FANTOM5 CAGE profiles of human and mouse samples

Noguchi, Shuhei; Arakawa, Takahiro; Fukuda, Shiro; Furuno, Masaaki; Hasegawa, Akira; Hori, Fumi; Ishikawa-Kato, Sachi; Kaida, Kaoru; Kaiho, Ai; Kanamori-Katayama, Mutsumi; Kawashima, Tsugumi; Kojima, Miki; Kubosaki, Atsutaka; Manabe, Ri-ichiroh; Murata, Mitsuyoshi; Nagao-Sato, Sayaka; Nakazato, Kenichi; Ninomiya, Noriko; Nishiyori-Sueki, Hiromi; Noma, Shohei; Saijyo, Eri; Saka, Akiko; Sakai, Mizuho; Simon, Christophe; Suzuki, Naoko; Tagami, Michihira; Watanabe, Shoko; Yoshida, Shigehiro; Arner, Peter; Axton, Richard A.; Babina, Magda; Baillie, J. Kenneth; Barnett, Timothy C.; Beckhouse, Anthony G.; Blumenthal, Antje; Bodega, Beatrice; Bonetti, Alessandro; Briggs, James; Brombacher, Frank; Carlisle, Ailsa J.; Clevers, Hans C.; Davis, Carrie A.; Detmar, Michael; Dohi, Taeko; Edge, Albert S.B.; Edinger, Matthias; Ehrlund, Anna; Ekwall, Karl; Endoh, Mitsuhiro; Enomoto, Hideki; Eslami, Afsaneh; Fagiolini, Michela; Fairbairn, Lynsey; Farach-Carson, Mary C.; Faulkner, Geoffrey J.; Ferrai, Carmelo; Fisher, Malcolm E.; Forrester, Lesley M.; Fujita, Rie; Furusawa, Jun-ichi; Geijtenbeek, Teunis B.; Gingeras, Thomas; Goldowitz, Daniel; Guhl, Sven; Guler, Reto; Gustincich, Stefano; Ha, Thomas J.; Hamaguchi, Masahide; Hara, Mitsuko; Hasegawa, Yuki; Herlyn, Meenhard; Heutink, Peter; Hitchens, Kelly J.; Hume, David A.; Ikawa, Tomokatsu; Ishizu, Yuri; Kai, Chieko; Kawamoto, Hiroshi; Kawamura, Yuki I.; Kempfle, Judith S.; Kenna, Tony J.; Kere, Juha; Khachigian, Levon M.; Kitamura, Toshio; Klein, Sarah; Klinken, S. Peter; Knox, Alan J.; Kojima, Soichi; Koseki, Haruhiko; Koyasu, Shigeo; Lee, Weonju; Lennartsson, Andreas; Mackay-sim, Alan; Mejhert, Niklas; Mizuno, Yosuke; Morikawa, Hiromasa; Morimoto, Mitsuru; Moro, Kazuyo; Morris, Kelly J.; Motohashi, Hozumi; Mummery, Christine L.; Nakachi, Yutaka; Nakahara, Fumio; Nakamura, Toshiyuki; Nakamura, Yukio; Nozaki, Tadasuke; Ogishima, Soichi; Ohkura, Naganari; Ohno, Hiroshi; Ohshima, Mitsuhiro; Okada-Hatakeyama, Mariko; Okazaki, Yasushi; Orlando, Valerio; Ovchinnikov, Dmitry A.; Passier, Robert; Patrikakis, Margaret; Pombo, Ana; Pradhan-Bhatt, Swati; Qin, Xian-Yang; Rehli, Michael; Rizzu, Patrizia; Roy, Sugata; Sajantila, Antti; Sakaguchi, Shimon; Sato, Hiroki; Satoh, Hironori; Savvi, Suzana; Saxena, Alka; Schmidl, Christian; Schneider, Claudio; Schulze-Tanzil, Gundula G.; Schwegmann, Anita; Sheng, Guojun; Shin, Jay W.; Sugiyama, Daisuke; Sugiyama, Takaaki; Summers, Kim M.; Takahashi, Naoko; Takai, Jun; Tanaka, Hiroshi; Tatsukawa, Hideki; Tomoiu, Andru; Toyoda, Hiroo; van de Wetering, Marc; van den Berg, Linda M.; Verardo, Roberto; Vijayan, Dipti; Wells, Christine A.; Winteringham, Louise N.; Wolvetang, Ernst; Yamaguchi, Yoko; Yamamoto, Masayuki; Yanagi-Mizuochi, Chiyo; Yoneda, Misako; Yonekura, Yohei; Zhang, Peter G.; Zucchelli, Silvia; Abugessaisa, Imad; Arner, Erik; Harshbarger, Jayson; Kondo, Atsushi; Lassmann, Timo; Lizio, Marina; Sahin, Serkan; Sengstag, Thierry; Severin, Jessica; Shimoji, Hisashi; Suzuki, Masanori; Suzuki, Harukazu; Kawai, Jun; Kondo, Naoto; Itoh, Masayoshi; Daub, Carsten O.; Kasukawa, Takeya; Kawaji, Hideya; Carninci, Piero; Forrest, Alistair R.R.; Hayashizaki, Yoshihide

doi:10.1038/sdata.2017.112

Download PDF

Data Descriptor
Open access
Published: 29 August 2017

FANTOM5 CAGE profiles of human and mouse samples

Shuhei Noguchi¹,
Takahiro Arakawa^1,2,
Shiro Fukuda²,
Masaaki Furuno^1,2,
Akira Hasegawa^1,2,
Fumi Hori^1,2,
Sachi Ishikawa-Kato^1,2,
Kaoru Kaida²,
Ai Kaiho²,
Mutsumi Kanamori-Katayama²,
Tsugumi Kawashima^1,2,
Miki Kojima^1,2,
Atsutaka Kubosaki²,
Ri-ichiroh Manabe^1,2,
Mitsuyoshi Murata^1,2,
Sayaka Nagao-Sato^1,2,
Kenichi Nakazato²,
Noriko Ninomiya²,
Hiromi Nishiyori-Sueki^1,2,
Shohei Noma^1,2,
Eri Saijyo²,
Akiko Saka²,
Mizuho Sakai^1,2,
Christophe Simon²,
Naoko Suzuki^1,2,
Michihira Tagami^1,2,
Shoko Watanabe^1,2,
Shigehiro Yoshida²,
Peter Arner^3,4,
Richard A. Axton⁵,
Magda Babina⁶,
J. Kenneth Baillie⁷,
Timothy C. Barnett^8,9,
Anthony G. Beckhouse¹⁰,
Antje Blumenthal¹¹,
Beatrice Bodega¹²,
Alessandro Bonetti^1,2,
James Briggs¹³,
Frank Brombacher^14,15,16,
Ailsa J. Carlisle⁷,
Hans C. Clevers^17,18,
Carrie A. Davis¹⁹,
Michael Detmar ORCID: orcid.org/0000-0002-5351-5054²⁰,
Taeko Dohi²¹,
Albert S.B. Edge²²,
Matthias Edinger^23,24,
Anna Ehrlund^3,4,
Karl Ekwall²⁵,
Mitsuhiro Endoh²⁶,
Hideki Enomoto²⁷,
Afsaneh Eslami²⁸,
Michela Fagiolini²⁹,
Lynsey Fairbairn⁷,
Mary C. Farach-Carson ORCID: orcid.org/0000-0002-4526-3088³⁰,
Geoffrey J. Faulkner³¹,
Carmelo Ferrai³²,
Malcolm E. Fisher ORCID: orcid.org/0000-0003-1074-8103⁷,
Lesley M. Forrester ORCID: orcid.org/0000-0002-9717-8299⁵,
Rie Fujita³³,
Jun-ichi Furusawa²⁶,
Teunis B. Geijtenbeek³⁴,
Thomas Gingeras¹⁹,
Daniel Goldowitz³⁵,
Sven Guhl⁶,
Reto Guler^14,15,16,
Stefano Gustincich^36,37,
Thomas J. Ha³⁵,
Masahide Hamaguchi³⁸,
Mitsuko Hara³⁹,
Yuki Hasegawa^1,2,
Meenhard Herlyn⁴⁰,
Peter Heutink ORCID: orcid.org/0000-0001-5218-1737⁴¹,
Kelly J. Hitchens^8,13,
David A. Hume⁷,
Tomokatsu Ikawa²⁶,
Yuri Ishizu^1,2,
Chieko Kai^42,43,
Hiroshi Kawamoto²⁶,
Yuki I. Kawamura²¹,
Judith S. Kempfle²²,
Tony J. Kenna⁴⁴,
Juha Kere ORCID: orcid.org/0000-0003-1974-0271^25,45,
Levon M. Khachigian^46,47,
Toshio Kitamura⁴⁸,
Sarah Klein²⁰,
S. Peter Klinken⁴⁹,
Alan J. Knox⁵⁰,
Soichi Kojima ORCID: orcid.org/0000-0002-5252-1612³⁹,
Haruhiko Koseki²⁶,
Shigeo Koyasu²⁶,
Weonju Lee⁵¹,
Andreas Lennartsson²⁵,
Alan Mackay-sim⁵²,
Niklas Mejhert^3,4,
Yosuke Mizuno⁵³,
Hiromasa Morikawa ORCID: orcid.org/0000-0002-6793-5885³⁸,
Mitsuru Morimoto²⁷,
Kazuyo Moro²⁶,
Kelly J. Morris³²,
Hozumi Motohashi⁵⁴,
Christine L. Mummery⁵⁵,
Yutaka Nakachi^53,56,
Fumio Nakahara⁴⁸,
Toshiyuki Nakamura⁴²,
Yukio Nakamura⁵⁷,
Tadasuke Nozaki⁵⁸,
Soichi Ogishima⁵⁹,
Naganari Ohkura³⁸,
Hiroshi Ohno²⁶,
Mitsuhiro Ohshima⁶⁰,
Mariko Okada-Hatakeyama^26,61,
Yasushi Okazaki^53,56,
Valerio Orlando^12,62,
Dmitry A. Ovchinnikov¹³,
Robert Passier⁵⁵,
Margaret Patrikakis⁴⁶,
Ana Pombo ORCID: orcid.org/0000-0002-7493-6288³²,
Swati Pradhan-Bhatt⁶³,
Xian-Yang Qin³⁹,
Michael Rehli^23,24,
Patrizia Rizzu⁴¹,
Sugata Roy²,
Antti Sajantila⁶⁴,
Shimon Sakaguchi³⁸,
Hiroki Sato⁴²,
Hironori Satoh³³,
Suzana Savvi^14,15,16,
Alka Saxena²,
Christian Schmidl²³,
Claudio Schneider⁶⁵,
Gundula G. Schulze-Tanzil⁶⁶,
Anita Schwegmann^14,15,16,
Guojun Sheng⁶⁷,
Jay W. Shin^1,2,
Daisuke Sugiyama⁶⁸,
Takaaki Sugiyama⁴²,
Kim M. Summers⁷,
Naoko Takahashi²,
Jun Takai³³,
Hiroshi Tanaka²⁸,
Hideki Tatsukawa⁶⁹,
Andru Tomoiu⁷,
Hiroo Toyoda⁵⁴,
Marc van de Wetering¹⁷,
Linda M. van den Berg³⁴,
Roberto Verardo⁷⁰,
Dipti Vijayan⁷¹,
Christine A. Wells⁷²,
Louise N. Winteringham⁴⁹,
Ernst Wolvetang¹³,
Yoko Yamaguchi⁷³,
Masayuki Yamamoto³³,
Chiyo Yanagi-Mizuochi⁷⁴,
Misako Yoneda⁴²,
Yohei Yonekura²⁷,
Peter G. Zhang³⁵,
Silvia Zucchelli³⁶,
Imad Abugessaisa ORCID: orcid.org/0000-0001-7458-801X¹,
Erik Arner^1,2,
Jayson Harshbarger^1,2,
Atsushi Kondo^1,2,
Timo Lassmann^1,2,75,
Marina Lizio ORCID: orcid.org/0000-0001-7337-7325^1,2,
Serkan Sahin^1,2,
Thierry Sengstag²,
Jessica Severin^1,2,
Hisashi Shimoji^2,76,
Masanori Suzuki²,
Harukazu Suzuki^1,2,
Jun Kawai^2,77,
Naoto Kondo^1,2,
Masayoshi Itoh^1,2,77,
Carsten O. Daub^1,2,25,
Takeya Kasukawa ORCID: orcid.org/0000-0001-5085-0802¹,
Hideya Kawaji ORCID: orcid.org/0000-0002-0575-0308^1,2,76,77,
Piero Carninci^1,2,
Alistair R.R. Forrest^1,2,49 &
…
Yoshihide Hayashizaki^2,77

Scientific Data volume 4, Article number: 170112 (2017) Cite this article

16k Accesses
149 Citations
8 Altmetric
Metrics details

Subjects

Abstract

In the FANTOM5 project, transcription initiation events across the human and mouse genomes were mapped at a single base-pair resolution and their frequencies were monitored by CAGE (Cap Analysis of Gene Expression) coupled with single-molecule sequencing. Approximately three thousands of samples, consisting of a variety of primary cells, tissues, cell lines, and time series samples during cell activation and development, were subjected to a uniform pipeline of CAGE data production. The analysis pipeline started by measuring RNA extracts to assess their quality, and continued to CAGE library production by using a robotic or a manual workflow, single molecule sequencing, and computational processing to generate frequencies of transcription initiation. Resulting data represents the consequence of transcriptional regulation in each analyzed state of mammalian cells. Non-overlapping peaks over the CAGE profiles, approximately 200,000 and 150,000 peaks for the human and mouse genomes, were identified and annotated to provide precise location of known promoters as well as novel ones, and to quantify their activities.

Design Type(s)	organism part comparison design • species comparison design • cell type comparison design • organism development design
Measurement Type(s)	DNA-templated transcription, initiation
Technology Type(s)	cap analysis of gene expression
Factor Type(s)	Species • Organism Part • life cycle stage • cell type
Sample Characteristic(s)	Mus musculus • Homo sapiens • cerebellum • visual cortex • ileum • Peyer's patch • stomach • axillary lymph node • aorta • substantia nigra • hippocampal formation • brain • heart • liver • meningeal cluster • bone marrow • spinal cord • raphe nuclei • corpus striatum • cortex • peripheral nervous system • kidney • neural system • hemolymphoid system • blood • spleen • mesoderm • hematopoietic system • ventral wall of dorsal aorta • placenta • ganglion • spiral organ of cochlea • small intestine • intestine • adrenal gland • eyeball of camera-type eye • pituitary gland • thymus • lung • female gonad • testis • bone tissue • diencephalon • muscle organ • medulla oblongata • forelimb • pancreas • gonad • corpora quadrigemina • skin of body • tongue • colon • caecum • vesicular gland • epididymis • amnion • mammary gland • uterus • submandibular gland • prostate gland • intestinal mucosa • urinary bladder • vagina • oviduct

Machine-accessible metadata file describing the reported data (ISA-Tab format)

Persistent features of intermittent transcription

Article Open access 21 February 2020

grandR: a comprehensive package for nucleotide conversion RNA-seq data analysis

Article Open access 15 June 2023

RNA timestamps identify the age of single molecules in RNA sequencing

Article 19 October 2020

Background & Summary

Since the completion of the human genome sequencing, role of individual bases has been a central question. An international collaborative effort, FANTOM (Functional ANnoTation Of Mammalian Genome)¹, delineated a complex landscape of transcribed RNAs (transcriptome) and their regulations. The initial key technology driving the project was to make full-length cDNA clones, representing complete primary structure of transcribed RNA molecules. Sequencing of the full-length cDNA clones uncovered unexpected number of long non-coding RNAs as well as protein coding genes^2–6. The CAGE (Cap Analysis Gene Expression)^7,8 protocol, combination with high-throughput sequencing, was developed to monitor frequencies of transcription initiation by determining 5′-end of capped RNAs. The technology was devised to uncover complexity of the transcriptome^4–6 and elucidate transcriptional regulatory networks by focusing on promoter elements^9–12. By taking advantage of single molecule sequencer, HeliScopeCAGE was recently developed to provide more sensitive and accurate monitoring of transcription initiation activities^7,8.

In the fifth round of the FANTOM projects, FANTOM5, the challenge was to capture the transcriptome of many varieties of cell states as possible, to understand the implication of each genomic bases in different contexts. In the first phase of the FANTOM5 project, we targeted cells in steady state, called ‘snapshot’ samples¹³. Our central focus was on human primary cells, while cell lines, tissues and mouse samples were chosen to cover cells inaccessible as isolated human primary samples. The resulting data provided an atlas of promoter and enhancer activities in wide range of cell states¹⁴, which is a baseline of understanding complex transcriptional regulation. In the second phase, we focused on transitions of cell states by monitoring ‘time course’ samples, such as activations, differentiations, and developments at sequential time points¹⁵. The monitored activities of promoters and enhancers demonstrated that enhancer activities is the earliest event during dynamic changes of transcriptome. These data sets are being utilized in many other studies inside and outside of the FANTOM5 consortium.

The data production scheme was implemented based on the FANTOM5 collaboration. Sample collection was performed at individual institutes, since specific types of samples require dedicated systems with special expertise or settings, as well as through purchase from commercial sources. RNA quality was firstly examined at the place where the samples were obtained (the first RNA quality check). The CAGE assay pipeline established in RIKEN GeNAS (Genome Network Analysis Support Facility) employed two workflows of HeliScopeCAGE, a manual workflow for samples with small amount of total RNAs⁸ and a robotic workflow for samples with standard requirements⁷. The assay pipeline started with checking RNA quality (the second RNA quality check), which provides a uniform quality assessment of the profiled RNA extracts. The resulting CAGE libraries were sequenced by HeliScope in RIKEN and also in Helicos Biosciences, and the obtained data were processed by the MOIRAI system¹⁶. Quality of the resulting CAGE profiles was checked with several statistics as well as manual inspection by using the ZENBU browser¹⁷. Finally CAGE profiles were shared among the consortium for further analysis.

In the course of the two phases focused on ‘snapshot’ and ‘time course’ samples, we profiled 1,816 human and 1,016 mouse samples in total, and obtained approximately four millions of single-molecule reads successfully aligned to the genome per sample on average. Based on frequencies of the observed 5′-ends of individual capped RNA molecules at a single base-pair resolution, we identified 201,802 and 158,966 peaks for human and mouse respectively, where promoters are defined as the sequence immediately upstream of the peaks and frequencies of observed CAGE reads reflect activities of the promoters. All data generated during the course of the project were deposited to a public repository (DDBJ Read Archive, DRA) and/or provided at the FANTOM5 web resource (http://fantom.gsc.riken.jp/5/)¹⁸. Here we describe the data with the processing details and quality metrics.

Methods

Sample collection

Sample collection was performed as described previously^13,15. Briefly, primary cells were purchased as purified RNAs or frozen cells, or obtained as described previously^19–24 through collaboration in the consortium. Purchased cells were cultured according to the manufacturer’s instructions and miRNeasy kit (QIAGEN) was used for RNA extraction. Human post mortem tissue RNAs were purchased or obtained through the Dutch Brain bank. Tissues collected through the consortium were snap-frozen in liquid nitrogen, transferred into Lysing Matrix D tubes (MP Biomedicals, Santa Ana, CA) containing chilled Trizol (Gibco), homogenized by FastPrep Homogenizer (Thermo Savant), and centrifuged. miRNeasy kit (QIAGEN) was used for RNA extraction from cultured cell lines as well as frozen cell line stocks.

For the purchased samples, lot or catalogue numbers were recorded where available. Of the collected RNAs, those with more than 1 μg, were measured by Agilent BioAnalyzer (Agilent Technologies, Santa Clara, CA) and Nanodrop spectrophotometer (Thermo Fisher Scientific, Wilmington, DE) to check RIN (RNA integrity) score and the absorbance ratio of A260/A230 and A260/A280. The rest of the samples were directly subjected to the CAGE library production to avoid wasting material. All 2,832 profiled samples are summarized in Table 1.

Table 1 Summary of FANTOM5 phase 1 and phase 2 samples.

Full size table

Single molecule CAGE and data processing

HeliScopeCAGE libraries were prepared, sequenced, and processed as described previously^13,15. Most of the RNAs were subjected to the automated HeliScopeCAGE protocol⁷, except for RNAs with less than 1 μg that were subjected to the manual protocol optimized for low quantity RNAs⁸. The resulting libraries were measured by OliGreen fluorescence assay kit (Life Technologies), and sequenced by following the manufacturer’s instructions (LB-016_01, LB-017_01, and LB-001_04 (ref. 13). RNAs extracted from mouse whole body embryo E17.5 (called internal control) were systematically subjected to this workflow, with one per a sequencing run.

The produced data were processed as previously described^13,15. Briefly, reads corresponding to ribosomal RNA were removed by using the program rRNAdust (http://fantom.gsc.riken.jp/5/suppl/rRNAdust/), remaining reads were aligned to the reference genome of human and mouse (hg19 or mm9) by using Delve²⁵, and alignments with a quality of less than 20 (<99% chance of true) or a sequence identity of less than 85% were discarded. Frequencies of the CAGE read 5′ ends were counted to give a unit of CAGE tag start site (CTSS), a single base-pair on the reference genome. The entire flow of the data is illustrated in Fig. 1, and the number of CAGE profiles (equivalent to CTSS files) is summarized in Table 2.

Table 2 Sequence files (CTSS files).

Full size table

Identification of peaks and their annotations

Non-overlapping peaks based on the all CAGE profiles were identified by using DPI (decomposition-based peak identification, https://github.com/hkawaji/dpi1/) method and annotated as previously described^13,15. A ‘robust’ threshold, for which a peak must include a CTSS with more than 10 read counts and 1 TPM (tags per million) at least one sample, was employed to define a stringent subset of the CAGE peaks. The robust peaks were associated with known transcripts, such as RefSeq²⁶, UCSC known gene²⁷, GENCODE²⁸, Ensembl²⁹, and mRNAs (full-length cDNA clones), based on their 5′-end proximity to the peaks. Official gene symbols, Entrez Gene IDs, and protein (UniProt) IDs associated with the transcripts were retrieved and assigned as part of annotation. In addition to these associations, human readable names and descriptions were assigned to each of the CAGE peaks. Peaks were given a name in the form pN@GENE, where GENE indicates gene symbol or transcript name and N indicates the rank in the ranked list of promoter activities for that gene. For example, p1@SPI1 represent the peak with the highest number of observation (that is, read counts) in all of the FANTOM5 CAGE profiles, among the peaks associated with SPI1 gene.

Peak identification with the same method and the same threshold was performed two times; the first was for ‘snapshot’ samples (phase 1), and the second was for the entire samples from both the ‘snapshot’ and ‘time course’ studies (phase 2). We integrated these two peak sets into a hybrid set consisting of all the phase 1 peaks over the robust threshold and a subset of phase 2 peaks that did not overlap with the phase 1 peaks. Annotation of phase1 peaks was used in the hybrid set, called phase 1+2 peaks, which provide a consistent reference in the definition of promoters.

Quantification of promoter activities

All the obtained CAGE profiles were subjected to the peak identification, even if they have some issues in quality, since all of them still represent independent observations of RNA 5′-ends. However promoter activities (that is, expression levels of CAGE peaks) were quantified only in the samples satisfying the following criteria: RIN score greater than 6, more than 500,000 successfully aligned reads to the genome, and more than 50% of the successful alignments are close to 5′-end of RefSeq gene model, for expression analysis requiring reliable quantification. After discarding a few CAGE profiles of low quality, read counts for individual CTSSs belonging to the same peak were summed up, normalization (or scaling) factors were calculated with RLE (Relative Log Expression)³⁰ method by edgeR³¹, and tags per million (that is, counts per million) was computed as expression levels.

The RLE normalization was first performed within the phase 1 samples. The naïve application of this to the entire data sets, consisting of phase 1 and phase 2 samples, might cause inconsistencies in expression levels between the two normalizations. To avoid this, we took the geometric mean of CAGE peak read counts across the phase 1 samples and used it as the reference expression for a normalization factor calculation in the same manner as RLE method. This enabled us to keep the expression levels of phase 1 as they were, and to adjust the expression levels of the phase 2 samples to be comparable¹⁵.

Code availability

All software used in this study are publicly available. rRNAdust, for removing ribosomal RNA, is available at http://fantom.gsc.riken.jp/5/suppl/rRNAdust/. Mapping software Delve is available at http://fantom.gsc.riken.jp/5/suppl/delve/. The program to perform DPI, decomposition-based peak identification, method is available at https://github.com/hkawaji/dpi1/.

Data Records

Data record 1: Metadata

Two types of metadata are available at figshare and LSDB Archive (Data Citation 1, 10). One is for the samples, including their origins and extracted RNA. The other is for the CAGE assay, including the result of RNA quality check, library production, and post-processing of the CAGE tag sequences. Both of them are described in SDRF (Sample and Data Relationship Format)³². Sample metadata for human and mouse are ‘HumanSamples2.0.sdrf.xlsx’ and ‘MouseSamples2.0.sdrf.xlsx’, respectively. The metadata for the CAGE assay are available as ‘*sdrf.txt’.

Data record 2: CAGE profiles

All of the CAGE sequences, their alignment to the genomes, and CTSS frequencies are available at DDBJ DRA (DDBJ Sequence Read Archive) (Data Citations 2–9). The accession number of each file is summarized in ‘DRA*.txt’ at figshare (Data Citation 1).

Data record 3: CAGE peaks

Genomic coordinates, annotations and expressions of the CAGE peaks are available as ‘*phase1and2combined_coord.bed.gz’, ‘*phase1and2combined_ann.txt.gz’, and ‘*phase1and2combined_tpm.osc.txt.gz’ respectively at figshare (Data Citation 1). Genomic coordinates are formatted in BED format, and the others are formatted in OSCtable (Order Switchable Column table). The detail of the OSCtable format is available at https://sourceforge.net/projects/osctf/.

Technical Validation

RNA quality

Measured RNA qualities at the second check (that is, immediately before the CAGE library production) are shown in Fig. 2a–c. RNA Integrity Number (RIN) score, measured using an Agilent Bioanalyzer, was 8.96 on average (standard deviation 1.19), absorbance ratio of 260/230 nm (A260/A230) and 260/280 nm (A260/A280) were on average 2.01 (standard deviation 0.53) and 2.13 (standard deviation 0.14) respectively. These figures indicate that the majority of the RNAs were processed in good quality.

**Figure 2: RNA and mapping quality control.**

Mapped reads

The number of CAGE reads successfully aligned with the genome and the ratio of CAGE reads hitting conventional promoters are shown in Fig. 2d,e. The average number of mapped reads is 4,208,291 per CAGE profile. Of the 2,522 profiles, 98.3% (2,478) consists of at least 500,000 successfully aligned reads, which was a criterion of profiles used for expression analysis¹³. The average ratio of promoter-hitting reads is 76.5, and 98.6% of the all profiles (2,437/2,472) have more than 50% promoter-hitting rate, which was another criterion of profiles used for expression analysis¹³.

Sample identity

Hierarchical clustering of the 126 mouse primary cells¹³ within the phase 1 was shown in Fig. 3, and the same clustering of the 571 human primary cells¹³ was in Supplementary Fig. 1. The average linkage method was applied to log-scale expression (TPM) profiles at promoter-level, and sample identities were assessed by expression of marker genes and also by manual inspection of the hierarchical clustering. The figures show that majority of biological replicates belonged to the same branch of the tree, that is, the same cluster, except for samples with a low number of mapped read counts.

**Figure 3: Hierarchical clustering of primary cells.**

Usage Notes

As well as providing access to individual data files, we also set up a series of interfaces as described in the FANTOM web resource^18,33. TET (Table Extraction Tool) provides an interface to obtain a subset of data by specifying the desired columns and rows. The BioMart interface³⁴, and FANTOM5 SSTAR (Semantic catalog of Samples, Transcription initiation And Regulators) provides the metadata of the profiled samples³⁵. The CAGE profile on the genomic axis is visible in ZENBU¹⁷ with its interactive interface and also in the UCSC genome browser³⁶ via track data hub³⁷.

Additional Information

How to cite this article: Noguchi, S. et al. FANTOM5 CAGE profiles of human and mouse samples. Sci. Data 4:170112 doi: 10.1038/sdata.2017.112 (2017).

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

de Hoon, M., Shin, J. W. & Carninci, P. Paradigm shifts in genomics through the FANTOM projects. Mamm Genome 26, 391–402 (2015).
Article CAS Google Scholar
The RIKEN Genome Exploration Research Group Phase II Team and the FANTOM Consortium. Functional annotation of a full-length mouse cDNA collection. Nature 409, 685–690 (2001).
Article Google Scholar
The FANTOM Consortium and the RIKEN Genome Exploration Research Group Phase I & II Team. Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature 420, 563–573 (2002).
Article Google Scholar
RIKEN Genome Exploration Research Group and Genome Science Group (Genome Network Project Core Group) and the FANTOM Consortium. Antisense transcription in the mammalian transcriptome. Science 309, 1564–1566 (2005).
Article ADS Google Scholar
The FANTOM Consortium and RIKEN Genome Exploration Research Group and Genome Science Group (Genome Network Project Core Group). The Transcriptional Landscape of the Mammalian Genome. Science 309, 1559–1563 (2006).
Google Scholar
Carninci, P. et al. Genome-wide analysis of mammalian promoter architecture and evolution. Nat Genet 38, 626–635 (2006).
Article CAS Google Scholar
Itoh, M. et al. Automated Workflow for Preparation of cDNA for Cap Analysis of Gene Expression on a Single Molecule Sequencer. PLoS ONE 7, e30809 (2012).
Article ADS CAS Google Scholar
Kanamori-Katayama, M. et al. Unamplified Cap Analysis of Gene Expression on a single-molecule sequencer. Genome Res 21, 1150–1159 (2011).
Article CAS Google Scholar
The FANTOM Consortium and the Riken Omics Science Center. The transcriptional network that controls growth arrest and differentiation in a human myeloid leukemia cell line. Nat Genet 41, 553–562 (2009).
Article Google Scholar
Taft, R. J. et al. Tiny RNAs associated with transcription start sites in animals. Nat Genet 41, 572–578 (2009).
Article CAS Google Scholar
Faulkner, G. J. et al. The regulated retrotransposon transcriptome of mammalian cells. Nat Genet 41, 563–571 (2009).
Article CAS Google Scholar
Ravasi, T. et al. An Atlas of Combinatorial Transcriptional Regulation in Mouse and Man. Cell 140, 744–752 (2010).
Article CAS Google Scholar
The FANTOM Consortiumand the RIKEN PMI and CLST (DGT). A promoter-level mammalian expression atlas. Nature 507, 462–470 (2014).
Article ADS Google Scholar
Andersson, R. et al. An atlas of active enhancers across human cell types and tissues. Nature 507, 455–461 (2014).
Article ADS CAS Google Scholar
Arner, E. et al. Transcribed enhancers lead waves of coordinated transcription in transitioning mammalian cells. Science 347, 1010–1014 (2015).
Article ADS CAS Google Scholar
Hasegawa, A., Daub, C., Carninci, P., Hayashizaki, Y. & Lassmann, T. MOIRAI: a compact workflow system for CAGE analysis. BMC Bioinformatics 15, 144 (2014).
Article Google Scholar
Severin, J. et al. Interactive visualization and analysis of large-scale sequencing datasets using ZENBU. Nat Biotechnol 32, 217–219 (2014).
Article CAS Google Scholar
Lizio, M. et al. Gateways to the FANTOM5 promoter level mammalian expression atlas. Genome Biol 16, 22 (2015).
Article CAS Google Scholar
Pradhan, S. et al. Perlecan Domain IV Peptide Stimulates Salivary Gland Cell Assembly In Vitro. Tissue Eng Part A 15, 3309–3320 (2009).
Article CAS Google Scholar
Lee, W. J., Cha, H. W., Sohn, M. Y., Lee, S.-J. & Kim, D. W. Vitamin D increases expression of cathelicidin in cultured sebocytes. Arch Dermatol Res 304, 627–632 (2012).
Article CAS Google Scholar
Ohshima, M., Yamaguchi, Y., Micke, P., Abiko, Y. & Otsuka, K. In Vitro Characterization of the Cytokine Profile of the Epithelial Cell Rests of Malassez. J Periodontol 79, 912–919 (2008).
Article CAS Google Scholar
You, Y., Richer, E. J., Huang, T. & Brody, S. L. Growth and differentiation of mouse tracheal epithelial cells: selection of a proliferative population. Am J Physiol Lung Cell Mol Physiol 283, L1315–L1321 (2002).
Article CAS Google Scholar
Kajiya, K., Hirakawa, S., Ma, B., Drinnenberg, I. & Detmar, M. Hepatocyte growth factor promotes lymphatic vessel formation and function. EMBO J 24, 2885–2895 (2005).
Article CAS Google Scholar
Hori, S., Nomura, T. & Sakaguchi, S. Control of regulatory T cell development by the transcription factor Foxp3. Science 299, 1057–1061 (2003).
Article ADS CAS Google Scholar
Djebali, S. et al. Landscape of transcription in human cells. Nature 489, 101–108 (2012).
Article ADS CAS Google Scholar
Pruitt, K. D., Tatusova, T., Brown, G. R. & Maglott, D. R. NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy. Nucleic Acids Res 40, D130–D135 (2012).
Article CAS Google Scholar
Hsu, F. et al. The UCSC known genes. Bioinformatics 22, 1036–1046 (2006).
Article CAS Google Scholar
Harrow, J. et al. GENCODE: producing a reference annotation for ENCODE. Genome Biol 7 (Suppl 1): S4.1–S9 (2006).
Article Google Scholar
Flicek, P. et al. Ensembl 2011. Nucleic Acids Res 39, 800–806 (2011).
Article Google Scholar
Anders, S. & Huber, W. Differential expression analysis for sequence count data. Genome Biol 11, R106 (2010).
Article CAS Google Scholar
Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).
Article CAS Google Scholar
Rayner, T. F. et al. A simple spreadsheet-based, MIAME-supportive format for microarray data: MAGE-TAB. BMC Bioinformatics 7, 489 (2006).
Article ADS Google Scholar
Lizio, M. et al. Update of the FANTOM web resource: high resolution transcriptome of diverse cell types in mammals. Nucleic Acids Res 45, D737–D743 (2017).
Article CAS Google Scholar
Smedley, D. et al. The BioMart community portal: An innovative alternative to large, centralized data repositories. Nucleic Acids Res 43, W589–W598 (2015).
Article CAS Google Scholar
Abugessaisa, I. et al. FANTOM5 transcriptome catalog of cellular states based on Semantic MediaWiki. Database 2016, article ID baw105 (2016).
Article Google Scholar
Speir, M. L. et al. The UCSC Genome Browser database: 2016 update. Nucleic Acids Res 44, D717–D725 (2016).
Article CAS Google Scholar
Raney, B. J. et al. Track data hubs enable visualization of user-defined genome-wide annotations on the UCSC Genome Browser. Bioinformatics 30, 1003–1005 (2014).
Article CAS Google Scholar

Data Citations

Noguchi, S. figshare https://doi.org/10.6084/m9.figshare.c.3728767 (2017)
DDBJ Sequence Read Archive DRA000991 (2013)
DDBJ Sequence Read Archive DRA001026 (2013)
DDBJ Sequence Read Archive DRA001027 (2013)
DDBJ Sequence Read Archive DRA001028 (2013)
DDBJ Sequence Read Archive DRA002216 (2014)
DDBJ Sequence Read Archive DRA002711 (2014)
DDBJ Sequence Read Archive DRA002747 (2014)
DDBJ Sequence Read Archive DRA002748 (2014)
LSDB Archive http://doi.org/10.18908/lsdba.nbdc01389-000.V002 (2016)

Download references

Acknowledgements

FANTOM5 was made possible by a Research Grant for RIKEN Omics Science Center from MEXT to Y.H. and a grant of the Innovative Cell Biology by Innovative Technology (Cell Innovation Program) from the MEXT, Japan to Y.H. It was also supported by Research Grants for RIKEN Preventive Medicine and Diagnosis Innovation Program to Y.H. and RIKEN Centre for Life Science Technologies, Division of Genomic Technologies (from the MEXT, Japan).

Author information

Authors and Affiliations

Division of Genomic Technologies, RIKEN Center for Life Science Technologies, Yokohama, 230-0045, Kanagawa, Japan
Shuhei Noguchi, Takahiro Arakawa, Masaaki Furuno, Akira Hasegawa, Fumi Hori, Sachi Ishikawa-Kato, Tsugumi Kawashima, Miki Kojima, Ri-ichiroh Manabe, Mitsuyoshi Murata, Sayaka Nagao-Sato, Hiromi Nishiyori-Sueki, Shohei Noma, Mizuho Sakai, Naoko Suzuki, Michihira Tagami, Shoko Watanabe, Alessandro Bonetti, Yuki Hasegawa, Yuri Ishizu, Jay W. Shin, Imad Abugessaisa, Erik Arner, Jayson Harshbarger, Atsushi Kondo, Timo Lassmann, Marina Lizio, Serkan Sahin, Jessica Severin, Harukazu Suzuki, Naoto Kondo, Masayoshi Itoh, Carsten O. Daub, Takeya Kasukawa, Hideya Kawaji, Piero Carninci & Alistair R.R. Forrest
RIKEN Omics Science Center, Yokohama, 230-0045, Kanagawa, Japan
Takahiro Arakawa, Shiro Fukuda, Masaaki Furuno, Akira Hasegawa, Fumi Hori, Sachi Ishikawa-Kato, Kaoru Kaida, Ai Kaiho, Mutsumi Kanamori-Katayama, Tsugumi Kawashima, Miki Kojima, Atsutaka Kubosaki, Ri-ichiroh Manabe, Mitsuyoshi Murata, Sayaka Nagao-Sato, Kenichi Nakazato, Noriko Ninomiya, Hiromi Nishiyori-Sueki, Shohei Noma, Eri Saijyo, Akiko Saka, Mizuho Sakai, Christophe Simon, Naoko Suzuki, Michihira Tagami, Shoko Watanabe, Shigehiro Yoshida, Alessandro Bonetti, Yuki Hasegawa, Yuri Ishizu, Sugata Roy, Alka Saxena, Jay W. Shin, Naoko Takahashi, Erik Arner, Jayson Harshbarger, Atsushi Kondo, Timo Lassmann, Marina Lizio, Serkan Sahin, Thierry Sengstag, Jessica Severin, Hisashi Shimoji, Masanori Suzuki, Harukazu Suzuki, Jun Kawai, Naoto Kondo, Masayoshi Itoh, Carsten O. Daub, Hideya Kawaji, Piero Carninci, Alistair R.R. Forrest & Yoshihide Hayashizaki
Department of Medicine, Karolinska Institutet, Stockholm, 141 86, Sweden
Peter Arner, Anna Ehrlund & Niklas Mejhert
Karolinska University Hospital, Center for Metabolism and Endocrinology, Stockholm, 141 86, Sweden
Peter Arner, Anna Ehrlund & Niklas Mejhert
Scottish Centre for Regenerative Medicine, University of Edinburgh, 5 Little France Drive, Edinburgh, EH16 4UU, UK
Richard A. Axton & Lesley M. Forrester
Department of Dermatology and Allergy, Charite University Medicine Berlin, Charitéplatz 1, Berlin, 10117, German
Magda Babina & Sven Guhl
The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, EH25 9RG, Midlothian, UK
J. Kenneth Baillie, Ailsa J. Carlisle, Lynsey Fairbairn, Malcolm E. Fisher, David A. Hume, Kim M. Summers & Andru Tomoiu
Australian Infectious Diseases Research Centre, The University of Queensland, St Lucia, 4072, QLD, Australia
Timothy C. Barnett & Kelly J. Hitchens
School of Chemistry and Molecular Biosciences, The University of Queensland, St Lucia, 4072, QLD, Australia
Timothy C. Barnett
Bio-Rad Laboratories Pty Ltd, Hercules, 94547, California, USA
Anthony G. Beckhouse
The University of Queensland Diamantina Institute, The University of Queensland, Woolloongabba, 4102, QLD, Australia
Antje Blumenthal
IRCCS Fondazione Santa Lucia, Via del Fosso di Fiorano 64, Rome, 00143, Italy
Beatrice Bodega & Valerio Orlando
Australian Institute for Bioengineering and Nanotechnology (AIBN), University of Queensland, Brisbane, QLD 4072, St Lucia, Australia
James Briggs, Kelly J. Hitchens, Dmitry A. Ovchinnikov & Ernst Wolvetang
Division of Immunology, Institute of Infectious Diseases and Molecular Medicine (IDM), University of Cape Town, Anzio Road, Observatory 7925, Cape Town, South Africa
Frank Brombacher, Reto Guler, Suzana Savvi & Anita Schwegmann
Immunology of Infectious Diseases, Faculty of Health Sciences, South African Medical Research Council (SAMRC), University of Cape Town, Anzio Road, Observatory 7925, Cape Town, South Africa
Frank Brombacher, Reto Guler, Suzana Savvi & Anita Schwegmann
International Centre for Genetic Engineering and Biotechnology, Cape Town Component, Anzio Road, Observatory 7925, Cape Town, South Africa
Frank Brombacher, Reto Guler, Suzana Savvi & Anita Schwegmann
Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences, Uppsalalaan 8, Utrecht, 3584 CT, The Netherlands
Hans C. Clevers & Marc van de Wetering
University Medical Centre Utrecht, Postbus 85500, Utrecht, 3508 GA, The Netherlands
Hans C. Clevers
Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, 11797, New York, USA
Carrie A. Davis & Thomas Gingeras
Institute of Pharmaceutical Sciences, ETH Zurich, Vladimir-Prelog-Weg 3, HCI H 303, Zurich, 8093, Switzerland
Michael Detmar & Sarah Klein
Gastroenterology, Research Center for Hepatitis and Immunology, Research Institute National Center for Global Health and Medicine, Ichikawa, 272-8516, Chiba, Japan
Taeko Dohi & Yuki I. Kawamura
Department of Otology and Laryngology, Harvard Medical School, Boston, 02114, Massachusetts, USA
Albert S.B. Edge & Judith S. Kempfle
Department of Internal Medicine III, University Hospital Regensburg, F.-J.-Strauss Allee 11, Regensburg, D-93053, Germany
Matthias Edinger, Michael Rehli & Christian Schmidl
RCI Regensburg Centre for Interventional Immunology, University Hospital Regensburg, F.-J.-Strauss Allee 11, Regensburg, D-93053, Germany
Matthias Edinger & Michael Rehli
Department of Biosciences and Nutrition, Karolinska Institutet, Halsovagen 7-9, Huddinge, SE-141 83, Sweden
Karl Ekwall, Juha Kere, Andreas Lennartsson & Carsten O. Daub
RIKEN Center for Integrative Medical Sciences, Yokohama, 230-0045, Kanagawa, Japan
Mitsuhiro Endoh, Jun-ichi Furusawa, Tomokatsu Ikawa, Hiroshi Kawamoto, Haruhiko Koseki, Shigeo Koyasu, Kazuyo Moro, Hiroshi Ohno & Mariko Okada-Hatakeyama
Laboratory for Neuronal Differentiation and Regeneration, RIKEN Center for Developmental Biology, Chuou-ku, 650-0047, Kobe, Japan
Hideki Enomoto, Mitsuru Morimoto & Yohei Yonekura
Department of Bioinformatics, Medical Research Institute, Tokyo Medical and Dental University, Bunkyo-ku, 113-8510, Tokyo, Japan
Afsaneh Eslami & Hiroshi Tanaka
F.M. Kirby Neurobiology Center, Children's Hospital, Harvard Medical School, Boston, 02115, Massachusetts, USA
Michela Fagiolini
The University of Texas Health Science Center at Houston, Houston, 77251-1892, TX, USA
Mary C. Farach-Carson
Cancer Biology Program, Mater Medical Research Institute, South Brisbane, 4101, Queensland, Australia
Geoffrey J. Faulkner
Berlin Institute for Medical Systems Biology, Max Delbrueck Center, Robert Roessle Str.10, Berlin, 13125, Germany
Carmelo Ferrai, Kelly J. Morris & Ana Pombo
Department of Medical Biochemistry, Tohoku University Graduate School of Medicine, Sendai, 980-8575, Miyagi, Japan
Rie Fujita, Hironori Satoh, Jun Takai & Masayuki Yamamoto
Experimental Immunology, Academic Medical Center, University of Amsterdam, Meibergdreef 9, Amsterdam, 1105 AZ, The Netherlands
Teunis B. Geijtenbeek & Linda M. van den Berg
Department of Medical Genetics, Centre for Molecular Medicine and Therapeutics, Child and Family Research Institute, University of British Columbia, Vancouver, V5Z 4H4, British Columbia, Canada
Daniel Goldowitz, Thomas J. Ha & Peter G. Zhang
Neuroscience, SISSA, Via Bonomea 265, Trieste, 34136, Italy
Stefano Gustincich & Silvia Zucchelli
Department of Neuroscience and Brian Technologies, Italian Istitute of Technology, Via Morego 30, Genova, Italy
Stefano Gustincich
Department of Experimental Immunology, World Premier International Immunology Frontier Research Center, Osaka University, 565-0871, Suita, Osaka, Japan
Masahide Hamaguchi, Hiromasa Morikawa, Naganari Ohkura & Shimon Sakaguchi
RIKEN Center for Life Science Technologies, Wako, 351-0198, Saitama, Japan
Mitsuko Hara, Soichi Kojima & Xian-Yang Qin
Melanoma Research Center, The Wistar Institute, Philadelphia, 19104, Pennsylvania, USA
Meenhard Herlyn
German Center for Neurodegenerative Diseases (DZNE)-Tübingen, Otfried Müller Straße 23, Tübingen, 72076, Germany
Peter Heutink & Patrizia Rizzu
Laboratory Animal Research Center, Institute of Medical Science, The University of Tokyo, Minato-ku, 108-8639, Tokyo, Japan
Chieko Kai, Toshiyuki Nakamura, Hiroki Sato, Takaaki Sugiyama & Misako Yoneda
International Research Center for Infectious Diseases, Institute of Medical Science, The University of Tokyo, Minato-ku, 108-8639, Tokyo, Japan
Chieko Kai
Institute of Health and Biomedical Innovation, Queensland University of Technology, Translational Research Institute, Princess Alexandra Hospital, Brisbane, 4102, QLD, Australia
Tony J. Kenna
Department of Genetics and Molecular Medicine, King's College London, Guy’s St Thomas Street, London, UK
Juha Kere
Centre for Vascular Research, University of New South Wales, Sydney, 2052, New South Wales, Australia
Levon M. Khachigian & Margaret Patrikakis
Vascular Biology and Translational Research, School of Medical Sciences, University of New South Wales, Sydney, 2052, New South Wales, Australia
Levon M. Khachigian
Division of Cellular Therapy and Division of Stem Cell Signaling, Institute of Medical Science, University of Tokyo, Minato-ku, 108-8639, Tokyo, Japan
Toshio Kitamura & Fumio Nakahara
Harry Perkins Institute of Medical Research, Perth, 6009, WA, Australia
S. Peter Klinken, Louise N. Winteringham & Alistair R.R. Forrest
Respiratory Medicine, University of Nottingham, Hucknall Road, Nottingham, NG5 1PB, UK
Alan J. Knox
Dermatology, School of Medicine Kyungpook National University, Jung-gu, 41944, Daegu, Korea
Weonju Lee
Griffith University, Brisbane, 4111, Queensland, Australia
Alan Mackay-sim
Division of Functional Genomics and Systems Medicine, Research Center for Genomic Medicine, Saitama Medical University, Hidaka, 350-1241, Saitama, Japan
Yosuke Mizuno, Yutaka Nakachi & Yasushi Okazaki
Center for Radioisotope Sciences, Tohoku University Graduate School of Medicine, Sendai, 980-8575, Miyagi, Japan
Hozumi Motohashi & Hiroo Toyoda
Anatomy and Embryology, Leiden University Medical Center, Einthovenweg 20, P.O. Box 9600, Leiden, 2300 RC, The Netherlands
Christine L. Mummery & Robert Passier
Division of Translational Research, Research Center for Genomic Medicine, Saitama Medical University, Hidaka, 350-1241, Saitama, Japan
Yutaka Nakachi & Yasushi Okazaki
Cell Engineering Division, RIKEN BioResource Center, Tsukuba, 305-0074, Ibaraki, Japan
Yukio Nakamura
Department of Clinical Molecular Genetics, School of Pharmacy, Tokyo University of Pharmacy and Life Sciences, Hachioji, 192-0392, Tokyo, Japan
Tadasuke Nozaki
Department of Bioclinical Informatics, Tohoku Medical Megabank Organization, Tohoku University, Sendai, 980-8573, Miyagi, Japan
Soichi Ogishima
Department of Biochemistry, Ohu University School of Pharmaceutical Sciences, Koriyama, 963-8611, Fukushima, Japan
Mitsuhiro Ohshima
Insitute for Protein Research, Osaka University, Suita, 565-0871, Osaka, Japan
Mariko Okada-Hatakeyama
Biological and Environmental Sciences and Engineering Division, Environmental Epigenetics Program, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia
Valerio Orlando
University of Delaware, Newark, 19716, DE, USA
Swati Pradhan-Bhatt
Department of Forensic Medicine, Hjelt Institute, University of Helsinki, Kytosuontie 11, Helsinki, 003000, Finland
Antti Sajantila
Laboratorio Nazionale CIB, Padriciano, Trieste, 99 34149, Italy
Claudio Schneider
Department of Orthopedic, Trauma and Reconstructive Surgery, Charite Universitatsmedizin Berlin, Charitéplatz 1, Berlin, 10117, German
Gundula G. Schulze-Tanzil
International Research Center for Medical Sciences (IRCMS), Kumamoto University, Chuo-ku, 860-0811, Kumamoto, Japan
Guojun Sheng
Department of Clinical Study, Center for Advanced Medical Innovation, Kyushu University, Higashi-Ku, 812-8582, Fukuoka, Japan
Daisuke Sugiyama
Graduate School of Pharmaceutical Sciences, Nagoya University, Nagoya, 464-8601, Aichi, Japan
Hideki Tatsukawa
Laboratorio Nazionale del Consorzio Interuniversitario per le Biotecnologie (LNCIB), Padriciano 99, Trieste, 34149, Italy
Roberto Verardo
QIMR Berghofer Medical Research Institute, Brisbane, 4006, QLD, Australia
Dipti Vijayan
Department of Anatomy and Neuroscience, Centre for Stem Cell Systems, MDHS, University of Melbourne, Melbourne, 3010, VIC, Australia
Christine A. Wells
Department of Biochemistry, Nihon University School of Dentistry, Chiyoda-ku, 101-8310, Tokyo, Japan
Yoko Yamaguchi
Center for Clinical and Translational Reseach, Kyushu University Hospital, Higashi-Ku, 812-8582, Fukuoka, Japan
Chiyo Yanagi-Mizuochi
Telethon Kids Institute, the University of Western Australia, Perth, WA, Australia
Timo Lassmann
Preventive medicine and applied genomics unit, RIKEN Advanced Center for Computing and Communication, Yokohama, 230-0045, Kanagawa, Japan
Hisashi Shimoji & Hideya Kawaji
RIKEN Preventive Medicine and Diagnosis Innovation Program, Wako, 351-0198, Saitama, Japan
Jun Kawai, Masayoshi Itoh, Hideya Kawaji & Yoshihide Hayashizaki

Authors

Shuhei Noguchi
View author publications
You can also search for this author in PubMed Google Scholar
Takahiro Arakawa
View author publications
You can also search for this author in PubMed Google Scholar
Shiro Fukuda
View author publications
You can also search for this author in PubMed Google Scholar
Masaaki Furuno
View author publications
You can also search for this author in PubMed Google Scholar
Akira Hasegawa
View author publications
You can also search for this author in PubMed Google Scholar
Fumi Hori
View author publications
You can also search for this author in PubMed Google Scholar
Sachi Ishikawa-Kato
View author publications
You can also search for this author in PubMed Google Scholar
Kaoru Kaida
View author publications
You can also search for this author in PubMed Google Scholar
Ai Kaiho
View author publications
You can also search for this author in PubMed Google Scholar
Mutsumi Kanamori-Katayama
View author publications
You can also search for this author in PubMed Google Scholar
Tsugumi Kawashima
View author publications
You can also search for this author in PubMed Google Scholar
Miki Kojima
View author publications
You can also search for this author in PubMed Google Scholar
Atsutaka Kubosaki
View author publications
You can also search for this author in PubMed Google Scholar
Ri-ichiroh Manabe
View author publications
You can also search for this author in PubMed Google Scholar
Mitsuyoshi Murata
View author publications
You can also search for this author in PubMed Google Scholar
Sayaka Nagao-Sato
View author publications
You can also search for this author in PubMed Google Scholar
Kenichi Nakazato
View author publications
You can also search for this author in PubMed Google Scholar
Noriko Ninomiya
View author publications
You can also search for this author in PubMed Google Scholar
Hiromi Nishiyori-Sueki
View author publications
You can also search for this author in PubMed Google Scholar
Shohei Noma
View author publications
You can also search for this author in PubMed Google Scholar
Eri Saijyo
View author publications
You can also search for this author in PubMed Google Scholar
Akiko Saka
View author publications
You can also search for this author in PubMed Google Scholar
Mizuho Sakai
View author publications
You can also search for this author in PubMed Google Scholar
Christophe Simon
View author publications
You can also search for this author in PubMed Google Scholar
Naoko Suzuki
View author publications
You can also search for this author in PubMed Google Scholar
Michihira Tagami
View author publications
You can also search for this author in PubMed Google Scholar
Shoko Watanabe
View author publications
You can also search for this author in PubMed Google Scholar
Shigehiro Yoshida
View author publications
You can also search for this author in PubMed Google Scholar
Peter Arner
View author publications
You can also search for this author in PubMed Google Scholar
Richard A. Axton
View author publications
You can also search for this author in PubMed Google Scholar
Magda Babina
View author publications
You can also search for this author in PubMed Google Scholar
J. Kenneth Baillie
View author publications
You can also search for this author in PubMed Google Scholar
Timothy C. Barnett
View author publications
You can also search for this author in PubMed Google Scholar
Anthony G. Beckhouse
View author publications
You can also search for this author in PubMed Google Scholar
Antje Blumenthal
View author publications
You can also search for this author in PubMed Google Scholar
Beatrice Bodega
View author publications
You can also search for this author in PubMed Google Scholar
Alessandro Bonetti
View author publications
You can also search for this author in PubMed Google Scholar
James Briggs
View author publications
You can also search for this author in PubMed Google Scholar
Frank Brombacher
View author publications
You can also search for this author in PubMed Google Scholar
Ailsa J. Carlisle
View author publications
You can also search for this author in PubMed Google Scholar
Hans C. Clevers
View author publications
You can also search for this author in PubMed Google Scholar
Carrie A. Davis
View author publications
You can also search for this author in PubMed Google Scholar
Michael Detmar
View author publications
You can also search for this author in PubMed Google Scholar
Taeko Dohi
View author publications
You can also search for this author in PubMed Google Scholar
Albert S.B. Edge
View author publications
You can also search for this author in PubMed Google Scholar
Matthias Edinger
View author publications
You can also search for this author in PubMed Google Scholar
Anna Ehrlund
View author publications
You can also search for this author in PubMed Google Scholar
Karl Ekwall
View author publications
You can also search for this author in PubMed Google Scholar
Mitsuhiro Endoh
View author publications
You can also search for this author in PubMed Google Scholar
Hideki Enomoto
View author publications
You can also search for this author in PubMed Google Scholar
Afsaneh Eslami
View author publications
You can also search for this author in PubMed Google Scholar
Michela Fagiolini
View author publications
You can also search for this author in PubMed Google Scholar
Lynsey Fairbairn
View author publications
You can also search for this author in PubMed Google Scholar
Mary C. Farach-Carson
View author publications
You can also search for this author in PubMed Google Scholar
Geoffrey J. Faulkner
View author publications
You can also search for this author in PubMed Google Scholar
Carmelo Ferrai
View author publications
You can also search for this author in PubMed Google Scholar
Malcolm E. Fisher
View author publications
You can also search for this author in PubMed Google Scholar
Lesley M. Forrester
View author publications
You can also search for this author in PubMed Google Scholar
Rie Fujita
View author publications
You can also search for this author in PubMed Google Scholar
Jun-ichi Furusawa
View author publications
You can also search for this author in PubMed Google Scholar
Teunis B. Geijtenbeek
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Gingeras
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Goldowitz
View author publications
You can also search for this author in PubMed Google Scholar
Sven Guhl
View author publications
You can also search for this author in PubMed Google Scholar
Reto Guler
View author publications
You can also search for this author in PubMed Google Scholar
Stefano Gustincich
View author publications
You can also search for this author in PubMed Google Scholar
Thomas J. Ha
View author publications
You can also search for this author in PubMed Google Scholar
Masahide Hamaguchi
View author publications
You can also search for this author in PubMed Google Scholar
Mitsuko Hara
View author publications
You can also search for this author in PubMed Google Scholar
Yuki Hasegawa
View author publications
You can also search for this author in PubMed Google Scholar
Meenhard Herlyn
View author publications
You can also search for this author in PubMed Google Scholar
Peter Heutink
View author publications
You can also search for this author in PubMed Google Scholar
Kelly J. Hitchens
View author publications
You can also search for this author in PubMed Google Scholar
David A. Hume
View author publications
You can also search for this author in PubMed Google Scholar
Tomokatsu Ikawa
View author publications
You can also search for this author in PubMed Google Scholar
Yuri Ishizu
View author publications
You can also search for this author in PubMed Google Scholar
Chieko Kai
View author publications
You can also search for this author in PubMed Google Scholar
Hiroshi Kawamoto
View author publications
You can also search for this author in PubMed Google Scholar
Yuki I. Kawamura
View author publications
You can also search for this author in PubMed Google Scholar
Judith S. Kempfle
View author publications
You can also search for this author in PubMed Google Scholar
Tony J. Kenna
View author publications
You can also search for this author in PubMed Google Scholar
Juha Kere
View author publications
You can also search for this author in PubMed Google Scholar
Levon M. Khachigian
View author publications
You can also search for this author in PubMed Google Scholar
Toshio Kitamura
View author publications
You can also search for this author in PubMed Google Scholar
Sarah Klein
View author publications
You can also search for this author in PubMed Google Scholar
S. Peter Klinken
View author publications
You can also search for this author in PubMed Google Scholar
Alan J. Knox
View author publications
You can also search for this author in PubMed Google Scholar
Soichi Kojima
View author publications
You can also search for this author in PubMed Google Scholar
Haruhiko Koseki
View author publications
You can also search for this author in PubMed Google Scholar
Shigeo Koyasu
View author publications
You can also search for this author in PubMed Google Scholar
Weonju Lee
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Lennartsson
View author publications
You can also search for this author in PubMed Google Scholar
Alan Mackay-sim
View author publications
You can also search for this author in PubMed Google Scholar
Niklas Mejhert
View author publications
You can also search for this author in PubMed Google Scholar
Yosuke Mizuno
View author publications
You can also search for this author in PubMed Google Scholar
Hiromasa Morikawa
View author publications
You can also search for this author in PubMed Google Scholar
Mitsuru Morimoto
View author publications
You can also search for this author in PubMed Google Scholar
Kazuyo Moro
View author publications
You can also search for this author in PubMed Google Scholar
Kelly J. Morris
View author publications
You can also search for this author in PubMed Google Scholar
Hozumi Motohashi
View author publications
You can also search for this author in PubMed Google Scholar
Christine L. Mummery
View author publications
You can also search for this author in PubMed Google Scholar
Yutaka Nakachi
View author publications
You can also search for this author in PubMed Google Scholar
Fumio Nakahara
View author publications
You can also search for this author in PubMed Google Scholar
Toshiyuki Nakamura
View author publications
You can also search for this author in PubMed Google Scholar
Yukio Nakamura
View author publications
You can also search for this author in PubMed Google Scholar
Tadasuke Nozaki
View author publications
You can also search for this author in PubMed Google Scholar
Soichi Ogishima
View author publications
You can also search for this author in PubMed Google Scholar
Naganari Ohkura
View author publications
You can also search for this author in PubMed Google Scholar
Hiroshi Ohno
View author publications
You can also search for this author in PubMed Google Scholar
Mitsuhiro Ohshima
View author publications
You can also search for this author in PubMed Google Scholar
Mariko Okada-Hatakeyama
View author publications
You can also search for this author in PubMed Google Scholar
Yasushi Okazaki
View author publications
You can also search for this author in PubMed Google Scholar
Valerio Orlando
View author publications
You can also search for this author in PubMed Google Scholar
Dmitry A. Ovchinnikov
View author publications
You can also search for this author in PubMed Google Scholar
Robert Passier
View author publications
You can also search for this author in PubMed Google Scholar
Margaret Patrikakis
View author publications
You can also search for this author in PubMed Google Scholar
Ana Pombo
View author publications
You can also search for this author in PubMed Google Scholar
Swati Pradhan-Bhatt
View author publications
You can also search for this author in PubMed Google Scholar
Xian-Yang Qin
View author publications
You can also search for this author in PubMed Google Scholar
Michael Rehli
View author publications
You can also search for this author in PubMed Google Scholar
Patrizia Rizzu
View author publications
You can also search for this author in PubMed Google Scholar
Sugata Roy
View author publications
You can also search for this author in PubMed Google Scholar
Antti Sajantila
View author publications
You can also search for this author in PubMed Google Scholar
Shimon Sakaguchi
View author publications
You can also search for this author in PubMed Google Scholar
Hiroki Sato
View author publications
You can also search for this author in PubMed Google Scholar
Hironori Satoh
View author publications
You can also search for this author in PubMed Google Scholar
Suzana Savvi
View author publications
You can also search for this author in PubMed Google Scholar
Alka Saxena
View author publications
You can also search for this author in PubMed Google Scholar
Christian Schmidl
View author publications
You can also search for this author in PubMed Google Scholar
Claudio Schneider
View author publications
You can also search for this author in PubMed Google Scholar
Gundula G. Schulze-Tanzil
View author publications
You can also search for this author in PubMed Google Scholar
Anita Schwegmann
View author publications
You can also search for this author in PubMed Google Scholar
Guojun Sheng
View author publications
You can also search for this author in PubMed Google Scholar
Jay W. Shin
View author publications
You can also search for this author in PubMed Google Scholar
Daisuke Sugiyama
View author publications
You can also search for this author in PubMed Google Scholar
Takaaki Sugiyama
View author publications
You can also search for this author in PubMed Google Scholar
Kim M. Summers
View author publications
You can also search for this author in PubMed Google Scholar
Naoko Takahashi
View author publications
You can also search for this author in PubMed Google Scholar
Jun Takai
View author publications
You can also search for this author in PubMed Google Scholar
Hiroshi Tanaka
View author publications
You can also search for this author in PubMed Google Scholar
Hideki Tatsukawa
View author publications
You can also search for this author in PubMed Google Scholar
Andru Tomoiu
View author publications
You can also search for this author in PubMed Google Scholar
Hiroo Toyoda
View author publications
You can also search for this author in PubMed Google Scholar
Marc van de Wetering
View author publications
You can also search for this author in PubMed Google Scholar
Linda M. van den Berg
View author publications
You can also search for this author in PubMed Google Scholar
Roberto Verardo
View author publications
You can also search for this author in PubMed Google Scholar
Dipti Vijayan
View author publications
You can also search for this author in PubMed Google Scholar
Christine A. Wells
View author publications
You can also search for this author in PubMed Google Scholar
Louise N. Winteringham
View author publications
You can also search for this author in PubMed Google Scholar
Ernst Wolvetang
View author publications
You can also search for this author in PubMed Google Scholar
Yoko Yamaguchi
View author publications
You can also search for this author in PubMed Google Scholar
Masayuki Yamamoto
View author publications
You can also search for this author in PubMed Google Scholar
Chiyo Yanagi-Mizuochi
View author publications
You can also search for this author in PubMed Google Scholar
Misako Yoneda
View author publications
You can also search for this author in PubMed Google Scholar
Yohei Yonekura
View author publications
You can also search for this author in PubMed Google Scholar
Peter G. Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Silvia Zucchelli
View author publications
You can also search for this author in PubMed Google Scholar
Imad Abugessaisa
View author publications
You can also search for this author in PubMed Google Scholar
Erik Arner
View author publications
You can also search for this author in PubMed Google Scholar
Jayson Harshbarger
View author publications
You can also search for this author in PubMed Google Scholar
Atsushi Kondo
View author publications
You can also search for this author in PubMed Google Scholar
Timo Lassmann
View author publications
You can also search for this author in PubMed Google Scholar
Marina Lizio
View author publications
You can also search for this author in PubMed Google Scholar
Serkan Sahin
View author publications
You can also search for this author in PubMed Google Scholar
Thierry Sengstag
View author publications
You can also search for this author in PubMed Google Scholar
Jessica Severin
View author publications
You can also search for this author in PubMed Google Scholar
Hisashi Shimoji
View author publications
You can also search for this author in PubMed Google Scholar
Masanori Suzuki
View author publications
You can also search for this author in PubMed Google Scholar
Harukazu Suzuki
View author publications
You can also search for this author in PubMed Google Scholar
Jun Kawai
View author publications
You can also search for this author in PubMed Google Scholar
Naoto Kondo
View author publications
You can also search for this author in PubMed Google Scholar
Masayoshi Itoh
View author publications
You can also search for this author in PubMed Google Scholar
Carsten O. Daub
View author publications
You can also search for this author in PubMed Google Scholar
Takeya Kasukawa
View author publications
You can also search for this author in PubMed Google Scholar
Hideya Kawaji
View author publications
You can also search for this author in PubMed Google Scholar
Piero Carninci
View author publications
You can also search for this author in PubMed Google Scholar
Alistair R.R. Forrest
View author publications
You can also search for this author in PubMed Google Scholar
Yoshihide Hayashizaki
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Samples were provided by P. Arner, R. Axton, M. Babina, J. Baillie, T. Barnett, A. Beckhouse, A. Blumenthal, B. Bodega, A. Bonetti, J. Briggs, F. Brombacher, A. Carlisle, H. Clevers, C. Davis, M. Detmar, T. Dohi, A. Edge, M. Edinger, A. Ehrlund, K. Ekwall, M. Endoh, H. Enomoto, A. Eslami, M. Fagiolini, L. Fairbairn, M. Farach-Carson, G. Faulkner, C. Ferrai, M. Fisher, L. Forrester, R. Fujita, J. Furusawa, T. Geijtenbeek, T. Gingeras, D. Goldowitz, S. Guhl, R. Guler, S. Gustincich, T. Ha, M. Hamaguchi, M. Hara, Y. Hasegawa, M. Herlyn, P. Heutink, K. Hitchens, D. Hume, T. Ikawa, Y. Ishizu, C. Kai, H. Kawamoto, Y. Kawamura, J. Kempfle, T. Kenna, J. Kere, L. Khachigian, T. Kitamura, S. Klein, S. Klinken, A. Knox, S. Kojima, H. Koseki, S. Koyasu, W. Lee, A. Lennartsson, A. Mackay-sim, N. Mejhert, Y. Mizuno, H. Morikawa, M. Morimoto, K. Moro, K. Morris, H. Motohashi, C. Mummery, Y. Nakachi, F. Nakahara, T. Nakamura, Y. Nakamura, T. Nozaki, S. Ogishima, N. Ohkura, H. Ohno, M. Ohshima, M. Okada-Hatakeyama, Y. Okazaki, V. Orlando, D. Ovchinnikov, R. Passier, M. Patrikakis, A. Pombo, S. Pradhan-Bhatt, X. Qin, M. Rehli, P. Rizzu, S. Roy, A. Sajantila, S. Sakaguchi, H. Sato, H. Satoh, S. Savvi, A. Saxena, C. Schmidl, C. Schneider, G. Schulze-Tanzil, A. Schwegmann, G. Sheng, J. Shin, D. Sugiyama, T. Sugiyama, K. Summers, N. Takahashi, J. Takai, H. Tanaka, H. Tatsukawa, A. Tomoiu, H. Toyoda, M. van de Wetering, L. van den Berg, R. Verardo, D. Vijayan, C. Wells, L. Winteringham, E. Wolvetang, Y. Yamaguchi, M. Yamamoto, C. Yanagi-Mizuochi, M. Yoneda, Y. Yonekura, P. Zhang, S. Zucchelli; CAGE data was produced by T. Arakawa, S. Fukuda, M. Furuno, A. Hasegawa, F. Hori, S. Ishikawa-Kato, K. Kaida, A. Kaiho, M. Kanamori-Katayama, T. Kawashima, M. Kojima, A. Kubosaki, R. Manabe, M. Murata, S. Nagao-Sato, K. Nakazato, N. Ninomiya, H. Nishiyori-Sueki, S. Noma, E. Saijyo, A. Saka, M. Sakai, C. Simon, N. Suzuki, M. Tagami, S. Watanabe, S. Yoshida; Data quality was assessed by S. Noguchi, I. Abugessaisa, E. Arner, J. Harshbarger, A. Kondo, T. Lassmann, M. Lizio, S. Sahin, T. Sengstag, J. Severin, H. Shimoji, H. Kawaji, A. Forrest; Data description is achieved by S. Noguchi, T. Kasukawa, H. Kawaji; Project is organized by M. Suzuki, H. Suzuki, J. Kawai, N. Kondo, M. Itoh, C. Daub, T. Kasukawa, H. Kawaji, P. Carninci, A. Forrest, Y. Hayashizaki.

Corresponding author

Correspondence to Hideya Kawaji.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

ISA-Tab metadata

Supplementary information

Supplementary Fig. 1 (PDF 89 kb)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files made available in this article.

Reprints and permissions

About this article

Cite this article

Noguchi, S., Arakawa, T., Fukuda, S. et al. FANTOM5 CAGE profiles of human and mouse samples. Sci Data 4, 170112 (2017). https://doi.org/10.1038/sdata.2017.112

Download citation

Received: 06 December 2016
Accepted: 25 April 2017
Published: 29 August 2017
DOI: https://doi.org/10.1038/sdata.2017.112

This article is cited by

BORIS/CTCFL epigenetically reprograms clustered CTCF binding sites into alternative transcriptional start sites
- Elena M. Pugacheva
- Dharmendra Nath Bhatt
- Victor V. Lobanenkov
Genome Biology (2024)
Species-aware DNA language models capture regulatory elements and their evolution
- Alexander Karollus
- Johannes Hingerl
- Julien Gagneur
Genome Biology (2024)
Dynamic enhancer landscapes in human craniofacial development
- Sudha Sunil Rajderkar
- Kitt Paraiso
- Axel Visel
Nature Communications (2024)
Human VDAC pseudogenes: an emerging role for VDAC1P8 pseudogene in acute myeloid leukemia
- Xena Giada Pappalardo
- Pierpaolo Risiglione
- Angela Messina
Biological Research (2023)
Prioritizing genes associated with brain disorders by leveraging enhancer-promoter interactions in diverse neural cells and tissues
- Xingzhong Zhao
- Liting Song
- Xing-Ming Zhao
Genome Medicine (2023)

Subjects

Abstract

Similar content being viewed by others

Background & Summary

Methods

Sample collection

Single molecule CAGE and data processing

Identification of peaks and their annotations

Quantification of promoter activities

Code availability

Data Records

Data record 1: Metadata

Data record 2: CAGE profiles

Data record 3: CAGE peaks

Technical Validation

RNA quality

Mapped reads

Sample identity

Usage Notes

Additional Information

References

References

Data Citations

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

ISA-Tab metadata

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links