Comprehensive comparative analysis of 5′-end RNA-sequencing methods

Adiconis, Xian; Haber, Adam L.; Simmons, Sean K.; Levy Moonshine, Ami; Ji, Zhe; Busby, Michele A.; Shi, Xi; Jacques, Justin; Lancaster, Madeline A.; Pan, Jen Q.; Regev, Aviv; Levin, Joshua Z.

doi:10.1038/s41592-018-0014-2

Analysis
Published: 04 June 2018

Comprehensive comparative analysis of 5′-end RNA-sequencing methods

Xian Adiconis^1,2^na1,
Adam L. Haber¹^na1,
Sean K. Simmons²^na1,
Ami Levy Moonshine³,
Zhe Ji ORCID: orcid.org/0000-0002-1809-8099¹,
Michele A. Busby³,
Xi Shi²,
Justin Jacques²,
Madeline A. Lancaster ORCID: orcid.org/0000-0003-2324-8853⁴,
Jen Q. Pan²,
Aviv Regev^1,3,5,6 &
…
Joshua Z. Levin ORCID: orcid.org/0000-0002-0170-3598^1,2

Nature Methods volume 15, pages 505–511 (2018)Cite this article

13k Accesses
64 Citations
57 Altmetric
Metrics details

Subjects

An Author Correction to this article was published on 20 November 2018

This article has been updated

Abstract

Specialized RNA-seq methods are required to identify the 5′ ends of transcripts, which are critical for studies of gene regulation, but these methods have not been systematically benchmarked. We directly compared six such methods, including the performance of five methods on a single human cellular RNA sample and a new spike-in RNA assay that helps circumvent challenges resulting from uncertainties in annotation and RNA processing. We found that the ‘cap analysis of gene expression’ (CAGE) method performed best for mRNA and that most of its unannotated peaks were supported by evidence from other genomic methods. We applied CAGE to eight brain-related samples and determined sample-specific transcription start site (TSS) usage, as well as a transcriptome-wide shift in TSS usage between fetal and adult brain.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 2: Sequence read performance metrics for 5′ end methods.**

**Fig. 3: TSS peak performance metrics.**

**Fig. 4: TSS discovery for unannotated CAGE peaks.**

**Fig. 5: Differential TSS usage in brain-related samples.**

**Fig. 6: Adult brain samples preferentially use more downstream TSSs.**

Single-cell long-read sequencing-based mapping reveals specialized splicing patterns in developing and adult mouse and human brain

Article Open access 09 April 2024

Anoushka Joglekar, Wen Hu, … Hagen U. Tilgner

Assessing GPT-4 for cell type annotation in single-cell RNA-seq analysis

Article Open access 25 March 2024

Wenpin Hou & Zhicheng Ji

Inferring gene regulatory networks from single-cell multiome data using atlas-scale external data

Article Open access 12 April 2024

Qiuyue Yuan & Zhana Duren

Change history

20 November 2018
The original version of this paper contained an incorrect primer sequence. In the Methods subsection “Rampage libraries,” the text for modification 3 stated that the reverse primer used for library indexing was 5′-CAAGCAGAAGACGGCATACGAGATXXXXXXXXGTGACTGGAGT-3′. The correct sequence of the oligonucleotide used is 5′-CAAGCAGAAGACGGCATACGAGATXXXXXXXXGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT-3′. This error has been corrected in the PDF and HTML versions of the paper.

References

Heinzen, E. L., Neale, B. M., Traynelis, S. F., Allen, A. S. & Goldstein, D. B. The genetics of neuropsychiatric diseases: looking in and beyond the exome. Annu. Rev. Neurosci. 38, 47–68 (2015).
Article CAS Google Scholar
Edwards, S. L., Beesley, J., French, J. D. & Dunning, A. M. Beyond GWASs: illuminating the dark road from association to function. Am. J. Hum. Genet. 93, 779–797 (2013).
Article CAS Google Scholar
De Gobbi, M. et al. A regulatory SNP causes a human genetic disease by creating a new transcriptional promoter. Science 312, 1215–1217 (2006).
Article Google Scholar
Davuluri, R. V., Suzuki, Y., Sugano, S., Plass, C. & Huang, T. H. The functional consequences of alternative promoter use in mammalian genomes. Trends Genet. 24, 167–177 (2008).
Article CAS Google Scholar
Grob, T. J. et al. Human delta Np73 regulates a dominant negative feedback loop for TAp73 and p53. Cell Death Differ. 8, 1213–1223 (2001).
Article CAS Google Scholar
Béna, F. et al. Molecular and clinical characterization of 25 individuals with exonic deletions of NRXN1 and comprehensive review of the literature. Am. J. Med. Genet. B. Neuropsychiatr. Genet. 162B, 388–403 (2013).
Article Google Scholar
Hrdlickova, R., Toloue, M. & Tian, B. RNA-Seq methods for transcriptome analysis. Wiley Interdiscip. Rev. RNA 8, e1364 (2017).
Article Google Scholar
Tyner, C. et al. The UCSC Genome Browser database: 2017 update. Nucleic Acids Res. 45, D626–D634 (2017).
CAS PubMed Google Scholar
Murata, M. et al. Detecting expressed genes using CAGE. Methods Mol. Biol. 1164, 67–85 (2014).
Article Google Scholar
Batut, P., Dobin, A., Plessy, C., Carninci, P. & Gingeras, T. R. High-fidelity promoter profiling reveals widespread alternative promoter usage and transposon-driven developmental gene expression. Genome Res. 23, 169–180 (2013).
Article CAS Google Scholar
Batut, P. & Gingeras, T. R. RAMPAGE: promoter activity profiling by paired-end sequencing of 5′-complete cDNAs. Curr. Protoc. Mol. Biol. 104, 25B.11.1–25B.11.16 (2013).
Google Scholar
Islam, S. et al. Highly multiplexed and strand-specific single-cell RNA 5′ end sequencing. Nat. Protoc. 7, 813–828 (2012).
Article CAS Google Scholar
Salimullah, M., Sakai, M., Plessy, C. & Carninci, P. NanoCAGE: a high-resolution technique to discover and interrogate cell transcriptomes. Cold Spring Harb. Protoc. 2011, pdb.prot5559 (2011).
Article Google Scholar
Cumbie, J. S., Ivanchenko, M. G. & Megraw, M. NanoCAGE-XL and CapFilter: an approach to genome wide identification of high confidence transcription start sites. BMC Genomics 16, 597 (2015).
Article Google Scholar
Yamashita, R. et al. Genome-wide characterization of transcriptional start sites in humans by integrative transcriptome analysis. Genome Res. 21, 775–789 (2011).
Article CAS Google Scholar
Tsuchihara, K. et al. Massive transcriptional start site analysis of human genes in hypoxia cells. Nucleic Acids Res. 37, 2249–2263 (2009).
Article CAS Google Scholar
Core, L. J. et al. Analysis of nascent RNA identifies a unified architecture of initiation regions at mammalian promoters and enhancers. Nat. Genet. 46, 1311–1320 (2014).
Article CAS Google Scholar
Lam, M. T. et al. Rev-Erbs repress macrophage gene expression by inhibiting enhancer-directed transcription. Nature 498, 511–515 (2013).
Article CAS Google Scholar
Adiconis, X. et al. Comparative analysis of RNA sequencing methods for degraded or low-input samples. Nat. Methods 10, 623–629 (2013).
Article CAS Google Scholar
Hestand, M. S. et al. Tissue-specific transcript annotation and expression profiling with complementary next-generation sequencing technologies. Nucleic Acids Res. 38, e165 (2010).
Article Google Scholar
Morlan, J. D., Qu, K. & Sinicropi, D. V. Selective depletion of rRNA enables whole transcriptome profiling of archival fixed tissue. PLoS One 7, e42882 (2012).
Article CAS Google Scholar
Schoenberg, D. R. & Maquat, L. E. Re-capping the message. Trends Biochem. Sci. 34, 435–442 (2009).
Article CAS Google Scholar
Jiang, L. et al. Synthetic spike-in standards for RNA-seq experiments. Genome Res. 21, 1543–1551 (2011).
Article CAS Google Scholar
Frith, M. C. et al. A code for transcription initiation in mammalian genomes. Genome Res. 18, 1–12 (2008).
Article CAS Google Scholar
Harrow, J. et al. GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 22, 1760–1774 (2012).
Article CAS Google Scholar
Djebali, S. et al. Landscape of transcription in human cells. Nature 489, 101–108 (2012).
Article CAS Google Scholar
FANTOM Consortium & RIKEN PMI and CLST. A promoter-levelmammalian expression atlas. Nature 507, 462–470 (2014)..
ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
Article Google Scholar
Zeisel, A. et al. Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq. Science 347, 1138–1142 (2015).
Article CAS Google Scholar
Boyle, A. P. et al. High-resolution mapping and characterization of open chromatin across the genome. Cell 132, 311–322 (2008).
Article CAS Google Scholar
Hoffman, M. M. et al. Integrative annotation of chromatin elements from ENCODE data. Nucleic Acids Res. 41, 827–841 (2013).
Article CAS Google Scholar
Kim, T. K. et al. Widespread transcription at neuronal activity-regulated enhancers. Nature 465, 182–187 (2010).
Article CAS Google Scholar
Andersson, R. et al. An atlas of active enhancers across human cell types and tissues. Nature 507, 455–461 (2014).
Article CAS Google Scholar
Busskamp, V. et al. Rapid neurogenesis through transcriptional activation in human stem cells. Mol. Syst. Biol. 10, 760 (2014).
Article Google Scholar
Lancaster, M. A. & Knoblich, J. A. Organogenesis in a dish: modeling development and disease using organoid technologies. Science 345, 1247125 (2014).
Article Google Scholar
Hughes, T. et al. A loss-of-function variant in a minor isoform of ANK3 protects against bipolar disorder and schizophrenia. Biol. Psychiatry 80, 323–330 (2016).
Article CAS Google Scholar
Rueckert, E. H. et al. Cis-acting regulation of brain-specific ANK3 gene expression by a genetic variant associated with bipolar disorder. Mol. Psychiatry 18, 922–929 (2013).
Article CAS Google Scholar
Bae, B. I. et al. Evolutionarily dynamic alternative splicing of GPR56 regulates regional cerebral cortical patterning. Science 343, 764–768 (2014).
Article CAS Google Scholar
Novak, G. & Tallerico, T. Nogo A, B and C expression in schizophrenia, depression and bipolar frontal cortex, and correlation of Nogo expression with CAA/TATC polymorphism in 3′-UTR. Brain Res. 1120, 161–171 (2006).
Article CAS Google Scholar
Buenrostro, J. D., Giresi, P. G., Zaba, L. C., Chang, H. Y. & Greenleaf, W. J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10, 1213–1218 (2013).
Article CAS Google Scholar
Bellin, M., Marchetto, M. C., Gage, F. H. & Mummery, C. L. Induced pluripotent stem cells: the new patient? Nat. Rev. Mol. Cell Biol. 13, 713–726 (2012).
Article Google Scholar
Sterneckert, J. L., Reinhardt, P. & Schöler, H. R. Investigating human disease using stem cell models. Nat. Rev. Genet. 15, 625–639 (2014).
Article CAS Google Scholar
Imaizumi, Y. & Okano, H. Modeling human neurological disorders with induced pluripotent stem cells. J. Neurochem. 129, 388–399 (2014).
Article CAS Google Scholar
Hyman, S. E. Revitalizing psychiatric therapeutics. Neuropsychopharmacology 39, 220–229 (2014).
Article CAS Google Scholar
Arner, E. et al. Transcribed enhancers lead waves of coordinated transcription in transitioning mammalian cells. Science 347, 1010–1014 (2015).
Article CAS Google Scholar
Birdsill, A. C., Walker, D. G., Lue, L., Sue, L. I. & Beach, T. G. Postmortem interval effect on RNA and gene expression in human brain tissue. Cell Tissue Bank. 12, 311–318 (2011).
Article CAS Google Scholar
Sandberg, R., Neilson, J. R., Sarma, A., Sharp, P. A. & Burge, C. B. Proliferating cells express mRNAs with shortened 3′ untranslated regions and fewer microRNA target sites. Science 320, 1643–1647 (2008).
Article CAS Google Scholar
Miura, P., Shenker, S., Andreu-Agullo, C., Westholm, J. O. & Lai, E. C. Widespread and extensive lengthening of 3′ UTRs in the mammalian brain. Genome Res. 23, 812–825 (2013).
Article CAS Google Scholar
Sarda, S., Das, A., Vinson, C. & Hannenhalli, S. Distal CpG islands can serve as alternative promoters to transcribe genes with silenced proximal promoters. Genome Res. 27, 553–566 (2017).
Article CAS Google Scholar
Lancaster, M. A. & Knoblich, J. A. Generation of cerebral organoids from human pluripotent stem cells. Nat. Protoc. 9, 2329–2340 (2014).
Article CAS Google Scholar
Picelli, S. et al. Smart-seq2 for sensitive full-length transcriptome profiling in single cells. Nat. Methods 10, 1096–1098 (2013).
Article CAS Google Scholar
Soumillon, M., Cacchiarelli, D., Semrau, S., van Oudenaarden, A. & Mikkelsen, T. S. Characterization of directed differentiation by high-throughput single-cell RNA-Seq. bioRxiv Preprint available at https://www.biorxiv.org/content/early/2014/03/05/003236 (2014).
Suzuki, Y. & Sugano, S. Construction of a full-length enriched and a 5′-end enriched cDNA library using the oligo-capping method. Methods Mol. Biol. 221, 73–91 (2003).
CAS PubMed Google Scholar
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Article CAS Google Scholar
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Article CAS Google Scholar
Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Use R!) 2nd edn (Springer, New York, 2009).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Article CAS Google Scholar
Trapnell, C., Pachter, L. & Salzberg, S. L. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25, 1105–1111 (2009).
Article CAS Google Scholar
Trapnell, C. et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 28, 511–515 (2010).
Article CAS Google Scholar
Zhang, K. et al. Digital RNA allelotyping reveals tissue-specific and allele-specific gene expression in human. Nat. Methods 6, 613–618 (2009).
Article CAS Google Scholar
Ashoor, H., Kleftogiannis, D., Radovanovic, A. & Bajic, V. B. DENdb: database of integrated human enhancers. Database (Oxford) 2015, bav085 (2015).
Article Google Scholar
Carninci, P. et al. Genome-wide analysis of mammalian promoter architecture and evolution. Nat. Genet. 38, 626–635 (2006).
Article CAS Google Scholar
Zhao, X., Valen, E., Parker, B. J. & Sandelin, A. Systematic clustering of transcription start site landscapes. PLoS One 6, e23409 (2011).
Article CAS Google Scholar
Wagih, O. ggseqlogo: a versatile R package for drawing sequence logos. Bioinformatics 33, 3645–3647 (2017).
Article CAS Google Scholar
Tang, D. T. et al. Suppression of artifacts and barcode bias in high-throughput transcriptome analyses utilizing template switching. Nucleic Acids Res. 41, e44 (2013).
Article CAS Google Scholar
Li, B. & Dewey, C. N. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics 12, 323 (2011).
Article CAS Google Scholar
Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).
Article Google Scholar
Sloan, C. A. et al. ENCODE data at the ENCODE portal. Nucleic Acids Res. 44, D726–D732 (2016).
Article CAS Google Scholar
Chambers, S. M. et al. Combined small-molecule inhibition accelerates developmental timing and converts human pluripotent stem cells into nociceptors. Nat. Biotechnol. 30, 715–720 (2012).
Article CAS Google Scholar
Venables, W. N. & Ripley, B. D. Modern Applied Statistics with S (Springer, New York, 2002).
Book Google Scholar

Download references

Acknowledgements

We are grateful to M. Salit and J. McDaniel (National Institute of Standards and Technology, Gaithersburg, MD, USA) for ERCC spike-in RNA; P. Batut for sharing RAMPAGE peak-calling code; N. Shoresh for advice on epigenomics datasets; N. Sanjana for advice on preparing the NGN1/2 in vitro neuron sample; B. Haas, Y. Farjoun, and M. Hofree for statistical advice; L. Gaffney for assistance with figures; I. Wortman and C. Cheng for K-562 experiments; C. de Boer for helpful comments on this manuscript; and the Broad Genomics Platform for sequencing. We thank S. McCarroll for suggesting this research direction and helpful discussions in the early phases of this study. This work was supported by the Stanley Center for Psychiatric Research, the Klarman Cell Observatory, and the BRAIN Initiative (U01-MH105960-01 to A.R.).

Author information

These authors contributed equally: Xian Adiconis, Adam L. Haber, Sean K. Simmons.

Authors and Affiliations

Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
Xian Adiconis, Adam L. Haber, Zhe Ji, Aviv Regev & Joshua Z. Levin
Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
Xian Adiconis, Sean K. Simmons, Xi Shi, Justin Jacques, Jen Q. Pan & Joshua Z. Levin
Broad Institute of MIT and Harvard, Cambridge, MA, USA
Ami Levy Moonshine, Michele A. Busby & Aviv Regev
Laboratory of Molecular Biology, Medical Research Council, Cambridge, UK
Madeline A. Lancaster
Department of Biology, Howard Hughes Medical Institute, Massachusetts Institute of Technology, Cambridge, MA, USA
Aviv Regev
The David H. Koch Institute for Integrative Cancer Research at Massachusetts Institute of Technology, Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
Aviv Regev

Authors

Xian Adiconis
View author publications
You can also search for this author in PubMed Google Scholar
Adam L. Haber
View author publications
You can also search for this author in PubMed Google Scholar
Sean K. Simmons
View author publications
You can also search for this author in PubMed Google Scholar
Ami Levy Moonshine
View author publications
You can also search for this author in PubMed Google Scholar
Zhe Ji
View author publications
You can also search for this author in PubMed Google Scholar
Michele A. Busby
View author publications
You can also search for this author in PubMed Google Scholar
Xi Shi
View author publications
You can also search for this author in PubMed Google Scholar
Justin Jacques
View author publications
You can also search for this author in PubMed Google Scholar
Madeline A. Lancaster
View author publications
You can also search for this author in PubMed Google Scholar
Jen Q. Pan
View author publications
You can also search for this author in PubMed Google Scholar
Aviv Regev
View author publications
You can also search for this author in PubMed Google Scholar
Joshua Z. Levin
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

J.Z.L., X.A., and A.R. conceived the research. X.A. prepared the 5′-end RNA-seq libraries. J.J. prepared the standard RNA-seq library. X.S. prepared the in vitro neurons under the supervision of J.Q.P. M.A.L. prepared the brain organoid RNA. A.L.H., S.K.S., A.L.M., Z.J., and M.A.B. developed and performed computational analysis. J.Z.L., X.A., A.L.H., S.K.S., and A.R. wrote the paper. All of the authors edited the paper.

Corresponding authors

Correspondence to Aviv Regev or Joshua Z. Levin.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Integrated supplementary information

Supplementary Figure 1 Comparing lab methods of 5′-end sequencing for specific genes.

Sequencing coverage with five different lab methods for three highly expressed genes in K-562 cells. Shown is the scaled number of reads (y-axis) at each position in the genome (x-axis; top track). Bottom track shows position of annotated exons (filled boxes) and introns (lines) with direction of transcription shown by arrows based on UCSC annotation. Plots generated with IGV (Robinson, J.T. et al. Integrative genomics viewer. Nat Biotechnol 29, 24-26 (2011)).

Supplementary Figure 2 CapFilter improved peak-calling.

Sensitivity, Precision, and F₁ scores (bars, y-axis) at varying levels of filtering by CapFilter (x-axis) for each of four lab methods. Each level corresponds to the minimum percent of reads per peak that begin with an extra G (Online Methods).

Supplementary Figure 3 Strand invasion and RAMPAGE filters did not improve peak-calling.

Sensitivity, Precision, and F₁ scores (bars, y-axis) at (a) different levels of filtering with a strand invasion filter (Online Methods); and (b) comparing RAMPAGE (with and without read 2) and ParaClu peak callers. In all cases CapFilter was used.

Supplementary Figure 4 Sensitivity with and without corroborative DNase-seq data.

Shown is the sensitivity (y-axis) for each method (x-axis). False negatives were defined as all TSSs without overlapping 5’ end RNA-Seq peaks (“without” DNase-Seq) or only the subset overlapping DNase-Seq peaks in K-562 cells (“with” DNase-Seq, the method used in Fig. 4). DNase-Seq data permits better assessment of actual performance for K-562 cells rather than comparing only to the UCSC annotation, which is compiled from diverse samples.

Supplementary Figure 5 STRT performance is essentially independent of RNA input amount.

Sensitivity, precision, and F₁ score (y-axis) for STRT with RNA input amounts ranging from 10 ng to 10 μg. Also included to aid comparison are the STRT data shown in Fig. 3a (10 ng input).

Supplementary Figure 6 5′-end-method performance metrics with Gencode annotation.

Sensitivity, precision, and F₁ score (y-axis) for each lab method (x-axis) relative to the Gencode annotation.

Supplementary Figure 7 Performance of the 5′-end methods in published datasets.

Sensitivity, precision, and F₁ score (y-axis) for each lab method (x-axis). Comparison of (a) CAGE (replicates A and B) to RAMPAGE for K-562, (b) CAGE to Oligo capping for MCF-7, and (c) CAGE to STRT for mouse hippocampus. CAGE performed better than other methods in these comparisons.

Supplementary Figure 8 TSS initiator sequences.

For each method, shown are the nucleotides right before (−1 position) and after (+1 position) the dominant TSS for each tag cluster (TC). The results are displayed as sequence logos for (a) broad TCs and (b) narrow TCs. Although the methods differ in the nucleotide distributions, in all cases, we do see a preference for a pyrimidine at position −1 and a purine at position +1, as has been found previously.

Supplementary Figure 9 Reproducibility of 5′-end methods.

(a) Shared peaks across CAGE replicates. Shown is the proportion of shared peaks. Main-1, Main-4, and Main-6 were processed in the same batch. (b) Normalized coverage by position for CAGE, RAMPAGE, and STRT replicates. For each library, shown is the average relative coverage (y-axis) at each relative position along the transcripts’ length (x-axis).

Supplementary Figure 10 Correlation of gene expression levels.

Shown are scatter plots for an all-versus-all comparison of gene expression levels (ln(TPM+1)) for (a) CAGE replicates, (b) RAMPAGE replicates, (c) STRT replicates, and (d) each 5’ end method and standard RNA-Seq. Points are colored based on their normalized density (Online Methods). Pearson's r shown for each comparison. Sample size for each method: n = 1 library per replicate or method, except CAGE (d) is a combination of 3 libraries.

Supplementary Figure 11 TSS discovery for unannotated peaks.

(a,b) Corroborative data for TSS peaks from all methods. Shown are the proportion (a) and number (b) of peaks (y axis) with support from each corroborative data source (color legend) for peaks initially defined as ‘true positive’, ‘false positive’ and ‘intergenic’ based on the UCSC annotation. (a) Peaks were assigned to only one category of support as in Fig. 4a. (b) Peaks were assigned to as many corroborative categories as evidence supported as in Fig. 4b.

Supplementary Figure 12 Corroborative evidence for 5′ ends identified by standard RNA-seq.

Venn diagram showing TSS prediction with Standard RNA-Seq, DNase-Seq and H3K4me3 ChIP-Seq data. Numbers of peaks shown here in overlapping categories correspond to RNA-Seq peaks for all overlaps involving RNA-Seq peaks and DNase-Seq peaks in the overlap with only H3K4me3 ChIP-Seq peaks. For each subset of RNA-Seq peaks, we also show the % true positives (TPs) out of all the RNA-Seq peaks in that category. Areas not to scale.

Supplementary Figure 13 Correlation of CAGE-based gene expression for brain-related samples.

Heatmap showing the Pearson correlation of expression levels based on ln(TPM+1) between each pair of brain-related samples. Correlation was calculated using all genes expressed in at least one sample. The associated hierarchical clustering is displayed above and to the left of the heatmap. Sample size for each method: n = 1 library per sample.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Adiconis, X., Haber, A.L., Simmons, S.K. et al. Comprehensive comparative analysis of 5′-end RNA-sequencing methods. Nat Methods 15, 505–511 (2018). https://doi.org/10.1038/s41592-018-0014-2

Download citation

Received: 18 September 2017
Accepted: 10 April 2018
Published: 04 June 2018
Issue Date: July 2018
DOI: https://doi.org/10.1038/s41592-018-0014-2

This article is cited by

FIPRESCI: droplet microfluidics based combinatorial indexing for massive-scale 5′-end single-cell RNA sequencing
- Yun Li
- Zheng Huang
- Lan Jiang
Genome Biology (2023)
Sex-chromosome mechanisms in cardiac development and disease
- Frank L. Conlon
- Arthur P. Arnold
Nature Cardiovascular Research (2023)
Bookend: precise transcript reconstruction with end-guided assembly
- Michael A. Schon
- Stefan Lutzmayer
- Michael D. Nodine
Genome Biology (2022)
A comparison of experimental assays and analytical methods for genome-wide identification of active enhancers
- Li Yao
- Jin Liang
- Haiyuan Yu
Nature Biotechnology (2022)
Exogenous artificial DNA forms chromatin structure with active transcription in yeast
- Jianting Zhou
- Chao Zhang
- Ying-Jin Yuan
Science China Life Sciences (2022)