Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Differential analysis of gene regulation at transcript resolution with RNA-seq

Abstract

Differential analysis of gene and transcript expression using high-throughput RNA sequencing (RNA-seq) is complicated by several sources of measurement variability and poses numerous statistical challenges. We present Cuffdiff 2, an algorithm that estimates expression at transcript-level resolution and controls for variability evident across replicate libraries. Cuffdiff 2 robustly identifies differentially expressed transcripts and genes and reveals differential splicing and promoter-preference changes. We demonstrate the accuracy of our approach through differential analysis of lung fibroblasts in response to loss of the developmental transcription factor HOXA1, which we show is required for lung fibroblast and HeLa cell cycle progression. Loss of HOXA1 results in significant expression level changes in thousands of individual transcripts, along with isoform switching events in key regulators of the cell cycle. Cuffdiff 2 performs robust differential analysis in RNA-seq experiments at transcript resolution, revealing a layer of regulation not readily observable with other high-throughput technologies.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Changes in fragment count for a gene does not necessarily equal a change in expression.
Figure 2: An overview of the Cuffdiff 2 approach to isoform-level differential analysis of RNA-seq data.
Figure 3: Comparison of Cuffdiff 2 with other expression platforms.
Figure 4: Accuracy of Cuffdiff 2 over varied experimental designs.
Figure 5: Changes in expression of cell cycle regulatory genes in response to HOXA1 knockdown.
Figure 6: Cell cycle analysis after HOXA1 knockdown.

Accession codes

Primary accessions

Gene Expression Omnibus

References

  1. Cloonan, N. et al. Stem cell transcriptome profiling via massive-scale mRNA sequencing. Nat. Methods 5, 613–619 (2008).

    Article  CAS  Google Scholar 

  2. Mortazavi, A., Williams, B.A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-seq. Nat. Methods 5, 621–628 (2008).

    Article  CAS  Google Scholar 

  3. Trapnell, C. et al. Transcript assembly and quantification by RNA-seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 28, 511–515 (2010).

    Article  CAS  Google Scholar 

  4. Guttman, M. et al. Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs. Nat. Biotechnol. 28, 503–510 (2010).

    Article  CAS  Google Scholar 

  5. Marioni, J.C., Mason, C.E., Mane, S.M., Stephens, M. & Gilad, Y. RNA-seq: An assessment of technical reproducibility and comparison with gene expression arrays. Genome Res. 18, 1509–1517 (2008).

    Article  CAS  Google Scholar 

  6. Fu, X. et al. Estimating accuracy of RNA-seq and microarrays with proteomics. BMC Genomics 10, 161 (2009).

    Article  Google Scholar 

  7. Graveley, B.R. et al. The developmental transcriptome of Drosophila melanogaster. Nature 471, 473–479 (2011).

    Article  CAS  Google Scholar 

  8. Lister, R. et al. Hotspots of aberrant epigenomic reprogramming in human induced pluripotent stem cells. Nature 471, 68–73 (2011).

    Article  CAS  Google Scholar 

  9. Pickrell, J.K. et al. Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature 464, 768–772 (2010).

    Article  CAS  Google Scholar 

  10. Montgomery, S.B. et al. Transcriptome genetics using second generation sequencing in a Caucasian population. Nature 464, 773–777 (2010).

    Article  CAS  Google Scholar 

  11. Wang, E.T. et al. Alternative isoform regulation in human tissue transcriptomes. Nature 456, 470–476 (2008).

    Article  CAS  Google Scholar 

  12. Pan, Q., Shai, O., Lee, L.J., Frey, B.J. & Blencowe, B.J. Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat. Genet. 40, 1413–1415 (2008).

    Article  CAS  Google Scholar 

  13. Jiang, H. & Wong, W.H. Statistical inferences for isoform expression in RNA-seq. Bioinformatics 25, 1026–1032 (2009).

    Article  CAS  Google Scholar 

  14. Katz, Y., Wang, E.T., Airoldi, E.M. & Burge, C.B. Analysis and design of RNA sequencing experiments for identifying isoform regulation. Nat. Methods 7, 1009–1015 (2010).

    Article  CAS  Google Scholar 

  15. Nicolae, M., Mangul, S., Măndoiu, I.I. & Zelikovsky, A. Estimation of alternative splicing isoform frequencies from RNA-seq data. Algorithms Mol. Biol. 6, 9 (2011).

    Article  Google Scholar 

  16. Lee, S. et al. Accurate quantification of transcriptome from RNA-seq data by effective length normalization. Nucleic Acids Res. 39, e9 (2011).

    Article  Google Scholar 

  17. Anders, S. & Huber, W. Differential expression analysis for sequence count data. Genome Biol. 11, R106 (2010).

    Article  CAS  Google Scholar 

  18. Langmead, B., Hansen, K.D. & Leek, J.T. Cloud-scale RNA-sequencing differential expression analysis with Myrna. Genome Biol. 11, R83 (2010).

    Article  Google Scholar 

  19. Oshlack, A., Robinson, M.D. & Young, M.D. From RNA-seq reads to differential expression results. Genome Biol. 11, 220 (2010).

    Article  CAS  Google Scholar 

  20. Robinson, M.D., McCarthy, D.J. & Smyth, G.K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).

    Article  CAS  Google Scholar 

  21. Wang, L., Feng, Z., Wang, X., Wang, X. & Zhang, X. DEGseq: an R package for identifying differentially expressed genes from RNA-seq data. Bioinformatics 26, 136–138 (2010).

    Article  Google Scholar 

  22. Hardcastle, T.J. & Kelly, K.A. baySeq: empirical Bayesian methods for identifying differential expression in sequence count data. BMC Bioinformatics 11, 422 (2010).

    Article  Google Scholar 

  23. Griffith, M. et al. Alternative expression analysis by RNA sequencing. Nat. Methods 7, 843–847 (2010).

    Article  CAS  Google Scholar 

  24. Glaus, P., Honkela, A. & Rattray, M. Identifying differentially expressed transcripts from RNA-seq data with biological variation. Bioinformatics 28, 1721–1728 (2012).

    Article  CAS  Google Scholar 

  25. Anders, S., Reyes, A. & Huber, W. Detecting differential usage of exons from RNA-seq data. Genome Res. 22, 2008–2017 (2012).

    Article  CAS  Google Scholar 

  26. Pearson, J.C., Lemons, D. & McGinnis, W. Modulating Hox gene functions during animal body patterning. Nat. Rev. Genet. 6, 893–904 (2005).

    Article  CAS  Google Scholar 

  27. Xi, W., WU, Z. & Zhang, X. Isoform abundance inference provides a more accurate estimation of gene expression levels in RNA-seq. J. Bioinform. Comput. Biol. 08, 177 (2010).

    Article  Google Scholar 

  28. Tarazona, S., García-Alcalde, F., Dopazo, J., Ferrer, A. & Conesa, A. Differential expression in RNA-seq: a matter of depth. Genome Res. 21, 2213–2223 (2011).

    Article  CAS  Google Scholar 

  29. Hiller, D., Jiang, H., Xu, W. & Wong, W.H. Identifiability of isoform deconvolution from junction arrays and RNA-seq. Bioinformatics 25, 3056–3059 (2009).

    Article  CAS  Google Scholar 

  30. Roberts, A., Trapnell, C., Donaghey, J., Rinn, J.L. & Pachter, L. Improving RNA-seq expression estimates by correcting for fragment bias. Genome Biol. 12, R22 (2011).

    Article  CAS  Google Scholar 

  31. Rinn, J.L., Bondre, C., Gladstone, H.B., Brown, P.O. & Chang, H.Y. Anatomic demarcation by positional variation in fibroblast gene expression programs. PLoS Genet. 2, e119 (2006).

    Article  Google Scholar 

  32. Wu, J.Q. et al. Dynamic transcriptomes during neural differentiation of human embryonic stem cells revealed by short, long, and paired-end sequencing. Proc. Natl. Acad. Sci. USA 107, 5254–5259 (2010).

    Article  CAS  Google Scholar 

  33. Cabili, M.N. et al. Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev. (2011).

  34. Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. USA 102, 15545–15550 (2005).

    Article  CAS  Google Scholar 

  35. Morgan, D.O. & Morgan, D.O. Cyclin-dependent kinases: engines, clocks, and microprocessors. Annu. Rev. Cell Dev. Biol. 13, 261–291 (1997).

    Article  CAS  Google Scholar 

  36. Liu, S. et al. Structural analysis of human Orc6 protein reveals a homology with transcription factor TFIIB. Proc. Natl. Acad. Sci. USA 108, 7373–7378 (2011).

    Article  CAS  Google Scholar 

  37. Dhar, S.K. & Dhar, S.K. Identification and characterization of the human ORC6 homolog. J. Biol. Chem. 275, 34983–34988 (2000).

    Article  CAS  Google Scholar 

  38. Guillamot, M. et al. Cdc14b regulates mammalian RNA polymerase II and represses cell cycle transcription. Scientific Reports 1, 189 (2011).

    Article  Google Scholar 

  39. Washkowitz, A.J., Gavrilov, S., Begum, S. & Papaioannou, V.E. Diverse functional networks of Tbx3 in development and disease. Wiley Interdisciplinary Rev. Syst. Biol. Med. 4, 273–283 (2012).

    Article  CAS  Google Scholar 

  40. Wilson, V., Wilson, V., Conlon, F.L. & Conlon, F.L. The T-box family. Genome Biol. 3, S3008 (2002).

    Article  Google Scholar 

  41. Djebali, S. et al. Landscape of transcription in human cells. Nature 489, 101–108 (2012).

    Article  CAS  Google Scholar 

  42. Bradley, R.K., Merkin, J., Lambert, N.J. & Burge, C.B. Alternative splicing of RNA triplets is often regulated and accelerates proteome evolution. PLoS Biol. 10, e1001229 (2012).

    Article  CAS  Google Scholar 

  43. Johnson, D.S., Mortazavi, A., Myers, R.M. & Wold, B. Genome-wide mapping of in vivo protein-DNA interactions. Science 316, 1497–1502 (2007).

    Article  CAS  Google Scholar 

  44. Mikkelsen, T.S. et al. Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Nature 448, 553–560 (2007).

    Article  CAS  Google Scholar 

  45. Crawford, G.E. et al. Genome-wide mapping of DNase hypersensitive sites using massively parallel signature sequencing (MPSS). Genome Res. 16, 123–131 (2006).

    Article  CAS  Google Scholar 

  46. Giresi, P.G. & Lieb, J.D. Isolation of active regulatory elements from eukaryotic chromatin using FAIRE (formaldehyde assisted isolation of regulatory elements). Methods 48, 233–239 (2009).

    Article  CAS  Google Scholar 

  47. Fullwood, M.J. et al. An oestrogen-receptor-alpha-bound human chromatin interactome. Nature 462, 58–64 (2009).

    Article  CAS  Google Scholar 

  48. Zhao, J. et al. Genome-wide identification of polycomb-associated RNAs by RIP-seq. Mol. Cell 40, 939–953 (2010).

    Article  CAS  Google Scholar 

  49. Licatalosi, D.D. et al. HITS-CLIP yields genome-wide insights into brain alternative RNA processing. Nature 456, 464–469 (2008).

    Article  CAS  Google Scholar 

  50. Wang, E.T. et al. Transcriptome-wide regulation of pre-mRNA splicing and mRNA localization by muscleblind proteins. Cell 150, 710–724 (2012).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We are grateful to D. Kelley for a careful reading of the manuscript, and B. Wold for sharing the hESC RNA-seq data. We are also thankful for the ongoing development efforts of A. Roberts, B. Langmead, D. Kim, G. Pertea, H. Pimentel and S. Salzberg. C.T. and D.G.H. are Damon Runyon Postdoctoral Fellows. J.L.R. is a Damon Runyon-Rachleff Inovator fellow. This work was supported by US National Institutes of Health grants DP2OD006670, P01GM099117, P50HG006193 and RO1ES020260 (to J.L.R.) and R01 HG006129 and R01 DK094699 (to L.P.).

Author information

Authors and Affiliations

Authors

Contributions

C.T. and L.P. developed the mathematics and statistics. D.G.H. and M.S. performed the experiments. D.G.H. and C.T. designed the experiments and performed the analysis. C.T. and L.G. implemented the software. L.P., J.L.R., D.G.H. and C.T. conceived the research. All authors wrote and approved the manuscript.

Corresponding authors

Correspondence to John L Rinn or Lior Pachter.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–87 and Supplementary Tables 1–3 (PDF 21617 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Trapnell, C., Hendrickson, D., Sauvageau, M. et al. Differential analysis of gene regulation at transcript resolution with RNA-seq. Nat Biotechnol 31, 46–53 (2013). https://doi.org/10.1038/nbt.2450

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nbt.2450

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing