Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Detecting actively translated open reading frames in ribosome profiling data

Abstract

RNA-sequencing protocols can quantify gene expression regulation from transcription to protein synthesis. Ribosome profiling (Ribo-seq) maps the positions of translating ribosomes over the entire transcriptome. We have developed RiboTaper (available at https://ohlerlab.mdc-berlin.de/software/), a rigorous statistical approach that identifies translated regions on the basis of the characteristic three-nucleotide periodicity of Ribo-seq data. We used RiboTaper with deep Ribo-seq data from HEK293 cells to derive an extensive map of translation that covered open reading frame (ORF) annotations for more than 11,000 protein-coding genes. We also found distinct ribosomal signatures for several hundred upstream ORFs and ORFs in annotated noncoding genes (ncORFs). Mass spectrometry data confirmed that RiboTaper achieved excellent coverage of the cellular proteome. Although dozens of novel peptide products were validated in this manner, few of the currently annotated long noncoding RNAs appeared to encode stable polypeptides. RiboTaper is a powerful method for comprehensive de novo identification of actively used ORFs from Ribo-seq data.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: The RiboTaper workflow.
Figure 2: De novo ORF reconstruction and examples of detected ORFs.
Figure 3: Comparative analysis of translated ORFs from coding and noncoding regions.
Figure 4: Conserved and nonconserved RiboTaper-identified ORFs define the cellular proteome.

Similar content being viewed by others

Accession codes

Primary accessions

Gene Expression Omnibus

Referenced accessions

Gene Expression Omnibus

References

  1. Ingolia, N.T., Ghaemmaghami, S., Newman, J.R.S. & Weissman, J.S. Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science 324, 218–223 (2009).

    Article  CAS  Google Scholar 

  2. Ingolia, N.T., Brar, G.A., Rouskin, S., McGeachy, A.M. & Weissman, J.S. The ribosome profiling strategy for monitoring translation in vivo by deep sequencing of ribosome-protected mRNA fragments. Nat. Protoc. 7, 1534–1550 (2012).

    Article  CAS  Google Scholar 

  3. Schafer, S. et al. Translational regulation shapes the molecular landscape of complex disease phenotypes. Nat. Commun. 6, 7200 (2015).

    Article  Google Scholar 

  4. Lareau, L.F., Hite, D.H., Hogan, G.J. & Brown, P.O. Distinct stages of the translation elongation cycle revealed by sequencing ribosome-protected mRNA fragments. Elife 3, e01257 (2014).

    Article  Google Scholar 

  5. Fritsch, C. et al. Genome-wide search for novel human uORFs and N-terminal protein extensions using ribosomal footprinting. Genome Res. 22, 2208–2218 (2012).

    Article  CAS  Google Scholar 

  6. Bazzini, A.A. et al. Identification of small ORFs in vertebrates using ribosome footprinting and evolutionary conservation. EMBO J. 33, 981–993 (2014).

    Article  CAS  Google Scholar 

  7. Aspden, J.L. et al. Extensive translation of small Open Reading Frames revealed by Poly-Ribo-Seq. Elife 3, e03528 (2014).

    Article  Google Scholar 

  8. Pop, C. et al. Causal signals between codon bias, mRNA structure, and the efficiency of translation and elongation. Mol. Syst. Biol. 10, 770 (2014).

    Article  Google Scholar 

  9. Chew, G.-L. et al. Ribosome profiling reveals resemblance between long non-coding RNAs and 5′ leaders of coding RNAs. Development 140, 2828–2834 (2013).

    Article  CAS  Google Scholar 

  10. Guttman, M., Russell, P., Ingolia, N.T., Weissman, J.S. & Lander, E.S. Ribosome profiling provides evidence that large noncoding RNAs do not encode proteins. Cell 154, 240–251 (2013).

    Article  CAS  Google Scholar 

  11. Ingolia, N.T. et al. Ribosome profiling reveals pervasive translation outside of annotated protein-coding genes. Cell Rep. 8, 1365–1379 (2014).

    Article  CAS  Google Scholar 

  12. Steitz, J.A. Polypeptide chain initiation: nucleotide sequences of the three ribosomal binding sites in bacteriophage R17 RNA. Nature 224, 957–964 (1969).

    Article  CAS  Google Scholar 

  13. Duncan, C.D.S. & Mata, J. The translational landscape of fission-yeast meiosis and sporulation. Nat. Struct. Mol. Biol. 21, 641–647 (2014).

    Article  CAS  Google Scholar 

  14. Michel, A.M. et al. Observation of dually decoded regions of the human genome using ribosome profiling data. Genome Res. 22, 2219–2229 (2012).

    Article  CAS  Google Scholar 

  15. Michel, A.M. et al. GWIPS-viz: development of a ribo-seq genome browser. Nucleic Acids Res. 42, D859–D864 (2014).

    Article  CAS  Google Scholar 

  16. Olshen, A.B. et al. Assessing gene-level translational control from ribosome profiling. Bioinformatics 29, 2995–3002 (2013).

    Article  CAS  Google Scholar 

  17. Legendre, R., Baudin-Baillieu, A., Hatin, I. & Namy, O. RiboTools: a Galaxy toolbox for qualitative ribosome profiling analysis. Bioinformatics 31, 2586–2588 (2015).

    Article  CAS  Google Scholar 

  18. Thomson, D.J. Spectrum estimation and harmonic analysis. Proc. IEEE 70, 1055–1096 (1982).

    Article  Google Scholar 

  19. Babadi, B. & Brown, E.N. A review of multitaper spectral analysis. IEEE Trans. Biomed. Eng. 61, 1555–1564 (2014).

    Article  Google Scholar 

  20. Thomson, D.J., Maclennan, C.G. & Lanzerotti, L.J. Propagation of solar oscillations through the interplanetary medium. Nature 376, 139–144 (1995).

    Article  CAS  Google Scholar 

  21. Lahens, N.F. et al. IVT-seq reveals extreme bias in RNA-sequencing. Genome Biol. 15, R86 (2014).

    Article  Google Scholar 

  22. Gao, X. et al. Quantitative profiling of initiating ribosomes in vivo. Nat. Methods 12, 147–153 (2015).

    Article  CAS  Google Scholar 

  23. Pauli, A. et al. Toddler: an embryonic signal that promotes cell movement via Apelin receptors. Science 343, 1248636 (2014).

    Article  Google Scholar 

  24. Siepel, A. et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 15, 1034–1050 (2005).

    Article  CAS  Google Scholar 

  25. Wang, L. et al. CPAT: Coding-Potential Assessment Tool using an alignment-free logistic regression model. Nucleic Acids Res. 41, e74 (2013).

    Article  CAS  Google Scholar 

  26. Lin, M.F., Jungreis, I. & Kellis, M. PhyloCSF: a comparative genomics method to distinguish protein coding and non-coding regions. Bioinformatics 27, i275–i282 (2011).

    Article  CAS  Google Scholar 

  27. Eravci, M., Sommer, C. & Selbach, M. IPG strip-based peptide fractionation for shotgun proteomics. Methods Mol. Biol. 1156, 67–77 (2014).

    Article  CAS  Google Scholar 

  28. Andreev, D.E. et al. Translation of 5′ leaders is pervasive in genes resistant to eIF2 repression. Elife 4, e03971 (2015).

    Article  Google Scholar 

  29. Gerashchenko, M.V. & Gladyshev, V.N. Translation inhibitors cause abnormalities in ribosome profiling experiments. Nucleic Acids Res. 42, e134 (2014).

    Article  Google Scholar 

  30. Artieri, C.G. & Fraser, H.B. Accounting for biases in riboprofiling data indicates a major role for proline in stalling translation. Genome Res. 24, 2011–2021 (2014).

    Article  CAS  Google Scholar 

  31. Bánfai, B. et al. Long noncoding RNAs are rarely translated in two human cell lines. Genome Res. 22, 1646–1657 (2012).

    Article  Google Scholar 

  32. Barbosa, C., Peixeiro, I. & Romão, L. Gene expression regulation by upstream open reading frames and human disease. PLoS Genet. 9, e1003529 (2013).

    Article  CAS  Google Scholar 

  33. Wethmar, K., Barbosa-Silva, A., Andrade-Navarro, M.A. & Leutz, A. uORFdb—a comprehensive literature database on eukaryotic uORF biology. Nucleic Acids Res. 42, D60–D67 (2014).

    Article  CAS  Google Scholar 

  34. Zupanic, A. et al. Detecting translational regulation by change point analysis of ribosome profiling data sets. RNA 20, 1507–1518 (2014).

    Article  CAS  Google Scholar 

  35. Crappé, J. et al. PROTEOFORMER: deep proteome coverage through ribosome profiling and MS integration. Nucleic Acids Res. 43, e29 (2015).

    Article  Google Scholar 

  36. Schueler, M. et al. Differential protein occupancy profiling of the mRNA transcriptome. Genome Biol. 15, R15 (2014).

    Article  Google Scholar 

  37. Langmead, B., Trapnell, C., Pop, M. & Salzberg, S.L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).

    Article  Google Scholar 

  38. Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

    Article  CAS  Google Scholar 

  39. Harrow, J. et al. GENCODE: the reference human genome annotation for the ENCODE Project. Genome Res. 22, 1760–1774 (2012).

    Article  CAS  Google Scholar 

  40. Li, B. & Dewey, C.N. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics 12, 323 (2011).

    Article  CAS  Google Scholar 

  41. Quinlan, A.R. & Hall, I.M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).

    Article  CAS  Google Scholar 

  42. Rahim, K.J., Burr, W.S. & Thomson, D.J. Appendix: A Multitaper R package. in Applications of Multitaper Spectral Analysis to Nonstationary Data. PhD dissertation, Queen's Univ., 149–183 (2014).

  43. Mackowiak, S.D. et al. Extensive identification and analysis of conserved small ORFs in animals. Genome Biol. 16, 179 (2015).

    Article  Google Scholar 

  44. Cox, J. & Mann, M. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat. Biotechnol. 26, 1367–1372 (2008).

    Article  CAS  Google Scholar 

  45. Chen, C., Li, Z., Huang, H., Suzek, B.E. & Wu, C.H. A fast Peptide Match service for UniProt Knowledgebase. Bioinformatics 29, 2808–2809 (2013).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

L.C. sincerely thanks R. Marangoni (University of Pisa) for inspiring and fruitful discussions and A. Munteanu (MDC) for help with the supplementary figures. L.C. is funded by the MDC PhD program. B.O. acknowledges funding through a Delbrück fellowship at the MDC. N.M. acknowledges funding from EU Marie Curie IIF. N.M. and U.O. were supported by US National Institutes of Health grant R01GM104962.

Author information

Authors and Affiliations

Authors

Contributions

L.C. and U.O. developed the computational approach. L.C. implemented the RiboTaper method and analyzed the sequencing data, supervised by U.O. E.W., N.M. and A.H. performed the Ribo-seq experiments, supervised by M.L. B.O. carried out the evolutionary-conservation analysis and helped with the interpretation of the presented findings. H.Z., M.S. and L.C. analyzed the mass spectrometry data. L.C., N.M. and U.O. wrote the manuscript, with crucial input from all other authors.

Corresponding author

Correspondence to Uwe Ohler.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–17 (PDF 6623 kb)

Supplementary Table 1

Statistics about alignment and pre-processing of the sequencing datasets used in this study. (XLSX 5 kb)

Supplementary Table 2

Complete list of identified ORFs in the different datasets used. (XLSX 31374 kb)

Supplementary Table 3

Filtered list of identified ORFs in the different datasets used. Non-coding ORFs overlapping CDS regions were removed. Additionally, ORFs were filtered out when >30% of the Ribo-seq coverage were supported by multi-mapping reads only. (XLSX 29505 kb)

Supplementary Table 4

Evidence table for peptides identified by the RiboTaper database only. (XLSX 3115 kb)

Supplementary Data

Archive containing bed files for the identified ORFs. (ZIP 8432 kb)

Supplementary Software

RiboTaper (version 1.0) software code. (ZIP 69 kb)

Source data

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Calviello, L., Mukherjee, N., Wyler, E. et al. Detecting actively translated open reading frames in ribosome profiling data. Nat Methods 13, 165–170 (2016). https://doi.org/10.1038/nmeth.3688

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nmeth.3688

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing