Abstract
RNA-sequencing protocols can quantify gene expression regulation from transcription to protein synthesis. Ribosome profiling (Ribo-seq) maps the positions of translating ribosomes over the entire transcriptome. We have developed RiboTaper (available at https://ohlerlab.mdc-berlin.de/software/), a rigorous statistical approach that identifies translated regions on the basis of the characteristic three-nucleotide periodicity of Ribo-seq data. We used RiboTaper with deep Ribo-seq data from HEK293 cells to derive an extensive map of translation that covered open reading frame (ORF) annotations for more than 11,000 protein-coding genes. We also found distinct ribosomal signatures for several hundred upstream ORFs and ORFs in annotated noncoding genes (ncORFs). Mass spectrometry data confirmed that RiboTaper achieved excellent coverage of the cellular proteome. Although dozens of novel peptide products were validated in this manner, few of the currently annotated long noncoding RNAs appeared to encode stable polypeptides. RiboTaper is a powerful method for comprehensive de novo identification of actively used ORFs from Ribo-seq data.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Accession codes
Primary accessions
Gene Expression Omnibus
Referenced accessions
Gene Expression Omnibus
References
Ingolia, N.T., Ghaemmaghami, S., Newman, J.R.S. & Weissman, J.S. Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science 324, 218–223 (2009).
Ingolia, N.T., Brar, G.A., Rouskin, S., McGeachy, A.M. & Weissman, J.S. The ribosome profiling strategy for monitoring translation in vivo by deep sequencing of ribosome-protected mRNA fragments. Nat. Protoc. 7, 1534–1550 (2012).
Schafer, S. et al. Translational regulation shapes the molecular landscape of complex disease phenotypes. Nat. Commun. 6, 7200 (2015).
Lareau, L.F., Hite, D.H., Hogan, G.J. & Brown, P.O. Distinct stages of the translation elongation cycle revealed by sequencing ribosome-protected mRNA fragments. Elife 3, e01257 (2014).
Fritsch, C. et al. Genome-wide search for novel human uORFs and N-terminal protein extensions using ribosomal footprinting. Genome Res. 22, 2208–2218 (2012).
Bazzini, A.A. et al. Identification of small ORFs in vertebrates using ribosome footprinting and evolutionary conservation. EMBO J. 33, 981–993 (2014).
Aspden, J.L. et al. Extensive translation of small Open Reading Frames revealed by Poly-Ribo-Seq. Elife 3, e03528 (2014).
Pop, C. et al. Causal signals between codon bias, mRNA structure, and the efficiency of translation and elongation. Mol. Syst. Biol. 10, 770 (2014).
Chew, G.-L. et al. Ribosome profiling reveals resemblance between long non-coding RNAs and 5′ leaders of coding RNAs. Development 140, 2828–2834 (2013).
Guttman, M., Russell, P., Ingolia, N.T., Weissman, J.S. & Lander, E.S. Ribosome profiling provides evidence that large noncoding RNAs do not encode proteins. Cell 154, 240–251 (2013).
Ingolia, N.T. et al. Ribosome profiling reveals pervasive translation outside of annotated protein-coding genes. Cell Rep. 8, 1365–1379 (2014).
Steitz, J.A. Polypeptide chain initiation: nucleotide sequences of the three ribosomal binding sites in bacteriophage R17 RNA. Nature 224, 957–964 (1969).
Duncan, C.D.S. & Mata, J. The translational landscape of fission-yeast meiosis and sporulation. Nat. Struct. Mol. Biol. 21, 641–647 (2014).
Michel, A.M. et al. Observation of dually decoded regions of the human genome using ribosome profiling data. Genome Res. 22, 2219–2229 (2012).
Michel, A.M. et al. GWIPS-viz: development of a ribo-seq genome browser. Nucleic Acids Res. 42, D859–D864 (2014).
Olshen, A.B. et al. Assessing gene-level translational control from ribosome profiling. Bioinformatics 29, 2995–3002 (2013).
Legendre, R., Baudin-Baillieu, A., Hatin, I. & Namy, O. RiboTools: a Galaxy toolbox for qualitative ribosome profiling analysis. Bioinformatics 31, 2586–2588 (2015).
Thomson, D.J. Spectrum estimation and harmonic analysis. Proc. IEEE 70, 1055–1096 (1982).
Babadi, B. & Brown, E.N. A review of multitaper spectral analysis. IEEE Trans. Biomed. Eng. 61, 1555–1564 (2014).
Thomson, D.J., Maclennan, C.G. & Lanzerotti, L.J. Propagation of solar oscillations through the interplanetary medium. Nature 376, 139–144 (1995).
Lahens, N.F. et al. IVT-seq reveals extreme bias in RNA-sequencing. Genome Biol. 15, R86 (2014).
Gao, X. et al. Quantitative profiling of initiating ribosomes in vivo. Nat. Methods 12, 147–153 (2015).
Pauli, A. et al. Toddler: an embryonic signal that promotes cell movement via Apelin receptors. Science 343, 1248636 (2014).
Siepel, A. et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 15, 1034–1050 (2005).
Wang, L. et al. CPAT: Coding-Potential Assessment Tool using an alignment-free logistic regression model. Nucleic Acids Res. 41, e74 (2013).
Lin, M.F., Jungreis, I. & Kellis, M. PhyloCSF: a comparative genomics method to distinguish protein coding and non-coding regions. Bioinformatics 27, i275–i282 (2011).
Eravci, M., Sommer, C. & Selbach, M. IPG strip-based peptide fractionation for shotgun proteomics. Methods Mol. Biol. 1156, 67–77 (2014).
Andreev, D.E. et al. Translation of 5′ leaders is pervasive in genes resistant to eIF2 repression. Elife 4, e03971 (2015).
Gerashchenko, M.V. & Gladyshev, V.N. Translation inhibitors cause abnormalities in ribosome profiling experiments. Nucleic Acids Res. 42, e134 (2014).
Artieri, C.G. & Fraser, H.B. Accounting for biases in riboprofiling data indicates a major role for proline in stalling translation. Genome Res. 24, 2011–2021 (2014).
Bánfai, B. et al. Long noncoding RNAs are rarely translated in two human cell lines. Genome Res. 22, 1646–1657 (2012).
Barbosa, C., Peixeiro, I. & Romão, L. Gene expression regulation by upstream open reading frames and human disease. PLoS Genet. 9, e1003529 (2013).
Wethmar, K., Barbosa-Silva, A., Andrade-Navarro, M.A. & Leutz, A. uORFdb—a comprehensive literature database on eukaryotic uORF biology. Nucleic Acids Res. 42, D60–D67 (2014).
Zupanic, A. et al. Detecting translational regulation by change point analysis of ribosome profiling data sets. RNA 20, 1507–1518 (2014).
Crappé, J. et al. PROTEOFORMER: deep proteome coverage through ribosome profiling and MS integration. Nucleic Acids Res. 43, e29 (2015).
Schueler, M. et al. Differential protein occupancy profiling of the mRNA transcriptome. Genome Biol. 15, R15 (2014).
Langmead, B., Trapnell, C., Pop, M. & Salzberg, S.L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Harrow, J. et al. GENCODE: the reference human genome annotation for the ENCODE Project. Genome Res. 22, 1760–1774 (2012).
Li, B. & Dewey, C.N. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics 12, 323 (2011).
Quinlan, A.R. & Hall, I.M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Rahim, K.J., Burr, W.S. & Thomson, D.J. Appendix: A Multitaper R package. in Applications of Multitaper Spectral Analysis to Nonstationary Data. PhD dissertation, Queen's Univ., 149–183 (2014).
Mackowiak, S.D. et al. Extensive identification and analysis of conserved small ORFs in animals. Genome Biol. 16, 179 (2015).
Cox, J. & Mann, M. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat. Biotechnol. 26, 1367–1372 (2008).
Chen, C., Li, Z., Huang, H., Suzek, B.E. & Wu, C.H. A fast Peptide Match service for UniProt Knowledgebase. Bioinformatics 29, 2808–2809 (2013).
Acknowledgements
L.C. sincerely thanks R. Marangoni (University of Pisa) for inspiring and fruitful discussions and A. Munteanu (MDC) for help with the supplementary figures. L.C. is funded by the MDC PhD program. B.O. acknowledges funding through a Delbrück fellowship at the MDC. N.M. acknowledges funding from EU Marie Curie IIF. N.M. and U.O. were supported by US National Institutes of Health grant R01GM104962.
Author information
Authors and Affiliations
Contributions
L.C. and U.O. developed the computational approach. L.C. implemented the RiboTaper method and analyzed the sequencing data, supervised by U.O. E.W., N.M. and A.H. performed the Ribo-seq experiments, supervised by M.L. B.O. carried out the evolutionary-conservation analysis and helped with the interpretation of the presented findings. H.Z., M.S. and L.C. analyzed the mass spectrometry data. L.C., N.M. and U.O. wrote the manuscript, with crucial input from all other authors.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–17 (PDF 6623 kb)
Supplementary Table 1
Statistics about alignment and pre-processing of the sequencing datasets used in this study. (XLSX 5 kb)
Supplementary Table 2
Complete list of identified ORFs in the different datasets used. (XLSX 31374 kb)
Supplementary Table 3
Filtered list of identified ORFs in the different datasets used. Non-coding ORFs overlapping CDS regions were removed. Additionally, ORFs were filtered out when >30% of the Ribo-seq coverage were supported by multi-mapping reads only. (XLSX 29505 kb)
Supplementary Table 4
Evidence table for peptides identified by the RiboTaper database only. (XLSX 3115 kb)
Supplementary Data
Archive containing bed files for the identified ORFs. (ZIP 8432 kb)
Supplementary Software
RiboTaper (version 1.0) software code. (ZIP 69 kb)
Rights and permissions
About this article
Cite this article
Calviello, L., Mukherjee, N., Wyler, E. et al. Detecting actively translated open reading frames in ribosome profiling data. Nat Methods 13, 165–170 (2016). https://doi.org/10.1038/nmeth.3688
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nmeth.3688
This article is cited by
-
Translational landscape of direct cardiac reprogramming reveals a role of Ybx1 in repressing cardiac fate acquisition
Nature Cardiovascular Research (2023)
-
A novel tumor suppressor encoded by a 1p36.3 lncRNA functions as a phosphoinositide-binding protein repressing AKT phosphorylation/activation and promoting autophagy
Cell Death & Differentiation (2023)
-
A hidden translatome in tumors—the coding lncRNAs
Science China Life Sciences (2023)
-
Translational activation of ribosome-related genes at initial photoreception is dependent on signals derived from both the nucleus and the chloroplasts in Arabidopsis thaliana
Journal of Plant Research (2023)
-
Shining in the dark: the big world of small peptides in plants
aBIOTECH (2023)