Translation of neutrally evolving peptides provides a basis for de novo gene evolution

Abstract

Accumulating evidence indicates that some protein-coding genes have originated de novo from previously non-coding genomic sequences. However, the processes underlying de novo gene birth are still enigmatic. In particular, the appearance of a new functional protein seems highly improbable unless there is already a pool of neutrally evolving peptides that are translated at significant levels and that can at some point acquire new functions. Here, we use deep ribosome-profiling sequencing data, together with proteomics and single nucleotide polymorphism information, to search for these peptides. We find hundreds of open reading frames that are translated and that show no evolutionary conservation or selective constraints. These data suggest that the translation of these neutrally evolving peptides may be facilitated by the chance occurrence of open reading frames with a favourable codon composition. We conclude that the pervasive translation of the transcriptome provides plenty of material for the evolution of new functional proteins.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: Detection of translated ORFs.
Fig. 2: Identification of selection signatures.
Fig. 3: Three-nucleotide periodicity of translated ORFs.
Fig. 4: Factors influencing the translation of neutrally evolving ORFs.

References

  1. 1.

    Kutter, C. et al. Rapid turnover of long noncoding RNAs and the evolution of gene expression. PLoS Genet. 8, e1002841 (2012).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  2. 2.

    Wiberg, R. A. W. et al. Assessing recent selection and functionality at long noncoding RNA loci in the mouse genome. Genome Biol. Evol. 7, 2432–2444 (2015).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  3. 3.

    Ruiz-Orera, J. et al. Origins of de novo genes in human and chimpanzee. PLoS Genet. 11, e1005721 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  4. 4.

    Ji, Z., Song, R., Regev, A. & Struhl, K. Many lncRNAs, 5'UTRs, and pseudogenes are translated and some are likely to express functional proteins. Elife 4, e08890 (2015).

    PubMed  PubMed Central  Google Scholar 

  5. 5.

    Raj, A. et al. Thousands of novel translated open reading frames in humans inferred by ribosome footprint profiling. Elife 5, e13328 (2016).

    PubMed  PubMed Central  Google Scholar 

  6. 6.

    Ingolia, N. T., Lareau, L. F. & Weissman, J. S. Ribosome profiling of mouse embryonic stem cells reveals the complexity and dynamics of mammalian proteomes. Cell 147, 789–802 (2011).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  7. 7.

    Ingolia, N. T. et al. Ribosome profiling reveals pervasive translation outside of annotated protein-coding genes. Cell Rep. 8, 1365–1379 (2014).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  8. 8.

    Ruiz-Orera, J., Messeguer, X., Subirana, J. A. & Alba, M. M. Long non-coding RNAs as a source of new peptides. Elife 3, e03523 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  9. 9.

    Wilson, B. A. & Masel, J. Putatively noncoding transcripts show extensive association with ribosomes. Genome Biol. Evol. 3, 1245–1252 (2011).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  10. 10.

    Couso, J.-P. & Patraquim, P. Classification and function of small open reading frames. Nat. Rev. Mol. Cell Biol. 18, 575–589 (2017).

    CAS  Article  PubMed  Google Scholar 

  11. 11.

    Ingolia, N. T., Ghaemmaghami, S., Newman, J. R. S. & Weissman, J. S. Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science 324, 218–223 (2009).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  12. 12.

    Bazzini, A. A. et al. Identification of small ORFs in vertebrates using ribosome footprinting and evolutionary conservation. EMBO J. 33, 981–993 (2014).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  13. 13.

    Calviello, L. et al. Detecting actively translated open reading frames in ribosome profiling data. Nat. Methods 13, 165–170 (2016).

    CAS  Article  PubMed  Google Scholar 

  14. 14.

    Aspden, J. L. et al. Extensive translation of small ORFs revealed by Poly-Ribo-Seq. Elife 3, e03528 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  15. 15.

    Mackowiak, S. D. et al. Extensive identification and analysis of conserved small ORFs in animals. Genome Biol. 16, 1–21 (2015).

    Article  Google Scholar 

  16. 16.

    Begun, D. J., Lindfors, H. A., Kern, A. D. & Jones, C. D. Evidence for de novo evolution of testis-expressed genes in the Drosophila yakuba/Drosophila erecta clade. Genetics 176, 1131–1137 (2006).

    Article  Google Scholar 

  17. 17.

    Tautz, D. & Domazet-Lošo, T. The evolutionary origin of orphan genes. Nat. Rev. Genet. 12, 692–702 (2011).

    CAS  Article  PubMed  Google Scholar 

  18. 18.

    McLysaght, A. & Hurst, L. D. Open questions in the study of de novo genes: what, how and why. Nat. Rev. Genet. 17, 567–578 (2016).

    CAS  Article  PubMed  Google Scholar 

  19. 19.

    Zhao, L., Saelao, P., Jones, C. D. & Begun, D. J. Origin and spread of de novo genes in Drosophila melanogaster populations. Science 343, 769–772 (2014).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  20. 20.

    Carvunis, A.-R. et al. Proto-genes and de novo gene birth. Nature 487, 370–374 (2012).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  21. 21.

    Toll-Riera, M. et al. Origin of primate orphan genes: a comparative genomics approach. Mol. Biol. Evol. 26, 603–612 (2009).

    CAS  Article  PubMed  Google Scholar 

  22. 22.

    Cai, J. J. & Petrov, D. A. Relaxed purifying selection and possibly high rate of adaptation in primate lineage-specific genes. Genome Biol. Evol. 2, 393–409 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  23. 23.

    Chen, S., Zhang, Y. E. & Long, M. New genes in Drosophila quickly become essential. Science 330, 1682–1685 (2010).

    CAS  Article  PubMed  Google Scholar 

  24. 24.

    Reinhardt, J. A. et al. De novo ORFs in Drosophila are important to organismal fitness and evolved rapidly from previously non-coding sequences. PLoS Genet. 9, e1003860 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  25. 25.

    Sunyaev, S., Kondrashov, F. A., Bork, P. & Ramensky, V. Impact of selection, mutation rate and genetic drift on human genetic variation. Hum. Mol. Genet. 12, 3325–3330 (2003).

    CAS  Article  PubMed  Google Scholar 

  26. 26.

    Gayà-Vidal, M. & Albà, M. M. Uncovering adaptive evolution in the human lineage. BMC Genomics 15, 599 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  27. 27.

    Harr, B. et al. Genomic resources for wild populations of the house mouse, Mus musculus and its close relative Mus spretus. Sci. Data 3, 160075 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  28. 28.

    Buck-Koehntop, B. A., Mascioni, A., Buffy, J. J. & Veglia, G. Structure, dynamics, and membrane topology of stannin: a mediator of neuronal cell apoptosis induced by trimethyltin chloride. J. Mol. Biol. 354, 652–665 (2005).

    CAS  Article  PubMed  Google Scholar 

  29. 29.

    Pueyo, J. I. et al. Hemotin, a regulator of phagocytosis encoded by a small ORF and conserved across Metazoans. PLoS Biol. 14, e1002395 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  30. 30.

    Plotkin, J. B. & Kudla, G. Synonymous but not the same: the causes and consequences of codon bias. Nat. Rev. Genet. 12, 32–42 (2011).

    CAS  Article  PubMed  Google Scholar 

  31. 31.

    Vizcaíno, J. A. et al. 2016 update of the PRIDE database and its related tools. Nucleic Acids Res. 44, D447–D456 (2016).

    Article  PubMed  Google Scholar 

  32. 32.

    Slavoff, S. A. et al. Peptidomic discovery of short open reading frame-encoded peptides in human cells. Nat. Chem. Biol. 9, 59–64 (2013).

    CAS  Article  PubMed  Google Scholar 

  33. 33.

    Heinen, T. J. A. J., Staubach, F., Häming, D. & Tautz, D. Emergence of a new gene from an intergenic region. Curr. Biol. 19, 1527–1531 (2009).

    CAS  Article  PubMed  Google Scholar 

  34. 34.

    Dana, A. & Tuller, T. The effect of tRNA levels on decoding times of mRNA codons. Nucleic Acids Res. 42, 9171–9181 (2014).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  35. 35.

    Yu, C. et al. Codon usage influences the local rate of translation elongation to regulate co-translational protein folding. Mol. Cell 59, 744–754 (2015).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  36. 36.

    Presnyak, V. et al. Codon optimality is a major determinant of mRNA stability. Cell 160, 1111–1124 (2015).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  37. 37.

    Schlötterer, C. Genes from scratch — the evolutionary fate of de novo genes. Trends Genet. 31, 215–219 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  38. 38.

    Okazaki, Y. et al. Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature 420, 563–573 (2002).

    Article  PubMed  Google Scholar 

  39. 39.

    Neme, R. & Tautz, D. Fast turnover of genome transcription across evolutionary time exposes entire non-coding DNA to de novo gene emergence. Elife 5, e09977 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  40. 40.

    Lynch, M. & Marinov, G. K. The bioenergetic costs of a gene. Proc. Natl Acad. Sci. USA 112, 15690–15695 (2015).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  41. 41.

    Wilson, B. A., Foy, S. G., Neme, R. & Masel, J. Young genes are highly disordered as predicted by the preadaptation hypothesis of de novo gene birth. Nat. Ecol. Evol. 1, 146 (2017).

    Article  Google Scholar 

  42. 42.

    Kaiser, C. A., Preuss, D., Grisafi, P. & Botstein, D. Many random sequences functionally replace the secretion signal sequence of yeast invertase. Science 235, 312–317 (1987).

    CAS  Article  PubMed  Google Scholar 

  43. 43.

    Keefe, A. D. & Szostak, J. W. Functional proteins from a random-sequence library. Nature 410, 715–718 (2001).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  44. 44.

    Neme, R., Amador, C., Yildirim, B., McConnell, E. & Tautz, D. Random sequences are an abundant source of bioactive RNAs or peptides. Nat. Ecol. Evol. 1, 0127 (2017).

    Article  Google Scholar 

  45. 45.

    Soumillon, M. et al. Cellular source and mechanisms of high transcriptome complexity in the mammalian testis. Cell Rep. 3, 2179–2190 (2013).

    CAS  Article  PubMed  Google Scholar 

  46. 46.

    Necsulea, A. et al. The evolution of lncRNA repertoires and expression patterns in tetrapods. Nature 505, 635–640 (2014).

    CAS  Article  PubMed  Google Scholar 

  47. 47.

    Smeds, L. & Künstner, A. ConDeTri — a content dependent read trimmer for Illumina data. PLoS ONE 6, e26314 (2011).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  48. 48.

    Kim, D. et al. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 14, R36 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  49. 49.

    Pertea, M. et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat. Biotechnol. 33, 290–295 (2015).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  50. 50.

    Luis Villanueva-Cañas, J. et al. New genes and functional innovation in mammals. Genome Biol. Evol. 9, 1886–1900 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  51. 51.

    Gonzalez, C. et al. Ribosome profiling reveals a cell-type-specific translational landscape in brain tumors. J. Neurosci. 34, 10924–10936 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  52. 52.

    Castañeda, J. et al. Reduced pachytene piRNAs and translation underlie spermiogenic arrest in Maelstrom mutant mice. EMBO J. 33, 1999–2019 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  53. 53.

    Guo, H., Ingolia, N. T., Weissman, J. S. & Bartel, D. P. Mammalian microRNAs predominantly act to decrease target mRNA levels. Nature 466, 835–840 (2010).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  54. 54.

    Diaz-Munoz, M. D. et al. The RNA-binding protein HuR is essential for the B cell antibody response. Nat. Immunol. 16, 415–425 (2015).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  55. 55.

    Cho, J. et al. Multiple repressive mechanisms in the hippocampus during memory formation. Science 350, 82–87 (2015).

    CAS  Article  PubMed  Google Scholar 

  56. 56.

    Sedlazeck, F. J., Rescheneder, P. & von Haeseler, A. NextGenMap: fast and accurate read mapping in highly polymorphic genomes. Bioinformatics 29, 2790–2791 (2013).

    CAS  Article  PubMed  Google Scholar 

  57. 57.

    Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  58. 58.

    Karolchik, D. et al. The UCSC Genome Browser database: 2014 update. Nucleic Acids Res. 42, D764–D770 (2014).

    CAS  Article  PubMed  Google Scholar 

  59. 59.

    Rosenberg, M. S., Subramanian, S. & Kumar, S. Patterns of transitional mutation biases within and among mammalian genomes. Mol. Biol. Evol. 20, 988–993 (2003).

    CAS  Article  PubMed  Google Scholar 

  60. 60.

    R Development Core Team R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, Vienna, 2016).

Download references

Acknowledgements

We are grateful for valuable discussions with many colleagues during this study. This work was funded by grants BFU2012-36820, BFU2015-65235-P and TIN2015-69175-C4-3-R from Ministerio de Economía e Innovación (Spanish Government) and co-funded by FEDER (EC). We also received funding from Agència de Gestió d’Ajuts Universitaris i de Recerca Generalitat de Catalunya (AGAUR), grant no. 2014SGR1121.

Author information

Affiliations

Authors

Contributions

J.R.-O. and M.M.A. conceived the study, interpreted the data and wrote the paper. J.R.-O. performed most of the analyses, including the transcript assemblies, identification of translated ORFs, BLAST searches, SNP mapping and generation of controls. J.R.-O., P.V.-G. and J.L.V.-C. wrote the code and performed analyses on the coding score. X.M. wrote the code to calculate the expected SNP frequencies. M.M.A. coordinated the study.

Corresponding authors

Correspondence to Jorge Ruiz-Orera or M. Mar Albà.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Tables 1–6, Supplementary Figures 1–10

Life Sciences Reporting Summary

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ruiz-Orera, J., Verdaguer-Grau, P., Villanueva-Cañas, J.L. et al. Translation of neutrally evolving peptides provides a basis for de novo gene evolution. Nat Ecol Evol 2, 890–896 (2018). https://doi.org/10.1038/s41559-018-0506-6

Download citation

Further reading

Search

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing