Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Classification and function of small open reading frames

This article has been updated

Key Points

  • Small peptides of 100 amino acids or fewer are encoded by small open reading frames (smORFs) and mediate key physiological functions in animals and humans.

  • smORFs constitute 99% of transcribed, but only 1% of annotated, coding sequences in flies, mice and humans.

  • Different smORF classes show distinctive and predictive markers of functionality at the RNA level and the protein sequence level.

  • The characteristics of different smORF classes are evolutionarily conserved across animal species, encouraging the use of Drosophila melanogaster and Mus musculus as model organisms for studies of peptide biology in the context of development, physiology and disease.

  • Different smORF classes may represent steps in the origin and evolution of new genes and proteins.

Abstract

Small open reading frames (smORFs) of 100 codons or fewer are usually — if arbitrarily — excluded from proteome annotations. Despite this, the genomes of many metazoans, including humans, contain millions of smORFs, some of which fulfil key physiological functions. Recently, the transcriptome of Drosophila melanogaster was shown to contain thousands of smORFs of different classes that actively undergo translation, which produces peptides of mostly unknown function. Here, we present a comprehensive analysis of smORFs in flies, mice and humans. We propose the existence of several functional classes of smORFs, ranging from inert DNA sequences to transcribed and translated cis-regulators of translation and peptides with a propensity to function as regulators of membrane-associated proteins, or as components of ancient protein complexes in the cytoplasm. We suggest that the different smORF classes could represent steps in gene, peptide and protein evolution. Our analysis introduces a distinction between different peptide-coding classes of smORFs in animal genomes, and highlights the role of model organisms for the study of small peptide biology in the context of development, physiology and human disease.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Figure 1: Properties of smORFs in fruitflies and mammals.
Figure 2: Conservation of smORF numbers and lengths in animals.
Figure 3: RNA characteristics of transcribed smORFs.
Figure 4: Functions of smORF-encoded peptides.
Figure 5: Coding features of smORFs.
Figure 6: Stepwise evolution of smORFs towards proteins.
Figure 7: Evidence of smORF evolution.

Change history

  • 01 August 2017

    The original online version of this article contained four errors, which have now been corrected. The corrections included two typos in the main text, the addition of a missing point in the X axis in Figure 3b, and the exchange of the position of two column headers in Figure 5c.

References

  1. 1

    Gerstein, M. B. et al. What is a gene, post-ENCODE? History and updated definition. Genome Res. 17, 669–681 (2007).

    CAS  PubMed  Google Scholar 

  2. 2

    Derrien, T. et al. The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression. Genome Res. 22, 1775–1789 (2012).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  3. 3

    Guttman, M. & Rinn, J. L. Modular regulatory principles of large non-coding RNAs. Nature 482, 339–346 (2012).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  4. 4

    Basrai, M. A., Hieter, P. & Boeke, J. D. Small open reading frames: beautiful needles in the haystack. Genome Res. 7, 768–771 (1997). This seminal work effectively establishes the field of smORF studies by arguing that smORFs exist in large numbers and can encode functional peptides.

    CAS  PubMed  Article  Google Scholar 

  5. 5

    Kastenmayer, J. P. et al. Functional genomics of genes with small open reading frames (sORFs) in S. cerevisiae. Genome Res. 16, 365–373 (2006). The only genome-wide assessment of smORF function, demonstrating smORF function in approximately 5% of baker's yeast genes.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  6. 6

    Ladoukakis, E., Pereira, V., Magny, E. G., Eyre-Walker, A. & Couso, J. P. Hundreds of putatively functional small open reading frames in Drosophila. Genome Biol. 12, R118 (2011).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  7. 7

    Frith, M. C. et al. The abundance of short proteins in the mammalian proteome. Plos Genet. 2, 515–528 (2006).

    CAS  Google Scholar 

  8. 8

    Aspden, J. L. et al. Extensive translation of small open≈reading frames revealed by Poly-Ribo-Seq. eLife 3, e03528 (2014).

    PubMed  Article  CAS  PubMed Central  Google Scholar 

  9. 9

    Bazzini, A. A. et al. Identification of small ORFs in vertebrates using ribosome footprinting and evolutionary conservation EMBO J. 33, 981–993 (2014). References 8 and 9 represent the first two studies of smORF translation using ribosome profiling in animals. Reference 8 introduces the concept that smORFs can be divided into different categories according to sequence features and translation efficiency.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  10. 10

    Ruiz-Orera, J., Messeguer, X., Subirana, J. A. & Alba, M. M. Long non-coding RNAs as a source of new peptides. eLife 3, e03523 (2014). This computational study shows that the conservation and translation metrics of lncORFs resemble those of evolutionarily young proteins.

    PubMed  Article  CAS  PubMed Central  Google Scholar 

  11. 11

    Smith, J. E. et al. Translation of small open reading frames within unannotated RNA transcripts in Saccharomyces cerevisiae. Cell Rep. 7, 1858–1866 (2014).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  12. 12

    Galindo, M. I., Pueyo, J. I., Fouix, S., Bishop, S. A. & Couso, J. P. Peptides encoded by short ORFs control development and define a new eukaryotic gene family. PLoS Biol. 5, 1052–1062 (2007).

    CAS  Article  Google Scholar 

  13. 13

    Magny, E. G. et al. Conserved regulation of cardiac calcium uptake by peptides encoded in small open reading frames. Science 341, 1116–1120 (2013). This study finds that smORFs can be conserved across hundreds of millions of years of evolution at the levels of peptide structure and function.

    CAS  PubMed  Article  Google Scholar 

  14. 14

    Andrews, S. J. & Rothnagel, J. A. Emerging evidence for functional peptides encoded by short open reading frames. Nat. Rev. Genet. 15, 193–204 (2014). This work further confirms the existence of functional smORFs and reviews smORF functions and current testing techniques.

    CAS  PubMed  Article  Google Scholar 

  15. 15

    Pueyo, J. I., Magny, E. G. & Couso, J. P. New peptides under the s(ORF)ace of the genome. Trends Biochem. Sci. 41, 665–678 (2016).

    CAS  PubMed  Article  Google Scholar 

  16. 16

    Saghatelian, A. & Couso, J. P. Discovery and characterization of smORF-encoded bioactive polypeptides. Nat. Chem. Biol. 11, 909–916 (2015).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  17. 17

    Hemm, M. R., Paul, B. J., Schneider, T. D., Storz, G. & Rudd, K. E. Small membrane proteins found by comparative genomics and ribosome binding site models. Mol. Microbiol. 70, 1487–1501 (2008).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  18. 18

    Hanada, K. et al. Small open reading frames associated with morphogenesis are hidden in plant genomes. Proc. Natl Acad. Sci. USA 110, 2395–2400 (2013).

    CAS  PubMed  Article  Google Scholar 

  19. 19

    Ma, J. et al. Discovery of human sORF-encoded polypeptides (SEPs) in cell lines and tissue. J. Proteome Res. 13, 1757–1765 (2014).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  20. 20

    Anderson, D. M. et al. A micropeptide encoded by a putative long noncoding RNA regulates muscle performance. Cell 160, 595–606 (2015).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  21. 21

    Mackowiak, S. D. et al. Extensive identification and analysis of conserved small ORFs in animals. Genome Biol. 16, 179 (2015). This study uses a stringent computational approach to identify hundreds of conserved smORFs in lncRNAs and UTRs in several model animals, and re-evaluates previous computational studies.

    PubMed  Article  CAS  PubMed Central  Google Scholar 

  22. 22

    Lemaitre, B., Reichhart, J. M. & Hoffmann, J. A. Drosophila host defense: differential induction of antimicrobial peptide genes after infection by various classes of microorganisms. Proc. Natl Acad. Sci. USA 94, 14614–14619 (1997).

    CAS  PubMed  Article  Google Scholar 

  23. 23

    Pueyo, J. I. & Couso, J. P. The 11-aminoacid long Tarsal-less peptides trigger a cell signal in Drosophila leg development. Dev. Biol. 324, 192–201 (2008).

    CAS  PubMed  Article  Google Scholar 

  24. 24

    Djakovic, S., Dyachok, J., Burke, M., Frank, M. J. & Smith, L. G. BRICK1/HSPC300 functions with SCAR and the ARP2/3 complex to regulate epidermal cell shape in Arabidopsis. Development 133, 1091–1100 (2006).

    CAS  PubMed  Article  Google Scholar 

  25. 25

    Hanyu-Nakamura, K., Sonobe-Nojima, H., Tanigawa, A., Lasko, P. & Nakamura, A. Drosophila Pgc protein inhibits P-TEFb recruitment to chromatin in primordial germ cells. Nature 451, 730–733 (2008).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  26. 26

    FlyBase Consortium. The FlyBase database of the Drosophila genome projects and community literature. Nucleic Acids Res. 27, 85–88 (1999).

  27. 27

    Ingolia, N. T. et al. Ribosome profiling reveals pervasive translation outside of annotated protein-coding genes. Cell Rep. 8, 1365–1379 (2014).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  28. 28

    Ingolia, N. T., Lareau, L. F. & Weissman, J. S. Ribosome profiling of mouse embryonic stem cells reveals the complexity and dynamics of mammalian proteomes. Cell 147, 789–802 (2011). This work uses ribosome profiling in mouse embryonic stem cells to show pervasive translation from alternative start sites, non-canonical start codon usage, and uORF and lncRNA translation.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  29. 29

    Pueyo, J. I. et al. Hemotin, a regulator of phagocytosis encoded by a small ORF and conserved across metazoans. PLoS Biol. 14, e1002395 (2016).

    PubMed  Article  CAS  PubMed Central  Google Scholar 

  30. 30

    Johnstone, T. G., Bazzini, A. A. & Giraldez, A. J. Upstream ORFs are prevalent translational repressors in vertebrates. EMBO J. 35, 706–723 (2016).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  31. 31

    Wang, X. Q. & Rothnagel, J. A. 5′-untranslated regions with multiple upstream AUG codons can support low-level translation via leaky scanning and reinitiation. Nucleic Acids Res. 32, 1382–1391 (2004).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  32. 32

    Calvo, S. E., Pagliarini, D. J. & Mootha, V. K. Upstream open reading frames cause widespread reduction of protein expression and are polymorphic among humans. Proc. Natl Acad. Sci. USA 106, 7507–7512 (2009).

    CAS  PubMed  Article  Google Scholar 

  33. 33

    Fritsch, C. et al. Genome-wide search for novel human uORFs and N-terminal protein extensions using ribosomal footprinting. Genome Res. 22, 2208–2218 (2012).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  34. 34

    Ji, Z., Song, R., Regev, A. & Struhl, K. Many lncRNAs, 5′UTRs, and pseudogenes are translated and some are likely to express functional proteins. eLife 4, e08890 (2015).

    PubMed  Article  PubMed Central  Google Scholar 

  35. 35

    Duncan, C. D. & Mata, J. The translational landscape of fission-yeast meiosis and sporulation. Nat. Struct. Mol. Biol. 21, 641–647 (2014).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  36. 36

    Pegueroles, C. & Gabaldon, T. Secondary structure impacts patterns of selection in human lncRNAs. BMC Biol. 14, 60 (2016).

    PubMed  Article  CAS  PubMed Central  Google Scholar 

  37. 37

    Guttman, M., Russell, P., Ingolia, N. T., Weissman, J. S. & Lander, E. S. Ribosome profiling provides evidence that large noncoding RNAs do not encode proteins. Cell 154, 240–251 (2013).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  38. 38

    Banfai, B. et al. Long noncoding RNAs are rarely translated in two human cell lines. Genome Res. 22, 1646–1657 (2012).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  39. 39

    Chew, G. L. et al. Ribosome profiling reveals resemblance between long non-coding RNAs and 5′ leaders of coding RNAs. Development 140, 2828–2834 (2013).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  40. 40

    Nelson, B. R. et al. A peptide encoded by a transcript annotated as long noncoding RNA enhances SERCA activity in muscle. Science 351, 271–275 (2016).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  41. 41

    Yates, A. et al. Ensembl 2016. Nucleic Acids Res. 44, D710–D716 (2016).

    CAS  Article  Google Scholar 

  42. 42

    Rubin, G. M. et al. Comparative genomics of the eukaryotes. Science 287, 2204–2215 (2000).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  43. 43

    Li, Z. et al. Detection of intergenic non-coding RNAs expressed in the main developmental stages in Drosophila melanogaster. Nucleic Acids Res. 37, 4308–4314 (2009).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  44. 44

    van Heesch, S. et al. Extensive localization of long noncoding RNAs to the cytosol and mono- and polyribosomal complexes. Genome Biol. 15, R6 (2014).

    PubMed  Article  CAS  PubMed Central  Google Scholar 

  45. 45

    Wang, H., Wang, Y., Xie, S., Liu, Y. & Xie, Z. Global and cell-type specific properties of lincRNAs with ribosome occupancy. Nucleic Acids Res. 45, 2786–2796 (2017).

    CAS  PubMed  Google Scholar 

  46. 46

    Slavoff, S. A. et al. Peptidomic discovery of short open reading frame-encoded peptides in human cells. Nat. Chem. Biol. 9, 59–64 (2013). This work applies a novel proteomics approach to discover SEPs in human cells.

    CAS  PubMed  Article  Google Scholar 

  47. 47

    Miklos, G. L. G. & Rubin, G. M. The role of the genome project in determining gene function: insights from model organisms. Cell 86, 521–529 (1996).

    CAS  PubMed  Article  Google Scholar 

  48. 48

    Mumtaz, M. A. & Couso, J. P. Ribosomal profiling adds new coding sequences to the proteome. Biochem. Soc. Trans. 43, 1271–1276 (2015).

    CAS  PubMed  Article  Google Scholar 

  49. 49

    Crappe, J. et al. Combining in silico prediction and ribosome profiling in a genome-wide search for novel putatively coding sORFs. BMC Genomics 14, 648 (2013).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  50. 50

    Fatica, A. & Bozzoni, I. Long non-coding RNAs: new players in cell differentiation and development. Nat. Rev. Genet. 15, 7–21 (2013).

    PubMed  Article  CAS  Google Scholar 

  51. 51

    Kronja, I. et al. Widespread changes in the posttranscriptional landscape at the Drosophila oocyte-to-embryo transition. Cell Rep. 7, 1495–1508 (2014).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  52. 52

    Zanet, J. et al. Pri sORF peptides induce selective proteasome-mediated protein processing. Science 349, 1356–1358 (2015).

    CAS  PubMed  Article  Google Scholar 

  53. 53

    Kessler, M. M. et al. Systematic discovery of new genes in the Saccharomyces cerevisiae genome. Genome Res. 13, 264–271 (2003).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  54. 54

    Baggerman, G., Cerstiaens, A., De Loof, A. & Schoofs, L. Peptidomics of the larval Drosophila melanogaster central nervous system. J. Biol. Chem. 277, 40368–40374 (2002).

    CAS  PubMed  Article  Google Scholar 

  55. 55

    Loose, C. R., Langer, R. S. & Stephanopoulos, G. N. Optimization of protein fusion partner length for maximizing in vitro translation of peptides. Biotechnol. Prog. 23, 444–451 (2007).

    CAS  PubMed  Article  Google Scholar 

  56. 56

    Lauressergues, D. et al. Primary transcripts of microRNAs encode regulatory peptides. Nature 520, 90–93 (2015).

    CAS  PubMed  Article  Google Scholar 

  57. 57

    Paharkova, V., Alvarez, G., Nakamura, H., Cohen, P. & Lee, K. W. Rat Humanin is encoded and translated in mitochondria and is localized to the mitochondrial compartment where it regulates ROS production. Mol. Cell. Endocrinol. 413, 96–100 (2015).

    CAS  PubMed  Article  Google Scholar 

  58. 58

    Lee, C. et al. The mitochondrial-derived peptide MOTS-c promotes metabolic homeostasis and reduces obesity and insulin resistance. Cell Metab. 21, 443–454 (2015).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  59. 59

    Kozak, M. Regulation of translation via mRNA structure in prokaryotes and eukaryotes. Gene 361, 13–37 (2005).

    CAS  PubMed  Article  Google Scholar 

  60. 60

    Szamecz, B. et al. eIF3a cooperates with sequences 5′ of uORF1 to promote resumption of scanning by post-termination ribosomes for reinitiation on GCN4 mRNA. Genes Dev. 22, 2414–2425 (2008).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  61. 61

    Ebina, I. et al. Identification of novel Arabidopsis thaliana upstream open reading frames that control expression of the main coding sequences in a peptide sequence-dependent manner. Nucleic Acids Res. 43, 1562–1576 (2015).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  62. 62

    Combier, J. P., de Billy, F., Gamas, P., Niebel, A. & Rivas, S. Trans-regulation of the expression of the transcription factor MtHAP2-1 by a uORF controls root nodule development. Genes Dev. 22, 1549–1559 (2008).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  63. 63

    Zhang, Z. & Dietrich, F. Identification and characterization of upstream open reading frames (uORF) in the 5′ untranslated regions (UTR) of genes in Saccharomyces cerevisiae. Curr. Genet. 48, 77–87 (2005).

    CAS  PubMed  Article  Google Scholar 

  64. 64

    Abrusan, G. Integration of new genes into cellular networks, and their structural maturation. Genetics 195, 1407–1417 (2013).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  65. 65

    Kelley, L. A. & Sternberg, M. J. Partial protein domains: evolutionary insights and bioinformatics challenges. Genome Biol. 16, 100 (2015).

    PubMed  Article  CAS  PubMed Central  Google Scholar 

  66. 66

    Joliot, A. & Prochiantz, A. Transduction peptides: from technology to physiology. Nat. Cell Biol. 6, 189–196 (2004).

    CAS  PubMed  Article  Google Scholar 

  67. 67

    Schulman, B. A. & Harper, J. W. Ubiquitin-like protein activation by E1 enzymes: the apex for downstream signalling pathways. Nat. Rev. Mol. Cell Biol. 10, 319–331 (2009).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  68. 68

    Alonso, J. & Santaren, J. F. Characterization of the Drosophila melanogaster ribosomal proteome. J. Proteome Res. 5, 2025–2032 (2006).

    CAS  PubMed  Article  Google Scholar 

  69. 69

    Ghiglione, C., Perrimon, N. & Perkins, L. A. Quantitative variations in the level of MAPK activity control patterning of the embryonic termini in Drosophila. J. Dev. Biol. 205, 181–193 (1999).

    CAS  Article  Google Scholar 

  70. 70

    Vaux, D. L. & Korsmeyer, S. J. Cell death in development. Cell 96, 245–254 (1999).

    CAS  PubMed  Article  Google Scholar 

  71. 71

    Itoh, K., Nakamura, K., Iijima, M. & Sesaki, H. Mitochondrial dynamics in neurodegeneration. Trends Cell Biol. 23, 64–71 (2012).

    PubMed  Article  CAS  PubMed Central  Google Scholar 

  72. 72

    Staudt, A. C. & Wenkel, S. Regulation of protein function by 'microProteins'. EMBO Rep. 12, 35–42 (2011).

    CAS  PubMed  Article  Google Scholar 

  73. 73

    Seo, P. J., Hong, S. Y., Kim, S. G. & Park, C. M. Competitive inhibition of transcription factors by small interfering peptides. Trends Plant Sci. 16, 541–549 (2011).

    CAS  PubMed  Article  Google Scholar 

  74. 74

    Graeff, M. et al. Microprotein-mediated recruitment of CONSTANS into a TOPLESS trimeric complex represses flowering in Arabidopsis. PLoS Genet. 12, e1005959 (2016).

    PubMed  Article  CAS  PubMed Central  Google Scholar 

  75. 75

    Ling, F., Kang, B. & Sun, X.-H. Id proteins: small molecules, mighty regulators. Curr. Top. Dev. Biol. 110, 189–216 (2014).

    CAS  PubMed  Article  Google Scholar 

  76. 76

    Au, Y. The muscle ultrastructure: a structural perspective of the sarcomere. Cell. Mol. Life Sci. 61, 3016–3033 (2004).

    CAS  PubMed  Article  Google Scholar 

  77. 77

    Gawlin´ski, P. et al. The Drosophila mitotic inhibitor Frühstart specifically binds to the hydrophobic patch of cyclins. EMBO Rep. 8, 490–496 (2007).

    Article  CAS  Google Scholar 

  78. 78

    Guo, B. et al. Humanin peptide suppresses apoptosis by interfering with Bax activation. Nature 423, 456–461 (2003).

    CAS  PubMed  Article  Google Scholar 

  79. 79

    Weinmaster, G. & Fischer, J. A. Notch ligand ubiquitylation: what is it good for? Dev. Cell 21, 134–144 (2011).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  80. 80

    Ching, K. H., Kisailus, A. E. & Burbelo, P. D. Biochemical characterization of distinct regions of SPEC molecules and their role in phagocytosis. Exp. Cell Res. 313, 10–21 (2007).

    CAS  PubMed  Article  Google Scholar 

  81. 81

    Zasloff, M. Antimicrobial peptides of multicellular organisms. Nature 415, 389–395 (2002).

    CAS  PubMed  Article  Google Scholar 

  82. 82

    Brogden, K. A. Antimicrobial peptides: pore formers or metabolic inhibitors in bacteria? Nat. Rev. Microbiol. 3, 238–250 (2005).

    CAS  PubMed  Article  Google Scholar 

  83. 83

    D'Onofrio, G., Mouchiroud, D., Aissani, B., Gautier, C. & Bernardi, G. Correlations between the compositional properties of human genes, codon usage, and amino acid composition of proteins. J. Mol. Evol. 32, 504–510 (1991). This study correlates the overall nucleotide and amino acid compositions of protein-coding sequences, highlighting and attempting to explain the biased nonrandom amino acid usage of canonical proteins.

    CAS  PubMed  Article  Google Scholar 

  84. 84

    Hansen, M., Kilk, K. & Langel, U. Predicting cell-penetrating peptides. Adv. Drug Deliv. Rev. 60, 572–579 (2008).

    CAS  PubMed  Article  Google Scholar 

  85. 85

    Jones, S. W. et al. Characterisation of cell-penetrating peptide-mediated peptide delivery. Br. J. Pharmacol. 145, 1093–1102 (2005).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  86. 86

    Murphy, M. P. Targeting lipophilic cations to mitochondria. Biochim. Biophys. Acta 1777, 1028–1031 (2008).

    CAS  PubMed  Article  Google Scholar 

  87. 87

    Hoffmann, J. A., Kafatos, F. C., Janeway, C. A. & Ezekowitz, R. A. Phylogenetic perspectives in innate immunity. Science 284, 1313–1318 (1999).

    CAS  PubMed  Article  Google Scholar 

  88. 88

    Fan, L. et al. DRAMP: a comprehensive data repository of antimicrobial peptides. Sci. Rep. 6, 24482 (2016).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  89. 89

    Wenzel, M. et al. Small cationic antimicrobial peptides delocalize peripheral membrane proteins. Proc. Natl Acad. Sci. USA 111, E1409–E1418 (2014).

    CAS  PubMed  Article  Google Scholar 

  90. 90

    Slavoff, S. A., Heo, J., Budnik, B. A., Hanakahi, L. A. & Saghatelian, A. A human short open reading frame (sORF)-encoded polypeptide that stimulates DNA end joining. J. Biol. Chem. 289, 10950–10957 (2014).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  91. 91

    Pueyo, J. I. & Couso, J. P. Tarsal-less peptides control Notch signalling through the Shavenbaby transcription factor. Dev. Biol. 355, 183–193 (2011).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  92. 92

    Palm, C., Jayamanne, M., Kjellander, M. & Hallbrink, M. Peptide degradation is a critical determinant for cell-penetrating peptide uptake. Biochim. Biophys. Acta 1768, 1769–1776 (2007).

    CAS  PubMed  Article  Google Scholar 

  93. 93

    Jaillon, O. et al. Translational control of intron splicing in eukaryotes. Nature 451, 359–362 (2008).

    CAS  PubMed  Article  Google Scholar 

  94. 94

    Pimplikar, S. W. Reassessing the amyloid cascade hypothesis of Alzheimer's disease. Int. J. Biochem. Cell Biol. 41, 1261–1268 (2009).

    CAS  PubMed  Article  Google Scholar 

  95. 95

    Waxman, D. & Peck, J. R. Pleiotropy and the preservation of perfection. Science 279, 1210–1213 (1998).

    CAS  PubMed  Article  Google Scholar 

  96. 96

    Billingsley, M. L. et al. Functional and structural properties of stannin: roles in cellular growth, selective toxicity, and mitochondrial responses to injury. J. Cell. Biochem. 98, 243–250 (2006).

    CAS  PubMed  Article  Google Scholar 

  97. 97

    Chng, S. C., Ho, L., Tian, J. & Reversade, B. ELABELA: a hormone essential for heart development signals via the apelin receptor. Dev. Cell 27, 672–680 (2013).

    CAS  PubMed  Article  Google Scholar 

  98. 98

    Pauli, A. et al. Toddler: an embryonic signal that promotes cell movement via Apelin receptors. Science 343, 1248636 (2014). References 97 and 98 characterize the 32-amino-acid-long SEP toddler, which acts as a hormone in the zebrafish heart.

    PubMed  Article  CAS  PubMed Central  Google Scholar 

  99. 99

    Dunn, J. G., Foo, C. K., Belletier, N. G., Gavis, E. R. & Weissman, J. S. Ribosome profiling reveals pervasive and regulated stop codon readthrough in Drosophila melanogaster. eLife 2, e01179 (2013).

    PubMed  Article  CAS  PubMed Central  Google Scholar 

  100. 100

    Carvunis, A. R. et al. Proto-genes and de novo gene birth. Nature 487, 370–374 (2012). This study proposes a model for the de novo emergence of protein-coding genes from proto-genes or sequences, forming a continuum between noncoding DNA and fully coding genes.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  101. 101

    McLysaght, A. & Guerzoni, D. New genes from non-coding sequence: the role of de novo protein-coding genes in eukaryotic evolutionary innovation. Phil. Trans. R. Soc. B http://dx.doi.org/10.1098/rstb.2014.0332 (2015).

  102. 102

    Zhao, L., Saelao, P., Jones, C. D. & Begun, D. J. Origin and Spread of de novo genes in Drosophila melanogaster populations. Science 343, 769–772 (2014).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  103. 103

    Zhou, Q. et al. On the origin of new genes in Drosophila. Genome Res. 18, 1446–1455 (2008).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  104. 104

    Neme, R. & Tautz, D. Phylogenetic patterns of emergence of new genes support a model of frequent de novo evolution. BMC Genomics 14, 117 (2013).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  105. 105

    Reinhardt, J. A. et al. De novo ORFs in Drosophila are important to organismal fitness and evolved rapidly from previously non-coding sequences. PLoS Genet. 9, e1003860 (2013).

    PubMed  Article  CAS  PubMed Central  Google Scholar 

  106. 106

    Moyers, B. A. & Zhang, J. Evaluating phylostratigraphic evidence for widespread de novo gene birth in genome evolution. Mol. Biol. Evol. 33, 1245–1256 (2016).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  107. 107

    Xie, C. et al. Hominoid-specific de novo protein-coding genes originating from long non-coding RNAs. PLoS Genet. 8, e1002942 (2012).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  108. 108

    Schlotterer, C. Genes from scratch — the evolutionary fate of de novo genes. Trends Genet. 31, 215–219 (2015).

    PubMed  Article  CAS  PubMed Central  Google Scholar 

  109. 109

    Sommer, R. J. The future of evo-devo: model systems and evolutionary theory. Nat. Rev. Genet. 10, 416–422 (2009).

    CAS  PubMed  Article  Google Scholar 

  110. 110

    Yang, S. & Bourne, P. E. The evolutionary history of protein domains viewed by species phylogeny. PLoS ONE 4, e8378 (2009).

    PubMed  Article  CAS  PubMed Central  Google Scholar 

  111. 111

    Milligan, M. J. et al. Global intersection of long non-coding RNAs with processed and unprocessed pseudogenes in the human genome. Front. Genet. 7, 26 (2016).

    PubMed  Article  CAS  PubMed Central  Google Scholar 

Download references

Acknowledgements

The authors thank their colleagues J. Pueyo, E. Magny, S. Bishop and F. Casares for helpful suggestions about the manuscript. This work was funded by grants from the Spanish Ministerio de Economía, Industria y Competitividad (MINECO; ref. BFU/2016-77793-P) and the British Biotechnology and Biological Sciences Research Council (BBSRC; ref. BB/N001753/1) to J.-P.C.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Juan-Pablo Couso.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

PowerPoint slides

Glossary

Ribosome profiling

A technique that globally probes RNA molecules that are being actively translated by ribosomes by analysing ribosome-protected RNA fragments (ribosomal footprints).

Translation efficiency

A measure of the rate of translation for a given mRNA feature, obtained in ribosome profiling experiments. It usually consists of the ratio between ribosomal footprints and RNA sequencing reads generated by the mRNA region.

Protein isoforms

Variants of a given protein generated by the translation of alternative mRNA sequences, in distinct mRNAs produced by the same gene.

ORF tagging

A technique to probe the translation of a specific open reading frame (ORF), whereby a reporter sequence without a start codon is cloned in-frame with the assessed ORF.

Helix–loop–helix

(HLH). A DNA-binding domain that characterizes members of a transcription factor family. It is composed of two α-helices connected by a short loop.

Pseudogene

A paralogue of a functional protein-coding gene, which has lost its gene expression and/or protein-coding capacities.

Paralogue

Homologous gene within a given species, usually generated by gene duplication.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Couso, JP., Patraquim, P. Classification and function of small open reading frames. Nat Rev Mol Cell Biol 18, 575–589 (2017). https://doi.org/10.1038/nrm.2017.58

Download citation

Further reading

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing