Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Random sequences are an abundant source of bioactive RNAs or peptides

Abstract

It is generally assumed that new genes arise through duplication and/or recombination of existing genes. The probability that a new functional gene could arise out of random non-coding DNA is so far considered to be negligible, as it seems unlikely that such an RNA or protein sequence could have an initial function that influences the fitness of an organism. Here, we have tested this question systematically, by expressing clones with random sequences in Escherichia coli and subjecting them to competitive growth. Contrary to expectations, we find that random sequences with bioactivity are not rare. In our experiments we find that up to 25% of the evaluated clones enhance the growth rate of their cells and up to 52% inhibit growth. Testing of individual clones in competition assays confirms their activity and provides an indication that their activity could be exerted by either the transcribed RNA or the translated peptide. This suggests that transcribed and translated random parts of the genome could indeed have a high potential to become functional. The results also suggest that random sequences may become an effective new source of molecules for studying cellular functions, as well as for pharmacological activity screening.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Induction of expression through IPTG drives changes in clone frequency over time.
Figure 2: Examples of four clones with significant changes in frequency over time.
Figure 3: Assessment of read depth on detection power.
Figure 4: Expression of peptides.
Figure 5: Growth competition experiment with three selected clones.

Similar content being viewed by others

References

  1. Jacob, F. Evolution and tinkering. Science 196, 1161–1166 (1977).

    Article  CAS  Google Scholar 

  2. Tautz, D. The discovery of de novo gene evolution. Perspect. Biol. Med. 57, 149–161 (2014).

    Article  Google Scholar 

  3. Chothia, C. Proteins. One thousand families for the molecular biologist. Nature 357, 543–544 (1992).

    Article  CAS  Google Scholar 

  4. Lupas, A. N., Ponting, C. P. & Russell, R. B. On the evolution of protein folds: are similar motifs in different protein folds the result of convergence, insertion, or relics of an ancient peptide world? J. Struct. Biol. 134, 191–203 (2001).

    Article  CAS  Google Scholar 

  5. Orengo, C. A. & Thornton, J. M. Protein families and their evolution—a structural perspective. Annu. Rev. Biochem. 74, 867–900 (2005).

    Article  CAS  Google Scholar 

  6. Carvunis, A. R. et al. Proto-genes and de novo gene birth. Nature 487, 370–374 (2012).

    Article  CAS  Google Scholar 

  7. Reinhardt, J. A. et al. De novo ORFs in Drosophila are important to organismal fitness and evolved rapidly from previously non-coding sequences. PLoS Genet. 9, e1003860 (2013).

    Article  Google Scholar 

  8. Zhao, L., Saelao, P., Jones, C. D. & Begun, D. J. Origin and spread of de novo genes in Drosophila melanogaster populations. Science 343, 769–772 (2014).

    Article  CAS  Google Scholar 

  9. Neme, R. & Tautz, D. Fast turnover of genome transcription across evolutionary time exposes entire non-coding DNA to de novo gene emergence. eLife 5, e09977 (2016).

    Article  Google Scholar 

  10. Tautz, D. & Domazet-Loso, T. The evolutionary origin of orphan genes. Nat. Rev. Genet. 12, 692–702 (2011).

    Article  CAS  Google Scholar 

  11. Xie, C. et al. Hominoid-specific de novo protein-coding genes originating from long non-coding RNAs. PLoS Genet. 8, e1002942 (2012).

    Article  CAS  Google Scholar 

  12. Ruiz-Orera, J., Messeguer, X., Subirana, J. A. & Alba, M. M. Long non-coding RNAs as a source of new peptides. Elife 3, e03523 (2014).

    Article  Google Scholar 

  13. Barrick, J. E. & Lenski, R. E. Genome dynamics during experimental evolution. Nat. Rev. Genet. 14, 827–839 (2013).

    Article  CAS  Google Scholar 

  14. Stepanov, V. G. & Fox, G. E. Stress-driven in vivo selection of a functional mini-gene from a randomized DNA library expressing combinatorial peptides in Escherichia coli. Mol. Biol. Evol. 24, 1480–1491 (2007).

    Article  CAS  Google Scholar 

  15. Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).

    Article  Google Scholar 

  16. Keefe, A. D. & Szostak, J. W. Functional proteins from a random-sequence library. Nature 410, 715–718 (2001).

    Article  CAS  Google Scholar 

  17. Uversky, V. N. & Dunker, A. K. Understanding protein non-folding. BBA-Proteins Proteom. 1804, 1231–1264 (2010).

    Article  CAS  Google Scholar 

  18. Tompa, P., Schad, E., Tantos, A. & Kalmar, L. Intrinsically disordered proteins: emerging interaction specialists. Curr. Opin. Struct. Biol. 35, 49–59 (2015).

    Article  CAS  Google Scholar 

  19. Cumberworth, A., Lamour, G., Babu, M. M. & Gsponer, J. Promiscuity as a functional trait: intrinsically disordered regions as central players of interactomes. Biochem. J. 454, 361–369 (2013).

    Article  CAS  Google Scholar 

  20. Tompa, P., Davey, N. E., Gibson, T. J. & Babu, M. M. A million peptide motifs for the molecular biologist. Mol. Cell 55, 161–169 (2014).

    Article  CAS  Google Scholar 

  21. Sims, D. et al. High-throughput RNA interference screening using pooled shRNA libraries and next generation sequencing. Genome Biol. 12, R104 (2011).

    Article  CAS  Google Scholar 

  22. Edgar, R. C. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26, 2460–2461 (2010).

    Article  CAS  Google Scholar 

  23. Rice, P., Longden, I. & Bleasby, A. EMBOSS: the European molecular biology open software suite. Trends Genet. 16, 276–277 (2000).

    Article  CAS  Google Scholar 

  24. Sedlazeck, F. J., Rescheneder, P. & von Haeseler, A. NextGenMap: fast and accurate read mapping in highly polymorphic genomes. Bioinformatics 29, 2790–2791 (2013).

    Article  CAS  Google Scholar 

  25. Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

    Article  Google Scholar 

  26. Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015).

    Article  CAS  Google Scholar 

  27. Xiao, N., Cao, D. S., Zhu, M. F. & Xu, Q. S . protr/ProtrWeb: R package and web server for generating various numerical representation schemes of protein sequences. Bioinformatics 31, 1857–1859 (2015).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank S. Künzel for sequencing and E. Özkurt for contributions during her rotation project. The project was financed through an ERC advanced grant to D.T. (NewGenes—322564).

Author information

Authors and Affiliations

Authors

Contributions

R.N. and D.T. designed the experiment, C.A. constructed the library, C.A., B.Y. and E.M. conducted the experiments, R.N. did the bioinformatic analysis, and R.N. and D.T. wrote the paper.

Corresponding author

Correspondence to Diethard Tautz.

Ethics declarations

Competing interests

The work described in this publication is subject to patent application by the Max-Planck Society.

Supplementary information

Supplementary Figures

Supplementary Figures 1–3 (PDF 661 kb)

Supplementary Table 1

Supplementary Table 1 (XLSX 116 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Neme, R., Amador, C., Yildirim, B. et al. Random sequences are an abundant source of bioactive RNAs or peptides. Nat Ecol Evol 1, 0127 (2017). https://doi.org/10.1038/s41559-017-0127

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41559-017-0127

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing