Random sequences are an abundant source of bioactive RNAs or peptides

Neme, Rafik; Amador, Cristina; Yildirim, Burcin; McConnell, Ellen; Tautz, Diethard

doi:10.1038/s41559-017-0127

Article
Published: 24 April 2017

Random sequences are an abundant source of bioactive RNAs or peptides

Rafik Neme¹^nAff2,
Cristina Amador¹^nAff2,
Burcin Yildirim¹,
Ellen McConnell¹ &
…
Diethard Tautz ORCID: orcid.org/0000-0002-0460-5344¹

Nature Ecology & Evolution volume 1, Article number: 0127 (2017) Cite this article

10k Accesses
59 Citations
201 Altmetric
Metrics details

Subjects

Abstract

It is generally assumed that new genes arise through duplication and/or recombination of existing genes. The probability that a new functional gene could arise out of random non-coding DNA is so far considered to be negligible, as it seems unlikely that such an RNA or protein sequence could have an initial function that influences the fitness of an organism. Here, we have tested this question systematically, by expressing clones with random sequences in Escherichia coli and subjecting them to competitive growth. Contrary to expectations, we find that random sequences with bioactivity are not rare. In our experiments we find that up to 25% of the evaluated clones enhance the growth rate of their cells and up to 52% inhibit growth. Testing of individual clones in competition assays confirms their activity and provides an indication that their activity could be exerted by either the transcribed RNA or the translated peptide. This suggests that transcribed and translated random parts of the genome could indeed have a high potential to become functional. The results also suggest that random sequences may become an effective new source of molecules for studying cellular functions, as well as for pharmacological activity screening.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: Induction of expression through IPTG drives changes in clone frequency over time.**

**Figure 2: Examples of four clones with significant changes in frequency over time.**

**Figure 3: Assessment of read depth on detection power.**

**Figure 5: Growth competition experiment with three selected clones.**

Overlapping genes in natural and engineered genomes

Article 05 October 2021

Bradley W. Wright, Mark P. Molloy & Paul R. Jaschke

Selection of a de novo gene that can promote survival of Escherichia coli by modulating protein homeostasis pathways

Article Open access 09 November 2023

Idan Frumkin & Michael T. Laub

Experimental characterization of de novo proteins and their unevolved random-sequence counterparts

Article Open access 06 April 2023

Brennen Heames, Filip Buchel, … Klára Hlouchová

References

Jacob, F. Evolution and tinkering. Science 196, 1161–1166 (1977).
Article CAS Google Scholar
Tautz, D. The discovery of de novo gene evolution. Perspect. Biol. Med. 57, 149–161 (2014).
Article Google Scholar
Chothia, C. Proteins. One thousand families for the molecular biologist. Nature 357, 543–544 (1992).
Article CAS Google Scholar
Lupas, A. N., Ponting, C. P. & Russell, R. B. On the evolution of protein folds: are similar motifs in different protein folds the result of convergence, insertion, or relics of an ancient peptide world? J. Struct. Biol. 134, 191–203 (2001).
Article CAS Google Scholar
Orengo, C. A. & Thornton, J. M. Protein families and their evolution—a structural perspective. Annu. Rev. Biochem. 74, 867–900 (2005).
Article CAS Google Scholar
Carvunis, A. R. et al. Proto-genes and de novo gene birth. Nature 487, 370–374 (2012).
Article CAS Google Scholar
Reinhardt, J. A. et al. De novo ORFs in Drosophila are important to organismal fitness and evolved rapidly from previously non-coding sequences. PLoS Genet. 9, e1003860 (2013).
Article Google Scholar
Zhao, L., Saelao, P., Jones, C. D. & Begun, D. J. Origin and spread of de novo genes in Drosophila melanogaster populations. Science 343, 769–772 (2014).
Article CAS Google Scholar
Neme, R. & Tautz, D. Fast turnover of genome transcription across evolutionary time exposes entire non-coding DNA to de novo gene emergence. eLife 5, e09977 (2016).
Article Google Scholar
Tautz, D. & Domazet-Loso, T. The evolutionary origin of orphan genes. Nat. Rev. Genet. 12, 692–702 (2011).
Article CAS Google Scholar
Xie, C. et al. Hominoid-specific de novo protein-coding genes originating from long non-coding RNAs. PLoS Genet. 8, e1002942 (2012).
Article CAS Google Scholar
Ruiz-Orera, J., Messeguer, X., Subirana, J. A. & Alba, M. M. Long non-coding RNAs as a source of new peptides. Elife 3, e03523 (2014).
Article Google Scholar
Barrick, J. E. & Lenski, R. E. Genome dynamics during experimental evolution. Nat. Rev. Genet. 14, 827–839 (2013).
Article CAS Google Scholar
Stepanov, V. G. & Fox, G. E. Stress-driven in vivo selection of a functional mini-gene from a randomized DNA library expressing combinatorial peptides in Escherichia coli. Mol. Biol. Evol. 24, 1480–1491 (2007).
Article CAS Google Scholar
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Article Google Scholar
Keefe, A. D. & Szostak, J. W. Functional proteins from a random-sequence library. Nature 410, 715–718 (2001).
Article CAS Google Scholar
Uversky, V. N. & Dunker, A. K. Understanding protein non-folding. BBA-Proteins Proteom. 1804, 1231–1264 (2010).
Article CAS Google Scholar
Tompa, P., Schad, E., Tantos, A. & Kalmar, L. Intrinsically disordered proteins: emerging interaction specialists. Curr. Opin. Struct. Biol. 35, 49–59 (2015).
Article CAS Google Scholar
Cumberworth, A., Lamour, G., Babu, M. M. & Gsponer, J. Promiscuity as a functional trait: intrinsically disordered regions as central players of interactomes. Biochem. J. 454, 361–369 (2013).
Article CAS Google Scholar
Tompa, P., Davey, N. E., Gibson, T. J. & Babu, M. M. A million peptide motifs for the molecular biologist. Mol. Cell 55, 161–169 (2014).
Article CAS Google Scholar
Sims, D. et al. High-throughput RNA interference screening using pooled shRNA libraries and next generation sequencing. Genome Biol. 12, R104 (2011).
Article CAS Google Scholar
Edgar, R. C. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26, 2460–2461 (2010).
Article CAS Google Scholar
Rice, P., Longden, I. & Bleasby, A. EMBOSS: the European molecular biology open software suite. Trends Genet. 16, 276–277 (2000).
Article CAS Google Scholar
Sedlazeck, F. J., Rescheneder, P. & von Haeseler, A. NextGenMap: fast and accurate read mapping in highly polymorphic genomes. Bioinformatics 29, 2790–2791 (2013).
Article CAS Google Scholar
Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Article Google Scholar
Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015).
Article CAS Google Scholar
Xiao, N., Cao, D. S., Zhu, M. F. & Xu, Q. S . protr/ProtrWeb: R package and web server for generating various numerical representation schemes of protein sequences. Bioinformatics 31, 1857–1859 (2015).
Article CAS Google Scholar

Download references

Acknowledgements

We thank S. Künzel for sequencing and E. Özkurt for contributions during her rotation project. The project was financed through an ERC advanced grant to D.T. (NewGenes—322564).

Author information

Rafik Neme & Cristina Amador
Present address: †Present addresses: Department of Biochemistry and Molecular Biophysics, Columbia University Medical Center, 1212 Amsterdam Avenue, New York, NY 10027, USA (R.N.); Technical University of Denmark, Department of Biotechnology and Biomedicine, 2800 Kgs Lyngby, Denmark (C.A.),

Authors and Affiliations

Max-Planck Institute for Evolutionary Biology, August-Thienemannstrasse 2, Plön, 24306, Germany.
Rafik Neme, Cristina Amador, Burcin Yildirim, Ellen McConnell & Diethard Tautz

Authors

Rafik Neme
View author publications
You can also search for this author in PubMed Google Scholar
Cristina Amador
View author publications
You can also search for this author in PubMed Google Scholar
Burcin Yildirim
View author publications
You can also search for this author in PubMed Google Scholar
Ellen McConnell
View author publications
You can also search for this author in PubMed Google Scholar
Diethard Tautz
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

R.N. and D.T. designed the experiment, C.A. constructed the library, C.A., B.Y. and E.M. conducted the experiments, R.N. did the bioinformatic analysis, and R.N. and D.T. wrote the paper.

Corresponding author

Correspondence to Diethard Tautz.

Ethics declarations

Competing interests

The work described in this publication is subject to patent application by the Max-Planck Society.

Supplementary information

Supplementary Figures

Supplementary Figures 1–3 (PDF 661 kb)

Supplementary Table 1

Supplementary Table 1 (XLSX 116 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Neme, R., Amador, C., Yildirim, B. et al. Random sequences are an abundant source of bioactive RNAs or peptides. Nat Ecol Evol 1, 0127 (2017). https://doi.org/10.1038/s41559-017-0127

Download citation

Received: 22 October 2016
Accepted: 01 March 2017
Published: 24 April 2017
DOI: https://doi.org/10.1038/s41559-017-0127

This article is cited by

Selection of a de novo gene that can promote survival of Escherichia coli by modulating protein homeostasis pathways
- Idan Frumkin
- Michael T. Laub
Nature Ecology & Evolution (2023)
Statistical analysis of synonymous and stop codons in pseudo-random and real sequences as a function of GC content
- Valentin Wesp
- Günter Theißen
- Stefan Schuster
Scientific Reports (2023)
Evolution and implications of de novo genes in humans
- Luuk A. Broeils
- Jorge Ruiz-Orera
- Sebastiaan van Heesch
Nature Ecology & Evolution (2023)
Experimental characterization of de novo proteins and their unevolved random-sequence counterparts
- Brennen Heames
- Filip Buchel
- Klára Hlouchová
Nature Ecology & Evolution (2023)
Selection in a growing colony biases results of mutation accumulation experiments
- Anjali Mahilkar
- Namratha Raj
- Supreet Saini
Scientific Reports (2022)