Article

Random sequences are an abundant source of bioactive RNAs or peptides

  • Nature Ecology & Evolution 1, Article number: 0127 (2017)
  • doi:10.1038/s41559-017-0127
  • Download Citation
Received:
Accepted:
Published online:

Abstract

It is generally assumed that new genes arise through duplication and/or recombination of existing genes. The probability that a new functional gene could arise out of random non-coding DNA is so far considered to be negligible, as it seems unlikely that such an RNA or protein sequence could have an initial function that influences the fitness of an organism. Here, we have tested this question systematically, by expressing clones with random sequences in Escherichia coli and subjecting them to competitive growth. Contrary to expectations, we find that random sequences with bioactivity are not rare. In our experiments we find that up to 25% of the evaluated clones enhance the growth rate of their cells and up to 52% inhibit growth. Testing of individual clones in competition assays confirms their activity and provides an indication that their activity could be exerted by either the transcribed RNA or the translated peptide. This suggests that transcribed and translated random parts of the genome could indeed have a high potential to become functional. The results also suggest that random sequences may become an effective new source of molecules for studying cellular functions, as well as for pharmacological activity screening.

  • Subscribe to Nature Ecology & Evolution for full access:

    $99

    Subscribe

Additional access options:

Already a subscriber?  Log in  now or  Register  for online access.

References

  1. 1.

    Evolution and tinkering. Science 196, 1161–1166 (1977).

  2. 2.

    The discovery of de novo gene evolution. Perspect. Biol. Med. 57, 149–161 (2014).

  3. 3.

    Proteins. One thousand families for the molecular biologist. Nature 357, 543–544 (1992).

  4. 4.

    , & On the evolution of protein folds: are similar motifs in different protein folds the result of convergence, insertion, or relics of an ancient peptide world? J. Struct. Biol. 134, 191–203 (2001).

  5. 5.

    & Protein families and their evolution—a structural perspective. Annu. Rev. Biochem. 74, 867–900 (2005).

  6. 6.

    et al. Proto-genes and de novo gene birth. Nature 487, 370–374 (2012).

  7. 7.

    et al. De novo ORFs in Drosophila are important to organismal fitness and evolved rapidly from previously non-coding sequences. PLoS Genet. 9, e1003860 (2013).

  8. 8.

    , , & Origin and spread of de novo genes in Drosophila melanogaster populations. Science 343, 769–772 (2014).

  9. 9.

    & Fast turnover of genome transcription across evolutionary time exposes entire non-coding DNA to de novo gene emergence. eLife 5, e09977 (2016).

  10. 10.

    & The evolutionary origin of orphan genes. Nat. Rev. Genet. 12, 692–702 (2011).

  11. 11.

    et al. Hominoid-specific de novo protein-coding genes originating from long non-coding RNAs. PLoS Genet. 8, e1002942 (2012).

  12. 12.

    , , & Long non-coding RNAs as a source of new peptides. Elife 3, e03523 (2014).

  13. 13.

    & Genome dynamics during experimental evolution. Nat. Rev. Genet. 14, 827–839 (2013).

  14. 14.

    & Stress-driven in vivo selection of a functional mini-gene from a randomized DNA library expressing combinatorial peptides in Escherichia coli. Mol. Biol. Evol. 24, 1480–1491 (2007).

  15. 15.

    , & Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).

  16. 16.

    & Functional proteins from a random-sequence library. Nature 410, 715–718 (2001).

  17. 17.

    & Understanding protein non-folding. BBA-Proteins Proteom. 1804, 1231–1264 (2010).

  18. 18.

    , , & Intrinsically disordered proteins: emerging interaction specialists. Curr. Opin. Struct. Biol. 35, 49–59 (2015).

  19. 19.

    , , & Promiscuity as a functional trait: intrinsically disordered regions as central players of interactomes. Biochem. J. 454, 361–369 (2013).

  20. 20.

    , , & A million peptide motifs for the molecular biologist. Mol. Cell 55, 161–169 (2014).

  21. 21.

    et al. High-throughput RNA interference screening using pooled shRNA libraries and next generation sequencing. Genome Biol. 12, R104 (2011).

  22. 22.

    Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26, 2460–2461 (2010).

  23. 23.

    , & EMBOSS: the European molecular biology open software suite. Trends Genet. 16, 276–277 (2000).

  24. 24.

    , & NextGenMap: fast and accurate read mapping in highly polymorphic genomes. Bioinformatics 29, 2790–2791 (2013).

  25. 25.

    et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

  26. 26.

    , & Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015).

  27. 27.

    , , & . protr/ProtrWeb: R package and web server for generating various numerical representation schemes of protein sequences. Bioinformatics 31, 1857–1859 (2015).

Download references

Acknowledgements

We thank S. Künzel for sequencing and E. Özkurt for contributions during her rotation project. The project was financed through an ERC advanced grant to D.T. (NewGenes—322564).

Author information

Author notes

    • Rafik Neme
    •  & Cristina Amador

    Present addresses: Department of Biochemistry and Molecular Biophysics, Columbia University Medical Center, 1212 Amsterdam Avenue, New York, NY 10027, USA (R.N.); Technical University of Denmark, Department of Biotechnology and Biomedicine, 2800 Kgs Lyngby, Denmark (C.A.)

Affiliations

  1. Max-Planck Institute for Evolutionary Biology, August-Thienemannstrasse 2, 24306 Plön, Germany.

    • Rafik Neme
    • , Cristina Amador
    • , Burcin Yildirim
    • , Ellen McConnell
    •  & Diethard Tautz

Authors

  1. Search for Rafik Neme in:

  2. Search for Cristina Amador in:

  3. Search for Burcin Yildirim in:

  4. Search for Ellen McConnell in:

  5. Search for Diethard Tautz in:

Contributions

R.N. and D.T. designed the experiment, C.A. constructed the library, C.A., B.Y. and E.M. conducted the experiments, R.N. did the bioinformatic analysis, and R.N. and D.T. wrote the paper.

Competing interests

The work described in this publication is subject to patent application by the Max-Planck Society.

Corresponding author

Correspondence to Diethard Tautz.

Supplementary information

PDF files

  1. 1.

    Supplementary Figures

    Supplementary Figures 1–3

Excel files

  1. 1.

    Supplementary Table 1

    Supplementary Table 1