The emergence of new genes from non-coding DNA is common across eukaryotes — how they contribute to adaptive evolutionary novelties is fascinating.
There are multiple ways by which new genes with new functions emerge. The traditional view includes mechanisms that modify functional sequences, such as duplication, horizontal transfer, gene fusion and fission, and other creative ways of tinkering with pre-existing genes. Although initially received with scepticism, it has become clear over the last five to ten years that de novo gene birth, the origin of new functional genes from non-coding DNA, is also common especially in young lineages. Genome-wide scans and functional characterization have identified de novo genes in diverse taxa including Drosophila, yeast, Hydra, mouse, humans, Arabidopsis and rice, and as a result de novo gene birth has gained widespread acceptance as a phenomenon in eukaryotic evolution.
This radical form of genetic novelty is puzzling given how unlikely it seems that a random DNA sequence can be translated and code for a functional polypeptide. It is even more challenging to envisage such random new peptides being able to increase the fitness of an individual. In this issue, Neme and colleagues (article no. 0127) tackle this conundrum with an empirical approach. They generated clones with random sequences of 150 oligonucleotides and expressed them in E. coli containing all the necessary elements for transcription and translation. When translated, these clones resulted in peptides with 50 amino acids of random sequence and the authors monitored clone frequencies in competitive assays. Surprisingly, they found that a large fraction of these random peptides affected fitness of their E. coli cells (25% of the clones enhanced the growth rate of their cells and 52% inhibited growth). This study provides systematic empirical evidence for common bioactivity of random sequences of DNA and strengthens the idea that random non-coding parts of the genome have the potential to become functional.
Although it is now widely accepted that de novo genes emerge at a high rate at least in eukaryotes, how this happens is under debate. Two evolutionary models have been proposed. In the first, a gradual process involves the emergence of proto-genes from translation of non-genic open reading frames. A subset of these proto-genes might have adaptive potential and will progressively evolve the characteristics of genes, which will be retained in the genome1. An alternative model poses that, because non-coding transcripts sometimes get translated by accident, natural selection can purge deleterious polypeptides and retain the benign ones — a process of pre-adaptation that generates material prone to become de novo functional genes2. This month we publish bioinformatics work by Wilson and colleagues (article no. 0146) that lends support to the pre-adaptation view of de novo gene origin. The two hypotheses predict different gene behaviour in young and old genes: if emergence of de novo genes is a gradual process we would expect young genes to be much more similar to non-coding sequences than to old genes; by contrast, under the pre-adaptation scenario, young genes should resemble old genes and differ from non-coding sequences. The authors stratify genes of the house mouse and baker's yeast by age and estimate their intrinsic structural disorder (an indicator of how likely a protein is to fold into a stable structure). They find high levels of intrinsic structural disorder in young and old genes but not in junk DNA, with young genes having the highest levels of intrinsic structural disorder of all. Thus, when considering folding properties of the proteins they encode for, young genes look nothing like junk DNA but look just like old ones, a finding that lends support to the pre-adaptation mode of de novo birth of genes.
The emergence of genes from non-coding sequences is just one of many fascinating mysteries of the evolutionary dynamics of de novo genes. Researchers have established that they arise at a high rate but how do they spread and how often do they get lost? Comparing de novo genes among Drosophila melanogaster populations3 identified many polymorphisms and found patterns of nucleotide diversity and gene expression consistent with the role of directional selection in the spread of de novo genes. The authors observed a high rate of de novo gene loss but found no population genetic evidence for selection driving gene loss. Instead, they suggest that de novo genes are primarily lost by drift. Comparative analyses of close relatives in the Drosophila obscura group showed that the high rate of de novo origination is balanced by rapid gene loss due to mutations that render the new genes non-functional4. The same study found that young genes are more likely to be lost than older ones, which is consistent with stochastic or weak selection processes. The emerging picture for the life cycle of de novo genes in Drosophila suggests that they arise at a high rate, are fixed by natural selection and are lost at high rates too, either by drift or weak selection. We look forward to more comparative studies involving closely related species and multiple natural populations that will enable us to understand the evolutionary dynamics of de novo genes and how they contribute to diversity.
Mounting evidence in several taxa attests to the biological importance of de novo genes. Differences in expression levels indicate the role of several de novo genes in stress response in Arabidopsis thaliana, Daphnia magna, yeast and Drosophila5. De novo genes also contribute to reproduction and development in Drosophila, yeast and humans6. The study by Neme and colleagues published in this issue shows that a random sequence acquiring biological function might not be as hard as we naively expected. Nevertheless, it is fascinating to consider how a de novo gene can replace an existing one with the same function and, even more intriguingly, how it integrates into established gene networks and becomes essential. McLysaght and Hurst6 suggest that antagonistic co-evolution may well be a mechanism by which de novo genes replace existing ones and become essential.
Understanding the origin and evolutionary consequences of genomic novelty, and de novo genes in particular, is a vibrant new field in which Nature Ecology & Evolution takes a keen interest.
Carvunis, A.-R. et al. Nature 487, 370–374 (2012).
Wilson, B. A. & Masel, J. Genome Biol. Evol. 3, 1245–1252 (2011).
Zhao, L. et al. Science 343, 769–772 (2014).
Palmieri, E. et al. eLife 3, e01311 (2014).
Schlötterer, C. Trends Genet. 31, 215–219 (2015).
McLysaght, A. & Hurst, L. D. Nat. Rev. Genet. 17, 567–578 (2016).