There are various ways to explain how new genes can come about — exon shuffling, gene duplication and retroposition being just a few. But how are truly novel genes born out of a sequence that was previously non-coding? Using a comparative genomics approach, a research group has identified and characterized the function of a novel ORF in Saccharomyces cerevisiae, and has shown that it might contribute to the evolution of this yeast species.

The mechanism by which a gene is created from scratch receives relatively little attention, despite reports from sequencing projects and comparative analyses that many de novo genes exist in some bacterial and animal lineages. To investigate the process of gene creation, the authors concentrated on BSC4 , a species-specific gene that was pulled out of genome comparisons among Saccharomyces species. Although the gene has no homologue in other fungal species, the region around BSC4 is syntenic across Saccharomyces species — suggesting that it probably did not jump into its current position on the S. cerevisiae genome by horizontal gene transfer, and allowing the birth of the BSC4 ancestral sequence to be dated to over 100 million years ago.

How do we know that BSC4 is a real protein-coding gene? Most criteria would suggest that it is: the sequence — an ORF of 132 amino acids — is fixed in S. cerevisiae strains, and RT-PCR experiments suggest that the ORF can be transcribed. Proteomics analyses support the existence of 29 peptides associated with this locus, and the BCS4 sequence seems to be under purifying selection. It is not clear what BSC4 actually does, but the fact that it is synthetically lethal with mutations in two DNA-damage repair genes makes it likely that it acts in this pathway.

This paper provides the first evidence that new genes can arise from a non-coding region that is initially transcribed and then subsequently obtains an ORF via mutation. The fact that BSC4 expression is upregulated during the stationary phase of the cell cycle, when DNA repair is most pressing, supports the idea that not only is this new gene real, it is also selectively advantageous.