Rapid evolution of protein diversity by de novo origination in Oryza

Zhang, Li; Ren, Yan; Yang, Tao; Li, Guangwei; Chen, Jianhai; Gschwend, Andrea R.; Yu, Yeisoo; Hou, Guixue; Zi, Jin; Zhou, Ruo; Wen, Bo; Zhang, Jianwei; Chougule, Kapeel; Wang, Muhua; Copetti, Dario; Peng, Zhiyu; Zhang, Chengjun; Zhang, Yong; Ouyang, Yidan; Wing, Rod A.; Liu, Siqi; Long, Manyuan

doi:10.1038/s41559-019-0822-5

Article
Published: 11 March 2019

Rapid evolution of protein diversity by de novo origination in Oryza

Nature Ecology & Evolution volume 3, pages 679–690 (2019)Cite this article

12k Accesses
84 Citations
149 Altmetric
Metrics details

Subjects

Abstract

New protein-coding genes that arise de novo from non-coding DNA sequences contribute to protein diversity. However, de novo gene origination is challenging to study as it requires high-quality reference genomes for closely related species, evidence for ancestral non-coding sequences, and transcription and translation of the new genes. High-quality genomes of 13 closely related Oryza species provide unprecedented opportunities to understand de novo origination events. Here, we identify a large number of young de novo genes with discernible recent ancestral non-coding sequences and evidence of translation. Using pipelines examining the synteny relationship between genomes and reciprocal-best whole-genome alignments, we detected at least 175 de novo open reading frames in the focal species O. sativa subspecies japonica, which were all detected in RNA sequencing-based transcriptomes. Mass spectrometry-based targeted proteomics and ribosomal profiling show translational evidence for 57% of the de novo genes. In recent divergence of Oryza, an average of 51.5 de novo genes per million years were generated and retained. We observed evolutionary patterns in which excess indels and early transcription were favoured in origination with a stepwise formation of gene structure. These data reveal that de novo genes contribute to the rapid evolution of protein diversity under positive selection.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Identification of de novo genes that originated recently during *Oryza* diversification.**

**Fig. 2: Stepwise origination processes for the de novo gene *Osjap05g30030*.**

**Fig. 3: Stepwise origination process for the de novo gene *Osjap06g21910*.**

**Fig. 4: Patterns of de novo origination in evolution, expression and gene structures.**

**Fig. 5: Example of the verification of protein products translated from a candidate de novo gene, *Osjap05g20760*.**

**Fig. 6: Summary of the protein products translated from candidate de novo genes in *O. sativa* subspecies *japonica*, as detected by experimental proteomics and ribosomal profiling analyses.**

De novo transcriptome and tissue specific expression analysis of genes associated with biosynthesis of secondary metabolites in Operculina turpethum (L.)

Article Open access 18 November 2021

Analyses of transcriptomes and the first complete genome of Leucocalocybe mongolica provide new insights into phylogenetic relationships and conservation

Article Open access 03 February 2021

Comparative transcriptomics provides a strategy for phylogenetic analysis and SSR marker development in Chaenomeles

Article Open access 12 August 2021

Data availability

The data that support the findings of this study are available in Supplementary Files 1 and 2, Supplementary Figs. 4 and 5, and Supplementary Tables 11 and 14.

References

Chen, L., DeVries, A. L. & Cheng, C. H. Evolution of antifreeze glycoprotein gene from a trypsinogen gene in Antarctic notothenioid fish. Proc. Natl Acad. Sci. USA 94, 3811–3816 (1997).
Article CAS PubMed PubMed Central Google Scholar
Levine, M. T., Jones, C. D., Kern, A. D., Lindfors, H. A. & Begun, D. J. Novel genes derived from noncoding DNA in Drosophila melanogaster are frequently X-linked and exhibit testis-biased expression. Proc. Natl Acad. Sci. USA 103, 9935–9939 (2006).
Article CAS PubMed PubMed Central Google Scholar
Ohno, S. Evolution by Gene Duplication (Springer, 1970).
Jacob, F. Evolution and tinkering. Science 196, 1161–1166 (1977).
Article CAS PubMed Google Scholar
Gilbert, W. Why genes in pieces? Nature 271, 501 (1978).
Article CAS PubMed Google Scholar
Mayr, E. The Growth of Biological Thought: Diversity, Evolution, and Inheritance (Belknap Press, 1982).
Patthy, L. in Protein Evolution 2nd edn 108–109 (Blackwell Publishing, 2008).
Klasberg, S., Bitard-Feildel, T., Callebaut, I. & Bornberg-Bauer, E. Origins and structural properties of novel and de novo protein domains during insect evolution. FEBS J. 285, 2605–2625 (2018).
Article CAS PubMed Google Scholar
Bitard-Feildel, T., Heberlein, M., Bornberg-Bauer, E. & Callebaut, I. Detection of orphan domains in Drosophila using “hydrophobic cluster analysis”. Biochimie 119, 244–253 (2015).
Article CAS PubMed Google Scholar
Cai, J., Zhao, R., Jiang, H. & Wang, W. De novo origination of a new protein-coding gene in Saccharomyces cerevisiae. Genetics 179, 487–496 (2008).
Article CAS PubMed PubMed Central Google Scholar
Carvunis, A. R. et al. Proto-genes and de novo gene birth. Nature 487, 370–374 (2012).
Article CAS PubMed PubMed Central Google Scholar
Xiao, W. et al. A rice gene of de novo origin negatively regulates pathogen-induced defense response. PLoS ONE 4, e4603 (2009).
Article CAS PubMed PubMed Central Google Scholar
Wu, D. D. et al. “Out of pollen” hypothesis for origin of new genes in flowering plants: study from Arabidopsis thaliana. Genome Biol. Evol. 6, 2822–2829 (2014).
Article CAS PubMed PubMed Central Google Scholar
Cui, X. et al. Young genes out of the male: an insight from evolutionary age analysis of the pollen transcriptome. Mol. Plant 8, 935–945 (2015).
Article CAS PubMed Google Scholar
Donoghue, M. T., Keshavaiah, C., Swamidatta, S. H. & Spillane, C. Evolutionary origins of Brassicaceae specific genes in Arabidopsis thaliana. BMC Evol. Biol. 11, 47 (2011).
Article CAS PubMed PubMed Central Google Scholar
Begun, D. J., Lindfors, H. A., Kern, A. D. & Jones, C. D. Evidence for de novo evolution of testis-expressed genes in the Drosophila yakuba/Drosophila erecta clade. Genetics 176, 1131–1137 (2007).
Article CAS PubMed PubMed Central Google Scholar
Chen, S. T., Cheng, H. C., Barbash, D. A. & Yang, H. P. Evolution of hydra, a recently evolved testis-expressed gene with nine alternative first exons in Drosophila melanogaster. PLoS Genet. 3, e107 (2007).
Article CAS PubMed PubMed Central Google Scholar
Chen, S., Zhang, Y. E. & Long, M. New genes in Drosophila quickly become essential. Science 330, 1682–1685 (2010).
Article CAS PubMed PubMed Central Google Scholar
Reinhardt, J. A. et al. De novo ORFs in Drosophila are important to organismal fitness and evolved rapidly from previously non-coding sequences. PLoS Genet. 9, e1003860 (2013).
Article CAS PubMed PubMed Central Google Scholar
Zhou, Q. et al. On the origin of new genes in Drosophila. Genome Res. 18, 1446–1455 (2008).
Article CAS PubMed PubMed Central Google Scholar
Zhao, L., Saelao, P., Jones, C. D. & Begun, D. J. Origin and spread of de novo genes in Drosophila melanogaster populations. Science 343, 769–772 (2014).
Article CAS PubMed PubMed Central Google Scholar
Toll-Riera, M. et al. Origin of primate orphan genes: a comparative genomics approach. Mol. Biol. Evol. 26, 603–612 (2009).
Article CAS PubMed Google Scholar
Li, C. Y. et al. A human-specific de novo protein-coding gene associated with human brain functions. PLoS Comput. Biol. 6, e1000734 (2010).
Article CAS PubMed PubMed Central Google Scholar
Wu, D. D., Irwin, D. M. & Zhang, Y. P. De novo origin of human protein-coding genes. PLoS Genet. 7, e1002379 (2011).
Article CAS PubMed PubMed Central Google Scholar
Zhang, Y. E., Vibranovski, M. D., Landback, P., Marais, G. A. & Long, M. Chromosomal redistribution of male-biased genes in mammalian evolution with two bursts of gene gain on the X chromosome. PLoS Biol. 8, e1000494 (2010).
Article CAS PubMed PubMed Central Google Scholar
Knowles, D. G. & McLysaght, A. Recent de novo origin of human protein-coding genes. Genome Res. 19, 1752–1759 (2009).
Article CAS PubMed PubMed Central Google Scholar
Murphy, D. N. & McLysaght, A. De novo origin of protein-coding genes in murine rodents. PLoS ONE 7, e48650 (2012).
Article CAS PubMed PubMed Central Google Scholar
Xie, C. et al. Hominoid-specific de novo protein-coding genes originating from long non-coding RNAs. PLoS Genet. 8, e1002942 (2012).
Article CAS PubMed PubMed Central Google Scholar
Ruiz-Orera, J., Verdaguer-Grau, P., Villanueva-Canas, J. L., Messeguer, X. & Alba, M. M. Translation of neutrally evolving peptides provides a basis for de novo gene evolution. Nat. Ecol. Evol. 2, 890–896 (2018).
Article PubMed Google Scholar
Tautz, D. & Domazet-Lošo, T. The evolutionary origin of orphan genes. Nat. Rev. Genet. 12, 692–702 (2011).
Article CAS PubMed Google Scholar
Schlötterer, C. Genes from scratch—the evolutionary fate of de novo genes. Trends Genet. 31, 215–219 (2015).
Article CAS PubMed PubMed Central Google Scholar
Moyers, B. A. & Zhang, J. Evaluating phylostratigraphic evidence for widespread de novo gene birth in genome evolution. Mol. Biol. Evol. 33, 1245–1256 (2018).
Article CAS Google Scholar
Zhao, Y. et al. Identification and analysis of unitary loss of long-established protein-coding genes in Poaceae shows evidences for biased gene loss and putatively functional transcription of relics. BMC Evol. Biol. 15, 66 (2015).
Article CAS PubMed PubMed Central Google Scholar
Cheng, C. H. & Chen, L. Evolution of an antifreeze glycoprotein. Nature 401, 443–444 (1999).
Article CAS PubMed Google Scholar
Husnik, F. & McCutcheon, J. P. Functional horizontal gene transfer from bacteria to eukaryotes. Nat. Rev. Microbiol. 16, 67–79 (2018).
Article CAS PubMed Google Scholar
Dujon, B. The yeast genome project: what did we learn? Trends Genet. 12, 263–270 (1996).
Article CAS PubMed Google Scholar
Gubala, A. M. et al. The goddard and saturn genes are essential for Drosophila male fertility and may have arisen de novo. Mol. Biol. Evol. 34, 1066–1082 (2017).
CAS PubMed PubMed Central Google Scholar
Stein, J. C. et al. Genomes of 13 domesticated and wild rice relatives highlight genetic conservation, turnover and innovation across the genus Oryza. Nat. Genet. 50, 285–296 (2018).
Article CAS PubMed Google Scholar
Hedges, S. B., Marin, J., Suleski, M., Paymer, M. & Kumar, S. Tree of life reveals clock-like speciation and diversification. Mol. Biol. Evol. 32, 835–845 (2015).
Article CAS PubMed PubMed Central Google Scholar
Kawahara, Y. et al. Improvement of the Oryza sativa Nipponbare reference genome using next generation sequence and optical map data. Rice 6, 4 (2013).
Article PubMed PubMed Central Google Scholar
Sakai, H. et al. Rice Annotation Project Database (RAP-DB): an integrative and interactive database for rice genomics. Plant Cell Physiol. 54, e6 (2013).
Article CAS PubMed PubMed Central Google Scholar
Simao, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015).
Article CAS PubMed Google Scholar
Long, M. Y., VanKuren, N. W., Chen, S. D. & Vibranovski, M. D. New gene evolution: little did we know. Annu. Rev. Genet. 47, 307–333 (2013).
Article CAS PubMed PubMed Central Google Scholar
Zhang, C. J. et al. High occurrence of functional new chimeric genes in survey of rice chromosome 3 short arm genome sequences. Genome Biol. Evol. 5, 1038–1048 (2013).
Article CAS PubMed PubMed Central Google Scholar
Zhang, Y. E., Landback, P., Vibranovski, M. & Long, M. New genes expressed in human brains: implications for annotating evolving genomes. BioEssays 34, 982–991 (2012).
Article CAS PubMed Google Scholar
Mills, R. E. et al. An initial map of insertion and deletion (INDEL) variation in the human genome. Genome Res. 16, 1182–1190 (2006).
Article CAS PubMed PubMed Central Google Scholar
Wang, W. et al. Genomic variation in 3,010 diverse accessions of Asian cultivated rice. Nature 557, 43–49 (2018).
Article CAS PubMed PubMed Central Google Scholar
Xu, X. et al. Resequencing 50 accessions of cultivated and wild rice yields markers for identifying agronomically important genes. Nat. Biotechnol. 30, 105–111 (2012).
Article CAS Google Scholar
Watterson, G. A. On the number of segregating sites in genetical models without recombination. Theor. Popul. Biol. 7, 256–276 (1975).
Article CAS PubMed Google Scholar
McDonald, J. H. & Kreitman, M. Adaptive protein evolution at the Adh locus in Drosophila. Nature 351, 652–654 (1991).
Article CAS PubMed Google Scholar
Wang, M. et al. The genome sequence of African rice (Oryza glaberrima) and evidence for independent domestication. Nat. Genet. 46, 982–988 (2014).
Article CAS PubMed PubMed Central Google Scholar
Hartl, D. L. & Clark, A. G. Principles of Population Genetics 4th edn 172–175; 351–354 (Sinauer Associates, Sunderland, 2007).
Berretta, J. & Morillon, A. Pervasive transcription constitutes a new level of eukaryotic genome regulation. EMBO Rep. 10, 973–982 (2009).
Article CAS PubMed PubMed Central Google Scholar
Bornberg-Bauer, E. & Alba, M. M. Dynamics and adaptive benefits of modular protein evolution. Curr. Opin. Struct. Biol. 23, 459–466 (2013).
Article CAS PubMed Google Scholar
Neme, R., Amador, C., Yildirim, B., McConnell, E. & Tautz, D. Random sequences are an abundant source of bioactive RNAs or peptides. Nat. Ecol. Evol. 1, 0217 (2017).
Article PubMed PubMed Central Google Scholar
Heinen, T. J., Staubach, F., Häming, D. & Tautz, D. Emergence of a new gene from an intergenic region. Curr. Biol. 19, 1527–1531 (2009).
Article CAS PubMed Google Scholar
Yanai, I. et al. Genome-wide midrange transcription profiles reveal expression level relationships in human tissue specification. Bioinformatics 21, 650–659 (2005).
Article CAS PubMed Google Scholar
Long, M., Rosenberg, C. & Gilbert, W. Intron phase correlations and the evolution of the intron/exon structure of genes. Proc. Natl Acad. Sci. USA 92, 12495–12499 (1995).
Article CAS PubMed PubMed Central Google Scholar
Sharp, P. A. Speculations on RNA splicing. Cell 23, 643–646 (1981).
Article CAS PubMed Google Scholar
Yang, Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24, 1586–1591 (2007).
Article CAS PubMed Google Scholar
Lange, V., Picotti, P., Domon, B. & Aebersold, R. Selected reaction monitoring for quantitative proteomics: a tutorial. Mol. Syst. Biol. 4, 222 (2008).
Article PubMed PubMed Central Google Scholar
Ebhardt, H. A., Root, A., Sander, C. & Aebersold, R. Applications of targeted proteomics in systems biology and translational medicine. Proteomics 15, 3193–3208 (2015).
Article CAS PubMed PubMed Central Google Scholar
Pecorelli, I., Bibi, R., Fioroni, L. & Galarini, R. Validation of a confirmatory method for the determination of sulphonamides in muscle according to the European Union regulation 2002/657/EC. J. Chromatogr. A 1032, 23–29 (2004).
Article CAS PubMed Google Scholar
Wen, B. et al. IPeak: an open source tool to combine results from multiple MS/MS search engines. Proteomics 15, 2916–2920 (2015).
Article CAS PubMed Google Scholar
Zhao, D. et al. Analysis of ribosome-associated mRNAs in rice reveals the importance of transcript size and GC content in translation. G3 (Bethesda) 7, 203–219 (2017).
Article CAS Google Scholar
Grabherr, M. G. et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011).
Article CAS PubMed PubMed Central Google Scholar
Sabi, R., Volvovitch Daniel, R. & Tuller, T. stAIcalc: tRNA adaptation index calculator based on species-specific weights. Bioinformatics 33, 589–591 (2017).
CAS PubMed Google Scholar
Lees, J. G., Dawson, N. L., Sillitoe, I. & Orengo, C. A. Functional innovation from changes in protein domains and their combinations. Curr. Opin. Struct. Biol. 38, 44–52 (2016).
Article CAS PubMed Google Scholar
Davidson, A. R. & Sauer, R. T. Folded proteins occur frequently in libraries of random amino acid sequences. Proc. Natl Acad. Sci. USA 91, 2146–2150 (1994).
Article CAS PubMed PubMed Central Google Scholar
Keefe, A. D. & Szostak, J. W. Functional proteins from a random-sequence library. Nature 410, 715–718 (2001).
Article CAS PubMed PubMed Central Google Scholar
Vaughan, D. A., Morishima, H. & Kadowaki, K. Diversity in the Oryza genus. Curr. Opin. Plant Biol. 6, 139–146 (2003).
Article CAS PubMed Google Scholar
Murat, F., Van de Peer, Y. & Salse, J. Decoding plant and animal genome plasticity from differential paleo-evolutionary patterns and processes. Genome Biol. Evol. 4, 917–928 (2012).
Article CAS PubMed PubMed Central Google Scholar
Huey, R. B. et al. Plants versus animals: do they deal with stress in different ways? Integr. Comp. Biol. 42, 415–423 (2002).
Article PubMed Google Scholar
Wilson, B. A., Foy, S. G., Neme, R. & Masel, J. Young genes are highly disordered as predicted by the preadaptation hypothesis of de novo gene birth. Nat. Ecol. Evol. 1, 0146 (2017).
Article PubMed PubMed Central Google Scholar
McLysaght, A. & Hurst, L. D. Open questions in the study of de novo genes: what, how and why. Nat. Rev. Genet. 17, 567–578 (2016).
Article CAS PubMed Google Scholar
Zhang, Y. E., Vibranovski, M. D., Krinsky, B. H. & Long, M. Age-dependent chromosomal distribution of male-biased genes in Drosophila. Genome Res. 20, 1526–1533 (2010).
Article CAS PubMed PubMed Central Google Scholar
Zhang, Y. E., Landback, P., Vibranovski, M. D. & Long, M. Accelerated recruitment of new brain development genes into the human genome. PLoS Biol. 9, e1001179 (2011).
Article CAS PubMed PubMed Central Google Scholar
Kent, W. J. BLAT—the BLAST-like alignment tool. Genome Res. 12, 656–664 (2002).
Article CAS PubMed PubMed Central Google Scholar
Ranwez, V., Harispe, S., Delsuc, F. & Douzery, E. J. MACSE: multiple alignment of coding sequences accounting for frameshifts and stop codons. PLoS ONE 6, e22594 (2011).
Article CAS PubMed PubMed Central Google Scholar
Kearse, M. et al. Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics 28, 1647–1649 (2012).
Article PubMed PubMed Central Google Scholar
Sayers, E. W. et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 37, D5–D15 (2009).
Article CAS PubMed Google Scholar
Trapnell, C. et al. Differential gene and transcript expression analysis of RNA-Seq experiments with TopHat and Cufflinks. Nat. Protoc. 7, 562–578 (2012).
Article CAS PubMed PubMed Central Google Scholar
Stanke, M. & Waack, S. Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics 19, ii215–ii225 (2003).
Article PubMed Google Scholar
Dos Reis, M. et al. Solving the riddle of codon usage preferences: a test for translational selection. Nucleic Acids Res. 32, 5036–5044 (2004).
Article CAS PubMed Google Scholar
Chan, P. P. & Lowe, T. M. GtRNAdb: a database of transfer RNA genes detected in genomic sequence. Nucleic Acids Res. 37, D93–D97 (2009).
Article CAS PubMed Google Scholar
Aebersold, R., Burlingame, A. L. & Bradshaw, R. A. Western blots versus selected reaction monitoring assays: time to turn the tables? Mol. Cell. Proteomics 12, 2381–2382 (2013).
Article CAS PubMed PubMed Central Google Scholar
Sjostrom, M. et al. A combined shotgun and targeted mass spectrometry strategy for breast cancer biomarker discovery. J. Proteome Res. 14, 2807–2818 (2015).
Article CAS PubMed Google Scholar
Guo, J. et al. A comprehensive investigation toward the indicative proteins of bladder cancer in urine: from surveying cell secretomes to verifying urine proteins. J. Proteome Res. 15, 2164–2177 (2016).
Article CAS PubMed Google Scholar
Xie, Y. et al. The levels of serine proteases in colon tissue interstitial fluid and serum serve as an indicator of colorectal cancer progression. Oncotarget 7, 32592–32606 (2016).
PubMed PubMed Central Google Scholar
Zhang, S. et al. Quantitative analysis of the human AKR family members in cancer cell lines using the mTRAQ/MRM approach. J. Proteome Res. 12, 2022–2033 (2013).
Article CAS PubMed Google Scholar
Hou, G. et al. Biomarker discovery and verification of esophageal squamous cell carcinoma using integration of SWATH/MRM. J. Proteome Res. 14, 3793–3803 (2015).
Article CAS PubMed Google Scholar
Hou, G., Wang, Y., Lou, X. & Liu, S. Combination strategy of quantitative proteomics uncovers the related proteins of colorectal cancer in the interstitial fluid of colonic tissue from the AOM-DSS mouse model. Methods Mol. Biol. 1788, 185–192 (2017).
Article CAS Google Scholar
Fagerberg, L. et al. Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics. Mol. Cell. Proteomics 13, 397–406 (2014).
Article CAS PubMed Google Scholar
Uhlen, M. et al. Tissue-based map of the human proteome. Science 347, 1260419 (2015).
Article CAS PubMed Google Scholar
Lindskog, C. The potential clinical impact of the tissue-based map of the human proteome. Expert Rev. Proteomics 12, 213–215 (2015).
Article CAS PubMed Google Scholar
Uhlen, M. et al. Transcriptomics resources of human tissues and organs. Mol. Syst. Biol. 12, 862 (2016).
Article PubMed PubMed Central Google Scholar
Wisniewski, J. R., Zougman, A., Nagaraj, N. & Mann, M. Universal sample preparation method for proteome analysis. Nat. Methods 6, 359–362 (2009).
Article CAS PubMed Google Scholar
MacLean, B. et al. Skyline: an open source document editor for creating and analyzing targeted proteomics experiments. Bioinformatics 26, 966–968 (2010).
Article CAS PubMed PubMed Central Google Scholar
Picotti, P. & Aebersold, R. Selected reaction monitoring-based proteomics: workflows, potential, pitfalls and future directions. Nat. Methods 9, 555–566 (2012).
Article CAS PubMed Google Scholar
Reiter, L. et al. mProphet: automated data processing and statistical validation for large-scale SRM experiments. Nat. Methods 8, 430–435 (2011).
Article CAS PubMed Google Scholar
Bruderer, R., Bernhardt, O. M., Gandhi, T. & Reiter, L. High-precision iRT prediction in the targeted analysis of data-independent acquisition and its impact on identification and quantitation. Proteomics 16, 2246–2256 (2016).
Article CAS PubMed PubMed Central Google Scholar
Navarro, P. et al. A multicenter study benchmarks software tools for label-free proteome quantification. Nat. Biotechnol. 34, 1130–1136 (2016).
Article CAS PubMed PubMed Central Google Scholar
Jordan, G. & Goldman, N. The effects of alignment error and alignment filtering on the sitewise detection of positive selection. Mol. Biol. Evol. 29, 1125–1139 (2012).
Article CAS PubMed Google Scholar
Löytynoja, A. Phylogeny-aware alignment with PRANK. Methods Mol. Biol. 1079, 155–170 (2014).
Article PubMed Google Scholar
Yang, Z. Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution. Mol. Biol. Evol. 15, 568–573 (1998).
Article CAS PubMed Google Scholar
Huang, X. et al. A map of rice genome variation reveals the origin of cultivated rice. Nature 490, 497–501 (2012).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We appreciate valuable discussions with N. Jiang at MSU, the group of M. L. at Chicago, Y. Liao and M. Chen at the Institute of Genetics and Development in Beijing, and J. P. Staley at Chicago. We are thankful for the editing done by E. Mortola. This work was supported by the USA National Science Foundation (NSF) under Plant Genome Research Program numbers 0321678, 0638541 and 0822284, the Bud Antle Endowed Chair of Excellence in Agriculture and Life Sciences, and the AXA Chair for Evolutionary Genomics and Genome Biology (to R.A.W.), NSF MCB number 1026200 (to M.L. and R.A.W.), NSF MCB 1051826 and NIH R01 GM 100768 (to M.L.), the National Key R&D Program of China 2017YFC0908400 (to S.L.) and the National Program for Support of Top-notch Young Professionals of China (to Y.O.).

Author information

These authors contributed equally: Li Zhang, Yan Ren.

Authors and Affiliations

Department of Ecology and Evolution, The University of Chicago, Chicago, IL, USA
Li Zhang, Jianhai Chen, Andrea R. Gschwend, Chengjun Zhang & Manyuan Long
BGI-Shenzhen, Shenzhen, China
Yan Ren, Jin Zi, Ruo Zhou, Bo Wen, Zhiyu Peng & Siqi Liu
National Key Laboratory of Crop Genetic Improvement and National Centre of Plant Gene Research (Wuhan), Huazhong Agricultural University, Wuhan, China
Tao Yang, Guangwei Li, Guixue Hou & Yidan Ouyang
Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ, USA
Yeisoo Yu, Jianwei Zhang, Kapeel Chougule, Muhua Wang, Dario Copetti & Rod A. Wing
Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, China
Chengjun Zhang
Key Laboratory of Zoological Systematics and Evolution & State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
Yong Zhang

Authors

Li Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yan Ren
View author publications
You can also search for this author in PubMed Google Scholar
Tao Yang
View author publications
You can also search for this author in PubMed Google Scholar
Guangwei Li
View author publications
You can also search for this author in PubMed Google Scholar
Jianhai Chen
View author publications
You can also search for this author in PubMed Google Scholar
Andrea R. Gschwend
View author publications
You can also search for this author in PubMed Google Scholar
Yeisoo Yu
View author publications
You can also search for this author in PubMed Google Scholar
Guixue Hou
View author publications
You can also search for this author in PubMed Google Scholar
Jin Zi
View author publications
You can also search for this author in PubMed Google Scholar
Ruo Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Bo Wen
View author publications
You can also search for this author in PubMed Google Scholar
Jianwei Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Kapeel Chougule
View author publications
You can also search for this author in PubMed Google Scholar
Muhua Wang
View author publications
You can also search for this author in PubMed Google Scholar
Dario Copetti
View author publications
You can also search for this author in PubMed Google Scholar
Zhiyu Peng
View author publications
You can also search for this author in PubMed Google Scholar
Chengjun Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yidan Ouyang
View author publications
You can also search for this author in PubMed Google Scholar
Rod A. Wing
View author publications
You can also search for this author in PubMed Google Scholar
Siqi Liu
View author publications
You can also search for this author in PubMed Google Scholar
Manyuan Long
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

L.Z., R.A.W., S.L. and M.L. conceived and designed the project. L.Z., Y.R., R.A.W., S.L. and M.L. wrote the manuscript, with significant contributions from C.Z., A.R.G., J.C. and Y.Z. L.Z. conducted the computational genomic analysis, with significant contributions from A.R.G., K.C., J.Z. and Y.Z. C.Z., Y.Y., J.Z., K.C., M.W., D.C. and R.A.W. generated and annotated the genome sequences. Y.R., G.H., J.Z., L.Z. and S.L. designed and conducted the proteomics experiments to detect proteins translated from de novo genes. R.Z., B.W., L.Z. and Z.P. conducted the analysis of public proteomics databases. Y.R., L.Z., J.C., M.L. and S.L. performed further evolutionary and proteomics analyses. T.Y., G.L. and Y.O. grew rice strains in Sanya (China) and dissected rice tissues. J.C., L.Z., C.Z. and M.L. conducted the evolutionary substitution analyses of de novo genes.

Corresponding authors

Correspondence to Rod A. Wing, Siqi Liu or Manyuan Long.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, L., Ren, Y., Yang, T. et al. Rapid evolution of protein diversity by de novo origination in Oryza. Nat Ecol Evol 3, 679–690 (2019). https://doi.org/10.1038/s41559-019-0822-5

Download citation

Received: 08 May 2018
Accepted: 23 January 2019
Published: 11 March 2019
Issue Date: April 2019
DOI: https://doi.org/10.1038/s41559-019-0822-5

This article is cited by

The origin and structural evolution of de novo genes in Drosophila
- Junhui Peng
- Li Zhao
Nature Communications (2024)
Four classic “de novo” genes all have plausible homologs and likely evolved from retro-duplicated or pseudogenic sequences
- Joseph Hannon Bozorgmehr
Molecular Genetics and Genomics (2024)
Clustering pattern and evolution characteristic of microRNAs in grass carp (Ctenopharyngodon idella)
- Huiqin Niu
- Yifan Pang
- Xiaoyan Xu
BMC Genomics (2023)
Uncovering gene-family founder events during major evolutionary transitions in animals, plants and fungi using GenEra
- Josué Barrera-Redondo
- Jaruwatana Sodai Lotharukpong
- Susana M. Coelho
Genome Biology (2023)
Selection of a de novo gene that can promote survival of Escherichia coli by modulating protein homeostasis pathways
- Idan Frumkin
- Michael T. Laub
Nature Ecology & Evolution (2023)