Abstract

Whole-genome duplication (WGD), or polyploidy, followed by gene loss and diploidization has long been recognized as an important evolutionary force in animals, fungi and other organisms1,2,3, especially plants. The success of angiosperms has been attributed, in part, to innovations associated with gene or whole-genome duplications4,5,6, but evidence for proposed ancient genome duplications pre-dating the divergence of monocots and eudicots remains equivocal in analyses of conserved gene order. Here we use comprehensive phylogenomic analyses of sequenced plant genomes and more than 12.6 million new expressed-sequence-tag sequences from phylogenetically pivotal lineages to elucidate two groups of ancient gene duplications—one in the common ancestor of extant seed plants and the other in the common ancestor of extant angiosperms. Gene duplication events were intensely concentrated around 319 and 192 million years ago, implicating two WGDs in ancestral lineages shortly before the diversification of extant seed plants and extant angiosperms, respectively. Significantly, these ancestral WGDs resulted in the diversification of regulatory genes important to seed and flower development, suggesting that they were involved in major innovations that ultimately contributed to the rise and eventual dominance of seed plants and angiosperms.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

References

  1. 1.

    Evolution by Gene Duplication (Springer, 1970)

  2. 2.

    The Origins of Genome Architecture (Sinauer, 2007)

  3. 3.

    & Gene and genome duplications: the impact of dosage-sensitivity on the fate of nuclear genes. Chromosome Res. 17, 699–717 (2009)

  4. 4.

    , & Genome duplication and the origin of angiosperms. Trends Ecol. Evol. 20, 591–597 (2005)

  5. 5.

    , , & Origin and early evolution of angiosperms. Ann. NY Acad. Sci. 1133, 3–25 (2008)

  6. 6.

    , & Plants with double genomes might have had a better chance to survive the Cretaceous-Tertiary extinction event. Proc. Natl Acad. Sci. USA 106, 5737–5742 (2009)

  7. 7.

    et al. Finding and comparing syntenic regions among Arabidopsis and the outgroups papaya, poplar, and grape: CoGe with rosids. Plant Physiol. 148, 1772–1781 (2008)

  8. 8.

    , , & Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 422, 433–438 (2003)

  9. 9.

    et al. The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature 449, 463–467 (2007)

  10. 10.

    , & The origins of genomic duplications in Arabidopsis. Science 290, 2114–2117 (2000)

  11. 11.

    , & Paleopolyploidy in the Brassicales: analyses of the Cleome transcriptome elucidate the history of genome duplications in Arabidopsis and other Brassicales. Genome Biol. Evol. 1, 391–399 (2009)

  12. 12.

    et al. Synteny and collinearity in plant genomes. Science 320, 486–488 (2008)

  13. 13.

    et al. Unraveling ancient hexaploidy through multiply-aligned angiosperm gene maps. Genome Res. 18, 1944–1954 (2008)

  14. 14.

    et al. The genome of black cottonwood, Populus trichocarpa (Torr. & Gray). Science 313, 1596–1604 (2006)

  15. 15.

    , , & Angiosperm genome comparisons reveal early polyploidy in the monocot lineage. Proc. Natl Acad. Sci. USA 107, 472–477 (2010)

  16. 16.

    et al. Widespread genome duplications throughout the history of flowering plants. Genome Res. 16, 738–749 (2006)

  17. 17.

    et al. The gain and loss of genes during 600 million years of vertebrate evolution. Genome Biol. 7, R43 (2006)

  18. 18.

    , & HaMStR: profile hidden Markov model based search for orthologs in ESTs. BMC Evol. Biol. 9, 157 (2009)

  19. 19.

    , , & Using plastid genome-scale data to resolve enigmatic relationships among basal angiosperms. Proc. Natl Acad. Sci. USA 104, 19363–19368 (2007)

  20. 20.

    , , & The EMMIX algorithm for the fitting of normal and t-components. J. Stat. Softw. 4, i02 (1999)

  21. 21.

    , & The age of the angiosperms: a molecular timescale without a clock. Evolution 59, 1245–1258 (2005)

  22. 22.

    et al. Ferns diversified in the shadow of angiosperms. Nature 428, 553–557 (2004)

  23. 23.

    Bias in plant gene content following different sorts of duplication: tandem, whole-genome, segmental, or by transposition. Annu. Rev. Plant Biol. 60, 433–453 (2009)

  24. 24.

    , , , & Evolution of gene function and regulatory control after whole-genome duplication: comparative analyses in vertebrates. Genome Res. 19, 1404–1418 (2009)

  25. 25.

    , & Phytochrome E influences internode elongation and flowering time in Arabidopsis. Plant Cell 10, 1479–1487 (1998)

  26. 26.

    , & Phytochromes differentially regulate seed germination responses to light quality and temperature cues during seed maturation. Plant Cell Environ. 32, 1297–1309 (2009)

  27. 27.

    , & Adaptive evolution in the photosensory domain of phytochrome A in early angiosperms. Mol. Biol. Evol. 20, 1087–1097 (2003)

  28. 28.

    et al. Complex regulation of the TIR1/AFB family of auxin receptors. Proc. Natl Acad. Sci. USA 106, 22540–22545 (2009)

  29. 29.

    , & Phylogenetic analysis of the plant-specific zinc finger-homeobox and mini zinc finger gene families. J. Integr. Plant Biol. 50, 1031–1045 (2008)

  30. 30.

    & Evolution of the class III HD-Zip gene family in land plants. Evol. Dev. 8, 350–361 (2006)

  31. 31.

    et al. Gene loss and silencing in Tragopogon miscellus (Asteraceae): comparison of natural and synthetic allotetraploids. Heredity 103, 73–81 (2009)

  32. 32.

    , & Detecting the undetectable: uncovering duplicated segments in Arabidopsis by comparison with rice. Trends Genet. 18, 606–608 (2002)

  33. 33.

    & Widespread paleopolyploidy in model plant species inferred from age distributions of duplicate genes. Plant Cell 16, 1667–1678 (2004)

  34. 34.

    , , , & The flowering world: a tale of duplications. Trends Plant Sci. 14, 680–688 (2009)

  35. 35.

    , & OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res. 13, 2178–2189 (2003)

  36. 36.

    et al. PLAZA: a comparative genomics resource to study gene and genome evolution in plants. Plant Cell 21, 3718–3731 (2009)

  37. 37.

    MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792–1797 (2004)

  38. 38.

    , & TrimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009)

  39. 39.

    , & Multiple sequence alignment using ClustalW and ClustalX. Curr. Protoc. Bioinf. 2.3.1–2.3.22. (2002)

  40. 40.

    , & RAxML-III: a fast program for maximum likelihood-based inference of large phylogenetic trees. Bioinformatics 21, 456–463 (2005)

  41. 41.

    RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22, 2688–2690 (2006)

  42. 42.

    Cases in which parsimony or compatibility methods will be positively misleading. Syst. Zool. 27, 401–410 (1978)

  43. 43.

    & A framework for the quantitative study of evolutionary trees. Syst. Zool. 38, 297–309 (1989)

  44. 44.

    r8s: inferring absolute rates of molecular evolution and divergence times in the absence of a molecular clock. Bioinformatics 19, 301–302 (2003)

  45. 45.

    et al. The Physcomitrella genome reveals evolutionary insights into the conquest of land by plants. Science 319, 64–69 (2008)

  46. 46.

    & The origin and early evolution of plants on land. Nature 389, 33–39 (1997)

  47. 47.

    Implications of fossil conifers for the phylogenetic relationships of living families. Bot. Rev. 65, 239–277 (1999)

  48. 48.

    & in Pollen and Spores: Patterns of Diversification (eds & ) 169–195 (Clarendon, 1991)

  49. 49.

    , & PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res. 34, W609–W612 (2006)

  50. 50.

    & A codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol. Biol. Evol. 11, 725–736 (1994)

  51. 51.

    PAML: a program package for phylogenetic analysis by maximum likelihood. Comput. Appl. Biosci. 13, 555–556 (1997)

  52. 52.

    et al. Gene ontology: tool for the unification of biology. Nature Genet. 25, 25–29 (2000)

  53. 53.

    & InterProScan - an integration platform for the signature-recognition methods in InterPro. Bioinformatics 17, 847–848 (2001)

  54. 54.

    , , , & agriGO: a GO analysis toolkit for the agricultural community. Nucleic Acids Res. 38, W64–W70 (2010)

Download references

Acknowledgements

This work was supported primarily by NSF Plant Genome Research Program (DEB 0638595, The Ancestral Angiosperm Genome Project) and in part by the Department of Biology and by the Huck Institutes of Life Sciences of the Pennsylvania State University. H.M. was also supported by funds from Fudan University. We thank J. Carlson, M. Frohlich, S. DiLoretto, L. Warg, S. Crutchfield, C. Johnson, N. Naznin, X. Zhou, J. Duarte, B. J. Bliss, J. Der and E. Wafula for help and discussion, D. Stevenson and C. Schultz for Zamia samples, J. McNeal, S. Kim and M. Axtell for photographs, and all the members of The Genome Center at Washington University production team, especially L. Fulton, K. Delehaunty and C. Fronick.

Author information

Affiliations

  1. Intercollege Graduate Degree Program in Plant Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA

    • Yuannian Jiao
    • , Hong Ma
    •  & Claude W. dePamphilis
  2. Department of Biology, Institute of Molecular Evolutionary Genetics, and the Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA

    • Yuannian Jiao
    • , Norman J. Wickett
    • , Lena Landherr
    • , Paula E. Ralph
    • , Yi Hu
    • , Hong Ma
    •  & Claude W. dePamphilis
  3. Department of Plant Biology, University of Georgia, Athens, Georgia 30602, USA

    • Saravanaraj Ayyampalayam
    •  & Jim Leebens-Mack
  4. Department of Biology, University of Florida, Gainesville, Florida 32611, USA

    • André S. Chanderbali
    •  & Douglas E. Soltis
  5. Center for Comparative Genomics, Center for Infectious Disease Dynamics, The Pennsylvania State University, University Park, Pennsylvania 16802, USA

    • Lynn P. Tomsho
    •  & Stephan C. Schuster
  6. Department of Genetics and Biochemistry, Clemson University, Clemson, South Carolina 29634, USA

    • Haiying Liang
  7. Florida Museum of Natural History, University of Florida, Gainesville, Florida 32611, USA

    • Pamela S. Soltis
  8. The Genome Center at Washington University, Saint Louis, Missouri 63108, USA

    • Sandra W. Clifton
  9. Department of Forestry, Wildlife & Fisheries, Institute of Agriculture, The University of Tennessee, Knoxville, Tennessee 37996, USA

    • Scott E. Schlarbaum
  10. State Key Laboratory of Genetic Engineering, School of Life Sciences, Institute of Plant Biology, Center for Evolutionary Biology, Fudan University, Shanghai 200433, China

    • Hong Ma
  11. Institute of Biomedical Sciences, Fudan University, Shanghai 200433, China

    • Hong Ma

Authors

  1. Search for Yuannian Jiao in:

  2. Search for Norman J. Wickett in:

  3. Search for Saravanaraj Ayyampalayam in:

  4. Search for André S. Chanderbali in:

  5. Search for Lena Landherr in:

  6. Search for Paula E. Ralph in:

  7. Search for Lynn P. Tomsho in:

  8. Search for Yi Hu in:

  9. Search for Haiying Liang in:

  10. Search for Pamela S. Soltis in:

  11. Search for Douglas E. Soltis in:

  12. Search for Sandra W. Clifton in:

  13. Search for Scott E. Schlarbaum in:

  14. Search for Stephan C. Schuster in:

  15. Search for Hong Ma in:

  16. Search for Jim Leebens-Mack in:

  17. Search for Claude W. dePamphilis in:

Contributions

Y.J. and C.W.d. designed the study and Y.J. performed the principal data analyses. A.S.C., L.L., P.E.R., Y.H., S.E.S. and H.L. prepared tissues, RNAs, and/or libraries. S.W.C., L.P.T. and S.C.S. generated sequence data. S.A. and J.L.-M. performed the Ancestral Angiosperm Genome Project transcriptome assemblies and MAGIC database construction. Y.J. and C.W.d. drafted the manuscript, and N.J.W., A.S.C., L.L. P.E.R., P.S.S., D.E.S., H.M. and J.L-M. contributed to the planning and discussion of the research and the editing of the manuscript. All authors contributed to and approved the final manuscript.

Competing interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to Claude W. dePamphilis.

Alignments and phylogenetic trees have been deposited in Dryad with package identifier doi:10.5061/dryad.8546.

Supplementary information

PDF files

  1. 1.

    Supplementary Information 1

    The file contains a Supplementary Discussion, Supplementary References, Supplementary Tables 1-5 and Supplementary Figures 1-8 with legends.

  2. 2.

    Supplementary Information 2

    Additional File 4 shows plot of the genomic positions of paralogous pairs of Vitis vinifera genes that arose from duplications prior to the divergence of monocots and eudicots.

Excel files

  1. 1.

    Supplementary Table 1

    Additional File 1 displays a list of 799 orthogroups with Monocot + Eudicot duplication.

  2. 2.

    Supplementary Table 2

    Additional File 2 displays the number of ancient duplications found in orthogroups in all four analyses.

  3. 3.

    Supplementary Table 3

    Additional File 3 displays data on significant enrichment of GO-SLIM term for the orthogroups with ancient duplication measured by Fisher’s exact test followed by multiple testing corrections.

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/nature09916

Further reading

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.