Plant Phylogenomics

A thousand plants’ phylogeny

The One Thousand Plant Initiative analysed an unprecedented collection of >1,000 plant vegetative transcriptomes from species spanning the green tree of life, resolved controversial phylogenetic placements and highlighted gene family expansions and whole genome duplications that occurred during different stages of evolution.

The Viridiplantae clade (green plants) includes land plants as well as green algae, and are at the base of terrestrial and sunlit aquatic ecosystems (Fig. 1). They evolved from engulfment of an ancestral cyanobacterium in a eukaryotic host more than 1 billion years ago and have been diversifying since then, giving rise in particular to the embryophyte (land plants) lineage. During this interval, they evolved a large range of features, including multicellularity as well as morphological and physiological innovations. Recent advances in genomic methods allow us to study these events by comparing the diversity of extant lineages as they provide support and refinement to their phylogeny1. The gene content of green plants may also help understand what functions were selected, amplified or lost during their complex evolution. However, Viridiplantae are difficult to study at the genome level due to the large sizes of their genomes and complex ploidy2. An alternative way to obtain genomic information spanning such a large group of diverse species is to sequence their expressed genes or transcriptomes. This avoids some of the difficulties still associated with the sequencing of genomes for relatively unstudied organisms. In a recent issue of Nature3, the One Thousand Plant Initiative provided and analysed an unprecedented collection of >1,000 plant transcriptomes, from unicellular lineages related to Viridiplantae to flowering plants. This unrivalled data set helps unravel conflicting cases of phylogenetic placements and highlights key events in gene family expansions over time.

Fig. 1: Green plant diversity.

Photographs of Welwitschia Mirabilis (a), green algae (b), wheat (c), Arabidopsis thaliana (d), green moss (e), liverwort (Marchantia polymorpha) (f), ostrich fern (Matteuccia struthiopteris) (g), conifer (h) and waterlily (i).

The high number of studies that have already used parts of the genomic data made publicly available before publication ( demonstrates the high relevance of this transcriptome collection4,5. However, some important evolutionary questions receive reliable answers only when the whole collection is used. A number of controversial issues received much stronger support with this extensive data set compared to its previous, more limited release1. For instance, the branching order of Viridiplantae with two basal lineages of algae, the Rhodophyta and the Glaucophyta, has been debated6. This study provides a robust support for the Glaucophyta as a sister lineage to green plants, implying a complex pattern for loss of peptidoglycan biosynthesis. Another controversial issue is the nature of the bryophytes (comprising liverworts, mosses and hornworts) and their relationships with other land plants (namely ferns, lycophytes and the large group of seed plants). The authors found strong support for the monophyly of the bryophytes, apparently resolving one main uncertainty in the origins of land conquest by the green lineage. Among gymnosperms, different results were obtained for the phylogenetic placement of the gnetales when using nuclear versus organellar phylogenomics7. With this extensive data set, the authors placed the gnetales as sisters to conifers and explained much of the previous discrepancies by suggesting the possibility of gene flow among lineages. Using divergence between paralogous copies of genes in the transcriptomes, inferences of many potential new whole genome duplications (WGDs) were also made across seed plants. Accordingly, one of the most interesting observations made in this work is the difficulty in inferring some phylogenetic relationships, apparently because of the variable histories of genes in the same genomes. How this difficulty relates to the high frequency of WGDs in plant evolution is an interesting topic to pursue.

One of the limitations of working with transcriptomes rather than genomes is the inherent incompleteness of the data, as only expressed genes are recorded, which potentially leads to the absence of weakly or tissue-specifically transcribed genes. However, looking at the dynamics of large gene families in the many species sequenced here led to the hypothesis that angiosperms (flowering plants) have no specific family expansions. Therefore, many of the novelties that appeared in this lineage may be due to the recruitment of pre-existing members of gene families to new functions. Other gene family expansions show a peak during the transition from algae to land plants, indicating that this major change was accompanied by a profound remodelling of gene content. No evidence for the involvement of a global polyploidization event was found at this stage, suggesting that these expansions occurred only for some gene families. Therefore, these families may be interesting targets for understanding the initial steps in the appearance of land plants.

The present work proposes 244 WGDs across green plant evolution, including those not detected previously. However, further genomic analysis is needed to confirm these inferences based on paralogous genes estimates. The presence of syntenic blocks with similar ages is still the best way to predict such events. Even with the impressive progress made over the last few years on the sequencing of complex genomes, exploring the whole diversity of plant genomics remains a difficult and costly task to envision, particularly with the high quality of genome sequences that would be needed to produce faithful gene annotations. As such, the current collection of transcriptomes presented in ref. 3 appears to be the best compromise to link genes and phenotypes within a robust evolutionary framework.


  1. 1.

    Wickett, N. J. et al. Proc. Natl Acad. Sci. USA 111, E4859–E4868 (2014).

    CAS  Article  Google Scholar 

  2. 2.

    Jung, H. et al. Trends Plant Sci. 24, 700–724 (2019).

    CAS  Article  Google Scholar 

  3. 3.

    One Thousand Plant Transcriptomes Initiative. Nature (2019).

  4. 4.

    Mutte, S. K. et al. eLife 7, e33399 (2018).

    Article  Google Scholar 

  5. 5.

    Bourque, S. et al. Trends Plant Sci. 21, 1008–1016 (2016).

    CAS  Article  Google Scholar 

  6. 6.

    Deschamps, P. & Moreira, D. Mol. Biol. Evol. 26, 2745–2753 (2009).

    CAS  Article  Google Scholar 

  7. 7.

    Zhong, B. et al. Mol. Biol. Evol. 27, 2855–2863 (2010).

    CAS  Article  Google Scholar 

Download references

Author information



Corresponding author

Correspondence to Patrick Wincker.

Ethics declarations

Competing interests

The author declares no competing interests.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wincker, P. A thousand plants’ phylogeny. Nat. Plants 5, 1106–1107 (2019).

Download citation


Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing