replying to A. Liston et al. Nature Genetics https://doi.org/10.1038/s41588-019-0543-3 (2019)
The origin of octoploid strawberry has been the focus of several phylogenetic studies over the past decade (for example, refs. 1,2,3). Our previous study, using the octoploid genome and transcriptomes of every extant diploid Fragaria species, provided support for four species (Fragaria vesca, Fragaria iinumae, Fragaria viridis and Fragaria nipponica) as the closest extant relatives of the diploids that contributed to the origin of octoploid strawberry4. In a response paper5, Liston et al. stated “that only two extand diploids were progenitors” with one subgenome being contributed by F. vesca and three by F. iinumae–like ancestors. Our reanalysis of the transcriptome data and comparative genomic analyses of a chromosome-scale F. iinumae genome support our previous model for the origin of octoploid strawberry4.
Liston et al.5 raised a concern regarding one of the steps in the phylogenetic analysis of the subgenome tree-searching algorithm (PhyDS) tool we developed to identify extant relatives of diploid progenitors of allopolyploids. Specifically, they argue that we may have incorrectly identified F. viridis and F. nipponica as extant relatives because in-paralogs were excluded from our previous phylogenetic analysis4. Our reanalysis of the data using PhyDS, now including in-paralogs, yielded results consistent with those presented in our previous study (Fig. 1; Supplementary Information and Supplementary Dataset 1). Furthermore, their alternative model for the origin of octoploid strawberry (1× F. vesca–like and 3× F. iinumae–like subgenomes) is not supported by comparative genomic analyses of a new chromosome-scale F. iinumae genome (Fig. 2).
Phylogenetic analysis of the subgenome tree-searching algorithm searched a set of gene trees to identify sequences most closely related to a set of user-provided paralogs (or homoeologs in polyploids). Homoeologs are orthologous genes that were brought back into the same nucleus by allopolyploidization6. For our analyses, we used syntenic (that is, positionally conserved) homoeologs that were present on all subgenomes in octoploid strawberry. Gene trees were estimated using RAxML7 based on orthologs identified using established orthogrouping approaches8 applied to de novo assembled transcriptomes for each diploid Fragaria species4. PhyDS performs a relatively simple and straightforward analysis of gene trees. First, it identifies the user-provided paralog present in a gene tree and then moves to the direct ancestral node of the paralog. Second, PhyDS then returns to the user the direct descendants (that is, sequence identities including the paralog) of that ancestral node with its bootstrap support value (Fig. 1).
We have two major concerns regarding the methods used in refs. 2,5. First, phylogenetic analyses aimed at estimation of species relationships are reliant first on correct identification of orthologs9. These authors used a sequence similarity-based approach to identify putative orthologs that has relatively high error rates10. Furthermore, pangenome studies have shown that up to one-half of gene content exhibits presence–absence variation at the species level in plants11. In other words, many genes are individual- or population-specific. Thus, many of the putative ortholog predictions in their studies may be inaccurate. Second, Liston et al.5 performed analyses of 100-kb windows across each of the seven base chromosomes. This could be problematic because chromosomal regions from one parental species can be replaced with chromosomal regions from the other parental species during meiosis in polyploids (referred to as homoeologous exchanges12). Homoeologous exchanges can range in size from large megabase-sized regions to single genes (see a recent review on its impact on subgenome assignment in ref. 13). We identifed extensive homoeologous exchanges throughout the octoploid strawberry genome4. Thus, the 100-kb windows Liston et al. used consist of genes with different evolutionary histories reflecting each of the different progenitor species. This could result in inaccurate estimates of species relationships.
Here we present a chromosome-scale genome of F. iinumae with a scaffold minimum scaffold length needed to cover 50% of the genome of 33.98 Mb and 23,665 protein-coding genes (see Supplementary Information). This genome was used to calculate the synonymous substitution (Ks) divergence between F. iinumae to each of the four subgenomes (Fig. 2a). This revealed that only one of the subgenomes of octoploid strawberry is F. iinumae–like, which does not support the model presented by Liston et al.5 that the origin of octoploid strawberry involved three F. iinumae–like and one F. vesca–like progenitor species. Instead, these results are consistent with our phylogenetic estimates supporting more than two diploid progenitors (Fig. 2b–d). The F. viridis (Fig. 2c) and F. nipponica (Fig. 2d) subgenomes are not F. iinumae–like.
Our new phylogenetic analyses support four distinct progenitor species, which is consistent with our previous results4 and that of other groups3. The conflicting results obtained by Liston et al.5 are probably due to differences in methodology. As pointed out above, establishing gene orthology is crucial for molecular phylogenetics. Our pipeline started by identifying high-confidence syntenic 1:1 homoeologs present on each of the subgenomes. This step alone filtered out 82.1% of genes from the octoploid strawberry genome4. The number of genes analyzed in our study was further reduced due to absence across transcriptome data, stringent orthogroup filtering and bootstrap value filtering. In short, more data are not always better if one introduces ‘phylogenetic noise’. It is unclear to us how Liston et al.5 obtained high unique mapping rates (~89% alignment) across the F. vesca genome, which consists of ~31% transposable elements and hundreds of duplicate genes. Furthermore, many genes are species-specific based on previous pangenome studies.
As pointed out by Liston et al.5, incomplete lineage sorting can impact phylogenetic inferences. However, that is far more likely to impact within-species than between-species estimates. This is exactly what was observed in our study. Other F. vesca subspecies were identified as contributors but were present at notably lower levels than F. viridis and F. nipponica (Fig. 1a). These patterns provide further support for F. viridis and F. nipponica as extant relatives of the progenitors that contributed to the origin of the intermediate hexaploid ancestor. Lastly, we did state that F. moschata may be an extant relative of the intermediate hexaploid ancestor. Given the high frequency of polyploid formation in Fragaria14 and birth–death dynamics of polyploids15, we agree it is possible that the hexaploid ancestor may be extinct. This remains to be properly evaluated using robust phylogenetic approaches and datasets.
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
The phylogenetic trees and alignments are available on Dryad (https://doi.org/10.5061/dryad.b2c58pc). The genome assembly and annotation files are available on the Genome Database for Rosaceae (https://www.rosaceae.org/) and NCBI GenBank under BioProjects PRJNA544784 and PRJNA508389. The raw sequence data are available in the Sequence Read Archive under the same NCBI BioProject numbers, PRJNA544784 and PRJNA508389.
Custom software for running PhyDS phylogenetic analyses is available on GitHub (https://github.com/mrmckain/PhyDS/).
Rousseau-Gueutin, M. et al. Tracking the evolutionary history of polyploidy in Fragaria L. (strawberry): new insights from phylogenetic analyses of low-copy nuclear genes. Mol. Phylogenet. Evol. 51, 515–530 (2009).
Tennessen, J. A., Govindarajulu, R., Ashman, T.-L. & Liston, A. Evolutionary origins and dynamics of octoploid strawberry subgenomes revealed by dense targeted capture linkage maps. Genome Biol. Evol. 6, 3295–3313 (2014).
Yang, Y. & Davis, T. M. A new perspective on polyploid Fragaria (strawberry) genome composition based on large-scale, multi-locus phylogenetic. Analysis. Genome Biol. Evol. 9, 3433–3448 (2017).
Edger, P. P. et al. Origin and evolution of the octoploid strawberry genome. Nat. Genet. 51, 541–547 (2019).
Liston, A. et al. Revisiting the origin of the octoploid strawberry. Nat. Genet. https://doi.org/10.1038/s41588-019-0543-3 (2019).
Glover, N. M., Redestig, H. & Dessimoz, C. Homoeologs: what are they and how do we infer them? Trends Plant Sci. 21, 609–621 (2016).
Stamatakis, A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313 (2014).
Emms, D. M. & Kelly, S. OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biol. 16, 157 (2015).
Duarte, J. M. et al. Identification of shared single copy nuclear genes in Arabidopsis, Populus, Vitis and Oryza and their phylogenetic utility across various taxonomic levels. BMC Evol. Biol. 10, 61 (2010).
Nichio, B. T. L., Marchaukoski, J. N. & Raittz, R. T. New tools in orthology analysis: a brief review of promising perspectives. Front. Genet. 8, 165 (2017).
Gordon, S. P. et al. Extensive gene content variation in the Brachypodium distachyon pan-genome correlates with population structure. Nat. Commun. 8, 2184 (2017).
Xiong, Z., Gaeta, R. T. & Pires, J. C. Homoeologous shuffling and chromosome compensation maintain genome balance in resynthesized allopolyploid Brassica napus. Proc. Natl Acad. Sci. USA 108, 7908–7913 (2011).
Edger, P. P., McKain, M. R., Bird, K. A. & VanBuren, R. Subgenome assignment in allopolyploids: challenges and future directions. Curr. Opin. Plant Biol. 42, 76–80 (2018). /4.
Hummer, K. The discovery and naming of the cascade strawberry (Fragaria cascadensis). Kalmiopsis 21, 26–31 (2015).
Mayrose, I. et al. Recently formed polyploid plants diversify at lower rates. Science 333, 1257 (2011).
We thank J. Lei and L. Xue for sample preparation of F. iinumae. This work was supported by Michigan State University AgBioResearch (to P.P.E.), USDA-NIFA HATCH (no. 1009804 to P.P.E.), USDA-NIFA (no. SCRI 2014-51181-22378) and NSF-DEB (no. 1737898) to P.P.E., USDA-NIFA (no. SCRI 2017-51181-26833 to S.J.K.), the California Strawberry Commission (to S.J.K.), the University of California (to S.J.K.) and the National Natural Science Foundation of China (nos. 31770408 to T.Z. and 31760082 to Q.Q.).
The authors declare no competing interests.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Previously a high-density linkage map of F. iinumae was constructed by 4173 markers, with 3280 from the Array and 893 from genotyping by sequencing7. Here we anchored the contigs to this genetic map to obtain a chromosome-scale genome of F. iinumae.
About this article
Cite this article
Edger, P.P., McKain, M.R., Yocca, A.E. et al. Reply to: Revisiting the origin of octoploid strawberry. Nat Genet 52, 5–7 (2020). https://doi.org/10.1038/s41588-019-0544-2
Evolution of the MLO gene families in octoploid strawberry (Fragaria ×ananassa) and progenitor diploid species identified potential genes for strawberry powdery mildew resistance
Horticulture Research (2021)
Horticulture Research (2020)
Nature Genetics (2020)
The sugar transporter system of strawberry: genome-wide identification and expression correlation with fruit soluble sugar-related traits in a Fragaria × ananassa germplasm collection
Horticulture Research (2020)