Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Ancient and recent origins of shared polymorphisms in yeast

Abstract

Shared genetic polymorphisms between populations and species can be ascribed to ancestral variation or to more recent gene flow. Here, we mapped shared polymorphisms in Saccharomyces cerevisiae and its sister species Saccharomyces paradoxus, which diverged 4–6 million years ago. We used a dense map of single-nucleotide diagnostic markers (mean distance 15.6 base pairs) in 1,673 sequenced S. cerevisiae isolates to catalogue 3,852 sequence blocks (≥5 consecutive markers) introgressed from S. paradoxus, with most being recent and clade-specific. The highly diverged wild Chinese S. cerevisiae lineages were depleted of introgressed blocks but retained an excess of individual ancestral polymorphisms derived from incomplete lineage sorting, perhaps due to less dramatic population bottlenecks. In the non-Chinese S. cerevisiae lineages, we inferred major hybridization events and detected cases of overlapping introgressed blocks across distinct clades due to either shared histories or convergent evolution. We experimentally engineered, in otherwise isogenic backgrounds, the introgressed PAD1-FDC1 gene pair that independently arose in two S. cerevisiae clades and revealed that it increases resistance against diverse antifungal drugs. Overall, our study retraces the histories of divergence and secondary contacts across S. cerevisiae and S. paradoxus populations and unveils a functional outcome.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: The species-wide landscape of S. paradoxus markers in S. cerevisiae.
Fig. 2: Incomplete lineage sorting in Chinese lineages.
Fig. 3: Major hybridization events.
Fig. 4: Ancient introgression on chromosome III centromere.
Fig. 5: Adaptive introgression of PAD1-FDC1.

Similar content being viewed by others

Data availability

The gVCF of the S. cerevisiae collection and raw data of the diagnostic markers and introgressed blocks coordinates are available at https://bitbucket.org/yeastgenomics. The genome sequences generated in this study are available at European Nucleotide Archive under the accession code PRJEB71987. Source data are provided with this paper.

Code availability

The developed computational pipelines and scripts are available at https://bitbucket.org/yeastgenomics.

References

  1. Nei, M., Suzuki, Y. & Nozawa, M. The neutral theory of molecular evolution in the genomic era. Annu. Rev. Genomics Hum. Genet. 11, 265–289 (2010).

    Article  CAS  PubMed  Google Scholar 

  2. Schrempf, D. & Szöllösi, G. in Phylogenetics in the Genomic Era (eds Scornavacca, C. et al.) 3:1:1–3.1:23 (HAL Open Science, 2020).

  3. Sousa, V. & Hey, J. Understanding the origin of species with genome-scale data: modelling gene flow. Nat. Rev. Genet. 14, 404–414 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Rannala, B., Edwards, S. V., Leaché, A. & Yang, Z. in Phylogenetics in the Genomic Era (eds Scornavacca, C. et al.) 3.3:1–3.3:21 (HAL Open Science, 2020).

  5. Kingman, J. F. C. The coalescent. Stoch. Process. Appl. 13, 235–248 (1982).

    Article  Google Scholar 

  6. Maddison, W. P. Gene trees in species trees. Syst. Biol. 46, 523–536 (1997).

    Article  Google Scholar 

  7. Harrison, R. G. & Larson, E. L. Hybridization, introgression and the nature of species boundaries. J. Hered. 105, 795–809 (2014).

    PubMed  Google Scholar 

  8. Taylor, S. A. & Larson, E. L. Insights from genomes into the evolutionary importance and prevalence of hybridization in nature. Nat. Ecol. Evol. 3, 170–177 (2019).

    Article  PubMed  Google Scholar 

  9. Suarez-Gonzalez, A., Lexer, C. & Cronk, Q. C. B. Adaptive introgression: a plant perspective. Biol. Lett. 14, 20170688 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  10. Moran, B. M. et al. The genomic consequences of hybridization. eLife 10, e69016 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Martin, S. H. & Jiggins, C. D. Interpreting the genomic landscape of introgression. Curr. Opin. Genet. Dev. 47, 69–74 (2017).

    Article  CAS  PubMed  Google Scholar 

  12. Steenwyk, J. L., Li, Y., Zhou, X., Shen, X.-X. & Rokas, A. Incongruence in the phylogenomics era. Nat. Rev. Genet. 24, 834–850 (2023).

    Article  CAS  PubMed  Google Scholar 

  13. Bergström, A. et al. Insights into human genetic variation and population history from 929 diverse genomes. Science 367, eaay5012 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  14. Green, R. E. et al. A draft sequence of the Neandertal genome. Science 328, 710–722 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Slon, V. et al. The genome of the offspring of a Neanderthal mother and a Denisovan father. Nature 561, 113–116 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Mao, Y. et al. A high-quality bonobo genome refines the analysis of hominid evolution. Nature 594, 77–81 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Sankararaman, S., Mallick, S., Patterson, N. & Reich, D. The combined landscape of Denisovan and Neanderthal ancestry in present-day humans. Curr. Biol. 26, 1241–1247 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Harris, D. N. et al. Diverse African genomes reveal selection on ancient modern human introgressions in Neanderthals. Curr. Biol. 33, 4905–4916 (2023).

    Article  CAS  PubMed  Google Scholar 

  19. Huerta-Sánchez, E. et al. Altitude adaptation in Tibetans caused by introgression of Denisovan-like DNA. Nature 512, 194–197 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  20. Shen, X.-X. et al. Tempo and mode of genome evolution in the budding yeast subphylum. Cell 175, 1533–1545 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Peter, J. et al. Genome evolution across 1,011 Saccharomyces cerevisiae isolates. Nature 556, 339–344 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Duan, S.-F. et al. The origin and adaptive evolution of domesticated populations of yeast from Far East Asia. Nat. Commun. 9, 2690 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  23. Ono, J., Greig, D. & Boynton, P. J. Defining and disrupting species boundaries in Saccharomyces. Annu. Rev. Microbiol. 74, 477–495 (2020).

    Article  CAS  PubMed  Google Scholar 

  24. Clark, A., Dunham, M. J. & Akey, J. M. The genomic landscape of Saccharomyces paradoxus introgression in geographically diverse Saccharomyces cerevisiae strains. Preprint at bioRxiv https://doi.org/10.1101/2022.08.01.502362 (2022)

  25. Barbosa, R. et al. Evidence of natural hybridization in Brazilian wild lineages of Saccharomyces cerevisiae. Genome Biol. Evol. 8, 317–329 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  26. Pontes, A., Čadež, N., Gonçalves, P. & Sampaio, J. P. A quasi-domesticate relic hybrid population of Saccharomyces cerevisiae × S. paradoxus adapted to olive brine. Front. Genet. 10, 449 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Peris, D. et al. Macroevolutionary diversity of traits and genomes in the model yeast genus Saccharomyces. Nat. Commun. 14, 690 (2022).

  28. Gallone, B. et al. Domestication and divergence of Saccharomyces cerevisiae beer yeasts. Cell 166, 1397–1410 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Barbosa, R. et al. Multiple rounds of artificial selection promote microbe secondary domestication—the case of cachaça yeasts. Genome Biol. Evol. 10, 1939–1955 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Gonçalves, M. et al. Distinct domestication trajectories in top-fermenting beer yeasts and wine yeasts. Curr. Biol. 26, 2750–2761 (2016).

    Article  PubMed  Google Scholar 

  31. Legras, J.-L. et al. Adaptation of S. cerevisiae to fermented food environments reveals remarkable genome plasticity and the footprints of domestication. Mol. Biol. Evol. 35, 1712–1727 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Ramazzotti, M. et al. Population genomics reveals evolution and variation of Saccharomyces cerevisiae in the human and insects gut. Environ. Microbiol. 21, 50–71 (2019).

    Article  CAS  PubMed  Google Scholar 

  33. Coi, A. L. et al. Genomic signatures of adaptation to wine biological ageing conditions in biofilm-forming flor yeasts. Mol. Ecol. 26, 2150–2166 (2017).

    Article  CAS  PubMed  Google Scholar 

  34. Almeida, P. et al. A population genomics insight into the Mediterranean origins of wine yeast domestication. Mol. Ecol. 24, 5412–5427 (2015).

    Article  PubMed  Google Scholar 

  35. Yue, J.-X. et al. Contrasting evolutionary genome dynamics between domesticated and wild yeasts. Nat. Genet. 49, 913–924 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Bendixsen, D. P., Gettle, N., Gilchrist, C., Zhang, Z. & Stelkens, R. Genomic evidence of an ancient East Asian divergence event in wild Saccharomyces cerevisiae. Genome Biol. Evol. 13, evab001 (2021).

  37. O’Donnell, S. et al. 142 telomere-to-telomere assemblies reveal the genome structural landscape in Saccharomyces cerevisiae. Nat. Genet. 55, 1390–1399 (2023).

  38. D’Angiolo, M. et al. A yeast living ancestor reveals the origin of genomic introgressions. Nature 587, 420–425 (2020).

    Article  PubMed  Google Scholar 

  39. Mancera, E., Bourgon, R., Brozzi, A., Huber, W. & Steinmetz, L. M. High-resolution mapping of meiotic crossovers and non-crossovers in yeast. Nature 454, 479–485 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Liti, G. et al. Population genomics of domestic and wild yeasts. Nature 458, 337–341 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Ramos-Cormenzana, A., Juárez-Jiménez, B. & Garcia-Pareja, M. P. Antimicrobial activity of olive mill wastewaters (alpechin) and biotransformed olive oil mill wastewater. Int. Biodeterior. Biodegrad. 38, 283–290 (1996).

    Article  Google Scholar 

  42. Richard, P., Viljanen, K. & Penttilä, M. Overexpression of PAD1 and FDC1 results in significant cinnamic acid decarboxylase activity in Saccharomyces cerevisiae. AMB Express 5, 12 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  43. Jamoussi, B., Bedoui, A., Hassine, B. B. & Abderraba, A. Analyses of phenolic compounds occurring in olive oil mill wastewaters by GC–MS. Toxicol. Environ. Chem. 87, 45–53 (2005).

    Article  CAS  Google Scholar 

  44. Klinke, H. B., Thomsen, A. B. & Ahring, B. K. Inhibition of ethanol-producing yeast and bacteria by degradation products produced during pre-treatment of biomass. Appl. Microbiol. Biotechnol. 66, 10–26 (2004).

    Article  CAS  PubMed  Google Scholar 

  45. Diderich, J. A., Weening, S. M., van den Broek, M., Pronk, J. T. & Daran, J.-M. G. Selection of Pof Saccharomyces eubayanus variants for the construction of S. cerevisiae × S. eubayanus hybrids with reduced 4-vinyl guaiacol formation. Front. Microbiol. 9, 1640 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  46. Skelly, D. A. et al. Integrative phenomics reveals insight into the structure of phenotypic diversity in budding yeast. Genome Res. 23, 1496–1504 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Hillenmeyer, M. E. et al. The chemical genomic portrait of yeast: uncovering a phenotype for all genes. Science 320, 362–365 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Wang, Q.-M., Liu, W.-Q., Liti, G., Wang, S.-A. & Bai, F.-Y. Surprisingly diverged populations of Saccharomyces cerevisiae in natural environments remote from human activity. Mol. Ecol. 21, 5404–5417 (2012).

    Article  PubMed  Google Scholar 

  49. Lee, T. J. et al. Extensive sampling of Saccharomyces cerevisiae in Taiwan reveals ecology and evolution of predomesticated lineages. Genome Res. 32, 864–877 (2022).

    PubMed  PubMed Central  Google Scholar 

  50. Bozdag, G. O. et al. Engineering recombination between diverged yeast species reveals genetic incompatibilities. Preprint at bioRxiv http://biorxiv.org/lookup/doi/10.1101/755165 (2019).

  51. Vilgalys, T. P. et al. Selection against admixture and gene regulatory divergence in a long-term primate field study. Science 377, 635–641 (2022).

  52. Wolf, A. B. & Akey, J. M. Outstanding questions in the study of archaic hominin admixture. PLoS Genet. 14, e1007349 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  53. De Chiara, M. et al. Domestication reprogrammed the budding yeast life cycle. Nat. Ecol. Evol. 6, 448–460 (2022).

    Article  PubMed  Google Scholar 

  54. Gallone, B. et al. Interspecific hybridization facilitates niche adaptation in beer yeast. Nat. Ecol. Evol. 3, 1562–1575 (2019).

    Article  PubMed  Google Scholar 

  55. Andrews, S. FastQC: a quality control tool for high throughput sequence data (2010).

  56. Perez-Sepulveda, B. M. et al. An accessible, efficient and global approach for the large-scale sequencing of bacterial genomes. Genome Biol. 22, 349 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  57. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics https://doi.org/10.1093/bioinformatics/btp324 (2009).

  58. Danecek, P. Twelve years of SAMtools and BCFtools. GigaScience https://doi.org/10.1093/gigascience/giab008 (2021).

  59. Zheng, X. A high-performance computing toolset for relatedness and principal component analysis of SNP data. Bioinformatics https://doi.org/10.1093/bioinformatics/bts606 (2012).

  60. Gascuel, O. BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. Mol. Biol. Evol. 14, 685–695 (1997).

    Article  CAS  PubMed  Google Scholar 

  61. Kurtz, S. et al. Versatile and open software for comparing large genomes. Genome Biol. https://doi.org/10.1186/gb-2004-5-2-r12 (2004).

  62. Garrison, E. & Marth, G. Haplotype-based variant detection from short-read sequencing. Preprint at https://doi.org/10.48550/arXiv.1207.3907 (2012).

  63. Tattini, L. et al. Accurate tracking of the mutational landscape of diploid hybrid genomes. Mol. Biol. Evol. 36, 2861–2877 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Santiago, M.-S. M., Sammeth, R. & Guigó Paolo, R. The GEM mapper: fast accurate and versatile alignment by filtration. Nat. Methods https://doi.org/10.1038/nmeth.2221 (2012).

  65. Boeva, V. et al. Control-FREEC: a tool for assessing copy number and allelic content using next-generation sequencing data. Bioinformatics https://doi.org/10.1093/bioinformatics/btr670 (2012).

  66. Liti, G., Barton, D. B. H. & Louis, E. J. Sequence diversity, reproductive isolation and species concepts in Saccharomyces. Genetics 174, 839–850 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. He, P.-Y. et al. Highly diverged lineages of Saccharomyces paradoxus in temperate to subtropical climate zones in China. Yeast 39, 69–82 (2022).

    Article  CAS  PubMed  Google Scholar 

  68. Bankevich, A. et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J. Comput. Biol. https://doi.org/10.1089/cmb.2012.0021 (2012).

  69. Edgar, R. C. MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics https://doi.org/10.1186/1471-2105-5-113 (2004).

  70. Hoang, D. P. UFBoot2: improving the ultrafast bootstrap approximation. Mol. Biol. Evol. https://doi.org/10.1093/molbev/msx281 (2018).

  71. Nguyen, L-T. et al. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. https://doi.org/10.1093/molbev/msu300 (2015).

  72. Schliep, K.-P. phangorn: phylogenetic analysis in R. Bioinformatics https://doi.org/10.1093/bioinformatics/btq706 (2011).

  73. Martin, S. H. et al. Genome-wide evidence for speciation with gene flow in Heliconius butterflies. Genome Res. 23, 1817–1828 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  74. Durand, E. Y., Patterson, N., Reich, D. & Slatkin, M. Testing for ancient admixture between closely related populations. Mol. Biol. Evol. 28, 2239–2252 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Pfeifer, B. & Kapan, D. D. Estimates of introgression as a function of pairwise distances. BMC Bioinf. 20, 207 (2019).

    Article  Google Scholar 

  76. Darling, A. E., Mau, B. & Perna, N. T. progressiveMauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS ONE 5, e11147 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  77. Patterson, N. et al. Ancient admixture in human history. Genetics https://doi.org/10.1534/genetics.112.145037 (2012).

  78. Petr, M., Vernot, B. & Kelso, J. admixr—R package for reproducible analyses using ADMIXTOOLS. Bioinformatics https://doi.org/10.1093/bioinformatics/btz030 (2019).

  79. Tsai, I. J., Bensasson, D., Burt, A. & Koufopanou, V. Population genomics of the wild yeast Saccharomyces paradoxus: quantifying the life cycle. Proc. Natl Acad. Sci. USA 105, 4957–4962 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Schacherer, J., Shapiro, J. A., Ruderfer, D. M. & Kruglyak, L. Comprehensive polymorphism survey elucidates population structure of Saccharomyces cerevisiae. Nature 458, 342–345 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. Pfeifer, B., Wittelsbürger, U., Ramos-Onsins, S. E. & Lercher, M. J. PopGenome: an efficient Swiss army knife for population genomic analyses in R. Mol. Biol. Evol. 31, 1929–1936 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  82. Gel, B. et al. regioneR: an R/Bioconductor package for the association analysis of genomic regions based on permutation tests. Bioinformatics 32, 289–291 (2016).

    Article  CAS  PubMed  Google Scholar 

  83. Pearson, W. R. & Lipman, D. J. Improved tools for biological sequence comparison. Proc. Natl Acad. Sci. USA 85, 2444–2448 (1988).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  84. Pan, J. et al. A hierarchical combination of factors shapes the genome-wide topography of yeast meiotic recombination initiation. Cell 144, 719–731 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  85. Mozzachiodi, S. et al. Aborting meiosis allows recombination in sterile diploid yeast hybrids. Nat. Commun. 12, 6564 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  86. Eberlein, C. et al. Hybridization is a recurrent evolutionary stimulus in wild yeast speciation. Nat. Commun. 10, 923 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  87. Leducq, J.-B. et al. Speciation driven by hybridization and chromosomal plasticity in a wild yeast. Nat. Microbiol. 1, 15003 (2016).

    Article  CAS  PubMed  Google Scholar 

  88. Koufopanou, V. et al. Population size, sex and purifying selection: comparative genomics of two sister taxa of the wild yeast Saccharomyces paradoxus. Genome Biol. Evol. 12, 1636–1645 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  89. Danecek, P. et al. The variant call format and VCFtools. Bioinformatics https://doi.org/10.1093/bioinformatics/btr330 (2011).

  90. Kumar, S. MEGA X: molecular evolutionary genetics analysis across computing platforms. Mol. Biol. Evol. https://doi.org/10.1093/molbev/msy096 (2018).

  91. Ortiz, E. M. vcf2phylip v2.0: convert a VCF matrix into several matrix formats for phylogenetic analysis. Zenodo https://doi.org/10.5281/zenodo.2540861 (2019).

  92. Letunc, I & Bork, P. Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. https://doi.org/10.1093/nar/gkab301 (2021).

  93. Lee, S., Lim, W. A. & Thorn, K. S. Improved blue, green and red fluorescent protein tagging vectors for S. cerevisiae. PLoS ONE 8, e67902 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  94. Fernandez-Ricaud, L., Kourtchenko, O., Zackrisson, M., Warringer, J. & Blomberg, A. PRECOG: a tool for automated extraction and visualization of fitness components in microbial growth phenomics. BMC Bioinf. 17, 249 (2016).

    Article  Google Scholar 

  95. Chevin, L.-M. On measuring selection in experimental evolution. Biol. Lett. 7, 210–213 (2011).

    Article  PubMed  Google Scholar 

  96. Guindon, S. et al. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59, 307–321 (2010).

    Article  CAS  PubMed  Google Scholar 

  97. Minh, B. Q., Nguyen, M. A. T. & von Haeseler, A. Ultrafast approximation for phylogenetic bootstrap. Mol. Biol. Evol. 30, 1188–1195 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  98. Hoang, D. T., Chernomor, O., von Haeseler, A., Minh, B. Q. & Vinh, L. S. UFBoot2: improving the ultrafast bootstrap approximation. Mol. Biol. Evol. 35, 518–522 (2018).

    Article  CAS  PubMed  Google Scholar 

  99. Naseeb, S. et al. Whole genome sequencing, de novo assembly and phenotypic profiling for the new budding yeast species Saccharomyces jurei. G3 8, 2967–2977 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank M. D’Angiolo, E. Mancera, N. Vakirlis, G. Fischer, E. Danchin and P. Beltrao for discussions and critical reading of the manuscript. We also thank V. Koufopanou for sharing the strain OS162 and A. Pontes and J. P. Sampaio for providing information on the origin of the alpechin and Brazilian sequenced strains. This work was supported by Agence Nationale de la Recherche (ANR-11-LABX-0028-01, ANR-15-IDEX-01, ANR-18-CE12-0004, ANR-20-CE12-0020, ANR-22-CE12-0015), Fondation pour la Recherche Médicale (EQU202003010413), UCA AAP Start-up Deep tech, CEFIPRA to G.L. N.T. was partially supported by the PhD fellowship programme Region PACA. E.S.N. was supported by the Ministry of Science and Higher Education of the Russian Federation grant for Kurchatov Center of Genome Research (075-15-2019-1659).

Author information

Authors and Affiliations

Authors

Contributions

N.T., M.D.C. and G.L. conceived the project and designed the experiments. N.T. developed the bioinformatic pipeline. N.T., M.D.C. and L.T. designed and performed the genomic analyses. S.M. and C.V. designed and performed the PAD1-FDC1 functional characterization. E.S.N., A.B., J.W. and G.L. contributed with resources and reagents. G.L. supervised the project. N.T. and G.L. wrote the paper with input from all the other authors.

Corresponding author

Correspondence to Gianni Liti.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Ecology & Evolution thanks Feng-Yan Bai, Xing-Xing Shen and Kevin Verstrepen for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Methods overview.

a, Workflow of the pipeline. Abbreviations: PWGA: pairwise whole-genome alignment; CHR: chromosome. b, biallelic patterns across marker positions. X: S.c.c. allele; Y: alternative allele. Numbers on the phylogeny represent whole-genome sequence divergence between S.c.c. and S. paradoxus and within the main S. paradoxus populations. In red is the abbreviation of the main S. paradoxus populations. EU: European, FE: Far Eastern, NA: North American, SA: South American, HW: Hawaiian. The columns indicate the marker positions used to define the introgression boundaries (common abbreviated comm.) and the origin (remaining columns). The counts correspond to the number of marker positions available for each pattern. c, A cartoon of the strategy adopted to construct the S.c.c. sequence (methods). Briefly: for each clade of the 1,011 collection, we picked 2 strains and extracted the SNPs against the SGD (Saccharomyces Genome Database) reference genome. We then used the SGD reference genome as scaffold and changed the alleles in the positions in which the ALT (alternative) allele was more frequent than the REF (reference) allele (freq. ≥ 0.75). d, Example of restoring the collinearity of the translocated genomic region between S.c.c. and HW S. paradoxus on the translocation chromosome V/ chromosome XIII35. Line colour reflects the sequence divergence between S.c.c. and the HW S. paradoxus. Whole-genome alignment before (left panel) and after (right panel) genome collinearization. e, The blue rectangles represent the genomic regions, in S.c.c. coordinates, which were correctly aligned across S.c.c. and all the S. paradoxus whole-genome assemblies. f, Distribution of distances between pairs of consecutive diagnostic markers (n=775,484) along the genome (1st quantile = 3, mean = 14.59, median = 8, 3rd quantile = 16). Box, interquartile range (IQR); whiskers, 1.5×IQR; thick horizontal line, median. Circles represent outliers. g. The per-chromosome marker density (MD) was measured as the number of diagnostic markers divided by the sum of all nucleotides in the aligned regions depicted in the g. panel. h, UPGMA phylogenies across the genome assemblies of 5,191 S. cerevisiae - S. paradoxus 1-to-1 orthologs. In blue, the most abundant individual gene topologies that follow the structure of the species tree. In red and green two contrasting gene topologies. i, zoom-in in the distribution of contrasting gene phylogenies across the assemblies showing the S. cerevisiae introgression on chromosome XIV in the European population of S. paradoxus. In red: the genes with the phylogeny depicted in the central panel in i. j, the inconsistent topology between gene and species tree involving the South American and the Hawaiian S. paradoxus on the region surrounding the centromeric position of chromosome V. In green: the genes with the phylogeny depicted in the right panel in h; in grey, gene trees with alternative topologies. The blue rectangles represent genes for which the gene tree topology is identical to the species tree topology.

Source data

Extended Data Fig. 2 Global S. cerevisiae phylogeny and collections.

a, Unrooted neighbour-joining tree of the 1,673 S. cerevisiae strains analysed in this work. Red dots indicate the S. cerevisiae isolates used for the construction of S. cerevisiae consensus (S.c.c.) reference. The main clades are sequentially numbered and listed on the right side. Clades highlighted in colour are discussed in the main text. b, Number of strains included in this study from previously published collections (Methods).

Source data

Extended Data Fig. 3 Unknown origin of a chromosome XI introgression in AMH CHN-IX.

a, df statistics support the presence of the introgression on AMH chromosome XI. Red dots represent genomic windows characterized by an absolute value of df = 5 fold or higher compared to the average value across the chromosome. b, Polymorphism ancestry plot showing that the origin of the chromosome XI introgression cannot be traced to any S. paradoxus population, consistent with it having an unknown sister species as origin. The grey bar indicates the position of the introgression. c, Introgression boundaries in chromosome XI on AMH. The red blocks represent homozygous introgressions, while the blue blocks are homozygous for S. cerevisiae. d-f, Maximum likelihood phylogenies of the sequences spanning the genes YKR064W-YKR078W derived from de novo assemblies35,37,99,67. SGD = S. cerevisiae reference genome, BAG = CHN-II S. cerevisiae, BAL = CHN-I S. cerevisiae, AMH = CHN-IX S. cerevisiae, CBS432 = European S. paradoxus, BJ-DLS32-26 = Chinese S. paradoxus, YPS138 = North American S. paradoxus and NCYC3947 = Saccharomyces jurei (outgroup). The trees on panel d and f are derived from 25 kb flanking regions before and after the introgression, while the tree in panel e is that of the introgressed region. The chromosome XI introgression is basal to the entire S. paradoxus clade with an estimated nucleotide distance of ~10% from S. paradoxus, consistent with it having an origin in an unknown donor species. The blue circles at the nodes represent the percentage of SH-aLRT ≥ 95% and UFboot ≥ 95%.

Source data

Extended Data Fig. 4 Size and location of introgression blocks.

a, distribution of the diagnostic markers with S. paradoxus genotypes groped by size across different clades. For each clade, isolated markers and introgressed blocks with the same boundaries are counted once. Overlapping blocks with different boundary coordinates are counted as separate events. The last column included all the blocks with at least 5 consecutive S. paradoxus markers. The Y axis is on log10 scale. The absolute value of the counts is indicated on top of each column. The label ‘Other’ indicates S. cerevisiae strains that could not be placed in a specific clade. b, physical positioning and frequency of the introgression blocks supported by ≥ 5 consecutive S. paradoxus markers across 1,459 out of 1,673 S. cerevisiae samples. The coloured scale reflects the number of times a specific block is shared across the 1,459 S. cerevisiae strains. Two blocks shared by more than 156 S. cerevisiae strains are shown as equal to 156 to allow the visualization of less common events. The genomic coordinates indicate positions in the S.c.c. genome.

Source data

Extended Data Fig. 5 Introgression block size.

a, distributions of introgression lengths, defined as the number of consecutive S. paradoxus markers), for each S. cerevisiae clade. b, distributions of introgression lengths, defined as the total length (in bp) of regions within S. paradoxus markers, for each S. cerevisiae clade. Box, interquartile range (IQR); whiskers, 1.5×IQR; thick horizontal line, median. Fully coloured data points beyond the whiskers are outliers. Empty dots represent values of clades with ≤ 20 values (African Beer = 42; African Palm Wine = 93; Alpechin = 6965; Asian Fermentation = 19; Asian Island = 54; Baijiu = 26; Beer = 139; Bioethanol = 257; CHN-I = 207; CHN-II = 19; CHN-III = 24; CHN-IV = 257; CHN-IX = 681; CHN-V = 54; CHN-VI = 186; CHN-VIII = 14; CHN-X = 29; Dairy Products = 394; Far East Russian = 77; French Guiana = 869; Huangjiu = 4; Kombucha =67; Malaysian = 9; Mantou-1 = 1; Mantou-2 = 12; Mantou-3 = 6; Mantou-4 = 3; Mantou-5 = 4; Mantou-6 = 2; Mantou 7 = 48; Med. Oak = 280; Mexican Agave = 1440; Mix Origin = 131; Mosaics Beer = 207; MosaicsR1 = 104; Mosaics R2= 99; MosaicsR3 = 724; North American = 342; Other = 171; Sake-A = 24; Sake-B = 35; South Africa =13; South America Mix1 = 897; South America Mix2 =839; South America Mix 3 = 179; West African Cocoa = Wine/European = 488; Wine/European 1 = 59; Wine/European 2 = 18; Wine/European 3 = 47; Wine/European 4 =14.).

Source data

Extended Data Fig. 6 Patterson’s D statistics.

a, D values measured across the S. cerevisiae collection using different quartet (P1-P4) arrangements. The strains on the top of each plot represent the P1 (WE: ADS, CHN-IV: BJ3 and CHN-I: BAL) population. P3 and P4 (Outgroup) are fixed and represented by the European S. paradoxus (CBS432) and S. jurei, respectively. Multiple D values were calculated, for each sample, by mean jackknife resampling of genomic blocks. The gradient colour reflects the Z-score. D values associated with a Z-scores equal to or greater than the absolute value of 3 were considered statistically significant and the null hypothesis of absence of gene flow were consequently rejected for these. Box, interquartile range (IQR); whiskers, 1.5×IQR; thick horizontal line, median. Panel WE n. others =1,586; Alpechin = 39; South America mix 2 = 10; Mexican agave = 7; CHN-II =10; CHN-I = 12; CHN-IX = 9. Panel S.c.c. n. others=1,577; Alpechin = 39; South America mix 2 = 10; Mexican agave = 7; CHN-II =10; CHN-I =12; CHN-IX = 9. Panel CHN-IV n. others = 1,585, Alpechin =39; South America mix 2 =10; Mexican agave = 7; CHN-II = 10; CHN-I = 12; CHN-IX = 9. Panel CHN-I n. others = 1,584; Alpechin = 39, South America mix 2 = 10; Mexican agave = 7, CHN-II = 10, CHN-I = 12, CHN-IX = 9. b, Absolute counts of both ABBA and BABA sites across the S. cerevisiae strains. c, Shared ABBA-BABA sites between the strains BAG - AMH (left) and FJ7 - AMH (right) using the S.c.c. in P1.

Source data

Extended Data Fig. 7 Highly introgressed clades.

a, Frequency and genomic position of the introgressions detected across the Alpechin (N=40, on the left). b, The heatmap shows the S. paradoxus percentage of introgressed genes across S. cerevisiae strains with at least 5 introgressed genes. To reduce the complexity of the heatmap, we retained only the genes introgressed in at least one of these S. cerevisiae strains (322 strains x 1305 genes). If less than 10% of a gene was introgressed, it is shown as white. Strains were grouped using hierarchical clustering. We coloured the strain groups (branches) for which all strains belong to the same phylogenetic clade.

Source data

Extended Data Fig. 8 Hybrid subgenome ancestries.

a, Fraction of marker variants, across all marker positions in the S. paradoxus subgenome in each hybrid (panels), that match the corresponding variant in each category of S. paradoxus strains (x axis). Priv. = marker variant is private to this category. Comm. = marker variant is common to all strain categories. b, Maximum likelihood phylogenies of the S. paradoxus (on the left) and S. cerevisiae (on the right) subgenomes of hybrid isolates (red label). The S. paradoxus phylogeny was constructed including only American S. paradoxus strains86, while the S. cerevisiae phylogeny was constructed with a selection of S. cerevisiae strains from previous studies21,22. The circle represents nodes with SH-aLRT ≥ 95% and UFboot ≥ 95%. The size of the circle at each node reflects the value of SH-aLRT and UFboot (from 95 to 100). c, genomic profiles of hybrid-descendant pairs from Spain (on the left) and Brasil (on the right). Details of the strain and genomes origin are given in the methods (section Saccharomyces collections). Red and blue blocks represent homozygous S. paradoxus and S. cerevisiae regions respectively, while grey blocks are heterozygous regions.

Extended Data Fig. 9 Ancient introgression on chromosome III.

a, Density of origin-specific S. paradoxus diagnostic markers across a subset of strains (Supplementary Table 10). The x axis and y axis respectively indicate the density of origin-specific S. paradoxus diagnostic markers that support a Euroasian or American origin of the introgressions. Chromosome III has diagnostic markers with Euroasian ancestry also in clades with genome-wide introgressed blocks with American origins. b, Ancestry plot of diagnostic markers within the chromosome III introgression block in a Brazilian bioethanol and a Sake/Mantou 7 strains. The origin of diagnostic markers is attributed to different S. paradoxus populations (listed along the Y axis). Genomic coordinates are from the S.c.c. genome. c, Map of introgressed blocks detected around chromosome III centromere (black dot) across different S. cerevisiae strains. Red and grey colours represent homozygous and heterozygous S. paradoxus introgression respectively, blue indicates the S. cerevisiae genome. d, heatmap with introgressed genes (x axis) detected on the region encompassing the centromere of chromosome III across different S. cerevisiae strains. The strains are clustered by their introgression profile (left side), while the coloured bar (right side) indicates their phylogenetic assignment illustrated in Fig. 4b. e, Alignment of the short-reads against S.c.c. of a selection of four strains. CHA, Wine/European, AHQ Mantou 7, APA and CME Sake strains. The rectangles 1 and 2 highlight shared SNPs between CHA and AHQ which are absent in the Sake strains. The rectangles 3, 4, 5 and 6 highlight private SNPs of the Sake clade. The rectangle 7 encloses the SNPs in AHQ shared with the Sake strains but absent in the Wine/European strain (CHA). f, Maximum likelihood phylogenies of the arms of chromosome III, external to the introgressed block and the 2 kb left-side flanking region of the introgression for a selection of strains. The introgression and the subtelomeric/telomeric regions are excluded. Red circles indicate strains with introgressed CEN3. The blue circles at the nodes represent the percentage of SH-aLRT ≥ 80% and UFboot ≥ 95%.

Source data

Extended Data Fig. 10 Introgressed PAD1-FDC1 phenotypic assessment.

a, Generation Time (GT) and yield of different wild type and PAD1-FDC1 knockout (KO) S. cerevisiae isolates (WE: Wine/European; BY: S288C, WA: West African, SA: Sake, NA: North American and Alp: Alpechin) and the European S. paradoxus CBS432. Growth has been phenotyped in both SDC and cinnamic acid (0.6 mM). Three biological replicates derived from single WT and KO clones are reported (n. dots = 3, all P values in Supplementary Table 12). b, Yield (YI) of the wild type, PAD1-FDC1 knockout (KO) and PAD1-FDC1 introgressed (I) Wine/European S. cerevisiae strain DBVPG6765 measured in SDC and cinnamic acid. ΔM refers to the difference between the median values of the distributions. Each genomic background has 2 independently constructed clones and 4 biological replicates (n=8 dots, all P values in Supplementary Table 13). *indicates 0.01 < P value < 0.05, **indicates P value < 0.01. Relevant significant P values: WEWT-WEI in CA (YI), 0.0129; WEWT-WEKO in CA (YI), 3.89 × 10−06 and WEKO-WEI in CA (YI), 2.89 × 10−06. c, Fitness measure in SDC and cinnamic acid with inverted fluorescent markers (SDCmRuby2-yGamillus - SDCyGamillus-mRuby2: P value = 0.21, SDCyGamillus-mRuby2-CAyGamillus-mRuby2: P value = 0.35). Each circle represents a competition replicate (n. dots = 3, all P values in Supplementary Table 11). d, FACS experiment results showing the two cell populations at time T0 (left panels) and T3 (right panels) in SDC (panels at the top) and Tebuconazole (0.0037 mg/mL, panels at the bottom). Cells with the PAD1-FDC1 Alpechin introgressed copy labelled with mRuby2 (WEI) are highlighted using a violet gate, whereas the green gate is used to group cells carrying the PAD1-FDC1 S. cerevisiae copy labelled with yGamillus (WEWT). In a, b, c panels: All P values were calculated with a two-sided Welch two sample t-test (no adjustment for multiple tests). All boxes, interquartile range (IQR); whiskers, 1.5 × IQR; thick horizontal line, median.

Source data

Supplementary information

Source data

Source Data Fig. 1

Data for figure.

Source Data Fig. 2

Data for figure and statistics.

Source Data Fig. 3

Data for figure.

Source Data Fig. 4

Data for figure.

Source Data Fig. 5

Data for figure and statistics.

Source Data Extended Data Fig. 1

Data for figure.

Source Data Extended Data Fig. 2

Data for figure.

Source Data Extended Data Fig. 3

Data for figure.

Source Data Extended Data Fig. 4

Data for figure.

Source Data Extended Data Fig. 5

Data for figure.

Source Data Extended Data Fig. 6

Data for figure and statistics.

Source Data Extended Data Fig. 7

Data for figure.

Source Data Extended Data Fig. 8

Data for figure.

Source Data Extended Data Fig. 9

Data for figure.

Source Data Extended Data Fig. 10

Data for figure and statistics.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tellini, N., De Chiara, M., Mozzachiodi, S. et al. Ancient and recent origins of shared polymorphisms in yeast. Nat Ecol Evol 8, 761–776 (2024). https://doi.org/10.1038/s41559-024-02352-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41559-024-02352-5

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing