Grape derived wine is one of the most popular alcoholic beverages and has been produced by humans since ancient times. It is the result of grape juice fermentation by yeasts which consume the fruit sugars and mainly release ethanol and carbon dioxide. Even though microorganisms are an essential part of the winemaking process, they must cope with a very hostile and variable environment, characterised by high initial sugar content and subsequent high ethanol content, low pH, presence of antimicrobial agents and lack of nutrients. Despite these stressful conditions, some opportunistic microorganisms manage to survive and multiply during and after alcoholic fermentation. A striking example is the wine spoilage yeast Brettanomyces bruxellensis (teleomorph Dekkera bruxellensis) that is typically detected during wine aging but also at lower frequency during the early stages of the winemaking process (grapes and must)1,2. When it grows in wine, B. bruxellensis produces odorant molecules (namely volatile phenols), which are associated with unpleasant aromas described as barnyard, horse sweat, Band-aid®3,4,5. Therefore, the presence of B. bruxellensis in wine often provokes rejection by consumers and serious economic losses for winemakers6.

The wider industrial relevance of this yeast is highlighted by the fact that it is isolated from various fermented beverages and products. For example, B. bruxellensis is an essential contributor to the elaboration of some specialty Belgian and American beers, which are the result of complex spontaneous fermentations performed by various genera of bacteria and yeasts7,8. Indeed, B. bruxellensis was the first microorganism to be patented for its contribution to English ‘stock’ ales9, in 1904. This yeast has also been isolated from other fermented beverages and food like kombucha, kefir, cider, and olives7,10,11. Interestingly, B. bruxellensis was reported to be a common contaminant in bioethanol production plants12,13, and under the right conditions can take the place of the industrial Saccharomyces cerevisiae strains and perform molasses fermentation13.

The recurrent problem of B. bruxellensis in wine and its potential use for beer and bioethanol industrial fermentations has led to high and rising interest in this yeast species. Various studies highlighted great phenotypic diversity of B. bruxellensis regarding growth capacity14,15,16,17,18,19, sugar metabolism20,21,22,23, nitrogen source utilisation21,24, volatile phenols production5,14,18,20,23,25,26, behaviour in viable but not cultivable state27, and response to abiotic factors like temperature20,28, pH20,29, oxygen availability30,31,32 and sulfur dioxide (SO2)20,23,28,33,34,35. This phenotypic variation makes it difficult to predict the spoilage potential of B. bruxellensis and is therefore a major concern for winemakers. For example, across several studies the concentration of molecular SO2 (mSO2) required to stop B. bruxellensis’ growth ranged from 0.2 to 1.0 mg.L−136. This observed variability was at least partly due to the use of different strains. However, only a few studies have attempted to correlate SO2 tolerance to a genotypic profile20,34. A striking example is a study of 41 B. bruxellensis wine isolates from Australia showing that the most common genotype (92% of studied isolates) was correlated with SO2 tolerance, thus suggesting that SO2 usage patterns may have created a selective pressure on this population34.

Despite several studies that have explored genetic diversity of this species using fingerprinting techniques such as Random Amplified Polymorphism DNA (RAPD), Amplified Fragment Length Polymorphism (AFLP), pulsed field electrophoresis (REA-PFGE), and mtDNA restriction analysis14,17,20,25,26,34,37,38,39,40, our understanding of the B. bruxellensis global population structure and the factors that drive it remains limited. Several studies highlight an important intraspecific diversity of B. bruxellensis14,20,38,40 which makes the prediction of its occurrence and behaviour in industrial fermentations difficult. Further, recent genetic studies on a limited number of strains24,41,42 have suggested that polyploidy and hybridisation may play a significant role in microevolution of the species, along with plasticity in chromosomal structure due to “untraditional” centromeres43. The role of polyploidy in adaptive changes to suit environment and/or lifestyle has been observed in other organisms44,45,46,47, notably for S. cerevisiae which shares similar fermentation niches to those occupied by B. bruxellensis.

To enhance our knowledge of the global B. bruxellensis population, here we used a recently developed microsatellite profiling method42 to genotype 1488 isolates from various fermentation niches across five continents. Typing based on microsatellite markers is a rapid, reliable and discriminant genotyping approach that has been successfully used to decipher complex population structures48,49 and provide insight into the ploidy-state42. The performed research work aimed to determine the population structure of a large B. bruxellensis collection and test for a link between the identified subpopulations and their adaptive ability, with a focus on tolerance to sulfur dioxide.


B. bruxellensis genotyping analysis and population structure

The B. bruxellensis collection used in this study comprised 1488 isolates from 29 countries and 9 different substrates, the majority of strains (87%) originating from wine (Supplementary Table S1). The 1488 isolates were genotyped with 12 primer pairs amplifying microsatellite regions, including four new loci in addition to the eight previously published42. Characteristics of the different loci and number of alleles are given in Supplementary Table S2. One locus out of the four additional loci (D1) displayed a high allelic diversity, presenting 18 different alleles. All isolates were shown to be heterozygous for at least one locus. Many isolates were shown to have more than 2 alleles per locus. About half of the isolates had up to 3 alleles per locus (792 isolates) and some had up to 4 and 5 alleles per locus (67 and 1 isolates, respectively). The high number of isolates with up to 3 alleles per locus suggests the existence of triploidy in the studied population. Similar observation was reported previously by Curtin et al.41 and Borneman et al.24 who performed de-novo sequencing and comparative genomics respectively, highlighting two triploid strains having core diploid genome and additional sets of chromosomes resulting from different triploidisation origins for the two strains. Based on those observations and the occurrence among the isolates of genotypes presenting more than two alleles/locus we extend this hypothesis to the latter.

The raw data obtained by the microsatellite analysis corresponds to the alleles (i.e. the size of the amplified microsatellite sequences) per locus and per strain (Supplementary Table S3). This data was further used for the construction of a dendrogram reflecting the genetic proximity between strains (Fig. 1A). The method was based on Bruvo’s distance and Neighbour Joining (NJ) and was chosen for being reliable and suitable for populations with mixed ploidy levels. The population clusters in 3 main genetic groups (Fig. 1A). Additional methods, including complementary tests and Bayesian approaches were applied to verify the reliability of the clustering obtained by NJ (Fig. 1). The NJ tree showed three main branches that were almost perfectly conserved with UPGMA method (Fig. 1A and B). Then, a multidimensional scaling was performed with Bruvo’s distance matrix on the same dataset and using the cmdscale function on R (Fig. 1C). The multidimensional scaling analysis showed that the three main groups were almost identical to the clusters previously defined. Furthermore, the partition method50 was applied on the same dataset. This algorithm identifies monophyletic clusters for which the individuals are more closely related than randomly selected individuals. The reliability of the node is then computed and nodes with reliability higher than 90% are considered (Fig. 1D). The partition method also confirmed the three main clusters obtained with NJ as reliable. Finally, clusters were identified using successive K-means (adegenet package, function ‘find.clusters’). This function implements the clustering procedure used in Discriminant Analysis of Principal Components (DAPC)51, where successive K-means are run with an increasing number of clusters (k), associated with a statistical measure of goodness of fit. This approach identified 3 clusters, once again very similar to those obtained by NJ (Fig. 1E). Overall, the five approaches taken together confirmed the reliability of the three main clusters observed in the studied B. bruxellensis population.

Figure 1
figure 1

B. bruxellensis population clusters identification by combining different tools and parameters. (A) Dendrogram using Bruvo’s distance and NJ clustering. The figure was produced using the poppr package in R. (B) Dendrogram using Bruvo’s distance and UPGMA clustering. The figure was produced using poppr. Isolates are shown in the same colours as in A. (C) Multidimensional scaling performed with Bruvo’s distance matrix on the same dataset and using the cmdscale function on R. For isolates with incomplete genotyping, the missing data was inferred from the closest neighbour using Bruvo’s distance. Isolates are shown with the same colours as in A. (D) Node reliability using the partition method50. Only the nodes with reliability >90% are shown on the NJ tree. (E) Cluster identification using successive K-means. The find.cluster function from the adegenet package in R was applied, using within-groups sum of squares (WSS) statistics and the default criterion diffNgroup. This tool identifies an optimal number of 3 clusters, represented on the NJ tree using different arbitrary colours. (F) Inferred ploidy. The maximum number of alleles per locus was computed. Isolates with up to 2 alleles/locus were considered as diploid (2n). Isolates with up to 3 alleles/locus were considered as triploid (3n), and the number of loci showing up to 3 alleles was recorded (1–2 loci, or more than 2 loci showing up to three alleles). Finally, isolates with up to 4 or 5 alleles/locus were noted as 4n/5n. The inferred ploidy is represented on the NJ tree.

Since B. bruxellensis is known to exhibit different ploidy levels24,41, we inferred putative ploidy level based on the microsatellite genotyping. Isolates with up to 2 alleles per locus were considered diploid and noted 2n (Fig. 1F). Isolates with up to 3 alleles/locus were considered triploid (3n). Finally, isolates with up to 4–5 alleles/locus were noted as 4n/5n. The ploidy level coincided clearly with the three main branches of the dendrogram, the red and orange groups being mostly triploid and the blue-green mostly diploid. Within this last cluster, two triploid sub-groups based on the substrate origin and ploidy level of the strains were defined, marked with blue and cyan colours. Finally, the combination of different methods and factors defined of 3 main groups, the ‘diploid’ one being further divided into 3 subgroups (Table 1 and Fig. 2).

Table 1 Clusters considered as a result of the microsatellite analysis and cluster validation with five different clustering methods.
Figure 2
figure 2

Dendrogram of 1488 isolates of B. bruxellensis using 12 microsatellite markers. The dendrogram was drawn via the poppr package, using Bruvo’s distance and NJ clustering. Five clusters were considered and are represented by different colours. Isolates displaying identical genotypes are represented by a unique tip whose size is proportional to the number of isolates. Inferred ploidy was made as described in Fig. 1F. The histograms represent the distribution of isolates depending on the substrate and the five considered clusters. The pie chart illustrates the proportion of the strains originating from different types of sources.

To assess the relative importance of geographical localisation, substrate origin and ploidy level on B. bruxellensis’ population structure, an analysis of molecular variance (AMOVA) was performed. The three factors were shown to be significant (p-value < 0.0001). Ploidy level explained 46.9% of the variance, whereas the geographical origin and substrate factors explained only small proportions of the total variation (around 5% for each) (Table 2). However, when considering non-wine isolates, the geographical origin explains 54.8% of the total variance, suggesting that wine genotypes are highly disseminated across the regions studied in comparison with other substrates. The correlation between genetic and geographic distance matrix (MANTEL test) was also significant (p-value = 0.0009), confirming that the genetic variation of the total population is significantly related to geographical localisation. The MANTEL test, performed only on the wine strains (p-value = 0.0040), also confirmed the results obtained with AMOVA, suggesting a different population structure amongst wine strains compared to those from the other niches.

Table 2 Impact of geographical localisation, substrate origin and ploidy on the population variance (AMOVA test).

Core genotype analysis

Core diploid data subset

Most classical population genetic analyses cannot be performed using our initial microsatellite dataset since B. bruxellensis population include diploid and polyploid isolates, and most traditional analyses are not available for mixed ploidy levels. To overcome such difficulties, we excluded the alleles identified as specific to the isolates showing more than 3 alleles for at least one locus. Among the 124 alleles included in the initial dataset, 70 were found to be significantly associated with the triploid isolates (χ² test, p-value < 0.01), and were excluded to create a new dataset comprising alleles representative of the core genotype (i.e. the genotype common to all groups). This approach is justified as previous comparative genomics studies showed that B. bruxellensis isolates shared a core diploid genome24.

The obtained core genotype dataset showed up to 2 alleles per locus for most individuals (1350 out of 1488) and only 138 remaining individuals had loci with 3 or 4 alleles. This indicates that the removal of specific triploid alleles allowed us to have access to the core diploid genome common to all B. bruxellensis isolates. Loci with more than 2 alleles were considered as missing data and only concerned 138 individuals, of which 130 only had one locus with 3 alleles.

Ancestral populations and inference of population structure

LEA package and the snmf function in R were used to infer population structure for the ‘core diploid’ dataset. The number of ancestral populations tested ranged from K = 1 to K = 15 (100 repetitions), and entropy criterion was computed to choose the number of ancestral populations explaining the genotypic data in the best way (Supplementary Fig. S1). Entropy was minimal for K = 5 ancestral populations (K = 3, 4, 5, 6 shown on Supplementary Fig. S2). Such Bayesian analysis shows that these 5 ancestral populations are congruent with previous analyses that considered the complete dataset (Fig. 3): the AWRI1499-like (wine, red) and AWRI1608-like (beer, orange) groups were associated with only one ancestral population. Likewise, most of the blue-green subgroups (wine CBS 2499-like, wine L0308-like, kombucha L14165-like) previously defined were associated with only one ancestral population. Finally, only the tequila/ethanol group (CBS 5512-like) seemed to be associated with more than one ancestry. Altogether, the population structure analysis on the core diploid genotype confirmed the previous clustering and suggested the existence of only one ancestral population for each current population.

Figure 3
figure 3

Ancestral populations of 1488 B. bruxellensis strains STRUCTURE plots for K = 5 (the number of ancestral population with lowest entropy, see Supplementary Fig. S1). Each bar represents an isolate and the colour of the bar represents the estimated ancestry proportion of each of the K clusters. The same colour code is kept as in Figs 1 and 2.

Population differentiation analysis

A population differentiation analysis was performed by calculating the fixation index (FST) on the core diploid genotype dataset (Fig. 4). The wine AWRI1499-like population is highly differentiated from beer AWRI1608-like and wine CBS 2499-like groups (with FST 0.36 and 0.39 respectively). This confirms the grouping obtained by the previous analyses. In addition, the pairwise FST values showed high differentiation between beer AWRI1608-like and wine CBS 2499-like populations (FST 0.28). The L14165-like kombucha population seems to be mostly differentiated from the 1608-like beer population and is closer to CBS 5512-like tequila/ethanol group. Finally, it is interesting to point out that the CBS 5512-like group is not highly differentiated from all other groups, which is congruent with the fact that population structure analysis inferred multiple ancestries populations for that group.

Figure 4
figure 4

Population differentiation represented by fixation index (FST) of B. bruxellensis genetic groups between each other. The range of FST is from 0 to 1, 1 meaning that the two populations do not share any genetic diversity.

Sulfite tolerance

Sulfur dioxide tolerance was assayed for a subset of B. bruxellensis (a total of 39 strains). The chosen strains were selected according to their various geographical origins, substrates and different genetic groups. Some isolates showing identical microsatellite genotypes were included to evaluate possible sulfur tolerance variation between strains with undifferentiated genotypic patterns (13-EN11C11 = L0417 = L0424; UWOPS 92–244.4 = UWOPS 92–262.3; L0469 = L14186). Each strain was grown in medium with increasing SO2 concentration (ranging from 0 to 0.6 mg.L−1 molecular SO2) in biological triplicates, so that more than 480 fermentations were monitored.

Three growth parameters (lag phase, maximum growth rate, maximal OD) in the presence of four different concentrations of mSO2 were followed until stationary phase was reached or for a maximum of 300 h when growth was slow or absent. The isolates presented different behaviour according to mSO2 concentration (Fig. 5). Based on the growth parameters of the strains when exposed to increased concentrations of mSO2, two main groups were identified: (1) sensitive strains (S) characterised by an altered growth with (i) a significant lag phase prolongation, (ii) a significant decrease in maximum growth rate, and/or (iii) significant decrease in maximum OD600 (e.g. the sensitive strain L0422 had a lag phase of 17.2 h, 40.7 h, 255.8 h and growth absence, growth rate values were 0.11, 0.07, 0.02 divisions/h and growth absence for and OD600 2, 1.9, 0.8 and no growth at 0, 0.2, 0.4 and 0.6 mg.L−1 mSO2 respectively); (2) tolerant strains (T) that showed unmodified growth rate and maximum OD600 but sometimes a significant prolongation of lag phase was observed (e.g. the tolerant strain AWRI1499 had a maximal growth rate of 0.07, 0.09, 0.08 and 0.07 divisions/h, OD600 1.9, 2.0, 1.9 and 1.9, lag phase of 75, 56.5, 91.5 and 110.3 h at 0, 0.2, 0.4 and 0.6 mg.L−1 mSO2 respectively for the same strain) (mean values of those parameters for each strain are shown in Supplementary Table S4). A clear relation between genetic group and SO2 tolerance was highlighted (Fig. 5). The isolates from groups AWRI1608-like, CBS 5512-like, CBS 2499-like and L14165-like were mostly identified as sensitive (S), whereas the triploid AWRI1499-like and triploid L0308-like groups were mostly classified as tolerant (T). Furthermore, the isolates with an identical microsatellite profile presented similar behaviour in means of growth parameters in the different conditions studied here (Fig. 5 and Supplementary Table S4).

Figure 5
figure 5

Growth parameters of B. bruxellensis strains at different concentrations of SO2. 39 strains belonging to the 6 genetic groups defined previously were tested in small scale fermentations and growth (OD600) was measured in media containing different concentrations of sulfur dioxide (0, 0.2, 0.4, and 0.6 mg.L−1 mSO2) and in biological triplicates. Three parameters were considered: lag phase (h): end of lag phase considered when OD above initial OD*5%; maximal growth rate (r) = number of cellular divisions per hour; maximal OD; S and T stand for sensitive and tolerant (Kruskal-Wallis test, α = 5%). Genetic groups are represented in the same colours as on Fig. 2.


The yeast B. bruxellensis has gained importance for its impact not only in wine industry, but also in beer- and bioethanol-associated fermentation processes. Subsequently, many independent studies were held and results were obtained on different B. bruxellensis collections but without leading to a holistic picture of the B. bruxellensis species. In this study, a large collection of B. bruxellensis strains (1488 isolates) from various substrates (9, the majority of strains (87%) being isolated from wine) and geographic origins (5 continents) was genotyped. The use of a reliable and robust method (microsatellite analysis) determined a general picture of the species’ genetic diversity and population structure. The analysis of the complete genotype dataset highlighted 3 main genetic clusters in the B. bruxellensis population represented by the AWRI1499-like group, AWRI1608-like and CBS 2499-like group correlating with ploidy level and substrate of isolation. Three sub-clusters were also defined for their ploidy level and substrate of isolation, namely tequila/ethanol CBS 5512-like group, wine L0308-like, and kombucha L14165-like group. Our results are consistent with comparative genomics analysis showing that the AWRI1499, AWRI1608 and AWRI1613 (genetically close to the strain CBS 2499) strains are genetically distant and that the AWRI1499 and AWRI1608 strains are triploid while AWRI1613 is diploid24.

Heterozygosity for at least one out of the 12 microsatellite loci was shown for all B. bruxellensis isolates. This observation supports the assumption that a simple haploid organisation of the genome is excluded, which is congruent with previous results based on the Southern analysis of single gene probes of 30 B. bruxellensis strains from different geographical origins52. In comparison, using microsatellite analysis, Legras et al. (2007) reported 102 out of 410 S. cerevisiae isolates (about 25%) and 75% of Saccharomyces uvarum strains (among 108 isolates from various geographical and substrates origins) to be homozygous53. In general, highly homozygous strains are associated with sporulation and selfing phenomena54. So, this could suggest that in the case of B. bruxellensis these mechanisms are non-existent or very rare amongst isolates from industrial fermentation environments. Indeed, there is only one study to our knowledge55, which reports spore formation for B. bruxellensis (and therefore its teleomorph form Dekkera bruxellensis). In the scenario of rare or non-existent sexual reproduction, a large proportion of heterozygous strains would promote higher phenotypic diversity and therefore colonisation of new niches and adaptation to new environments56.

Our results confirm on a large scale the assumption that the B. bruxellensis population is composed of strains with different ploidy level24,41,42,52, as 57.8% of the isolates were shown to have more than 2 alleles for at least one locus. Moreover, polyploid strains were associated with various fermentation niches and geographical regions. A strong correlation between genetic clustering and ploidy level was highlighted, with some clusters predicted to be diploid (CBS 2499-like) while others were composed of mainly triploid isolates (e.g. AWRI1499- and AWRI1608-like). The latter two clusters derive from distinct ancestral populations and thus, presumably from different triploidisation events. The polyploid state typically has a high fitness cost on the eukaryote cell due to the difficulty to maintain imbalanced number of chromosomes during cell division as well as other effects caused by nucleus and cell enlargement45. Thus, it is presumed that a stable polyploid or aneuploid state is maintained when it confers advantage for the survival of the cell in particular conditions47. Indeed, aneuploidy and polyploidy contribute to genome plasticity and have been shown to confer selective and fitness advantages to fungi in extreme conditions, such as the presence of high concentrations of drugs, high osmotic pressure, low temperature, and others (see44,47,57 for review). Similar observations have been made in clinical microbiology, for example, 70% of 132 completely sequenced S. cerevisiae clinical isolates with different geographic origins were shown to be poly- or aneuploid58. It has been suggested that the aneuploid state contributes to the transition from commercial (industrial fermentations) to clinical (human pathogen lifestyle) environments. Aneuploidy was also reported for another human pathogen – C. albicans, for which an aneuploidy of an isochromosome [i(5 L)] is shown to confer resistance to fluconazole59. In the industry, stable autotetraploid S. cerevisiae strains have been described among isolates from a bakery environment and it was suggested that their prevalence in sour dough fermentation could be the result of human selection for tolerance to high osmotic pressure and high metabolic flux – highly favourable characteristics for baking60. In the case of B. bruxellensis, however, polyploidy seems to be not only due to a “simple” duplication of chromosomes and/or regions of chromosomes but is the result of independent hybridisation events with closely or distantly related unknown species24, which result in allotriploid strains. Efficient hybrid species are not rare in human related fermentations44,61,62 and often the hybridisation with a genetically close species is believed to confer tolerance to specific stress factor in a given environment. This is the case of S. pastorianus, used for lager beer fermentations characterised with low temperatures. This yeast has recently been shown to be a hybrid between S. cerevisiae and S. eubayanus – a cryotolerant species isolated from forests in Patagonia63, Tibet64 and recently from New Zealand65. Thus, presumably sterile hybrids were naturally generated and they multiplied clonally, accumulating mutations which enhanced the adaptability of the new “species”63. Hybrids are also a widespread state among wine yeast, where natural or laboratory obtained combinations between two species could have interesting technological properties62,66,67,68,69. Other form of genome dynamics was also highlighted for the diploid CBS 2499 strain possessing specific centromeric loci configuration that enables genome rearrangements and ploidy shifts43. Based on the body of knowledge concerning other polyploid micro- and macro-organisms and the prevalence of polyploid strains highlighted in this study, we assume that B. bruxellensis has adapted to environmental stress factors by the means of genome plasticity, namely polyploidy.

Our study showed that at least one group, the AWRI1499-like triploid wine group, is composed of wine isolates that are highly tolerant to SO2 and that are clearly divergent from other B. bruxellensis clusters (FST higher than 0.35 when compared with AWRI1608-like and CBS 2499-like groups). Nevertheless, for some wine samples, isolates from both AWRI1499-like triploid group and the CBS 2499-like diploid group were identified. Coexistence of diploid and polyploid (auto- and allopolyploid) “microspecies” has often been reported for plants, in which the polyploids are widely distributed as opposed to the diploids that have a more restricted distribution70. Babcock and Stebbins were the first to name this coexistence of populations a diploid-polyploid complex71 for a Crepis species defined as a group of interrelated and interbreeding species that also have different levels of ploidy. These authors claimed that such polyploid complex can arise when there are at least two genetically isolated diploid populations and auto- and allopolyploid derivatives that coexist and interbreed. In the case of B. bruxellensis, the sexual cycle of this yeast is not yet elucidated and interbreeding remains to be evidenced. However, we propose that B. bruxellensis could be described as a diploid-triploid complex, in which sub-populations with different ploidy levels coexist.

To obtain a deeper understanding of the factors shaping B. bruxellensis population structure, we explored the impact of geographical localisation and industrial fermentation environment of origin on the total genetic variance of the studied population. Contribution of the “geographic origin” factor to the population structure was shown to be significant yet only explained a relatively small proportion of variation. However, the variance proportion explained by this factor is much higher when considering non-wine isolates, suggesting that wine strains are highly dispersed worldwide. This dispersal could easily reflect exchange of material and human transport associated with winemaking, followed by adaptation to local winemaking practices38. Exchange of material also happens between different industries, which would facilitate local transfer of microorganisms between beverages. For example, some beers are aged in oak barrels previously used for winemaking72. Also, in the past, beer fermentation is thought to have been initiated by the addition of a small amount of wine73. Such exchanges could be a possible explanation for the low (but significant) contribution of the “substrate of isolation” factor to the total genetic variance in the studied population (5.93%, p-value < 0.0001). Substrate of isolation and geographic origin contributed to a similar extent to the total genetic variance of the population. However, this percentage remained low (5%) compared to S. cerevisiae for which geographic origin was shown to contribute to 28% of the genetic variance53, and Candida albicans for which 39% were reported74. For S. cerevisiae, a significant contribution of geographic origin to the genetic variance is often perceived as a sign of local domestication53,75. Like S. cerevisiae, B. bruxellensis is isolated from human-conducted fermentations including beer and wine. However, until now there are no B. bruxellensis isolates from “natural” non-human related habitats contrary to the case of S. cerevisiae76,77,78. A recent comparative study of strains with different industrial origins and their growth capacities in various type of media (wine, beer, and soft drink) suggests adaptation of B. bruxellensis strains to different fermented beverages23. In our study, a low but significant contribution of substrate of isolation to the total genetic variance of the species was highlighted (5.93%, p-value < 0.0001), which is an indicator for the adaptation of certain sub-groups to different human-related niches (e.g. winemaking conditions, kombucha fermentation, and others). This structuration is further accompanied by a specific genetic configuration, some groups being mostly diploid and others polyploid.

The hypothesis that the triploid state of B. bruxellensis is maintained for some genetic groups because of its contribution to adaptation to a certain type of environment or stress factors is strongly supported by the sulfite tolerance assay performed in our study. This indicated that strains representative of the globally dispersed wine triploid AWRI1499-like group are highly tolerant to SO2. Sulfur dioxide is the most common antimicrobial agent used in winemaking. However, very tolerant B. bruxellensis strains have been reported36. Particularly, in Australia 92% of the isolates are genetically close to a strain that has be shown to be triploid by genome sequencing and highly tolerant to SO2 (normal growth at more than 0.6 mg.L−1 mSO2)34. Here, we show that isolates from this genetic group are highly represented worldwide, namely in France, Italy, Portugal, Southern Argentina and Chile. Furthermore, we confirmed on a larger scale (39 strains from different geographical and fermentation niches) that even high SO2 doses could not guarantee the absence of growth of these strains and therefore their potential to spoil wine. In this context, it is worth noting that isolates from substrates other than wine, were all sensitive to SO2 which suggests a direct link between SO2 exposure in wine and tolerance to this compound. Survival in the presence of SO2 has been broadly studied in S. cerevisiae but is still not fully elucidated. Molecular SO2 was reported to be the major active antiseptic species of SO2 in wine by different authors (see review of Divol et al., 2012) whereas bisulfites species could also play a role at minor level, in the biocidic effect of PMB79. Molecular SO2 could enter the cell passively or via selective transport80. Once inside the cell, molecular SO2 at approximate intracellular pH 5.5–6.5, rapidly dissociates into bisulphite and sulphite anions. Then, bisulphite is the dominant and main antimicrobial species of SO2 inside the cell that can interact with different enzymes and molecules thus having an impact on the basic metabolic pathways of the cell, such as glycolysis. Strategies to tolerate SO2 are also numerous, like its action on the cell: through the production of molecules that bind SO2 (acetaldehyde, pyruvate, and others), SO2 oxidation and SO2 active efflux by sulfite pump (SSU1)80. Even if in B. bruxellensis these mechanisms are not elucidated, SO2 tolerance could be linked to different aspects – presence of gene(s) coding for a sulfite transporter or presence of this gene (or genes) in multiple copies and therefore overexpression, differences in the gene regulation leading to more efficient response to SO2 toxicity, or morphological and physiological state of the cell that would give it the ability to tolerate this antimicrobial agent (cell membrane structure, growth, etc.). The fact that all the highly tolerant B.bruxellensis strains are triploid indicates that this genetic configuration could contribute to SO2 tolerance. As mentioned in the previous paragraphs, polyploid states are maintained when they confer a selective advantage. In this case, we can hypothesise that the allotriploid AWRI1499-like strains combine genetic and physiological characteristics from the parent genomes that confer to them the ability to survive in the presence of SO2.

A possible strategy to cope with the issue of highly tolerant strains would be the increase of SO2 concentration added to the must and wine. However, the strong legislation and consumer pressure to reduce any kind of wine additives makes it undesirable to produce wines with high concentrations of SO2 which would be needed for the prevention of AWRI1499-like strains growth. Therefore, the genetic content of B. bruxellensis has to be considered when choosing spoilage prevention and treatment methods in the winery in order to obtain optimal effect with minimum intervention. Overall, our results show that polyploid strains are widely disseminated and suggest that B. bruxellensis is a diploid-triploid complex whose population structure has been influenced by the use of sulfur dioxide as a preservative in winemaking. Thus, we highlight the importance of B. bruxellensis species as a non-conventional model microorganism for the study of polyploidy as an adaptation mechanism to human-related environments.

Materials and Methods

Yeast strains

B. bruxellensis strains used in this study were collected from different origins: (i) from CRB Oenologie collection (Centre de Ressources Biologiques Oenologie, Institut des Sciences de la Vigne et du Vin, France), (ii) sent from other laboratories, and (iii) isolated from wines for the purpose of this work. Overall, the collection of B. bruxellensis used in this study contained 1488 isolates (Supplementary Table S1) which were further analysed by genotyping.

Strain isolation from contaminated wines was performed by spreading 100 µL of wine sample on solid YPD medium containing 10 g.L−1 yeast extract (Difco Laboratories, Detroit M1), 10 g.L−1 bactopeptone (Difco Laboratories, Detroit M1), 20 g.L−1 D-glucose (Sigma-Aldrich) and 20 g.L−1 agar (Sigma-Aldrich). This medium was supplemented with antibiotics in order to limit the growth of bacteria (5 g.L−1 chloramphenicol Sigma-Aldrich), moulds (7.5 g.L−1 biphenyl, Sigma-Aldrich), and yeast of the Saccharomyces genus (50 g.L−1 cycloheximide, Sigma-Aldrich). The samples were then incubated at 30 °C for 5 to 10 days. Ten colonies were then picked randomly and analysed by PCR using the DB1/DB2 primers81 (Eurofins MWG Operon, Les Ulis, France) for species identity confirmation (DNA extraction was performed as described below for the microsatellite analysis). Putative B. bruxellensis colonies were streaked and grown on selective YPD medium twice consecutively in order to insure the strain purity. Colonies that gave a positive result by PCR DB1/DB2 were stored at -80 °C in 50% YPD/glycerol medium.

Genotyping by microsatellite analysis

DNA extraction

For DNA extraction, strains were grown on YPD solid medium at 30 °C for 5 to 7 days and fresh colonies were lysed in 30 µL of 20 mM NaOH solution heated at 99 °C for 10 minutes using iCycler thermal cycler (Biorad, Hercules, CA, USA).

Microsatellite loci identification and primers design

Twelve pairs of primers were designed on the basis of the de-novo genome assembly of the triploid B. bruxellensis strain AWRI149941 as previously described by Albertin et al.42. Four pairs of primers were added to the eight that were previously described in order to improve the discriminative power of the test and to insure its robustness (Supplementary Table S2).

Microsatellites amplification

In order to reduce the time and cost of analysis, some of the PCR reactions were multiplexed as shown in the Tm column in Supplementary Table S2. By this procedure the number of PCR reactions per sample was reduced from 12 to 9.

PCR reactions were performed in a final volume of 15 µL containing 1 µL of DNA extract (extraction performed as described above), 0.05 µM of forward primer, 0.5 µM of reverse primer and labelled primer (or 1 µL in the case of duplex PCR reactions), 1×Taq-&GO (MP Biomedicals, Illkirch, France). The forward primers were tailed on their 5′ end with M13 sequence as described by Schuelke et al.82. Universal M13 primers were labelled with FAM-, HEX-, AT565- (equivalent to PET) or AT550- (equivalent to NED) fluorescent dies (Eurofins MWG Operon, Les Ulis, France). This method allows labelling of several microsatellite marker primers with the same fluorochrome marked primer (M13) instead of marking each of the 12 forward primers and thus reduces significantly the analysis cost.

Touch-down PCR was carried out using an iCycler thermal cycler (Biorad, Hercules, CA, USA). The program consisted of an initial denaturation step of 1 min at 94 °C followed by 10 cycles of 30 s at 94 °C, 30 s at Tm + 10 °C (followed by a 1 °C decrease per cycle until Tm is reached) and 30 s at 72 °C, then 20 cycles of 30 s at 94 °C, 30 s at Tm and 30 s at 72 °C, and a final extension step of 2 min at 72 °C.

Amplicons were first analysed by a microchip electrophoresis system (MultiNA, Shimadzu) and the optimal conditions for PCR amplifications were assessed. Then, the exact sizes of the amplified fragments were determined using the ABI3730 DNA analyser (Applied Biosystems) (a core facility of INRA, UMR Biodiversité Gènes et Ecosystèmes, PlateForme Génomique, 33610 Cestas, France). Prior to the ABI3730 analysis, PCR amplicons were diluted (1800-fold for FAM, 600-fold for HEX, 1200-fold for AT565 and 1800-fold for AT550) and multiplexed in formamide. The LIZ 600 molecular marker (ABI GeneScan 600 LIZ Size Standard, Applied Biosystems) was diluted 100-fold and added to each multiplex. Before loading, diluted amplicons were heated 4 min at 94 °C. Allele size was recorded manually using GeneMarker Demo software V2.2.0 (SoftGenetics).

Microsatellite data analysis

To investigate the genetic relationships between strains, the microsatellite dataset was analysed using the Poppr package83 in R (3.1.3 version, A dendrogram was established using Bruvo’s distance84 and Neighbour Joining (NJ) clustering85. Bruvo’s distance takes into account the mutational process of microsatellite loci and is well adapted to populations with mixed ploidy levels and is therefore suitable for the study of the B. bruxellensis strain collection used in this work. Supplementary tests were applied to the same dataset in order to confirm the clusters obtained by Neighbour Joining. First, an UPGMA (Unweighted Pair Group Method with Arithmetic Mean) analysis was compared with NJ. Then, the partition method50 was applied in order to confirm the reliability of the nodes obtained by NJ. Also, a multidimensional scaling was performed with Bruvo’s distance matrix on the same dataset and using the cmdscale function on R and finally, the function ‘find.clusters’ available in the adegenet R package was used to identify clusters by successive K-means86. Further, AMOVA (analysis of molecular variance) was used to assess the relative importance of geographical localisation and substrate origin regarding B. bruxellensis genetic diversity. To confirm the results obtained by the AMOVA analysis, the link between genetic divergence and geographic distance was further evaluated by MANTEL test.

Core genotype analysis

Among the 124 alleles included in the initial dataset, 70 were found to be significantly associated with the triploid isolates (χ² test, p < 0.01) and were excluded to create a new dataset comprising alleles common to all groups and representative of the core genotype (i.e. the genotype common to all groups).

For the inference of population structure with this dataset, LEA package was used87 in combination with the TESS tool to map the geographical cluster assignments of the ancestral populations as defined by Höhna et al.88. Further, a differentiation test analysis was performed by calculating the fixation index (FST) for the core diploid genotype.

Sulfite tolerance assessment

The assay was performed in liquid medium containing 6.7 g.L−1 of YNB (DifcoTM Yeast Nitrogen Base, Beckton, Dickinson and Company), 2.5 g.L−1 D-glucose, 2.5 g.L−1 D-Fructose, 5% (v/v) ethanol and increasing concentrations of potassium metabisulfite (PMB, K2S2O5)(Thermo Fischer Scientific) in order to obtain 0, 0.2, 0.4 and 0.6 mg.L−1 mSO2 final concentrations. For the calculation of mSO2 it was considered that K2S205 corresponds to about 50% of total SO2 (therefore a solution of 10 g.L−1 K2S205 corresponds to approximately 5 g.L−1 total SO2). In order to deduce the final mSO2 concentration, the free SO2 concentration was assessed by aspiration/titration method. Then, the mSO2 was calculated by using the Henderson-Hasselbalch equation on dissociation constant pK189. Final pH was adjusted to 3.5 (corresponding to an average value for pH generally encountered in red winemaking conditions) with phosphoric acid (1 M H3PO4) and the four media (corresponding to the 4 different concentrations of SO2) were filtered separately with 0.22 µm pore filter (Millipore).

Small-scale fermentations were performed in sterile 4 ml spectrophotometer cuvettes containing a sterile magnet stirrer (Dutscher, France). The cells were grown on YPD agar and inoculated into the YNB-based medium without SO2. After 96 h of pre-culture (the point at which all strains reached stationary phase), the cells were inoculated at OD600 0.1 in a final volume of 3 ml. The inoculated medium was then covered with 300 µL of sterile silicone oil (Sigma-Aldrich) to avoid oxidation of the medium which could favour the free SO2 consumption. Then, the cuvette was capped with a plastic cap (Dutscher) and sealed with parafilm. A sterile needle was added by piercing the cap to allow CO2 release. The “nano-fermenters” were then placed in a spectrophotometer cuvettes container box and on a 15 multi-positions magnetic stirrer plate at 25 °C (the final temperature in the “nano-fermenters” was therefore 29 °C due to the stirrer heating). Optical density (OD600) was measured every 24 h during at least 300 h to follow cell population growth until stationary phase was reached.

For each growth curve, the following three parameters were calculated: maximal OD was the maximal OD reached at 600 nm, the lag phase (in hours) was the time between inoculation and the beginning of cell growth (5% maximal OD increase), and finally, the maximal growth rate was calculated (maximal number of division per hour based on the OD measurement divided by time). A non-parametric Kruskal-Wallis test was used at α = 5% to identify the means that were significantly different.

Data availability

The datasets generated and analysed during the current study are available from the corresponding author on reasonable request.