Wine is probably the oldest produced beverage in human history dating its first apparition in Georgia 7000 BC. It is extremely important for many cultures not only for its economy but also from the social and religious point of view. For these reasons it is also the most common and consumed type of alcoholic beverage throughout the world.1, 2 In addition, several studies indicate that modest wine consumption can have some beneficial health effects, including protection against cardiovascular and Alzheimer’s disease.3, 4, 5, 6

There has been extensive research aiming to improve both its production and its flavor. The taste of wine depends on many different sensory perceptions ranging from taste and olfaction to texture.7 Wine is a mixture of thousands of molecules that can contribute to its final taste,8 and its composition varies greatly depending on the grapes used, climate and production method.9 Although the flavor of some of these compounds is very well known, it is really difficult to understand what their role in final taste is.7, 10, 11

The relevant role of genetic factors both on alcohol consumption and dependence has been clearly demonstrated by genome-wide association study (GWAS) studies.12, 13, 14, 15, 16 Moreover, studies on the genetic variations in bitter taste receptors have also shown that variations in TAS2R16 and TAS2R38 genes can influence alcohol intake17, 18, 19 but not dependence.19

On the contrary, very little is known about the possible contribution of genetic factors to alcohol preferences in humans. Recently, it has been demonstrated that ethanol flavor perception (but not taste) is associated to variants in the olfactory receptor (OR) OR7D4 gene and to the SCNN1D, a gene encoding a subunit of the salt taste receptor.20 Moreover, Mc Rae et al have demonstrated that a non-synonymous variant of OR2J3, an OR gene, impairs the ability to detect cis-3-hexen-1-ol, a key component in many different foods including wine.21 Finally, we conducted a study on numerous taste-related genes, which allowed us to associate white wine and vodka liking to variations on the TAS1R2 gene in Central Asia populations, suggesting a general role in ethanol perception for this sweet receptor gene.22

Nevertheless, the molecular bases underlying food liking in general and more specifically to wine is still poorly understood. For these reasons, we have conducted the first GWAS on red and white wine liking on five isolated populations coming from different geographic areas.

Materials and methods

Study populations

Samples have been collected in various populations from Europe and Central Asia. More specifically, our study includes 381 individuals come from INGI-CARL a population coming from Carlantino, a small village located in Puglia (Southern Italy); 744 from INGI-FVG which refers to six villages all situated in the Friuli Venezia Region in Northern-Eastern Italy; and finally 1115 from INGI-VB a population coming from the Val Borbera Valley in Northern-Western Italy. In total, 1261 samples were from the Erasmus Rucphen Family (ERF) study, a cross-sectional cohort including 3000 living descendants of 22 couples who had at least six children baptized in the community church around 1850–1900. Finally, Silk Road (SR) is a cohort of ~1000 individuals resulting from the sampling of 20 communities coming from five nations (Armenia, Azerbaijan, Georgia, Uzbekistan, Tajikistan and Kazakhstan) located along the silk road, in particular, 335 have been used for this study.

Wine-liking ascertainment

Red and white wine liking was ascertained through a questionnaire where each participant was asked to rate his liking for each type of wine on a scale from 1 (dislike extremely) to 9 (like extremely).23 To assess individual liking in the SR population a five-point scale coupled with smiley faces was used. This scale is commonly used in case of linguistic barriers or when working with illiterate people as was the case of the SR population.24 Given the differences in the two scales data have been standardized by dividing each score for the number of categories of the used scale, so 9 for the European populations and 5 for the SR study. This approach is the same as the ‘Simple proportion method’ described in Colman et al (1977)25 and similar to the formulas used in Preston and Colman (2000)26 and Dawes (2002).27 The only difference in a regression would be noticed on the intercept; however, as, to perform meta-analysis, we are interested only in betas and association is performed within each group, this difference is negligible.

Genotyping and Imputation

Genotyping was carried out as previously described.28, 29, 30 In brief, INGI-CARL, INGI-FVG and INGI-VB have been genotyped with Illumina 370 k high-density SNP array, whereas SR has been genotyped with Illumina 700 k high-density SNP arrays. Genotype imputation on the INGI cohorts and SR was conducted after standard QC using SHAPEIT231 for the phasing step and IMPUTE232 for the imputation using the 1000 Genomes phase I v3 reference set.33 ERF has been genotyped with different genotyping platform: Illumina 318 k, 350 k, 610 k and Affymetrics 200 k. Genotypes were pooled together after QC, phased and imputed to the 1000Genomes data set phase I v333 using MaCH and minimac.34 After imputation, we excluded from the statistical analyses SNPs with MAF<0.01 or Info<0.4 for all populations but ERF for which R2<0.3 was used instead.

Association analysis

Association analysis was conducted using mixed model linear regression, whereas the standardized wine liking was used as the dependent variable and the SNP dosages as the independent variable. Sex and age were used as covariates. The kinship matrix based on all available genotyped SNPs was used as the random effect. For ERF the kinship matrix was estimated on 14.4 k SNPs common to all different genotyping platform used.35 The GenABEL R package was used to eliminate the effect of relatedness from the trait. Corrected environmental residuals were estimated according to the formula: trait~sex+age+genomic kinship using the GRAMMAR+ method36 as implemented in the GenABEL 1.7–2. MixABEL35 was used for running the linear regressions between the estimated residuals and all the imputed SNPs. Only SNPs that passed post imputation quality control were used for the association analysis. SNPs that did not pass quality control for more than one population were discarded as well. For the association analysis we used a two-step approach.37 For the genome-wide discovery step, association analysis was conducted separately for each INGI cohort and results have been pooled together using the inverse-variance weighting method. After association analysis all SNPs, which showed P<1 × 10−5, were selected to be used for the replication step using ERF and SR. We considered SNPs with P<5 × 10−8 as significant at the replication step. All meta-analysis have been conducted using in house R scripts. Power analysis was conducted using the ‘Genetic Power Calculator’ software.38

Data availability

Genotypes for the cohorts used in this study have been deposited in the EGA database In particular, under the following accession numbers: INGI-CARL EGAS00001001005, INGI-FVG EGAS00001001006, INGI-VB EGAS00001001007, ERF EGAS00001001134 and SR EGAS00001001008.


Table 1 describes the each cohort’s characteristics and trait distribution. The mean liking rating was very similar across the Italian populations, whereas lower liking was observed in SR and ERF. In particular, ERF shows the lowest liking as compared with the other populations. Although means were different, SDs were consistent across all populations showing that the standardization procedure was effective in making the measures comparable.

Table 1 Cohort demographics and trait distribution

For the discovery step we performed genome-wide association analysis on red wine liking and white wine liking on the three Italian populations (2271 samples) and then pulled the results together using the inverse-variance weighting method. We then chose all SNPs with P<1 × 10−5 to use for the second step joint analysis.

Figure 1 represents the Manhattan plot for the discovery step that formed the basis SNPs selected for replication. The selected SNPs were 42 for white wine and 72 for red wine. The selected variants were tested for association to wine preference in two additional cohorts: ERF using 1261 samples and SR using 335 samples. Joint analysis revealed a genome-wide significant association between white wine and rs9276975 (NC_000006.12:g.33005822C>T) (P=2.1 × 10−8) a SNP in the 3′-UTR region of the HLA-DOA gene. Given that the inverse-variance method for meta-analysis is sensible to differences in trait distributions and that we had to transform the SR scale to adapt it to the other populations, we performed also z-score based meta-analysis. The results were almost identical to the ones obtained with the inverse-variance method (regression coefficient 0.98), with the P-value for rs9276975:C>T equal to 1.5 × 10−8. This result shows that the transformation applied to the traits did not influence the final result. To verify that we did not have residual stratification in the replication step, we performed genome-wide meta-analysis on both analyzed traits. Lambda was 1.01 for red wine and 1.002 for white wine, QQ-plots are reported in the additional materials (Supplementary Figures S2 and S3). Finally, power analysis using the Genetic Power Calculator revealed that we have 0.54 power to detect association at α=5 × 10−8 under the same conditions of the described association (MAF=0.15 β=0.053). We also estimated that we have 0.8 power to detect association at α=5 × 10−8 at 0.01 explained variance regardless of the MAF.

Figure 1
figure 1

Manhattan Plot for the meta-analysis on the Italian Populations. The red line is set at P=1 × 10−5, whereas the red points represent the SNPs selected for replication. The top figure refers to red wine, whereas the lower one to white wine.

Figure 2 shows the regional association plot around rs9276975:C>T for white and red wine, whereas Supplementary Figure S1 shows a similar plot for the whole HLA locus. The association signal is clearly confined to the HLA-DOA locus not affecting the wider HLA region, which is in strong linkage disequilibrium. This is probably due to the fact that HLA-DOA is within two recombination hotspots that isolate it from the rest of the HLA locus.

Figure 2
figure 2

Regional association plot of the locus containing the HLA-DOA gene. The top figure refers to white wine, whereas the bottom one to red wine. Clearly, the two figures look alike, although in the case of red wine the association signal is much weaker. Different colors refer to the r2 with rs9276975 in the CEU population. Squares represent coding SNPs, whereas circles represent non-coding SNPs.

Table 2 reports the results for the significant SNP, whereas Supplementary Table S2 contains the results for all the SNPs selected in step one.

Table 2 Results for the association analyses for rs9276975 for white wine and red wine

To verify whether the associated SNP had an effect on food likings in general or only on wine, we ran association analysis between the two SNPs and 40 more food likings, which were available on the studied cohorts and collected similar to the collection of data for wine liking. This analysis did not show significant or close to significant results (Supplementary Table S4), suggesting that the described associations are specific to wine liking.

No significant association was found for any of the 72 SNPs used for the replication step of red wine as shown in Supplementary Table S3. Supplementary materials Supplementary Table S1 report the descriptive statistics for all GWAS and meta-analysis, whereas Supplementary Figures S2 and S3 show QQ-plots for the same analyses.

Given the possible genetic differences between man and women, the analyses have been conducted also on man and women separately limited to the rs9276975:C>T SNP. This analysis revealed that the association between white wine and HLA-DOA was much stronger in women than man (P=1 × 10−7 in women vs P=0.01 in man). Looking at the effect size, women exhibited twice the effect as compared with man (0.067 vs 0.033). Table 3 shows the results for the sex-specific analysis of white wine and red wine, which showed again a similar effect.

Table 3 Summary of results for the sex-separated analysis

Annotating the rs9276975:C>T with HaploReg v2,39 SNPNexus and Regulome DB did not give any insight on what might be its role in the observed association. Also, extending the annotation to the all the SNPs in strong LD (r2>0.8) with rs9276975:C>T in the CEU 1000 G population. Only rs1367731 (NC_000006.12:g.33017422C>T) and rs9276981 (NC_000006.12:g.33007922G>C) are suggested to have a possible functional role by Regulome DB with a score of 5 and 6, respectively, which corresponds to the lowest possible scores and to very weak evidence.


In this study we present the first GWAS on wine liking showing a significant association with HLA-DOA, a non-canonical class II MHC molecule, in particular with white wine. Although it is true that in the MHC locus there are at least two clusters of OR, HLA-DOA is situated in between two recombination hotspots (Figure 2), which makes it independent from the main MHC locus containing the OR genes. The most associated SNP, rs9276975:C>T, is located in the 3′-UTR region of the gene, which would suggest more a regulatory role than a functional one. This fits particularly well if we consider HLA-DOA function in the general MHC class II system. In fact its role is to inhibit HLA-DM another non-canonical MHC class II molecule who has the double role of activating all other MHC class II proteins by de-coupling from the CLIP protein and then, by binding to them, is responsible for the specificity of their binding to the correct antigens.40 In other words HLA-DO by inhibiting HLA-DM prevents the activation of the MHC class II molecules in general. Unfortunately the bioinformatics analysis was not able to give us any insights on possible regulatory elements modified by SNPs in LD with rs9276975; however, we must consider that given the particular nature of our traits not much information is available on the interested tissues such as the olfactory epithelium for example.

MHC could be linked to wine liking directly through the olfactory epithelium cells. In fact numerous studies have linked MHC molecules to mate choice through scent in various species including humans.41, 42, 43 In mice, non-canonical MHC class I molecules are specifically expressed in a particular section of the vomeronasal organ (VNO),44 which are thought to modulate the response to MHC antigens.45 Given that humans do not have a VNO, it is unclear how they could be able to distinguish different HLA types. However, studies on VNO sham mice show that they can recognize MHC class II antigens as well as non-sham mice suggesting that, at least for this type of molecules, the VNO is not necessary.46 Another recent study conducted on monogamous birds with highly developed olfaction shows that it is MHC class II molecules and not class I, which determine mate recognition.47 In humans it has been demonstrated that different HLA types show preference or dislike for specific scents, suggesting that some volatile molecules responsible for these odors are mimicking some HLA-specific scent.48, 49 MHC class II molecules are responsible for immune response to external stimuli and it is thus a more suitable candidate than class I MHC for a possible olfactory function. One of the main theories behind MHC recognition is that people with different HLA profile have different composition in bacterial flora, which affects body odor and thus generates the HLA-specific scent.50 Extending this view to our findings, the association we detect seems to be specific to wine in which bacteria are an important part of the production process and could be responsible to the difference in liking. It is thus possible that HLA-DOA, by regulating MHC class II molecules, modulate also the perception of specific volatile compounds present in wine, which are produced by the bacteria used in wine production. It is also very suggestive that this effect is much stronger in white wine than in red wine in which the mouth-feel component of its flavor is more accentuated owing to the presence of more tannins.51 The fact that the effect of rs9276975:C>T on wine liking is twice as large in women than in men seems to support the olfactory hypothesis. Women have in fact been described as being more sensitive than men to specific odors52, 53 also in the specific case of MHC driven body odor recognition.54, 55

The precise mechanism of how MHC class II molecules actually interact with the olfactory epithelium and how the signal is then transferred to the brain is still unknown and further studies are needed to clarify this point.

We have previously reported rs4920566 (NC_000001.11:g.18853330 A>G) in the TAS1R2 gene as being associated to white wine liking. Checking in the results from the combined meta-analysis, this SNP resulted to be non significant (P-value=0.83). This lack of replication could be due to allelic heterogeneity between the SR populations and the European ones, and further studies are needed to clarify this point.

In conclusion, this study is one of the first that examines the genetics of liking for a commonly consumed food on a genome-wide scale. Our findings suggest that at least in the case of wine MHC has an important role in defining its liking probably through olfaction. This opens a new prospective not only in food liking and thus choice but also in understanding the mechanisms that link flavor to olfaction in general.

Ethical statement

All studies adhered to the tenets of the Declaration of Helsinki. The ERF study was approved by the Medical Ethics Committee of the Erasmus Medical Center in Rotterdam. Informed consent was obtained after explanation of the nature and possible consequences of the study.

All subjects in the INGI-CARL, INGI-FVG and SR studies provided written informed consent before participation. Approval for the research protocol was obtained from the ethical committee of IRCCS Burlo Garofolo Hospital.

The VB study, including the overall plan and the informed consent form was reviewed and approved by the institutional review boards of San Raffaele Hospital in Milan and by the ethical committee of the Regional Authorities of Piemonte.