Abstract
Alcohol consumption is a heritable behavior seriously endangers human health. However, genetic studies on alcohol consumption primarily focuses on common variants, while insights from rare coding variants are lacking. Here we leverage whole exome sequencing data across 304,119 white British individuals from UK Biobank to identify protein-coding variants associated with alcohol consumption. Twenty-five variants are associated with alcohol consumption through single variant analysis and thirteen genes through gene-based analysis, ten of which have not been reported previously. Notably, the two unreported alcohol consumption-related genes GIGYF1 and ANKRD12 show enrichment in brain function-related pathways including glial cell differentiation and are strongly expressed in the cerebellum. Phenome-wide association analyses reveal that alcohol consumption-related genes are associated with brain white matter integrity and risk of digestive and neuropsychiatric diseases. In summary, this study enhances the comprehension of the genetic architecture of alcohol consumption and implies biological mechanisms underlying alcohol-related adverse outcomes.
Similar content being viewed by others
Introduction
Alcohol consumption is a prominent risk factors for death and disability worldwide, accounting for over two million deaths each year1. It poses a tremendous threat to human health through multiple mechanisms, including cumulative damage to organs and leading to self-harm or violence2,3. Notably, these adverse effects are largely dependent on the average volume of alcohol consumption4. Identifying the risk factors that influence one’s level of alcohol consumption can contribute to the prevention, identification, and treatment of adverse outcomes from alcohol consumption5.
Over the recent decades, comprehensive genome-wide association studies (GWAS) have indicated the potential influence of genetic factors on one’s alcohol consumption volume and identified over 100 related variants6,7. However, a predominant proportion of the identified variants are localized within noncoding regions, and their effect sizes tend to be small, making interpretation and identification of the causal gene challenging8. In addition, previous GWAS mainly utilized imputed genotype data, which only cover limited regions of the genome, and thus may have missed many potential genes. Furthermore, GWAS studies focused mainly on common variants, and few studies have investigated rare variants associated with alcohol consumption, which yield greater potential to interpret biological function and elucidate mechanisms9. Although there are studies that have attempted to leverage exome chip data to identify rare variants contributing to alcohol consumption, the sample size was small and limited regions of the whole exome were examined10.
The introduction of whole exome sequencing (WES) provides a great chance to overcome the limitations of previous genetic studies on alcohol consumption with a substantially larger amount of rare and ultra-rare protein-coding variants11,12,13. Collapsing of loss-of-function (LOF) variants helps estimate the effect direction of associated genes13,14. When combined with large-scale population cohorts with multi-modal phenotypic data, WES would greatly facilitate our understanding of the genetic underpinnings of alcohol consumption as well as its implication on physical and mental health6. However, to our knowledge, there have been few large-scale WES studies on alcohol consumption, let alone elucidating the potential implications of the identified genes10,15. Meanwhile, as indicated by a previous genome-wide association study, significant genetic associations existed between alcohol consumption and several body health phenotypes7. The application of phenome-wide analysis for alcohol-related genes can help extend and deepen our current comprehension of the association between alcohol consumption and human health.
Hence, aiming to refine the genetic architecture of alcohol consumption, we conduct an exome-wide association study (ExWAS) for alcohol consumption among 304,119 individuals from the UK Biobank (UKB). We also examine the rare-variant associations with genes reported by previous GWAS6,7,16,17. Finally, we provide biological insights into the identified genes via bioinformatics analyses and phenome-wide association analysis (PheWAS).
Results
Study population and data description
We leveraged exome sequencing data and phenotypic data from UKB and excluded low-quality variants and samples (Methods)13,18. For the main analysis, we included 304,119 unrelated white British participants. The average age was 56.87 years at enrollment and 54.09% participants were female. Information about alcohol drinking per week were obtained from self-completed touchscreen interviews at baseline (Methods and Supplementary Data 1). The average alcohol consumption (alcohol amounts after natural logarithm) of the whole sample was 2.06 (Standard Deviation (SD) = 1.44), with a mean of 2.47 (SD = 1.41) and 1.72 (SD = 1.38) for males and females respectively (Supplementary Data 2). Finally, the exome-wide association analysis included 100,101 common variants (with a MAF of ≥1%) and 13,018,630 rare variants (with a MAF of < 1%). Figure 1 provided the general schema of our study.
ExWAS for alcohol consumption
To test whether alcohol consumption was associated with damaging coding variants, we conducted ExWAS using a linear mixed model with adjustments for ten principal components, age, and sex (Methods). The analysis discovered two rare variants and 23 independent common variants linked to alcohol consumption (P < 5 × 10−8) (Table 1, Fig. 2a, b). The genomic control lambda is 1.04, indicating that the association statistics are not systematically inflated (see Supplementary Fig. 1 for the corresponding quantile-quantile plot). The top rare variant, rs283413 (MAF = 0.8%; βA = −0.15, P = 2.73 × 10−31) is a stop-gain variant in ADH1C, the well-known gene related to alcohol metabolism. Among the 23 common variants, three were not reported previously (rs41288799, rs4975020 and rs77623289). Most of the identified variants are intron (46%) or missense (19%) (Fig. 2c, Supplementary Data 3, Methods). Additionally, 15 of the 22 identified variants, which were examined in an independent alcohol consumption GWAS19, showed nominal significance (P < 0.05) (Table 1, Supplementary Data 4). Further, 17 of the 24 identified variants available in the FinnGen study20 exhibited nominal associations with alcohol use disorder (AUD) (P < 0.05) (Supplementary Data 5). To assess the robustness of the main analysis, we adjusted for rs1229984, a well-established marker strongly linked to alcohol consumption6,21. Notably, 23 of 25 variants (92%) retained the same association directions, with 22 variants (88%) maintaining their significance (P < 5 × 10−8) (Supplementary Data 6). Additionally, the main analysis maintained its robustness after excluding former drinkers and non-drinkers. Further, all effect directions remained the same, and 20 of the initially identified 25 variants (80%) retained their significance (P < 5 × 10−8) (Supplementary Data 7). Finally, the ExWAS for scores of alcohol use disorders identification test (AUDIT) identified a rare variant (rs283413) and two independent common variants (rs13107325 and rs201168482) associated with alcohol use problems (Supplementary Figs. 2–4, Supplementary Data 8).
Since a single rare variant tends to be of insufficient power to identify significant signals, we further performed gene-based collapsing analysis to detect genes related to alcohol consumption. LOF and missense rare variants of each gene and three MAF thresholds (< 1%, < 0.1% and < 0.01%) were utilized. In total, we identified 19 associations (covering seven genes) after Bonferroni correction (Table 2 and Fig. 3a; Supplementary Data 9; P < 0.05/19852 = 2.5 × 10−6). Rare variants in the known alcohol consumption-related gene, ADH1C showed the most significant gene-based association at P = 1.91 × 10−30. The maximum genomic control lambda was 1.076 (see Supplementary Fig. 5 for the corresponding quantile-quantile plots). The total rare burden heritability of alcohol consumption was 0.88% (Fig. 3b and Supplementary Data 10). We additionally identified six putative alcohol consumption-related genes under the threshold of overall false discovery rate (FDR) < 0.05 (P <1.69 × 10−5). Among these rare-variant genes, seven (GIGYF1, ANKRD12, KDM5B, APC2, LGI2, ATP1A2, and ENSG00000224076 (not officially designated and excluded from further analysis)) were not previously reported in GWAS studies for alcohol consumption. The LOF and missense burden in eleven of the rare-variant genes reduced alcohol consumption (β = −0.003 to −0.023; Fig. 3c, Table 2). In addition, 2.03% (n = 8825) of the participants carried a LOF variant located in ADH1C exons and GIGYF1 variants were carried by 1.72% (n = 7449) participants (Fig. 3d). After excluding the former drinkers and non-drinkers, 19 out of the initially identified 39 associations retained the significance (P < 1.69 × 10−5, Supplementary Data 11). Following adjustment for rs1229984, the identified associations were robust except for ADH1C, ADH1A, SNX17 and ADH5 (Supplementary Data 12). Additionally, we performed ExWAS for AUDIT and identified two genes (ADH1C and CA1) associated with alcohol use problems (Supplementary Figs. 6-8, Supplementary Data 13).
Leave-one-variant-out (LOVO) and conditional analysis
To investigate whether a single variant dominated the gene-based associations, we firstly conducted LOVO analysis. While the maximum P-value for ADH1C was P = 0.802 after the removal of rs283413, P = 0.873 for ADH1A after the removal of rs190428650, P = 0.446 for SNX17 after the removal of rs147740391, and P = 0.016 for ADH5 after the removal of rs62325244, the other associations did not exhibit substantial attenuation (Supplementary Figs. 9–20, Supplementary Data 14). Hence, even a single variant, i.e. of ADH1C, ADH1A, and ADH5, may critically influence alcohol consumption, whereas the other significant associations were based on a burden of multiple rare variants. Subsequently, conditional analysis was performed to assess whether the significant associations with rare variants were influenced by adjacent common variants (Methods). Seven genes were found to have nearby common variants exhibiting significant associations with alcohol consumption. The associations of GIGYF1, ANKRD12 and APC2 did not exhibit substantial attenuation, whereas the associations of ADH1C, ADH5 and SNX17 exhibited attenuation, though still nominally significant, and the association of ADH1A lost its significance after adjustment for the nearby common variants (Supplementary Data 15).
Sex-specific analysis of the associations
As the average alcohol consumption showed a significant difference between males and females, we conducted gene-based collapsing analyses on participants separated by sex to explore whether the genetic contributions to alcohol consumption also differed by sex. While the KDM5B gene’s association with alcohol consumption was only observed in males (P = 3.04 × 10−7 for males and P = 0.170 for females), the other genes were significantly associated with alcohol consumption in both males and females (P < 0.05, Supplementary Data 16).
Associations of rare variants in alcohol-related genes
We then examined the impact of rare variants based on previous GWAS findings on alcohol consumption. We assessed a total of 174 alcohol consumption-related genes identified by the most recent GWAS studies6,7,16,17. Although 25 genes showed nominal significance, only the ADH1C gene was significant after Bonferroni correction (Supplementary Data 17). The influence of coding variants within the GWAS regions did not exhibit substantial effects, potentially due to the limited statistical power of ExWAS.
Biological function and tissue expression of the alcohol consumption-related genes
We further conducted a series of bioinformatics analyses to investigate the biological functions of the alcohol consumption-related genes. We first performed pathway enrichment analyses. We found the enrichment of gene ontology (GO) pathways relevant to alcohol dehydrogenase activity, oxidoreductase activity, ethanol oxidation and ethanol metabolism (Fig. 4a, Supplementary Data 18). Also, the analysis of Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways identified the enrichment of these genes in tyrosine metabolism, fatty acid degradation, and pyruvate metabolism. These results hence supported the biological validity of our genetic findings.
We further analyzed tissue-specific expression enrichment of the identified genes based on the Human Protein Atlas project using the TissueEnrich R package22. We observed six, four, and two genes enriched in the liver, duodenum, and adipose tissue, respectively (Fig. 4b, Supplementary Fig. 21, and Supplementary Data 19). Genes, including SERPINA1, ADH1C, ADH1A, MLXIPL, MTTP, and KLB were specifically enriched in the liver (Supplementary Fig. 22). Subsequently, we evaluated the expression levels of these six genes across various cell types in the liver with single-cell RNA sequencing (scRNA-seq) data. While SERPINA1 was widely expressed in all cell types, ADH1A, ADH1C, MTTP, and MLXIPL were all predominantly expressed in the hepatocytes (Fig. 4c, d).
We further estimated the similarities between genes based on the association results of collapsing analyses across 1419 quantitative traits in UKB using Gene-SCOUT23. Notably, the GIGYF1 gene exhibited the highest similarity to ANKRD12 (Fig. 5a and Supplementary Data 20). Interestingly, the top 10 similar genes of ANKRD12 are enriched in brain function-related pathways containing glial cell differentiation, cognitive function, and glutamate secretion (Fig. 5b and Supplementary Data 21). Thus, to gain more insights into how these rare-variant genes may be related to alcohol use, we further examined the expression of ANKRD12 and GIGYF1 across tissues within the Human Protein Atlas24. Notably, both ANKRD12 and GIGYF1 exhibited strong expression in the brain, particularly in the cerebellum (Fig. 5c, Supplementary Fig. 23). In addition, both ANKRD12 and GIGYF1 showed broad expression in all cell types in brain (Supplementary Fig. 24). We subsequently characterized the spatiotemporal expression trajectories of ANKRD12 and GIGYF1 in the human brain, using mRNA sequencing (mRNA-seq) data from the PsychEncode study25. Our findings revealed unique temporal expression patterns of these genes in the cerebellum compared to other regions of the brain (Fig. 5d, e). These results imply that these two genes associated with alcohol consumption may alter the function of brain, which are important targeted organ of alcohol intake, providing clues for future research on the alcohol-related brain injury.
Phenotypic associations with alcohol consumption-related genes
Alcohol consumption has been documented to correlate with various biological markers, including metabolites, and health outcomes7,26,27,28. To systematically assess the relationship between genetic variation in alcohol consumption and a broad spectrum of health phenotypes, we performed PheWAS for the identified alcohol consumption-related genes across blood indices, major diseases, body function, and brain structures from the UKB (Methods and Supplementary Data 22).
Among the 82 significant gene-phenotype and 380 variant-phenotype associations (P < 0.05/316/12 = 1.32 × 10−5, P < 0.05/316/25 = 6.33 × 10−6, respectively), 81.7% and 47.4% were related to inflammatory and blood biochemistry indices (Fig. 6 and Supplementary Data 23, 24, Supplementary Figs. 25–61). Indicators of inflammation and disturbance of lipid metabolism showed significant associations with alcohol consumption-related genes. GIGYF1 and ANKRD12 showed the most phenotypic associations. GIGYF1 showed strong positive associations with HbA1c (βburden = 0.029, P = 2.51 × 10−13) and glucose (βburden = 0.027, P = 1.92 × 10−10), and negative associations with total cholesterol level (βburden = −0.029, P = 4.52 × 10−13), low-density lipoprotein cholesterol level (LDLC) (βburden = −0.026, P = 1.33 × 10−10) and Apolipoprotein B (βburden = −0.024, P = 6.15 × 10−10). ANKRD12 showed strong positive associations with neutrophil percentage (βburden = 0.021, P = 1.19 × 10−13) and neutrophil-lymphocyte ratio (βburden = 0.020, P = 7.34 × 10−13), and negative associations with lymphocyte percentage (βburden = −0.021, P = 2.75 × 10−13), total protein level (βburden = −0.019, P = 1.36 × 10−10), and monocyte percentage (βburden = −0.015, P = 3.04 × 10−8).
Interestingly, the gene-phenotype associations also extended to cognitive function and white matter. ANKRD12 showed significant associations with lower fluid intelligence scores (βburden = −0.028, P = 6.03 × 10−10) and worse performance in the pairs matching task (βburden = 0.010, P = 2.93×10−7). GIGYF1 showed nominal associations with lower fractional anisotropy (FA) in the fornix tract (βburden = −0.059, P = 1.06 × 10−4), and longer reaction time (βburden = 0.014, P = 1.40×10−4). The Mendelian randomization analyses failed to uncover any causal relationship between cognition and alcohol consumption (Supplementary Data 25), in line with results from previous studies29. Given the limited evidence supporting causal links between cognition and alcohol consumption, it is plausible that the observed associations may stem from the pleiotropic effects of ANKRD12 and GIGYF1.
The variant-phenotype association analyses revealed significant correlations with various white matter tracts. Notably, significant correlations were observed for FA in specific regions, including left anterior limb of the internal capsule (β = −0.072, P = 9.52 × 10−13), genu of corpus callosum (β = −0.069, P = 9.68 × 10−13), and left superior frontal-occipital fasciculus (β = −0.067, P = 6.13 × 10−12).
ExWAS in all white British participants and unrelated non-white British participants
Since SAIGE can handle sample relatedness in the regression model, we included all 373,152 white British participants (including both unrelated and related participants) in the analyses to increase statistical power. In the ExWAS for single variants, we identified 26 independent significant variants associated with alcohol consumption, including four variants not detected in the unrelated white British sample, of which two were not previously linked to alcohol consumption (Supplementary Data 26). The gene-based collapsing analysis identified 23 potential alcohol consumption-related genes with an overall FDR < 0.05. Of the 23 genes, 13 were not found in the unrelated white British participants, and among these, eight were not previously associated with alcohol consumption (Supplementary Data 27).
Moreover, ExWAS was conducted in 61,076 unrelated non-white British participants. While the ExWAS for single variants identified one locus significantly linked to alcohol consumption (Supplementary Data 28), the gene-based collapsing analysis did not uncover any significant associations after FDR correction, potentially attributed to the constrained sample size among non-white British participants.
Discussion
Herein we describe the largest comprehensive ExWAS of alcohol consumption to date and provide deep biological insights into the identified genes via functional analysis and phenome-wide association analysis with health-related data from the UKB. We identified ten previously unreported genes associated with alcohol consumption as well as replicated several known genes, which may shed light on pathophysiological processes in alcohol use. Furthermore, bioinformatics analyses supported the biological validity of the genetic associations and gene expression analysis highlighted the role of the cerebellum in alcohol consumption. PheWAS analyses provide strong support for the pleiotropic and consequent effects of alcohol consumption-related genes on human health, especially on inflammation, lipid metabolism, and white matter integrity.
Previous GWAS studies have enabled the identification of alcohol consumption-related genes, but our study extended previous findings via the discovery of more genes as well as the identification of more common and rare variants to the reported genes of alcohol consumption. We have identified thirteen genes at exome-wide significance based from rare variants using gene-based collapsing analysis, seven of which (GIGYF1, ANKRD12, KDM5B, APC2, LGI2, ENSG00000224076 and ATP1A2) were not reported by previous GWAS studies. Moreover, among the 174 reported genes from the most recent GWAS studies6,7,16,17, twenty-five showed nominal significance and ADH1C passed Bonferroni correction. Notably, utilizing the LOVO analysis, we found that for those reported GWAS genes including ADH1C, ADH1A, SNX17, and ADH5, removal of a single SNP leads to loss of significance in gene-based collapsing analysis, while for the genes not previously reported in the GWAS studies, removal of any single SNP does not influence the significance. The results indicated that the significance of these genes is the cumulative effect from a group of rare SNPs, which may explain why they were not detected by previous GWAS studies. This is further validated by the single variant analysis, where a significant signal was detected in ADH1C while not in those genes. Our results emphasized the value of rare variants as well as the necessity of gene-based collapsing analysis in WES studies on alcohol consumption.
For the two rare-variant genes (GIGYF1 and ANKRD12) associated with alcohol consumption, GIGYF1, identified as a risk gene for diabetes in earlier research13,30, is a protein-coding gene intricately involved in the regulation of cell growth and division. One meta-analysis of 38 studies demonstrated that a moderate level of alcohol intake was linked to a lower risk of type 2 diabetes compared to abstainers31. The association might be mediated by the beneficial metabolic effect of alcohol consumption such as altered HDL cholesterol and inflammation levels26. Meanwhile, results of the PheWAS showed that the alcohol consumption-related gene, GIGYF1, was significantly associated with blood levels of HDL cholesterol and several inflammatory biomarkers. Therefore, GIGYF1 may participate in the metabolic disturbance caused by alcohol consumption. As for another gene ANKRD12, less evidence was found on its possible role in alcohol consumption. While the Gene-SCOUT analysis provided interesting findings that GIGYF1 and ANKRD12 showed high similarity in biomarker profiles, which suggested that they might execute similar biological functions. Interestingly, ANKRD12 and GIGYF1 are associated with a higher Townsend deprivation index, which could possibly lead to a less access to alcohol32. Given that those genes were associated with cognitive function in our PheWAS results and in previous studies33,34, it is possible that reduced cognitive function in the gene carriers results in increased material deprivation and in turn reduced alcohol consumption. Nevertheless, the findings may be confounded by many factors and the causality is not validated by mechanism study, so further research is needed to clarify the potential associations of GIGYF1 and ANKRD12 with alcohol consumption.
In addition to the discovery of genetic associations, we also provide insights into alcohol metabolism-related brain alterations based on the two rare-variant genes identified in this study. The Gene-SCOUT analysis identified a series of genes that were highly similar to these two genes. These genes displayed significant enrichment in the regulation of glial cell differentiation and observational learning. As evidenced by previous human and animal studies, disrupted differentiation of glial cell (astrocytes and oligodendrocytes) is one of the human alcohol-related neuropathology35 and heavy alcohol exposures could result in cognitive impairment36. Since alcohol consumption influences intracellular signaling mechanisms, causing alterations in gene expression that gradually produce long-lasting damage in the brain37, these identified genes might be involved in the pathological process. What’s more, glia dysfunction is known to cause white matter atrophy, and these two genes are significantly expressed in white matter, further hinting that they might mediate alcohol-related brain damage. Another finding lies in their dominant expression in the cerebellum, one of the major target organs of alcohol abuse. Moreover, ANKRD12 and GIGYF1 are well-known genes for reduced cognitive function and intellectual disability as evidenced by previous studies33,34. Consistently with previous findings, our PheWAS analyses indicated strong correlations between these genes and cognitive decline as well as altered white matter integrity, which suggests that these genes play a significant role in brain function and structure. The findings are plausible as prior studies have observed the associations between heavy alcohol consumption and changes in brain structure38,39. More interestingly, alcohol consumption-related white matter microstructure changes have been considered a hallmark of AUD40,41. Therefore, ANKRD12, significantly associated with alcohol consumption, AUDIT, and white matter integrity alterations, might serve as therapeutic targets for the prevention of AUD.
We observed sex heterogeneity for KDM5B, the association between KDM5B and alcohol consumption was only observed in the male group. As we only observed heterogeneity in one gene, it is possible due to the sex-specific biological function of KDM5B. KDM5B encodes a lysine-specific histone demethylase, which is an important regulator of liver molecular pathways after alcohol consumption42. Previous studies found sex-specific roles of KDM5B in the alcohol-induced hepatic response, which regulates a fibrogenic program in females while contributes to hepatocyte dedifferentiation and fatty acid synthesis in males43,44. However, the sex-specific mechanisms underlying the influence of KDM5B on alcohol consumption is still unclear. Future studies to identify the mechanisms will be necessary.
Despite these significant findings, our study has some limitations. First, as WES could only detect variants in the protein-coding regions, the possible genetic associations in non-protein-coding region were less investigated in this work. Second, because of the scarcity of a comparable population cohort with genetic sequencing and phenotype data for replication, we relied on existing GWAS data for alcohol consumption and AUD to support our findings. Further whole-exome studies are needed to replicate the identified genes. Third, the causality between the reported genes and alcohol use was largely unknown. Further research are needed to replicate and verify the identified genes and the potential relationship with alcohol consumption. Lastly, participants who drinking up to 3 times monthly and less were assigned a weekly drinking level of zero following a previous study45. While this simplified approach may introduce some error into their drinking levels, it is expected to be relatively small given the infrequency of their alcohol consumption.
In conclusion, by sequencing the protein-coding regions, we were able to replicate the genes previously reported and identify common and rare coding variants that have a strong effect on alcohol consumption. Additionally, functional analysis of the identified genes not only recapitulated known biological processes in alcohol consumption but also provided insights into the brain’s role in alcohol consumption. We anticipate that our findings of the alcohol consumption-related genes will facilitate the identification of individuals that are vulnerable or intolerant to alcohol consumption, contributing eventually to the prevention as well as treatment of alcohol-related adverse outcomes.
Methods
UK Biobank
The UKB included phenotypic and genetic information for approximately 500,000 participants of ages between 40 and 6946,47. Informed consent has been signed by all participants. The UKB cohort was approved by the NHS National Research Ethics Service North West (reference number: 16/NW/0274). The data utilized in the study included demographic data, alcohol-related phenotypes, neuropsychiatric diseases, cardiovascular diseases, cognition, brain grey matter and white matter phenotypes, heart function, lung function, biochemistry, and inflammation phenotypes. The research was performed under application number 19542.
Study phenotypes
The alcohol consumption score was determined through a self-administered touchscreen interview conducted during the baseline appointment. Initial data acquisition involved obtaining mean weekly alcohol consumption data, taking into account various beverage types, from participants reporting alcohol consumption more than once or twice weekly. Each alcoholic drink type was measured in specific units: spirits in measures, wines in glasses, and beer/cider in pints, approximately equating to one, two, and two point five units, respectively. For respondents indicating intake frequencies of “one to three times a month,” “special occasions only,” or “never” (for whom weekly alcohol consumption data were unavailable), a weekly volume of 0 units was assigned. The determination of alcoholic units per week involved aggregating the intakes for these five drink types, consistent with a previous study45. The median alcoholic units per week of the whole sample was 10 (Supplementary Data 2). The alcohol consumption score was the log (units+1) transformed alcoholic units per week. Detailed information was available in Supplementary Data 1.
Whole exome sequencing data
WES was performed for approximately 454,756 individuals from the UKB with IDT xGen Exome Research Panel v1.011,18. We implemented centralized quality control following extensive quality control procedures following previous research13. Concisely, multi-allelic sites were segregated into bi-allelic sites and calls with poor genotype quality or excessively low/high genotype depth were marked as no-call. Next, we excluded variants located in Ensembl low-complexity regions, along with variants possessing call rate ≤ 90%, and Hardy-Weinberg Equilibrium (HWE) P-value ≤ 10−15. Finally, we removed participants who withdrew from the UKB, duplicates, participants exhibiting discrepancies between self-reported and genetically indicated sex, and participants with Ti/Tv, Het/Hom, SNV/indel, and the amount of singletons exceeding 8 standard deviations from the mean. Additionally, we excluded individuals who were genetically related at the 3rd degree or closer in the main analysis. Overall, a total of 304,119 individuals with available alcohol consumption data and genetic data passed the initial quality check and were used in the main analysis. We additionally conducted ExWAS in all (both genetically related and unrelated) white British participants and unrelated non-white British individuals. White British individuals were identified as the intersection of participants who self-reported as ‘White British’ and those who exhibited very similar genetic ancestry based on genetic components. To control population stratification, we generated the top 10 ancestral principal components (PCs) using a high-quality independent autosomal variants subset, as outlined in a prior study13. Specifically, this subset of variants comprised variants with MAF > 0.1%, HWE P > 10−6, missingness < 1%, and underwent two rounds of pruning (--indep-pairwise 200 100 0.1 and 200 100 0.05 in PLINK).
Variant annotation
First, rare variants were defined as MAF less than 1%. SnpEff was utilized to annotate the variants48, during which the most detrimental consequence of the gene transcript was retained. Subsequently, variants annotated as frameshift, splicing donor, stop gain, splicing acceptor, stop loss, and start loss were categorized as loss of function (LOF). Variants that were consistently predicted as deleteriousness in SIFT49, PolyPhen2 HDIV, and PolyPhen2 HVAR50, LRT51, and MutationTaster52 were defined as likely deleterious missense.
ExWAS
ExWAS analysis was conducted using the SKAT-O test through SAIGE-GENE + 53. In SAIGE-GENE + , ultra-rare variants (minor allele carrier (MAC) ≤ 10) were collapsed into a pseudo marker, effectively addressing data sparsity caused by the presence of ultra-rare variants53. Therefore, both rare and ultra-rare variants could be investigated. First, single-variant association analyses were performed for all variants with MAC ≥ 20, as suggested by SAIGE-GENE + 53. Independent significant variants were identified using linkage disequilibrium (LD)-clumping (r2 < 0.1), with the UKB WES data utilized as the reference panel, and subsequently mapped to genes using VEP54. Then, in the gene-based collapsing analyses, SKAT-O tests were conducted utilizing the minimum p-value method53,55. We used three distinct maximum MAF cutoffs (0.01%, 0.1%, and 1%) and two annotations masks (LOF and LOF plus missense). We adjusted age, sex, and the top ten ancestral PCs (which were calculated with WES data). All quantitative phenotypes underwent inverse normalization in SAIGE-GENE + . A relative coefficient cutoff of 0.05 was applied to the sparse genetic relationship matrix for the estimation of variance ratios.
Genotype and imputation
Genotype data (version 3) were from the UKB cohort. The UKB conducted array design, genotyping, quality control, and imputation procedures46. We performed quality control (excluding variants with MAF < 0.005, INFO < 0.3, call rate < 90% or HWE P < 10−50) with PLINK v256 software. Additionally, participants with missingness less than 0.05, no sex mismatch, no abnormal sex chromosome aneuploidy, no outliers in heterozygosity rate, and estimated white British ancestry, with a maximum of ten putative third-generation relatives, were incorporated into the analysis.
ExWAS for AUDIT
To extend the implications of alcohol consumption findings to alcohol use disorder, we conducted an ExWAS utilizing measures from the Alcohol Use Disorders Identification Test (AUDIT)57, obtained through an online mental health questionnaire and processed following the methodology detailed in the previous study58. Specifically, the scores for the AUDIT subdomains, representing alcohol consumption (AUDIT-C) and indicating alcohol dependence and problematic alcohol use (AUDIT-P), were calculated by consolidating scores from items 1–3 and items 4–10, respectively. The total score (AUDIT-T) was the sum of items 1–10. Detailed information was available in Supplementary Data 1. A total of 101,240 participants with available AUDIT measurements, WES data and covariate information were used for the analyses. We conducted ExWAS for both the total score and the subscores.
LOVO analysis
The LOVO analysis was performed for associations identified in the gene-based analysis. For each gene-phenotype association, the collapsing test was iterated upon excluding each variant initially included, where each variant would have a P-value. This was undertaken to address specific aspects: firstly, to examine the stability and consistency of the results across variant exclusions; secondly, to discern whether the gene-based collapsing association results were predominantly driven by specific variants; and finally, to investigate whether the observed gene-based collapsing associations were influenced by numerous rare variants characterized by relatively small effect sizes. If the collapsing analysis after removing a single variant yields an attenuated significance (P > 0.01), that single variant was considered to predominantly drive the gene-phenotype association13. This analytical approach allows for a comprehensive evaluation of the role of individual variants within the broader gene-based context.
Conditional analysis
To test for independence between the significant rare variant associations and nearby common variation, we re-conducted the gene-based collapsing analyses additionally correcting the nearby common variants associated with alcohol consumption13. First, we conducted association analyses for common variants (MAF > 0.5%) within the 500 kb genomic region of the identified genes, utilizing the UKB imputed genotype data. Then, LD-clumping was performed to identify independent significant loci (P < 1 × 10−5 and r2 < 0.01). At last, we performed the collapsing analyses additionally adjusting for the independent significant loci.
Burden heritability estimation
We estimated the burden heritability based on rare coding variants (LOF and missense) using the burden heritability regression (BHR) method59. The BHR performed regression of the burden test statistic on the burden score using summary statistics of the association analysis and allele frequencies at the variant level, and derived the burden heritability through estimation of the regression slope59.
Pathway enrichment analysis
We used the g:Profiler60 software to conduct the enrichment analysis, selecting Gene Ontology and KEGG database as the gene set databases. The g:SCS (Set Counts and Sizes) correction method was employed for multiple testing correction.
Tissue enrichment and expression analysis
To determine whether the identified genes were enriched in multiple tissues, we conducted tissue enrichment analysis using the R package TissueEnrich22. The source data were from the Human Protein Atlas, and the hypergeometric test was used22.
Transcript expression levels of the two genes (GIGYF1 and ANKRD12) in 256 tissues were determined utilizing RNA sequencing data from the Human Protein Atlas24. The dataset corresponds to Human Protein Atlas version 22.0 and Ensembl version 103.38. Additional details regarding the data are available elsewhere at (https://www.ebi.ac.uk/biostudies/arrayexpress/studies/E-MTAB-2836).
Lifespan spatio-temporal gene expression trajectory
The lifespan spatio-temporal brain expression trajectories of the alcohol consumption-related genes were characterized using the mRNA-seq data of human brain from the PsychENCODE study25. The expression of each gene in each anatomical tissue was estimated. Gene expression levels was quantified utilizing the reads per kilobase per million mapped reads (RPKM) metric.
Single-cell expression
We used liver scRNA-seq data from Gene Expression Omnibus (GEO) database (accession ID: GSE115469)61 and processed it with the R package Seurat62. Individual cells with low quality, defined as the cells with less than 200 expressed genes or larger than 75% mitochondrial counts, were excluded. Then the gene expression matrix underwent normalization using the NormalizeData function in Seurat62. The top 25 PCs and a resolution of 0.4 were used to conduct clustering, and then the clusters were annotated according to the previous publication61.
Additionally, the brain scRNA-seq data sourced from temporal cortex tissues was obtained from the GEO database under accession ID GSE17373163. In the dataset, all cell types in the brain were isolated and sequenced63. Analysis and visualization were performed using the metadata files with the R package Seurat62.
Gene similarity
We utilized Gene-SCOUT23 to estimate the similarities between genes using association results of collapsing analyses across various quantitative traits in the UKB. In this tool, we searched the “seed gene” ANKRD12 to identify the similar genes. The top 10 similar genes and the “seed gene” were then employed in the enrichment analysis with Gene Ontology terms23.
MRI data and preprocessing
Structural MRI data were obtained from three dedicated and identical imaging centers64,65. Preprocessing of this data followed a pipeline established in previous studies66,67 with SPM12 software and the CAT12 toolbox68 with default settings. This included high-dimensional spatial normalization, nonlinear modulations, and smoothing (with an 8 mm half-maximum full-width Gaussian kernel). For regional grey matter volume, we employed the Automated Anatomical Labeling 3 (AAL3) atlas69, a brain parcellation system that subdivides the brain into 166 distinct regions. We utilized the AAL3 atlas due to its finer parcellation, especially in the subcortical regions, which are closely linked to alcohol use and addiction.
We utilized fractional anisotropy (FA) of white matter tracts provided by UKB. Detailed data processing and quality control procedures have been comprehensively outlined in prior study60. Specifically, dual diffusion-weighted shells were employed to acquire diffusion-weighted images, incorporating 50 distinct diffusion-encoding directions for each shell, and with a resolution of 2 × 2 × 2 mm. TBSS70 was used to conduct the alignment of FA images to a standard-space white matter skeleton. FA images was further improved with high-dimensional FNIRT-based warping for enhanced alignment71. Our analyses encompassed 48 distinct white matter tracts extracted based on the JHU ICBM-DTI-81 atlas72.
Phenome-wide association analysis
The phenotypes in PheWAS were centered around traits that are associated with alcohol consumption, including behavioral aspects and health outcomes. The disease-related analysis covered neuropsychiatric diseases, cardiovascular diseases, and digestive diseases, which can be impacted by alcohol consumption patterns. Additionally, the analysis incorporated cognitive tasks, inflammatory traits, blood biochemistry traits, neuroimaging traits (including grey and white matter measures), and cardiac and lung function measures, all of which are pertinent to understanding the impacts of genes related to alcohol consumption on human health and functioning. This comprehensive selection of phenotypes aligns with the aim of investigating the potential genetic influences on alcohol consumption and its related health implications. In the analysis of diseases, we investigated 10 neuropsychiatric diseases, 7 cardiovascular diseases, and 19 digestive diseases. For the analysis of continuous phenotypes, we examined 10 cognition tasks, 9 inflammatory traits, 30 blood biochemistry traits, 214 neuroimaging traits (including 166 grey matter measures and 48 white matter measures), 8 heart structure measures, and 9 spirometry measures. Comprehensive details regarding the phenotypes can be found in Supplementary Data 22. We used single-variant association tests for identified variants and SKAT-O tests for identified genes53, adjusting for the top ten ancestral PCs, age, and sex.
For the cognitive function tasks, data were preprocessed similar to the previous study73. We incorporated cognitive tests from both baseline and imaging follow-up. Specifically, we selected the timepoints that corresponded to the maximum sample size for each cognitive test.
Mendelian randomization analysis
To explore the mediating relationships between ANKRD12, GIGYF1, cognition, and alcohol consumption, we first conducted a bidirectional Mendelian randomization (MR) between cognition and alcohol consumption using TwoSampleMR R package. We employed GWAS summary data for the general factor of intelligence, derived from a compilation of seven distinct cognitive tests74, all sourced from the UK Biobank. Ensuring the avoidance of sample overlap, we utilized separate GWAS summary data for alcohol consumption, excluding participants from the UK Biobank19.
Sensitivity analysis
To evaluate the stability of the main results, we conducted multiple sensitivity analyses. Initially, we excluded participants who were former drinkers and non-drinkers (Field 20117) and performed association analysis for the identified genes. Additionally, we adjusted for rs1229984, a well-known alcohol consumption-related locus6,21, to identify independent associations.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
The data used in the study from the UKB were accessible under restricted access (application number 19542). Access can be procured by submitting an application through the UKB platform (https://www.ukbiobank.ac.uk/). The scRNA-seq data are documented in the GEO database, accessible under accession codes GSE115469 and GSE173731. The transcript expression data are accessible in the Human Protein Atlas database (https://v22.proteinatlas.org/about/download). The processed human brain mRNA sequencing data were available in the PsychENCODE study (http://development.psychencode.org/files/processed_data/RNA-seq/). Summary GWAS statistics from FinnGen are available at https://storage.googleapis.com/finngen-public-data-r9/summary_stats/finngen_R9_AUD.gz and https://storage.googleapis.com/finngen-public-data-r9/summary_stats/finngen_R9_AUD_SWEDISH.gz. KEGG database used in gProfiler are available at https://www.genome.jp/kegg/pathway.html. The Gene Ontology database used in gProfiler are available at https://geneontology.org/docs/download-ontology/. The paper and/or the Supplementary Information contain all necessary data to assess the conclusions. In addition, this paper includes source data. Source data are provided with this paper.
Code availability
ExWAS analyses and PheWAS analyses was performed via the R package SAIGE GENE+ which was available on https://github.com/saigegit/SAIGE. Burden heritability regression analysis was performed via the R package BHR (v.0.1.0, https://github.com/ajaynadig/bhr). Annotation of significant variants was conducted with SnpEff (https://pcingola.github.io/SnpEff/). Gene ontology enrichment analysis was conducted using g:Profiler (https://biit.cs.ut.ee/gprofiler/gost) and tissue enrichment analysis was performed via the R package TissueEnrich (v.1.16.0, https://github.com/Tuteja-Lab/TissueEnrich). The scRNA-seq data were analyzed and visualized using the R package Seurat using the R package Seurat (v.4.3.0, https://satijalab.org/seurat/index.html). Gene-Scout was available through the website: https://astrazeneca-cgr-publications.github.io/gene-scout/.
References
Griswold, M. G. et al. Alcohol use and burden for 195 countries and territories, 1990-2016: a systematic analysis for the Global Burden of Disease Study 2016. Lancet 392, 1015–1035 (2018).
Freisthler, B., Wolf, J. P., Hodge, A. I. & Cao, Y. Alcohol Use and Harm to Children by Parents and Other Adults. Child Maltreat 25, 277–288 (2020).
Friesen, E. L. et al. Hazardous alcohol use and alcohol-related harm in rural and remote communities: a scoping review. Lancet Public Health 7, e177–e187 (2022).
Rehm, J. et al. The relationship of average volume of alcohol consumption and patterns of drinking to burden of disease: an overview. Addiction 98, 1209–1228 (2003).
Witkiewitz, K., Litten, R. Z. & Leggio, L. Advances in the science and treatment of alcohol use disorder. Sci. Adv. 5, eaax4043 (2019).
Saunders, G. R. B. et al. Genetic diversity fuels gene discovery for tobacco and alcohol use. Nature 612, 720–724 (2022).
Kranzler, H. R. et al. Genome-wide association study of alcohol consumption and use disorder in 274,424 individuals from multiple populations. Nat. Commun. 10, 1499 (2019).
Visscher, P. M. et al. 10 Years of GWAS Discovery: Biology, Function, and Translation. Am. J. Hum. Genet 101, 5–22 (2017).
Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).
Brazel, D. M. et al. Exome Chip Meta-analysis Fine Maps Causal Variants and Elucidates the Genetic Architecture of Rare Coding Variants in Smoking and Alcohol Use. Biol. Psychiatry 85, 946–955 (2019).
Van Hout, C. V. et al. Exome sequencing and characterization of 49,960 individuals in the UK Biobank. Nature 586, 749–756 (2020).
Cirulli, E. T. et al. Genome-wide rare variant analysis for thousands of phenotypes in over 70,000 exomes from two cohorts. Nat. Commun. 11, 542 (2020).
Jurgens, S. J. et al. Analysis of rare genetic variation underlying cardiometabolic diseases and traits among 200,000 individuals in the UK Biobank. Nat. Genet 54, 240–250 (2022).
Szustakowski, J. D. et al. Advancing human genetics research and drug discovery through exome sequencing of the UK Biobank. Nat. Genet. 53, 942–948 (2021).
Marees, A. T. et al. Exploring the role of low-frequency and rare exonic variants in alcohol and tobacco use. Drug Alcohol Depend. 188, 94–101 (2018).
Zhou, H. et al. Genome-wide meta-analysis of problematic alcohol use in 435,563 individuals yields insights into biology and relationships with other traits. Nat. Neurosci. 23, 809–818 (2020).
Schumann, G. et al. KLB is associated with alcohol drinking, and its gene product β-Klotho is necessary for FGF21 regulation of alcohol preference. Proc. Natl Acad. Sci. 113, 14372–14377 (2016).
Backman, J. D. et al. Exome sequencing and analysis of 454,787 UK Biobank participants. Nature 599, 628–634 (2021).
Liu, M. et al. Association studies of up to 1.2 million individuals yield new insights into the genetic etiology of tobacco and alcohol use. Nat. Genet. 51, 237–244 (2019).
Kurki, M. I. et al. FinnGen provides genetic insights from a well-phenotyped isolated population. Nature 613, 508–518 (2023).
Bierut, L. J. et al. ADH1B is associated with alcohol dependence and alcohol consumption in populations of European and African ancestry. Mol. psychiatry 17, 445–450 (2012).
Jain, A. & Tuteja, G. TissueEnrich: Tissue-specific gene enrichment analysis. Bioinformatics 35, 1966–1967 (2018).
Middleton, L. et al. Gene-SCOUT: identifying genes with similar continuous trait fingerprints from phenome-wide association analyses. Nucleic Acids Res. 50, 4289–4301 (2022).
Uhlén, M. et al. Tissue-based map of the human proteome. Science 347, 1260419 (2015).
Li, M. et al. Integrative functional genomic analysis of human brain development and neuropsychiatric risks. Science 362, eaat7615 (2018).
Brien, S. E., Ronksley, P. E., Turner, B. J., Mukamal, K. J. & Ghali, W. A. Effect of alcohol consumption on biological markers associated with risk of coronary heart disease: systematic review and meta-analysis of interventional studies. BMJ. 342, d636 (2011).
Peng, B. et al. Role of Alcohol Drinking in Alzheimer’s Disease, Parkinson’s Disease, and Amyotrophic Lateral Sclerosis. Int J. Mol. Sci. 21, 2316 (2020).
Biddinger, K. J. et al. Association of Habitual Alcohol Intake With Risk of Cardiovascular Disease. JAMA Netw. Open 5, e223849 (2022).
Mahedy, L. et al. Alcohol use and cognitive functioning in young adults: improving causal inference. Addiction 116, 292–302 (2021).
Curtis, D. Analysis of rare coding variants in 200,000 exome-sequenced subjects reveals novel genetic risk factors for type 2 diabetes. Diabetes Metab. Res Rev. 38, e3482 (2022).
Knott, C., Bell, S. & Britton, A. Alcohol Consumption and the Risk of Type 2 Diabetes: A Systematic Review and Dose-Response Meta-analysis of More Than 1.9 Million Individuals From 38 Observational Studies. Diabetes Care 38, 1804–1812 (2015).
Karczewski, K. J. et al. Systematic single-variant and gene-based association testing of thousands of phenotypes in 394,841 UK Biobank exomes. Cell Genom. 2, 100168 (2022).
Fu, J. M. et al. Rare coding variation provides insight into the genetic architecture and phenotypic context of autism. Nat. Genet. 54, 1320–1331 (2022).
Chen, C. Y. et al. The impact of rare protein coding genetic variation on adult cognitive function. Nat. Genet. 55, 927–938 (2023).
de la Monte, S. M. & Kril, J. J. Human alcohol-related neuropathology. Acta. Neuropathol. 127, 71–90 (2014).
Tiwari, V. & Chopra, K. Resveratrol abrogates alcohol-induced cognitive deficits by attenuating oxidative-nitrosative stress and inflammatory cascade in the adult rat brain. Neurochem. Int. 62, 861–869 (2013).
Egervari, G., Siciliano, C. A., Whiteley, E. L. & Ron, D. Alcohol and the brain: from genes to circuits. Trends Neurosci. 44, 1004–1015 (2021).
Daviet, R. et al. Associations between alcohol consumption and gray and white matter volumes in the UK Biobank. Nat. Commun. 13, 1175 (2022).
Sullivan, E. V. & Pfefferbaum, A. Brain-behavior relations and effects of aging and common comorbidities in alcohol use disorder: A review. Neuropsychology 33, 760–780 (2019).
Monnig, M. A., Tonigan, J. S., Yeo, R. A., Thoma, R. J. & McCrady, B. S. White matter volume in alcohol use disorders: a meta-analysis. Addict. Biol. 18, 581–592 (2013).
Pfefferbaum, A. & Sullivan, E. V. Disruption of brain white matter microstructure by excessive intracellular and extracellular fluid in alcoholism: evidence from diffusion tensor imaging. Neuropsychopharmacology 30, 423–432 (2005).
Schonfeld, M., O’Neil, M., Weinman, S. A. & Tikhanovich, I. Alcohol-induced epigenetic changes prevent fibrosis resolution after alcohol cessation in miceresolution. Hepatology. https://doi.org/10.1097/HEP.0000000000000675 (9900).
Schonfeld, M., Averilla, J., Gunewardena, S., Weinman, S. A. & Tikhanovich, I. Alcohol‐associated fibrosis in females is mediated by female‐specific activation of lysine demethylases KDM5B and KDM5C. Hepatol. Commun. 6, 2042–2057 (2022).
Schonfeld, M., Averilla, J., Gunewardena, S., Weinman, S. A. & Tikhanovich, I. Male‐Specific Activation of Lysine Demethylases 5B and 5C Mediates Alcohol‐Induced Liver Injury and Hepatocyte Dedifferentiation. Hepatol. Commun. 6, 1373–1391 (2022).
Howe, L. J. et al. Genetic evidence for assortative mating on alcohol consumption in the UK Biobank. Nat. Commun. 10, 1–10 (2019).
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203 (2018).
Sudlow, C. et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12, e1001779 (2015).
Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff. Fly 6, 80–92 (2012).
Vaser, R., Adusumalli, S., Leng, S. N., Sikic, M. & Ng, P. C. SIFT missense predictions for genomes. Nat. Protoc. 11, 1–9 (2016).
Adzhubei, I., Jordan, D. M. & Sunyaev, S. R. Predicting Functional Effect of Human Missense Mutations Using PolyPhen-2. Curr. Protoc. Hum. Genet. 76, 7.20.1–7.20.41 (2013).
Chun, S. & Fay, J. C. Identification of deleterious mutations within three human genomes. Genome Res. 19, 1553–1561 (2009).
Schwarz, J. M., Rödelsperger, C., Schuelke, M. & Seelow, D. MutationTaster evaluates disease-causing potential of sequence alterations. Nat. methods 7, 575–576 (2010).
Zhou, W. et al. SAIGE-GENE+ improves the efficiency and accuracy of set-based rare variant association tests. Nat. Genet. 54, 1466–1469 (2022).
Martin, F. J. et al. Ensembl 2023. Nucleic Acids Res. 51, D933–D941 (2022).
Zhou, W. et al. Scalable generalized linear mixed model for region-based association tests in large biobanks and cohorts. Nat. Genet. 52, 634–639 (2020).
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
Saunders, J. B., Aasland, O. G., Babor, T. F., De La Fuente, J. R. & Grant, M. Development of the Alcohol Use Disorders Identification Test (AUDIT): WHO Collaborative Project on Early Detection of Persons with Harmful Alcohol Consumption-II. Addiction 88, 791–804 (1993).
Sanchez-Roige, S. et al. Genome-Wide Association Study Meta-Analysis of the Alcohol Use Disorders Identification Test (AUDIT) in Two Population-Based Cohorts. Am. J. Psychiatry 176, 107–118 (2018).
Weiner, D. J. et al. Polygenic architecture of rare coding variation across 394,783 exomes. Nature 614, 492–499 (2023).
Raudvere, U. et al. g:Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update). Nucleic Acids Res. 47, W191–W198 (2019).
MacParland, S. A. et al. Single cell RNA sequencing of human liver reveals distinct intrahepatic macrophage populations. Nat. Commun. 9, 4383 (2018).
Butler, A., Hoffman, P., Smibert, P., Papalexi, E. & Satija, R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol. 36, 411–420 (2018).
Garcia, F. J. et al. Single-cell dissection of the human brain vasculature. Nature 603, 893–899 (2022).
Miller, K. L. et al. Multimodal population brain imaging in the UK Biobank prospective epidemiological study. Nat. Neurosci. 19, 1523 (2016).
Alfaro-Almagro, F. et al. Image processing and Quality Control for the first 10,000 brain imaging datasets from UK Biobank. NeuroImage 166, 400–424 (2018).
Chun, S. et al. Associations of Social Isolation and Loneliness With Later Dementia. Neurology 99, e164 (2022).
Jujiao, K. et al. Association between obesity, brain atrophy and accelerated brain aging and their genetic mechanisms. medRxiv, 2022.12.30.22284052 (2022).
Gaser, C. et al. CAT—A Computational Anatomy Toolbox for the Analysis of Structural MRI Data. bioRxiv, 2022.06.11.495736 (2023).
Rolls, E. T., Huang, C.-C., Lin, C.-P., Feng, J. & Joliot, M. Automated anatomical labelling atlas 3. NeuroImage 206, 116189 (2020).
Smith, S. M. et al. Tract-based spatial statistics: Voxelwise analysis of multi-subject diffusion data. NeuroImage 31, 1487–1505 (2006).
de Groot, M. et al. Improving alignment in Tract-based spatial statistics: Evaluation and optimization of image registration. NeuroImage 76, 400–411 (2013).
Wakana, S., Jiang, H., Nagae-Poetscher, L. M., Van Zijl, P. C. & Mori, S. Fiber tract-based atlas of human white matter anatomy. Radiology 230, 77–87 (2004).
Kang, J. et al. Increased brain volume from higher cereal and lower coffee intake: shared genetic determinants and impacts on cognition and metabolism. Cereb. Cortex 32, 5163–5174 (2022).
de la Fuente, J., Davies, G., Grotzinger, A. D., Tucker-Drob, E. M. & Deary, I. J. A general dimension of genetic sharing across diverse cognitive traits inferred from molecular data. Nat. Hum. Behav. 5, 49–58 (2021).
Acknowledgements
We express our gratitude to the participants of the UK Biobank for their valuable time, and we extend our appreciation to the dedicated team members of the UK Biobank for their efforts in data collection. We acknowledge the participants and investigators involved in the FinnGen study. We also acknowledge the contributions of MacParland, S.A. et al. and Garcia, F.J. et al. for providing the scRNA-seq matrix. We acknowledge the Human Protein Atlas project, the PsychENCODE Consortium and the FinnGen project for their unwavering dedication to advancing scientific research. W Cheng received support through grants from the National Natural Sciences Foundation of China (no. 82071997) and the Shanghai Rising-Star Program (no. 21QA1408700). J.T. Yu received support through grants from the Science and Technology Innovation 2030 Major Projects (2022ZD0211600), National Natural Science Foundation of China (82071201, 81971032, 92249305), Shanghai Municipal Science and Technology Major Project (No.2018SHZDZX01), Research Start-up Fund of Huashan Hospital (2022QD002), Excellence 2025 Talent Cultivation Program at Fudan University (3030277001), Shanghai Talent Development Funding for The Project (2019074), and ZHANGJIANG LAB, Tianqiao and Chrissy Chen Institute, and the State Key Laboratory of Neurobiology and Frontiers Center for Brain Science of Ministry of Education, Fudan University. J.F. Feng received support through grants from National Key R&D Program of China (No. 2018YFC1312904 and No. 2019YFA0709502), the Shanghai Municipal Science and Technology Major Project (No. 2018SHZDZX01), the 111 Project (No. B18015), Shanghai Center for Brain Science and Brain-Inspired Technology and Zhangjiang Lab. T.Y. Jia received support through grants from the National Key R&D Program of China (No. 2019YFA0709501) and the National Natural Science Foundation of China (T2122005, No. 81801773).
Author information
Authors and Affiliations
Contributions
All authors had complete access to the data in this study and acknowledged the responsibility for its submission for publication. W.C., J.T.Y., and J.F.F. designed the study. J.J.K. and Y.T.D. conducted the main analyses and drafted the manuscript. B.S.W., Z.Y.L., W.S.L., S.T.X., L.Y., and J.Y. contributed to data collection and analyses. X.H.G. and T.Y.J. contributed to data interpretation. J.T.Y., W.C., and J.F.F. provided critical revisions to the manuscript. All authors have reviewed and approved the final version.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Neelroop Parikshak and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Kang, J., Deng, YT., Wu, BS. et al. Whole exome sequencing analysis identifies genes for alcohol consumption. Nat Commun 15, 5777 (2024). https://doi.org/10.1038/s41467-024-50132-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-024-50132-3
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.