Introduction

Alcohol use disorders (AUDs) are highly prevalent, disabling disorders that often go untreated in the USA1. Although a substantial heritable component has been found to underlie the variation in AUDs (see reviews2,3), the identification of specific genetic variants associated with the disorder in genome-wide association studies (GWAS), though appearing promising, has proved to be challenging. The most consistent findings among studies have been variations in alcohol-metabolizing genes alcohol dehydrogenase (ADH) and aldehyde dehydrogenase (ALDH) that have been shown to confer protection against alcohol dependence in several populations2,4,5. In recent years, GWAS have yielded an additional small yet diverse set of single-nucleotide polymorphisms (SNPs) that have been associated with alcohol dependence, alcohol consumption, and related traits in a number of ethnic groups5,6,7,8,9,10,11. Among these recent findings, β-Klotho (KLB) has been repeatedly associated with alcohol consumption in large population studies of European ancestry11,12, although alcohol consumption may have significantly different genetic patterns from alcohol use disorders (AUD)13. One ethnic group that is particularly understudied yet has a high prevalence of AUD is American Indians (AI)1,14. While several studies provide data to demonstrate that a substantial genetic component for risk for AUD exists in the select AI tribes that have been studied15, little is known as to the exact genes and genetic variants that may confer this possibly elevated risk, with the exception of the ADH and ALDH loci16,17,18.

There are several reasons for the paucity of findings linking specific variants to AUDs in any ethnic group. One potential reason may be defining the phenotype. One aspect of AUDs that appears to be highly consistent is the clinical course of the disorder19,20. The clinical course, as described by Schuckit et al.20, consists of the order and progression of 36 alcohol-related life events. These life events have been shown to be highly similar and consistent across many different subgroups and populations, although age of onset and endorsement rates of individual events can differ21,22,23,24,25,26,27,28,29. Although this phenotype has been extensively described, it has not yet been utilized as a trait to evaluate the genetics of AUDs.

Another phenotype that has been little explored in genetic studies is substance-induced mood disturbance. Among individuals with moderate, and especially severe AUD, mood disturbances can arise that can sometimes mimic major depressive episodes (MDE). This phenomenon was called “secondary depression” and later called substance-induced MDE30,31. More subtle substance-induced affective symptoms may also occur during a bout of heavy drinking and/or when an individual cuts down on their drinking or during withdrawal32. These more subtle symptoms have been suggested by Koob et al.33 as comprising the “dark side” of addiction33. It has been further hypothesized that a “negative emotional state” can arise during heavy alcohol exposure that then acts as negative reinforcement that promotes additional drinking in an attempt to eliminate the affective symptoms34,35.

Another potential reason for the paucity of genetic findings in AUD is that, similar to other complex diseases, the variants that have so-far been identified in GWAS for AUD-related traits are primarily common variants that may collectively explain only a small portion of the heritability36. There are many theories regarding this “missing heritability” of complex diseases37. One class of genetic variation that is largely understudied for AUD is rare variants in the genome. Rare variants have been understudied, in part, due to technological constraints limiting comprehensive whole-genome sequencing (WGS) of population samples and the lack of statistical models that incorporate variables such as family relatedness and ethnic admixture. Genotyping followed by imputation to reference panels is insufficient for studying special high-risk populations, as rare variants are often unique to each population. Recent advances in WGS technologies and analytical methods, however, have made possible the identification of both rare and common variants in studies of novel and admixed populations enriched for substance dependence phenotypes, such as American Indians17,18,38.

In the present study, we sought to investigate the genetic basis of two understudied phenotypes: (1) the clinical course of AUD as indexed by alcohol-related life events and (2) alcohol-induced affective symptoms, in two independent cohorts: American Indians (AI) and Euro-Americans (EA). Specifically, we conducted: (1) genome-wide association analysis, (2) rare-variant analysis (3) functional and pathway analyses, and (4) tissue-specific gene expression enrichment analysis, using low-coverage whole-genome sequencing data, in order to identify both shared and distinct genetic factors between the two populations.

Materials and methods

Participants

Two independent populations were investigated: 742 Native Americans of extended pedigrees from an American Indian cohort (AI), and 1711 primarily Euro-American (EA) participants from the San Francisco Family Alcohol Study (SFFS) (See Table S1 for demographics). We refer to the first cohort as AI and the second as EA or SFFS interchangeably. The population characteristics and the recruitment procedures of the two cohorts have been, respectively, described39,40,41,42. The protocol of the study of the American Indian cohort was approved by the Scripps Research Institute Institutional Review Board and Indian Health Council, a tribal review group overseeing health issues for the reservations where recruitments took place. The protocol for collection of participants in the SFFS was approved by the University of California San Francisco (UCSF) Committee for the Protection of the Rights of Human Subject while the recruitment took place. Subsequently, the University of North Carolina, at Chapel Hill IRB approved the data analysis plan. Written informed consent was obtained from each participant after study procedures had been fully explained. Participants were compensated for their time spent in the study.

Phenotypes

The clinical course of AUD as indexed by the 36 alcohol-related life events described by Schuckit et al.20 in these two populations has been previously studied22,23,26. We herein defined a new weighted alcohol-related life events phenotype: a quantitative trait derived from the 36 alcohol-related life events as listed in Table 1. The life events were given a severity weight of 1 for events 1–12, 2 for 13–24 and 3 for 25–36. The order of the events was based on the mean age of occurrence with the first event happening earliest and the last event (36th) occurring latest in a lifetime. The phenotype was defined as the sum of the severity weights of the 36 alcohol-related life events. The resulting alcohol-related life events for AI and SFFS (EA) cohorts are characterized in Table S2. Of the 1711 individuals in SFFS, 1702 had valid values for this trait. This newly derived phenotype is correlated with DSM5 AUD diagnoses with correlation coefficients ranging from 0.62 to 0.78 and 0.81 for mild to moderate and severe AUDs respectively, in the AIs (Table S3).

Table 1 Clinical course of alcohol use disorders: 36 alcohol-related life events, from which the weighted life events phenotype was derived

Two symptoms, which were assessed across both samples, were used as an index of substance-induced affective states or “dark side symptoms”. The first was a measure of withdrawal that queried whether participants ever felt anxious or depressed when they stopped or cut down on drinking. The second measure queried whether participants’ drinking had ever caused them to feel depressed or uninterested in things for more than 24 h and to the point that it interfered with their functioning. Both phenotypes are dichotomous. The distributions of the two “dark side” phenotypes are also listed in Table S2. These symptoms are rare in moderate and mild AUD and common in severe AUD32.

Whole-genome sequencing and association analysis

The same methods and pipeline were used to conduct low-coverage whole-genome sequencing (LCWGS) on blood-derived DNA for AI and SFFS cohorts, and has been previously published38.

A linear mixed model approach as implemented in EMMAX43 was used in the whole-genome association analysis, to control for population structures (as AI cohort is primarily admixed)17,44 as well as familial relatedness. The association for each variant was conditioned on a kinship matrix that was estimated from the genotypes, in order to capture a wide range of sample structures. We further included sex, age, and age-squared as covariates in all association analyses. The significance of associations was corrected for multiple traits using the effective independent number of traits (meff)45. For the three traits in the present study, meff = 2.369 and 2.105 for the AI and SFFS cohorts, respectively. We used a p-value of 5 × 10–8 as the genome-wide significant threshold and 5 × 10–7 as the threshold for suggestive significance.

We additionally performed a gene-based test using fastBAT46. For each gene, all variants in the range of ± 50 Kb of the gene and of MAF ≥ 1% were included. The p-values were corrected for meff. The number of genes (N) was 24,690 and 24,681 for AI and SFFS, respectively. Thus, the significant threshold for p-value was set at 0.05/N = 2.0 × 10–6, and a suggestive threshold at 2.0 × 10–5.

Gene-based low-frequency and rare variants association analysis

A linear mixed model-based combined multivariate and collapsing method47 as implemented in EMMAX was used to collectively analyze the variants having lower than 5% minor allele frequency (MAF). We grouped the low-frequency (1% ≤ MAF < 5%) and rare variants (MAF < 1%) by genes. For each gene, we formed two groups. One group considered all variants on exons, 5′- and 3′-UTRs, upstream and downstream of the gene (denoted as Exon + Reg). The other included only the nonsynonymous and the splicing-site variants (denoted as Nonsyn). Intergenic variants were not considered in the present study. For each variant group type, a gene was excluded if fewer than three markers were found, or if <1% of the samples had any such markers on the gene. The p-values were corrected for meff. The significant thresholds for corrected p-values were set at 0.05/(NExon+Reg + NNonsyn) for each trait and cohort, where NExon + Reg is the number of genes in the group Exon + Reg and NNonsyn in the group Nonsyn. Note that correcting for the sum of the numbers of genes in the two groups is likely an overcorrection as two groups of variants are correlated.

Functional and pathway analyses

Top variants from our association analyses were tested against the brain-specific cis-eQTL database BRAINEAC48. Polyphen-2 was used to predict whether nonsynonymous variants might be potentially damaging49. The variants with p-values < 10–5 from each GWAS were annotated with genes using SGAdviser50; the associated set of genes were then subjected to functional analyses. We used GeneMANIA51 to extract potential functional networks, and DAVID 6.852 for disease enrichment analysis.

Tissue-specific gene expression enrichment analysis

We obtained the median tissue-specific gene expression data from The Genotype-Tissue Expression (GTEx) Project release V7 (at GTEx Portal) for the sets of genes associated with variants that had a p-value < 10–5 in the GWAS of each trait and cohort. For each gene, its expression profile across tissues was standardized. For each tissue, we then counted the number of genes in each gene set that had expression levels over z-score of 2 (representing the most expressed tissues by the gene). If this gene count was significantly higher than expected, the tissue was considered enriched with respect to tissue-specific expressions for the gene set. The significance was determined through permutation tests.

Data and code availability

The SFFS dataset has been deposited in dbGaP (accession: phs001458.v1.p1). In accordance with the wishes of the tribes no sharing of the AI data are possible. The analysis code is available upon request.

Full details of all the analyses are given in Supplemental Materials and Methods.

Results

GWAS for the American Indian cohort

All variants that were found associated with any of the three alcohol-related traits at over a suggestive significant level (p < 5 × 10–7) in the AI cohort had <5% allele frequency (see Table 2, S5, and Fig. S1). Variant rs200577368, downstream of gene NAF1 and 658Kbp upstream of FSTL5 (Fig. S3A), was found significantly associated with alcohol-related life events (p = 6.35 × 10–9, see Fig. S2A). Variant rs79833306 downstream of DMRTA1 was also associated with alcohol-related life events (p = 5.14 × 10–8). Six additional variants were associated with the phenotype at suggestive significant levels (Table 2, S5, and S7), including one SNP located 31Kbp upstream of PCCA and 71Kbp downstream of ZIC2, and two upstream of gene KCTD3 and downstream of KCNK2, both potassium channel genes (Fig. 1). Additionally, gene-based tests using the fastBAT statistic identified only a single gene, MME (a.k.a NEP, CD10), to be suggestively associated with alcohol-related life events (p = 1.47 × 10–5).

Table 2 Genomic variants for the strongest associations (nominal p < 5 × 10–8) or the top variant for each AUD-related trait in the American Indian (AI) and the European American (EA) cohorts
Fig. 1: Regional Manhattan plot of KCNK2 and KCTD3 variants for alcohol-related life events in AI.
figure 1

Two variants (MAF = 1.4%), upstream of KCTD3 and downstream of KCNK2, were suggestively associated with alcohol-related life events in American Indians (AI) cohort. Gene-based rare variants test also showed that KCNK2 was associated with alcohol-related life events and affective symptoms when cutting down or during withdrawal. These two single nucleotide polymorphisms (SNPs) are in high LD and also in LD with variants on and near the two nearby genes. They are both near gene activation sites, and located in regions methylated in frontal cortex. Additionally, rs72739250 is located in a CpG island, together suggesting the potential regulatory roles of these two variants81

Variant rs150351153, located in an intronic region of PRKG2, was significantly associated with affective symptoms during withdrawal in the AI cohort (p = 9.75 × 10–9, Figs. S2B and S5), and nine others were associated at suggestive significant levels. No variant was significantly associated with alcohol-induced depression. However, ten variants were associated at suggestive significant levels (Fig. S2C).

GWAS for the European American cohort

No single variant remained genome-wide significant after correcting for the number of traits (Tables 2 and S6, Fig. S2). The top association for alcohol-related life events was rs11100375, a common (MAF = 40%) intronic variant on gene FSTL5, at a suggestive significant level (p = 4.82 × 10–7, Fig. S3B). Note that the top variant rs200577368 associated with the same trait in the AI cohort resides 1.2 Mb upstream of rs11100375 in the intergenic region between FSTL5 and NAF1. Although it’s in linkage disequilibrium (LD) with many variants in the region as reflected by high D’ (Fig. S3A), there was no clear evidence of LD between rs200577368 and rs11100375. FSTL5, a follistatin-like five gene, was most highly expressed in the cerebellar hemisphere. Rs11100375 found cis-eQTL for FSTL5 in the hippocampus, thalamus and substantia nigra regions (FDR = 0.022–0.035).

The top association for alcohol-induced affective symptoms during withdrawal in SFFS was rs2500086 (Fig. S2E). This variant is a cis-eQTL for JARID2 in the substantia nigra (FDR = 5.0 × 10–4) and a cis-eQTL for RNF182, a gene involved in innate immune system, in the frontal cortex (FDR = 5.2 × 10–4).

A number of variants in or near a long non-coding RNA LINC02347 (a.k.a. LOC100128554) were associated with alcohol-induced depression at suggestive significance (Figs. S2F and S4). The fastBAT gene-based test also identified this locus as significantly associated with alcohol-induced depression with 155 SNPs included (p = 1.77 × 10–7). LINC02347 is located on chromosome 12q24.32, a sub-telomere region. The most significant variant was rs4309206 (p = 9.08 × 10–8), located upstream of LINC02347. Rs4309206 was identified as a cis-eQTL for LINC02347 in occipital cortex (FDR = 0.0024), substantia nigra (FDR = 0.0057), and putamen (FDR = 0.013).

Sequence of LINC02347 were also partially mapped to transcripts of FAM32A, CHIA, ROS1, NIPA2, and TAP2. Additionally, top SNPs in high LD with rs4309206 were found to be brain cis-eQTLs for LOC283435, LOC400084, and TMEM132B (777Kbp downstream), which are located near LINC02347 (Fig. S4). For TMEM132B, the top eQTLs in brain were rs10847158 (the 2nd most significant SNP) for expression in frontal cortex (FDR = 0.0042), and rs4765395 for expression in white matter (FDR = 0.0023), occipital cortex (FDR = 0.0068), and substantia nigra (FDR = 0.033). All variants that were associated with one of the three AUD-related traits at over a suggestive significant level in the SFFS cohort were common variants (see Tables 2 and S6) and represented cis-eQTLs for the related genes in brain regions.

Gene-based rare-variant analysis for the American Indians

Rare and less-frequent (MAF < 5%) exonic and regulatory (upstream or downstream) variants of a potassium channel gene, KCNK2, were significantly associated with alcohol-related life events (p = 7.74 × 10–7) and suggestively associated with alcohol-induced affective symptoms during withdrawal (p = 6.32 × 10–6) in the AI cohort (Table 3). The rare coding variants in KCNK2 were also associated with alcohol-induced affective symptoms during withdrawal in the SFFS cohort at p = 0.043 (Table 3). Note that the two variants (MAF = 1.4%) upstream of KCTD3 that were suggestively associated with alcohol-related life events in AI are also downstream of KCNK2 (see Fig. 1). KCNK2 was most highly expressed in fibroblasts and adrenal gland, thyroid and several brain regions. KCTD3 was more ubiquitously expressed and highly expressed in the adrenal gland.

Table 3 Rare and low-frequency variants in genes (MAF < 5%) with the strongest associations in the AI and the EA cohorts

Rare coding variants in EBI3 (a.k.a IL27B) were significantly associated with alcohol-induced affective symptoms during withdrawal (p = 1.67 × 10–7), followed by KCNK2. Only weak support was found for EBI3 in the Euro-American cohort (nonsynonymous variants in EBI3 were associated with alcohol-induced depression in SFFS with nominal p = 0.048). EBI3 was most highly expressed in lymphocytes and spleen. Rare, nonsynonymous variants on DSG1 were suggestively associated with affective symptoms during withdrawal, followed by SLC39A13. DSG1 was most highly expressed in skin, vagina, and esophagus. Although below the suggestive significant threshold in this rare-variant gene test, SLC39A13 has recently been identified as a novel locus for alcohol use disorder identification test (AUDIT) total score in the UK Biobank13. Rare variants in an lncRNA RP11-94B19.6 were suggestively associated with alcohol-induced depression, followed by rare coding variants in TICAM1 (that was most highly expressed in esophagus) and nonsynonymous variants in ZNF644.

Gene-based rare-variant analysis for the Euro-Americans

PDE4C was suggestively associated with alcohol-related life events (p = 1.44 × 10–6) using the rare nonsynonymous or splice-site variants (Table 3). This finding was replicated in the AI cohort (p = 2.93 x 10–3). Interestingly, although not unexpectedly given that these were rare variants, between the 10 and 9 rare nonsynonymous or splice-site variants, respectively, found in PDE4C in the AI and SFFS cohorts, only one variant, rs182916479, at a splice-site was shared across cohorts (0.2% MAF in both cohorts). Of the ten variants in the AI cohort, Polyphen-2 predicted one variant possibly damaging (probability = 0.904) and three probably damaging (prob. = 0.992–0.999). Of the nine variants in the SFFS cohort, the prediction was one possibly damaging (prob. = 0.736) and four probably damaging (prob. = 0.986–1).

IZUMO4 was the top gene for which rare and low-frequency nonsynonymous variants were associated with alcohol-induced affective symptoms during withdrawal. IZUMO4 was primarily highly expressed in testis. Rare coding and regulatory variants on COX19 were the top associations for alcohol-induced depression. COX19 was expressed in many tissues and the most highly expressed in the adrenal gland. None passed suggestive significant threshold after multiple test correction.

Table S7 summarizes relevant functional details of all genes and variants that were significantly or suggestively associated with one of the traits in either cohort.

Functional analysis and tissue-specific gene expression analysis

The top functional group for alcohol-related life events for the AI cohort was potassium ion transport (FDR = 0.015) (Table S8), while the top functional group for alcohol-induced affective symptoms during withdrawal was with arachidonic acid metabolic process (FDR = 0.068). The top functional groups for alcohol-related life events for the SFFS cohort (Table S9) were regulation of Rac protein signal transduction and regulation of Rac GTPase activity (FDR = 0.11). The top groups for alcohol-induced affective symptoms during withdrawal were response to virus (FDR = 2.64 × 10–5) and cellular response to type I interferon (FDR = 7.7 × 10–4). No functional group was found significant for alcohol-induced depression for either cohort. The enriched diseases for each trait are listed in Tables S10 and S11 for AI and SFFS, respectively.

The most enriched tissues with respect to tissue-specific gene expression by the top associated genes were adrenal gland and visceral adipose for alcohol-related life events in AI (see Fig. 2 and Table S12). Esophagus tissues were the most enriched for alcohol-induced affective symptoms during withdrawal, while nucleus accumbens was the most enriched for alcohol-induced depression in AI. In contrast, the most enriched tissues for alcohol-related life events in SFFS were mostly brain tissues including cortex, PFC, anterior cingulate cortex, and nucleus accumbens (Fig. 2 and Table S13). Sigmoid colon was the most enriched for alcohol-induced affective symptoms during withdrawal, followed by the tibial artery and visceral adipose. PFC, cortex, skeletal muscle, and caudate were most enriched for alcohol-induced depression in SFFS.

Fig. 2: Significance of tissue-specific gene expression enrichment by the top genes associated with one of the three AUD-related traits in the GWAS.
figure 2

Median gene expression by tissues from GTEx V7 was used. Only the most expressed tissues (z-score ≥ 2) by a gene were included. Gene sets include those with SNPs that were associated with alcohol-related life events or alcohol-induced affective symptoms in the AI or EA cohort at p-value < 10–5. Labels: x-axis: cohort and trait (Life Events: alcohol-related life events; Depression: 24 h depression when drinking; Anxiousness: affective sympotoms when cutting down or during withdrawal); y-axis: GTEx tissue name. The color scale from white to dark-blue corresponds to –log10(p) = 0–3. Tissues with enrichment p-value < 0.1 for all gene sets are omitted. AI American Indians, EA European Americans, SNPs single-nucleotide polymorphisms, GWAS genome-wide association studies

Discussion

The present study utilized low-coverage whole-genome sequence data to identify potential variants and pathways underlying two types of phenotypes associated with severe AUD: the severity of the clinical course of AUD and alcohol-induced affective symptoms, in an American Indian and a Euro-American populations.

Converging evidence from AI and EA suggested two new loci for alcohol-related life events and affective symptoms

Rare variants in a K2P channel gene KCNK2 were associated with the alcohol-related life events and alcohol-induced affective symptoms during withdrawal in the American Indians. The latter also found some supporting evidence in the Euro-Americans. KCNK2, the most studied K2P channel, has been found to play a key role in the cellular mechanisms of neuroprotection, anesthesia, pain-sensing, and depression (see review53). It has been shown that Kcnk2-knockout mice have increased efficacy of serotonin neurotransmission and are resistant to depression; they also exhibit substantially reduced elevation of corticosterone levels under stress54. It has also been shown in humans that KCNK2 might be related to susceptibility to major depressive disorder (MDD) and involved in antidepressant treatment response55. Associated pathways include potassium channels and neuronal system. The Collaborative Study on the Genetics of Alcoholism (COGA) also identified a potassium channel gene KCNJ6 to be associated with endophenotypes of AUD56. Notably, KCNK2 channels can be opened by neuroprotective agents and anesthetics, and inhibited by clinical doses of antidepressant drugs, making it a potential pharmacological target53.

Nonsynonymous rare variants in a pro-inflammatory mediator gene, PDE4C, were associated with alcohol-related life events in the Euro-American cohort. The association was supported in the AI, although the identified rare variants were largely different. PDE4C-encoded protein belongs to the cyclic nucleotide phosphodiesterase (PDE) family and is one of the four PDE4 iso-enzymes, which are the most prevalent PDE in immune cells. The PDE4 inhibitors have long been recognized as anti-inflammatory agents57. Preclinical research has found that PDE4 inhibitors reduced ethanol consumption and preference in rodent models by increasing cAMP, thus reducing inflammatory signaling58,59,60. This is consistent with the hypothesis linking neuroimmune signaling with alcohol consumption and dependence61,62. Pathways associated with gene PDE4C include G protein and GPCR signaling, cAMP signaling, morphine addiction, and opioid signaling63. Genes of the same PDE family have previously been implicated in alcohol use. For instance, the COGA has identified a SNP near PDE11A to be associated with alcohol dependence6. PDE4B has been found to be associated with alcohol consumption in the UK Biobank11.

Additionally, a low-frequency variant between NAF1 and FSTL5 and common variants in FSTL5 were, respectively, the top variants associated with the alcohol-related life events in the American Indian and the Euro-American cohorts (Fig. S3). Although potentially an interesting locus, there was no clear evidence of LD between the top variants in the two cohorts. Functions of FSTL5 include calcium ion binding and protein binding. Variants near the gene have been associated with alcohol dependence64, response to amphetamines65, interferon-γ induced monokine66, and paliperidone response in schizophrenia67.

Loci uniquely associated with alcohol-related life events and affective symptoms in the American Indian cohort

A variant in PRKG2 and rare variants in an interleukin subunit gene EBI3 (IL-27B) were identified for affective symptoms during withdrawal in the AI. PRKG2 has been associated with obesity traits in a number of ethnic groups68,69. The gene has also been associated with EEG alpha power in the COGA cohort70. The variant in PRKG2 is in moderate LD with a few variants in or near RASGEF1B (see Fig. S5), a regulator of ICAM-1 in the TLR4/LPS signal transduction pathway involved in pro-inflammatory cytokines release to activate immune response71. This gene is ubiquitously expressed in many tissues and has been implicated in MDD72. EBI3 is a subunit of the composite cytokines IL-27 and IL-35. It is involved in IL-27-mediated signaling and cytokine signaling in the immune system and plays a role in cell-mediated immune response. It can also promote pro-inflammatory IL-6 functions by mediating trans-signaling73.

A novel long non-coding RNA was uniquely associated with alcohol-induced depression in the European American cohort

An lncRNA and a gene in chromosome segment 12q24.32 are of particular relevance to alcohol use phenotypes. There has been evidence suggesting that aberrant methylation of LINC02347 was associated with MDD in European populations74. Evidence has also shown that a hemizygous interstitial deletion at chromosome 12q24.31-q24.33 caused multiple dysmorphic features and developmental delay75. SNPs in this locus have been shown to act as brain cis-eQTLs for TMEM132B, which encodes a transmembrane protein in the TMEM132 gene family whose members have been implicated in brain development76, panic/anxiety77, bipolar disorder78, and insomnia79. The gene was most highly expressed in tibial nerve and many brain regions, followed by testis and thyroid. TMEM132B has been associated with excessive daytime sleepiness (EDS) with BMI adjustment79, for which depression was suggested as the most significant risk factor80.

In summary, this study presents the first genome-wide analysis of an AUD clinical course severity phenotype and alcohol-induced affective symptoms (“dark side” traits) in two independent populations: American Indians and Euro-Americans. We have identified several novel loci containing rare variants. Many associated genes show increased expression in brain regions, adrenal gland, and digestive track, confirming the importance of neuronal, stress, immune, and metabolic systems in AUDs. However, certain limitations should be considered when making inferences from these findings. At the moderate sample sizes of 742 and 1711 of the AI and EA cohorts, respectively, we had limited statistical power to detect genome-wide significant associations (see Fig. S6 for power calculations for the study). Further, given the uniqueness of our American Indian sample, there are presently no replication samples available for the LCWGS study in AI, although there was corroborative evidence between the AI and the EA cohorts (two independent populations) to support two of the top rare-variant gene findings. Although all of the top GWAS variants from the EA cohort were found to be cis-eQTLs for certain brain regions, there was no eQTL information available in the public domain for any of the AI top variants, likely because they were all low-frequency variants. The fact that low-frequency variants predominated our findings, especially for the AI cohort, suggests that rare and less-frequent variants may play important roles in complex diseases such as AUD, especially in unique high-risk populations such as American Indians.