Genetic loci for alcohol-related life events and substance-induced affective symptoms: indexing the “dark side” of addiction

A limited number of genetic variants have been identified in traditional GWAS as risk or protective factors for alcohol use disorders (AUD) and related phenotypes. We herein report whole-genome association and rare-variant analyses on AUD traits in American Indians (AI) and European Americans (EA). We evaluated 742 AIs and 1711 EAs using low-coverage whole-genome sequencing. Phenotypes included: (1) a metric based on the occurrence of 36 alcohol-related life events that reflect AUD severity; (2) two alcohol-induced affective symptoms that accompany severe AUDs. We identified two new loci for alcohol-related life events with converging evidence from both cohorts: rare variants of K2P channel gene KCNK2, and rare missense and splice-site variants in pro-inflammatory mediator gene PDE4C. A NAF1-FSTL5 intergenic variant and an FSTL5 variant were respectively associated with alcohol-related life events in AI and EA. PRKG2 of serine/threonine protein kinase family, and rare variants in interleukin subunit gene EBI3 (IL-27B) were uniquely associated with alcohol-induced affective symptoms in AI. LncRNA LINC02347 on 12q24.32 was uniquely associated with alcohol-induced depression in EA. The top GWAS findings were primarily rare/low-frequency variants in AI, and common variants in EA. Adrenal gland was the most enriched in tissue-specific gene expression analysis for alcohol-related life events, and nucleus accumbens was the most enriched for alcohol-induced affective states in AI. Prefrontal cortex was the most enriched in EA for both traits. These studies suggest that whole-genome sequencing can identify novel, especially uncommon, variants associated with severe AUD phenotypes although the findings may be population specific.


Introduction
Alcohol use disorders (AUDs) are highly prevalent, disabling disorders that often go untreated in the USA 1 . Although a substantial heritable component has been found to underlie the variation in AUDs (see reviews 2,3 ), the identification of specific genetic variants associated with the disorder in genome-wide association studies (GWAS), though appearing promising, has proved to be challenging. The most consistent findings among studies have been variations in alcohol-metabolizing genes alcohol dehydrogenase (ADH) and aldehyde dehydrogenase (ALDH) that have been shown to confer protection against alcohol dependence in several populations 2,4,5 . In recent years, GWAS have yielded an additional small yet diverse set of single-nucleotide polymorphisms (SNPs) that have been associated with alcohol dependence, alcohol consumption, and related traits in a number of ethnic groups [5][6][7][8][9][10][11] . Among these recent findings, β-Klotho (KLB) has been repeatedly associated with alcohol consumption in large population studies of European ancestry 11,12 , although alcohol consumption may have significantly different genetic patterns from alcohol use disorders (AUD) 13 . One ethnic group that is particularly understudied yet has a high prevalence of AUD is American Indians (AI) 1,14 . While several studies provide data to demonstrate that a substantial genetic component for risk for AUD exists in the select AI tribes that have been studied 15 , little is known as to the exact genes and genetic variants that may confer this possibly elevated risk, with the exception of the ADH and ALDH loci 16-18 . There are several reasons for the paucity of findings linking specific variants to AUDs in any ethnic group. One potential reason may be defining the phenotype. One aspect of AUDs that appears to be highly consistent is the clinical course of the disorder 19,20 . The clinical course, as described by Schuckit et al. 20 , consists of the order and progression of 36 alcohol-related life events. These life events have been shown to be highly similar and consistent across many different subgroups and populations, although age of onset and endorsement rates of individual events can differ [21][22][23][24][25][26][27][28][29] . Although this phenotype has been extensively described, it has not yet been utilized as a trait to evaluate the genetics of AUDs.
Another phenotype that has been little explored in genetic studies is substance-induced mood disturbance. Among individuals with moderate, and especially severe AUD, mood disturbances can arise that can sometimes mimic major depressive episodes (MDE). This phenomenon was called "secondary depression" and later called substance-induced MDE 30,31 . More subtle substanceinduced affective symptoms may also occur during a bout of heavy drinking and/or when an individual cuts down on their drinking or during withdrawal 32 . These more subtle symptoms have been suggested by Koob et al. 33 as comprising the "dark side" of addiction 33 . It has been further hypothesized that a "negative emotional state" can arise during heavy alcohol exposure that then acts as negative reinforcement that promotes additional drinking in an attempt to eliminate the affective symptoms 34,35 .
Another potential reason for the paucity of genetic findings in AUD is that, similar to other complex diseases, the variants that have so-far been identified in GWAS for AUD-related traits are primarily common variants that may collectively explain only a small portion of the heritability 36 . There are many theories regarding this "missing heritability" of complex diseases 37 . One class of genetic variation that is largely understudied for AUD is rare variants in the genome. Rare variants have been understudied, in part, due to technological constraints limiting comprehensive whole-genome sequencing (WGS) of population samples and the lack of statistical models that incorporate variables such as family relatedness and ethnic admixture. Genotyping followed by imputation to reference panels is insufficient for studying special high-risk populations, as rare variants are often unique to each population. Recent advances in WGS technologies and analytical methods, however, have made possible the identification of both rare and common variants in studies of novel and admixed populations enriched for substance dependence phenotypes, such as American Indians 17,18,38 .
In the present study, we sought to investigate the genetic basis of two understudied phenotypes: (1) the clinical course of AUD as indexed by alcohol-related life events and (2) alcohol-induced affective symptoms, in two independent cohorts: American Indians (AI) and Euro-Americans (EA). Specifically, we conducted: (1) genomewide association analysis, (2) rare-variant analysis (3) functional and pathway analyses, and (4) tissue-specific gene expression enrichment analysis, using low-coverage whole-genome sequencing data, in order to identify both shared and distinct genetic factors between the two populations.

Participants
Two independent populations were investigated: 742 Native Americans of extended pedigrees from an American Indian cohort (AI), and 1711 primarily Euro-American (EA) participants from the San Francisco Family Alcohol Study (SFFS) (See Table S1 for demographics). We refer to the first cohort as AI and the second as EA or SFFS interchangeably. The population characteristics and the recruitment procedures of the two cohorts have been, respectively, described [39][40][41][42] . The protocol of the study of the American Indian cohort was approved by the Scripps Research Institute Institutional Review Board and Indian Health Council, a tribal review group overseeing health issues for the reservations where recruitments took place. The protocol for collection of participants in the SFFS was approved by the University of California San Francisco (UCSF) Committee for the Protection of the Rights of Human Subject while the recruitment took place. Subsequently, the University of North Carolina, at Chapel Hill IRB approved the data analysis plan. Written informed consent was obtained from each participant after study procedures had been fully explained. Participants were compensated for their time spent in the study.

Phenotypes
The clinical course of AUD as indexed by the 36 alcohol-related life events described by Schuckit et al. 20 in these two populations has been previously studied 22,23,26 . We herein defined a new weighted alcohol-related life events phenotype: a quantitative trait derived from the 36 alcohol-related life events as listed in Table 1. The life events were given a severity weight of 1 for events 1-12, 2 for 13-24 and 3 for 25-36. The order of the events was based on the mean age of occurrence with the first event happening earliest and the last event (36th) occurring latest in a lifetime. The phenotype was defined as the sum of the severity weights of the 36 alcohol-related life events. The resulting alcohol-related life events for AI and SFFS (EA) cohorts are characterized in Table S2. Of the 1711 individuals in SFFS, 1702 had valid values for this trait. This newly derived phenotype is correlated with DSM5 AUD diagnoses with correlation coefficients ranging from 0.62 to 0.78 and 0.81 for mild to moderate and severe AUDs respectively, in the AIs (Table S3).
Two symptoms, which were assessed across both samples, were used as an index of substance-induced affective states or "dark side symptoms". The first was a measure of withdrawal that queried whether participants ever felt anxious or depressed when they stopped or cut down on drinking. The second measure queried whether participants' drinking had ever caused them to feel depressed or uninterested in things for more than 24 h and to the point that it interfered with their functioning. Both phenotypes are dichotomous. The distributions of the two "dark side" phenotypes are also listed in Table S2. These symptoms are rare in moderate and mild AUD and common in severe AUD 32 .

Whole-genome sequencing and association analysis
The same methods and pipeline were used to conduct low-coverage whole-genome sequencing (LCWGS) on blood-derived DNA for AI and SFFS cohorts, and has been previously published 38 .
A linear mixed model approach as implemented in EMMAX 43 was used in the whole-genome association analysis, to control for population structures (as AI cohort is primarily admixed) 17,44 as well as familial relatedness. The association for each variant was conditioned on a kinship matrix that was estimated from the genotypes, in order to capture a wide range of sample structures. We further included sex, age, and age-squared as covariates in all association analyses. The significance of associations was corrected for multiple traits using the effective independent number of traits (m eff ) 45 . For the three traits in the present study, m eff = 2.369 and 2.105 for the AI and SFFS cohorts, respectively. We used a p-value of 5 × 10 -8 as the genome-wide significant threshold and 5 × 10 -7 as the threshold for suggestive significance.
We additionally performed a gene-based test using fastBAT 46 . For each gene, all variants in the range of ± 50 Kb of the gene and of MAF ≥ 1% were included. The pvalues were corrected for m eff . The number of genes (N) was 24,690 and 24,681 for AI and SFFS, respectively. Thus, the significant threshold for p-value was set at 0.05/ N = 2.0 × 10 -6 , and a suggestive threshold at 2.0 × 10 -5 .

Gene-based low-frequency and rare variants association analysis
A linear mixed model-based combined multivariate and collapsing method 47 as implemented in EMMAX was used to collectively analyze the variants having lower than 5% minor allele frequency (MAF). We grouped the lowfrequency (1% ≤ MAF < 5%) and rare variants (MAF < 1%) by genes. For each gene, we formed two groups. One group considered all variants on exons, 5′-and 3′-UTRs, The order of the events was based on the mean age of the occurrence with the first event happening earliest and the last event (36th) occurring latest in a lifetime upstream and downstream of the gene (denoted as Exon + Reg). The other included only the nonsynonymous and the splicing-site variants (denoted as Nonsyn). Intergenic variants were not considered in the present study. For each variant group type, a gene was excluded if fewer than three markers were found, or if <1% of the samples had any such markers on the gene. The p-values were corrected for m eff . The significant thresholds for corrected pvalues were set at 0.05/(N Exon+Reg + N Nonsyn ) for each trait and cohort, where N Exon + Reg is the number of genes in the group Exon + Reg and N Nonsyn in the group Nonsyn. Note that correcting for the sum of the numbers of genes in the two groups is likely an overcorrection as two groups of variants are correlated.

Functional and pathway analyses
Top variants from our association analyses were tested against the brain-specific cis-eQTL database BRAI-NEAC 48 . Polyphen-2 was used to predict whether nonsynonymous variants might be potentially damaging 49 . The variants with p-values < 10 -5 from each GWAS were annotated with genes using SGAdviser 50 ; the associated set of genes were then subjected to functional analyses.
We used GeneMANIA 51 to extract potential functional networks, and DAVID 6.8 52 for disease enrichment analysis.

Tissue-specific gene expression enrichment analysis
We obtained the median tissue-specific gene expression data from The Genotype-Tissue Expression (GTEx) Project release V7 (at GTEx Portal) for the sets of genes associated with variants that had a p-value < 10 -5 in the GWAS of each trait and cohort. For each gene, its expression profile across tissues was standardized. For each tissue, we then counted the number of genes in each gene set that had expression levels over zscore of 2 (representing the most expressed tissues by the gene). If this gene count was significantly higher than expected, the tissue was considered enriched with respect to tissue-specific expressions for the gene set. The significance was determined through permutation tests.

Data and code availability
The SFFS dataset has been deposited in dbGaP (accession: phs001458.v1.p1). In accordance with the wishes of

GWAS for the American Indian cohort
All variants that were found associated with any of the three alcohol-related traits at over a suggestive significant level (p < 5 × 10 -7 ) in the AI cohort had <5% allele frequency (see Table 2, S5, and Fig. S1). Variant rs200577368, downstream of gene NAF1 and 658Kbp upstream of FSTL5 (Fig. S3A), was found significantly associated with alcohol-related life events (p = 6.35 × 10 -9 , see Fig. S2A). Variant rs79833306 downstream of DMRTA1 was also associated with alcohol-related life events (p = 5.14 × 10 -8 ). Six additional variants were associated with the phenotype at suggestive significant levels ( Table 2, S5, and S7), including one SNP located 31Kbp upstream of PCCA and 71Kbp downstream of ZIC2, and two upstream of gene KCTD3 and downstream of KCNK2, both potassium channel genes (Fig. 1). Additionally, gene-based tests using the fastBAT statistic identified only a single gene, MME (a.k.a NEP, CD10), to be suggestively associated with alcohol-related life events (p = 1.47 × 10 -5 ).
Variant rs150351153, located in an intronic region of PRKG2, was significantly associated with affective symptoms during withdrawal in the AI cohort (p = 9.75 × 10 -9 , Figs. S2B and S5), and nine others were associated at suggestive significant levels. No variant was significantly associated with alcohol-induced depression. However, ten variants were associated at suggestive significant levels (Fig. S2C).

GWAS for the European American cohort
No single variant remained genome-wide significant after correcting for the number of traits (Tables 2 and S6, Fig. S2). The top association for alcohol-related life events was rs11100375, a common (MAF = 40%) intronic variant on gene FSTL5, at a suggestive significant level (p = 4.82 × 10 -7 , Fig. S3B). Note that the top variant rs200577368 associated with the same trait in the AI cohort resides 1.2 Mb upstream of rs11100375 in the intergenic region between FSTL5 and NAF1. Although it's in linkage disequilibrium (LD) with many variants in the region as reflected by high D' (Fig. S3A), there was no clear evidence of LD between rs200577368 and rs11100375. FSTL5, a follistatin-like five gene, was most Gene-based rare variants test also showed that KCNK2 was associated with alcohol-related life events and affective symptoms when cutting down or during withdrawal. These two single nucleotide polymorphisms (SNPs) are in high LD and also in LD with variants on and near the two nearby genes. They are both near gene activation sites, and located in regions methylated in frontal cortex. Additionally, rs72739250 is located in a CpG island, together suggesting the potential regulatory roles of these two variants 81 highly expressed in the cerebellar hemisphere. Rs11100375 found cis-eQTL for FSTL5 in the hippocampus, thalamus and substantia nigra regions (FDR = 0.022-0.035).
Sequence of LINC02347 were also partially mapped to transcripts of FAM32A, CHIA, ROS1, NIPA2, and TAP2. Additionally, top SNPs in high LD with rs4309206 were found to be brain cis-eQTLs for LOC283435, LOC400084, and TMEM132B (777Kbp downstream), which are located near LINC02347 (Fig. S4). For TMEM132B, the top eQTLs in brain were rs10847158 (the 2nd most significant SNP) for expression in frontal cortex (FDR = 0.0042), and rs4765395 for expression in white matter (FDR = 0.0023), occipital cortex (FDR = 0.0068), and substantia nigra (FDR = 0.033). All variants that were associated with one of the three AUD-related traits at over a suggestive significant level in the SFFS cohort were common variants (see Tables 2 and S6) and represented cis-eQTLs for the related genes in brain regions.

Gene-based rare-variant analysis for the American Indians
Rare and less-frequent (MAF < 5%) exonic and regulatory (upstream or downstream) variants of a potassium channel gene, KCNK2, were significantly associated with alcohol-related life events (p = 7.74 × 10 -7 ) and suggestively associated with alcohol-induced affective symptoms during withdrawal (p = 6.32 × 10 -6 ) in the AI cohort ( Table 3). The rare coding variants in KCNK2 were also associated with alcohol-induced affective symptoms during withdrawal in the SFFS cohort at p = 0.043 ( Table 3). Note that the two variants (MAF = 1.4%) upstream of KCTD3 that were suggestively associated with alcoholrelated life events in AI are also downstream of KCNK2 (see Fig. 1). KCNK2 was most highly expressed in fibroblasts and adrenal gland, thyroid and several brain regions. KCTD3 was more ubiquitously expressed and highly expressed in the adrenal gland.
Rare coding variants in EBI3 (a.k.a IL27B) were significantly associated with alcohol-induced affective symptoms during withdrawal (p = 1.67 × 10 -7 ), followed by KCNK2. Only weak support was found for EBI3 in the Euro-American cohort (nonsynonymous variants in EBI3 were associated with alcohol-induced depression in SFFS with nominal p = 0.048). EBI3 was most highly expressed in lymphocytes and spleen. Rare, nonsynonymous variants on DSG1 were suggestively associated with affective symptoms during withdrawal, followed by SLC39A13. DSG1 was most highly expressed in skin, vagina, and esophagus. Although below the suggestive significant threshold in this rare-variant gene test, SLC39A13 has recently been identified as a novel locus for alcohol use disorder identification test (AUDIT) total score in the UK Biobank 13 . Rare variants in an lncRNA RP11-94B19.6 were suggestively associated with alcohol-induced depression, followed by rare coding variants in TICAM1 (that was most highly expressed in esophagus) and nonsynonymous variants in ZNF644.

Gene-based rare-variant analysis for the Euro-Americans
PDE4C was suggestively associated with alcohol-related life events (p = 1.44 × 10 -6 ) using the rare nonsynonymous or splice-site variants (Table 3). This finding was replicated in the AI cohort (p = 2.93 x 10 -3 ). Interestingly, although not unexpectedly given that these were rare variants, between the 10 and 9 rare nonsynonymous or splice-site variants, respectively, found in PDE4C in the AI and SFFS cohorts, only one variant, rs182916479, at a splice-site was shared across cohorts (0.2% MAF in both cohorts). Of the ten variants in the AI cohort, Polyphen-2 predicted one variant possibly damaging (probability = 0.904) and three probably damaging (prob. = 0.992-0.999). Of the nine variants in the SFFS cohort, the prediction was one possibly damaging (prob. = 0.736) and four probably damaging (prob. = 0.986-1).
IZUMO4 was the top gene for which rare and lowfrequency nonsynonymous variants were associated with alcohol-induced affective symptoms during withdrawal. IZUMO4 was primarily highly expressed in testis. Rare coding and regulatory variants on COX19 were the top associations for alcohol-induced depression. COX19 was expressed in many tissues and the most highly expressed in the adrenal gland. None passed suggestive significant threshold after multiple test correction. Table S7 summarizes relevant functional details of all genes and variants that were significantly or suggestively associated with one of the traits in either cohort.

Functional analysis and tissue-specific gene expression analysis
The top functional group for alcohol-related life events for the AI cohort was potassium ion transport (FDR = 0.015) (Table S8), while the top functional group for alcohol-induced affective symptoms during withdrawal was with arachidonic acid metabolic process (FDR = 0.068). The top functional groups for alcohol-related life events for the SFFS cohort (Table S9) were regulation of Rac protein signal transduction and regulation of Rac GTPase activity (FDR = 0.11). The top groups for alcohol-induced affective symptoms during withdrawal were response to virus (FDR = 2.64 × 10 -5 ) and cellular response to type I interferon (FDR = 7.7 × 10 -4 ). No functional group was found significant for alcoholinduced depression for either cohort. The enriched diseases for each trait are listed in Tables S10 and S11 for AI and SFFS, respectively.
The most enriched tissues with respect to tissue-specific gene expression by the top associated genes were adrenal gland and visceral adipose for alcohol-related life events in AI (see Fig. 2 and Table S12). Esophagus tissues were the most enriched for alcohol-induced affective symptoms during withdrawal, while nucleus accumbens was the most enriched for alcohol-induced depression in AI. In Fig. 2 Significance of tissue-specific gene expression enrichment by the top genes associated with one of the three AUD-related traits in the GWAS. Median gene expression by tissues from GTEx V7 was used. Only the most expressed tissues (z-score ≥ 2) by a gene were included. Gene sets include those with SNPs that were associated with alcohol-related life events or alcohol-induced affective symptoms in the AI or EA cohort at pvalue < 10 -5 . Labels: x-axis: cohort and trait (Life Events: alcohol-related life events; Depression: 24 h depression when drinking; Anxiousness: affective sympotoms when cutting down or during withdrawal); y-axis: GTEx tissue name. The color scale from white to dark-blue corresponds to -log 10 (p) = 0-3. Tissues with enrichment p-value < 0.1 for all gene sets are omitted. AI American Indians, EA European Americans, SNPs single-nucleotide polymorphisms, GWAS genome-wide association studies contrast, the most enriched tissues for alcohol-related life events in SFFS were mostly brain tissues including cortex, PFC, anterior cingulate cortex, and nucleus accumbens ( Fig. 2 and Table S13). Sigmoid colon was the most enriched for alcohol-induced affective symptoms during withdrawal, followed by the tibial artery and visceral adipose. PFC, cortex, skeletal muscle, and caudate were most enriched for alcohol-induced depression in SFFS.

Discussion
The present study utilized low-coverage whole-genome sequence data to identify potential variants and pathways underlying two types of phenotypes associated with severe AUD: the severity of the clinical course of AUD and alcohol-induced affective symptoms, in an American Indian and a Euro-American populations.
Converging evidence from AI and EA suggested two new loci for alcohol-related life events and affective symptoms Rare variants in a K 2P channel gene KCNK2 were associated with the alcohol-related life events and alcoholinduced affective symptoms during withdrawal in the American Indians. The latter also found some supporting evidence in the Euro-Americans. KCNK2, the most studied K 2P channel, has been found to play a key role in the cellular mechanisms of neuroprotection, anesthesia, painsensing, and depression (see review 53 ). It has been shown that Kcnk2-knockout mice have increased efficacy of serotonin neurotransmission and are resistant to depression; they also exhibit substantially reduced elevation of corticosterone levels under stress 54 . It has also been shown in humans that KCNK2 might be related to susceptibility to major depressive disorder (MDD) and involved in antidepressant treatment response 55 . Associated pathways include potassium channels and neuronal system. The Collaborative Study on the Genetics of Alcoholism (COGA) also identified a potassium channel gene KCNJ6 to be associated with endophenotypes of AUD 56 . Notably, KCNK2 channels can be opened by neuroprotective agents and anesthetics, and inhibited by clinical doses of antidepressant drugs, making it a potential pharmacological target 53 .
Nonsynonymous rare variants in a pro-inflammatory mediator gene, PDE4C, were associated with alcoholrelated life events in the Euro-American cohort. The association was supported in the AI, although the identified rare variants were largely different. PDE4C-encoded protein belongs to the cyclic nucleotide phosphodiesterase (PDE) family and is one of the four PDE4 iso-enzymes, which are the most prevalent PDE in immune cells. The PDE4 inhibitors have long been recognized as antiinflammatory agents 57 . Preclinical research has found that PDE4 inhibitors reduced ethanol consumption and preference in rodent models by increasing cAMP, thus reducing inflammatory signaling [58][59][60] . This is consistent with the hypothesis linking neuroimmune signaling with alcohol consumption and dependence 61,62 . Pathways associated with gene PDE4C include G protein and GPCR signaling, cAMP signaling, morphine addiction, and opioid signaling 63 . Genes of the same PDE family have previously been implicated in alcohol use. For instance, the COGA has identified a SNP near PDE11A to be associated with alcohol dependence 6 . PDE4B has been found to be associated with alcohol consumption in the UK Biobank 11 .
Additionally, a low-frequency variant between NAF1 and FSTL5 and common variants in FSTL5 were, respectively, the top variants associated with the alcoholrelated life events in the American Indian and the Euro-American cohorts (Fig. S3). Although potentially an interesting locus, there was no clear evidence of LD between the top variants in the two cohorts. Functions of FSTL5 include calcium ion binding and protein binding. Variants near the gene have been associated with alcohol dependence 64 , response to amphetamines 65 , interferon-γ induced monokine 66 , and paliperidone response in schizophrenia 67 .
Loci uniquely associated with alcohol-related life events and affective symptoms in the American Indian cohort A variant in PRKG2 and rare variants in an interleukin subunit gene EBI3 (IL-27B) were identified for affective symptoms during withdrawal in the AI. PRKG2 has been associated with obesity traits in a number of ethnic groups 68,69 . The gene has also been associated with EEG alpha power in the COGA cohort 70 . The variant in PRKG2 is in moderate LD with a few variants in or near RAS-GEF1B (see Fig. S5), a regulator of ICAM-1 in the TLR4/ LPS signal transduction pathway involved in proinflammatory cytokines release to activate immune response 71 . This gene is ubiquitously expressed in many tissues and has been implicated in MDD 72 . EBI3 is a subunit of the composite cytokines IL-27 and IL-35. It is involved in IL-27-mediated signaling and cytokine signaling in the immune system and plays a role in cell-mediated immune response. It can also promote pro-inflammatory IL-6 functions by mediating trans-signaling 73 .
A novel long non-coding RNA was uniquely associated with alcohol-induced depression in the European American cohort An lncRNA and a gene in chromosome segment 12q24.32 are of particular relevance to alcohol use phenotypes. There has been evidence suggesting that aberrant methylation of LINC02347 was associated with MDD in European populations 74 . Evidence has also shown that a hemizygous interstitial deletion at chromosome 12q24.31-q24.33 caused multiple dysmorphic features and developmental delay 75 . SNPs in this locus have been shown to act as brain cis-eQTLs for TMEM132B, which encodes a transmembrane protein in the TMEM132 gene family whose members have been implicated in brain development 76 , panic/anxiety 77 , bipolar disorder 78 , and insomnia 79 . The gene was most highly expressed in tibial nerve and many brain regions, followed by testis and thyroid. TMEM132B has been associated with excessive daytime sleepiness (EDS) with BMI adjustment 79 , for which depression was suggested as the most significant risk factor 80 .
In summary, this study presents the first genome-wide analysis of an AUD clinical course severity phenotype and alcohol-induced affective symptoms ("dark side" traits) in two independent populations: American Indians and Euro-Americans. We have identified several novel loci containing rare variants. Many associated genes show increased expression in brain regions, adrenal gland, and digestive track, confirming the importance of neuronal, stress, immune, and metabolic systems in AUDs. However, certain limitations should be considered when making inferences from these findings. At the moderate sample sizes of 742 and 1711 of the AI and EA cohorts, respectively, we had limited statistical power to detect genome-wide significant associations (see Fig. S6 for power calculations for the study). Further, given the uniqueness of our American Indian sample, there are presently no replication samples available for the LCWGS study in AI, although there was corroborative evidence between the AI and the EA cohorts (two independent populations) to support two of the top rare-variant gene findings. Although all of the top GWAS variants from the EA cohort were found to be cis-eQTLs for certain brain regions, there was no eQTL information available in the public domain for any of the AI top variants, likely because they were all low-frequency variants. The fact that low-frequency variants predominated our findings, especially for the AI cohort, suggests that rare and lessfrequent variants may play important roles in complex diseases such as AUD, especially in unique high-risk populations such as American Indians.