INTRODUCTION

Formal genetic studies show that the heritability of alcohol dependence (AD) is around 40–60% (Prescott and Kendler, 1999; Sullivan et al, 2012). Candidate and genome-wide association studies (GWAS) have suggested numerous candidates for AD and alcohol consumption. The most consistent associations have been reported for: (i) chromosome 4q22/4q23 in/near the alcohol dehydrogenase (ADH) gene cluster; and (ii) in/near chromosome 12q24, a region which harbors the aldehyde dehydrogenase 2 (ALDH2) gene. Whereas candidate studies rely on prior knowledge, GWAS allow systematic screening of the whole genome with hundreds of thousands to millions of genetic variants, thus facilitating the identification of genes in novel biological contexts. Owing to correction for multiple testing, the probability of identifying genome-wide significant associations hinges on the sample size as well as on the degree of heritability and heterogeneity (Sullivan et al, 2012). Genome-wide significant association findings for AD have been detected in: (i) the chr4q22/4q23 region in/near the ADH genes, PDLIM5, METAP1, and LOC100507053 (Frank et al, 2012; Gelernter et al, 2013a; Park et al, 2013); (ii) at chr1p35 in SERINC2 (Zuo et al, 2013); (iii) at chr2p16 near MTIF2, CCDC88A, and PRORSD1P (Gelernter et al, 2013a); (iv) at chr2q21 near DARS and CXCR4 (Gelernter et al, 2013a); (v) at chr2q35 near PECR (Treutlein et al, 2009); (vi) at chr5p15 (Gelernter et al, 2013a); at chr9p13 (Gelernter et al, 2013a); at chr12q24 in ALDH2 (Quillen et al, 2014); (vii) at chr13q32 in NALCN (Wetherill et al, 2013); and (viii) at chr19p13 in LOC100131094 and DPP9 (Gelernter et al, 2013a). GWAS of alcohol consumption have identified significant variants at: (i) chr12q24 in/near ALDH2, C12orf51, CCDC63, MYL2, OAS3, CUX2 (alias CUTL2), and BRAP (Baik et al, 2011; Takeuchi et al, 2011); (ii) chr3p24 near SGOL1 (Pan et al, 2013); (iii) chr4q23 in ADH1B (McKay et al, 2011); and (iv) 7q11 in AUTS2 (Schumann et al, 2011). With the exception of the ADH and ALDH2 genes, none of the variants identified for AD to date exert strong effects, which is consistent with the hypothesis of a polygenic contribution.

In accordance with the latter hypothesis, polygenic score analyses of European GWAS data have shown that many common variants—most of which do not exceed the threshold for genome-wide significance after correction for multiple testing—contribute to the risk of AD (Frank et al, 2012). It can therefore be assumed that many risk factors at the level of individual genes still await identification. A promising approach that circumvents extreme correction for multiple testing is to analyze the aggregated contribution of variants in functionally related gene groups under the assumption that these gene groups contain a large number of variants with a disruptive influence on gene function. Specific gene groups can be defined by searching for specific gene-networks (eg grouping genes whose protein products interact (Jia et al, 2011)), or by relying on a priori information concerning gene-pathways. Recent studies of such system-level approaches have identified new genes for alcohol- (Han et al, 2013), cocaine- (Gelernter et al, 2013c), and opioid dependence (Gelernter et al, 2013b).

The aim of the present study was to identify as yet unknown genes with an involvement in AD using both the gene-set-based analysis of data from a previous GWAS of AD from the German population (Frank et al, 2012), and the Global test method (Deelen et al, 2013). The most convincing finding, XRCC5, was followed up in a functional genetic analysis of its homolog in the fruit fly Drosophila melanogaster. The phenotype alcohol sensitivity (Scholz, 2009) was investigated, as this Drosophila phenotype corresponds to the human phenotype ‘level of response’. The level of response to alcohol refers to the psychological and physiological response of an individual to an alcohol challenge. A previous study reported an association between low level of response to alcohol in humans and an increased risk for AD (Schuckit, 1994). As low responders are likely to consume more alcohol in order to experience the desired pleasurable feelings, we performed a follow-up human genetic association study of XRCC5 and free access to intravenous ethanol in the laboratory-setting (Zimmermann et al, 2009).

MATERIALS AND METHODS

Genome-Wide Association Study

Ethics statement

All participants provided written informed consent following a detailed explanation of the study, and all data were anonymized prior to analysis. The study was approved by the ethics committees of the German Universities of Heidelberg, Bonn, Dresden, Düsseldorf, Essen, Cologne, Mainz, Munich, and Regensburg. All research procedures were conducted in accordance with the Declaration of Helsinki.

GWAS and replication study

The case and control samples from the German GWAS, and the COGA and SAGE GWAS used to replicate significant gene-sets, are described in detail elsewhere (Frank et al, 2012, Edenberg et al, 2010, Bierut et al, 2010; for details, see Supplementary Materials and Methods).

GWAS data set

Briefly, genome-wide data for 1333 cases and 2168 controls were available for analysis after stringent quality control. The quality control criteria were a sample call rate (CR) of 0.98, and conformity between reported sex and genotypic sex. In the case of duplicates or cryptic relatedness (identity by state across autosomal markers 1.6), the sample with the lower CR was removed. Outliers were identified using principal component analysis and removed. Only those variants that were present on all of the applied genotyping platforms were included in the analysis. To remain in the data set, single nucleotide polymorphisms (SNPs) had to meet the following criteria: CR 0.98; minor allele frequency 0.01; conformity with Hardy–Weinberg equilibrium (HWE; P1E–6) in any subsample; HWE conformity across all samples; and no significant difference in allele frequency between subsamples for controls (Frank et al, 2012). Owing to German regulations concerning informed consent and data protection, the data cannot be made available via publicly accessible databases. However, the data have been made available to the Psychiatric Genomics Consortium on AD, which is being established at the time of writing (http://pgc.unc.edu).

Gene-set-based analysis

Gene-set-based analysis was performed using these GWAS data.The association model used by Frank et al (2012) was applied. Briefly, logistic regression was performed, using AD diagnosis as the response variable and the number of minor alleles at considered loci as predictor variables. The principal component analysis in Frank et al (2012) revealed strong ethnic homogeneity across the sample. Therefore in the present analyses, no correction for population stratification was performed. Gene-set descriptions were retrieved from the following gene-set collections: Kyoto Encyclopedia of Genes and Genomes (dbKEGG, http://www.genome.jp/kegg/, R package KEGG.db version 2.5.0); Reactome (dbRC, http://www.reactome.org/, R package reactome.db version 1.44.0); Gene Ontology (dbGO, www.geneontology.org, R package GO.db version 2.5.0); Biocarta (dbBC); microRNA targets (dbMIR); transcription factor targets (dbTFT); and positional gene-sets (dbPOS). The latter four gene-sets were retrieved via MSigDB (http://www.broadinstitute.org/gsea/msigdb/, version 3.0).

Gene-sets with a minimum of 5 and a maximum of 200 genes were retained. To reduce multicollinearity in the GWAS data, a pruned SNP set was obtained using a variance inflation factor (VIF=10) implemented in PLINK (http://pngu.mgh.harvard.edu/~purcell/plink/). Linkage disequilibrium (LD)-based pruning was applied to the raw data using the complete SNP set. After LD-based pruning, 100471 SNPs (from a total of 462775) were considered for the gene-set analysis. To account for important regulatory regions, SNPs were assigned to a gene if the variant was located within the genomic sequence or within 20 kb of the 5′ and 3′ ends of the first and last exons (Veyrieras et al, 2008). If a SNP was within a region shared by more than one gene, the SNP was assigned to all of the respective genes. These SNPs were analyzed within the context of 10367 gene-sets (dbBC 217; dbCGP 1817; dbGO 6427; dbKEGG 215; dbMIR 177; dbPOS 288; dbRC 770; dbTFT 456).

Global Test

For the gene-set-based analysis, R package globaltest version 5.12.0 was applied. Details of the Global test are provided elsewhere (Goeman et al, 2004; Juraeva et al, 2014, for details, see Supplementary Materials and Methods). The contribution of each SNP to the gene-set association score was calculated using the component Global Test. SNPs with a component Global Test P-value of 1E-3 are reported as ‘top SNPs’. A SNP was classified as a significant contributor to a gene-set association score if its component Global Test P-value was 5E-2. A complete list of SNPs contributing to a gene-set association score are provided in Supplementary Table S1. For the top genes in the merged COGA/SAGE replication data set, single SNP analysis and gene-wide analysis were performed. The gene-wide analysis was performed using VEGAS (http://gump.qimr.edu.au/VEGAS/).

Drosophila Study

The most promising finding, ie XRCC5, was subjected to gene-targeted functional genetic analysis in an invertebrate model of AD (Scholz, 2009). Homologs of XRCC5 exist across a wide range of organisms (Altenhoff et al, 2013; Li et al, 2006). Initial sensitivity to alcohol was used as the phenotype of interest in determining genetic factors that might influence the development of AD. To determine whether altered Ku80 function interferes with ethanol-induced behaviors, flies with altered Ku80 function were tested for changes in ethanol sensitivity using an inebriometer, an assay that measures the effect of ethanol intoxication on postural control (Hoffmann and Cohan, 1987). Briefly, the assay consists of an ethanol-vapor-filled 1.22-meter-long column. A population of 3–5-day-old flies (n=100) is inserted into the top of the device. Over time, the flies lose their balance and fall through the column. At the bottom, the flies are counted by the passing through of a light beam. The time required for a population to elute from the column is defined as mean elution time and is around 20 min for a control population. To reduce Ku80 function, a RNAi hairpin construct of the Ku80 gene (dsKu80JF02790) was expressed under the control of the UAS/GAL4 system. These experiments involved a population of female flies in which an UAS-Ku80-RNAi transgene was expressed under the control of the GAL4 neuronal driver Appl-GAL4. The latter normally expresses transgenes in all pan-neural cells. To control for the putative effects of the transposon insertion sites and the putative effects of the transgenes alone, heterozygote flies carrying only one copy of the transgene were used as controls. To reduce the influence of modifiers in the genetic background, the transgenic lines were backcrossed for five generations using the w1118 stock of the Scholz laboratory. All experiments with D. melanogaster were performed in accordance with relevant guidelines and regulations of the German Research Foundation (DFG).

Investigation of Association between Free Access Alcohol Self-Administration and the XRCC5 Marker rs828701 in Social Drinkers

Subjects

Data sets were available for 85 healthy non-alcohol-dependent social drinkers (n=49 male, n=36 female; mean age: 18.39 ± 0.49 years), all of whom were of German descent. These individuals had been identified from the citizen registry of the German city of Dresden. The majority (76%) were living with their parents, and 77% were in their 13th year of education and planning to attend university. They were phenotyped using a previously described experimental paradigm involving free access to the intravenous self-infusion of ethanol (Zimmermann et al, 2009).

Genotyping of XRCC5

Of the four SNPs in XRCC5, only rs828701, ie the marker with the most significant P-value in this gene, was genotyped. This approach was used as the four top SNPs were intronic, did not constitute expression quantitative trait loci (eQTLs), and had no other reported functional effects. Genotyping was performed on an Applied Biosystems 7900HT Fast Real-Time PCR System (assay ID: C___8839929_10), in accordance with the manufacturer’s instructions (Applied Biosystems, Darmstadt, Germany). An ANOVA (function Anova, package car in R) was used to test the effects of rs828701 on maximum achieved blood alcohol concentration (BAC).

RESULTS

Gene-Set Association Analysis

Nineteen gene-sets with a false discovery rate (FDR) of 5E-2 were identified (Table 1, Supplementary Table S1). FDR values ranged from 9.01E-3 to 4.62E-2. Relationships between gene-sets, as revealed by SNPs with an individual SNP P-value of 5E-2, are shown in Supplementary Figure S1. The 19 gene-sets with FDR 5E-2 included five sets, which characterize processes of DNA integrity and DNA repair, ie ‘telomeres telomerase cellular aging and immortality‘ (set 1); ‘provirus integration’ (set 2); ‘nonhomologous end joining complex’ (set 4); ‘DNA integration’ (set 5); and ‘nonhomologous end-joining’ (set 11). The 19 gene-sets also included: the ‘GTPase inhibitor activity’ set containing gene products, which prevent enzymatic hydrolysis of guanosin triphosphate (set 3); the ‘peroxisome’ set containing gene products, which have a role in lipid homeostasis and redox reactions of this organelle (set 6); the ‘focal adhesion’ set containing gene products, which form specialized structures at contact points between the cell and the extracellular matrix (set 7); the ‘leukocyte transendothelial migration’ set containing gene products of relevance to the migration of leucocytes between blood and tissue (set 8); the ‘glycolysis/gluconeogenesis’ set, which encodes enzymes for the synthesis of pyruvate from glucose and for the formation of glucose from noncarbohydrate precursors (set 9); the ‘lysosome’ set containing the protein configuration of this digestive compartment (set 10); the ‘biosynthesis of unsaturated fatty acids’ set containing the enzymes required for formation of these compounds (set 12); the ‘mir-377’ set containing genes that share the microRNA binding motif ‘tgtgtga’ in the 3′-untranslated region (set 13); the four positional sets ‘chr12q12’ (set 14), ‘chr5q21’ (set 15), ‘chr2q35’ (set 16) and ‘chr7q32’ (set 17), which contain the genes of the respective cytogenetic bands; and the two transcription factor target sets ‘srf’ (set 18); and ‘foxj2’ (set 19), which contain genes that share the respective transcription factor binding site.

Table 1 Associated Gene-Sets of the German GWAS and Replication

Association with four of these gene-sets, ie ‘peroxisome’ (set 6); ‘glycolysis/gluconeogenesis’ (set 9); ‘biosynthesis of unsaturated fatty acids’ (set 12); and ‘chr2q35’ (set 16), was driven by the following three highly significant association findings: (P<1E-5): rs1789891 between ADH1B and ADH1C (set 9); rs11499823 near ADH1C (set 9); and rs1344694 near PECR (sets 6,12,16). The remaining 15 gene-sets were defined exclusively by variants with less significant P-values (Table 1).

The 19 gene-sets were comprised of the 38 most significant contributory genes in the most significant category (P<1E-3) (Table 1). The gene XRCC5 was present in six of the 19 gene-associated sets (Table 1). These gene-sets were: ‘telomeres telomerase cellular aging and immortality’; ‘nonhomologous end joining’; ‘DNA integration’; ‘provirus integration’; ‘nonhomologous end joining complex’; and the positional gene-set ‘chr2q35’. Nine variants at the XRCC5 locus contributed with nominal significance to the association of these six gene-sets (Supplementary Table S1). These included four variants in the top category P1E-3, ie rs828701, rs828704, rs207938, and rs2032765.

The top XRCC5 marker rs828701 (risk allele: C), ie the SNP with the most significant P-value among these four SNPs, showed P=2.25E-5 (Table 1). The rs828701 C-allele frequency was 48.6% in cases and 43.5% in controls, respectively.

XRCC5 was present in 65 of the 10 367 analyzed gene-sets. A total of 1737 genes were present in at least 65 (or more) gene-sets. This indicates that the frequent presence of XRCC5 was not due to overrepresentation, and that XRCC5 had no greater a priori likelihood of appearing in the 19 significantly associated gene-sets, ie that this gene was truly enriched for variants that drive the association of the gene-sets.

In the replication study, the set ‘DNA integration’ (Gene Ontology: 0015074), which contains the XRCC5 gene, achieved a nominally significant P-value of 3.07E-2. In addition, four further sets achieved a nominally significant P-value of 5E-2, ie focal adhesion (KEGG: 04510) P=3.80E-3; leucocyte transendothelial migration (KEGG: 04670) P=3.91E-3; chr7q32 (dbPOS-mSIGdb) P=2.39E-2; and MIR-377 (dbMIR-mSIGdb) P=3.51E-2 (Table 1). Thus a nominally significant result was obtained for 5 of the 19 gene-sets tested, which is a significantly higher number than would be expected by chance (P=0.002). For the 38 top genes in the merged COGA/SAGE data set, single-marker analyses in the COGA/SAGE data set generated no significant finding (P>5E-2). Gene-wide analysis using VEGAS revealed nominally significant P-values for CLDN14 and SLC2A13.

Functional Genetic Study in Ku80, the Drosophila Homolog of XRCC5

The one-to-one ortholog of the human gene XRCC5 in Drosophila is the Ku80 gene (Altenhoff et al, 2013; Li et al, 2006, Supplementary Figure S2). Experiments in Drosophila showed that initial sensitivity to ethanol, as measured by the effect of ethanol intoxication on postural control, was significantly reduced (P<0.05) in Ku80 mutants compared with controls. This finding supports the hypothesis that Ku80 has a functional role in the regulation of ethanol sensitivity (Figure 1).

Figure 1
figure 1

Effect of alcohol on postural control in Drosophila Ku80 mutants. The expression of the UAS-dsKu80JF02790: RNAi hairpin construct of the Ku80 gene under the control of the pan-neural GAL4 driver Appl-GAL4 significantly reduced ethanol sensitivity in the flies. To control for transgene insertion effects, the transheterozygote Appl-GAL4 and UAS-dsKu80JF02790 were used. The average times required to lose postural control due to ethanol intoxication were: UAS-dsKu80JF02790/+: 22.3±1.7; Appl-GAL4/+: 20.5±1.8; and UAS-dsKu80JF02790/Appl-GAL4: 27.8±1.1. The number of experiments performed was 7, 7, and 8 respectively. Each experiment involved one fly population of n=100 and the error bars represent SEM. (*) indicates significance P<0.05 as determined with an ANOVA Tukey-Kramer post hoc test comparing experimental group with controls.

PowerPoint slide

Human Genetic Study of Free Access to Alcohol Self-Administration

The XRCC5 rs828701 genotypes were in HWE (P>0.05). Their distribution was (i) CC=8, CT=24, TT=17 in the 49 males; and (ii) CC=7, CT=19, TT=10 in the 36 females. The C-allele frequency in non-alcohol-dependent social drinkers was 44.9%, and no sex difference was observed (χ2-test P=0.78). The rs828701 variant had a robust effect on the self- administration of ethanol (Figure 2). An ANOVA model was computed to predict maximum achieved BACs from genotype. This revealed a significant variation in maximum BACs across genotypes, with TT>CT>CC (F(2,82)=3.6, P=0.03). Sex did not interact with genotype when entered as a covariate into this model. These results show that the T allele was associated with high alcohol consumption in the present experiments.

Figure 2
figure 2

Effect of genotype on alcohol self-administration. Mean and standard error of the mean (SEM) of the highest blood alcohol concentration (BAC) achieved during 2 h of voluntary free access intravenous self-infusion of ethanol in CC (15), CT (43), and TT (27). The effect of genotype was statistically significant. Error bars represent standard errors of the mean. Significance code: *P<0.05.

PowerPoint slide

DISCUSSION

The present results implicate 19 gene-sets in AD susceptibility, all of which encode functionally diverse pathways. In the replication data set, five gene-sets achieved nominally significant P-values ranging from 2.39E-2 to 3.80E-3, ie DNA integration, focal adhesion, leucocyte transendothelial migration, chr7q32, and microRNA mir-377. The replication of this number of gene-sets in an independent sample is significantly higher than would be expected by chance (P=0.002). This finding is consistent with the hypothesis that multiple genes and genetic variants with low effect sizes contribute to AD (Frank et al, 2012), and underlines the validity of such a systematic exploitation of GWAS data with respect to the identification of vulnerability genes.

On the level of gene-sets/biological processes, one of the present findings has been implicated in previous studies of AD and alcohol-related traits, ie glycolysis/gluconeogenesis. According to KEGG, these processes include alcohol degradation and the conversion of the resulting acetate to acetyl CoA. In rat models, chronic alcohol consumption has been associated with changes in the enzymes of glycolysis and gluconeogenesis in the liver (Klouckova et al, 2006), as well as with enzymes of glycolysis in the hippocampus (Hargreaves et al, 2009). Research in humans has demonstrated inhibition of gluconeogenesis following alcohol ingestion, as determined through quantification of the gluconeogenic flux (Siler et al, 1998). The glycolysis/gluconeogenesis set contained two well-established groups of candidate genes, ie the alcohol dehydrogenases (ADH1B and ADH1C (for review see (Rietschel and Treutlein, 2013), see also www.addictiongwas.com described in (Spanagel et al, 2013)), and the aldehyde dehydrogenases (ALDH1A3 (Sherva et al, 2009), ALDH7A1 (Bhave et al, 2006). (Supplementary Table S1). The aldehyde dehydrogenases belong to the same gene family as ALDH2, which exerts strong protective effects against AD in Asian ethnicities (for review, see (Rietschel and Treutlein, 2013).

With respect to the identified gene-sets, glycolysis/gluconeogenesis was the only one to have received strong support from previous studies. We therefore sought independent support for the 38 most significant contributory genes (P<1E-3) from our 19 identified gene-sets. In the gene-wide analysis of the COGA/SAGE replication data set, the genes CLDN14 and SLC2A13 achieved nominal significance. With respect to the non-replicated genes, it must be borne in mind that for complex traits, non-replication does not necessarily indicate that the initial results were false. This is underlined by the fact that in addition to ADH1B and ADH1C, 16 of the 38 genes in the top category of the 19 gene-sets have been reported as candidate genes for alcohol-related traits in previous studies, whereas 20 are new candidates (Table 2).

Table 2 Previously Implicated Candidates Among the Thirty-Eight most Significant Contributory Genes (P1E-3) in the 19 Identified Gene-Sets

The fact that 15 of the 19 gene-sets achieved significance owing to variants that were not among the best single-marker findings (P<1E-5) demonstrates the ability of the present approach to identify pathophysiologically relevant genes that are overlooked in single-marker analyses of GWAS data. This is also illustrated by the markers in XRCC5, all of which had P-values of >1E-5.

The four top SNPs in XRCC5 were only moderately correlated (rs828701-rs828704: r2=0.271, D′=1.000; rs828701-rs207938: r2=0.257, D′=0.533; rs828701-rs2032765: r2=0.016, D′=0.424; rs828704-rs207938: r2=0.021, D′=0.333; rs828704-rs2032765: r2=0.094, D′=0.469; rs207938-rs2032765: r2=0.080, D′=1.000; 1000 Genomes Pilot 1 CEU data accessed via https://www.broadinstitute.org/mpg/snap/ldsearch.php). Only the top SNP rs828701 was genotyped, as none of the four SNPs constituted an eQTL or an amino-acid exchange.

XRCC5 was present in approximately one third of the discovered gene-sets. The application of a convergent approach (Spanagel, 2013) generated strong further independent support for XRCC5. Previous gene expression studies in rodents have implicated XRCC5 in the etiology of AD. Expression profiling in inbred long sleep/short sleep mouse strains—which display pronounced differences in ethanol sensitivity as measured by loss of the righting response (LORR)—identified 15 differentially expressed genes, one of which was XRCC5 (MacLaren et al, 2006). LORR refers to the inability of mice to roll back onto their abdomens after receiving a sedative dose of alcohol and being placed on their backs (Crabbe et al, 2010). LORR is the correlate of the level of response to alcohol in humans, which refers to the reactions of an individual following alcohol ingestion (eg feeling ‘high’, reduced motor coordination) (Joslyn et al, 2010). An early linkage study mapped low level of response to alcohol to several chromosomal regions. These included chr2q35 (Schuckit et al, 2001), which hosts the XRCC5 gene. Furthermore, in a recent gene-set-enrichment study of the level of response, XRCC5 was among the 173 loci that contributed to this phenotype (Joslyn et al, 2010).

However, recent research has shown that the role of alcohol sensitivity in the development of AD is more complex (Newlin and Renton, 2010), and that responses to alcohol, such as greater pleasurable and excitatory effects, can increase the risk of future alcohol problems (King et al, 2011; King et al, 2014).

Therefore the issues of whether, and under which circumstances, level of response predicts AD remain unclear. However, the result of the present Drosophila study corroborates previous observations that XRCC5 modifies the acute response to alcohol.

The present human genetic study provides further evidence for an association between XRCC5 and effects of alcohol consumption in healthy young adult social drinkers.

The XRCC5 variant rs828701, which was the most significant XRCC5 finding in our GWAS of AD (P=2.25E-5), showed an allele-dosage-dependent association with the maximum achieved BAC in an experiment, in which volunteers used a free access alcohol self-infusion paradigm to reproduce the pleasant alcohol effects preferred within the context of a weekend party.

Compared with the findings described above, the results of this experiment are not only influenced by the level of response to inebriating effects of alcohol, but also by the behaviorally relevant constructs of ‘liking’ and ‘wanting’ alcohol, according to the theory of incentive sensitization (Robinson and Berridge, 2000). However, association was found with the T allele, and not with the AD risk allele C.

Rs828701 does not constitute an amino-acid exchange, and has no known effect on gene expression. Thus the possibility that this represents a chance finding cannot be excluded. However, we consider the finding genuine, and attribute it to the well-known phenomenon where by different alleles of the same variant are associated with the same phenotype. This phenomenon was first described in 2007 (Lin et al, 2007), and has been observed for some of the most robust genetic association findings in both humans (Lin et al, 2007; Maher et al, 2010), and Drosophila (Gruber et al, 2007). These authors proposed that this phenomenon may be explained by an interaction or correlation between the examined (proxy) variant and an unknown causal variant. Events of this nature are most likely to occur when proxies are examined in populations with different genetic backgrounds. All of the present study participants were of German descent. Nevertheless, certain differences between cohorts may have introduced some unrecognized difference in genetic composition: The risk allele was found in AD patients (Frank et al, 2012) with an average age of 42.0 years who had commenced drinking many years ago, whereas the study of alcohol self-administration was conducted in healthy, non-alcohol-dependent younger social drinkers with an average age of 18.39 years. Furthermore, research has shown that light social drinkers could be a unique group, as their drinking over follow-up remained light, indicating the presence of factors that mitigate their risk for heavy drinking (King et al, 2014).

The effect may have also been due to a gene x environment (GxE) interaction. The presence of a GxE means that the effect of a gene differs between different environments. This may also explain inconsistent findings in the literature concerning whether increased alcohol sensitivity confers an increased risk of AD. Unrecognized, yet important, environmental factors may act in our study populations. More detailed comparison of the alcohol intake characteristics of controls and the alcohol self-administration group was hampered by the fact that the controls were population based, as are many GWAS control samples. The possibility that the reversed allele effect was due to the chance occurrence of erratic allele frequencies can be excluded by the finding that the allele frequency in the non-alcohol-dependent social drinkers (C-allele frequency 44.9%) was closer to that of the controls (C-allele frequencies: 43.5%) than to that of the cases (C-allele frequency: 48.6%).

XRCC5 encodes a protein involved in the repair of DNA double strand breaks (Downs and Jackson, 2004). XRCC5 homologs are involved in similar phenotypes in the organismal lineages human, mouse, and fly, whose last common ancestor existed around 970 million years ago (Nei et al, 2001). This suggests that the function of XRCC5/Ku80 in the cellular context, which is presumably related to double strand break repair, has been preserved owing to its vital importance. XRCC5’s involvement in neuronal homeostasis (De Zio et al, 2012) may also influence the manner in which an organism reacts to ethanol.

A potential limitation of our gene-set-based study is that although correction was made for gene-set significance within each gene-set database, no correction was made for all databases in total. This increases the likelihood of false-positive findings. The procedure was chosen to limit the risk of rejecting true findings on the basis of their small effect sizes and indeed we could replicate 5 out of the 19 gene-sets in an independent sample. However, the other results require replication in future independent studies.

A further issue concerns comparability between alcohol sensitivity in model organisms and response to alcohol in humans (Crabbe et al, 2010). Although the comparability of these phenotypes is open to question, human XRCC5 and D. melanogaster Ku80 show 1:1 orthology, and thus comparability in terms of the corresponding gene is strong.

Despite uncertainty concerning phenotypic correlation between species, preliminary analyses of the biological mechanisms that are associated with alcohol-induced behaviors in both species (eg cAMP signaling) suggest that the ethanol phenotypes in Drosophila may indeed serve as a proxy for more complex alcohol-induced behaviors in humans (Moore et al, 1998, Schuckit et al, 2004). In particular, the correlation between alcohol sensitivity/resistance in Drosophila and the level of response in humans is supported by convincing molecular data (Lasek et al, 2011).

In conclusion, the results of the present gene-set-based analysis identified known and new candidate genes for AD. Our follow-up study of XRCC5 in Drosophila and human endorsed previous convergent findings from gene expression studies in rodents, and findings from linkage- and gene-set-enrichment studies in humans, in suggesting that this gene impacts on response to alcohol. Further studies are warranted to replicate these candidate genes, and to elucidate their precise mechanisms of action and their relevance in AD.

FUNDING AND DISCLOSURE

MR, MMN and SC were supported by grant FKZ 01GS08152 from the National Genome Research Network (NGFN plus) and BMBF 01ZX1311A (e:Med program) of the German Federal Ministry of Education and Research (BMBF). KM was supported by grant 01EB0410 from the Bundesministerium für Bildung und Forschung. RS was supported by grants FKZ 01GS0117/NGFN and FKZ EB 01011300, and by grant FKZ 01GS08152 from the National Genome Research Network (NGFN plus) and BMBF 01ZX1311A (e:Med program, see Spanagel et al, 2013). BB was supported by the German Federal Ministry of Research and Education (BMBF) through grants 01GS0896, 01GS08149 and 01GS08153. USZ, MNS and, EJ were supported by grant 1U01AA017900-01 from the National Institute of Alcohol Abuse and Alcoholism. The contents of this work are solely the responsibility of the authors and do not necessarily represent the official view of the NIAAA or NIH. USZ, MNS, and EJ were also supported by grant ZI 1119/4-1 from the Deutsche Forschungsgemeinschaft (DFG). MMN is a member of the DFG-funded Excellence-Cluster ImmunoSensation 3 and received support from the Alfried Krupp von Bohlen und Halbach-Stiftung. JT was supported by the DFG (TR 920/2-1). HS was supported by the Heisenberg grant DFG Scho 656/7-2. ND was supported by grants from DFG and BMBF. Wolfgang Gaebel has received symposia support from Janssen-Cilag GmbH, Neuss, Lilly Deutschland GmbH, Bad Homburg, and Servier, Munich. He is a member of the Faculty of the Lundbeck International Neuroscience Foundation (LINF), Denmark. Monika Ridinger received compensation from Lundbeck switzerland and Lundbeck institute for advisory boards and expert meeting, and from Lundbeck and Lilly Suisse for workshops and presentations. Dr Wodarz has received funding from the German Research Foundation (DFG) and Federal Ministry of Education and Research Germany (BMBF); he has received speaker’s honoraria and travel funds from Janssen-Cilag and essex pharma. He took part in industry sponsored multi-center randomized trials by D&A pharma and Lundbeck. USZ received compensation for professionel services from Janssen, Lundbeck, Servier, Sächsische Landesärztekammer, Gewerkschaft Erziehung und Wissenschaft, Park-Krankenhaus Leipzig, and ABW Wissenschaftsverlag. Prof Dr N Scherbaum received honoraria for several activities (advisory boards, lectures, manuscripts, and educational material) by the factories Sanofi-Aventis, Reckitt-Benckiser, Lundbeck, and Janssen-Cilag. During the last 3 years, he participated in clinical trials financed by the pharmaceutical industry (Reckitt & Benckiser). Prof Dr N Dahmen has recieved support from Astra Zeneca and Janssen-Cilag, and for expert opinions for organizations, courts, insurances, and private persons. All other authors declare no conflict of interest. The authors have neither submitted nor published any related manuscripts elsewhere. The funding bodies had no role in study design, data collection, data analysis, manuscript preparation, or the decision to publish the manuscript.