INTRODUCTION

Cigarette smoking represents the most important source of preventable morbidity and premature mortality worldwide. Individual differences in smoking behavior can be attributed to genetic as well as to environmental factors. Twin studies have shown that a variety of nicotine-dependence (ND)-related phenotypes are substantially heritable, including ND and smoking persistence (h2 between 46 and 69%) (Maes et al, 2004; True et al, 1997; Haberstick et al, 2007; Li et al, 2003), failed smoking cessation (Xian et al, 2003), and various withdrawal symptoms (Xian et al, 2003; Pergadia et al, 2006). Linkage studies (Li, 2008), candidate–gene association studies, and more recently genome-wide association studies (GWAS) have been used to identify genes associated with ND. The most remarkable results from GWAS and follow-up candidate–gene studies implicate a gene cluster on chromosome 15q.24–15q25 containing CHRNA5, CHRNA4, and CHRNB4, which respectively encode α5, α3, and β4 subunits of the nicotinic acetylcholine receptors (Bierut et al, 2008, 2009; Stevens et al, 2008; Saccone et al, 2007; Li et al, 2009; Thorgeirsson et al, 2008; Hung et al, 2008; Berrettini et al, 2008; Weiss et al, 2008). Associations between these genes and ND have been replicated using several different measures of ND. An association between variation at these loci and risk for lung cancer has also been reported (Thorgeirsson et al, 2008; Hung et al, 2008). Although initial studies were conducted in subjects of mainly European descent, more recent studies have confirmed associations between variation within the CHRNA5/A3/B4 gene cluster and ND-related phenotypes in African-American (Saccone et al, 2009; Li et al, 2009) and Asian populations (Li et al, 2010).

Despite widespread public knowledge of the harmful consequences of smoking, and several interventions (including medications) available for ND, about 27% of individuals in the United States continue to use tobacco regularly (World Health Organization, 2008). Craving has been suggested to be among the most important acute factors contributing to unsuccessful quitting or smoking relapse (Tiffany et al, 2009; Aubin et al, 2010; Killen et al, 2006; Killen and Fortmann, 1997; Piasecki, 2006), but it remains understudied. For example, no clear data on heritability of tobacco craving are available. However, a study of craving for sweet foods in 663 twin pairs (Keskitalo et al, 2007) found that additive genetic factors explain 38% of the variance of such craving. This estimate is within the range typical of a variety of addiction-related phenotypes, and provides at least suggestive evidence that genetic factors may be important in tobacco craving as well.

The main foci of most ND-related genetic studies have been the diagnosis of ND according to DSM-IV or other criteria, or ND-related symptoms such as scores on the Fagerström Scale of Nicotine Dependence (FTND), the number of cigarettes smoked per day (CPD), or cotinine levels (Greenbaum and Lerer, 2009). To date, candidate–gene studies focusing on cessation-related phenotypes, such as craving (Perkins et al, 2009; Baker et al, 2009; Erblich et al, 2004) or withdrawal symptoms (Stern et al, 2007; Conti et al, 2008; Pergadia et al, 2009; Perkins et al, 2009; Baker et al, 2009), have not yielded compelling positive results. Finding genes associated with craving or other cessation-related phenotypes may help identify new strategies to help people quit smoking.

Recently, it has been shown that several genes that affect feeding can also be involved in drug reward; galanin (GAL) and neuropeptide Y (NPY) are among these genes (Zachariou et al, 2001; Fulton et al, 2006). GAL in particular can inhibit dopamine and norepinephrine release and may therefore modulate the rewarding properties of nicotine and other drugs of abuse (Robinson and Brewer, 2008; Melnikova et al, 2006; Narasimhaiah et al, 2009). GAL and its receptors have been shown to modulate behaviors such as conditioned place preference, locomotor activation, and withdrawal responses, in studies of addictive drugs in animal models (Zachariou et al, 1999, 2000; Hawes et al, 2008). In human beings, a nominally significant association between a single-nucleotide polymorphism (SNP) in the GAL locus and heroin addiction was recently observed (Levran et al, 2008), and a SNP in GALR3, encoding the type-3 GALR, has been reported to associate with alcoholism (Belfer et al, 2007).

The current case-only candidate gene study was conducted to evaluate a variety of candidate loci and their associations to phenotypes relevant to subjective responses to tobacco, and the ability to quit smoking, assessed at baseline in smokers seeking cessation treatment. Candidate genes were chosen based on the neurobiology of ND, and previous reports of genetic associations. Our results suggest an association between variation at the GALR1 locus and baseline craving for tobacco in smokers seeking cessation treatment.

MATERIALS AND METHODS

Samples

Six hundred and eleven subjects were recruited from three smoking cessation trials at the Yale University Transdisciplinary Tobacco Use Research Center (TTURC; Toll et al (2007a, 2010) and O’Malley et al (2006)). Common inclusion criteria for admission into the studies were a minimum age of 18 years, fluency in English, weight of least 45 kg, and daily cigarette smoking (10 or more; Toll et al (2010)) and an expired carbon monoxide level of greater than or equal to 10 p.p.m. Exclusion criteria included current serious neurological, psychiatric, or medical illness, current dependence on illicit drugs of abuse, and for women, being pregnant or nursing, or being sexually active without using reliable birth control. Additional study-specific criteria can be found in the original reports (O’Malley et al, 2006; Toll et al, 2010). All subjects were prescreened by phone; this report is based on data provided by subjects who consented at the in-person intake evaluation whether or not they ultimately met all eligibility criteria. For the association tests presented here, we restricted analyses to individuals whose self-reported race was European American (EA) or African American (AA).

Assessments

At intake sessions, participants completed questionnaires and interviews assessing demographics (age, gender, race, weight, height, marital status, numbers of children, education level), personal and family history of smoking and alcohol use, mood, and other areas of functioning.

Written informed consent for participation, including explicit consent to donate blood samples for use in genetic studies, was obtained from each subject, following procedures approved by the Institutional Review Board (IRB) of Yale University School of Medicine. The Emory University IRB approved laboratory and statistical analysis of de-identified DNA samples and data.

Scales for Craving and ND

The Minnesota Withdrawal Scale (MWS) is an 8-item questionnaire, which assesses craving (Toll et al, 2007b), plus seven symptoms of nicotine withdrawal listed in the DSM-IV (Gmitrowicz and Kucharska, 1994) (irritability, frustration, or anger, anxiety, difficulty concentrating, restlessness, increased appetite or weight gain, depressed or sad mood, insomnia or sleep problems) (Hatsukami et al, 1992, 1998; Hughes and Hatsukami, 1986). Each symptom is rated from 0 (none) to 4 (severe). We administered the MWS to each participant at intake (ie, before quitting). The wording of the MWS instructions was designed to assess each subject's experience during a previous quit attempt: ‘Listed below are a number of symptoms some people have experienced when they have cut down their smoking in an effort to quit. Please rate the severity of each of the symptoms listed below that you experienced DURING A PRIOR QUIT ATTEMPT.’

Severity of ND was assessed using the FTND (Heatherton et al, 1991), number of CPD, plasma cotinine concentration, and age at first cigarette. The FTND is a 10-item questionnaire (scale 0–10) that evaluates symptoms of ND, including smoking quantity (SQ) and the behaviors related to the individual's urge to smoke (‘how soon after waking up do you smoke your first cigarette’ … ‘do you smoke if you are so ill that you are in bed most of the day’). Cotinine, the main metabolite of nicotine, is a biomarker of nicotine intake and was measured from blood samples using a modified HPLC procedure (Hariharan et al, 1988).

Bioinformatics and SNP Selection

Twenty-six genes were selected for this genetic association study based on recent pharmacological and GWA findings (Table 1 , Supplementary Table 1). SNPs were identified using publicly available online resources, including the UCSC Genome Bioinformatics Site (http://genome.ucsc.edu/), HAPMAP (hapmap.org), and SNPbrowser 2.0 (Applied Biosystems, Foster City, CA) software. Selection of tagging SNPs to cover a particular gene was carried out using HAPMAP and SNPBrowser (ABI). The criteria for choosing tag SNPs were as follows: (1) having a proxy (surrogate) SNP with r2 85% for any SNP in the region that (2) had a minor allele frequency (MAF) ≥5%. We also made sure to include non-synonymous SNPs regardless of MAF or pair-wise r2 with neighboring SNPs. For NRXN1 and NRXN3, which are very large genes, we only selected SNPs that have been reported in the literature as potentially associated with ND (Bierut et al, 2007; Novak et al, 2009; Nussbaum et al, 2008) or that could be functional variants. Supplementary Table 1 shows a list of the genes and which SNP selection approach was used for each. A total of 145 SNPs were selected (Supplementary Table 1).

Table 1 List of the Candidate Genes used in the Candidate Association Study

Ancestry Informative Markers

To address potential artifacts due to population stratification, we typed 106 Ancestry Informative Markers (AIMs) from a previously described panel (Fejerman et al, 2008) that effectively discriminates European, African, and Native American origins, in a subsample of 358 subjects from the three smoking cessation trials.

Genotyping

Genomic DNA was extracted from whole blood using a Mag-Bind SQ Blood DNA kit (Omega Bio-Tek, Norcross, GA) on a KingFisher Flex Magnetic Particle Processor (Thermo Scientific, Waltham, MA) following the manufacturer's instructions. DNA samples (5 μg) were arrayed in 384-well plates using a Biomek FX liquid handling system and dried overnight. For quality control, each plate contained at least two blanks, two repeat samples, and two lab control samples known to be of high DNA quality. Genotyping was performed for all the SNPs using Sequenom MassArray platform (Ehrich et al, 2005; Gabriel et al, 2009). PCR primers and extension primers were automatically designed using MassArray Assay Designer software and ordered from Integrated DNA Technologies (Coralville, IA; primer sequences are available on request). Genotype calls were made using the Spectrotyper software (Sequenom) and then downloaded into a BC-Gene database (http://www.bcplatforms.com), which ran standard quality control analyses, including evaluation of Hardy–Weinberg equilibrium, call rates, and minor allele frequencies.

Genotypes for the GALR1 SNP rs2717162 were confirmed using the 5′-exonuclease (TaqMan) method. The Taqman assay kit and genotyping reagents were ordered from Applied Biosystems. Genotyping was performed on an ABI 7900HT system, with samples arrayed in 384-well plates. Six discordant genotypes were eliminated from the analyses.

Data Cleaning and Quality Control

Candidate genes

Twenty-one samples of the initial 611 samples failed: 590 samples were used in the final analyses. From the initial 145 SNPs, 11 did not amplify and/or significantly deviated from Hardy–Weinberg equilibrium and/or had a call rate lower than 90%; 24 SNPs turned out not to be polymorphic and 29 had an MAF minor of <0.02; a total of 81 SNPs were used for our analyses. The allele-frequency data observed in our sample were similar to population data reported in NCBI (Supplementary Table 2).

AIMs

Five samples were excluded owing to genotyping failure. For the AIMs, 10 SNPs with a call rate below 90% or poor QC metrics were discarded from the subsequent analyses.

Statistical Analysis

Candidate genes

A univariate general linear model (GLM) was run for each SNP using the following dependent variables: (i) MWS mean; (ii) craving score; (iii) FTND sum; (iv) salivary cotinine level; (v) age at first cigarette; and (vi) CPD. We chose a GLM over the alternative of testing additive, dominant, and recessive models for each SNP to minimize multiple testing. The following variables were assessed for association with the outcomes, and used as co-variates when appropriate: age, sex, BMI, education, and estimated proportion of European chromosomal ancestry (see below). For analyses that included both EA and AA samples, we performed a stratified analysis by including a covariate for EA vs AA. For important results, we also used permutation tests to compute empirical p-values. We generated 1 000 000 permuted data sets in which the values of the dependent variable were randomly re-assigned to individuals within each race, and computed the empirical p-value as the proportion of permuted tests for which the F statistic exceeded the original. To maintain an experiment-wide significance level of 0.05, we set α to 0.00010 (0.05/(81 × 6)), to correct for multiple testing (81 SNPs and six phenotypic outcomes). All statistical analyses were implemented in SPSS (17.0) or R.

In silico analysis

Linkage disequilibrium (LD) plot for SNPs in GALR1 in the EA subsample were prepared using Haploview 4.1, and compared with those derived from the CEU data available in HAPMAP (http://hapmap.ncbi.nlm.nih.gov/); r2 was used to evaluate LD between SNPs. Predicted canonical transcription factor analyses in the region surrounding the significant SNP (rs2717162 T/C) were performed using the Matinspector software by Genomatix (www.genomatix.de/cgi-bin/eldorado).

AIMs

We used the Structure software (Pritchard et al, 2000a) and the above-described AIMs to analyze population stratification. Structure estimates proportions of chromosomal ancestry based on K (the number of source populations). As recommended in the documentation of the software, we ran analyses with a range of values for K (in this case, between 1 and 4), with 10 000 and 100 000 iterations for burn-in. All other settings were at the default values. For each K, at least three runs were performed with and without using self-reported race data from each individual (Pritchard et al, 2000b; Rosenberg et al, 2002).

RESULTS

Four hundred and eighty-six smokers (47.8% male, FTND mean±SD=5.8±2.1) who underwent evaluation for enrollment in clinical trials for smoking cessation were included in the study. Additional demographic and ND-related phenotypic characteristics of the sample are summarized in Table 2 . Level of education associated significantly with five of the six phenotypes of interest (Supplementary Table 3). We therefore used level of education, coded as a three-level variable, as a covariate in our model. The degree of depressive symptoms varied widely in the sample, with total Center for Epidemiological Studies Depression Scores (CES-D) (Radloff, 1977) ranging from 0 to 43 (mean, SD: 9.21±7.22).

Table 2 Demography and Phenotype Description of Studied Population

First, for the candidate gene study, we performed GLM analyses in the EA subset of the sample (n=432); the sample sizes of other self-reported ethnicities were not large enough to support separate analyses at this stage.

One SNP in the GALR1 locus (rs2717162) was significantly associated with severity of craving as measured by the MWS in the EA subsample (p=6.48 × 10−6). A similar direction of association was observed in the AA subsample, but was not significant (p=0.57; Table 3 ). Analysis of the combined EA and AA samples revealed an association of similar strength to that observed in the EA sample (p=9.23 × 10−6) (Table 3). Individuals with TT and TC genotypes had significantly higher craving scores than CC subjects in the EA subsample (p=7.7 × 10−6) and the entire sample (p=5.35 × 10−8; Figure 1). To evaluate whether the lack of significance in the AA subsample reflected the relatively low power of this subsample, we performed a power analysis using the program QUANTO (Gauderman, 2002a, 2002b), assuming a recessive model (consistent with low craving only in the TT homozygotes). On the basis of an overall mean craving score of 3.3 and SD=0.85, and a MAF of 0.26, the power of the AA sample (N=43) to detect the observed effect size in a gene-only model at α=0.05 was 0.024; for the EA sample, power was 0.96 and for the overall sample, 0.99. MWS total scores, which included the craving score, were also nominally associated with rs2717162 (Table 3), but did not meet Bonferroni-corrected criteria for experiment-wide significance.

Table 3 Rs2717162 SNP is Significantly Associated with the Craving Score from the MWS
Figure 1
figure 1

Comparison of craving score in smokers with different genotype at rs2717162 (Galanin Receptor 1, GALR1) in European American (EA), African American (AA) subjects, or in the Entire population (EP). The numbers on the bars are the subjects in each genotype group. The p-values were calculated based on permutations. The CC genotype shows the lowest craving score in the three groups, reaching significance in the EA and EP.

PowerPoint slide

Neither alcohol consumption, alcohol abuse, nor CES-D depression scores were significantly associated with the GALR1 SNP, and none of those measures accounted for any additional variance in craving scores (data not shown).

rs2717162 was not in strong LD with any of the other eight SNPs analyzed at GALR1, a result consistent with those from HAPMAP. Figure 2 shows a Haploview plot of LD among the GALR1 SNPs, using r2 as the measure of LD. The plot was similar to that generated from the CEPH samples in HAPMAP (Figure 2). Using HAPMAP, we analyzed an extended region of GALR1 (1 Mb) to find other possible SNPs in LD with rs2717162; Figure 3 shows a portion of this area. We observed only one other SNP in LD with both rs2717162 and rs2850878, located in the same intronic region, about 1 kb downstream.

Figure 2
figure 2

Linkage disequilibrium (LD) structure of galanin receptor 1 (GALR1) gene in European American (EA) (left) in our sample and based on the data from the HAPMAP project (right). LD matrix of GALR1 single-nucleotide polymorphisms (SNPs) measured by r2 for SNPs in EAs (left) and Council on Education for Public Health (CEPH) (right). Each box represents r2, % LD between SNP pairs, as generated by Haploview (Whitehead Institute for Biomedical Research). Boxes without numbers represent complete LD; r2=100%. SNP IDs are public database rs numbers. The plot from the EA (left) was similar to that generated from the CEU samples in HAPMAP (right), with no other SNPs in LD with rs2717162.

PowerPoint slide

Figure 3
figure 3

Linkage disequilibrium (LD) of HAPMAP single-nucleotide polymorphisms (SNPs) in the region surrounding galanin receptor 1 (GALR1). LD matrix of GALR1 gene and surrounding areas measured by r2 for SNPs in Council on Education for Public Health (CEPH). Each box represents r2; % LD between SNP pairs, as generated by Haploview (Whitehead Institute for Biomedical Research). Boxes in black without numbers represent complete LD; D′=100%. SNP IDs are public database rs numbers. SNP rs2717162 is circled: it is in LD with only one other SNP located in the same intronic region. Using HAPMAP, it seems there is no other known variant that could explain our finding.

PowerPoint slide

To evaluate the possibility that population stratification accounted for some or all of the above results, 358 subjects were typed at 96 AIMs. After running analyses with K between 1 and 4, the value of K that maximized the estimated model Ln P(D) was K=2, with and without data on self-reported population of origin (Pritchard et al, 2000a; Falush et al, 2007). The model reflected data from self-reported race, very well (Supplementary Figure 1), with proportion of European ancestry 0.855 (SD=0.32) for self-identified EAs and 0.145 (SD=0.32) for self-identified AAs with K=2. We used the estimated cluster membership under models of K=2 and 3 as covariates in GLM of craving as the dependent variable. The proportion of chromosomal ancestry in either the K=2 or 3 models for each single subject (cluster) did not account for any significant proportion of the variance (Supplementary Table 4); rs2717162 in GALR1 still reached the Bonferroni-corrected threshold for experiment-wide significance after accounting for ancestry proportions (p=6.52 × 10−6 and 7.5 × 10−6 for K=2 and 3, respectively; Supplementary Table 4).

Our analyses did not identify experiment-wide significant associations between any SNP and FTND, cotinine, age of first cigarette, or cigarettes per day in the EA subpopulation after Bonferroni correction (Supplementary Table 3). However, the GLM revealed nominally significant associations (p<0.02) within the CHRNA5 locus between both rs16969968 and rs684513, and the two measures of ND (FTND and cotinine level, Supplementary Figure 2).

DISCUSSION

The major finding in this study was that one SNP in the GALR1 gene strongly associated with self-reported craving for tobacco experienced during a previous quit attempt, as assessed by the MWS. Evidence from previous reports suggests that craving may be one of the most sensitive and consistent predictors of smoking behavior and relapse (Piasecki, 2006). This finding could therefore be important for understanding genetic contributions to smokers’ ability to quit once addicted. For the purposes of our study, we analyzed craving separately from other items on the MWS because previous work has suggested that (Hughes and Hatsukami, 1998) craving may reflect distinct central nervous system pathways and genes from other measures (Teneggi et al, 2005; Koob and Volkow, 2010).

rs2717162 is an intronic SNP in the GALR1 gene. Interestingly, in silico analysis of 35 bp of sequence surrounding rs2717162 predicted four transcription factor binding sites for the T allele that were not predicted for the C allele. Conversely, the presence of the C allele predicted a distinct transcription factor binding motif (Supplementary Table 5). This observation raises the possibility that rs271762 is a functional variant that influences expression of the gene. The hypothesis that the SNP directly influences gene expression is an intriguing one that will require experimental evaluation in transient-transfection/reporter gene assays, chromatin immunoprecipitation, or other paradigms. Our analysis of the SNP data from this study, and the data available in HAPMAP, shows that there are no known common SNPs in strong LD with rs271762, suggesting the possibility that rs271762 could be directly involved in heritable differences in craving for tobacco. Alternatively, an as-yet undiscovered variant(s), influencing the function of GALR1 and leading to differences in craving, could be in LD with rs271762. Deep re-sequencing studies in samples from opposite homozygotes at rs271762 are warranted to evaluate that hypothesis.

The association of rs271762 with tobacco craving appears unlikely to be due to population stratification, as the proportion of chromosomal ancestry, estimated using a previously validated panel of AIMs, neither associated with craving score nor reduced the significance of the SNP association when added to the model as a covariate.

GALR1 is widely expressed throughout the brain and in particular in the cortico-limbic regions implicated in emotional behavior (Holmes et al, 2003). GAL is co-expressed with acetylcholine and it can inhibit acetylcholine release. GAL inhibits glutamate, but not GABA, release in the hippocampus (for a review see Picciotto et al (2010)). We speculate that a functional change in GALR1 could lead to a decrease in dopamine release in the VTA and a resultant decrease in drug reward. Moreover, GALR1 is highly expressed in the amygdala; human neuroimaging studies have shown that this brain area, together with the prefrontal cortex, is critical for drug- and cue-induced craving in human beings (Franklin et al, 2007). Taken together, these data support the hypothesis that GALR1 and GAL can modulate the craving phenotype.

Few GWAS studies have focused on measures of ND that correspond to the ability to quit smoking, despite the obvious clinical and public health importance of such phenotypes. Uhl et al (2007, 2008) identified several candidate genes comparing successful vs unsuccessful smoking cessation. Those loci were involved in cell adhesion, enzymatic, transcriptional regulation, and DNA/RNA protein binding. Drgon et al (2009) reported a GWAS study comparing individuals who had successfully quit smoking to those who had not, and found some overlap with the work of Uhl and co-workers (2007, 2008) identifying genes involved in cell adhesion and cell-to-cell communication processes. None of these previous studies, to our knowledge, identified GALR1 as a candidate gene. However, these studies compared two groups (successful vs unsuccessful quitters) and none of the previous studies examined craving during a previous quit attempt as the phenotype.

Numerous studies have recently shown an association between SNPs within the CHRNA5/A3/B4 cluster at 15q24 and several measures of nicotine addiction (Bierut et al, 2008, 2009; Stevens et al, 2008; Saccone et al, 2007; Thorgeirsson et al, 2008; Hung et al, 2008). Moreover, a recent GWAS meta-analysis using 74 053 subjects, using four smoking phenotypes and a follow-up analyses with more than 140 000 subjects for the most significant regions revealed three loci associated with CPD: rs1051730 in CHRNA3 (10−73), and rs16969968 (10−72) and rs684513 (10−8) in CHRNA5 (Tobacco and Genetics Consortium, 2010). Our study supported a nominally significant association between rs16969968 and rs684513 and FTND, but not CPD. Given the extensive data already supporting associations between the foregoing SNPs and ND-related phenotypes, we cautiously conclude that the modest associations observed here may represent true associations even though they did not meet Bonferroni-corrected criteria for experiment-wide significance.

Strengths of this work include the fact that subjects were recruited for smoking cessation studies, and thus represent a unique subpopulation of the smoking population. The carefully assessed phenotypes available from each subject allowed us to examine understudied phenotypes such as withdrawal symptoms and craving, which are germane to the ability to quit. Our results suggest that the measure of craving might be an important phenotype to examine in other samples of ND subjects.

An important limitation of this study was that we did not have a demographically matched control sample of non-smokers, and hence were confined to analyzing within-case phenotypes. We therefore cannot draw any conclusions about the potential association of GALR1 to ND per se. Another limitation is the modest sample size, although we note that the association observed for the GALR1-craving association would have approached genome-wide significance in the context of a typical GWAS. Although our sample size was clearly too small to support genome-wide analysis, it was larger than most previous candidate–gene studies (Uhl et al, 2010; Conti et al, 2008; Kim et al, 2009), and as the power calculations presented above indicate, adequate to detect the effect we observed for GALR1. Replication of the result in independent samples is clearly necessary. Finally, the MWS craving item used in this study captured the subject's recollection of craving during previous quit attempts. We focused on that method of assessment because it maximized the number of subjects from whom useful data could be gathered, and hence maximized our statistical power. Future studies of craving rated during a quit attempt would be valuable.

In conclusion, the results of this study, together with preclinical evidence, suggest sequence variation at GALR1 associates with baseline craving for tobacco in smokers motivated to quit. If confirmed, this association would imply that GAL and its receptors might be useful therapeutic targets for the pharmacological treatment of ND.