INTRODUCTION

Cigarette smoking is the primary risk factor for many chronic diseases (Bergen and Caporaso, 1999). Genetic epidemiological studies have estimated the heritability of nicotine dependence to range from 40 to 75% (Broms, 2007 no. 1225; Li, 2003 no. 39; Maes, 2004 no. 402; Vink, 2005 no. 480; Lessov, 2004 no. 472). Recent candidate gene association studies (Baker et al, 2009; Bergen et al, 2009; Bierut et al, 2008; Breitling et al, 2009a, 2009b; Chen et al, 2008, 2009; Etter et al, 2009; Freathy et al, 2009; Greenbaum et al, 2006; Hoft et al, 2009; Keskitalo et al, 2009; Le Marchand et al, 2008; Li et al, 2008, 2009; Perkins et al, 2008a, 2008b, 2009; Philibert et al, 2009; Rigbi et al, 2008; Saccone et al, 2009a, 2009b; Sherva et al, 2008; Spitz et al, 2008; Stevens et al, 2008; Wang et al, 2009; Weiss et al, 2008), and genome-wide association scans (GWASs) (Drgon et al, 2009a, 2009b; Liu et al, 2010, 2009; TAG, 2010; Thorgeirsson et al, 2008, 2010), with many published candidate gene and GWAS studies through 2009 referenced by Wang and Li (2010), have searched for, and, at varying levels of significance, identified, common variants associated with measures of response to tobacco, tobacco consumption, nicotine dependence, nicotine metabolism, or smoking cessation. However, most of the attributable fraction assumed to be due to heredity cannot be explained by these common single-nucleotide polymorphisms (SNPs), suggesting that rare variants in candidate genes, ie, those with minor allele frequencies (MAFs) substantially <5%, may also contribute to complex disease (Bodmer and Bonilla, 2008; Frazer et al, 2009; Schork et al, 2009). In contrast to common SNP associations, which typically exhibit odds ratios in case–control studies substantially less than two, rare variants at the genetic loci known to contribute to specific diseases exhibit substantially greater odds ratios, leading to the suggestion that despite their reduced allele frequency, rare variants might account for a substantial minor fraction of the population attributable risk due to genetic variation (Bodmer and Bonilla, 2008). Recent technological advances in high-throughput, next-generation sequencing enable direct sequencing of candidate genes, which is the optimal way to identify rare variants (Harismendy et al, 2009). A number of methods for identifying associations between rare variants and common diseases have been developed (Li and Leal, 2008; Madsen and Browning, 2009; Schork et al, 2008), which share the approach of combining information obtained from multiple variants.

In light of recent evidence from GWAS and candidate gene results, we chose 10 presynaptic neuronal nicotinic acetylcholine receptor (nAChR) subunit genes (namely CHRNA2, CHRNA3, CHRNA4, CHRNA5, CHRNA6, CHRNA7, CHRNB1, CHRNB2, CHRNB3, and CHRNB4) for resequencing to identify both common and rare variants for association analyses to measures of smoking behavior collected in a sample of treatment-seeking smokers involved in a randomized behavioral and pharmacological intervention for smoking cessation (Swan et al, 2010). Collectively, these genes code for receptors that bind the exogenous ligand nicotine, responsible for nicotine addiction (Changeux, 2009; Dani and Bertrand, 2007; Stolerman and Jarvis, 1995). On the basis of recent studies in the literature (Saccone et al, 2009a; Saccone et al, 2007), we hypothesized that common variation in these 10 nAChR subunit genes might contribute to variation in a commonly assessed measure of nicotine dependence, the Fagerström test for nicotine dependence (FTND) (Heatherton et al, 1991). In addition, on the basis of the emerging appreciation for the potential of rare variants to contribute to common disease (the common disease:rare variant hypothesis), we hypothesized that rare variation at these genes might also contribute to variation in this measure of nicotine dependence in this same sample. We employed a recently developed method and a traditional method (Margulies et al, 2005; Sanger et al, 1977) of resequencing to identify both common and rare sequence variations, and several recently developed methods for analysis of sequence variation that assesses the association between common and/or rare variants and phenotype combining information obtained from multiple variants to evaluate these hypotheses (Li and Leal, 2008; Madsen and Browning, 2009; Schork et al, 2008). Herein, we assess the contribution of common and/or rare sequence variation identified in 10 nAChR subunit genes in a sample of treatment-seeking smokers with the FTND score collected at the time of intake in a randomized smoking cessation trial.

MATERIALS AND METHODS

Population and Phenotype

Current smokers (10 cigarettes per day over the past year, N=1202) were recruited from the Group Health (GH), a consumer-governed nonprofit health-care organization that serves >630 000 residents of Washington and Idaho, to participate in a smoking cessation randomized behavioral intervention combined with the FDA-approved course of varenicline. Recruitment and treatment methods for the Comprehensive Medication Program and Support Services (COMPASS) study, a randomized trial (NCT00301145) sponsored by the National Cancer Institute, have been described previously (Swan et al, 2010). Nicotine dependence was measured at baseline with the FTND. Scores on this scale range from 0 to 10 with a higher score indicating greater dependence; although the FTND has suboptimal psychometric properties, it continues to be used in part because of the amount of previous clinical, descriptive, and genetic research that has used it, and in part because of the modest associations observed with the clinical outcome of smoking cessation (Piper et al, 2006). DNA was extracted from saliva samples provided by COMPASS participants (Nishita et al, 2009), and DNA obtained from 448 individuals who self-identified as belonging to non-Hispanic White race/ethnicity, who had never used varenicline previously, and who had questionnaire data on smoking habits, was used for genetic analyses. All protocols were reviewed and approved by the Institutional Review Boards of the GH and SRI International.

454 Resequencing

We designed target regions (amplicons) to amplify nAChR subunit gene exons and flanking regions including splice sites and selected 5′ and 3′ untranslated regions (UTRs) and SNPs upstream and downstream of transcription units. Primer design, sample preparation, and sequencing followed the protocols supplied by Roche for the 454 Genome Sequencer System (Supplementary File 1). We used the Roche GS Reference Mapper alignment tool to align sequence reads to a target consisting of the amplified intervals from the reference genome (NCBI Build 36). We separately used BLAST (Altschul et al, 1990) to align each read against the entire reference genome, and removed reads with the best BLAST hits that did not match the GS Reference Mapper alignment to the target. Candidate single-nucleotide variants (SNVs) were identified by parsing the GS Reference Mapper alignments and identifying positions at which at least one individual had reads of the same alternate base in forward and reverse orientations. We then used a Bayesian algorithm similar to that used in the study by Craig et al (2008) to assign genotypes and quality scores to all sequenced individuals at each putative variant position based on read counts for the reference and alternate base. This approach assigns genotypes taking into account genotype frequencies and error rates estimated at each variant position. Genotypes that were less than twice as likely as the next most likely genotype were coded as missing. Positions that were substantially out of the Hardy–Weinberg equilibrium (P<10−7) were excluded from analysis; the Hardy–Weinberg equilibrium test has power to detect only disequilibrium with common SNPs.

Sample Quality Control

We excluded 18 DNA samples with overall call rates <90%. The amplified regions included 29 SNPs in the Phase II HapMap release, 24 that were polymorphic in the CEU (Utah residents with Northern and Western European ancestry from the CEPH collection) panel. All of these SNPs were identified as polymorphic in the sequencing data, although four failed a quality control (QC) threshold as a result of fold coverage <10 × . Allele frequencies for the remaining 25 were within 6% of HapMap CEU frequencies. We included 10 additional amplicons containing 1 QC SNP each and a single Y-chromosome amplicon as control amplicons. Each of the sequenced DNA samples was also genotyped with a custom high-density oligonucleotide array containing 312 SNPs that contained the 10 QC SNPs (Hinds et al, 2004). Principal components analysis of the SNP array data did not indicate the presence of population structure, and confirmed that all samples were of European ancestry. No samples were excluded from association analyses because of a high discordance rate between the genotypes for the 10 QC SNPs from the array and sequencing data sets (data not shown).

Sanger Resequencing

Selected regions of CHRNB2 and CHRNA4 with low read depth (<10X) from 454 resequencing were resequenced using Sanger sequencing methods (Sanger et al, 1977), primers as developed by Weiss et al (2008) or as optimized in this study, enzymatic methods for PCR as optimized in this study, and using BigDye Terminator v3.1 Cycle Sequencing protocols on a Applied Biosystems 3130xl Genetic Analyzer. Variant detection was performed using Mutation Surveyor v3.24 (SoftGenetics, LLC, Penn State, Pennsylvania), Reference Sequence (RefSeq) NM_000748.2 for CHRNB2 and NM_000744.5 for CHRNA4 and reference files generated in the Mutation Surveyor using the RefSeq and the ‘generate synthetic chromatograms’ function. See Supplementary File 1 for further details.

Association Analyses

Common variants (MAF 5%) were analyzed separately for association with normally distributed FTND using a generalized linear model. Genotypes were coded in additive (Add) or dominant (Dom) models. The significance of regression models is reported for each SNP and with adjustment for correlated tests (PACT) (Conneely and Boehnke, 2007) for the most significantly associated SNP within each gene and model to a PACT value threshold of 0.10. For rare variants, gene-based association tests were performed by the cohort allelic sum test (CAST) and by the weighted sum statistic (WSS) (Madsen and Browning, 2009). CAST tests for association between FTND and counts of rare alleles were based on two fixed thresholds for ‘rareness’ (MAF <0.01 and <0.05). The WSS was used to test for association between FTND and weighted counts of minor alleles (MAF <0.05), with an inverse relationship between weights and the MAF of minor alleles. Both tests are applied only to rare variants under the assumption that rare variants are more likely to be deleterious than common variants (Kryukov et al, 2009). Linear regression coefficients, P-values from likelihood ratio tests, and empirical P-values from permutation testing are reported. Multivariate distance-based matrix regression (MDMR) was also employed to test associations of common and rare (MAF <0.05) variants with FTND, with either identical-by-state allele sharing across individuals and variants in each gene, and with allele sharing weighted by the Lynch–Ritland calculation, with 100 000 permutations. The latter approach gives more weight to rare variants (Nievergelt et al, 2007; Schork et al, 2008; Wessel and Schork, 2006). For nAChR subunit genes that are clustered in the genome (CHRNB3 and CHRNA6 at chr8p11, and CHRNA5, CHRNA3, and CHRNB4 at chr15q25.1), we also performed association analyses to evaluate all sequenced SNPs available in these genes together. When MDMR analyses with both common and rare SNPs identified a significant association, we performed and report results from two post hoc tests: analysis of rare SNPs alone and analysis of common SNPs alone. Power to detect effect sizes with parameters given, α=0.05 and a two-sided test was estimated using Quanto software (Gauderman, 2002). Pairwise linkage disequilibrium (LD) (D′ and r2) values were calculated from the COMPASS sequence data using Haploview (Barrett et al, 2005). Genes and SNPs are ordered in tables by the chromosome and coordinate, respectively.

RESULTS

nAChR Exome Sequencing and SNP Discovery

We attempted to resequence all coding exons, splice sites, and selected SNPs at 10 nAChR subunit genes comprising 18 kb of target sequence in 448 European ancestry treatment-seeking participants of the COMPASS clinical trial using 454 sequencing supplemented by Sanger sequencing for two genes (Supplementary File 1). For the 10 nAChR subunit genes sequenced by 454 technology, 66% of targeted base pairs in these amplicons had >10 reads, with 3 genes accounting for most of the shortfall in target coverage (CHRNA4, 0%; CHRNB2, 22%, and CHRNA7, 39%, vs an average for the 7 other genes of 81%, please see Table 3 and Figure 1 in Supplementary File 1). A total of 167 unique SNPs were identified through 454 sequencing. Before analyses, we excluded 18 DNA samples and 24 SNPs that had call rates <90% and 5 SNPs that were monomorphic in the remaining 430 samples. The mean (SD) Sanger amplicon resequencing read depth was 1.98 (0.20) and 1.88 (0.43) for CHRNB2 and CHRNA4, respectively, and SNPs were scored using default base-calling parameters. Sanger resequencing identified 15 SNPs at CHRNB2 and 32 SNPs at CHRNA4 (Supplementary File 1 and Table 1). Six SNPs found in CHRNA4 exon 5 and CHRNB2 exon 5 by 454 resequencing were not found by Sanger resequencing and hence were excluded. One SNP, found by both methods, was unreliable in both methods, because of poor coverage in 454 resequencing and unclear reads in Sanger resequencing and hence was also excluded. Using the two methods, a total of 172 SNPs were identified in 430 samples (Supplementary Table 1). The observed distribution of MAFs in the total sample is strongly skewed to low MAF values (Figure 1 and Supplementary Table 1), in which 73.9% have MAF <5% and 65.3% have MAF <1%, with MAFs ranging from 0.1 to 49.2%. Percentages and types of variants identified (Supplementary Table 1) are 4.1% 5′ of the 5′ UTR, 4.1% 5′ UTR, 51.4% coding (26.0% nonsynonymous, 25.4% synonymous), 23.1% intronic, 14.5% 3′ UTR, and 2.9% 3′ of the 3′ UTR.

Figure 1
figure 1

Minor allele frequencies (x axis), proportion (left y axis) and cumulative fraction (right y axis) of nAChR SNPs identified through resequencing. Bars represent the percentage of SNPs within each category of minor allele frequency (MAF) with the figure under each bar being the maximum MAF within that category. The line represents the cumulative frequency of SNPs.

PowerPoint slide

Table 1 Common nAChR Subunit Gene Variant Associations with the FTND Score

Treatment-Seeking Smoker Characteristics

The 430 individuals included in association analyses of nAChR subunit gene SNPs and the FTND score are mostly female (68%), with a mean (SD) age of 49.0 years (11.3) and of cigarettes smoked per day 20.4 (8.5). In this sample of 430 treatment-seeking smokers, gender and age were not significantly associated with FTND (P>0.05). The sample had mean (SD) FTND scores of 5.2 (2.1), indicating moderate addiction to tobacco (Heatherton et al, 1991). This mean FTND score is higher than population-based samples (Fagerstrom and Furberg, 2008), but typical of treatment-seeking smokers from this patient population (Swan et al, 2003; Swan et al, 2010) and other treatment-seeking smoker samples (Bergen et al, 2009).

Common Variation Association Analyses

In all, 44 SNPs with MAF 5% were tested for association with FTND using two transmission models. Significant (P<0.05) unadjusted associations were found between CHRNB1 (rs2302764 and rs7210231, D′=1, r2=0.049) and CHRNB2 (rs2072660 and rs2072661, D′=1, r2=0.918) SNPs with FTND (Table 1). After adjustment for multiple correlated tests within each gene, significant associations remained between two CHRNB2 SNPs located in the 3′ UTR and the FTND (rs2072660, PACT, Additive=0.013, and rs2072661, PACT, Additive=0.009). Individuals with one or two copies of the minor allele of these CHRNB2 SNPs exhibit a 0.6 unit increase in the mean FTND score (for rs2072661, FTND mean (SD)=5.5 (0.2) vs 4.9 (0.1)). After adjustment for correlated tests, the minor allele of the CHRNB1 SNP rs7210231 was nonsignificantly (PACT, Dominant=0.082) associated with a decrease in the FTND score. In a post hoc test of mean FTND scores, individuals with the rs7210231 heterozygote genotype exhibit significantly lower mean FTND scores than do individuals with homozygote genotypes (heterozygote mean (SE)=4.70 (0.19), major allele homozygote=5.77 (0.50), minor allele homozygote mean (SE)=5.36 (0.13), with heterozygote to major homozygote P=0.044 and heterozygote to minor homozygote P=0.004).

Common and Rare Variation Analyses

Association with rare SNPs within a gene and the FTND are reported in Table 2 for WSS (MAF<0.05) and CAST (two thresholds, MAF <0.01 and <0.05) analyses, and associations with common and rare (MAF<0.05) SNPs and FTND score are reported in Table 3 for MDMR analyses, along with post hoc MDMR tests. Significant (P<0.05) association with rare variation in CHRNA4 and the FTND score was observed using the CAST tests, but not the WSS test (Table 2). Significant associations between common and rare variants and the FTND score were identified at CHRNB2, CHRNA5, and at the chr15q25.1nAChR locus using the MDMR method (Table 3). When either common or rare variants at these genes were removed before association analysis, associations between rare or common SNPs at CHRNA5 and at the chr15q25.1 nAChR locus and the FTND score, and those between rare CHRNB2 SNPs and the FTND score, were no longer significant. Some association results that did not reach the threshold of statistical significance used herein, eg, P-values between 0.05 and 0.10, suggest that common variation at CHRNA3 (Table 1), rare variation at CHRNB2 and CHRNA6 (Table 2), common and rare variations at CHRNA2 (Table 3), and common and/or rare variation at CHRNB1 (Tables 1 and 3) may also contribute to the FTND score in this sample of treatment-seeking smokers. We note that CHRNA6 and CHRNA7 had very limited numbers of variants available for analysis, and thus tests for association at these genes that combine variants will have low power.

Table 2 Rare nAChR Subunit Gene Variant Associations with the FTND Score
Table 3 Common and Rare nAChR Subunit Gene Variant Associations with the FTND Score

DISCUSSION

We resequenced the exons of 10 highly relevant candidate genes for the study of nicotine dependence (Changeux, 2009), using DNA extracted from saliva (Nishita et al, 2009), provided by treatment-seeking participants of a randomized smoking cessation trial (Swan et al, 2010), using a next-generation sequencing method (Margulies et al, 2005), together with the Sanger method (Sanger et al, 1977). We identified sequence variation and employed standard regression and specialized analysis methods (Madsen and Browning, 2009; Wessel and Schork, 2006) for analyzing association of this variation with the FTND score. The latter analysis methods group SNPs from a gene together into a summary variable to test for an excess of variants associated with the phenotype (Li and Leal, 2008). We selected methods to analyze only common (regression), only rare (CAST and WSS), and both common and rare (MDMR) variations at a gene to evaluate the possible influence of common and rare variations under the common disease:common variant and the common disease:rare variant hypotheses (Bodmer and Bonilla, 2008; Schork et al, 2009).

In the sample of 430 DNA samples obtained from treatment-seeking smokers and 173 nAChR subunit gene SNPs analyzed for association, we observed that common variants at CHRNB2 and rare variants at CHRNA4 were significantly associated with the FTND score, after correction for multiple correlated tests and by permutation testing within each gene, respectively. Employment of a method (allele sharing and weighted allele sharing MDMR with permutation testing) that assesses the contribution of common and rare variants combined, and post hoc analyses with rare or common variants alone, identified a significant association with variation in the FTND score at CHRNB2, CHRNA5, and at the chr15q25.1nAChR locus with both types of variation, and with the combination of common variants alone at CHRNB2.

There are at least two possible mechanisms underlying the latter findings described above, ie, the observation that the combination of common SNPs and rare variants, or of common SNPs alone, at a gene or gene region, are associated with FTND, but that rare variants do not exhibit association. One possible mechanism is the interaction between common variants at the same gene or within the same gene region, which is the mechanism that has been elucidated by the studies by Li et al for the chr15q25.1 nAChR cluster, in which the interaction among CHRNA3 SNPs was observed in association with measures of nicotine dependence (Li et al, 2009; Lou et al, 2008). Formal interaction analyses within genes or within multigene regions will be necessary to explore the possible role of this mechanism in the observed association of common and rare or common variants alone with the FTND score in the COMPASS sample. Another possible mechanism is the phenomenon of common SNPs acting as tagging SNPs for multiple rare causal variants that exist in the same LD region, ie, so-called synthetic associations (Dickson et al, 2010; Wang et al, 2010). When we removed common SNPs from the association test as we did when testing only rare variants in the MDMR analysis, we may have removed the only SNPs that have the power to detect the association of rare causal variants with the FTND score. It is well known that the chr15q25.1 nAChR locus exhibits extensive LD across a region of >200 000 bp, and that the chr1q21.3 CHRNB2 gene region exhibits strong LD extending 150 000 bp in the HapMap CEU sample 3′ from CHRNB2 through ADAR to the 3′ exons of KCNN3; thus conditions exist for synthetic associations to be present at the chr15q25.1 and chr1q21.3 nAChR loci. Resequencing of selected individuals based on a common SNP allele or haplotype content, followed by common and rare variant analyses, will be necessary to test the synthetic association hypothesis in these two regions.

Three other groups have resequenced nAChR subunit genes to identify variants for subsequent association analysis. The first group used DNA obtained from 192 European Americans (144 smokers and 48 nonsmokers sampled from low (0–4) and high (6–10) FTND strata) to resequence 10 genes and identified 262 SNPs (Weiss et al, 2008). They identified the chr15q25.1nAChR locus to be significantly associated with the FTND strata in 144 smokers (72 high vs 72 low FTND strata) and genotyped 87 tag SNPs selected with MAF >0.05 across the 10 genes in 3 larger cohorts (total N=2827), and a significant association between 6 chr15q25.1 nAChR locus SNPs with the FTND strata was observed among individuals initiating smoking at age 16 years. The size of the COMPASS sample is approximately one-sixth that of the sample used by Weiss et al to characterize the effect size of chr15q25.1nAChR locus haplotypes. Replication of the Weiss et al finding will require the use of a larger sample of treatment-seeking smokers than available in the COMPASS sample. Sabatelli et al (2009) resequenced 3 nAChR subunit genes (namely CHRNA3, CHRNA4, and CHRNB4) in 245 sporadic amyotrophic lateral sclerosis (SALS) cases and 450 controls from Italy identifying 55 SNPs; significantly more missense variants in the SALS cases compared with controls were observed in the intracellular loop region of nAChRs (P=0.0001). In our analyses, we identified and evaluated 45 nonsynonymous variants at 10 nAChR subunit genes for association with the FNTD score, most of which are rare (38 with MAF <0.01 and 41 with MAF <0.05). Using several methods to assess the association with rare variants in the COMPASS sample, we identified evidence for a significant association with rare variants at CHRNA4 with the FTND score. We observed a total of 21 rare (20 variants with MAF <0.01 and 1 variant with MAF <0.05) CHRNA4 variants, with 5 nonsynonymous substitutions, including a nonsense variant, in the COMPASS sample analyzed. Further stratification of association testing of nAChR subunit gene variants in the COMPASS sample by variant type, nAChR structural region, and evaluation of function by in vitro expression and electrophysiology would be necessary to test the finding of Sabetelli et al. Rana et al (2009) resequenced the 3 chr15q25.1nAChR subunit genes in 80 individuals from 5 ethnicities identifying 63 variants; upon genotyping of 6 tag SNPs, significant associations in 370 European-American twins with CHRNA3 SNPs rs3743075 and rs3743074 (in strong LD) were found with plasma catestatin and systolic blood pressure levels. These common CHRNA3 SNPs were not associated with the FTND score in the COMPASS sample.

It is noteworthy that three recent meta-analyses of GWAS of smoking phenotypes in DNA samples obtained from population and clinically based data sets exceeding 140 000 individuals in toto have identified several common SNPs at the chr15q25.1 nAChR locus (rs55853698 and rs6495308 (Liu et al, 2010); rs1051730 and rs16969968 (TAG, 2010); and rs1051730 (Thorgeirsson et al, 2010)) associated with smoking quantity. In addition, one of these meta-analyses also identified common SNPs at the chr8p11nAChR locus (rs6474412 and rs13280604 (Thorgeirsson et al, 2010)) associated with smoking quantity. Of these six SNPs, our resequencing study identified only rs1051730 and rs16969968, and neither of these SNPs was statistically significantly associated with the FTND score in our sample. However, we did observe an association with multiple common SNPs at CHRNA5 and at the chr15q25.1 nAChR locus using the MDMR approach, although none of the 16 common chr15q25.1 nAChR SNPs showed a nominally significant association (P<0.05) in individual regression models.

Our finding of a significant association of CHRNB2 common variants with a measure of nicotine dependence is supported by some previous candidate gene studies reporting significant associations with the FTND and related phenotypes in treatment-seeking, in laboratory, and in population and clinically based data sets distinct from the treatment-seeking smokers we studied (Swan et al, 2010). An analysis of 417 treatment-seeking smokers from a placebo-controlled randomized clinical trial of bupropion and placebo (Lerman et al, 2006) reported significant associations with CHRNB2 SNPs rs2072660 and rs2072661 and abstinence in a logistic model incorporating main effects and interaction with treatment at the end of treatment (P=0.01) and at 6 months (P=0.0002) (Conti et al, 2008). A short-term test of nicotine vs placebo patch effects in 156 smokers reported a significant association between rs2072661 and the number of abstinent days (Perkins et al, 2009). In an analysis of CHRNA4 and CHRNB2 SNPs in a sample of 1068 older adolescents, rs2072660 was reported to be significantly associated with sensitivity to tobacco (subjective effects of ‘nauseous’ or ‘dizzy’) (Ehringer et al, 2007).

However, candidate gene studies evaluating the association of CHRNB2 SNPs rs2072660 and/or rs2072661 with smoking status and measures of nicotine dependence have more often failed to identify significant associations. These included studies recruiting individuals, twins, or families from the community, including 2037 individuals from 602 multiplex heavy-smoking pedigrees (Li et al, 2005), 1929 ever-smoking unrelated individuals (Saccone et al, 2009a; Saccone et al, 2007), 872 unrelated members of a population-based twin sample (Silverman et al, 2000), 742 unrelated individuals recruited for behavioral studies (Lueders et al, 2002), 621 heavy-smoking men from 206 families recruited for nicotine addiction studies (Feng et al, 2004), 516 individuals recruited for behavioral studies (Philibert et al, 2009), in addition to treatment-seeking smokers (Bergen et al, 2009). In this last study, a combined analysis of 821 treatment-seeking participants of two smoking cessation trials (Lerman et al, 2006) did not identify significant associations between rs2072660 and rs2072661 and baseline nicotine dependence as measured by the FTND score (uncorrected P-values of 0.280 and 0.195 for rs2072660 and rs2072661, respectively) (Bergen et al, 2009). Gene–gene interaction analyses did not identify significant interactions of CHRNB2 SNPs (including rs2072660 and rs2072661) with SNPs from CHRNA4, NTRK2, and BDNF in association with smoking status in a sample of 191 unrelated smokers and 191 unrelated nonsmokers (Lou et al, 2007), but did identify a significant interaction of CHRNB2 SNPs (including rs2072660 and rs2072661) and CHRNA4 SNPs with smoking status in a sample of 275 unrelated smokers with FTND scores 4 and 348 unrelated nonsmokers (Li et al, 2008).

A large candidate gene association study in 1929 ever smokers that included 119 SNPs at 16 nAChR subunit genes reported significant associations with SNPs at the CHRNB3/CHRNA6, CHRNA5/CHRNA3/CHRNB4, and CHRND/CHRNG multigene loci and a dichotomized measure of nicotine dependence (Saccone et al, 2007). A follow-up study using the same sample and phenotype, and analyzing a total of 226 SNPs at 16 nAChR subunit genes, reported significant associations with SNPs within the same multigene loci, and nominal association (not associated after multiple test correction) with SNPs at CHRNB1 and at CHRNA4 (Saccone et al, 2009a). We identified nominal associations between CHRNB1 SNPs and the FTND score in our analyses of 430 treatment-seeking smokers, including at some CHRNB1 SNPs studied by Saccone et al (2009a) and Lou et al (2006); however, these nominal associations became nonsignificant after correction for correlated tests at CHRNB1.

We identified a significant negative association between rare CHRNA4 SNPs and the FTND score in the CAST test, which remained significant after permutation testing. Previous results of candidate gene association with CHRNA4 (Breitling et al, 2009b; Feng et al, 2004; Hutchison et al, 2007; Li et al, 2005; Saccone et al, 2009a) have suggested that multiple common CHRNA4 SNPs may be associated with nicotine dependence-related phenotypes. The lack of a significant association of common CHRNA4 SNPs in this sample with FTND could be because of a lack of power and/or differences in the distribution of common and/or rare SNPs in our sample of treatment-seeking smokers relative to previous samples. Differences in results between different tests of association of rare CHRNA4 variation with the FTND score in this sample (CAST vs WSS) may be because of the large number of rare variants available at CHRNA4, which improves the power of the CAST test. However, the nominal significance observed in the CAST test and the lack of association observed in the WSS and MDMR tests suggest that this result should be regarded as preliminary and in need of replication.

One or more differences in study design, ascertainment criteria, assessment of nicotine dependence, case and control definitions, MAF thresholds, demographic variables, or sample sizes could be contributing to differences in results of association of nAChR subunit gene SNPs with measures of nicotine dependence or smoking status among the various studies. For example, the treatment-seeking smokers in our sample have a minimum past year history of 10 cigarettes smoked per day, are mostly female, have an average age of 49 years and are screened for various physical and mental disorders typical of exclusion criteria for smoking cessation clinical trials. These smoking behavior characteristics differ somewhat from population- or family-based samples ascertained with smoking behavior criteria with lower or higher smoking intensities than those observed among treatment-seeking smokers. It is of interest that FTND scores seemed to be normally distributed (data not shown) in this treatment-seeking smoker sample, as in a previous investigation in this patient population (Swan et al, 2003) and in other treatment-seeking smoker samples recently investigated (Bergen et al, 2009). We note that the sequencing coverage varied across nAChR subunit genes in this study and that some genes had more common and/or rare SNPs identified than other genes in these analyses. Chance is another factor that could be responsible for association differences among studies.

There were limitations to our analysis. One limitation is the use of saliva-extracted DNA obtained from participants of a smoking cessation trial, in which genotyping performance using DNA extracted from saliva may be reduced by increased nonhuman DNA content (Herraez and Stoneking, 2008). Three genes (namely CHRNA4, CHRNB2, and CHRNA7) exhibited reduced 454 resequencing coverage due to targeted regions of these genes with GC content that exceeds 80% (CHRNA4, 5′UTR, exon 1, exon 5, exon 6, 3′ UTR; CHRNB2, 5′UTR, exon 1, exon 2, exon 5, exon 6, 3′ UTR; CHRNA7, exon 1). For two of these genes (CHRNA4 and CHRNB2), we supplemented next-generation sequencing with Sanger sequencing protocols, which increased the number of rare and common SNPs available for analysis at these genes. The CHRNA7 region contains a partial duplication (Gault et al, 1998), which may have contributed to reduced resequencing coverage of this gene.

Power analyses to discover de novo rare variant associations from sequencing efforts have cited sample sizes from 100 to 10 000 depending on a number of factors, including subject sampling criteria, effect sizes, variant frequency, strength of purifying selection, and the probability of detecting variants (Kryukov et al, 2009; Li and Leal, 2008). Including this study, four groups have resequenced 3–10 nAChR genes using 80–675 individuals identifying 55–262 SNPs (Rana et al, 2009; Sabatelli et al, 2009; Weiss et al, 2008). Each of the four studies identified previously unreported variants; however, the power to identify associations with rare variants is limited by the lower sample sizes suggesting the need for larger sample sizes and more efficient approaches toward identifying rare, potentially functional variants. Small sample sizes leading to underpowered candidate gene studies have been an often-cited reason for inconsistent results across studies (Ioannidis, 2008). In our sample of 430 individuals with a mean (SD) FTND score of 5.2 (2.1), the power of single SNP tests (double sided with α=0.05) to detect β-values of 0.5, 1.0, 1.5, and 2.0 with a MAF=0.01 and a dominant model is 11, 28, 54, and 78%, whereas the post hoc power of the linear regression approach to detect the effect of the common CHRNB2 variants significantly associated with FTND score is 70%.

In conclusion, there are numerous previous GWASs and candidate gene studies supporting the hypothesis that common variation in several nAChR subunit genes contributes to variation in nicotine dependence as measured by the FTND or by measures of smoking intensity. In our resequencing study of DNA from a sample of treatment-seeking smokers, we found a significant evidence for the contribution of common variants at CHRNB2 and rare variants at CHRNA4 to variation in a measure of nicotine dependence. We note that in mouse knockout studies, the β2 and α4 nicotinic receptor subunit genes have been shown to be necessary and sufficient to observe the reinforcing properties of nicotine (Mineur and Picciotto, 2008), and that the human α4β2 nAChR is upregulated upon nicotine administration (Buisson and Bertrand, 2001), providing support for the involvement of these two nAChR subunits in the etiology of nicotine dependence. The heteropentameric structure of the nAChR suggests that future gene–gene interaction analyses of nAChR subunit variants will be a useful approach to identify nAChR sequence variation influencing smoking behavior. Analysis of both common and rare variants in the COMPASS samples suggests that the combination of common and rare variants at CHRNB2, CHRNA5, and at the chr15q25.1 nAChR locus, also contribute to nicotine dependence. The chr15q25.1 and CHRNA5 findings are concordant with multiple previous association results in both GWASs and candidate gene studies, whereas the CHRNB2 finding may reflect the common SNP association findings already observed in this sample with SNP-wise analysis. Resequencing of DNA from nAChR subunit genes targeting coding and regulatory regions, using larger samples sizes and multiple approaches to statistical genetic analysis, including analyses of interaction and synthetic association hypotheses, will be necessary to further characterize the relationship between common and rare nAChR subunit gene sequence variation and smoking-related behaviors in treatment-seeking participants of smoking cessation trials.