INTRODUCTION

Nicotine addiction continues to be the largest modifiable risk factor for morbidity and mortality in developed countries (Bergen and Caporaso, 1999) because of its causative link to cancer, cardiovascular, and respiratory diseases. The attributable risk of lung cancer due to smoking is 90%, with the burden of lung cancer being greater in African-Americans compared with European Americans (Haiman et al, 2006). The average 1975–2007 annual age adjusted per 100 000 incidence and mortality rates of lung cancer in African Americans was found to be 81.8 and 63.4, respectively, in comparison with 64.1 and 53.9 for European Americans (Ries et al, 2008). Recent genome-wide association studies (GWAS) have identified several genetic variants that influence nicotine intake, including a strong, replicated association between genetic variants in the chromosome 15 nicotinic receptor subunit cluster and smoking quantity in European Americans (Liu et al, 2010; Thorgeirsson et al, 2008; Thorgeirsson et al, 2010; Tobacco and Genetics Consortium, 2010).

Although genetic association studies have succeeded in identifying genetic variants on chromosome 15 that influence tobacco use, self-reported measures of smoking may not accurately reflect nicotine intake. Cotinine is the major metabolite of nicotine, with 75–80% of nicotine transformed to cotinine (Hukkanen et al, 2005; Swan et al, 2009). It has a longer half-life than nicotine, and it is widely used as a biomarker for nicotine exposure. Serum cotinine levels provide a more accurate measure of nicotine intake than self-reported cigarettes per day and the preference of using serum cotinine in research over cigarettes per day has been suggested in both genetic (Keskitalo et al, 2009) and non-genetic (Gorber et al, 2009) literature.

Cotinine levels vary across populations. After controlling for number and yield of cigarettes, Wagenknecht et al (1990) observed that African American smokers have significantly higher serum cotinine levels in comparison with European American smokers—a finding that has since been independently replicated (Caraballo et al, 1998). To date, there are no published reports of serum cotinine genetic association analyses evaluating and comparing the genome in these two racial groups. We, therefore, conducted a genetic association study of serum cotinine levels in African and European Americans using the IMAT-Broad-Candidate Gene Association Resource (CARe) (IBC) chip that includes genetic variants in 2100 genes. This approach allows for the examination of the effect of genetic variation on nicotine intake as measured by cotinine levels rather than self-reported cigarette consumption and for the comparison of variation in effects across these two populations. This comparison is important because of the potential for differences across populations in linkage disequilibrium patterns, disease allele frequency, genetic effect size, and rare variant effects.

MATERIALS AND METHODS

Study Participants

Participants were a part of The National Heart, Lung, and Blood Institute's Coronary Artery Risk Development in Young Adults ‘CARDIA’ Study. CARDIA was designed to examine the development, determinants, and risk factors of clinical and subclinical cardiovascular disease. A total of 5115 young adult African and European American men and women completed the baseline examination in 1985–1986. The participants were selected so that there would be approximately the same number of people in subgroups of race, gender, education (high school or less and more than high school), and age (18–24 and 25–30) in each of four centers: Birmingham, AL; Chicago, IL; Minneapolis, MN; and Oakland, CA. Periodic follow-up examinations were held through 2006 with high participant retention (72–90%). Additional details can be found on the CARDIA website (http://www.cardia.dopm.uab.edu/).

This report focuses on participants with baseline serum cotinine levels who reported, on an interviewer-administered questionnaire, current regular smoking of at least five cigarettes per week almost every week for the past 3 months. Subjects reporting current use of nicotine gum, cigar, or pipe, and subjects with cotinine values of zero were excluded from the analysis. The final sample included subjects with available phenotype and genotype data in African Americans (N=365) and European Americans (N=315). In the African American cohort, 43.6% of subjects were male and the mean age of the participants was 25.1 (SD=±3.6). In the European American cohort, 44.4% of subjects were male and the mean age of the participants was 25.3 (SD=±3.4).

Phenotype

Blood for cotinine level analysis was collected under a standardized protocol. A 1 ml aliquot of serum was frozen and shipped to the Clinical Biochemistry Laboratories at the American Health Foundation. Cotinine was determined by radioimmunoassay using the method of Haley et al (1983) after a modification of the method described by VanVunakis et al (1987). A 3.4% sample of randomly selected blind duplicates was submitted by the field centers over the entire study period for an external assessment of quality. Cross-classification of the duplicate samples of both smokers and non-smokers yielded a 91% (177/ 194) exact agreement rate. Internal quality control was maintained as well; the inter-assay coefficient of variation was 7%.

The mean value of serum cotinine levels in European Americans was 194.1 (SE=±7.8), whereas in African Americans it was 236.5 (SE=±8.1). The mean of the self-reported cigarettes per day in European Americans was 16.6 (SE=±0.6), whereas in African Americans it was 10.5 (SE=±0.4). A t-test comparison resulted in a finding of a significant difference in the means of both serum cotinine (p<.001) as well as cigarettes per day (p<.001) between the two populations. Our finding is in agreement with previous studies, which showed that African Americans have higher serum cotinine levels than European Americans (Caraballo et al, 1998; Clark et al, 1996; English et al, 1994; Pattishall et al, 1985; Wagenknecht et al, 1990). Some researchers have suggested that this difference is attributable, at least in part, to nicotine metabolism (see Supplementary Figure 2 for nicotine metabolism; Nakajima et al, 1996; Perez-Stable et al, 1998). Nicotine and cotinine are metabolized primarily by the enzyme CYP2A6 (Cashman et al, 1992; Nakajima et al, 1996). CYP2A6 is highly polymorphic (http://www.cypalleles.ki.se/cyp2a6.htm), and the frequency of CYP2A6 polymorphisms differs significantly across populations (Mwenifumbo et al, 2010).

Genotyping Assay

Samples from the CARDIA study were genotyped as part of the CARe project (Musunuru et al, 2010). The content of the genotyping array, ITMAT-Broad-CARe or ‘IBC chip’, is informed by GWAS, expression quantitative trait loci, pathway-based approaches, and comprehensive literature searching. It was designed to study a number of phenotypes, such as coronary heart disease, aging, blood biomarkers, and hypertension, however, it includes loci relevant to addiction. As an example, it contains densely spaced SNPs from 84 of the 130 genes from the ‘addiction array’ (Hodgkinson et al, 2008) and additional genes that are not on the addiction array, but were found to be associated with addiction phenotypes in later genetic association studies. With respect to genes previously associated with nicotine dependence, IBC chip has good coverage of dopamine receptor genes, and CYP2A6—the gene that mediates most of the metabolism of nicotine (to cotinine), but no coverage of the gene that mediates nicotine glucuronidation UTG 2B10 responsible for 10–20% of nicotine metabolism. The coverage of the chromosome 15 nicotinic receptor subunit region is moderate and includes SNPs upstream of CHRNA5 that are in LD with the enhancer region previously associated with nicotine dependence (Smith et al, 2011), as well as a variant within CHRNA5. Please see Supplementary Table 2 for information regarding the degree of coverage of some of the genes previously found to be associated with nicotine dependence (Wang and Li, 2010).

The loci (candidate genes and regions) on the IBC chip are divided into three Groups: Group (1) n=435 loci with a high likelihood of functional significance (tag SNPs selected to capture known variation with minor allele frequency (MAF)>0.02 and an r2 of at least 0.8 in HapMap populations); Group (2) n=1349 loci potentially involved in phenotypes of interest or established loci that required very large numbers of tagging SNPs (tag SNPs selected to capture known variation with MAF>0.05 with an r2 of at least 0.5 in HapMap populations); and Group (3) n=232 loci comprised mainly of the larger genes (100 kb), which were of lower interest a priori to the investigators (includes only non-synonomous SNPs and known functional variants). The average number of SNPs per locus across Group 1 and Group 2 loci was compared with the average number of SNPs per equivalent loci from GWAS products. The average coverage for Group 1 loci is 36.5 SNPs per locus on the IBC chip. The Illumina Human 1 M (San Diego, CA, USA) and Affymetrix 6.0 platform (Santa Clara, CA, USA), for comparison, have an average of 28.0 and 17.4 SNPs, respectively, across the equivalent IBC loci. The average number of SNPs observed for the Group 2 loci is 16.3 SNPs, which is comparable with the current GWAS products.

Additional details regarding the design of the IBC chip have been described in (Keating et al, 2008). In total, 49 320 SNPs were chosen to map 2100 candidate gene loci. For detailed genotyping and QC information, see (Musunuru et al, 2010).

Statistical Analyses

As with previous genetic association analyses of cotinine levels (He et al, 2009; Keskitalo et al, 2009), we evaluated association between serum cotinine levels and age, gender, BMI, and education. In the African American cohort, we found associations with BMI (p=0.025) and education (p=0.008), and in the European American cohort we found associations with age (p=0.003), gender (p=0.004), and education (p=0.023). We used the R statistical package (The R Foundation for Statistical Computing, Vienna, Austria) to make residuals based on adjustment by covariates significantly associated with the phenotype per each racial group. The serum cotinine phenotype was then Box–Cox transformed yielding normally distributed data (African American Cohort: Lilliefors (Kolmogorov–Smirnov) normality test p-value=6.24 × 10−07 before normalization and 0.799 after normalization; European American Cohort: Lilliefors normality test p-value=2.38 × 10−05 before normalization and 0.609 after normalization). In the European American cohort, one subject was removed after Box–Cox transformation because the subject's phenotype value was >3SDs away from the mean.

Association analysis was performed in PLINK (Purcell et al, 2007) using linear regression under an additive genetic model. We addressed population stratification by conducting principal component analysis as implemented in EIGENSTRAT (Price et al, 2006). The first 10 principal components were included as covariates in the genetic association analysis. The Bonferroni adjustment for multiple comparisons was set at an α level of 2.3 × 10−6. The α value was set at 0.05 for the comparison of top variation in effects across the two populations.

Imputation of ungenotyped variants was done using a combined CEU+YRI reference panel, including SNPs segregating in both CEU and YRI, as well as SNPs segregating in one panel and monomorphic and non-missing in the other, resulting in 270 000 total SNPs. The use of the CEU+YRI panel resulted in an allelic concordance rate of 95.6%, calculated as 1–1/2 × imputed_dosage–chip_dosage. This rate is comparable to rates calculated for individuals of African descent imputed with the HapMap 2 YRI individuals (Huang et al, 2009). In the first step of imputation, individuals with pedigree relatedness or cryptic relatedness (pi_hat>0.05) were filtered out. Recombination and error rate estimates for the entire sample were calculated based on a subset of random individuals. Next, these rates were used to impute all sample individuals across the entire reference panel. SNPs with poor imputation scores (RSQ_HAT<0.6) and a MAF<0.01 were filtered out.

RESULTS

Genomic control (GC) analysis did not result in a significant inflation of the χ2-test statistic in the African American cohort (GC inflation factor λGC=0.967). The European American cohort λGC value of 1.012 was adjusted for inflation. Single-nucleotide polymorphisms that were the most strongly associated with serum cotinine in the two cohorts are summarized in Table 1. In African Americans, the variant rs11187065 (imputed SNP) in IDE (the gene encoding insulin-degrading enzyme or Insulysin) exhibited the strongest association where each additional copy of the rs11187065*C minor allele corresponded to lower serum cotinine levels (β=−85.14; SE=18.88; p=8.91 × 10−06). The next two most strongly associated SNPs, rs11187064 (genotyped SNP) and rs17445328 (imputed SNP), also reside in IDE. See Figure 1 for a regional plot of the IDE region on 10q23–q25. In European Americans, rs11763963 (imputed SNP), located on chromosome 7p15 exhibited the strongest association where each additional copy of rs11763963*C minor allele corresponded to higher serum cotinine levels (β=106.82; SE=22.22; p=1.53 × 10−06; Figure 2). This finding was closely followed by a second locus in MORF4L1 (mortality factor 4 like 1; Figure 3) with each copy of the major A allele of rs12050510 (imputed SNP) corresponding to higher serum cotinine levels (β=83.21; SE=17.83; p=3.07 × 10−06). rs12050510 was not significantly associated with serum cotinine levels in the African American cohort (p=0.17).

Table 1 Association of Genomic Variants with Serum Cotinine Levels in African and European American CARDIA Cohorts
Figure 1
figure 1

Association between insulin-degrading enzyme (IDE) variants and serum cotinine in Coronary Artery Risk Development in Young Adults (CARDIA) African American cohort. The most significant association in CARDIA African American cohort was with SNP rs11187065 (here in purple) of the gene encoding IDE. The same variant was also associated with serum cotinine levels in CARDIA European Americans (p=0.044).

PowerPoint slide

Figure 2
figure 2

Association between rs11763963 and serum cotinine in Coronary Artery Risk Development in Young Adults (CARDIA) European American cohort. The most significant association in CARDIA European Americans was with SNP rs11763963 (here in purple), located on chromosome 7 outside of a gene transcript. This SNP is non-polymorphic in African Americans.

PowerPoint slide

Figure 3
figure 3

Association between mortality factor 4 like 1 (MORF4L1) variants and serum cotinine in Coronary Artery Risk Development in Young Adults (CARDIA) European Americans. The second association in European Americans was on chromosome 15 in MORF4L1. This association was not found in the CARDIA African American cohort (p=0.17).

PowerPoint slide

For the top SNP associations in both African and European Americans, we examined whether a similar signal existed in the other group. We found that the top variant in African Americans, rs11187065, was also associated with serum cotinine levels in the European American group, where each additional copy of rs11187065*C minor allele corresponded to lower serum cotinine levels (β=−24.91; SE=12.36; p=0.044). The top variant identified in European Americans, rs11763963, was non-polymorphic in the African American sample. Because rs11187065 was associated with serum cotinine levels in both populations, in Supplementary Table 1 we present associations of variants across IDE in both populations. Whereas in the European American population there is a high degree of LD in the region (Supplementary Figure 1b), leading to multiple, significant associations across the gene, lower levels of LD in African Americans (Supplementary Figure 1a) result in fewer variants significantly associated with serum cotinine levels. The most significant association in African Americans is localized to the region in the first intron of IDE.

DISCUSSION

We examined serum cotinine levels for association with genetic variants from a large number of candidate genes in two populations with different linkage disequilibrium patterns. Cotinine levels are a more objective measure of nicotine intake than cigarettes per day (Gorber et al, 2009) because self-reported questionnaires or interviews may result in underreporting of smoking quantity (Jarvis et al, 1984) and also because cigarettes can be smoked distinctively with different nicotine delivery from person to person. We found that rs11187065, located in intron 1 of IDE, was the most strongly associated variant with serum cotinine levels in African Americans. The association observed with this variant was also found in the CARDIA European American cohort.

The associated IDE SNP, rs11187065, is located in intron 1 of the gene. Previously, no study has reported a genetic association with this SNP, however, proxies of rs11187065, rs4646955 (imputed SNP and SNP #38 in Supplementary Table 1), and rs4646953 (r2 value of 0.785 with rs11187065), have been associated with an approximately twofold increased risk for Alzheimer's disease in the Finnish population (Vepsalainen et al, 2007). rs4646953, a functional SNP (based on location in the 5′ promoter region of IDE), was not genotyped or imputed in our study. rs4646955 was associated with serum cotinine levels at a p-value 1.78 × 10−5 in our African American sample and a p-value of 0.082 in our European American cohort. It is important to note that in this region there is high LD in European Americans, presumably responsible for multiple associations observed across the gene in this population. In the African American population, which has lower levels of LD in this region, the association is localized to intron 1, suggesting the approximate position of a functionally relevant region. Given the proximity and LD between the associated region and the 5′-UTR region, it is possible that our top SNP is tagging a functional SNP in the promoter region of the gene.

Although genetic association studies of cotinine levels do not reveal whether a genetic variant influences behavioral (pharmacodynamic) or metabolic (pharmacokinetic; either in the periphery or the brain) mechanism(s), results of previous research on insulin and addictive substances suggest that our finding may represent a pharmacodynamic or a behavioral effect. Nicotine treatment of cells enhances insulin-induced activation of extracellular signal-regulated kinase (ERK) and phosphoinositide 3-kinase pathways, as well as increases expression of insulin receptor substrate (IRS) proteins IRS-1 and IRS-2 (Sugano et al, 2006), which are expressed in the brain (Taguchi et al, 2007; Wang et al, 2009). However, whether and how this interferes with behavioral aspects of nicotine, namely insulin's regulation of dopamine and/or other neurotransmitters (Garcia et al, 2005; Owens et al, 2005; Williams et al, 2007), relevant to cellular and behavioral aspects of psychostimulant addiction (Sugano et al, 2006; Williams et al, 2007), or with insulin's effect on ERK signaling, essential for learning and memory formation, still needs to be determined. IDE is highly expressed in the brain (Farris et al, 2005) where the mechanism of IDE to degrade amyloid-β protein (Farris et al, 2003; Farris et al, 2004; Qiu et al, 1998; Qiu and Folstein, 2006; Vekrellis et al, 2000; Baskin et al, 1994; Miller et al, 2003) has been extensively evaluated.

In our sample of European American subjects, a relatively rare SNP rs11763963 was the most strongly associated variant (Figure 2). rs11763963 lies on chromosome 7 and it is not polymorphic in African Americans. A nearby gene SKAP2 encodes Src kinase-associated phosphoprotein 2. The protein belongs to the src family kinases, which regulate the neuronal nicotinic acetylcholine receptor (Wang et al, 2004). Whether the observed finding in this region is modulated by this gene or other nearby genes is a topic of future investigation. This top SNP was closely followed by two SNPs, rs12050510 and rs1836556 (imputed SNP), in and near MORF4L1—a transcription regulator, which was found to be significantly upregulated in the ventral tegmental area of nicotine-infused rats (Kurochkin and Goto, 1994). Linkage disequilibrium in the region across MORF4L1 is very high (Figure 3) and the proxies of rs12050510—our top SNP in this region—extend to both 5′ and 3′ region of the gene.

Previously, investigators conducted a single locus analysis between serum cotinine levels and the cluster of nicotinic receptor subunit genes on chromosome 15 (Keskitalo et al, 2009). They showed larger effect sizes for serum cotinine in comparison to the cigarettes per day phenotype, and suggested that future studies concerning the effects of nicotine should strive to use cotinine levels from serum or saliva rather than self-reported smoking quantity as a measure of nicotine intake or its regulation. In our European American cohort, multiple variants in CHRNA5, CHRNA3, and CHRNB4, were associated with serum cotinine levels (p<0.05; Figure 3). Interestingly, of the 114 SNPs in the region, none were associated with cotinine levels in our African American sample of individuals at a p-value <0.05.

Although the p-value (8.9 × 10−06) of IDE SNP rs11187065 in the African American cohort did not reach the Bonferroni-adjusted threshold of statistical significance (2.3 × 10−6), the central theme of our study is that this top SNP in African Americans replicated in our European American cohort (p=0.044) despite the fact that the sample included only 315 subjects. Replicability of a finding with one available cohort, especially of a different genetic ancestry and such a small size, may be limited even for non-behavioral phenotypes, but perhaps more so for a behavioral phenotype such as smoking. These results suggest that this locus should be investigated further in additional cohorts. It is important to note that our cohort included subjects 18–30 years of age. It remains to be determined whether the association may be observed in older smokers. In addition, genetic studies typically attempt to address genetic risk of a trait rather than current state. Successful methodologies for assessing lifetime smoking patterns have been difficult to achieve, however, it is possible that a self-report measure of cigarettes smoked per day during heaviest period of smoking may be more valuable for a genetic study than current state measured by cotinine levels.

Even with the limited sample size, the power to detect the effects observed here was more than sufficient (>90%). This is the first study to extensively evaluate the genome (2100 genes) in an analysis of serum cotinine levels in both African Americans as well as European Americans. We have shown that the IDE locus is involved in regulation of nicotine intake, as measured through serum cotinine levels, in both populations. Our work supports the idea of using serum cotinine to efficiently map regions in the genome that influence tobacco use.