Genome-Wide Meta-Analysis of Cotinine Levels in Cigarette Smokers Identifies Locus at 4q13.2

Genome-wide association studies (GWAS) of complex behavioural phenotypes such as cigarette smoking typically employ self-report phenotypes. However, precise biomarker phenotypes may afford greater statistical power and identify novel variants. Here we report the results of a GWAS meta-analysis of levels of cotinine, the primary metabolite of nicotine, in 4,548 daily smokers of European ancestry. We identified a locus close to UGT2B10 at 4q13.2 (minimum p = 5.89 × 10−10 for rs114612145), which was consequently replicated. This variant is in high linkage disequilibrium with a known functional variant in the UGT2B10 gene which is associated with reduced nicotine and cotinine glucuronidation activity, but intriguingly is not associated with nicotine intake. Additionally, we observed association between multiple variants within the 15q25.1 region and cotinine levels, all located within the CHRNA5-A3-B4 gene cluster or adjacent genes, consistent with previous much larger GWAS using self-report measures of smoking quantity. These results clearly illustrate the increase in power afforded by using precise biomarker measures in GWAS. Perhaps more importantly however, they also highlight that biomarkers do not always mark the phenotype of interest. The use of metabolite data as a proxy for environmental exposures should be carefully considered in the context of individual differences in metabolic pathways.

cohort was supported by grant U01 HG004729 from the National Human Genome Research Institute.

FinnTwin12 and FinnTwin16
Study description: Analyses were based on samples from unrelated individuals from the Cotinine assessment: All biological samples were stored at -80°C at the National Institute for Health and Welfare. Serum samples were sent for cotinine measurement to University of Toronto (Dr. Rachel Tyndale). Samples were assessed by LCMS as previously described 5 .
Acknowledgements: The study has been supported by ENGAGE -European Network for

FINRISK 1992 and FINRISK2007
Study description: The National FINRISK Study monitors risk factor level trends every five years since 1972. In 1992, the FINRISK survey was carried out to assess cardiovascular risk factor levels in Finland. The survey was conducted in four areas of Finland: (1) North Karelia province, (2) Kuopio, (3) Turku and Loimaa (representing South western Finland), and (4) Helsinki and Vantaa. A random sample of approximately 2,000 individuals aged 25-64 from each of the survey areas was drawn from the National Population Register. Altogether 7,927 individuals were enrolled, and 6,051 participated (76.3% participation rate). The survey included a self-administered questionnaire (mainly covering questions on socioeconomic factors, medical history, health behavior, and psychological factors) and a cardiovascular risk factor examination. A detailed smoking history as described by Vartiainen and colleagues 6 was obtained. Blood samples for DNA extraction and biochemistry analyses were taken as part of the risk factor examination.
In 2007, the FINRISK survey was carried out in six regions of Finland: 1) Helsinki and Vantaa, 2) Turku and Loimaa, 3) North Savo, 4) North Karelia, 5) the Oulu region, and 6) Lapland. A two stage process was used. Participants (n = 11,953) from all regions were invited to fill in an extensive baseline questionnaire (n = 7,993, 67% response rate) and to attend a locally organized health examination in which blood samples were taken (all regions but Lapland). After the baseline study, a self-administered questionnaire with detailed smoking history was given to individuals who had stated during the first part of the study that they had smoked at least 100 cigarettes during their lifetime (regions 1-3), or that they were current smokers (regions 4 and 5) (n = 1,992). Completed questionnaires were returned by mail, with one reminder. The number of participants in the smoking sub-study was 1,746 (91% response rate). Plasma cotinine was analyzed for those who identified themselves as daily smokers during the main FINRISK data collection and responded to the tobaccospecific questionnaire 7 .
Altogether 218 subjects from FINRISK 1992 and FINRISK 2007 were included in the GWAS discovery phase, with genome-wide genotype data generated with the Illumina HumanOmniExpress BeadChip (Illumina, Inc., San Diego, CA, USA). These samples belong to the Predict-CVD sub-cohort, which is a random subset of the whole FINRISK cohort, and has been previously described by Ganna and colleagues 8 . The Predict CVD -sample is enriched for individuals with a diagnosed cardiovascular event (coronary heart disease and/or ischemic stroke). The participants originate from FINRISK cohorts collected on 1992, 1997, 2002 and 2007. The size of sub-cohort in each stratum was made proportional to the number of incident cardiovascular disease cases in the corresponding stratum.
In the replication phase 620 additional subjects from FINRISK 2007 were included, with genotype data for the top loci extracted from genotype data generated with the MetaboChip. The MetaboChip is a custom Illumina iSelect genotyping array designed to test ~200,000 SNPs of interest for metabolic and atherosclerotic / cardiovascular disease traits.  Thus, the FHS has evolved into a prospective, community-based, three generation family study. The FHS is a joint project of the National Heart, Lung and Blood Institute and Boston University. Part of the FHS subjects participated in a plasma metabolite sub-study. Data used in our study were extracted from this metabolite profiling study.
Cotinine assessment: Cotinine was assayed from plasma samples using liquid chromatography-tandem mass spectrometry (LC-MS) with a 4000 QTRAP triple quadrupole mass spectrometer (Applied Biosystems/Sciex) that was coupled to a multiplexed LC system comprised of two 1200 Series pumps (Agilent Technologies) and an HTS PAL autosampler (Leap Technologies) equipped with two injection ports and a column selection valve 9 .
Acknowledgements: The Framingham Heart Study is conducted and supported by the National Heart, Lung, and Blood Institute (NHLBI) in collaboration with Boston University Funding support for the Framingham Metabolomics (HILIC -Installment 2) dataset was provided by NIH grant R01 DK081572.

GenMets/Health2000
Study description: The Health2000 study is a nationally representative sample of adult Finnish population, which includes a total of 8,028 subjects aged 30 or over. They were invited for an in-person study during which a blood sample was taken for cotinine measures and for DNA extraction. A detailed smoking history as described by Keskitalo and colleagues 10 was obtained. Altogether 485 subjects from GenMets were included in the GWAS discovery phase, with genome-wide genotype data generated with the Human610-Quad BeadChip (Illumina, Inc., San Diego, CA, USA). All participants gave written informed consent, and the study was approved by the Ethics Committee for Epidemiology and Public Health of the Hospital District of Helsinki and Uusimaa, Finland. ng/ml range we used a threshold of 20% for the inter-assay coefficient of variation, whereas for values exceeding 10 ng/ml the threshold was 15%. Values below 1 ng/ml were treated as undetectable.
Acknowledgements: We thank the Netherlands Twin Register participants whose data were analyzed in this study. We acknowledge the Netherlands Organisation for Scientific Research

TwinsUK
Study description: Study subjects were twins enrolled in the TwinsUK registry, a national register of adult twins. Twins were recruited as volunteers by successive media campaigns without selecting for particular diseases or traits 17 . In this study we analysed data from 674 female twins who cotinine data available. The study was approved by St. Thomas' Hospital Research Ethics Committee, and all twins provided informed written consent.
Cotinine assessment: Cotinine levels in TwinsUK were measured using a non-targeted mass spec-based metabolomic profiling using the Metabolon platform, as described previously 18 .
Metabolomic profiling was done in three batches. The raw cotinine levels were mediannormalised (dividing each cotinine concentration by the day cotinine median), then inverse normalised as the metabolite concentration was not normally distributed. To measure the correlation between Metabolon platform and LCMS method described in 5 , 12 samples were analysed with both methods.  International journal of epidemiology 37, 1220-1226 (2008). Figure S1. Chromosome 4 conditional analyses. Panels illustrate regional association plots of 4q13.2 region (69.5 to 70MB) based on the original results (top), and after conditioning on top hit rs114612145 (bottom). SNPs plotted by their position on chromosome 4 against -log10 p-value for their association with cotinine level in genome-wide meta-analysis. The ALSPAC mothers cohort (n=8,890) was used as an LD reference panel for these analyses. Figure S2. Chromosome 15 conditional analyses. Panels illustrate regional association plots of 15q25 region based on the original results (top), after conditioning on top hit rs10851907 (middle), and after conditioning on both rs10851907 and second independent signal rs57064725 (bottom). SNP rs57064725 is highlighted in pink in the top panel for reference. SNPs plotted by their position on chromosome 15 against -log10 p-value for their association with cotinine level in genome-wide meta-analysis. The ALSPAC mothers cohort (n=8,890) was used as an LD reference panel for these analyses. Figure S3. Chromosome 15 conditional analyses using rs16969968, which codes for D398N. Panels illustrate regional association plots of 15q25 region based on the original results (top), after conditioning on rs16969968 (middle), and after conditioning on both rs16969968 and second independent signal rs7170068 (bottom). SNP rs7170068 is highlighted in pink in the top panel for reference. SNPs plotted by their position on chromosome 15 against -log10 p-value for their association with cotinine level in genome-wide meta-analysis. The ALSPAC mothers cohort (n=8,890) was used as an LD reference panel for these analyses.