Investigating the causal effect of smoking on hay fever and asthma: a Mendelian randomization meta-analysis in the CARTA consortium

Observational studies on smoking and risk of hay fever and asthma have shown inconsistent results. However, observational studies may be biased by confounding and reverse causation. Mendelian randomization uses genetic variants as markers of exposures to examine causal effects. We examined the causal effect of smoking on hay fever and asthma by using the smoking-associated single nucleotide polymorphism (SNP) rs16969968/rs1051730. We included 231,020 participants from 22 population-based studies. Observational analyses showed that current vs never smokers had lower risk of hay fever (odds ratio (OR) = 0·68, 95% confidence interval (CI): 0·61, 0·76; P < 0·001) and allergic sensitization (OR = 0·74, 95% CI: 0·64, 0·86; P < 0·001), but similar asthma risk (OR = 1·00, 95% CI: 0·91, 1·09; P = 0·967). Mendelian randomization analyses in current smokers showed a slightly lower risk of hay fever (OR = 0·958, 95% CI: 0·920, 0·998; P = 0·041), a lower risk of allergic sensitization (OR = 0·92, 95% CI: 0·84, 1·02; P = 0·117), but higher risk of asthma (OR = 1·06, 95% CI: 1·01, 1·11; P = 0·020) per smoking-increasing allele. Our results suggest that smoking may be causally related to a higher risk of asthma and a slightly lower risk of hay fever. However, the adverse events associated with smoking limit its clinical significance.


Study descriptions British 1958 Birth Cohort
The British 1958 Birth Cohort (1958 BC) is a longitudinal population based cohort study that includes all births during one week in March in 1958 in England, Scotland and Wales 1 . Approximately 17,000 participants were recruited at birth and were subsequently followed up at ages 7, 11, 16, 23, 33, 42 and 45 years. At each follow-up, information on socioeconomic status, health and development, and familial and education factors were obtained. At 33 and 42 years, diet, lifestyle and occupational factors were also collected. Information on smoking status, asthma and hay fever were collected at the age of 42. At 45 years of age, 11,971 participants currently living in Britain were invited to take part in a biomedical survey, of whom 9,377 (78%) filled in a questionnaire and 8,302 (89%) also provided a blood sample, in which serum IgE concentration were measured and DNA were extracted for genotyping.

Genotyping
Genetic information was obtained from blood samples collected at 45 years, through two substudies from case-control studies that had used the 1958BC as a source for population controls: 3000 samples were randomly selected as part of the Wellcome Trust Case Control Consortium (WTCCC2 2 ) and 2592 distinct samples were randomly selected as part of the Type 1 Diabetes Genetics Consortium (T1DGC 3 ). The WTCCC2 samples were genotyped on the Affymetrix 6.0 platform, whereas T1DGC samples were genotyped using the Illumina Infinium 550 K chip. The SNP rs16969968 was imputed both in T1DGC and in WTCCC2, with average posterior call rate >0·99 in both studies.
Hay fever, asthma, and allergic sensitization At age 42, participants were asked whether they had ever had hay fever. There were 3·20% missing hay fever data (percentage of participants with missing information on the outcome among those with the genotyping information). At age 42, participants were asked whether they had ever had asthma. There were 3·22% missing asthma data (percentage of participants with missing information on the outcome among those with the genotyping information). Allergic sensitization was defined as specific IgE ≥0·30 kU/l in blood serum for any of the following 3 inhalant allergens including dust, cat and grass. Of note, specific IgE were only measured if total IgE was greater than 30 kU/l. There 3·33% missing data regarding allergic sensitization (percentage of participants with missing information on the outcome among those with the genotyping information).

Smoking status
Cigarette smoking was recorded at age 42 by Computer Aided Personal Interviewing, and was classified as never, ex-or current smoker. Reports of never smoking were verified using data from surveys at ages 23 and 33. Number of cigarettes smoked per day at 42 years was also reported for current smokers. Pipe and cigar smokers were excluded from analyses. In the current analyses, we included 4,882 participants with rs1051730 genotype, allergic respiratory disease, and smoking data available.

Ethics
Written consent was obtained from participants for the use of information in medical studies.

ALSPAC Mothers
Genotyping Rs1051730 was directly genotyped as part of the genomewide SNP genotyping using the Illumina human660W-quad array. Genotypes were called with Illumina GenomeStudio. PLINK (v1.07) was used to carry out quality control measures on an initial set of 10,015 subjects and 557,124 directly genotyped SNPs. A total of 8,340 subjects and 526,688 SNPs passed quality control filters. Further details of the genotyping methods have been published previously 6 . SNPs with more than 5% missingness were removed during quality control.
Hay fever, asthma, and allergic sensitization Self-reported hay fever in the mothers was evaluated in a questionnaire completed when the study child was 97 months old. The mothers were asked "Have you ever had any of the following problems: hay fever. The choices were "Yeshad it recently (in the past year)", "Yes, in the past, not recently", or "No, never". If the mothers had had hay fever recently or in the past, they were classified as having hay fever. If they answered never having had hay fever, they were classified as not having hay fever. Asthma was evaluated in the same way. The mothers were asked "Have you ever had any of the following problems: asthma. The choices were "Yes-had it recently (in the past year)", "Yes, in the past, not recently", or "No, never". If the mothers had had asthma recently or in the past, they were classified as having asthma. If they answered never having had asthma, they were classified as not having asthma. Allergic sensitization was not determined.

Smoking status
The mothers' self-reported smoking status was also evaluated in a questionnaire sent out when their participating children were 97 months old (on average). Women were asked whether they had ever smoked and whether they were currently smokers. Former and current smokers who reported not being daily smokers were excluded from the analyses. No questions were asked about cigar or pipe smoking. Smoking heaviness (cigarettes per day) was reported as a continuous variable. In the current analyses, we included 4,834 participants with rs1051730 genotype, allergic respiratory disease and smoking data available.

Ethics
Ethics approval for the study was obtained from the ALSPAC Ethics and Law Committee and the Local Research Ethics Committee.

Additional information
The analyses are not adjusted for principal component. The associations have not been published before.

ALSPAC Children
Genotyping Participants (N=9,912) were genotyped using the Illumina HumanHap550 quad genome-wide SNP genotyping platform by 23andMe subcontracting the Wellcome Trust Sanger Institute, Cambridge, UK and the Laboratory Corporation of America, Burlington, NC, USA. Individuals were excluded from further analysis on the basis of having incorrect gender assignments; minimal or excessive heterozygosity, disproportionate levels of individual missingness (>3%); evidence of cryptic relatedness (>10% IBD) and being of non-European ancestry. SNPs with more than 5% missingness were removed during quality control. After quality control, 8,365 unrelated individuals were available for analysis.
Hay fever, asthma, and allergic sensitization Hay fever and asthma were evaluated when the participants were 18 years old. They completed a questionnaire with the question: "Have you ever had hay fever?". Likewise, participants completed a questionnaire with the question: "Have you ever had asthma?". Allergic sensitization was not determined.

Smoking status
At the age of 18 years, participants were asked about their lifetime smoking behaviour. From these, two categories of smoking status were created: never smokers and current daily smokers. Never smokers reported never having tried a cigarette in their lifetime and current daily smokers smoked at least one cigarette per day. Individuals reporting less frequent smoking were excluded from analyses. In the current analyses, we included 1,549 participants with rs1051730 genotype, allergic respiratory disease and smoking data available.

Ethics
Ethics approval for the study was obtained from the ALSPAC Ethics and Law Committee and the Local Research Ethics Committee.

Additional information
The analyses are not adjusted for principal component. The associations have not been published before.

COPSAC2000
The Copenhagen Prospective study on Asthma in Childhood (COPSAC2000) cohort is a population based single-center prospective clinical birth cohorts consisting of 411 children with phenotypic and genotypic information available in their parents. The children in the COPSAC2000-cohort were born to asthmatic mothers enrolled from the greater Copenhagen area in the period from August 1998 to December 2001. Non-missing phenotype data (from questionnaires or Doctors diagnosis) for parents of COPSAC2000 cohort (smoking, asthma, dermatitis, and allergic sensitization) and genotype data constituted 543 participants for the current study.

Genotyping
Genotyping of 951,117 genetic markers was carried out on the Illumina Infinium HumanOmniExpressExome bead chip at the AROS Applied Biotechnology AS center, in Aarhus, Denmark. Genotypes were called with Illumina's Genome Studio software. We excluded individuals with gender miss-matches, genetic duplicates, outlying heterozygosity >0·27 and <0·037, and those individuals not clustering with the CEU individuals (Utah residents with ancestry from northern and Western Europe) through a multi-dimensional clustering analyses (MDS) seeded with individuals from the International Hap Map Phase 3. SNP data was extracted from the Genome wide array data for the current study and the genotyping call rate was 99.8%.
Hay fever, asthma, and allergic sensitization Evaluation of hay fever, asthma and allergic sensitization was done (an average) of 4 years before child birth. Participants with a doctor diagnosed hay fever or a positive answer to the question: "Have you ever had hay fever?" were considered to have hay fever. Participants with a doctor-diagnosed asthma were considered to have asthma. The serum levels were tested for IgE specificity, and allergic sensitization was defined as serum specific IgE positivity against allergens.

Smoking status
Smoking information was collected 2 years after child birth. Participants were divided into smokers and non-smokers since there was no information on former smoking. There was also no information on smoking heaviness.

Ethics
The study was conducted in accordance with the guiding principles of the Declaration of  Council for Independent Research (Grant no 10-082884 and271-08-0815); The Capital Region Research Foundation (No grant number) and NIH-NHLBI R01 HL129735 have provided core support for COPSAC. We express our gratitude to the children and families of the COPSAC cohorts for their support and commitment to our studies and acknowledge and appreciate the unique efforts of the COPSAC research team.

Additional information
The analyses are not adjusted for principal component. The observational or genetic data have previously been published 7 .

The Dan-Monica10 study
In 1982-1984, a random sample of 4807 participants from the referral area of Glostrup County Hospital, Copenhagen, was invited to participate in the Danish MONICA I health survey 8 . The study was a part of an international World Health Organization (WHO) coordinated study, MONItoring of trends and determinants in CArdiovascular Diseases (MONICA). The sample was selected to represent an equal number of men and women born in 1922,1932,1942  Hay fever, asthma, and allergic sensitization Hay fever was defined as a positive answer to the question: "Has a doctor ever told you that you had allergic hay fever?" Asthma was defined as a positive answer to the question: "Has a doctor ever told you that you had asthma?"Allergic sensitisation, allergic sensitization, was defined by determination of serum specific IgE as described in previous studies [9][10][11][12] . In the Monica1 study, serum specific IgE positivity was tested using the ADVIA Centaur Allergy Screen assay (Bayer HealthCare Diagnostics division, Tarrytown, N.Y., USA) 13 that is a multi-allergen assay to detect specific serum IgE antibodies to 19 common inhalant allergens. Allergic sensitization was defined as one or more positive results according to the manufacturer's instructions.

Smoking status
Information on smoking status was collected by a self-administered questionnaire to be filled in at home prior to the health examination. Smoking status was recorded as never, former, occasional (<1 cigarette, cheroot, cigar, or pipe per day) and daily smokers. Occasional smokers as well as daily smokers, who exclusively smoke cheroots, cigars, or pipe, were excluded from all analyses. Smoking heaviness among daily smokers was recorded as number of cigarettes per day. In the current study, we included 2,054 participants between 41 and 73 years of age.

Ethics
Ethics approval was given by the local research ethics committee. All participants gave written consent and the study was conducted in accordance with the Second Helsinki Declaration.

Additional information
The analyses are not adjusted for principal component. The genetic data used in this study have not previously been published.

ELSA
The English Longitudinal Study of Ageing (ELSA) is a population based national cohort of participants (48% men) aged over 50 years recruited from the Health Surveys for England in 1998, 1999 and 2001 as previously described 14 . The sample has been followed up every 2 years and data have been collected via computer-assisted personal interviews and selfcompletion questionnaires. A wide range of phenotypic measures relevant to ageing are available. These measures were done at Wave 0 of the study (1998, 1999 and 2001) and at follow up (2004/5). Data on health behaviors and a wide range of health outcomes are available. Nearly all participants (97%) have agreed to let us collect other register based data which allows for the assessment of health outcomes and cause specific mortality. More information can be found at http://www.ifs.org.uk/elsa/.

Genotyping
In Wave 2 (2004/2005) of the study, 5,633 participants provided blood samples for DNA extraction. Genotyping was performed by KBioscience using in-house KaSPAR technology.
Hay fever, asthma, and allergic sensitization Whether the participants had asthma or not was evaluated at wave 1, where they were asked: "Has a doctor ever told you that you have/have had any of the following conditions (asthma)?". Hay fever and allergic sensitization were not determined.

Smoking status
Individuals were classified as current smokers if they reported smoking at least one cigarette per day or at least one gram of tobacco per day on weekdays. Former smokers were individuals who reported ever having smoked cigarettes but who were not current smokers. Never smokers were individuals who had never reported smoking cigarettes. Current or former pipe and cigar smokers who did not also smoke cigarettes were excluded from all analyses. In the current analyses, we included 5,263 participants with rs16969968 genotype and smoking data available.

Ethics
ELSA has been approved by the National Research Ethics Service and all participants have given informed consent.
Funding and acknowledgements ELSA is funded by the National Institute on Aging in the US (R01 AG017644; R01AG1764406S1) and by a consortium of UK Government departments (including: Department for Communities and Local Government, Department for Transport, Department for Work and Pensions, Department of Health, HM Revenue and Customs and Office for National Statistics).

Additional information
The analyses are not adjusted for principal component. The results have not been published previously.

FINRISK
The National Finland Cardiovascular Risk Study (FINRISK) is a large population survey on risk factors of non-communicable diseases in Finland 15 . Every five years since 1972, area, sex and age stratified random samples of population have been drawn from the Population Register. In these analyses, data from FINRISK 1992, 1997, 2002 and 2007 surveys were used. Age range of the participants was from 25 to 64 years in study years 1992 and 1997 and from 25 to 74 years in study years 2002 and 2007. Surveys have included a selfadministered questionnaire, physical examination and blood draw for laboratory analyses and extraction of DNA.
Genotyping DNA was derived from whole blood samples, which were frozen immediately at the clinical study sites. The samples were transferred to the National Institute of Health and Welfare, where the DNA was extracted. Genotyping of rs16969968 (CHRNA5 D398N) was done under standard protocols of iPLEX Gold technology on the MassARRAY System (Sequenom, San Diego, CA, USA). The success rate was >0·99 and it was in HWE. Minor allele frequency was 0·32.
Hay fever, asthma, and allergic sensitization Diagnoses of hay fever and asthma were based on self-report. Participants who answered "last week", "last month" or "last year" to the question: "When have you last used hay fever medication" were classified as having hay fever. Likewise, participants with a positive answer to the question: "Has a doctor ever told you that you had asthma?" were classified as having asthma. Data on hay fever was available in FINRISK 1997, 2002 and 2007 surveys, whereas asthma data was available in FINRISK 1992 survey as well. Allergic sensitization was not determined.

Smoking status
In the same questionnaire, respondents were asked whether they had ever smoked. Those stating that they had never smoked were categorized as never smokers and skipped the other smoking-related questions. Ever smokers were defined as those who had smoked at least 100 cigarettes in their lifetime. More questions were used to classify ever smokers as current and former smokers. Former smokers reported having been either regular or occasional smokers but were not smoking currently. For the current analyses, only those who had quit over 6 months ago were included in the former smoker category. Current smokers reported regular or daily smoking having smoked on the day of the assessment or the previous day. Exclusive pipe or cigar smokers were excluded from all analyses. In order to create a variable for smoking quantity the participants were asked to indicate the average number of both manufactured and self-rolled cigarettes they smoked or had smoked per day before quitting. Manufactured and self-rolled cigarettes were totalled for the analysis.

Additional information
The analyses are not adjusted for principal component. The data concerning the smoking associated SNP and allergy have not been published previously.

GOYA Females
The GOYA females consist of mothers from a birth cohort randomly selected according to body mass index (BMI) distribution. The GOYA females were derived from the Danish genome-wide association study GOYA (Genomics of Overweight in Young Adults), nested within the Danish National Birth Cohort (DNBC) 16,17 . The Danish National Birth Cohort (DNBC) is a collection of data on 92,274 pregnant women recruited between 1996 and 2002, from their first antenatal visit to their general practitioner. Women participated in four telephone interviews (16 and 30 weeks gestation and 6 and 18 months after birth) and in a questionnaire-based follow-up 7 years after birth. They also provided two blood samples during pregnancy. The GOYA females used in this study were drawn as a random cohort sample from the 67,863 women within the DNBC who provided information about prepregnancy BMI, gave birth to a live born singleton infant and provided a blood sample during pregnancy. This comprised 2,542 women.

Hay fever, asthma, and allergic sensitization
Women were asked at app. 16 weeks gestation if they had ever had allergic rhinitis diagnosed by a doctor (0·6% missing). Women were asked at app. 16 weeks gestation if they had ever had asthma diagnosed by a doctor (5·3% missing). Allergic sensitization was not determined.

Smoking status
Women were asked about their present smoking status and any smoking during pregnancy at app. 16 weeks of gestation. Women were therefore classified as 'current' (current + any smoking in pregnancy) or 'never/former' smokers (no smoking at any time in pregnancy). In the current analyses, there were 2,016 participants with data on rs1051730 genotype and smoking habits and of these, 2,009 women had information on asthma and 1,897 had information on hay fever available.

Ethics
The study was approved by the regional scientific ethics committee and by the Danish Data Protection Board. All participants provided written informed consent.

Additional information
The analyses are not adjusted for principal component. The data concerning the smoking associated SNP and allergy has not been published previously.

GOYA Males
Genomics of Overweight in Young Adults (GOYA) males is a longitudinal case-cohort (obese, non-obese) study comprising a randomly (1%) selected control group and all extremely overweight men identified among 362,200 Caucasian men examined at the mean age of 20 years at the draft boards in Copenhagen and its surrounding areas during 1943-1977. Obesity was defined as 35% overweight relative to a local standard in use at the time (mid 1970's), corresponding to a BMI ≥31·0 kg/m2, which proved to be above the 99th percentile. All of the obese and 50% of the random sampled controls, who were still living in the region, were invited to a follow-up survey in 1992-94 at the mean age of 46 years, at which time the blood samples were taken and genotyping were performed for a total of 673 extremely overweight and 792 controls. With a sampling fraction of 0·5% (50% of 1%), the controls represent about 158,000 men among whom the case group was the most obese. In the current study, information from cohort part comprising 789 individuals with non-missing data was utilized.

Genotype
Genome-wide genotyping on the Illumina 610 k quad chip was carried out at the Centre National de Ge´notypage (CNG), Evry, France. We excluded SNPs with minor allele frequency, 1%, 0.5% missing genotypes or which failed an exact test of Hardy-Weinberg equilibrium (HWE) in the controls. We also excluded any individual who did not cluster with the CEU individuals (Utah residents with ancestry from northern and western Europe) in a multidimensional scaling analysis seeded with individuals from the International HapMap release 22. Rs1051730 was extracted from the GWAS dataset with a call rate of 99·9%.
Hay fever, asthma, and allergic sensitization Asthma was evaluated by questionnaire with the question: "Does food, medicine, or grass give you asthma?" Hay fever was evaluated by the question: "Does food, medicine or grass give you hay fever?" Allergic sensitization was not determined.

Ethics
The study was approved by the regional scientific ethics committee and the Danish Data Protection Board with consent from the participants.

Funding and acknowledgements
The GOYA study was conducted as part of the activities of the Danish Obesity Research Centre (DanORC, www.danorc.dk) and The MRC centre for Causal Analyses in Translational Epidemiology (MRC CAiTE). The genotyping for GOYA was funded by the Wellcome Trust (WT 084762). GOYA is a nested study within The Danish National Birth Cohort which was established with major funding from the Danish National Research Foundation. Additional support for this cohort has been obtained from the Pharmacy Foundation, the Egmont Foundation, The March of Dimes Birth Defects Foundation, the Augustinus Foundation, and the Health Foundation. TSA was supported by the Gene Diet Interactions in Obesity (GENDINOB, www.gendinob.dk ) postdoctoral fellowship grant. LP is funded by an MRC Population Health Scientist Fellowship (MR/J012165/1).

Additional information
The analyses are not adjusted for principal components, but ethnic outliers and related individuals were already excluded during post genotyping quality control. Some data were previously published 16 .

Health2006
The Health2006 study took place during 2006-2008 and consisted of a random sample of 7,931 Danish (Danish nationality and born in Denmark) men and women aged 18-69 years invited to participate in a health examination 18 . A total of 3,471 (43·8%) participated. Potential participants living in the Copenhagen area were identified in the central Danish Civil Registration System, and then recruited by invitation. The aim was to investigate the prevalence and risk factors of chronic diseases such as mental health, asthma, allergies, cardiovascular disease, and diabetes.

Genotyping
Blood samples were taken from all participants as part of their health examination. The buffy coat was frozen for DNA extraction, and later genomic DNA was extracted using a Qiagen AutoPure LS system. Genotyping was performed using KBiosciences allele-specific PCR (KASPar) (KBiosciences, Hoddesdon, UK). The call rate for this SNP (rs1051730) was > 99·2%. No errors were observed in 370 duplicate samples.

Hay fever, asthma, and allergic sensitization
We used the ADVIA Centaur sIgE assay (Bayer Corporation, New York, NY) to test serum specific IgE to mite (Dermatophagoides [D.] pteronyssinus), grass, cat, and birch 19 . The specific IgE analysis was positive if the measurement was ≥0·35 kU/l. Allergic sensitization, allergic sensitization, was defined as one or more positive tests for specific IgE against the allergens. Classification of hay fever and asthma was done by the questions: "Has a doctor ever told you that you had/have hay fever?" and "Has a doctor ever told you that you had/have asthma".

Smoking status
Information on smoking status was collected by a self-administered questionnaire to be filled in at home prior to the health examination. Smoking status was recorded as never, former, occasional (<1 cigarette, cheroot, cigar, or pipe per day) and daily smokers. Occasional smokers as well as daily smokers smoking exclusively cheroots, cigars, or pipe were excluded from all analyses. Smoking heaviness among daily smokers was recorded as number of cigarettes per day. In the current analyses, we included 3143 participants with rs1051730 genotype, allergic respiratory disease and smoking data available.

Ethics
The Health2006 study was approved by the Ethical Committee of Copenhagen (KA-20060011) and the Danish Data Protection Agency. Informed written consent was obtained from all participants..

Funding and acknowledgements
The Health2006 study was financially supported by grants from the Velux Foundation; the Danish Medical Research Council, Danish Agency for Science, Technology and Innovation; the Aase and Ejner Danielsens Foundation; ALK-Abello´ A/S (Hørsholm, Denmark), Timber Merchant Vilhelm Bangs Foundation, MEKOS Laboratories (Denmark) and Research Centre for Prevention and Health, the Capital Region of Denmark.

Additional information
The analyses are not adjusted for principal component. The data concerning the smoking associated SNP and allergy has not been published previously.

Health2008
The Health2008 study is a study of health and chronic disease initiated in 2008 and completed in 2009 [20][21][22] . Participants were recruited from the Danish Central Personal Register as random samples of the background population living in the Western part of the Copenhagen Region 22 . The total of 2218 persons 30-60 years of age was invited. Pregnant women, persons with known diabetes, chronic obstructive pulmonary disease, cardiovascular disease, hypertension, a history of blood clots, or unable to participate in physical activities such as climbing stairs were excluded from the study. Thus, a total of 795 participated (participation rate 36%). All studies included comprehensive questionnaire and interview data as well as clinical and biochemical data 22 .
Genotyping DNA was extracted from blood samples taken from all participants as part of their health examination. Genotyping was performed with the Illumina Human Exome BeadChip (version 1.2) using the Illumina HiScan (Illumina, San Diego, CA). Genotypes were called using the genotyping module (version 1.9.4) of GenomeStudio software (version 2011.1; Illumina). The call rate for this SNP (rs16969968) was > 99%.
Hay fever, asthma, and allergic sensitization Classification of hay fever and asthma was done by the questions: "Has a doctor ever told you that you had hay fever?" and "Has a doctor ever told you that you had asthma". In the Health2008 study, we used the ADVIA Centaur sIgE assay (Bayer Corporation, New York, NY) to test serum specific IgE to mite (Dermatophagoides [D.] pteronyssinus), grass, cat, and birch. Allergic sensitization was defined as serum specific IgE positivity to one or more of the tested allergens.

Smoking status
Information on smoking status was collected by a self-administered questionnaire to be filled in at home prior to the health examination. Smoking status was recorded as never, former, occasional (<1 cigarette, cheroot, cigar, or pipe per day) and daily smokers. Occasional smokers as well as daily smokers smoking exclusively cheroots, cigars, or pipe were excluded from all analyses. Smoking heaviness among daily smokers was recorded as number of cigarettes per day. In the current analyses, we included 618 participants with rs16969968 genotype, allergic sensitization and smoking data available.

Ethics
The study was approved by the Ethics Committee of Copenhagen and the Danish Data Protection Agency. We followed the recommendations of the Declaration of Helsinki, and each participant gave informed written consent.

Funding and acknowledgements
This work was supported by the Timber Merchant Vilhelm Bang's Foundation, the Danish Heart Foundation (Grant number 07-10-R61-A1754-B838-22392F), and the Health Insurance Foundation (Grant number 2012B233). Tea Skaaby was supported by a grant from the Lundbeck Foundation (Grant number R165-2013-15410).

Additional information
The analyses are not adjusted for principal component. The data concerning the smoking associated SNP and allergy has not been published previously.

HUNT2
The second wave of the HUNT Study in Norway (HUNT 2) took place in 1995-97, where all adults aged 20 years and older in Nord Trøndelag County were invited to participate. A total of 65,237 (70%) accepted the invitation and gave written informed consent to use the data for medical research. The data collection included questionnaires, clinical measurements and blood samples (http://www.ntnu.edu/hunt/data/que).

Genotyping
Altogether 56,664 participants were genotyped for the rs1051730 single nucleotide polymorphism variant. DNA was extracted from blood samples for all participants of the HUNT 2 study and stored at the HUNT biobank. The rs1051730 polymorphism was genotyped at the HUNT biobank using TaqMan genotyping assays (Applied Biosystems, Foster City, CA, USA) and performed on an Applied Biosystems 7900HT Fast real-Time PCR System using 10 ng of genomic DNA. The call rate cut-off was set to 90%. The genotyping success rate was 98·6% and quality score for each individual genotype was >90 (mean 99·7). Genotype frequencies were in agreement with HapMap data.
Hay fever, asthma, and allergic sensitization Hay fever was defined as a confirmative answer to the question: "Do you have hay fever or nasal allergies?" Asthma was defined as a confirmative answer to the question: "Do you have or have you had asthma?". Allergic sensitization was not determined.

Smoking status
Smoking status was measured with self-completed questionnaire data with a categorical variable, and the participants were classified as never smokers, former smokers or current smokers. Current smokers were asked how many cigarettes they smoked per day, the age when starting to smoke and possibly age for smoking cessation. Exclusive pipe and/or cigar smokers were excluded from the analyses. In the current analyses, we included 43,211 participants between the ages 19 to 101 years with rs1051730 genotype, allergic respiratory disease and smoking data available.

Ethics
Use of data in the present study was approved by the Regional Committee for Medical Research Ethics (Reference nr. 2013/1127/REK midt). Participants gave written informed consent.

Funding and acknowledgements
Nord-Trøndelag Health Study (The HUNT Study) is a collaboration between HUNT Research Centre (Faculty of Medicine, Norwegian University of Science and Technology NTNU), Nord-Trøndelag County Council and the Norwegian Institute of Public Health.

Additional information
The analyses were not adjusted for principal component. Associations between the SNP and e.g., lung cancer have previously been published 23 .

Inter99
The Inter99 study is a randomised controlled trial (CT00289237, ClinicalTrials.gov) investigating the effects of lifestyle intervention on CVD (N=61,301) 24 . We used baseline data from a random subsample of 12,934 men and women aged approximately 30, 35, 40, 45, 50, 55, or 65 years invited to participate in a health examination during 1999-2001. Participants were living in the Copenhagen area and were identified in the central Danish Civil Registration System, and recruited by invitation. Only participants with a Northern European origin (Denmark, Norway, Sweden, Iceland, and Faeroe Islands) were included in the present study.
Genotyping DNA was extracted from blood samples taken from all participants as part of their health examination. Genotyping was performed using KBiosciences allele-specific PCR (KASPar) (KBiosciences, Hoddesdon, UK). The call rate for this SNP (rs1051730) was > 98·8%. No errors were observed in 353 duplicate samples.
Hay fever, asthma, and allergic sensitization Asthma was defined as a positive answer to the question: "Has a doctor ever told you that you had asthma?" Serum samples were analyzed for specific IgE to mite (D. pteronyssinus), grass, cat, and birch by the IMMULITE 2000 Allergy Immunoassay System 25 . The specific IgE analysis was positive if the measurement was ≥0·35 kU/l. Allergic sensitization, allergic sensitization, was defined as a positive tests for specific IgE against any of allergens. Hay fever was not determined.

Smoking status
Information on smoking status was collected by a self-administered questionnaire to be filled in at home prior to the health examination. Smoking status was recorded as never, former, occasional (<1 cigarette, cheroot, cigar, or pipe per day) and daily smokers. Occasional smokers as well as daily smokers smoking exclusively cheroots, cigars, or pipe were excluded from all analyses. Smoking heaviness among daily smokers was recorded as number of cigarettes per day. In the current analyses, we included 4,991 participants with rs1051730 genotype, asthma, allergic sensitization, and smoking data available.

Ethics
Informed written consent was obtained from all participants. The study was approved by the Ethical Committee of Copenhagen.

Funding and acknowledgements
Data collection in the Inter99 study was supported economically by The Danish Medical Research Council, The Danish Centre for Evaluation and Health Technology Assessment, Novo Nordisk, Copenhagen County, The Danish Heart Foundation, The Danish Pharmaceutical Association, Augustinus foundation, Ib Henriksen foundation and Becket foundation.

Additional information
The analyses are not adjusted for principal component. The data concerning the smoking associated SNP and allergy has not been published previously.

KORA
The Cooperative Health Research in the Region of Augsburg (KORA) study is a population based case-control study. The study participants were recruited from the third MONICA survey (S3) which was conducted in 1994-1995 in Augsburg, Germany. The objective and protocols of the MONICA surveys have been previously described 26 . Briefly, four crosssectional health surveys (MONICA S1 to S4) were performed in the population aged 25-74 years of the city of Augsburg and two surrounding counties. In total, 4856 participants were recruited in the S3 survey in 1994/1995. The study used for these analyses is a nested casecontrol study comprising 1537 participants, which was performed between September 1997 and December 1998 and in which cases were defined by sensitization status (SPT/RAST). Details of the sampling frame and study design have been published earlier 27 .

Genotyping
The study participants underwent a standardized medical examination including blood draw. Genotyping was performed on the Illumina Omni 2.5 and the Illumina Omni Express platform. Genotypes were called with Genome Studio and annotated to NCBI build 37. The call rate for this SNP (rs1051730) was ≥ 98%. (Before imputation, SNPs with call rates <98% were excluded. Imputation was performed with IMPUTE v2.3.0 using the 1000G phase 1 (v3) reference panel) 28 .
Hay fever, asthma, and allergic sensitization Information on hay fever was requested using a self-administered questionnaire. The participants were asked whether hay fever was ever diagnosed by a doctor. Information on asthma was requested using a self-administered questionnaire. The participants were asked whether asthma was ever diagnosed by a doctor. Allergen specific IgE antibodies to common aeroallergens (grass and birch pollen, housedust mite, cat and Cladosporium) were determined by the fluorescence enzyme immunoassay technique (CAP-FEIA, Pharmacia, Uppsala, Sweden). Allergic sensitization was defined as serum specific IgE sensitivity (≥ 0·35 kU/l) against at least one inhalant allergen.

Smoking status
Information on smoking status was collected using a self-administered questionnaire.
Smoking status was defined based on the following three questions: 1) "Do you currently smoke cigarettes?", 2) "Have you ever smoked?" and 3) "Do you smoke regularly?" Current smoker, if the participant answered yes to questions 1) and 3); former smoker, if the participant answered no to question 1), but yes to question 2) and never smoker, if the participant answered no to questions 1) and 2). Information on smoking heaviness was requested by the question "How many cigarettes per day?".

Ethics
The study was approved by the ethics committee of the Bavarian Medical Association, and written informed consent was obtained from each participant.

Funding
The KORA study was initiated and financed by the Helmholtz Zentrum München -German Research Center for Environmental Health, which is funded by the German Federal Ministry of Education and Research (BMBF) and by the State of Bavaria. Furthermore, KORA research was supported within the Munich Center of Health Sciences (MC-Health), Ludwig-Maximilians-Universität, as part of LMUinnovativ.

Additional information
The analyses were not adjusted for principal component. The data has not been published before.

MIDSPAN Family Study
The Middle-aged Span-of-Life (MIDSPAN) Family Study is an offspring cohort of one of the original MIDSPAN cohorts. It is one of four MIDSPAN population cohort studies based in Scotland 29 . The three original studies took place between 1964 and 1976. Twenty years later, in 1996, the next generation was studied when offspring of couples in the original Renfrew/ Paisley Study were recruited into the Family Study. This latter group is the subject of the present analysis. Details of the study have been described previously 30 . All 2,120 participants used in the current project are of European ancestry and have data on rs1051730 genotype, smoking habits, age, sex, asthma and hay fever.

Genotyping
Genotyping was performed on an ABI PRISM 7900HT sequence detection system using a Taqman assay (Assay ID: C_9510307_20, Applied Biosystems), followed by allelic discrimination using software from Applied Biosystems (SDS V2.0).3.
Hay fever, asthma, and allergic sensitization Classification of participant hay fever was done by the following question: "Do you suffer from, or have you ever suffered from hay fever? (Yes/No)". Participants were classified as having asthma if they answered "Yes" to the question: "Do you suffer from, or have you ever suffered from asthma?" as well as "Yes" to at least one of the following two questions: "Have you suffered from an asthma attack in the last 12 months?" and "Are you currently taking medication (puffers or inhalers) for asthma?" Allergic sensitization was not determined.

Smoking status
The participants completed a questionnaire which included questions on smoking habit. Three categories of smoking habit were defined: never smoker, current smoker, and former smoker.

Ethics
Ethics approval was obtained from the Argyll and Clyde Health Board Local Research Ethics Committee.

Funding
The MIDSPAN Family Study was funded by The Wellcome Trust, and the NHS Research and Development Cardiovascular Research Programme. Participants gave their informed consent.

The NEO study
The NEO study was designed for extensive phenotyping to investigate pathways that lead to obesity-related diseases 36 . The NEO study is a population-based, prospective cohort study that includes 6,671 individuals aged 45-65 years, with an oversampling of individuals with overweight or obesity (BMI > 27 kg/m 2 ). At baseline, information on demography, lifestyle, and medical history has been collected by questionnaires. In addition, samples of 24-h urine, fasting and postprandial blood plasma and serum, and DNA were collected. Participants underwent an extensive physical examination, including anthropometry, electrocardiography, spirometry, and measurement of the carotid artery intima-media thickness by ultrasonography. In random subsamples of participants, magnetic resonance imaging of abdominal fat, pulse wave velocity of the aorta, heart, and brain, magnetic resonance spectroscopy of the liver, indirect calorimetry, dual energy X-ray absorptiometry, or accelerometry measurements were performed. The collection of data started in September 2008 and completed at the end of September 2012. Participants are currently being followed for the incidence of obesity-related diseases and mortality.

Genotyping
Genotyping was performed using the Illumina HumanCoreExome chip, which was subsequently imputed to the 1000 genome reference panal. Genotyping calling algorithm is GenCall. Genotyping and SNP call rates >98%.
Hay fever, asthma, and allergic sensitization Participants were defined as having asthma if they had the general practitioner record code R96 (asthma) according to the International Classification of Primary Care. Hay fever and allergic sensitization were not determined.

Smoking status
Questionnaires on health and lifestyle factors were sent to all participants. Smoking status was classified as never-smoker, former smoker, or current smoker on the basis of a questionnaire, in which the participants could answer the question "Do you smoke?" as either: "No, I never smoked"; "No, but I did smoke in the past"; or "Yes, currently". Longterm tobacco exposure was expressed in pack years of smoking, calculated by multiplying the number of packs of cigarettes smoked per day by the number of years the participant smoked.

Ethics
The Medical Ethical Committee of the Leiden University Medical Center (LUMC) approved the design of the study. All participants gave their written informed consent.

Funding and acknowledgements
The authors of the NEO study thank all individuals who participated in the Netherlands Epidemiology in Obesity study, all participating general practitioners for inviting eligible participants and all research nurses for collection of the data. We thank the NEO study group, Pat van Beelen, Petra Noordijk and Ingeborg de Jonge for the coordination, lab and data management of the NEO study. The genotyping in the NEO study was supported by the Centre National de Génotypage (Paris, France), headed by Jean-Francois Deleuze. The NEO study is supported by the participating Departments, the Division and the Board of Directors of the Leiden University Medical Center, and by the Leiden University, Research Profile Area Vascular and Regenerative Medicine. Dennis Mook-Kanamori is supported by Dutch Science Organization (ZonMW-VENI Grant 916.14.023).

Additional information
The analyses are adjusted for principal components (4). The data has not been published previously.

NSHD
The Medical Research Council National Survey of Health and Development (NSHD) is an on-going prospective population based birth cohort study consisting of all births in England, Scotland and Wales in one week in March 1946 37 . The sample includes single births to married mothers whose fathers were in non-manual or agricultural occupations and a randomly selected one in four of all others, whose fathers were in manual labor. The original cohort, now 70 years of age, comprised 2,547 women and 2,815 men who have been followed-up over 20 times since their birth. The data collected to date include repeat cognitive function, physical, lifestyle and anthropomorphic measures, as well as blood analytes and other measures. In 2006-10 the cohort carried out a particularly intensive phase of clinical assessment and biological sampling with blood and urine sampling and analysis, and cardiac and vascular imaging 38 .
Hay fever, asthma, and allergic sensitization The study has data on self-reported hay fever and self-reported asthma. The outcome hay fever variable was derived from responses to questions about hay fever asked by research nurses at home visits in 1989 (age 43) and 1999 (age 53). In 1989, cohort members were asked whether they had ever had hay fever, and in 1999 they were asked whether they had had it in the last 10 years. The asthma variable was derived from responses to questions about asthma asked by the research nurses at home visits in 1989 and 1999. In 1989, cohort members were asked whether they had ever had asthma, and in 1999 they were asked whether they had had it in the last 10 years. Allergic sensitization was not determined.

Smoking status
Smoking status was collected during a home interview in 1999 at age 53 years by trained interviewers 40 . Current cigarette smoking status ("yes", "no") and the number of cigarettes smoked per day was obtained. Study members who provided an affirmative response to being current cigarette smokers, regardless of the quantity of cigarettes smoked per day, were classified as "smokers", while those who provided a negative response were classified as "non-smoker". Pipe and cigar smokers who did not also report cigarette smoking were excluded from analyses. In the current analyses, we included 2,484 participants with rs16969968 genotype, allergic respiratory disease and smoking data available.

Additional information
The analyses are not adjusted for principal component. Some of the observational or genetic data have previously been published in CARTA papers or elsewhere 41 .

Ethics
Ethical approval was given by the Central Manchester Research Ethics Committee, and the participants gave informed written consent.

Funding and acknowledgements
We are very grateful to the members of this birth cohort for their continuing interest and participation in the study. We would like to acknowledge the Swallow group at University College London, who performed the DNA extractions. This work was funded by the Medical Research Council [MC_UU_12019/1].

The 1936 Cohort
The 1936 Cohort is a longitudinal population-based study based on a random sample of 1,200 persons living in 1976 (aged 40 years at the time of the study) and living in 4 municipalities (Broendby, Glostrup, Herlev, and Ledoeje-Smoerum) of Copenhagen drawn from the Danish Civil Registration. They were invited by a letter for a health examination that focused on risk factors for cardiovascular disease. Enclosed was a questionnaire regarding medical history and health and lifestyle to be completed in advance. Between 1976 and 1977, a total of 1,052 participants were examined (participation rate=87·7%). In 1995-1996 all participants were invited for a re-examination where a total of 695 were examined (participation rate=66%). We use these re-examination data in the current study.
Genotyping DNA was extracted and purified from leukocytes (LGC Genomics, Hoddlesdon, UK). All participants were genotyped with the Illumina Infinium HumanCoreExome-12 BeadChip (CoreExomeChip) using HiScan system (Illumina) at the Novo Nordisk Foundation Centre for Basic Metabolic Research, University of Copenhagen, Copenhagen, Denmark. The standard pipeline in Illumina Genome Studio software was used for the genotype calling. A total of 538,448 markers on 684 individuals entered the Quality Control (QC) pipeline, where 9886 markers were removed due to a call-rate below 95% before QC on individuals could begin. 28 individuals were excluded due to: 1) a call-rate below 95%, 2) extreme positive or negative inbreeding coefficients 3) ethnic outliers using Principal Component Analysis (PCA) on ancestral markers, 4) unknown pedigree relation found by Identical By Descent (IBD) analysis, where the individual with the lowest call rate for each pedigree-pair was removed, 5) sample duplicates, and finally 6) sex discrepancy.
We used a combination of scripts written in Python (v2.7.3) and R (v3.0.1) together with the PLINK (v1.07) software in our QC pipeline. After QC, 656 individuals with 528,562 markers were ready for imputation.
Hay fever, asthma, and allergic sensitization The questionnaire used in 1995-1996 (at 60 years of age) included the following questions on atopic diseases: "Has a doctor ever told you that you had hay fever?" (self-reported hay fever if they answered "yes")" and "Has a doctor ever told you that you had asthma?" (selfreported asthma if they answered "yes").
Measurement of aeroallergen sensitization was performed using the ADVIA Centaur® Allergy Screen assay (Bayer HealthCare Diagnostics division, Tarrytown, N.Y., USA) that is a multi-allergen assay for the qualitative detection of specific serum IgE antibodies specific to common inhalant allergens in serum. The test includes a total of 19 common inhalant allergens. Allergic sensitization was defined as a positive result of the dichotomized assay output.

Smoking status
Smoking status assessed at the re-examination in 1995-1996 was classified as "never smokers", "former smokers", and "current smokers" according to the answers to the questions "Do you smoke?" and ""If you don't smoke now, have you smoked before?".
Smoking heaviness was calculated as the sum of self-reported cigarettes with and without filter. In the current analyses, we included 557 participants with rs1051730 genotype, allergic respiratory disease, and smoking data available.

Ethics
The study was conducted according to the principles of the Declaration of Helsinki. It was approved by the Local Ethics Committee, and participants gave written informed consent.

Additional information
The analyses are not adjusted for principal component. The current associations have not been previously published.

UK Biobank
The UK Biobank is a large prospective study with over 500,000 participants from across the United Kingdom and aged 40-69 years at recruitment in 2006-2010 44 . The study has both data from questionnaires, physical measures, sample assays, accelerometry, multimodal imaging, genome-wide genotyping and longitudinal follow-up for a large number of healthrelated outcomes. In the current study, we use the interim UK Biobank genetic data that comprise more than 150,000 samples. We have restricted the analyses to individuals genetically defined as Caucasian and to participants that are unrelated.
In additional analyses, we excluded the approximately one third of the participants who participated in the UK BiLEVE Study because they were selected according to smoking habits and lung function 45 .

Genotyping
Approximately 450,000 of the participants have been/are being genotyped using the UK Biobank Axiom array from Affymetrix. There are approximately 800,000 markers on this array. The other approximately 50,000 samples were genotyped on the closely related UK BiLEVE array. These are two very similar arrays with more than 95% common marker content.
The rs16969968 SNP was directly genotyped and did not demonstrate evidence for deviation from Hardy Weinberg Equilibrium. The analysis sample was restricted to unrelated individuals, based on a threshold of 0·05 estimated from genetic kinships, and to individuals of Caucasian genetic ancestry using principal components analyses (PCA).
Hay fever, asthma, and allergic sensitization Hay fever was defined as a positive answer to the question: "Has a doctor ever told you that you have had any of the following conditions? Hayfever, allergic rhinitis, or eczema". Asthma was defined as a positive answer to the question: "Has a doctor ever told you that you have had any of the following conditions? Asthma" Of note, the data includes information about age at first diagnosis which enabled us to exclude participants below 16 years at diagnosis in additional analyses. Allergic sensitization was not determined.

Smoking status
Participants were asked about current and past tobacco (cigarette, pipe, cigar or other) smoking behavior in a computerized questionnaire. A full list of the questions is available at: http://biobank.ctsu.ox.ac.uk/crystal/docs/TouchscreenQuestionsMainFinal.pdf. Two questions were asked about current and past smoking status: "Do you smoke tobacco now?" (Yes, on most or all days, Only occasionally, No, Prefer not to answer) and "In the past, how often have you smoked tobacco?" (Smoked on most or all days, Smoked occasionally, Just tried once or twice, I have never smoked, Prefer not to answer). From the answers to these two questions, individuals were categorized as current (current daily or occasional smokers), former (past daily or occasional smokers) or never smokers (individuals who had never tried tobacco or had smoked tobacco once or twice).

Additional information
The analyses are adjusted for 15 principal components. The allergy-and smoking-associated SNP associations have not been previously published.

Ethics
Each participant has given informed consent. An independent Ethics and Governance Council oversees adherence to the Ethics and Governance Framework 44 .
Funding and acknowledgements UK Biobank has received funding from the UK Medical Research Council, Wellcome Trust, Department of Health, British Heart Foundation, Diabetes UK, Northwest Regional Development Agency, Scottish Government, and Welsh Assembly Government. As described in the manuscript, the MRC and Wellcome Trust played a key role in the decision to establish UK Biobank, a large, population-based, prospective, open access resource that would allow detailed investigations of the genetic and environmental determinants of the diseases of middle and old age. The MRC, Wellcome Trust, Department of Health, and Scottish Chief Scientist Office each have a representative on the UK Biobank Board. The MRC and Wellcome Trust fund the independent Ethics and Governance Council 44 .

Whitehall II
Whitehall II is a cohort study with recruitment of 10,308 participants (70% men) between 1985 and 1988 involved 20 London based Civil service departments [83]. Genetic samples were collected in 2004 from over 6,000 participants. The study is highly phenotyped for cardiovascular and other ageing related health outcomes, with 9 phases of follow up (5 with clinical assessment and biological sampling), over 20 years of follow up. A wide variety of health behaviour and environmental data are also collected and the participants are consented for linkage to recorded clinical data such as Hospital Episode Statistics (HES), the Office of National Statistics mortality data and the national registry of acute coronary syndromes in England and Wales (Myocardial Ischaemia National Audit Project).

Genotyping
Genotyping of rs16969968 was performed as part of genotyping using the Metabochip 46 Hay fever, asthma, and allergic sensitization Hay fever and asthma were measured at phase 1. Hay fever was defined as a positive answer to the question: "There are some kinds of health problems that keep recurring and some that people have all the time. In the last 12 months have you suffered from any of the following health problems? Hay fever" Asthma was defined as a positive answer to the question: "There are some kinds of health problems that keep recurring and some that people have all the time. In the last 12 months have you suffered from any of the following health problems? Asthma" Allergic sensitization was not determined.

Smoking status
Information on smoking status was collected by questionnaire during the first phase of data collection. Individuals who reported smoking cigars or pipes but not cigarettes were excluded from all analyses.

Additional information
The analyses are not adjusted for principal components.

Ethics
Ethical approval for the Whitehall II study was obtained from the University College London Medical School committee on the ethics of human research. Informed consent was gained from every participant.

Funding and acknowledgements
The Whitehall II study has been supported by grants from the Medical Research Council

SHIP
The Study of Health in Pomerania (SHIP) is a population-based cohort study in the German region of West Pomerania. SHIP was at first planned as a cross-sectional study 47 . Examinations were performed in centers located at Stralsund and Greifswald between the 16th of October 1997 and the 19th of May 2001. SHIP-0 had a response of 68·8%. The response in women was slightly higher (69·4%) than in men (68·2%). Participation in the different age groups for women ranged from 76·6% (participants aged between 50 and 60) to 49·5% (participants aged between 70 and 80) and for men from 74·3% (participants aged between 50 and 60) to 63·2% (participants aged between 70 and 80) 48 .

Genotyping
Genotyping was performed using the Human SNP 6.0 Array (Affymetrix, Santa Clara, CA, USA). Hybridization of genomic DNA was genotyped according to the manufacturer's standard recommendations. Genotypes were determined using the Birdseed2 clustering algorithm. All remaining arrays had a sample call rate > 92 %.
Hay fever, asthma, and allergic sensitization Participants were classified as having hay fever if they answered confirmatory to the question: "Do you sometimes or all the time suffer from hay fever?" The participants were defined as having allergic asthma or not according to self-reported questionnaire information. Allergic sensitization was not determined.

Smoking status
The participants completed an interview-based questionnaire which included questions on smoking habit. Three categories of smoking habits were defined: never, former and current smokers.

Additional information
The analyses are not adjusted for principal components.

Ethics
SHIP was planned and accompanied with support and advice from an external Data Safety and Monitoring Committee (DSMC). Each participant gave written informed consent. The study conformed to the principles of the Declaration of Helsinki and was approved by the Ethics Committee of the University of Greifswald.

Funding and acknowledgements
The Study of Health in Pomerania (SHIP) is part of the Research Network of Community Medicine (www.community-medicine.de).

SHIP TREND
SHIP-TREND is the second cohort of the population project Study of Health in Pomerania (SHIP). Baseline data were collected between 2008 and 2012 48 .

Genotyping
Genotyping was performed using the Human SNP 6.0 Array (Affymetrix, Santa Clara, CA, USA). Hybridization of genomic DNA was genotyped according to the manufacturer's standard recommendations. Genotypes were determined using the Birdseed2 clustering algorithm. All remaining arrays had a sample call rate > 92 %.

Hay fever, asthma, and allergic sensitization
In the standardized interview, participants were asked whether a physician had ever diagnosed them with allergy, and if so, which type of allergy. Participants who answered "allergy to house dust mite" or "pollen allergy" were defined as having hay fever in the current study. A diagnosis of lung asthma was defined as the participants that reported to have bronchial asthma. Allergic sensitization was not determined.

Smoking status
The participants completed an interview-based questionnaire which included questions on smoking habit. Three categories of smoking habits were defined: never, former and current smokers.

Additional information
The analyses are not adjusted for principal components.

Ethics
SHIP was planned and accompanied with support and advice from an external Data safety and Monitoring Committee (DSMC). Each participant gave written informed consent. The study conformed to the principles of the Declaration of Helsinki and was approved by the Ethics Committee of the University of Greifswald.

Funding and acknowledgements
The Study of Health in Pomerania (SHIP) is part of the Research Network of Community Medicine (www.community-medicine.de).   or "Yes, in the past, not recently" to the question: " Have you ever had any of the following problems: hay fever?"

Main supplementary tables
Answering "Yeshad it recently (in the past year)" or "Yes, in the past, not recently" to the question: " Have you ever had any of the following problems: The participants were defined as having allergic asthma or not according to self-reported questionnaire information.
NA SHIP TREND Self-reported doctordiagnosed "allergy to house dust mite" or "pollen allergy" Participants who reported to have bronchial asthma NA  Figure S1-S3 Age-and sex-adjusted association of smoking status with hay fever, asthma and allergic sensitization. Odds ratio compared to never smokers Allergic sensitization

Figure S8
Age-and sex-adjusted association of genotype and smoking heaviness.  Cigarettes per day per smoking-increasing allele Cigarettes per day Crude analyses Figure S9-S11 Crude association of smoking status with hay fever, asthma and allergic sensitization. Odds ratio compared to never smokers Allergic sensitization

Figure S12
Crude association of smoking heaviness with hay fever, asthma and allergic sensitization Odds ratio per cigarette per day Figure S13-S15 Mendelian randomization analysis of the crude associations of rs1051730/rs16969968 and hay fever (N=208,365), asthma (N=231,013) and allergic sensitization (N=17,623). Odds ratio per smoking-increasing allele Allergic sensitization

Figure S16
Crude association of genotype and smoking heaviness. Odds ratio per smoking-increasing allele Asthma Figure S19-S20 Age-and sex-adjusted association of the rs1051730/rs16969968 SNP with hay fever and asthma in selected samples of the UK Biobank data. Odds ratio per smoking-increasing allele Asthma Figure S23-S24 Age-and sex-adjusted association of the rs1051730/rs16969968 SNP with hay fever and asthma, excluding the ALSPAC Children. Odds ratio per smoking-increasing allele Asthma