Abstract
Cigarette smoking contributes to numerous diseases and is one of the leading causes of death in the United States. Smoking behaviors vary widely across race/ethnicity, but it is not clear why. Here, we examine the contribution of genetic ancestry to variation in two smoking-related traits in 43,485 individuals from four race/ethnicity groups (non-Hispanic white, Hispanic/Latino, East Asian, and African American) from a single U.S. healthcare plan. Smoking prevalence was the lowest among East Asians (22.7%) and the highest among non-Hispanic whites (38.5%). We observed significant associations between genetic ancestry and smoking-related traits. Within East Asians, we observed higher smoking prevalence with greater European (versus Asian) ancestry (P = 9.95 × 10−12). Within Hispanic/Latinos, higher cigarettes per day (CPD) was associated with greater European ancestry (P = 3.34 × 10−25). Within non-Hispanic whites, the lowest number of CPD was observed for individuals of southeastern European ancestry (P = 9.06 × 10−5). These associations remained after considering known smoking-associated loci, education, socioeconomic factors, and marital status. Our findings support the role of genetic ancestry and socioeconomic factors in cigarette smoking behaviors in non-Hispanic whites, Hispanic/Latinos, and East Asians.
Similar content being viewed by others
Introduction
Cigarette smoking contributes to numerous common diseases, including cancers, chronic obstructive pulmonary disease, and cardiovascular diseases, and it is one of the leading causes of death in the United States1,2,3,4,5,6. Despite the substantial decrease in cigarette smoking prevalence over the last one-half century, ~40 million people are still smokers in the United States, and disparities among smokers remain7,8. Higher prevalences of smokers have been observed in populations who are disadvantaged socially and economically7,9. Further, among smokers, socioeconomic status is a major determinant of the degree of nicotine dependence10, which can be approximated by the number of cigarettes smoked per day (CPD)11.
In the United States, smoking behaviors vary widely across race/ethnicity, with individuals of Asian and Hispanic/Latino ancestry having the lowest smoking prevalence compared to individuals of other ancestry7,8. The reasons for these disparities may include variation in genetic ancestry, which has the potential to explain variation in smoking behaviors between Asian and Hispanic/Latino ancestry populations and other populations. However, to date, no study has investigated the role of genetic ancestry and smoking behavior-related traits.
Twin and family studies suggest that genetic factors accounted for approximately half of the variance in smoking initiation and smoking quantity, and heritable variation in cigarette use seems comparable across ethnic groups12,13,14. Recently, the GWAS and Sequencing Consortium of Alcohol and Nicotine Use (GSCAN) study15 conducted in European ancestry individuals reported 467 genetic variants associated with cigarette smoking-related traits, including age at smoking initiation, smoking initiation, smoking cessation, and CPD.
Here, we hypothesize that genetic ancestry may explain some of the wide-variability in cigarette smoking behaviors across ethnic groups. To answer this question, we conduct genetic ancestry analyses of cigarette smoking behaviors within each of the four ethnic groups (non-Hispanic whites, Hispanic/Latinos, East Asians, and African Americans) from the Genetic Epidemiology Research in Adult Health and Aging (GERA) cohort16. Two smoking-related traits were used: smoking initiation (15,862 ‘ever’ smokers vs. 27 623 ‘never’ smokers) and CPD for all smokers (i.e., 2271 ‘current’ + 13,591 ‘formers’ smokers). We then investigate whether genetic ancestry associations are: (1) due to genetically determined smoking-related traits based on known smoking genetic variants15; and (2) modified by education, socioeconomic factors such as, employment/work status, household income, and marital status.
Materials and methods
Study population
Individuals were selected from the Kaiser Permanente Research Program on Genes, Environment, and Health (RPGEH) Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort. The cohort consists of over 110,000 adult members of Kaiser Permanente Northern California (KPNC), ranging in age from 18 to 100 years at enrollment16. The RPGEH was established as a resource for research on genetic and environmental influences on health and disease, and participants were asked to complete a mailed survey. On this survey, participants were asked: ‘What best describes your race/ethnicity?’. Briefly, and as previously described16, self-reported race/ethnicity for each individual was derived from responses to this question, and, for individuals who reported more than one category, the selections were collapsed into race/ethnicity categories. In particular, all East Asian nationalities (i.e., Chinese, Japanese, Korean, Filipino, Vietnamese, or other Southeast Asian) were collapsed into a single East Asian group; all Latino nationalities (i.e., Mexican, Central/South American, Puerto Rican, or other Latino/Hispanic) were collapsed into a single Hispanic/Latino category; all African descent populations (i.e., African-American, African, or Africo-Caribbean) were collapsed into a single group; all white-European ethnicities (i.e., White or European-American, Middle Eastern, or Ashkenazi Jewish) were collapsed into a single non-Hispanic white group. In addition to self-reported race/ethnicity, individuals included in the current study provided self-reported information regarding their cigarette use, education, employment/work status, household income, and marital status (N = 43,485, Table 1). All study procedures were approved by the Institutional Review Board of the Kaiser Foundation Research Institute.
Smoking-related traits
Two smoking-related traits (i.e., smoking initiation, and the number of CPD) were assessed based on the RPGEH survey, via the following questions: ‘Have you ever smoked one or more cigarettes per day for six months or longer?’ (yes or no); ‘Do you currently smoke, or have you stopped smoking?’ (current smoker or former smoker); and ‘On average how many packs of cigarettes do you (or did you) smoke per day?’(< ½ pack, ½–1 pack, 1–1½ packs, or more than 1½ packs). For smoking initiation, ever (former/current) and never smokers were assigned as cases and controls, respectively. For smokers (‘former’ and “current’ smokers), the number of CPD, as a quantitative trait, was assessed by considering ~20 cigarettes per pack. The RPGEH survey has been shown to be successful in assessing other substance use, such as alcohol consumption, as in our recent study17 we confirmed previous findings implicating ADH1B, AUTS2, SGOL1, SERPINC1, KLB, and GCKR loci in alcohol consumption18,19,20,21.
Socioeconomic covariates
The RPGEH survey was also used to assess education, socioeconomic factors (i.e., employment/work status and household income), and marital status, via the following questions: ‘What is the highest level of school that you have completed?’; ‘What is your employment or work status?’; ‘What best describes your household income (before taxes)?’; and ‘What is your current marital status?’. Answers to these questions were combined in: (1) 4 categories for education: ‘less than high school’ which corresponds to “grade school (grades 1–8)”, ‘high school’ which combines “some high school (grades 9–11)” with “high school or GED”, ‘some college’, and ‘college degree or more’ which combines “college”, “graduate school”, and “technical/trade school”; (2) 4 categories for employment or work status: ‘full-time employed’, ‘part-time employed’, ‘unemployed’ and ‘disabled for work’; (3) 3 categories for household income: ‘<$20,000’ which corresponds to an annual household income (before taxes) <$19,999 per year, ‘$20,000 to $59,999/year’, and ‘$60,000/year or more’; and (4) 3 categories for marital status: ‘never married’, ‘married or living as married’, and ‘separated or divorced’. ‘Female’ sex, ‘college or more’ education, ‘$60,000 or more’ income, ‘full-time employed’ employment, and ‘married or living as married’ marital status served as the reference groups for Models 3.
Genotyping and imputation
GERA DNA samples were genotyped on four custom Affymetrix Axiom arrays that were designed for individuals of non-Hispanic white, East Asian, African American, and Latino race/ethnicity, as previously described22,23. We applied genotype quality control (QC) procedures for the GERA samples on an array-wise basis23. Briefly, we included genetic markers with an initial genotyping call rate ≥97%, genotype concordance rate >0.75 across duplicate samples, and allele frequency difference ≤0.15 between females and males for autosomal markers.
Approximately 94% of samples and more than 98% of genetic markers assayed reached QC procedures. In total, over 665,000 genotyped single nucleotide polymorphisms (SNPs)22,24 and over 15,000,000 imputed SNPs were available for analyses. The 1000 Genomes reference panel (phase I integrated release, March 2012) was used for imputation (IMPUTE2 v2.3.0, SHAPEIT v2.r72719).
Principal component (PC) and genetic ancestry
Banda et al.16 conducted an analysis of ancestry in GERA using PC analysis (Eigenstrat v4.2), and identified 10 and 6 ancestry PCs reflecting genetic ancestry among non-Hispanic whites, and the other ethnic groups, respectively. To adjust for genetic ancestry, we also included the percentage of Ashkenazi (ASHK) Jewish ancestry as a covariate for the non-Hispanic white ethnic group analysis. For genetic ancestry analyses, for each ethnic group, we examined the effect of the first 2 PCs, which are the only ones geographically interpretable and represent geographic clines, on smoking-related traits prevalence/distribution. Each model was adjusted for additional PCs (i.e., up to 10 for non-Hispanic whites and up to 6 for the other ethnic groups). To visualize the smoking-related traits prevalence/distribution by the ancestry PCs, we created a smoothed distribution of each individual’s smoking phenotype using a radial kernel density estimate, as previously described25.
Genetic risk score (GRS)
To determine if known smoking-associated SNPs could explain the ancestry effect, we repeated the ancestry analyses including a GRS for each smoking-related trait based on the findings of the largest genetic study conducted to date, including up to 1.2 million individuals with information on multiple stages of tobacco use15. To derive the GRS, we used a ‘classic’ method26 which consists of computing GRS based on a subset of SNPs exceeding a specific GWAS association P-value threshold (i.e., P ≤ 5.0 × 10−8 in Liu et al.15). The first GRS was based on 365 smoking initiation genome-wide associated-SNPs associated-SNPs, and the second was based on 53 SNPs previously reported to be associated at a genome-wide level of significance with CPD15. Out of the 365 SNPs, 133 (36.4%) were confirmed to be associated with smoking initiation in GERA, including 14 at a Bonferroni-corrected alpha level of 1.37 × 10−4 (0.05/365) (Supplementary Data 1). Out of the 53 SNPs, 34 (64.1%) were confirmed to be associated with CPD in GERA, including 15 at a Bonferroni-corrected alpha level of 9.43 × 10−4 (0.05/53) (Supplementary Data 2). The GRSs were built on these known smoking-associated SNPs by summing up the additive coding of each SNP weighted by the effect size ascertained from the original study15. As the original study15 was conducted in cohorts of European ancestry, we also generated unweighted GRSs and included those in the models for each ethnic group. Results were similar using unweighted or weighted GRS in all ethnic groups (Supplementary Data 3).
Statistical analyses
For smoking initiation, we used a logistic regression model to examine the impact of ancestry on this smoking-related trait using R version 3.4.1 with the following covariates: age, sex, and ancestry PCs (first 10 PCs for the non-Hispanic white analyses and first 6 PCs for the other ethnic groups) (Model 1). For the number of CPD, we used a linear regression model. In Model 2, in addition to all covariates included in Model 1, we added one of the two GRS described above. In Model 3, in addition to all covariates included in Model 2, we added education, socioeconomic factors, and marital status as covariates.
Results
GERA cohort and smoking behavior
The study sample consisted of 43,485 GERA participants from four ethnic groups (non-Hispanic whites, Hispanic/Latinos, East Asians, and African Americans) (Table 1). In our study, the prevalence of ‘ever’ smokers varied by ethnicity with the lowest prevalence (22.7%) for East Asians and the highest (38.5%) for non-Hispanic whites. On average, the number of cigarettes per day (CPD) smoked by non-Hispanic whites was higher (21.2 CPD) compared to the number of CPD smoked by individuals from other ethnic groups (range of 16.4–17.1 CPD). ‘Ever’ smokers were more likely to be ‘former’ smokers compared to ‘current’ smokers in all ethnic groups.
In our study, the prevalence of ‘ever’ smokers also varied by education level, employment, income level, and marital status (Supplementary Table 1). Individuals with high school education levels were more likely to have smoked compared to individuals with a college degree or higher education level (51.3% vs. 31.7%). Individuals who were disabled were more likely to have smoked compared to individuals who were part- or full-time employed (53.3% vs. (34.8–36.1%)), and individuals having an annual income of $60,000 or more were less likely to have smoked compared to individuals who had an annual income of <$59,999 (34.5 vs. 43.6%). Finally, individuals who were separated/divorced were more likely to ever smoked compared to individuals who were never married (45.7% vs. 28.9%). Similar trends were observed across the four ethnic groups (Supplementary Table 2).
Genetic ancestry and smoking behaviors
We first investigated genome-wide genetic ancestry using principal components (PCs) that were assessed within each ethnic group separately16. Genetic ancestry associations with smoking initiation and CPD were then assessed and visual representations are provided in Figs. 1, 2. Within non-Hispanic whites, the first two PCs represented geographically interpretable genetic ancestry, with PC1 characterizing a northwestern vs. southeastern European cline and PC2 a northeastern vs. southwestern European cline. The first two PCs were both associated with CPD (Model 1: β = 27.95, PPC1 = 0.017; β = −50.32, PPC2 = 9.06 × 10−5) (Table 2), with the lowest prevalence observed for individuals of southeastern European ancestry (Fig. 2a). In contrast, neither PC1 nor PC2 was associated with smoking initiation within non-Hispanic whites.
Within Hispanic/Latinos, the first two PCs were also geographically interpretable, with PC1 representing greater European versus Native American ancestry and PC2 representing greater African versus European ancestry. In Hispanic/Latinos, we observed higher smoking initiation prevalence and higher CPD correlating with greater European (versus Native American) ancestry (Model 1: β = 17.67, PPC1 = 1.12 × 10−5 for smoking initiation; and β = 271.29, PPC1 = 3.34 × 10−25 for CPD) (Table 2; Figs. 1b and 2b).
In East Asians, PC1, which represents European admixture, was strongly associated with smoking initiation (Model 1: β = −23.15, PPC1 = 9.95 × 10−12) and nominally with CPD (Model 1: β = −48.22, PPC1 = 0.03). For PC2, which differentiates geographical clines across East Asia, we observed a non-linear association between smoking initiation and PC2 (Model 1: β = 10.12, PPC2 = 0.011 for smoking initiation). This non-linear association represents a U-shaped association of ancestry from north to south (or south to north) (Table 3; Fig. 1c). Recently, we reported a similar pattern of ancestry association for body mass index in East Asians27. Significant associations were also detected between PC2 and CPD (Model 1: β = 66.74, PPC2 = 3.92 × 10−3) (Fig. 2c).
In African Americans, neither PC1 (representing African vs. European ancestry) nor PC2 (representing East Asian ancestry) were associated with smoking initiation or CPD (Table 3; Figs. 1d and 2d).
Genetic ancestry and known smoking-associated loci
To determine whether the genetic ancestry associations with smoking-related traits were due to known smoking-associated loci, we repeated the ancestry analyses, including one of the two following GRS: the first GRS was based on 365 smoking initiation associated-SNPs, and the second GRS was based on 53 SNPs previously reported to be associated with CPD15. While the GRS for smoking initiation was significantly associated with smoking initiation in all four ethnic groups, the GRS for CPD was a predictor for CPD in all ethnic groups, except Hispanic/Latinos (Table 2).
In non-Hispanic whites, the genetic ancestry associations between PC1 or PC2 and CPD were not attenuated after including the GRS for CPD (Model 2: β = 34.07, PPC1 = 3.34 × 10−3; β = −50.90, PPC2 = 6.69 × 10−5) (Table 2). In Hispanic/Latinos, while the genetic ancestry association between PC1 and smoking initiation was not attenuated when including a GRS, the genetic association between PC1 and CPD was slightly attenuated (Model 2: β = 22.80, PPC1 = 4.07 × 10−8 for smoking initiation; β = 263.32, PPC1 = 2.18 × 10−23 for CPD) (Table 2). In East Asians, while the genetic ancestry association between PC1 and smoking initiation was not attenuated when including a GRS, the genetic ancestry association between PC2 and smoking initiation was slightly attenuated (Model 2: β = −24.06, PPC1 = 2.05 × 10−12; β = 9.10, PPC2 = 0.022 for smoking initiation) (Table 3). Further, in East Asians, while the genetic ancestry association between PC1 and CPD was no longer significant when including a GRS, the genetic ancestry association between PC2 and CPD was slightly attenuated (Model 2: β = −31.97, P = 0.15 for PC1 and β = 66.22, P = 4.07 × 10−3 for PC2) (Table 3).
Genetic ancestry associations and socioeconomic factors
To determine whether education, socioeconomic factors, and marital status explain the remaining genetic ancestry associations (after considering genetically determined smoking-related traits), we repeated the ancestry analyses, including education, employment, income level, and marital status. In non-Hispanic whites, only the genetic ancestry association between PC2 and CPD was attenuated after considering education, socioeconomic factors, and marital status (Model 3: β = −46.06, PPC2 = 2.74 × 10−4) (Table 2). In Hispanic/Latinos, while the genetic ancestry association between PC1 and smoking initiation was not attenuated when considering education, socioeconomic factors, and marital status, the genetic association between PC1 and CPD was attenuated further but not eliminated (Model 3: β = 27.85, PPC1 = 1.58 × 10−10 for smoking initiation; β = 248.22, PPC1 = 1.14 × 10−19 for CPD) (Table 2). In East Asians, the genetic ancestry association between PC1 and smoking initiation was attenuated when considering education, socioeconomic factors, and marital status, and the genetic ancestry association between PC2 and CPD was attenuated further but not eliminated (Model 3: β = −19.97, PPC1 = 1.26 × 10−8 for smoking initiation and β = 60.76, PPC2 = 9.27 × 10−3 for CPD) (Table 3).
Discussion
In this study, we observed substantial differences in cigarette smoking behaviors across race/ethnicity groups, and we found that smoking initiation and/or CPD were associated with genetic ancestry within non-Hispanic whites, Hispanic/Latinos, and East Asians. Specifically, a higher smoking initiation prevalence and higher number of CPD were associated with greater European (versus Native American) ancestry among Hispanic/Latinos and were associated with greater European (versus Asian) ancestry among East Asians. Furthermore, individuals of northwestern European ancestry had a higher number of CPD compared to individuals of southeastern European ancestry among non-Hispanic whites. No significant associations between genetic ancestry and cigarette smoking behaviors were detected in African Americans, which was the smallest sample size of the groups. After considering genetic variants known to contribute to cigarette smoking behaviors and accounting for education, socioeconomic factors such as employment/work status and household income, and marital status, these genetic ancestry associations remained, but were attenuated. Study findings suggest that genetically determined smoking traits and socioeconomic factors can explain some of the ancestry effects in Hispanic/Latinos, East Asians, and non-Hispanic whites, and that additional factors correlated with genetic ancestry remain to be discovered.
Our results are consistent with previous studies showing disparities in adult cigarette smoking prevalence among specific sub-populations, including individuals from certain ethnic groups, variation by education level, and socioeconomic groups. Indeed, we found that East Asian and Hispanic/Latino individuals had the lowest prevalence of smoking initiation compared to non-Hispanic white and African American individuals, consistent with the previous studies7,28. Similarly, in our study, the prevalence of these ‘ever’ smokers was much lower for college-educated individuals compared to those with high school education, and for individuals who earned >$60,000 compared to those with lower income, consistent with previous studies7,28,29,30. Furthermore, in our study, married individuals had the highest prevalence of smoking cessation compared to those who were single or divorced, consistent with previous findings31.
We recognize several potential limitations of our study. First, the cigarette smoking-related traits were based on self-reported information, and no information regarding other forms of tobacco use, such as pipes, cigars, or e-cigarettes, were collected on our survey. Further, GERA cohort members are older on average compared to the general population. As older adults may consume tobacco in a different form than younger adults who may prefer e-cigarettes32,33, this may limit the generalizability of the findings to the groups represented in this study. Second, no information regarding the previous U.S. addresses of the participants included in the current study was collected. All the GERA members were living in the Northern California region at the time of survey completion, however, as smoking prevalence has been shown to vary considerably across states7,34, considering the previous U.S. addresses of the participants could identify an additional potential source of variation in smoking behavior. Third, because of the limited number of ‘current’ smokers in our sample (N = 2271), we did not consider the smoking cessation phenotype (i.e., ‘current’ vs. ‘former’ smokers) for the subsequent genetic ancestry association analyses. Lastly, for the calculation of GRS for smoking-related traits, we used a ‘classic’ GRS method26 that restricts to only genetic variants reaching genome-wide significance in the original GWAS15. This ‘classic’ approach has been commonly applied35,36,37,38,39 and has key advantages26, including that it is relatively fast to apply and is more interpretable compared to more sophisticated methods, such as Bayesian regression models that perform shrinkage39,40,41. Further, this ‘classic’ approach has been shown to have relatively similar performance compared to alternative methods39,40,41. Future studies applying those alternative methods to derive GRS for smoking-related traits may provide a further refinement to the effects that we observed in the current study. Despite these limitations, our study is based on a unique and very large cohort of individuals, who were all members of the KPNC health plan, a single integrated healthcare delivery system. Participants were recruited in a similar manner and were assessed for their cigarette smoking behaviors using a single questionnaire providing greater consistency, in contrast to consortia which often include different questions across studies.
In conclusion, this study is the first investigation of genetic ancestry and cigarette smoking-related trait associations. We observed significant associations between genetic ancestry and smoking-related traits within each race/ethnicity, except for African Americans. Known smoking-associated genetic variants identified in populations of European ancestry15 explained only a small proportion of these associations, and the observed ancestry effects may be due to population-specific genetic variants. Future studies including additional genetic variants associated with smoking behavior-related traits in non-European populations, such as those recently identified in a Japanese population42 but not validated yet, may better explain these genetic ancestry associations.
Data availability
Genotype data of GERA participants are available from the database of Genotypes and Phenotypes (dbGaP) under accession phs000674.v2.p2. This includes individuals who consented to having their data shared with dbGaP. The complete GERA data are available upon application to the KP Research Bank (https://researchbank.kaiserpermanente.org/).
References
Ahmad, T. et al. Impaired mitophagy leads to cigarette smoke stress-induced cellular senescence: implications for chronic obstructive pulmonary disease. FASEB J. 29, 2912–2929 (2015).
Ambrose, J. A. & Barua, R. S. The pathophysiology of cigarette smoking and cardiovascular disease: an update. J. Am. Coll. Cardiol. 43, 1731–1737 (2004).
Jethwa, A. R. & Khariwala, S. S. Tobacco-related carcinogenesis in head and neck cancer. Cancer Metastasis Rev. 36, 411–423 (2017).
Malhotra, J., Malvezzi, M., Negri, E., La Vecchia, C. & Boffetta, P. Risk factors for lung cancer worldwide. Eur. Respir. J. 48, 889–902 (2016).
Torre, L. A. et al. Global cancer statistics, 2012. CA Cancer J. Clin. 65, 87–108 (2015).
Tran, I. et al. Role of cigarette smoke-induced aggresome formation in chronic obstructive pulmonary disease-emphysema pathogenesis. Am. J. Respir. Cell Mol. Biol. 53, 159–173 (2015).
Drope, J. et al. Who’s still smoking? Disparities in adult cigarette smoking prevalence in the United States. CA Cancer J. Clin. 68, 106–115 (2018).
Creamer, M. R. et al. Tobacco product use and cessation indicators among adults - United States, 2018. Morb. Mortal. Wkly Rep. 68, 1013–1019 (2019).
Dutra, L. M. et al. Differential relationship between tobacco control policies and U.S. adult current smoking by poverty. Int J. Environ. Res. Public Health 16, 4130 (2019).
Chen, A., Machiorlatti, M., Krebs, N. M. & Muscat, J. E. Socioeconomic differences in nicotine exposure and dependence in adult daily smokers. BMC Public Health 19, 375 (2019).
Mooney, M. E., Johnson, E. O., Breslau, N., Bierut, L. J. & Hatsukami, D. K. Cigarette smoking reduction and changes in nicotine dependence. Nicotine Tob. Res. 13, 426–430 (2011).
Agrawal, A. et al. The genetic relationship between cannabis and tobacco cigarette use in European- and African-American female twins and siblings. Drug Alcohol Depend. 163, 165–171 (2016).
Sartor, C. E. et al. Genetic and environmental contributions to initiation of cigarette smoking in young African-American and European-American women. Drug Alcohol Depend. 157, 54–59 (2015).
Vink, J. M., Willemsen, G. & Boomsma, D. I. Heritability of smoking initiation and nicotine dependence. Behav. Genet. 35, 397–406 (2005).
Liu, M. et al. Association studies of up to 1.2 million individuals yield new insights into the genetic etiology of tobacco and alcohol use. Nat. Genet. 51, 237–244 (2019).
Banda, Y. et al. Characterizing race/ethnicity and genetic ancestry for 100,000 subjects in the genetic epidemiology research on adult health and aging (GERA) cohort. Genetics 200, 1285–1295 (2015).
Jorgenson, E. et al. Genetic contributors to variation in alcohol consumption vary by race/ethnicity in a large multi-ethnic genome-wide association study. Mol. Psychiatry 22, 1359–1367 (2017).
Pan, Y. et al. Genome-wide association studies of maximum number of drinks. J. Psychiatr. Res. 47, 1717–1724 (2013).
Schumann, G. et al. Genome-wide association and genetic functional studies identify autism susceptibility candidate 2 gene (AUTS2) in the regulation of alcohol consumption. Proc. Natl Acad. Sci. USA 108, 7119–7124 (2011).
Schumann, G. et al. KLB is associated with alcohol drinking, and its gene product beta-Klotho is necessary for FGF21 regulation of alcohol preference. Proc. Natl Acad. Sci. USA 113, 14372–14377 (2016).
Xu, K. et al. Genomewide Association Study for Maximum Number of Alcoholic Drinks in European Americans and African Americans. Alcohol Clin. Exp. Res. 39, 1137–1147 (2015).
Hoffmann, T. J. et al. Next generation genome-wide association tool: design and coverage of a high-throughput European-optimized SNP array. Genomics 98, 79–89 (2011).
Kvale, M. N. et al. Genotyping Informatics and Quality Control for 100,000 Subjects in the Genetic Epidemiology Research on Adult Health and Aging (GERA) Cohort. Genetics 200, 1051–1060 (2015).
Hoffmann, T. J. et al. Design and coverage of high throughput genotyping arrays optimized for individuals of East Asian, African American, and Latino race/ethnicity using imputation and a novel hybrid SNP selection algorithm. Genomics 98, 422–430 (2011).
Hoffmann, T. J. et al. Imputation of the rare HOXB13 G84E mutation and cancer risk in a large population-based cohort. PLoS Genet. 11, e1004930 (2015).
Choi, S. W., Mak, T. S. & O’Reilly, P. F. Tutorial: a guide to performing polygenic risk score analyses. Nat. Protoc. 15, 2759–2772 (2020).
Hoffmann, T. J. et al. A Large Multi-ethnic Genome-Wide Association Study of Adult Body Mass Index Identifies Novel Loci. Genetics 210, 499–515, https://doi.org/10.1534/genetics.118.301479 (2018).
Wang, T. W. et al. Tobacco product use among adults - United States, 2017. Morb. Mortal. Wkly Rep. 67, 1225–1232 (2018).
Amroussia, N., Gustafsson, P. E. & Pearson, J. L. Do inequalities add up? Intersectional inequalities in smoking by sexual orientation and education among U.S. adults. Prev. Med. Rep. 17, 101032 (2020).
Barbeau, E. M., Krieger, N. & Soobader, M. J. Working class matters: socioeconomic disadvantage, race/ethnicity, gender, and smoking in NHIS 2000. Am. J. Public Health 94, 269–278 (2004).
Broms, U., Silventoinen, K., Lahelma, E., Koskenvuo, M. & Kaprio, J. Smoking cessation by socioeconomic status and marital status: the contribution of smoking behavior and family background. Nicotine Tob. Res. 6, 447–455 (2004).
Kava, C. M., Hannon, P. A. & Harris, J. R. Use of cigarettes and E-cigarettes and dual use among adult employees in the US workplace. Prev. Chronic Dis. 17, E16 (2020).
Dai, H. & Hao, J. Flavored tobacco use among U.S. adults by age group: 2013-2014. Subst. Use Misuse 54, 315–323 (2019).
Dwyer-Lindgren, L. et al. Cigarette smoking prevalence in US counties: 1996-2012. Popul. Health Metr. 12, 5 (2014).
Dudbridge, F. Power and predictive accuracy of polygenic risk scores. PLoS Genet. 9, e1003348 (2013).
International Schizophrenia C. et al. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 460, 748–752 (2009).
Wray, N. R. et al. Research review: Polygenic methods and their application to psychiatric traits. J. Child Psychol. Psychiatry 55, 1068–1087 (2014).
Euesden, J., Lewis, C. M. & O’Reilly, P. F. PRSice: Polygenic Risk Score software. Bioinformatics 31, 1466–1468 (2015).
Choi, S. W. & O’Reilly, P. F. PRSice-2: Polygenic Risk Score software for biobank-scale data. Gigascience 8, giz082 (2019).
Mak, T. S. H., Porsch, R. M., Choi, S. W., Zhou, X. & Sham, P. C. Polygenic scores via penalized regression on summary statistics. Genet. Epidemiol. 41, 469–480 (2017).
Ge, T., Chen, C. Y., Ni, Y., Feng, Y. A. & Smoller, J. W. Polygenic prediction via Bayesian regression and continuous shrinkage priors. Nat. Commun. 10, 1776 (2019).
Matoba, N. et al. GWAS of smoking behaviour in 165,436 Japanese people reveals seven new loci and shared genetic architecture. Nat. Hum. Behav. 3, 471–477 (2019).
Acknowledgements
This research was funded by a grant from the National Institute on Aging, National Institute of Mental Health, and National Institute of Health Common Fund (RC2 AG036607). Data analyses were facilitated by grants from the National Eye Institute (R01 EY027004), the National Institute of Diabetes and Digestive and Kidney Diseases (R01 DK116738), and the National Cancer Institute (R01 CA241623).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Choquet, H., Yin, J. & Jorgenson, E. Cigarette smoking behaviors and the importance of ethnicity and genetic ancestry. Transl Psychiatry 11, 120 (2021). https://doi.org/10.1038/s41398-021-01244-7
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41398-021-01244-7