Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Rare genetic variants explain missing heritability in smoking

Abstract

Common genetic variants explain less variation in complex phenotypes than inferred from family-based studies, and there is a debate on the source of this ‘missing heritability’. We investigated the contribution of rare genetic variants to tobacco use with whole-genome sequences from up to 26,257 unrelated individuals of European ancestries and 11,743 individuals of African ancestries. Across four smoking traits, single-nucleotide-polymorphism-based heritability (\(h^2_{\mathrm{SNP}}\)) was estimated from 0.13 to 0.28 (s.e., 0.10–0.13) in European ancestries, with 35–74% of it attributable to rare variants with minor allele frequencies between 0.01% and 1%. These heritability estimates are 1.5–4 times higher than past estimates based on common variants alone and accounted for 60% to 100% of our pedigree-based estimates of narrow-sense heritability (\(h^2_{\mathrm{ped}}\), 0.18–0.34). In the African ancestry samples, \(h^2_{\mathrm{SNP}}\) was estimated from 0.03 to 0.33 (s.e., 0.09–0.14) across the four smoking traits. These results suggest that rare variants are important contributors to the heritability of smoking.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: SNP-based heritability estimates in the European ancestry sample for each of the six MAF/LD bins, and sums across bins.
Fig. 2: SNP-based heritability estimates in the European ancestry sample from sensitivity analyses.
Fig. 3: Comparison of heritability estimates between current and published studies.

Similar content being viewed by others

Data availability

Phenotypes are available through an authorized access portal in dbgap (https://dbgap.ncbi.nlm.nih.gov/) or direct request to TOPMed principal investigators. Accession numbers and email addresses of the principal investigators are presented in the Supplementary Note. Genetic data are available through the dbgap TOPMed exchange area.

Code availability

All software used is publicly available and can be found at the references cited.

References

  1. Johnson, T. & Barton, N. Theoretical models of selection and mutation on quantitative traits. Phil. Trans. R. Soc. B 360, 1411–1425 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Keinan, A. & Clark, A. G. Recent explosive human population growth has resulted in an excess of rare genetic variants. Science 336, 740–743 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Yengo, L. et al. Meta-analysis of genome-wide association studies for height and body mass index in 700000 individuals of European ancestry. Hum. Mol. Genet. 27, 3641–3649 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Graham, S. E. et al. The power of genetic diversity in genome-wide association studies of lipids. Nature 600, 675–679 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Vujkovic, M. et al. Discovery of 318 new risk loci for type 2 diabetes and related vascular outcomes among 1.4 million participants in a multi-ancestry meta-analysis. Nat. Genet. 52, 680–691 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Ripke, S. et al. Biological insights from 108 schizophrenia-associated genetic loci. Nature 511, 421–427 (2014).

    Article  CAS  PubMed Central  Google Scholar 

  7. Manolio, T. A. et al. Finding the missing heritability of complex diseases. Nature 461, 747–753 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Ezzati, M., Lopez, A. D., Rodgers, A., Vander Hoorn, S. & Murray, C. J. Selected major risk factors and global and regional burden of disease. Lancet 360, 1347–1360 (2002).

    Article  PubMed  Google Scholar 

  9. Reitsma, M. B. et al. Smoking prevalence and attributable disease burden in 195 countries and territories, 1990–2015: a systematic analysis from the Global Burden of Disease Study 2015. Lancet 389, 1885–1906 (2017).

    Article  Google Scholar 

  10. Carter, B. D. et al. Smoking and mortality—beyond established causes. N. Engl. J. Med. 372, 631–640 (2015).

    Article  CAS  PubMed  Google Scholar 

  11. Maes, H. H. et al. A genetic epidemiological mega analysis of smoking initiation in adolescents. Nicotine Tob. Res. 19, 401–409 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  12. Vink, J. M. & Boomsma, D. I. Interplay between heritability of smoking and environmental conditions? A comparison of two birth cohorts. BMC Public Health 11, 316 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  13. Quach, B. C. et al. Expanding the genetic architecture of nicotine dependence and its shared genetics with multiple traits. Nature communications 11, 1–13 (2020).

    Article  Google Scholar 

  14. Degenhardt, L. & Hall, W. The relationship between tobacco use, substance-use disorders and mental health: results from the National Survey of Mental Health and Well-Being. Nicotine Tob. Res. 3, 225–234 (2001).

    Article  CAS  PubMed  Google Scholar 

  15. McCabe, S. E., West, B. T. & McCabe, V. V. Associations between early onset of e-cigarette use and cigarette smoking and other substance use among US adolescents: a national study. Nicotine Tob. Res. 20, 923–930 (2018).

    Article  PubMed  Google Scholar 

  16. King, S. M., Iacono, W. G. & McGue, M. Childhood externalizing and internalizing psychopathology in the prediction of early substance use. Addiction 99, 1548–1559 (2004).

    Article  PubMed  Google Scholar 

  17. Polderman, T. J. C. et al. Meta-analysis of the heritability of human traits based on fifty years of twin studies. Nat. Genet. 47, 702–709 (2015).

    Article  CAS  PubMed  Google Scholar 

  18. Liu, M. et al. Association studies of up to 1.2 million individuals yield new insights into the genetic etiology of tobacco and alcohol use. Nat. Genet. 51, 237–244 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Erzurumluoglu, A. M. et al. Meta-analysis of up to 622,409 individuals identifies 40 novel smoking behaviour associated genetic loci. Mol. Psychiatry 25, 2392–2409 (2020).

    Article  CAS  PubMed  Google Scholar 

  20. Evans, L. M. et al. Genetic architecture of four smoking behaviors using partitioned SNP heritability. Addiction 116, 2498–2508 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  21. Eichler, E. E. et al. Missing heritability and strategies for finding the underlying causes of complex disease. Nat. Rev. Genet. 11, 446–450 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Zaitlen, N. et al. Using extended genealogy to estimate components of heritability for 23 quantitative and dichotomous traits. PLoS Genet. 9, e1003520 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Wray, N. R. & Maier, R. Genetic basis of complex genetic disease: the contribution of disease heterogeneity to missing heritability. Curr. Epidemiol. Rep. 1, 220–227 (2014).

    Article  Google Scholar 

  24. Zuk, O., Hechter, E., Sunyaev, S. R. & Lander, E. S. The mystery of missing heritability: genetic interactions create phantom heritability. Proc. Natl Acad. Sci. USA 109, 1193–1198 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Young, A. I. Solving the missing heritability problem. PLoS Genet. 15, e1008222 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Gibson, G. Rare and common variants: twenty arguments. Nat. Rev. Genet. 13, 135–145 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Eyre-Walker, A. Genetic architecture of a complex trait and its implications for fitness and genome-wide association studies. Proc. Natl Acad. Sci. USA 107, 1752–1756 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Visscher, P. M., Goddard, M. E., Derks, E. M. & Wray, N. R. Evidence-based psychiatric genetics, AKA the false dichotomy between common and rare variant hypotheses. Mol. Psychiatry 17, 474–485 (2012).

    Article  CAS  PubMed  Google Scholar 

  29. McCarthy, S. et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat. Genet. 48, 1279–1283 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Loh, P.-R. et al. Reference-based phasing using the Haplotype Reference Consortium panel. Nat. Genet. 48, 1443–1448 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Yang, J. et al. Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index. Nat. Genet. 47, 1114–1120 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Evans, L. M. et al. Comparison of methods that use whole genome data to estimate the heritability and genetic architecture of complex traits. Nat. Genet. 50, 737–745 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Derkach, A., Zhang, H. & Chatterjee, N. Power Analysis for Genetic Association Test (PAGEANT) provides insights to challenges for rare variant association studies. Bioinformatics 34, 1506–1513 (2018).

    Article  CAS  PubMed  Google Scholar 

  34. Taliun, D. et al. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature 590, 290–299 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Wainschtein, P. et al. Assessing the contribution of rare variants to complex trait heritability from whole-genome sequence data. Nat. Genet. 54, 263–273 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Hernandez, R. D. et al. Ultrarare variants drive substantial cis heritability of human gene expression. Nat. Genet. 51, 1349–1355 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Sul, J. H. et al. Contribution of common and rare variants to bipolar disorder susceptibility in extended pedigrees from population isolates. Transl. Psychiatry 10, 74 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  38. Halvorsen, M. et al. Increased burden of ultra-rare structural variants localizing to boundaries of topologically associated domains in schizophrenia. Nat. Commun. 11, 1842 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Luo, Y. et al. Exploring the genetic architecture of inflammatory bowel disease by whole-genome sequencing identifies association at ADCY7. Nat. Genet. 49, 186–192 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Fuchsberger, C. et al. The genetic architecture of type 2 diabetes. Nature 536, 41–47 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Wainschtein, P. et al. Recovery of trait heritability from whole genome sequence data. Preprint at bioRxiv https://doi.org/10.1101/588020 (2021).

  42. Nait Saada, J. et al. Identity-by-descent detection across 487,409 British samples reveals fine scale population structure and ultra-rare variant associations. Nat. Commun. 11, 6130 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Nelson, M. R. et al. An abundance of rare functional variants in 202 drug target genes sequenced in 14,002 people. Science 337, 100–104 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Mullaert, J. et al. Taking population stratification into account by local permutations in rare-variant association studies on small samples. Genet. Epidemiol. 45, 821–829 (2021).

    Article  CAS  PubMed  Google Scholar 

  45. Gazal, S. et al. Linkage disequilibrium dependent architecture of human complex traits shows action of negative selection. Nat. Genet. 49, 1421–1427 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Mills, M. C. et al. Identification of 371 genetic variants for age at first sex and birth linked to externalising behaviour. Nat. Hum. Behav. 5, 1717–1730 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  47. Domingue, B. W., Rehkopf, D. H., Conley, D. & Boardman, J. D. Geographic clustering of polygenic scores at different stages of the life course. RSF 4, 137–149 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  48. Abdellaoui, A. et al. Genetic correlates of social stratification in Great Britain. Nat. Hum. Behav. 3, 1332–1342 (2019).

    Article  PubMed  Google Scholar 

  49. Tropf, F. C. et al. Hidden heritability due to heterogeneity across seven populations. Nat. Hum. Behav. 1, 757–765 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  50. Boardman, J. D., Blalock, C. L. & Pampel, F. C. Trends in the genetic influences on smoking. J. Health Soc. Behav. 51, 108–123 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  51. Bi, W. et al. Efficient mixed model approach for large-scale genome-wide association studies of ordinal categorical phenotypes. Am. J. Hum. Genet. 108, 825–839 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Mathieson, I. & McVean, G. Differential confounding of rare and common variants in spatially structured populations. Nat. Genet. 44, 243–246 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Zaidi, A. A. & Mathieson, I. Demographic history mediates the effect of stratification on polygenic scores. eLife 9, e61548 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Treur, J. L., Vink, J. M., Boomsma, D. I. & Middeldorp, C. M. Spousal resemblance for smoking: underlying mechanisms and effects of cohort and age. Drug Alcohol Depend. 153, 221–228 (2015).

    Article  PubMed  Google Scholar 

  55. Agrawal, A. et al. Assortative mating for cigarette smoking and for alcohol consumption in female Australian twins and their spouses. Behav. Genet. 36, 553–566 (2006).

    Article  PubMed  Google Scholar 

  56. Vink, J. M., Willemsen, G. & Boomsma, D. I. The association of current smoking behavior with the smoking behavior of parents, siblings, friends and spouses. Addiction 98, 923–931 (2003).

    Article  PubMed  Google Scholar 

  57. Border, R. et al. Assortative mating biases marker-based heritability estimators. Nat. Commun. (in the press).

  58. Yengo, L. et al. Imprint of assortative mating on the human genome. Nat. Hum. Behav. 2, 948–954 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  59. Howe, L. J. et al. Within-sibship genome-wide association analyses decrease bias in estimates of direct genetic effects. Nat Genet. 54, 581–592 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Young, A. I., Benonisdottir, S., Przeworski, M. & Kong, A. Deconstructing the sources of genotype–phenotype associations in humans. Science 365, 1396–1400 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Abdellaoui, A. & Verweij, K. J. H. Dissecting polygenic signals from genome-wide association studies on human behaviour. Nat. Hum. Behav. 5, 686–694 (2021).

    Article  PubMed  Google Scholar 

  62. Warrington, N. M., Hwang, L.-D., Nivard, M. G. & Evans, D. M. Estimating direct and indirect genetic effects on offspring phenotypes using genome-wide summary results data. Nat. Commun. 12, 5420 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Zhang, D., Dey, R. & Lee, S. Fast and robust ancestry prediction using principal component analysis. Bioinformatics 36, 3439–3446 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  64. Yang, J. et al. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 42, 565–569 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Fidler, J., Ferguson, S. G., Brown, J., Stapleton, J. & West, R. How does rate of smoking cessation vary by age, gender and social grade? Findings from a population survey in England. Addiction 108, 1680–1685 (2013).

    Article  PubMed  Google Scholar 

  66. Karp, I., O’loughlin, J., Paradis, G., Hanley, J. & Difranza, J. Smoking trajectories of adolescent novice smokers in a longitudinal study of tobacco use. Ann. Epidemiol. 15, 445–452 (2005).

    Article  PubMed  Google Scholar 

  67. Mathew, A. R. et al. Life-course smoking trajectories and risk for emphysema in middle age: the CARDIA Lung Study. Am. J. Respir. Crit. Care Med. 199, 237–240 (2018).

    Article  Google Scholar 

  68. Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Lee, S. H., Wray, N. R., Goddard, M. E. & Visscher, P. M. Estimating missing heritability for disease from genome-wide association studies. Am. J. Hum. Genet. 88, 294–305 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  70. Powell, L. A. Approximating variance of demographic parameters using the delta method: a reference for avian biologists. Condor 109, 949–954 (2007).

    Article  Google Scholar 

  71. Hernandez, R. D. et al. Ultrarare variants drive substantial cis heritability of human gene expression. Nat. Genet. 51, 1349–1355 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin) 6, 80–92 (2012).

    Article  CAS  Google Scholar 

  73. Bouaziz, M. et al. Controlling for human population stratification in rare variant association studies. Sci Rep 11, 19015 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  74. Conomos, M. P., Reiner, A. P., Weir, B. S. & Thornton, T. A. Model-free estimation of recent genetic relatedness. Am. J. Hum. Genet. 98, 127–148 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Maples, B. K., Gravel, S., Kenny, E. E. & Bustamante, C. D. RFMix: a discriminative modeling approach for rapid and robust local-ancestry inference. Am. J. Hum. Genet. 93, 278–288 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Bergström, A. et al. Insights into human genetic variation and population history from 929 diverse genomes. Science 367, eaay5012 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  77. Athanasiadis, G. et al. Estimating narrow-sense heritability using family data from admixed populations. Heredity (Edinb.) 124, 751–762 (2020).

    Article  Google Scholar 

  78. Evans, L. M. et al. Genetic architecture of four smoking behaviors using partitioned SNP heritability. Addiction 116, 2498–2508 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  79. Polderman, T. J. C. et al. Meta-analysis of the heritability of human traits based on fifty years of twin studies. Nat. Genet. 47, 702–709 (2015).

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

The molecular data for the TOPMed programme was supported by the National Heart, Lung and Blood Institute. See the TOPMed Omics Support Table (Supplementary Note) for study-specific omics support information. Core support including centralized genomic read mapping and genotype calling, along with variant quality metrics and filtering, was provided by the TOPMed Informatics Research Center (3R01HL-117626-02S1; contract no. HHSN268201800002I). Core support including phenotype harmonization, data management, sample-identity quality control and general programme coordination was provided by the TOPMed Data Coordinating Center (R01HL-120393; U01HL-120393; contract no. HHSN268201800001I). We acknowledge the studies and participants who provided biological samples and data for TOPMed. Funding for this project included grant nos R01DA044283, R01DA037904 and R01HG008983 to S.V. and grant no. R01MH100141 to M.C.K. Cohort-wise acknowledgement is provided in the Supplementary Note. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.

Author information

Authors and Affiliations

Authors

Contributions

S.-K.J., S.V., M.C.K. and L.E. designed the study. A.F. contributed to the data analysis. All authors contributed to the data collection and curation and critically reviewed the manuscript.

Corresponding author

Correspondence to Scott Vrieze.

Ethics declarations

Competing interests

B.M.P. serves on the Steering Committee of the Yale Open Data Access Project funded by Johnson & Johnson. E.K.S. has received grant support from GSK and Bayer Research support to the University of Pennsylvania from RenalytixAI and personal fees from Calico Labs, both outside the current work. All other authors declare no competing interests.

Peer review

Peer review information

Nature Human Behaviour thanks Roseann Peterson, Abdel Abdellaoui and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Figs. 1–4 and Note.

Reporting Summary

Peer Review File

Supplementary Table 1

Supplementary Tables 1–12.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jang, SK., Evans, L., Fialkowski, A. et al. Rare genetic variants explain missing heritability in smoking. Nat Hum Behav 6, 1577–1586 (2022). https://doi.org/10.1038/s41562-022-01408-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41562-022-01408-5

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing