The contribution of genetic variants to disease depends on the ruler

Journal name:
Nature Reviews Genetics
Volume:
15,
Pages:
765–776
Year published:
DOI:
doi:10.1038/nrg3786
Published online

Abstract

Our understanding of the genetic basis of disease has evolved from descriptions of overall heritability or familiality to the identification of large numbers of risk loci. One can quantify the impact of such loci on disease using a plethora of measures, which can guide future research decisions. However, different measures can attribute varying degrees of importance to a variant. In this Analysis, we consider and contrast the most commonly used measures — specifically, the heritability of disease liability, approximate heritability, sibling recurrence risk, overall genetic variance using a logarithmic relative risk scale, the area under the receiver–operating curve for risk prediction and the population attributable fraction — and give guidelines for their use that should be explicitly considered when assessing the contribution of genetic variants to disease.

At a glance

Figures

  1. Different measures of genetic effects on disease.
    Figure 1: Different measures of genetic effects on disease.

    Various measures can be used to assess the extent to which known genetic factors contribute to the overall genetic variation in disease. These include heritability (part a), sibling recurrence risk (part b), logarithmic relative risk (logRR) genetic variance (part c), area under the receiver–operating curve (AUC; part d) and population attributable fraction (PAF; part e). These measures have their bases in traditionally distinct disciplines such as quantitative genetics and epidemiology, which have recently begun to coalesce. Although epidemiological measures were originally developed to address different questions, they are now being repurposed to assess how much genetic variation can be explained. We compare these measures by simulation and applications.

  2. Empirical evaluation of measures of genetic effects.
    Figure 2: Empirical evaluation of measures of genetic effects.

    Comparison of heritability, approximate heritability, sibling recurrence risk, logarithmic relative risk (logRR) genetic variance and proportion of area under the receiver–operating curve (pAUC) explained across a range of complex disease architectures is shown. The measures are calculated for single causal variants with risk allele frequencies (RAFs) of 0.01, 0.10, 0.25, 0.50, 0.75 and 0.99, and genetic relative risk (RR) of 1.0–3.0 (assuming a multiplicative model). The overall disease risk is assumed to be 0.01, and the total sibling recurrence risk is 5, which gives an overall genetic heritability on the liability scale of 0.55 and a maximum AUC of 0.95. The percentages of heritability, sibling recurrence risk and logRR genetic variance explained are fairly modest for low RRs and small RAFs, but as these increase the measures start to materially differ. Heritability is always one of the smallest measures and is overestimated by the approximate heritability as the RR increases. The sibling recurrence risk and pAUC are generally the largest measures for lower RAFs.

  3. Application of measures to four diseases.
    Figure 3: Application of measures to four diseases.

    Commonly used measures for assessing the impact of known risk variants on disease are compared for four diseases: breast cancer (65 variants; part a), Crohn's disease (143 variants; part b), rheumatoid arthritis (36 variants; part c) and schizophrenia (32 variants; part d). The measures considered are heritability explained, approximate heritability explained (Approx. herit.), sibling recurrence risk explained, logarithmic relative risk (logRR) genetic variance explained, and the proportion of area under the receiver–operating curve (pAUC). Each line corresponds to an individual risk variant and indicates the percentage of each measure (for example, total heritability) explained by the variant. Lines are different colours depending on the relative risk (which is estimated by the odds ratio) for the variants. The y axes are on a squared scale. The percentages given in parentheses after each measure on the x axes indicate the total across all risk variants.

  4. Aspects of disease heritability: known, hiding and missing.
    Figure 4: Aspects of disease heritability: known, hiding and missing.

    A growing proportion of the total heritability estimated from family studies can be explained by known variants detected in existing genome-wide association studies (GWASs). This is one of the key measures considered here. The remaining heritability can be categorized as 'hiding' heritability and 'still-missing' heritability. The hiding heritability can be estimated from genome-wide arrays using the Genetic Relatedness Estimation through Maximum Likelihood (GREML) model34. The still-missing heritability may remain even after GWASs and could reflect different genetic architectures (for example, rare variants). Note that the total heritability may be biased upwards owing to confounding by non-additive genetic or non-genetic factors.

References

  1. Visscher, P. M., Brown, M. A., McCarthy, M. I. & Yang, J. Five years of GWAS discovery. Am. J. Hum. Genet. 90, 724 (2012).
  2. Witte, J. S. Genome-wide association studies and beyond. Annu. Rev. Publ. Health 31, 920 (2010).
  3. Manolio, T. A. et al. Finding the missing heritability of complex diseases. Nature 461, 747753 (2009).
  4. Wray, N. R., Yang, J., Goddard, M. E. & Visscher, P. M. The genetic interpretation of area under the ROC curve in genomic profiling. PLoS Genet. 6, e1000864 (2010).
  5. Yang, J. et al. Common SNPs explain a large proportion of the heritability for human height. Nature Genet. 42, 565569 (2010).
  6. Cole, P. & MacMahon, B. Attributable risk percent in case–control studies. Br. J. Prev. Soc. Med. 25, 242244 (1971).
  7. Lee, S. H. et al. Estimating the proportion of variation in susceptibility to schizophrenia captured by common SNPs. Nature Genet. 44, 247250 (2012).
  8. Barrett, J. C. et al. Genome-wide association defines more than 30 distinct susceptibility loci for Crohn's disease. Nature Genet. 40, 955962 (2008).
  9. Wang, K. et al. Interpretation of association signals and identification of causal variants from genome-wide association studies. Am. J. Hum. Genet. 86, 730742 (2010).
  10. Dempster, E. R. & Lerner, I. M. Heritability of threshold characters. Genetics 35, 212236 (1950).
    This study explores the relationship between heritability on disease and liability scales.
  11. Slatkin, M. Exchangeable models of complex inherited diseases. Genetics 179, 22532261 (2008).
  12. Falconer, D. The inheritance of liability to certain diseases, estimates from the incidence among relatives. Ann. Hum. Genet. 29, 5176 (1965).
    This paper presents a formal derivation of the relationship between disease risk in relatives and heritability, and also provides a thoughtful exploration of scenarios and caveats.
  13. Falconer, D. & Mackay, T. F. Introduction to Quantitative Genetics, (Pearson Education, 1996).
  14. Risch, N. J. Searching for genetic determinants in the new millennium. Nature 405, 847856 (2000).
    This paper describes variance explained by a single locus on the disease and liability scale.
  15. Purcell, S. M. et al. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 460, 748752 (2009).
  16. Stahl, E. A. et al. Bayesian inference analyses of the polygenic architecture of rheumatoid arthritis. Nature Genet. 44, 483489 (2012).
  17. Pharoah, P. D. et al. Polygenic susceptibility to breast cancer and implications for prevention. Nature Genet. 31, 3336 (2002).
    This is a clear presentation of the logRR model.
  18. Wray, N. R. & Goddard, M. E. Multi-locus models of genetic risk of disease. Genome Med. 2, 10 (2010).
  19. Pharoah, P. D., Day, N. E., Duffy, S., Easton, D. F. & Ponder, B. A. Family history and the risk of breast cancer: a systematic review and meta-analysis. Int. J. Cancer 71, 800809 (1997).
  20. James, J. W. Frequency in relatives for an all-or-none trait. Ann. Hum. Genet. 35, 4749 (1971).
  21. Pharoah, P. D., Antoniou, A. C., Easton, D. F. & Ponder, B. A. Polygenes, risk prediction, and targeted prevention of breast cancer. N. Engl. J. Med. 358, 27962803 (2008).
  22. Park, J. H. et al. Estimation of effect size distribution from genome-wide association studies and implications for future discoveries. Nature Genet. 42, 570575 (2010).
  23. Jostins, L. et al. Host–microbe interactions have shaped the genetic architecture of inflammatory bowel disease. Nature 491, 119124 (2012).
  24. Chen, G.-B. et al. Estimation and partitioning of (co)heritability of inflammatory bowel disease from GWAS and immunochip data. Hum. Mol. Genet. 23, 47104720 (2014).
  25. Okada, Y. et al. Genetics of rheumatoid arthritis contributes to biology and drug discovery. Nature 506, 376381 (2014).
  26. Ripke, S. et al. Genome-wide association analysis identifies 13 new risk loci for schizophrenia. Nature Genet. 45, 11501159 (2013).
  27. Kirov, G. et al. Neurexin 1 (NRXN1) deletions in schizophrenia. Schizophr Bull. 35, 851854 (2009).
  28. Kirov, G. et al. Support for the involvement of large copy number variants in the pathogenesis of schizophrenia. Hum. Mol. Genet. 18, 14971503 (2009).
  29. International Schizophrenia Consortium. Rare chromosomal deletions and duplications increase risk of schizophrenia. Nature 455, 237241 (2008).
  30. Stefansson, H. et al. Large recurrent microdeletions associated with schizophrenia. Nature 455, 232236 (2008).
  31. Sullivan, P. F., Kendler, K. S. & Neale, M. C. Schizophrenia as a complex trait — evidence from a meta-analysis of twin studies. Arch. Gen. Psychiatry 60, 11871192 (2003).
  32. Rockhill, B., Weinberg, C. R. & Newman, B. Population attributable fraction estimation for established breast cancer risk factors: considering the issues of high prevalence and unmodifiability. Am. J. Epidemiol. 147, 826833 (1998).
    This study considers the limitations of the PAF.
  33. Saha, S., Chant, D., Welham, J. & McGrath, J. A systematic review of the prevalence of schizophrenia. PLoS Med. 2, e141 (2005).
  34. Alonso, A., Logroscino, G., Jick, S. S. & Hernan, M. A. Incidence and lifetime risk of motor neuron disease in the United Kingdom: a population-based study. Eur. J. Neurol. 16, 745751 (2009).
  35. Wray, N. R. et al. Polygenic methods and their application to psychiatric traits. J. Child Psychol. Psychiatry http://dx.doi.org/10.1111/jcpp.12295 (2014).
  36. Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 7682 (2011).
  37. Gail, M. H. & Pfeiffer, R. M. On criteria for evaluating models of absolute risk. Biostatistics 6, 227239 (2005).
  38. Tenesa, A. & Haley, C. S. The heritability of human disease: estimation, uses and abuses. Nature Rev. Genet. 14, 139149 (2013).
  39. So, H. C., Gui, A. H., Cherny, S. S. & Sham, P. C. Evaluating the heritability explained by known susceptibility variants: a survey of ten complex diseases. Genet. Epidemiol. 35, 310317 (2011).
  40. So, H. C., Li, M. & Sham, P. C. Uncovering the total heritability explained by all true susceptibility variants in a genome-wide association study. Genet. Epidemiol. 35, 447456 (2011).
  41. So, H. C., Kwan, J. S., Cherny, S. S. & Sham, P. C. Risk prediction of complex diseases from family history and known susceptibility loci, with applications for cancer screening. Am. J. Hum. Genet. 88, 548565 (2011).
    This study uses variance explained by loci and considers complications of age-related risk.
  42. Do, C. B., Hinds, D. A., Francke, U. & Eriksson, N. Comparison of family history and SNPs for predicting risk of complex disease. PLoS Genet. 8, e1002973 (2012).
  43. Zaitlen, N. et al. Informed conditioning on clinical covariates increases power in case–control association studies. PLoS Genet. 8, e1003032 (2012).

Download references

Author information

Affiliations

  1. Department of Epidemiology and Biostatistics, and Department of Urology, University of California, San Francisco.

    • John S. Witte
  2. Institute for Human Genetics, University of California, San Francisco.

    • John S. Witte
  3. Helen Diller Comprehensive Cancer Center, University of California, San Francisco, 1450 3rd Street, San Francisco, California 94158, USA.

    • John S. Witte
  4. Queensland Brain Institute, The University of Queensland, Building 79, Research Road, Brisbane, 4072, Queensland, Australia.

    • Peter M. Visscher &
    • Naomi R. Wray
  5. The University of Queensland Diamantina Institute, The University of Queensland, 37 Kent Street, Brisbane, 4102, Queensland, Australia.

    • Peter M. Visscher

Competing interests statement

The authors declare no competing interests.

Corresponding authors

Correspondence to:

Author details

  • John S. Witte

    John S. Witte is Professor of Epidemiology and Biostatistics, and Associate Director of the Institute for Human Genetics at the University of California, San Francisco, USA. His research programme encompasses a synthesis of methodological and applied genetic epidemiology, with the overall aim of deciphering the mechanisms underlying complex diseases and traits. His current work on methods is focused on the design and statistical analysis of next-generation sequencing and genetic association studies. He is applying these methods to studies of prostate cancer, birth defects and pharmacogenomics to improve individual- and population-level health. John S. Witte's homepage.

  • Peter M. Visscher

    Peter M. Visscher is Professor of Quantitative Genetics at the University of Queensland, Brisbane, Australia, and a senior principal research fellow of the National Health and Medical Research Council in Australia. His research is at the interface of quantitative genetics, statistical genetics, population genetics, human genetics, animal genetics, evolution, bioinformatics and genetic epidemiology. His current research focuses on estimation and dissection of complex-trait variation in human populations — through the development of new statistical genetics methods for estimation and prediction — as well as on applications of such knowledge to quantitative traits and disease in human populations.

  • Naomi R. Wray

    Naomi R. Wray is Professor of Psychiatric Genetics at the University of Queensland, Brisbane, Australia, and a senior research fellow of the National Health and Medical Research Council in Australia. Her Ph.D. and early postdoctoral work was on the prediction of rates of inbreeding in populations that are undergoing selection. She currently leads a research programme in psychiatric genomics. Her recent research has focused on the application of quantitative genetics methods to psychiatric disorders, including the estimation of genetic variation in liability to disease, and the prospects and limitations of making predictions of individual risk from genetic data.

Supplementary information

PDF files

  1. Supplementary information S1 (table) (216 KB)

    Measures of overall impact of risk variants for breast cancer.

  2. Supplementary information S2 (table) (329 KB)

    Measures of overall impact of risk variants for Crohn's disease.

  3. Supplementary information S3 (table) (166 KB)

    Measures of overall impact of risk variants for rheumatoid arthritis.

  4. Supplementary information S4 (table) (162 KB)

    Measures of overall impact of risk variants for schizophrenia.

Additional data