Although the historically different fields of quantitative genetics and epidemiology are converging to answer fundamental questions about genetic variation in risk underlying human diseases, the plethora of measures to quantify the contribution of variants to disease risk have differing terminology and assumptions, which obfuscate their use and interpretation.
In this Analysis, we consider and contrast the most commonly used measures that assess disease risk contributed to the population by individual variants — the heritability of disease liability explained, approximate heritability explained, the sibling recurrence risk explained, the proportion of genetic variance explained on a logarthimic relative risk scale, the area under the receiver–operating curve (AUC) and the population attributable fraction (PAF) — and give numerical examples in breast cancer, Crohn's disease, rheumatoid arthritis and schizophrenia.
We discuss the properties of these measures, show how they are connected to each other, consider the situations for which they are best suited and provide an online tool for their calculation.
The most appropriate measure to use depends on the importance given to the frequency of a risk variant relative to its effect size on disease and on the baseline to which importance is expressed. These factors should be explicitly considered when assessing the contribution of genetic variants to disease.
We recommend investigators to focus primarily on the heritability of liability or genetic variance on the logarthimic relative risk scale explained, as they give estimates that are less sensitive to rare high-risk variants than the other measures considered here. Moreover, we caution against using the PAF for genetic risk variants because it has various undesirable properties.
The concept of individual loci providing an explanation for disease is less straightforward than it may seem at first sight, and we recommend investigators to undertake sensitivity analyses that explore how measures of the contribution of genetic variants to risk vary across a range of underlying assumptions.
Our understanding of the genetic basis of disease has evolved from descriptions of overall heritability or familiality to the identification of large numbers of risk loci. One can quantify the impact of such loci on disease using a plethora of measures, which can guide future research decisions. However, different measures can attribute varying degrees of importance to a variant. In this Analysis, we consider and contrast the most commonly used measures — specifically, the heritability of disease liability, approximate heritability, sibling recurrence risk, overall genetic variance using a logarithmic relative risk scale, the area under the receiver–operating curve for risk prediction and the population attributable fraction — and give guidelines for their use that should be explicitly considered when assessing the contribution of genetic variants to disease.
Subscribe to Journal
Get full journal access for 1 year
only $4.92 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Visscher, P. M., Brown, M. A., McCarthy, M. I. & Yang, J. Five years of GWAS discovery. Am. J. Hum. Genet. 90, 7–24 (2012).
Witte, J. S. Genome-wide association studies and beyond. Annu. Rev. Publ. Health 31, 9–20 (2010).
Manolio, T. A. et al. Finding the missing heritability of complex diseases. Nature 461, 747–753 (2009).
Wray, N. R., Yang, J., Goddard, M. E. & Visscher, P. M. The genetic interpretation of area under the ROC curve in genomic profiling. PLoS Genet. 6, e1000864 (2010).
Yang, J. et al. Common SNPs explain a large proportion of the heritability for human height. Nature Genet. 42, 565–569 (2010).
Cole, P. & MacMahon, B. Attributable risk percent in case–control studies. Br. J. Prev. Soc. Med. 25, 242–244 (1971).
Lee, S. H. et al. Estimating the proportion of variation in susceptibility to schizophrenia captured by common SNPs. Nature Genet. 44, 247–250 (2012).
Barrett, J. C. et al. Genome-wide association defines more than 30 distinct susceptibility loci for Crohn's disease. Nature Genet. 40, 955–962 (2008).
Wang, K. et al. Interpretation of association signals and identification of causal variants from genome-wide association studies. Am. J. Hum. Genet. 86, 730–742 (2010).
Dempster, E. R. & Lerner, I. M. Heritability of threshold characters. Genetics 35, 212–236 (1950). This study explores the relationship between heritability on disease and liability scales.
Slatkin, M. Exchangeable models of complex inherited diseases. Genetics 179, 2253–2261 (2008).
Falconer, D. The inheritance of liability to certain diseases, estimates from the incidence among relatives. Ann. Hum. Genet. 29, 51–76 (1965). This paper presents a formal derivation of the relationship between disease risk in relatives and heritability, and also provides a thoughtful exploration of scenarios and caveats.
Falconer, D. & Mackay, T. F. Introduction to Quantitative Genetics, (Pearson Education, 1996).
Risch, N. J. Searching for genetic determinants in the new millennium. Nature 405, 847–856 (2000). This paper describes variance explained by a single locus on the disease and liability scale.
Purcell, S. M. et al. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 460, 748–752 (2009).
Stahl, E. A. et al. Bayesian inference analyses of the polygenic architecture of rheumatoid arthritis. Nature Genet. 44, 483–489 (2012).
Pharoah, P. D. et al. Polygenic susceptibility to breast cancer and implications for prevention. Nature Genet. 31, 33–36 (2002). This is a clear presentation of the logRR model.
Wray, N. R. & Goddard, M. E. Multi-locus models of genetic risk of disease. Genome Med. 2, 10 (2010).
Pharoah, P. D., Day, N. E., Duffy, S., Easton, D. F. & Ponder, B. A. Family history and the risk of breast cancer: a systematic review and meta-analysis. Int. J. Cancer 71, 800–809 (1997).
James, J. W. Frequency in relatives for an all-or-none trait. Ann. Hum. Genet. 35, 47–49 (1971).
Pharoah, P. D., Antoniou, A. C., Easton, D. F. & Ponder, B. A. Polygenes, risk prediction, and targeted prevention of breast cancer. N. Engl. J. Med. 358, 2796–2803 (2008).
Park, J. H. et al. Estimation of effect size distribution from genome-wide association studies and implications for future discoveries. Nature Genet. 42, 570–575 (2010).
Jostins, L. et al. Host–microbe interactions have shaped the genetic architecture of inflammatory bowel disease. Nature 491, 119–124 (2012).
Chen, G.-B. et al. Estimation and partitioning of (co)heritability of inflammatory bowel disease from GWAS and immunochip data. Hum. Mol. Genet. 23, 4710–4720 (2014).
Okada, Y. et al. Genetics of rheumatoid arthritis contributes to biology and drug discovery. Nature 506, 376–381 (2014).
Ripke, S. et al. Genome-wide association analysis identifies 13 new risk loci for schizophrenia. Nature Genet. 45, 1150–1159 (2013).
Kirov, G. et al. Neurexin 1 (NRXN1) deletions in schizophrenia. Schizophr Bull. 35, 851–854 (2009).
Kirov, G. et al. Support for the involvement of large copy number variants in the pathogenesis of schizophrenia. Hum. Mol. Genet. 18, 1497–1503 (2009).
International Schizophrenia Consortium. Rare chromosomal deletions and duplications increase risk of schizophrenia. Nature 455, 237–241 (2008).
Stefansson, H. et al. Large recurrent microdeletions associated with schizophrenia. Nature 455, 232–236 (2008).
Sullivan, P. F., Kendler, K. S. & Neale, M. C. Schizophrenia as a complex trait — evidence from a meta-analysis of twin studies. Arch. Gen. Psychiatry 60, 1187–1192 (2003).
Rockhill, B., Weinberg, C. R. & Newman, B. Population attributable fraction estimation for established breast cancer risk factors: considering the issues of high prevalence and unmodifiability. Am. J. Epidemiol. 147, 826–833 (1998). This study considers the limitations of the PAF.
Saha, S., Chant, D., Welham, J. & McGrath, J. A systematic review of the prevalence of schizophrenia. PLoS Med. 2, e141 (2005).
Alonso, A., Logroscino, G., Jick, S. S. & Hernan, M. A. Incidence and lifetime risk of motor neuron disease in the United Kingdom: a population-based study. Eur. J. Neurol. 16, 745–751 (2009).
Wray, N. R. et al. Polygenic methods and their application to psychiatric traits. J. Child Psychol. Psychiatry http://dx.doi.org/10.1111/jcpp.12295 (2014).
Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).
Gail, M. H. & Pfeiffer, R. M. On criteria for evaluating models of absolute risk. Biostatistics 6, 227–239 (2005).
Tenesa, A. & Haley, C. S. The heritability of human disease: estimation, uses and abuses. Nature Rev. Genet. 14, 139–149 (2013).
So, H. C., Gui, A. H., Cherny, S. S. & Sham, P. C. Evaluating the heritability explained by known susceptibility variants: a survey of ten complex diseases. Genet. Epidemiol. 35, 310–317 (2011).
So, H. C., Li, M. & Sham, P. C. Uncovering the total heritability explained by all true susceptibility variants in a genome-wide association study. Genet. Epidemiol. 35, 447–456 (2011).
So, H. C., Kwan, J. S., Cherny, S. S. & Sham, P. C. Risk prediction of complex diseases from family history and known susceptibility loci, with applications for cancer screening. Am. J. Hum. Genet. 88, 548–565 (2011). This study uses variance explained by loci and considers complications of age-related risk.
Do, C. B., Hinds, D. A., Francke, U. & Eriksson, N. Comparison of family history and SNPs for predicting risk of complex disease. PLoS Genet. 8, e1002973 (2012).
Zaitlen, N. et al. Informed conditioning on clinical covariates increases power in case–control association studies. PLoS Genet. 8, e1003032 (2012).
The authors thank C. Nolan and B. Beyamin for developing the companion website, M. Robinson for help with the figure in Box 1, T. Hoffmann for help in plotting Figure 3, and J. Liu for linkage disequilibrium filtering of the breast cancer SNPs. This work is supported by the US National Institutes of Health grants R01 CA088164, U01 CA127298, U01 GM061390 and P30 CA82103, and by the Australian National Health and Medical Research Council grants 613602, 613601, 1011506, 1050218 and 1048853.
The authors declare no competing financial interests.
Measures of overall impact of risk variants for breast cancer. (PDF 216 kb)
Measures of overall impact of risk variants for Crohn's disease. (PDF 329 kb)
Measures of overall impact of risk variants for rheumatoid arthritis. (PDF 166 kb)
Measures of overall impact of risk variants for schizophrenia. (PDF 162 kb)
- Mendelian loci
Genetic loci that have alleles with discrete effects on the phenotype and that follow Mendel's laws of segregation and independent assortment.
The proportion of phenotypic variation in a population that is attributable to genetic variation among individuals.
- Disease liability
An underlying or latent continuous variable such that those with a liability above a threshold are considered diseased. The quantitative trait of liability reflects both genetic and environmental factors.
- Sibling recurrence risk
The ratio of the probability that a sibling of an individual affected by a disease will also be affected compared to the risk of disease in the general population.
- Genetic variance
The variance of trait values that can be ascribed to genetic differences among individuals. The total genetic variance of a trait can be dissected into additive, dominance and other components.
- Area under the receiver–operating curve
(AUC). The receiver–operating curve for a predictor (for example, a genetic test) plots the proportion of cases correctly identified by the test against the proportion of controls that are incorrectly classified as cases. The AUC indicates the probability that a factor (for example, a genetic risk score) will predict a higher risk of disease in a randomly selected case than in a control.
- Population attributable fraction
(PAF; also known as population attributable risk). For a given disease, risk factor and population, the fraction by which the incidence rate of the disease in the population would be reduced if the risk factor was eliminated.
- Overall disease risk
The lifetime probability that an individual will be affected by a disease.
- Genetic architectures
The number of risk alleles underlying disease, their allele frequency spectrum, effect sizes and mode of interaction.
- Linkage disequilibrium
A measure of whether alleles at two loci coexist in a population in a nonrandom manner. Alleles that are in linkage disequilibrium are found together on the same haplotype more often than expected by chance.
- Genomic profile risk
A predicted measure of genetic risk for individuals constructed from a set of loci, the risk alleles and corresponding effect sizes of which have been estimated in an independent sample.
About this article
Cite this article
Witte, J., Visscher, P. & Wray, N. The contribution of genetic variants to disease depends on the ruler. Nat Rev Genet 15, 765–776 (2014). https://doi.org/10.1038/nrg3786
Proportion of Idiopathic Pulmonary Fibrosis Risk Explained by Known Common Genetic Loci in European Populations
American Journal of Respiratory and Critical Care Medicine (2021)
A pooled genome-wide association study identifies pancreatic cancer susceptibility loci on chromosome 19p12 and 19p13.3 in the full-Jewish population
Human Genetics (2021)
Trans-ancestry genome-wide association meta-analysis of prostate cancer identifies new susceptibility loci and informs genetic risk prediction
Nature Genetics (2021)
Frontiers in Genetics (2020)
Highly Recurrent Copy Number Variations in GABRB2 Associated With Schizophrenia and Premenstrual Dysphoric Disorder
Frontiers in Psychiatry (2020)