Review Article | Published:

Rare and common variants: twenty arguments

Nature Reviews Genetics volume 13, pages 135145 (2012) | Download Citation

Abstract

Genome-wide association studies have greatly improved our understanding of the genetic basis of disease risk. The fact that they tend not to identify more than a fraction of the specific causal loci has led to divergence of opinion over whether most of the variance is hidden as numerous rare variants of large effect or as common variants of very small effect. Here I review 20 arguments for and against each of these models of the genetic basis of complex traits and conclude that both classes of effect can be readily reconciled.

Key points

  • For the past couple of years, discussion of the so-called 'missing heritability problem' has focused attention on the failure of genome-wide association studies (GWASs) to discover more than a minor fraction of the genetic variance for complex traits and disease susceptibility.

  • However, it now appears that the problem is largely a result of the small contribution of most variants, either because the variants are too rare to contribute population-wide, or because the effect sizes of common variants are, in general, very small.

  • This Review presents five arguments for, and five against, each of these two models and concludes that although the infinitesimal model is essentially correct, rare alleles of large effect almost certainly also make an essential contribution to risk of disease. Some of the more important arguments are listed below.

  • Standard evolutionary and quantitative genetic theory both provide strong expectations for rare and common variant contributions. There is also increasingly solid empirical evidence for both classes of contribution.

  • Neither model can yet be said to provide compelling explanations for epidemiological transitions and other demographic phenomena, including familial clustering and sibling resemblance.

  • Mechanistic explanations for additive within-locus effects and multiplicative between-locus effects are ultimately desired to complement a purely statistical description of effects. In either model, the majority of healthy individuals carry disease-associated alleles.

  • Although common variants may establish the background liability to many complex diseases, environmental and rare variant perturbations often provide the extra impetus that pushes an individual over the disease threshold.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

References

  1. 1.

    The new genomics: global views of biology. Science 274, 536–539 (1996).

  2. 2.

    & On the allelic spectrum of human disease. Trends Genet. 17, 502–510 (2001).

  3. 3.

    & The allelic architecture of human disease genes: common disease–common variant... or not? Hum. Mol. Genet. 11, 2417–2423 (2002).

  4. 4.

    & Discovering genotypes underlying human phenotypes: past successes for Mendelian disease, future approaches for complex disease. Nature Genet. 33, 228–237 (2003).

  5. 5.

    Personal genomes: the case of the missing heritability. Nature 456, 18–21 (2008).

  6. 6.

    et al. Finding the missing heritability of complex diseases. Nature 461, 747–753 (2009). This paper provides a compendium of arguments, which were assembled by participants in a US National Institutes of Health (NIH) workshop, relating to the possible sources of missing heritability.

  7. 7.

    The Genetical Theory of Natural Selection (Oxford Univ. Press, Oxford, 1930).

  8. 8.

    , & Heritability in the genomics era — errors and misconceptions. Nature Rev. Genet. 9, 255–266 (2008). This is an accessible modern introduction to the concept of heritability.

  9. 9.

    & Uncovering the roles of rare variants in common disease through whole-genome sequencing. Nature Rev. Genet. 11, 415–425 (2010).

  10. 10.

    The heritability hang-up. Science 190, 1163–1168 (1975).

  11. 11.

    et al. Missing heritability and strategies for finding the underlying causes of complex disease. Nature Rev. Genet. 11, 446–450 (2010).

  12. 12.

    et al. Estimation of effect size distribution from genome-wide association studies and implications for future discoveries. Nature Genet. 42, 570–575 (2010). This paper discusses how the true number of associations and their effect sizes can be inferred from observed GWAS results.

  13. 13.

    et al. Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index. Nature Genet. 42, 937–948 (2010). The largest GWAS meta-analysis to date shows that hundreds of complex variants influence continuous traits.

  14. 14.

    et al. Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature 467, 832–838 (2010).

  15. 15.

    Hints of hidden heritability in GWAS. Nature Genet. 42, 558–560 (2010).

  16. 16.

    & Modifier genes and sickle cell anemia. Curr. Opin. Hematol. 13, 131–136 (2006).

  17. 17.

    & Common and rare variants in multifactorial susceptibility to common diseases. Nature Genet. 40, 695–701 (2008).

  18. 18.

    , & Schizophrenia: a common disease caused by multiple rare alleles. Br. J. Psychiatry 190, 194–199 (2007).

  19. 19.

    , & Synthetic associations created by rare variants do not explain most GWAS results. PLoS Biol. 9, e1000579 (2011).

  20. 20.

    The genetic architecture of quantitative traits. Annu. Rev. Genet. 35, 303–339 (2001).

  21. 21.

    & The genetics of quantitative traits: challenges and prospects. Nature Rev. Genet. 10, 565–577 (2009).

  22. 22.

    Phenotypic plasticity and the epigenetics of human disease. Nature 447, 433–440 (2007).

  23. 23.

    et al. Parental origin of sequence variants associated with complex diseases. Nature 462, 868–874 (2009).

  24. 24.

    et al. Identification of an imprinted master trans regulator at the KLF14 locus related to multiple metabolic phenotypes. Nature Genet. 43, 561–564 (2011).

  25. 25.

    & Transgenerational epigenetic inheritance: prevalence, mechanisms, and implications for the study of heredity and evolution. Quart. Rev. Biol. 84, 131–176 (2009).

  26. 26.

    The effect of selection on genetic variability. Am. Nat. 105, 201–211 (1971).

  27. 27.

    & Evolutionary quantitative genetics: how little do we know? Annu. Rev. Genet. 23, 337–370 (1989).

  28. 28.

    Maintenance of genetic variability by mutation-selection balance: a child's guide through the jungle. Genome 31, 761–767 (1989).

  29. 29.

    Fisher, Medawar, Hamilton and the evolution of aging. Genetics 156, 927–931 (2000).

  30. 30.

    Rate, molecular spectrum, and consequences of human mutation. Proc. Natl Acad. Sci. USA 107, 961–968 (2010).

  31. 31.

    & Principles of Population Genetics 3rd edn (Sinauer Associates, Sunderland, USA, 1998).

  32. 32.

    et al. Characterization of single-nucleotide polymorphisms in coding regions of human genes. Nature Genet. 22, 231–238 (1999).

  33. 33.

    , & Most rare missense alleles are deleterious in humans: implications for complex disease and association studies. Am. J. Hum. Genet. 80, 727–739 (2007).

  34. 34.

    et al. A genome-wide comparison of the functional properties of rare and common genetic variants in humans. Am. J. Hum. Genet. 88, 458–468 (2011).

  35. 35.

    et al. A high-resolution map of human evolutionary constraint using 29 mammals. Nature 478, 476–482 (2011).

  36. 36.

    The evolutionary significance of cis-regulatory mutations. Nature Rev. Genet. 8, 206–216 (2007).

  37. 37.

    , , & Rare and common regulatory variation in population-scale sequenced human genomes. PLoS Genet. 7, e1002144 (2011).

  38. 38.

    et al. Discovery and verification of functional single nucleotide polymorphisms in regulatory genomic regions: current and developing technologies. Mutat. Res. 659, 147–157 (2008).

  39. 39.

    & The LDL receptor locus and the genetics of familial hypercholesterolemia. Annu. Rev. Genet. 13, 259–289 (1979).

  40. 40.

    & Insights on pathogenesis of type 2 diabetes from MODY genetics. Curr. Diab. Rep. 7, 131–138 (2007).

  41. 41.

    et al. A systematic genetic assessment of 1,433 sequence variants of unknown clinical significance in the BRCA1 and BRCA2 breast cancer-predisposition genes. Am. J. Hum. Genet. 81, 873–883 (2007).

  42. 42.

    et al. Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucl. Acids Res. 30, 52–55 (2002).

  43. 43.

    et al. A systematic, large-scale resequencing screen of X-chromosome coding exons in mental retardation. Nature Genet. 41, 535–543 (2009). This was one of the first whole-exome sequencing studies that was designed to detect rare variants of large effect.

  44. 44.

    et al. Two human MYD88 variants, S34Y and R98C, interfere with MyD88–IRAK4–Myddosome assembly. J. Biol. Chem. 286, 1341–1353 (2011).

  45. 45.

    & Copy-number variation and association studies of human disease. Nature Genet. 39, S37–S42 (2007).

  46. 46.

    et al. Large recurrent microdeletions associated with schizophrenia. Nature 455, 232–236 (2008).

  47. 47.

    et al. Strong association of de novo copy number mutations with autism. Science 316, 445–449 (2007). This paper provided the first demonstration that rare copy number variants associate with psychiatric disease.

  48. 48.

    & Copy-number variations associated with neuropsychiatric conditions. Nature 455, 919–923 (2008).

  49. 49.

    et al. TTC21B contributes both causal and modifying alleles across the ciliopathy spectrum. Nature Genet. 43, 189–196 (2011).

  50. 50.

    , , , & Rare variants create synthetic genome-wide associations. PLoS Biol. 8, e1000294 (2010). This study presents the argument that common variant associations may be due to LD with rare variants.

  51. 51.

    , & Synthetic associations are unlikely to account for many common disease genome-wide association signals. PLoS Biol. 9, e1000580 (2011).

  52. 52.

    The importance of synthetic associations will only be resolved empirically. PLoS Biol. 9, e1001008 (2011).

  53. 53.

    et al. Distribution of allele frequencies and effect sizes and their inter-relationships for common genetic susceptibility variants. Proc. Natl Acad. Sci. USA 108, 18026–18031 (2011).

  54. 54.

    et al. A family-based study of common polygenic variation and risk of schizophrenia. Mol. Psychiatry 16, 887–888 (2011).

  55. 55.

    Linkage strategies for genetically complex traits: I. Multilocus models. Am. J. Hum. Genet. 46, 222–228 (1990).

  56. 56.

    Genotype-specific risks as indicators of the genetic architecture of complex diseases. Am. J. Hum. Genet. 83, 120–126 (2008).

  57. 57.

    & The 'common disease–common variant' hypothesis and familial risks. PLoS ONE 3, e2504 (2011).

  58. 58.

    Exchangeable models of complex disease inheritance. Genetics 179, 2253–2261 (2008).

  59. 59.

    & Epistasis and its implications for personal genetics. Am. J. Hum. Genet. 85, 309–320 (2009).

  60. 60.

    et al. Clinical profile of diabetes in the young seen between 1992 and 2009 at a specialist diabetes centre in south India. Prim. Care Diabetes 5, 223–229 (2011).

  61. 61.

    et al. Diabetes in Asia: epidemiology, risk factors, and pathophysiology. JAMA 301, 2129–2140 (2009).

  62. 62.

    , , & A systematic review of the prevalence of schizophrenia. PLoS Med. 2, e141 (2005).

  63. 63.

    et al. Advancing paternal age and the risk of schizophrenia. Arch. Gen. Psychiatry 58, 361–367 (2001).

  64. 64.

    et al. Differences in maternal and paternal age between schizophrenia and other psychiatric disorders. Schizophr. Res. 116, 184–190 (2010).

  65. 65.

    et al. Consistent association of type 2 diabetes risk variants found in Europeans in diverse racial and ethnic groups. PLoS Genet. 6, e1001078 (2010).

  66. 66.

    et al. Transferability and fine-mapping of genome-wide associated loci for adult height across human populations. PLoS ONE 4, e8398 (2009).

  67. 67.

    et al. Transferability of type 2 diabetes implicated loci in multi-ethnic cohorts from Southeast Asia. PLoS Genet. 7, e1001363 (2011).

  68. 68.

    et al. Generalizability of associations from prostate cancer genome-wide association studies in multiple populations. Cancer Epidemiol. Biomarkers Rev. 18, 1285–1289 (2009).

  69. 69.

    Decanalization and the origins of complex disease. Nature Rev. Genet. 10, 134–140 (2009).

  70. 70.

    Genome partitioning and whole genome analysis. Adv. Genet. 42, 299–322 (2001).

  71. 71.

    et al. Common SNPs explain a large proportion of the heritability for human height. Nature Genet. 42, 565–569 (2010). This paper introduces a multivariate approach for capturing the effects of common variant associations genome-wide.

  72. 72.

    & Genomic selection. J. Animal Breed. Genet. 124, 323–330 (2007).

  73. 73.

    et al. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 460, 748–752 (2009).

  74. 74.

    , & Uncovering the total heritability explained by all true susceptibility variants in a genome-wide association study. Genet. Epidemiol. 35, 447–456 (2011).

  75. 75.

    et al. Genome partitioning of genetic variation for complex traits using common SNPs. Nature Genet. 43, 519–525 (2011).

  76. 76.

    et al. Genome-wide association studies establish that human intelligence is highly heritable and polygenic. Mol. Psychiatry 16, 996–1005 (2011).

  77. 77.

    Introduction to Quantitative Genetics Ch. 18 (Longman, New York, 1981).

  78. 78.

    & Endophenotypes in the genetic analysis of mental disorders. Annu. Rev. Clin. Psychol. 2, 267–290 (2006).

  79. 79.

    & Endophenotype: a conceptual analysis. Mol. Psychiatry 15, 789–797 (2010).

  80. 80.

    et al. Genetic variation near IRS1 associates with reduced adiposity and an impaired metabolic profile. Nature Genet. 43, 753–760 (2011).

  81. 81.

    et al. Meta-analysis of genome-wide association studies from the CHARGE consortium identifies common variants associated with carotid intima media thickness and plaque. Nature Genet. 43, 940–947 (2011).

  82. 82.

    et al. Genome-wide association analysis identifies variants associated with nonalcoholic fatty liver disease that have distinct effects on metabolic traits. PLoS Genet. 7, e1001324 (2011).

  83. 83.

    et al. From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus. Nature 466, 714–719 (2010). This was an important case study showing how to go from association study to molecular function of a specific variant.

  84. 84.

    et al. Whole-genome association study identifies STK39 as a hypertension susceptibility gene. Proc. Natl Acad. Sci. USA 106, 226–231 (2009).

  85. 85.

    , & The genetics of Alzheimer disease: back to the future. Neuron 68, 270–281 (2010).

  86. 86.

    et al. Mapping complex disease traits with global gene expression. Nature Rev. Genet. 10, 184–194 (2009).

  87. 87.

    & The study of eQTL variations by RNA-seq: from SNPs to phenotypes. Trends Genet. 27, 72–79 (2011).

  88. 88.

    et al. RNA sequencing reveals the role of splicing polymorphisms in regulating human gene expression. Genome Res. 21, 545–554 (2011).

  89. 89.

    et al. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403, 503–511 (2000).

  90. 90.

    & Correlation signature of the macroscopic states of the gene regulatory network in cancer. Proc. Natl Acad. Sci. USA 106, 4079–4084 (2009).

  91. 91.

    et al. Transcriptomic analysis of autistic brain reveals convergent molecular pathology. Nature 474, 380–384 (2011).

  92. 92.

    et al. A modular analysis framework for blood genomics studies: application to systemic lupus erythematosus. Immunity 29, 150–164 (2008).

  93. 93.

    & Uncovering cryptic genetic variation. Nature Rev. Genet. 5, 681–690 (2004).

  94. 94.

    et al. Genetic analysis of complex traits in the emerging Collaborative Cross. Genome Res. 21, 1213–12122 (2011).

  95. 95.

    et al. Genetic analysis in the Collaborative Cross breeding population. Genome Res. 21, 1223–1238 (2011).

  96. 96.

    & Joint estimates of quantitative trait locus effect and frequency using synthetic recombinant populations of Drosophila melanogaster. Genetics 176, 1261–1281 (2007).

  97. 97.

    et al. Genome-wide analysis of a long-term evolution experiment with Drosophila. Nature 467, 587–590 (2010). This paper uses an 'evolve-and-resequence' strategy to demonstrate the pervasive polygenic basis of complex traits.

  98. 98.

    et al. Population-based resequencing of experimentally evolved populations reveals the genetic basis of body size variation in Drosophila melanogaster. PLoS Genet. 7, e1001336 (2011).

  99. 99.

    , & A HapMap harvest of insights into the genetics of common disease. J. Clin. Invest. 118, 1590–1605 (2008).

  100. 100.

    et al. Genome-wide association analysis identifies 20 loci that influence adult height. Nature Genet. 40, 575–583 (2008).

  101. 101.

    Theoretical basis of the Beavis effect. Genetics 165, 2259–2268 (2003).

  102. 102.

    & Correcting “winner's curse” in odds ratios from genome-wide association findings for major complex human diseases. Genet. Epidemiol. 34, 78–91 (2010).

  103. 103.

    , & Deficiency mapping of quantitative trait loci affecting longevity in Drosophila melanogaster. Genetics 156, 1129–1146 (2000).

  104. 104.

    H, & Correlated genotypes in friendship networks. Proc. Natl Acad. Sci. USA 108, 1993–1997 (2011).

  105. 105.

    , Poveda, A,, & Contribution of genetics and environment to craniofacial anthropometric phenotypes in Belgian nuclear families. Hum. Biol. 80, 637–654 (2008).

  106. 106.

    , , & Probing genetic overlap among complex human phenotypes. Proc. Natl Acad. Sci. USA 104, 11694–11699 (2007).

  107. 107.

    et al. Clinical assessment incorporating a personal genome. Lancet 375, 1525–1535 (2010). This study develops a strategy that integrates whole-genome sequence and environmental exposure information to assess personal risk of disease.

  108. 108.

    et al. Genome-wide gene-environment study identifies glutamate receptor gene GRIN2A as a Parkinson's disease modifier gene via interaction with coffee. PLoS Genet. 7, e1002237 (2011).

  109. 109.

    , , Functional validation of new pathways in lipoprotein metabolism identified by human genetics. Curr. Opin. Lipidol. 22, 123–128 (2011).

  110. 110.

    et al. ITPA gene variants protect against anemia in patients treated for chronic hepatitis C. Nature 464, 405–408 (2010).

  111. 111.

    , , & Common vs. rare allele hypotheses for complex disease. Curr. Opin. Genet. Dev. 19, 212–219 (2009).

  112. 112.

    et al. The Human Gene Mutation Database: 2008 update. Genome Med. 1, 13 (2009).

  113. 113.

    , , , & Rare variants of IFIH1, a gene implicated in antiviral responses, protect against type 1 diabetes. Science 324, 387–389 (2009).

  114. 114.

    et al. Resequencing of positional candidates identifies low frequency IL23R coding variants protecting against inflammatory bowel disease. Nature Genet. 43, 43–47 (2011).

  115. 115.

    et al. Deep resequencing of GWAS loci identifies independent rare variants associated with inflammatory bowel disease. Nature Genet. 43, 1066–1073 (2011).

  116. 116.

    et al. Excess of rare variants in genes identified by genome-wide association study of hypertriglyceridemia. Nature Genet. 42, 684–687 (2010).

  117. 117.

    et al. Targeted capture and massively parallel sequencing of 12 human exomes. Nature 461, 272–276 (2009).

  118. 118.

    , , & Clan genomics and the complex architecture of human disease. Cell 147, 32–43 (2011).

  119. 119.

    , , & Statistical analysis strategies for association studies involving rare variants. Nature Rev. Genet. 11, 773–785 (2010).

  120. 120.

    Canalization and Gene Control (Academic Press, New York, 1967).

  121. 121.

    et al. Using principal components of genetic variation for robust and powerful detection of gene-gene interactions in case–control and case-only studies. Am. J. Hum. Genet. 86, 331–342 (2010).

  122. 122.

    Epigenetic inheritance and the missing heritability problem. Genetics 182, 845–850 (2009).

  123. 123.

    et al. Systematic genetic analysis with ordered arrays of yeast deletion mutants. Science 294, 2364–2368 (2001).

  124. 124.

    et al. The genetic landscape of a cell. Science 327, 425–431 (2010).

  125. 125.

    Synthetic lethality and semi-lethality among functionally related mutants of Drosophila melanogaster. Genetics 59, 37–44 (1968).

  126. 126.

    et al. Systematic mapping of genetic interactions in Caenorhabditis elegans identifies common modifiers of diverse signaling program. Nature Genet. 38, 896–903 (2006).

  127. 127.

    , , & Establishing an adjusted p-value threshold to control the family-wide type 1 error in genome-wide association studies. BMC Genomics 9, 516 (2008).

  128. 128.

    , , & Genotype imputation. Annu. Rev. Genom. Hum. Genet. 10, 387–406 (2009).

  129. 129.

    et al. Comparing strategies to fine-map the association of common SNPs at chromosome 9p21 with type 2 diabetes and myocardial infarction. Nature Genet. 43, 801–805 (2011).

  130. 130.

    Exchangeable models of complex inherited diseases. Genetics 179, 2253–2261 (2008).

  131. 131.

    & Multi-locus models of genetic risk of disease. Genome Med. 2, 10 (2010).

  132. 132.

    & The control of flux. Symp. Soc. Exp. Biol. 27, 65–104 (1973). This paper presents a theoretical argument for the recessivity of naturally occurring mutations that affect metabolism.

Download references

Acknowledgements

I particularly thank F. Vannberg, D. Goldstein, P. Visscher and E. Cirulli for discussions and suggestions and the Georgia Institute of Technology and the US National Institutes of Health for funding.

Author information

Affiliations

  1. School of Biology and Center for Integrative Genomics, 770 State Street, Georgia Institute of Technology, Atlanta, Georgia 30332, USA.

    • Greg Gibson

Authors

  1. Search for Greg Gibson in:

Competing interests

The author declares no competing financial interests.

Corresponding author

Correspondence to Greg Gibson.

Glossary

Common disease–common variant hypothesis

(CDCV hypothesis). The model that complex disease is largely attributable to a moderate number of common variants, each of which explains several per cent of the risk in a population.

Heritability

The proportion of the phenotypic variance in a population that is due to genotypic differences among individuals.

Genetic variance

The contribution of genotypic differences among individuals to phenotypic variation.

Narrow sense variance

The additive component of the genetic variance: namely, the average effect of substituting one allele for another at a locus.

Genotype relative risk

(GRR). The ratios of the risk of disease between individuals with and without the genotype. A ratio of 1.1 equates to a 10% increase in risk.

Penetrance

Describes the proportion of individuals with a mutation or risk variant who have the disease.

Expressivity

The severity of the disease in individuals who have the risk variant and disease.

Genotype-by-genotype interactions

(G×G interactions). Otherwise known as epistasis, this refers to the situation in which the effect of one genotype is conditional on genotypes at one or more other unlinked loci.

Genotype-by-environment interactions

(G×E interactions). Refers to the situation in which the effect of the genotype is conditional on the environment, which may include abiotic (temperature), biotic (viral exposure) and cultural/behavioural influences.

Parent-of-origin genetic contributions

Genetic effects that are only seen when the allele is transmitted either from the mother or from the father.

Purifying selection

Selection against genetic variants that reduce fitness. Purifying selection generally keeps deleterious alleles at a low frequency or removes them from the population.

Chronic disease

Medical conditions that develop slowly and persist, generally with a strong genetic component.

Ciliopathies

A class of diseases due to disruption of the cilium, a cellular organelle.

Linkage disequilibrium

(LD). Nonrandom association between genotypes, generally discussed in relation to loci that are closely located on a chromosome: for example, within a gene.

Haplotype

A set of alleles that commonly segregate together and are defined as regions of extended linkage disequilibrium, which in humans is often up to 100 kb in length.

Mutation–selection balance

An evolutionary model that accounts for the maintenance of genetic variation as a balance between mutation generating variance and purifying selection removing it.

Decanalization

The notion that genetic systems evolved to be buffered but that large effect mutations or environmental change can overcome this buffering, thereby increasing the genetic variance.

Genomic selection

The use of genetic markers that are spread throughout the genome to select individuals with desired predicted breeding values.

Predicted breeding value

The estimated phenotype of progeny of individuals that have a particular genotype.

Threshold-dependent models

A model that postulates that individuals who exceed some threshold value of a continuous physiological characteristic (called 'liability') have or are at high risk for disease.

Endophenotypes

Intermediate physiological or psychological traits, such as metabolite and transcript abundance or a specific neuronal function.

Expression quantitative trait locus analysis

(eQTL analysis). Studies of the association between genotypes and gene expression (transcript abundance), leading to the detection of eQTLs.

Cryptic variation

Genetic variation with effects that are only seen under perturbed conditions, such as in the presence a particular mutation or environmental exposure.

Transgressive segregation

The appearance of traits in the offspring that are more extreme than those observed in either parent.

Beavis effect

Also called the 'winner's curse', this is the observation that the effect sizes estimated in a discovery sample tend to be overestimates of the true effect sizes, as they typically receive the benefit of sampling variance in the same direction as the true effect in order to exceed strict genome-wide significance levels.

About this article

Publication history

Published

DOI

https://doi.org/10.1038/nrg3118

Further reading