Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Estimation of effect size distribution from genome-wide association studies and implications for future discoveries


We report a set of tools to estimate the number of susceptibility loci and the distribution of their effect sizes for a trait on the basis of discoveries from existing genome-wide association studies (GWASs). We propose statistical power calculations for future GWASs using estimated distributions of effect sizes. Using reported GWAS findings for height, Crohn's disease and breast, prostate and colorectal (BPC) cancers, we determine that each of these traits is likely to harbor additional loci within the spectrum of low-penetrance common variants. These loci, which can be identified from sufficiently powerful GWASs, together could explain at least 15–20% of the known heritability of these traits. However, for BPC cancers, which have modest familial aggregation, our analysis suggests that risk models based on common variants alone will have modest discriminatory power (63.5% area under curve), even with new discoveries.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Rent or buy this article

Prices vary by article type



Prices may be subject to local taxes which are calculated during checkout

Figure 1: Nonparametric estimates for distributions of effect sizes for susceptibility loci.
Figure 2: Receiver operating characteristic curves for genetic risk models.


  1. Manolio, T.A. et al. Finding the missing heritability of complex diseases. Nature 461, 747–753 (2009).

    Article  CAS  Google Scholar 

  2. Hirschhorn, J.N. Genomewide association studies–illuminating biologic pathways. N. Engl. J. Med. 360, 1699–1701 (2009).

    Article  CAS  Google Scholar 

  3. Goldstein, D.B. Common genetic variation and human traits. N. Engl. J. Med. 360, 1696–1698 (2009).

    Article  CAS  Google Scholar 

  4. Kraft, P. et al. Beyond odds ratios–communicating disease risk based on genetic profiles. Nat. Rev. Genet. 10, 264–269 (2009).

    Article  CAS  Google Scholar 

  5. Pharoah, P.D. et al. Polygenic susceptibility to breast cancer and implications for prevention. Nat. Genet. 31, 33–36 (2002).

    Article  CAS  Google Scholar 

  6. Gail, M.H. Value of adding single-nucleotide polymorphism genotypes to a breast cancer risk model. J. Natl. Cancer Inst. 101, 959–963 (2009).

    Article  CAS  Google Scholar 

  7. Gail, M.H. Discriminatory accuracy from single-nucleotide polymorphisms in models to predict breast cancer risk. J. Natl. Cancer Inst. 100, 1037–1041 (2008).

    Article  CAS  Google Scholar 

  8. Xu, J. et al. Estimation of absolute risk for prostate cancer using genetic markers and family history. Prostate 69, 1565–1572 (2009).

    Article  CAS  Google Scholar 

  9. Meigs, J.B. et al. Genotype score in addition to common risk factors for prediction of type 2 diabetes. N. Engl. J. Med. 359, 2208–2219 (2008).

    Article  CAS  Google Scholar 

  10. Wacholder, S. et al. Performance of common genetic variants in breast-cancer risk models. N. Engl. J. Med. 362, 986–993 (2010).

    Article  CAS  Google Scholar 

  11. Kraft, P. & Hunter, D.J. Genetic risk prediction–are we there yet? N. Engl. J. Med. 360, 1701–1703 (2009).

    Article  CAS  Google Scholar 

  12. Visscher, P.M. Sizing up human height variation. Nat. Genet. 40, 489–490 (2008).

    Article  CAS  Google Scholar 

  13. Gudbjartsson, D.F. et al. Many sequence variants affecting diversity of adult human height. Nat. Genet. 40, 609–615 (2008).

    Article  CAS  Google Scholar 

  14. Lettre, G. et al. Identification of ten loci associated with height highlights new biological pathways in human growth. Nat. Genet. 40, 584–591 (2008).

    Article  CAS  Google Scholar 

  15. Weedon, M.N. et al. Genome-wide association analysis identifies 20 loci that influence adult height. Nat. Genet. 40, 575–583 (2008).

    Article  CAS  Google Scholar 

  16. Weedon, M.N. & Frayling, T.M. Reaching new heights: insights into the genetics of human stature. Trends Genet. 24, 595–603 (2008).

    Article  CAS  Google Scholar 

  17. Barrett, J.C. et al. Genome-wide association defines more than 30 distinct susceptibility loci for Crohn's disease. Nat. Genet. 40, 955–962 (2008).

    Article  CAS  Google Scholar 

  18. Lichtenstein, P. et al. Environmental and heritable factors in the causation of cancer–analyses of cohorts of twins from Sweden, Denmark, and Finland. N. Engl. J. Med. 343, 78–85 (2000).

    Article  CAS  Google Scholar 

  19. Easton, D.F. et al. Genome-wide association study identifies novel breast cancer susceptibility loci. Nature 447, 1087–1093 (2007).

    Article  CAS  Google Scholar 

  20. Eeles, R.A. et al. Multiple newly identified loci associated with prostate cancer susceptibility. Nat. Genet. 40, 316–321 (2008).

    Article  CAS  Google Scholar 

  21. Houlston, R.S. et al. Meta-analysis of genome-wide association data identifies four new susceptibility loci for colorectal cancer. Nat. Genet. 40, 1426–1435 (2008).

    Article  CAS  Google Scholar 

  22. Thomas, G. et al. A multistage genome-wide association study in breast cancer identifies two new risk alleles at 1p11.2 and 14q24.1 (RAD51L1). Nat. Genet. 41, 579–584 (2009).

    Article  CAS  Google Scholar 

  23. Thomas, G. et al. Multiple loci identified in a genome-wide association study of prostate cancer. Nat. Genet. 40, 310–315 (2008).

    Article  CAS  Google Scholar 

  24. Eeles, R.A. et al. Identification of seven new prostate cancer susceptibility loci through a genome-wide association study. Nat. Genet. 41, 1116–1121 (2009).

    Article  CAS  Google Scholar 

  25. Orr, H.A. The population genetics of adaptation: The distribution of factors fixed during adaptive evolution. Evolution 52, 935–949 (1998).

    Article  Google Scholar 

  26. Eberle, M.A. et al. Power to detect risk alleles using genome-wide tag SNP panels. PLoS Genet. 3, 1827–1837 (2007).

    Article  CAS  Google Scholar 

  27. Schork, N.J. Power calculations for genetic association studies using estimated probability distributions. Am. J. Hum. Genet. 70, 1480–1489 (2002).

    Article  CAS  Google Scholar 

  28. Ambrosius, W.T., Lange, E.M. & Langefeld, C.D. Power for genetic association studies with random allele frequencies and genotype distributions. Am. J. Hum. Genet. 74, 683–693 (2004).

    Article  CAS  Google Scholar 

  29. Spencer, C.C., Su, Z., Donnelly, P. & Marchini, J. Designing genome-wide association studies: sample size, power, imputation, and the choice of genotyping chip. PLoS Genet. 5, e1000477 (2009).

    Article  Google Scholar 

  30. Dickson, S.P., Wang, K., Krantz, I., Hakonarson, H. & Goldstein, D.B. Rare variants create synthetic genome-wide associations. PLoS Biol. 8, e1000294 (2010).

    Article  Google Scholar 

  31. Yu, K. et al. Flexible design for following up positive findings. Am. J. Hum. Genet. 81, 540–551 (2007).

    Article  CAS  Google Scholar 

  32. Ghosh, A., Zou, F. & Wright, F.A. Estimating odds ratios in genome scans: an approximate conditional likelihood approach. Am. J. Hum. Genet. 82, 1064–1074 (2008).

    Article  CAS  Google Scholar 

  33. Li, B. & Leal, S.M. Discovery of rare variants via sequencing: implications for the design of complex trait association studies. PLoS Genet. 5, e1000481 (2009).

    Article  Google Scholar 

  34. Li, B. & Leal, S.M. Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am. J. Hum. Genet. 83, 311–321 (2008).

    Article  CAS  Google Scholar 

  35. Zhong, H. & Prentice, R.L. Bias-reduced estimators and confidence intervals for odds ratios in genome-wide association studies. Biostatistics 9, 621–634 (2008).

    Article  Google Scholar 

  36. Zhong, H. & Prentice, R.L. Correcting “winner's curse” in odds ratios from genomewide association findings for major complex human diseases. Genet. Epidemiol. 34, 78–91 (2009).

    Google Scholar 

Download references


This work was supported by the intramural program of the National Cancer Institute, US National Institutes of Health. The research of N.C. and J.-H.P. was also partially funded by the Gene-Environment Initiative of the National Institutes of Health.

Author information

Authors and Affiliations



J.-H.P. and N.C. developed the statistical methods and designed the analyses. J.-H.P. implemented the methods and carried out all analyses. N.C. and S.J.C. drafted the manuscript. S.W., M.H.G., K.B.J. and U.P. made important suggestions for presentation and interpretation of the results. All the authors participated in critically reviewing the paper and approved the final version of the manuscript.

Corresponding author

Correspondence to Nilanjan Chatterjee.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Tables 1–7 and Supplementary Note. (PDF 544 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Park, JH., Wacholder, S., Gail, M. et al. Estimation of effect size distribution from genome-wide association studies and implications for future discoveries. Nat Genet 42, 570–575 (2010).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer