Reevaluation of SNP heritability in complex human traits


SNP heritability, the proportion of phenotypic variance explained by SNPs, has been reported for many hundreds of traits. Its estimation requires strong prior assumptions about the distribution of heritability across the genome, but current assumptions have not been thoroughly tested. By analyzing imputed data for a large number of human traits, we empirically derive a model that more accurately describes how heritability varies with minor allele frequency (MAF), linkage disequilibrium (LD) and genotype certainty. Across 19 traits, our improved model leads to estimates of common SNP heritability on average 43% (s.d. 3%) higher than those obtained from the widely used software GCTA and 25% (s.d. 2%) higher than those from the recently proposed extension GCTA-LDMS. Previously, DNase I hypersensitivity sites were reported to explain 79% of SNP heritability; using our improved heritability model, their estimated contribution is only 24%.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: Comparison of the GCTA and LDAK models.
Figure 2: Relationship between heritability and MAF.
Figure 3: Comparison of methods for estimating for real and simulated data.
Figure 4: Comparing the GCTA and LDAK models for the GWAS traits.
Figure 5: Enrichment of SNP classes.
Figure 6: Varying quality control for the UCLEB traits.


  1. 1

    Yang, J. et al. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 42, 565–569 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  2. 2

    Maher, B. Personal genomes: the case of the missing heritability. Nature 456, 18–21 (2008).

    CAS  Article  Google Scholar 

  3. 3

    Speed, D. et al. Describing the genetic architecture of epilepsy through heritability analysis. Brain 137, 2680–2689 (2014).

    Article  Google Scholar 

  4. 4

    Henderson, C., Kempthorne, O., Searle, S. & von Krosigk, C. The estimation of environmental and genetic trends from records subject to culling. Biometrics 15, 192–218 (1959).

    Article  Google Scholar 

  5. 5

    Falconer, D. & Mackay, T. Introduction to Quantitative Genetics 4th edn (Longman, 1996).

  6. 6

    Yang, J. et al. Genome partitioning of genetic variation for complex traits using common SNPs. Nat. Genet. 43, 519–525 (2011).

    CAS  Article  Google Scholar 

  7. 7

    Gusev, A. et al. Partitioning heritability of regulatory and cell-type-specific variants across 11 common diseases. Am. J. Hum. Genet. 95, 535–552 (2014).

    CAS  Article  Google Scholar 

  8. 8

    Lee, S.H., Yang, J., Goddard, M.E., Visscher, P.M. & Wray, N.R. Estimation of pleiotropy between complex diseases using single-nucleotide polymorphism–derived genomic relationships and restricted maximum likelihood. Bioinformatics 28, 2540–2542 (2012).

    CAS  Article  Google Scholar 

  9. 9

    Speed, D., Hemani, G., Johnson, M.R. & Balding, D.J. Improved heritability estimation from genome-wide SNPs. Am. J. Hum. Genet. 91, 1011–1021 (2012).

    CAS  Article  Google Scholar 

  10. 10

    Bulik-Sullivan, B.K. et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).

    CAS  Article  Google Scholar 

  11. 11

    Bulik-Sullivan, B. Relationship between LD score and Haseman–Elston regression. Preprint at bioRxiv (2015).

  12. 12

    Corbeil, R. & Searle, S. Restricted maximum likelihood (REML) estimation of variance components in the mixed model. Technometrics 18, 31–38 (1976).

    Article  Google Scholar 

  13. 13

    Golan, D., Lander, E.S. & Rosset, S. Measuring missing heritability: inferring the contribution of common variants. Proc. Natl. Acad. Sci. USA 111, E5272–E5281 (2014).

    CAS  Article  Google Scholar 

  14. 14

    Lee, S.H. et al. Estimation of SNP heritability from dense genotype data. Am. J. Hum. Genet. 93, 1151–1155 (2013).

    CAS  Article  Google Scholar 

  15. 15

    Yang, J. et al. Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index. Nat. Genet. 47, 1114–1120 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  16. 16

    Ek, W.E. et al. Germline genetic contributions to risk for esophageal adenocarcinoma, Barrett's esophagus, and gastroesophageal reflux. J. Natl. Cancer Inst. 105, 1711–1718 (2013).

    Article  Google Scholar 

  17. 17

    Bevan, S. et al. Genetic heritability of ischemic stroke and the contribution of previously reported candidate gene and genomewide associations. Stroke 43, 3161–3167 (2012).

    CAS  Article  Google Scholar 

  18. 18

    Keller, M.F. et al. Using genome-wide complex trait analysis to quantify 'missing heritability' in Parkinson's disease. Hum. Mol. Genet. 21, 4996–5009 (2012).

    CAS  Article  Google Scholar 

  19. 19

    Yin, X. et al. Common variants explain a large fraction of the variability in the liability to psoriasis in a Han Chinese population. BMC Genomics 15, 87 (2014).

    Article  Google Scholar 

  20. 20

    Lee, S.H. et al. Estimating the proportion of variation in susceptibility to schizophrenia captured by common SNPs. Nat. Genet. 44, 247–250 (2012).

    CAS  Article  Google Scholar 

  21. 21

    Chen, G.B. et al. Estimation and partitioning of (co)heritability of inflammatory bowel disease from GWAS and Immunochip data. Hum. Mol. Genet. 23, 4710–4720 (2014).

    CAS  Article  Google Scholar 

  22. 22

    Stahl, E.A. et al. Bayesian inference analyses of the polygenic architecture of rheumatoid arthritis. Nat. Genet. 44, 483–489 (2012).

    CAS  Article  Google Scholar 

  23. 23

    Robinson, E.B. et al. The genetic architecture of pediatric cognitive abilities in the Philadelphia Neurodevelopmental Cohort. Mol. Psychiatry 20, 454–458 (2015).

    CAS  Article  Google Scholar 

  24. 24

    Shah, T. et al. Population genomics of cardiometabolic traits: design of the University College London–London School of Hygiene and Tropical Medicine–Edinburgh–Bristol (UCLEB) Consortium. PLoS One 8, e71345 (2013).

    CAS  Article  Google Scholar 

  25. 25

    Voight, B.F. et al. The Metabochip, a custom genotyping array for genetic studies of metabolic, cardiovascular, and anthropometric traits. PLoS Genet. 8, e1002793 (2012).

    CAS  Article  Google Scholar 

  26. 26

    Dempster, E.R. & Lerner, I.M. Heritability of threshold characters. Genetics 35, 212–236 (1950).

    CAS  PubMed  PubMed Central  Google Scholar 

  27. 27

    Lee, S.H., Wray, N.R., Goddard, M.E. & Visscher, P.M. Estimating missing heritability for disease from genome-wide association studies. Am. J. Hum. Genet. 88, 294–305 (2011).

    Article  Google Scholar 

  28. 28

    Yang, J., Lee, S.H., Goddard, M.E. & Visscher, P.M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011)..

    CAS  PubMed  PubMed Central  Google Scholar 

  29. 29

    Pruit, K., Brown, G., Tatusova, T. & Maglott, D. in The NCBI Handbook (eds. McEntyre, J. & Ostell, J.) Chapter. 18 (National Center for Biotechnology Information, 2002).

  30. 30

    Finucane, H.K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).

    CAS  Article  Google Scholar 

  31. 31

    Hastie, T., Tibshirani, R. & Friedman, J. The Elements of Statistical Learning (Springer, 2001).

  32. 32

    Habier, D., Fernando, R.L., Kizilkaya, K. & Garrick, D.J. Extension of the Bayesian alphabet for genomic selection. BMC Bioinformatics 12, 186 (2011).

    Article  Google Scholar 

  33. 33

    Lippert, C. et al. FaST linear mixed models for genome-wide association studies. Nat. Methods 8, 833–835 (2011).

    CAS  Article  Google Scholar 

  34. 34

    Zhou, X. & Stephens, M. Efficient multivariate linear mixed model algorithms for genome-wide association studies. Nat. Methods 11, 407–409 (2014).

    CAS  Article  Google Scholar 

  35. 35

    Yang, J., Zaitlen, N.A., Goddard, M.E., Visscher, P.M. & Price, A.L. Advantages and pitfalls in the application of mixed-model association methods. Nat. Genet. 46, 100–106 (2014).

    Article  Google Scholar 

  36. 36

    Loh, P.-R. et al. Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat. Genet. 47, 284–290 (2015).

    CAS  Article  Google Scholar 

  37. 37

    Cross-Disorder Group of the Psychiatric Genomics Consortium. Genetic relationship between five psychiatric disorders estimated from genome-wide SNPs. Nat. Genet. 45, 984–994 (2013).

  38. 38

    Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).

    CAS  Article  Google Scholar 

  39. 39

    Gazal, S. et al. Linkage disequilibrium dependent architecture of human complex traits reveals action of negative selection. Preprint at bioRxiv (2017).

  40. 40

    ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

  41. 41

    Krishna Kumar, S., Feldman, M.W., Rehkopf, D.H. & Tuljapurkar, S. Limitations of GCTA as a solution to the missing heritability problem. Proc. Natl. Acad. Sci. USA 113, E61–E70 (2016).

    Article  Google Scholar 

  42. 42

    Howie, B., Marchini, J. & Stephens, M. Genotype imputation with thousands of genomes. G3 (Bethesda) 1, 457–470 (2011).

    Article  Google Scholar 

  43. 43

    Hayes, B.J., Visscher, P.M. & Goddard, M.E. Increased accuracy of artificial selection by using the realized relationship matrix. Genet. Res. (Camb.) 91, 47–60 (2009).

    CAS  Article  Google Scholar 

  44. 44

    Habier, D., Fernando, R.L. & Dekkers, J.C. The impact of genetic relationship information on genome-assisted breeding values. Genetics 177, 2389–2397 (2007).

    CAS  Article  Google Scholar 

  45. 45

    Speed, D. & Balding, D.J. Relatedness in the post-genomic era: is it still useful? Nat. Rev. Genet. 16, 33–44 (2015).

    CAS  Article  Google Scholar 

  46. 46

    Hardy, G.H. Mendelian proportions in a mixed population. Science 28, 49–50 (1908).

    CAS  Article  Google Scholar 

  47. 47

    Weinberg, W. Über den Nachweis der Vererbung beim Menschen. Jahreshefte des Vereins fur Vaterländische Naturkd. Württemb. 64, 368–382 (1908).

    Google Scholar 

  48. 48

    Lee, S.H. & van der Werf, J.H. An efficient variance component approach implementing an average information REML suitable for combined LD and linkage mapping with a general complex pedigree. Genet. Sel. Evol. 38, 25–43 (2006).

    CAS  Article  Google Scholar 

  49. 49

    World Health Organization. Global Tuberculosis Report (World Health Organization, 2014).

  50. 50

    Gusev, A. et al. Quantifying missing heritability at known GWAS loci. PLoS Genet. 9, e1003993 (2013).

    Article  Google Scholar 

  51. 51

    Speed, D. & Balding, D.J. MultiBLUP: improved SNP-based prediction for complex traits. Genome Res. 24, 1550–1557 (2014).

    CAS  Article  Google Scholar 

  52. 52

    Zhou, X., Carbonetto, P. & Stephens, M. Polygenic modeling with Bayesian sparse linear mixed models. PLoS Genet. 9, e1003264 (2013).

    CAS  Article  Google Scholar 

  53. 53

    Moser, G. et al. Simultaneous discovery, estimation and prediction analysis of complex traits using a Bayesian mixture model. PLoS Genet. 11, e1004969 (2015).

    Article  Google Scholar 

  54. 54

    Visscher, P.M. et al. Statistical power to detect genetic (co)variance of complex traits using SNP data in unrelated samples. PLoS Genet. 10, e1004269 (2014).

    Article  Google Scholar 

  55. 55

    Bhatia, G. et al. Haplotypes of common SNPs can explain missing heritability of complex diseases. Preprint at bioRxiv (2016).

  56. 56

    Tobin, M.D., Sheehan, N.A., Scurrah, K.J. & Burton, P.R. Adjusting for treatment effects in studies of quantitative traits: antihypertensive therapy and systolic blood pressure. Stat. Med. 24, 2911–2935 (2005).

    Article  Google Scholar 

  57. 57

    Asselbergs, F.W. et al. Large-scale gene-centric meta-analysis across 32 studies identifies multiple lipid loci. Am. J. Hum. Genet. 91, 823–838 (2012).

    CAS  Article  Google Scholar 

  58. 58

    Delaneau, O., Zagury, J.F. & Marchini, J. Improved whole-chromosome phasing for disease and population genetic studies. Nat. Methods 10, 5–6 (2013).

    CAS  Article  Google Scholar 

  59. 59

    1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).

  60. 60

    Todd, J.A. et al. Robust associations of four new chromosome regions from genome-wide analyses of type 1 diabetes. Nat. Genet. 39, 857–864 (2007).

    CAS  Article  Google Scholar 

  61. 61

    Plenge, R.M. et al. TRAF1C5 as a risk locus for rheumatoid arthritis—a genomewide study. N. Engl. J. Med. 357, 1199–1209 (2007).

    CAS  Article  Google Scholar 

Download references


Access to Wellcome Trust Case Control Consortium data was authorized as work related to the project “Genome-wide association study of susceptibility and clinical phenotypes in epilepsy,” while access to Children's Hospital of Philadelphia (CHOP) data was granted under Project 49228-1, “Assumptions underlying estimates of SNP heritability.” We thank A. Molloy, J. Mills and L. Brody for permission to use genotype data from the Trinity College Dublin Student Study and S. Langley for help accessing the CHOP data. This work is funded by the UK Medical Research Council under grant MR/L012561/1 (awarded to D.S.) and the British Heart Foundation under grant RG/10/12/28456 (the UCLEB Consortium) and is supported by researchers at the National Institute for Health Research (NIHR) University College London Hospitals Biomedical Research Centre. N.C. is an ESPOD Fellow from the European Molecular Biology Laboratory, European Bioinformatics Institute, and Wellcome Trust Sanger Institute. M.R.J. receives funding from the Imperial College NIHR Biomedical Research Centre (BRC) Scheme. S.N. is a Wellcome Trust Senior Research Fellow in Basic Biomedical Science and is also supported by the NIHR Cambridge Biomedical Research Centre. Analyses were performed with the use of the UCL Computer Science Cluster and the help of the CS Technical Support Group, as well as the use of the UCL Legion High-Performance Computing Facility (Legion@UCL) and associated support services.

Author information





D.S. and N.C. performed the analyses. D.S. and D.J.B. wrote the manuscript with assistance from N.C., M.R.J., S.N. and members of the UCLEB Consortium.

Corresponding author

Correspondence to Doug Speed.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Additional information

A full list of members and affiliations appears in the Supplementary Note.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–25 and Supplementary Tables 1–12 (PDF 3715 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Speed, D., Cai, N., Johnson, M. et al. Reevaluation of SNP heritability in complex human traits. Nat Genet 49, 986–992 (2017).

Download citation

Further reading


Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing