Comparison of methods that use whole genome data to estimate the heritability and genetic architecture of complex traits

Article metrics


Multiple methods have been developed to estimate narrow-sense heritability, h2, using single nucleotide polymorphisms (SNPs) in unrelated individuals. However, a comprehensive evaluation of these methods has not yet been performed, leading to confusion and discrepancy in the literature. We present the most thorough and realistic comparison of these methods to date. We used thousands of real whole-genome sequences to simulate phenotypes under varying genetic architectures and confounding variables, and we used array, imputed, or whole genome sequence SNPs to obtain ‘SNP-heritability’ estimates. We show that SNP-heritability can be highly sensitive to assumptions about the frequencies, effect sizes, and levels of linkage disequilibrium of underlying causal variants, but that methods that bin SNPs according to minor allele frequency and linkage disequilibrium are less sensitive to these assumptions across a wide range of genetic architectures and possible confounding factors. These findings provide guidance for best practices and proper interpretation of published estimates.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Comparison of heritability estimation methods. Mean \({\hat{h}}_{{\bf{SNP}}}^{{\bf{2}}}\) across 100 replicates from GRMs built from WGS SNPs in the least structured subsamples.
Fig. 2: Partitioned heritability methods to explore allelic spectra of traits. Mean \({\hat{h}}_{{\bf{SNP}}}^{{\bf{2}}}\) for four MAF bins across 100 replicates from multicomponent approaches in unrelated individuals using WGS SNPs in the least structured subsample.
Fig. 3: Influence of model assumptions using phenotypes simulated under alternative genetic architectures.Mean \({\hat{h}}_{{\bf{SNP}}}^{{\bf{2}}}\) across 100 replicates from GRMs built from imputed SNPs in the least structured subsamples across different model assumptions (bars) and different ways of simulating CVs (x axes).
Fig. 4: Influence of model assumptions using phenotypes simulated with LD-dependent genetic architecture. Mean \({\hat{h}}_{{\bf{SNP}}}^{{\bf{2}}}\) across 100 replicates from GRMs built from imputed SNPs in the least structured subsamples across different model assumptions (bars) and different ways of simulating CVs (x axes).
Fig. 5: Bias of heritability estimates under different model assumptions. Boxplots of the absolute bias of heritability estimates (\(\left|E\left({\hat{h}}_{{\rm{SNP}}}^{2}\right)-{h}^{2}\right|\)) across all simulated phenotypes.
Fig. 6: Estimated \({\hat{h}}_{{\bf{SNP}}}^{{\bf{2}}}\) using multiple methods with imputed variants for six complex traits in the UK Biobank.


  1. 1.

    Tenesa, A. & Haley, C. S. The heritability of human disease: estimation, uses and abuses. Nat. Rev. Genet. 14, 139–149 (2013).

  2. 2.

    Visscher, P. M., Hill, W. G. & Wray, N. R. Heritability in the genomics era-concepts and misconceptions. Nat. Rev. Genet. 9, 255–266 (2008).

  3. 3.

    Keller, M. C. & Coventry, W. L. Quantifying and addressing parameter indeterminacy in the classical twin design. Twin Res. Hum. Genet. 8, 201–213 (2005).

  4. 4.

    Eaves, L. J., Last, K. A., Young, P. A. & Martin, N. G. Model-fitting approaches to the analysis of human behaviour. Heredity (Edinb.) 41, 249–320 (1978).

  5. 5.

    Yang, J. et al. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 42, 565–569 (2010).

  6. 6.

    Lee, S. H. et al. Estimating the proportion of variation in susceptibility to schizophrenia captured by common SNPs. Nat. Genet. 44, 247–250 (2012).

  7. 7.

    Yang, J. et al. Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index. Nat. Genet. 47, 1114–1120 (2015).

  8. 8.

    Hyde, C. L. et al. Identification of 15 genetic loci associated with risk of major depression in individuals of European descent. Nat. Genet. 48, 1031–1036 (2016).

  9. 9.

    Okbay, A. et al. Genetic variants associated with subjective well-being, depressive symptoms, and neuroticism identified through genome-wide analyses. Nat. Genet. 48, 624–633 (2016).

  10. 10.

    Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).

  11. 11.

    Yang, J. et al. Genome partitioning of genetic variation for complex traits using common SNPs. Nat. Genet. 43, 519–525 (2011).

  12. 12.

    Loh, P.-R. et al. Contrasting genetic architectures of schizophrenia and other complex diseases using fast variance-components analysis. Nat. Genet. 47, 1385–1392 (2015).

  13. 13.

    Gazal, S. et al. Linkage disequilibrium-dependent architecture of human complex traits shows action of negative selection. Nat. Genet. 49, 1421–1427 (2017).

  14. 14.

    Zeng, J. et al. Signatures of negative selection in the genetic architecture of human complex traits. Nat. Genet. (2018).

  15. 15.

    Speed, D., Cai, N., Johnson, M. R., Nejentsev, S. & Balding, D. J. Reevaluation of SNP heritability in complex human traits. Nat. Genet. 49, 986–992 (2017).

  16. 16.

    Lee, S. H. et al. Genetic relationship between five psychiatric disorders estimated from genome-wide SNPs. Nat. Genet. 45, 984–994 (2013).

  17. 17.

    Gusev, A. et al. Partitioning heritability of regulatory and cell-type-specific variants across 11 common diseases. Am. J. Hum. Genet. 95, 535–552 (2014).

  18. 18.

    Mancuso, N. et al. The contribution of rare variation to prostate cancer heritability. Nat. Genet. 48, 30–35 (2016).

  19. 19.

    Bulik-Sullivan, B. K. et al. LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).

  20. 20.

    Speed, D., Hemani, G., Johnson, M. R. & Balding, D. J. Improved heritability estimation from genome-wide SNPs. Am. J. Hum. Genet. 91, 1011–1021 (2012).

  21. 21.

    McCarthy, S. et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat. Genet. 48, 1279–1283 (2016).

  22. 22.

    Bycroft, C. et al. Genome-wide genetic data on ~ 500,000 UK Biobank participants. Preprint at bioRxiv (2017).

  23. 23.

    Yang, J., Zeng, J., Goddard, M. E., Wray, N. R. & Visscher, P. M. Concepts, estimation and interpretation of SNP-based heritability. Nat. Genet. 49, 1304–1310 (2017).

  24. 24.

    Zaitlen, N. et al. Using extended genealogy to estimate components of heritability for 23 quantitative and dichotomous traits. PLoS Genet. 9, e1003520 (2013).

  25. 25.

    Lee, S. H. et al. Estimation of SNP heritability from dense genotype data. Am. J. Hum. Genet. 93, 1151–1155 (2013).

  26. 26.

    Sudlow, C. et al. UK Biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12, e1001779 (2015).

  27. 27.

    Wood, A. R. et al. Defining the role of common variation in the genomic and biological architecture of adult human height. Nat. Genet. 46, 1173–1186 (2014).

  28. 28.

    Lee, S. H., Wray, N. R., Goddard, M. E. & Visscher, P. M. Estimating missing heritability for disease from genome-wide association studies. Am. J. Hum. Genet. 88, 294–305 (2011).

  29. 29.

    Browning, S. R. & Browning, B. L. Population structure can inflate SNP-based heritability estimates. Am. J. Hum. Genet. 89, 191–193 (2011). author reply 193–195.

  30. 30.

    Goddard, M. E., Lee, S. H., Yang, J., Wray, N. R. & Visscher, P. M. Response to Browning and Browning. Am. J. Hum. Genet. 89, 193–195 (2011).

  31. 31.

    Price, A. L., Zaitlen, N. A., Reich, D. & Patterson, N. New approaches to population stratification in genome-wide association studies. Nat. Rev. Genet. 11, 459–463 (2010).

  32. 32.

    Zhu, Z. et al. Dominance genetic variation contributes little to the missing heritability for human complex traits. Am. J. Hum. Genet. 96, 377–385 (2015).

  33. 33.

    Abraham, G. & Inouye, M. Fast principal component analysis of large-scale genome-wide data. PLoS One 9, e93766 (2014).

  34. 34.

    Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).

  35. 35.

    R Core Team. R: a Language and Environment for Statistical Computing. (R Foundation for Statistical Computing, Vienna, Austria, 2015).

  36. 36.

    Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).

  37. 37.

    Xia, C. et al. Pedigree- and SNP-associated genetics and recent environment are the major contributors to anthropometric and cardiometabolic trait variation. PLoS Genet. 12, e1005804 (2016).

  38. 38.

    Zuk, O., Hechter, E., Sunyaev, S. R. & Lander, E. S. The mystery of missing heritability: Genetic interactions create phantom heritability. Proc. Natl Acad. Sci. USA 109, 1193–1198 (2012).

Download references


We thank D. Speed (University College London) for providing LDAK5. We thank the Keller and Vrieze lab groups, the Institute for Behavioral Genetics, N. Wray, A. Price, and S. Caron for helpful comments. This work was supported by NIH grant R01MH100141 (to M.C.K.), NHMRC grants 1078037 (to P.M.V.) and 1113400 (to P.M.V. and J.Y.), Sylvia & Charles Viertel Charitable Foundation Senior Medical Research Fellowship (to J.Y.), and NIH grants R01DA037904 and R01HG008983 (to S.V.). This work used the Janus supercomputer, which is supported by the National Science Foundation (award number CNS-0821794), the University of Colorado Boulder, the University of Colorado Denver, and the National Center for Atmospheric Research. The Janus supercomputer is operated by the University of Colorado Boulder. We thank the participants of the individual Haplotype Reference Consortium cohorts. This research has been conducted using the UK Biobank Resource.

Author information

L.M.E. and M.C.K. conceived and designed the study. L.M.E. performed the statistical analyses and simulations. R.T., S.I.V., S.G., G.R.A., S.D., D.W.B., T.R.d.C., M.E.G., B.M.N., J.Y., and P.M.V. provided statistical support. The Haplotype Reference Consortium, G.R.A., and S.D. contributed to data collection and management. L.M.E. and M.C.K. wrote the manuscript with participation of all authors.

Correspondence to Luke M. Evans or Matthew C. Keller.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–34 and Supplementary Note

Reporting Summary

Supplementary Tables

Supplementary Tables 1–10

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Further reading