Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

Genetic correlations of polygenic disease traits: from theory to practice

Abstract

The genetic correlation describes the genetic relationship between two traits and can contribute to a better understanding of the shared biological pathways and/or the causality relationships between them. The rarity of large family cohorts with recorded instances of two traits, particularly disease traits, has made it difficult to estimate genetic correlations using traditional epidemiological approaches. However, advances in genomic methodologies, such as genome-wide association studies, and widespread sharing of data now allow genetic correlations to be estimated for virtually any trait pair. Here, we review the definition, estimation, interpretation and uses of genetic correlations, with a focus on applications to human disease.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Different mechanisms of pleiotropy between two diseases.
Fig. 2: Genome-wide genetic correlation versus regional genetic correlation.
Fig. 3: Relation between cross-disorder relative risk (CDRR) and genetic correlation.
Fig. 4: Precision of genetic correlation estimates compared to heritability estimates and power for GREML and LDSC.
Fig. 5: Bias in estimated genetic parameters.
Fig. 6: Prediction accuracy increases when correlated traits are combined.

Similar content being viewed by others

References

  1. Polderman, T. J. C. et al. Meta-analysis of the heritability of human traits based on fifty years of twin studies. Nat. Genet. 47, 702–709 (2015).

    CAS  PubMed  Google Scholar 

  2. Craddock, N. & Owen, M. J. The beginning of the end for the Kraepelinian dichotomy. Br. J. Psychiatry 186, 364–366 (2005).

    PubMed  Google Scholar 

  3. Maret-Ouda, J., Tao, W., Wahlin, K. & Lagergren, J. Nordic registry-based cohort studies: possibilities and pitfalls when combining Nordic registry data. Scand. J. Public Health 45 (Suppl. 17), 14–19 (2017).

    PubMed  Google Scholar 

  4. Lichtenstein, P. et al. Common genetic determinants of schizophrenia and bipolar disorder in Swedish families: a population-based study. Lancet 373, 234–239 (2009). This work reports a population-scale data set for estimation of genetic correlation between diseases based on family data.

    CAS  PubMed  Google Scholar 

  5. Stearns, F. W. One hundred years of pleiotropy: a retrospective. Genetics 186, 767–773 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  6. Paaby, A. B. & Rockman, M. V. The many faces of pleiotropy. Trends Genet. 29, 66–73 (2013).

    CAS  PubMed  Google Scholar 

  7. Grüneberg, H. An analysis of the ‘pleiotropic’ effects of a new lethal mutation in the rat (Mus norvegicus). Proc. R. Soc. Lond. B 125, 123–144 (1938).

    Google Scholar 

  8. Wagner, G. P. & Zhang, J. The pleiotropic structure of the genotype–phenotype map: the evolvability of complex organisms. Nat. Rev. Genet. 12, 204 (2011).

    CAS  PubMed  Google Scholar 

  9. Zhu, Z. et al. Causal associations between risk factors and common diseases inferred from GWAS summary data. Nat. Commun. 9, 224 (2018).

    PubMed  PubMed Central  Google Scholar 

  10. Verbanck, M., Chen, C.-Y., Neale, B. & Do, R. Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat. Genet. 50, 693–698 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. Shi, H., Mancuso, N., Spendlove, S. & Pasaniuc, B. Local genetic correlation gives insights into the shared genetic architecture of complex traits. Am. J. Hum. Genet. 101, 737–751 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  12. Zuk, O., Hechter, E., Sunyaev, S. R. & Lander, E. S. The mystery of missing heritability: genetic interactions create phantom heritability. Proc. Natl Acad. Sci. USA 109, 1193–1198 (2012).

    CAS  PubMed  Google Scholar 

  13. Cheverud, J. M. A comparison of genetic and phenotypic correlations. Evolution 42, 958–968 (1988). This study describes phenotypic correlations as estimates of genetic correlations based on observation data.

    PubMed  Google Scholar 

  14. Rzhetsky, A., Wajngurt, D., Park, N. & Zheng, T. Probing genetic overlap among complex human phenotypes. Proc. Natl Acad. Sci. USA 104, 11694–11699 (2007).

    CAS  PubMed  Google Scholar 

  15. Cross-Disorder Group of the Psychiatric Genomics Consortium et al. Genetic relationship between five psychiatric disorders estimated from genome-wide SNPs. Nat. Genet. 45, 984–994 (2013). This study is among the first to estimate genetic correlation between diseases using independently collected GWAS samples.

    PubMed Central  Google Scholar 

  16. Tenesa, A. & Haley, C. S. The heritability of human disease: estimation, uses and abuses. Nat. Rev. Genet. 14, 139–149 (2013).

    CAS  PubMed  Google Scholar 

  17. Falconer, D. S. The inheritance of liability to certain diseases, estimated from the incidence among relatives. Ann. Hum. Genet. 29, 51–76 (1965).

    Google Scholar 

  18. Reich, T., James, J. W. & Morris, C. A. The use of multiple thresholds in determining the mode of transmission of semi-continuous traits. Ann. Hum. Genet. 36, 163–184 (1972).

    CAS  PubMed  Google Scholar 

  19. Wray, N. R. & Gottesman, I. I. Using summary data from the Danish national registers to estimate heritabilities for schizophrenia, bipolar disorder, and major depressive disorder. Front. Genet. 3, 118 (2012).

    PubMed  PubMed Central  Google Scholar 

  20. Pearson, K. I. Mathematical contributions to the theory of evolution. — VII. On the correlation of characters not quantitatively measurable. Philos. Trans. A Math. Phys. Eng. Sci. 195, 1–405 (1900).

    Google Scholar 

  21. Sham, P. Statistics in Human Genetics (Wiley, 1998).

  22. Olsson, U. Maximum likelihood estimation of the polychoric correlation coefficient. Psychometrika 44, 443–460 (1979).

    Google Scholar 

  23. Lee, S. H., Yang, J., Goddard, M. E., Visscher, P. M. & Wray, N. R. Estimation of pleiotropy between complex diseases using single-nucleotide polymorphism-derived genomic relationships and restricted maximum likelihood. Bioinformatics 28, 2540–2542 (2012). This study introduces the bivariate GREML method to estimate genetic correlation from genome-wide SNP data.

    CAS  PubMed  PubMed Central  Google Scholar 

  24. Yang, J. et al. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 42, 565–569 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  25. Lee, S. H., Wray, N. R., Goddard, M. E. & Visscher, P. M. Estimating missing heritability for disease from genome-wide association studies. Am. J. Hum. Genet. 88, 294–305 (2011).

    PubMed  PubMed Central  Google Scholar 

  26. Falconer, D. S. & Mackay, T. F. C. Introduction to Quantitative Genetics 4th edn (Pearson, 1996).

  27. Zaitlen, N. et al. Using extended genealogy to estimate components of heritability for 23 quantitative and dichotomous traits. PLOS Genet. 9, e1003520 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  28. Bulik-Sullivan, B. K. et al. LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  29. Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015). This study introduces the LDSC method to estimate genetic correlation from GWAS summary data.

    CAS  PubMed  PubMed Central  Google Scholar 

  30. Zheng, J. et al. LD Hub: a centralized database and web interface to perform LD score regression that maximizes the potential of summary level GWAS data for SNP heritability and genetic correlation analysis. Bioinformatics 33, 272–279 (2017). This work introduces LD Hub, a server that hosts GWAS summary statistics and LDSC analyses to estimate genetic correlations.

    CAS  PubMed  Google Scholar 

  31. Brainstorm Consortium et al. Analysis of shared heritability in common disorders of the brain. Science 360, eaap8757 (2018).

    Google Scholar 

  32. Yang, J. et al. Genome partitioning of genetic variation for complex traits using common SNPs. Nat. Genet. 43, 519–525 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  33. Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  34. Visscher, P. M. et al. Statistical power to detect genetic (co)variance of complex traits using SNP data in unrelated samples. PLOS Genet. 10, e1004269 (2014).

    PubMed  PubMed Central  Google Scholar 

  35. Lu, Q. et al. A powerful approach to estimating annotation-stratified genetic covariance via GWAS summary statistics. Am. J. Hum. Genet. 101, 939–964 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  36. Wray, N. R. et al. Genome-wide association analyses identify 44 risk variants and refine the genetic architecture of major depression. Nat. Genet. 50, 668–681 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  37. Brown, B. C., Asian Genetic Epidemiology Network Type 2 Diabetes Consortium, Ye, C. J., Price, A. L. & Zaitlen, N. Transethnic genetic-correlation estimates from summary statistics. Am. J. Hum. Genet. 99, 76–88 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  38. de Candia, T. R. et al. Additive genetic variation in schizophrenia risk is shared by populations of African and European descent. Am. J. Hum. Genet. 93, 463–470 (2013).

    PubMed  PubMed Central  Google Scholar 

  39. Liu, J. Z. et al. Association analyses identify 38 susceptibility loci for inflammatory bowel disease and highlight shared genetic risk across populations. Nat. Genet. 47, 979–986 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  40. Yang, L. et al. Polygenic transmission and complex neurodevelopmental network for attention deficit hyperactivity disorder: genome-wide association study of both common and rare variants. Am. J. Med. Genet. B Neuropsychiatr. Genet. 162B, 419–430 (2013).

    PubMed  Google Scholar 

  41. Speed, D., Hemani, G., Johnson, M. R. & Balding, D. J. Improved heritability estimation from genome-wide SNPs. Am. J. Hum. Genet. 91, 1011–1021 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  42. Yang, J. et al. Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index. Nat. Genet. 47, 1114–1120 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  43. Speed, D. et al. Reevaluation of SNP heritability in complex human traits. Nat. Genet. 49, 986–992 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  44. Speed, D. & Balding, D. J. SumHer better estimates the SNP heritability of complex traits from summary statistics. Nat. Genet. 51, 277–284 (2019).

    CAS  PubMed  Google Scholar 

  45. Gazal, S., Marquez-Luna, C., Finucane, H. K. & Price, A. L. Reconciling S-LDSC and LDAK models and functional enrichment estimates. Preprint at bioRxiv https://doi.org/10.1101/256412 (2018).

    Article  Google Scholar 

  46. Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  47. Evans, L. M. et al. Comparison of methods that use whole genome data to estimate the heritability and genetic architecture of complex traits. Nat. Genet. 50, 737–745 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  48. Ni, G., Moser, G., Schizophrenia Working Group of the Psychiatric Genomics Consortium, Wray, N. R. & Lee, S. H. Estimation of genetic correlation via linkage disequilibrium score regression and genomic restricted maximum likelihood. Am. J. Hum. Genet. 102, 1185–1194 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  49. Weissbrod, O., Flint, J. & Rosset, S. Estimating SNP-based heritability and genetic correlation in case–control studies directly and with summary statistics. Am. J. Hum. Genet. 103, 89–99 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  50. Golan, D., Lander, E. S. & Rosset, S. Measuring missing heritability: inferring the contribution of common variants. Proc. Natl Acad. Sci. USA 111, E5272–E5281 (2014).

    CAS  PubMed  Google Scholar 

  51. Yang, J., Zeng, J., Goddard, M. E., Wray, N. R. & Visscher, P. M. Concepts, estimation and interpretation of SNP-based heritability. Nat. Genet. 49, 1304–1310 (2017).

    CAS  PubMed  Google Scholar 

  52. Holmes, J. B., Speed, D. & Balding, D. J. Summary statistic analyses do not correct confounding bias. Preprint at bioRxiv https://doi.org/10.1101/532069 (2019).

    Article  Google Scholar 

  53. Yengo, L., Yang, J. & Visscher, P. M. Expectation of the intercept from bivariate LD score regression in the presence of population stratification. Preprint at bioRxiv https://doi.org/10.1101/310565 (2018).

    Article  Google Scholar 

  54. Gianola, D. Assortative mating and the genetic correlation. Theor. Appl. Genet. 62, 225–231 (1982).

    CAS  PubMed  Google Scholar 

  55. Peyrot, W. J., Robinson, M. R., Penninx, B. W. J. H. & Wray, N. R. Exploring boundaries for the genetic consequences of assortative mating for psychiatric traits. JAMA Psychiatry 73, 1189–1195 (2016).

    PubMed  Google Scholar 

  56. Wray, N. R., Lee, S. H. & Kendler, K. S. Impact of diagnostic misclassification on estimation of genetic correlations using genome-wide genotypes. Eur. J. Hum. Genet. 20, 668–674 (2012).

    PubMed  PubMed Central  Google Scholar 

  57. Bromet, E. J. et al. Diagnostic shifts during the decade following first admission for psychosis. Am. J. Psychiatry 168, 1186–1194 (2011).

    PubMed  PubMed Central  Google Scholar 

  58. Han, B. et al. A method to decipher pleiotropy by detecting underlying heterogeneity driven by hidden subgroups applied to autoimmune and neuropsychiatric diseases. Nat. Genet. 48, 803–810 (2016). This work describes a method that tries to distinguish between genetic correlation driven by sample heterogeneity and that driven by trait pleiotropy.

    CAS  PubMed  PubMed Central  Google Scholar 

  59. Munafò, M. R., Tilling, K., Taylor, A. E., Evans, D. M. & Davey Smith, G. Collider scope: when selection bias can substantially influence observed associations. Int. J. Epidemiol. 47, 226–235 (2018).

    PubMed  Google Scholar 

  60. Allen, N. et al. UK Biobank: current status and what it means for epidemiology. Health Policy Technol. 1, 123–126 (2012).

    Google Scholar 

  61. Vuckovic, D., Gasparini, P., Soranzo, N. & Iotchkova, V. MultiMeta: an R package for meta-analyzing multi-phenotype genome-wide association studies. Bioinformatics 31, 2754–2756 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  62. Bhattacharjee, S. et al. A subset-based approach improves power and interpretation for the combined analysis of genetic association studies of heterogeneous traits. Am. J. Hum. Genet. 90, 821–835 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  63. Qi, G. & Chatterjee, N. Heritability informed power optimization (HIPO) leads to enhanced detection of genetic associations across multiple traits. PLOS Genet. 14, e1007549 (2018).

    PubMed  PubMed Central  Google Scholar 

  64. Turley, P. et al. Multi-trait analysis of genome-wide association summary statistics using MTAG. Nat. Genet. 50, 229–237 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  65. Ray, D. & Boehnke, M. Methods for meta-analysis of multiple traits using GWAS summary statistics. Genet. Epidemiol. 42, 134–145 (2018).

    PubMed  Google Scholar 

  66. O’Brien, P. C. Procedures for comparing samples with multiple endpoints. Biometrics 40, 1079–1087 (1984).

    PubMed  Google Scholar 

  67. Xu, X., Tian, L. & Wei, L. J. Combining dependent tests for linkage or association across multiple phenotypic traits. Biostatistics 4, 223–229 (2003).

    PubMed  Google Scholar 

  68. Yang, Q., Wu, H., Guo, C.-Y. & Fox, C. S. Analyze multivariate phenotypes in genetic association studies by combining univariate association tests. Genet. Epidemiol. 34, 444–454 (2010).

    PubMed  PubMed Central  Google Scholar 

  69. Bolormaa, S. et al. A multi-trait, meta-analysis for detecting pleiotropic polymorphisms for stature, fatness and reproduction in beef cattle. PLOS Genet. 10, e1004198 (2014).

    PubMed  PubMed Central  Google Scholar 

  70. Zhu, X. et al. Meta-analysis of correlated traits via summary statistics from GWASs with an application in hypertension. Am. J. Hum. Genet. 96, 21–36 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  71. He, L. et al. Pleiotropic meta-analyses of longitudinal studies discover novel genetic variants associated with age-related diseases. Front. Genet. 7, 179 (2016).

    PubMed  PubMed Central  Google Scholar 

  72. Grotzinger, A. D. et al. Genomic structural equation modelling provides insights into the multivariate genetic architecture of complex traits. Nat. Hum. Behav. 3, 513–525 (2019).

    PubMed  Google Scholar 

  73. van der Sluis, S., Posthuma, D. & Dolan, C. V. TATES: efficient multivariate genotype-phenotype analysis for genome-wide association studies. PLOS Genet. 9, e1003235 (2013).

    PubMed  PubMed Central  Google Scholar 

  74. Cichonska, A. et al. metaCCA: summary statistics-based multivariate meta-analysis of genome-wide association studies using canonical correlation analysis. Bioinformatics 32, 1981–1989 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  75. Andreassen, O. A. et al. Improved detection of common variants associated with schizophrenia and bipolar disorder using pleiotropy-informed conditional false discovery rate. PLOS Genet. 9, e1003455 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  76. Liley, J. & Wallace, C. A pleiotropy-informed Bayesian false discovery rate adapted to a shared control design finds new disease associations from GWAS summary statistics. PLOS Genet. 11, e1004926 (2015).

    PubMed  PubMed Central  Google Scholar 

  77. Majumdar, A., Haldar, T., Bhattacharya, S. & Witte, J. S. An efficient Bayesian meta-analysis approach for studying cross-phenotype genetic associations. PLOS Genet. 14, e1007139 (2018).

    PubMed  PubMed Central  Google Scholar 

  78. Chung, D., Yang, C., Li, C., Gelernter, J. & Zhao, H. GPA: a statistical approach to prioritizing GWAS results by integrating pleiotropy and annotation. PLOS Genet. 10, e1004787 (2014).

    PubMed  PubMed Central  Google Scholar 

  79. Wei, W. et al. GPA-MDS: a visualization approach to investigate genetic architecture among phenotypes using GWAS results. Int. J. Genomics 2016, 6589843 (2016).

    PubMed  PubMed Central  Google Scholar 

  80. Solovieff, N., Cotsapas, C., Lee, P. H., Purcell, S. M. & Smoller, J. W. Pleiotropy in complex traits: challenges and strategies. Nat. Rev. Genet. 14, 483–495 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  81. Shriner, D. Moving toward system genetics through multiple trait analysis in genome-wide association studies. Front. Genet. 3, 1 (2012).

    PubMed  PubMed Central  Google Scholar 

  82. Wray, N. R. et al. Pitfalls of predicting complex traits from SNPs. Nat. Rev. Genet. 14, 507–515 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  83. Khera, A. V. et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet. 50, 1219–1224 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  84. Torkamani, A., Wineinger, N. E. & Topol, E. J. The personal and clinical utility of polygenic risk scores. Nat. Rev. Genet. 19, 581–590 (2018).

    CAS  PubMed  Google Scholar 

  85. Lee, S. H., Clark, S. & van der Werf, J. H. J. Estimation of genomic prediction accuracy from reference populations with varying degrees of relationship. PLOS ONE 12, e0189775 (2017).

    PubMed  PubMed Central  Google Scholar 

  86. Maier, R. et al. Joint analysis of psychiatric disorders increases accuracy of risk prediction for schizophrenia, bipolar disorder, and major depressive disorder. Am. J. Hum. Genet. 96, 283–294 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  87. Guo, G. et al. Comparison of single-trait and multiple-trait genomic prediction models. BMC Genet. 15, 30 (2014).

    PubMed  PubMed Central  Google Scholar 

  88. Dudbridge, F. Power and predictive accuracy of polygenic risk scores. PLOS Genet. 9, e1003348 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  89. Li, C., Yang, C., Gelernter, J. & Zhao, H. Improving genetic risk prediction by leveraging pleiotropy. Hum. Genet. 133, 639–650 (2014).

    PubMed  Google Scholar 

  90. Maier, R. M. et al. Improving genetic prediction by leveraging genetic correlations among human diseases and traits. Nat. Commun. 9, 989 (2018).

    PubMed  PubMed Central  Google Scholar 

  91. Hu, Y. et al. Joint modeling of genetically correlated diseases and functional annotations increases accuracy of polygenic risk prediction. PLOS Genet. 13, e1006836 (2017).

    PubMed  PubMed Central  Google Scholar 

  92. Pingault, J.-B. et al. Using genetic data to strengthen causal inference in observational research. Nat. Rev. Genet. 19, 566–580 (2018).

    CAS  PubMed  Google Scholar 

  93. Smith, G. D., Davey Smith, G. & Hemani, G. Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Hum. Mol. Genet. 23, R89–R98 (2014).

    Google Scholar 

  94. Bowden, J., Davey Smith, G. & Burgess, S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int. J. Epidemiol. 44, 512–525 (2015).

    PubMed  PubMed Central  Google Scholar 

  95. Hemani, G. et al. The MR-Base platform supports systematic causal inference across the human phenome. eLife 7, e34408 (2018).

    PubMed  PubMed Central  Google Scholar 

  96. Zhu, Z. et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 48, 481–487 (2016).

    CAS  PubMed  Google Scholar 

  97. Hemani, G., Tilling, K. & Davey Smith, G. Orienting the causal relationship between imprecisely measured traits using GWAS summary data. PLOS Genet. 13, e1007081 (2017).

    PubMed  PubMed Central  Google Scholar 

  98. Burgess, S. & Thompson, S. G. Multivariable Mendelian randomization: the use of pleiotropic genetic variants to estimate causal effects. Am. J. Epidemiol. 181, 251–260 (2015). This study introduces MR, a method to determine whether genetic correlation results from a causal relationship.

    PubMed  PubMed Central  Google Scholar 

  99. Do, R. et al. Common variants associated with plasma triglycerides and risk for coronary artery disease. Nat. Genet. 45, 1345–1352 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  100. Baigent, C. et al. Efficacy and safety of cholesterol-lowering treatment: prospective meta-analysis of data from 90,056 participants in 14 randomised trials of statins. Lancet 366, 1267–1278 (2005).

    CAS  PubMed  Google Scholar 

  101. Nissen, S. E. et al. Effect of torcetrapib on the progression of coronary atherosclerosis. N. Engl. J. Med. 356, 1304–1316 (2007).

    CAS  PubMed  Google Scholar 

  102. Barter, P. J. et al. Effects of torcetrapib in patients at high risk for coronary events. N. Engl. J. Med. 357, 2109–2122 (2007).

    CAS  PubMed  Google Scholar 

  103. O’Connor, L. J. & Price, A. L. Distinguishing genetic correlation from causation across 52 diseases and complex traits. Nat. Genet. 50, 1728–1734 (2018).

    PubMed  PubMed Central  Google Scholar 

  104. Deng, Y. & Pan, W. Conditional analysis of multiple quantitative traits based on marginal GWAS summary statistics. Genet. Epidemiol. 41, 427–436 (2017).

    PubMed  PubMed Central  Google Scholar 

  105. Nieuwboer, H. A., Pool, R., Dolan, C. V., Boomsma, D. I. & Nivard, M. G. GWIS: genome-wide inferred statistics for functions of multiple phenotypes. Am. J. Hum. Genet. 99, 917–927 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  106. Li, Y. & Kellis, M. Joint Bayesian inference of risk variants and tissue-specific epigenomic enrichments across multiple complex human diseases. Nucleic Acids Res. 44, e144 (2016).

    PubMed  PubMed Central  Google Scholar 

  107. Kichaev, G. et al. Improved methods for multi-trait fine mapping of pleiotropic risk loci. Bioinformatics 33, 248–255 (2017).

    CAS  PubMed  Google Scholar 

  108. Pickrell, J. K. et al. Detection and interpretation of shared genetic influences on 42 human traits. Nat. Genet. 48, 709–717 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  109. Barton, N. H. Pleiotropic models of quantitative variation. Genetics 124, 773–782 (1990).

    CAS  PubMed  PubMed Central  Google Scholar 

  110. Walsh, B. & Blows, M. W. Abundant genetic variation + strong selection = multivariate genetic constraints: a geometric view of adaptation. Annu. Rev. Ecol. Evol. Syst. 40, 41–59 (2009). This work puts forward arguments for multivariate genetic constraints and strong limits on the number of independent traits.

    Google Scholar 

  111. Visscher, P. M. et al. 10 years of GWAS discovery: biology, function, and translation. Am. J. Hum. Genet. 101, 5–22 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  112. Inouye, M. et al. Genomic risk prediction of coronary artery disease in 480,000 adults: implications for primary prevention. J. Am. Coll. Cardiol. 72, 1883–1893 (2018).

    PubMed  PubMed Central  Google Scholar 

  113. Ferreira, M. A. et al. Shared genetic origin of asthma, hay fever and eczema elucidates allergic disease biology. Nat. Genet. 49, 1752–1757 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  114. Lee, S. H. & van der Werf, J. H. J. MTG2: an efficient algorithm for multivariate linear mixed model analysis based on genomic information. Bioinformatics 32, 1420–1422 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  115. Loh, P.-R. et al. Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat. Genet. 47, 284–290 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  116. Loh, P.-R. et al. Contrasting genetic architectures of schizophrenia and other complex diseases using fast variance-components analysis. Nat. Genet. 47, 1385–1392 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  117. Cotsapas, C. et al. Pervasive sharing of genetic effects in autoimmune disease. PLOS Genet. 7, e1002254 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  118. Dai, M. et al. Joint analysis of individual-level and summary-level GWAS data by leveraging pleiotropy. Bioinformatics 35, 1729–1736 (2018).

    Google Scholar 

  119. Liu, J., Wan, X., Ma, S. & Yang, C. EPS: an empirical Bayes approach to integrating pleiotropy and tissue-specific information for prioritizing risk genes. Bioinformatics 32, 1856–1864 (2016).

    CAS  PubMed  Google Scholar 

Download references

Acknowledgements

W.v.R. was funded by the ALS Foundation Netherlands. W.J.P. was funded by an NWO Veni grant (91619152). S.H.L. is an ARC Future Fellow (FT160100229). N.R.W. acknowledges funding from the Australian National Health and Medical Research Council (1078901, 1087889 and 1113400). W.v.R. and N.R.W. acknowledge funding from the EU Joint Programme – Neurodegenerative Disease Research (JPND) project (Australia, NHMRC 1151854; The Netherlands, ZonMW project number 733051071). The authors thank K. Tilling, G. Davey Smith and the members of the University of Queensland Program in Complex Trait Genomics for their insightful discussions.

Competing interests

The authors declare no competing interests.

Reviewer information

Nature Reviews Genetics thanks D. Balding, B. Pasaniuc and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Author information

Authors and Affiliations

Authors

Contributions

All authors researched data for the article, made substantial contributions to discussions of the content and reviewed and/or edited the manuscript before submission. W.v.R. and N.R.W. wrote the article.

Corresponding authors

Correspondence to Wouter van Rheenen or Naomi R. Wray.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Related links

BUHMBOX: http://software.broadinstitute.org/mpg/buhmbox/

fastPAINTOR: https://github.com/gkichaev/PAINTOR_V3.0

GCTA: http://cnsgenomics.com/software/gcta/

GNOVA: https://github.com/xtonyjiang/GNOVA

GWIS: https://sites.google.com/site/mgnivard/gwis

JPND: www.jpnd.eu

LCV: https://github.com/lukejoconnor/LCV

LDAK: http://dougspeed.com/ldak

LD Hub: http://ldsc.broadinstitute.org/ldhub/

LDSC: https://github.com/bulik/ldsc

MR Steiger: https://github.com/explodecomputer/causal-directions

MTAG: https://github.com/omeed-maghzian/mtag

PCGC: https://data.broadinstitute.org/alkesgroup/PCGC/

ρHESS: https://github.com/huwenboshi/hess

PleioPred: https://github.com/yiminghu/PleioPred

Popcorn: https://github.com/brielin/Popcorn

RiVIERA: https://github.com/yueli-compbio/RiVIERA-beta

SMTpred: https://github.com/uqrmaie1/smtpred

SumHer: http://dougspeed.com/sumher/

Supplementary information

Glossary

Parameter

A numerical value that summarizes a characteristic of a population, such as the mean height of men, the lifetime risk of schizophrenia or the heritability of a specific trait.

Traits

Measurements or phenotypes that are usually studied as the outcome of statistical analyses. They can be quantitative (for example, height) or dichotomous (for example, schizophrenia).

Estimates

Approximations of a parameter based on a sample of observed data drawn from a population.

Ascertainment biases

Types of bias that occur when the studied trait or disease affects how data were ascertained. For example, patients with a family history of diabetes may have more frequent examinations for cardiovascular diseases.

Genome-wide association studies

Studies in which up to millions of mostly common single-nucleotide polymorphisms from across the genome are each tested for association with a trait.

GWAS summary statistics

The output of statistical tests of association of a trait with each single-nucleotide polymorphism generated by a genome-wide association study (GWAS), typically including the effect allele, signed effect estimate, standard error, test statistic (for example, a z-score) and/or p-value.

Power

The probability that a study correctly rejects the null hypothesis of no association or correlation, also described as 1– type II error.

Bias

Phenomenon where statistical analyses produce estimates in observed data that systematically overestimate or underestimate the population parameter. Bias can arise from the ascertainment of the observed data or the statistical procedures used to generate the estimates.

Linkage disequilibrium

(LD). The non-random segregation of alleles at two distinct loci. LD induces a correlation between two single-nucleotide polymorphism (SNP) genotypes in the population and is caused by the fact that alleles of neighbouring SNPs are transmitted together until broken down by recombination events.

Genetic value

(g). The sum of the total effects of all genetic loci on the trait in an individual, that is g =  where X is a vector of genotypes for all loci and ß is a vector with additive allelic effects on the trait. It is also called the genotypic value, true polygenic (risk) score or breeding value.

Covariance

(\({\sigma }_{x,y}\)). The expected product of the deviation of two random variables from their mean (\({\sigma }_{x,y}=E[(X-{\mu }_{x})(Y-{\mu }_{y})]\)).

Genetic variance

(\({\sigma }_{g}^{2}\)). The expected squared deviation of genetic values from the mean genetic value (\({\sigma }_{g}^{2}=E[{(G-{\mu }_{g})}^{2}]\)), and can also be considered the covariance of a genetic value with itself.

Heritability

(h2). The proportion of phenotypic variance (parameter \({\sigma }_{P}^{\,2}\), estimate VP) attributable to variance in genetic factors. In the context of human traits, most often only additive genetic factors are considered for the genetic variance (parameter \({\sigma }_{A}^{2}\), estimate VA) and the ratio of variances is the narrow-sense heritability.

Latent model

A collection of formalized assumptions to describe a data-generating process through which observed variables (such as disease occurrence) can be used to identify unobserved (latent) variables (for example, genetic parameters: heritability and genetic correlation).

Phenotypic variance

(\({\sigma }_{P}^{2}\)). Variance of phenotypic values (for example, height or disease liability) after accounting for the variance attributable to fixed effects (for example, sex). When phenotypes are standardized, these phenotypic values are scaled such that µP = 0 and \({\sigma }_{P}^{2}\) = 1.

Coheritability

(hxy). The genetic covariance of standardized traits. This is a useful measure for comparisons of coheritabilities and heritabilities on the same scale.

Linear mixed model

(LMM). A linear model that includes both fixed and random effects to describe phenotypic values and that allows a correlation structure between the random effect levels.

Restricted maximum likelihood

(REML). A method for maximum likelihood estimation of variance–covariance components of the parameters in linear mixed models.

Liability threshold model

A model that describes a dichotomous trait (disease) as a threshold partitioning of ‘liability’, which is a latent variable assumed to follow a standard normal distribution in the population. The liability threshold (T) defines lifetime risk (K) of disease as the proportion of individuals exceeding this threshold.

Risk ratio

Ratio between the risk of disease in a specific group (for example, relatives of affected individuals) and the risk of disease in the general population.

Tetrachoric correlation

The correlation between two latent normally distributed liability phenotypes assumed to underlie dichotomous population data and estimated from an observed 2 × 2 frequency table.

Genomic relationship matrix

(GRM). A matrix whose off-diagonal elements represent a coefficient of genetic sharing between individuals to describe the variance–covariance structure between their genetic values calculated from observed single-nucleotide polymorphism (SNP) data. GRM coefficients can be calculated based on different assumptions of the expected distribution of per-SNP heritability.

SNP-based heritability

An estimate of the proportion of the total phenotypic variance attributable to the additive effects of the class of variants (that is, common single-nucleotide polymorphisms (SNPs)) that are typically genotyped and imputed in pursuit of a genome-wide association study. It is often shortened to SNP heritability, but this should be avoided.

Genotype by environment (G × E) interaction

Differences in size and/or direction of the effect of genotype on disease risk in two different environments.

Sample heterogeneity

Differences in the effects of genotype on disease risk in two different cohorts. Potential causes include differences in phenotype criteria, ascertainment methods and unknown environmental differences with genotype by environment interaction.

Infinitesimal model

This model assumes that a trait is shaped by a very large number of variants with small (infinitesimal) effects resulting in a normally distributed phenotype. A polygenic architecture of >~10 causal variants is approximated well by normal distribution infinitesimal model theory.

Haseman–Elston regression

Regression of the product of the standardized phenotypes of pairs of individuals on their coefficient of genetic sharing as defined in the genomic relationship matrix.

Confounding bias

A type of bias that emerges when a covariate, a ‘confounder’, causally influences the predictor variable and outcome variable. When the confounder is not accounted for, the relationship between predictor and outcome may be biased (confounded).

Assortative mating

Mating selection on a trait where the phenotypes of mates are positively correlated. Examples of assortative mating in humans include height or educational attainment.

Collider bias

A type of bias that emerges when estimates are conditioned on a covariate, a ‘collider,’ that is causally influenced by both the predictor variable and outcome variable.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

van Rheenen, W., Peyrot, W.J., Schork, A.J. et al. Genetic correlations of polygenic disease traits: from theory to practice. Nat Rev Genet 20, 567–581 (2019). https://doi.org/10.1038/s41576-019-0137-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41576-019-0137-z

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research