Analysis | Published:

Classification of common human diseases derived from shared genetic and environmental determinants

Nature Genetics volume 49, pages 13191325 (2017) | Download Citation

Abstract

In this study, we used insurance claims for over one-third of the entire US population to create a subset of 128,989 families (481,657 unique individuals). We then used these data to (i) estimate the heritability and familial environmental patterns of 149 diseases and (ii) infer the genetic and environmental correlations for disease pairs from a set of 29 complex diseases. The majority (52 of 65) of our study's heritability estimates matched earlier reports, and 84 of our estimates appear to have been obtained for the first time. We used correlation matrices to compute environmental and genetic disease classifications and corresponding reliability measures. Among unexpected observations, we found that migraine, typically classified as a disease of the central nervous system, appeared to be most genetically similar to irritable bowel syndrome and most environmentally similar to cystitis and urethritis, all of which are inflammatory diseases.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

References

  1. 1.

    , & Gender and cultural issues in psychiatric nosological classification systems. CNS Spectr. 21, 334–340 (2016).

  2. 2.

    The nature of psychiatric disorders. World Psychiatry 15, 5–12 (2016).

  3. 3.

    Genera Plantarum Secundum Ordines Naturales Disposita (F. Beck, 1836).

  4. 4.

    & Genera Plantarum (Upsaliæ:apud. J. Cramer; Stechert-Hafner Service Agency, 1964).

  5. 5.

    et al. The Families of Plants: With Their Natural Characters, According to the Number, Figure, Situation, and Proportion of All of the Parts of Fructification (John Jackson, 1787).

  6. 6.

    et al. Nova Genera Plantarum (Upsaliæ :apud. J. Edman etc., 1781).

  7. 7.

    Carl Linnaeus: Genius of Classification (Enslow Publishers, 2015).

  8. 8.

    Inferring Phylogenies (Sinauer Associates, 2004).

  9. 9.

    et al. Network-based elucidation of human disease similarities reveals common functional modules enriched for pluripotent drug targets. PLoS Comput. Biol. 6, e1000662 (2010).

  10. 10.

    XV.—the correlation between relatives on the supposition of Mendelian inheritance. Trans. R. Soc. Edinb. 52, 399–433 (1918).

  11. 11.

    Systems of mating. I. The biometric relations between parent and offspring. Genetics 6, 111–123 (1921).

  12. 12.

    & Genetics and Analysis of Quantitative Traits (Sinauer, 1998).

  13. 13.

    Bayesian Data Analysis 3rd edn. (CRC Press, 2014).

  14. 14.

    MCMC methods for multi-response generalized linear mixed models: the MCMCglmm R package. J. Stat. Softw. 33, 1–22 (2010).

  15. 15.

    & Controlling the false discovery rate—a practical and powerful approach to multiple testing. J. Royal Stat. Soc. B Met. 57, 289–300 (1995).

  16. 16.

    et al. Common genetic determinants of schizophrenia and bipolar disorder in Swedish families: a population-based study. Lancet 373, 234–239 (2009).

  17. 17.

    , & An expanded view of complex traits: from polygenic to omnigenic. Cell 169, 1177–1186 (2017).

  18. 18.

    & The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425 (1987).

  19. 19.

    The Jackknife, the Bootstrap and Other Resampling Plans (Society for Industrial and Applied Mathematics, 1982).

  20. 20.

    Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39, 783–791 (1985).

  21. 21.

    The bootstrap and Markov-chain Monte Carlo. J. Biopharm. Stat. 21, 1052–1062 (2011).

  22. 22.

    et al. Genetic and epigenetic fine mapping of causal autoimmune disease variants. Nature 518, 337–343 (2015).

  23. 23.

    et al. Meta-analysis of 375,000 individuals identifies 38 susceptibility loci for migraine. Nat. Genet. 48, 856–866 (2016).

  24. 24.

    et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).

  25. 25.

    et al. Pedigree- and SNP-associated genetics and recent environment are the major contributors to anthropometric and cardiometabolic trait variation. PLoS Genet. 12, e1005804 (2016).

  26. 26.

    , & Evaluating genetic association among ovarian, breast, and endometrial cancer: evidence for a breast/ovarian cancer relationship. Am. J. Hum. Genet. 45, 521–529 (1989).

  27. 27.

    et al. Partitioning the heritability of Tourette syndrome and obsessive compulsive disorder reveals differences in genetic architecture. PLoS Genet. 9, e1003864 (2013).

  28. 28.

    et al. Genetic relationship between five psychiatric disorders estimated from genome-wide SNPs. Nat. Genet. 45, 984–994 (2013).

  29. 29.

    et al. Contrasting genetic architectures of schizophrenia and other complex diseases using fast variance-components analysis. Nat. Genet. 47, 1385–1392 (2015).

  30. 30.

    et al. Evaluating the contribution of genetics and familial shared environment to common disease using the UK Biobank. Nat. Genet. 48, 980–983 (2016).

  31. 31.

    , & Heritability and genetic correlations explained by common SNPs for metabolic syndrome traits. PLoS Genet. 8, e1002637 (2012).

  32. 32.

    et al. Revisiting heritability accounting for shared environmental effects and maternal inheritance. Hum. Genet. 134, 169–179 (2015).

  33. 33.

    , , & The mystery of missing heritability: genetic interactions create phantom heritability. Proc. Natl. Acad. Sci. USA 109, 1193–1198 (2012).

  34. 34.

    et al. Using extended genealogy to estimate components of heritability for 23 quantitative and dichotomous traits. PLoS Genet. 9, e1003520 (2013).

  35. 35.

    & Genetic basis of complex genetic disease: the contribution of disease heterogeneity to missing heritability. Curr. Epidemiol. Rep. 1, 220–227 (2014).

  36. 36.

    , , & Incidence of sickle cell trait—United States, 2010. MMWR Morb. Mortal. Wkly. Rep. 63, 1155–1158 (2014).

  37. 37.

    et al. PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene–disease associations. Bioinformatics 26, 1205–1210 (2010).

  38. 38.

    et al. Multivariate Bayesian analysis of Gaussian, right censored Gaussian, ordered categorical and binary traits using Gibbs sampling. Genet. Sel. Evol. 35, 159–183 (2003).

  39. 39.

    & Introduction to Quantitative Genetics 4th edn. (Longman Scientific and Technical, 1996).

  40. 40.

    The inheritance of liability to certain diseases, estimated from the incidence among relatives. Ann. Hum. Genet. 29, 51–76 (1965).

  41. 41.

    & Likelihood, Bayesian and MCMC Methods in Quantitative Genetics (Springer-Verlag, 2002).

  42. 42.

    & An assessment of estimation procedures for multilevel models with binary responses. J. R. Stat. S`. Ser. A Stat. Soc. 158, 73–89 (1995).

  43. 43.

    , & Comparing parent–offspring regression with frequentist and Bayesian animal models to estimate heritability in wild populations: a simulation study for Gaussian and binary traits. Methods Ecol. Evol. 4, 260–275 (2013).

  44. 44.

    Prior distributions for variance parameters in hierarchical models (comment on article by Browne and Draper). Bayesian Anal. 1(3), 515–534 (2006).

  45. 45.

    & Inference from iterative simulation using multiple sequences. Stat. Sci. 7, 457–511 (1992).

  46. 46.

    & Simulation run length control in the presence of an initial transient. Opns Res. 31, 1109–1144 (1983).

  47. 47.

    , , & CODA: Convergence Diagnosis and Output Analysis for MCMC. R News 6, 7–11 (2006).

  48. 48.

    & The control of the false discovery rate in multiple testing under dependency. Ann. Stat. 29, 1165–1188 (2001).

  49. 49.

    , , & Bayesian measures of model complexity and fit. J. Royal Stat. Soc. B Stat. Methodol. 64, 583–639 (2002).

  50. 50.

    , , & Estimating quantitative genetic parameters in wild populations: a comparison of pedigree and genomic approaches. Mol. Ecol. 23, 3434–3451 (2014).

  51. 51.

    & How do misassigned paternities affect the estimation of heritability in the wild? Mol. Ecol. 14, 2839–2850 (2005).

  52. 52.

    , , & A framework for power and sensitivity analyses for quantitative genetic studies of natural populations, and case studies in Soay sheep (Ovis aries). J. Evol. Biol. 20, 2309–2321 (2007).

  53. 53.

    & Adopted children and stepchildren: 2010. P20-572. (US Census Bureau, 2014).

  54. 54.

    et al. Analysis of shared heritability in common disorders of the brain. Preprint at bioRxiv. (2016).

  55. 55.

    , & Diabetes mellitus: screening and diagnosis. Am. Fam. Physician 93, 103–109 (2016).

Download references

Acknowledgements

We thank E. Gannon, R. Melamed, R. Mork, and M. Rzhetsky for numerous comments on earlier versions of the manuscript. This work was funded by the DARPA Big Mechanism program under ARO contract W911NF1410333, by National Institutes of Health grants R01HL122712, 1P50MH094267, and U01HL108634-01, and by a gift from Liz and Kent Dauten.

Author information

Affiliations

  1. Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, Illinois, USA.

    • Kanix Wang
  2. Institute of Genomics and Systems Biology, University of Chicago, Chicago, Illinois, USA.

    • Kanix Wang
    • , Hallie Gaitsch
    •  & Andrey Rzhetsky
  3. Microsoft Research, Redmond, Washington, USA.

    • Hoifung Poon
  4. Vanderbilt Genetics Institute, Vanderbilt University, School of Medicine, Nashville, Tennessee, USA.

    • Nancy J Cox
  5. Department of Medicine, Department of Human Genetics, and Computation Institute, University of Chicago, Chicago, Illinois, USA.

    • Andrey Rzhetsky

Authors

  1. Search for Kanix Wang in:

  2. Search for Hallie Gaitsch in:

  3. Search for Hoifung Poon in:

  4. Search for Nancy J Cox in:

  5. Search for Andrey Rzhetsky in:

Contributions

All authors contributed extensively to the work presented in this paper. K.W. and A.R. designed experiments, analyzed data, and wrote the manuscript; K.W., H.G., and H.P. performed computational experiments; and N.J.C., H.G., and H.P. contributed to iterative improvement of the manuscript.

Competing interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to Andrey Rzhetsky.

Integrated supplementary information

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–6 and Supplementary Tables 3 and 5–8

Excel files

  1. 1.

    Supplementary Table 1

    Acronyms, biological systems, prevalence percentages and standard errors for 149 studied diseases.

  2. 2.

    Supplementary Table 2

    Heritability and preventability estimates and standard deviations for 149 studied diseases.

  3. 3.

    Supplementary Table 4

    Pairwise estimates and standard deviations of genetic, environmental and phenotypic correlations for 29 diseases.

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/ng.3931

Further reading