Analysis | Published:

Linkage disequilibrium–dependent architecture of human complex traits shows action of negative selection

Nature Genetics volume 49, pages 14211427 (2017) | Download Citation

Abstract

Recent work has hinted at the linkage disequilibrium (LD)-dependent architecture of human complex traits, where SNPs with low levels of LD (LLD) have larger per-SNP heritability. Here we analyzed summary statistics from 56 complex traits (average N = 101,401) by extending stratified LD score regression to continuous annotations. We determined that SNPs with low LLD have significantly larger per-SNP heritability and that roughly half of this effect can be explained by functional annotations negatively correlated with LLD, such as DNase I hypersensitivity sites (DHSs). The remaining signal is largely driven by our finding that more recent common variants tend to have lower LLD and to explain more heritability (P = 2.38 × 10−104); the youngest 20% of common SNPs explain 3.9 times more heritability than the oldest 20%, consistent with the action of negative selection. We also inferred jointly significant effects of other LD-related annotations and confirmed via forward simulations that they jointly predict deleterious effects.

  • Subscribe to Nature Genetics for full access:

    $59

    Subscribe

Additional access options:

Already a subscriber?  Log in  now or  Register  for online access.

References

  1. 1.

    et al. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 42, 565–569 (2010).

  2. 2.

    , , & GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).

  3. 3.

    et al. Genome partitioning of genetic variation for complex traits using common SNPs. Nat. Genet. 43, 519–525 (2011).

  4. 4.

    , & Contrasting the genetic architecture of 30 complex traits from summary association data. Am. J. Hum. Genet. 99, 139–153 (2016).

  5. 5.

    et al. Estimating the proportion of variation in susceptibility to schizophrenia captured by common SNPs. Nat. Genet. 44, 247–250 (2012).

  6. 6.

    et al. Partitioning the heritability of Tourette syndrome and obsessive compulsive disorder reveals differences in genetic architecture. PLoS Genet. 9, e1003864 (2013).

  7. 7.

    et al. Partitioning heritability of regulatory and cell-type-specific variants across 11 common diseases. Am. J. Hum. Genet. 95, 535–552 (2014).

  8. 8.

    et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).

  9. 9.

    et al. Contrasting genetic architectures of schizophrenia and other complex diseases using fast variance-components analysis. Nat. Genet. 47, 1385–1392 (2015).

  10. 10.

    et al. Disproportionate contributions of select genomic compartments and cell types to genetic risk for coronary artery disease. PLoS Genet. 11, e1005622 (2015).

  11. 11.

    , , & Improved heritability estimation from genome-wide SNPs. Am. J. Hum. Genet. 91, 1011–1021 (2012).

  12. 12.

    et al. Quantifying missing heritability at known GWAS loci. PLoS Genet. 9, e1003993 (2013).

  13. 13.

    et al. Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index. Nat. Genet. 47, 1114–1120 (2015).

  14. 14.

    et al. The landscape of histone modifications across 1% of the human genome in five human cell lines. Genome Res. 17, 691–707 (2007).

  15. 15.

    , , & Sequence features in regions of weak and strong linkage disequilibrium. Genome Res. 15, 1519–1534 (2005).

  16. 16.

    et al. Recombination affects accumulation of damaging and disease-associated mutations in human populations. Nat. Genet. 47, 400–404 (2015).

  17. 17.

    1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).

  18. 18.

    et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).

  19. 19.

    et al. Functional architectures of local and distal regulation of gene expression in multiple human tissues. Am. J. Hum. Genet. 100, 605–616 (2017).

  20. 20.

    et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).

  21. 21.

    , , & Genome-wide inference of ancestral recombination graphs. PLoS Genet. 10, e1004342 (2014).

  22. 22.

    et al. The fine-scale structure of recombination rate variation in the human genome. Science 304, 581–584 (2004).

  23. 23.

    , , , & A fine-scale map of recombination rates and hotspots across the human genome. Science 310, 321–324 (2005).

  24. 24.

    , , & Widespread genomic signatures of natural selection in hominid evolution. PLoS Genet. 5, e1000471 (2009).

  25. 25.

    et al. Differential relationship of DNA replication timing to different forms of human mutation and variation. Am. J. Hum. Genet. 91, 1033–1040 (2012).

  26. 26.

    ARGON: fast, whole-genome simulation of the discrete time Wright–Fisher process. Bioinformatics 32, 3032–3034 (2016).

  27. 27.

    et al. Demographic history and rare allele sharing among human populations. Proc. Natl. Acad. Sci. USA 108, 11983–11988 (2011).

  28. 28.

    & The effect of linkage on limits to artificial selection. Genet. Res. 8, 269–294 (1966).

  29. 29.

    UK10K Consortium. The UK10K project identifies rare variants in health and disease. Nature 526, 82–90 (2015).

  30. 30.

    International Schizophrenia Consortium. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 460, 748–752 (2009).

  31. 31.

    , & Natural selection and infectious disease in human populations. Nat. Rev. Genet. 15, 379–393 (2014).

  32. 32.

    SLiM: simulating evolution with selection and linkage. Genetics 194, 1037–1039 (2013).

  33. 33.

    The age of a rare mutant gene in a large population. Am. J. Hum. Genet. 26, 669–673 (1974).

  34. 34.

    , & The effect of deleterious mutations on neutral molecular variation. Genetics 134, 1289–1303 (1993).

  35. 35.

    et al. Detecting recent positive selection in the human genome from haplotype structure. Nature 419, 832–837 (2002).

  36. 36.

    et al. Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants. Nature 493, 216–220 (2013).

  37. 37.

    et al. Deleterious alleles in the human genome are on average younger than neutral alleles of the same frequency. PLoS Genet. 9, e1003301 (2013).

  38. 38.

    Genetic architecture of a complex trait and its implications for fitness and genome-wide association studies. Proc. Natl. Acad. Sci. USA 107 (Suppl. 1), 1752–1756 (2010).

  39. 39.

    , , , GoT2D Consortium & Evaluating empirical bounds on complex disease genetic architecture. Nat. Genet. 45, 1418–1427 (2013).

  40. 40.

    et al. Searching for missing heritability: designing rare variant association studies. Proc. Natl. Acad. Sci. USA 111, E455–E464 (2014).

  41. 41.

    et al. The genetic architecture of type 2 diabetes. Nature 536, 41–47 (2016).

  42. 42.

    et al. Winner's curse correction and variable thresholding improve performance of polygenic risk modeling based on genome-wide association study summary-level data. PLoS Genet. 12, e1006493 (2016).

  43. 43.

    Joint analysis of functional genomic data and genome-wide association studies of 18 human traits. Am. J. Hum. Genet. 94, 559–573 (2014).

  44. 44.

    et al. Weighting sequence variants based on their annotation increases power of whole-genome association studies. Nat. Genet. 48, 314–317 (2016).

  45. 45.

    , , , & Re-ranking sequencing variants in the post-GWAS era for accurate causal variant identification. PLoS Genet. 9, e1003609 (2013).

  46. 46.

    et al. Integrating functional data to prioritize causal variants in statistical fine-mapping studies. PLoS Genet. 10, e1004722 (2014).

  47. 47.

    , , & Rare-variant association analysis: study designs and statistical tests. Am. J. Hum. Genet. 95, 5–23 (2014).

  48. 48.

    & Theoretical models of selection and mutation on quantitative traits. Philos. Trans. R. Soc. B Biol. 360, 1411–1425 (2005).

  49. 49.

    et al. Classic selective sweeps were rare in recent human evolution. Science 331, 920–924 (2011).

  50. 50.

    , & Genome-wide signals of positive selection in human evolution. Genome Res. 24, 885–895 (2014).

  51. 51.

    Balancing selection and its effects on sequences in nearby genome regions. PLoS Genet. 2, e64 (2006).

  52. 52.

    et al. Genome-wide patterns and properties of de novo mutations in humans. Nat. Genet. 47, 822–826 (2015).

  53. 53.

    et al. Super-enhancers delineate disease-associated regulatory nodes in T cells. Nature 520, 558–562 (2015).

  54. 54.

    et al. Identifying a high fraction of the human genome to be under selective constraint using GERP++. PLoS Comput. Biol. 6, e1001025 (2010).

  55. 55.

    et al. Estimation of SNP heritability from dense genotype data. Am. J. Hum. Genet. 93, 1151–1155 (2013).

  56. 56.

    et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).

  57. 57.

    et al. The landscape of recombination in African Americans. Nature 476, 170–175 (2011).

  58. 58.

    et al. Fine-scale recombination rate differences between sexes, populations and individuals. Nature 467, 1099–1103 (2010).

  59. 59.

    et al. Large-scale whole-genome sequencing of the Icelandic population. Nat. Genet. 47, 435–444 (2015).

  60. 60.

    & BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).

  61. 61.

    et al. Multiple common variants for celiac disease influencing immune gene expression. Nat. Genet. 42, 295–302 (2010).

  62. 62.

    et al. Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature 467, 832–838 (2010).

  63. 63.

    et al. Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index. Nat. Genet. 42, 937–948 (2010).

  64. 64.

    et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466, 707–713 (2010).

  65. 65.

    Tobacco and Genetics Consortium. Genome-wide meta-analyses identify multiple loci associated with smoking behavior. Nat. Genet. 42, 441–447 (2010).

  66. 66.

    Psychiatric GWAS Consortium Bipolar Disorder Working Group. Large-scale genome-wide association analysis of bipolar disorder identifies a new susceptibility locus near ODZ4. Nat. Genet. 43, 977–983 (2011).

  67. 67.

    et al. Large-scale association analysis identifies 13 new susceptibility loci for coronary artery disease. Nat. Genet. 43, 333–338 (2011).

  68. 68.

    et al. Host–microbe interactions have shaped the genetic architecture of inflammatory bowel disease. Nature 491, 119–124 (2012).

  69. 69.

    et al. A genome-wide approach accounting for body mass index identifies genetic variants influencing fasting glycemic traits and insulin resistance. Nat. Genet. 44, 659–669 (2012).

  70. 70.

    et al. Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes. Nat. Genet. 44, 981–990 (2012).

  71. 71.

    Cross-Disorder Group of the Psychiatric Genomics Consortium. Identification of risk loci with shared effects on five major psychiatric disorders: a genome-wide analysis. Lancet 381, 1371–1379 (2013).

  72. 72.

    et al. GWAS of 126,559 individuals identifies genetic variants associated with educational attainment. Science 340, 1467–1471 (2013).

  73. 73.

    et al. A genome-wide association study of anorexia nervosa. Mol. Psychiatry 19, 1085–1094 (2014).

  74. 74.

    et al. Genetics of rheumatoid arthritis contributes to biology and drug discovery. Nature 506, 376–381 (2014).

  75. 75.

    et al. Parent-of-origin-specific allelic associations among 106 genomic loci for age at menarche. Nature 514, 92–97 (2014).

  76. 76.

    Schizophrenia Working Group of the Psychiatric Genomics Consortium. Biological insights from 108 schizophrenia-associated genetic loci. Nature 511, 421–427 (2014).

  77. 77.

    et al. Genetic association analyses implicate aberrant regulation of innate and adaptive immunity genes in the pathogenesis of systemic lupus erythematosus. Nat. Genet. 47, 1457–1464 (2015).

  78. 78.

    et al. International genome-wide meta-analysis identifies new primary biliary cirrhosis risk loci and targetable pathogenic pathways. Nat. Commun. 6, 8019 (2015).

  79. 79.

    et al. Large-scale genomic analyses link reproductive aging to hypothalamic signaling, breast cancer susceptibility and BRCA1-mediated DNA repair. Nat. Genet. 47, 1294–1303 (2015).

  80. 80.

    et al. Genome-wide association study identifies 74 loci associated with educational attainment. Nature 533, 539–542 (2016).

  81. 81.

    et al. Genetic variants associated with subjective well-being, depressive symptoms, and neuroticism identified through genome-wide analyses. Nat. Genet. 48, 624–633 (2016).

  82. 82.

    et al. Leveraging distant relatedness to quantify human mutation and gene-conversion rates. Am. J. Hum. Genet. 97, 775–789 (2015).

Download references

Acknowledgements

We thank the research participants and employees of 23andMe for making this work possible. We thank S. Sunyaev, Y. Reshef, G. Kichaev, D. Speed, and F. Day for helpful discussions. This research has been conducted using the UK Biobank Resource (application number 16549). This research was funded by NIH grants R01 MH101244, R01 MH107649, and U01 HG009088.

Author information

Affiliations

  1. Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA.

    • Steven Gazal
    • , Hilary K Finucane
    • , Po-Ru Loh
    • , Pier Francesco Palamara
    • , Xuanyao Liu
    • , Armin Schoech
    • , Alexander Gusev
    •  & Alkes L Price
  2. Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA.

    • Steven Gazal
    • , Hilary K Finucane
    • , Po-Ru Loh
    • , Pier Francesco Palamara
    • , Xuanyao Liu
    • , Armin Schoech
    • , Brendan Bulik-Sullivan
    • , Benjamin M Neale
    • , Alexander Gusev
    •  & Alkes L Price
  3. Department of Mathematics, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.

    • Hilary K Finucane
  4. 23andMe, Inc., Mountain View, California, USA.

    • Nicholas A Furlotte
  5. Department of Systems Biology, Harvard Medical School, Boston, Massachusetts, USA.

    • Armin Schoech
  6. Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA.

    • Brendan Bulik-Sullivan
    •  & Benjamin M Neale
  7. Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital, Boston, Massachusetts, USA.

    • Benjamin M Neale
  8. Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA.

    • Alkes L Price

Authors

  1. Search for Steven Gazal in:

  2. Search for Hilary K Finucane in:

  3. Search for Nicholas A Furlotte in:

  4. Search for Po-Ru Loh in:

  5. Search for Pier Francesco Palamara in:

  6. Search for Xuanyao Liu in:

  7. Search for Armin Schoech in:

  8. Search for Brendan Bulik-Sullivan in:

  9. Search for Benjamin M Neale in:

  10. Search for Alexander Gusev in:

  11. Search for Alkes L Price in:

Contributions

S.G. and A.L.P. designed experiments. S.G. performed experiments. S.G., H.K.F., N.A.F., and P.-R.L. analyzed data. S.G. and A.L.P. wrote the manuscript with assistance from H.K.F., N.A.F., P.-R.L., P.F.P., X.L., A.S., B.B.-S., B.M.N., and A.G.

Competing interests

N.A.F. is an employee of 23andMe, Inc.

Corresponding authors

Correspondence to Steven Gazal or Alkes L Price.

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–8 and 10–16, Supplementary Tables 5, 6, 8a, 10–16, 18–22, 24 and 25, and Supplementary Note

  2. 2.

    Life Sciences Reporting Summary

  3. 3.

    Supplementary Figure 9

    Proportion of heritability explained by the quintiles of each LD-related annotation of the baseline-LD model for each of the 62 data sets analyzed.

Excel files

  1. 1.

    Supplementary Tables 1–4, 7, 8b,c, 17 and 23

    Supplementary Tables 1–4, 7, 8b,c, 17 and 23

  2. 2.

    Supplementary Table 9

    Effect size and enrichment of each annotation of the baseline-LD model in each of the 62 data sets analyzed.

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/ng.3954

Rights and permissions

To obtain permission to re-use content from this article visit RightsLink.