Subjects

Abstract

There is increasing evidence that many risk loci found using genome-wide association studies are molecular quantitative trait loci (QTLs). Here we introduce a new set of functional annotations based on causal posterior probabilities of fine-mapped molecular cis-QTLs, using data from the Genotype-Tissue Expression (GTEx) and BLUEPRINT consortia. We show that these annotations are more strongly enriched for heritability (5.84× for eQTLs; P = 1.19 × 10−31) across 41 diseases and complex traits than annotations containing all significant molecular QTLs (1.80× for expression (e)QTLs). eQTL annotations obtained by meta-analyzing all GTEx tissues generally performed best, whereas tissue-specific eQTL annotations produced stronger enrichments for blood- and brain-related diseases and traits. eQTL annotations restricted to loss-of-function intolerant genes were even more enriched for heritability (17.06×; P = 1.20 × 10−35). All molecular QTLs except splicing QTLs remained significantly enriched in joint analysis, indicating that each of these annotations is uniquely informative for disease and complex trait architectures.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from $8.99

All prices are NET prices.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  1. 1.

    Schizophrenia Working Group of the Psychiatric Genomics Consortium.. Biological insights from 108 schizophrenia-associated genetic loci. Nature 511, 421–427 (2014).

  2. 2.

    Okbay, A. et al. Genome-wide association study identifies 74 loci associated with educational attainment. Nature 533, 539–542 (2016).

  3. 3.

    Visscher, P. M. et al. 10 years of GWAS discovery: biology, function, and translation. Am. J. Hum. Genet. 101, 5–22 (2017).

  4. 4.

    Hindorff, L. A. et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl Acad. Sci. USA 106, 9362–9367 (2009).

  5. 5.

    Nicolae, D. L. et al. Trait-associated SNPs are more likely to be eQTLs: annotation to enhance discovery from GWAS. PLoS Genet. 6, e1000888 (2010).

  6. 6.

    Nica, A. C. et al. Candidate causal regulatory effects by integration of expression QTLs with complex trait genetic associations. PLoS Genet. 6, e1000895 (2010).

  7. 7.

    Maurano, M. T. et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1190–1195 (2012).

  8. 8.

    Trynka, G. et al. Disentangling the effects of colocalizing genomic annotations to functionally prioritize non-coding variants within complex-trait loci. Am. J. Hum. Genet. 97, 139–152 (2015).

  9. 9.

    Roadmap Epigenomics Consortium. et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).

  10. 10.

    Wright, F. A. et al. Heritability and genomics of gene expression in peripheral blood. Nat. Genet. 46, 430–437 (2014).

  11. 11.

    Zhang, X. et al. Identification of common genetic variants controlling transcript isoform variation in human whole blood. Nat. Genet. 47, 345–352 (2015).

  12. 12.

    The GTEx Consortium. The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans. Science 348, 648–660 (2015).

  13. 13.

    Zhernakova, D. V. et al. Identification of context-dependent expression quantitative trait loci in whole blood. Nat. Genet. 49, 139–145 (2017).

  14. 14.

    The GTEx Consortium.. Genetic effects on gene expression across human tissues. Nature 550, 204–213 (2017).

  15. 15.

    McVicker, G. et al. Identification of genetic variants that affect histone modifications in human cells. Science 342, 747–749 (2013).

  16. 16.

    Waszak, S. M. et al. Population variation and genetic control of modular chromatin architecture in humans. Cell 162, 1039–1050 (2015).

  17. 17.

    Grubert, F. et al. Genetic control of chromatin states in humans involves local and distal chromosomal interactions. Cell 162, 1051–1065 (2015).

  18. 18.

    Chen, L. et al. Genetic drivers of epigenetic and transcriptional variation in human immune cells. Cell 167, 1398–1414 (2016).

  19. 19.

    Davis, L. K. et al. Partitioning the heritability of Tourette syndrome and obsessive compulsive disorder reveals differences in genetic architecture. PLoS Genet. 9, e1003864 (2013).

  20. 20.

    Torres, J. M. et al. Cross-tissue and tissue-specific eQTLs: partitioning the heritability of a complex trait. Am. J. Hum. Genet. 95, 521–534 (2014).

  21. 21.

    Hu, X. et al. Regulation of gene expression in autoimmune disease loci and the genetic basis of proliferation in CD4+ effector memory T cells. PLoS Genet. 10, e1004404 (2014).

  22. 22.

    Giambartolomei, C. et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 10, e1004383 (2014).

  23. 23.

    Hormozdiari, F. et al. Colocalization of GWAS and eQTL signals detects target genes. Am. J. Hum. Genet. 99, 1245–1260 (2016).

  24. 24.

    Gamazon, E. R. et al. A gene-based association method for mapping traits using reference transcriptome data. Nat. Genet. 47, 1091–1098 (2015).

  25. 25.

    Gusev, A. et al. Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet. 48, 245–252 (2016).

  26. 26.

    Mancuso, N. et al. Integrating gene expression with summary association statistics to identify genes associated with 30 complex traits. Am. J. Hum. Genet. 100, 473–487 (2017).

  27. 27.

    Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).

  28. 28.

    Lee, S. H. et al. Estimating the proportion of variation in susceptibility to schizophrenia captured by common SNPs. Nat. Genet. 44, 247–250 (2012).

  29. 29.

    Gusev, A. et al. Partitioning heritability of regulatory and cell-type-specific variants across 11 common diseases. Am. J. Hum. Genet. 95, 535–552 (2014).

  30. 30.

    Loh, P.-R. et al. Contrasting genetic architectures of schizophrenia and other complex diseases using fast variance-components analysis. Nat. Genet. 47, 1385–1392 (2015).

  31. 31.

    Speed, D., Cai, N., Johnson, M. R., Nejentsev, S. & Balding, D. J. Reevaluation of SNP heritability in complex human traits. Nat. Genet. 49, 986–992 (2017).

  32. 32.

    Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).

  33. 33.

    Gazal, S. et al. Linkage disequilibrium-dependent architecture of human complex traits shows action of negative selection. Nat. Genet. 49, 1421–1427 (2017).

  34. 34.

    Pasaniuc, B. & Price, A. L. Dissecting the genetics of complex traits using summary association statistics. Nat. Rev. Genet. 18, 117–127 (2017).

  35. 35.

    Hormozdiari, F., Kostem, E., Kang, E. Y., Pasaniuc, B. & Eskin, E. Identifying causal variants at loci with multiple signals of association. Genetics 198, 497–508 (2014).

  36. 36.

    Liu, X. et al. Functional architectures of local and distal regulation of gene expression in multiple human tissues. Am. J. Hum. Genet. 100, 605–616 (2017).

  37. 37.

    Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).

  38. 38.

    Gazal, S., Finucane, H. K. & Price, A. l. Reconciling S-LDSC and LDAK functional enrichment estimates. bioRxiv https://doi.org/10.1101/256412 (2018).

  39. 39.

    Cassa, C. A. et al. Estimating the selective effects of heterozygous protein-truncating variants from human exome data. Nat. Genet. 49, 806–810 (2017).

  40. 40.

    Samocha, K. E. et al. A framework for the interpretation of de novo mutation in human disease. Nat. Genet. 46, 944–950 (2014).

  41. 41.

    Wang, T. et al. Identification and characterization of essential genes in the human genome. Science 350, 1096–1101 (2015).

  42. 42.

    Hormozdiari, F. et al. Widespread allelic heterogeneity in complex traits. Am. J. Hum. Genet. 100, 789–802 (2017).

  43. 43.

    Veyrieras, J.-B. et al. High-resolution mapping of expression-QTLs yields insight into human gene regulation. PLoS Genet. 4, e1000214 (2008).

  44. 44.

    Javierre, B. M. et al. Lineage-specific genome architecture links enhancers and non-coding disease variants to target gene promoters. Cell 167, 1369–1384 (2016).

  45. 45.

    Mumbach, M. R. et al. Enhancer connectome in primary human cells identifies target genes of disease-associated DNA elements. Nat. Genet. 49, 1602–1612 (2017).

  46. 46.

    Li, Y. I. et al. RNA splicing is a primary link between genetic variation and disease. Science 352, 600–604 (2016).

  47. 47.

    Li, X. et al. The impact of rare variation on gene expression across tissues. Nature 550, 239–243 (2017).

  48. 48.

    Sul, J. H., Han, B., Ye, C., Choi, T. & Eskin, E. Effectively identifying eQTLs from multiple tissues by combining mixed model and meta-analytic approaches. PLoS Genet. 9, e1003491 (2013).

  49. 49.

    Flutre, T., Wen, X., Pritchard, J. & Stephens, M. A statistical framework for joint eQTL analysis in multiple tissues. PLoS Genet. 9, e1003486 (2013).

  50. 50.

    Finucane, H. K. et al. Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types. Nat. Genet. 50, 621–629 (2018).

  51. 51.

    Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).

  52. 52.

    The 1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).

  53. 53.

    The International HapMap 3 Consortium.. Integrating common and rare genetic variation in diverse human populations. Nature 467, 52–58 (2010).

  54. 54.

    Loh, P.-R. et al. Mixed model association for Biobank-scale data sets. Nat. Genet. 50, 621–629 (2018).

  55. 55.

    Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).

  56. 56.

    Urbut, S. M., Wang, G. & Stephens, M. Flexible statistical methods for estimating and testing effects in genomic studies with multiple conditions. bioRxiv https://doi.org/10.1101/096552 (2017).

  57. 57.

    Park, Y., Sarkar, A. K., Bhutani, K. & Kellis, M. Multi-tissue polygenic models for transcriptome-wide association studies. bioRxiv (2017).

  58. 58.

    Jostins, L. et al. Host–microbe interactions have shaped the genetic architecture of inflammatory bowel disease. Nature 491, 119–124 (2012).

  59. 59.

    Okada, Y. et al. Genetics of rheumatoid arthritis contributes to biology and drug discovery. Nature 506, 376–381 (2014).

  60. 60.

    Bentham, J. et al. Genetic association analyses implicate aberrant regulation of innate and adaptive immunity genes in the pathogenesis of systemic lupus erythematosus. Nat. Genet. 47, 1457–1464 (2015).

  61. 61.

    Dubois, P. C. A. et al. Multiple common variants for celiac disease influencing immune gene expression. Nat. Genet. 42, 295–302 (2010).

  62. 62.

    Day, F. R. et al. Genomic analyses identify hundreds of variants associated with age at menarche and support a role for puberty timing in cancer risk. Nat. Genet. 49, 834–841 (2017).

  63. 63.

    Speliotes, E. K. et al. Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index. Nat. Genet. 42, 937–948 (2010).

  64. 64.

    Psychiatric GWAS Consortium Bipolar Disorder Working Group. Large-scale genome-wide association analysis of bipolar disorder identifies a new susceptibility locus near ODZ4. Nat. Genet. 43, 977–983 (2011).

  65. 65.

    The Tobacco and Genetics Consortium. Genome-wide meta-analyses identify multiple loci associated with smoking behavior. Nat. Genet. 42, 441–447 (2010).

  66. 66.

    Rietveld, C. A. et al. GWAS of 126,559 individuals identifies genetic variants associated with educational attainment. Science 340, 1467–1471 (2013).

Download references

Acknowledgements

We thank S. Raychaudhuri, N. Zaitlen, B. Pasaniuc, M. Nivard, J.-H. Sul and F. Hormozdiari for helpful discussions. This research was funded by NIH grants U01 HG009379, R01 MH101244, R01 MH109978, T32 DK110919 and R01 MH107649. This research was conducted using the UK Biobank Resource under application 16549.

Author information

Affiliations

  1. Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA

    • Farhad Hormozdiari
    • , Steven Gazal
    • , Bryce van de Geijn
    • , Hilary K. Finucane
    • , Armin Schoech
    • , Yakir Reshef
    • , Xuanyao Liu
    • , Luke O’Connor
    •  & Alkes L. Price
  2. Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA

    • Farhad Hormozdiari
    • , Steven Gazal
    • , Bryce van de Geijn
    • , Hilary K. Finucane
    • , Po-Ru Loh
    • , Armin Schoech
    • , Yakir Reshef
    • , Xuanyao Liu
    •  & Alkes L. Price
  3. Department of Computer Science, University of California, Los Angeles, CA, USA

    • Chelsea J.-T. Ju
    •  & Eleazar Eskin
  4. Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA

    • Po-Ru Loh
    •  & Alexander Gusev
  5. Program in Bioinformatics and Integrative Genomics, Harvard Graduate School of Arts and Sciences, Boston, MA, USA

    • Luke O’Connor
  6. Dana Farber Cancer Institute, Harvard Medical School, Boston, MA, USA

    • Alexander Gusev
  7. Department of Human Genetics, University of California, Los Angeles, CA, USA

    • Eleazar Eskin

Authors

  1. Search for Farhad Hormozdiari in:

  2. Search for Steven Gazal in:

  3. Search for Bryce van de Geijn in:

  4. Search for Hilary K. Finucane in:

  5. Search for Chelsea J.-T. Ju in:

  6. Search for Po-Ru Loh in:

  7. Search for Armin Schoech in:

  8. Search for Yakir Reshef in:

  9. Search for Xuanyao Liu in:

  10. Search for Luke O’Connor in:

  11. Search for Alexander Gusev in:

  12. Search for Eleazar Eskin in:

  13. Search for Alkes L. Price in:

Contributions

F.H. and A.L.P. designed experiments. F.H. performed experiments. F.H., S.G., B.v.d.G., H.K.F., C.J.-T.J., P.-R.L., A.S., Y.R., X.L., L.O., A.G. and E.E. analyzed data. F.H. and A.L.P. wrote the manuscript with assistance from all authors.

Competing interests

The authors declare no competing interests.

Corresponding authors

Correspondence to Farhad Hormozdiari or Alkes L. Price.

Integrated supplementary information

  1. Supplementary Figure 1 Enrichment and τ* of different QTL annotations for whole blood in the GTEx dataset without conditioning on the baselineLD model.

    a, Meta-analysis results of enrichment for whole blood tissue from the GTEx datasets. b, Meta-analysis results of τ* for whole blood. The y axis is the meta-analyzed value, and error bars represent jackknife 95% confidence intervals. These values were computed by meta-analyzing 41 independent traits (n = 41 independent simulations to derive the statistics). Numerical results are reported in Supplementary Table 9.

  2. Supplementary Figure 2 Simulations results at different eQTL sample sizes.

    We report annotation effect size (τ*) estimates at different eQTL sample sizes simulated under the alternative simulation framework (Methods). We simulated eQTL studies where we ranged the eQTL sample size as 100, 200, 300, 400, 500, 600, 700, 800, 900 and 1,000. The x axis is the eQTL sample size, and the y axis is the estimated annotation effect size (τ*). We observed a linear relationship between eQTL sample size and τ* (R2 = 0.84, P = 9.39 × 10–5).

  3. Supplementary Figure 3 Histogram of values for MaxCPP annotation in GTEx.

    The x axis is the MaxCPP value, and the y axis is the frequency corresponding to each MaxCPP value.

  4. Supplementary Figure 4 Pairwise correlation among all baselineLD model annotations and our six molecular QTL MaxCPP annotations.

    We compute the pairwise correlation between annotation values.

  5. Supplementary Figure 5 Pairwise correlation among LD scores of all baselineLD model annotations and six molecular QTL MaxCPP annotations.

    We compute the pairwise correlation between the LD score of each annotation.

  6. Supplementary Figure 6 Pairwise correlation among all baselineLD model annotations and GTEx blood and Brain+Nerve MaxCPP annotations.

    We compute the pairwise correlation between annotation values.

  7. Supplementary Figure 7 Pairwise correlation among LD scores of all baselineLD model annotations and GTEx blood and Brain+Nerve MaxCPP annotations.

    We compute the pairwise correlation between the LD score of each annotation.

  8. Supplementary Figure 8 Histogram of values for MaxCPP annotation for FE-Meta-Tissue for each molecular QTL in BLUEPRINT.

    a, eQTL. b, H3K27ac hQTL. c, H3K4me1 hQTL. d, meQTL. e, sQTL. All these annotations were obtained by creating the MaxCPP annotation for each QTL dataset.

  9. Supplementary Figure 9 MaxCPP enrichment estimates are not sensitive to the maximum number of causal variants per locus modeled by CAVIAR.

    We report MaxCPP enrichment estimates for each of 44 GTEx tissues with CAVIAR modeling either up to six or up to three causal variants per locus. We determined that results were not statistically different. The y axis is the enrichment meta-analyzed value, and error bars represent 95% confidence intervals. These values were computed by meta-analyzing 41 independent traits (n = 41 independent traits to derive the statistics). Numerical results are reported in Supplementary Table 37.

  10. Supplementary Figure 10 MaxCPP τ* estimates are not sensitive to the maximum number of causal variants per locus modeled by CAVIAR.

    We report MaxCPP τ* estimates for each of 44 GTEx tissues with CAVIAR modeling either up to six or up to three causal variants per locus. We determined that results were not statistically different. The y axis is the τ* meta-analyzed value, and error bars represent 95% confidence intervals. These values are computed by meta-analyzing 41 independent traits (n = 41 independent traits to derive the statistics). Numerical results are reported in Supplementary Table 37.

  11. Supplementary Figure 11 Simulations confirm that S-LDSC estimates for τ* are unique to the focal annotation.

    We generated simulated data under a model in which only baselineLD and GTEx-FE-Meta-Tissue MaxCPP annotations directly influence trait heritability, using estimated τ* values from meta-analysis of 41 independent traits (estimated using a model that contains only baselineLD and GTEx-FE-Meta-Tissue MaxCPP annotations). We then estimated τ* values using a model that contains baselineLD, GTEx-FE-Meta-Tissue MaxCPP, and GTEx whole-blood maxCPP annotations. Results are averaged across 2,000 simulations. a, We report τ* estimates for each annotation. The y axis is the mean of τ* values, and error bars represent 95% confidence intervals. We computed the mean and confidence intervals using 400 simulations (n = 400 independent simulations to derive the statistics). τ* estimates for GTEx whole-blood MaxCPP were not significantly different from 0. b, We report the false positive rate for different P-value thresholds of τ* estimates for GTEx whole-blood MaxCPP. We observed correct null calibration. We computed the false positive rate using 400 simulations (n = 400 independent simulations to derive the statistics).

Supplementary information

  1. Supplementary Figures, Text and Tables

    Supplementary Figures 1–11 and Supplementary Tables 1–5, 7–15, 17–26 and 28–36

  2. Reporting Summary

  3. Supplementary Table 6

    List of 47 datasets analyzed in this study

  4. Supplementary Table 16

    τ* of MaxCPP annotations for individual tissues and individual traits

  5. Supplementary Table 27

    Enrichment and τ* for CD14+ monocytes, CD16+ neutrophils, naive CD4+ T cells and FE-Meta-Tissue in the BLUEPRINT dataset

  6. Supplementary Table 37

    MaxCPP enrichment and τ* estimates are not sensitive to the maximum number of causal variants per locus modeled by CAVIAR

  7. Supplementary Table 38

    Results using baselineLD model v1.1 and baselineLD model v1.0 are highly concordant

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/s41588-018-0148-2