Abstract

Many genetic variants influence complex traits by modulating gene expression, thus altering the abundance of one or multiple proteins. Here we introduce a powerful strategy that integrates gene expression measurements with summary association statistics from large-scale genome-wide association studies (GWAS) to identify genes whose cis-regulated expression is associated with complex traits. We leverage expression imputation from genetic data to perform a transcriptome-wide association study (TWAS) to identify significant expression-trait associations. We applied our approaches to expression data from blood and adipose tissue measured in 3,000 individuals overall. We imputed gene expression into GWAS data from over 900,000 phenotype measurements to identify 69 new genes significantly associated with obesity-related traits (BMI, lipids and height). Many of these genes are associated with relevant phenotypes in the Hybrid Mouse Diversity Panel. Our results showcase the power of integrating genotype, gene expression and phenotype to gain insights into the genetic basis of complex traits.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

References

  1. 1.

    , , & Five years of GWAS discovery. Am. J. Hum. Genet. 90, 7–24 (2012).

  2. 2.

    et al. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat. Genet. 44, 369–375, S1–S3 (2012).

  3. 3.

    , , , & DIST: direct imputation of summary statistics for unmeasured SNPs. Bioinformatics 29, 2925–2927 (2013).

  4. 4.

    et al. Fast and accurate imputation of summary statistics enhances evidence of functional enrichment. Bioinformatics 30, 2906–2914 (2014).

  5. 5.

    Global Lipids Genetics Consortium. Discovery and refinement of loci associated with lipid levels. Nat. Genet. 45, 1274–1283 (2013).

  6. 6.

    et al. Genetic studies of body mass index yield new insights for obesity biology. Nature 518, 197–206 (2015).

  7. 7.

    et al. Defining the role of common variation in the genomic and biological architecture of adult human height. Nat. Genet. 46, 1173–1186 (2014).

  8. 8.

    et al. From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus. Nature 466, 714–719 (2010).

  9. 9.

    et al. Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501, 506–511 (2013).

  10. 10.

    et al. Identification of common genetic variants controlling transcript isoform variation in human whole blood. Nat. Genet. 47, 345–352 (2015).

  11. 11.

    et al. Systematic identification of trans eQTLs as putative drivers of known disease associations. Nat. Genet. 45, 1238–1243 (2013).

  12. 12.

    & The role of regulatory variation in complex traits and disease. Nat. Rev. Genet. 16, 197–212 (2015).

  13. 13.

    et al. Polarization of the effects of autoimmune and neurodegenerative risk alleles in leukocytes. Science 344, 519–523 (2014).

  14. 14.

    et al. Domains of genome-wide gene expression dysregulation in Down's syndrome. Nature 508, 345–350 (2014).

  15. 15.

    et al. Partitioning the heritability of Tourette syndrome and obsessive compulsive disorder reveals differences in genetic architecture. PLoS Genet. 9, e1003864 (2013).

  16. 16.

    et al. A gene-based association method for mapping traits using reference transcriptome data. Nat. Genet. 47, 1091–1098 (2015).

  17. 17.

    et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466, 707–713 (2010).

  18. 18.

    et al. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 42, 565–569 (2010).

  19. 19.

    , , & GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).

  20. 20.

    et al. Partitioning heritability of regulatory and cell-type-specific variants across 11 common diseases. Am. J. Hum. Genet. 95, 535–552 (2014).

  21. 21.

    et al. Pitfalls of predicting complex traits from SNPs. Nat. Rev. Genet. 14, 507–515 (2013).

  22. 22.

    et al. Cardiovascular risk factors in 2011 and secular trends since 2007: the Cardiovascular Risk in Young Finns Study. Scand. J. Public Health 42, 563–571 (2014).

  23. 23.

    et al. Cohort profile: the cardiovascular risk in Young Finns Study. Int. J. Epidemiol. 37, 1220–1226 (2008).

  24. 24.

    et al. Heritability and genomics of gene expression in peripheral blood. Nat. Genet. 46, 430–437 (2014).

  25. 25.

    et al. Mapping cis- and trans-regulatory effects across multiple tissues in twins. Nat. Genet. 44, 1084–1089 (2012).

  26. 26.

    et al. Trait-associated SNPs are more likely to be eQTLs: annotation to enhance discovery from GWAS. PLoS Genet. 6, e1000888 (2010).

  27. 27.

    et al. Cross-tissue and tissue-specific eQTLs: partitioning the heritability of a complex trait. Am. J. Hum. Genet. 95, 521–534 (2014).

  28. 28.

    et al. Gene-gene and gene-environment interactions detected by transcriptome sequence analysis in twins. Nat. Genet. 47, 88–91 (2015).

  29. 29.

    et al. Candidate causal regulatory effects by integration of expression QTLs with complex trait genetic associations. PLoS Genet. 6, e1000895 (2010).

  30. 30.

    That BLUP is a good thing: the estimation of random effects. Stat. Sci. 6, 15–32 (1991).

  31. 31.

    , & Polygenic modeling with Bayesian sparse linear mixed models. PLoS Genet. 9, e1003264 (2013).

  32. 32.

    Power and predictive accuracy of polygenic risk scores. PLoS Genet. 9, e1003348 (2013).

  33. 33.

    et al. Projecting the performance of risk prediction based on polygenic analyses of genome-wide association studies. Nat. Genet. 45, 400–405, e1–e3 (2013).

  34. 34.

    , & Integrative modeling of eQTLs and cis-regulatory elements suggests mechanisms underlying cell type specificity of eQTLs. PLoS Genet. 9, e1003649 (2013).

  35. 35.

    , & Cross-population joint analysis of eQTLs: fine mapping and functional annotation. PLoS Genet. 11, e1005176 (2015).

  36. 36.

    et al. Another explanation for apparent epistasis. Nature 514, E3–E5 (2014).

  37. 37.

    et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).

  38. 38.

    et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 10, e1004383 (2014).

  39. 39.

    et al. JEPEG: a summary statistics based tool for gene-level joint testing of functional variants. Bioinformatics 31, 1176–1182 (2015).

  40. 40.

    & The allelic architecture of human disease genes: common disease–common variant...or not? Hum. Mol. Genet. 11, 2417–2423 (2002).

  41. 41.

    et al. Meta-analysis of genome-wide association studies in East Asian–ancestry populations identifies four new loci for body mass index. Hum. Mol. Genet. 23, 5492–5504 (2014).

  42. 42.

    et al. Biological interpretation of genome-wide association studies using predicted gene functions. Nat. Commun. 6, 5890 (2015).

  43. 43.

    & 'Mendelian randomization': can genetic epidemiology contribute to understanding environmental determinants of disease? Int. J. Epidemiol. 32, 1–22 (2003).

  44. 44.

    Fulfilling the promise of Mendelian randomization. bioRxiv (16 April 2015).

  45. 45.

    & Efficient design for Mendelian randomization studies: subsample and 2-sample instrumental variable estimators. Am. J. Epidemiol. 178, 1177–1184 (2013).

  46. 46.

    et al. Quantifying missing heritability at known GWAS loci. PLoS Genet. 9, e1003993 (2013).

  47. 47.

    et al. Common regulatory variation impacts gene expression in a cell type–dependent manner. Science 325, 1246–1250 (2009).

  48. 48.

    et al. Transcriptome genetics using second generation sequencing in a Caucasian population. Nature 464, 773–777 (2010).

  49. 49.

    Joint analysis of functional genomic data and genome-wide association studies of 18 human traits. Am. J. Hum. Genet. 94, 559–573 (2014).

  50. 50.

    et al. Integrating functional data to prioritize causal variants in statistical fine-mapping studies. PLoS Genet. 10, e1004722 (2014).

  51. 51.

    et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).

  52. 52.

    et al. Hyperglycemia and a common variant of GCKR are associated with the levels of eight amino acids in 9,369 Finnish men. Diabetes 61, 1895–1902 (2012).

  53. 53.

    et al. Changes in insulin sensitivity and insulin release in relation to glycemia and glucose tolerance in 6,414 Finnish men. Diabetes 58, 1212–1221 (2009).

  54. 54.

    et al. Evidence of widespread selection on standing variation in Europe at height-associated SNPs. Nat. Genet. 44, 1015–1019 (2012).

  55. 55.

    et al. Netherlands Twin Register: from twins to twin families. Twin Res. Hum. Genet. 9, 849–857 (2006).

  56. 56.

    et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).

  57. 57.

    et al. Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature 467, 832–838 (2010).

  58. 58.

    et al. A multi-trait, meta-analysis for detecting pleiotropic polymorphisms for stature, fatness and reproduction in beef cattle. PLoS Genet. 10, e1004198 (2014).

  59. 59.

    et al. Meta-analysis of correlated traits via summary statistics from GWASs with an application in hypertension. Am. J. Hum. Genet. 96, 21–36 (2015).

  60. 60.

    , , , & Leveraging genetic variability across populations for the identification of causal variants. Am. J. Hum. Genet. 86, 23–33 (2010).

  61. 61.

    , & Rapid and accurate multiple testing correction and power estimation for millions of correlated markers. PLoS Genet. 5, e1000456 (2009).

Download references

Acknowledgements

We thank the individuals who participated in the study. We also acknowledge L. Yang for helpful discussions that have improved the quality of this manuscript. We also thank K. Mohlke, M. Boehnke and F. Collins for help with the METSIM data. This work was funded in part by US National Institutes of Health (NIH) grants F32 GM106584 (A.G.), R01 GM053725 (B.P.), R01 GM105857 (A.L.P., A.G. and G.B.), HL-28481 (P.P., A.J.L. and M.C.) and HL-095056 (P.P. and B.P.) and by the US NIH training grant in Genomic Analysis and Interpretation T32 HG002536 (A.K.).

Author information

Affiliations

  1. Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA.

    • Alexander Gusev
    • , Gaurav Bhatia
    • , Wonil Chung
    •  & Alkes L Price
  2. Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA.

    • Alexander Gusev
    • , Gaurav Bhatia
    •  & Alkes L Price
  3. Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, USA.

    • Alexander Gusev
    • , Gaurav Bhatia
    •  & Alkes L Price
  4. Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California, USA.

    • Arthur Ko
    • , Elina Nikkola
    • , Marcus Alvarez
    • , Aldons J Lusis
    • , Päivi Pajukanta
    •  & Bogdan Pasaniuc
  5. Molecular Biology Institute, University of California, Los Angeles, Los Angeles, California, USA.

    • Arthur Ko
    •  & Päivi Pajukanta
  6. Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, California, USA.

    • Huwenbo Shi
    •  & Bogdan Pasaniuc
  7. Department of Psychiatry, VU University Medical Center, Amsterdam, the Netherlands.

    • Brenda W J H Penninx
    •  & Rick Jansen
  8. Department of Biological Psychology, VU University, Amsterdam, the Netherlands.

    • Eco J C de Geus
    •  & Dorret I Boomsma
  9. Bioinformatics Research Center, Department of Statistics, Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina, USA.

    • Fred A Wright
  10. Department of Genetics, University of North Carolina, Chapel Hill, North Carolina, USA.

    • Patrick F Sullivan
  11. Department of Psychiatry, University of North Carolina, Chapel Hill, North Carolina, USA.

    • Patrick F Sullivan
  12. Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden.

    • Patrick F Sullivan
  13. Department of Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California, USA.

    • Mete Civelek
    •  & Aldons J Lusis
  14. Department of Clinical Chemistry, Fimlab Laboratories and University of Tampere School of Medicine, Tampere, Finland.

    • Terho Lehtimäki
    • , Emma Raitoharju
    •  & Ilkka Seppälä
  15. Department of Clinical Physiology, Pirkanmaa Hospital District and University of Tampere School of Medicine, Tampere, Finland.

    • Mika Kähönen
  16. Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku, Turku, Finland.

    • Olli T Raitakari
  17. Department of Clinical Physiology and Nuclear Medicine, Turku University Hospital, Turku, Finland.

    • Olli T Raitakari
  18. Department of Medicine, University of Eastern Finland and Kuopio University Hospital, Kuopio, Finland.

    • Johanna Kuusisto
    •  & Markku Laakso
  19. Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California, USA.

    • Bogdan Pasaniuc

Authors

  1. Search for Alexander Gusev in:

  2. Search for Arthur Ko in:

  3. Search for Huwenbo Shi in:

  4. Search for Gaurav Bhatia in:

  5. Search for Wonil Chung in:

  6. Search for Brenda W J H Penninx in:

  7. Search for Rick Jansen in:

  8. Search for Eco J C de Geus in:

  9. Search for Dorret I Boomsma in:

  10. Search for Fred A Wright in:

  11. Search for Patrick F Sullivan in:

  12. Search for Elina Nikkola in:

  13. Search for Marcus Alvarez in:

  14. Search for Mete Civelek in:

  15. Search for Aldons J Lusis in:

  16. Search for Terho Lehtimäki in:

  17. Search for Emma Raitoharju in:

  18. Search for Mika Kähönen in:

  19. Search for Ilkka Seppälä in:

  20. Search for Olli T Raitakari in:

  21. Search for Johanna Kuusisto in:

  22. Search for Markku Laakso in:

  23. Search for Alkes L Price in:

  24. Search for Päivi Pajukanta in:

  25. Search for Bogdan Pasaniuc in:

Contributions

A.G. and B.P. conceived and designed the experiments. A.G., A.K. and H.S. performed the experiments and analyzed the data. G.B., W.C., B.W.J.H.P., R.J., E.J.C.d.G., D.I.B., F.A.W., P.F.S., E.N., M.A., M.C., A.J.L., T.L., E.R., M.K., I.S., O.T.R., J.K. and M.L. generated data, reagents, materials and analysis tools. A.G., A.L.P., P.P. and B.P. wrote the manuscript. All authors reviewed, revised and wrote feedback for the manuscript.

Competing interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to Alexander Gusev or Bogdan Pasaniuc.

Integrated supplementary information

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1– 19 and Supplementary Note.

Excel files

  1. 1.

    Supplementary Tables 1– 17

    Supplementary Tables 1– 17.

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/ng.3506

Further reading