Abstract

Genome-wide association studies (GWAS) have identified thousands of variants robustly associated with complex traits. However, the biological mechanisms underlying these associations are, in general, not well understood. We propose a gene-based association method called PrediXcan that directly tests the molecular mechanisms through which genetic variation affects phenotype. The approach estimates the component of gene expression determined by an individual's genetic profile and correlates 'imputed' gene expression with the phenotype under investigation to identify genes involved in the etiology of the phenotype. Genetically regulated gene expression is estimated using whole-genome tissue-dependent prediction models trained with reference transcriptome data sets. PrediXcan enjoys the benefits of gene-based approaches such as reduced multiple-testing burden and a principled approach to the design of follow-up experiments. Our results demonstrate that PrediXcan can detect known and new genes associated with disease traits and provide insights into the mechanism of these associations.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

References

  1. 1.

    , , & Designing genome-wide association studies: sample size, power, imputation, and the choice of genotyping chip. PLoS Genet. 5, e1000477 (2009).

  2. 2.

    et al. Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index. Nat. Genet. 42, 937–948 (2010).

  3. 3.

    et al. Finding the missing heritability of complex diseases. Nature 461, 747–753 (2009).

  4. 4.

    et al. The missing association: sequencing-based discovery of novel SNPs in VKORC1 and CYP2C9 that affect warfarin dose in African Americans. Clin. Pharmacol. Ther. 89, 408–415 (2011).

  5. 5.

    The success of pharmacogenomics in moving genetic association studies from bench to bedside: study design and implementation of precision medicine in the post-GWAS era. Hum. Genet. 131, 1615–1626 (2012).

  6. 6.

    et al. Obesity-associated variants within FTO form long-range functional connections with IRX3. Nature 507, 371–375 (2014).

  7. 7.

    et al. Trait-associated SNPs are more likely to be eQTLs: annotation to enhance discovery from GWAS. PLoS Genet. 6, e1000888 (2010).

  8. 8.

    , , & Chemotherapeutic drug susceptibility associated SNPs are enriched in expression quantitative trait loci. Proc. Natl. Acad. Sci. USA 107, 9287–9292 (2010).

  9. 9.

    et al. Partitioning the heritability of Tourette syndrome and obsessive compulsive disorder reveals differences in genetic architecture. PLoS Genet. 9, e1003864 (2013).

  10. 10.

    et al. The convergence of eQTL mapping, heritability estimation and polygenic modeling: emerging spectrum of risk variation in bipolar disorder. arXiv 1303.6227 (2013).

  11. 11.

    et al. Regulatory variants explain much more heritability than coding variants across 11 common diseases. bioRxiv 004309 (21 April 2014).

  12. 12.

    ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

  13. 13.

    GTEx Consortium. The Genotype-Tissue Expression (GTEx) project. Nat. Genet. 45, 580–585 (2013).

  14. 14.

    GTEx Consortium. The Genotype-Tissue Expression (GTEx) pilot analysis: multi-tissue gene regulation in humans. Science 348, 648–660 (2015).

  15. 15.

    et al. Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501, 506–511 (2013).

  16. 16.

    et al. Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals. Genome Res. 24, 14–24 (2014).

  17. 17.

    et al. Genetic variability in the regulation of gene expression in ten regions of the human brain. Nat. Neurosci. 17, 1418–1428 (2014).

  18. 18.

    Regression shrinkage and selection via the Lasso. J. R. Stat. Soc., B 58, 267–288 (1996).

  19. 19.

    & Regularization and variable selection via the elastic net. J. R. Stat. Soc. Series B Stat. Methodol. 67, 301–320 (2005).

  20. 20.

    , , & GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).

  21. 21.

    , & The final touches make perfect the peptide–MHC class I repertoire. Immunity 26, 397–406 (2007).

  22. 22.

    Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661–678 (2007).

  23. 23.

    et al. Pervasive sharing of genetic effects in autoimmune disease. PLoS Genet. 7, e1002254 (2011).

  24. 24.

    et al. Host-microbe interactions have shaped the genetic architecture of inflammatory bowel disease. Nature 491, 119–124 (2012).

  25. 25.

    et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl. Acad. Sci. USA 106, 9362–9367 (2009).

  26. 26.

    & Genetics of the HLA region in the prediction of type 1 diabetes. Curr. Diab. Rep. 11, 533–542 (2011).

  27. 27.

    et al. A novel susceptibility locus for type 1 diabetes on Chr12q13 identified by a genome-wide association study. Diabetes 57, 1143–1146 (2008).

  28. 28.

    et al. Genetically dependent ERBB3 expression modulates antigen presenting cell function and type 1 diabetes risk. PLoS ONE 5, e11789 (2010).

  29. 29.

    et al. Genome-wide association study of d-amphetamine response in healthy volunteers identifies putative associations, including cadherin 13 (CDH13). PLoS ONE 7, e42646 (2012).

  30. 30.

    et al. Genetic variation associated with euphorigenic effects of d-amphetamine is associated with diminished risk for schizophrenia and attention deficit hyperactivity disorder. Proc. Natl. Acad. Sci. USA 111, 5968–5973 (2014).

  31. 31.

    Psychiatric GWAS Consortium Bipolar Disorder Working Group. Large-scale genome-wide association analysis of bipolar disorder identifies a new susceptibility locus near ODZ4. Nat. Genet. 43, 977–983 (2011).

  32. 32.

    et al. Genetic analysis of genome-wide variation in human gene expression. Nature 430, 743–747 (2004).

  33. 33.

    et al. Single-tissue and cross-tissue heritability of gene expression via identity-by-descent in related or unrelated individuals. PLoS Genet. 7, e1001317 (2011).

  34. 34.

    , & Revealing the architecture of gene regulation: the promise of eQTL studies. Trends Genet. 24, 408–415 (2008).

  35. 35.

    , , , & Mapping complex disease traits with global gene expression. Nat. Rev. Genet. 10, 184–194 (2009).

  36. 36.

    & Robust prediction of expression differences among human individuals using only genotype information. PLoS Genet. 9, e1003396 (2013).

  37. 37.

    et al. Cross-tissue and tissue-specific eQTLs: partitioning the heritability of a complex trait. Am. J. Hum. Genet. 95, 521–534 (2014).

  38. 38.

    , & RNA-Seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet. 10, 57–63 (2009).

  39. 39.

    , , , & Fast and accurate genotype imputation in genome-wide association studies through pre-phasing. Nat. Genet. 44, 955–959 (2012).

  40. 40.

    , & minimac2: faster genotype imputation. Bioinformatics 31, 782–784 (2015).

  41. 41.

    , , , & Using probabilistic estimation of expression residuals (PEER) to obtain increased power and interpretability of gene expression analyses. Nat. Protoc. 7, 500–507 (2012).

  42. 42.

    , & The Elements of Statistical Learning: Data Mining, Inference, and Prediction (Springer, 2009).

  43. 43.

    Random Forests. Mach. Learn. 45, 5–32 (2001).

  44. 44.

    et al. Poly-omic prediction of complex traits: OmicKriging. Genet. Epidemiol. 38, 402–415 (2014).

  45. 45.

    et al. GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 22, 1760–1774 (2012).

  46. 46.

    Matrix eQTL: ultra fast eQTL analysis via large matrix operations. Bioinformatics 28, 1353–1358 (2012).

  47. 47.

    et al. Genome-wide meta-analysis increases to 71 the number of confirmed Crohn's disease susceptibility loci. Nat. Genet. 42, 1118–1125 (2010).

  48. 48.

    et al. Genetics of rheumatoid arthritis contributes to biology and drug discovery. Nature 506, 376–381 (2014).

  49. 49.

    et al. A versatile gene-based test for genome-wide association studies. Am. J. Hum. Genet. 87, 139–145 (2010).

  50. 50.

    et al. Powerful SNP-set analysis for case-control genome-wide association studies. Am. J. Hum. Genet. 86, 929–942 (2010).

  51. 51.

    et al. Rare-variant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet. 89, 82–93 (2011).

  52. 52.

    , & Naive Electronic Health Record phenotype identification for rheumatoid arthritis. AMIA Annu. Symp. Proc. 2011, 189–196 (2011).

Download references

Acknowledgements

We thank A. Konkashbaev and C. Fuchsberger for outstanding technical support and N. Knoblauch for assistance in performing the quality control pipeline. We acknowledge the following US National Institutes of Health grants: K12 CA139160 (H.K.I.), T32 MH020065 (K.P.S.), F32 CA165823 (H.E.W.), R01 MH101820 and R01 MH090937 (GTEx), P30 DK20595 and P60 DK20595 (Diabetes Research and Training Center), P50 DA037844 (Rat Genomics), UO1 GM61393 (Pharmacogenomics of Anticancer Agents Research), P50 MH094267 (Conte), U01 GM092691 (J.C.D.) and U19 HL065962 (PGRN Statistical Analysis Resource). Additional acknowledgments can be found in the Supplementary Note.

Author information

Author notes

    • Eric R Gamazon
    • , Heather E Wheeler
    •  & Kaanan P Shah

    These authors contributed equally to this work.

Affiliations

  1. Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, Illinois, USA.

    • Eric R Gamazon
    • , Kaanan P Shah
    • , Keston Aquino-Michaels
    • , Dan L Nicolae
    • , Nancy J Cox
    •  & Hae Kyung Im
  2. Division of Genetic Medicine, Vanderbilt University, Nashville, Tennessee, USA.

    • Eric R Gamazon
    •  & Nancy J Cox
  3. Section of Hematology/Oncology, Department of Medicine, University of Chicago, Chicago, Illinois, USA.

    • Heather E Wheeler
  4. Department of Human Genetics, University of Chicago, Chicago, Illinois, USA.

    • Sahar V Mozaffari
    • , Dan L Nicolae
    •  & Nancy J Cox
  5. Department of Biomedical Informatics, Vanderbilt University, Nashville, Tennessee, USA.

    • Robert J Carroll
    •  & Joshua C Denny
  6. Rheumatology Center, NorthCrest Medical Center, Springfield, Tennessee, USA.

    • Anne E Eyler
  7. Department of Statistics, University of Chicago, Chicago, Illinois, USA.

    • Dan L Nicolae

Consortia

  1. GTEx Consortium

    A full list of members and affiliations appears in the Supplementary Note.

Authors

  1. Search for Eric R Gamazon in:

  2. Search for Heather E Wheeler in:

  3. Search for Kaanan P Shah in:

  4. Search for Sahar V Mozaffari in:

  5. Search for Keston Aquino-Michaels in:

  6. Search for Robert J Carroll in:

  7. Search for Anne E Eyler in:

  8. Search for Joshua C Denny in:

  9. Search for Dan L Nicolae in:

  10. Search for Nancy J Cox in:

  11. Search for Hae Kyung Im in:

Contributions

H.K.I., H.E.W., E.R.G., K.P.S., S.V.M. and K.A.-M. performed the analyses. J.C.D., R.J.C. and A.E.E. provided replication data. E.R.G., H.E.W., K.P.S. and H.K.I. wrote the manuscript. D.L.N., N.J.C. and H.K.I. provided intellectual input and supervised the study. H.K.I. designed the study. All authors reviewed and contributed to the final manuscript.

Competing interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to Hae Kyung Im.

Integrated supplementary information

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–7 and Supplementary Note.

Excel files

  1. 1.

    Supplementary Table 1

    Supplementary Table 1.

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/ng.3367

Further reading

  • An association analysis to identify genetic variants linked to asthma and rhino-conjunctivitis in a cohort of Sicilian children

    • Gianluca Sottile
    • , Giuliana Ferrante
    • , Marta Torregrossa
    • , Fabio Cibella
    • , Giovanna Cilluffo
    • , Salvatore Fasola
    • , Riccardo Alessandro
    • , Gregorio Seidita
    • , Giovanni Viegi
    •  & Stefania La Grutta

    Italian Journal of Pediatrics (2019)

  • GWAS and Beyond: Using Omics Approaches to Interpret SNP Associations

    • Hung-Hsin Chen
    • , Lauren E. Petty
    • , William Bush
    • , Adam C. Naj
    •  & Jennifer E. Below

    Current Genetic Medicine Reports (2019)

  • Trans-ethnic association study of blood pressure determinants in over 750,000 individuals

    • Ayush Giri
    • , Jacklyn N. Hellwege
    • , Jacob M. Keaton
    • , Jihwan Park
    • , Chengxiang Qiu
    • , Helen R. Warren
    • , Eric S. Torstenson
    • , Csaba P. Kovesdy
    • , Yan V. Sun
    • , Otis D. Wilson
    • , Cassianne Robinson-Cohen
    • , Christianne L. Roumie
    • , Cecilia P. Chung
    • , Kelly A. Birdwell
    • , Scott M. Damrauer
    • , Scott L. DuVall
    • , Derek Klarin
    • , Kelly Cho
    • , Yu Wang
    • , Evangelos Evangelou
    • , Claudia P. Cabrera
    • , Louise V. Wain
    • , Rojesh Shrestha
    • , Brian S. Mautz
    • , Elvis A. Akwo
    • , Muralidharan Sargurupremraj
    • , Stéphanie Debette
    • , Michael Boehnke
    • , Laura J. Scott
    • , Jian’an Luan
    • , Jing-Hua Zhao
    • , Sara M. Willems
    • , Sébastien Thériault
    • , Nabi Shah
    • , Christopher Oldmeadow
    • , Peter Almgren
    • , Ruifang Li-Gao
    • , Niek Verweij
    • , Thibaud S. Boutin
    • , Massimo Mangino
    • , Ioanna Ntalla
    • , Elena Feofanova
    • , Praveen Surendran
    • , James P. Cook
    • , Savita Karthikeyan
    • , Najim Lahrouchi
    • , Chunyu Liu
    • , Nuno Sepúlveda
    • , Tom G. Richardson
    • , Aldi Kraja
    • , Philippe Amouyel
    • , Martin Farrall
    • , Neil R. Poulter
    • , Markku Laakso
    • , Eleftheria Zeggini
    • , Peter Sever
    • , Robert A. Scott
    • , Claudia Langenberg
    • , Nicholas J. Wareham
    • , David Conen
    • , Colin Neil Alexander Palmer
    • , John Attia
    • , Daniel I. Chasman
    • , Paul M. Ridker
    • , Olle Melander
    • , Dennis Owen Mook-Kanamori
    • , Pim van der Harst
    • , Francesco Cucca
    • , David Schlessinger
    • , Caroline Hayward
    • , Tim D. Spector
    • , Marjo-Riitta Jarvelin
    • , Branwen J. Hennig
    • , Nicholas J. Timpson
    • , Wei-Qi Wei
    • , Joshua C. Smith
    • , Yaomin Xu
    • , Michael E. Matheny
    • , Edward E. Siew
    • , Cecilia Lindgren
    • , Karl-Heinz Herzig
    • , George Dedoussis
    • , Joshua C. Denny
    • , Bruce M. Psaty
    • , Joanna M. M. Howson
    • , Patricia B. Munroe
    • , Christopher Newton-Cheh
    • , Mark J. Caulfield
    • , Paul Elliott
    • , J. Michael Gaziano
    • , John Concato
    • , Peter W. F. Wilson
    • , Philip S. Tsao
    • , Digna R. Velez Edwards
    • , Katalin Susztak
    • , Christopher J. O’Donnell
    • , Adriana M. Hung
    •  & Todd L. Edwards

    Nature Genetics (2019)

  • Integrative genomics identifies new genes associated with severe COPD and emphysema

    • Phuwanat Sakornsakolpat
    • , Jarrett D. Morrow
    • , Peter J. Castaldi
    • , Craig P. Hersh
    • , Yohan Bossé
    • , Edwin K. Silverman
    • , Ani Manichaikul
    •  & Michael H. Cho

    Respiratory Research (2018)

  • An integrative functional genomics framework for effective identification of novel regulatory variants in genome–phenome studies

    • Junfei Zhao
    • , Feixiong Cheng
    • , Peilin Jia
    • , Nancy Cox
    • , Joshua C. Denny
    •  & Zhongming Zhao

    Genome Medicine (2018)