Abstract
Transcriptome-wide association studies using predicted expression have identified thousands of genes whose locally regulated expression is associated with complex traits and diseases. In this work, we show that linkage disequilibrium induces significant gene–trait associations at non-causal genes as a function of the expression quantitative trait loci weights used in expression prediction. We introduce a probabilistic framework that models correlation among transcriptome-wide association study signals to assign a probability for every gene in the risk region to explain the observed association signal. Importantly, our approach remains accurate when expression data for causal genes are not available in the causal tissue by leveraging expression prediction from other tissues. Our approach yields credible sets of genes containing the causal gene at a nominal confidence level (for example, 90%) that can be used to prioritize genes for functional assays. We illustrate our approach by using an integrative analysis of lipid traits, where our approach prioritizes genes with strong evidence for causality.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Code availability
FUSION TWAS method (http://www.gusevlab.org/projects/fusion/) and FOCUS fine-mapping methods (http://github.com/bogdanlab/focus).
Data availability
Data used in this study are available at the following links: TWAS eQTL weights (http://www.gusevlab.org/projects/fusion/), TWAS and fine-mapping results (http://github.com/bogdanlab/focus_paper/), and lipid GWAS summary data (http://lipidgenetics.org/).
References
Gusev, A. K. A. et al. Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet. 48, 245–252 (2016).
Gamazon, E. R. et al. A gene-based association method for mapping traits using reference transcriptome data. Nat Genet. 47, 1091–1098 (2015).
Zhu, Z. et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 48, 481–487 (2016).
Mancuso, N. et al. Integrating gene expression with summary association statistics to identify genes associated with 30 complex traits. Am. J. Hum. Genet. 100, 473–487 (2017).
Shi, H., Mancuso, N., Spendlove, S. & Pasaniuc, B. Local genetic correlation gives insights into the shared genetic architecture of complex traits. Am. J. Hum. Genet. 101, 737–751 (2017).
Lawlor, D. A., Harbord, R. M., Sterne, J. A., Timpson, N. & Davey, S. G. Mendelian randomization: using genes as instruments for making causal inferences in epidemiology. Stat. Med. 27, 1133–1163 (2008).
Pierce, B. L. & Burgess, S. Efficient design for Mendelian randomization studies: subsample and 2-sample instrumental variable estimators. Am. J. Epidemiol. 178, 1177–1184 (2013).
Bowden, J., Davey Smith, G. & Burgess, S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int. J. Epidemiol. 44, 512–525 (2015).
Davey Smith, G. & Hemani, G. Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Hum. Mol. Genet. 23, R89–R98 (2014).
Wainberg, M. et al. Opportunities and challenges for transcriptome-wide association studies. Nat. Genet. https://doi.org/10.1038/s41588-019-0385-z (2019).
Barfield, R. et al. Transcriptome-wide association studies accounting for colocalization using Egger regression. Genet. Epidemiol. 42, 418–433 (2018).
Bowden, J., Davey Smith, G., Haycock, P. C. & Burgess, S. Consistent estimation in Mendelian randomization with some invalid instruments using a weighted median estimator. Genet. Epidemiol. 40, 304–314 (2016).
Maller, J. B. et al. Bayesian refinement of association signals for 14 loci in 3 common diseases. Nat. Genet. 44, 1294–1301 (2012).
Hormozdiari, F., Kichaev, G., Yang, W.-Y., Pasaniuc, B. & Eskin, E. Identification of causal genes for complex traits. Bioinformatics 31, i206–i213 (2015).
Kichaev, G. et al. Integrating functional data to prioritize causal variants in statistical fine-mapping studies. PLoS Genet. 10, e1004722 (2014).
Teslovich, T. M. et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466, 707–713 (2010).
Musunuru, K. et al. From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus. Nature 466, 714–719 (2010).
Consortium, G. Genetic effects on gene expression across human tissues. Nature 550, 204–213 (2017).
Gelman, A., Meng, X.-L. & Stern, H. Posterior predictive assessment of model fitness via realized discrepancies. Statistica Sin. 6, 733–760 (1996).
Gusev, A. et al. Partitioning heritability of regulatory and cell-type-specific variants across 11 common diseases. Am. J. Hum. Genet. 95, 535–552 (2014).
Wray, N. R. et al. Pitfalls of predicting complex traits from SNPs. Nat. Rev. Genet. 14, 507–515 (2013).
Gusev, A. et al. Atlas of prostate cancer heritability in European and African-American men pinpoints tissue-specific regulation. Nat. Commun. 7, 10979 (2016).
Liu, X. et al. Functional architectures of local and distal regulation of gene expression in multiple human tissues. Am. J. Hum. Genet. 100, 605–616 (2017).
Giambartolomei, C. et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 10, e1004383 (2014).
Krause, B. R. & Hartman, A. D. Adipose tissue and cholesterol metabolism. J. Lipid Res. 25, 97–110 (1984).
Le Lay, S. et al. Cholesterol: a cell size dependent signal which regulates glucose metabolism and gene expression in adipocytes. J. Biol. Chem. 276, 16904–16910 (2001).
Berg, A. H., Combs, T. P. & Scherer, P. E. ACRP30/adiponectin: an adipokine regulating glucose and lipid metabolism. Trends Endocrinol. Metab. 13, 84–89 (2002).
de Haan, W., Bhattacharjee, A., Ruddle, P., Kang, M. H. & Hayden, M. R. ABCA1 in adipocytes regulates adipose tissue lipid content, glucose tolerance and insulin sensitivity. J. Lipid Res. 55, 516–523 (2014).
O’Leary, N. A. et al. Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res. 44, D733–D745 (2016).
Chun, S. et al. Limited statistical evidence for shared genetic effects of eQTLs and autoimmune-disease-associated loci in three major immune-cell types. Nat. Genet. 49, 600–605 (2017).
Hormozdiari, F. et al. Widespread allelic heterogeneity in complex traits. Am. J. Hum. Genet. 100, 789–802 (2017).
Battle, A. et al. Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals. Genome Res. 24, 14–24 (2014).
Li, Y. I. et al. RNA splicing is a primary link between genetic variation and disease. Science 352, 600–604 (2016).
Gusev, A. et al. Transcriptome-wide association study of schizophrenia and chromatin activity yields mechanistic disease insights. Nat. Genet. 50, 538–548 (2018).
Kaalund, S. et al. Contrasting changes in DRD1 and DRD2 splice variant expression in schizophrenia and affective disorders, and associations with SNPs in postmortem brain. Mol. Psychiatry 19, 1258–1266 (2014).
Marigorta, U. M. et al. Transcriptional risk scores link GWAS to eQTLs and predict complications in Crohn’s disease. Nat. Genet. 49, 1517–1521 (2017).
Habier, D., Fernando, R. & Dekkers, J. C. The impact of genetic relationship information on genome-assisted breeding values. Genetics 177, 2389–2397 (2007).
VanRaden, P. M. Efficient methods to compute genomic predictions. J. Dairy Science 91, 4414–4423 (2008).
Zhou, X., Carbonetto, P. & Stephens, M. Polygenic modeling with Bayesian sparse linear mixed models. PLoS Genet. 9, e1003264 (2013).
Ongen, H. et al. Estimating the causal tissues for complex traits and diseases. Nat. Genet. 49, 1676–1683 (2017).
Berisa, T. & Pickrell, J. K. Approximately independent linkage disequilibrium blocks in human populations. Bioinformatics 32, 283–285 (2016).
The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).
Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B 58, 267–288 (1996).
Acknowledgements
We would like to thank C. Giambartolomei for discussions. This work was funded by NIH awards nos. T32NS048004 (N.M.), T32LM012424 (M.K.F.), R01HG009120 (N.M., M.K.F., R.J., G.K., H.S., B.P.), R01MH115676 (N.M., M.K.F., R.J., G.K., H.S., A.G., B.P.), R01HG006399 (N.M., M.K.F., R.J., G.K., H.S., B.P.), and U01CA194393 (N.M., M.K.F., R.J., G.K., H.S., B.P.); NSF award no. DGE-1829071 (R.J.); and the Claudia Adams Barr Award (A.G.). The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.
Author information
Authors and Affiliations
Contributions
N.M., A.G., and B.P. developed the model. N.M., M.K.F., H.S., and G.K. performed simulations and analyses. N.M. and R.J. designed and wrote the FOCUS software. All authors read and approved the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Text and Figures
Supplementary Note and Supplementary Figures 1–24
Supplementary Tables
Supplementary Tables 1–5
Rights and permissions
About this article
Cite this article
Mancuso, N., Freund, M.K., Johnson, R. et al. Probabilistic fine-mapping of transcriptome-wide association studies. Nat Genet 51, 675–682 (2019). https://doi.org/10.1038/s41588-019-0367-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41588-019-0367-1
This article is cited by
-
Expression- and splicing-based multi-tissue transcriptome-wide association studies identified multiple genes for breast cancer by estrogen-receptor status
Breast Cancer Research (2024)
-
Large-scale integrative analysis of juvenile idiopathic arthritis for new insight into its pathogenesis
Arthritis Research & Therapy (2024)
-
Leveraging single-cell ATAC-seq and RNA-seq to identify disease-critical fetal and adult brain cell types
Nature Communications (2024)
-
Adjusting for genetic confounders in transcriptome-wide association studies improves discovery of risk genes of complex traits
Nature Genetics (2024)
-
Conditional transcriptome-wide association study for fine-mapping candidate causal genes
Nature Genetics (2024)