Probabilistic fine-mapping of transcriptome-wide association studies


Transcriptome-wide association studies using predicted expression have identified thousands of genes whose locally regulated expression is associated with complex traits and diseases. In this work, we show that linkage disequilibrium induces significant gene–trait associations at non-causal genes as a function of the expression quantitative trait loci weights used in expression prediction. We introduce a probabilistic framework that models correlation among transcriptome-wide association study signals to assign a probability for every gene in the risk region to explain the observed association signal. Importantly, our approach remains accurate when expression data for causal genes are not available in the causal tissue by leveraging expression prediction from other tissues. Our approach yields credible sets of genes containing the causal gene at a nominal confidence level (for example, 90%) that can be used to prioritize genes for functional assays. We illustrate our approach by using an integrative analysis of lipid traits, where our approach prioritizes genes with strong evidence for causality.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Illustration of the induced correlation structure for predicted expression.
Fig. 2: Simulation diagram for alternative and null scenarios.
Fig. 3: Credible gene sets are well calibrated in simulations.
Fig. 4: FOCUS credible sets alleviate bias in confounding simulations.
Fig. 5: FOCUS accurately prioritizes causal genes in simulations.
Fig. 6: 1p13 locus for LDL.

Code availability

FUSION TWAS method ( and FOCUS fine-mapping methods (

Data availability

Data used in this study are available at the following links: TWAS eQTL weights (, TWAS and fine-mapping results (, and lipid GWAS summary data (


  1. 1.

    Gusev, A. K. A. et al. Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet. 48, 245–252 (2016).

    CAS  Article  Google Scholar 

  2. 2.

    Gamazon, E. R. et al. A gene-based association method for mapping traits using reference transcriptome data. Nat Genet. 47, 1091–1098 (2015).

    CAS  Article  Google Scholar 

  3. 3.

    Zhu, Z. et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 48, 481–487 (2016).

    CAS  Article  Google Scholar 

  4. 4.

    Mancuso, N. et al. Integrating gene expression with summary association statistics to identify genes associated with 30 complex traits. Am. J. Hum. Genet. 100, 473–487 (2017).

    CAS  Article  Google Scholar 

  5. 5.

    Shi, H., Mancuso, N., Spendlove, S. & Pasaniuc, B. Local genetic correlation gives insights into the shared genetic architecture of complex traits. Am. J. Hum. Genet. 101, 737–751 (2017).

    CAS  Article  Google Scholar 

  6. 6.

    Lawlor, D. A., Harbord, R. M., Sterne, J. A., Timpson, N. & Davey, S. G. Mendelian randomization: using genes as instruments for making causal inferences in epidemiology. Stat. Med. 27, 1133–1163 (2008).

    Article  Google Scholar 

  7. 7.

    Pierce, B. L. & Burgess, S. Efficient design for Mendelian randomization studies: subsample and 2-sample instrumental variable estimators. Am. J. Epidemiol. 178, 1177–1184 (2013).

    Article  Google Scholar 

  8. 8.

    Bowden, J., Davey Smith, G. & Burgess, S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int. J. Epidemiol. 44, 512–525 (2015).

    Article  Google Scholar 

  9. 9.

    Davey Smith, G. & Hemani, G. Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Hum. Mol. Genet. 23, R89–R98 (2014).

    CAS  Article  Google Scholar 

  10. 10.

    Wainberg, M. et al. Opportunities and challenges for transcriptome-wide association studies. Nat. Genet. (2019).

    CAS  Article  Google Scholar 

  11. 11.

    Barfield, R. et al. Transcriptome-wide association studies accounting for colocalization using Egger regression. Genet. Epidemiol. 42, 418–433 (2018).

    Article  Google Scholar 

  12. 12.

    Bowden, J., Davey Smith, G., Haycock, P. C. & Burgess, S. Consistent estimation in Mendelian randomization with some invalid instruments using a weighted median estimator. Genet. Epidemiol. 40, 304–314 (2016).

    Article  Google Scholar 

  13. 13.

    Maller, J. B. et al. Bayesian refinement of association signals for 14 loci in 3 common diseases. Nat. Genet. 44, 1294–1301 (2012).

    CAS  Article  Google Scholar 

  14. 14.

    Hormozdiari, F., Kichaev, G., Yang, W.-Y., Pasaniuc, B. & Eskin, E. Identification of causal genes for complex traits. Bioinformatics 31, i206–i213 (2015).

    CAS  Article  Google Scholar 

  15. 15.

    Kichaev, G. et al. Integrating functional data to prioritize causal variants in statistical fine-mapping studies. PLoS Genet. 10, e1004722 (2014).

    Article  Google Scholar 

  16. 16.

    Teslovich, T. M. et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466, 707–713 (2010).

    CAS  Article  Google Scholar 

  17. 17.

    Musunuru, K. et al. From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus. Nature 466, 714–719 (2010).

    CAS  Article  Google Scholar 

  18. 18.

    Consortium, G. Genetic effects on gene expression across human tissues. Nature 550, 204–213 (2017).

    Article  Google Scholar 

  19. 19.

    Gelman, A., Meng, X.-L. & Stern, H. Posterior predictive assessment of model fitness via realized discrepancies. Statistica Sin. 6, 733–760 (1996).

    Google Scholar 

  20. 20.

    Gusev, A. et al. Partitioning heritability of regulatory and cell-type-specific variants across 11 common diseases. Am. J. Hum. Genet. 95, 535–552 (2014).

    CAS  Article  Google Scholar 

  21. 21.

    Wray, N. R. et al. Pitfalls of predicting complex traits from SNPs. Nat. Rev. Genet. 14, 507–515 (2013).

    CAS  Article  Google Scholar 

  22. 22.

    Gusev, A. et al. Atlas of prostate cancer heritability in European and African-American men pinpoints tissue-specific regulation. Nat. Commun. 7, 10979 (2016).

    CAS  Article  Google Scholar 

  23. 23.

    Liu, X. et al. Functional architectures of local and distal regulation of gene expression in multiple human tissues. Am. J. Hum. Genet. 100, 605–616 (2017).

    CAS  Article  Google Scholar 

  24. 24.

    Giambartolomei, C. et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 10, e1004383 (2014).

    Article  Google Scholar 

  25. 25.

    Krause, B. R. & Hartman, A. D. Adipose tissue and cholesterol metabolism. J. Lipid Res. 25, 97–110 (1984).

    CAS  PubMed  Google Scholar 

  26. 26.

    Le Lay, S. et al. Cholesterol: a cell size dependent signal which regulates glucose metabolism and gene expression in adipocytes. J. Biol. Chem. 276, 16904–16910 (2001).

    CAS  Article  Google Scholar 

  27. 27.

    Berg, A. H., Combs, T. P. & Scherer, P. E. ACRP30/adiponectin: an adipokine regulating glucose and lipid metabolism. Trends Endocrinol. Metab. 13, 84–89 (2002).

    CAS  Article  Google Scholar 

  28. 28.

    de Haan, W., Bhattacharjee, A., Ruddle, P., Kang, M. H. & Hayden, M. R. ABCA1 in adipocytes regulates adipose tissue lipid content, glucose tolerance and insulin sensitivity. J. Lipid Res. 55, 516–523 (2014).

    CAS  Article  Google Scholar 

  29. 29.

    O’Leary, N. A. et al. Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res. 44, D733–D745 (2016).

    Article  Google Scholar 

  30. 30.

    Chun, S. et al. Limited statistical evidence for shared genetic effects of eQTLs and autoimmune-disease-associated loci in three major immune-cell types. Nat. Genet. 49, 600–605 (2017).

    CAS  Article  Google Scholar 

  31. 31.

    Hormozdiari, F. et al. Widespread allelic heterogeneity in complex traits. Am. J. Hum. Genet. 100, 789–802 (2017).

    CAS  Article  Google Scholar 

  32. 32.

    Battle, A. et al. Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals. Genome Res. 24, 14–24 (2014).

    CAS  Article  Google Scholar 

  33. 33.

    Li, Y. I. et al. RNA splicing is a primary link between genetic variation and disease. Science 352, 600–604 (2016).

    CAS  Article  Google Scholar 

  34. 34.

    Gusev, A. et al. Transcriptome-wide association study of schizophrenia and chromatin activity yields mechanistic disease insights. Nat. Genet. 50, 538–548 (2018).

    CAS  Article  Google Scholar 

  35. 35.

    Kaalund, S. et al. Contrasting changes in DRD1 and DRD2 splice variant expression in schizophrenia and affective disorders, and associations with SNPs in postmortem brain. Mol. Psychiatry 19, 1258–1266 (2014).

    CAS  Article  Google Scholar 

  36. 36.

    Marigorta, U. M. et al. Transcriptional risk scores link GWAS to eQTLs and predict complications in Crohn’s disease. Nat. Genet. 49, 1517–1521 (2017).

    CAS  Article  Google Scholar 

  37. 37.

    Habier, D., Fernando, R. & Dekkers, J. C. The impact of genetic relationship information on genome-assisted breeding values. Genetics 177, 2389–2397 (2007).

    CAS  Article  Google Scholar 

  38. 38.

    VanRaden, P. M. Efficient methods to compute genomic predictions. J. Dairy Science 91, 4414–4423 (2008).

    CAS  Article  Google Scholar 

  39. 39.

    Zhou, X., Carbonetto, P. & Stephens, M. Polygenic modeling with Bayesian sparse linear mixed models. PLoS Genet. 9, e1003264 (2013).

    CAS  Article  Google Scholar 

  40. 40.

    Ongen, H. et al. Estimating the causal tissues for complex traits and diseases. Nat. Genet. 49, 1676–1683 (2017).

    CAS  Article  Google Scholar 

  41. 41.

    Berisa, T. & Pickrell, J. K. Approximately independent linkage disequilibrium blocks in human populations. Bioinformatics 32, 283–285 (2016).

    CAS  Google Scholar 

  42. 42.

    The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).

  43. 43.

    Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B 58, 267–288 (1996).

    Google Scholar 

Download references


We would like to thank C. Giambartolomei for discussions. This work was funded by NIH awards nos. T32NS048004 (N.M.), T32LM012424 (M.K.F.), R01HG009120 (N.M., M.K.F., R.J., G.K., H.S., B.P.), R01MH115676 (N.M., M.K.F., R.J., G.K., H.S., A.G., B.P.), R01HG006399 (N.M., M.K.F., R.J., G.K., H.S., B.P.), and U01CA194393 (N.M., M.K.F., R.J., G.K., H.S., B.P.); NSF award no. DGE-1829071 (R.J.); and the Claudia Adams Barr Award (A.G.). The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.

Author information




N.M., A.G., and B.P. developed the model. N.M., M.K.F., H.S., and G.K. performed simulations and analyses. N.M. and R.J. designed and wrote the FOCUS software. All authors read and approved the manuscript.

Corresponding authors

Correspondence to Nicholas Mancuso or Bogdan Pasaniuc.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Text and Figures

Supplementary Note and Supplementary Figures 1–24

Reporting Summary

Supplementary Tables

Supplementary Tables 1–5

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Mancuso, N., Freund, M.K., Johnson, R. et al. Probabilistic fine-mapping of transcriptome-wide association studies. Nat Genet 51, 675–682 (2019).

Download citation

Further reading


Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing