Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Technical Report
  • Published:

Conditional transcriptome-wide association study for fine-mapping candidate causal genes

Abstract

Transcriptome-wide association studies (TWASs) aim to integrate genome-wide association studies with expression-mapping studies to identify genes with genetically predicted expression (GReX) associated with a complex trait. In the present report, we develop a method, GIFT (gene-based integrative fine-mapping through conditional TWAS), that performs conditional TWAS analysis by explicitly controlling for GReX of all other genes residing in a local region to fine-map putatively causal genes. GIFT is frequentist in nature, explicitly models both expression correlation and cis-single nucleotide polymorphism linkage disequilibrium across multiple genes and uses a likelihood framework to account for expression prediction uncertainty. As a result, GIFT produces calibrated P values and is effective for fine-mapping. We apply GIFT to analyze six traits in the UK Biobank, where GIFT narrows down the set size of putatively causal genes by 32.16–91.32% compared with existing TWAS fine-mapping approaches. The genes identified by GIFT highlight the importance of vessel regulation in determining blood pressures and lipid metabolism for regulating lipid levels.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Schematic overview of GIFT for conditional TWAS analysis.
Fig. 2: Comparison of GIFT with other TWAS fine-mapping methods in simulations.
Fig. 3: TWAS fine-mapping results from different methods for the two BP traits in the UK Biobank.
Fig. 4: TWAS fine-mapping results from different methods for the four blood lipid traits in the UK Biobank.

Similar content being viewed by others

Data availability

No data were generated in the present study. The GEUVADIS gene expression data are publicly available at https://www.internationalgenome.org/data-portal/data-collection/geuvadis. The UK Biobank data were obtained from the UK Biobank resource (http://www.ukbiobank.ac.uk). The 1000 Genomes project data (phase 3) are available at https://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502. GENCODE (release 12) is available at https://www.gencodegenes.org/human/release_12.html.

Code availability

GIFT is implemented in the R package GIFT, freely available at https://yuanzhongshang.github.io/GIFT and https://zenodo.org/records/10070491. The analysis code used to reproduce all results is available at https://zenodo.org/records/10070491. We also performed analysis using FOCUS (v.0.802, https://github.com/mancusolab/ma-focus), FOGS (v.2.0, https://github.com/ChongWu-Biostat/FOGS), MV-IWAS (v.0.0.0.9000, https://github.com/kathalexknuts/MVIWAS), clusterProfiler (v.3.14.3, https://bioconductor.org/packages/release/bioc/html/clusterProfiler.html) and PLINK (v.1.90b6.13, https://www.cog-genomics.org/plink/1.9).

References

  1. Wainberg, M. et al. Opportunities and challenges for transcriptome-wide association studies. Nat. Genet. 51, 592–599 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Gusev, A. et al. Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet. 48, 245–252 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Gamazon, E. R. et al. A gene-based association method for mapping traits using reference transcriptome data. Nat. Genet. 47, 1091–1098 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Yuan, Z. et al. Testing and controlling for horizontal pleiotropy with probabilistic Mendelian randomization in transcriptome-wide association studies. Nat. Commun. 11, 3861 (2020).

    Article  PubMed  PubMed Central  ADS  Google Scholar 

  5. Liu, L., Zeng, P., Xue, F., Yuan, Z. & Zhou, X. Multi-trait transcriptome-wide association studies with probabilistic Mendelian randomization. Am. J. Hum. Genet. 108, 240–256 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Yang, C. et al. CoMM: a collaborative mixed model to dissecting genetic contributions to complex traits by leveraging regulatory information. Bioinformatics 35, 1644–1652 (2019).

    Article  CAS  PubMed  Google Scholar 

  7. Zeng, P. & Zhou, X. Non-parametric genetic prediction of complex traits with latent Dirichlet process regression models. Nat. Commun. 8, 456 (2017).

    Article  PubMed  PubMed Central  ADS  Google Scholar 

  8. Nagpal, S. et al. TIGAR: an improved Bayesian tool for transcriptomic data imputation enhances gene mapping of complex traits. Am. J. Hum. Genet. 105, 258–266 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Zhang, Y. et al. PTWAS: investigating tissue-relevant causal molecular mechanisms of complex traits using probabilistic TWAS analysis. Genome Biol. 21, 232 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  10. Luningham, J. M. et al. Bayesian genome-wide TWAS method to leverage both cis- and trans-eQTL information through summary statistics. Am. J. Hum. Genet. 107, 714–726 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Bhattacharya, A., Li, Y. & Love, M. I. MOSTWAS: multi-omic strategies for transcriptome-wide association studies. PLoS Genet. 17, e1009398 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Zhang, W. et al. Integrative transcriptome imputation reveals tissue-specific and shared biological mechanisms mediating susceptibility to complex traits. Nat. Commun. 10, 3834 (2019).

    Article  PubMed  PubMed Central  ADS  Google Scholar 

  13. Cao, C. et al. kTWAS: integrating kernel machine with transcriptome-wide association studies improves statistical power and reveals novel genes. Brief. Bioinform. 22, bbaa270 (2021).

    Article  PubMed  Google Scholar 

  14. Tang, S. et al. Novel variance-component TWAS method for studying complex human diseases with applications to Alzheimer’s dementia. PLoS Genet. 17, e1009482 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Zeng, P., Dai, J., Jin, S. & Zhou, X. Aggregating multiple expression prediction models improves the power of transcriptome-wide association studies. Hum. Mol. Genet. 30, 939–951 (2021).

    Article  CAS  PubMed  Google Scholar 

  16. Zuber, V., Colijn, J. M., Klaver, C. & Burgess, S. Selecting likely causal risk factors from high-throughput experiments using multivariable Mendelian randomization. Nat. Commun. 11, 29 (2020).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  17. Barbeira, A. N. et al. Integrating predicted transcriptome from multiple tissues improves association detection. PLoS Genet. 15, e1007889 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  18. Mancuso, N. et al. Probabilistic fine-mapping of transcriptome-wide association studies. Nat. Genet. 51, 675 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Wu, C. & Pan, W. A powerful fine-mapping method for transcriptome-wide association studies. Hum. Genet. 139, 199–213 (2020).

    Article  CAS  PubMed  Google Scholar 

  20. Knutson, K. A., Deng, Y. & Pan, W. Implicating causal brain imaging endophenotypes in Alzheimer’s disease using multivariable IWAS and GWAS summary data. NeuroImage 223, 117347 (2020).

    Article  PubMed  Google Scholar 

  21. Morley, M. et al. Genetic analysis of genome-wide variation in human gene expression. Nature 430, 743–747 (2004).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  22. Klebanov, L. & Yakovlev, A. Diverse correlation structures in gene expression data and their utility in improving statistical inference. Ann. Appl. Stat. 1, 538–559 (2007).

    Article  MathSciNet  Google Scholar 

  23. Berisa, T. & Pickrell, J. K. Approximately independent linkage disequilibrium blocks in human populations. Bioinformatics 32, 283–285 (2016).

    Article  CAS  PubMed  Google Scholar 

  24. Rust, S. et al. Tangier disease is caused by mutations in the gene encoding ATP-binding cassette transporter 1. Nat. Genet. 22, 352–355 (1999).

    Article  CAS  PubMed  Google Scholar 

  25. Buniello, A. et al. The NHGRI-EBI GWAS catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 47, D1005–D1012 (2019).

    Article  CAS  PubMed  Google Scholar 

  26. Frikke-Schmidt, R. et al. Association of loss-of-function mutations in the ABCA1 gene with high-density lipoprotein cholesterol levels and risk of ischemic heart disease. JAMA 299, 2524–2532 (2008).

    Article  CAS  PubMed  Google Scholar 

  27. McNeish, J. et al. High density lipoprotein deficiency and foam cell accumulation in mice with targeted disruption of ATP-binding cassette transporter-1. Proc. Natl Acad. Sci. USA 97, 4245–4250 (2000).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  28. Brunham, L. R. et al. Intestinal ABCA1 directly contributes to HDL biogenesis in vivo. J. Clin. Invest. 116, 1052–1062 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Simons, K. & Ikonen, E. Functional rafts in cell membranes. Nature 387, 569–572 (1997).

    Article  CAS  PubMed  ADS  Google Scholar 

  30. Hao, X., Zeng, P., Zhang, S. & Zhou, X. Identifying and exploiting trait-relevant tissues with multiple functional annotations in genome-wide association studies. PLoS Genet. 14, e1007186 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  31. Shang, L., Smith, J. A. & Zhou, X. Leveraging gene co-expression patterns to infer trait-relevant tissues in genome-wide association studies. PLoS Genet. 16, e1008734 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Li, Z. et al. METRO: multi-ancestry transcriptome-wide association studies for powerful gene-trait association detection. Am. J. Hum. Genet. 109, 783–801 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Knutson, K. A. & Pan, W. MATS: a novel multi-ancestry transcriptome-wide association study to account for heterogeneity in the effects of cis-regulated gene expression on complex traits. Hum. Mol. Genet. 32, 1237–1251 (2023).

    Article  CAS  PubMed  Google Scholar 

  34. Lu, Z. et al. Multi-ancestry fine-mapping improves precision to identify causal genes in transcriptome-wide association studies. Am. J. Hum. Genet. 109, 1388–1404 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Liu, L. et al. GIFT: conditional transcriptome-wide association study for fine-mapping candidate causal genes. Zenodo https://doi.org/10.5281/zenodo.10070491 (2023).

  36. Ray, D. & Boehnke, M. Methods for meta-analysis of multiple traits using GWAS summary statistics. Genet. Epidemiol. 42, 134–145 (2018).

    Article  PubMed  Google Scholar 

  37. Zhu, X. et al. Meta-analysis of correlated traits via summary statistics from GWASs with an application in hypertension. Am. J. Hum. Genet. 96, 21–36 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Kim, J., Bai, Y. & Pan, W. An adaptive association test for multiple phenotypes with GWAS summary statistics. Genet. Epidemiol. 39, 651–663 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  39. Zhou, X., Carbonetto, P. & Stephens, M. Polygenic modeling with bayesian sparse linear mixed models. PLoS Genet. 9, e1003264 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Zhou, X. & Stephens, M. Genome-wide efficient mixed-model analysis for association studies. Nat. Genet. 44, 821–824 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Lappalainen, T. et al. Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501, 506–511 (2013).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  42. Wen, X., Luca, F. & Pique-Regi, R. Cross-population joint analysis of eQTLs: fine mapping and functional annotation. PLoS Genet. 11, e1005176 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  43. Efron, B. Size, power and false discovery rates. Ann. Appl. Stat. 35, 1351–1377 (2007).

    MathSciNet  Google Scholar 

  44. Yu, G., Wang, L. G., Han, Y. & He, Q. Y. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS 16, 284–287 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

The present study was supported by the National Natural Science Foundation of China (grant nos. 82373686 and 82173624), the Natural Science Foundation of Shandong Province (grant no. ZR2019ZD02), the Taishan Scholar Project of Shandong Province (grant no. tsqn202211025) and the Cheeloo Young Talent Program of Shandong University, all awarded to Z.Y. X.Z. is supported by the University of Michigan, Ann Arbor, MI, USA. The present study has been conducted using the UK Biobank resource under application no. 30686. The UK Biobank was established by the Wellcome Trust medical charity, Medical Research Council, Department of Health, Scottish Government and the Northwest Regional Development Agency. It has also received funding from the Welsh Assembly Government, British Heart Foundation and Diabetes UK.

Author information

Authors and Affiliations

Authors

Contributions

X.Z. and Z.Y. conceived the idea. L.L. developed the methods. L.L. developed the software tool with assistance from R.Y. L.L. performed simulations and real-data analysis with assistance from P.G., J.J., W.G., F.X. and X.Z. X.Z., Z.Y. and L.L. wrote the manuscript with input from all the other authors. All authors reviewed and approved the final manuscript.

Corresponding authors

Correspondence to Zhongshang Yuan or Xiang Zhou.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Genetics thanks Arjun Bhattacharya and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Simulation setups in the present study.

Information include the simulation workflow, different simulation scenarios as well as the corresponding parameter settings.

Extended Data Fig. 2 Quantile-quantile plots of -log10 p-values for testing the non-causal genes in the complete null-simulation settings.

Compared methods include GIFT (orange), FOGS (purple), and MV-IWAS (blue). (a) Sparse simulation settings, where either one SNP (circle), three SNPs (square), or five SNPs (triangle) have non-zero effects on gene expression. (b) Polygenic simulation settings, where either 1% of SNPs (circle), or 10% of SNPs (square) have non-zero effects on gene expression. (c) Homogeneous expression heritability settings, where the expression heritability PVEzx is either 1% (circle), 5% (square), 10% (triangle) or 20% (diamond). (d) Heterogeneous expression heritability simulation settings, where the expression heritability of each gene in the region is randomly set to be 1%, 5%, 10% or 20%. (e) Simulation settings with varying sample sizes: circle represents the settings where the sample size of the gene expression study is 250 and the sample size of GWAS is 5,000; while square represents the settings where the sample size of the gene expression study is 465 and the sample size of GWAS is 50,000. (f) Simulation settings where the expression correlation of multiple genes ρ is set to be either 0 (circle), 0.3 (square), 0.6 (triangle), or 0.9 (diamond). Two-sided p-values are calculated for all methods.

Extended Data Fig. 3 Quantile-quantile plots of -log10 p-values for testing the non-causal genes in the more challenging null-simulation settings.

Quantile-quantile plots of -log10 p-values from different methods for the non-causal genes in the simulations, where the gene expression heritability PVEzx is set to be either 1% (magenta), 5% (green), 10% (orange), 20% (blue) or randomly set to be these four values (purple). Compared methods include GIFT (a,b), FOGS (c,d), and MV-IWAS (e,f). Simulations are performed in settings where the number of causal genes in the region is set to be either one (a,c,e) or two (b,d,f). Two-sided p-values are calculated for all methods.

Extended Data Fig. 4 Power comparisons for different methods based on a true FDR of 0.05 under different simulation settings.

The number of causal genes in the region is set to be either one (a,c,e,g) or two (b,d,f,h). Compared methods include GIFT (orange), FOCUS (green), FOGS (purple), and MV-IWAS (blue). Simulations are performed with different gene expression heritability (a,b), different gene expression correlations (c,d), the sparse simulation settings (e,f) or the polygenic simulation settings (g,h).

Extended Data Fig. 5 TWAS fine mapping results from different methods across six traits in UK Biobank.

(a) Ring plot shows the proportion of analyzed regions across six traits that harbor either 1 (blue); 2 ~ 5 (red); 6 ~ 10 (peachpuff); or >10 (yellow) genes that have significant marginal TWAS evidence. (b) The 90%-credible sets from FOCUS across six traits are divided into two categories: those that include only genes (blue) and those that include both genes and the null model (red). (c) Summary plot shows the proportion of regions that harbor 1, 2, 3, or more genes detected by different methods across six traits. Four-way Venn diagram of genes identified by GIFT, FOCUS, FOGS, and MV-IWAS for SBP (d), DBP (e), TC (f), HDL (g), LDL (h) and TG (i).

Extended Data Fig. 6 Manhattan plots of the fine-mapping results from GIFT for three blood lipid traits.

Different colors represent different levels of evidence: a gene is ‘Known’ (red) if its association with the trait has been previously reported and well documented; a gene is a significant ‘TWAS’ gene (blue) if its marginal TWAS p-value is below the Bonferroni corrected transcriptome-wide threshold; a gene is a significant ‘GWAS’ gene (purple) if its marginal GWAS p-value is below the usual genome-wide threshold 5 × 10−8 or such association was previously reported; otherwise, a gene is denoted as ‘NA’ (brown). Two-sided p-values are calculated for all methods. Manhattan plots are displayed for TC(a), LDL(b) and TG(c).

Extended Data Fig. 7 TWAS fine mapping results from different methods for two binary traits in the UK Biobank.

(a) Quantile–quantile plot of -log10 p-values from the three frequentists methods, including GIFT (orange), FOGS (purple), and MV-IWAS (blue), for testing gene associations with cardiovascular disease (CVD; circle) and obesity (diamond). (b) Ring plot shows the proportion of analyzed regions that harbor either 1 (blue); 2 ~ 5 (red); 6 ~ 10 (peachpuff); or >10 (yellow) genes that have significant marginal TWAS association evidence. (c) The 90%-credible sets from FOCUS are divided into two categories: those that include only genes (blue) and those that include both genes and the null model (red). (d) Summary plot shows the proportion of regions that harbor 1, 2, 3, or more associated genes detected by different methods. (e) Manhattan plots show the fine-mapping results from GIFT for CVD. (f) Manhattan plots show the fine-mapping results from GIFT for obesity. For (e) and (f), different colors represent different levels of evidence: a gene is ‘Known’ (red) if its association with the trait has been previously reported and well documented; a gene is a significant ‘TWAS’ gene (blue) if its marginal TWAS p-value is below the Bonferroni corrected transcriptome-wide threshold; a gene is a significant ‘GWAS’ gene (purple) if its marginal GWAS p-value is below the usual genome-wide threshold 5 × 10−8 or such association was previously reported; otherwise, a gene is denoted as ‘NA’ (brown). Two-sided p-values are calculated for all frequentist methods.

Supplementary information

Supplementary Information

Supplementary Notes 1–14, Figs. 1–52, Tables 1–9 and References.

Reporting Summary

Peer Review File

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, L., Yan, R., Guo, P. et al. Conditional transcriptome-wide association study for fine-mapping candidate causal genes. Nat Genet 56, 348–356 (2024). https://doi.org/10.1038/s41588-023-01645-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41588-023-01645-y

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics