Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Protocol
  • Published:

BridGE: a pathway-based analysis tool for detecting genetic interactions from GWAS

Abstract

Genetic interactions have the potential to modulate phenotypes, including human disease. In principle, genome-wide association studies (GWAS) provide a platform for detecting genetic interactions; however, traditional methods for identifying them, which tend to focus on testing individual variant pairs, lack statistical power. In this protocol, we describe a novel computational approach, called Bridging Gene sets with Epistasis (BridGE), for discovering genetic interactions between biological pathways from GWAS data. We present a Python-based implementation of BridGE along with instructions for its application to a typical human GWAS cohort. The major stages include initial data processing and quality control, construction of a variant-level genetic interaction network, measurement of pathway-level genetic interactions, evaluation of statistical significance using sample permutations and generation of results in a standardized output format. The BridGE software pipeline includes options for running the analysis on multiple cores and multiple nodes for users who have access to computing clusters or a cloud computing environment. In a cluster computing environment with 10 nodes and 100 GB of memory per node, the method can be run in less than 24 h for typical human GWAS cohorts. Using BridGE requires knowledge of running Python programs and basic shell script programming experience.

Key points

  • This protocol describes a method for discovering interactions between biological pathways in genome-wide association study data by evaluating variant-level interactions connecting between and within biological pathways.

  • The technique differs from approaches that perform interaction tests for every pair of variants with a phenotype of interest as it specifically assesses the impact of combinations of interacting loci at the pathway level, which affords Bridging Gene sets with Epistasis (BridGE) greater statistical power for identifying genetic interactions.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: An overview of the BridGE pipeline.
Fig. 2: An example of the MDS plot produced by the BridGE pipeline.
Fig. 3: Visualization of discovered pathway–pathway interactions (top 18).

Similar content being viewed by others

Data availability

The datasets used in this protocol included (1) a sample 1000 Genomes Project GWAS dataset with simulated binary phenotypes, which is available at the Zenodo repository (https://doi.org/10.5281/zenodo.8067407); (2) a copy of 1000 Genomes Project data, which is available at the Zenodo repository (https://doi.org/10.5281/zenodo.8067407); and (3) a Parkinson’s disease GWAS dataset from the IPDGC (dbGaP study accession: phs000918.v1.p1), which is available at the dbGaP (https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000918.v1.p1).

Code availability

BridGE-Python can be obtained from the GitHub repository (https://github.com/csbio/BridGE-Python). It can be freely used for educational and research purposes by nonprofit institutions and US government agencies. A license for commercial use of this software is available from the University of Minnesota’s Office for Technology Commercialization.

References

  1. Manolio, T. A. et al. Finding the missing heritability of complex diseases. Nature 461, 747–753 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Eichler, E. E. et al. Missing heritability and strategies for finding the underlying causes of complex disease. Nat. Rev. Genet. 11, 446–450 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Maher, B. Personal genomes: the case of the missing heritability. Nature 456, 18–21 (2008).

    Article  CAS  PubMed  Google Scholar 

  4. Phillips, P. C. Epistasis—the essential role of gene interactions in the structure and evolution of genetic systems. Nat. Rev. Genet. 9, 855–867 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Cordell, H. J. Detecting gene–gene interactions that underlie human diseases. Nat. Rev. Genet. 10, 392–404 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Mackay, T. F. & Moore, J. H. Why epistasis is important for tackling complex human disease genetics. Genome Med. 6, 42 (2014).

    Article  PubMed Central  Google Scholar 

  7. Zuk, O., Hechter, E., Sunyaev, S. R. & Lander, E. S. The mystery of missing heritability: genetic interactions create phantom heritability. Proc. Natl Acad. Sci. USA 109, 1193–1198 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Hu, X. et al. SHEsisEpi, a GPU-enhanced genome-wide SNP–SNP interaction scanning algorithm, efficiently reveals the risk genetic epistasis in bipolar disorder. Cell Res. 20, 854–857 (2010).

    Article  PubMed  Google Scholar 

  9. Schüpbach, T., Xenarios, I., Bergmann, S. & Kapur, K. FastEpistasis: a high performance computing solution for quantitative trait epistasis. Bioinformatics 26, 1468–1469 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  10. Wan, X. et al. BOOST: a fast approach to detecting gene-gene interactions in genome-wide case-control studies. Am. J. Hum. Genet. 87, 325–340 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Yung, L. S., Yang, C., Wan, X. & Yu, W. GBOOST: a GPU-based tool for detecting gene–gene interactions in genome-wide case control studies. Bioinformatics 27, 1309–1310 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Goudey, B. et al. GWIS-model-free, fast and exhaustive search for epistatic interactions in case-control GWAS. BMC Genom. 14, 1–18 (2013).

    Article  Google Scholar 

  13. Wang, X. et al. ELSSI: parallel SNP–SNP interactions detection by ensemble multi-type detectors. Brief. Bioinform. 23, bbac213 (2022).

    Article  PubMed  Google Scholar 

  14. Chatelain, C., Durand, G., Thuillier, V. & Augé, F. Performance of epistasis detection methods in semi-simulated GWAS. BMC Bioinform. 19, 1–17 (2018).

    Article  Google Scholar 

  15. Costanzo, M. et al. A global genetic interaction network maps a wiring diagram of cellular function. Science 353, aaf1420 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  16. Kuzmin, E. et al. Systematic analysis of complex genetic interactions. Science 360, eaao1729 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  17. Wang, W. et al. Pathway-based discovery of genetic interactions in breast cancer. PLoS Genet. 13, e1006973 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  18. Fang, G. et al. Discovering genetic interactions bridging pathways in genome-wide association studies. Nat. Commun. 10, 4274 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  19. Ueki, M. & Cordell, H. J. Improved statistics for genome-wide interaction analysis. PLoS Genet. 8, e1002625 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Sollis, E. et al. The NHGRI-EBI GWAS catalog: knowledgebase and deposition resource. Nucleic Acids Res. 51, D977–D985 (2023).

    Article  CAS  PubMed  Google Scholar 

  21. Hallacli, E. et al. The Parkinson’s disease protein alpha-synuclein is a modulator of processing bodies and mRNA stability. Cell 185, 2035–2056.e33 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA 102, 15545–15550 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Wang, K., Li, M. & Bucan, M. Pathway-based approaches for analysis of genome-wide association studies. Am. J. Hum. Genet. 81, 1278–1283 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Kim, N. C. et al. Gene ontology analysis of pairwise genetic associations in two genome-wide studies of sporadic ALS. BioData Min. 5, 9 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Pandey, A. et al. Epistasis network centrality analysis yields pathway replication across two GWAS cohorts for bipolar disorder. Transl. Psychiatry 2, e154–e154 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Ma, L. et al. Knowledge-driven analysis identifies a gene–gene interaction affecting high-density lipoprotein cholesterol levels in multi-ethnic populations. PLoS Genet. 8, e1002714 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Ma, L., Clark, A. G. & Keinan, A. Gene-based testing of interactions in association studies of quantitative traits. PLoS Genet. 9, e1003321 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Sun, X. et al. Analysis pipeline for the epistasis search–statistical versus biological filtering. Front. Genet. 5, 106 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  29. Brossard, M. et al. Integrated pathway and epistasis analysis reveals interactive effect of genetic variants at TERF1 and AFAP1L2 loci on melanoma risk. Int. J. Cancer 137, 1901–1909 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Mitra, I. et al. Reverse pathway genetic approach identifies epistasis in autism spectrum disorders. PLoS Genet. 13, e1006516 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  31. Chen, L. S. et al. Insights into colon cancer etiology via a regularized approach to gene set analysis of GWAS data. Am. J. Hum. Genet. 86, 860–871 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Zhao, J., Gupta, S., Seielstad, M., Liu, J. & Thalamuthu, A. Pathway-based analysis using reduced gene subsets in genome-wide association studies. BMC Bioinform. 12, 1–14 (2011).

    Article  Google Scholar 

  33. Huang, A., Martin, E. R., Vance, J. M. & Cai, X. Detecting genetic interactions in pathway‐based genome‐wide association studies. Genet. Epidemiol. 38, 300–309 (2014).

    Article  PubMed  Google Scholar 

  34. Ritchie, M. D. Large-scale analysis of genetic and clinical patient data. Annu. Rev. Biomed. Data Sci. 1, 263–274 (2018).

    Article  Google Scholar 

  35. Silberstein, M., Nesbit, N., Cai, J. & Lee, P. H. Pathway analysis for genome-wide genetic variation data: analytic principles, latest developments, and new opportunities. J. Genet. Genom. 48, 173–183 (2021).

    Article  Google Scholar 

  36. Cui, T. et al. Gene–gene interaction detection with deep learning. Commun. Biol. 5, 1238 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  37. Liu, L. et al. Using machine learning to identify gene interaction networks associated with breast cancer. BMC Cancer 22, 1070 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Consortium, G. P. A global reference for human genetic variation. Nature 526, 68 (2015).

    Article  Google Scholar 

  39. Consortium, I. P. D. G. Imputation of sequence variants for identification of genetic risks for Parkinson’s disease: a meta-analysis of genome-wide association studies. Lancet 377, 641–649 (2011).

    Article  Google Scholar 

  40. Lewontin, R. C. & Kojima, K.-I. The evolutionary dynamics of complex polymorphisms. Evolution 14, 458–472 (1960).

    Google Scholar 

Download references

Acknowledgements

This work was partially supported by a grant from the Alzheimer’s Association, Alzheimer’s Research UK, The Michael J. Fox Foundation for Parkinson’s Research and the Weston Brain Institute (BAND-19-615151), and grants from the NIH (R21CA235352, R01HG005084 and R01HG005853). The content is solely the responsibility of the authors and does not necessarily represent the official views of the funders. This study makes use of genome-wide association datasets provided by dbGaP (study accession numbers: phs000089.v3.p2, phs000126.v2.p1, phs000918.v1.p1). We acknowledge the contributing investigators who submitted data from their original study to dbGaP, the primary funding organization that supported the contributing investigators and the NIH data repository. Computing resources and data storage services were partially provided by the Minnesota Supercomputing Institute and the University of Minnesota’s Office of Information Technology, respectively.

Author information

Authors and Affiliations

Authors

Contributions

M.H., W.W. and C.L.M. created the software design based on the original BridGE method. M.H., W.W. and M.A. developed the software. M.H., M.F. and W.W. tested the protocol and provided feedback on improvements. M.H. and W.W. wrote an initial draft of the protocol manuscript, which was then reviewed and revised by all co-authors. W.W. and C.L.M. supervised the software development process, manuscript writing and testing of the protocol, and secured funding to support the project.

Corresponding authors

Correspondence to Wen Wang or Chad L. Myers.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Protocols thanks Marylyn Ritchie and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Related links

Key references using this protocol

Wang, W. et al. PLoS Genet. 13, e1006973 (2017): https://doi.org/10.1371/journal.pgen.1006973

Fang, G. et al. Nat. Commun. 10, 4274 (2019): https://doi.org/10.1038/s41467-019-12131-7

Supplementary information

Reporting Summary

Supplementary Data 1

The output of the original BridGE and BridGE 2.0 are compared for the two Parkinson’s disease GWAS cohorts, which were also analyzed in the original BridGE paper. The Excel files include summary tables and lists of BPM, WPM and PATH discovered at the FDR cutoff shown in the summary tables.

Supplementary Data 2

BridGE output files from the simulated 1000 Genomes GWAS data based on the combined disease model. The Excel files include BridGE results and interaction lists for BPMs and WPMs. The PDF file provides the visualization of pathway–pathway interactions.

Supplementary Data 3

A complete set of BridGE output files from the Parkinson’s disease IPDGC cohort (dbGaP study accession: phs000918.v1.p1). The Excel files include information about significant pathway-level interactions (FDR <0.25) discovered under the corresponding disease model (RR, DD, RD and combined) and pairwise SNP interaction lists and associated statistics (DD disease model only), one for BPM and one for WPM. The PDF file provides a network visualization.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hajiaghabozorgi, M., Fischbach, M., Albrecht, M. et al. BridGE: a pathway-based analysis tool for detecting genetic interactions from GWAS. Nat Protoc (2024). https://doi.org/10.1038/s41596-024-00954-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41596-024-00954-8

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing