Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

Detecting epistasis in human complex traits

Key Points

  • Tremendous activity in the development of methodology has now rendered the exhaustive search for pairwise genetic interactions computationally routine, but addressing the statistical problems of detecting epistasis remains a big challenge.

  • Most reports of epistasis influencing human complex traits that exist in the literature raise concerns regarding their validity and do not follow the same strict protocols that are in place for reporting additive effects.

  • There is mounting evidence against the existence of pairwise epistatic effects influencing human complex traits that are sufficiently large for detection in standard single-sample genome-wide association studies (GWASs). If epistatic effects do influence complex traits, then each interaction effect will probably be small, as is observed with additive effects.

  • The majority of robust additive effects are only found when GWASs are carried out using huge sample sizes and good single-nucleotide polymorphism coverage, often as a result of multistudy meta-analyses. Similar approaches are necessary if epistatic effects are also to be robustly detected, although methodology or attempts at implementation are yet to surface.

  • Methods have emerged for estimating the total contribution of additive effects across the whole genome; similar methods for estimating the total contribution of genetic interactions would be valuable but have not yet been developed.

Abstract

Genome-wide association studies (GWASs) have become the focus of the statistical analysis of complex traits in humans, successfully shedding light on several aspects of genetic architecture and biological aetiology. Single-nucleotide polymorphisms (SNPs) are usually modelled as having additive, cumulative and independent effects on the phenotype. Although evidently a useful approach, it is often argued that this is not a realistic biological model and that epistasis (that is, the statistical interaction between SNPs) should be included. The purpose of this Review is to summarize recent directions in methodology for detecting epistasis and to discuss evidence of the role of epistasis in human complex trait variation. We also discuss the relevance of epistasis in the context of GWASs and potential hazards in the interpretation of statistical interaction terms.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Types of methods to detect epistasis in genome-wide association studies.

Similar content being viewed by others

References

  1. Phillips, P. C. Epistasis — the essential role of gene interactions in the structure and evolution of genetic systems. Nature Rev. Genet. 9, 855–867 (2008).

    CAS  PubMed  Google Scholar 

  2. Cordell, H. J. Epistasis: what it means, what it doesn't mean, and statistical methods to detect it in humans. Hum. Mol. Genet. 11, 2463–2468 (2002).

    CAS  PubMed  Google Scholar 

  3. Wang, X., Elston, R. C. & Zhu, X. The meaning of interaction. Hum. Hered. 70, 269–277 (2010).

    PubMed  PubMed Central  Google Scholar 

  4. Visscher, P. M., Hill, W. G. & Wray, N. R. Heritability in the genomics era — concepts and misconceptions. Nature Rev. Genet. 9, 255–266 (2008).

    CAS  PubMed  Google Scholar 

  5. Huang, Y., Wuchty, S. & Przytycka, T. M. eQTL epistasis — challenges and computational approaches. Front. Genet. 4, 51 (2013).

    PubMed  PubMed Central  Google Scholar 

  6. McKinney, B. A. & Pajewski, N. M. Six degrees of epistasis: Statistical network models for GWAS. Front. Genet. 2, 109 (2011).

    CAS  PubMed  Google Scholar 

  7. Pang, X. et al. A statistical procedure to map high-order epistasis for complex traits. Brief. Bioinform. 14, 302–314 (2013).

    CAS  PubMed  Google Scholar 

  8. Ritchie, M. D. Using biological knowledge to uncover the mystery in the search for epistasis in genome-wide association studies. Ann. Hum. Genet. 75, 172–182 (2011).

    PubMed  PubMed Central  Google Scholar 

  9. Steen, K. V. Travelling the world of gene–gene interactions. Brief. Bioinform. 13, 1–19 (2012).

    PubMed  Google Scholar 

  10. Zhang, Y., Jiang, B., Zhu, J. & Liu, J. S. Bayesian models for detecting epistatic interactions from genetic data. Ann. Hum. Genet. 75, 183–193 (2011).

    PubMed  Google Scholar 

  11. Gyenesei, A. et al. BiForce Toolbox: powerful high-throughput computational analysis of gene–gene interactions in genome-wide association studies. Nucleic Acids Res. 40, W628–632 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  12. Hemani, G., Theocharidis, A., Wei, W. & Haley, C. EpiGPU: exhaustive pairwise epistasis scans parallelized on consumer level graphics cards. Bioinformatics 27, 1462–1465 (2011).

    CAS  PubMed  Google Scholar 

  13. Liu, Y. et al. Genome-wide interaction-based association analysis identified multiple new susceptibility loci for common diseases. PLoS Genet. 7, e1001338 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  14. Schüpbach, T., Xenarios, I., Bergmann, S. & Kapur, K. FastEpistasis: a high performance computing solution for quantitative trait epistasis. Bioinformatics 26, 1468–1469 (2010).

    PubMed  PubMed Central  Google Scholar 

  15. Yung, L. S., Yang, C., Wan, X. & Yu, W. GBOOST: a GPU-based tool for detecting gene–gene interactions in genome-wide case control studies. Bioinformatics 27, 1309–1310 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  16. Cordell, H. J. Detecting gene–gene interactions that underlie human diseases. Nature Rev. Genet. 10, 392–404 (2009). This is an excellent review of methods to study epistasis in GWASs of human diseases.

    CAS  PubMed  Google Scholar 

  17. Ueki, M. & Cordell, H. J. Improved statistics for genome-wide interaction analysis. PLoS Genet. 8, e1002625 (2012). This is a comprehensive assessment of LD- and haplotype-based methods for genome-wide detection of epistasis.

    CAS  PubMed  PubMed Central  Google Scholar 

  18. Hu, J. K., Wang, X. & Wang, P. Testing gene–gene interactions in genome wide association studies. Genet. Epidemiol. 38, 123–134 (2014).

    PubMed  PubMed Central  Google Scholar 

  19. Kam-Thong, T. et al. EPIBLASTER-fast exhaustive two-locus epistasis detection strategy using graphical processing units. Eur. J. Hum. Genet. 19, 465–471 (2010).

    PubMed  PubMed Central  Google Scholar 

  20. Wang, Z., Wang, Y. & Tan, K. L., Wong, L. & Agrawal, D. eCEO: an efficient Cloud Epistasis cOmputing model in genome-wide association study. Bioinformatics 27, 1045–1051 (2011).

    CAS  PubMed  Google Scholar 

  21. Prabhu, S. & Pe'er, I. Ultrafast genome-wide scan for SNP–SNP interactions in common complex disease. Genome Res. 22, 2230–2240 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  22. Wan, X. et al. BOOST: a fast approach to detecting gene–gene interactions in genome-wide case–control studies. Am. J. Hum. Genet. 87, 325–340 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. Gyenesei, A., Moody, J., Semple, C. A., Haley, C. S. & Wei, W.-H. High-throughput analysis of epistasis in genome-wide association studies with BiForce. Bioinformatics 28, 1957–1964 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  24. Wei, W., Gyenesei, A., Semple, C. A. & Haley, C. S. Properties of local interactions and their potential value in complementing genome-wide association studies. PLoS ONE 8, e71203 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  25. Gauderman, W. J. Sample size requirements for association studies of gene–gene interaction. Am. J. Epidemiol. 155, 478–484 (2002). This is an important work that investigates power and sample sizes required for studying epistasis in GWASs.

    PubMed  Google Scholar 

  26. Zuk, O., Hechter, E., Sunyaev, S. R. & Lander, E. S. The mystery of missing heritability: genetic interactions create phantom heritability. Proc. Natl Acad. Sci. USA 109, 1193–1198 (2012). This paper provides an interesting theoretical exploration of how disease traits can be the sum of many lower-level pathways and how polygenic modes of inheritance may invoke high-level epistasis.

    CAS  PubMed  PubMed Central  Google Scholar 

  27. Ma, L. et al. Knowledge-driven analysis identifies a gene–gene interaction affecting high-density lipoprotein cholesterol levels in multi-ethnic populations. PLoS Genet. 8, e1002714 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  28. Evans, D. M. et al. Interaction between ERAP1 and HLA-B27 in ankylosing spondylitis implicates peptide handling in the mechanism for HLA-B27 in disease susceptibility. Nature Genet. 43, 761–767 (2011).

    CAS  PubMed  Google Scholar 

  29. Strange, A. et al. A genome-wide association study identifies new psoriasis susceptibility loci and an interaction between HLA-C and ERAP1. Nature Genet. 42, 985–990 (2010).

    CAS  PubMed  Google Scholar 

  30. Carlborg, O. & Haley, C. S. Epistasis: too often neglected in complex trait studies? Nature Rev. Genet. 5, 618–625 (2004).

    CAS  PubMed  Google Scholar 

  31. Evans, D. M., Marchini, J., Morris, A. P. & Cardon, L. R. Two-stage two-locus models in genome-wide association. PLoS Genet. 2, e157 (2006).

    PubMed  PubMed Central  Google Scholar 

  32. Marchini, J., Donnelly, P. & Cardon, L. R. Genome-wide strategies for detecting multiple loci that influence complex diseases. Nature Genet. 37, 413–417 (2005). This important simulation study investigates key issues in studying epistasis in GWASs.

    CAS  PubMed  Google Scholar 

  33. Hoh, J. & Ott, J. Mathematical multi-locus approaches to localizing complex human trait genes. Nature Rev. Genet. 4, 701–709 (2003).

    CAS  PubMed  Google Scholar 

  34. Zhao, J., Jin, L. & Xiong, M. Test for interaction between two unlinked loci. Am. J. Hum. Genet. 79, 831–845 (2006).

    CAS  PubMed  PubMed Central  Google Scholar 

  35. Haig, D. Does heritability hide in epistasis between linked SNPs? Eur. J. Hum. Genet. 19, 123 (2011). This paper presents an early suggestion of examining interactions between neighbouring SNPs.

    PubMed  Google Scholar 

  36. Wellek, S. & Ziegler, A. A genotype-based approach to assessing the association between single nucleotide polymorphisms. Hum. Hered. 67, 128–139 (2009).

    CAS  PubMed  Google Scholar 

  37. Yuan, Z. et al. From interaction to co-association — a fisher R-to-Z transformation-based simple statistic for real world genome-wide association study. PLoS ONE 8, e70774 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  38. Zhang, Y. & Liu, J. S. Bayesian inference of epistatic interactions in case–control studies. Nature Genet. 39, 1167–1173 (2007).

    CAS  PubMed  Google Scholar 

  39. Tang, W., Wu, X., Jiang, R. & Li, Y. Epistatic module detection for case–control studies: a Bayesian model with a Gibbs sampling strategy. PLoS Genet. 5, e1000464 (2009).

    PubMed  PubMed Central  Google Scholar 

  40. Chen, G. K. & Thomas, D. C. Using biological knowledge to discover higher order interactions in genetic association studies. Genet. Epidemiol. 34, 863–878 (2010).

    PubMed  Google Scholar 

  41. Yi, N., Kaklamani, V. G. & Pasche, B. Bayesian analysis of genetic interactions in case–control studies, with application to adiponectin genes and colorectal cancer risk. Ann. Hum. Genet. 75, 90–104 (2011).

    PubMed  Google Scholar 

  42. Zhang, Y. A novel bayesian graphical model for genome-wide multi-SNP association mapping. Genet. Epidemiol. 36, 36–47 (2012).

    PubMed  Google Scholar 

  43. Li, J., Zhang, K. & Yi, N. A. Bayesian hierarchical model for detecting haplotype–haplotype and haplotype–environment interactions in genetic association studies. Hum. Hered. 71, 148–160 (2011).

    PubMed  PubMed Central  Google Scholar 

  44. Ferreira, T. & Marchini, J. Modeling interactions with known risk loci — a Bayesian model averaging approach. Ann. Hum. Genet. 75, 1–9 (2011).

    PubMed  Google Scholar 

  45. Turner, S. D. et al. Knowledge-driven multi-locus analysis reveals gene–gene interactions influencing HDL cholesterol level in two independent EMR-linked biobanks. PLoS ONE 6, e19586 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  46. Ackermann, M. & Beyer, A. Systematic detection of epistatic interactions based on allele pair frequencies. PLoS Genet. 8, e1002463 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  47. Xie, M., Li, J. & Jiang, T. Detecting genome-wide epistases based on the clustering of relatively frequent items. Bioinformatics 28, 5–12 (2012).

    CAS  PubMed  Google Scholar 

  48. Zhang, X., Huang, S., Zou, F. & Wang, W. TEAM: efficient two-locus epistasis tests in human genome-wide association study. Bioinformatics 26, i217–i227 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  49. Brinza, D., Schultz, M., Tesler, G. & Bafna, V. RAPID detection of gene–gene interactions in genome-wide association studies. Bioinformatics 26, 2856–2862 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  50. Ueki, M. & Tamiya, G. Ultrahigh-dimensional variable selection method for whole-genome gene–gene interaction analysis. BMC Bioinformatics 13, 72 (2012).

    PubMed  PubMed Central  Google Scholar 

  51. Yang, C. et al. SNPHarvester: a filtering-based approach for detecting epistatic interactions in genome-wide association studies. Bioinformatics 25, 504–511 (2009).

    CAS  PubMed  Google Scholar 

  52. Shen, X., Pettersson, M., Ronnegard, L. & Carlborg, O. Inheritance beyond plain heritability: variance-controlling genes in Arabidopsis thaliana. PLoS Genet. 8, e1002839 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  53. Ronnegard, L. & Valdar, W. Recent developments in statistical methods for detecting genetic loci affecting phenotypic variability. BMC Genet. 13, 63 (2012).

    PubMed  PubMed Central  Google Scholar 

  54. Brown, A. A. et al. Genetic interactions affecting human gene expression identified by variance association mapping. Elife 3, e01381 (2014).

    PubMed  PubMed Central  Google Scholar 

  55. Lewinger, J. P. et al. Efficient two-step testing of gene–gene interactions in genome-wide association studies. Genet. Epidemiol. 37, 440–451 (2013).

    PubMed  Google Scholar 

  56. Sun, X. et al. Analysis pipeline for the epistasis search — statistical versus biological filtering. Front. Genet. 5, 106 (2014).

    PubMed  PubMed Central  Google Scholar 

  57. Fairfax, B. P. et al. Genetics of gene expression in primary immune cells identifies cell type-specific master regulators and roles of HLA alleles. Nature Genet. 44, 502–510 (2012).

    CAS  PubMed  Google Scholar 

  58. Trynka, G. et al. Chromatin marks identify critical cell types for fine mapping complex trait variants. Nature Genet. 45, 124–130 (2013).

    CAS  PubMed  Google Scholar 

  59. Yang, C. et al. The choice of null distributions for detecting gene–gene interactions in genome-wide association studies. BMC Bioinformatics 12 (Suppl. 1), S26 (2011).

    PubMed  PubMed Central  Google Scholar 

  60. Fang, G. et al. High-order SNP combinations associated with complex diseases: efficient discovery, statistical power and functional interactions. PLoS ONE 7, e33531 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  61. Culverhouse, R. C. A comparison of methods sensitive to interactions with small main effects. Genet. Epidemiol. 36, 303–311 (2012).

    PubMed  PubMed Central  Google Scholar 

  62. Molinaro, A. M. et al. Power of data mining methods to detect genetic associations and interactions. Hum. Hered. 72, 85–97 (2011).

    PubMed  PubMed Central  Google Scholar 

  63. Zhu, Z. et al. Development of GMDR-GPU for gene–gene interaction analysis and its application to WTCCC GWAS data for type 2 diabetes. PLoS ONE 8, e61943 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  64. Schwarz, D. F., König, I. R. & Ziegler, A. On safari to Random Jungle: a fast implementation of Random Forests for high-dimensional data. Bioinformatics 26, 1752–1758 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  65. Knights, J., Yang, J., Chanda, P., Zhang, A. & Ramanathan, M. SYMPHONY, an information-theoretic method for gene–gene and gene–environment interaction analysis of disease syndromes. Heredity 110, 548–559 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  66. Shervais, S., Kramer, P. L., Westaway, S. K., Cox, N. J. & Zwick, M. Reconstructability analysis as a tool for identifying gene–gene interactions in studies of human diseases. Stat. Appl. Genet. Mol. Biol. 9, article18 (2010).

    PubMed  Google Scholar 

  67. Zwick, M. Reconstructability analysis of epistasis. Ann. Hum. Genet. 75, 157–171 (2011).

    PubMed  Google Scholar 

  68. Lishout, F. V. et al. An efficient algorithm to perform multiple testing in epistasis screening. BMC Bioinformatics 14, 138 (2013).

    PubMed  PubMed Central  Google Scholar 

  69. Mahachie John, J. M., Van Lishout, F. & Van Steen, K. Model-based multifactor dimensionality reduction to detect epistasis for quantitative traits in the presence of error-free and noisy data. Eur. J. Hum. Genet. 19, 696–703 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  70. Gui, J. et al. A novel survival multifactor dimensionality reduction method for detecting gene–gene interactions with application to bladder cancer prognosis. Hum. Genet. 129, 101–110 (2011).

    PubMed  Google Scholar 

  71. Lee, S., Kwon, M. S., Oh, J. M. & Park, T. Gene–gene interaction analysis for the survival phenotype based on the Cox model. Bioinformatics 28, i582–i588 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  72. Yoshida, M. & Koike, A. SNPInterForest: a new method for detecting epistatic interactions. BMC Bioinformatics 12, 469 (2011).

    PubMed  PubMed Central  Google Scholar 

  73. Li, J., Horstman, B. & Chen, Y. Detecting epistatic effects in association studies at a genomic level based on an ensemble approach. Bioinformatics 27, i222–i229 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  74. Lu, Q., Wei, C., Ye, C., Li, M. & Elston, R. C. A likelihood ratio-based Mann–Whitney approach finds novel replicable joint gene action for type 2 diabetes. Genet. Epidemiol. 36, 583–593 (2012).

    PubMed  PubMed Central  Google Scholar 

  75. De Lobel, L. et al. A screening methodology based on Random Forests to improve the detection of gene–gene interactions. Eur. J. Hum. Genet. 18, 1127–1132 (2010).

    PubMed  PubMed Central  Google Scholar 

  76. Lin, H. Y. et al. TRM: a powerful two-stage machine learning approach for identifying SNP–SNP interactions. Ann. Hum. Genet. 76, 53–62 (2012).

    PubMed  Google Scholar 

  77. Wang, Y., Liu, X., Robbins, K. & Rekaya, R. AntEpiSeeker: detecting epistatic interactions for case–control studies using a two-stage ant colony optimization algorithm. BMC Res. Notes 3, 117 (2010).

    PubMed  PubMed Central  Google Scholar 

  78. Hu, T. et al. An information-gain approach to detecting three-way epistatic interactions in genetic association studies. J. Am. Med. Inform. Assoc. 20, 630–636 (2013).

    PubMed  PubMed Central  Google Scholar 

  79. Ma, L., Clark, A. G. & Keinan, A. Gene-based testing of interactions in association studies of quantitative traits. PLoS Genet. 9, e1003321 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  80. Oh, S. et al. A novel method to identify high order gene–gene interactions in genome-wide association studies: gene-based MDR. BMC Bioinformatics 13 (Suppl. 9), S5 (2012).

    PubMed  PubMed Central  Google Scholar 

  81. Wu, M. C. et al. Powerful SNP-set analysis for case–control genome-wide association studies. Am. J. Hum. Genet. 86, 929–942 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  82. Wu, C. & Cui, Y. Boosting signals in gene-based association studies via efficient SNP selection. Br. Bioinform. 15, 279–291 (2014).

    Google Scholar 

  83. He, S. & Wu, Z. Gene-based Higher Criticism methods for large-scale exonic single-nucleotide polymorphism data. BMC Proceedings. 5 (Suppl. 9), S65 (2011).

    PubMed  PubMed Central  Google Scholar 

  84. Rajapakse, I., Perlman, M. D., Martin, P. J., Hansen, J. A. & Kooperberg, C. Multivariate detection of gene–gene interactions. Genet. Epidemiol. 36, 622–630 (2012).

    PubMed  PubMed Central  Google Scholar 

  85. Zhang, X. et al. A PLSPM-based test statistic for detecting gene–gene co-association in genome-wide association study with case–control design. PLoS ONE 8, e62129 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  86. Davis, N. A., Crowe, J. E. Jr, Pajewski, N. M. & McKinney, B. A. Surfing a genetic association interaction network to identify modulators of antibody response to smallpox vaccine. Genes Immun. 11, 630–636 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  87. Carter, G. W., Hays, M., Sherman, A. & Galitski, T. Use of pleiotropy to model genetic interactions in a population. PLoS Genet. 8, e1003010 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  88. Snitkin, E. S. & Segre, D. Epistatic interaction maps relative to multiple metabolic phenotypes. PLoS Genet. 7, e1001294 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  89. Li, F. et al. A powerful latent variable method for detecting and characterizing gene-based gene–gene interaction on multiple quantitative traits. BMC Genet. 14, 89 (2013).

    PubMed  PubMed Central  Google Scholar 

  90. Lehner, B. Molecular mechanisms of epistasis within and between genes. Trends Genet. 27, 323–331 (2011). This is an overview of possible molecular mechanisms that can cause epistasis and links between functional and statistical epistasis.

    CAS  PubMed  Google Scholar 

  91. Becker, J., Wendland, J. R., Haenisch, B., Nöthen, M. M. & Schumacher, J. A systematic eQTL study of cis–trans epistasis in 210 HapMap individuals. Eur. J. Hum. Genet. 97–101 (2011).

  92. Zhang, W., Zhu, J., Schadt, E. E. & Liu, J. S. A Bayesian partition method for detecting pleiotropic and epistatic eQTL modules. PLoS Comput. Biol. 6, e1000642 (2010).

    PubMed  PubMed Central  Google Scholar 

  93. Lee, S. & Xing, E. P. Leveraging input and output structures for joint mapping of epistatic and marginal eQTLs. Bioinformatics 28, i137–146 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  94. Holzinger, E. R. et al. Initialization parameter sweep in ATHENA: optimizing neural networks for detecting gene–gene interactions in the presence of small main effects. Genet. Evol. Comput. Conf. 12, 203–210 (2010).

    PubMed  PubMed Central  Google Scholar 

  95. Wise, A. L., Gyi, L. & Manolio, T. A. eXclusion: toward integrating the X chromosome in genome-wide association analyses. Am. J. Hum. Genet. 92, 643–647 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  96. Chen, C. C. et al. Methods for identifying SNP interactions: a review on variations of Logic Regression, Random Forest and Bayesian logistic regression. IEEE/ACM Trans. Comput. Biol. Bioinform. 8, 1580–1591 (2011).

    PubMed  Google Scholar 

  97. Garcia-Magarinos, M., Lopez-de-Ullibarri, I., Cao, R. & Salas, A. Evaluating the ability of tree-based methods and logistic regression for the detection of SNP–SNP interaction. Ann. Hum. Genet. 73, 360–369 (2009).

    PubMed  Google Scholar 

  98. Kapur, K., Schupbach, T., Xenarios, I., Kutalik, Z. & Bergmann, S. Comparison of strategies to detect epistasis from eQTL data. PLoS ONE 6, e28415 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  99. Shang, J. et al. Performance analysis of novel methods for detecting epistasis. BMC Bioinformatics 12, 475 (2011).

    PubMed  PubMed Central  Google Scholar 

  100. Winham, S., Wang, C. & Motsinger-Reif, A. A. A comparison of multifactor dimensionality reduction and L1-penalized regression to identify gene–gene interactions in genetic association studies. Stat. Appl. Genet. Mol. Biol. 10, Article 4 (2011).

    PubMed  Google Scholar 

  101. An, P. et al. The challenge of detecting epistasis (G × G interactions): genetic analysis workshop 16. Genet. Epidemiol. 33 (Suppl. 1), S58–67 (2009).

    PubMed  PubMed Central  Google Scholar 

  102. Hemani, G., Knott, S. & Haley, C. An evolutionary perspective on epistasis and the missing heritability. PLoS Genet. 9, e1003295 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  103. Lippert, C. et al. An exhaustive epistatic SNP association analysis on expanded Wellcome Trust data. Sci. Rep. 3, 1099 (2013).

    PubMed  PubMed Central  Google Scholar 

  104. Schadt, E. et al. Genetics of gene expression surveyed in maize, mouse and man. Nature 422, 297–302 (2003).

    CAS  PubMed  Google Scholar 

  105. Powell, J. E. et al. The Brisbane Systems Genetics Study: genetical genomics meets complex trait genetics. PLoS ONE 7, e35430 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  106. Hemani, G. et al. Detection and replication of epistasis influencing transcription in humans. Nature 10, 249–253 (2014).

    Google Scholar 

  107. Combarros, O., Cortina-Borja, M., Smith, A. D. & Lehmann, D. J. Epistasis in sporadic Alzheimer's disease. Neurobiol. Aging 30, 1333–1349 (2009).

    CAS  PubMed  Google Scholar 

  108. Kolsch, H. et al. Interaction of insulin and PPAR-α genes in Alzheimer's disease: the Epistasis Project. J. Neural Transm. 119, 473–479 (2012).

    PubMed  Google Scholar 

  109. Bullock, J. M. et al. Discovery by the Epistasis Project of an epistatic interaction between the GSTM3 gene and the HHEX/IDE/KIF11 locus in the risk of Alzheimer's disease. Neurobiol. Aging 34, 1309.e1–1309.e7 (2013).

    CAS  Google Scholar 

  110. Combarros, O. et al. The dopamine β-hydroxylase -1021C/T polymorphism is associated with the risk of Alzheimer's disease in the Epistasis Project. BMC Med. Genet. 11, 162 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  111. Combarros, O. et al. Replication by the Epistasis Project of the interaction between the genes for IL-6 and IL-10 in the risk of Alzheimer's disease. J. Neuroinflammation 6, 22 (2009).

    PubMed  PubMed Central  Google Scholar 

  112. Rhinn, H. et al. Integrative genomics identifies APOE ε4 effectors in Alzheimer's disease. Nature 500, 45–50 (2013). This paper presents a good example of how knowledge of protein–protein interactions can lead to the identification of statistical interactions between genetic variants.

    CAS  PubMed  Google Scholar 

  113. Gregersen, J. W. et al. Functional epistasis on a common MHC haplotype associated with multiple sclerosis. Nature 443, 574–577 (2006).

    CAS  PubMed  Google Scholar 

  114. Lincoln, M. R. et al. Epistasis among HLA-DRB1, HLA-DQA1, and HLA-DQB1 loci determines multiple sclerosis susceptibility. Proc. Natl Acad. Sci. 106, 7542–7547 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  115. Castillejo-López, C. et al. Genetic and physical interaction of the B-cell systemic lupus erythematosus-associated genes BANK1 and BLK. Ann. Rheum. Dis. 71, 136–142 (2012).

    PubMed  Google Scholar 

  116. Dempster, E. R. & Lerner, I. M. Heritability of threshold characters. Genetics 35, 212–236 (1950). This is a clear and insightful paper that explains the concepts behind the liability scale and observed scale in binary phenotypes.

    CAS  PubMed  PubMed Central  Google Scholar 

  117. Lucas, G. et al. Hypothesis-based analysis of gene–gene interactions and risk of myocardial infarction. PLoS ONE 7, e41730 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  118. Bell, J. T. et al. Genome-wide association scan allowing for epistasis in type 2 diabetes. Ann. Hum. Genet. 75, 10–19 (2011).

    PubMed  Google Scholar 

  119. Wei, W. H. et al. Genome-wide analysis of epistasis in body mass index using multiple human populations. Eur. J. Hum. Genet. 20, 857–862 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  120. Wei, W. et al. Characterisation of genome-wide association epistasis signals for serum uric acid in human population isolates. PLoS ONE 6, e23836 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  121. Visscher, P. M., Brown, M. a, McCarthy, M. I. & Yang, J. Five years of GWAS discovery. Am. J. Hum. Genet. 90, 7–24 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  122. Hill, L. D. et al. Epistasis between COMT and MTHFR in maternal–fetal dyads increases risk for preeclampsia. PLoS ONE 6, e16681 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  123. Génin, E. et al. Epistatic interaction between BANK1 and BLK in rheumatoid arthritis: results from a large trans-ethnic meta-analysis. PLoS ONE 8, e61044 (2013).

    PubMed  PubMed Central  Google Scholar 

  124. Verhoeven, K. J. F., Casella, G. & McIntyre, L. M. Epistasis: obstacle or advantage for mapping complex traits? PLoS ONE 5, e12264 (2010).

    PubMed  PubMed Central  Google Scholar 

  125. Hill, W. G., Goddard, M. E. & Visscher, P. M. Data and theory point to mainly additive genetic variance for complex traits. PLoS Genet. 4, e1000008 (2008). This study explores the apparent dichotomy between evidence for functional epistasis and lack of evidence for statistical epistasis; it points out that, with allele frequency distributions typical of natural populations, non-additive gene action typically generates little epistatic variance.

    PubMed  PubMed Central  Google Scholar 

  126. Gjuvsland, a B., Vik, J. O., Woolliams, J. a & Omholt, S. W. Order-preserving principles underlying genotype–phenotype maps ensure high additive proportions of genetic variance. J. Evol. Biol. 24, 2269–2279 (2011).

    CAS  PubMed  Google Scholar 

  127. Mäki-Tanila, A. & Hill, W. Influence of gene interaction on complex trait variation with multi-locus models. Genetics http://dx.doi.org/10.1534/genetics.114.165282 (2014).

  128. Falconer, D. S. & Mackay, T. F. C. Introduction to Quantitative Genetics (Longman, 1996).

    Google Scholar 

  129. Stringer, S., Derks, E., Kahn, R., Hill, W. & Wray, N. Assumptions and properties of limiting pathway models for analysis of epistasis in complex traits. PLoS ONE 8, 1–9 (2013).

    Google Scholar 

  130. Evans, D. M., Gillespie, N. a & Martin, N. G. Biometrical genetics. Biol. Psychol. 61, 33–51 (2002).

    PubMed  Google Scholar 

  131. Silventoinen, K. et al. Heritability of adult body height: a comparative study of twin cohorts in eight countries. Twin Res. 6, 399–408 (2003).

    PubMed  Google Scholar 

  132. Elks, C. E. et al. Variability in the heritability of body mass index: a systematic review and meta-regression. Front. Endocrinol. 3, 29 (2012).

    Google Scholar 

  133. Hu, X. et al. Integrating autoimmune risk loci with gene-expression data identifies specific pathogenic immune cell subsets. Am. J. Hum. Genet. 89, 496–506 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  134. Wray, N. R., Yang, J., Goddard, M. E. & Visscher, P. M. The genetic interpretation of area under the ROC curve in genomic profiling. PLoS Genet. 6, e1000864 (2010).

    PubMed  PubMed Central  Google Scholar 

  135. Daetwyler, H. D., Villanueva, B. & Woolliams, J. A. Accuracy of predicting the genetic risk of disease using a genome-wide approach. PLoS ONE 3, e3395 (2008).

    PubMed  PubMed Central  Google Scholar 

  136. Quon, G., Lippert, C., Heckerman, D. & Listgarten, J. Patterns of methylation heritability in a genome-wide analysis of four brain regions. Nucleic Acids Res. 41, 2095–2104 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  137. Gervin, K. et al. Extensive variation and low heritability of DNA methylation identified in a twin study. Genome Res. 21, 1813–1821 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  138. Purcell, S. M. et al. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 460, 748–752 (2009).

    CAS  PubMed  Google Scholar 

  139. Wray, N. R. et al. Pitfalls of predicting complex traits from SNPs. Nature Rev. Genet. 14, 507–515 (2013). This is essential reading for those interested in prediction of complex disease from genetic signals — some of the pitfalls may be even more dangerous when using epistatic signals.

    CAS  PubMed  Google Scholar 

  140. Zhou, X., Carbonetto, P. & Stephens, M. Polygenic modeling with bayesian sparse linear mixed models. PLoS Genet. 9, e1003264 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  141. Becker, T., Herold, C., Meesters, C., Mattheisen, M. & Baur, M. P. Significance levels in genome-wide interaction analysis (GWIA). Ann. Hum. Genet. 75, 29–35 (2011).

    PubMed  Google Scholar 

  142. Carlborg, O., Jacobsson, L., Ahgren, P., Siegel, P. & Andersson, L. Epistasis and the release of genetic variation during long-term selection. Nature Genet. 38, 418–420 (2006).

    CAS  PubMed  Google Scholar 

  143. Álvarez-Castro, J. M., Le Rouzic, A., Andersson, L., Siegel, P. B. & Carlborg, Ö. Modelling of genetic interactions improves prediction of hybrid patterns — a case study in domestic fowl. Genet. Res. 94, 255–266 (2012).

    Google Scholar 

  144. Wang, D. et al. Prediction of genetic values of quantitative traits with epistatic effects in plant breeding populations. Heredity 109, 313–319 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  145. Dudley, J. W. & Johnson, G. R. Epistatic models improve prediction of performance in corn. Crop Sci. 49, 763–770 (2009).

    CAS  Google Scholar 

  146. Hu, Z. et al. Genomic value prediction for quantitative traits under the epistatic model. BMC Genet. 12, 15 (2011).

    PubMed  PubMed Central  Google Scholar 

  147. González-Camacho, J. M. et al. Genome-enabled prediction of genetic values using radial basis function neural networks. Theor. Appl. Genet. 125, 759–771 (2012).

    PubMed  PubMed Central  Google Scholar 

  148. Buckler, E. S. et al. The genetic architecture of maize flowering time. Science 325, 714–718 (2009).

    CAS  PubMed  Google Scholar 

  149. Mackay, T. F. C. Epistasis and quantitative traits: using model organisms to study gene–gene interactions. Nature Rev. Genet. 15, 22–33 (2014). This review argues that detection of epistasis is often more tractable in model organisms, but differences in populations and genetic architecture (especially allele frequency and effect size) make it difficult to extrapolate conclusions on the importance of epistasis to human populations.

    CAS  PubMed  Google Scholar 

  150. Houle, D., Pélabon, C., Wagner, G. & Hansen, T. Measurement and meaning in biology. Q. Rev. Biol. 86, 3–34 (2011). This is an interesting discussion on the science of measuring things and is informative when thinking about scale effects that may underlie epistatic signals.

    PubMed  Google Scholar 

  151. Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

  152. Dinu, I. et al. SNP–SNP interactions discovered by logic regression explain Crohn's disease genetics. PLoS ONE 7, e43035 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  153. Piriyapongsa, J. et al. iLOCi: a SNP interaction prioritization technique for detecting epistasis in genome-wide association studies. BMC Genomics 13 (Suppl. 7), S2 (2012).

    PubMed  PubMed Central  Google Scholar 

  154. Hu, X. et al. SHEsisEpi, a GPU-enhanced genome-wide SNP–SNP interaction scanning algorithm, efficiently reveals the risk genetic epistasis in bipolar disorder. Cell Res. 20, 854–857 (2010).

    PubMed  Google Scholar 

  155. Wu, X. et al. A novel statistic for genome-wide interaction analysis. PLoS Genet. 6, e1001131 (2010).

    PubMed  PubMed Central  Google Scholar 

  156. Emily, M. IndOR: a new statistical procedure to test for SNP–SNP epistasis in genome-wide association studies. Stat. Med. 31, 2359–2373 (2012).

    CAS  PubMed  Google Scholar 

  157. Li, M., Romero, R., Fu, W. J. & Cui, Y. Mapping haplotype–haplotype interactions with adaptive LASSO. BMC Genet. 11, 79 (2010).

    PubMed  PubMed Central  Google Scholar 

  158. Yi, N., Liu, N., Zhi, D. & Li, J. Hierarchical generalized linear models for multiple groups of rare and common variants: jointly estimating group and individual-variant effects. PLoS Genet. 7, e1002382 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  159. Winham, S. J. & Motsinger-Reif, A. A. An R package implementation of multifactor dimensionality reduction. BioData Min. 4, 24 (2011).

    PubMed  PubMed Central  Google Scholar 

  160. Yang, P., Ho, J. W., Yang, Y. H. & Zhou, B. B. Gene–gene interaction filtering with ensemble of filters. BMC Bioinformatics 12, (Suppl. 1), S10 (2011).

    PubMed  PubMed Central  Google Scholar 

  161. Wan, X. et al. Predictive rule inference for epistatic interaction detection in genome-wide association studies. Bioinformatics 26, 30–37 (2010).

    CAS  PubMed  Google Scholar 

  162. Winham, S. J. et al. SNP interaction detection with Random Forests in high-dimensional genetic data. BMC Bioinformatics 13, 164 (2012).

    PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

The authors are grateful to three anonymous reviewers for help in improving the manuscript. W.-H.W. acknowledges financial support from the Higher Education Funding Council for England (HEFCE) and the Medical Research Council. G.H. is grateful for support from the Medical Research Council (MC_UU_12013/1-9) and by the US National Institutes of Health (GM057091). C.S.H. is grateful for financial support from the UK Medical Research Council and the Biotechnology and Biological Sciences Research Council.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chris S. Haley.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Related links

PowerPoint slides

Supplementary information

Supplementary information S1 (box)

Rationale of frequentist methods for testing statistical interactions (PDF 555 kb)

Glossary

Complex traits

Traits for which variation between individuals is controlled by several or many genes and different environmental effects, potentially with interactions between these different effects.

Mutational target size

The fraction of the genome in which new mutations can potentially cause variation for a trait. For most complex traits this is large, thus suggesting that many loci can influence trait variation.

Causal variants

Genetic variants that directly modify a phenotype and/or cause a change of disease risk. Owing to the limited amount of variation interrogated by single-nucleotide polymorphism (SNP) genotyping microarrays, SNPs in genome-wide association studies typically merely tag the causal region rather than being the causal variants themselves.

Genetic architecture

The complete description of the genetic factors influencing trait variation, such as the number of genetic loci, their effects, allele frequencies, actions and interactions.

Epistasis

Statistical interactions between loci in their effect on a trait such that the impact of a particular single-locus genotype depends on the genotype at other loci.

Narrow-sense heritability

(h2). The proportion of variation due to the additive effects of genes.

Broad-sense heritability

(H2). The proportion of variation due to all genetic effects (that is, both additive and non-additive, including dominance and epistasis).

Exhaustive search

A search of all possible pairwise combinations of loci for evidence of epistatic interactions.

Bonferroni correction

The simplest and perhaps most conservative method to control family error rate (α) by correcting for the number of independent hypothesis tests (n) when n is large; that is, the corrected threshold Pcorrected = α/n.

Hypothesis-free

An analysis in which no assumption is made about the loci involved in epistasis or their effects and so all possible pairs of single-nucleotide polymorphisms are tested (that is, an exhaustive search).

Hypothesis-driven

An analysis that limits the combinations of loci tested for epistasis according to some prior hypothesis (for example, only loci with a marginal effect or loci involved in a particular biological pathway should be tested).

Quantitative traits

Phenotypes that vary continuously (for example, height), in contrast to qualitative traits in which phenotypes are discrete (for example, diseased or healthy).

Saturated and reduced models

There are nine joint genotypes for a pair of single-nucleotide polymorphisms (SNPs) each with three genotypes (for example, AA, Aa and aa). These can be modelled in full using nine parameters: one as the baseline (for example, aa/aa), two for each SNP (for example, AA/aa and Aa/aa) and four for interactions (for example, AA/Aa, AA/AA, Aa/Aa, Aa/AA). The saturated model fits all the nine parameters, whereas the reduced model fits the first five parameters and excludes the four interaction parameters.

Hardy–Weinberg equilibrium

(HWE). A principle stating that allele and genotype frequencies of variants in a population will remain constant from one generation to the next in the absence of evolutionary disturbing factors such as mutation and genetic drift.

Marginal effects

(Also known as main effects). The average effect of a locus across all other loci and environmental effects.

Linkage disequilibrium

(LD). The nonrandom association of alleles of two or more loci in a population owing to limited recombination. LD is often used to measure the relationship of genetic markers of the loci: a high LD means the markers are closely related (that is, co-occurring) so the genotype at one marker is predictive of the genotype at another.

Haplotype

A combination of alleles (DNA sequences) inherited from a single parent. A haplotype can be within one locus or across multiple loci, with or without physical coupling on the DNA strand.

Linkage phase

(Also known as gametic phase). The information of combinations of DNA alleles in a diploid individual inherited from the mother or father.

Polygenic architecture

A trait genetic architecture under which many genes of small effect contribute to trait variation.

Covariates

Variables that may confound the outcome variable of a statistical model, for example, age is a covariate of human height.

Bayes' theorem

A probability theory by Thomas Bayes to calculate conditional probabilities based on prior distributions of parameters in a model and the observed experimental data.

Variance heterogeneity

Differnce in variance of a quantitative trait between the three possible genotypes of a biallelic single-nucleotide polymorphism in the presence of genetic interactions; it can therefore be used to screen for potential interacting SNPs.

Publication bias

A bias that arises owing to only certain types of results (for example, those that successfully reject the null hypothesis) being much more likely to be published than others, leading to a disproportionate representation in the literature.

Large P small N problem

A statistical challenge to estimate a large number of parameters based on a small number of samples.

Multifactor dimensionality reduction

A data-mining algorithm that can reduce a high-dimensional multilocus model of multifactorial classes (that is, single-nucleotide polymorphism genotype combinations) into a one-dimensional model of one variable of either high-risk (potential interacting) or low-risk classes based on the ratio of cases and controls in each class. The algorithm uses cross-validation iteratively to define the best classification.

Tree-based methods

Model-free or non-parametric machine-learning approaches for regression and classification analyses by recursive partitioning of variables into tree structures. Popular applications in epistasis studies include random forest, random jungle, classification and regression trees.

Entropy-based methods

Entropy is a key measure of uncertainty associated with a random variable in information theory. Entropy-based methods examine the information entropy difference between different models with and without interactions to detect epistasis.

Imputation

Statistical inference of unobserved single-nucleotide polymorphism (SNP) genotypes based on a reference panel of known haplotypes in a population (for example, the 1000 Genomes Project). Imputation can greatly narrow down the distance between SNPs and causal variants, and thus increase the power of detection of associations.

Pleiotropic epistasis

Statistical interaction signals shared in multiple traits.

Expression quantitative trait locus

(eQTL). A locus that controls variation in expression of a particular gene. An eQTL may lie adjacent to the gene being controlled (cis-acting control) or some distance away (trans-acting control).

Wellcome Trust Case–Control Consortium

(WTCCC). One of the first large collaborative genome-wide association studies that included eight disease traits. This study has become a role model for subsequent studies, and the data set has been subjected to additional analyses, including for epistasis.

Endophenotypes

Heritable traits that are genetically correlated with disease traits. They are often traits (such as the level of a metabolite or transcript) that can be measured in all individuals (both diseased and healthy) and that can potentially provide a predictor of disease status.

Observed scale

Measurement of a binary phenotype in terms of whether the participant exhibits the phenotype or not.

Liability scale

An unobserved underlying risk of a binary phenotype or disease that is measured on a continuous scale and that is likely to be influenced by many genetic and environmental factors.

Binary phenotypes

Disease traits that have two major states on the observed scale: diseased or healthy. They may nonetheless be complex traits in which transition to the disease state is influenced by continuous variation on an underlying liability scale for disease that is controlled by many genetic loci and environmental effects.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wei, WH., Hemani, G. & Haley, C. Detecting epistasis in human complex traits. Nat Rev Genet 15, 722–733 (2014). https://doi.org/10.1038/nrg3747

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nrg3747

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing