Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Analysing biological pathways in genome-wide association studies

Key Points

  • This article provides a background introduction to pathway-based approaches for analyzing genome-wide association (GWA) studies. An example is shown to illustrate that many genes in a susceptibility pathway may show evidence of association, although not genome-wide significance, in any given GWA study.

  • A brief overview of published studies that use pathway approaches for interpreting data from GWA studies is given.

  • A summary and classification of the currently available pathway approaches is provided. The differences in their statistical approaches and analytical procedures are described.

  • A discussion of the challenges and pitfalls for using pathway approaches for analyzing GWA studies is then provided.

  • An outline of the future research directions that could further mine information from existing GWA study data sets is given. The extension of pathway approaches to next-generation sequencing data is also discussed.

Abstract

Genome-wide association (GWA) studies have typically focused on the analysis of single markers, which often lacks the power to uncover the relatively small effect sizes conferred by most genetic variants. Recently, pathway-based approaches have been developed, which use prior biological knowledge on gene function to facilitate more powerful analysis of GWA study data sets. These approaches typically examine whether a group of related genes in the same functional pathway are jointly associated with a trait of interest. Here we review the development of pathway-based approaches for GWA studies, discuss their practical use and caveats, and suggest that pathway-based approaches may also be useful for future GWA studies with sequencing data.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Figure 1: Linking pathways to disease: Crohn's disease.
Figure 2: Types of pathway association method.

References

  1. Manolio, T. A., Brooks, L. D. & Collins, F. S. A HapMap harvest of insights into the genetics of common disease. J. Clin. Invest. 118, 1590–1605 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Li, M., Wang, K., Grant, S. F., Hakonarson, H. & Li, C. ATOM: a powerful gene-based association test by combining optimally weighted markers. Bioinformatics 25, 497–503 (2009).

    Article  CAS  PubMed  Google Scholar 

  3. Gauderman, W. J., Murcray, C., Gilliland, F. & Conti, D. V. Testing association between disease and multiple SNPs in a candidate gene. Genet. Epidemiol. 31, 383–395 (2007).

    Article  PubMed  Google Scholar 

  4. Wang, T. & Elston, R. C. Improved power by use of a weighted score test for linkage disequilibrium mapping. Am. J. Hum. Genet. 80, 353–360 (2007).

    Article  CAS  PubMed  Google Scholar 

  5. Wu, M. C. et al. Powerful SNP-set analysis for case–control genome-wide association studies. Am. J. Hum. Genet. 86, 929–942 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Kwee, L. C., Liu, D., Lin, X., Ghosh, D. & Epstein, M. P. A powerful and flexible multilocus association test for quantitative traits. Am. J. Hum. Genet. 82, 386–397 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Wang, K. & Abbott, D. A principal components regression approach to multilocus genetic association studies. Genet. Epidemiol. 32, 108–118 (2008).

    Article  PubMed  Google Scholar 

  8. Liu, J. Z. et al. A versatile gene-based test for genome-wide association studies. Am. J. Hum. Genet. 87, 139–145 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Li, Y., Willer, C., Sanna, S. & Abecasis, G. Genotype imputation. Annu . Rev. Genomics Hum. Genet. 10, 387–406 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Marchini, J. & Howie, B. Genotype imputation for genome-wide association studies. Nature Rev. Genet. 11, 499–511 (2010).

    Article  CAS  PubMed  Google Scholar 

  11. Roeder, K., Bacanu, S. A., Wasserman, L. & Devlin, B. Using linkage genome scans to improve power of association in genome scans. Am. J. Hum. Genet. 78, 243–252 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Wang, K., Li, M. & Bucan, M. Pathway-based approaches for analysis of genomewide association studies. Am. J. Hum. Genet. 81, 1278–1283 (2007). This is one of the first studies to propose the use of pathway information in GWA studies. Borrowing ideas from the gene expression microarray field, the authors adapted a GSEA approach for pathway analysis and demonstrated its use in several GWA studies.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Schadt, E. E. Molecular networks as sensors and drivers of common human diseases. Nature 461, 218–223 (2009).

    Article  CAS  PubMed  Google Scholar 

  14. Mootha, V. K. et al. PGC-1α-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nature Genet. 34, 267–273 (2003).

    Article  CAS  PubMed  Google Scholar 

  15. Subramanian, A., Kuehn, H., Gould, J., Tamayo, P. & Mesirov, J. P. GSEA-P: a desktop application for Gene Set Enrichment Analysis. Bioinformatics 23, 3251–3253 (2007).

    Article  CAS  PubMed  Google Scholar 

  16. Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acac. Sci. USA 102, 15545–15550 (2005). The authors proposed the GSEA approach for analysis of expression microarray data. This approach has been modified in many subsequent studies to perform pathway-based analysis on both expression data and GWA study data.

    Article  CAS  Google Scholar 

  17. Song, S. & Black, M. A. Microarray-based gene set analysis: a comparison of current methods. BMC Bioinformatics 9, 502 (2008).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  18. Hedegaard, J. et al. Methods for interpreting lists of affected genes obtained in a DNA microarray experiment. BMC Proc. 3 (Suppl. 4), 5 (2009).

    Article  CAS  Google Scholar 

  19. Dong, C. TH17 cells in development: an updated view of their molecular identity and genetic programming. Nature Rev. Immunol. 8, 337–348 (2008).

    Article  CAS  Google Scholar 

  20. Abraham, C. & Cho, J. H. IL-23 and autoimmunity: new insights into the pathogenesis of inflammatory bowel disease. Annu. Rev. Med. 60, 97–110 (2009).

    Article  CAS  PubMed  Google Scholar 

  21. Yoshida, H., Nakaya, M. & Miyazaki, Y. Interleukin 27: a double-edged sword for offense and defense. J. Leukoc. Biol. 86, 1295–1303 (2009).

    Article  CAS  PubMed  Google Scholar 

  22. Abraham, C. & Cho, J. Interleukin-23/TH17 pathways and inflammatory bowel disease. Inflamm. Bowel Dis. 15, 1090–1100 (2009).

    Article  PubMed  Google Scholar 

  23. Barrett, J. C. et al. Genome-wide association defines more than 30 distinct susceptibility loci for Crohn's disease. Nature Genet. 40, 955–962 (2008).

    Article  CAS  PubMed  Google Scholar 

  24. Glas, J. et al. Evidence for STAT4 as a common autoimmune gene: rs7574865 is associated with colonic Crohn's disease and early disease onset. PLoS ONE 5, e10373 (2010).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  25. Martinez, A. et al. Association of the STAT4 gene with increased susceptibility for some immune-mediated diseases. Arthritis Rheum. 58, 2598–2602 (2008).

    Article  CAS  PubMed  Google Scholar 

  26. Zhernakova, A. et al. Genetic analysis of innate immunity in Crohn's disease and ulcerative colitis identifies two susceptibility loci harboring CARD9 and IL18RAP. Am. J. Hum. Genet. 82, 1202–1210 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Leach, S. T. et al. Local and systemic interleukin-18 and interleukin-18-binding protein in children with inflammatory bowel disease. Inflamm. Bowel Dis. 14, 68–74 (2008).

    Article  PubMed  Google Scholar 

  28. Wang, K. et al. Comparative genetic analysis of inflammatory bowel disease and type 1 diabetes implicates multiple loci with opposite effects. Hum. Mol. Genet. 19, 2059–2067 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Sato, K. et al. Strong evidence of a combination polymorphism of the tyrosine kinase 2 gene and the signal transducer and activator of transcription 3 gene as a DNA-based biomarker for susceptibility to Crohn's disease in the Japanese population. J. Clin. Immunol. 29, 815–825 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Klein, R. J. et al. Complement factor h polymorphism in age-related macular degeneration. Science 308, 385–389 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Edwards, A. O. et al. Complement factor H polymorphism and age-related macular degeneration. Science 308, 421–424 (2005).

    Article  CAS  PubMed  Google Scholar 

  32. Haines, J. L. et al. Complement factor H variant increases the risk of age-related macular degeneration. Science 308, 419–421 (2005).

    Article  CAS  PubMed  Google Scholar 

  33. Dinu, V., Miller, P. L. & Zhao, H. Evidence for association between multiple complement pathway genes and AMD. Genet. Epidemiol. 31, 224–237 (2007).

    Article  PubMed  Google Scholar 

  34. Ng, T. K. et al. Multiple gene polymorphisms in the complement factor H gene are associated with exudative age-related macular degeneration in chinese. Invest. Ophthalmol. Vis. Sci. 49, 3312–3317 (2008).

    Article  PubMed  Google Scholar 

  35. Lesnick, T. G. et al. A genomic pathway approach to a complex disease: axon guidance and Parkinson disease. PLoS Genet. 3, e98 (2007).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  36. Lesnick, T. G. et al. Beyond Parkinson disease: amyotrophic lateral sclerosis and the axon guidance pathway. PLoS ONE 3, e1449 (2008).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  37. O'Dushlaine, C. et al. Molecular pathways involved in neuronal cell adhesion and membrane scaffolding contribute to schizophrenia and bipolar disorder susceptibility. Mol. Psychiatry 16 Feb 2010 (doi:10.1038/mp.2010.7).

    Article  CAS  PubMed  Google Scholar 

  38. Wang, K. et al. Common genetic variants on 5p14.1 associate with autism spectrum disorders. Nature 459, 528–533 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Askland, K., Read, C. & Moore, J. Pathways-based analyses of whole-genome association study data in bipolar disorder reveal genes mediating ion channel activity and synaptic neurotransmission. Hum. Genet. 125, 63–79 (2009).

    Article  CAS  PubMed  Google Scholar 

  40. Holmans, P. et al. Gene ontology analysis of GWA study data sets provides insights into the biology of bipolar disorder. Am. J. Hum. Genet. 85, 13–24 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Wang, K. et al. Diverse genome-wide association studies associate the IL12/IL23 pathway with Crohn disease. Am. J. Hum. Genet. 84, 399–405 (2009). The authors demonstrated a successful example in which pathway-based association approaches can identify a known disease susceptibility pathway and reveal additional susceptibility genes. Furthermore, they showed that pathway association can be replicated between different genotyping platforms or different ethnicity groups.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Eleftherohorinou, H. et al. Pathway analysis of GWAS provides new insights into genetic susceptibility to 3 inflammatory diseases. PLoS ONE 4, e8068 (2009).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  43. Tintle, N. L., Borchers, B., Brown, M. & Bekmetjev, A. Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16. BMC Proc. 3 (Suppl. 7), 96 (2009).

    Article  CAS  Google Scholar 

  44. Ballard, D. H. et al. A pathway analysis applied to Genetic Analysis Workshop 16 genome-wide rheumatoid arthritis data. BMC Proc. 3 (Suppl. 7), 91 (2009).

    Article  CAS  Google Scholar 

  45. Beyene, J. et al. Pathway-based analysis of a genome-wide case–control association study of rheumatoid arthritis. BMC Proc. 3 (Suppl. 7), 128 (2009).

    Article  CAS  Google Scholar 

  46. Sohns, M., Rosenberger, A. & Bickeboller, H. Integration of a priori gene set information into genome-wide association studies. BMC Proc. 3 (Suppl. 7), 95 (2009).

    Article  CAS  Google Scholar 

  47. Lebrec, J. J., Huizinga, T. W., Toes, R. E., Houwing-Duistermaat, J. J. & van Houwelingen, H. C. Integration of gene ontology pathways with North American Rheumatoid Arthritis Consortium genome-wide association data via linear modeling. BMC Proc. 3 (Suppl. 7), 94 (2009).

    Article  CAS  Google Scholar 

  48. Torkamani, A., Topol, E. J. & Schork, N. J. Pathway analysis of seven common diseases assessed by genome-wide association. Genomics 92, 265–272 (2008).

    Article  CAS  PubMed  Google Scholar 

  49. Chen, L. S. et al. Insights into colon cancer etiology via a regularized approach to gene set analysis of GWAS data. Am. J. Hum. Genet. 86, 860–871 (2010). The authors proposed a strategy that uses representative eigenSNPs for each gene to assess their joint association with disease risk. This approach compares favourably against other approaches that examine only the most significant SNP in each gene or SNPs passing a certain p -value threshold.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Zhang, L. et al. Pathway-based genome-wide association analysis identified the importance of regulation-of-autophagy pathway for ultradistal radius BMD. J. Bone Miner. Res. 25, 1572–1580 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  51. Peng, G. et al. Gene and pathway-based second-wave analysis of genome-wide association studies. Eur. J. Hum. Genet. 18, 111–117 (2010).

    Article  PubMed  Google Scholar 

  52. Chen, Y. et al. Pathway-based genome-wide association analysis identified the importance of EphrinA–EphR pathway for femoral neck bone geometry. Bone 46, 129–136 (2010).

    Article  CAS  PubMed  Google Scholar 

  53. Lambert, J. C. et al. Implication of the immune system in Alzheimer's disease: evidence from genome-wide pathway analysis. J. Alzheimers Dis. 20, 1107–1118 (2010).

    Article  CAS  PubMed  Google Scholar 

  54. Joslyn, G., Ravindranathan, A., Brush, G., Schuckit, M. & White, R. L. Human variation in alcohol response is influenced by variation in neuronal signaling genes. Alcohol. Clin. Exp. Res. 34, 800–812 (2010).

    Article  CAS  PubMed  Google Scholar 

  55. Ballard, D., Abraham, C., Cho, J. & Zhao, H. Pathway analysis comparison using Crohn's disease genome wide association studies. BMC Med. Genomics 3, 25 (2010).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  56. Yu, K. et al. Pathway analysis by adaptive combination of P-values. Genet. Epidemiol. 33, 700–709 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  57. Chen, L. et al. Prioritizing risk pathways: a novel association approach to searching for disease pathways fusing SNPs and pathways. Bioinformatics 25, 237–242 (2009).

    Article  CAS  PubMed  Google Scholar 

  58. O'Dushlaine, C. et al. The SNP ratio test: pathway analysis of genome-wide association datasets. Bioinformatics 25, 2762–2763 (2009).

    Article  CAS  PubMed  Google Scholar 

  59. Chai, H. S. et al. GLOSSI: a method to assess the association of genetic loci-sets with complex diseases. BMC Bioinformatics 10, 102 (2009).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  60. Chasman, D. I. On the utility of gene set methods in genomewide association studies of quantitative traits. Genet. Epidemiol. 32, 658–668 (2008).

    Article  PubMed  Google Scholar 

  61. De la Cruz, O., Wen, X., Ke, B., Song, M. & Nicolae, D. L. Gene, region and pathway level analyses in whole-genome studies. Genet. Epidemiol. 34, 222–231 (2010).

    PubMed  PubMed Central  Google Scholar 

  62. Zhang, K., Cui, S., Chang, S., Zhang, L. & Wang, J. i-GSEA4GWAS: a web server for identification of pathways/gene sets associated with traits by applying an improved gene set enrichment analysis to genome-wide association study. Nucleic Acids Res. 38 (Suppl. 2), W90–W95 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Schwender, H., Ruczinski, I. & Ickstadt, K. Testing SNPs and sets of SNPs for importance in association studies. Biostatistics 2 July 2010 (doi:10.1093/biostatistics/kxq042).

    Article  PubMed  PubMed Central  Google Scholar 

  64. Nam, D., Kim, J., Kim, S. Y. & Kim, S. GSA-SNP: a general approach for gene set analysis of polymorphisms. Nucleic Acids Res. 38 (Suppl. 2), W749–W754 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Luo, L. et al. Genome-wide gene and pathway analysis. Eur. J. Hum. Genet. 18, 1045–1053 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Guo, Y. F., Li, J., Chen, Y., Zhang, L. S. & Deng, H. W. A new permutation strategy of pathway-based approach for genome-wide association study. BMC Bioinformatics 10, 429 (2009).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  67. Cantor, R. M., Lange, K. & Sinsheimer, J. S. Prioritizing GWAS results: a review of statistical methods and recommendations for their application. Am. J. Hum. Genet. 86, 6–22 (2010). A crucial review of current statistical approaches used in GWA studies, including meta-analysis, epistasis analysis and pathway analysis. The authors give a few recommendations for using these approaches.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. Hong, M. G., Pawitan, Y., Magnusson, P. K. & Prince, J. A. Strategies and issues in the detection of pathway enrichment in genome-wide association studies. Hum. Genet. 126, 289–301 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Kraft, P. & Raychaudhuri, S. Complex diseases, complex genes: keeping pathways on the right track. Epidemiology 20, 508–511 (2009). The authors discuss three loosely defined approaches to pathway analysis and touch on potential pitfalls for each when applied to GWA studies. They suggest that care must be taken to avoid biases and errors that will send researchers down blind alleys.

    Article  PubMed  PubMed Central  Google Scholar 

  70. Tintle, N. et al. Inclusion of a priori information in genome-wide association analysis. Genet. Epidemiol. 33 (Suppl. 1), 74–80 (2009).

    Article  Google Scholar 

  71. Thomas, D. C. et al. Use of pathway information in molecular epidemiology. Hum. Genomics 4, 21–42 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. Elbers, C. C. et al. Using genome-wide pathway analysis to unravel the etiology of complex diseases. Genet. Epidemiol. 33, 419–431 (2009). The authors present the various benefits and limitations of pathway classification tools for analyzing GWA study data. They demonstrate multiple differences in outcome between pathway tools analyzing the same data set and suggest that the limitations of pathway approaches need to be addressed.

    Article  PubMed  Google Scholar 

  73. Kanehisa, M. & Goto, S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 27–30 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  74. Ashburner, M. et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nature Genet. 25, 25–29 (2000).

    Article  CAS  PubMed  Google Scholar 

  75. Klingstrom, T. & Plewczynski, D. Protein–protein interaction and pathway databases, a graphical review. Brief. Bioinform. 17 Sept 2010 (doi:10.1093/bib/bbq064).

    Article  CAS  PubMed  Google Scholar 

  76. Goeman, J. J. & Buhlmann, P. Analyzing gene expression data in terms of gene sets: methodological issues. Bioinformatics 23, 980–987 (2007).

    Article  CAS  PubMed  Google Scholar 

  77. Keating, B. J. et al. Concept, design and implementation of a cardiovascular gene-centric 50 k SNP array for large-scale genomic association studies. PLoS ONE 3, e3583 (2008).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  78. Fridley, B. L., Jenkins, G. D. & Biernacka, J. M. Self-contained gene-set analysis of expression data: an evaluation of existing and novel methods. PLoS ONE 5, e12693 (2010).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  79. Willer, C. J. et al. Six new loci associated with body mass index highlight a neuronal influence on body weight regulation. Nature Genet. 41, 25–34 (2009).

    Article  CAS  PubMed  Google Scholar 

  80. Weedon, M. N. et al. Genome-wide association analysis identifies 20 loci that influence adult height. Nature Genet. 40, 575–583 (2008).

    Article  CAS  PubMed  Google Scholar 

  81. Lango Allen, H. et al. Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature 467, 832–838 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  82. Lewinger, J. P., Conti, D. V., Baurley, J. W., Triche, T. J. & Thomas, D. C. Hierarchical Bayes prioritization of marker associations from a genome-wide association scan for further investigation. Genet. Epidemiol. 31, 871–882 (2007).

    Article  PubMed  Google Scholar 

  83. Wu, T. T., Chen, Y. F., Hastie, T., Sobel, E. & Lange, K. Genome-wide association analysis by lasso penalized logistic regression. Bioinformatics 25, 714–721 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  84. Zhou, H., Sehl, M. E., Sinsheimer, J. S. & Lange, K. Association screening of common and rare genetic variants by penalized regression. Bioinformatics 26, 2375–2382 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  85. Perry, J. R. et al. Interrogating type 2 diabetes genome-wide association data using a biological pathway-based approach. Diabetes 58, 1463–1467 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  86. Mirnics, K., Middleton, F. A., Marquez, A., Lewis, D. A. & Levitt, P. Molecular characterization of schizophrenia viewed by microarray analysis of gene expression in prefrontal cortex. Neuron 28, 53–67 (2000). This is one of the first gene expression studies demonstrating that a group of functionally related genes may show modest yet consistent expression changes between two conditions.

    Article  CAS  PubMed  Google Scholar 

  87. Jiang, Z. & Gentleman, R. Extensions to gene set enrichment. Bioinformatics 23, 306–313 (2007).

    Article  PubMed  CAS  Google Scholar 

  88. Efron, B. & Tibshirani, R. On testing the significance of sets of genes. Ann. Appl. Stat. 1, 107–129 (2007).

    Article  Google Scholar 

  89. Dinu, I. et al. Improving gene set analysis of microarray data by SAM-GS. BMC Bioinformatics 8, 242 (2007).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  90. Heller, R., Manduchi, E., Grant, G. R. & Ewens, W. J. A flexible two-stage procedure for identifying gene sets that are differentially expressed. Bioinformatics 25, 1019–1025 (2009).

    Article  CAS  PubMed  Google Scholar 

  91. Ackermann, M. & Strimmer, K. A general modular framework for gene set enrichment analysis. BMC Bioinformatics 10, 47 (2009).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  92. Glazko, G. V. & Emmert-Streib, F. Unite and conquer: univariate and multivariate approaches for finding differentially expressed gene sets. Bioinformatics 25, 2348–2354 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  93. Irizarry, R. A., Wang, C., Zhou, Y. & Speed, T. P. Gene set enrichment analysis made simple. Stat. Methods Med. Res. 18, 565–575 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  94. Hsu, Y. H. et al. An integration of genome-wide association study and gene expression profiling to prioritize the discovery of novel susceptibility loci for osteoporosis-related traits. PLoS Genet. 6, e1000977 (2010).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  95. Zhong, H., Yang, X., Kaplan, L. M., Molony, C. & Schadt, E. E. Integrating pathway analysis and genetics of gene expression for genome-wide association studies. Am. J. Hum. Genet. 86, 581–591 (2010). The authors performed an analysis that leverages information from genetics of gene expression studies to identify biological pathways enriched for expression-associated genetic loci associated with disease in GWA studies. They demonstrated the utility of integrating pathway analysis and gene expression data for interpreting signals from GWA studies.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  96. Li, B. & Leal, S. M. Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am. J. Hum. Genet. 83, 311–321 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  97. Price, A. L. et al. Pooled association tests for rare variants in exon-resequencing studies. Am. J. Hum. Genet. 86, 832–838 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  98. Madsen, B. E. & Browning, S. R. A groupwise association test for rare mutations using a weighted sum statistic. PLoS Genet. 5, e1000384 (2009).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  99. Han, F. & Pan, W. A data-adaptive sum test for disease association with multiple common or rare variants. Hum. Hered. 70, 42–54 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  100. Wei, Z. et al. From disease association to risk assessment: an optimistic view from genome-wide association studies on type 1 diabetes. PLoS Genet. 5, e1000678 (2009).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  101. Frayling, T. M., Colhoun, H. & Florez, J. C. A genetic link between type 2 diabetes and prostate cancer. Diabetologia 51, 1757–1760 (2008).

    Article  CAS  PubMed  Google Scholar 

  102. Giovannucci, E. et al. Diabetes and cancer: a consensus report. CA Cancer J. Clin. 60, 207–221 (2010).

    Article  PubMed  Google Scholar 

  103. Pan, W. Network-based model weighting to detect multiple loci influencing complex diseases. Hum. Genet. 124, 225–234 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  104. Baranzini, S. E. et al. Pathway and network-based analysis of genome-wide association studies in multiple sclerosis. Hum. Mol. Genet. 18, 2078–2090 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  105. Baurley, J. W., Conti, D. V., Gauderman, W. J. & Thomas, D. C. Discovery of complex pathways from observational data. Stat. Med. 29, 1998–2011 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  106. Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  107. Zinovyev, A., Viara, E., Calzone, L. & Barillot, E. BiNoM: a Cytoscape plugin for manipulating and analyzing biological networks. Bioinformatics 24, 876–877 (2008).

    Article  CAS  PubMed  Google Scholar 

  108. Clement-Ziza, M. et al. Genoscape: a Cytoscape plug-in to automate the retrieval and integration of gene expression data and molecular networks. Bioinformatics 25, 2617–2618 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  109. Neurath, M. F., Fuss, I., Kelsall, B. L., Stuber, E. & Strober, W. Antibodies to interleukin 12 abrogate established experimental colitis in mice. J. Exp. Med. 182, 1281–1290 (1995).

    Article  CAS  PubMed  Google Scholar 

  110. Neurath, M. F. IL-23: a master regulator in Crohn disease. Nature Med. 13, 26–28 (2007).

    Article  CAS  PubMed  Google Scholar 

  111. Medina, I. et al. Gene set-based analysis of polymorphisms: finding pathways or biological processes associated to traits in genome-wide association studies. Nucleic Acids Res. 37, W340–W344 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  112. Holden, M., Deng, S., Wojnowski, L. & Kulle, B. GSEA-SNP: applying gene set enrichment analysis to SNP data from genome-wide association studies. Bioinformatics 24, 2784–2785 (2008).

    Article  CAS  PubMed  Google Scholar 

  113. Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank D. C. Thomas (University of Southern California) for his helpful critiques which greatly improved the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hakon Hakonarson.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Related links

Related links

DATABASES

BioCarta database

Cytoscape

GeneGo

Gene Ontology database

Ingenuity Pathway Analysis

Kyoto Encyclopedia of Genes and Genomes (KEGG)

MetaCyc

Molecular Signatures Database

Nature Pathway Interaction Database

Pathguide

Science Signal Transduction Knowledge Environment

TRANSPATH

FURTHER INFORMATION

Kai Wang's homepage

Mingyao Li's homepage

Hakon Hakonarson's homepage

ALIGATOR

Genetic Analysis Workshop 16

GenGen

GESBAP

GRASS

GSA-SNP

GSEA-SNP

i-GSEA4GWAS

Nature Reviews Genetics series on Genome-wide association studies

Nature Reviews Genetics series on Study designs

PLINK

SNP ratio test

Glossary

Permutation

A strategy for assessing the probability of observing the value of a particular statistic. The probability is computed from a data set in which the data are randomly shuffled and the statistic is recomputed from the shuffled data many times and ultimately compared to the value of the statistic obtained with the non-shuffled data.

Multi-marker test

A statistical method that measures the strength of association between a trait and multiple SNP markers.

SNP ascertainment

Identification of SNPs that should be placed on a genotyping array to ensure representative coverage of the genome.

Linkage disequilibrium

The non-random association of alleles at two or more closely linked loci.

Genomic inflation

The presence of excess false-positive results, measured by quantifying the ratio of the median of the empirically observed distribution of the test statistic to the expected median.

Type I error

The probability of a false-positive result from a statistical hypothesis test.

Bonferroni correction

A multiple comparison adjustment approach that tests each individual hypothesis by dropping the threshold for declaring statistical significance by n-fold, when n hypotheses are being tested.

False Discovery Rate

A multiple comparison adjustment approach to control the expected proportion of incorrectly rejected null hypotheses in a list of rejected hypotheses.

Genotype imputation

A statistical method that predicts individual genotypes at ungenotyped markers from genotypes of other nearby markers, usually using the HapMap data as a reference.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Wang, K., Li, M. & Hakonarson, H. Analysing biological pathways in genome-wide association studies. Nat Rev Genet 11, 843–854 (2010). https://doi.org/10.1038/nrg2884

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nrg2884

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing