InnateDB: facilitating systems-level analyses of the mammalian innate immune response
David J Lynn1,2, Geoffrey L Winsor1, Calvin Chan2, Nicolas Richard1, Matthew R Laird1, Aaron Barsky3, Jennifer L Gardy2, Fiona M Roche1, Timothy H W Chan2, Naisha Shah1,2, Raymond Lo1, Misbah Naseer2, Jaimmie Que2, Melissa Yau2, Michael Acab1, Dan Tulpan1, Matthew D Whiteside1, Avinash Chikatamarla2, Bernadette Mah2, Tamara Munzner3, Karsten Hokamp4, Robert E W Hancock2 & Fiona S L Brinkman1
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada
- Centre for Microbial Diseases and Immunity Research, University of British Columbia, Vancouver, British Columbia, Canada
- Department of Computer Science, University of British Columbia, Vancouver, British Columbia, Canada
- Smurfit Institute of Genetics, Trinity College Dublin, Dublin, Ireland
Correspondence to: David J Lynn1,2 Department of Molecular Bology and Biochemistry, Room SSB 8166, Simon Fraser University, 8888 University Drive, Burnaby, British Columbia, Canada V5A 1S6. Tel.: +778 782 2061; Fax: +778 782 5583; Email: dlynn@sfu.ca
Correspondence to: Fiona S L Brinkman1 Department of Molecular Bology and Biochemistry, Room SSB 8166, Simon Fraser University, 8888 University Drive, Burnaby, British Columbia, Canada V5A 1S6. Tel.: +778 782 5646; Fax: 778 782 5583; Email: brinkman@sfu.ca
Received 21 May 2008; Accepted 17 July 2008; Published online 2 September 2008
Article highlights
- InnateDB is a molecular interaction and pathway database and analysis platform that has been developed to facilitate systems level analyses of the complex networks of pathways and interactions that govern the innate immune response, the wider immune system and the entire mammalian interactome.
- To date, more than 3,500 innate immunity relevant interactions have been contextually annotated through the review of 1,000 plus publications.
- Integrated into InnateDB are novel bioinformatics resources including, network visualization software, pathway analysis, orthologous interaction network construction and the ability to overlay user-supplied gene expression data in an intuitively displayed molecular interaction network and pathway context, that will enable biologists without a computational background to explore their data in a more systems-oriented, yet user-friendly, manner.
Synopsis
The importance of the innate immune response has long been recognized in the front line of defense against invading pathogens. If not tightly regulated, however, an overwhelming immune response can lead to what is sometimes called a cytokine storm. One such out-of-control response, sepsis, results in more than 200 000 deaths a year in the United States alone (Angus et al, 2001). Over the course of the last decade, significant progress has been made in understanding the innate immune response, including the detailed dissection of some of the critical signaling pathways involved (Lang and Mansell, 2007; Matsukawa, 2007). Despite these efforts, many questions remain unanswered including how the innate immune system initiates distinct responses toward particular pathogens. It is becoming increasingly clear that the innate immune response does not involve simple linear pathways but rather complex networks of pathways and interactions, positive and negative feedback loops and multifaceted transcriptional responses (Tegner et al, 2006; Lee and Kim, 2007). To better understand the complexities of the innate immune response and the cross-talk between its components, complementary systems-level analyses and more focused follow-up experimental approaches are now needed.
Recently, researchers have started to apply systems biology approaches to the study of the immune system (Gilchrist et al, 2006; Tegner et al, 2006; Andersen et al, 2008) and bioinformatics resources are now emerging to aid these types of analyses. Despite the enormous efforts of the major publicly available interaction and pathway databases to provide as wide-ranging cover as possible (Salwinski et al, 2004; Alfarano et al, 2005; Joshi-Tope et al, 2005; Breitkreutz et al, 2007; Chatr-aryamontri et al, 2007; Kanehisa et al, 2007; Kerrien et al, 2007), it was quickly apparent to us that currently available bioinformatics resources provided poor coverage and detail of the molecular interactions and pathways relevant to innate immunity, information that is essential for the systems-orientated interpretation of large-scale genomics data.
To overcome these problems and to provide a resource that will enable biologists without a computational background to explore their data in a more systems-oriented manner, we have developed InnateDB. InnateDB (www.innatedb.ca ) is a publicly available database and analysis platform for the genes, proteins, experimentally verified interactions and pathways involved in the human and murine innate immune responses.
One of the primary goals of InnateDB is to provide a manually curated centralized resource for experimentally verified human and mouse protein, gene and RNA molecular interactions involved in the innate immune system. To do this, a dedicated full-time team of curators has been assembled to review the relevant biomedical literature and to submit detailed annotation on these interactions and pathways to InnateDB through customized submission system software. To date, more than 3500 innate immunity-relevant interactions, involving around 1000 genes, have been manually curated through the review of approximately 1000 publications. Only interactions with published direct experimental evidence of a physical or biochemical interaction are submitted to InnateDB. The importance of manual curation is clear, as we are often able to double the number of interactions for a given gene or protein compared to the number currently present in the other interaction databases combined. Furthermore, this detailed manual curation permitted us to richly annotate these interactions and to place them in their relevant context. Interaction data in InnateDB are also curated, stored and downloadable in the Proteomics Standards Initiative Molecular Interaction (PSI-MI) 2.5-compliant XML format (Hermjakob et al, 2004).
In addition to the detailed manual curation of the genes, proteins and their interactions and pathways that are specifically known to have a role in the innate immune response, InnateDB also incorporates data on the entire human and mouse interactomes. To do this, annotation on more than 100 000 human and mouse interactions was integrated from several of the major publicly available interaction databases into InnateDB (Figure 1). To enable the investigation of genes, proteins and their molecular interactions that are relevant to particular pathways, InnateDB also includes cross-references of genes not only to innate immunity-relevant pathways but also to more than 2500 pathways from several of the major publicly available pathway databases (Figure 1). Detailed gene and protein annotation has also been extracted from a variety of other data sources.
Figure 1
Integration of publicly available resources as a foundation for InnateDB. More than 100 000 interactions were integrated into InnateDB from the Molecular Interaction (MINT) database (Chatr-aryamontri et al, 2007); the IntAct database (Kerrien et al, 2007); the Database of Interacting Proteins (DIP) (Salwinski et al, 2004); the General Repository for Interaction Datasets (BioGRID) (Breitkreutz et al, 2007) and the Biomolecular Interaction Network Database (BIND) (Alfarano et al, 2005). Cross-references to more than 2500 pathways from the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database (Kanehisa et al, 2007), the NCI-Nature Pathway Interaction Database (PID) (http://pid.nci.nih.gov), Integrating Network Objects with Hierarchies (INOH) pathway database (http://www.inoh.org/), NetPath (http://www.netpath.org) and the Reactome database (Joshi-Tope et al, 2005) were also incorporated. Up-to-date versions of Ensembl (www.ensembl.org) provide details of human and mouse genes, transcripts and proteins along with rich protein and gene annotation from the Universal Protein Resource (UniProt) (The UniProt Consortium, 2007), Gene Ontology (Ashburner et al, 2000) and Entrez Gene (http://www.ncbi.nlm.nih.gov/).
Full figure and legend (212K)Figures & Tables indexSpecific interactions, pathways and genes or proteins of interest can be interactively searched for in InnateDB through the flexible web-based search interface of the database, providing a knowledge base for the community, whereas the bioinformatics and network visualization tools incorporated into InnateDB elevate the system from database to robust analysis platform. InnateDB allows one to integrate quantitative data (such as differential gene expression) into a molecular interaction network and pathway context, enabling the interrogation of such data in novel and insightful ways. Investigating differentially expressed molecular interaction networks may identify subnetworks or as-yet unidentified pathways as being significantly involved in the response to a particular stimulus. By incorporating Cytoscape into InnateDB, investigators are able to take a closer look at the interactions involved in these pathways or subnetworks, potentially identifying cross-talk between key pathways, and highlighting the molecules that are the hubs of these networks. Our Cerebral plugin allows one to further extend this experience, visually interrogating quantitative data across multiple conditions in more biologically intuitive pathway-like layouts of networks, which are generated using subcellular localization information.
Integrated pathway over-representation analysis can identify those pathways that are significantly associated with differentially regulated genes, highlighting those pathways that are significantly altered in their gene expression. Through such pathway analysis, it is possible to identify common pathways that are involved in the innate immune response to particular infections, and to identify the common central regulators of these pathways as attractive targets for immune modulation. (Figure 4).
Figure 4
Cerebral enables the overlay of quantitative data in a molecular interaction network context. To facilitate the side-by-side comparison of specific experimental conditions, Cerebral uses a series of small, linked views to visualize quantitative data (gene expression values, for example) across multiple conditions simultaneously, alongside a larger central window, which permits more detailed investigation of particular conditions or regions of the network. Nodes (i.e. molecules) are colored according to user-defined thresholds of fold change in gene expression (red, significantly upregulated; green, significantly downregulated). The small multiple windows and the larger overview window are linked––if one zooms in or out, pans, mouses over or selects a node in one of the views, the same action will be perpetuated across all views. By selecting one of the small multiple windows, the expression values for that condition will be promoted to the larger overview window. If two small multiple windows are selected simultaneously, the difference in gene expression in the two conditions is computed and displayed in the main window. In this view, the nodes in the main window are colored according to the magnitude of the change between the two conditions. InnateDB uses the number of pieces of evidence supporting an interaction, usually separate publications or experiments, as a measure of confidence in the interaction. Cerebral uses weighted lines in its display to represent these confidence scores (heavier weighted lines=higher confidence). By right clicking on a node or edge, one may interactively link to the relevant pages in InnateDB for more detailed annotation regarding the gene, protein or interaction of interest.
Full figure and legend (523K)Figures & Tables indexInnateDB, along with other emerging resources for bioinformatics and systems-level analysis of immunology (Kelley et al, 2005; Ortutay and Vihinen, 2006; Hijikata et al, 2007; Korb et al, 2008), will undoubtedly lead to novel and much deeper insights into the innate immune response to particular pathogens.
Acknowledgements
We thank Kathleen Wee, Eddie Yuen, Patrick Taylor, Sheena Tam, Tom Yang and other members of the Pathogenomics of Innate Immunity project for their assistance in manual curation of InnateDB. InnateDB has been funded by Genome Canada and Genome BC through the Pathogenomics of Innate Immunity (PI2) project and by the Foundation for the National Institute of Health and the Canadian Institutes of Health Research (CIHR) under the Grand Challenges in Global Health Research Initiative (Grand Challenges ID: 419). DJL and JLG hold Postdoctoral Trainee Awards from the Michael Smith Foundation for Health Research (MSFHR), and JLG also holds a Sanofi Pasteur CIHR fellowship. MDW holds a Junior Graduate Studentship Award from the MSFHR. FSLB is a CIHR New Investigator and an MSFHR Senior Scholar. REWH holds a Canada Research Chair (CRC). We also thank the various interaction, pathway and annotation databases that have been integrated into InnateDB for freely providing their data to the public.
References
- Alfarano C, Andrade CE, Anthony K, Bahroos N, Bajec M, Bantoft K, Betel D, Bobechko B, Boutilier K, Burgess E, Buzadzija K, Cavero R, D'Abreo C, Donaldson I, Dorairajoo D, Dumontier MJ, Dumontier MR, Earles V, Farrall R, Feldman H et al (2005) The Biomolecular Interaction Network Database and related tools 2005 update. Nucleic Acids Res 33: D418–D424 | Article | PubMed | ISI | ChemPort |
- Andersen J, VanScoy S, Cheng TF, Gomez D, Reich NC (2008) IRF-3-dependent and augmented target genes during viral infection. Genes Immun 9: 168–175 | Article | PubMed | ChemPort |
- Angus DC, Linde-Zwirble WT, Lidicker J, Clermont G, Carcillo J, Pinsky MR (2001) Epidemiology of severe sepsis in the United States: analysis of incidence, outcome, and associated costs of care. Crit Care Med 29: 1303–1310 | Article | PubMed | ISI | ChemPort |
- Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G (2000) Gene Ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25: 25–29 | Article | PubMed | ISI | ChemPort |
- Breitkreutz BJ, Stark C, Reguly T, Boucher L, Breitkreutz A, Livstone M, Oughtred R, Lackner DH, Bahler J, Wood V, Dolinski K, Tyers M (2007) The BioGRID Interaction Database: 2008 update. Nucleic Acids Res 36: D637–D640 | Article | PubMed | ChemPort |
- Chatr-aryamontri A, Ceol A, Palazzi LM, Nardelli G, Schneider MV, Castagnoli L, Cesareni G (2007) MINT: the Molecular INTeraction database. Nucleic Acids Res 35: D572–D574 | Article | PubMed | ISI | ChemPort |
- Gilchrist M, Thorsson V, Li B, Rust AG, Korb M, Kennedy K, Hai T, Bolouri H, Aderem A (2006) Systems biology approaches identify ATF3 as a negative regulator of Toll-like receptor 4. Nature 441: 173–178 | Article | PubMed | ISI | ChemPort |
- Hermjakob H, Montecchi-Palazzi L, Bader G, Wojcik J, Salwinski L, Ceol A, Moore S, Orchard S, Sarkans U, von Mering C, Roechert B, Poux S, Jung E, Mersch H, Kersey P, Lappe M, Li Y, Zeng R, Rana D, Nikolski M et al (2004) The HUPO PSI's molecular interaction format––a community standard for the representation of protein interaction data. Nat Biotechnol 22: 177–183 | Article | PubMed | ISI | ChemPort |
- Hijikata A, Kitamura H, Kimura Y, Yokoyama R, Aiba Y, Bao Y, Fujita S, Hase K, Hori S, Ishii Y, Kanagawa O, Kawamoto H, Kawano K, Koseki H, Kubo M, Kurita-Miki A, Kurosaki T, Masuda K, Nakata M, Oboki K et al (2007) Construction of an open-access database that integrates cross-reference information from the transcriptome and proteome of immune cells. Bioinformatics 23: 2934–2941 | Article | PubMed | ChemPort |
- Joshi-Tope G, Gillespie M, Vastrik I, D'Eustachio P, Schmidt E, de Bono B, Jassal B, Gopinath GR, Wu GR, Matthews L, Lewis S, Birney E, Stein L (2005) Reactome: a knowledgebase of biological pathways. Nucleic Acids Res 33: D428–D432 | Article | PubMed | ISI | ChemPort |
- Kanehisa M, Araki M, Goto S, Hattori M, Hirakawa M, Itoh M, Katayama T, Kawashima S, Okuda S, Tokimatsu T, Yamanishi Y (2007) KEGG for linking genomes to life and the environment. Nucleic Acids Res 36: D480–D484 | Article | PubMed | ChemPort |
- Kelley J, de Bono B, Trowsdale J (2005) IRIS: a database surveying known human immune system genes. Genomics 85: 503–511 | Article | PubMed | ChemPort |
- Kerrien S, Alam-Faruque Y, Aranda B, Bancarz I, Bridge A, Derow C, Dimmer E, Feuermann M, Friedrichsen A, Huntley R, Kohler C, Khadake J, Leroy C, Liban A, Lieftink C, Montecchi-Palazzi L, Orchard S, Risse J, Robbe K, Roechert B et al (2007) IntAct—open source resource for molecular interaction data. Nucleic Acids Res 35: D561–D565 | Article | PubMed | ISI | ChemPort |
- Korb M, Rust AG, Thorsson V, Battail C, Li B, Hwang D, Kennedy KA, Roach J, Rosenberger CM, Gilchrist M, Zak D, Johnson C, Marzolf B, Aderem A, Shmulevich I, Bolouri H (2008) The Innate Immune Database (IIDB). BMC Immunol 9: 7 | Article | PubMed | ChemPort |
- Lang T, Mansell A (2007) The negative regulation of Toll-like receptor and associated pathways. Immunol Cell Biol 85: 425–434 | Article | PubMed | ChemPort |
- Lee MS, Kim YJ (2007) Signaling pathways downstream of pattern-recognition receptors and their cross talk. Annu Rev Biochem 76: 447–480 | Article | PubMed | ISI | ChemPort |
- Matsukawa A (2007) STAT proteins in innate immunity during sepsis: lessons from gene knockout mice. Acta Med Okayama 61: 239–245 | PubMed | ChemPort |
- Ortutay C, Vihinen M (2006) Immunome: a reference set of genes and proteins for systems biology of the human immune system. Cell Immunol 244: 87–89 | Article | PubMed | ChemPort |
- Salwinski L, Miller CS, Smith AJ, Pettit FK, Bowie JU, Eisenberg D (2004) The Database of Interacting Proteins: 2004 update. Nucleic Acids Res 32: D449–D451 | Article | PubMed | ISI | ChemPort |
- Tegner J, Nilsson R, Bajic VB, Bjorkegren J, Ravasi T (2006) Systems biology of innate immunity. Cell Immunol 244: 105–109 | Article | PubMed | ChemPort |
- The UniProt Consortium (2007) The Universal Protein Resource (UniProt). Nucleic Acids Res 36: D190–D195


