Genome-scale gene/reaction essentiality and synthetic lethality analysis
There is a Report associated with this document.
Patrick F Suthers1,a, Alireza Zomorrodi1,a & Costas D Maranas1
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA, USA
Correspondence to: Costas D Maranas1 Department of Chemical Engineering, The Pennsylvania State University, University Park, PA 16802, USA. Tel.: +1 814 863 9958; Fax: +1 814 865 7846; Email: costas@psu.edu
Received 23 December 2008; Accepted 8 July 2009; Published online 18 August 2009
aThese authors contributed equally to this work
Top of pageArticle highlights
- Customized algorithms enable in silico identification of different orders of synthetic lethals at both the gene and reaction level.
- We present a comprehensive map of gene and reaction synthetic lethals up to order three and identified some higher order synthetic lethals for the latest genome-scale metabolic network of E. coli, i.e., iAF1260.
- Graph representation of synthetic lethals reveals a number of topological motifs providing insights into complex patterns of gene and reaction utilizations.
- We propose model correction strategies based on the mismatches between in vivo and in silico results.
Synopsis
Synthetic lethals (SL) refer to pairs of non-essential genes whose simultaneous deletion is lethal (Novick et al, 1989; Guarente, 1993). The study of synthetic lethality plays a pivotal role in elucidating functional associations between genes and gene function predictions (Ooi et al, 2006). One can extend the concept of synthetic lethality by considering gene groups of increasing size where only the simultaneous elimination of all genes is lethal whereas individual gene deletions are not. The availability of genome-scale metabolic models of organisms has provided the foundation for the development of computational frameworks to rapidly predict the effect of multiple genetic manipulations on the strain growth phenotype under different media. The majority of in vivo and in silico studies have concentrated on perturbing/deleting a single gene or a gene pair at a time. Thus, these analyses might fail to assess the full range of robustness and functional organization of the metabolic networks afforded by higher-order interactions and redundancies. Extending the concept of lethality for not just gene pairs but triples, quadruples, etc. can capture multi-gene/reaction interdependencies. The challenge in exhaustively identifying higher-order SLs lies in the combinatorial complexity of the underlying mathematical problem. This computationally intensive goal was made possible by developing an efficient procedure relying on bilevel optimization.
This framework is applied to the iAF1260 model of E. coli K12 (Feist et al, 2007) for aerobic growth on minimal glucose medium. We contrast the predicted SLs against experimental data and provide a number of model refinement possibilities. We elucidate all SL gene and reaction triples. We also introduce the concept of degree of essentiality to unravel the contribution of each reaction in "buffering" cellular functionalities. This study provides a complete analysis of gene and reaction essentiality and lethality for the latest E. coli model iAF1260 and ushers the computational means for performing similar analyses for other genome-scale models. Furthermore, by exhaustively elucidating all model growth predictions in response to multiple gene knockouts it provides a many-fold increase in the number of genetic perturbations that can be used to assess the performance of in silico metabolic models. We identified 83 genes and 4 non-gene associated reactions involved in 86 SL pairs (
0.01% of total possible pairs) as shown in Figure 1. All these SL pairs were analyzed in detail in terms of their phenotypic, topological and functional impact. Of the 86 predicted SL pairs, 53 (
62%) of them were found to yield auxotroph strains in silico that can be restored through supplementation. By representing all genes forming SL pairs as nodes connected by an edge, a variety of different topological motifs emerge (see Figure 1). These include disjoint pairs, stars and highly connected subgraphs.
Figure 1
Topological and functional classification of clusters of SL gene pairs. Three types of network motifs are present: disjoint pairs (left); stars, or 1-connected motifs (center); and highly connected subgraphs, or k-connected motifs (right). Genes are color-coded in accordance to the COG (Tatusov et al, 2003) functional categorization. Names of genes are set in italics and the names of non-gene associated reactions are set in roman. Note that all the reaction abbreviations follow those in iAF1260 (Feist et al, 2007).
Full figure and legend (438K)Figures & Tables indexWe investigated the membership of SL gene pairs to clusters of orthologous groups (COGs) ontology (Tatusov et al, 2003) as illustrated using different colors in Figure 1. It has been previously noted that two functionally distant genes can cause synthetic lethality because a gene deletion not only causes the loss of function of the primary function but also creates a cascade of compensatory cellular responses possibly affecting many pathways (Schoner et al, 2008). These inter-category connections are thus indicative of the need to bring to bear different parts of metabolism to enable the production of all biomass precursors. We searched for experimental evidence to examine the validity of the in silico predicted SL pairs. Explicit experimental evidence was found in the literature for eleven such SLs. All of these SLs could be rescued by nutrient supplementation: five with amino acids alone, five with other metabolites, and one with a combination of amino acids and other nutrients.
Comparisons of in silico predictions and in vivo observations for single gene essentiality data (Kumar and Maranas, 2009) were used before to drive the process of metabolic model refinement (Becker and Palsson, 2008). Extending this workflow to include SL pairs, triplets, etc. provides additional layers of model validation and opportunities for correction. We identified 27 in silico SLs that are inconsistent with in vivo SL data in two different ways. The first one includes predicted SLs that contain one or more essential genes whereas the latter contains predicted SL that are in agreement with in vivo SL data but imply incorrect supplementation rescue (i.e. auxotrophy) scenarios. Using these results, we suggested 18 iAF1260 model modifications. The concept of synthetic (pair) lethality can be extended to SL triples where the simultaneous deletion of three genes is lethal. When searching for SL triples, all essential genes and SL pairs are excluded from consideration to eliminate trivial results. We identified 193 SL gene triples involving 114 genes and 15 non-gene associated reactions. Analyzing reactions, we found 96 SL reaction pairs and a total of 243 SL triples involving 163 reactions. A wide amount of participation for different reactions in SL is observed. Notably, TPI (triose-phosphate isomerase) is the most highly triple-participating reaction, with membership in 35 different SL triples.
To quantify the degree of dispensability of a gene or reaction in a metabolic network with respect to biomass formation we introduce the concept of degree of essentiality (DOE). This metric is defined as the size of the smallest SL that the gene or reaction is a member of. Therefore, essential genes or reactions have a DOE of one while genes or reactions that participate in SL pairs (and perhaps in higher-order SLs) have a DOE of two. We determined the DOE of up to three for all genes and reactions and the DOE of up to four for all reactions of central metabolism active under aerobic glucose conditions (see Figure 8). We can see that the majority of reactions in central metabolism have a DOE of greater than one. This occurrence is most likely due to the presence of multiple diverging and converging branches in pathways of central metabolism. It is important to note that reactions operating in opposite directions can have different DOEs. Such examples include reaction pairs FBP and PFK as well as PPC and PPCK.
Figure 8
Color-coded representation of the reactions in central metabolism according to their degree of essentiality.
Full figure and legend (655K)Figures & Tables indexReferences
- Becker SA, Palsson BO (2008) Three factors underlying incorrect in silico predictions of essential metabolic genes. BMC Syst Biol 2: 14 | Article | PubMed | ChemPort |
- Feist AM, Henry CS, Reed JL, Krummenacker M, Joyce AR, Karp PD, Broadbelt LJ, Hatzimanikatis V, Palsson BO (2007) A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information. Mol Syst Biol 3: 121 | Article | PubMed | ChemPort |
- Guarente L (1993) Synthetic enhancement in gene interaction: a genetic tool come of age. Trends Genet 9: 362–366 | Article | PubMed | ISI | ChemPort |
- Kumar VS, Maranas CD (2009) GrowMatch: an automated method for reconciling in silico/in vivo growth predictions. PLoS Comput Biol 5: e1000308 | Article | PubMed | ChemPort |
- Novick P, Osmond BC, Botstein D (1989) Suppressors of yeast actin mutations. Genetics 121: 659–674 | PubMed | ISI | ChemPort |
- Ooi SL, Pan X, Peyser BD, Ye P, Meluh PB, Yuan DS, Irizarry RA, Bader JS, Spencer FA, Boeke JD (2006) Global synthetic-lethality analysis and yeast functional profiling. Trends Genet 22: 56–63 | Article | PubMed | ISI | ChemPort |
- Schoner D, Kalisch M, Leisner C, Meier L, Sohrmann M, Faty M, Barral Y, Peter M, Gruissem W, Buhlmann P (2008) Annotating novel genes by integrating synthetic lethals and genomic information. BMC Syst Biol 2: 3 | Article | PubMed | ChemPort |
- Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN, Rao BS, Smirnov S, Sverdlov AV, Vasudevan S, Wolf YI, Yin JJ, Natale DA (2003) The COG database: an updated version includes eukaryotes. BMC Bioinformatics 4: 41 | Article | PubMed


