An in-silico approach to predict and exploit synthetic lethality in cancer metabolism

Synthetic lethality is a promising concept in cancer research, potentially opening new possibilities for the development of more effective and selective treatments. Here, we present a computational method to predict and exploit synthetic lethality in cancer metabolism. Our approach relies on the concept of genetic minimal cut sets and gene expression data, demonstrating a superior performance to previous approaches predicting metabolic vulnerabilities in cancer. Our genetic minimal cut set computational framework is applied to evaluate the lethality of ribonucleotide reductase catalytic subunit M1 (RRM1) inhibition in multiple myeloma. We present a computational and experimental study of the effect of RRM1 inhibition in four multiple myeloma cell lines. In addition, using publicly available genome-scale loss-of-function screens, a possible mechanism by which the inhibition of RRM1 is effective in cancer is established. Overall, our approach shows promising results and lays the foundation to build a novel family of algorithms to target metabolism in cancer.


GIMME 4 and iMAT
. For illustration, we will evaluate the essentiality of r 2 in the resulting contextualized networks from these two approaches.
iMAT. In this method a particular reaction is removed from the reference metabolic network if, when it is blocked, its consistency with gene expression data is strictly higher than when it is forced to be active. To measure the consistency with gene expression data, iMAT gives the same weight to include a reaction in H (subset of highly expressed reactions) as to exclude a reaction in L.
For instance, if we delete r 4 , the maximum consistency score with gene expression data would be 6 (agreement with r 1 , r 2 , r 3 , r 4 , r 6 , r 7 ), while, if we activate r 4 , this score would be 5 (agreement with r 1 , r 2 , r 3 , r 6 , r 7 ). In light of this, r 4 is excluded from the reconstruction. The same procedure is applied to each reaction. If both scores obtain the same result, the reaction is included in the reconstruction.
When applying iMAT to our toy example, we obtain the sub-network shown in Supplementary Figure 1c. The essentiality of r 2 is not predicted following the iMAT algorithm, since it has an escape pathway through r 3 , r 5 , r 6 .
Note here that, in order to evaluate the performance of iMAT in the Results section of the main text, given its high computational demand, we had to introduce modifications with respect to the original version of the iMAT presented in Shlomi et al. 3 (see Supplementary Note 2). GIMME. This approach first calculates an inconsistency score for each reaction in L, as a function of the difference between its expression level and the threshold which determines if a reaction is expressed or not. As reactions in L are not supposed to take part in the reconstruction, this algorithm includes all reactions in and some reactions in L which minimize the sum of inconsistency scores. This minimization problem must satisfy steady-state condition, thermodynamic constraints and biomass production.
Supplementary Figure 1d shows the reconstructed network obtained following GIMME.
This method does predict the essential role of r 2 , that is, there are not alternative pathways to reach r 8 after knocking out r 2 . Nevertheless, GIMME is not able to explain the reason why r 2 is essential, since when the reconstruction is conducted, r 4 and r 5 are removed from the solution network and, therefore, this information is lost.
Note here that, in order to evaluate the performance of GIMME in the Results section of the main text, we implemented the algorithm presented in Becker et al. 4 . As done with our gMCS approach, we used the Gene Expression Barcode 3.0 (ref. 5) to obtain the set lowly expressed genes, L.

Extending MCSs at the gene level (gMCSs).
In contrast with existing methods for MCS computation 2,6 , we extend the analysis to the gene level and determine gMCSs. It is important to emphasize that the subset of genes associated with the reactions involved in a particular MCS, determined using Gene-Protein-Reaction (GPR) rules, does not necessarily constitute a minimal knockout strategy. This is due to the fact that GPR rules are not always trivial (one-to-one association) and may involve complex relationships. In Recon2.v04 (ref. 7), for instance, this is the case for 88% of genes included. For illustration, assume that we are concerned in finding gMCSs involving g 2 for the toy metabolic network in Supplementary Figure 1a in a slightly more complex GPR rules scenario ( Supplementary Fig. 1e). In this case, g 2 is only related to r 2 , which can be catalyzed by one additional enzyme encoded by g 3 ; the rest of reactions are catalyzed by only one enzyme. In order to delete r 2 (the only potential effect over the network of knocking out g 2 ), we need to suppress g 2 and g 3 simultaneously and, when this is achieved, r 3 is indirectly deleted. As g 2 is necessarily coupled to g 3 to have any effect and they form a synthetic lethal, the knockout of g 4 , g 5 or g 6 is not necessary any more to disrupt r 8 . Therefore, if we obtain the genes associated to MCS 1 -MCS 3 using GPR rules, the only true minimal gene knockout solution would be gMCS 1 ={g 2 , g 3 } (Supplementary Fig.   1f). As g 3 is not included in the L set, our approach would lead to an 'infeasible' problem, i.e. with no solution. We conclude that g 2 is not essential for the activation of Subject to: , where H and L represents the subset of highly and lowly expressed reactions, 7), this approach is prohibitive in terms of computation time, as we need to solve at least 2*n MILPs of similar complexity as the one shown above for each sample. To overcome this issue, we carried out the following implementation: 1) Solve the MILP shown above (equations (S1)-(S10)), extract the value of fluxes in the solution found (henceforth denoted as u) and include non-zero fluxes in the output reconstruction. These active reactions must be part of the reconstruction since no better objective value can be found and, therefore, if they are knocked out, the objective value will be less or equal than the current one.
2) Evaluate whether other reactions without expression data available (set E), currently not part in the reconstruction, can be included. To that end, we force the fluxes in the same direction found as in the previous solution (u), force to zero fluxes in H and L inactive in the previous solution and maximizes the number of reactions in E, which leads to the following MILP: Subject to: (S12) (S13) (S14) (S15) With this second step, we keep the objective value found in Step 1 (given by reactions in H and L) and identify alternative pathways through reactions in E. This approach constitutes a computationally tractable approximation to iMAT. In terms of gene essentiality analysis, this approximation results in a best-case scenario, as we may have additional reactions in H and L that could be part of the reconstruction, which add new escape pathways and, therefore, would reduce the list of essential genes. In other words, with this implementation, a predicted non-essential gene is certainly non-essential in iMAT; however, a predicted essential gene could be non-essential in iMAT.  11 , we used a conservative strategy and decided to keep these reactions with unknown GPR rules. Note that if these reactions were deleted from the reference metabolic network, RRM1 becomes an essential gene for any type of cell, which is not in consonance with functional studies of RRM1 silencing 12 .

Supplementary Note 3. GPR rules for
Supplementary Note 4. G matrix. As noted in the main text, we introduce the binary g x n matrix G, which defines for each row the set of blocked reactions arising from the knockout of a particular subset of genes in L. Genes associated with each row in G must be functionally interrelated and their simultaneous knockout is required to delete at least one of the reactions in the metabolic network. [ ] The naïve approach considered above which calculates all the possible combinations of genes in L is not computationally tractable. To overcome this issue, we have only calculated the combinations of genes in L related to each lowly expressed reaction. In the example discussed above, r 1 , r 2 , r 3 , r 5 and r 6 are lowly expressed (colored blue), so the combinations of genes to study will be: g 1 related to r 1 and r 2 ; g 2 related to r 5 ; g 4 related to r 6 ; and {g 1 , g 2 } related to r 3 .
Note that all combinations containing g 3 have not been included because it is not lowly expressed. As a consequence, following this method we obtain the same G matrix in a more straightforward way.
The G matrix is used in our algorithm to define the potential list of reaction knockouts arising from the combination of genes in L (see equation (4)