Integration of machine learning and genome-scale metabolic modeling identifies multi-omics biomarkers for radiation resistance

Resistance to ionizing radiation, a first-line therapy for many cancers, is a major clinical challenge. Personalized prediction of tumor radiosensitivity is not currently implemented clinically due to insufficient accuracy of existing machine learning classifiers. Despite the acknowledged role of tumor metabolism in radiation response, metabolomics data is rarely collected in large multi-omics initiatives such as The Cancer Genome Atlas (TCGA) and consequently omitted from algorithm development. In this study, we circumvent the paucity of personalized metabolomics information by characterizing 915 TCGA patient tumors with genome-scale metabolic Flux Balance Analysis models generated from transcriptomic and genomic datasets. Metabolic biomarkers differentiating radiation-sensitive and -resistant tumors are predicted and experimentally validated, enabling integration of metabolic features with other multi-omics datasets into ensemble-based machine learning classifiers for radiation response. These multi-omics classifiers show improved classification accuracy, identify clinical patient subgroups, and demonstrate the utility of personalized blood-based metabolic biomarkers for radiation sensitivity. The integration of machine learning with genome-scale metabolic modeling represents a significant methodological advancement for identifying prognostic metabolite biomarkers and predicting radiosensitivity for individual patients.

D espite being one of the oldest forms of cancer therapy and still a primary treatment modality, radiation therapy is not effective for over one-fifth of cancer patients distributed across almost all cancer types 1,2 . While biological understanding of radiation resistance has been advanced, use of a priori prediction of radiation response for individual cancer patients is not yet implemented clinically 3 . Early studies that identified biomarkers for radiation response focused on tumor histology, clinical factors including staging and Karnofsky performance score, and physiological parameters such as tumor oxygenation status [4][5][6] . As methods for transcriptomic analysis have improved, gene expressionbased classifiers for radiation response have proliferated (recently curated in the RadiationGeneSigDB database) 7 . To date, however, these radiation response classifiers do not integrate multiple -omics modalities, owing in part to a lack of available -omics datasets for individual patient tumors. Specifically, while genomic and transcriptomic data are becoming more widely available for large numbers of patient tumors through initiatives such as The Cancer Genome Atlas (TCGA), metabolomic data associated with tumor biobanks are rarely captured, limiting inclusion of tumor metabolic features in predictive models for radiation therapy response 2 .
Given the lack of available tumor metabolomic data, genome-scale metabolic modeling approaches such as flux balance analysis (FBA) are becoming increasingly popular for predicting metabolic phenotypes 8,9 . By combining a curated reconstruction of the human metabolic network with constraints on metabolic reaction activities and an objective function to maximize a particular metabolic phenotype, predictions of steady-state reaction fluxes or metabolite production rates under physiological constraints can be obtained at a genome scale 10 . We previously developed a bioinformatics pipeline for integrating genomic, transcriptomic, kinetic, and thermodynamic parameters into personalized FBA models of 716 radiation-sensitive and 199 radiation-resistant patient tumors from TCGA across multiple cancer types 11 . Using these metabolic models, we identified differences in redox metabolism between radiation-sensitive and -resistant tumors, as well as personalized gene targets for inhibiting antioxidant production and clearance of reactive oxygen species. By validating model predictions using a panel of matched radiationsensitive and -resistant cancer cell lines, we demonstrated that genome-scale metabolic models provide accurate predictions of tumor metabolism and can identify diagnostic and therapeutic biomarkers for radiation response.
While machine learning methods have been previously combined with genome-scale metabolic models to improve prediction of metabolic phenotypes, most studies combining these two methodologies have focused on microbiological applications rather than applications to cancer metabolism or predicting treatment outcomes [12][13][14] . We hypothesize that predictions from genome-scale metabolic models of patient tumors would provide additional information for distinguishing pathophysiological differences between radiation-sensitive and -resistant tumors, as well as for prediction of radiation response.
In this work, we utilize personalized FBA models of TCGA patient tumors to predict genome-scale metabolite production rates for incorporation into machine learning classifiers and identification of metabolite biomarkers associated with radiation resistance. In addition, through integration with clinical, genomic, and transcriptomic datasets, we develop gene expression, multiomics, and non-invasive classifiers that outperform previous predictors of radiation response, as well as provide personalized diagnostic biomarker panels for individual patient tumors.

Results
Gene expression classifier implicates cellular metabolism. Because the majority of previously developed classifiers for radiation response are based on gene expression data (curated in the RadiationGeneSigDB database), we first developed a machine learning classifier utilizing transcriptomic data from 716 radiation-sensitive and 199 radiation-resistant TCGA tumors to compare predictive accuracy and identified gene sets (Supplementary Data 1) 7 . A gradient boosting machine (GBM) algorithm was used with Bayesian optimization for determining optimal hyperparameter values, providing optimal performance accuracy on TCGA datasets ( Supplementary Fig. 1). The significance of individual gene expression features toward the GBM classifier was determined by calculating the mean absolute Shapley Additive Explanations (SHAP) value of the feature across all samples (mean |ΔP|), where each SHAP value (ΔP) represents the contribution of a feature toward the predicted probability of radiation resistance for an individual sample 15,16 . In all, 782 of the 22,819 genes in the dataset (3.43%) were identified as significant in the classification of radiation response, determined by a 95% cumulative sum threshold on mean absolute SHAP values ( Fig. 1a and Supplementary Data 2-4). Ten of the 50 genes with largest mean |ΔP| scores were previously implicated in radiation therapy response [17][18][19][20][21][22][23][24][25][26] . To determine whether the identified set of 782 genes has more predictive value than previously identified gene sets in RadiationGeneSigDB, machine learning classifiers were subsequently trained using only the genes from each respective gene set, and the predictive accuracy of each classifier was compared ( Fig. 1b). Our set of 782 genes had the best performance among all gene sets when trained on the TCGA dataset, and was among the best gene sets when trained on a separate dataset from the Cancer Cell Line Encyclopedia (CCLE) 27 . In addition, our gene set shows statistically significant overlap with significant genes identified from differential gene expression analysis of an independent dataset comparing breast cancer patient tumors exhibiting locoregional recurrence (LRR) versus non-recurrent controls (CTL) (Fisher's exact test, representation factor = 0.6, p = 7e-3) (Supplementary Fig. 2 and Supplementary Table 1) 28 .
Gene set enrichment analysis (GSEA) of these 782 genes among the Hallmarks of Cancer showed significantly increased enrichment of the "Deregulating cellular energetics" hallmark, with very low enrichment of the "Genome instability & mutation" hallmark ( Fig. 1c) 29,30 . Hierarchical clustering of the hallmark enrichment ranks for each gene set in RadiationGeneSigDB revealed two major clusters: a larger cluster with very high rank of "Genome instability & mutation," and a smaller cluster with much higher ranks for other hallmarks involved in cellular metabolism, angiogenesis, and metastasis (Fig. 1d). This dichotomy suggests that although the biological response to radiation therapy certainly involves genomic instability and DNA damage repair, other biological processes such as cellular metabolism may play critical roles as well 31,32 . GSEA of cancer expression modules additionally showed increased enrichment of many modules involved in cellular metabolism, including amino acid and sulfur metabolism, redox metabolism, and lipid metabolism (Fig. 1e) 33 . Finally, GSEA of Recon3D metabolic subsystems demonstrated increased enrichment of pathways involved in central carbon metabolism and lipid metabolism, with the majority of genes being associated with increased probability of radiation resistance (Fig. 1f) 10 . Together, analysis of this gene expression classifier suggests that radiation-resistant tumors exemplify dysregulation in their cellular metabolic networks, and that additional features involving the metabolism of radiation-sensitive and -resistant tumors will provide significant benefit in developing machine learning classifiers for radiation response.
FBA models accurately predict relative metabolite production. Personalized genome-scale FBA models of radiation-sensitive and ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-22989-1 -resistant TCGA tumors were generated to obtain metabolic features that could be used in machine learning classifiers for radiation response (see Methods section). These FBA models were developed through integration of gene expression and mutation information from individual patient tumors, as well as kinetic and thermodynamic parameters from publicly available repositories 11 . By systematically creating artificial metabolite sinks in the Recon3D metabolic network and evaluating fluxes to these sinks, the production rates of different metabolites were predicted and compared between radiation-sensitive and -resistant tumors ( Fig. 2a and Supplementary Data 5). Figure 2b shows that many of the metabolite classes implicated from the gene expression classifier showed significantly increased production in radiation-resistant tumors. These included antioxidant and cysteine-containing metabolites (including precursors of glutathione, an antioxidant with previously implicated roles in radiation response), lipid and fatty acid metabolites (including those previously implicated in lipid peroxidation in response to  (mean |ΔP|) for individual genes, signifying the absolute change in predicted probability of radiation resistance attributed to each feature averaged across all samples. Those features within the top 50 with previous literature suggesting a role in tumor radiation response are annotated. (Right, gray) Cumulative mean |ΔP| scores. b Performance of the identified set of 782 significant gene expression features from this study (red) versus previously identified gene sets in RadiationGeneSigDB (black), on both the (left) TCGA dataset performing a classification task on patient tumor radiation response and (right) CCLE dataset performing a regression task on cancer cell line radiation response. n = 20 training+validation/testing splits. Error bars: mean ± 1 standard error. AUROC; area under the receiver operating characteristic curve, MAE; mean absolute error, MSE; mean squared error. c Gene set enrichment analysis (GSEA) of significant features from our gene expression classifier among the Hallmarks of Cancer. Statistical test: χ 2 test with Yates' correction. d Hierarchical clustering of Hallmarks of Cancer enrichment ranks from the gene set in this study and those in RadiationGeneSigDB, based on both (row) hallmark, and (column) gene set. e GSEA of significant gene expression features among the cancer expression modules from Segal et al. Modules relevant to cellular metabolism are annotated with their number and descriptions. f GSEA of significant gene expression features among Recon3D metabolic subsystems. Significant genes within each subsystem are annotated above or below p-value value bars based on whether their expression is positively correlated with (above, green) radiation sensitivity, or (below, red) radiation resistance.
ionizing radiation), and immune system mediators [34][35][36] . While fewer metabolites were predicted to be significantly downregulated in radiation-resistant tumors, many metabolites involved in nucleotide metabolism were among this group.
Regression of experimental metabolite concentrations among the NCI-60 cancer cell line panel with cell line surviving fraction at 2 Gy radiation (SF2) showed up-and downregulation of the same metabolite classes predicted from FBA models (Fig. 2c) 37 . Many lipid and fatty acid metabolites positively correlate with radiation resistance (including cholesterol, which had the most positive correlation among all metabolites tested); antioxidant metabolites including glutathione positively correlate as well. On the other hand, many nucleotide metabolites are anti-correlated with radiation resistance (including UDP-MurNAc, which had the most negative correlation among all metabolites tested). Regression of experimental metabolite concentrations with cell line radiation response among the larger CCLE cancer cell line panel yielded similar findings, with upregulation of lipid/ antioxidant metabolites and downregulation of nucleotide metabolites in radiation-resistant cell lines ( Supplementary  Fig. 3).
To experimentally validate FBA model predictions of individual metabolite levels, we analyzed matched pairs of radiationsensitive and -resistant cell lines from four different cancer types via untargeted metabolomics ( Fig. 2d- . Predictions of lipid production accurately captured the observed heterogeneity in lipid levels between cell lines. Although model predictions of absolute oxidized (GSSG) and reduced (GSH) glutathione production did not match with experimentally measured values, previous model predictions of increased reduction potential of GSSG to GSH in radiationresistant tumors agreed with experimental findings of greater GSH/GSSG ratios in radiation-resistant cell lines 11 . Modelpredicted production of the antioxidant lipoamide as well as immune mediators anandamide and 2-arachidonylglycerol corresponded very well with experimental measurements, which were upregulated in nearly all radiation-resistant cell lines. Mean experimentally measured metabolite concentrations averaged across all cell line pairs correlated well with FBA-predicted production among all metabolite classes (Fig. 2e, Pearson r = 0.425, p = 6.96e-3). Additional validation was performed analyzing metabolite concentrations among 130 breast, colorectal, glioma, and upper aerodigestive cancer cell lines among the CCLE panel (Supplementary Data 8) 38,39 . FBA models were able to accurately predict nucleotide metabolites that are downregulated (adenine and guanine derivates) and upregulated (cytidine and uracil derivatives) in radiation-resistant cancers, as well as correctly predict the upregulation of antioxidant metabolites including glutathione ( Supplementary Fig. 8).  Table 2). e Comparison of model-predicted and experimentally measured levels of individual putative metabolites within the four major classes identified in (b). BRCA, COAD, GBM, HNSC: log 2 ratio of putative metabolite levels in radiation-resistant versus -sensitive cell lines. Statistically significant differences within each cell line pair are represented by box outlines and p-values. Statistical test: two-sided t-test. MEAN EXP; average experimental log 2 ratio across all four cell line pairs, FBA; log 2 ratio of model-predicted metabolite production rates in radiation-resistant versus -sensitive TCGA tumors.
Overall, these findings demonstrate that genome-scale metabolic models derived from transcriptomic and genomic data provide surprisingly accurate predictions of relative metabolite production between radiation-sensitive and -resistant cancers, allowing for their use in machine learning classifiers for radiation response.
Machine learning architecture for radiation response. To integrate FBA model predictions of metabolite production rates with other TCGA datasets into multi-omics machine learning classifiers, a dataset-independent ensemble architecture was developed (Fig. 3a). Multiple independent "base learner" classifiers are trained on an individual -omics dataset (either clinical, genomics, transcriptomics, or metabolomics data), as described in Supplementary Fig. 1. Subsequently, by comparing predicted class probabilities from each individual base learner to known radiation responses, a "meta-learner" classifier is trained to determine which base learner provides the most accurate prediction of radiation response based on the multi-omics features of individual samples (Fig. 3b) 40 . For an individual testing sample, each base learner outputs the predicted probability of radiation resistance (p i ), and the meta-learner outputs the predicted probability that each base learner will provide the most accurate prediction (w i ); the final probability of radiation resistance is the weighted average of each p i , with weights being each w i (Fig. 3c). This dataset-independent ensemble architecture performs better across multiple performance metrics compared to the common practice of initially combining all -omics datasets and training on a single classifier ( Fig. 3d and Supplementary Figs. 9 and 10). Overall, this machine learning architecture is a robust platform for integrating multi-omics data and providing accurate predictions of radiation response in individual patient tumors.

Multi-omics classifier identifies clinical patient subgroups.
Using the dataset-independent ensemble architecture described above, a multi-omics machine learning classifier integrating clinical, gene expression, mutation, and FBA-predicted metabolite production rates from TCGA tumors was developed (Supplementary Data 9). With an AUROC of 0.906 ± 0.004, this classifier has significantly greater performance compared to previously developed machine learning classifiers for radiation response (Figs. 4a and 1b) 7,41 . In addition, the threshold for separating radiation-sensitive and -resistant classes can be altered to optimize sensitivity, specificity, or a balance of both. In all, 725 of the 52,223 features from the four datasets (1.39%) were identified as significant in the classification of radiation response as determined by a 95% cumulative sum threshold on mean absolute Machine learning architecture for improved prediction of radiation therapy response. a Dataset-independent ensemble architecture, with independent base learners for each dataset and one meta-learner for integration of base learner outputs. b Meta-learner performing N d -class classification of the most accurate base learner/dataset for each sample, where N d is the number of independent base learners/datasets. c Prediction of radiation response for each testing set sample using predicted probabilities from each base learner and weights from the meta-learner. d Performance of multi-omics classifier trained on clinical, gene expression, mutation, and FBA-predicted metabolite data from TCGA tumors, comparing the dataset-independent ensemble architecture versus combining datasets together before training on a single classifier. Weighted log loss and AUROC performance metrics are shown here, with other metrics shown in Supplementary Fig. 9. n = 20 training+validation/testing splits. Boxplots: box = 25th, 50th, and 75th percentiles, whiskers = 1.5 times the interquartile range. Statistical test: two-sided t-test. . While the majority of these 725 features were gene expression (48.3%) and metabolite (32.6%) features, clinical features including tumor histology, chemotherapeutic response, and cancer type contributed more than half of the total mean |ΔP| scores (60.1%; Fig. 4c and Supplementary Fig. 11). Mutations with significant mean |ΔP| scores included those directly involved in redox metabolism (IDH1 R132H) and lipid metabolism (BRAF V600E) 42,43 . Many of these significant mutation features were among those with the largest differences in mutation rates between breast cancer tumors exhibiting LRR versus CTL in an independent patient-derived dataset ( Supplementary Fig. 12) 28 .
The contribution of a particular dataset toward radiation response classification for each individual patient was calculated by summing the patient-specific absolute SHAP values (|ΔP|) for all features within the particular dataset, normalized by the sum of patient-specific |ΔP| values across all datasets (Fig. 4d).
Individual samples varied significantly in the contribution of different datasets toward radiation response classification. Using unsupervised clustering, three clusters of patients with varying contributions of clinical features were identified ( Fig. 4e and Supplementary Fig. 13a). While "High Clinical" patients were categorized by large clinical feature contributions and small contributions from multi-omics datasets, multi-omics features provided the majority of cumulative SHAP values for "Low Clinical" patients, with metabolic features alone providing nearly as much utility as clinical features ( Fig. 4f and Supplementary Figs. [14][15][16][17][18]. For this "Low Clinical" cluster, certain clinical features including chemotherapeutic response have diminutive utility, whereas multi-omics features including IDH1 SNP and lipid metabolite levels have much higher importance scores compared to the overall patient cohort. Significant heterogeneity in clinical clusters was observed based on patient clinical factors, especially cancer type and tumor histology (Fig. 4g-i). Output weights from the meta-learner provide an accurate prediction of clinical cluster, effectively differentiating between "Low Clinical" and "Medium/High Clinical" patients; this provides a valuable strategy for determining whether clinical information from electronic medical records is sufficient to accurately predict radiation response in an individual patient, or whether multiomics features from tumor biopsy samples are needed (Fig. 4j).
Metabolic biomarkers identified for radiation response. Metabolite set enrichment analysis of the 236 significant metabolite features from the multi-omics classifier indicated significant enrichment of several metabolic pathways involved in central carbon metabolism, lipid metabolism, and nucleotide metabolism (Fig. 5a). To identify individual metabolites with the largest impact on radiation response prediction, the Spearman correlation between SHAP value (ΔP) and predicted metabolite production rate across all patients was calculated for each metabolite ( Supplementary  Fig. 19). Figure 5b highlights many of the significant metabolic features, as well as metabolism-related gene expression and mutation features. Significant glycolytic and TCA cycle metabolites (fructose 1,6-bisphosphate, 3-phosphoglyceric acid, succinyl-CoA, and succinate) were all positively correlated with radiation resistance, while genes promoting gluconeogenesis (PCK2 and LDHC) were associated with radiation sensitivity. Fructose 2,6-bisphosphate, an allosteric regulator of PFK-1 that activates glucose breakdown, had one of the most positive correlation values. In addition, many metabolites in early mannose metabolism had positive correlation values, in accordance with previously observed radiation-induced upregulation of mannose-6-phosphate receptors and high-mannose type N-glycan production 44,45 .  Fig. 19). Significant gene expression and mutation features are denoted by colored reaction arrows, either in green (associated with radiation sensitivity) or in red (associated with radiation resistance). 13BPG 1,3-bisphosphoglycerate, 2HG 2-hydroxyglutarate, 2PG 2-phosphoglycerate, Greater glycolytic fluxes in radiation-resistant tumor models resulted in increased production of the majority of significant lipid and fatty acid metabolites, including many with previously identified roles in antioxidation such as capric acid, butyric acid, eicosatrienoic acid, and γ-linolenic acid (Fig. 5c) [46][47][48][49] . On the other hand, significant nucleotide metabolites were highly correlated with radiation sensitivity, in agreement with the observed downregulation in radiation-resistant cancer cell lines (Fig. 5d). While production of energy metabolites including ATP was correlated with radiation sensitivity, FBA models predict significantly greater conversion of ADP to ATP in radiationresistant tumors, in agreement with previous experimental findings ( Fig. 5e and Supplementary Fig. 20) 50,51 . Finally, increased production of membrane phospholipids and arachidonic acid precursors resulted in significant correlations between inflammation-mediating eicosanoids and radiation resistance, corroborating previous evidence of radiation-sensitizing effects of cyclooxygenase inhibitors including aspirin (Fig. 5f) 52 . Together, these findings suggest that metabolic features from multiple interconnected pathways including central carbon, lipid, and nucleotide metabolism are viable diagnostic biomarkers for prediction of radiation sensitivity.

Non-invasive classifier implicates blood metabolic features.
Because non-invasive metabolic predictors of radiation response could be rapidly applied for informing patient-specific treatment, we refined machine learning classification to only integrate clinical data derived from non-invasive means (excluding any pathologic staging or histological information from tumor biopsies) with FBA-predicted production rates of metabolites known to be quantifiable in human blood samples (Fig. 6a) 53 . This noninvasive classifier performed similarly overall to the multi-omics classifier, with increased sensitivity and decreased specificity; this suggests that the non-invasive classifier may be optimal as a firstline screening test, followed by the multi-omics classifier as a second-line diagnostic test (Fig. 6b) 54 . A total of 97 of the 363 features from the two datasets (26.7%) were identified as significant in the classification of radiation response ( Fig. 6c and Supplementary Data 12 and 13). Similar to the multi-omics classifier (Fig. 4e), individual patient contributions of clinical features formed a bimodal distribution of "Low Clinical" and "High Clinical" groups ( Fig. 6d and Supplementary Fig. 13b). Blood metabolite features-including many lipid, nucleotide, and inflammation-mediating metabolites previously identified from the multi-omics classifier-provided almost one-half of the cumulative mean absolute SHAP values (mean |ΔP|) for "Low Clinical" patients (Fig. 6e). Dataset contributions and SHAP values for individual cancer patients can identify personalized biomarkers with maximal diagnostic utility (Fig. 6f-h). Overall, these findings demonstrate the value of blood-based biomarkers as a non-invasive approach toward personalized prediction of radiation response.

Discussion
Despite significant interest in methodologies for the a priori prediction of radiation response in cancer patients, machine learning algorithms have yet to be used in the clinical setting for informing radiation treatment 43 . Recently developed classifiers for predicting tumor radiation response have focused mainly on gene expression data, rather than the integration of multiple -omics datasets 7,55 . This may be in part due to a lack of metabolomics datasets from tumor biobanks including TCGA, limiting inclusion of metabolic features in machine learning classifiers for radiation response. Here, we propose the strategy of utilizing personalized genome-scale FBA models of radiation-sensitive and -resistant patient tumors to predict the production rates of metabolites across the Recon3D metabolic network, leveraging the accessibility of genomic and transcriptomic tumor datasets to generate metabolic insight. These metabolic features are subsequently integrated with clinical, genomic, and gene expression data from TCGA tumors to generate gene expression, multiomics, and non-invasive classifiers for radiation response. These classifiers provide more accurate predictions of tumor radiation response compared to previously developed classifiers, as well as multi-omics biomarkers associated with radiation sensitivity.
FBA model predictions of tumor metabolism and experimental validation with matched radiation-sensitive and -resistant cancer cell lines demonstrated significant re-routing of metabolic fluxes in radiation-resistant cancers, as observed by the up-and downregulation of metabolites across multiple interconnected metabolic pathways. (Figs. 2 and 5). This flux re-routing was observed previously in the context of redox metabolism in radiation-resistant cell lines and tumors, but findings from this study suggest more widespread metabolic alterations throughout central carbon, lipid, and nucleotide metabolism 11,56,57 . Our approach of systematically introducing metabolite sinks into the Recon3D network allows for relating production fluxes to relative changes in experimentally measured metabolite levels. We observe association between increased levels of fatty acid and cholesterol metabolites with tumor radiation resistance, in agreement with previous experimental evidence. Radiationresistant head and neck cancer cells have enhanced fatty acid production from increased expression of fatty acid synthase 58 . In addition, ionizing radiation was shown to cause increased cholesterol production in lung cancer cells 59 . Plasma levels of total and HDL cholesterol were found to be elevated in radiationresistant SPRET/EiJ mice compared to radiation-sensitive BALB/ cByJ mice, implicating cholesterol as a potential non-invasive metabolic biomarker 60,61 . Treatment with HMG-CoA reductase inhibitors including simvastatin was reported to sensitize prostate cancer cells to radiation therapy, potentially by compromising DNA damage repair 62,63 . Other agreements between model predictions and experimental studies include implication of inflammation-mediating eicosanoids in radiation resistance. Many prostaglandin metabolites identified in this study have previous associations with radiation resistance, and cyclooxygenase inhibitors such as aspirin may act as radiation sensitizers and improve outcomes in cervical, prostate, and rectal cancers 52,64-66 . These findings suggest that lipid and eicosanoid metabolites may have utility as both diagnostic biomarkers and therapeutic targets for improving radiation response. Integration of FBA model predictions into multi-omics machine learning classifiers for radiation response was performed by employing a dataset-independent ensemble architecture (Fig. 3). This approach was based on the concept of stacked generalization (having multiple "base learners" make predictions that are used as inputs for a separate "meta-learner"), which was shown to improve predictive accuracy in this study as well as multiple previous medical applications [67][68][69] . However, while in previous studies there is only one input dataset being supplied to the multiple base learners, we instead input different -omics datasets to separate base learners. The benefit of this dataset-independent approach is that the meta-learner can subsequently be used to predict which individual datasets will provide the most utility for determining radiation response in individual patients. For example, the meta-learner can accurately differentiate between "Low Clinical" patients (with large contributions of gene expression, mutation, and metabolic datasets from patient biopsy samples and genome-scale metabolic modeling) and "High Clinical" patients (with greater contribution of clinical data from electronic medical records) (Fig. 4). This stratification of patient populations allows for optimal resource allocation for collecting biological measurements with maximal diagnostic utility for individual cancer patients. Moreover, the use of GBM models as the base and meta-learners provides a significant amount of embedded feature selection; this decrease in model complexity not only lowers the cost of measuring biological features needed for prediction, but also improves the interpretability of models, increasing the likelihood of adoption by clinicians 70 .
In addition to demonstrating the utility of multi-omics data for the classification of radiation response, we found that a classifier utilizing non-invasive clinical information and blood-based metabolic biomarkers can predict radiation sensitivity with comparable accuracy (Fig. 6). Blood-based diagnostic tools are garnering attention for their use in early detection, monitoring, and optimal treatment identification for cancer patients 71 .
Identification of circulating biomarkers through the integration of machine learning and genome-scale metabolic modeling could provide significant utility in adaptive radiotherapy to modify patient treatment with radiation or radiation-sensitizing chemotherapies in response to the observed efficacy of previous treatments 72 .
Although our approach toward integrating machine learning and genome-scale metabolic modeling for the prediction of radiation response provides many enhancements in performance accuracy and biomarker identification compared to previous studies, further improvements could yield additional benefits. Our multi-omics classifier showed that tumor histology has major impacts on both the prediction of radiation response and the clustering of "Low/High Clinical" patients, suggesting that additional features from tumor histological images could further improve classifier performance. A convolutional neural network  Supplementary Fig. 13b). e Clinical and metabolic dataset contributions among the "Low Clinical" group. Individual features with mean |ΔP| scores above 1% are shown. n = 457 samples. Boxplots: box = 25th, 50th, and 75th percentiles, whiskers = 1.5 times the interquartile range. f Breakdown of individual feature contributions toward prediction of radiation response in a representative radiation-resistant TCGA patient (TCGA-S9-A7IY). (Upper) Contribution of each dataset toward the progression from prior to posterior probability of radiation resistance. (Lower) SHAP values for this individual patient. g, h Plots of SHAP value versus predicted metabolite production rate for two metabolic features, illustrating g a feature with large individual importance score relative to other patients (significant utility as a personalized blood-based biomarker), and h a feature with small individual importance score relative to other patients (little utility as a personalized blood-based biomarker). could be added as an additional base learner with tumor H&E images from TCGA as input, providing additional histological features and identifying associations with radiation response 73,74 . In addition, changes in both DNA methylation and microRNA expression have been implicated in the tumor response to radiation therapy; inclusion of these features into the multiomics classifier may yield additional biomarkers for improved radiation response prediction 75,76 . Finally, the course-grained classification of RECIST criteria is an imperfect metric of radiation response; using more clinically applicable or quantitative radiation response metrics in future classifier training such as LRR or surviving fraction at 2 Gy radiation (SF2) could improve the utility of multi-omics biomarkers obtained from machine learning classifiers.
Metabolomic profiling is powerful for understanding cancer pathophysiology, identifying and monitoring clinical biomarkers, and predicting patient outcomes, but challenging to retrospectively analyze in specimen biobanks for inclusion in multi-omics data mining 77 . In this study, we demonstrate that integration of machine learning and genome-scale metabolic modeling methodologies allows for improved biomarker identification and prediction of radiation response in individual patient tumors without direct metabolomics measurements. This approach is generalizable toward other applications in guiding patient treatment, such as the prediction of chemotherapeutic response as well as identification of metabolic targets for pharmacological inhibition and treatment sensitization. The synergistic integration of machine learning and genome-scale metabolic modeling will inevitably yield additional insights for improving precision medicine and long-term care of cancer patients.

Methods
TCGA data retrieval and processing. Clinical data from TCGA patients were obtained from the GDC data portal (https://portal.gdc.cancer.gov; clinical drug, clinical patient, and clinical radiation files) and the Synapse TCGA_Pancancer project (https://www.synapse.org/#!Synapse:syn300013/wiki/70804; biological sample files) 2 . Drug names were standardized according to the standard available from the Gene-Drug Interactions for Survival in Cancer database 78 . Categorical clinical features were one-hot encoded before inputting into machine learning classifiers. RNA-Seq gene expression data were obtained from Rahman et al.'s alternative preprocessing method (GEO: GSE62944) 79 . Data from this preprocessing method showed fewer missing values, more consistent expression between replicates, and improved prediction of biological pathway activity compared to the original TCGA pipeline. Mutation data using the MuTect variant caller were obtained from the GDC data portal 2,80 . For all data types, only features with at least two unique non-missing values were included.
Radiation sensitivity. TCGA samples were classified into radiation-sensitive and radiation-resistant classes according to their reported sensitivity to radiation therapy based upon the RECIST classification method. Patients with a complete or partial response to radiation (greater than 30% decrease in tumor size) were classified as radiation-sensitive, and patients with stable or progressive disease (either less than 30% decrease in tumor size, or increase in tumor size) were classified as radiation-resistant. If a patient received multiple courses of radiation therapy, they were classified based on the response to their first course.
Data splitting. Supplementary Fig. 21 provides an overview of data splitting for machine learning classifier training and testing. The collection of 716 radiationsensitive and 199 radiation-resistant samples was randomly split into training+ validation (80% of all samples) and testing (20% of all samples) groups. Within the training+validation group, five-fold cross-validation was performed to optimize hyperparameter values. The training (80% of training+validation samples) group was used for training the model with a given set of hyperparameters; within this training group, 87.5% was directly used for training, and 12.5% was used to identify the iteration at which to perform early stopping during training. The validation (20% of training+validation samples) group was used to assess model performance with the given set of hyperparameters. The average validation performance across all five folds was used to determine the optimized set of hyperparameters; once this set was determined, the model was retrained on the entire training+validation group, and the testing group (20% of all samples) was used to assess overall model performance. Twenty iterations of randomized training+validation/testing splitting were performed to analyze model predictions and performance metrics over multiple instances. All data splits were performed using stratified shuffle splitting, where the proportion of radiation-sensitive and -resistant samples was kept the same (refer to Supplementary Fig. 21).
Base learners. N d base learners were trained using an individual -omics dataset (either clinical, gene expression, mutation, or metabolic datasets), where N d is the number of individual datasets being used for the classifier. Each base learner is composed of a GBM model that performs two-class classification (predicting either radiation sensitivity or resistance for each patient) using features from an individual dataset, such as clinical, genomics, or transcriptomics. GBM models using decision tree ensembles have many useful characteristics compared to other machine learning algorithms, including embedded feature selection, capability of handling missing values (which is common in clinical datasets), and efficient management of high-dimensional datasets (where the number of features greatly exceeds the number of samples) 81,82 . XGBoost (v0.90) was used to develop GBM base learners and meta-learners 83 .
Bayesian optimization was performed to optimize hyperparameter values for each GBM model. At each iteration of Bayesian optimization, five-fold crossvalidation was used to calculate the performance of a particular set of hyperparameters. Weighted log loss was used as the performance metric for both GBM model training and evaluating model performance on validation sets: where y i is the true class label of sample i (y i = 0 if sensitive, y i = 1 if resistant), p i is the predicted probability of sample i being radiation-resistant (belonging to class 1), w R is the weight given to radiation-resistant samples (w R = no. of sensitive samples/no. of resistant samples), and N S is the total number of samples. The weight given to radiation-resistant samples accommodates for the fact that there are more radiation-sensitive samples than radiation-resistant samples, and prevents classifiers from focusing on optimizing performance exclusively on radiationsensitive samples. The mean weighted log loss plus one standard error over all five folds of cross-validation is used to choose the hyperparameter set with best performance. During model training, early stopping is employed to prevent overfitting. For individual samples, each of the N d base learners outputs the predicted probability of radiation resistance (p 1 , p 2 , …, p Nd ) using features from the individual data type. Each base classifier receives the same training/validation/ testing split of samples.
Meta-learner. For every sample within the five validation sets used for the base learners, each base learner's output prediction of radiation resistance (p i ) is compared to the sample's true radiation response class (y i ). The meta-learner is trained to predict the optimal base learner that provides the most accurate prediction of radiation response for each sample, based on the sample's multi-omics features. This meta-learner performs an N d -class classification, where N d is the number of independent base learners. The features this meta-learner is trained on include all features from the N d datasets that have non-zero SHAP values from their respective base learners; features that do not impact base learner predictions are not included, which increases the training speed while maintaining meta-learner accuracy. Because validation samples from the five-fold cross-validation were not directly used in base learner training, they can be used to train this meta-learner without overfitting or inflation of model performance metrics.
Implementation of the meta-learner is analogous to that of each base learner, using a GBM model, Bayesian optimization, early stopping, and five-fold crossvalidation. Multiclass log loss was used as the performance metric for both GBM model training and evaluating model performance 84 : where y i,k is 1 if dataset k is the true optimal dataset of sample i and 0 otherwise, p i,k is the predicted probability of dataset k being the optimal dataset of sample i, N S is the total number of samples, and N k is the total number of datasets. The mean multiclass log loss plus one standard error over all five folds of cross-validation is used to choose the optimal hyperparameter set with best performance. For individual samples, the meta-learner outputs N d probabilities (w 1 , w 2 , …, w Nd ) that each base learner is optimal for that sample (all N d probabilities sum to 1). Note that, once the meta-learner is trained using the predicted probabilities from the base learners, the base learners and meta-learner act independently of each other when used on new testing samples.
Radiation response prediction. Each testing sample is run through (1) all N d base learners to obtain the predicted probabilities of radiation resistance using each of the N d individual datasets (p 1 , p 2 , …, p Nd ), and (2) the meta-learner to obtain the predicted probabilities that each of the N d base learners/datasets is optimal for that sample (w 1 , w 2 , …, w Nd ). To obtain the final predicted probability of radiation resistance for the testing sample, the weighted average of the base learner probabilities is taken, with the meta-learner probabilities as weights: Samples with p < 0.5 are classified as radiation-sensitive, while samples with p > 0.5 are classified as radiation-resistant.
Bayesian optimization. Bayesian optimization was used to optimize GBM hyperparameters for both the base learner and meta-learner classifiers. This iterative approach automates the search for hyperparameter values by calculating an acquisition function that provides the expected benefit of sampling a particular point in hyperparameter space on the overall search for hyperparameters with minimal cross-validation error. At each iteration, the point in hyperparameter space with the largest acquisition function value is chosen, five-fold cross-validation is used to determine the performance of those particular hyperparameters, and the acquisition function is updated to then determine which next point in hyperparameter space will be sampled. Hyperopt (v0.1.2) was used to perform Bayesian optimization 85 . Supplementary Table 3 provides the eight GBM hyperparameters chosen for optimization of both base learner and meta-learner classifiers, with the ranges of values in the hyperparameter search space. A total of 2 8 = 256 iterations of Bayesian optimization were performed for each classifier.
Classifier performance metrics. Final classifier performance was assessed on testing samples across the 20 iterations of randomized training+validation/testing splitting. The following performance metrics were used: 1. Weighted log loss: Eq. (1) 2. Area under the receiver operating characteristic curve (AUROC) 3. Balanced accuracy, an accuracy metric that corrects for unequal numbers of radiation-sensitive and -resistant patients: 4. Sensitivity: 5. Specificity: 6. Positive predictive value: 7. Negative predictive value: SHAP values. The importance of individual features toward the prediction of radiation response, both averaged across all samples and for individual samples, was determined by calculating SHAP values for each classifier 15,16 . Each SHAP value (ΔP) represents the change in predicted probability of radiation resistance for patient i attributed to feature j. Features with positive SHAP values for patient i signify those where the particular value of feature j attributed to patient i is such that it increases patient i's predicted probability of radiation resistance. Larger absolute SHAP values (|ΔP|) indicate features with larger overall contributions (either negatively or positively). Mean absolute SHAP values across all samples (Mean |ΔP|) provide an indication of the overall importance of a particular feature in the classifier's prediction of radiation response. SHAP values were averaged across 20 training+validation/testing splits by a weighted average, with weights proportional to the inverse of the weighted log loss performance metric on the testing set for that split 86 . This weighted average allows model analysis to be more reflective of the more accurate predictions, so that identified biomarkers are more likely to be true biomarkers rather than artifacts of poorly performing predictions. Values were normalized by the difference between prior and posterior probabilities of radiation resistance for each sample. SHAP v0.29.1 was used to calculate SHAP values 16 .
Comparison of machine learning algorithms. scikit-learn v0.21.2 functions sklearn.ensemble.RandomForestClassifier() and sklearn.linear_model.LogisticRegression() were used to implement random forest and logistic regression with L1 regularization classifiers, respectively 87 . Keras v2.3.1 was used to implement the neural network with L1 regularization classifier (https://github.com/keras-team/ keras). Weighted log loss (Eq. (1)) was used as the loss function for the neural network classifier, and early stopping was performed. Missing values were imputed and scaled using sklearn.impute.SimpleImputer() and sklearn.preprocessing.Stan-dardScaler() functions, respectively, before training with the random forest, logistic regression, and neural network classifiers. Supplementary Tables 4-6 provide the hyperparameters and value ranges used for Bayesian optimization with each algorithm; 256 iterations of Bayesian optimization were performed for each classifier. Each classifier, including the GBM classifier, was run using the same training, validation, and testing samples at each of 20 training+validation/testing splits so that performance can be accurately compared.
Comparison of gene expression datasets. Eleven gene expression sets for oxic radiation response in the RadiationGeneSigDB database were compared to the set of 782 significant genes from the gene expression classifier 7 . Gene names from RadiationGeneSigDB gene sets were converted to Entrez gene ID's and gene symbols. Those genes where a matching Entrez gene ID or gene symbol could not be found were removed. In addition, those genes that were not in both TCGA and CCLE gene expression datasets were removed.
To compare performance of gene expression sets on TCGA data, classification models were trained to predict radiation-sensitive or -resistant classes of TCGA tumor samples using gene expression data from only the subset of genes for an individual set. Model performance was assessed using weighted log loss (Eq. (1)) and AUROC metrics. To compare performance of gene expression sets on CCLE data, regression models were trained to predicted radiation response (reported as area under the curve of survival vs. radiation dose) of CCLE cell lines using gene expression data from only the subset of genes for an individual set 39 . Model performance was assessed using mean absolute error and mean squared error metrics.
Flux balance analysis (FBA). Generation of personalized FBA models of individual TCGA tumor samples was performed as described in Lewis et al. 11 . To predict the maximum production of a particular metabolite in FBA models, the following objective function was used: where "met" is the metabolite to be maximized, and "[all]" represents the maximization of this objective function across all cellular compartments where the metabolite is located. This creates an artificial sink for a particular metabolite in the Recon3D metabolic network, resulting in the maximization of reaction fluxes generating this metabolite. We hypothesized that this objective function would be valid and thus yield accurate predictions for metabolites with large differences in production between radiation-sensitive and -resistant tumors, as these would be particularly beneficial to either tumor class and thus the metabolic network of these tumors would be optimized to maximize levels of the metabolite. The modeled external compartment contained all metabolites found in DMEM/ F-12 cell culture media (Thermo Fisher Scientific, Cat#11320) as well as fetal bovine serum (FBS) to match the cell culture media used for experimental validation 88 . All 871 metabolites in the Recon3D human metabolic reconstruction that (1) had KEGG database IDs, (2) were not present in the extracellular media, and (3) were capable of being produced by all FBA tumor models, were included in the FBA metabolite production screen. NCI-60 data retrieval and processing. Experimental metabolomics data from NCI-60 cancer cell lines were obtained from the Developmental Therapeutics Program of the National Cancer Institute (https://wiki.nci.nih.gov/display/ NCIDTPdata/Molecular+Target+Data). Normalized concentration entries without metabolite names or for isobars were excluded. Cell line surviving fraction at 2 Gy radiation (SF2) values was obtained from Amundson et al. 37 .
CCLE data retrieval and processing. TPM-normalized RNA-Seq transcriptomic data from the CCLE cancer cell lines were obtained from the Broad Institute CCLE database (https://data.broadinstitute.org/ccle/CCLE_RNAseq_rsem_genes_tpm_ 20180929.txt.gz). Experimental metabolomics data were obtained from Li et al. 38 . Normalized log 10 -transformed concentration entries were utilized. Cell line radiation responses measured using the area under the radiation response curve were obtained from Yard et al. 39 .
Keene et al. data retrieval and processing. RNA-Seq transcriptomic data from breast cancer tumors exhibiting LRR or CTL from Keene et al. were obtained from GEO GSE119937 28 . RPKM-normalized values were normalized to TPM, and only genes with no missing values were kept in the analysis. For duplicate gene entries, the entry with the largest mean value across all samples was kept. Mutation data were obtained from Keene et al.'s Table S1.
Cell culture. Supplementary Table 2 provides the matched radiation-sensitive and radiation-resistant cell lines used for experimental validation of metabolite levels predicted from FBA models. All cell lines were maintained in DMEM/F-12 cell culture media (Thermo Fisher Scientific, Cat #11320) with 10% FBS (Sigma-Aldrich, Cat #F4135) at 37°C and 5% CO 2 , and were free of Mycoplasma.
Metabolomics. Three biological replicates of each cell line were grown in separate T-25 flasks with the cell culture conditions described above. Cell pellets with approximately 1 million cells were obtained from trypsinization, centrifugation, and removal of supernatant. Samples were reconstituted in 90% MeOH, 10% H 2 O at a ratio of 200 μL/1 million cells. Aliquots of the supernatant were combined to create a pooled sample used for quality control. Aliquots of the samples were transferred to LC vials and stored at 4°C.
Compound Discoverer 3.1 was used to perform quality control, putative metabolite identification, and quantification of metabolite levels. Results for positive and negative ion modes were combined. Metabolites with no identified name were removed from the analysis. If duplicate metabolites with the same identification were obtained, then the entry with the largest maximum area was used. KEGG IDs for each metabolite were manually identified based on metabolite name, molar mass, and chemical formula. Metabolites from experimental metabolomics were matched to those from FBA analysis by matching KEGG IDs.

Data availability
TCGA, NCI-60, CCLE, and Keene et al. datasets used in this study are available as cited above in their respective Methods subsections. Information about KEGG database IDs is available at https://genome.jp. The following datasets generated from this study are available at https://github.com/kemplab/ML-radiation (https://doi.org/10.5281/ zenodo.4540314) 89

Code availability
Jupyter notebooks containing Python code for running and analyzing the gene expression, multi-omics, and non-invasive classifiers for radiation response are available at https:// github.com/kemplab/ML-radiation (https://doi.org/10.5281/zenodo.4540314) 89 . In addition, the gene sets and code used to compare the significant gene list from our gene expression classifier to those from the RadiationGeneSigDB database are available. Jupyter notebooks containing Python code related to the generation of personalized genome-scale FBA models of TCGA tumors are available at https://github.com/kemplab/FBA-pipeline (https://doi.org/10.5281/zenodo.4540330) 11,90 . Received: 23 June 2020; Accepted: 9 April 2021;