Machine learning approach for quantitative biodosimetry of partial-body or total-body radiation exposures by combining radiation-responsive biomarkers

During a large-scale radiological event such as an improvised nuclear device detonation, many survivors will be shielded from radiation by environmental objects, and experience only partial-body irradiation (PBI), which has different consequences, compared with total-body irradiation (TBI). In this study, we tested the hypothesis that applying machine learning to a combination of radiation-responsive biomarkers (ACTN1, DDB2, FDXR) and B and T cell counts will quantify and distinguish between PBI and TBI exposures. Adult C57BL/6 mice of both sexes were exposed to 0, 2.0–2.5 or 5.0 Gy of half-body PBI or TBI. The random forest (RF) algorithm trained on ½ of the data reconstructed the radiation dose on the remaining testing portion of the data with mean absolute error of 0.749 Gy and reconstructed the product of dose and exposure status (defined as 1.0 × Dose for TBI and 0.5 × Dose for PBI) with MAE of 0.472 Gy. Among irradiated samples, PBI could be distinguished from TBI: ROC curve AUC = 0.944 (95% CI: 0.844–1.0). Mouse sex did not significantly affect dose reconstruction. These results support the hypothesis that combinations of protein biomarkers and blood cell counts can complement existing methods for biodosimetry of PBI and TBI exposures.


Materials and methods
Experimental procedures. The mouse experiments were approved by the Columbia University Institutional Animal Care and Use Committee (IACUC, approved protocol #AABA9506) and were conducted under all relevant federal and state guidelines. Male and female C57BL/6 mice aged (aged 12-14 weeks) were purchased from Charles River Laboratories (Frederick, MD) and randomly assigned to the sham (0 Gy) and irradiated (2.0-2.5 and 5 Gy) study groups. A summary of the numbers of mice in each exposure group is provided in Table 1, and the full data are provided in Supplementary_table_S1 online. All methods were performed in accordance with ARRIVE guidelines (https:// arriv eguid elines. org) and with other relevant guidelines and regulations.

Irradiation and dosimetry. Clinac. PBI and TBI exposures were performed at the Radiological Research
Accelerator Facility (RARAF), using 9 MeV electrons generated by our modified Clinac 2100C 37 . Batches of mice were irradiated on different dates, with random assignment of the mice to exposure type and dose. Mice were anesthetized using isofluorane and placed into a custom irradiation jig with a movable ¼ inch thick lead shield of the lower half of the body (for PBI exposures), or no shielding for TBI. The jig was placed at a source to surface distance of 90 cm and dose was delivered at a dose rate of 5-10 Gy/sec, which ensured that the circulation time of blood in the mouse, ~ 15 s 38 , was much longer than the dose delivery time (≤ 1 s).
Dose rate was evaluated prior to the experiment using a NIST-traceable advanced Markus ion chamber and Unidos E electrometer (PTW, Germany). The jig was placed at a source to surface distance of 90 cm and dose was delivered at a dose rate of 7 Gy/sec (~ 0.4 Gy/pulse @ 180 Hz). The number of Clinac pulses required to deliver 2.5 or 5 Gy was evaluated prior to the experiment using a NIST-traceable advanced Markus ion chamber and Unidos E electrometer (PTW, Germany). 2.5 Gy irradiations required 65 pulses and 5 Gy irradiations required 130 pulses, each after 20 s warm up time in which the electron gun was active but no dose was delivered 37 . To verify dose on a per-mouse basis, EBT3 film (Ashland, Bridgewater, NJ) was irradiated with each mouse. The film was scanned using an V700 photo scanner (Epson, Suwa Japan) 39 and dose was reconstructed from the red channel data using the previously determined calibration curve:D Gy = 7.404OD 0.818−OD , where the optical density, OD, is the negative log transformed ratio of the pixel values (red channel only) of exposed and unexposed film, scanned simultaneously. Dose variation through the mouse thickness was previously measured to be about 10% in this irradiation geometry.
The experimental plan was to irradiate 4 batches of mice, where each batch included irradiated mice and corresponding controls which were sham-irradiated with the corresponding TBI or PBI procedures. The samples from 2 female mice exposed to 2.5 Gy TBI were excluded from analysis due to very low levels of B and T cells, www.nature.com/scientificreports/ insufficient for scoring (Table 1). Consequently, the analyzed data set (Supplementary_table_S1 online) included 42 animals exposed to Clinac PBI and 36 animals exposed to TBI.
X-RAD. For comparison, 25 mice (15 male; 10 female) were irradiated with 0, 2 or 5 Gy of TBI exposures using 320 kVp x-rays, a current of 12.5 mA, and dose rate of 1 Gy/min, using the X-RAD 320 biological irradiator (Precision X-Ray Inc, North Branford, CT) at the Center for Radiological Research. This additional data set enabled us to increase the sample size of the study, and to compare the effects of different types of radiations. Mouse irradiations were performed according to previous protocols 22,40 . For in-vivo irradiations, mice were placed in a specifically designed mouse irradiation holder (Precision X-ray). Control mice were sham irradiated. All doses were validated using a Radcal ion chamber (Monrovia, CA) placed in the mouse holder. During the actual irradiations, the delivered dose was measured by placing the ion chamber at the same position into the mouse holder. These x-ray exposures were performed to compare TBI exposures to high dose rate Clinac electrons at 5-10 Gy/sec with TBI exposures to lower dose rate x-rays at 1 Gy/min.
Blood sample collection and cell counts. All irradiated and sham-control mice were euthanized by CO 2 asphyxiation at 24 h after radiation exposure to mimic realistic scenarios of biodosimetry measurements following a mass radiological event. Peripheral whole blood (WB) samples were collected from each mouse by cardiac puncture using a heparin-coated syringe prepared by adding 500 µl DPBS to BD Vacutainer containing 158 USP units of sodium heparin (#366,480). Similar to our earlier work 22 , leukocyte, T and B cell counts were determined by flow cytometry (CytoFLEX, Beckman Coulter, Pasedena, CA) using 20 μL of heparinized blood, using the following antibodies purchased from Biolegend (San Diego, CA): APC-CD45 (catalog #103,112), FITC-CD3e (#100,306), PE-CD19 (#115,508). Blood counts were determined using CytExpert software (Beckman Coulter). Table 1. Summary of the number of mice in each exposed group. As described in the main text, detailed dosimetry was performed on each mouse and dose variations around the nominally assigned values were accounted for in the analysis, and two mouse samples in the TBI females batch 3 group were excluded from analysis due to insufficient number of B and T cells for scoring. In total, 42 mice were assigned to PBI (16 of them sham-exposed) and 36 mice to TBI exposures (13 of them sham-exposed) using CLINAC electrons, and 25 mice were assigned to x-ray exposures (9 of them sham-exposed). To compensate for spectral spillover, cells stained with single fluorescence only were acquired using the compensation wizard on INSPIRE software (488 nm laser on with the brightfield and side scatter inactivated). The compensation coefficients were determined automatically by the IDEAS software (Luminex ver. 6.2) to create a compensation matrix.
Analysis imaging flow cytometry images and spectral data were performed on IDEAS software (version 6.2), similar to previous work done in our laboratory 22 . As seen in Fig. 1, we developed a uniform analysis template to quantify the Mean Fluorescence Intensity (MFI) of each biomarker in non-apoptotic mouse leukocytes, CD19 + (B cell) and CD3 + (T cell) populations. Figure 1A illustrates our cell gating methods, as follows: To select only focused cells for analysis, images of cells were visually inspected, and a region with X coordinate beginning at 57.87 was set on the brightfield (BF) Gradient root mean square (RMS) feature ( Fig. 1 Ai). Single cells were selected by creating a gate in a bivariate plot of BF Aspect Ratio versus BF Area ( Fig. 1 Aii). Healthy cells were selected by creating a gate in bivariate plot of BF Circularity versus BF Contrast, thus excluding apoptotic cells ( Fig. 1 Aiii). Regions CD19 + and CD3 + were created to select for B and T cells, respectively ( Fig. 1 Aiv). The Mean of Fluorescence Intensity (MFI) value of each biomarker within all healthy leukocytes, CD3 + , and CD19 + cell populations was then computed by the IDEAS software (Fig. 1B). This analysis template was applied to all data files and automatically batch processed within IDEAS.
Combining conventional flow cytometry and IFC data. As described above, we quantified the leukocyte subtypes, using two samples from the same mouse: From the first sample, we obtained raw concentration values from interrogating surface labeled fresh whole blood via conventional flow cytometry ("ln_Bcells / ln_T_cells", as described in the methods). Later, a second sample was prepared involving fixing, permeabilizing, and multiple washes, from which we obtained percentages of surface labeled subtypes present in the total number cells analyzed on the IFC ("Percent_T cells / Percent_B cells", as described in the methods). Due to the inherent differences in the preparation methods for these pre-and post-processed samples and the capabilities of the instrument they were interrogated with, each sample generated data with different metrics: The pre-processed raw (non-fixed) counts measured by conventional flow cytometry are likely to provide values that more closely reflect absolute cell numbers in the sample. These numbers represent exponential cell killing by radiation and are therefore log-transformed. In contrast, the IFC-prepared samples undergo several processing steps towards the measurement of intracellular and surface biomarker labeling of the B and T cell subtypes, all of which are based on brightfield morphology and refined by several image gating steps (as seen in Fig. 1). Therefore, it is of interest to look at both methods of quantifying blood counts in determining correlation with radiation dose and exposure type (Interaction).
Data set for machine learning analyses. Biomarker signals and conventional flow cytometry blood counts were natural log (ln) transformed to bring their distributions closer to the normal distribution. The main variables in the resulting data set were: The radiation dose (Dose, in Gy). The exposure type (Exposure), with 1.0 for TBI and 0.5 for PBI. The product of dose and exposure type (Interaction), with 1.0 × Dose for TBI and 0.5 × Dose for PBI. Sex, with 0 = females and 1 = males. Radiation type (Radiation_type), with 0 for electrons and 1 for x-rays. Ln-transformed B and T cell counts (ln_B_cells and ln_T_cells, respectively) from CytoFLEX measurements are given in events/µl. Percentages of cells displaying CD3 or CD19 surface markers (Percent_T cells and Percent_B cells, respectively) from IFC measurements are given as a percentage of all healthy, single cells which were analyzed. Ln-transformed signals (from the Intensity_MC_Ch02, Mean, healthy & single & focused channel) for the radiation-responsive biomarkers. This data set is provided in Supplementary_table_S1 online. Dose and Interaction were treated as the target variables to be predicted by the ML models, using the other variables (except Exposure) as predictors.
Machine learning analysis procedure. We imported the data into the R 4.2.0 41 programming language.
We used the geometric mean of unstained blood samples from each batch to normalize biomarker fluorescence intensities and reduce potential differences in signal intensities between experimental batches (i.e., groups of mice irradiated on the same day). Analyses and visualizations of the data were performed in R and in Microsoft Excel software.
We split the data set randomly into halves for training and testing. We used the Boruta feature selection algorithm (implemented by the Boruta R package) 42 to identify and discard any weak predictor variables, which would www.nature.com/scientificreports/ not be useful for reconstructing the Dose or Interaction variables. Boruta iteratively compares the importance score of each predictor with the importance score of its randomly shuffled "shadow", in the context of a random forest model 42 . It duplicates the data set and randomly shuffles the values in each column. These shuffled values are called shadow features, and they are re-created in each iteration. Those predictors that had significantly (p-value < 0.05 with Bonferroni correction) worse importance than shadow features during Boruta implementation on a randomly selected training half of the data were discarded from further analysis. We trained the random forest (RF) ML algorithm 43 on the training portion of the data set, using all predictor variables retained by Boruta, to predict Dose or Interaction. Each of these RF models was refined by grid search hyperparameter tuning, using the caret and ranger R packages, separately for each of the two target variables. In addition, we trained a separate RF model in classification mode to distinguish between exposed and unexposed samples (i.e. those with radiation Dose > 0 versus 0).
The strengths of the RF algorithm include its ability to model non-linear relationships and interactions between variables, and its low sensitivity to correlations between predictor variables and to outlier observations 43 . RF generates many uncorrelated decision trees by bootstrap aggregation, or "bagging" (randomly selecting samples from training data with replacement) and feature randomness (selecting a random subset of predictor variables for each tree). Predictions from all trees are then averaged for regression problems such as the one here.
To counteract the problem of overfitting, we trained each RF model using repeated k-fold cross validation (threefold, repeated 100 times) on the training data, and evaluated its performance on the testing data. Three performance metrics were used for evaluation on each of the target variables (Dose or Interaction): mean absolute error (MAE), root mean square root error (RMSE) and coefficient of determination (R 2 ).

Results
Biomarker and blood cell count dose responses for PBI and TBI. The dose responses for B and T cell counts obtained by conventional flow cytometry after TBI or PBI exposures are shown in Fig. 2. Overall, these results show that despite some variability between different mice, it is clear that the dose response slopes were markedly different for TBI (red) and PBI (blue) exposures. The linear regression analysis which generated the fitted lines in Fig. 2 showed that the PBI slopes were roughly twofold lower than TBI slopes, reflecting that PBI was half-body in this case and that the differences in slopes between TBI and PBI were statistically significant: p-value = 6.13 × 10 −7 for B cells and 2.65 × 10 −8 for T cells. In each case, the null hypothesis was that the regression slopes are equivalent for PBI versus TBI. Coefficient of determination (R 2 ) values are also shown in Fig. 2. These values (especially for TBI, 0.66 and 0.75 for B and T cells, respectively) suggested that most of the data variability was explained by the linear regression.
The TBI and PBI dose responses of the percentages of T and B cells that met the sequential gating criteria by IFC shown in Fig. 3. As described in the Methods section, these percentages represent a different metric, than the raw B and T cell counts shown in Fig. 2. The difference in the shapes of the curves in Figs. 2 and 3 may be due to the different methods in sample preparation: the raw counts (non-fixed) measured by conventional flow cytometry are more representative of a total population of healthy and dying cells, whereas the IFC-processed fixed/permeabilized samples gated for healthy T and B leukocyte subtypes were used to estimate in more detail how the percentages of different cell populations with different surface markers changed as a function of radiation dose and type. Importantly, for both types of metrics the dose responses looked considerably different for PBI www.nature.com/scientificreports/ versus TBI exposures. Consequently, both the raw B and T cell counts and the percentages of B and T leukocyte subtypes from IFC were incorporated as predictor variables into the dose reconstruction ML modeling. The measured radiation dose responses for the protein biomarkers DDB2, FDXR and ACTN1 are shown in Fig. 4. Here the ln-transformed fold changes are increasing with dose instead of decreasing, but also there is a clear and statistically significant difference in dose response slopes between TBI and PBI: p-value = 1.20 × 10 −3 for DDB2, 5.09 × 10 −4 for FDXR, and 3.02 × 10 −4 for ACTN1. The dose response slopes for PBI are approximately twofold lower than the corresponding TBI values, which supports the expectation that approximately half of the body was irradiated in the PBI scenario. The DDB2 and FDXR biomarkers showed the most reproducible dose response patterns among experimental batches, and therefore we focused on measuring these two biomarkers in the subsequent ML analyses.

Selection of strong predictors of radiation dose and exposure type. As described in Materials and
Methods, the data set was split randomly into training and testing halves. The training part was used for feature selection (i.e., identifying the most important predictors of Dose and Interaction), tuning and fitting of the RF model or each target variable. The testing part was used to evaluate model performance. A visualization of the matrix of Spearman's correlation coefficients between all variables (e.g., blood cell counts, biomarkers) in the training data is displayed in Fig. 5. It shows that many of the predictor variables were strongly correlated with the outcome variables, Dose and/or Interaction. The B and T cell counts were very strongly correlated with the outcome variables, and the selected protein biomarkers (especially DDB2) showed significant correlations as well.
To determine which predictor variables are most important and need to be retained for ML analysis, we implemented the Boruta feature selection algorithm 42 . Each predictor is retained only if it outperforms its "shadow" with a specified level of statistical significance (here set to 0.05 with Bonferroni correction). In this case, the Sex and Radiation_type variables did not pass the Boruta screening, suggesting that they are not very important for reconstruction of Dose or Interaction (where Interaction = Dose for TBI, and Interaction = Dose/2 for PBI). Specifically, Sex and Radiation_type outperformed noise in only 0-16.7% of Boruta iterations, whereas the other predictor variables (ln_B_cells, ln_T_cells, Percent_T cells, Percent_B cells, DDB2 and FDXR) did so in 87.5-100% of iterations. Therefore, the data from mice irradiated with TBI x-rays were not distinguished by www.nature.com/scientificreports/ Boruta screening from those data that came from electron-exposed TBI mice, which is biologically plausible since low-LET photons and electrons tend to have similar biological effectiveness per unit dose 44 . This finding of similarity between electron and x-ray effects in this study is supported by calculation of dose reconstruction performance metrics for testing TBI samples, separately for electron and x-ray irradiations. For electrons, R 2 = 0.949, RMSE = 0.539 Gy, and MAE = 0.413 Gy. For x-rays, the numbers were quite similar: R 2 = 0.962, RMSE = 0.501 Gy, and MAE = 0.403 Gy.
Machine learning results for dose and exposure type reconstructions. In this analysis, Interaction = Dose for TBI exposures, and Interaction = Dose/2 for PBI exposures. Two separate random forest models were fitted to the data. The set of predictor variables was the same for each model (ln_B_cells, ln_T_cells, Percent_T cells, Percent_B cells, DDB2, DXR), but the target variable to be predicted was different: Dose in one model, and Interaction in the other. The rationale for this approach was that in a hypothetical realistic situation where samples with unknown exposures are analyzed, both models will be used and predictions for both Dose and Interaction will be generated for each sample. This dual prediction is intended to be informative about the type and magnitude of exposure for the sample.
Among irradiated mouse samples in the testing data set, it was possible to discriminate between PBI and TBI by predicting the Dose -Interaction difference (Fig. 7). Here, only irradiated animal data was used, and unirradiated controls were excluded. Predicted values of Dose and Interaction, which were calculated by RF models as described above, were used to calculate the predicted Dose-Interaction difference for each irradiated sample. www.nature.com/scientificreports/ This difference was used to classify samples into the TBI or PBI classes, and classification results were compared with true known values of TBI or PBI for each sample to generate the ROC curve shown in Fig. 7. These results suggest that ML-based methods can be useful for detecting PBI exposures based on protein biomarker and blood cell count data as inputs. In addition, despite the variability in responses between individual mice, since the RF algorithm integrates information from several predictors (B and T cell counts and percentages, FDXR and DDB2 biomarkers), it was able to accurately classify samples as exposed or unexposed: classification accuracy on the testing data set was 92.2% and ROC curve AUC = 0.982 (95% CI: 0.953, 1.0). The comparisons of data with RF predictions are provided in Supplementary_table_S3 online.

Discussion
The objective of this study was to investigate the usefulness of radiation responsive protein biomarkers, in combination with blood cell counts, as potential rapid and high-throughput biodosimeters for PBI as well as TBI exposure situations. Enhancing the number of available tools for PBI biodosimetry is important because currently available techniques have limitations in terms of time-to-result, throughput and/or accuracy. We hypothesized that combining radiation-responsive protein biomarkers and blood cell counts in an ML model context can be used to generate quantitative reconstructions of the radiation dose for PBI as well as for TBI exposures. The results suggest that the top two intracellular protein biomarker expression (DDB2, FDXR), and immunophenotyping through either traditional flow (cell counts) or IFC (cell percentages after gating) correlated strongly with radiation exposure (Fig. 5), and showed consistent and reproducible dose-dependent radiation responses (Figs. 2, 3, 4). The slopes of these responses for biomarkers and blood cell counts were significantly different for TBI versus PBI irradiated mice. These findings support the expectation that PBI exposures "spare" a large fraction of blood cells from radiation damage. Based on these differences, an ML analysis using the RF algorithm was able to generate accurate reconstructions of PBI exposures, as well as TBI. The RF algorithm was also able to distinguish between unirradiated and irradiated samples. Consequently, it may be possible in the future to use separate RF models in a two-stage process to first classify each unknown sample as either irradiated or unirradiated, and then to distinguish TBI from PBI on those samples classified as irradiated.
It is important to emphasize that quantification of radiation dose, as well as classification of the exposure type as TBI versus PBI, are important first steps, but ultimately they need to be followed by much more detailed assessment of the possible acute and long-term health effects of the exposure. In other words, predicting the likely symptoms of irradiation and taking the correct actions to prevent and/or mitigate them is the ultimate goal, where dose and exposure type reconstruction are the initial steps.
We believe that the results of this study are the first use of intracellular protein and cell surface biomarkers for biodosimetry in an ML context and support the potential usefulness of the proposed approach for biodosimetry in practical mass-exposure situations, such as improvised nuclear device explosion scenarios, for time points soon after the event (e.g., 24 h), as well as for longer time points. However, limitations of the current study include selection of one age group only (young adult), a single (half-body) PBI shielding setup, and only two non-zero dose levels. Other limitations include the use of a single ML method (RF) and, ultimately, the challenges of translatability from the mouse system to humans. Also, the classification of exposures into TBI versus PBI categories, as performed here, is a simple "extreme" representation of a more complex picture of inhomogeneous radiation exposures, which was used here mainly as a proof of principle to develop/refine biodosimetry methods.
We are planning to address the first three of these limitations by acquiring young (4 week old) mice, using a hind leg shielding set up (which shields only a small percentage of the bone marrow), and investigating the performances of other state of the art ML methods, such as extreme gradient boosting (XGBoost) 45 , to improve the dose and exposure reconstructions. The age issue is particularly important since radiosensitivity, for example carcinogenesis, can be higher in pediatric populations than in adults 46,47 . We plan to assess whether or not the same exposure reconstruction approaches and choice of predictors are applicable to young mice as well as to adults, or whether different approaches and biomarkers are needed for different age groups.
In summary, we are developing a biomarker-based FAST-DOSE biodosimetry assay that can be used to rapidly quantify intracellular and surface protein markers to accurately estimate absorbed dose after exposure to TBI and PBI. The current study shows that this approach can distinguish between PBI and TBI exposures and quantify them, but this was only a first step, which used a limited number of dose levels and a single -half-body -PBI exposure condition. The development of an in-the-field FAST-DOSE biodosimeter for estimation of absorbed radiation dose in potentially exposed individuals shortly after radiation exposure would allow for rapid triage and treatment decisions prior to sending blood samples for more accurate cytogenetic testing 32,33 . In future work, we plan to further validate this system using more doses and PBI exposure scenarios, and to optimize its performance for time points up to a week after radiation exposure and to transition the top biomarker candidates to an in-the-field deployable device.

Data availability
All datasets analyzed during the current study are available in Supplementary_table_S1 online and also available from corresponding author on reasonable request.