## Introduction

Cell cycle is a process of fundamental importance, at the basis of cell growth and replication. In eukaryotic cells it is typically divided in four phases: $${\text{G}}_{1}$$-phase, in which the cell gets ready to DNA synthesis; S-phase, when DNA replication takes place; $${\text{G}}_{2}$$-phase, when the cell begins preparation for mitosis; and M-phase, where cell division takes place. Each phase is characterized by a different length. Various proteins, cyclins and cyclin-dependent kinases, regulate transitions between two phases and block progression in specific checkpoints if the system detects errors1.

Exposure to ionizing radiation is known to affect cell-cycle progression: radiation causes DNA damage, and an arrest in cell-cycle progression follows as the cell activates DNA repair mechanisms. A “successful” repair can lead to further progression in the cycle (recovery of the arrest, with a resulting delay), while, if the repair is unsuccessful, the cell can eventually die thus exiting the cycle or progress fixing alteration (genome instability). Cell-cycle arrests following radiation exposure have been observed experimentally mainly as accumulation in $${\text{G}}_{2}$$/M-phase, followed by possible $${\text{G}}_{1}$$-phase arrest. It is also well known that cells in different phases of the cycle at the time of exposure show a different sensitivity to the radiation insult, with cells in S-phase being generally more radioresistant than cells in $${\text{G}}_{1}$$ and $${\text{G}}_{2}$$, and cells in M being highly radiosensitive. Quiescent cells in a $${\text{G}}_{0}$$ state are also radioresistant, as they cannot undergo clonogenic death until brought to re-enter the replicative cycle.

Findings on the interplay between radiation action and cell cycle progression are applied in radiation therapy for cancer treatment. In particular, when a fractionation scheme is used, the total dose is split in smaller fractions: this allows the redistribution of surviving cancer cells within the cell cycle, the repair of sub-lethal damage, the re-oxygenation of the tumor and repopulation of normal and malignant tissues2. Chemotherapeutic drugs also affect cell cycle progression, and their action can as well be phase-specific, e.g. interfering with replication in the S-phase or damaging the formation or dissociation of the mitotic spindle in the M phase. The treatment effectiveness will be finally dependent on several factors, as the spatial distribution of the tumor cell mass (oxygenation heterogeneity), the timing of the drug/radiation dose delivery, the time between doses, the specific radiosensitivity of the tumor, etc. From the combination of treatment-dependent perturbations of cell-cycle progression and cell-cycle-dependent therapeutic sensitivity we get the rationale behind the use of kinetically-based administration protocols of chemotherapy and radiation therapy: as a general consideration, favouring synchrony and arrest of cells at a particular cell-cycle phase can improve the effectiveness of the next dose of radiation/chemotherapy, administered within an appropriate time so that synchrony/arrest is not lost3.

For radiation treatments, the modelling of the perturbation of the cell cycle might then be used as an input to refine the evaluation of the Tumour Control Probability (TCP), defined as the probability that no cancer cells clonogenically survive, and to optimize the fractionated treatment protocol in terms of fraction numbers, dose per fraction and time between fractions4. Possible synergistic effects of concurrent treatments with radiation and chemotherapeutic drugs have to be explored, especially, in perspective, going from conventional radiotherapy to particle therapy for radioresistant tumours, in both target- and healthy tissues. Clinically driven mathematical models can be used for this purpose as tools to understand, study, and provide useful predictions related to the outcome of various treatment protocols used to treat human malignancies. The use of such tools could speed up delivery of efficacious treatments to patients, providing indications prior to beginning actual testing and long and costly clinical trials, and also preventing the use of potentially sub-optimal treatment combinations5.

Tools of this kind, to be used in a pre-clinical/clinical framework, have to rely on solid computational models able to describe cell cycle progression and predict the outcome of a given perturbation. Different cell-cycle models have been developed, greatly varying in complexity, from compartmental models based on ordinary differential equations (ODEs), to multi-scale models predicting population growth, possibly taking into account intracellular biochemical processes or factors of the cell environment that affect the fate of each individual cell. Generally speaking, models limited to the prediction of the distribution of cells in the cycle have a deterministic nature, their output being fully determined by parameter values and initial conditions. Different options are available: the model can include explicit expressions for the concentration of regulators of cell-cycle progression and their time evolution (usually limited to essential interactions), thus providing a molecular insight on the system6. In this case, model parameters are activation and degradation rates of regulatory proteins and their concentration. Alternatively, the model can include expressions for the percentage of cells that are found to be in a given cell-cycle phase, thus providing a “population overview”7,8. Model parameters are then transition probabilities between different phases. The perturbation of the system is finally described by a variation of the values of parameters that govern its evolution. Radiation action can be described as leading to an outcome subject to probability laws, as it is the case for clonogenic cell survival. This would suggest the use of a stochastic model, where the same set of input parameters and initial conditions will lead to an ensemble of different outputs. When describing cell survival coupled to the perturbation of the cell cycle, hybrid models can be implemented, incorporating randomness in the deterministic evolution of the system5. The choice of which model and how to implement it necessarily depends on the needs and aims of the study, and above all, on the availability of data for model benchmark.

The purpose of this work is to set the basis of a computational model to specifically describe how cells that survive radiation exposure are distributed in the cell cycle. Flow cytometry is adopted to provide dedicated experimental data for the testing and benchmark of this model. For a first proof-of-concept validation of the potential of the model we obtained data on well-characterized healthy human fibroblasts (IMR90 cell line). Cells were exposed to X-ray doses of 2 Gy (as a typical radiotherapy dose per fraction) and 5 Gy, and followed from 6 h up to 72 h post irradiation. Using flow cytometry, the position of the cell in the cycle is primarily determined measuring its DNA content. We therefore present a new implementation from a previously published deterministic model (up to now applied to cell-cycle perturbation by chemotherapeutic drugs only), explicitly taking into account DNA content as a variable, besides the time variable for the temporal evolution of the system9. The developed model is able to describe how radiation modifies in time the percentage of cells in all phases of the cycle, with a benchmark also on the distinction of $${\text{G}}_{2}$$- and M-phases (at difference with previous implementations). Radiation-induced perturbations and their evolution in time are reproduced varying two of the model parameters, representing transition rates from $${\text{G}}_{1}$$- to S-phase and from $${\text{G}}_{2}$$- to M-phase. Upon further developments, that we later discuss in great detail, this model can be integrated in the framework of tools to be developed for predictions/optimization with cancer cells and future prediction about cancer therapy, considering the dynamics of cell-cycle progression and its alteration by chemotherapeutic drugs and radiation and, eventually, cell death leading to the control of tumour cell population.

## Results

The structure of this paper is as follows: we first present experimental measurements with the IMR90 cell line. We then introduce the theoretical model and discuss how model parameters are estimated reproducing the unperturbed experimental condition (cells in exponential growth in absence of radiation exposure). We finally show that the modification of such parameters allows to reproduce cell-cycle perturbations induced by radiation. All details on cell culture and experimental procedures and on the mathematical model and its MATLAB implementation are given in the "Methods" section at the end of the manuscript.

### Experimental results

We present experimental data obtained to characterize the human fibroblast cell line IMR90 in terms of: i) the growth of the cell population as a function of time in the control (pre-treatment, unirradiated) condition; ii) how the distribution of vital cells in the different cell-cycle phases is affected, again as a function of time, when cells are either sham-irradiated (0 Gy) or exposed to X-ray doses of 2 and 5 Gy. In all experimental conditions cells are followed in time until 72 h.

#### Estimation of population doubling time in the pre-treatment condition

In order to characterize the growth of the control cell population we estimated the population doubling time ($${\text{T}}_{\mathrm{D}}$$). As later discussed, this parameter is necessary in the model as a constraint to determine the average time spent in each cell-cycle phase. In Fig. 1 we report the average number of cells in unirradiated conditions for each time point (counted with a Bürker chamber), fitted to an exponential function. The doubling time can be derived from the fit using the relation:

\begin{aligned} T_D=\frac{\ln {2}}{\gamma } \end{aligned}
(1)

where $$\gamma$$ is the exponent coefficient of the fit function (see “Methods”). The estimated $${\text{T}}_{\mathrm{D}}$$ is of 38.4 (CL at 95$$\%$$ $$\left[ 31.7, 48.8\right]$$).

#### Flow-cytometry analysis for cell-cycle distribution

Cells were analyzed with flow cytometry to discriminate the different phases of the cell cycle. All details are given in the "Methods" section, representative flow-cytometry panels to illustrate the gating strategy are also reported in Fig. 2. Results on experimental percentages of cells in each phase of the cell cycle for sham, 2 Gy and 5 Gy are given in Table 1 and plotted in Fig. 3. Data are given as average between both biological (3) and technical (2) replicates with the corresponding standard error of the mean.

#### Cell-cycle distribution in unirradiated conditions

Figure 3a shows cell-cycle data representing the sham condition. We observe almost constant percentages in cell-cycle phases versus time in the early time points, while at 48 h and 72 h the percentage of cells in $${\text{G}}_{1}$$-phase increases from 61% (average value between 0 h and 24 h) to 79%. Data from Fig. 3a suggest that already at 48 h cells are approaching confluence, and their distribution in the cell cycle is affected. Based on these results, time points until 24 h only can be considered to have cells in an exponential growth phase.

Figures 3b,c show cell-cycle data for samples irradiated with 2 Gy and 5 Gy, respectively. Compared to the sham condition, for both doses there is a strong increase of cells in $${\text{G}}_{2}$$-phase at the early 6 h time point. A block in $${\text{G}}_{2}$$ following radiation exposure is to be attributed to the activation of cell repair mechanisms. For the 2 Gy condition, the percentage of cells in $${\text{G}}_{2}$$ reaches its maximum at this time point, and later decreases to an average value of 8.10% (between 24 h and 72 h), comparable with the average value for the sham between 0 and 24 h (8.07% ). Correspondingly, the percentage of $${\text{G}}_{1}$$-phase cells goes up in the same time-frame to an average value of 85.90%, and the percentage of S-phase is drastically reduced under 10% for any time point.

In the 5 Gy condition, the percentage of cells in $${\text{G}}_{2}$$ is maintained higher at all the time points, being maximal at 16 h with a value of 37.54%. A similar increase of $${\text{G}}_{1}$$-phase is shown at later time points, with a peak at 48 h. This latter effect can be both attributed to a block in $${\text{G}}_{1}$$, and to influx and accumulation in $${\text{G}}_{1}$$ of cells surviving the earlier block in $${\text{G}}_{2}$$ that return cycling.

Using data from Table 1 we calculated the relative differences in percentages of cells in cell-cycle phases between the sham and the two irradiated conditions. Results are given in Table 2 and plotted in Fig. 4. The value calculated for each time point is:

\begin{aligned} \delta \%=\frac{f_{dose}-f_{sham}}{f_{sham}} \end{aligned}
(2)

where $$f_{\textit{dose}}$$ and $$f_{\textit{sham}}$$ are, respectively, the percentages for the irradiated condition and for the sham. From Fig. 4 we observe that the relative difference $$\delta$$ has a similar trend versus time for the two doses as far as S, $${\text{G}}_{1}$$ and M-phases are concerned, though with different absolute values. The peak in $${\text{G}}_{2}$$-phase at 6 h for 2 Gy appears shifted to 16 h-24 h for the 5 Gy condition.

It has to be noted that relative differences with the sham condition at time points later than 24 h are affected by the fact that the unirradiated cells are approaching confluence, leading e.g. to the pronounced peak for the $${\text{G}}_{2}$$-phase at 72 h for 5 Gy.

### Data reproduction with the mathematical model

Starting from the work of Basse et al. 20039, we extended a mathematical model for a population of cells describing their position within the cell cycle. The model is based on a system of partial differential equations that governs the kinetics of cell densities in all phases of the cell cycle, depending on time t and DNA content x and, only for DNA synthesis, on an age-like variable $$\tau _S$$ describing the time since arrival in S-phase. The explicit consideration of DNA content as a variable in the model allows, in principle, tuning of model parameter to reproduce experimental flow-cytometry profiles.

In this section we briefly describe the general features of the model, we then turn to reproduction of experimental percentages of cells in cell-cycle phases and extraction of model parameter values. Full details on the model formalism and on the strategy for data reproduction are given in the "Methods" section.

#### Mathematical model

The model is a classical compartmental model with four main compartments, one for each sub-population of cells in a given cell-cycle phase. Four dependent variables: $${\text{G}}_{1}$$(x,t), S(x,t), $${\text{G}}_{2}$$(x,t) and M(x,t) represent the density number of cells with relative DNA content x, scaled to $$\textit{x} = 1$$ in the $${\text{G}}_{1}$$-phase, at time t. The main parameters of the model are the transition rates that control the transfer of cells between phases. The dynamics between the compartments is represented by the following equations (3):

\begin{aligned} \frac{\partial G_1}{\partial t} \left( x,t\right)&= 2^2b M\left( 2x,t\right) - \left( k_1 + \mu _{G_1} \right) G_1 \left( x,t\right) \end{aligned}
(3a)
\begin{aligned} \frac{\partial S}{\partial t} \left( x,t\right)&= D \frac{\partial ^2S}{\partial x^2} - g \frac{\partial S}{\partial x} \left( x,t\right) + k_1G_1 \left( x,t\right) - \mu _S S\left( x,t\right) - I\left( x,t;T_S\right) \end{aligned}
(3b)
\begin{aligned} \frac{\partial G_2}{\partial t} \left( x,t\right)&= I\left( x,t;T_S\right) - \left( k_2+\mu _{G_2}\right) G_2\left( x,t\right) \end{aligned}
(3c)
\begin{aligned} \frac{\partial M}{\partial t} \left( x,t\right)&= k_2G_2 \left( x,t\right) - bM\left( x,t\right) - \mu _MM\left( x,t\right) \end{aligned}
(3d)

Each equation in (3) describes the variation of cell density in the corresponding phase. This is generally determined by a source term, that gives the influx of cells coming from the preceding phase, minus the loss of cells transitioning to the next phase and that due to cell death. The source term in $${\text{G}}_{1}$$-phase is given by the rate of divisions per unit time b. For the S-phase we assume that, on average, the time spent during DNA replication ($${\text{T}}_{\mathrm{S}}$$) is constant for all cells. This implies that a cell entering the S-phase either exits after $${\text{T}}_{\mathrm{S}}$$ hours or dies. In addition to the influx from $${\text{G}}_{1}$$ and the loss term due to death, in Eq. (3b) for the S-phase we have: the first term that is a dispersion term, taking into account the experimental variability of signal at increasing DNA content; the second term that describes the increase of DNA content at an average growth rate of g per unit time (instead, DNA content is constant in all other phases, so similar terms are not needed); the last term $$I\left( x,t;T_S\right)$$, that represents the sub-population of cells which entered S-phase $${\text{T}}_{\mathrm{S}}$$ hours before and are therefore ready to exit from S to enter $${\text{G}}_{2}$$. This latter term is the influx term in Eq. (3c) for the $${\text{G}}_{2}$$-phase. Full details on the model are given in the original work9. In normal conditions, cells are considered in a state of asynchronous balanced growth, where the percentage of cells in each of the four phases of the cycle remains constant. Experimentally, for an in vitro system, this corresponds to unperturbed exponential growth and requires cells to be far from confluence. This state represents the so-called steady DNA distribution (SDD)10. Rates of transition between phases are also constant in this state. It has been shown that in such condition, the variable x is both uniquely determined by model parameters and independent of the initial distribution at time $$\textit{t} = 0$$11. A theoretical solution on the infinite temporal domain $$-\infty<t<\infty$$ exists, and asymptotically approaches the SDD condition8. For the purpose of this work, focusing on vital cells only, all $$\mu _{i}$$ parameters are set to 0. Also, a matricial formulation is adopted for the model implementation in MATLAB, as described in the "Methods" section.

#### Initial parameter values from experimental data

Experimental data on percentages of cells in cell-cycle phases for the sham condition are used to estimate model parameters for the SDD state. Average percentages among the first 4 time points (including the 0 h pre-treatment sample), excluding time points for which cells get close to confluence, are as follows: $$\textit{f}_{{\text{G}}_{1}}= 61.06 \pm 1.59$$; $$\textit{f}_{\mathrm{S}}= 30.07 \pm 1.52$$; $$\textit{f}_{{\text{G}}_{2}}= 8.07 \pm 1.01$$ and $$\textit{f}_{\mathrm{M}}= 0.80 \pm 0.28$$ (see Fig. 3). Using the experimental value of the doubling time $${\text{T}}_{\mathrm{D}}$$, and an estimate of the time spent in mitosis as $$T_M = T_D\cdot f_{M}$$, we can apply Steel’s formula (see Methods, Eqs. 1720) and obtain initial guess values for model parameters: $$k_{1} = 0.0496$$, $$\textit{g} = 0.0740$$, $$k_2 = 0.2272$$, $$\textit{b} = 3.2577$$.

#### Model performance for the unperturbed condition

Given the set of initial parameters, the model can be run. For its evolution, a step-size in time $${\text{h}}_{\mathrm{t}} = 0.05 \, h$$ is used. The model starts with 100% of cells in $${\text{G}}_{1}$$-phase. As a general criterion, we consider that convergence to a steady profile is reached when the sum of squared differences between percentages of all phases at time t and those at $$(t-5)$$ h becomes less than $$10^{-5}$$.

When running the model with the initial parameters given above, convergence is reached in $${\text{T}}_{\mathrm{SDD}} = 40.58 \, \text{h}$$. The corresponding model evolution is shown in Fig. 5a and compared to experimental data. In the SSD condition, model percentages are only close to experimental data, though within errors. Model parameters can therefore be adjusted, starting from initial guess values, to obtain a better reproduction of experimental data. This is done via $$\chi ^2$$ minimization (see later in Methods, Eq. 21). Best fit parameters obtained via minimization are as follows: $$k_1 = 0.0429$$, $$\textit{g} = 0.0796$$, $$k_2 = 0.2347$$, $$\textit{b} = 2.3655$$, and the corresponding evolution of the model and reproduction of the experimental unperturbed condition are reported in Fig. 5b.

#### Data reproduction strategy for the perturbed condition

The unperturbed cell population is growing exponentially at $$\textit{t} = {\text{T}}_{\mathrm{SSD}}$$, before treatment. We then consider that cells are exposed to radiation at $$t^\prime >T_{SSD}$$. For the sake of comparison to experimental data, we rescale the perturbation time $$t^\prime$$ to 0. After the exposure, the effects of radiation are manifested as alteration of the transition rates between compartments.

A $$\chi ^2$$ minimization strategy is not practical to find model parameters reproducing experimental data for the perturbed condition, as we justify in detail in the "Methods" section. Therefore, we use a simplified strategy for data reproduction, based on the following considerations:

1. 1.

alterations of $$k_1$$ and $$k_2$$, representing the transition probability from the gap phases to S or M, are expected to be more important. This is supported by biological evidence, as cell-cycle checkpoints exist in $${\text{G}}_{1}$$ and $${\text{G}}_{2}$$-phases, and a larger impact on the population of such phases is observed in experimental data;

2. 2.

from the reproduction of data for the sham condition it is known that g and b parameters have less importance in the $$\chi ^2$$ minimisation. We can assume that they are less modified by radiation, which is also supported by biological evidence, as S-phase and M-phase maintain an average constant duration.

To further simplify our strategy, we assume at first approximation that parameters variations can be described with step functions versus time, thus altering their values in fixed time interval. We recall that the main features of the cell-cycle perturbations that have been measured experimentally and that we want to reproduce are: an early $${\text{G}}_{2}$$ block (peaked at 6 h from irradiation for 2 Gy, and at a later 16 h time point for 5 Gy), and, after that, a strong increment of cells in $${\text{G}}_{1}$$-phase, that persists in time (see previous sections). We find that the following alteration of model parameters $$k_1$$ and $$k_2$$ lead to a good reproduction of experimental data:

1. 1.

for the 2 Gy condition, $$k_2$$ is divided by a factor of 12 in the interval 0 h-6 h after irradiation. After that, from 6 h to 16 h, $$k_2$$ is increased up to half its initial value, and later maintained constant throughout the simulation; at the 6 h time point $$k_1$$ is divided by a factor 10 and maintained constant until 48 h, when its current value is doubled and later kept constant until the end;

2. 2.

for the 5 Gy condition, $$k_2$$ is divided by a factor of 12 (as for the 2 Gy condition), but this value is maintained for a longer time, until 16 h after irradiation, to reproduce the longer block in $${\text{G}}_{2}$$-phase. At the 24 h time point, $$k_2$$ is restored to 1/4 of its initial value, and later maintained constant. At 16 h, $$k_1$$ is divided by 10 and later maintained constant.

Model calculations are shown in Fig. 6 and compared to experimental measurements. As it can be observed in the figures, simple changes in the parameters as a function of time lead already to a very good agreement between model and data. Note that, to provide a visual reference for phase percentages in the unperturbed condition, the time scale is shifted such that the irradiation time is at 10 h in the plots.

## Discussion

Starting from the work of Basse et al.9, we implemented a computational model to reproduce how cells that survive radiation exposure are distributed in the cell cycle as a function of dose and time after exposure. To our knowledge, the original model had not been applied to cell-cycle perturbations induced by ionizing radiation, but only tested for chemotherapeutic drugs.

This quantitative approach to model cell-cycle perturbation underlies the following assumptions: transitions between phases are complete and irreversible12,13,14; and perturbations, e.g. DNA damage, can affect the timing of these transitions15. The model is deterministic, built with four compartments, each representing cells in a given phase of the cycle. The position in the cycle is given by the value of a DNA content variable, and the evolution in time is governed by a set of differential equations. Model parameters are transition rates (between $${\text{G}}_{1}$$- and S-phase and between $${\text{G}}_{2}$$- and M-phase), cell division rate and DNA synthesis rate. For the MATLAB implementation presented in this work we adopted a matrix formulation of the model, explicitly taking into account the time variable only, while information on the DNA content can be recovered a posteriori once the model is solved. The model gives as main output percentages of cells in each phase of the cycle. A DNA content distribution can also be obtained, to be directly compared to a cell-cycle profile acquired with flow cytometry (which is a distribution of signal intensity versus a quantity proportional to DNA content).

The performance of the model was tested using data on IMR90 cells, a well-characterized healthy fibroblast cell line. It is important to state that the formulation of the model is general, and does not depend on the specific cell line under investigation. Cells in exponential growth were exposed to X-rays, with doses of 0 Gy (sham-irradiated), 2 Gy and 5 Gy, and their cell-cycle distribution was measured via flow cytometry at multiple time points after exposure: 6 h, 16 h, 24 h, 48 h and 72 h. At difference with previous implementations of similar models, data include explicit discrimination of cells in $${\text{G}}_{2}$$- and M-phase, using a flow-cytometry protocol identifying M-phase through the detection of pH3 phosphorylation during mitosis.

Experimental results indicate that unperturbed cells in exponential growth settle over time to a steady DNA distribution, in agreement with expectations, in which the percentages of cells in each phase remain constant in time. When cells are exposed to X-rays, a block in $${\text{G}}_{2}$$-phase in observed (at the early time point of 6 h after irradiation for 2 Gy and at 16 h for 5 Gy). After that, the percentage of cells in $${\text{G}}_{2}$$-phase decreases. The recovery to the pre-treatment condition is not total in both cases and seems to be dose dependent. At the same time cells accumulate in $${\text{G}}_{1}$$-phase right after the block in $${\text{G}}_{2}$$-phase, with a modification that seems persistent in time.

The proposed modeling approach is phenomenological: model parameters have to be adjusted to reproduce experimental data. As a general strategy, we obtained the set of model parameters needed to reproduce the sham condition, and assume that any perturbation can be described by the alteration of these parameters. The trends observed in experimental data for irradiated conditions are reproduced by varying only two of the parameters: $$k_1$$ and $$k_2$$, respectively the transition rate from $${\text{G}}_{1}$$- to S-phase and from $${\text{G}}_{2}$$- to M-phase. This is consistent with the biological knowledge of the main role of the two checkpoints in the gap phases to control the cell-cycle progression. Interestingly, as the analysis is limited to living cells, the model parameters (giving the entity of the perturbation) are not so different in the two perturbed conditions. We observed instead a difference on the duration of the blocks in gap phases, and on the ability to recover to the unperturbed condition. Therefore, a perturbation of similar entity was applied to $$k_2$$ for both doses, but kept for longer times for the higher dose, with a later recovery to a lower value at 5 Gy compared to the 2 Gy. Similarly, the same perturbation is applied to $$k_1$$ but a later time point (16 h vs. 6 h) for the highest dose condition, probably representing (more than a block in $${\text{G}}_{1}$$-phase) the renewed influx to $${\text{G}}_{1}$$ of cells surviving the $${\text{G}}_{2}$$ block, which coherently happens later for the higher dose.

How parameters are varied in order to reproduce data also gives an insight on possible biological interpretations: up until 4 - 6 h post irradiation only a slowing of S-phase entry is observed, even at high doses (10 Gy), and this can be explained by the principal molecular mechanisms for $${\text{G}}_{1}$$ arrest, that is a slow process involving transcriptional activation by p53 of p21 that leads to inhibition of pRb phosphorylation and $${\text{G}}_{1}$$ arrest16,17,18,19.

In this work a $$\chi ^2$$ fit strategy involving all the experimental time-points is used for the reproduction of the unperturbed condition only, while for the irradiated conditions a minimization is performed in discrete time intervals, giving as results step functions for parameter variation as a function of time. Perturbed parameters therefore represent average rates in discrete time interval. A first improvement of the reproduction strategy of the perturbed data would be to refine the time dependence of the perturbed parameters, introducing linear functions with imposed continuity conditions between time points. Furthermore, it would be necessary to implement a $$\chi ^2$$ fit strategy also for the perturbed conditions that could include all time-points simultaneously. This deserves further investigation as it is expected that the objective functions will have a complex behaviour as a function of model parameters. Also, the perturbed biological system is less easy to characterize with additional constraints, as the population increase rate, which was measured only for the unperturbed condition in this work. Optimization of the reproduction of the shape of the cell-cycle profile could also be performed, fully exploiting the flow-cytometry data.

It has to be stressed that results presented in this work in terms of percentages of cells in the different phases of the cycle always correspond to the fraction of living cells only. The importance of cell death cannot be quantified from our dataset. This information could be added experimentally and used as input for further model developments: the model could be expanded introducing a new compartment of dying cells. This same formulation of the model can be already adapted to reproduce cell death data adjusting the values of death rate parameters (currently set to 0). Dedicated experimental flow-cytometry protocols must be planned, e.g. using Annexin V with FITC versus Propidium Iodide, that can discriminate cells based on viability, and possibly identify the death mechanism (apoptosis vs. necrosis).

Additionally, a quiescent cell compartment could be added to the model to take into account the existence of the $${\text{G}}_{0}$$-phase. What we observe in our dataset as an increase in the population of $${\text{G}}_{1}$$ after irradiation could also be due to cells that exit the replicative cycle and accumulate in a quiescent $${\text{G}}_{0}$$-state.

It is also of interest to mention that an alternative approach could have been taken from the start, using as experimental model a synchronized cell population. Synchronizing cells in a given cell-cycle phase and monitoring their rate of transition to the next phase, one could also obtain an estimate of model parameters. It can be expected that such parameters would be close to those extracted in our work for the unirradiated condition. However, when irradiating a synchronized cell population, the perturbation to the parameter describing the transition rate to the next phase would be more strongly dependent on the specific radiosensitivity of that phase. In our approach we irradiate an asynchronous cell population, and parameters that describe the average behaviour are affected by the initial distribution of cells in the different phases of the cycle in an interdependent way. The comparison between these two approaches deserves further investigation.

To conclude, the computational model we present in this work has the potential to reproduce perturbations of cell cycle following exposure to ionizing radiation. We here focus on the percentage of living cells in different phases of the cycle as a function of radiation dose and time, offering a proof-of-concept validation of the model with flow-cytometry experimental data on a chosen cell line. The model formulation is very general, such that it can be easily applied to reproduce different datasets. In particular, applications to cancer cells will be needed, in the perspective of using this model in the context of radiotherapy research. The response of healthy or cancer cells can be tested at different doses, also to address more specifically either effects to healthy tissues or to the target tumour mass. Different types of radiation or different agents as drugs can be considered. Biological insight on underlying response phenomena can be gained, comparing model parameters that are necessary to reproduce different datasets. By construction, the model can be easily extended to the consideration of cells exiting the cycle, either undergoing cell death or transition to a quiescent state. New dedicated datasets should be acquired to benchmark possible model extensions.

A model of this kind constitutes the core of more complex tools for pre-clinical/clinical use. These tools should be able to include cell-cycle perturbations for predictions in radiotherapy, hence considering characteristics of the tumour microenvironment. From the experimental point of view, further application of our model to spheroids20 or organoids would be beneficial in this sense. From the modeling point of view, advanced model development could include the influence of the microenvironment, starting with the consideration of cell distribution in space (e.g. on a 2D grid for in vitro cultures or 3D lattice for spheroids), oxygen/nutrient supply, cell quiescence and cell clonogenic survival also due to radiation exposure5,20. Tools of this kind could allow predictions on the tumour mass growth (monitored in terms of number of cancer cells with replicative potential) as a function of time. Comparing this time evolution for different treatment combinations could allow to obtain information on optimal fractionation schemes or on the possible synergistic combination of radiation and chemotherapeutic drugs.

## Methods

### Experimental cell-cycle analysis

#### Cell culture

In this work a primary culture of IMR90 human fibroblast cells (ATCC CCL-186, USA) originated from lung tissue is used. Cells were cultured in complete medium (EMEM supplemented with 12.5% of FBS, 1% of L-glutamine 2 mM and 1%of NEAA) at $$37 \, ^{\circ }{}\text{C}$$ and 5% $${\text{CO}}_{2}$$, until confluence was reached at 80-90%. Cells used for the measurements were always between the $$8^{\mathrm{th}}$$ and the $$25^{\mathrm{th}}$$ passage. IMR90 cells were plated at a density of $$10^{5}$$ per T25 flask (Greiner Bio-One, Germany).

#### Growth curve

Independent measurements on cellular growth curves were performed in order to estimate the doubling time ($${\text{T}}_{\mathrm{D}}$$) of an unirradiated population. $$10^{5}$$ cells were seeded per T25 flask, they were cultivated and harvested at the following time points: 6 h, 16 h, 24 h, 48 h, 72 h (the same time points for the following study of the radiation effects). The samples were trypsinized, $$10 \, \mu l$$ of the pellet was placed in a Bürker counting chamber. The number of cells were counted under an inverted microscope. Experiments were repeated in biological triplicate, each of them was performed in technical duplicate. The data were fitted to the following exponential function via non-linear least squares methods: $$fit(t) = N_0 \exp (\gamma t)$$, where $$N_0$$ is the number of cells at $$t= 0$$, and $$\gamma$$ is the growth rate.

Irradiations were performed at the Radiotherapy facility of IRCCS Istituti Clinici Maugeri (Pavia, Italy), with a 6 MV linear accelerator Clinac (Varian, USA), regularly used for radiotherapy treatments. Cell samples were exposed to X-rays at two different doses of 2 Gy and 5 Gy, with a dose rate of 3 Gy/min. Irradiations were performed as previously described in detail21. Control samples, so-called “sham” samples, underwent the same environmental stress as the irradiated samples, except for exposure to radiation. After irradiation samples were transferred in the incubator at $$37 \,^{\circ }\text{C}$$ with 5% $${\text{CO}}_{2}$$ level. At each time point (6 h, 16 h, 24 h, 48 h, 72 h) cells were treated for flow-cytometry measurements, as described in the next section. Experiments were repeated in biological triplicate, each of them was performed in technical duplicate.

#### Cell-cycle characterization

After irradiation, cells were incubated with $$2 \, \mu g/ \mu l$$ EdU (Click-iT Plus EdU Alexa Fluor 488 Flow Cytometry Assay Kit, Invitrogen, USA) for 1 h, then fixed following the kit manufacture instructions with minor changes, to discriminate S-phase. M-phase was discriminated with the phospho-Histone H3 (Ser-10) primary antibody (1:25 in PBS with 1% of NGS, Cell Signaling Technology, USA [RRID:AB_331748]) and the secondary anti-mouse IgG antibody (1:500, Alexa Fluor 555 Conjugate, Cell Signaling Technology, USA [RRID:AB_1904022]). FxCycle Violet dye (4’,6-diamidino-2-phenylindole, dihydrochloride) was used to measure the total DNA content, following manufacture instructions. The Attune NxT Acoustic Focusing flow cytometer (Thermo Fisher Scientific, USA) employed for these experiments is equipped with two lasers emitting, respectively, in the blue (488 nm, 50 mW) and in the violet (405 nm, 50 mW).

The full gating strategy to discriminate the four cell-cycle phases is as follows: after the identification of the cell population and of singlets (Fig. 2a,b, respectively), the characteristic cell-cycle spectrum, as measured with FxCycle Violet, (Fig. 2c) is analyzed. A further gate is applied, to select the whole spectrum where the signal of FxCycle Violet is proportional to the amount of DNA: this has the typical form of two Gaussians, the second with a mean value equal to the double of the first peak, and a plateau in between. These three distributions represent respectively cells in $${\text{G}}_{1}$$-phase, $${\text{G}}_{2}$$/M-phase and S-phase. The cell-cycle spectrum in VL1-A channel alone cannot provide a quantitative discrimination of the four phases, so stainings with EdU and pH3 antibody were introduced. In the bi-parametric plane BL1-A versus VL1-A (EdU vs. FxCycle Violet, Fig. 2d) there are three distinct clouds of events: the cloud in the bottom left represents cells in $${\text{G}}_{1}$$-phase, that have a low signal of FxCycle Violet and low signal of EdU. The events in the bottom right part of the graph represent cells in $${\text{G}}_{2}$$ or M-phase, while the horseshoe-like subset with a high signal of EdU represents cells in S-phase. The events collected in the gate “$${\text{G}}_{2}$$-M” are visualized in the VL1-A versus BL2-A plane (pH3 vs. FxCycle Violet): here events with a double positive fluorescence correspond to cells in M-phase, that have a double content of DNA and a high content of phosphorylated H3 (Fig. 2e).

#### Statistical analysis

Data acquisition was performed with the Attune NxT software. Data analysis was carried out with FlowJo, a specialised software used for flow-cytometry applications. As a result of the gating procedure described above, we obtained cell numbers in each of the four cell-cycle phases. Data were then expressed in terms of percentages with respect to the whole cell population (hence normalized to the sum of cells in all gates used to identify cell-cycle phases). Data are given as average between both biological (3) and technical (2) replicates, and all experimental uncertainties reported are intended as standard error of the mean.

### Mathematical model details

#### Mathematical model: boundary and initial conditions and steady-state solution in unperturbed condition

Model equations (3) govern the dynamics between cell-cycle phases. To solve the system and find the time t and DNA content x dependence of the four phases, $${\text{G}}_{1}$$(x,t), S(x,t), $${\text{G}}_{2}$$(x,t) and M(x,t), boundary and initial conditions are needed, that we now discuss. The following boundary condition (Eq. (4)):

\begin{aligned} D\frac{\partial S}{\partial x}\left( 0,t\right) -gS\left( 0,t\right) =0, \quad t>0 \end{aligned}
(4)

ensures a positive DNA content in all cells at all times.

At first, as initial conditions, all cells are synchronised in $${\text{G}}_{1}$$-phase, while the other three compartments are empty. An approximation of the flow-cytometric profile of $${\text{G}}_{1}$$-phase at time t is given by Eq. (5):

\begin{aligned}&G_1 \left( x,t\right) = \frac{n_{G_1} \left( t\right) }{\sqrt{2\pi \theta _{G_1}^2}} exp\frac{-\left( x-{{\bar{m}}}_{G_1}\right) ^2}{2\theta _{G_1}^2} \nonumber \\&S\left( x,0\right) =0, \quad G_2\left( x,0\right) =0,\quad M\left( x,0\right) =0 \qquad 0<x<\infty \end{aligned}
(5)

where $${{\bar{m}}}_{G_1}$$ is the mean DNA content in $${\text{G}}_{1}$$-phase and $$\theta _{G_1}^2$$ is the corresponding variance. For simplicity, the mean parameter is normalized to a relative value $$\textit{x} = 1$$ for $${\text{G}}_{1}$$-phase, thus giving $$\textit{x} = 2$$ for $${\text{G}}_{2}$$-phase and M-phase, while the variance is chosen sufficiently small so that $$G_1\left( x,0\right)$$ exists only for $$x>0$$, and can be adapted to simulate the experimental variance of the flow-cytometric profile.

Given the initial conditions as Eq. (5), the model is made evolve until $${\text{T}}_{\mathrm{SDD}}$$ hours to reach a steady DNA distribution. After such time, the model is considered to give the cell-cycle distribution of cells in exponential growth. The variance $$\theta _1^2$$ for the Gaussian distribution of cells in $${\text{G}}_{1}$$-phase is fixed at 0.05, which is chosen based on the experimental variance of the $${\text{G}}_{1}$$-phase sham profiles (around 5% of the mean). The variance in $${\text{G}}_{2}$$-phase and M, $$\theta _{G_2}^2$$, is considered as two times $$\theta _{G_1}^2$$. In practice, starting from the initial condition, the full profile in x is superimposed to the solution of the problem in its matricial formulation, that gives the number of cells in each phase at each time (see next section).

#### Matricial model formulation

For a practical implementation of the model in MATLAB we resort to a matricial model formulation, integrating Eqs. (3) over the DNA content domain from 0 to X, with X upper limit for the x-axis, large enough to simulate infinity so that the difference of the integrations over 2X and over X is negligible. The matrix system is resolved using finite forward differences method, with discretization in the time variable. The integration leads to the following system of temporal ordinary and partial differential equations:

\begin{aligned}&\frac{dn_1\left( t\right) }{dt}=2bn_M\left( t\right) -k_1n_1\left( t\right) \end{aligned}
(6)
\begin{aligned}&\frac{\partial n_S\left( t;\tau _S\right) }{\partial \tau _S}+\frac{\partial n_S\left( t;\tau _S\right) }{\partial t}=0 \end{aligned}
(7)
\begin{aligned}&\frac{dn_2\left( t\right) }{dt}=n_S\left( t;\tau _S=T_S\right) -k_2n_2\left( t\right) \end{aligned}
(8)
\begin{aligned}&\frac{dn_M\left( t\right) }{dt}=k_2n_2\left( t\right) -bn_M\left( t\right) \end{aligned}
(9)

where:

\begin{aligned}&n_1\left( t\right) =\int _{0}^{X}{G_1\left( x,t\right) dx}&n_S\left( t;\tau _S\right) =\int _{0}^{X}{{\bar{S}}\left( x,t;\tau _S\right) dx} \end{aligned}
(10)
\begin{aligned}&n_2\left( t\right) =\int _{0}^{X}{G_2\left( x,t\right) dx}&n_M\left( t\right) =\int _{0}^{X}M\left( x,t\right) dx \end{aligned}
(11)

Then, Eqs. (6) – (9) are rewritten as a linear system of equations and finite forward differences method is applied to resolve it:

\begin{aligned} u_{t+\Delta t}=A\cdot u_t \end{aligned}
(12)

with:

\begin{aligned} u_t=\left[ n_1\left( t\right) n_S\left( t;\tau _S=0\right) \quad n_S\left( t;\tau _S=\Delta t\right) \quad \ldots \quad n_S\left( t;\tau _S=T_S\right) \quad n_2\left( t\right) \quad n_M\left( t\right) \right] ^T \end{aligned}
(13)

and

\begin{aligned} A=\left[ \begin{matrix} 1-k_1\Delta t&{}0&{}0&{}\ldots &{}0&{}0&{}2b\Delta t\\ k_1\Delta t&{}0&{}0&{}\ldots &{}0&{}0&{}0\\ 0&{}1&{}0&{}\ldots &{}0&{}0&{}0\\ \vdots &{}\ddots &{}\ddots &{}\ddots &{}\vdots &{}\vdots &{}\vdots \\ 0&{}0&{}0&{}\ddots &{}0&{}0&{}0\\ 0&{}0&{}0&{}\ddots &{}1&{}1-k_2\Delta t&{}0\\ 0&{}0&{}0&{}\ldots &{}0&{}k_2\Delta t&{}1-b\Delta t\\ \end{matrix}\right] \end{aligned}
(14)

The eigenvector corresponding to the eigenvalue of largest magnitude of the transition matrix A gives the long-term distribution of cells. The eigenvalue determines the rate of population growth and allows to compute the doubling time of the cell population22. In the discretized matrix form of the model of an unperturbed cell line, the number of cells in each phase can be tracked over time without any knowledge of the DNA distribution in each phase. This formulation is more convenient for practical model implementation, but it provides no information on the flow-cytometric profile to be compared to data. To obtain this information at any time in each phase, normal distributions can be superimposed to data from the matrix model. It is assumed that experimental flow-cytometric profiles obtained from a homogeneous cell population are approximately Gaussian, and experimental variances can be used to adapt theoretical distributions. For $${\text{G}}_{1}$$-phase at the arbitrary time $$t^\prime$$, the flow-cytometric profile is:

\begin{aligned} G_1\left( x,t_0\right) =\frac{n_1(t')}{\sqrt{2\pi \theta _{1}^{2}}} \exp { \frac{-(x-\mu _1)^2}{2\theta _{1}^2}} \end{aligned}
(15)

The mean $$\mu _{1}$$ for $${\text{G}}_{1}$$-phase is chosen at a relative DNA content $$\textit{x} = 1$$. The S-phase profile is composed by a sum of Gaussian distributions with mean values shifted along the x-axis, proportional to $$\tau _S$$ (the time each cell has spent in S-phase). So, the mean in S-phase is $$\mu _{1}+ {\text{g}}_{\mathrm{S}} \tau _S$$. For $${\text{G}}_{2}$$ and M, the profile is a Gaussian function like the one in Eq. 15, but with mean respectively $$\mu _{2}$$ and $$\mu _{\mathrm{M}}$$ at $$\textit{x} = 2$$23. It is assumed also that the variances increase linearly from $$\theta _{1}^2$$ to $$\theta _{2}^2$$ during S-phase. As mentioned, the main motivation to consider this discretized formulation of the model is the faster computational time when using a language as MATLAB.

#### Initial parameter values from experimental data

For a first estimate of model parameter values in the unperturbed condition we can start from the so-called Steel’s formulas24: such formulas give the average time spent in each phase ($${\text{T}}_{\mathrm{phase}}$$) as a function of measured percentages of cells in cell-cycle phases ($$f_{\textit{phase}}$$) and the doubling time $${\text{T}}_{\mathrm{D}}$$. Formulas are as follows (Eq. 16):

\begin{aligned}&T_S=T_D\frac{\log {\left( f_S+f_{G_2}+f_M+1\right) }}{log{2}}-T_{G_2}-T_M \end{aligned}
(16a)
\begin{aligned}&T_{G_2}=T_D\frac{\log {\left( f_{G_2}+f_M+1\right) }}{log{2}}-T_M \end{aligned}
(16b)
\begin{aligned}&T_{G_1}=T_D-T_{G_2}-T_M-T_S \end{aligned}
(16c)

where $${\text{T}}_{\mathrm{M}}$$ can be separately estimated as $${\text{T}}_{\mathrm{D}} \cdot f_{\textit{M}}$$.

As previously mentioned, for a population of cells with a steady DNA distribution, $$f_{\textit{phase}}$$ percentage values are unaltered in time. Steel’s formulas are then combined with the relation between the average time in each phase and the model parameters as in25:

\begin{aligned} T_{G_1}&=\frac{1}{k_1+\mu _{G_1}} \end{aligned}
(17)
\begin{aligned} T_S&=\frac{1}{g+\mu _S} \end{aligned}
(18)
\begin{aligned} T_{G_2}&=\frac{1}{k_2+\mu _{G_2}} \end{aligned}
(19)
\begin{aligned} T_M&=\frac{1}{b+\mu _M} \end{aligned}
(20)

If $$\mu _{\mathrm{i}}$$ parameters determining cell death in Eq.(3) are set to 0, transition rates are simply the reciprocal of the average time spent in each phase: if a block occurs in a particular phase, the time spent in that phase will increase and the rate transition will tend to zero.

#### Fit strategy for the unperturbed condition

Parameter values obtained from Eqs. (17) – (20) are used as an initial guess. A minimization strategy is then necessary, to obtain the set of parameters that allow the best reproduction of experimental data for the unperturbed condition. The minimization strategy is not uniquely defined, and possible objective functions show multiple local minima, with different sets of parameters returning similar profiles, with very different doubling time.

We follow a sequential minimisation procedure on the four parameters: $$k_{1}$$, g, $$k_{2}$$, b. We define the objective function to minimize the difference between experimental and theoretical percentages as:

\begin{aligned} \Phi \left( k_1,g,k_2,b\right) =\sum _{k}\sum _{j=1}^{4}{\frac{\left( M_j\left( t_k\right) -f_j\left( t_k\right) \right) }{{\sigma ^2}_{f_j}\left( t_k\right) }}^2 \end{aligned}
(21)

where [$${\text{k}}_{1}$$, g, $${\text{k}}_{2}$$, b] is the vector of the model parameters spanning between their limits, $$M_{j}(t_{k})$$ is the theoretical percentage of cells in the phase j at time $$t_k$$, $$f_{j}(t_{k})$$ is the experimental percentage of cells in the phase j at time $$t_k$$ and $$\sigma _{f_j}^2$$ is the corresponding experimental error.

The following constraints are imposed: $$k_1\in \left[ 0,1\right] ,g\in \left[ 0.05,0.5\right] ,k_2\in \left[ 0,1\right]$$ and $$b\in \left[ 0,\infty \right)$$. These constraints ensure positivity of the parameter values, reflect the fact that $$k_1$$ and $$k_2$$ are probabilities and assume that the time spent in S-phase is approximately between 2 and 20 h26. Lower and upper bounds of the fit parameters kept during minimization are given as follows: $$k_1\in \left[ 0.03, 0.06\right] , g\in \left[ 0.05,0.5\right] ,k_2\in \left[ 0.05, 0.5\right]$$ and $$b\in \left[ 2, 5\right]$$.

It has to be recalled that the four parameters are related: the sum of their reciprocal gives the overall duration of the cell cycle, hence what is experimentally measured as the doubling time. Therefore, this can be used as a criterion to drive the optimization: any local minimum of the objective function for which the derived cell-cycle length is far from our estimation can be discarded. The same holds for minima that occur for parameter sets too far from initial conditions, as initial parameter guesses provided by Steel’s formula have been shown to give a reasonable agreement with experimental data. For all these reasons, the minimization is performed varying parameters in relatively narrow intervals, in particular for $$k_1$$, as the objective function is found to have a strong dependence on its value. On the contrary, minimization is less affected by the values of b and g. Taking also into account the lower importance of the b parameter and the low values assumed by M-phase percentages, in a first instance the parameter was fixed to the value 3.

The optimisation of the fit (minimization of $$\Phi$$) for the unperturbed condition is performed with a sequential quadratic programming method via the MATLAB routine fmincon27.