Radiation-induced cell cycle perturbations: a computational tool validated with flow-cytometry data

Cell cycle progression can be studied with computational models that allow to describe and predict its perturbation by agents as ionizing radiation or drugs. Such models can then be integrated in tools for pre-clinical/clinical use, e.g. to optimize kinetically-based administration protocols of radiation therapy and chemotherapy. We present a deterministic compartmental model, specifically reproducing how cells that survive radiation exposure are distributed in the cell cycle as a function of dose and time after exposure. Model compartments represent the four cell-cycle phases, as a function of DNA content and time. A system of differential equations, whose parameters represent transition rates, division rate and DNA synthesis rate, describes the temporal evolution. Initial model inputs are data from unexposed cells in exponential growth. Perturbation is implemented as an alteration of model parameters that allows to best reproduce cell-cycle profiles post-irradiation. The model is validated with dedicated in vitro measurements on human lung fibroblasts (IMR90). Cells were irradiated with 2 and 5 Gy with a Varian 6 MV Clinac at IRCCS Maugeri. Flow cytometry analysis was performed at the RadBioPhys Laboratory (University of Pavia), obtaining cell percentages in each of the four phases in all studied conditions up to 72 h post-irradiation. Cells show early \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text{G}}_{2}$$\end{document}G2-phase block (increasing in duration as dose increases) and later \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text{G}}_{1}$$\end{document}G1-phase accumulation. For each condition, we identified the best sets of model parameters that lead to a good agreement between model and experimental data, varying transition rates from \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text{G}}_{1}$$\end{document}G1- to S- and from \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text{G}}_{2}$$\end{document}G2- to M-phase. This work offers a proof-of-concept validation of the new computational tool, opening to its future development and, in perspective, to its integration in a wider framework for clinical use.


Results
The structure of this paper is as follows: we first present experimental measurements with the IMR90 cell line. We then introduce the theoretical model and discuss how model parameters are estimated reproducing the unperturbed experimental condition (cells in exponential growth in absence of radiation exposure). We finally show that the modification of such parameters allows to reproduce cell-cycle perturbations induced by radiation. All details on cell culture and experimental procedures and on the mathematical model and its MATLAB implementation are given in the "Methods" section at the end of the manuscript. www.nature.com/scientificreports/ Experimental results. We present experimental data obtained to characterize the human fibroblast cell line IMR90 in terms of: i) the growth of the cell population as a function of time in the control (pre-treatment, unirradiated) condition; ii) how the distribution of vital cells in the different cell-cycle phases is affected, again as a function of time, when cells are either sham-irradiated (0 Gy) or exposed to X-ray doses of 2 and 5 Gy. In all experimental conditions cells are followed in time until 72 h.
Estimation of population doubling time in the pre-treatment condition. In order to characterize the growth of the control cell population we estimated the population doubling time ( T D ). As later discussed, this parameter is necessary in the model as a constraint to determine the average time spent in each cell-cycle phase. In Fig. 1 we report the average number of cells in unirradiated conditions for each time point (counted with a Bürker chamber), fitted to an exponential function. The doubling time can be derived from the fit using the relation: where γ is the exponent coefficient of the fit function (see "Methods"). The estimated T D is of 38.4 (CL at 95% [31. 7, 48.8]).
Flow-cytometry analysis for cell-cycle distribution. Cells were analyzed with flow cytometry to discriminate the different phases of the cell cycle. All details are given in the "Methods" section, representative flow-cytometry panels to illustrate the gating strategy are also reported in Fig. 2. Results on experimental percentages of cells in each phase of the cell cycle for sham, 2 Gy and 5 Gy are given in Table 1 and plotted in Fig. 3. Data are given as average between both biological (3) and technical (2) replicates with the corresponding standard error of the mean.
Cell-cycle distribution in unirradiated conditions. Figure 3a shows cell-cycle data representing the sham condition. We observe almost constant percentages in cell-cycle phases versus time in the early time points, while at 48 h and 72 h the percentage of cells in G 1 -phase increases from 61% (average value between 0 h and 24 h) to 79%. Data from Fig. 3a suggest that already at 48 h cells are approaching confluence, and their distribution in the cell cycle is affected. Based on these results, time points until 24 h only can be considered to have cells in an exponential growth phase.
Cell-cycle distribution after irradiation. Figures 3b,c show cell-cycle data for samples irradiated with 2 Gy and 5 Gy, respectively. Compared to the sham condition, for both doses there is a strong increase of cells in G 2 -phase at the early 6 h time point. A block in G 2 following radiation exposure is to be attributed to the activation of cell repair mechanisms. For the 2 Gy condition, the percentage of cells in G 2 reaches its maximum at this time point, and later decreases to an average value of 8.10% (between 24 h and 72 h), comparable with the average value for the sham between 0 and 24 h (8.07% ). Correspondingly, the percentage of G 1 -phase cells goes up in the same time-frame to an average value of 85.90%, and the percentage of S-phase is drastically reduced under 10% for any time point. In the 5 Gy condition, the percentage of cells in G 2 is maintained higher at all the time points, being maximal at 16 h with a value of 37.54%. A similar increase of G 1 -phase is shown at later time points, with a peak at 48 h.  www.nature.com/scientificreports/ This latter effect can be both attributed to a block in G 1 , and to influx and accumulation in G 1 of cells surviving the earlier block in G 2 that return cycling.
Control versus irradiated conditions. Using data from Table 1 we calculated the relative differences in percentages of cells in cell-cycle phases between the sham and the two irradiated conditions. Results are given in Table 2 and plotted in Fig. 4. The value calculated for each time point is: where f dose and f sham are, respectively, the percentages for the irradiated condition and for the sham. From Fig. 4 we observe that the relative difference δ has a similar trend versus time for the two doses as far as S, G 1 and  Table 1). Percentages of cells in the M-phase are generally too low to be appreciated in the plot. Errors are reported as standard deviations at 1σ. It has to be noted that relative differences with the sham condition at time points later than 24 h are affected by the fact that the unirradiated cells are approaching confluence, leading e.g. to the pronounced peak for the G 2 -phase at 72 h for 5 Gy.
Data reproduction with the mathematical model. Starting from the work of Basse et al. 2003 9 , we extended a mathematical model for a population of cells describing their position within the cell cycle. The model is based on a system of partial differential equations that governs the kinetics of cell densities in all phases of the cell cycle, depending on time t and DNA content x and, only for DNA synthesis, on an age-like variable τ S describing the time since arrival in S-phase. The explicit consideration of DNA content as a variable in the model allows, in principle, tuning of model parameter to reproduce experimental flow-cytometry profiles.
In this section we briefly describe the general features of the model, we then turn to reproduction of experimental percentages of cells in cell-cycle phases and extraction of model parameter values. Full details on the model formalism and on the strategy for data reproduction are given in the "Methods" section.
Mathematical model. The model is a classical compartmental model with four main compartments, one for each sub-population of cells in a given cell-cycle phase. Four dependent variables: G 1 (x,t), S(x,t), G 2 (x,t) and M(x,t) represent the density number of cells with relative DNA content x, scaled to x = 1 in the G 1 -phase, at time t. The main parameters of the model are the transition rates that control the transfer of cells between phases. The dynamics between the compartments is represented by the following equations (3): Each equation in (3) describes the variation of cell density in the corresponding phase. This is generally determined by a source term, that gives the influx of cells coming from the preceding phase, minus the loss of cells transitioning to the next phase and that due to cell death. The source term in G 1 -phase is given by the rate of divisions per unit time b. For the S-phase we assume that, on average, the time spent during DNA replication ( T S ) is constant for all cells. This implies that a cell entering the S-phase either exits after T S hours or dies. In addition to the influx from G 1 and the loss term due to death, in Eq. (3b) for the S-phase we have: the first term that is a dispersion term, taking into account the experimental variability of signal at increasing DNA content; the second term that describes the increase of DNA content at an average growth rate of g per unit time (instead,   www.nature.com/scientificreports/ DNA content is constant in all other phases, so similar terms are not needed); the last term I(x, t; T S ) , that represents the sub-population of cells which entered S-phase T S hours before and are therefore ready to exit from S to enter G 2 . This latter term is the influx term in Eq. (3c) for the G 2 -phase. Full details on the model are given in the original work 9 . In normal conditions, cells are considered in a state of asynchronous balanced growth, where the percentage of cells in each of the four phases of the cycle remains constant. Experimentally, for an in vitro system, this corresponds to unperturbed exponential growth and requires cells to be far from confluence. This state represents the so-called steady DNA distribution (SDD) 10 . Rates of transition between phases are also constant in this state. It has been shown that in such condition, the variable x is both uniquely determined by model parameters and independent of the initial distribution at time t = 0 11 . A theoretical solution on the infinite temporal domain −∞ < t < ∞ exists, and asymptotically approaches the SDD condition 8 . For the purpose of this work, focusing on vital cells only, all µ i parameters are set to 0. Also, a matricial formulation is adopted for the model implementation in MATLAB, as described in the "Methods" section.
Initial parameter values from experimental data. Experimental data on percentages of cells in cell-cycle phases for the sham condition are used to estimate model parameters for the SDD state. Average percentages among the first 4 time points (including the 0 h pre-treatment sample), excluding time points for which cells get close to confluence, are as follows: Model performance for the unperturbed condition. Given the set of initial parameters, the model can be run. For its evolution, a step-size in time h t = 0.05 h is used. The model starts with 100% of cells in G 1 -phase. As a general criterion, we consider that convergence to a steady profile is reached when the sum of squared differences between percentages of all phases at time t and those at (t − 5) h becomes less than 10 −5 . When running the model with the initial parameters given above, convergence is reached in T SDD = 40.58 h . The corresponding model evolution is shown in Fig. 5a and compared to experimental data. In the SSD condition, model percentages are only close to experimental data, though within errors. Model parameters can therefore be adjusted, starting from initial guess values, to obtain a better reproduction of experimental data. This is done via χ 2 minimization (see later in Methods, Eq. 21). Best fit parameters obtained via minimization are as follows: k 1 = 0.0429 , g = 0.0796 , k 2 = 0.2347 , b = 2.3655 , and the corresponding evolution of the model and reproduction of the experimental unperturbed condition are reported in Fig. 5b.
Data reproduction strategy for the perturbed condition. The unperturbed cell population is growing exponentially at t = T SSD , before treatment. We then consider that cells are exposed to radiation at t ′ > T SSD . For the sake of comparison to experimental data, we rescale the perturbation time t ′ to 0. After the exposure, the effects of radiation are manifested as alteration of the transition rates between compartments. www.nature.com/scientificreports/ A χ 2 minimization strategy is not practical to find model parameters reproducing experimental data for the perturbed condition, as we justify in detail in the "Methods" section. Therefore, we use a simplified strategy for data reproduction, based on the following considerations: 1. alterations of k 1 and k 2 , representing the transition probability from the gap phases to S or M, are expected to be more important. This is supported by biological evidence, as cell-cycle checkpoints exist in G 1 and G 2 -phases, and a larger impact on the population of such phases is observed in experimental data; 2. from the reproduction of data for the sham condition it is known that g and b parameters have less importance in the χ 2 minimisation. We can assume that they are less modified by radiation, which is also supported by biological evidence, as S-phase and M-phase maintain an average constant duration.
To further simplify our strategy, we assume at first approximation that parameters variations can be described with step functions versus time, thus altering their values in fixed time interval. We recall that the main features of the cell-cycle perturbations that have been measured experimentally and that we want to reproduce are: an early G 2 block (peaked at 6 h from irradiation for 2 Gy, and at a later 16 h time point for 5 Gy), and, after that, a strong increment of cells in G 1 -phase, that persists in time (see previous sections). We find that the following alteration of model parameters k 1 and k 2 lead to a good reproduction of experimental data: 1. for the 2 Gy condition, k 2 is divided by a factor of 12 in the interval 0 h-6 h after irradiation. After that, from 6 h to 16 h, k 2 is increased up to half its initial value, and later maintained constant throughout the simulation; at the 6 h time point k 1 is divided by a factor 10 and maintained constant until 48 h, when its current value is doubled and later kept constant until the end; 2. for the 5 Gy condition, k 2 is divided by a factor of 12 (as for the 2 Gy condition), but this value is maintained for a longer time, until 16 h after irradiation, to reproduce the longer block in G 2 -phase. At the 24 h time point, k 2 is restored to 1/4 of its initial value, and later maintained constant. At 16 h, k 1 is divided by 10 and later maintained constant.
Model calculations are shown in Fig. 6 and compared to experimental measurements. As it can be observed in the figures, simple changes in the parameters as a function of time lead already to a very good agreement between model and data. Note that, to provide a visual reference for phase percentages in the unperturbed condition, the time scale is shifted such that the irradiation time is at 10 h in the plots.

Discussion
Starting from the work of Basse et al. 9 , we implemented a computational model to reproduce how cells that survive radiation exposure are distributed in the cell cycle as a function of dose and time after exposure. To our knowledge, the original model had not been applied to cell-cycle perturbations induced by ionizing radiation, but only tested for chemotherapeutic drugs. This quantitative approach to model cell-cycle perturbation underlies the following assumptions: transitions between phases are complete and irreversible [12][13][14] ; and perturbations, e.g. DNA damage, can affect the timing of these transitions 15 . The model is deterministic, built with four compartments, each representing cells in a given www.nature.com/scientificreports/ phase of the cycle. The position in the cycle is given by the value of a DNA content variable, and the evolution in time is governed by a set of differential equations. Model parameters are transition rates (between G 1 -and S-phase and between G 2 -and M-phase), cell division rate and DNA synthesis rate. For the MATLAB implementation presented in this work we adopted a matrix formulation of the model, explicitly taking into account the time variable only, while information on the DNA content can be recovered a posteriori once the model is solved. The model gives as main output percentages of cells in each phase of the cycle. A DNA content distribution can also be obtained, to be directly compared to a cell-cycle profile acquired with flow cytometry (which is a distribution of signal intensity versus a quantity proportional to DNA content).
The performance of the model was tested using data on IMR90 cells, a well-characterized healthy fibroblast cell line. It is important to state that the formulation of the model is general, and does not depend on the specific cell line under investigation. Cells in exponential growth were exposed to X-rays, with doses of 0 Gy (sham-irradiated), 2 Gy and 5 Gy, and their cell-cycle distribution was measured via flow cytometry at multiple time points after exposure: 6 h, 16 h, 24 h, 48 h and 72 h. At difference with previous implementations of similar models, data include explicit discrimination of cells in G 2 -and M-phase, using a flow-cytometry protocol identifying M-phase through the detection of pH3 phosphorylation during mitosis.
Experimental results indicate that unperturbed cells in exponential growth settle over time to a steady DNA distribution, in agreement with expectations, in which the percentages of cells in each phase remain constant in time. When cells are exposed to X-rays, a block in G 2 -phase in observed (at the early time point of 6 h after irradiation for 2 Gy and at 16 h for 5 Gy). After that, the percentage of cells in G 2 -phase decreases. The recovery to the pre-treatment condition is not total in both cases and seems to be dose dependent. At the same time cells accumulate in G 1 -phase right after the block in G 2 -phase, with a modification that seems persistent in time.
The proposed modeling approach is phenomenological: model parameters have to be adjusted to reproduce experimental data. As a general strategy, we obtained the set of model parameters needed to reproduce the sham condition, and assume that any perturbation can be described by the alteration of these parameters. The trends observed in experimental data for irradiated conditions are reproduced by varying only two of the parameters: k 1 and k 2 , respectively the transition rate from G 1 -to S-phase and from G 2 -to M-phase. This is consistent with the biological knowledge of the main role of the two checkpoints in the gap phases to control the cell-cycle progression. Interestingly, as the analysis is limited to living cells, the model parameters (giving the entity of the perturbation) are not so different in the two perturbed conditions. We observed instead a difference on the duration of the blocks in gap phases, and on the ability to recover to the unperturbed condition. Therefore, a perturbation of similar entity was applied to k 2 for both doses, but kept for longer times for the higher dose, with a later recovery to a lower value at 5 Gy compared to the 2 Gy. Similarly, the same perturbation is applied to k 1 but a later time point (16 h vs. 6 h) for the highest dose condition, probably representing (more than a block in G 1phase) the renewed influx to G 1 of cells surviving the G 2 block, which coherently happens later for the higher dose.
How parameters are varied in order to reproduce data also gives an insight on possible biological interpretations: up until 4 -6 h post irradiation only a slowing of S-phase entry is observed, even at high doses (10 Gy), and this can be explained by the principal molecular mechanisms for G 1 arrest, that is a slow process involving transcriptional activation by p53 of p21 that leads to inhibition of pRb phosphorylation and G 1 arrest [16][17][18][19] .
In this work a χ 2 fit strategy involving all the experimental time-points is used for the reproduction of the unperturbed condition only, while for the irradiated conditions a minimization is performed in discrete time intervals, giving as results step functions for parameter variation as a function of time. Perturbed parameters therefore represent average rates in discrete time interval. A first improvement of the reproduction strategy of the perturbed data would be to refine the time dependence of the perturbed parameters, introducing linear functions with imposed continuity conditions between time points. Furthermore, it would be necessary to implement a χ 2 fit strategy also for the perturbed conditions that could include all time-points simultaneously. This deserves further investigation as it is expected that the objective functions will have a complex behaviour as a function of model parameters. Also, the perturbed biological system is less easy to characterize with additional constraints, as the population increase rate, which was measured only for the unperturbed condition in this work. Optimization of the reproduction of the shape of the cell-cycle profile could also be performed, fully exploiting the flow-cytometry data.
It has to be stressed that results presented in this work in terms of percentages of cells in the different phases of the cycle always correspond to the fraction of living cells only. The importance of cell death cannot be quantified from our dataset. This information could be added experimentally and used as input for further model developments: the model could be expanded introducing a new compartment of dying cells. This same formulation of the model can be already adapted to reproduce cell death data adjusting the values of death rate parameters (currently set to 0). Dedicated experimental flow-cytometry protocols must be planned, e.g. using Annexin V with FITC versus Propidium Iodide, that can discriminate cells based on viability, and possibly identify the death mechanism (apoptosis vs. necrosis).
Additionally, a quiescent cell compartment could be added to the model to take into account the existence of the G 0 -phase. What we observe in our dataset as an increase in the population of G 1 after irradiation could also be due to cells that exit the replicative cycle and accumulate in a quiescent G 0 -state.
It is also of interest to mention that an alternative approach could have been taken from the start, using as experimental model a synchronized cell population. Synchronizing cells in a given cell-cycle phase and monitoring their rate of transition to the next phase, one could also obtain an estimate of model parameters. It can be expected that such parameters would be close to those extracted in our work for the unirradiated condition. However, when irradiating a synchronized cell population, the perturbation to the parameter describing the transition rate to the next phase would be more strongly dependent on the specific radiosensitivity of that phase. In our approach we irradiate an asynchronous cell population, and parameters that describe the average www.nature.com/scientificreports/ behaviour are affected by the initial distribution of cells in the different phases of the cycle in an interdependent way. The comparison between these two approaches deserves further investigation.
To conclude, the computational model we present in this work has the potential to reproduce perturbations of cell cycle following exposure to ionizing radiation. We here focus on the percentage of living cells in different phases of the cycle as a function of radiation dose and time, offering a proof-of-concept validation of the model with flow-cytometry experimental data on a chosen cell line. The model formulation is very general, such that it can be easily applied to reproduce different datasets. In particular, applications to cancer cells will be needed, in the perspective of using this model in the context of radiotherapy research. The response of healthy or cancer cells can be tested at different doses, also to address more specifically either effects to healthy tissues or to the target tumour mass. Different types of radiation or different agents as drugs can be considered. Biological insight on underlying response phenomena can be gained, comparing model parameters that are necessary to reproduce different datasets. By construction, the model can be easily extended to the consideration of cells exiting the cycle, either undergoing cell death or transition to a quiescent state. New dedicated datasets should be acquired to benchmark possible model extensions.
A model of this kind constitutes the core of more complex tools for pre-clinical/clinical use. These tools should be able to include cell-cycle perturbations for predictions in radiotherapy, hence considering characteristics of the tumour microenvironment. From the experimental point of view, further application of our model to spheroids 20 or organoids would be beneficial in this sense. From the modeling point of view, advanced model development could include the influence of the microenvironment, starting with the consideration of cell distribution in space (e.g. on a 2D grid for in vitro cultures or 3D lattice for spheroids), oxygen/nutrient supply, cell quiescence and cell clonogenic survival also due to radiation exposure 5,20 . Tools of this kind could allow predictions on the tumour mass growth (monitored in terms of number of cancer cells with replicative potential) as a function of time. Comparing this time evolution for different treatment combinations could allow to obtain information on optimal fractionation schemes or on the possible synergistic combination of radiation and chemotherapeutic drugs.

Methods
Experimental cell-cycle analysis. Cell culture. In this work a primary culture of IMR90 human fibroblast cells (ATCC CCL-186, USA) originated from lung tissue is used. Cells were cultured in complete medium (EMEM supplemented with 12.5% of FBS, 1% of L-glutamine 2 mM and 1%of NEAA) at 37 • C and 5% CO 2 , until confluence was reached at 80-90%. Cells used for the measurements were always between the 8 th and the 25 th passage. IMR90 cells were plated at a density of 10 5 per T25 flask (Greiner Bio-One, Germany).
Growth curve. Independent measurements on cellular growth curves were performed in order to estimate the doubling time ( T D ) of an unirradiated population. 10 5 cells were seeded per T25 flask, they were cultivated and harvested at the following time points: 6 h, 16 h, 24 h, 48 h, 72 h (the same time points for the following study of the radiation effects). The samples were trypsinized, 10 µl of the pellet was placed in a Bürker counting chamber. The number of cells were counted under an inverted microscope. Experiments were repeated in biological triplicate, each of them was performed in technical duplicate. The data were fitted to the following exponential function via non-linear least squares methods: fit(t) = N 0 exp(γ t) , where N 0 is the number of cells at t = 0 , and γ is the growth rate.
Irradiation setup. Irradiations were performed at the Radiotherapy facility of IRCCS Istituti Clinici Maugeri (Pavia, Italy), with a 6 MV linear accelerator Clinac (Varian, USA), regularly used for radiotherapy treatments. Cell samples were exposed to X-rays at two different doses of 2 Gy and 5 Gy, with a dose rate of 3 Gy/min. Irradiations were performed as previously described in detail 21 . Control samples, so-called "sham" samples, underwent the same environmental stress as the irradiated samples, except for exposure to radiation. After irradiation samples were transferred in the incubator at 37 • C with 5% CO 2 level. At each time point (6 h, 16 h, 24 h, 48 h, 72 h) cells were treated for flow-cytometry measurements, as described in the next section. Experiments were repeated in biological triplicate, each of them was performed in technical duplicate.
The full gating strategy to discriminate the four cell-cycle phases is as follows: after the identification of the cell population and of singlets (Fig. 2a,b, respectively), the characteristic cell-cycle spectrum, as measured with FxCycle Violet, (Fig. 2c) is analyzed. A further gate is applied, to select the whole spectrum where the signal of FxCycle Violet is proportional to the amount of DNA: this has the typical form of two Gaussians, the second with a mean value equal to the double of the first peak, and a plateau in between. These three distributions represent respectively cells in G 1 -phase, G 2 /M-phase and S-phase. The cell-cycle spectrum in VL1-A channel alone cannot provide a quantitative discrimination of the four phases, so stainings with EdU and pH3 antibody were introduced. In the bi-parametric plane BL1-A versus VL1-A (EdU vs. FxCycle Violet, Fig. 2d)  www.nature.com/scientificreports/ FxCycle Violet and low signal of EdU. The events in the bottom right part of the graph represent cells in G 2 or M-phase, while the horseshoe-like subset with a high signal of EdU represents cells in S-phase. The events collected in the gate " G 2 -M" are visualized in the VL1-A versus BL2-A plane (pH3 vs. FxCycle Violet): here events with a double positive fluorescence correspond to cells in M-phase, that have a double content of DNA and a high content of phosphorylated H3 (Fig. 2e).
Statistical analysis. Data acquisition was performed with the Attune NxT software. Data analysis was carried out with FlowJo, a specialised software used for flow-cytometry applications. As a result of the gating procedure described above, we obtained cell numbers in each of the four cell-cycle phases. Data were then expressed in terms of percentages with respect to the whole cell population (hence normalized to the sum of cells in all gates used to identify cell-cycle phases). Data are given as average between both biological (3) and technical (2)  where m G 1 is the mean DNA content in G 1 -phase and θ 2 G 1 is the corresponding variance. For simplicity, the mean parameter is normalized to a relative value x = 1 for G 1 -phase, thus giving x = 2 for G 2 -phase and M-phase, while the variance is chosen sufficiently small so that G 1 (x, 0) exists only for x > 0 , and can be adapted to simulate the experimental variance of the flow-cytometric profile.
Given the initial conditions as Eq. (5), the model is made evolve until T SDD hours to reach a steady DNA distribution. After such time, the model is considered to give the cell-cycle distribution of cells in exponential growth. The variance θ 2 1 for the Gaussian distribution of cells in G 1 -phase is fixed at 0.05, which is chosen based on the experimental variance of the G 1 -phase sham profiles (around 5% of the mean). The variance in G 2 -phase and M, θ 2 G 2 , is considered as two times θ 2 G 1 . In practice, starting from the initial condition, the full profile in x is superimposed to the solution of the problem in its matricial formulation, that gives the number of cells in each phase at each time (see next section).

Matricial model formulation.
For a practical implementation of the model in MATLAB we resort to a matricial model formulation, integrating Eqs. (3) over the DNA content domain from 0 to X, with X upper limit for the x-axis, large enough to simulate infinity so that the difference of the integrations over 2X and over X is negligible. The matrix system is resolved using finite forward differences method, with discretization in the time variable. The integration leads to the following system of temporal ordinary and partial differential equations: where: www.nature.com/scientificreports/ Then, Eqs. (6) -(9) are rewritten as a linear system of equations and finite forward differences method is applied to resolve it: with: and The eigenvector corresponding to the eigenvalue of largest magnitude of the transition matrix A gives the longterm distribution of cells. The eigenvalue determines the rate of population growth and allows to compute the doubling time of the cell population 22 . In the discretized matrix form of the model of an unperturbed cell line, the number of cells in each phase can be tracked over time without any knowledge of the DNA distribution in each phase. This formulation is more convenient for practical model implementation, but it provides no information on the flow-cytometric profile to be compared to data. To obtain this information at any time in each phase, normal distributions can be superimposed to data from the matrix model. It is assumed that experimental flow-cytometric profiles obtained from a homogeneous cell population are approximately Gaussian, and experimental variances can be used to adapt theoretical distributions. For G 1 -phase at the arbitrary time t ′ , the flow-cytometric profile is: The mean µ 1 for G 1 -phase is chosen at a relative DNA content x = 1 . The S-phase profile is composed by a sum of Gaussian distributions with mean values shifted along the x-axis, proportional to τ S (the time each cell has spent in S-phase). So, the mean in S-phase is µ 1 + g S τ S . For G 2 and M, the profile is a Gaussian function like the one in Eq. 15, but with mean respectively µ 2 and µ M at x = 2 23 . It is assumed also that the variances increase linearly from θ 2 1 to θ 2 2 during S-phase. As mentioned, the main motivation to consider this discretized formulation of the model is the faster computational time when using a language as MATLAB.
Initial parameter values from experimental data. For a first estimate of model parameter values in the unperturbed condition we can start from the so-called Steel's formulas 24 : such formulas give the average time spent in each phase ( T phase ) as a function of measured percentages of cells in cell-cycle phases ( f phase ) and the doubling time T D . Formulas are as follows (Eq. 16): where T M can be separately estimated as T D · f M .
As previously mentioned, for a population of cells with a steady DNA distribution, f phase percentage values are unaltered in time. Steel's formulas are then combined with the relation between the average time in each phase and the model parameters as in 25 :  www.nature.com/scientificreports/ If µ i parameters determining cell death in Eq.(3) are set to 0, transition rates are simply the reciprocal of the average time spent in each phase: if a block occurs in a particular phase, the time spent in that phase will increase and the rate transition will tend to zero.
Fit strategy for the unperturbed condition. Parameter values obtained from Eqs. (17) - (20) are used as an initial guess. A minimization strategy is then necessary, to obtain the set of parameters that allow the best reproduction of experimental data for the unperturbed condition. The minimization strategy is not uniquely defined, and possible objective functions show multiple local minima, with different sets of parameters returning similar profiles, with very different doubling time.
We follow a sequential minimisation procedure on the four parameters: k 1 , g, k 2 , b. We define the objective function to minimize the difference between experimental and theoretical percentages as: where [ k 1 , g, k 2 , b] is the vector of the model parameters spanning between their limits, M j (t k ) is the theoretical percentage of cells in the phase j at time t k , f j (t k ) is the experimental percentage of cells in the phase j at time t k and σ 2 f j is the corresponding experimental error. The following constraints are imposed: k 1 ∈ [0, 1], g ∈ [0.05, 0.5], k 2 ∈ [0, 1] and b ∈ [0, ∞) . These constraints ensure positivity of the parameter values, reflect the fact that k 1 and k 2 are probabilities and assume that the time spent in S-phase is approximately between 2 and 20 h 26 . Lower and upper bounds of the fit parameters kept during minimization are given as follows: k 1 ∈ [0.03, 0.06], g ∈ [0.05, 0.5], k 2 ∈ [0.05, 0.5] and b ∈ [2,5].
It has to be recalled that the four parameters are related: the sum of their reciprocal gives the overall duration of the cell cycle, hence what is experimentally measured as the doubling time. Therefore, this can be used as a criterion to drive the optimization: any local minimum of the objective function for which the derived cell-cycle length is far from our estimation can be discarded. The same holds for minima that occur for parameter sets too far from initial conditions, as initial parameter guesses provided by Steel's formula have been shown to give a reasonable agreement with experimental data. For all these reasons, the minimization is performed varying parameters in relatively narrow intervals, in particular for k 1 , as the objective function is found to have a strong dependence on its value. On the contrary, minimization is less affected by the values of b and g. Taking also into account the lower importance of the b parameter and the low values assumed by M-phase percentages, in a first instance the parameter was fixed to the value 3.
The optimisation of the fit (minimization of ) for the unperturbed condition is performed with a sequential quadratic programming method via the MATLAB routine fmincon 27 . (18)