First-passage times and normal tissue complication probabilities in the limit of large populations

The time of a stochastic process first passing through a boundary is important to many diverse applications. However, we can rarely compute the analytical distribution of these first-passage times. We develop an approximation to the first and second moments of a general first-passage time problem in the limit of large, but finite, populations using Kramers–Moyal expansion techniques. We demonstrate these results by application to a stochastic birth-death model for a population of cells in order to develop several approximations to the normal tissue complication probability (NTCP): a problem arising in the radiation treatment of cancers. We specifically allow for interaction between cells, via a nonlinear logistic growth model, and our approximations capture the effects of intrinsic noise on NTCP. We consider examples of NTCP in both a simple model of normal cells and in a model of normal and damaged cells. Our analytical approximation of NTCP could help optimise radiotherapy planning, for example by estimating the probability of complication-free tumour under different treatment protocols.

but their mutual key characteristic is that they describe the dynamics of cell division and death. Many of these models are intrinsically stochastic. Mitosis and cell death are random events in such models, and the precise outcome is therefore uncertain; the tumour may or may not be controlled, and NTCs can arise, but do not have to. The aim of this line of research is to obtain, for a given model of the population dynamics of cells and a given radiation protocol, the TCP and NTCP. The word 'obtain' includes by computer simulation of the population, or by direct mathematical computation when this is possible. While simulations are sometimes viable, the mathematical route when available is generally preferable as explicit formulae provide an efficient way of evaluating TCP or NTCP, often much faster than simulation. Not all types of population dynamics can be treated mathematically exactly however. In such cases approximations have to be made in the mathematical calculation of TCP and NTCP.
TCP computed as the first-passage distribution of a stochastic birth-death model has previously been described by Zaider and Minerbo 6 , and extended in subsequent work [7][8][9] , using generating-function methods. The use of such methods to solve first-passage time problems, however, is limited to those considering the extinction of all cells and where the dynamics are linear, and so are not directly applicable to NTCP.
A stochastic birth-death model of normal tissue cells where cell death rates are affected by the dose and timing of radiotherapy was described by Stocks et al. 10 . NTCP can be seen as the cumulative distribution function of the first-passage time of this stochastic birth-death process through a boundary 10 ; NTC sets in when the number of functional cells falls below a certain threshold. However, the analysis by Stocks et al. 10 of this model was restricted to the deterministic limit in which intrinsic noise within the population is not taken into account, and therefore NTCP was approximated as either zero or one.
The application of our theoretical results in approximating the distribution of first-passage times extends the analysis in Stocks et al. 10 to capture features of intrinsic noise. Note that we do not aim to develop a novel model of NTCP, but instead further the analysis of existing models 5,10 using novel mathematical developments while maintaining the assumptions, and therefore clinical relevance, of previous work.
Mathematically, our main result is intuitive. We find that, for a sufficiently large population, the distribution of first-passage times through a threshold is approximately normal. The variance of this normal distribution decreases proportionally to the size of the population. The deterministic result for NTCP by Stocks et al. 10 is recovered in the limit of infinite population size.
The remainder of this paper is set out as follows. We first present the microscopic model of normal tissue cells adapted from the model of Stocks et al. 10 and a definition of NTCP. We use this model to explain the steps of our approximation and derive our main results for approximating the first-passage time of a birth-death process through a boundary. Following Hanin and Zaider 5 we then consider a more complicated model of two cell types (normal and 'doomed'), and we show how our method can be extended to systems with more than one degree of freedom. In the context of this model we also develop a second approximation method for NTCP.

Logistic model of healthy tissue
Model definitions. We first focus on a model of normal tissue, similar to that in Stocks et al. 10 , which we refer to as Model 1. The model describes a well-mixed population of identical and independent cells; we write N t for the size of the population at time t. The model does not distinguish between organ stem cells, functional cells or different types of tissue renewal but simply assumes that all cells can divide by mitosis with the same rate. We assume that overall growth is limited by spatial constraints, the presence of nutrients, or other regulatory mechanisms. For a population of size N, the overall growth rate b N is a logistic function, N where b > 0 is a constant parameter. This indicates that the per capita birth rate decreases with increasing population size, and growth ceases completely when the carrying capacity K is reached; K is a model parameter and constant in time. Cells can die due to natural causes and from external radiation. Natural death occurs with rate d. We note that explicitly separating death processes from birth events is necessary for a stochastic treatment of the model; basing the analysis on an effective net growth rate (i.e., b N − d) is insufficient to model the dynamics outside of the deterministic limit. External radiation damage to cells is captured via a hazard function h(t), denoting the per capita death rate due to radiation at time t 10,11 . This rate will generally depend on time, as determined by the details of the specific treatment protocol applied and biological assumptions made.
Model 1 can be summarised as a list of 'reactions' , with notation similar to that used in chemical reaction systems, where we write N to represent an individual normal cell (death due to radiation), where the rates above the arrows are per capita rates. The deterministic rate equation for this system can be formulated heuristically as follows, It can also be derived systematically from the lowest-order terms in an expansion in the inverse system size, as discussed below.
In the absence of radiation [i.e., when h(t) = 0], the non-zero fixed point of Eq. (3) is given by Since the population dynamics are stochastic, in the absence of radiation the size of the population fluctuates about this value. To simplify the notation we will use = − K in the following, such that-in the absence of radiation-the average population size is M.
Master equation. The process defined by Eq. (2) can equivalently be described by a (chemical) master equation 12 , where we write P N (t) for the probability that the population has size N at time t Definition of normal-tissue complication probability and strategies to calculate it. An organ requires a minimum number of cells to function properly 13 . We introduce a threshold L and say that a normal tissue complication (NTC) is encountered when the number of cells in the population N t falls below L. Given that N t is a stochastic process, NTC will occur at different times in different realisations of the model dynamics (or potentially, it may never occur in a given realisation). This leads to the definition of normal tissue complication probability (NTCP). We assume that once NTC has been encountered in a given realisation of the dynamics, it cannot be repaired, even if the number of cells ultimately recovers to values above the threshold L. We therefore define NTCP(t) as the probability that, at some time before t, the population contained L cells or fewer. Mathematically the calculation of NTCP constitutes a first-passage time problem 14 . More precisely, NTCP(t) is the cumulative distribution function of the first-passage time through the threshold L.
Realisations of the process defined by Eq. (2) can be generated using the stochastic simulation algorithm by Gillespie 15,16 . In principle, a large ensemble of such simulations can be used to measure NTCP(t). However, in practice this approach is of limited use since a large number of runs need to be collected to obtain sufficient statistics. Simulations also offer relatively little in the way of mechanistic insight.
One can also find the NTCP(t) by direct numerical integration of Eq. (5) (subject to an absorbing boundary at L). In practice, this approach is computationally costly, especially in more realistic models where there are several different types of cells and the master equation is then a large set of coupled ODEs.
An alternative approach involves the use of generating functions 12 . However, this technique is usually only viable for relatively simple models (and not for the above logistic growth process). For example, generating functions can be calculated analytically when per capita birth and death rates do not depend on the current population size, i.e., when b N is independent of N. A notable example of an exact calculation using generating functions is the work of Zaider and Minerbo 6 who obtain TCP in closed form for a linear-birth death process with time-dependent death rate due to irradiation.
Given the limitations of these numerical and analytical methods, we now develop and use an approximation to estimate NTCP. The approach is based on Kramers-Moyal expansion techniques 12,17 and retains features of the intrinsic noise resulting from the finiteness of the population of cells. At the same time, we assume that the population is sufficiently large so that the jump process defined by the master Eq. (5) can be approximated by a stochastic differential equation (SDE).

Kramers-Moyal expansion and linear-noise approximation.
The expansion method is based on the assumption of a large, but finite population. We refer to M as the average system size and introduce the population density n t = N t /M 12,17 . We re-scale the threshold for the onset of NTC in the same way =  L M / ; NTC thus occurs when ≤  n t . We also introduce a re-scaled carrying capacity k = K/M. Given our above choice for the action of the step operator. We proceed to consider the limit where the system size is large, M ≫ 1. In this limit one can expand www.nature.com/scientificreports www.nature.com/scientificreports/ where we have neglected higher-order terms in M −1 . The probability of finding the random process n t with a value in the interval + n n n [ , d ) at time t is ∏(n, t)dn. For the current model, the drift and diffusion terms in Eq. (7) are given by respectively. Equation (7) describes the statistics generated by solutions of the It o SDE where W t is a standard Wiener process.
In principle, trajectories of this SDE can be generated in simulations, for example using the Euler-Maruyama method 18 . These simulations are more efficient than simulating the original model, in particular the population size only enters in the noise strength and does not affect computing time required to generate a set number of realisations. However, our aim is to make analytical progress. This requires further approximation, first because μ(n t , t) is a non-linear function of n t , and more importantly because the noise in Eq. (9) is multiplicative. We proceed by making a further simplification using the LNA 12,17 , effectively turning multiplicative noise into additive noise.
To carry out the LNA we introduce the stochastic process ξ t via the transformation 17 is a deterministic function of t, to be determined shortly. We next substitute this ansatz into Eq. (9) and expand in powers of M −1/2 . From the two lowest-order terms we find is the partial derivative of the drift μ(n, t) with respect to n, evaluated at φ(t) and t. The first of these equations indicates that φ(t) is the solution of a deterministic rate equation in the limit M → ∞. Within the linear-noise approximation, the population density n t is normally distributed with mean φ(t). Up to re-scaling of N and K this rate equation is identical to Eq. (3). The SDE (11b) describes fluctuations about this deterministic trajectory, due to demographic noise. We note that the LNA is only valid provided corrections to the deterministic dynamics remain small; if this is not the case higher-order terms in the system-size expansion become important. The approximation is generally appropriate if the deterministic trajectory is locally attracting, i.e., if μ φ ′ < t t [ ( ), ] 0 at all times. This condition is fulfilled in the present model. The linear SDE (11b) can be solved straightforwardly 12,17,19 , and, within the LNA, the distribution of n t is found to be Gaussian, centred around the solution φ(t) of Eq. (11a), The variance of this distribution, M −1 ∑ 2 (t), is a function of time, and can be obtained from the solution of  (13) can be solved exactly, and one can obtain an analytical expression for ∏(n, t) in Eq. (12). We discuss this in the context of the current model below. For the general case, these equations can be integrated forward numerically, using standard Runge-Kutta methods. This only requires the integration of two ODEs.

Approximation of NTCP(t).
We now proceed to estimate NTCP using the outcome of the LNA. Taking Eqs. (11a) and (11b) as a starting point, the calculation of NTCP amounts to a first-passage time problem for an SDE with time-dependent drift and noise strength. Equation (11b) describes an Ornstein-Uhlenbeck process with time-dependent rates 12  www.nature.com/scientificreports www.nature.com/scientificreports/ first-passage time distribution of Ornstein-Uhlenbeck processes is available for constant rates and a static boundary 20 , studies of instances with time-dependence are often based on approximation schemes for specific cases; examples can be found in the literature 21,22 .
To make progress we therefore use a further approximation. We focus on cases in which the deterministic trajectory φ(t) crosses the threshold =  L M / , as illustrated in Fig. 1(a); we write t * for this time. The exact value of t * will depend on the applied radiation protocol and the other model parameters. The calculation of NTCP(t) by Stocks et al. 10 is based on this deterministic contribution, and within their calculation NTCP (t) = Θ(t − t * ) is a Heaviside step function [Θ(u) = 1 for u ≥ 0, and Θ(u) = 0 otherwise]. Our aim is to build on these results and to capture some of the influence of intrinsic fluctuations on NTCP.
As a next step we look at the dynamics of Eqs. (11a) and (11b) in a time window around t * , as shown in Fig. 1(b). Some trajectories of the stochastic system will cross the threshold  before t * , and others after t * . We expect these fluctuations in the crossing time to decrease as the system-size parameter M is increased. To evaluate this further we consider the Gaussian distribution for the population density ⁎ n t obtained by evaluating Eq. (12) at time t * . By construction, this distribution is centred on , as shown in Fig. 1(c). We now proceed on the basis that trajectories with values >  ⁎ n t will first cross the threshold at a time greater than t * , and estimate this time of crossing from the dynamics near t * . Similarly, trajectories with >  ⁎ n t have already crossed the threshold, and we estimate how long before t * this has occurred. This procedure implies several assumptions, for example a trajectory with >  ⁎ n t may have had its first crossing before t * and then returned to values n t above  due to further fluctuations. This is not captured by our estimate of NTCP.
In order to focus on the dynamics in a time window near t * , it is useful to introduce re-scaled coordinates  Table 2 (parameter set (D)).

Coordinate Interpretation Relations
deviation from mean-field path due to linear noise www.nature.com/scientificreports www.nature.com/scientificreports/ Considering values of τ and ζ of order M 0 allows us to magnify the region around t * where boundary crossings are likely (ζ τ refers to the random process, while ζ is a value in the process's state space). In these coordinates, the crossing of the deterministic trajectory occurs at τ = 0, and the position of the threshold is at ζ = 0. We note that μ <  ⁎ t ( , ) 0 so that positive values of the re-scaled time (τ > 0) correspond to t > t * . A summary of the coordinates used in our analysis is given in Table 1.
Substituting the new coordinates into Eq. (7), and writing ζ τ Π ∼ ( , ) for the probability density in these coordinates, we find , i.e., near the threshold the dynamics of the system can be approximated by where ζ 0 is the location of the path at time τ = 0 (i.e., at t = t * ). Figure 1(b) shows a number of different stochastic trajectories in this region. Broadly, they travel along approximately parallel straight paths of gradient minus one (in the coordinate system of τ and ζ).
We now use this result to approximate the distribution of crossing times. To do this we estimate when a particular trajectory located at ξ 0 at time t * crosses (or did cross) the threshold. We write τ × (ζ 0 ) for this crossing time in the re-scaled coordinates. Using Eq. (16) we find We show this schematically in Fig. 1(c). We now combine this with the Gaussian distribution for ξ 0 obtained from the LNA, also shown in Fig. 1(c). Equation (12), evaluated at t = t * , can be written as  Using the definition of NTCP as outlined above we find where erf is the error function. We now test this approximation scheme on the logistic growth model defined in Eq. (2). We focus on a particularly simple case where there is no radiation prior to a certain time, and a constant rate of cell death due to radiation thereafter so that the hazard function h(t) is the step function  Table 2. Five sets of parameters used in Fig. 2 for Model 1. These parameter sets are the same as those considered by Stocks et al. 10 , but we have defined separate mitosis and natural death rates to be able to analyse stochastic effects in finite populations. The ratio of mitosis and natural death was chosen as 10:1, consistent with literature 5 . We primarily consider radiation of this type as a simple initial example, following the study of NTCP in Stocks et al. 10 . For this case, analytical and numerical methods exist for obtaining the moments of the first-passage time 23 . More complicated radiation protocols (for which no such methods are available) will be discussed in later examples.
We assume that the dynamics of the population start long before t = 0, so that the stationary state of the master Eq. (5) (with h(t) = 0) is reached by t = 0. The mean and variance of this stationary distribution are given by the fixed points of Eqs. (11a) and (13), using μ and σ 2 for the logistic model and setting h(t) = 0. We have At times t ≥ 0, Eqs. (11a) and (13) are given by From this, one then finds the passage time t * of the deterministic trajectory as Next we turn to Eq. (23b) in order to find ∑ 2 (t * ). For constant radiation the path φ(t) is monotonically decreasing in time. This allows us to trade the time derivative in Eq. (23b) for a derivative with respect to φ, resulting in a linear ODE for ∑ 2 as a function of φ. For our specific example this ODE can be solved in closed form, and we find the variance of first-passage times as This can then be used in Eq. (20) to obtain NTCP(t).
In Fig. 2 we show the resulting NTCP as a function of time for several sets of model parameters; these parameter sets are summarised in Table 2, and were previously motivated and used in Stocks et al. 10 to consider normal tissue complications arising from the treatment of prostate cancer. For these sets of parameters, the standard deviation in the time for NTCP onset varies from 3% (parameter set (D)) to 21% (parameter set (E)) of the mean onset time. In order to test the accuracy of our approximation, we have also obtained NTCP(t) for the original model by numeral integration of the master equation Eq. (5); these values are shown as black circles in Fig. 2. These results are compared with the analytical approximations in Eqs. (20) and (26), and for most of the parameter sets tested we find good agreement. The approximation works noticeably less well for parameter set (E) than for the other four sets. In this case, the speed with which the deterministic path crosses the boundary is lower than for the other parameter sets. This leads to a longer time window around t * within which crossings are likely, and thus a larger amount of error in our approximation. www.nature.com/scientificreports www.nature.com/scientificreports/

Extended model of normal and doomed cells
Model definitions. Hanin and Zaider 5 proposed a model which adds complexity by including radiation-damaged cells. In this model, damaged cells continue to occupy the limited volume available to the population. Damaged cells also carry out their functions, but fail to proliferate. The presence of such cells has been offered an explanation for the observation that, after irradiation, an initial lag period occurs before re-population 5,24 . We refer to this as Model 2.
As before there are 'normal cells' N which carry out the functions of the organ; these cells have the ability to proliferate (and also die with a constant rate d 1 from causes unrelated to radiation). Once damaged by radiation, a normal cell does not vanish immediately; rather, it becomes a 'doomed cell' X 5 . Doomed cells continue to contribute to the normal functions of the organ, however they are unable to proliferate. Thus, although they may temporarily aid the function of the organ, they ultimately die at a constant rate d 2 without reproducing. Doomed cells also consume resources and so are in direct competition with the normal cells. As a result of this, the per capita mitosis (birth) rate of normal cells decreases as the total size of the population of both types increases. The dynamics of Model 2 can be summarised as follows: We write N and X for the numbers of normal and doomed cells, respectively. As before, the constant k ≡ (1 − d 1 /b) −1 is chosen so that-in the absence of radiation-the stationary average size of the population of normal cells is M. An NTC is assumed to arise when the total number of functional cells, N + X, falls below a threshold L.
Writing s = (N + X)/M for the (re-scaled) total number of functional cells in the population, and x = X/M for the (re-scaled) number of doomed cells, one has the following rate equations in the deterministic limit, In this example, as per the literature 5,10 , we consider a hazard function representing brachytherapy where there is a time-varying dose of radiation acting on the population of normal cells, resulting from the decay of a radioactive implant: where α, β, γ, λ and R 0 are model parameters; R 0 in particular denotes the initial dose rate. Further details are given in the Supplementary Information. We consider a specific set of realistic parameters, proposed by Hanin and Zaider 5 and summarised in Table 3. These parameters were chosen to model the treatment of prostate cancer, where the normal-tissue complication refers to grade 2, or larger, toxicity ('GU2+') of the genitourinary tract.
Alternative approximation for NTCP. Results for Model 2 are presented in Fig. 3. We first focus on the deterministic dynamics, indicated by the blue lines in panels (a) and (c). In panel (a) the mitosis rate b is sufficiently low for deterministic trajectory to fall below the threshold  for the onset of NTCs. The approximation for NTCP developed previously can be applied, as discussed in more detail later.  25) and (26). Model parameters are given in Table 2.
Scientific RepoRtS | (2020) 10:8786 | https://doi.org/10.1038/s41598-020-64618-9 www.nature.com/scientificreports www.nature.com/scientificreports/ The second parameter set in Table 3 describes a case with a higher mitosis rate b. As shown in Fig. 3(c), the solution of the deterministic rate equations then only briefly falls below the threshold . The number of functional cells then increases again to values above . In the stochastic system we expect only a fraction of trajectories to cross the threshold; some realisations may never fall below , and hence NTCP(t) can be expected to take a long-time limit below one. This cannot be captured by the approximation method developed previously.
With this in mind, we propose the following improved method of estimating NTCP. Within the LNA, at each moment in time t the distribution of the population of interest (in this case s t ) is approximately normal with a mean φ(t) and variance ∑ 2 (t) given by Eqs. (11a) and (13), respectively. The amount of probability below the threshold  at a given time is then obtained as We now estimate NTCP(t) as the maximum amount of probability below the threshold at any earlier time t ≤ t, i.e., t t Parameter Definition Fig. 3(a,b)  Fig. 3(c,d) b  Table 3. Parameters used in Fig. 3. Similar parameters were previously proposed in Hanin and Zaider 5 . We have explicitly included normal-cell birth and death and made the assumption that d 1 = d 2 .  Table 3) Further steps of the mathematical evaluation are presented in the Supplementary Information. We briefly comment on the limitations of this approximation, before we discuss the results for the model of normal and doomed cells. Equation (31) provides a lower bound for NTCP of the process described by the LNA. This can be seen as follows. At a given time t, let the maximum in Eq. (31) have occurred at a time t m ≤ t; the estimate for NTCP(t) is then Q(t m ). Consider now a trajectory with a total population density above the boundary at time n m , >  s t m . Such a trajectory does not contribute to NTCP(t) within our approximation, even though the population size might have taken values below the threshold before t m , or go below threshold between t m and t. The above approximation therefore underestimates NTCP. We note that the SDE obtained in the LNA is itself an approximation, so the above calculation is not necessarily a lower bound to the NTCP of the discrete population dynamics from which we started.
Despite these limitations, the method provides useful estimates for NTCP. For example, NTCP(t) obtained from Eqs. (30) and (31) for the simple logistic model of healthy tissue does not significantly differ from the predictions of the previous approximation method for NTCP(t). To keep the language compact we will refer to the previous approximation procedure as Approximation 1 from now on, and to that in Eqs. (30) and (31) as Approximation 2. A quantitative comparison of the distributions of first-passage time from the two approximations for Model 1 is included in the Supplementary Information. It indicates that Approximation 2 provides an improvement relative to Approximation 1 and that both methods do considerably better than the deterministic approximation in Stocks et al. 10 .
ntcp for model of normal and doomed cells. Approximation 2 applied to Model 2 can provide a significantly improved prediction of NTCP compared to Approximation 1, as we will discuss now. In this context it is useful to distinguish the cases in which normal tissue complication occurs with certainty at long times and those in which long-time NTCP stays below one.
Certain normal tissue complication at long times. For the first set of parameters in Table 3 normal-tissue complication occurs with probability one at long times. We show results in panel (a) of Fig. 3. The source of radiation is implanted at time zero, assuming that the population of normal cells is at its stationary state at this time. The population of functional cells then decreases monotonously, and the number of functional cells crosses the threshold for the onset of NTC. Panel (b) shows the estimates for NTCP as a function of time for Approximation 1 and Approximation 2. Their predictions are largely indistinguishable, and they both agree well with results for the original model found by numerical integration of the master equation.
We note that for this choice of parameter values, carrying out the numerical integration of the master equation takes approximately 10 5 times longer than to evaluate each of the two approximations. This is because the master equation consists of a set of M 2 coupled ODEs, whereas evaluation of each of the approximations only involves integrating forward five ODEs (for the mean proportion of each type of cell, their variances and the covariance). Thus, the approximation methods offer a significant increase in efficiency for large populations, at moderate reduction of accuracy.
Uncertain onset of normal tissue complication. In panels (c) and (d) of Fig. 3 we show the same quantities, but for a different choice of birth rate (see Table 3). The deterministic path barely crosses the boundary , and for this choice of parameters only a fraction of trajectories of the stochastic model will lead to an onset of NTC. In this case, the predictions of the two approximations are widely different. Approximation 1 assumes a Gaussian distribution of first-passage times and deviates significantly from the NTCP seen in the original model. Most notably, this approximation predicts that all trajectories eventually cross the boundary so that NTCP(t) → 1 at large times. Although this is not the case for typical population size used in this example (M = 1000), we remark that for M → ∞ NTC becomes certain at long times in the original model for the present parameter set.
As seen in Fig. 3(d) Approximation 2 outperforms Approximation 1. This is because, in the narrow region where boundary-crossings are likely, there is a significant change in the drift for the total population size; the sign of the drift changes from negative to positive. Approximation 2 takes this into account, whereas Approximation 1 is based on constant drift within the region near the boundary . Unlike Approximation 1, Approximation 2 does not (wrongly) predict that all trajectories eventually cross the boundary. Instead NTCP(t) remains below unity at t → ∞ within Approximation 2.

Discussion
We have derived approximations for the distribution of the first-passage time of a stochastic birth-death model through a boundary. These approximations capture effects of fluctuations in the population which were discarded in previous approaches. The improvements rely on an expansion in the inverse typical size of the population. One can therefore expect the approach to be particularly useful for large, but finite populations. Intrinsic noise is then weak, but not always weak enough to be ignored altogether. The methods we have developed do not require the birth-death model to be linear, for example we have considered logistic growth.
Our analysis was presented in the context of normal tissue complication probabilities for radiotherapy treatment. In particular, we have obtained approximations of NTCP for models of normal tissue with a single type of cell and for an extended model with two different cell types. Our results demonstrate that these approximations can lead to a significant increase in efficiency over simulation methods, at a moderate loss of accuracy. This is the case particularly when the underlying model becomes complex and has many different internal states. We note that NTCP takes the form of an error function in our approximation; this functional form has previously been reported in statistical models of NTCP, see for example Lyman 25 . (2020) 10:8786 | https://doi.org/10.1038/s41598-020-64618-9 www.nature.com/scientificreports www.nature.com/scientificreports/ Our applied analysis is limited to stylised models, and we do not claim direct clinical applicability. For example, we assume that NTC occurs when the number of functioning cells first falls below a threshold. This threshold assumption, and more broadly the assumption of a 'functional reserve' or critical number of structural elements that must be undamaged to avoid tissue failure, has been widely made in models of NTCP in the literature 5,10,26,27 . We acknowledge that this assumption is more valid for certain types of tissue and complications than others, in particular for tissues arranged 'in parallel' rather than 'in series' 28 . However, to what extent this established assumption is clinically appropriate remains a matter for further biological exploration and is beyond the scope of the mathematical approach of this manuscript. In addition, only a limited number of parameter sets have been investigated in this study and, although informed by previous literature, we note that these serve as an illustration of our methodological developments rather than claiming they are necessarily clinically realistic values.
However, the analytical approaches developed here have the potential to give clinical benefit as more realistic models of cancerous cells and normal tissue evolve. For example, combining approximations of NTCP with values for tumour control probabilities -the probability of eliminating all cancer cells -can give an estimate of the success of a particular treatment protocol. In particular our approximations of NTCP, used with approximations of TCP from the literature, can be used to estimate the probability of complication-free tumour control. This can be used for the efficient identification of optimised parameters for treatment planning; we give more details on this application of our approximations in the Supplementary Information.
One may ask whether the inclusion of intrinsic noise is necessary in modelling NTCP. Hanin and Zaider 5 argue that deterministic approaches might be sufficient, due to the high numbers of cells involved. However we note that the size of the population may vary depending on context. For example, the model could describe a functional subunit (FSU) of an organ, rather than the entire organ 27,29,30 . NTCP would then not necessarily indicate the probability that an organ fails, but instead that such a subunit no longer fulfils its function. For instance, Niemierko and Goitein consider a kidney split into 10 7 FSUs, where each FSU contains 10 4 cells 27 . In such circumstances noise in the population (i.e., within a FSU) may become relevant. Intrinsic stochasticity may also be important in the context of stem cells, especially if they are present in relatively small numbers [31][32][33][34] .
To summarise, we have developed approximations to the first and second moment of a general first-passage time problem and applied them to models of cells damaged by radiotherapy. We hope that our mathematical results can be adapted to more clinically realistic models of cancerous cells and normal tissue as they evolve and note that they have wider applicability to other problems, in many wide-ranging fields, in which first-passage times of stochastic processes are of interest.