Elucidating the correlations between cancer initiation times and lifetime cancer risks

Cancer is a genetic disease that results from accumulation of unfavorable mutations. As soon as genetic and epigenetic modifications associated with these mutations become strong enough, the uncontrolled tumor cell growth is initiated, eventually spreading through healthy tissues. Clarifying the dynamics of cancer initiation is thus critically important for understanding the molecular mechanisms of tumorigenesis. Here we present a new theoretical method to evaluate the dynamic processes associated with the cancer initiation. It is based on a discrete-state stochastic description of the formation of tumors as a fixation of cancerous mutations in tissues. Using a first-passage analysis the probabilities for the cancer to appear and the times before it happens, which are viewed as fixation probabilities and fixation times, respectively, are explicitly calculated. It is predicted that the slowest cancer initiation dynamics is observed for neutral mutations, while it is fast for both advantageous and, surprisingly, disadvantageous mutations. The method is applied for estimating the cancer initiation times from experimentally available lifetime cancer risks for different types of cancer. It is found that the higher probability of the cancer to occur does not necessary lead to the faster times of starting the cancer. Our theoretical analysis helps to clarify microscopic aspects of cancer initiation processes.

the system to reach the state 2 if at t = 0 the system was in the state i (i = 0, 1, or 2). These In addition, we have the following boundary condition Π 2 (t ) = δ(t ). These equations can 27 be solved by using Laplace transformations, yielding 28 (s + r bn) Π 0 (s) = r bn Π 1 (s) (S3) Π 0 (s) = r bn A(N − n) s 2 + s(r bn + A(N − n) + A(n + 1)) + r bn A(N − n) (S5) 32 The mean first-passage time T n,n+1 to reach the state (n + 1, N − n − 1) from the state 33 (n, N − n) is given by: After some algebra, the expression for the mean first-passage time can be written as 36 T n,n+1 = r bn + A(N − n) + A(n + 1) , $ - Figure S1. Schematic view for the derivation of Eq. S8 .

45
Consider a tissue compartment that has N normal cells. At time zero one of them is  The problem is analogous then to a random walk on the lattice of N sites. At t = 0 the 51 walk starts at the site 1. The state n corresponds to n mutated and (N − n) normal cells. As 52 shown above, the transition rate from the state n to n + 1 is equal to r a n , where of understanding when the whole compartment becomes mutated is analogous then to a 56 first-passage problem of the random walker starting on the site 1 to reach the site N for the 57 first time before disappearing to the site 0. One can define the corresponding first-passage 58 probability density functions to start from any site n and reach the site N at time t (if at 59 t = 0 the k was at the site n), F n (t ). The temporal evolution of these probabilities follows 60 the backward master equations 62 for 1 < n < N ; and In addition, we have the boundary condition F N (t ) = δ(t ), which means that the process 65 is immediately accomplished if the walker starts from the site N . Let us also do the 66 calculations assuming b = 1, i.e., all times scales are renormalized with respect to cell 67 replication time.

68
It is convenient to solve this problem using the Laplace transformation, which 69 changes the backward master equations: 70 s F n r a n and F N = 1. Because we are interested only in the fixation probabilities and fixation times,

73
there is no need to obtain full analytical expressions for F n , but it is needed to determine 74 the expansion of this function up to the linear term in s. Thus, we can write 75 F n (s) π n + b n s r (S14) 76 where π n = F n (s = 0), is the fixation probability starting from n single mutations, and the 77 unknown parameters b n are related to the fixation times (viewed as conditional mean first-78 passage times) as Note that π N = 1 and b N = 0. Substituting Eq. S14 into Eqs. S12 and S13 we obtain for the 81 fixation probabilities

Calculation of fixation times
From Eqs. S12, S13 and S14, the corresponding equations for parameters b n can be written 89 as, 90 π n a n + (1 + 1 To solve Eqs. S19 and S20, let us write the following anzats

94
where K n is another unknown parameter that will be determined. Then the substitution of 95 Eq. S21 into Eqs. S19 and S20 yields 96 K n − K n−1 r = π n a n ; (S22) Eq. S22 can be easily solved, producing Then from Eq. S21 we can write This expression is valid for any 1 ≤ n ≤ N , which due to b N = 0 leads to Then the final fixation time (normalized to the cell replication rate b) will be equal 106 from which after some algebra we obtain The summations can be further simplified by defining a new index n = j − l .

110
Now we can change the order of summations, This eventually produces a compact expression, It can be further simplified for r → 1 by employing L'Hôpital's rule:

Explicit expression for fixation times for N → ∞ 119
In general it is difficult to perform explicit summation in Eq. S27 for large N . For N → ∞, 120 we can convert summation to integration: Now we perform integrals term by term: After some algebra, we obtain: where Ei(x) represents exponential function defined by Ei(x) = − ∞ −x e −z z dz, and γ is the 126 Euler-Mascheroni constant. Therefore the fixation time is given by Because c < 0 and N → ∞, then the first two terms vanish and thus we finally obtain: Because the number of stem cells is very large, it can be shown from Eq. 12 in the main 129 text and (S42) that