## Introduction

Cancer is a disease of multicellular organisms, which occurs when individual cells fail to contribute to normal tissue function and instead divide selfishly, resulting in uncontrolled local growth, metastasis, and often death1. Multicellular organisms have evolved both species- and tissue-specific mechanisms to suppress somatic evolution and, thus, delay aging and the emergence of cancers. The most striking evidence for the evolution of cancer suppression originates with a prediction of the multistage model2,3, which was succinctly expressed by Peto4: He observed that even though humans are around 1000 times larger than mice and live about 30 times longer, the overall incidence of cancer in the two species is very similar, a sign of evolutionary fine-tuning5.

Similar to large and long-lived species, tissues within an individual that are large and rapidly dividing also face potentially higher rates of somatic evolution and, as a result, higher incidence of tumors, raising the question if tissue-specific mechanism to suppress somatic evolution has also evolved? A recent empirical dataset assembled by Tomasetti and Vogelstein6 offers key insight to answer this question. The dataset, which gathers lifetime cancer risk and the total number of divisions of healthy self-replicating cells (i.e., stem cells) for 31 different tissues, displays a striking tendency: the dependence of cancer incidence on the number of stem cell divisions is sub-linear. In particular, a hundred-fold increase in the number of divisions only results in a ten-fold increase in incidence7,8. As first pointed out by Nobel et al.7 this trend supports theoretical predictions9,10,11 that tissues with more stem cell divisions (typically larger ones with rapid turnover, e.g., the colon) are relatively less prone to develop cancer, which by analogy we may call Peto’s paradox for tissues7,8.

However, while there are clear examples of how species-specific differences can control cancer risk, e.g., by increasing the copy number of tumor suppressor genes12, it is not clear how different tissues subject to different constraints but sharing the same genome, can control somatic evolution.

Self-renewing tissues that must generate a large number of cells during an individual’s lifetime and in which cancers typically arise are characterized by hierarchical differentiation, which can suppress somatic evolution in two fundamental respects. First, hierarchical organization limits the mutational burden of maintaining tissues8,13,14 by reducing divisional load, i.e., the number of cell divisions along cell lineages. Second, the rate of somatic evolution also depends on the strength of somatic selection, which is limited by “washing out”, i.e., the ability of differentiation to drive cells higher in the hierarchy towards the terminally differentiated state and permanent loss of proliferative ability (Fig. 1a)15,16,17,18.

Washing out can be quantified by the “proliferative disadvantage” of cells, a quantity (formally defined below) that is proportional to the difference between the rate of cell loss (via symmetric differentiation or cell death) and the rate of self-renewal of cells at a given level of differentiation. In healthy tissues, stem cells are lost and self-renewed at the same rate and, as a result, have no proliferative disadvantage. Higher in the hierarchy, however, more differentiated progenitor cells always have an inherent proliferative disadvantage as some cells arrive by differentiation from lower levels, and self-renewal replenishes only a fraction of the cells lost (Fig. 1b). As a result, the descendants of progenitors are eventually “washed out” of the tissue by cells differentiating from lower levels of the hierarchy.

The higher the proliferative disadvantage, i.e., the more committed the cells are to differentiation rather than self-renewal, the more resistant they are to somatic evolution toward uncontrolled growth, because they must accumulate more or stronger mutations before being washed out. Cells with a higher proliferative disadvantage are, thus, more resistant to mutations leading to cancer.

Hierarchical tissues can optimally restrain somatic evolution by simultaneously minimizing divisional load and maximizing washing out when a sufficiently large number of progressively faster differentiating cell types are present. As Derényi et al.8 showed, this requires $${\log }_{2}(N/{N}_{0})$$ hierarchical levels in a tissue where N0 stem cells are responsible for generating N terminally differentiated cells over an individual’s lifetime. In such optimal differentiation hierarchies only stem cells are self-renewed, all other cell types are fully committed to differentiation (i.e., do not self-renew) and, therefore, have a maximal proliferative disadvantage.

Peto’s paradox for tissues, however, implies that in real tissues we do not in general see optimal hierarchies that reduce cancer incidence to the lowest possible value. This is reflected in Tomasetti and Vogelstein’s6 data by the smaller and slower tissues that divide less often being less protected against cancer than larger ones scaled to the same size (e.g., cancer of the esophagus vs. colorectal cancer). The evolutionary explanation for suboptimal tissue organization is that reduction of cancer incidence, especially beyond the reproductive age of the individual, is expected to provide diminishing fitness advantages and, consequently, a tissue-specific limit exists beyond which the effects of subsequent beneficial mutations will not be large enough to overcome random genetic drift. This “drift-barrier hypothesis” has been successful in explaining variation in a variety of traits such as genome size and mutation rate across diverse taxa19.

Consider as an example the per generation mutation rate: Selection is generally expected to favor reduced mutation rates20,21 as it reduces the load of deleterious mutations. Current evidence, however, indicates that differences in mutation rates, which can vary over orders of magnitude across different species, are not the results of physiological constraints on DNA-replication fidelity. Instead, mutation rates in different species are the result of a balance between selection and genetic drift as evidenced by their negative correlation with effective population size22,23.

In case of hierarchical tissues, it is not well understood how suboptimal tissues (with fewer than optimal hierarchical levels) restrain somatic evolution. On the one hand, washing out (characterized by the proliferative disadvantage) is maximized when progenitor cells can only differentiate and never self-renew. On the other hand, minimizing mutation accumulation (characterized by the lifetime divisional load, i.e., the length of the longest cell lineages) requires non-vanishing self-renewal of the progenitors8. Crucially, vanishing self-renewal of the progenitors delegates the self-renewal burden to the stem cells, which would make the cell lineages longer. Understanding the organization of real tissues (with less than optimal complexity) requires us to consider this inherent conflict between maximizing washing out, i.e., making progenitor cells more resistant to mutations leading to cancer, and minimizing the accumulation of the same mutations.

Here, we explore the organizational properties of hierarchical tissues that keep the lifetime risk of cancer below a threshold value, determined by the “drift-barrier”. We show that under general conditions there exists a trade-off between minimizing mutation accumulation and maximizing the proliferative disadvantage of cells. This trade-off provides an explanation for the observed higher division rate of stem cells than what would be expected solely from the minimization of the accumulation of mutations.

## Results

Consider a minimal generic model of hierarchically organized, self-sustaining tissue with cells arranged into n + 1 hierarchical levels based on their differentiation state8: The bottom level (level 0) is comprised of tissue-specific stem cells, while higher levels (levels k, where 0 < k < n) contain progressively more differentiated progenitors, and the top level (level n) corresponds to the terminally differentiated cells (Fig. 1a). During tissue homeostasis the stem cell level produces differentiated cells at a rate of δ0, while the differentiation rates of higher levels (denoted by δk for level k) are progressively larger. The increasing tendency of the differentiation rates of the progenitor levels (0 < k < n) is specified by the level-specific amplification factors γk = δk/δk−1, which relate the differentiation rate of a progenitor level to that of the level below it (cf. Fig. 1b).

### The proliferative disadvantage of cells is determined by the amplification factor

A differentiation hierarchy $${{{{{{{\mathcal{H}}}}}}}}$$ with n + 1 levels and N0 stem cells is fully described by the per cell rates of the four microscopic events: symmetric differentiation ($${r}_{k}^{\uparrow \uparrow }$$), asymmetric differentiation ($${r}_{k}^{\circ \uparrow }$$), symmetric cell division ($${r}_{k}^{\circ \circ }$$), and cell death ($${r}_{k}^{\times }$$) for each level k (Fig. 1a), which together specify the homeostatic cell numbers Nk at each level k (Fig. 1b).

The per cell rate of net cell production at level k can be expressed as

$${R}_{k}={r}_{k}^{\uparrow \uparrow }+{r}_{k}^{\circ \uparrow }+{r}_{k}^{\circ \circ }-{r}_{k}^{\times }={A}_{k}-{r}_{k}^{\times },$$
(1)

where

$${A}_{k}={r}_{k}^{\uparrow \uparrow }+{r}_{k}^{\circ \uparrow }+{r}_{k}^{\circ \circ },$$
(2)

the sum of all the three types of cell division rates characterizes the divisional activity of the cells.

Homeostasis, cf. Fig. 1b, implies that on any particular level the number of cells remains constant on average:

$${\delta }_{k-1}+{N}_{k}{r}_{k}^{\circ \circ }-{N}_{k}{r}_{k}^{\uparrow \uparrow }-{N}_{k}{r}_{k}^{\times }={\delta }_{k-1}-{N}_{k}{W}_{k}=0,$$
(3)

i.e.,

$${\delta }_{k-1}={N}_{k}{W}_{k},$$
(4)

where

$${W}_{k}={r}_{k}^{\uparrow \uparrow }-{r}_{k}^{\circ \circ }+{r}_{k}^{\times }$$
(5)

can be identified as the net per cell rate at which cells are depleted (“washed out”) from level k. Note that because there is no differentiation toward the stem cell level, δ−1 is formally set to 0 and, therefore, W0 = 0. For all progenitor levels Wk > 0.

To derive the relationship between the amplification factor and the strength of “washing out”, we first express δk, the rate differentiated cells are produced by level k as

$${\delta }_{k}={N}_{k}\left(2{r}_{k}^{\uparrow \uparrow }+{r}_{k}^{\circ \uparrow }\right)={N}_{k}({R}_{k}+{W}_{k}),$$
(6)

where symmetric differentiation events () produce two differentiated descendants on level k + 1 and asymmetric differentiation events () produce only one descendant on level k + 1 (the another one on level k). Using Eqs. (4) and (9) the amplification factor γk for the progenitor levels can be expressed as

$${\gamma }_{k}=\frac{{\delta }_{k}}{{\delta }_{k-1}}=\frac{{R}_{k}+{W}_{k}}{{W}_{k}}=2+\frac{{R}_{k}-{W}_{k}}{{W}_{k}}.$$
(7)

The proliferative disadvantage of cells at level k > 0 can be quantified by the dimensionless ratio of the rate of washing out Wk and the divisional activity Ak:

$${\pi }_{k}=\frac{{W}_{k}}{{A}_{k}}=\frac{{W}_{k}}{{R}_{k}}\frac{{R}_{k}}{{A}_{k}}=\frac{1}{{\gamma }_{k}-1}(1-{\varepsilon }_{k}) \; > \; 0\quad {{{{{{{\rm{for}}}}}}}}\,{{{{{{{\rm{all}}}}}}}}\quad k \; > \; 0,$$
(8)

where

$${\varepsilon }_{k}=\frac{{r}_{k}^{\times }}{{A}_{k}}$$
(9)

is the death to birth ratio at level k.

As we are interested in how tissues of different complexity can restrain somatic evolution, in the following we will omit cell death which can only increase the mutational burden, because a lost cell must always be replaced by cell division. Under such conditions the above expressions simplify to: Ak = Rk and εk = 0, from which it follows that Rk ≥ Wk and, therefore, γk ≥ 2. The minimum of γk = 2 is reached when only symmetric differentiation events () occur. Finally, the proliferative disadvantage πk = 1/(γk − 1) becomes a decreasing function of the amplification factor γk that is maximized at γk = 2 corresponding to progenitors that only divide via symmetric differentiation.

Derényi et al.8 showed that for a tissue with N0 stem cells that produce a total of N terminally differentiated cells during the tissue’s expected lifetime the optimal differentiation hierarchy that minimizes divisional load has $${n}_{{{{{{{{\rm{opt}}}}}}}}}={\log }_{2}(N/{N}_{0})$$ levels and a uniform amplification factor of γk = 2. In such optimal self-sustaining differentiation hierarchies no more than $${\log }_{2}(N/{N}_{0})+2$$ cell divisions are sufficient along any cell lineage while, at the same time, the proliferative disadvantage of progenitors is also maximized (cf. Eq. (8)).

However, for suboptimal hierarchies, in particular, ones with n < nopt levels the amplification factor that minimizes divisional load8 is $${\gamma }_{k}={\gamma }^{* }(n)={(N/{N}_{0})}^{1/n} \; > \; 2$$, while the proliferative disadvantage of progenitors is still maximized by γk = 2. This implies the existence of a trade-off between minimizing divisional load and maximizing proliferative disadvantage. To understand the effect of this trade-off on tissue organization requires developing a quantitative theory of cancer incidence in hierarchical tissues.

### Necessary conditions for cancer

In healthy tissues the proliferative disadvantage πk of all “wild type” cells except tissue-specific stem cells is strictly positive. As a result, descendants of progenitor cells are inexorably driven toward the terminally differentiated state and eventually lost from the tissue unless they accumulate mutations that lead to a negative proliferative disadvantage at some level of the hierarchy. If the proliferative disadvantage does become negative at any point in the hierarchy the mutant population will start to grow exponentially.

Mutants with a reduced, but still positive proliferative disadvantage can lead to hyperplasia, which, can be life-threatening. As the conditions for a cell to be able to proliferate into a macroscopically large number or to proliferate exponentially are very similar, we do not distinguish the two, and only focus on uncontrolled growth, which occurs when the proliferative disadvantage becomes negative.

To model the accumulation of mutations we consider driver mutations that each reduces the strength of washing out by the same fraction of the total cell division rate Ak, i.e., the rate $${\hat{W}}_{k}(d,s)$$ at which mutant cells with d driver mutations of strength s are depleted is:

$${\hat{W}}_{k}(d,s)={W}_{k}-d\cdot s\cdot {A}_{k},$$
(10)

for all levels 0 < k < n. The terminally differentiated level (k = n), where only cell loss is assumed to occur, but not cell division, is considered unaffected by driver mutations.

Stem cells (k = 0), which must fully self-renew and, as a result, have a proliferative disadvantage of zero, i.e., π0 = 0, must be considered separately. Formally, even a single driver mutation, no matter how weak, will, if it is not lost, lead to an exponential, albeit potentially very slow expansion of the stem cell pool. The differentiated descendants of these mutant stem cells, however, will still be at a proliferative disadvantage and will be washed out from higher levels k > 0 of the hierarchy, unless a sufficient number of drivers are accumulated to overcome the proliferative disadvantage πk.

The critical number dcrit(k, s) of driver mutations of strength s necessary on level k to overcome the proliferative disadvantage πk is the smallest value of d for which $${\hat{W}}_{k}(d,s)\le 0$$. This gives

$${d}_{{{{{{{{\rm{crit}}}}}}}}}(k,s)=\left\lceil \frac{{W}_{k}/{A}_{k}}{s}\right\rceil =\left\lceil \frac{{\pi }_{k}}{s}\right\rceil =\left\lceil \frac{1}{s({\gamma }_{k}-1)}\right\rceil ,$$
(11)

where x denotes the ceiling function, i.e., the smallest integer that is equal to or larger than x. Note that the increase in proliferative disadvantage for decreasing amplification factors is reflected in a larger critical number of mutations.

Equation (10), however, does not fully specify the effect of driver mutations, as reduction of $${\hat{W}}_{k}(d,s)$$ can potentially be achieved in two ways: either by increasing $${r}_{k}^{\circ \circ }$$ or by decreasing $${r}_{k}^{\uparrow \uparrow }$$. Of these two possibilities mutations increasing $${r}_{k}^{\circ \circ }$$ alone can always lead to uncontrolled growth if they are of sufficient strength and number, while mutations decreasing only $${r}_{k}^{\uparrow \uparrow }$$ require the condition that $${r}_{k}^{\circ \circ } \; > \; 0$$, i.e., γk > 2. Mutations affecting $${r}_{k}^{\circ \uparrow }$$ cannot lead to exponential growth, but do have an effect on the accumulation rate of mutations.

Here we restrict the discussion to mutations that increase $${r}_{k}^{\circ \circ }$$, i.e., the case where mutants with d driver mutations exhibit increased self-proliferation:

$${\hat{r}}_{k}^{\circ \circ }(d,s)={r}_{k}^{\circ \circ }+d\cdot s\cdot {A}_{k}.$$
(12)

This leads to modified total cell division rates:

$${\hat{A}}_{k}(d,s)={A}_{k}+d\cdot s\cdot {A}_{k}$$
(13)

and an increased amplification factor for mutant cells:

$${\hat{\gamma }}_{k}(d,s)=\frac{{\hat{{A}}_{k}(d,s)+{\hat{W}}_{k}(d,s)}}{{\hat{{W}}_{k}(d,s)}}=\frac{{A}_{k}+{W}_{k}}{{W}_{k}-d\cdot s\cdot {A}_{k}}=\frac{{\pi }_{k}+1}{{\pi }_{k}-d\cdot s},$$
(14)

which diverges as d approaches dcrit.

### The probability of accumulating the critical number of driver mutations

Our aim is to calculate the probability of accumulating the critical number dcrit(s) of driver mutations during the lifetime of a tissue hierarchy $${{{{{{{\mathcal{H}}}}}}}}$$. To do so we make the following assumptions: we assume that driver mutations of strength s occur with probability μ in each descendant cell following a division event and we consider hierarchy $${{{{{{{\mathcal{H}}}}}}}}$$ described by the number of levels n, the amplification factors γk, the homeostatic cell numbers Nk, the relative contributions of the two differentiation rates $$2{r}_{k}^{\uparrow \uparrow }/{r}_{k}^{\circ \uparrow }$$, and the total rate at which stem cells produce differentiated cells $${\delta }_{0}=(2{r}_{0}^{\uparrow \uparrow }+{r}_{0}^{\circ \uparrow })\cdot {N}_{0}$$ during an expected lifetime tlife. Note that the driver mutation rate per cell division μ is the rate at which mutations that lead to a decreased proliferative disadvantage occur and corresponds to the product of the number of driver genes, the average mutational target size per gene, and the base pair mutation rate per cell division.

To calculate the probability $${P}_{{{{{{{{\rm{cancer}}}}}}}}}(s,\mu ,{{{{{{{\mathcal{H}}}}}}}},{t}_{{{{{{{{\rm{life}}}}}}}}})$$ of accumulating the critical number of mutations, we introduce an efficient mathematical tool capable of handling lineage trees of hierarchically organized tissues even in the presence of mutations that alter the microscopic properties (e.g., the rates of cellular events) of affected cells and, thereby, alter the structures of the corresponding sublineages. (To make the notations more concise, the μ and $${{{{{{{\mathcal{H}}}}}}}}$$ dependences of the mathematical functions defined below are not indicated).

The key is to determine the probability Qk(d, s, t) that a single cell at a non-stem level 0 < k that has already acquired d ≤ dcrit(s) driver mutations at time t gives rise to a sublineage along which the remaining dcrit(s) − d driver mutations needed to reach criticality are eventually accumulated. By definition Qk(dcrit(s), s, t) = 1 for every level 0 < k < n, except for the terminal one k = n, where Qn(d, t) = 0 for every d. For any 0 < k < n and d < dcrit(s) in the (k, d) parameter space the probabilities Qk(d, s, t) can be derived recursively from these boundary conditions (defined at the k = n and d = dcrit boundaries).

Any cell that appears on level k is washed out together with all of its descendants from this level at a rate $${\hat{W}}_{k}(d,s)$$, which means that after the first appearance of the cell at time t its expected number (including that of its descendants on level k) decays exponentially with time $$t^{\prime}$$ as $${{{{{{{{\rm{e}}}}}}}}}^{-{\hat{W}}_{k}(d,s)\cdot (t^{\prime} -t)}$$. The expected number of times some event with rate r involving this cell or its descendants on level k occurs is then $$r\cdot {\hat{\tau }}_{k}(d,s,t)$$, where

$${\hat{\tau }}_{k}(d,s,t)=\int\nolimits_{t}^{{t}_{{{{{{{{\rm{life}}}}}}}}}}{{{{{{{{\rm{e}}}}}}}}}^{-{\hat{W}}_{k}(d,s)\cdot (t^{\prime} -t)}{{{{{{{\rm{d}}}}}}}}t^{\prime} =\frac{1-{{{{{{{{\rm{e}}}}}}}}}^{-{\hat{W}}_{k}(d,s)\cdot ({t}_{{{{{{{{\rm{life}}}}}}}}}-t)}}{{\hat{{W}}_{k}(d,s)}}=\frac{{\hat{{P}}_{k}(d,s,t)}}{{\hat{{W}}_{k}(d,s)}}$$
(15)

and the survival probability $${\hat{P}}_{k}(d,s,t)$$, as defined by this equation, is the probability of the cell (and its descendants) not being washed out of level k in the time interval between t and tlife.

In particular, the expected number of birth events giving rise to differentiated descendants (i.e., cells on one level higher) at a rate $${r}_{k}^{\uparrow }=2{r}_{k}^{\uparrow \uparrow }+{r}_{k}^{\circ \uparrow }$$ is

$${m}_{k}^{\uparrow }(d,s,t) =\; {r}_{k}^{\uparrow }\cdot {\hat{\tau }}_{k}(d,s,t)=\frac{{\hat{{A}}_{k}(d,s)+{\hat{W}}_{k}(d,s)}}{{\hat{{W}}_{k}(d,s)}}\\ \cdot {\hat{P}}_{k}(d,s,t) ={\hat{\gamma }}_{k}(d,s)\cdot {\hat{P}}_{k}(d,s,t),$$
(16)

while the expected number of birth events giving rise to undifferentiated descendants (i.e., cells on the same level) at a rate $${\hat{r}}_{k}^{\circ }=2{\hat{r}}_{k}^{\circ \circ }(d,s)+{r}_{k}^{\circ \uparrow }$$ is

$${m}_{k}^{\circ }(d,s,t) =\; {\hat{r}}_{k}^{\circ }\cdot {\hat{\tau }}_{k}(d,s,t)=\frac{{\hat{{A}}_{k}(d,s)-{\hat{W}}_{k}(d,s)}}{{\hat{{W}}_{k}(d,s)}}\\ \cdot {\hat{P}}_{k}(d,s,t) =\left[{\hat{\gamma }}_{k}(d,s)-2\right]\cdot {\hat{P}}_{k}(d,s,t).$$
(17)

For a cell on level 0 < k < n with d < dcrit(s) mutations the probability of giving rise to a lineage that eventually accumulates dcrit(s) mutations can be approximated recursively (starting from the k = n and d = dcrit boundaries) as

$${Q}_{k}(d,s,t)= \; {m}_{k}^{\uparrow }(d,s,t)\cdot [(1-\mu )\cdot {Q}_{k+1}(d,s,t)+\mu \\ \cdot {Q}_{k+1}(d+1,s,t)]+{m}_{k}^{\circ }(d,s,t)\cdot \mu \cdot {Q}_{k}(d+1,s,t),$$
(18)

where the terms on the right-hand side correspond to three possibilities: (i) a fraction of 1 − μ of the $${m}_{k}^{\uparrow }(d,s,t)$$ descendants of the cell on level k + 1 acquire no driver mutation, and lead to sublineages with probability Qk+1(d, s, t), (ii) a fraction of μ of these descendants acquire an additional driver mutation, and lead to sublineages with probability Qk+1(d + 1, s, t), and (iii) a fraction of μ of the undifferentiated descendants on level k manage to acquire a driver mutation before being washed out, and lead to sublineages with probability Qk(d + 1, s, t). The recursion gives a small (typically negligible) overestimation of the Qk(d, s, t) probabilities in three respects. First, it replaces probabilities with expected values, thus, it does not discount the possibility of the simultaneous appearance of critical mutants along the $${m}_{k}^{\uparrow }(d,s,t)$$ and $${m}_{k}^{\uparrow }(d,s,t)$$ parallel sublineages. This has a negligible effect as long as Qk(d, s, t)  1, which is the typical case (except when mutants with dcrit(s) − 1 mutations are almost critical). Second, the survival probability $${\hat{P}}_{k}(d,s,t)$$ accounts for the limited time available for a cell (including its descendants) on level k, if the cell appears close to the end of the lifetime of the tissue (measured in units of $$1/{\hat{W}}_{k}(d,s)$$), however, the even shorter times available for the sublineages initiated by the $${m}_{k}^{\uparrow }(d,s,t)$$ and $${m}_{k}^{\circ }(d,s,t)$$ descendants are not taken into account. The effect of the $${\hat{P}}_{k}(d,s,t)$$ correction factor is typically very small and this second order correction is even more negligible (again with the exception of the almost critical subcritical mutants). Third, when the critical mutant appears, it may stochastically go extinct before it can establish an exponentially growing population. This, however, is also negligible unless the last driver mutation arrives close to the terminal level k = n and is only slightly critical (i.e., only has a marginally negative $${\hat{W}}_{k}({d}_{{{{{{{{\rm{crit}}}}}}}}},s)/{\hat{A}}_{k}({d}_{{{{{{{{\rm{crit}}}}}}}}},s)$$).

To complete the calculation of $${P}_{{{{{{{{\rm{cancer}}}}}}}}}(s,\mu ,{{{{{{{\mathcal{H}}}}}}}},{t}_{{{{{{{{\rm{life}}}}}}}}})$$ we have to consider the accumulation of mutations on the stem cell lineage (i.e., the bottom most line leading to each yellow stem cell on Fig. 1c). To do so, here, we neglect the expansion of the stem cell pool for mutants. This is motivated by the qualitatively different nature of stem cells, which, in contrast to progenitors lack a proliferative disadvantage. The lack of proliferative advantage implies that if stem cells were affected by driver mutations in the same manner as progenitors even a single driver mutation would lead to exponential growth of mutant stem cells. We discuss this assumption in detail below.

The time evolution of the expected number of stem cells N0(d, s, t) that have acquired d drivers but have not yet given rise to a progenitor sublineage along which the critical number of drivers will have accumulated is given by the following system of ordinary differential equations:

$$\frac{\partial }{\partial t}{N}_{0}(0,s,t)=-{r}_{0}^{\circ }\cdot \mu \cdot {N}_{0}(0,s,t)-{r}_{0}^{\uparrow }\cdot {q}_{0}(0,s,t)\cdot {N}_{0}(0,s,t)\quad {{{{{{{\rm{for}}}}}}}}\,d=0,$$
(19)
$$\frac{\partial }{\partial t}{N}_{0}(d,s,t)= \; {r}_{0}^{\circ }\cdot \mu \cdot \left[{N}_{0}(d-1,s,t)-{N}_{0}(d,s,t)\right]\\ -{r}_{0}^{\uparrow }\cdot {q}_{0}(d,s,t)\cdot {N}_{0}(d,s,t)\quad {{{{{{{\rm{for}}}}}}}}\,d \; > \; 0,$$
(20)

with initial conditions N0(0, s, 0) = N0 and N0(d, s, 0) = 0 for d > 0, where $${r}_{0}^{\circ }=2{r}_{0}^{\circ \circ }+{r}_{0}^{\circ \uparrow }$$, $${r}_{0}^{\uparrow }=2{r}_{0}^{\uparrow \uparrow }+{r}_{0}^{\circ \uparrow }$$, and

$${q}_{0}(d,s,t)=(1-\mu )\cdot {Q}_{1}(d,s,t)+\mu \cdot {Q}_{1}(d+1,s,t)\quad \quad {{{{{{{\rm{for}}}}}}}}\,d \; < \; {d}_{{{{{{{{\rm{crit}}}}}}}}}(s),$$
(21)
$${q}_{0}(d,s,t)=1\quad{{{{{{{\rm{for}}}}}}}}\;d\;\ge\; {d}_{{{{{{{{\rm{crit}}}}}}}}}(s)$$
(22)

is the probability that a progenitor sublineage descending from a single stem cell accumulates the critical number of mutations.

Using the above the lifetime probability of accumulating dcrit(s) mutations can be expressed as:

$${P}_{{{{{{{{\rm{cancer}}}}}}}}}(s,\mu ,{{{{{{{\mathcal{H}}}}}}}},{t}_{{{{{{{{\rm{life}}}}}}}}})= \; \mathop{\sum }\limits_{d=0}^{\infty }\int\nolimits_{0}^{{t}_{{{{{{{{\rm{life}}}}}}}}}}{r}_{0}^{\uparrow }\cdot {q}_{0}(d,s,t)\cdot N(d,s,t)\cdot {{{{{{{\rm{d}}}}}}}}t\\ +\mathop{\sum }\limits_{k=1}^{n-1}{N}_{k}\cdot {Q}_{k}(0,s,0),$$
(23)

where the first term describes the probability of accumulating the critical number of mutations from the time t = 0 the tissue has fully developed until the end of its expected lifetime tlife and the second term corresponds to the contribution of cells created during tissue development.

Equation (23) can be solved numerically using standard methods and, as shown in Fig. S1, it is in very good agreement with explicit, but extremely time consuming, population dynamics simulations (see Supplementary Note 1). We provide an open-source implementation of both the numerical solution and the explicit simulation used to validate it (see Code availability).

In the following we assume that all amplification factors are equal, i.e., γk = γ and $${\hat{\gamma }}_{k}(d,s)=\hat{\gamma }(d,s)$$, which also implies that πk = π and dcrit(s) = π/s = 1/(s(γ − 1)) for all progenitor levels 0 < k < n. The assumption of uniform amplification factors, which corresponds to differentiation rates increasing exponentially along the hierarchy13,14, is motivated by both mathematical convince and the optimality of identical γk values in minimizing the lifetime divisional load8. Model parameters are summarized in Table S1.

The above result for the risk of cancer during the lifetime of a tissue hierarchy $${{{{{{{\mathcal{H}}}}}}}}$$ described by the number of levels n, the uniform amplification factor γ, homeostatic cell numbers Nk for k > 0 and the total number of stem cell division $${\delta }_{0}={r}_{0}^{\uparrow }\cdot {N}_{0}$$ has some clear implications: the accumulation of cancer risk during a tissues lifetime, i.e., the first term on the right hand side of Eq. (23), is proportional to N0, increases with increasing tissue life time, and decreases with increasing n, because δ0 = γ1/n and the Q0(d, s, t) terms describing the probability of accumulating the critical number of mutations along a progenitor lineage also decrease as the divisional load decreases with increasing n.

The dependence on the amplification factor γ, however, is more complicated. As illustrated in Fig. 2b, c, in contrast to the probability of accumulating a fixed number of mutations, the minimum of the probability of cancer as a function of the amplification factor γ is not, in general, close to the value $${\gamma }^{* }(n)={(N/{N}_{0})}^{1/n}$$ that minimizes the lifetime divisional load of the tissue8. Instead, the amplification factor $${\gamma }_{{{{{{{{\rm{cancer}}}}}}}}}^{* }$$ that minimizes the probability of cancer is determined by a trade-off between the proliferative disadvantage along the hierarchy, reflected in increasing dcrit(s) = 1/s(γ − 1) for decreasing γ as shown in Fig. 2a, and mutation accumulation, which is minimized near γ*(n), as illustrated by the colored bands in Fig. 2b, c.

The question arises if the trade-off is the result of the stem cells being unaffected by driver mutations. The mathematical description developed here can be readily extended to situations where the drivers do affect the stem cell rates. The results (not detailed here) show that the amplification factor that minimizes the cancer incidence is shifted to higher values (to relieve the cell divisional burden on the stem cells), but a trade-off remains.

To explore the relevance of such a trade-off consider two human tissues, the hierarchical organization of which are best understood: the hematopoietic system, where approximately N0 = 104 stem cells24 produce about N = 1015 terminally differentiated cells, and the colon, where approximately N0 = 108 stem cells produce N = 1014 terminally differentiated cells during a person’s lifetime.

For a fully optimal hierarchy with $${n}_{{{{{{{{\rm{opt}}}}}}}}}={\log }_{2}(N/{N}_{0})$$ levels, where γ*(n) = 2, the minimum of the lifetime divisional load coincide with maximal proliferative disadvantage along the hierarchy. Based on the above order of magnitude estimates this would require nopt ≈ 36 hierarchical levels in blood, while the colon would require nopt ≈ 20. In addition, stem cells at the bottom of both hierarchies would only divide twice during an entire lifetime8.

Detailed modeling of human hematopoiesis has provided estimates of between 17 and 31 hierarchical levels25, and long-term hematopoetic stem cells are thought to divide at most a few times a year (estimates of every 25–50 weeks26 and every 2–20 months27 have been proposed). The colon is organized into millions of crypts, each containing only a few stem cells and selection for driver mutations occurs within single crypts and the number of hierarchical levels in colonic crypts is less clear, but stem cells are known to divide approximately every 4 days28,29.

From these data it is obvious that neither tissue appears to possess a fully optimal hierarchy, despite evidence that large and rapidly dividing human tissues have evolved increased cancer resistance6,7,8. This observation is consistent with the existence of a “drift-barrier”, i.e., that selection can only optimize tissues to the extent that the selective advantage achieved is sufficiently large to overcome genetic drift.

### The organization of hierarchical tissues that have evolved to limit somatic evolution

To model the existence of a drift-barrier we consider the least complex tissue, i.e., the one with the smallest number of hierarchical levels, that can keep the probability of cancer below a threshold value. We consider the number of stem cells N0 and the number of terminally differentiated cells produced N as fixed by external constraints and vary the rate of driver mutations per cell division μ and their strength s.

We determined the minimum number of levels ndrift and the corresponding uniform amplification factor γdrift necessary to keep the lifetime risk of cancer below the threshold value of 2% for cancers of the hematopoietic system and about 4% for colorectal cancer30 (see Methods for details). Varying s and μ in Fig. 3a, b, we show results for the number of levels ndrift and the amplification factor γdrift, together with the number of drivers (which is determined by s and γdrift, cf. Eq. (11)) and the stem cell division time (determined by ndrift, γdrift, and N0).

Estimates for the rate of driver mutations per cell division31,32,33 vary over μ = 10−6–10−4 reflecting potentially tissue specific uncertainty in both the number of mutational targets and the somatic mutation rate per cell division. Recently, Watson et al.24 have estimated that there are ≥ 2500 variants that confer moderate to high selective advantages in hematopoietic stem cells, which combined with an approximate mutation rate of 10−9 per base pair per cell division corresponds to μ ≈ 2.5 × 10−6. For the average selective advantage of driver mutations estimates range from31 s ≈ 10−3 to32,34 s > 10−1. For the colon empirical measurements35 and theoretical arguments suggest that s < 10−1 is unlikely, but for blood the entire range of values is plausible, with estimates by Watson et al.24 indicating that 40% of variants confer moderate to high fitness effects of s > 0.04.

The unshaded areas bounded by the dashed lines in Fig. 3a, b show the ranges of μ and s values consistent with the above estimates. For the hematopoietic system we find that the number of hierarchical levels ranges between n = 15 and 30, and the amplification factor between γ = 2 and 6, broadly consistent with estimates25 based on available in vivo data. The number of drivers falls between d = 4 and 6, while stem cells divide a few times per year. For the colon we find a significantly lower number of levels between n = 5 and 15 and an amplification factor of γ = 2, corresponding to maximal washing out, again consistent with our understanding of the organization of the colorectal epithelium36,37,38. As can be seen when comparing the rightmost panels in Fig. 3a, b, the maximization of washing out in the case of the colon, however, comes at the cost of relegating the burden of cell proliferation to the stem cells, which divide at least an order of magnitude faster as compared to blood, for the same values of μ and s.

## Discussion

Animals have been evolving mechanisms to suppress cancer ever since the origin of multicellularity. The existence of species level adaptations, as exemplified by the near irrelevance of mammalian body size and lifespan to lifelong cancer risk, has been clear for several decades4,5. The realization that rapidly renewing tissues of long-lived animals, such as humans, must also have evolved tissue specific protective mechanisms also dates back several decades9,10. Evidence for tissue specific adaptations is, however, more recent6,7,8.

In the above we calculate the lifetime risk of cancer in a hierarchically differentiating self-renewing tissue taking into account the effects of driver mutations that reduce the proliferative disadvantage of mutants. Using this result we determine the organizational properties of hierarchical tissues that have evolved to limit somatic evolution by keeping the lifetime risk of cancer below a maximum acceptable value. We find that the optimal tissue organization is determined by a trade-off between two competing mechanism, reduced mutation accumulation8, and increased “washing out” through the progression of increasingly differentiated cell types15.

We show that such a trade-off exists as long as differentiation hierarchies are not fully optimal in reducing divisional load. This is likely the case in most tissues of most species, as fully optimal tissues require complex hierarchies with a large number of levels incompatible with current empirical evidence6,8. Such complex hierarchies are also unlikely to have evolved according to the “drift-barrier” hypothesis22,23,39 which, in contrast to the view that natural selection fine-tunes every aspect of organisms, predicts that genetic drift, resulting from finite population sizes, can limit the power of selection and constrain the degree to which phenotypes can be optimized by selection.

The trade-off occurs in the tempo of increase of the cell production rate along the differentiation hierarchy, which we parametrize by the amplification factor γ. The amplification factor corresponds to the ratio of the rate at which adjacent levels produce differentiated cells. Tissues with a smaller amplification factor experience increased mutational burden, however, at the same time exhibit increased washing out, resulting in a trade-off between the two.

We demonstrate that based on the lifetime number of the terminally differentiated cells produced per stem cell, our theoretical description (Fig. 3a, b) provides realistic predictions for the organization of the human hematopoietic system and the epithelial tissue of the colon. In particular, the hematopoietic differentiation hierarchy is predicted to have a relatively larger number of levels with a relatively high amplification factor ensuring low mutational load from cell divisions, in agreement with previous results25. The colorectal epithelium, the paradigmatic model of differentiation induced proliferative disadvantage15,18, in contrast, has a near minimal amplification factor and few differentiation levels ensuring strong washing out and requiring a fast stem cell turnover rate in agreement with experimental data28,29.

In summary, trade-off theory does not lead to a different optimum, but rather argues that, given the relevant limits of natural selection set by genetic drift, tissues have not evolved to be optimal. The quantitative model developed here provides a general analytical tool for predicting the organization (including the cell differentiation rates and the number of hierarchical levels) of tissues of various sizes (N0 and N) based on the rate (μ) and strength (s) of driver mutations. Based on these results we demonstrate that under a broad range of parameters characteristic of real tissues, hierarchical structure optimized to the limits of natural selection set by genetic drift is determined by a trade-off between mutation accumulation and the strength of washing out. An immediate consequence of our predictions is the explanation of the surprisingly fast turnover rate of the stems cells of the colonic crypts.

It is, however, important to emphasize that our results only consider the balance between mutation accumulation and washing out resulting from cell differentiation, while keeping other variables fixed. In particular, N/N0, the number of terminally differentiated cells produced per stem cell during the lifetime of the tissue is a constraint of fundamental importance (cf. Figs. 2 and 3). For the two examples considered above, blood and colon, the number of terminally differentiated cells produced during the lifetime of the two tissues is similar (~N = 1015 and 1014, respectively), while the number of cells produced per stem cell differs by several orders of magnitude (N/N0 = 1011 and 106). In fact, the two tissues are markedly different in their physical organization, and this is reflected in the differences in the number of stem cells in each. Blood is replenished in a centralized manner by the bone marrow, while the intestinal epithelium of the colon is renewed in a highly localized manner by a large number of stem cells that reside at the base of a large number of distinct crypts. Understanding the evolutionary and physiological origins of differences in the hierarchical organization of different tissues will require a theory that considers all relevant and evolutionary forces and physiological constraints together.

## Methods

### Calculating the minimum number of levels and the corresponding amplification factor

For specific values of N, N0, μ, and s, to determine the minimum number of levels n and the corresponding uniform amplification factor γ, starting with n = 1 we determine the minimum of the lifetime cancer risk (defined by Eq. (23)) as a function of γ. If this minimum is above the threshold value of 2% for cancers of the hematopoietic system and about 4% for colorectal cancer30, we increase n by one, otherwise, we stop the procedure. Fig. S2 (see Supplementary Note 2) shows the robustness of our results to changing the value of the threshold between 0.1% and 10%, while supplementary Fig. S3 (see Supplementary Note 3) explores the effect of driver mutations that do not change the proliferative disadvantage until the critical number is accumulated.

### Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.