Cell population heterogeneity driven by stochastic partition and growth optimality

A fundamental question in biology is how cell populations evolve into different subtypes based on homogeneous processes at the single cell level. Here we show that population bimodality can emerge even when biological processes are homogenous at the cell level and the environment is kept constant. Our model is based on the stochastic partitioning of a cell component with an optimal copy number. We show that the existence of unimodal or bimodal distributions depends on the variance of partition errors and the growth rate tolerance around the optimal copy number. In particular, our theory provides a consistent explanation for the maintenance of aneuploid states in a population. The proposed model can also be relevant for other cell components such as mitochondria and plasmids, whose abundances affect the growth rate and are subject to stochastic partition at cell division.

A fundamental question in biology is how cell populations evolve into different subtypes based on homogeneous processes at the single cell level. Here we show that population bimodality can emerge even when biological processes are homogenous at the cell level and the environment is kept constant. our model is based on the stochastic partitioning of a cell component with an optimal copy number. We show that the existence of unimodal or bimodal distributions depends on the variance of partition errors and the growth rate tolerance around the optimal copy number. In particular, our theory provides a consistent explanation for the maintenance of aneuploid states in a population. the proposed model can also be relevant for other cell components such as mitochondria and plasmids, whose abundances affect the growth rate and are subject to stochastic partition at cell division.
It is generally believed that, in simple homogeneous environments, only one competitor class can be sustained on a single resource 1 . The intuition is that in the long run, a fittest competitor class outgrows the rest. However, this contradicts the existence of structured and heterogeneous communities, and in particular their emergence from initially clonal populations 2 . One explanation to this contradiction rests in the idea that evolution in simple environments is a sequence of selective sweeps where dominant clones are regularly replaced by fitter descendants 3 . In this case diversity is a temporary state in the transition from a dominant clone to another. The validity of this formulation is restricted to a regime where replication errors giving rise to fit descendants are rare. In the opposite extreme, the quasi-species model applies to very large populations with frequent mutations 4 , such as viruses. In this case, a population is predicted to form a cloud around a fitness peak (the so-called "quasi-species" 4 ), unless the mutability exceeds an "error threshold", in which case individuals drift randomly on the fitness landscape 5 . Since the distribution is uni-modal or uniform, in this case there are no clearly defined sub-types within the population. Furthermore, the high mutation rates of viruses are not common in other types of cell, and this restricts the applicability of the model. On the other hand, mathematical descriptions of chemostat experiments predict that diversity cannot be maintained unless cells engage in cross-feeding 6 , are subject to product inhibition 7 , rate-yield trade-offs 8,9 , or there are periodic variations in the dilution rate of the chemostat 10 .
Despite these difficulties to explain the emergence of multiple cell types, experiments supporting the phenomenon are ubiquitous. Clonal bacterial populations can diverge into multiple phenotypic clusters in the chemostat 11 . Tissues of multicellular organisms are composed of a hierarchy of genetically identical cells with different phenotypes that maintain a stable coexistence 12,13 . Resistance to cancer therapy is related in part to the heterogeneity of cancer cells before treatment [14][15][16] . Although it is not obvious how to disentangle extrinsic and intrinsic influences in complex examples like these, experiments where cells ex vivo reproduce aspects of their original population structure 17 indicate that there is an intrinsic propensity towards the maintenance of diversity even in absence of external cues.
The dominant explanation is that regulatory feedback loops amplify gene expression noise, eventually giving rise to distinct cells [18][19][20][21][22] . For example, some transcription factors regulate their own expression. Such a closed circuit may admit more than one stable steady state for appropriate values of the kinetic parameters, which can be occupied by different cells in the population due to the stochastic nature of gene expression, resulting in a bimodal distribution of phenotypes. These mechanisms are well studied in the literature, with analytical solutions available (2019) 9:9406 | https://doi.org/10.1038/s41598-019-45882-w www.nature.com/scientificreports www.nature.com/scientificreports/ for their simplest variants 20 and abundant experimental evidence supporting the theoretical results [23][24][25] . The main ingredients common to these approaches are: a properly tuned gene regulatory circuit, and gene expression noise.
However, in many contexts partition errors during cell division are more relevant than gene expression noise [26][27][28][29][30][31][32] . Up to now partition noise, which is in average symmetric, has not been studied as a potential source of bimodality in a cell population. The novelty of this contribution is to propose such a model and justify its relevance in actual biological scenarios. In particular, we study the division of a cell that carries a certain number of components that influence its growth rate. Stochastic models of partitioning errors have until now assumed that the growth rate of cells is homogeneous [26][27][28][29][30][31][32] . As we show here, relaxing this assumption is key to obtain a bimodal distribution.

Model Definition
For simplicity we focus our attention on a single component. Here the component may represent an organelle (e.g., mitochondria) or a macromolecule (e.g., chromosome or plasmid), that will be referred as the component or the particle. We model the particle copy number dynamics across the population as a type-dependent branching process where individual cells replicate at a rate μ n , where n is the number of particles in the cell at birth. To keep things simple, we consider that during their life cycle cells that were born with k particles will duplicate their content resulting 2k particles at the time of division. This is a plausible hypothesis for most cell components, which replicate autonomously in coordination with the cell cycle 33 . Notice that this condition could be relaxed and the qualitative picture stays the same. Therefore, a cell born with k copies replicates giving birth to daughter cells with n copies with probability Ω nk . We denote by x n (t) the expected value at time t of cells born with particle copy number n. For large populations x n (t) satisfies the dynamical equation I is the identity matrix and U is a diagonal matrix with entries U nn = μ n . In the long time limit x(t) ≈ ce λt , where λ is the largest eigenvalue of W and c its corresponding eigenvector (see Appendix) When λ > 0 the population follows balanced growth and λ represents the average population growth rate 34 . The forms of μ n and Ω nk will depend on specific biological mechanisms controlling growth optimality and partitioning at cell division. We focus our attention on a scenario where molecular mechanisms enforce the maintenance of an optimal growth state with n = m particles. The enforcement acts at two levels. First, cells will tend to arrest or slow down growth when n deviates from m. For the sake of illustration and mathematical simplicity we will model the growth rate by the Gaussian function form where μ m is the maximum growth rate and κ quantifies the range of tolerated deviations from the optimal copy number m. Second, cells will tend to enforce even partition at division but, due to stochastic errors, random partition may occur. We introduce a parameter  between 0 and 1 to quantify the rate of partition errors. Tightly regulated error-free divisions resulting in even distribution of cellular contents correspond to  = 0, while absence of control or bias corresponds to  = 1, where each particle can be in either daughter cell with equal probability. To interpolate between both situations in a simple manner, we employ the following error prone even partition model, and  is the error probability of random partition per particle pair. Notice that introducing the  dependency of Ω nk in this manner, we get an identity matrix if  = 0, and a binomial law if  = 1, thus smoothly interpolating between the extreme situations just described. By an appropriate choice of the time unit we can set μ m = 1 without loss of generality. The model is left then with three parameters, κ, m and .

Results
To start our analysis we numerically estimated the population growth rate as a function of m and κ for the case  = 1. The population growth rate becomes effectively zero when m increases, specially if m ≫ κ (Fig. 1a). In this limit partition errors drive the majority of cells away from the fitness peak, at the expense of a decreasing pool of fitter cells. In contrast, if κ is large then cells are more robust to variations in copy numbers and therefore less sensitive to partition errors (Fig. 1b). This explains the increase in the growth rate with κ. It is also evident that λ may exhibit abrupt changes when the parameters vary (Fig. 1a, blue arrows). This observation suggests that www.nature.com/scientificreports www.nature.com/scientificreports/ varying the parameter values one may find a solution space characterized by qualitatively different phases. Indeed, the numerically estimated eigenvector displays different behaviors (Fig. 2). Depending on the choice of the parameters we obtain a population of cells where the particles effectively disappear (Fig. 2a), a homogeneous population with a unimodal distribution of particle copy number (Fig. 2b) or a bimodal distribution of particle copy number (Fig. 2c). The transition from unimodal to bimodal is continuos as the partition errors increase, with an initial single mode smoothly splitting into two peaks on both sides of the optimal copy number.
To obtain a qualitative insight into the origin of the transition between the different behaviors, we derive approximate analytical solutions. In the unimodal phase, simulations suggest that the population distribution c n has a single-peaked bell-like shape that can be approximated by a normal distribution. This suggests an ansatz of the form: with parameters a, ν. Under this ansatz the product c n μ n is also normal. If we equate the first two moments of the left and right-hand sides of Eq. (3) and take the continuous limit, we obtain a pair of equations that can be solved for a, ν, obtaining (see Appendix for details) is the ratio between the variances of μ n and Ω mn , i.e., the fitness robustness to partition noise ratio. The mean growth rate is approximated by where λ U denotes the value of λ in the unimodal phase. Both ν and λ U are increasing functions of r. Another possible solution is the deletion phase, where c n = δ n0 and λ D = μ 0 . This latter solution will dominate when λ D > λ U and therefore the condition U DU 0 defines the boundary separating the unimodal and deletion phases.
Finally, the numerical simulations in Fig. 2c indicate the existence of bimodal solutions. One way to understand the emergence of this transition, is to refine the unimodal ansatz by further applications of the matrix W. Denote by = | | || () the normalized vector obtained after +  1 applications of W, starting from the unimodal ansatz found above, that we now denote by c (0) . As → ∞  the vector  c ( ) converges to the true eigenvector 35 . To obtain a tractable analytical expression, we analyze the vector obtained after the first iteration, c (1) . Using (3) we can compute the moments of c (1) from those of c (0) . In particular, we obtain an expression for the kurtosis of c (1) www.nature.com/scientificreports www.nature.com/scientificreports/ In the limit m, κ → ∞ with  κ = r m 2 / fixed, the kurtosis simplifies to 2 which increases with r. The kurtosis measures the tailedness of a distribution. In particular, the inequality K ≥ 9/5 holds for all unimodal symmetric distributions 36 . Though this is only a necessary condition for unimodality (but not sufficient), violation of this inequality can be used as an indication that the unimodality to bimodality transition (BU) has ocurred. From (13) we see that for large m, κ, this occurs when r = rBU ≈ 0.26. An excess of partitioning noise over the robustness of growth is the cause of the bimodal transition in this approximation.
Putting all together, the model can be described by a phase diagram in the (m, κ) plane for any particular value of . In Fig. 3 we display the result for the bimodal partitioning model, when = 1  . Though the analytical approximation described above fails to describe the quantitative location of the boundaries, it captures some of its qualitative features, such as the dependance of the unimodal-bimodal transition on the ratio between m and κ and that the regimes κ ≫ m (m ≫ κ) result in the deletion (bimodal) phases. In particular, the bimodal region in the analytical approximation is contained in its simulated counterpart because the inequality K ≤ 9/5 is only necessary for unimodality. A corrected threshold K = 2.15 gives a much better quantitative agreement (see blue dashed line in the figure). The regions delimited by these boundaries correspond with the following steady states or phases. In the deletion phase phase, the particles effectively disappear from most cells in the population. Since cells cannot synthesize de novo particles in this model, the deletion state is irreversible (also called an absorbing state in the language of branching processes). The unimodal phase is characterized by a rather homogeneous population where the distribution of particle copy number is unimodal. In this case, all cells replicate with appreciable rate and have very similar off-springs, that in turn replicate at similar rates. The third and most relevant  www.nature.com/scientificreports www.nature.com/scientificreports/ phase is characterized by a bimodal distribution of particle copy number. In this case, a minority of cells replicates at an appreciable rate, but because of larger partition errors, the majority of their offspring have too low or too high copy numbers (thus feeding the two peaks with newborn cells) with negligible replication rates.
A case of particular interest is that of low m that may represent the evolution of chromosome copy number in a population (ploidy). In the low m limit the analytical approximations are less accurate and we resort on numerical simulations alone. In Fig. 4 we report examples and the phase diagram in the plane κ ( , )  for the case m = 2. The specific examples presented in Fig. 4a,b show that for low m there are unimodal and bimodal solutions, respectively. In fact, the phase diagram retains the three phases: deletion, unimodal and bimodal distributions. It is evident that bimodal solutions are obtained only for partition errors above the threshold ≈ . 0 4 c  . A second requirement for bimodality is that the growth rate tolerance parameter lies below the threshold κ c ≈ 0.6, which is equivalent to requirement of small r (9) in the phase diagram of Fig. 3.

Discussion
Mathematical models of partition errors at cell division typically assume that the growth rate of cells is a constant 26,27,37 . This assumption simplifies conveniently mathematical derivations and might be applicable in some particular scenarios. However, as we have shown here, when the growth rate dependency on the copy number is considered, partition errors might be an effective and robust mechanism of cell diversification.
Our analysis reveals that population bimodality is a feasible state of balanced growth even when all quantities follow unimodal behavior at the single cell level. Within the model considered here, there are two necessary conditions for bimodality to manifest. First, some degree of stochasticity in the partition of particles at division is needed (i.e.,   > c ). Second, an optimal copy number and a sharp decrease in the growth rate for sub-optimal copy numbers (i.e., small κ).
At first sight it might seem puzzling that sharp fitness peaks result in bimodality. One might expect that such a peak would eradicate any deviation from the optimal copy number and thus not allow bimodality. However, this is why partition errors are essential, because as the fitness peaks get sharper, the probability that a reproductive event results in a newborn inside the peak decreases. In simple terms, there is a critical point where more individuals are born outside the peak than inside, which results in bimodality in our model. A similar transition occurs at the error threshold of the quasi-species model 4 .
There are different biological scenarios where this picture could be relevant. One particularly interesting example is the evolution of ploidy in a population with faulty chromosome segregation discussed above. Aneuploidy, is a phenomena at the basis of many disabilities and connected with tumorigenesis. Excluding sex chromosomes, mammalian cells have the diploid state as default, and deviations from this state are rarely tolerated. They have also molecular mechanisms to enforce even chromosome partition at cell division, but mutations in components of the chromosome segregation machinery may lead to an increase of partitioning errors. However, there is a clear contradiction between the low growth rate of aneuploid cells 38 when compared with normal ones, and their prevalence in the context of cancer. The theory proposed here provides a hypothesis for how cells with decreased and increased copy number could be more abundant than the "optimal" diploid cells.
On the other hand, mitochondria are organelles involved in the production of ATP through oxidative phosphorylation. Unsurprisingly then, experimental evidence shows that the larger the number of mitochondria, the larger the growth rate of cells 31 . However, on the extreme case of an excessive mitochondrial content, the decrease of the cytoplasmic space available for other essential components, such as enzymes and ribosomes [39][40][41] becomes detrimental for the cell. In between there must exist an optimal copy number of mitochondria that maximizes growth in a given environment. However cells typically depart from this optimum. In particular cell www.nature.com/scientificreports www.nature.com/scientificreports/ differentiation often involves qualitative changes in the quantity of mitochondria content in different stages 42,43 . Moreover, recent experiments suggest that the partition of mitochondria at cell division is well approximated by binomial statistics 31,33 , but it is not understood how this could support heterogeneous populations.
Plasmids are small, circular DNA molecules within bacterial cells that replicate autonomously. The abundance of plasmids in a cell is known to affect its growth rate 44 , due to excess metabolic burden 45 or because the plasmid contains genes that might increase bacterial fitness 46 . An intermediate plasmid copy number maximizes the growth rate. However, heterogeneity in plasmid content is an important source of evolutionary innovation in bacteria 46 , but we don't know under which conditions this heterogeneity could be maintained. Partition errors at cell division have been associated with the loss of plasmids in bacterial populations 47,48 , in accord with the deletion phase of our model. For example, malfunction of ParA F , a protein involved in the regulation of the faithful partition of DNA in bacteria, leads to an enhanced loss rate of plasmids 49 .
Future experimental work is required to validate or invalidate the relevance of the proposed theory in these scenarios. But our model displays, based on simple and biologically plausible hypothesis, all the phenomenology described by the current experimental results.
Several simplifications were made in the formulation of this model. We considered an oversimplified view of cell growth by duplication of components without acknowledging regulatory mechanisms coupling component synthesis to cell size and division 50 . There is evidence that some components are produced in a cell size dependent manner maintaining constant concentrations 33 , and that cell size homeostasis in a population is achieved by a nearly constant addition of volume after birth (the "adder" mechanism) 51 . Investigating the heterogeneity induced by partitioning noise in light of these observations is an interesting topic for a future work. The ideal Gaussian form of the growth rate (4) simplifies our calculations, but it is only an approximation to a generic unimodal bell-like shape expected for a component exhibiting an optimal copy number 31,39,45,46 . Partition errors are also subject to increasingly detailed mechanisms depending on the molecule in question, with particular statistical properties 27 that are not all covered by (5). Inclusion of all these details would have made our model too complicated. Future work is needed to decide which of these phenomena are more relevant to the mechanism of cell diversification by partitioning errors and growth optimality proposed in this work.