Stochasticity and Spatial Interaction Govern Stem Cell Differentiation Dynamics

Stem cell differentiation underlies many fundamental processes such as development, tissue growth and regeneration, as well as disease progression. Understanding how stem cell differentiation is controlled in mixed cell populations is an important step in developing quantitative models of cell population dynamics. Here we focus on quantifying the role of cell-cell interactions in determining stem cell fate. Toward this, we monitor stem cell differentiation in adherent cultures on micropatterns and collect statistical cell fate data. Results show high cell fate variability and a bimodal probability distribution of stem cell fraction on small (80–140 μm diameter) micropatterns. On larger (225–500 μm diameter) micropatterns, the variability is also high but the distribution of the stem cell fraction becomes unimodal. Using a stochastic model, we analyze the differentiation dynamics and quantitatively determine the differentiation probability as a function of stem cell fraction. Results indicate that stem cells can interact and sense cellular composition in their immediate neighborhood and adjust their differentiation probability accordingly. Blocking epithelial cadherin (E-cadherin) can diminish this cell-cell contact mediated sensing. For larger micropatterns, cell motility adds a spatial dimension to the picture. Taken together, we find stochasticity and cell-cell interactions are important factors in determining cell fate in mixed cell populations.

All images were processed using a custom written Matlab algorithm incorporating the imaging processing toolbox. Prior to enumerating Tra-1-81 positive cells, individual JPEG files were preprocessed via background subtraction. Total nuclei was then pinpointed by local pixel intensities within a defined regional array and counted. Finally, individual channels were masked as binary images subsequent to excluding extraneous pixel values and co-localized with DAPI images for quantifying the percentage of stem to non stem cell populations.

D. E-cadherin Anti-body Blocking Experiment
To elucidate the role of cell-cell interactions in stem cell di↵erentiation kinetics within confined geometric domains, 50 µg/mL of anti-E-cadherin antibody (clone 67A4; Millipore) was incubated with freshly dissociated hiPSCs for two hours. As evidence suggests disruption of E-cadherin signaling leads to increased stem cell death 2 , 500,000 cells were subsequently seeded onto the micropatterns, cultured for an additional 24 hours 3 and prepped for immunofluorescence as described above.

II. MATHEMATICAL MODEL DETAILS
A. Stochastic two-species growth model: e↵ects of interconversion and competition The experimental system in the main text can be modeled by a stochastic birth-death process 4-6 with den- sity dependent rates for symmetric cell division u S = v S S (n S 1 + n D ) and u D = v D D (n D 1 + n S ) for species S (stem cells) and D (di↵erentiated cells), respectively. Here, v S and v D are intrinsic rates of cell proliferation in very sparse cell cultures while S and D are crowding coe cients. In this system, there is also potentially cell loss or cell death, mostly from cells detaching from substrate. We model cell loss with rates w S and w D for stem and di↵erentiated cells. Since cell loss comes from mostly cells detaching from the substrate, and this rate is similar for stem and di↵erentiated cells, i.e., w S ⇠ w D . Note that the rate constants of cell division for species S and D depend on total population size n S + n D . We will call this stochastic model as Model 1. Also, linear expressions for rate constants of cell proliferation make sense as long as the rates are positive. For large cell abundances (n S 1, or/and n D 1) we will assume that u S = 0 and u D = 0 when the linear expression result in negative rates. Model 1 predicts that the net growth diminishes as the population grows (and cell density increases, as the space for growth is limited) and can be used to describe homeostatic populations. In this model, all cells divide slower in the more crowed environment and the population reaches steady state.
In addition to cell proliferation, stem cells S can differentiate, i.e., convert to species D with rate r using several di↵erent possible mechanisms. We can consider three scenarios: (a) direct conversion: S ! D, (b) asymmetric cell division: S ! S + D, and (c) symmetric cell division: S ! D+D. Note that for mechanism (a) the total number of cells does not change while for mechanisms (b) and (c) the total number of cells increases. Therefore, in this paper, we focus on (a). (c) is not very di↵erent from (a) if we consider cell division and then conversion as two success steps. Therefore, results only vary quantitatively. The form of the di↵erentiation probability and main conclusions of the paper remain the same. For (b), the number of stem cells cannot decrease, which is not what we observe in the experiment. Therefore, we eliminate (b) from our consideration.
In the experiments, we discover that the rate of stem cell di↵erentiation is a function of the composition of a population. We expect that rate of di↵erentiation is a function of fractions of stem and nonstem cells in a population. In general, we can expand the rate as a polynomial function of stem cell fraction, = n S /(n S + n D ). To second order, the rate of cell di↵erentiation can be written as (1) The fraction of di↵erentiated cells is 1 , therefore writing r as a function of di↵erentiated cell fraction will yield a similar expression. As previously, this expression for stem cell conversion rate is valid as long as it gives the non-negative rates of di↵erentiation. For the range of parameters where expression becomes negative we assume that r = 0. Now let us consider the set of stochastic equations describing the time dependence of cell number distribution of S and D). Let P (k, m, t) be the probability to find n S ⌘ k stem cells and n D ⌘ m di↵erentiated cells at time t. The general stochastic master equations are with cell division rates based on cell density dependent rate constants for k = 0, 1, . . . and m = 0, 1, . . . , and loss rates based on fixed rate constants for m = 0, 1, . . . and k = 0, 1, . . . . For states (k, m) where the expressions in Eq. (3) become negative, U is replaced by zero. The rate of stem cell di↵erentiation is given by piecewise function For states (k, m) of the system where the rates expressed by Eq. (5) become negative, R is replaced by zero. The coe cients f 1 and f 2 for stem cells may have di↵erent signs. From Eqs. (3) it follows that The infinite set of Eq. (2) should be supplemented by for j = 0, 1, . . . . Practically, we solve the finite set of stochastic master equations by introducing the truncation size K. The set of rates at m or k equal to K is defined as for j = 0, 1, . . . , K and also for j = 0, 1, . . . , K + 1. Also instead of Eq. (7) in numerical computations we use the following expression for rates: for j = 0, 1, . . . , K.
Eqs. (8-11) expresses the condition that the population size cannot grow larger than n S = K and n D = K. We used K = 100 as the truncation size in our computations for describing the population dynamics of stem/nonstem cells on small micro patterns.
The set of stochastic master equations can be solved for a given initial conditions P (k, m, 0) = P 0 (k, m). In the experiments, the micropatterns initially are plated by stem cells with some constant cell density. In this case di↵erentiated cells are not present at t = 0 and the initial distribution of stem cells is expected to be a Poisson distribution and P (k, m > 0, 0) = 0, where is the initial average number of stem cells on a pattern of a given size. We solve the system of stochastic master equations numerically and obtain at each time t the probabilities P (k, m, t) of states with k stem and m di↵erentiated cells. The probabilities of states fully define the state of a system. For instance, they allow computing any moments of joint probability distribution. The average number of stem and di↵erentiated cells at time t are given by and the variances (squares of standard deviations) and covariance are obtained as The average numbers of stem hn S i and di↵erentiated hn D i cells along with their standard deviations S and D are compared with experimentally observed values ( Fig. 2 main text). For a state (k, m) stem cells fraction is k/(k + m). Hence, the probability to obtain the fraction (t) of stem cell in a population at time t is given by summing up of all P (k, m, t) with k/(k + m) = . The one-dimensional distribution ⇢ S ( , t) is normalized since it is derived from normalized two-dimensional distribution P (k, m, t).
From the master equation, we computed twodimensional probability distributions for the model that assumes appreciable dependence of the conversion rate on stem cell fraction. Fig. S2 depicts the joint probability distribution for numbers of stem and nonstem cells at t = 1, 2 and 5 days. This distribution has two distinct maxima. The composition of cell colonies in collection of micropatterns described by this distribution is highly heterogeneous corresponding to populations highly enriched by stem cells or nonstem cells. The one-dimensional stem cell fraction probability distribution is bimodal with sharp maxima around = 0 and = 1. In contrast, we the di↵erentiation probability r as a constant, independent of , we always find a single peak in the cell number distribution (Fig. S3). Given the modeling framework, we fit the model parameters to experimental data. The procedure of finding the best parameters of the models is based on the minimization of an error function by the Monte Carlo (MC) method. The error function is composed of two parts and represents the sum of deviations of theoretical data from experimental observations. The first part shows how the average numbers of stem and nonstem cells on micro pattern agree with experimental observations. The second part measures the deviations of theoretical probability distribution function for stem cell fraction for a group of micropatterns of a given size from experimental results.
In our case the error function reads where E is the total error function and g i (i = 1, 2) are its parts; hn S i and hn D i are theoretical numbers of stem and nonstem cells at time t i (K = 3 is the number of time points at which the cell colonies were quantified), and n 0 S and n 0 D are experimental values of average numbers of stem and nonstem cells for micropatterns of a given size. ⇢ S ( j , t i ) and ⇢ 0 S ( j , t i ) are theoretical and experimental values of probability density function for stem cell fraction evaluated at time point t i .  Table I.
The algorithm of finding the best parameters using the MC method is as follows: (1) We start with an initial guess of parameters in the model: intrinsic proliferation rates, crowding coe cients and cell death rates for stem and nonstem cells, respectively, and also parameters describing the rate of di↵erentiation as a function of local fraction of stem cells in a population; (2) All parameters are changed randomly by small values; (3) We solve numerically the system of stochastic master equations for initial guess and perturbed parameters; (4) If the trial function g is smaller for changed set of parameters and all these parameters are consistent with constraints, the new parameters are accepted otherwise the parameters remain intact. The new or unchanged parameters are perturbed again (see item 2). Thus, we accomplish the constrained search of parameters, so the new set of parameters that diminishes the trial function is accepted only if all these parameters are consistent with constraints. We put the following constraints on parameters: all crowding coe cients that describe the proliferation rates of cells on the total population size should be positive and also the intrinsic proliferation rates should exceed some pre-set values. The obtained best fits are checked by starting with other initial guesses to see convergence to the same set of final parameters. We complete the procedure of finding the best parameters when the trial function practically ceases to change and theoretical description of experimental data remains practically the same. The parameters of the model found by minimization of error function E by the MC method is presented in Table I.

STEM CELL DIFFERENTIATION ON LARGE MICRO PATTERNS: THE MODEL WITH COMPARTMENTS
The model with compartments is used to capture the main experimental observations on stem cells di↵erentiation in micro patterns with large sizes. A large micro pattern is considered as a combination of small compartments. The probability of stem cell conversion is expected to be a function of local composition of a single compartment rather than the global composition of a population in the whole micropattern. The functional dependence of conversion rate vs. stem cell fraction is taken identical to cell di↵erentiation on small micropatterns. Cell motility is taken into account by introducing stochastic hopping between compartments. Stem and nonstem cells can leave their original compartment and occupy the neighboring one. The hopping rate of a cell from one compartment to another one is k h , which is related to the e↵ective di↵usion constant of cells as where L = 100µm is the compartment size. We use a stochastic simulation algorithm to compute dynamics in the compartment model. For a given time interval, t, we consider the probability of escaping the current state, e K t , where K is the sum of all possible rates of escape from the current state. For example, we include sums of all rates of proliferation u S,D and cell loss S,D for each cell in all compartments in K. We also include the rate of stem cell di↵erentiation, r, and rates of di↵usion for each cell, k h . We then use an acceptance criterion R = min[1, e K t ], where R is a random number. To determine the actual stochastic event, we select another random number and select from the list of all possible events weighted by their rates. The parameters specifying proliferation and cell loss in each compartment are exactly the same as 140µm parameters in Table 1. Using these parameters we obtained 3000 trajectories for a  probability distribution of numbers of stem and di↵erentiated cells in the whole pattern comprised four compartments at t = 1 day. Cell motility is taken into account. The total distribution summed for all compartments reveals one maximum that is also consistent with unimodal one-dimensional distribution of stem cell fraction where the maximum is close to average value. The individual compartment distribution can still exhibit a bimodal distribution.

MEAN POPULATION MODEL FOR STEM CELL DIFFERENTIATION
Here we focus briefly on di↵erence between stochastic and mean population model models for describing the population dynamics of stem cells. The mean population model considers only the time evolution of average number of species derived from expressions for rate constants. These models are often used in chemical kinetic equations. For our system, equations that describe the time evolution of average numbers of stem and di↵eren-tiated cells can be written as with rate constants expressed through the average numbers of species (X = S, D) This simple system of two di↵erential equations can be solved numerically. Comparison between this mean field approach and the full stochastic model for average stem and di↵erentiated cells are shown in Fig. S4. Note that the mean field model cannot compute probability distributions of stem cell fractions from our data. The mean field model also does not consider possible population fluctuations away from the average. Since in our system the cell number is small, fluctuations are substantial, and a↵ect the mean population.
Alternatively, starting with the full stochastic model of Eq. (2), we can perform a cumulant expansion by considering moments such as Substituting the right hand side of Eq. (2) for @P/@t will give a complicated function of higher moments of P . Note that this set of coupled moment equations are in principle exact. Approximations can be made by cutting o↵ the moment equations and introducing a closure relation. The typical population models of Eq. (20) is not equivalent to the cumulant expansion, even if we cut the expansion o↵ at first order and neglect any terms involving higher moments of P .
Comparisons between Eq. (20) and the full stochastic equation solutions are shown in Fig. S3. The mean-field