Hidden heterogeneity and circadian-controlled cell fate inferred from single cell lineages

The origin of lineage correlations among single cells and the extent of heterogeneity in their intermitotic times (IMT) and apoptosis times (AT) remain incompletely understood. Here we developed single cell lineage-tracking experiments and computational algorithms to uncover correlations and heterogeneity in the IMT and AT of a colon cancer cell line before and during cisplatin treatment. These correlations could not be explained using simple protein production/degradation models. Sister cell fates were similar regardless of whether they divided before or after cisplatin administration and did not arise from proximity-related factors, suggesting fate determination early in a cell’s lifetime. Based on these findings, we developed a theoretical model explaining how the observed correlation structure can arise from oscillatory mechanisms underlying cell fate control. Our model recapitulated the data only with very specific oscillation periods that fit measured circadian rhythms, thereby suggesting an important role of the circadian clock in controlling cellular fates.


Cell fate is correlated in related cells
To test the impact of cell state prior to cisplatin treatment on the response of HCT116 cells to cisplatin, we calculated how frequently related cells shared the same fate by lineage relationship and compared these frequencies to unrelated cells. We first considered only death or survival as a cell fate (Supplementary Figure 2a). Since 62.5% of cells died in response to cisplatin treatment, then 53% of cells should share the same fate if the response of each cell was an independent event as the probability of both cells dying is ~39% (0.625 * 0.625) and the probability of both cells surviving is ~14% (0.375 * 0.375). Indeed, unrelated cells shared the same fate 52% of the time (N = 12,495 cell pairs). In contrast, sister cells shared the same fate over 80% of the time and the correlation in cell fate decayed with lineage distance. Cells separated by 4 divisions, or 3rd cousins, shared the same fate in similar proportions to unrelated cells (Supplementary Figure 2a).
We next included cell division as one of the outcomes following cisplatin treatment. Following cisplatin treatment cells can divide, die, do neither or do both (Supplementary Figure 2b). Again, we found that related cells share the same fate much more frequently than expected for independent events and the correlation in cell fate decayed with lineage distance (Supplementary Figure 2c).

Cell cycle stage does not affect p53 dynamics or cell death in response to cisplatin
The efficacy of many chemotherapy drugs varies with cell cycle stage 2 . To determine if the rate of p53 accumulation and cell fate is linked to cell cycle stage, we engineered a system to measure cell cycle stage, p53 dynamics, and cell fate in live cells. We used HCT116 p53-VKI human colon cancer cells, a previously established clonal cell line in which one allele of TP53 is tagged with the Venus fluorescent protein 3 . To track cell cycle phase, we expressed Cerulean fused with the N-terminal domain of human geminin (Cer-hGem). Cer-hGem is degraded by the anaphase-promoting complex (APC) during M-phase and G1 when APC activity is high. APC inactivation upon S-Phase entry triggers accumulation of Cer-hGem which remains high until the next M-phase(Supplementary Figure 3a We compared p53 dynamics and cell fate between cells that were in G1 or S/G2 at the time of cisplatin treatment. We found no connection between cell cycle stage at the time of treatment and either cell death or onset of p53 accumulation (Supplementary Figure 3d,e). It is possible that cisplatin acts slowly to damage DNA and that this delay masks our ability to observe a connection between cell cycle stage and cell fate. However, the cell cycle stage of cells at later time points also had no effect on cell death or p53 onset (Supplementary Figure 3f,h). These data suggest that the variation observed in p53 dynamics and cell death in response to cisplatin is not due to variation in cell cycle stage at the time of or during treatment. In contrast, we did observe a correlation between the cell cycle stage at the time of cisplatin addition and the probability that a cell will divide after treatment (Supplementary Figure 3j). For example, although only 34% of cells were in G1 when cisplatin was added, these G1 cells accounted for 83% of the cells that did not die or divide following cisplatin treatment (Supplementary Figure  3j).

Comparison of model fits to IMT and AT distribution data
The most common two and three parameter functions that have previously been used to describe cell division and apoptosis time distributions -Exponentially Modified Gaussian (EMG), gamma, and shifted gamma -were tested against our single cell data. The functional forms of the probability density functions (pdf) are given by where Erfc is the complementary error function. The EMG is a convolution of a Gaussian density with mean and variance : , and an exponential with parameter . The mean of the EMG is given by + 1/ and the variance is : + 1/ : . , where is the gamma function. The mean of the shifted gamma distribution is + and the variance is : .
All nonlinear model fitting and model selection calculations (using the Akaike Information Criterion, AIC 6 ) were performed using the "NonlinearModelFit" function in Mathematica. The Exponentially Modified Gaussian (EMG) function describes the data before cisplatin dosing significantly better than any other model (Supplementary Table 1). The best fitting EMG parameters are = 28.57, = 2.45, = 0.27. The closest competitor, the Gamma function, has an AIC larger than the AIC for EMG by 6.5, implying that the Gamma function is only +[.\/: = 0.038 times (3.8%) as likely to best explain the data as the EMG function 6 . Similarly, the EMG function describes the IMT distribution of cells that straddle the cisplatin dosing event significantly better than the other functions (Supplementary Table 2). The best fitting EMG parameters for the straddling cells are = 29.26, = 3.5, = 0.093. The next best model was the shifted gamma, which was only 6 % as likely to best explain the IMT data as the EMG function. Finally, the same analysis showed that the EMG also best describes the time to death distribution data of cells existing purely after cisplatin treatment (Supplementary Table 3). The best fitting EMG parameters for this case are = 45.836, = 27.9, = 0.209. The next best model was the Gamma, which was only 8.9% as likely to best explain the time to death data as the EMG function.

Basic introduction to survival and competing risks analysis
Survival analysis is a statistical technique for analyzing time to event data, and competing risks occur when there are multiple events that are mutually exclusive-the occurrence of one event precludes the occurrence of any of the other events. This technique is usually framed in terms of hazard functions, which describe the instantaneous rates of occurrence of various events and are central to the development of our statistical algorithm.
Let T be a non-negative random variable denoting the time until occurrence of an event, and be an instantiation of . Let the probability density function (pdf) of be ( ) and the cumulative distribution function (cdf) be ( ), such that The survival function ( ) is the complement of the cdf, and denotes the probability that the event has not occurred by time t: The hazard function for the event is a measure of the risk that the event will occur at any point in time, given the event has not happened up to that time. Formally, the hazard function is defined as .
The expression in Supplementary Equation (3) can easily be shown to simplify to The hazard function is therefore simply a ratio of the pdf and survival functions, as given in Supplementary Equation (4). It can be interpreted as a rate of occurrence of the event. For a Poisson process with rate and an exponentially distributed waiting time distribution given by ( ) = +9? , the hazard function is a constant and given simply by ℎ( ) = ; this therefore represents the simplest hazard function. When the waiting time distributions are no longer exponential, as with the cellular IMT and AT distributions, the hazard functions become timedependent.
Since -( ) is the derivative of ( ), Supplementary Equation (4) can be rewritten as This provides the following equation connecting just the survival and hazard functions: In the case of multiple competing (say k) risks, the above equations can be generalized in a straightforward manner. The total hazard function is defined as the sum of cause-specific hazard functions ℎ • ( ), and the all-cause survival function is given by The all-cause survival function in Supplementary Equation (8) gives the probability of none of the k events having occurred until time and provides the basic equation for the competing risks analysis to be used later in the estimation of parameters of the IMT and AT distributions.

Basic introduction to copulas
Copulas (Latin for "link" or "bond"), as the name suggests, are functions that "join together" one-dimensional distribution functions to form multivariate distribution functions 7 . A more mathematical intuition for these functions can be obtained as follows: consider a pair of random variables and , which could represent, for example, the times to division of two sister cells. Note that these random variables need not be independent, and indeed in the case of division times of sister cells, are highly correlated. Denote the cumulative distribution functions (cdf) of and by ( ) and ( ), respectively, and a joint distribution ( , ) = Pr ( ≤ , ≤ ).
Conversely, if is a copula and and are distribution functions, then the function defined by Sklar's theorem in Supplementary Equation (9) is a joint distribution function with and as marginals. The likelihood framework developed later relies on this converse statement of Sklar's theorem. Functionally, the power of copulas becomes evident by rewriting Supplementary Equation (9) in terms of density functions (denoted by small letters) instead of the cdfs (denoted by capital letters): The copula density on the right hand side of Supplementary Equation (10) captures the correlations among the random variables and , thereby separating the dependence structure from the univariate marginals. Supplementary Equation (10) will form the core component of modeling the correlations among sister cells in the statistical algorithm developed later in this section.
How are copula functions created? There are a number of ways, including geometric methods, to construct copulas 7 . The easiest method, however, is by inverting Sklar's Theorem in Supplementary Equation (9) and using known multivariate distributions. Noting that ( ( ), ( )) lies in the unit square [0,1] × [0,1], we have from Supplementary Equation (9): where ( , ) lie in the unit square [0,1] × [0,1]. For example, by choosing and to be the standard univariate normal distribution , and to be the standard bivariate normal -(with Pearson correlation coefficient ), we obtain what is known as the Gaussian copula, given by --ƒ˜•• ( , ) = -( +" ( ), +" ( )). Note that using the Gaussian copula along with standard univariate normal margins in Supplementary Equation (9) would again provide the standard bivariate normal distribution.

Algorithm based on competing risks analysis and copulas to infer unbiased IMT and AT distributions
To infer the underlying (unobserved or "hidden") IMT and AT distributions that can generate the experimentally observed distributions, it is necessary to model the stochastic competition of cellular fates and the high sister correlations in IMT and AT. High sister correlations can lead to an inaccurate inference of the distribution parameters if unaccounted for, especially when IMT data is limited due to a large extent of drug-induced apoptosis. While it is challenging to model correlated data and this usually neglected for simplicity, recent work in the context of DNA methylation has highlighted the improvement in inference that can be achieved by incorporating correlations 8 . We therefore incorporated sister correlations into our algorithm, neglecting higher order lineage correlations since we found them to be smaller than sistersister correlations. Since the Exponentially Modified Gaussian (EMG) does not have a multivariate version, copulas provide a flexible technique for modeling bivariate EMG distributions with arbitrary correlations 7 . While copula functions have been used extensively in finance 9 , there exists surprisingly few applications in biological problems. Our work highlights the power of copulas, and should motivate more widespread use in biological contexts especially in the growing field of genomics and epigenetics where modeling correlated data is often important 8 . The copula framework was used in conjunction with the competing risks analysis method to develop a statistical algorithm to simultaneously infer unbiased estimates of the inter-mitotic time (IMT) and apoptosis time (AT) distributions of HCT116 colon cancer cells while accounting for the experimentally observed sister correlations. Note that this method is completely general and can be used in the analysis of any cell line with arbitrary IMT and AT distributions, treated with drugs at any concentration. The steps of the method are explained in detail below: (a) A likelihood model was first developed to describe the data before drug dosing, where no cell death was observed. The single cell dataset comprises snapshots at every 30 minute interval, recording the fate and lineage relationship of every cell. This data was partitioned into i pairs of sister cells that divided before cisplatin addition, and using Supplementary Equation (10) each sister pair was ascribed a joint probability density of division given by ™ > " ™ , : ™ ; ›oe• žŸ C = ¡ G > " ™ ; ›oe• žŸ C, > : ™ ; ›oe• žŸ CI > " ™ ; ›oe• žŸ C ( : ™ ; ›oe• žŸ ).
Here i denotes the sister-pair, " ™ and : ™ denote the division times for the two sisters in pair i respectively (in other words, " ™ and : ™ represent the ages of the cells at times of division), ›oe• žŸ represents the parameter vector of the function f, and F denotes the cumulative distribution for the density function f. Throughout this work f is chosen to be the EMG function based on evidence from the data (see Supplementary section 4), but any arbitrary density function can be used instead, if deemed appropriate. The subscript on ›oe• žŸ denotes "division", and the superscript means "before-drug". The function cz is the copula density for modeling the sister-sister correlation in the data. The subscript z in cz refers to a single parameter in the copula which can be related to some measure of correlation in the data 7 . For elliptic copulas like the Gaussian copula, this parameter can be the Pearson correlation. For other copula families, however, measures of linear correlation cannot be used and other measures of dependence like Kendall's Tau or Spearman's rank correlation are required 7 . The final likelihood function for the entire data set (before drug dosing) then becomes > ›oe• žŸ , ¤ ) = ∏ ™ > " ™ , : where the product is over all sister pairs. Since the likelihood function involves the copula, the R package 'copula' 10 was used in the entire analysis. This likelihood function was maximized using the "MLE" command in R to obtain the maximum likelihood parameter estimate of ›oe• žŸ and z. The parameter errors were obtained from the variance-covariance matrix derived from the Hessian matrix. Since the EMG function has three parameters, the before-drug scenario involved inferring four parameters in total. Note that the inference of the copula parameter z (Pearson correlation of the division time of sisters) serves as a validation of the modeling of correlations -with 80 pairs of sister cells in the pre-cisplatin scenario, the inferred value of z is expected to be almost identical to the Pearson correlation measured directly from the data.
(b) After addition of cisplatin, multiple fates were observed (cell division and death; the possibility of survival will be dealt with in the next section). Therefore the likelihood function to describe the data post cisplatin was a combination of the copula (modeling the correlations among times to fates of sisters) and the competing risks framework. Under this scenario, the data is described as a set of sister pairs, where the sisters could either have been born before the time of cisplatin administration ( { ) reaching their fate after { (straddling cells), or they could have been born after { . Since the functional form of the distribution of inter-division times and apoptosis after drug dosing was found to be best described by the EMG (see Supplementary Tables 2 and 3), ( ; ) represents the EMG function as with the before-drug scenario. The after drug IMT and time to death density functions are defined as ( ; ›oe• © ª ) and ( ; ›oeŸ © ª ), respectively. The superscript denotes "after-drug". The corresponding hazard functions for division and death after drug dosing are denoted ℎ( ; ›oe• © ª ) and ℎ( ; ›oeŸ © ª ) respectively, and can be derived from the density functions via Supplementary Equation (4).
The hazard function for either of the two cells of sister pair number that straddles the dosing event will be the hazard for division before drug dosing, and the sum of division and death hazards after drug dosing. This was mathematically implemented by defining a piece-wise function: Note that in Supplementary Equation (14) denotes the time since the individual cell was born, not the absolute time from the start of the experiment. In other words, denotes the age of the individual cell in Supplementary Equation (14), and can be expressed as = − T™«?T , with denoting the absolute time since the start of the experiment and T™«?T denoting the absolute time when the th sister pair was born. { represents the absolute time when the drug was added. With the hazard function thus defined, a straddling cell's probability to survive till an age can be computed using Supplementary Equation (8) and Supplementary Equation (14), and is given by the following survival function: Note that this expression for the survival function is the same for both sisters in pair number , since their hazards are identical. Noting that the cumulative distributions of the two sisters are given by 1 − ™ > " ™ C and 1 − ™ > : ™ C respectively (see Supplementary Equation (2)), and that the density functions for division are given by the products of survival and hazard functions, the joint density for observing both straddling sister cells of pair number to divide is given by Supplementary Equation (16) (16) simply for notational convenience. To reduce the number of parameters that need to be simultaneously inferred, we fixed the values of ›oe• žŸ to those obtained from maximum likelihood estimation from the precisplatin dataset, and inferred only the six parameters ›oe• © ª and ›oeŸ © ª from the postcisplatin dataset. We also fixed the value of z to that calculated directly from the data (separate values of z for cell division and death). The next section discusses a seventh parameter -the probability of a cell being in a state of cell cycle arrest, which was also inferred along with the six mentioned here.
Sister cells that straddle or are born after { are more likely to die due to the effect of cisplatin. Hence along with Supplementary Equation (16), similar equations were developed for all the other possibilities, including discordant fates among the sister cells. The only modifications necessary for such cases is changing the hazard functions on the right hand side of Supplementary Equation (16) to the appropriate hazard functions that describe the eventual fate of the sister cells. For cells that were born after function can be obtained from Supplementary Equation (8). Finally, the total likelihood of the post-cisplatin data analogous to Supplementary Equation (13) is a product of the joint density given in Supplementary Equation (16) and all such joint densities describing all possible combinations of fates, over all sister pairs of cells. For sister pairs with discordant cell fates, we used the simplifying condition of no correlations between the sisters and set the copula term to 1, denoting independence.

Modeling the probability of surviving treatment
Several cells survive until the end of the lineage-tracing experiment. These cells may simply have been censored (they would have divided or died had more time been allowed to elapse), or they may have been under cell cycle arrest due to the action of cisplatin. The older a surviving cell was at the end of the experiment, the more likely it is that the cell was maintained in an arrested state. Conversely, the younger a cell was when the experiment was ended, the more likely it is that that particular cell was not given enough time to undergo a fate. This physically intuitive observation was incorporated into the likelihood framework discussed above using a variable , which denotes the probability of a surviving cell in the dataset to have been in a state of cell cycle arrest 11 . Therefore, for a sister pair that straddled { and both cells eventually divided, the joint density given in Supplementary Equation (16) was multiplied by the factor (1 − ) : , denoting that neither cell was under cell cycle arrest. For a cell that survived till the end of the experiment, its density function would be given by + (1 − ) ™ > " ™ C : the cell could have been arrested or not arrested; if not arrested with probability (1 − ), then ™ > " ™ C quantifies the probability that the cell survived at least till age " ™ given the competing risk scenario. The appropriate function of was multiplied with each of the pair of sisters, and the final complete likelihood constructed. This final likelihood could not be maximized using standard maximization techniques, and a Metropolis Hastings MCMC algorithm was used to generate posterior distributions of all the seven parameters to be inferred -three each from {™ ¶ ƒx? and {™S ƒx? , and the probability of arrest . The MCMC results are presented later.

Modeling delayed response of cells after drug administration
Previous work has suggested that certain drugs may take time to act on cells after being added to the in vitro medium 11 . This delay in drug action could in principle affect the estimation of the parameters from the last sections. To account for this possible effect, we introduced a new parameter (to be inferred from the data via the likelihood algorithm), {S"ƒ· . Physically, the introduction of this term delays the moment when the hazard of a cell switches from just division to division and death. Mathematically, {S"ƒ· adds a delay to the time when the hazard function switches in Supplementary Equation (14). As a result, the piece-wise function is changed to Similarly, the limits of integration in Supplementary Equation (15)

Accounting for sister correlations improves parameter estimation of the IMT distribution
Before using our full computational algorithm of the copulas combined with competing risks to analyze the data, we tested the improvement of parameter inference that can be achieved by accounting for correlations among sister cells. To this end, we used two complementary methods: Method (1): In this approach, we assumed that the true underlying IMT distribution is represented by a dataset comprising only one cell per lineage, which ensures that there are no correlations in this dataset. Therefore, parameters of the distribution inferred from this uncorrelated dataset will represent the 'true' underlying parameters. Correlated datasets of similar size (obtained by analyzing sister pairs) can then be used to test the usefulness of our copula-based inference approach -our approach based on the correlated data should provide inferred parameters closer to the 'true' parameters as compared to inference using the standard method of non-linear least squares (NLS). Specifically, by randomly choosing one cell division time from each lineage in the experimental dataset, we obtained 41 inter-mitotic times and inferred the parameters of this IMT distribution ('true' parameters). To obtain a correlated dataset of similar size, we randomly chose 20 sister pairs out of the 80 sister pairs in our full dataset. We then used NLS as well as our copula-based approach to infer parameters from the correlated dataset. We compared the closeness of each of the inferred parameters ( = , , ) to the 'true' parameters using the square of a simple distance metric Since the 41 independent samples and the 20 sister pairs can be chosen in many ways, we performed this entire analysis 1,000 times, each time drawing a different random set of data and computing the closeness of the parameter inferences to the 'true' values. We found that our copula-approach consistently outperformed the NLS approach, doing better 64.3% of the time for , 56.3% of the time for , and 61.1% of the time for . These values did not change when we increased the sampling from 1,000 to 1,500 and 2,000 times, ruling out the possibility that the higher frequency of improved parameter estimates occurred by chance.
While the above analysis highlights the improvement achieved by our method, it provides an under-estimate of the improvement that could potentially be reached. This fact arises because the 'true' parameters, inferred from a dataset of just 41 samples, do not perfectly represent the real underlying distribution due to the small sample size, and different realizations of drawing 41 independent division times produce slightly different distributions. This leads to an increased chance of the NLS method having better performance than our copula method in specific instances. Therefore, while this method provides a way of demonstrating the importance of determining correlations directly from the data, we investigated a second method in which the truth is unequivocally known.
Method 2: In this method, we used simulations to generate correlated random number pairs (representing sister cells) from an underlying distribution. Hence by construction we know the underlying true distribution. We chose an EMG distribution with the same parameters that described our pre-cisplatin IMT data ( = 28.576, = 2.453, = 0.274). We then used the NLS method as well as our copula method to infer the distribution parameters from this correlated dataset. As with Method 1, we calculated the distance of parameter estimates from both NLS and copula methods to the true parameter values ( ¹ºy,™ and À‚Á˜"ƒ,™ , respectively) which were used to generate the data. This procedure was repeated ~1,000 times. When we used 80 simulated sister pairs, a number chosen to match our experimental data, we found that our method was closer to the truth 61.8% of the time for , 71.1% of the time for , and 66.9% of the time for . We repeated the entire analysis for 100 simulated sister pairs and found that our method does better 62.4% of the time for , 71.4% of the time for , and 67.3% of the time for . In addition, since in this method we have precise knowledge of the true underlying parameters, we also checked the magnitude of improvement in parameter estimation achieved by our copula method over the NLS method. To do this, we defined an improvement metric ™ ( = , , ) given by ™ = | ¹ºy,™ | − | À‚Á˜"ƒ,™ | ™ ⁄ for each of the three parameters, whenever | ¹ºy,™ | > | À‚Á˜"ƒ,™ |. This metric provides a measure of how close the copula inference is to the true parameters compared to the NLS inference. While we found that there was only ~ 1% median improvement for , the improvement in estimation of the other two parameters was very large: ~ 14.6% median, 23.4% 3 rd quartile improvement for and ~ 12% median, 26% 3 rd quartile improvement for (see Supplementary Figure 4 for the distributions of ™ ).

Results from the statistical algorithm -maximum likelihood and MCMC simulations
We will now discuss results from the application of the statistical algorithm described in the previous sections to single cell lineage tracing data of HCT116 cells. Results of the maximum likelihood analysis on pre-cisplatin data are presented in Supplementary

Basic model with no circadian coupling
To validate the results of the inference procedure and explore the mechanistic origins of the observed correlations among cellular lineages, a computational model based on birth-death processes was developed to mimic the single cell lineage tracing experiments. Cellular proliferation was modeled such that a cell divides or dies based on probabilistic rules. In particular, each cell division results in exactly two daughters being born and death removes one cell from the population. We used a version of this model that keeps track of the age of every single extant cell 12 . The probability per unit time of division or death depends on the cell's age in a manner that reproduces the correct, non-exponential functional form (EMG in this case) of these distributions. In brief, a kinetic Monte Carlo simulation was performed where the acceptance probability of any event (division or death) for a cisplatin treated cell depends on was used. Details of the general simulation procedure are given in 12 . The time step ∆ was chosen such that it was much smaller than the average times of the IMT and time to death distributions. Throughout this work ∆ = 0.1 frames (= 3 minutes) was used.
In addition, the birth-death process simulations described above were implemented on a directed graph, such that the vertices or nodes of the graph represent individual cells and edges represent mother-daughter relationships. Growing the cellular population on a graph allows for tracking of lineage relationships between every cell in the population, extant or dead. The absolute time of birth, time of fate, and type of fate (division or death) were recorded as vertex attributes. Initial conditions for all simulations were 30 ancestor cells (similar to the HCT116 single cell dataset), whose ages were randomly chosen from a uniform distribution on the interval [0, ], where is the average of the IMT distribution before cisplatin administration (parameters given in Supplementary Table 4). The progeny of these 30 ancestor cells were tracked over time starting from = 0. All simulations were performed using the R package "igraph" 13 .
Similar to the single cell lineage tracking experiment data, the initial 30 ancestor cells were first allowed to proliferate to ~250 − 290 cells (~3 generations) in the absence of drug. The absence of drug was modeled by setting ›oe• žŸ , the EMG parameters for division in the absence of cisplatin, from Supplementary Table 4. After proliferating to ~250 cells, the parameters were changed and set using the results in Supplementary Table 5 to reflect addition of cisplatin, and further proliferation was simulated. Division times before and after drug addition were calculated from the recorded vertex attributes, allowing computation of the histograms and correlations shown in the main text.

Modeling cell fate dependence on stochastic protein production/degradation
To investigate the dependence of cell fate on stochastically fluctuating levels of one or multiple proteins, we developed a computational model where, in addition to the basic single cell lineage generating mechanism (described above), we simulated protein production with rate ÓrÔ› and degradation with rate ›ŸÕ within each single cell over time. To this end we used the standard approach of generating a uniformly distributed random number between 0 and 1 for each cell at each time step and comparing it with the quantities [Protein] ›ŸÕ ∆ and [Protein] ›ŸÕ ∆ + ÓrÔ› ∆ to decide which reaction will occur, if any. We developed two models: one in which the concentration of one protein (Protein X) controls the cell division probability, and one in which the concentrations of two proteins (Protein X and Protein Y) control the cell division probability. The concentrations of the proteins in a mother cell at the time of division are kept identical to the concentrations inherited by the two daughters. These inherited concentrations are then coupled to the hazard functions of the cells to control cell fate probabilities. Since the goal of these models was to investigate whether the cousin-mother inequality could be recapitulated, using data from our pre-cisplatin study, we modeled only division and neglected cell death. Finally, for both these models, we studied different mixing properties of the proteins: for the case when the protein is 'mixing', i.e. when it loses memory of its level over time scales that are shorter than a single cell division time, we chose the production/degradation parameters ÓrÔ› = 2.5 frame +" and ›ŸÕ = 0.05 frame +" . This choice allows for large fluctuations in the protein levels over time. For the case when the protein is 'non-mixing' and loses memory over time scales much larger than a single cell's lifetime, the parameters were chosen to be ÓrÔ› = 0.05 frame +" and ›ŸÕ = 0.005 frame +" . This choice ensured that the protein level a cell was born with hardly changes by the time that cell divides. The mathematical details of the two models are as follows: Protein X model: As the Protein X concentration at the time at which the mother divides increases, the parameter for the two daughters' hazard functions also increases, thereby enhancing the probability of longer division times for the two daughter cells. If the Protein X concentration in the mother decreases, there is an increased probability of shorter divisions for the daughters. Mathematically, this was achieved by setting = g + 0.65 [Protein X] − 25 for every cell, where [Protein X] represents the Protein X concentration in a cell at the time it was born. The concentration of Protein X in the 30 ancestor cells were randomly chosen from [10,50] (arbitrary units). This parametrization allowed us to maintain the correct magnitude of the sister correlations as observed in our pre-cisplatin dataset. The remainder of the parameters of the model were kept the same as inferred for the pre-cisplatin case (Supplementary Table 4): g = 28.5, = 2.4, = 0.27, while ÓrÔ› and ›ŸÕ were chosen as described above.
Protein X and Protein Y model: To allow the division probability of each cell to depend on the levels of two proteins, we generated stochastic simulations of two independent proteins X and Y. When the ratio of their levels X/Y increases beyond 1, the daughters are more likely to divide more slowly while they are more likely to divide faster if the ratio of the protein levels is smaller than 1. This effect was achieved by coupling the hazard functions of each cell to the protein levels: = ⁄ − 1) when both X and Y are mixing. As with the Protein X only case, these parametrizations were chosen so as to reproduce the experimentally observed sister correlations. The remainder of the parameters of the model were kept the same as inferred for the pre-cisplatin case (Supplementary Table 4): g = 28.5, = 2.4, = 0.27, while ÓrÔ› and ›ŸÕ were chosen as described above.

Modeling circadian gating of the cell cycle
The circadian clock was ascribed a period of 24 hours (48 frames) as found in experiments using the HCT116 cell line 14  To model gating of the cell cycle by the circadian clock, the probability for division of a cell was chosen based on the clock phase at which the cell was born -cells born at certain phases of the clock would have a higher probability of dividing at lower ages than cells born at other phases of the clock. Mathematically, this was achieved by making the parameter of the EMG a sinusoidal function of the clock phase at cell birth. As mentioned in Supplementary section 4, the mean of the EMG is given by + 1/ , while the variance is given by : + 1/ : . Therefore, by changing , the mean of the IMT distribution and hence the hazard of division can be changed without affecting the variance. The following general structure for was used to introduce gating of the cell cycle for any individual cell: where is the clock phase at time of birth of that particular cell. Sister cells are born at the same clock phase (for simulations relaxing this assumption, see below), cousins at similar phases, and mother-daughter pairs at potentially very different phases, depending on the length of the cell cycle compared to the clock period. For the simulations incorporating circadian gating, an extra vertex attribute recording for each cell was added. The crucial aspect to note regarding the two free parameters g and in Supplementary Equation ( (18), is that they need to be chosen in a way that reproduces the IMT distributions defined by ›oe• žŸ (for that the pre-competition IMT distribution is similar to the distributions parameterized by ›oe• žŸ and ›oe• © ª . Furthermore, not all combinations of parameters g and that recapitulate the underlying distributions can recapitulate the observed correlations in the division times. Hence this poses an additional constraint on the choice of parameterization of Supplementary  Equation ( (18). In order to satisfy both these constraints simultaneously, the following parameterization was used in all simulations incorporating gating of the cell-cycle: The superscript denotes the sister pair number, as in Supplementary section 5. Therefore, for the pre-cisplatin scenario, the circadian model to describe IMT distributions and correlations in IMT has four free parameters g , , , . Since the EMG function describing the IMT distribution requires 3 parameters for complete characterization, our model requires only one extra free parameter to also explain the entire IMT correlation structure pre-cisplatin. Note that there may exist other choices of parameter values for g , , , that explain the data equally wellthe choice given above is only meant to serve as an example of how a minimal model with circadian gating can explain the single cell lineage tracing data. The number of free parameters for the post-cisplatin scenario is discussed below. Finally, note that the definition of the clock phase based on the absolute time implies that all cells have their circadian clocks synchronized. This is a simplifying assumption and usually not true in bulk populations of cells. However, this assumption does not affect the experimental observations this model aims to explaincorrelations within lineages of single cells. This point is explained in more detail with supporting simulations below.
The simulations using the circadian gating model were started with an initial condition of 30 cells with randomly drawn ages between 0 and the mean of the inferred IMT distribution (parameters in Supplementary Table 4). The initial absolute time was set to = 0 and as before, the time step ∆ was chosen to be ∆ = 0.1 frames (= 3 minutes). The lineages of each of the 30 original cells were then followed over time.

Varying the periods of the oscillator gating of the cell cycle
Besides a 24hour circadian period for the oscillator gating of the cell cycle, a number of other periods were tested for their ability to reproduce the cousin-mother inequality observed in the pre-cisplatin HCT116 lineage data. The parameters = 2.2 , = 0.85 were kept unchanged across all of these simulations. Only the parameters g and were tuned to reproduce the correct sister correlations and the IMT distributions observed in the data. Supplementary Table  7 provides all parameter values that were used to generate Fig. 6 and Supplementary Figure  10. Parameters for the period 24 hours (48 frames) are given in the preceding paragraph.

Gating of apoptosis by the circadian clock
Gating of apoptosis was incorporated in a manner similar to Supplementary Equation ( (18), with an identical underlying principle -cells born at certain phases of the circadian clock are more likely to die at lower ages as compared to other cells. With gating of two pathways (division and death), an additional phase difference between the two gated pathways must be considered. This phase difference was accounted for by using the variable in the following manner for defining of the time to death distribution: , and another set of four g , , , representing ›oeŸ,áoerá©›oe©â © ª . Note that since three parameters each were required to characterize the post-cisplatin IMT and AT distributions, only two additional parameters were required by our model to capture the entire correlation structures in post-cisplatin IMT and AT. Finally, various values of between 0 and 2 were explored, and ~ 0 was found to best explain the data, suggesting that gating of cell cycle and cell death pathways must occur approximately in phase. These parameters for the time to death distribution along with the parameters for the IMT distribution (given above) together explain all experimentally observed IMT and AT distributions as well as the correlations before and after cisplatin treatment.

Random phase change of the circadian clock during cell division
The model for circadian gating described above makes the simplifying assumption that the circadian clocks of all cells in the population are synchronized. This assumption is usually not true in a bulk population where the individual clocks are poorly synchronized, unless treated with dexamethasone or serum shocked 15 . The single cell lineage tracing experiment analyzed in this work was performed under standard cell culture conditions, and hence the cells would not be expected to show synchronicity at the bulk level. However, when the oscillations of circadian proteins in progeny emerging from a single cell are tracked, the phase of the clocks in the two daughters after division faithfully start close to the phase value where the mother cell ended [15][16][17] . Hence, over a few generations (which is the duration of the entire experiment studied here), the cells within one particular lineage would have circadian clocks that are synchronized to a high degree. Since the model developed here aims to explain correlations between family members within a lineage, the simplifying assumption of synchronized clocks is a very good approximation. Having said that, small phase shifts have been noticed at the time of division, presumably due to stochastic distribution of circadian proteins from mother to daughter cells 15,17 . To ensure that these small phase shifts do not qualitatively change any of this work's results, simulations were performed including random phase shifts in the clock phase at birth of daughter cells (Supplementary Figure 8). This therefore ensures that the two daughters at the time of birth have slightly different clock phases. The random phase shift was chosen uniformly between [0, ]. All results shown in the main text remained almost quantitatively the same even with as high as 6 ï , as shown in Supplementary Figure 8. Only with ≥ 2 ï do the correlations among sisters and cousins show a noticeable decrease.

Random circadian phases for different cellular lineages
As discussed above, the results of the simulations shown in the main text were derived under the assumption that all cells across different lineages had synchronized circadian clocks. This assumption was made purely for the sake of simplicity of the model and for reducing computing time and memory requirements as we needed to keep track of the phases of each lineage separately over time. This assumption was not meant to be physically realistic, but captured the minimal requirement that sisters have similar circadian phases. Since different lineages are independent in birth-death models such as the one we use in this study, our results would not change if we assumed different phases across lineages.
To demonstrate the validity of our results shown in the main text regarding the lineage correlation, in particular the cousin-mother inequality, even in the absence of this assumption, we developed a modified model: we chose the circadian phase of the 30 starting cells randomly from a uniform distribution, [0, 2 ]. As a result, the phases of cells across lineages are no longer synchronized at any later time (Supplementary Figure 9a). We found that our original results for the lineage correlations still hold in this modified version of the model (Supplementary Figure 9b), proving that as long as the sister cells have similar circadian phases, the experimentally observed correlation structures can be quantitatively reproduced by our model. Note that in this modified model, the cells within a lineage still have synchronized circadian clocks, but we showed in the previous subsection that adding small amounts of randomness in the passing of phase from mother to daughter does not change our results.

Modeling correlated cell fates of sisters
In addition to the correlation structures in times to fate, very high similarities between the eventual fates of the sister and cousin cells were also observed (Supplementary Figure 2a). To check whether these results could also be accounted for in the birth-death simulation framework, the circadian gating model was updated based on our experimental observation that p53 dynamics of sisters is correlated and associated with cellular fate (Fig. 2e,f). Two additions were made to the simulation procedure: (1) In the step where the decision to undergo an event (either division or death) for every extant cell is made depending on comparisons of independent uniform random numbers between 0 and 1 with the exponential factors exp [− Gℎ> ; ›oe• © ª C + ℎ> ; ›oeŸ © ª CI ∆ ], correlated random numbers were introduced for the sisters. Sister cells among the extant cells were identified in each time step and correlated uniform random numbers between 0 and 1 were generated from a bivariate Gaussian copula, using the "copula" package in R. Independent random numbers were generated for all the other non-sister cells.
(2) A similar method for generating correlated uniform random numbers for sisters was used for the fate-determining step. Once an event is destined to occur, the choice between division or death was generated based on comparisons of uniform random numbers to the ratios ℎ> ; ›oe• © ª C/ Gℎ> ; ›oe• © ª C + ℎ> ; ›oeŸ © ª CI for each cell destined to a fate at that time step. Instead of using independent random numbers, correlated random numbers were generated for the sisters. In both the steps, the same value of correlation was used, thereby adding only one extra free parameter overall. A Pearson correlation of 0.95 was used for the random numbers to generate Supplementary Figure  . We found that in our simulations «ƒñ{ was always approximately 50%, while •™• and À‚˜• were higher, as shown in Supplementary Figure 13a.

Live cell microscopy for cell cycle analysis
To track cell cycle stage we incorporated a Cerulean-hGem lentiviral reporter into HCT116 p53-VKI cells. Approximately 10,000 cells were plated to poly-D-lysine coated glass bottom dishes (MatTek corporation) in McCoy's media with 10% FBS. Cells were grown for 48 hours prior to imaging to allow cells to attach. Prior to imaging, media was replaced with RPMI without phenol red or riboflavin and supplemented with 5% FBS (clear RPMI) to minimize background fluorescence. Cells were imaged every 30 minutes for 24 hours to identify cell cycle stage, then media was replaced with premixed clear RPMI-media + 12.5 µM cisplatin and cells were imaged for an additional 72 hours without media replacement.

Data analysis
Cells were tracked with custom ImageJ scripts using the phase channel. Custom Matlab (Mathworks) scripts were used to subtract the background from images using the rolling ball algorithm (Sternberg 1983) and acquire single cell trajectories of p53-Venus and Cerulean-hGem levels. Code available upon request. Cell-cycle stage was determined by visual inspection of Cer-hGem levels.   i.   (a)-(d) Contour plots of the bivariate density obtained from four independent sets of simulated data with 80 sister pairs, a number chosen to match the number of sisters in the experimental data. In the simulated datasets, the univariate margins were chosen to be the inferred EMG distribution as given in Supplementary Table 4, and the Pearson correlation was set to the inferred value given in Supplementary Table 4. Sister−1 IMT (hours) Sister−2 IMT (hours) 10 12 c. d.  (a)-(c) Lineage correlations obtained from simulations of stochastic protein production and degradation combined with the single cell birth-death process. (a) Both proteins X and Y are mixing, thereby losing memory of the initial protein levels a mother passes on to the daughter cells; (b) neither X nor Y are mixing, and hence retain memory of the initial protein levels a cell is born with; (c) only protein X is mixing. As can be seen in the three panels, the cousin-mother inequality cannot be recapitulated in any case. In addition, in (b) and (c), the mother-daughter correlations become very large, inconsistent with our experimental observations. All boxplots represent the 1 st , 2 nd and 3 rd quartiles of the lineage correlations generated from 30 simulation runs.   d. c.  Only certain multiples of approximately 12 hour time periods were able to recapitulate the data, for example 12 and 48 hours (24 hours as well; these results are shown in Fig. 5f), but not 36 hours. Parameters used for generating these plots are given in Supplementary section 6. All boxplots represent the 1 st , 2 nd and 3 rd quartiles of the lineage correlations generated from 30 simulation runs. b. c.

SUPPLEMENTARY FIGURES
d. e.