The single-progenitor model as the unifying paradigm of squamous epithelial maintenance

Adult tissues such as the epidermis of the skin and the epithelium lining the esophagus are continuously turned over throughout life. Cells are shed from the tissue surface and replaced by cell division. Yet, the cellular mechanisms that underpin these tissues homeostasis remain poorly established, having important implications for wound healing and carcinogenesis. Lineage tracing, in which of a cohort of proliferating cells and their descendants are genetically labelled in transgenic mice, has been used to study the fate behavior of the proliferating cells that maintain these tissues. However, based on this technique, distinct mutually irreconcilable models, differing in the implored number and hierarchy of proliferating cell types, have been proposed to explain homeostasis. To elucidate which of these conflicting scenarios should prevail, here we performed cell proliferation assays across multiple body sites in transgenic H2BGFP mouse epidermis and esophagus. Cell-cycle properties were then extracted from the H2BGFP dilution kinetics and adopted in a common analytic approach for a refined analysis of a new lineage-tracing experiment and eight published clonal data sets from esophagus and different skin territories. Our results show H2BGFP dilution profiles remained unimodal over time, indicating the absence of slow-cycling stem cells across all tissues analyzed. We find that despite using diverse genetic labelling approaches, all lineage-tracing data sets are consistent with tissues maintenance by a single population of proliferating cells. The outcome of a given division is unpredictable but, on average the likelihood of producing proliferating and differentiating cells is balanced, ensuring tissue homeostasis. The fate outcomes of sister cells are anticorrelated. We conclude a single cell population maintains squamous epithelial homeostasis.


Introduction
The squamous epithelia that cover the external surface of the body and line the mouth and esophagus consist of layers of keratinocytes. In the mouse epidermis and esophagus cell division is confined to the deepest, basal cell layer (FIGURE 1A). On commitment to terminal differentiation, proliferating cells exit the cell cycle and migrate to the suprabasal cell layers, undergoing a series of biochemical and morphological changes before being ultimately shed from the tissue surface. Cellular homeostasis requires that cells are generated by proliferation at the same rate at which they are shed. Further, to maintain a constant number of proliferating cells, on average each cell division must generate one daughter that will go on to divide and one that will differentiate after first exiting the cell cycle. However, the nature of the dividing cell population has been subject to controversy, revolving around whether dividing cells are a single, functionally equivalent population or a hierarchy of two or more distinct cell types with different proliferative potential [1][2][3][4][5] . Resolving proliferating cell behavior is key for understanding not only normal tissue maintenance but also processes such as wound healing and the accumulation of somatic mutations in normal tissues during aging and carcinogenesis 6,7 .
Whilst murine epidermis and esophageal epithelium share the same basic organization, there are significant differences between the tissues. The esophageal epithelium is uniform, with no appendages, while the epidermis is punctuated by hair follicles and sweat ducts, which form distinct proliferative compartments independent of the epidermis (FIGURE 1A) 5,[8][9][10] . The structure of the epidermis also varies with body site. In typical mouse epidermis, such as that on the back (dorsum), hair follicles are frequent but there are no sweat glands 10 . In contrast, in the mouse paw epidermis hair is absent but sweat ducts are common 10,11 . The ear epidermis is different again; it has uniform columns of differentiating cells, not present elsewhere 12 . Finally, the mouse tail has the most unusual structure, being a scale forming epidermis like that of chicken legs and Crocodillia rather than typical mammalian skin [13][14][15] . This structural diversity has motivated a range of studies to define the properties of proliferating cells at each site.
Genetic lineage tracing in transgenic mice has emerged as a powerful technique for tracking the behavior of cells within tissues (FIGURE 1B) 16 . This is performed in mice expressing two transgenic constructs (FIGURE S1A). The first is a genetic switch, using a bacterial recombinase enzyme Cre, expressed either from a transgenic promoter or targeted to a specific gene 17 . A variety of Cre expressing mouse strains have been used for studies of esophageal epithelium and epidermis (FIGURE S1A). Cre is fused to a mutant hormone receptor so it is only active following treatment with a drug, giving control over when recombination is induced. Using low doses of inducing drug allows the labelling of scattered single cells.
The second construct is a reporter, such as a fluorescent protein, typically targeted to the ubiquitously expressed Gt(ROSA)26Sor (Rosa26) locus. The reporter is only expressed following the excision of a "stop" cassette by Cre. As Cre mediates a genetic rearrangement, the progeny of the labelled cell also express the reporter. If cells are labelled at a low enough frequency, single-cell-derived clones of reporter expressing cells result. If a representative sample of proliferating cells is labelled and their progeny tracked over a time course, statistical analysis of the evolving clone size distributions may be used to infer cell behavior 3 .
Alongside lineage tracing, a complementary transgenic assay may be used to detect cells cycling at different rates and infer the average rate of cell division (FIGURE 1C). This uses a transgenic, drug regulated synthetic promoter to control expression of a protein comprising Histone2B fused to green fluorescent protein (H2B-GFP) (FIGURE S1B). The H2B-GFP is initially expressed at high levels in keratinocytes. Its transcription is then shut off and levels of H2B-GFP protein measured by microscopy or flow cytometry. The stable H2B-GFP protein is diluted by cell division, so if the tissue contains cell populations dividing at different rates, the more slowly dividing cells will retain higher levels of protein 18 .
Measurements of the rate of loss of fluorescence have been used to estimate the rate of cell division 4,5,19 .
Lineage tracing has ruled out older deterministic models of a proliferative hierarchy of asymmetrically dividing stem cells generating 'transit amplifying' cells that undergo a fixed number of divisions prior to differentiation 3 . These models predict that clone sizes will rise and then remain stable. In multiple lineage tracing experiments, however, mean clone size has been found to increase progressively with time.
However, several mutually incompatible models in which proliferating cells have stochastic fate have been proposed that do appear consistent with the data in one or more experiments (FIGURE 1D; Suppl.

Methods).
The simplest stochastic model, the single progenitor (SP) hypothesis, proposes that all dividing keratinocytes are functionally equivalent and generate dividing and differentiating daughters with equal probability 3,5 . An alternative stem cell-committed progenitor (SC-CP) paradigm, applied to the epidermis proposes a hierarchy of rare, slowly cycling stem cells which generate stem and progenitor daughters.
The progenitors are biased towards differentiation so continual stem cell proliferation is required 4,20 . A third model argues that two independent populations of stem cells (2xSC) dividing at different rates exist in the epidermis 19 . These models all give comparably good fits to the results from individual experiments.
However, each has been proposed on the basis of distinct data sets analyzed by different inference and fitting procedures, with limited testing of alternative hypotheses.
Motivated by the disparity of the proposed models of cell dynamics we set out to determine whether a consistent quantitative approach to lineage tracing data, further constrained by cell proliferation analysis in transgenic H2B-GFP mice could identify a single model that is consistent across multiple data sets in the esophagus and different epidermal regions. Here we develop an integrative stochastic-modelling approach where cell-cycle properties are extracted from new H2B-GFP dilution data, embodied in model simulations and lineage tracing results fitted, by maximum likelihood parameter inference. We test this method on a new experimental data set from mouse esophagus, and then apply it to published data sets in different skin territories, performing new H2B-GFP dilution measurements to constrain model fitting at each body site. We find that the data sets are consistent with a common, simple SP model of homeostasis.
We are able to improve the parameter estimates within the model to show that the fates of pairs of sister cells are anti-correlated, and that the basal layer contains a substantial proportion of cells which will differentiate rather than going on to divide.

Cell division rates and cycle times in epidermis and esophagus
Analysis of cell proliferation in epithelia offers a simple way to test the predictions of the disparate models of epithelial homeostasis by revealing the level of heterogeneity in the division rate of basal-layer cells.
The SP model predicts a single cell population dividing at the same average rate while the alternative hypotheses argue for discrete populations dividing at different average rates. We therefore investigated the dilution of H2B-GFP in the epidermis and esophagus of R26 M2rtTA /TetO-H2BGFP mice ( Fig. 2A). The animals were treated with doxycycline (Dox) for 4 weeks to induce H2BGFP expression. Dox was then withdrawn and H2BGFP protein levels in individual basal keratinocytes tracked by direct measurement of GFP fluorescence using confocal imaging in epithelial wholemounts at multiple time points. We examined esophagus and epidermis from paw, ear, and tail ( Fig. 2B; Fig. S2; Fig. S3). Non-epithelial cells in the form of CD45 + leukocytes, which retain high levels of H2BGFP, were excluded from the analysis, but served as internal reference for label retention 5 (e.g. Fig. 2B, insert; Table S1). In addition, for the analysis below we included a recently published data set from dorsal epidermis performed using an identical protocol 21 ( Fig. S2).
We first examined images for the presence of label retaining cells (LRCs) ( Table S1). We found no keratinocyte LRC in the basal cell layer of the esophagus or any epidermal site other than the interscale region of the tail (Fig. S3A). Rare keratinocyte LRCs (4/1923, i.e. 0.2% of basal layer keratinocytes) were observed in interscale epidermis, in a single animal, 18 days after DOX withdrawal. Their scarcity however suggests that they are unlikely to make a substantial contribution to tissue maintenance.
Next, we performed a quantitative analysis of the time series of the individual-cell H2BGFP intensity histograms (Table S2). If there were multiple subpopulations of cells proliferating at different rates the distribution of H2BGFP intensities would progressively diverge, becoming wider over time. We found no evidence of such behavior in the esophagus and at multiple sites in the epidermis ( Fig. 2B; Fig. S2).
Specifically, several statistical tests of the modality of the distribution were applied, showing no evidence for multiple populations ( Fig. 2C; Fig. S3; Table S3; Suppl. Methods).
To further challenge the hypothesis that there is a single proliferating cell population, we examined  Table S4).
Altogether, these observations strongly argue against scenarios of heterogeneous proliferating cell populations, such as the SC-CP or 2xSC models, at all sites other than in the tail where marked variation between animals precluded reliable inference on cell proliferation rates (Fig. S3). We conclude that in the basal layer of the epidermis at multiple body sites and in the esophagus proliferating cells divide at a unique average rate with highly homogenous cell-cycle periods, consistent with the SP model ( Table 1).

A common analytical approach integrating cell proliferation and lineage tracing data
The ability of lineage tracing to track the behavior of cohorts of proliferating cells and their progeny over time courses extending to many rounds of cell division offers the potential to validate models of homeostasis. Having established the homogeneity in the division rate of basal-layer cells, we then set out to determine whether clonal dynamics across different lineage tracing data sets were consistent with the SP paradigm.
Multiple lineage tracing studies have been published but these used distinct approaches to infer models of cell behavior and did not apply the additional constraint imposed by measuring the cell cycle time distribution 3,4,11 . Computational simulations revealed that the SP, SC-CP and 2xSC models all predict very similar development of clonal features over time, which rendered them hardly distinguishable from lineage tracing data alone (Suppl. Methods). However, as our cell-proliferation analyses do not support the SC-CP and 2xSC paradigms we focused on testing the SP model. By incorporating the measurement of the average division rate, we could reduce the uncertainty in the parameter estimation, a problem that has been generally overlooked in these stochastic models (Fig. S4). In turn, whilst long-term model predictions on clone-size distributions remained largely unaffected by the assumptions on the cell-cycle time distribution, introducing realistic estimates for the distribution of individual cell-cycle lengths affected short-term clone-size predictions, impacting on the inferred parameter values (Fig. S5). This disputes most common, Markovian implementations that take the biologically implausible assumption that cell cycle times are distributed exponentially (i.e. the likeliest time for a cell to divide is immediately after the division that generated it). We therefore developed a robust quantitative approach where cellcycle attributes estimated from H2B-GFP experiments were embodied in (non-Markovian) model simulations, and a subsequent maximum likelihood inference (MLE) method was applied across the available data sets for each body site to challenge whether each of them was consistent with the SP paradigm (Suppl. Methods).

Clonal dynamics in esophageal epithelium
In order to explore in vivo clonal dynamics, we began by studying a new lineage tracing dataset in mouse esophageal epithelium (Fig. 3A). In this experiment we used a strain (Lrig1-cre) in which a tamoxifenregulated form of Cre recombinase and enhanced green fluorescent protein (EGFP) are targeted to one allele of the Lrig1 locus 8,[22][23][24] . We found LRIG1 protein was ubiquitously expressed in the basal layer of esophageal epithelium in wild type mice (Fig. S6A). Consistent with this finding, in Lrig1-cre animals, EGFP, reporting Lrig1 transcription was detected in 94+/-0.3% (s.e.m.) of basal cells (Fig. S6B). These observations indicate Lrig1 is widely expressed in the proliferative compartment of esophageal epithelium and is suitable for lineage tracing of proliferating esophageal keratinocytes (Suppl. Methods).
To track the fate of basal cells, Lrig1-cre mice were crossed with the Rosa26 flConfetti/wt (Confetti) reporter strain which labels cells with one of four possible fluorescent proteins (green, GFP, cyan, CFP, yellow, YFP or red RFP) after recombination ( Fig. S6C-D) 25,26 . In some Cre inducible mouse lines, reporter expressing clones can appear without induction with Tamoxifen. However, no fluorescent protein expression was found in adult uninduced Lrig1-cre/Confetti mice (Table S5) 27 . Next, cohorts of Lrig1-cre/Confetti animals were treated with a low dose of Tamoxifen that resulted in labelling of only 1 in 300 ± 106 (mean ± SEM) basal cells at 10 days post-induction. Clones containing one or more basal cells were imaged in esophageal epithelial wholemounts from at least 3 mice at multiple time points over 6 months following induction ( Fig. 3B; Table S5). Only CFP, YFP and RFP expressing clones were counted because of Lrig1 driven GFP expression in all basal cells.
The clone data set displayed several important features. The density (clones/area) of labelled clones decreased progressively, consistent with clone loss through differentiation, while the number of basal and suprabasal cells in the remaining clones rose ( Fig. 3C; Fig. S7A). The proportion of labelled basal cells remained constant during the experiment, indicating the labelled population was self-maintaining over a six month period, consistent with labelled cells being a representative sample of all proliferating cells in the homeostatic tissue ( Fig. 3C). At late time points, the clone size distribution scaled with time (i.e. the probability of seeing clones larger than x times the average clone size became time-invariant, following a simple exponential f(x) = e -x ) (Fig. S7B). Collectively, these features are hallmarks of neutral competition, in which clonal dynamics result from stochastic cell fates, with an average cell division generating one proliferating and one differentiating daughter cell, a scenario consistent with the SP model (Suppl. 3,28 .

Methods)
The measurement of the average cell division rate (l) and inference of the cell cycle time distribution constrain the fitting of lineage tracing data, providing a stringent test of the candidate SP model ( Fig. S4B-C; Fig. S5E). Within this paradigm unknown parameters are the probability of a progenitor cell division generating two dividing (PP), or two differentiating (DD) daughters (r), and the stratification rate (Γ), which in homeostasis sets the fraction of progenitor cells in the basal layer (r) (Fig. S4A). Our technique for identifying the most appropriate cell-cycle distribution coupled with a MLE grid search discriminated parameter estimates that gave excellent fits on clone-size distributions at both early and late time points for the Lrig1/confetti dataset ( Fig. 3D-E; Table 1; Table S4).
Next, we applied the same approach to an independent, published lineage tracing data set from esophageal epithelium where clones were labelled with YFP by Cre expressed from an inducible Cyp1a1 (Ah) promoter in AhYFP mice 5 . A close fit to the SP model was obtained for very similar parameter values to the ones above, with cell-cycle time constrains resulting in an improved match over short-term clone sizes compared with fits in the original publication ( Fig. 3D-E; Fig. S7C; Table 1; Table S4). As a further validation, we tested our parameterized model against a third, more limited dataset from Krt15-cre PR1 R26 mT/mG mice in which a red-to-green fluorescent reporter was used with inducible Cre expressed from a Krt15 promoter (Suppl. Methods) 29 . Although the SP paradigm was criticized by these authors, it yielded an adequate fit with their own data over the experimental time course ( Fig. S7D; Table S4). The consistent fit of the SP model to three independent lineage tracing data sets using different combinations of transgenic Cre and reporter alleles strongly supports the conclusion that the esophageal epithelium is maintained by a single progenitor population and reinforces the reliability of our parameter estimates.

Clonal dynamics in skin epidermis
We next investigated clonal dynamics in the epidermis through available lineage tracing data sets from the typical interfollicular epidermis of the mouse hind paw, ear and back (dorsum). Applying the MLE approach constrained by the cell-cycle time analysis at each body site yielded improved fits of the SP model to data from paw epidermis in Axin2-cre ERT R26 Rainbow animals and from ear and dorsal interfollicular epidermis in AhYFP mice compared with those fits reported in the original publications ( Fig. S8; Suppl. 11,21,30 . Despite differences in average keratinocyte division rates across territories (l » 2.0, 1.5, 1.2/week for hind paw, ear and dorsum, respectively), all analyzed regions share comparable intermediate proportions of progenitor basal cells r (~55%, the rest corresponding to differentiating basal cells) and a predominance of asymmetric cell divisions (i.e. low inferred values for the probability of symmetric division, r < 0.25) ( Table 1; Table S4).

Methods)
Particularly relevant are the implications for the mode of keratinocyte renewal in back skin, as a previous work claims that two stem cell populations dividing at different rates coexist at this site (2xSC model) 19 .
This argument was supported by a quantitative analysis of H2BGFP dilution patterns in Krt5 tTA /pTRE-H2BGFP mice, a system that differs from that we use above in that mice are treated with Dox to suppress H2B-GFP expression instead of using it as activator (Fig. S9A) 19 . However, in that publication, in the rejection of the SP model, exponential distributions for cell division/cell stratification rates were assumed, which we have here shown to be inappropriate for the short time scale of the experiment (Suppl.  Table S4) 33 .
F Finally, we turned to revisit clonal dynamics in the mouse tail epidermis. Previous studies of tail have argued that the hierarchical SC-CP paradigm applies to proliferating cells in the interscale areas while the SP paradigm describes behavior in the scale regions ( Fig. S3A) 4,20 . These claims were primarily supported by the observation of LRCs in the interscale region in H2BGFP dilution experiments in Krt5 tTA /pTRE-H2BGFP mice. However, our quantitative reanalysis of this data set revealed the SP model fits the reported H2BGFP intensity histograms over time as well as the SC-CP model ( Fig. S10A; Table S4) 20 . Even though we cannot discard the possibility of a subpopulation of slow-cycling stem cells in the tail, such cells would seem to be rare in interscale epidermis (Table S1). Further analysis argued that there was no conflict between the reported tail lineage tracing data and the SP model (Fig S10B-H; Suppl. Methods; Table S4).

Discussion
Overall, we find that combining cell cycle distribution analysis with lineage tracing argues mouse esophageal epithelium and epidermis are maintained by a single population of progenitor cells, with the sole possible exception of the inter-scale compartment of tail skin. The quality of the fit of the SP model to the data is equivalent to or exceeds that of more complex models, rendering the need to invoke additional cell populations redundant. The nine lineage tracing data sets analyzed include a variety of Cre and reporter strain combinations, and are all consistent with the SP model. In addition, live imaging studies of the epidermis are consistent with a single proliferating cell population maintaining the tissue 31 .
Quantitative analysis of cell proliferation in the different tissue types reveals further constraints that must be considered by researchers exploring the appropriateness of alternative models. The original SC-CP and 2xSC models invoked 12% and 30% of basal-layer keratinocytes constitute slow-cycling stem cells, It follows that any slow cycling stem-cell population must have substantially fewer than one slow-cycling cell per thousand basal keratinocytes to be compatible with observations reported here, making it unlikely that such slow cycling cells will make a detectable contribution to tissue homeostasis. The hypothesis that two subpopulations exist, but that they both divide at a similar rate, is hard to sustain in the face of the close agreement of the simpler SP model across all the analyzed data sets.
The improved resolution of parameter estimates reveals differences in cell division rates across the epidermis and the esophagus ( Table 1). Proliferating cells divide rapidly in the esophageal epithelium, on average every ~2.4 days (similar to keratinocyte turnover rate in oral mucosa), while progenitor cells cycle comparatively slower in the epidermis, on average between 3.5-6 days depending on body site 32 .
However, our study suggests individual cell-cycle periods are tightly controlled, showing little variation around average division rate, per territory.
The proportion of progenitor cells in the basal layer and the probability of symmetric cell division outcomes (r ) are similar across body sites ( Table 1). The insight that a substantial proportion of cells in the basal layer will proceed to differentiate rather than divide will be important for the interpretation for the growing body of single cell RNA sequencing data in these tissues 32,33 . In addition, the low values of r we identify give insight into the basis of cell fate determination (Fig. 4). In principle, if every basal cell divides or differentiates with equal probability, as proposed by Leblond, r will be 0.25, as expected from any pair of uncorrelated basal cells 1 . However, this scenario is excluded by our analysis. Instead, the consistent values of r <0.25 indicate the fate of sister cells is preferentially anti-correlated. This phenomenon can be associated to local coordination of neighboring cell stratification and division events, as has been observed in vivo in adult ear epidermis by live imaging 34 . Our results argue that anticorrelation of sister cell fates applies generally in the epidermis and esophagus, pointing to common mechanisms of keratinocyte cell fate regulation. We conclude that the SP model explains squamous epithelial homeostasis.

Wholemount preparation & immunostaining
Esophageal epithelium wholemounts for lineage tracing were prepared as follows: The esophagus was cut longitudinally and the middle two thirds of the tract was incubated for 3 hours in 5 mM EDTA in PBS at 37°C. The epithelium was then peeled away from the underlying submucosa, stretched and fixed for 30 minutes in 4% paraformaldehyde in PBS. Samples were stored in PBS at 4°C until subsequent analysis. Skin pieces of approximately 0.5cm 2 were cut and incubated for 1 hour in 5 mM EDTA in PBS at 37°C. Skin epidermis was then peeled away using fine forceps and processed as described above for the esophageal epithelium. Lineage-tracing data from Ah-cre ERT R26 flEYFP derived clones in esophagus 5 , ear 30 and dorsal epidermis 21 were obtained from experimental colleagues (data available upon request). Data on induced Axin2-cre ERT R26 Rainbow clones in hindpaw 11 and Lgr6-eGFPcre ERT R26 flConfetti in back epidermis 33 were kindly provided by the authors. Data from lineage tracing in scale and interscale tail epidermis 20 was accessed through the online publication material, while authors were unable to provide original data from 4 . Data on Krt15-cre PR1 R26 mT/mG mouse esophagus 29 were retrieved by digitalizing Fig. 2E and Fig. S3B from the original publication. A similar procedure was used to extract Krt5 tTA /pTRE-H2BGFP dilution data from back skin ( Fig. 3 from 19 ) and tail epidermis (Fig. 3K from 4 ).     Transgenic mice for H2BGFP dilution experiments are designed with a first construct, typically targeted to a constitutive promoter, coding either for a tetracycline-controlled transactivator (tTA; Tet-Off system) (i) or a reverse tetracycline-controlled transactivator (rtTA; Tet-On system) (ii). A second construct codes for a Histone 2B-green fluorescent protein fusion (H2BGFP) controlled by a tetracycline-response promoter element (tetO; sometimes referred to as pTRE). Treatment with tetracycline or its derivative doxycycline (Dox) preempts tTA protein from binding to tetO elements in Tet-Off systems, causing repression of pTREcontrolled H2BGFP expression, whilst it is required for binding of rtTA to tetO elements in Tet-On systems, hence having an opposite effect. Dox is administered for induction and withdrawn during the H2BGFP dilution chase in Tet-On mice, while in Tet-Off animals its application gets required for the duration of the experiment.  found to be bimodal in the overall, per-animal modality test (Fig. 2C). Analyses per field of view all resulted non-significant (unimodality).