Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Mapping single-cell-resolution cell phylogeny reveals cell population dynamics during organ development

Abstract

Mapping the cell phylogeny of a complex multicellular organism relies on somatic mutations accumulated from zygote to adult. Available cell barcoding methods can record about three mutations per barcode, enabling only low-resolution mapping of the cell phylogeny of complex organisms. Here we developed SMALT, a substitution mutation-aided lineage-tracing system that outperforms the available cell barcoding methods in mapping cell phylogeny. We applied SMALT to Drosophila melanogaster and obtained on average more than 20 mutations on a three-kilobase-pair barcoding sequence in early-adult cells. Using the barcoding mutations, we obtained high-quality cell phylogenetic trees, each comprising several thousand internal nodes with 84–93% median bootstrap support. The obtained cell phylogenies enabled a population genetic analysis that estimates the longitudinal dynamics of the number of actively dividing parental cells (Np) in each organ through development. The Np dynamics revealed the trajectory of cell births and provided insight into the balance of symmetric and asymmetric cell division.

Your institute does not have access to this article

Relevant articles

Open Access articles citing this article.

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Fig. 1: Invention of the SMALT cell barcoding system in yeast.
Fig. 2: Performance of SMALT in fruit fly.
Fig. 3: Building high-quality cell phylogenetic trees of two fly individuals.
Fig. 4: Population reconstruction of the demographic history of ten fly organs.
Fig. 5: The Np dynamics informs the rate of cell births and cell division modes.

Data availability

Raw data have been deposited in the National Center for Biotechnology Information’s Sequence Read Archive with accession numbers PRJNA716791, PRJNA761270 and PRJNA761271. Source data supporting the findings of the present study are provided as online materials for this paper.

Code availability

Codes for processing the data are available at https://github.com/CellLineage/SLOTH.

References

  1. Sulston, J. E. & Horvitz, H. R. Post-embryonic cell lineages of the nematode, Caenorhabditis elegans. Dev. Biol. 56, 110–156 (1977).

    CAS  PubMed  Google Scholar 

  2. Nei, M. Phylogenetic analysis in molecular evolutionary genetics. Annu. Rev. Genet. 30, 371–403 (1996).

    CAS  PubMed  Google Scholar 

  3. Sulston, J. E., Schierenberg, E., White, J. G. & Thomson, J. N. The embryonic cell lineage of the nematode Caenorhabditis elegans. Dev. Biol. 100, 64–119 (1983).

    CAS  PubMed  Google Scholar 

  4. Woodworth, M. B., Girskis, K. M. & Walsh, C. A. Building a lineage from single cells: genetic techniques for cell lineage tracking. Nat. Rev. Genet. 18, 230–244 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  5. Frumkin, D., Wasserstrom, A., Kaplan, S., Feige, U. & Shapiro, E. Genomic variability within an organism exposes its cell lineage tree. PLoS Comput. Biol. 1, 382–394 (2005).

    CAS  Google Scholar 

  6. Salipante, S. J. & Horwitz, M. S. Phylogenetic fate mapping. Proc. Natl Acad. Sci. USA 103, 5448–5453 (2006).

    CAS  PubMed  PubMed Central  Google Scholar 

  7. Behjati, S. et al. Genome sequencing of normal cells reveals developmental lineages and mutational processes. Nature 513, 422–425 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  8. Luo, T., He, X. & Xing, K. Lineage analysis by microsatellite loci deep sequencing in mice. Mol. Reprod. Dev. 83, 387–391 (2016).

    CAS  PubMed  Google Scholar 

  9. McKenna, A. et al. Whole-organism lineage tracing by combinatorial and cumulative genome editing. Science 353, aaf7907 (2016).

    PubMed  PubMed Central  Google Scholar 

  10. Pei, W. et al. Polylox barcoding reveals haematopoietic stem cell fates realized in vivo. Nature 548, 456–460 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. Kalhor, R., Mali, P. & Church, G. M. Rapidly evolving homing CRISPR barcodes. Nat. Methods 14, 195–200 (2017).

    CAS  PubMed  Google Scholar 

  12. Frieda, K. L. et al. Synthetic recording and in situ readout of lineage information in single cells. Nature 541, 107–111 (2017).

    CAS  PubMed  Google Scholar 

  13. Alemany, A., Florescu, M., Baron, C. S., Peterson-Maduro, J. & van Oudenaarden, A. Whole-organism clone tracing using single-cell sequencing. Nature 556, 108–112 (2018).

    CAS  PubMed  Google Scholar 

  14. Kalhor, R. et al. Developmental barcoding of whole mouse via homing CRISPR. Science 361, eaat9804 (2018).

    PubMed  PubMed Central  Google Scholar 

  15. Raj, B. et al. Simultaneous single-cell profiling of lineages and cell types in the vertebrate brain. Nat. Biotechnol. 36, 442–450 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  16. Spanjaard, B. et al. Simultaneous lineage tracing and cell-type identification using CRISPR–Cas9-induced genetic scars. Nat. Biotechnol. 36, 469–473 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  17. Chan, M. M. et al. Molecular recording of mammalian embryogenesis. Nature 570, 77–7 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  18. Hwang, B. et al. Lineage tracing using a Cas9-deaminase barcoding system targeting endogenous L1 elements. Nat. Commun. 10, 1234 (2019).

    PubMed  PubMed Central  Google Scholar 

  19. Bowling, S. et al. An engineered CRISPR–Cas9 mouse line for simultaneous readout of lineage histories and gene expression profiles in single cells. Cell 181, 1410–1422 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  20. Ye, C., Chen, Z. X., Liu, Z., Wang, F. & He, X. L. Defining endogenous barcoding sites for CRISPR/Cas9-based cell lineage tracing in zebrafish. J. Genet Genomics 47, 85–91 (2020).

    PubMed  Google Scholar 

  21. Chen, H. Q. et al. Efficient, continuous mutagenesis in human cells using a pseudo-random DNA editor. Nat. Biotechnol. 38, 165–16 (2020).

    CAS  PubMed  Google Scholar 

  22. Baron, C. S. & van Oudenaarden, A. Unravelling cellular relationships during development and regeneration using genetic lineage tracing. Nat. Rev. Mol. Cell Bio 20, 753–765 (2019).

    CAS  Google Scholar 

  23. Wagner, D. E. & Klein, A. M. Lineage tracing meets single-cell omics: opportunities and challenges. Nat. Rev. Genet. 21, 410–427 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  24. Felsenstein, J. Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39, 783–791 (1985).

    PubMed  Google Scholar 

  25. Wasserstrom, A. et al. Estimating cell depth from somatic mutations. PLoS Comput. Biol. 4, e1000058 (2008).

    PubMed  PubMed Central  Google Scholar 

  26. Sender, R., Fuchs, S. & Milo, R. Revised estimates for the number of human and bacteria cells in the body. PloS Biol. https://doi.org/10.1371/journal.pbio.1002533 (2016).

  27. Stadler, T., Pybus, O. G. & Stumpf, M. P. H. Phylodynamics for cell biologists. Science 371, https://doi.org/10.1126/science.aah6266 (2021).

  28. Harris, R. S., Petersen-Mahrt, S. K. & Neuberger, M. S. RNA editing enzyme APOBEC1 and some of its homologs can act as DNA mutators. Mol. Cell 10, 1247–1253 (2002).

    CAS  PubMed  Google Scholar 

  29. Chen, X. et al. Nucleosomes suppress spontaneous mutations base-specifically in eukaryotes. Science 335, 1235–1238 (2012).

    CAS  PubMed  Google Scholar 

  30. Chen, P., Wang, D., Chen, H., Zhou, Z. & He, X. The nonessentiality of essential genes in yeast provides therapeutic insights into a human disease. Genome Res 26, 1355–1362 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  31. Prorok, P. et al. Uracil in duplex DNA is a substrate for the nucleotide incision repair pathway in human cells. Proc. Natl Acad. Sci. USA 110, E3695–E3703 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  32. Wang, M., Yang, Z. Z., Rada, C. & Neuberger, M. S. AID upmutants isolated using a high-throughput screen highlight the immunity/cancer balance limiting DNA deaminase activity. Nat. Struct. Mol. Biol. 16, 769–776 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  33. Fonfara, I., Curth, U., Pingoud, A. & Wende, W. Creating highly specific nucleases by fusion of active restriction endonucleases and catalytically inactive homing endonucleases. Nucleic Acids Res. 40, 847–860 (2012).

    CAS  PubMed  Google Scholar 

  34. Zhu, Y. O., Siegal, M. L., Hall, D. W. & Petrov, D. A. Precise estimates of mutation rate and spectrum in yeast. Proc. Natl Acad. Sci. USA 111, E2310–E2318 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  35. Brand, A. H. & Perrimon, N. J. D. Targeted gene expression as a means of altering cell fates and generating dominant phenotypes. Development 118, 401–415 (1993).

    CAS  PubMed  Google Scholar 

  36. Bate, M., Martinez Arias, A. & Hartenstein, V. The Development of Drosophila melanogaster (Cold Spring Harbor Laboratory Press, 1993).

  37. Farrell, J. A. & O’Farrell, P. H. From egg to gastrula: how the cell cycle is remodeled during the Drosophila mid-blastula transition. Annu. Rev. Genet. 48, 269–294 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  38. Nguyen, L. T., Schmidt, H. A., von Haeseler, A. & Minh, B. Q. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268–274 (2015).

    CAS  PubMed  Google Scholar 

  39. Edgar, B. A., Zielke, N. & Gutierrez, C. Endocycles: a recurrent evolutionary innovation for post-mitotic cell growth. Nat. Rev. Mol. Cell Biol. 15, 197–210 (2014).

    PubMed  Google Scholar 

  40. Underwood, E. M., Caulton, J. H., Allis, C. D. & Mahowald, A. P. Developmental fate of pole cells in Drosophila melanogaster. Dev. Biol. 77, 303–314 (1980).

    CAS  PubMed  Google Scholar 

  41. Buchon, N. et al. Morphological and molecular characterization of adult midgut compartmentalization in Drosophila. Cell Rep. 3, 1725–1738 (2013).

    CAS  PubMed  Google Scholar 

  42. Miguel-Aliaga, I., Jasper, H. & Lemaitre, B. Anatomy and physiology of the digestive tract of Drosophila melanogaster. Genetics 210, 357–396 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  43. Lee-Six, H. et al. Population dynamics of normal human blood inferred from somatic mutations. Nature 561, 473–478 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  44. Fu, Y. X. A phylogenetic estimator of effective population size or mutation rate. Genetics 136, 685–692 (1994).

    CAS  PubMed  PubMed Central  Google Scholar 

  45. Pybus, O. G., Rambaut, A. & Harvey, P. H. An integrated framework for the inference of viral population history from reconstructed genealogies. Genetics 155, 1429–1437 (2000).

    CAS  PubMed  PubMed Central  Google Scholar 

  46. Drummond, A. J. & Rambaut, A. BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol. Biol. 7, 214 (2007).

    PubMed  PubMed Central  Google Scholar 

  47. Yao, Z., Liu, K., Deng, S. & He, X. An instantaneous coalescent method insensitive to population structure. J. Genet Genomics 48, 219–224 (2021).

    PubMed  Google Scholar 

  48. Karcher, M. D., Palacios, J. A., Lan, S. & Minin, V. N. phylodyn: an R package for phylodynamic simulation and inference. Mol. Ecol. Resour. 17, 96–100 (2017).

    CAS  PubMed  Google Scholar 

  49. Hu, Z., Fu, Y. X., Greenberg, A. J., Wu, C. I. & Zhai, W. W. Age-dependent transition from cell-level to population-level control in murine intestinal homeostasis revealed by coalescence analysis. PLoS Genet. 9, e1003326 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  50. Salvador-Martinez, I., Grillo, M., Averof, M. & Telford, M. J. Is it possible to reconstruct an accurate cell lineage using CRISPR recorders? eLife 8, e40292 (2019).

    PubMed  PubMed Central  Google Scholar 

  51. Ho, S. Y. W. & Duchene, S. Molecular-clock methods for estimating evolutionary rates and timescales. Mol. Ecol. 23, 5947–5965 (2014).

    PubMed  Google Scholar 

  52. Charlesworth, B. Effective population size and patterns of molecular evolution and variation. Nat. Rev. Genet. 10, 195–205 (2009).

    CAS  PubMed  Google Scholar 

  53. Gietz, R. D. & Schiestl, R. H. Large-scale high-efficiency yeast transformation using the LiAc/SS carrier DNA/PEG method. Nat. Protoc. 2, 38–41 (2007).

    CAS  PubMed  Google Scholar 

  54. Radchenko, E. A., McGinty, R. J., Aksenova, A. Y., Neil, A. J. & Mirkin, S. M. Quantitative analysis of the rates for repeat-mediated genome instability in a yeast experimental system. Methods Mol. Biol. 1672, 421–438 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  55. Roney, I. J., Rudner, A. D., Couture, J. F. & Kaern, M. Improvement of the reverse tetracycline transactivator by single amino acid substitutions that reduce leaky target gene expression to undetectable levels. Sci. Rep. 6, 27697 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  56. Mol, C. D. et al. Crystal structure of human uracil-DNA glycosylase in complex with a protein inhibitor: protein mimicry of DNA. Cell 82, 701–708 (1995).

    CAS  PubMed  Google Scholar 

  57. Bischof, J., Maeda, R. K., Hediger, M., Karch, F. & Basler, K. An optimized transgenesis system for Drosophila using germ-line-specific phi C31 integrases. Proc. Natl Acad. Sci. USA 104, 3312–3317 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We are grateful to J. Zhang for inspiration, Y. Rong for information on the phiC31 system and for providing related fly strains, the Tsinghua Fly Center at the Tsinghua University for providing fly strains, Y. Zhao for suggestions and guidance with regard to the fly embryo microinjection, and T. Tang, C-I. Wu, X. Shen and members of the Wu laboratory for help with fly work. We thank H. Chen, W. Qian, Y. Zhang, W. Zhai, C-I. Wu, X. Huang, Z. Wang and J. Yang for discussion and comments on the paper. This work was supported by grants of the National Key R&D Program of China (grant no. 2017YFA0103504), the National Natural Science Foundation of China (grant nos. 31630042, 31970570, and 32070687), the Shanghai Municipal Science and Technology Major Project (grant no. 2017SHZDZX01) and the Guangdong Special Support Program (grant no. 2017TX04R395).

Author information

Authors and Affiliations

Authors

Contributions

X.H., C.Y. and L.L. designed the study. C.Y., K.L. and L.L. conducted the yeast assays. K.L. and H.G. conducted the fly experiments. S.D., C.Y., Z.Y., K.L., J.W. and X.H. analyzed the data. X.H. supervised the study and wrote the paper with input from C.Y., S.D., K.L. and other coauthors.

Corresponding authors

Correspondence to Li Liu or Xionglei He.

Ethics declarations

Competing interests

A patent related to the developed technique has been filed.

Additional information

Peer review information Nature Methods thanks Hugo Bellen and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Madhura Mukhopadhyay was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 SMALT mutations depend heavily on cell divisions.

a. The schematic of the Tet-On system used in the experiment. The AI and GFP are both controlled by the Tet promoter such that the induced expression of AI can be inferred from GFP. b. A single diploid clone was grown to saturation (~10 h, 107 cells per ml), then split into four independent groups (~106 cells in 100 μl), followed by ~4 h starvation in PBS to run out of the original YPDA media. Then the yeast cells were washed with ddH2O twice and grew for ~18 hours in the correspondent media (dividing group: YPDA Dox- and Dox-; non-dividing group: PBS Dox+ and YEKAC Dox + ). Distribution of the mutation rate per site along the 120 barcode sequence. The mutation rate of a site is the number of reads with mutations at the site divided by the total number of reads covering the site (Methods). The relative position of the barcode is indicated and each dot represents a site. Two iSceI binding sites are shaded with grey rectangle and marked with black arrows. A cutoff of 0.1% (dashed line) is used as the threshold for over-background mutations. Clone 1, 2 and 3 represent three independent initial single clones. c. The expression induction (measured by GFP) in Dox+ media supplied with 10 μg/ml Doxycycline. The photos were taken after the yeast cells stay for ~18 h in each of the media. There are comparable GFP levels for the yeast cells in the dividing and non-dividing Dox+ media. The bright, GFP and merged images are shown, respectively, with the scale bar = 100μm (40X Objective with PH40 mode).

Source data

Extended Data Fig. 2 Tubulin-GAL4 ubiquitously drives UAS-GFP expression.

a. Images of developing D. melanogaster embryos at the stage of fast cleavage. Embryos were collected within 30 min after egg laying. Over five larvae were examined and a random sample is shown for each organ. Bar = 100 μm. b. Images of 10 dissected organs from a third instar larva. Bar = 200 μm.

Extended Data Fig. 3 Genome-wide off-target analysis of the SMALT system in fly.

We compared flies with the SMALT system (AID+) to those without the system (NC). Six individuals of each category were examined, and their genomes were subject to Illumina sequencing (PE150). In total we obtained ~0.8 billion reads after trimming with trim_galore, leaving on average ~108 sequencing coverages for each genome. The processed sequences were mapped to the Drosophila melanogaster hgenome (BDGP6) using bwa-mem and duplicates were removed, followed by variants calling with GATK HaplotypeCaller. Alternative loci with allele frequency over 10% were defined as polymorphisms, which were excluded from further analysis. a. The number of sites with detected putative somatic mutations in each genome. The six AID + individuals and six NC individuals are compared under a variety of mutant allele frequencies. The numbers in parentheses are the average coverage of reliably mapped reads on the genome. There are not apparent differences between the AID + and NC groups. b. The relative frequency of the different mutation types for the putative somatic mutations detected in each of the genomes. The frequency of AID signature mutations (C > T and G > A) is similar between the AID + and NC individuals. c. To increase the sensitivity we pooled the reads of all AID + individuals (also for all NC individuals) and focused on the five potential off-target sites each with < =2 mismatches to the 18 bp iSceI binding site. For each site we considered the two flanking regions of 150 bp in length (hence 18 + 2 × 150 = 318 bp), and identified the fragments (paired reads) fully covering the 318 bp regions. The bar-plots represent the frequency of fragments harboring any kind of mutations (n shows the number of fragments analyzed in this region). The error bars show the standard error of the frequency. We observed a higher frequency of mutated fragments in AID + group than NC group. However, the AID specific mutations (C > T and G > A) appear similar between the two groups, as shown by the pie charts. This suggests the higher frequency of mutated fragments not be due to the AID enzymatic activity. A possible explanation is the strong overexpression of a heterogeneous protein imposes stresses on the genome stability by consuming a large amount of cellular energy.

Source data

Extended Data Fig. 4 The cell tree of Fly-1 that comprises 5,003 alleles, with no organ information shown.

The bootstrap supports on the early internal branches are generally low. The first 30 cell generations are highlighted regarding the bootstrap values, with the red line showing the median and the blue lines showing the 25th to 75th percentiles.

Extended Data Fig. 5 The cell tree of Fly-2 that comprises 5,421 alleles, with no organ information shown.

The bootstrap supports on the early internal branches are generally low. The first 30 cell generations are highlighted regarding the bootstrap values, with the red line showing the median and the blue lines showing the 25th to 75th percentiles.

Extended Data Fig. 6 Validation of the coalescent method by simulations.

a. A schematic diagram showing how the coalescent rate (CR) of each small time interval is calculated from a hypothetical phylogenetic tree. The reconstruction is conducted until the 40th generation when 95% of the terminal nodes in the tree have been examined. b. An organ development is simulated and the actual Np trajectory is defined by the given parameters (Methods). With the increasing sampling proportion and per-generation mutation rate the Np trajectory can be reliably reconstructed by computing the CR of each generation (Np = 0.5(1/CR + 1)). c. The estimated Np trajectories are robust against different modes of cell deaths.

Extended Data Fig. 7 The coalescent method is not sensitive to population structure.

a. The mathematic proof of the equal coalescent probability at the immediate preceding generation of two random cells in a panmictic population (A) and a structured population (B = B1 + B2) of the same total size. b. A and B are two simulated populations with identical total Np trajectory, while B is composed of two divided sub-populations (B1 and B2) each with a distinct Np trajectory. The simulation procedure is similar to that of Extended Data Fig. 6b, and the per-generation mutation rate is set to be one. The coalescent method performs well in the structured population B, evidenced by the achieved consistency between the reconstructed Np trajectories (blue curves) and the actual Np trajectory (black curves), although a higher sampling proportion is required to have the same performance as in the panmictic counterpart population A (yellow curves).

Supplementary information

Supplementary Information

Supplementary Tables 1–3, Figs. 1–15, Notes I and II and Datasets I and II.

Reporting Summary

Supplementary Data

Mutation rate estimated with a maximum-likelihood method.

Supplementary Data

Raw unprocessed image.

Source data

Source Data Fig. 1

Mutation rate estimated with a maximum-likelihood method.

Source Data Fig. 2

Mutation for each readout processed from Sanger sequencing and mutation for each allele processed from PacBio data in binary format.

Source Data Fig. 3

Two phylogenetic trees in Newick format.

Source Data Fig. 4

Estimation result from instantaneous coalescence analysis.

Source Data Fig. 5

Estimation result from instantaneous coalescence analysis.

Source Data Extended Data Fig. 1

Mutation table for each sample.

Source Data Extended Data Fig. 3

Mutation table for each sample.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Liu, K., Deng, S., Ye, C. et al. Mapping single-cell-resolution cell phylogeny reveals cell population dynamics during organ development. Nat Methods 18, 1506–1514 (2021). https://doi.org/10.1038/s41592-021-01325-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41592-021-01325-x

Further reading

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing