Abstract
Studying cellular and developmental processes in complex multicellular organisms can require the non-destructive observation of thousands to billions of cells deep within an animal. DNA recorders address the staggering difficulty of this task by converting transient cellular experiences into mutations at defined genomic sites that can be sequenced later in high throughput. However, existing recorders act primarily by erasing DNA. This is problematic because, in the limit of progressive erasure, no record remains. We present a DNA recorder called CHYRON (Cell History Recording by Ordered Insertion) that acts primarily by writing new DNA through the repeated insertion of random nucleotides at a single locus in temporal order. To achieve in vivo DNA writing, CHYRON combines Cas9, a homing guide RNA and the template-independent DNA polymerase terminal deoxynucleotidyl transferase. We successfully applied CHYRON as an evolving lineage tracer and as a recorder of user-selected cellular stimuli.

This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
A time-resolved, multi-symbol molecular recorder via sequential genome editing
Nature Open Access 06 July 2022
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout






Data availability
All NGS datasets were deposited at the NCBI’s Sequence Read Archive under accession no. PRJNA561027. All plasmids and full sequences are available at Addgene. See Supplementary Table 4 for a guide to these data and reagents. Please contact C.C.L. for cell lines.
Code availability
All scripts are available at https://github.com/liusynevolab.
References
McDole, K. et al. In toto imaging and reconstruction of post-implantation mouse development at the single-cell level. Cell 175, 859–876 (2018).
McKenna, A. et al. Whole-organism lineage tracing by combinatorial and cumulative genome editing. Science 353, aaf7907 (2016).
Perli, S. D., Cui, C. H. & Lu, T. K. Continuous genetic recording with self-targeting CRISPR–Cas in human cells. Science 353, aag0511 (2016).
Kalhor, R., Mali, P. & Church, G. M. Rapidly evolving homing CRISPR barcodes. Nat. Methods 14, 195–200 (2017).
Frieda, K. L. et al. Synthetic recording and in situ readout of lineage information in single cells. Nature 541, 107–111 (2017).
Schmidt, S. T., Zimmerman, S. M., Wang, J., Kim, S. K. & Quake, S. R. Quantitative analysis of synthetic cell lineage tracing using nuclease barcoding. ACS Synth. Biol. 6, 936–942 (2017).
Sheth, R. U., Yim, S. S., Wu, F. L. & Wang, H. H. Multiplex recording of cellular events over time on CRISPR biological tape. Science 358, 1457–1461 (2017).
Tang, W. & Liu, D. R. Rewritable multi-event analog recording in bacterial and mammalian cells. Science 360, eaap8992 (2018).
Shipman, S. L., Nivala, J., Macklis, J. D. & Church, G. M. CRISPR–Cas encoding of a digital movie into the genomes of a population of living bacteria. Nature 547, 345–349 (2017).
Shipman, S. L., Nivala, J., Macklis, J. D. & Church, G. M. Molecular recordings by directed CRISPR spacer acquisition. Science 353, aaf1175 (2016).
Kalhor, R. et al. Developmental barcoding of whole mouse via homing CRISPR. Science 361, eaat9804 (2018).
Raj, B. et al. Simultaneous single-cell profiling of lineages and cell types in the vertebrate brain. Nat. Biotechnol. 36, 442–450 (2018).
Spanjaard, B. et al. Simultaneous lineage tracing and cell-type identification using CRISPR–Cas9-induced genetic scars. Nat. Biotechnol. 36, 469–473 (2018).
Sheth, R. U. & Wang, H. H. DNA-based memory devices for recording cellular events. Nat. Rev. Genet. 19, 718–732 (2018).
Hwang, B. et al. Lineage tracing using a Cas9-deaminase barcoding system targeting endogenous L1 elements. Nat. Commun. 10, 1234 (2019).
Chan, M. M. et al. Molecular recording of mammalian embryogenesis. Nature 570, 77–82 (2019).
Bowling, S. et al. An engineered CRISPR–Cas9 mouse line for simultaneous readout of lineage histories and gene expression profiles in single cells. Cell 181, 1410–1422 (2020).
Alemany, A., Florescu, M., Baron, C. S., Peterson-Maduro, J. & van Oudenaarden, A. Whole-organism clone tracing using single-cell sequencing. Nature 556, 108–112 (2018).
Landau, N. R., Schatz, D. G., Rosa, M. & Baltimore, D. Increased frequency of N-region insertion in a murine pre-B-cell line infected with a terminal deoxynucleotidyl transferase retroviral expression vector. Mol. Cell Biol. 7, 3237–3243 (1987).
Pryor, J. M. et al. Ribonucleotide incorporation enables repair of chromosome breaks by nonhomologous end joining. Science 361, 1126–1129 (2018).
Wang, T., Wei, J. J., Sabatini, D. M. & Lander, E. S. Genetic screens in human cells using the CRISPR–Cas9 system. Science 343, 80–84 (2013).
Jinek, M. et al. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science 337, 816–821 (2012).
Zuo, Z. & Liu, J. Cas9-catalyzed DNA cleavage generates staggered ends: evidence from molecular dynamics simulations. Sci. Rep. 6, 37584 (2016).
Gisler, S. et al. Multiplexed Cas9 targeting reveals genomic location effects and gRNA-based staggered breaks influencing mutation efficiency. Nat. Commun. 10, 1598 (2019).
Shannon, C. E. A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423 (1948).
Motea, E. A. & Berdis, A. J. Terminal deoxynucleotidyl transferase: the story of a misguided DNA polymerase. Biochim. Biophys. Acta 1804, 1151–1166 (2010).
Liu, M. et al. Genomic discovery of potent chromatin insulators for human gene therapy. Nat. Biotechnol. 33, 198–203 (2015).
Semenza, G. L. Hypoxia-inducible factors in physiology and medicine. Cell 148, 399–408 (2012).
Rankin, E. B. & Giaccia, A. J. Hypoxic control of metastasis. Science 352, 175–180 (2016).
Ede, C., Chen, X., Lin, M.-Y. & Chen, Y. Y. Quantitative analyses of core promoters enable precise engineering of regulated gene expression in mammalian cells. ACS Synth. Biol. 5, 395–404 (2016).
McKenna, A. & Gagnon, J. A. Recording development with single cell dynamic lineage tracing. Development 146, dev169730 (2019).
Fu, Y., Sander, J. D., Reyon, D., Cascio, V. M. & Joung, J. K. Improving CRISPR–Cas nuclease specificity using truncated guide RNAs. Nat. Biotechnol. 32, 279–284 (2014).
Palluk, S. et al. De novo DNA synthesis using polymerase–nucleotide conjugates. Nat. Biotechnol. 36, 645–650 (2018).
Barthel, S., Palluk, S., Hillson, N. J., Keasling, J. D. & Arlow, D. H. Enhancing terminal deoxynucleotidyl transferase activity on substrates with 3′ terminal structures for enzymatic de novo DNA synthesis. Genes 11, 102 (2020).
Lee, H. H., Kalhor, R., Goela, N., Bolot, J. & Church, G. M. Terminator-free template-independent enzymatic DNA synthesis for digital information storage. Nat. Commun. 10, 2383 (2019).
Zamft, B. M. et al. Measuring cation dependent DNA polymerase fidelity landscapes by deep sequencing. PLoS ONE 7, e43876 (2012).
Marblestone, A. H. et al. Physical principles for scalable neural recording. Front. Comput. Neurosci. 7, 137 (2013).
Glaser, J. I. et al. Statistical analysis of molecular signal recording. PLoS Comput. Biol. 9, e1003145 (2013).
Bhan, N. J. et al. Recording temporal data onto DNA with minutes resolution. Preprint at bioRxiv https://doi.org/10.1101/634790 (2019).
Mali, P. et al. RNA-guided human genome engineering via Cas9. Science 339, 823–826 (2013).
Komor, A. C., Kim, Y. B., Packer, M. S., Zuris, J. A. & Liu, D. R. Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature 533, 420–424 (2016).
Yan, Q., Bartz, S., Mao, M., Li, L. & Kaelin, W. G. The hypoxia-inducible factor 2α N-terminal and C-terminal transactivation domains cooperate to promote renal tumorigenesis in vivo. Mol. Cell Biol. 27, 2092–2102 (2007).
Campeau, E. et al. A versatile viral system for expression and depletion of proteins in mammalian cells. PLoS ONE 4, e6529 (2009).
Tsai, S. Q. et al. Dimeric CRISPR RNA-guided FokI nucleases for highly specific genome editing. Nat. Biotechnol. 32, 569–576 (2014).
Waldo, G. S., Standish, B. M., Berendzen, J. & Terwilliger, T. C. Rapid protein-folding assay using green fluorescent protein. Nat. Biotechnol. 17, 691–695 (1999).
Yang, B., Gathy, K. N. & Coleman, M. S. Mutational analysis of residues in the nucleotide binding domain of human terminal deoxynucleotidyl transferase. J. Biol. Chem. 269, 11859–11868 (1994).
Repasky, J. A. E., Corbett, E., Boboila, C. & Schatz, D. G. Mutational analysis of terminal deoxynucleotidyltransferase-mediated N-nucleotide addition in V(D)J recombination. J. Immunol. 172, 5478–5488 (2004).
Lee, M. E., DeLoache, W. C., Cervantes, B. & Dueber, J. E. A highly characterized yeast toolkit for modular, multipart assembly. ACS Synth. Biol. 4, 975–986 (2015).
Chen, S. et al. Genome-wide CRISPR screen in a mouse model of tumor growth and metastasis. Cell 160, 1246–1260 (2015).
Tanida-Miyake, E., Koike, M., Uchiyama, Y., Tanida, I. & Sato, M. Optimization of mNeonGreen for Homo sapiens increases its fluorescent intensity in mammalian cells. PLoS ONE 13, e0191108 (2018).
Kleinstiver, B. P. et al. Engineered CRISPR–Cas9 nucleases with altered PAM specificities. Nature 523, 481–485 (2015).
Zhang, J., Kobert, K., Flouri, T. & Stamatakis, A. PEAR: a fast and accurate Illumina Paired-End reAd mergeR. Bioinformatics 30, 614–620 (2013).
Acknowledgements
We thank S.K. Paul and T.C. Lone for technical assistance. We thank the following people for helpful discussions: C. Guerrero-Juarez, C. Li, J. Zimak, Q. Nie and all members of the Liu laboratory. We thank the following people for plasmids: Y. Chen, K. Joung, W. Kaelin, G. Church, D. Liu, E. Campeau, P. Kaufman, T. Lu, P. Sharp, F. Zhang, K. Oka and I. Tanida. This work was made possible, in part, through access to the Genomics High Throughput Facility Shared Resource of the Cancer Center Support Grant (P30CA-062203) at the University of California, Irvine and NIH shared instrumentation grants 1S10RR025496-01, 1S10OD010794-01 and 1S10OD021718-01. This work was funded by NIH grants 1DP2GM119163-01 and 1R21GM126287-01 to C.C.L., AHA Predoctoral and NSF Graduate Research fellowships to C.K.C. and a fellowship from the NSF-Simons Center for Multiscale Cell Fate Research (NSF award 1763272) to T.B.L.
Author information
Authors and Affiliations
Contributions
T.B.L. and C.C.L. designed experiments. C.K.C. performed NGS library preparation for hypoxia recording experiments; T.B.L. and J.H.G. performed all other experiments, with assistance from M.W.S. C.C.L. and T.B.L. developed hypoxia recording protocols, G.L. determined how to remove unedited CHYRON sequences during NGS library preparation, and J.H.G. and T.B.L. developed all other NGS library preparation protocols. T.B.L., J.H.G. and C.K.C. established cell lines. T.B.L., J.H.G., C.K.C., G.L. and M.F. cloned plasmid vectors. T.B.L., M.W.S., E.F., B.S.A., X.X. and C.C.L. discussed experimental analyses. M.W.S. and B.S.A. wrote lineage reconstruction scripts and then performed initial reconstructions for the experiment shown in Fig. 5; T.B.L. performed all other lineage analyses. E.F. wrote code for the analysis of NGS data, which was subsequently edited by M.W.S., B.L. and T.B.L. B.L., M.F. and T.B.L. analyzed the proportions of all 2-, 3- and 4-nt sequences in CHYRON insertions. E.F. wrote the description of NGS analysis in the Methods, and J.H.G. and T.B.L. wrote the remainder of the Methods. T.B.L. and C.C.L. wrote the remainder of the paper, with input from all authors, especially C.K.C. C.C.L. procured funding and oversaw the project.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Peer review information Nature Chemical Biology thanks Randall Platt and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Detailed model for the progressive accumulation of insertions at the CHYRON locus.
The nucleotides initially added by TdT may be ribonucleotides20.
Extended Data Fig. 2 TdT writes stretches of random nucleotides at a Cas9-induced DSB in primary cells.
a, Expression of TdT promoted insertion mutations. Adult human primary dermal fibroblasts were nucleofected with plasmids expressing Cas9 and TdT, or Cas9 alone, and an sgRNA against a genomic site (site3, as in Fig. 2). After 7 days, cells were collected, processed, and analyzed as described in Fig. 2a and Methods. Each point represents a single technical replicate. (Three replicates were assayed.) b, Expression of TdT resulted in longer insertion mutations than those minority insertions created in the presence of Cas9 alone, suggesting that TdT acts as a DNA writer. From the pool of pure insertions, the average length was calculated and plotted. Each point represents a single technical replicate. (Three replicates were assayed.) c, Insertion sequences generated by TdT had the same nucleotide biases in primary fibroblasts as in HEK293T cells. The proportions of each nucleotide (on the top strand) found in all pure insertion sequences 4 bp in length were calculated and plotted. Each point represents a single technical replicate. (HEK293T data from Fig. 2c, sequence ‘GT’).
Extended Data Fig. 3 TdT-mediated insertions are added 3 bp upstream of the PAM.
For the pure insertions shown in Fig. 2a, the position of the insertion was determined, if the insertion sequence made this determination possible. Insertions were annotated as having an ‘indeterminate’ position, for example, if the 3’ nt of the insertion was identical to the protospacer nt 5’ of where the insertion was placed. Each point represents the mean of two technical replicates of a single biological replicate.
Extended Data Fig. 4 Further characterization of CHYRON20.
a, Schematic of 293T-CHYRON20. b, Western blots of samples shown in Fig. 3a–c. c, The Shannon entropy was calculated at each timepoint of this experiment.
Extended Data Fig. 5 Further characterization of CHYRON20 over successive rounds of activity.
a, Plan of the experiment for b, Fig. 3d, and Supplementary Fig. 3. b, Cas9 and TdT mediated multiple rounds of editing on an integrated hgRNA. The 293T-CHYRON20 cell line was transfected with Cas9 and TdT to induce insertions, then single colonies isolated. The clonal isolate shown bears the insertions CTGAAAAACT and CCT. This cell line was treated and sequences analyzed as in Fig. 3d. Editing outcomes were determined to be the root CHYRON20 sequence (not shown), deletions, both dominant insertions (not shown), any insertion containing these insertions or a shortened version as a prefix (CTGAAAAACT + , CCT + , CTGAA + ), any insertion containing the sequences CTGAA or CCT other than as a prefix (*CTGAA* and *CCT*), or other insertions. Each point represents a single biological replicate.
Extended Data Fig. 6 Further characterization of CHYRON20i and CHYRON16i.
a, From the insertions detected in the experiment shown in Fig. 5, the proportions of all possible single-nt, 2-nt, 3-nt, and 4-nt sequences were determined, and the Shannon entropy calculated. The Shannon entropy was divided by the length of the sequences considered to calculate the bits per bp encoded in the insertions. b, CHYRON loci with an initial hgRNA length of 20 nt accumulated insertions quickly, with a gradual increase in length, whereas those with an initial hgRNA length of 16 nt accumulated insertions more slowly, ending up with a much longer insertion distribution. In the experiment shown in Fig. 4, 293T-CHYRON20i and 293T-CHYRON16i cells were transfected with a plasmid expressing Cas9 and TdT for the indicated time before collection. Cells were re-transfected every 3 days. The CHYRON locus was analyzed by NGS and each sequence was annotated as root, pure insertion or any sequence that involves a loss of information. The percentage of total sequences that are a pure insertion of the indicated length was plotted. Each point represents a single technical replicate. (Some points overlap; three replicates were assayed.).
Extended Data Fig. 7 CHYRON17 delivered to primary cells by virus accumulates insertion mutations.
a, Plan of the experiment. Cells were infected with two lentiviruses, one expressing Cas9, at a multiplicity of infection (MOI) of 15, and one expressing TdT and an hgRNA with a 17 nt initial spacer length, at an MOI of 30, then grown for the indicated time before collection. b, The CHYRON locus was analyzed by NGS and each sequence was annotated as in Fig. 2a. c, From these data, the average length of all pure insertions was calculated. For b and c, each point represents a single biological replicate.
Extended Data Fig. 8 Reconstruction of cell relatedness by DNA recording in primary cells.
a, Plan of the experiment. This procedure was performed in quadruplicate to generate 16 final wells. 76,000 human adult primary dermal fibroblasts growing in 2 wells of a 24-well plate were infected with lentiviruses carrying CHYRON17 at high MOI as described in Extended Data Fig. 7, then re-plated after 3 days to begin the experiment. Each well was split evenly into two new wells after 4 days, then split again after 14 more days, then collected 4 days after that. The CHYRON locus was sequenced, and unique insertions enumerated for each well. Low-abundance, artifactual insertions were removed (see Supplementary Fig. 4b and Methods). Lineage reconstruction was performed as in Fig. 5. b, Dendrograms for reconstruction using all unique insertions 7-15 bp in length.
Extended Data Fig. 9 Reconstruction of cell relatedness by DNA recording requires high information when sampling is limited.
For the experiment shown in Fig. 5, lineage reconstruction under sampling constraints would have been unsuccessful with a recorder that makes use of an hgRNA and Cas9 only. For each population in the experiment, the data set was degraded at random so that only 20% of all unique insertions remained. Then, each insertion was truncated so that the proportions of insertions encoding each amount of self-information matched the proportions of mutated hgRNA sequences encoding that amount of self-information in a published dataset. This pipeline was run five times and the number of correctly reconstructed relationships was calculated in the following way: for each well, the reconstruction was awarded one point for grouping the well with the proper sibling well and one point for grouping the well with at least 2 of the 3 other wells in its clade. Because the relationships among 16 wells were reconstructed, a maximum of 32 points is possible. a, Each point represents a replicate of the entire truncation and degradation process. For each reconstruction, the insertion length cutoff that produced the most accurate reconstruction for that sample was used. b, Representative dendrograms for reconstruction from 20% sampling data before or after truncation. For reference, the reconstruction on the left had a score of 30, and the reconstruction on the right had a score of 18. c, Reconstructions were scored for each minimum insertion length (in bp) used for reconstruction, which is noted below each bar. Results were plotted as in (a).
Extended Data Fig. 10 293T-CHYRON16 insertions grow longer with increasing duration of exposure to DMOG.
For the experiment shown in Fig. 6d, the length of each insertion recorded in each condition was tabulated and plotted. At each dose, the timepoints were significantly different by one-way Welch analysis of variance (ANOVA). For the 0.25 mM dose, F = 17.659, p < 0.001; for the 0.5 mM dose, F = 18.463, p < 0.001; for the 1 mM dose, F = 7.461, p < 0.001. Pairs of samples marked * were significantly different according to post hoc Games-Howell test (p < 0.001).
Supplementary information
Supplementary Information
Supplementary Note and Figs. 1–5
Supplementary Table 1
Data underlying Fig. 2c.
Supplementary Table 2
Data underlying Fig. 5.
Supplementary Table 3
Data underlying Extended Data Fig. 9.
Supplementary Table 4
A guide to plasmids, primers and NGS datasets produced in this work.
Source data
Source Data Extended Data Fig. 4
Unprocessed western blots.
Source Data Extended Data Fig. 8
Statistical source data.
Source Data Extended Data Fig. 10
Statistical source data.
Rights and permissions
About this article
Cite this article
Loveless, T.B., Grotts, J.H., Schechter, M.W. et al. Lineage tracing and analog recording in mammalian cells by single-site DNA writing. Nat Chem Biol 17, 739–747 (2021). https://doi.org/10.1038/s41589-021-00769-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41589-021-00769-8
This article is cited by
-
New dual-channel system records lineage in high definition
Nature Methods (2022)
-
Scalable biological signal recording in mammalian cells using Cas12a base editors
Nature Chemical Biology (2022)
-
Emerging strategies for the genetic dissection of gene functions, cell types, and neural circuits in the mammalian brain
Molecular Psychiatry (2022)
-
A time-resolved, multi-symbol molecular recorder via sequential genome editing
Nature (2022)