Abstract
DddA-derived cytosine base editors (DdCBEs)—which are fusions of split DddA halves and transcription activator-like effector (TALE) array proteins from bacteria—enable targeted C•G-to-T•A conversions in mitochondrial DNA1. However, their genome-wide specificity is poorly understood. Here we show that the mitochondrial base editor induces extensive off-target editing in the nuclear genome. Genome-wide, unbiased analysis of its editome reveals hundreds of off-target sites that are TALE array sequence (TAS)-dependent or TAS-independent. TAS-dependent off-target sites in the nuclear DNA are often specified by only one of the two TALE repeats, challenging the principle that DdCBEs are guided by paired TALE proteins positioned in close proximity. TAS-independent off-target sites on nuclear DNA are frequently shared among DdCBEs with distinct TALE arrays. Notably, they co-localize strongly with binding sites for the transcription factor CTCF and are enriched in topologically associating domain boundaries. We engineered DdCBE to alleviate such off-target effects. Collectively, our results have implications for the use of DdCBEs in basic research and therapeutic applications, and suggest the need to thoroughly define and evaluate the off-target effects of base-editing tools.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
All data generated for this paper have been deposited at NCBI Gene Expression Omnibus (GEO) and are available under GEO accession number GSE173859 (Detect-seq data), GSE173689 (ATAC-seq data and in situ ChIP–seq data) and GSE176089 (targeted deep sequencing data). hg38 was used as the reference genome. The Hi-C, DNase-seq, Bisulfite-seq and ChIP–seq data were downloaded from the GEO or ENCODE database; accession numbers of these public data sets are available in Supplementary Table 5.
Code availability
Detect-seq tools, including several Python scripts, were deposited on GitHub (https://github.com/menghaowei/Detect-seq). Detect-seq tools can help to perform Detect-seq analysis, including but not limited to Detect-seq signal finding, enrichment testing, off-target sites identification, TALE sequence alignment and alignment results visualization.
References
Mok, B. Y. et al. A bacterial cytidine deaminase toxin enables CRISPR-free mitochondrial base editing. Nature 583, 631–637 (2020).
Russell, O. M., Gorman, G. S., Lightowlers, R. N. & Turnbull, D. M. Mitochondrial diseases: hope for the future. Cell 181, 168–188 (2020).
Vafai, S. B. & Mootha, V. K. Mitochondrial disorders as windows into an ancient organelle. Nature 491, 374–383 (2012).
Stewart, J. B. & Chinnery, P. F. The dynamics of mitochondrial DNA heteroplasmy: implications for human health and disease. Nat. Rev. Genet. 16, 530–542 (2015).
Stewart, J. B. & Chinnery, P. F. Extreme heterogeneity of human mitochondrial DNA from organelles to populations. Nat. Rev. Genet. 22, 106–118 (2021).
Montano, V., Gruosso, F., Simoncini, C., Siciliano, G. & Mancuso, M. Clinical features of mtDNA-related syndromes in adulthood. Arch. Biochem. Biophys. 697, 108689 (2021).
Bacman, S. R. et al. MitoTALEN reduces mutant mtDNA load and restores tRNA(Ala) levels in a mouse model of heteroplasmic mtDNA mutation. Nat. Med. 24, 1696–1700 (2018).
Bacman, S. R., Williams, S. L., Pinto, M., Peralta, S. & Moraes, C. T. Specific elimination of mutant mitochondrial genomes in patient-derived cells by mitoTALENs. Nat. Med. 19, 1111–1113 (2013).
Gammage, P. A., Rorbach, J., Vincent, A. I., Rebar, E. J. & Minczuk, M. Mitochondrially targeted ZFNs for selective degradation of pathogenic mitochondrial genomes bearing large-scale deletions or point mutations. EMBO Mol. Med. 6, 458–466 (2014).
Lei, Z. et al. Detect-seq reveals out-of-protospacer editing and target-strand editing by cytosine base editors. Nat. Methods 18, 643–651 (2021).
Zhu, C. et al. Single-cell 5-formylcytosine landscapes of mammalian early embryos and ESCs at single-base resolution. Cell Stem Cell 20, 720–731.e725 (2017).
Shu, X. et al. Genome-wide mapping reveals that deoxyuridine is enriched in the human centromeric DNA. Nat. Chem. Biol. 14, 680–687 (2018).
Xia, B. et al. Bisulfite-free, base-resolution analysis of 5-formylcytosine at the genome scale. Nat. Methods 12, 1047–1050 (2015).
Komor, A. C., Kim, Y. B., Packer, M. S., Zuris, J. A. & Liu, D. R. Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature 533, 420–424 (2016).
Nishida, K. et al. Targeted nucleotide editing using hybrid prokaryotic and vertebrate adaptive immune systems. Science 353, aaf8729 (2016).
Koblan, L. W. et al. Improving cytidine and adenine base editors by expression optimization and ancestral reconstruction. Nat. Biotechnol. 36, 843–846 (2018).
Li, X. et al. Base editing with a Cpf1-cytidine deaminase fusion. Nat. Biotechnol. 36, 324–327 (2018).
Wang, X. et al. Cas12a base editors induce efficient and specific editing with low DNA damage response. Cell Rep. 31, 107723 (2020).
Sardo, L. et al. Real-time visualization of chromatin modification in isolated nuclei. J. Cell Sci. 130, 2926–2940 (2017).
Wang, Q. et al. CoBATCH for high-throughput single-cell epigenomic profiling. Mol. Cell 76, 206–216.e207 (2019).
Boch, J. et al. Breaking the code of DNA binding specificity of TAL-type III effectors. Science 326, 1509–1512 (2009).
Moscou, M. J. & Bogdanove, A. J. A simple cipher governs DNA recognition by TAL effectors. Science 326, 1501 (2009).
Lamb, B. M., Mercer, A. C. & Barbas, C. F. III. Directed evolution of the TALE N-terminal domain for recognition of all 5′ bases. Nucleic Acids Res. 41, 9779–9785 (2013).
Jin, S. et al. Cytosine, but not adenine, base editors induce genome-wide off-target mutations in rice. Science 364, 292–295 (2019).
Zuo, E. et al. Cytosine base editor generates substantial off-target single-nucleotide variants in mouse embryos. Science 364, 289–292 (2019).
Doman, J. L., Raguram, A., Newby, G. A. & Liu, D. R. Evaluation and minimization of Cas9-independent off-target DNA editing by cytosine base editors. Nat. Biotechnol. 38, 620–628 (2020).
Nakahashi, H. et al. A genome-wide map of CTCF multivalency redefines the CTCF code. Cell Rep. 3, 1678–1689 (2013).
Schmidt, D. et al. Waves of retrotransposon expansion remodel genome organization and CTCF binding in multiple mammalian lineages. Cell 148, 335–348 (2012).
Merkenschlager, M. & Nora, E. P. CTCF and cohesin in genome folding and transcriptional gene regulation. Annu. Rev. Genomics Hum. Genet. 17, 17–43 (2016).
Rowley, M. J. & Corces, V. G. Organizational principles of 3D genome architecture. Nat. Rev. Genet. 19, 789–800 (2018).
Shi, Z., Gao, H., Bai, X. C. & Yu, H. Cryo-EM structure of the human cohesin–NIPBL–DNA complex. Science 368, 1454–1459 (2020).
Davidson, I. F. et al. DNA loop extrusion by human cohesin. Science 366, 1338–1345 (2019).
Kim, Y., Shi, Z., Zhang, H., Finkelstein, I. J. & Yu, H. Human cohesin compacts DNA by loop extrusion. Science 366, 1345–1349 (2019).
Murayama, Y. & Uhlmann, F. Biochemical reconstitution of topological DNA binding by the cohesin ring. Nature 505, 367–371 (2014).
Petela, N. J. et al. Scc2 Is a potent activator of cohesin’s ATPase that promotes loading by binding Scc1 without Pds5. Mol. Cell 70, 1134–1148.e1137 (2018).
Hashimoto, H. et al. Structural basis for the versatile and methylation-dependent binding of CTCF to DNA. Mol. Cell 66, 711–720.e713 (2017).
Dixon, J. R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380 (2012).
Yu, M. & Ren, B. The three-dimensional organization of mammalian genomes. Annu. Rev. Cell Dev. Biol. 33, 265–289 (2017).
Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009).
Schmitt, A. D., Hu, M. & Ren, B. Genome-wide mapping and analysis of chromosome architecture. Nat. Rev. Mol. Cell Biol. 17, 743–755 (2016).
Suzuki, K. et al. In vivo genome editing via CRISPR/Cas9 mediated homology-independent targeted integration. Nature 540, 144–149 (2016).
Yang, J. et al. ULtiMATE system for rapid assembly of customized TAL effectors. PLoS ONE 8, e75649 (2013).
Buenrostro, J. D., Wu, B., Chang, H. Y. & Greenleaf, W. J. ATAC-seq: a method for assaying chromatin accessibility genome-wide. Curr. Protoc. Mol. Biol. 109, 21–29 (2015).
Neely, A. E. & Bao, X. Nuclei isolation staining (NIS) method for imaging chromatin-associated proteins in difficult cell types. Curr. Protoc. Cell Biol. 84, e94 (2019).
Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal https://doi.org/10.14806/ej.17.1.200 (2011).
Krueger, F. & Andrews, S. R. Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics 27, 1571–1572 (2011).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Zhang, Y. et al. Model-based analysis of ChIP–seq (MACS). Genome Biol 9, R137 (2008).
Ramirez, F. et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 44, W160–W165 (2016).
Koboldt, D. C. et al. VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res. 22, 568–576 (2012).
Servant, N. et al. HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome Biol. 16, 259 (2015).
Wolff, J. et al. Galaxy HiCExplorer 3: a web server for reproducible Hi-C, capture Hi-C and single-cell Hi-C data analysis, quality control and visualization. Nucleic Acids Res. 48, W177–W184 (2020).
Acknowledgements
We thank W. Wei for providing related plasmids; H. Cheng for sharing the antibodies for mitochondrial markers; W. Xie and X. Lu for discussion about the ChIP assay; X. Zhang and Chuyun Shao for help with ATAC-seq experiments and data processing; National Center for Protein Sciences at Peking University for assistance with FACS, imaging, sequencing, Imaris, Fragment Analyzer and Agilent 4150 TapeStation System; C. Shan, L. Fu, S. Qin and Y. Guo for assistance with immunofluorescence experiments, FACS and image processing; and G. Li and X. Zhang for assistance with NGS experiments. Bioinformatics analysis was performed on the High-Performance Computing Platform of the School of Life Sciences and High-Performance Computing Platform of the Center for Life Science. This work was supported by the National Natural Science Foundation of China (nos. 21825701, 91953201, 92153303 and 22107006), National Key R&D Program (2019YFA0110900 and 2019YFA0802200) and China Postdoctoral Science Foundation (2020M680218, 2021M700238). L. Liu was supported in part by the Postdoctoral Fellowship of Peking-Tsinghua Center for Life Sciences.
Author information
Authors and Affiliations
Contributions
Z.L., H.M. and C.Y. conceived and led the project. Z.L., H.M., L.L. and C.Y. designed the experiments to investigate the DdCBE off-target effect, which were performed by Z.L. and L.L. H.M. analysed Detect-seq, in situ ChIP–seq and the downloaded public data. H.Z. analysed ATAC-seq and targeted deep sequencing data. Z.L. performed the co-immunoprecipitation and non-fixation immunofluorescence assays. L.L. conducted the cell fractionation assay and the sequential western blot and fixation-based immunofluorescence experiments. X. R. and Y. Y. assisted the experiments and data processing. Z.L. and C.Y. designed the in situ ChIP experiments with the advice of A.H. M.L. performed the in situ ChIP–seq experiments with cell samples prepared by Z.L. Z.L., H.M., L.L., H.Z. and C.Y. wrote the paper with the help of X. R. and H.W.
Corresponding author
Ethics declarations
Competing interests
Peking University has filed patent applications on Detect-seq and optimized DdCBE variants described in this study, listing Z.L., H.M., Z.C.L., L.L, H.Z. and C.Y. as inventors.
Peer review
Peer review information
Nature thanks Bryan Dickinson and Fyodor Urnov for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Extended Data Fig. 1 Workflow of Detect-seq.
Endogenous 5fdC was protected by O-ethylhydroxylamine (EtONH2). Damage repair step eliminates endogenous DNA damages including abasic sites (AP), single strand breaks (SSB), etc. Deoxyuridine (dU) generated by DdCBE in vivo was labeled by the in vitro reconstituted base excision repair (BER) reaction: UDG specifically recognizes and cleaves dU, leaving abasic sites; Endo IV removes abasic sites, leaving 3’-OH remnant; Bst DNA polymerase initiates DNA strand replacement after the 3’-OH; ligase sews the final nicks. Through the so-called “nick translation” activity of Bst polymerase during this step, biotinylated dUTPs and 5fdCTPs were incorporated 3’ to dU. Malononitrile treatment marks the incorporated 5fdCs, generating a characteristic tandem C-to-T mutation pattern to trace DdCBE edits. Biotin pulldown followed by NaOH treatment enriches DdCBE edited DNA fragments and enhances Detect-seq signals.
Extended Data Fig. 2 Comparisons of Detect-seq signals for off-target edits under two different transfection conditions.
The two conditions are: 4x105 seeded HEK293T cells on 6-well plates were transfected with 4 μg of each monomer using 12 μl Lipofectamine 2000; or, 6.4x105 cells were transfected with 3.5 μg of each monomer using 21 μl Lipofectamine LTX. The Detect-seq signals are highly consistent between the two conditions for all three DdCBEs.
Extended Data Fig. 3 Editing ratios of nuclear DNA off-target sites identified for the three L1397N-DdCBEs.
a–c, Targeted deep sequencing results for selected nuclear off-target sites of ND4-L1397N (a), ND5.1-L1397N (b) and ND6-L1397N (c). For each off-target site, the editing ratio for the highest edited cytosine is plotted (blue), and the matched ratio in untreated ctrl sample is plotted in gray.
Extended Data Fig. 4 A real-time IF staining assay using unfixed HeLa nuclei to demonstrate the nuclear localization of DdCBE.
a, Fluorescence imaging of DAPI (navy blue), HA-tagged left half (Anti-HA, orange red) and Flag-tagged right half (Anti-Flag, green) in unfixed nuclei of HeLa cells untreated or transfected with Lipofectamine LTX. Possible mitochondrial contamination was tested by MitoTracker (magenta). The images were obtained at a representative Z-axis under the same exposure condition by High Speed Spinning Disk Confocal Microscope (ANDOR). Scale bars, 3 μm. Images are representative of 3 independent biological replicates. b, c, The projected 2D fluorescence image (b) and 3D snapshot (c) of a representative nuclei from cells transfected with Lipofectamine 3000. d, e, The projected 2D fluorescence image (d) and 3D snapshot (e) of a representative nuclei from cells transfected with Lipofectamine LTX. f, Statistic diagram for 3D mean fluorescence intensity per voxel of all scanned nuclei under different treatments. The data in b–f for each nucleus was obtained from z-stack images collected at 0.4 μm intervals under the same exposure condition by DeltaVision OMX SR (GE). Similar color and scale bars in a were used. HeLa cells on 6-well plates were transfected with 2 μg of each monomer using 6 μl Lipofectamine3000; or, cells were transfected with 3.5 μg of each monomer using 21 μl Lipofectamine LTX. “ND6-WT”: wild type ND6-L1397N; “ND6-(TALE-)”: ND6-L1397N architectures that deleted the TALE arrays. In f, error bars reflect the mean +/− SD; and p-values are calculated by one-side Student’s t-test.
Extended Data Fig. 5 A small portion of DdCBE is localized in the nucleus of transfected HEK293T cells.
a, Western blotting results showing the distribution of ND6-L1397N (WT) in different subcellular fractions of 2×103 HEK293T cells untreated or transfected using Lipofectamine 2000 or LTX; and the distribution of three deletion variants of ND6-L1397N in different fractions of cells transfected with LTX. “DddA-free”, “UGI-free” and “TALE-free” mean the deletion of DddA, UGI and TALE arrays from the full-length ND6-L1397N respectively. The results show that ND6-L1397N is partially localized in the chromatin fraction no matter which transfection reagent was used. The signal of the TALE-free construct in the chromatin fraction is only present when the exposure time is extended. This observation suggests that compared to DddA and UGI, the TALE arrays most strongly affect the nuclear localization. ATP5a (mitochondria), GAPDH (cytosolic) and H3 (chromatin) were chosen as compartment-specific markers, demonstrating the purity of each subcellular fraction. HA (tagged left half) and Flag (tagged right half) were used to indicate the localization of DdCBEs. Molecular weight is given in kDa; images are representative of 2 independent biological replicates; samples are derived from the same batch of experiment and gels were processed in parallel. b, Fluorescence imaging of nuclei (DAPI, blue), HA-tagged left half (Anti-HA, red), Flag-tagged right half (Anti-Flag, green) in fixed nuclei isolated from HEK293T cells untreated or transfected with ND6-L1397N (WT) using Lipofectamine 2000, or transfected with ND6-L1397N (WT), DddA-free, UGI-free and TALE-free constructs using Lipofectamine LTX. Possible mitochondrial contamination was tested by MitoTracker (magenta). The results show that a small portion of DdCBE is localized in nuclei, regardless of the transfection conditions. TALE arrays more strongly affect the nuclear localization compared with DddA and UGI. Scale bars, 5 μm for zoomed in images of TALE-free; 40 μm for all remaining images. The images were obtained under the same exposure condition and are representative of 2 independent biological replicates.
Extended Data Fig. 6 The editing spectra of DdCBE at TAS-dependent nDNA off-target sites.
a, Sequence logos for Cs with highest Detect-seq signal obtained via WebLogo using DNA sequences at TAS-dependent off-target sites of ND6-L1397N, ND5.1-L1397N and ND4-L1397N. b, Sequence logos generated from the pTBSs of ND5.1-L1397N and ND4-L1397N. Bits reflect the level of sequence conservation at a given position. c, Aggregate distribution of C·Gs with highest Detect-seq signal across the flanking region of each pTBS for TAS-dependent off-target sites of ND6-L1397N, ND5.1-L1397N and ND4-L1397N. The position of pTBS for left or right TALE proteins is shadowed. d, A schematic illustrating the editing spectra of the three L1397N DdCBEs based on the pTBS-edits distribution analysis. Counting the first base pair after the 3’ ends of pTBS as position +1. NTD, N-terminal domain; CTD, C-terminal domain.
Extended Data Fig. 7 The TALE independency of TAS-independent off-target sites validated by targeted deep sequencing.
Results of targeted deep sequencing at five representative TAS-independent off-target sites for different ND6-L1397N constructs in Fig. 2a.
Extended Data Fig. 8 Motif search result from sequences of all TAS-independent off-target sites.
The results (with a p-value < 0.05) are generated by Tomtom program with JASPAR core motif database.
Extended Data Fig. 9 Strategies to improve the specificity of DdCBE.
a, Fusing nuclear export signal (NES) sequences into the DdCBE constructs. The protein level of DdCBE in the nucleus should be decreased, and hence lower the risk of nDNA off-target editing. b, Co-expressing DddIA that fused with nuclear localization signals (NLS). DddIA is a natural immunity protein of the deaminase DddA; bpNLS-linked DddIA is supposed to antagonize the deamination activity of DdCBEs mis-localized into the nucleus. bpNLS, bipartite NLS at both the N and C termini. c, Mutating the DddAtox in the DdCBE architecture to reduce its intrinsic DNA binding affinity. Ideally, mutated deaminase would not be able to catalyze DNA substrates without the help of simultaneously stable binding of the two TALE repeats.
Supplementary information
Supplementary Information
This file contains Supplementary Discussion, legends for Supplementary Tables 1–5 and Supplementary Videos 1 and 2, Supplementary Figures 1–27 and references.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lei, Z., Meng, H., Liu, L. et al. Mitochondrial base editor induces substantial nuclear off-target mutations. Nature 606, 804–811 (2022). https://doi.org/10.1038/s41586-022-04836-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41586-022-04836-5
This article is cited by
-
Base editing of organellar DNA with programmable deaminases
Nature Reviews Molecular Cell Biology (2024)
-
Strand-selective base editing of human mitochondrial DNA using mitoBEs
Nature Biotechnology (2024)
-
CRISPR-free, strand-selective mitochondrial DNA base editing using a nickase
Nature Biotechnology (2024)
-
Compact zinc finger architecture utilizing toxin-derived cytidine deaminases for highly efficient base editing in human cells
Nature Communications (2024)
-
Targeted knockout of a conserved plant mitochondrial gene by genome editing
Nature Plants (2023)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.