Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Detect-seq reveals out-of-protospacer editing and target-strand editing by cytosine base editors

Abstract

Cytosine base editors (CBEs) have the potential to correct human pathogenic point mutations. However, their genome-wide specificity remains poorly understood. Here we report Detect-seq for the evaluation of CBE specificity. It enables sensitive detection of CBE-induced off-target sites at the genome-wide level. Detect-seq leverages chemical labeling and biotin pulldown to trace the editing intermediate deoxyuridine, thereby revealing the editome of CBE. In addition to Cas9-independent and typical Cas9-dependent off-target sites, we discovered edits outside the protospacer sequence (that is, out-of-protospacer) and on the target strand (which pairs with the single-guide RNA). Such unexpected off-target edits are prevalent and can exhibit a high editing ratio, while their occurrences exhibit cell-type dependency and cannot be predicted based on the sgRNA sequence. Moreover, we found out-of-protospacer and target-strand edits nearby the on-target sites tested, challenging the general knowledge that CBEs do not induce proximal off-target mutations. Collectively, our approaches allow unbiased analysis of the CBE editome and provide a widely applicable tool for specificity evaluation of various emerging genome editing tools.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Detect-seq assesses the genome-wide specificity of CBE.
Fig. 2: Comparisons of Detect-seq results with GUIDE-seq results.
Fig. 3: Comparisons of Detect-seq with WGS-based methods and computational prediction methods for off-target identification.
Fig. 4: Detect-seq discovered prevalent out-of-protospacer edits and target-strand edits.
Fig. 5: The characteristics of out-of-protospacer edits and target-strand edits.

Similar content being viewed by others

Data availability

All data generated for this paper have been deposited at NCBI GEO and are available under accession numbers GSE151265 and GSE152907. Source data are provided with this paper.

Code availability

Detect-seq tools are available at https://github.com/menghaowei/Detect-seq.

References

  1. Doudna, J. A. The promise and challenge of therapeutic genome editing. Nature 578, 229–236 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Lee, J. et al. Recent advances in genome editing of stem cells for drug discovery and therapeutic application. Pharmacol. Ther. 209, 107501 (2020).

    Article  CAS  PubMed  Google Scholar 

  3. Wang, D., Zhang, F. & Gao, G. CRISPR-based therapeutic genome editing: strategies and in vivo delivery by AAV vectors. Cell 181, 136–150 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Komor, A. C., Badran, A. H. & Liu, D. R. CRISPR-based technologies for the manipulation of eukaryotic genomes. Cell 169, 559 (2017).

    Article  CAS  PubMed  Google Scholar 

  5. Rees, H. A. & Liu, D. R. Base editing: precision chemistry on the genome and transcriptome of living cells. Nat. Rev. Genet. 19, 770–788 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Anzalone, A. V., Koblan, L. W. & Liu, D. R. Genome editing with CRISPR–Cas nucleases, base editors, transposases and prime editors. Nat. Biotechnol. 38, 824–844 (2020).

    Article  CAS  PubMed  Google Scholar 

  7. Gaudelli, N. M. et al. Programmable base editing of A*T to G*C in genomic DNA without DNA cleavage. Nature 551, 464–471 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Komor, A. C., Kim, Y. B., Packer, M. S., Zuris, J. A. & Liu, D. R. Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature 533, 420–424 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Nishida, K. et al. Targeted nucleotide editing using hybrid prokaryotic and vertebrate adaptive immune systems. Science 353, aaf8729 (2016).

    Article  PubMed  Google Scholar 

  10. Dunbar, C. E. et al. Gene therapy comes of age. Science 359, eaan4672 (2018).

    Article  PubMed  Google Scholar 

  11. Grunewald, J. et al. Transcriptome-wide off-target RNA editing induced by CRISPR-guided DNA base editors. Nature 569, 433–437 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Jin, S. et al. Cytosine, but not adenine, base editors induce genome-wide off-target mutations in rice. Science 364, 292–295 (2019).

    Article  CAS  PubMed  Google Scholar 

  13. Kim, D. et al. Genome-wide target specificities of CRISPR RNA-guided programmable deaminases. Nat. Biotechnol. 35, 475–480 (2017).

    Article  CAS  PubMed  Google Scholar 

  14. McGrath, E. et al. Targeting specificity of APOBEC-based cytosine base editor in human iPSCs determined by whole genome sequencing. Nat. Commun. 10, 5353 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  15. Zhou, C. et al. Off-target RNA mutation induced by DNA base editing and its elimination by mutagenesis. Nature 571, 275–278 (2019).

    Article  CAS  PubMed  Google Scholar 

  16. Zuo, E. et al. Cytosine base editor generates substantial off-target single-nucleotide variants in mouse embryos. Science 364, 289–292 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Shu, X. et al. Genome-wide mapping reveals that deoxyuridine is enriched in the human centromeric DNA. Nat. Chem. Biol. 14, 680–687 (2018).

    Article  CAS  PubMed  Google Scholar 

  18. Xia, B. et al. Bisulfite-free, base-resolution analysis of 5-formylcytosine at the genome scale. Nat. Methods 12, 1047–1050 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Zhu, C. et al. Single-cell 5-formylcytosine landscapes of mammalian early embryos and ESCs at single-base resolution. Cell Stem Cell 20, 720–731 e725 (2017).

    Article  CAS  PubMed  Google Scholar 

  20. Zeng, H. et al. Bisulfite-free, nanoscale analysis of 5-hydroxymethylcytosine at single base resolution. J. Am. Chem. Soc. 140, 13190–13194 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Tsai, S. Q. et al. GUIDE-seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases. Nat. Biotechnol. 33, 187–197 (2015).

    Article  CAS  PubMed  Google Scholar 

  22. Wienert, B. et al. Unbiased detection of CRISPR off-targets in vivo using DISCOVER-Seq. Science 364, 286–289 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Hong, J. & Gresham, D. Incorporation of unique molecular identifiers in TruSeq adapters improves the accuracy of quantitative sequencing. Biotechniques 63, 221–226 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  24. Salk, J. J., Schmitt, M. W. & Loeb, L. A. Enhancing the accuracy of next-generation sequencing for detecting rare and subclonal mutations. Nat. Rev. Genet. 19, 269–285 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Saraconi, G., Severi, F., Sala, C., Mattiuz, G. & Conticello, S. G. The RNA editing enzyme APOBEC1 induces somatic mutations and a compatible mutational signature is present in esophageal adenocarcinomas. Genome Biol. 15, 417 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  26. Doman, J. L., Raguram, A., Newby, G. A. & Liu, D. R. Evaluation and minimization of Cas9-independent off-target DNA editing by cytosine base editors. Nat. Biotechnol. 38, 620–628 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Bae, S., Park, J. & Kim, J. S. Cas-OFFinder: a fast and versatile algorithm that searches for potential off-target sites of Cas9 RNA-guided endonucleases. Bioinformatics 30, 1473–1475 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Huai, C. et al. Structural insights into DNA cleavage activation of CRISPR-Cas9 system. Nat. Commun. 8, 1375 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  29. Jiang, F. et al. Structures of a CRISPR-Cas9 R-loop complex primed for DNA cleavage. Science 351, 867–871 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Zuo, E. et al. A rationally engineered cytosine base editor retains high on-target activity while reducing both DNA and RNA off-target effects. Nat. Methods 17, 600–604 (2020).

    Article  CAS  PubMed  Google Scholar 

  31. Li, X. et al. Base editing with a Cpf1-cytidine deaminase fusion. Nat. Biotechnol. 36, 324–327 (2018).

    Article  CAS  PubMed  Google Scholar 

  32. Wang, X. et al. Cas12a base editors induce efficient and specific editing with low DNA damage response. Cell Rep. 31, 107723 (2020).

    Article  CAS  PubMed  Google Scholar 

  33. Kim, D., Lim, K., Kim, D. E. & Kim, J. S. Genome-wide specificity of dCpf1 cytidine base editors. Nat. Commun. 11, 4072 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Kim, D. et al. Genome-wide analysis reveals specificities of Cpf1 endonucleases in human cells. Nat. Biotechnol. 34, 863–868 (2016).

    Article  CAS  PubMed  Google Scholar 

  35. Kleinstiver, B. P. et al. Genome-wide specificities of CRISPR-Cas Cpf1 nucleases in human cells. Nat. Biotechnol. 34, 869–874 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Sakata, R. C. et al. Base editors for simultaneous introduction of C-to-T and A-to-G mutations. Nat. Biotechnol. 38, 865–869 (2020).

    Article  CAS  PubMed  Google Scholar 

  37. Zhang, X. et al. Dual base editor catalyzes both cytosine and adenine base conversions in human cells. Nat. Biotechnol. 38, 856–860 (2020).

    Article  CAS  PubMed  Google Scholar 

  38. Grunewald, J. et al. A dual-deaminase CRISPR base editor enables concurrent adenine and cytosine editing. Nat. Biotechnol. 38, 861–864 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  39. Li, C. et al. Targeted, random mutagenesis of plant genes with dual cytosine and adenine base editors. Nat. Biotechnol. 38, 875–882 (2020).

    Article  CAS  PubMed  Google Scholar 

  40. Kurt, I. C. et al. CRISPR C-to-G base editors for inducing targeted DNA transversions in human cells. Nat. Biotech. 39, 41–46 (2020).

    Article  Google Scholar 

  41. Zhao, D. et al. Glycosylase base editors enable C-to-A and C-to-G base changes. Nat. Biotech. 39, 35–40 (2020).

    Article  Google Scholar 

  42. Mok, B. Y. et al. A bacterial cytidine deaminase toxin enables CRISPR-free mitochondrial base editing. Nature 583, 631–637 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank W. Wei (Peking University) and J. Hu (Peking University) for discussion, W. Wei together with J. Chen (ShanghaiTech University) for kindly providing related plasmids, and J. Liu (Peking University) for help with experiments. We thank the National Center for Protein Sciences at Peking University in Beijing, China, for assistance with FACS and the Fragment Analyzer. Bioinformatics analysis was performed on the High-Performance Computing Platform of the School of Life Sciences. This work was supported by the National Natural Science Foundation of China (grant nos. 21825701 and 91953201), National Key R&D Program (grant no. 2019YFA0110900) and the Peking University Ge Li and Ning Zhao Education Fund.

Author information

Authors and Affiliations

Authors

Contributions

Z. Lei, H.M., Z. Lv and C.Y. conceived and guided the research. Z. Lei and M.L. led the development of Detect-seq protocol. H.M. developed the computational pipeline for Detect-seq. H.M., H.W. and H.Z. analyzed all high-throughput sequencing data. Z. Lei and H.M. optimized the targeted amplicon sequencing methodology. Z. Lv conducted cellular experiments and molecular cloning assays. Z. Lei executed Detect-seq experiments. L.L., K.Y., X.Z., Y.Z. and Y.Y. assisted with the experiments. Z. Lei, H.M., Z. Lv and C.Y. wrote the paper.

Corresponding author

Correspondence to Chengqi Yi.

Ethics declarations

Competing interests

The authors have filed patent applications on related sequencing technologies.

Additional information

Peer review Information Nature Methods thanks Jia Chen and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Lei Tang was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Overview of Detect-seq.

Schematic procedures of Detect-seq. Fragmentation of genomic DNA was followed by end repair to avoid mistaken labeling or protection at overhangs. Endogenous 5fdC was blocked by O-ethylhydroxylamine (EtONH2)-based protection. Damage repair step eliminates false positive signals from endogenous DNA damages including abasic sites (AP), single strand breaks (SSB), etc. Deoxyuridine (dU) generated by CBE in vivo was labeled by an in vitro reconstituted base excision repair (BER) reaction: UDG removes dU with high specificity, generating abasic sites; Endo IV catalyzes the cleavage of abasic sites while leaving a 3’-OH remnant; Bst DNA polymerase initiates DNA synthesis following 3’-OH and successively replaces nucleotides 3’ to it; ligase finally sews the nicks. Through the ‘nick translation’ activity of Bst during the dU-labeling step, biotinylated dUTPs and 5fdCTPs were incorporated 3’ to dU. Malononitrile treatment marks the incorporated 5fdCs for introducing a featured tandem C-to-T mutation pattern to trace CBE editing events. Biotin pulldown followed by NaOH treatment enriches CBE edited DNA fragments and enhances signals for high-throughput sequencing.

Extended Data Fig. 2 The characteristics of Detect-seq signal pattern.

a, Enriched peaks and Detect-seq signals at the on-target site of EMX1 and VEGFA_site_2 (shorted as VEGFA) sgRNA. Sequencing data shown was generated from CBE-transfected HEK293T cells. Two independent biological replicates are shown, demonstrating high reproducibility. Red ‘T’s in the upper IGV (Integrative Genomics Viewer) view indicate C-to-T mutations on the non-target strand. Green ‘G’s in the lower IGV view indicate G-to-A mutations on the target strand (that is, C-to-T mutations on the non-target strand). b, A representative example showing Detect-seq signals can be easily distinguished from SNVs. Red blocks without a black triangle above represent C-to-T mutations on the non-target strand, while red blocks with a black triangle above indicate a G-to-T SNV; the red and green inverted triangles respectively indicate genuine C-to-T edits on the forward and reverse strand according to the results of targeted amplicon sequencing. The pRBS is shadowed, and the corresponding targeted amplicon sequencing results are shown below. c, Normalized signals within a 4 kb window at pRBSs identified by Detect-seq (navy line) and sites by genome sampling (green line). For plots of each sgRNA, the left panel shows signals in WGS data, while the right panel shows Detect-seq data.

Extended Data Fig. 3 Detect-seq identified prevalent pRBS-containing loci.

a, Genome-wide distribution of pRBSs identified by Detect-seq on each chromosome for the three sgRNAs. On- and off-target edits are indicated by red squares and blue circles, respectively. b, Sequence logos for different sgRNAs obtained via WebLogo using DNA sequences at the pRBSs. The upper, middle and lower panel respectively shows sequence logos of pRBSs with high, middle and low level of editing frequency. c, Detect-seq identified dozens to hundreds of off-target sites that are highly reproducible between two biological replicates for EMX1, HEK293_site_4 and VEGFA_site_2 in HEK293T cells, as well as for HEK293_site_4 in MCF7 cells. d, A representative, reproducible off-target site observed in two biological replicates. The pRBS is shadowed, and the corresponding targeted amplicon sequencing results are shown below. In the IGV views, green blocks indicate G-to-A mutations on the target-strand (that is, C-to-T mutations on the non-target strand), while red blocks indicate C-to-T mutations on the target-strand. Orange asterisks indicate signal regions of out-of-protospacer edits; the red and green inverted triangles respectively indicate genuine C-to-T edits on the forward and reverse strand according to the results of targeted amplicon sequencing. e, Recall ratio plot for all identified pRBSs of the three sgRNAs according to the downsampling result of Detect-seq data. f, Recall ratio plot for the top one-third pRBSs (ranked by signal strength of Detect-seq) of the three sgRNAs according to the downsampling result of Detect-seq data.

Source data

Extended Data Fig. 4 Cas9-dependent off-target sites are verified by an optimized targeted amplicon sequencing strategy.

a, Schematic workflow of the improved targeted amplicon sequencing procedure. Extended unique molecular identifiers (UMIs) are introduced to mark each amplicon during the first round of PCR amplification. b, The bioinformatic strategy to remove PCR duplicates and errors generated during the second round of PCR amplification according to left and right UMIs. c, d, Matched targeted amplicon sequencing results for Fig. 1e (c) and in Supplementary Fig. 4b (d). e, Detect-seq and the matched targeted high-throughput sequencing results were given for a representative putative sgRNA-binding site for the constructs in Fig. 1c. The pRBS is shadowed. Green and red blocks in the IGV views respectively indicate C-to-T mutations on the non-target strand and target strand; the red and green inverted triangles respectively indicate genuine C-to-T edits on the forward and reverse strand according to the results of targeted amplicon sequencing.

Extended Data Fig. 5 The distribution of the target-strand editing.

a, Signal distribution of Detect-seq. C-to-T mutations herein reflect non-target-strand edits, while G-to-A mutations are target-strand edits. b, Distribution of edited cytosines on the target strands. The pRBS region for off-target sites of VEGFA_site_2 sgRNA are indicated by the dashed lines. Count PAM as positions 21–23.

Source data

Extended Data Fig. 6 Comparisons of off-target effect induced by dCpf1-based and Cas9-based BEs.

a, Illustration of the two genomic sites used for direct comparison of the two BE systems. b, c, Detect-seq identified off-target sites of BE4max (b) or LbCpf1-BE (c) that are highly reproducible between two biological replicates for the RUNX1 and DYRK1A. d, Genome-wide distribution of reproducible off-target sites for the RUNX1 and DYRK1A. On- and off-target edits are indicated by red squares and blue circles respectively. f, g, Sequence logos for RUNX1 (f) and DYRK1A (g) obtained via WebLogo using DNA sequences at the pRBSs (putative sgRNA/crRNA binding sites). The pRBSs are identified by Detect-seq for BE4max or LbCpf1-BE and compared with predicted off-target sites by Cas-OFFinder (allowing no more than 7 mismatches).

Source data

Supplementary information

Supplementary Information

Supplementary Figs. 1–19.

Reporting Summary

Supplementary Table 1

Primer sequences for targeted amplicon sequencing, and designed spike-in model sequences.

Supplementary Table 2

Validation results for edits by targeted amplicon sequencing.

Supplementary Table 3

The lists of Detect-seq identified pRBSs.

Supplementary Table 4

Public data used in this study.

Source data

Source Data Fig. 1

Statistical source data.

Source Data Fig. 2

Statistical source data.

Source Data Fig. 3

Statistical source data.

Source Data Fig. 4

Statistical source data.

Source Data Fig. 5

Statistical source data.

Source Data Extended Data Fig. 3

Statistical source data.

Source Data Extended Data Fig. 5

Statistical source data.

Source Data Extended Data Fig. 6

Statistical source data.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lei, Z., Meng, H., Lv, Z. et al. Detect-seq reveals out-of-protospacer editing and target-strand editing by cytosine base editors. Nat Methods 18, 643–651 (2021). https://doi.org/10.1038/s41592-021-01172-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41592-021-01172-w

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research