Cytosine base editors (CBEs) offer a powerful tool for correcting point mutations, yet their DNA and RNA off-target activities have caused concerns in biomedical applications. We describe screens of 23 rationally engineered CBE variants, which reveal mutation residues in the predicted DNA-binding site can dramatically decrease the Cas9-independent off-target effects. Furthermore, we obtained a CBE variant—YE1-BE3-FNLS—that retains high on-target editing efficiency while causing extremely low off-target edits and bystander edits.
This is a preview of subscription content, access via your institution
Open Access articles citing this article.
Communications Biology Open Access 02 November 2022
Nature Communications Open Access 08 August 2022
Nature Communications Open Access 02 August 2022
Subscribe to Nature+
Get immediate online access to Nature and 55 other Nature journal
Subscribe to Journal
Get full journal access for 1 year
only $8.25 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Get time limited or full article access on ReadCube.
All prices are NET prices.
All the sequencing data were deposited in NCBI Sequence Read Archive (SRA) under project accession PRJNA527003 and https://www.biosino.org/node/project/detail/OEP000272. The data that support the findings of this study are available from the corresponding author upon request.
Rees, H.A. & Liu, D.R. Base editing: precision chemistry on the genome and transcriptome of living cells. Nat. Rev. Genet. 19, 770–788 (2018).
Zuo, E. et al. Cytosine base editor generates substantial off-target single-nucleotide variants in mouse embryos. Science 364, 289–292 (2019).
Jin, S. et al. Cytosine, but not adenine, base editors induce genome-wide off-target mutations in rice. Science 364, 292–295 (2019).
Grunewald, J. et al. Transcriptome-wide off-target RNA editing induced by CRISPR-guided DNA base editors. Nature 569, 433–437 (2019).
Grünewald, J. et al. CRISPR adenine and cytosine base editors with reduced RNA off-target activities. 37, 1041–1048 (2019).
Zhou, C. et al. Off-target RNA mutation induced by DNA base editing and its elimination by mutagenesis. Nature 37, 1041–1048 (2019).
Kim, Y. B. et al. Increasing the genome-targeting scope and precision of base editing with engineered Cas9-cytidine deaminase fusions. Nat. Biotechnol. 35, 371–376 (2017).
Holden, L. G. et al. Crystal structure of the anti-viral APOBEC3G catalytic domain and functional implications. Nature 456, 121–124 (2008).
Chen, K. M. et al. Structure of the DNA deaminase domain of the HIV-1 restriction factor APOBEC3G. Nature 452, 116–119 (2008).
Teng, B. B. et al. Mutational analysis of apolipoprotein B mRNA editing enzyme (APOBEC1). structure-function relationships of RNA editing and dimerization. J. Lipid Res. 40, 623–635 (1999).
Teng, B., Burant, C. F. & Davidson, N. O. Molecular cloning of an apolipoprotein B messenger RNA editing protein. Science 260, 1816–1819 (1993).
Gehrke, J. M. et al. An APOBEC3A–Cas9 base editor with minimized bystander and off-target activities. Nat Biotechnol. 36, 977–982 (2018).
Wang, X. et al. Efficient base editing in methylated regions with a human APOBEC3A-Cas9 fusion. Nat Biotechnol. 36, 946–949 (2018).
Zafra, M. P. et al. Optimized base editors enable efficient editing in cells, organoids and mice. Nat Biotechnol. 36, 888–893 (2018).
Koblan, L. W. et al. Improving cytidine and adenine base editors by expression optimization and ancestral reconstruction. Nat. Biotechnol. 36, 843–846 (2018).
Komor, A. C., Kim, Y. B., Packer, M. S., Zuris, J. A. & Liu, D. R. Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature 533, 420–424 (2016).
Komor, A. C. et al. Improved base excision repair inhibition and bacteriophage Mu Gam protein yields C:G-to-T:A base editors with higher efficiency and product purity. Sci. Adv. 3, eaao4774 (2017).
Doman, J. L., Raguram, A., Newby, G. A. & Liu, D. Evaluation and minimization of Cas9-independent off-target DNA editing by cytosine base editors. Nat. Biotechnol. (in the press).
Clement, K. et al. CRISPResso2 provides accurate and rapid genome editing sequence analysis. Nat. Biotechnol. 37, 224–226 (2019).
Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Park, J., Lim, K., Kim, J. S. & Bae, S. Cas-analyzer: an online tool for assessing genome editing results using NGS data. Bioinformatics 33, 286–288 (2017).
Wang, X. et al. CRISPR–DAV: CRISPR NGS data analysis and visualization pipeline. Bioinformatics 33, 3811–3812 (2017).
Cibulskis, K. et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat. Biotechnol. 31, 213–219 (2013).
Wilm, A. et al. LoFreq: a sequence-quality aware, ultra-sensitive variant caller for uncovering cell-population heterogeneity from high-throughput sequencing datasets. Nucleic Acids Res. 40, 11189–11201 (2012).
Saunders, C. T. et al. Strelka: accurate somatic small-variant calling from sequenced tumor-normal sample pairs. Bioinformatics 28, 1811–1817 (2012).
Fang, H. et al. Indel variant analysis of short-read sequencing data with Scalpel. Nat. Protoc. 11, 2529–2548 (2016).
Bae, S., Park, J. & Kim, J. S. Cas-OFFinder: a fast and versatile algorithm that searches for potential off-target sites of Cas9 RNA-guided endonucleases. Bioinformatics 30, 1473–1475 (2014).
Haeussler, M. et al. Evaluation of off-target and on-target scoring algorithms and integration into the guide RNA selection tool CRISPOR. Genome Biol. 17, 148 (2016).
Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).
Chen, C. C., Hwang, J. K. & Yang, J. M. (PS)2: protein structure prediction server. Nucleic Acids Res. 34, W152–157 (2006).
Huang, T. T. et al. (PS)2: protein structure prediction server version 3.0. Nucleic Acids Res. 43, W338–342 (2015).
We thank the FACS facility staff H. Wu, L. Quan and S. Qian at ION, M. Zhang at IPS, and L. Yuan at Big Data Platform (SIBS, CAS). This study was supported by the R&D Program of China (2018YFC2000100 and 2017YFC1001302, 2017YFC0909701), the CAS Strategic Priority Research Program (XDB32060000, XDBS01060100), the National Natural Science Foundation of China (31871502, 31522037, 31822035, 31922048, 31925016, 91957122), the Basic Frontier Scientific Research Program of Chinese Academy of Sciences From 0 to 1 original innovation project (ZDBS-LY-SM001), the Shanghai Municipal Science and Technology Major Project (2018SHZDZX05), the Shanghai City Committee of science and technology project (18411953700, 18JC1410100, 16JC1420202), the National Science and Technology Major Project (2015ZX10004801-005), the National Key Research and Development Program of China (2017YFA0505500, 2016YFC0901704) the Agricultural Science and Technology Innovation Program to E.Z. and the International Partnership Program of Chinese Academy of Sciences (153D31KYSB20170059).
The authors disclose a patent application relating to aspects of this work (engineered base editors).
Peer review information Lei, Tang was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
a, The mutated residues are highlighted in the predicted structure of rAPOBEC1. Green and yellow colors indicate residues in the helix and the loop of the structure, respectively. b, The crystal structure of APOBEC3G. c, The on-target efficiency and indel frequencies of different versions of engineered CBEs for additional 11 target sites. d, The C-to-T editing efficiency for the engineered variants at each C of the 21 target sites. n = 21 independent experiments for each group. All values are presented as mean ± s.e.m. e, The indel frequency comparison among the engineered variants for the 21 target sites. n = 21 independent experiments for each group. P value was calculated by two-sided Student’s t-tests. Box-and-whisker plots: center line indicates median, the bottom and top lines of the box represents the first quartile and third quartile of the values, respectively. The bottom and top of the vertical line represent the minimum and maximum value. f, The on-target C-to-T editing efficiency of engineered BE3 variants at each target site. n = 3 biologically independent samples for each group. P value was calculated by two-sided Student’s t-tests. Sequences of the on-target protospacers and primers were shown in Extended Data Table 5. The data for BE3 and YE1-BE3 are also used in Figs. 3a and 3d-g.
a, The blastocyst rate of BE3 and BE3 variants with sgRNA-D. All values are presented as mean ± s.e.m. b, The blastocyst rate for BE3-hA3A and BE3-FNLSwith additional sgRNAs. All values are presented as mean ± s.e.m. n = 3 biologically independent samples for each group.
Extended Data Fig. 3 On-target editing efficiency and characteristics of off-target SNVs of engineered CBEs.
a, On-target editing efficiency of BE3 and CBE variants from WGS data. Two BE3 embryos without sgRNAs were not shown as they have no target site. b, Comparison of C-to-T and G-to-A conversions between CBE variants-treated and Cre or BE3 groups. n = 3 biologically independent samples for Cre, n = 6 biologically independent samples for BE3, n = 12 biologically independent samples for BE3R126E, n = 3 biologically independent samples for BE3R132E, n = 8 biologically independent samples for YE1-BE3, and n = 3 biologically independent samples for FE1-BE3. Two Cre samples and six BE3 samples were derived from Zuo et al.22 and one Cre sample was newly generated in this study. All values are presented as mean ± s.e.m. P value was calculated by two-sided Student’s t-test.
Extended Data Fig. 4 Venn diagrams of SNVs detected in each embryo by WGS data using the indicated software tools.
a, SNVs identified in BE3R126E-treated embryos. b, SNVs identified in BE3R132E-treated embryos. c, SNVs identified in YE1-BE3-treated embryos. d, SNVs identified in FE1-BE3-treated embryos. e, SNVs identified in the newly generated Cre-treated embryo.
a, The overlap among SNVs detected from our analysis with predicted off-targets sites by Cas-OFFinder and CRISPOR.
Editing rate of each variant across the chromosomes for each sample.
a, The comparison of the total number of detected RNA off-target SNVs at 72 h post-transfection. n = 6 biologically independent samples for GFP, n = 9 biologically independent samples for BE3, n = 7 biologically independent samples for BE3R126E and n = 2 biologically independent samples for YE1-BE3 groups. All values are presented as mean ± s.e.m. P values above each bar were calculated by comparing with GFP group with two-sided Student’s t-tests. b, The distribution of mutation types for GFP, BE3, and BE3 variants-treated groups. c, Editing rate of RNA off-targets for BE3 variants at 72 h post-transfection.
a, The C-to-T editing efficiency for the engineered variants at each C of the 21 target sites. n = 21 independent experiments for each group. All values are presented as mean ± s.e.m. The data for BE3 are also used in Fig. 3d. b, SNVs identified in BE3-hA3AY130F and YE1-BE3-FNLS-treated embryos. c, The overlap among SNVs detected from our analysis with predicted off-targets sites by Cas-OFFinder and CRISPOR. d, The distribution of mutation types of DNA off-target SNVs for BE3-hA3AY130F and YE1-BE3-FNLS-treated embryos. e, The distribution of mutation types of RNA off-target SNVs for BE3-hA3AY130F and YE1-BE3-FNLS-treated embryos. f, The expression level of APOBEC1 in BE3 and BE3-FNLS variants. n = 3 biologically independent samples for each group. Box-and-whisker plots: center line indicates median, the bottom and top lines of the box represents the first quartile and third quartile of the values, respectively. The bottom and top of the vertical line represent the minimum and maximum value. g, Editing rate of RNA off-targets for BE3 and BE3-FNLS variants at 36 h post-transfection. n = 3 biological replicates for each group. P value was calculated by two-sided Student’s t-test.
a, The C-to-T editing efficiency for BE3-FNLS, YE1-BE3-FNLS and BE4max at indicated target sites. b, Indel frequency for BE3-FNLS, YE1-BE3-FNLS and BE4max at indicated target sites. Data are shown as mean values ± SEM for n = 3 biological replicates performed at the same time. P value was calculated by two-sided Student’s t-test.
Extended Data Fig. 10 Activities of CBE and CBE variants at the indicated Cas9-dependent off-target sites.
a, The Cas9-dependent off-target effects of the CBE and CBE variants. b, The comparison of editing frequencies of CBE and CBE variants at 34 potential off-target sites. P values were calculated by two sided Student’s t-tests, compared with YE1-BE3-FNLS group. Each cell represents the percentage of total sequencing reads with C to T conversion. n = 21 independent experiments for each group. Box-and-whisker plots: center line indicates median, the bottom and top lines of the box represents the first quartile and third quartile of the values, respectively. The bottom and top of the vertical line represent the minimum and maximum value. HEK293T cells were transfected with plasmids expressing BE3, BE3R126E, BE3R132E, YE1-BE3, FE1-BE3, BE3-hA3A, BE3-hA3AY130F, BE3-FNLS and YE1-BE3-FNLS and sgRNAs matching the indicated on-target sequence using Lipofectamine 3000. Three days after transfection, genomic DNA was extracted, amplified by PCR, and analyzed by high-throughput DNA sequencing at the on-target loci, plus the top ten known Cas9 off-target loci for these sgRNAs, as previously determined using the GUIDE-seq method23, 24 and ChIP-seq method25. Sequences of the on-target and off-target protospacers and primers were shown in Extended Data Table 5.
About this article
Cite this article
Zuo, E., Sun, Y., Yuan, T. et al. A rationally engineered cytosine base editor retains high on-target activity while reducing both DNA and RNA off-target effects. Nat Methods 17, 600–604 (2020). https://doi.org/10.1038/s41592-020-0832-x
This article is cited by
Development of an efficient and precise adenine base editor (ABE) with expanded target range in allotetraploid cotton (Gossypium hirsutum)
BMC Biology (2022)
Nature Communications (2022)
Communications Biology (2022)
Nature Communications (2022)
Science China Life Sciences (2022)