Base editors (BEs) have been recently developed by combining the APOBEC (apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like)/AID (activation-induced deaminase) cytidine deaminase family members1 with the CRISPR/Cas9 system to perform targeted C-to-T base editing2, 3, 4, 5,6,7,8. Mechanistically, Cas9 variant-fused APOBEC/AID is directed to target site by sgRNA, introducing C-to-T substitution at the single-base level2,3,4. Compared to earlier generations of BEs (BE1 and BE2), the latest BE3 achieved much higher base editing frequencies by substituting catalytically-dead Cas9 (dCas9) with Cas9 nickase (nCas9)2. Because BEs achieve gene corrections without introducing DNA double-strand breaks (DSBs), unwanted indels converted from DSBs through non-homologous end joining (NHEJ) were thought to be excluded in base editing. However, non-negligible levels of indels (∼4%-12% in published cases2,3) were still observed in BE3-mediated base editing. In addition, unwanted non-C-to-T (i.e., C-to-A or C-to-G) substitutions were observed, and the frequencies of C-to-A/C-to-G substitutions could be as high as that of C-to-T substitution in some examined cases5. The existence of unwanted indels and C-to-A/C-to-G substitutions compromises the fidelity of base editing outcome.
Thus, understanding what causes the formation of those unwanted indels and C-to-A/C-to-G substitutions during base editing will help achieve a cleaner yield of BE3. Ideally, along with the U:G mismatch introduced by APOBEC-mediated cytidine deamination on the non-target strand (NTS), the nCas9-generated nick on the sgRNA target strand (TS) activates mismatch repair (MMR) pathway9,10 to excise the nicked TS (Supplementary information, Figure S1A). Subsequent TS DNA re-synthesis using the edited NTS as a template converts the original U:G mismatch into a U:A pair, whereby the desired C-to-T substitution is achieved after DNA replication (Supplementary information, Figure S1B). However, the U on the single-stranded NTS could also be transformed into an apurinic/apyrimidinic (AP) site by various DNA glycosylases, including uracil DNA glycosylase (UDG)11 (Supplementary information, Figure S1C), to trigger other DNA repair pathways. For instance, AP endonuclease-mediated cleavage or spontaneous breakage of AP site-containing ssDNA could trigger NHEJ to form indels (Supplementary information, Figure S1C, left); additionally, translesion synthesis (TLS) over the AP site by TLS DNA polymerase could result in a C-to-A or C-to-G substitution (Supplementary information, Figure S1C, right). Thus, it is tempting to speculate that preventing the transformation of the APOBEC-generated U into AP site on the single-stranded NTS could reduce unwanted indels and non-C-to-T substitutions. Uracil DNA glycosylase inhibitor (UGI) domain was fused to nCas9 in BE3 to prevent the transformation of U into AP site. To test the importance of UGI in base editing, we first removed the fused UGI in BE3. Consistent with our hypothesis mentioned above (Supplementary information, Figure S1C), the UGI-deleted BE3 (BE3-ΔUGI; Supplementary information, Figure S2A) was less competent in base editing (Supplementary information, Figure S2B-S2L). Compared to BE3, BE3-ΔUGI induced higher unwanted indel frequencies and lower desired C-to-T editing (Supplementary information, Figure S2B-S2D, P < 0.01 and Figure S2E-S2G, P < 10−5). As a consequence, the ratios of C-to-T editing to indels decreased considerably (Supplementary information, Figure S2H-S2J, P < 10−6). Meanwhile, the unwanted C-to-A/C-to-G substitutions also increased in the absence of UGI (Supplementary information, Table S2), leading to a significant reduction of C-to-T over C-to-A/C-to-G substitutions (Supplementary information, Figure S2K-S2L, P < 10−4). These results thus indicated that preventing the transformation of U into AP site is pivotal for efficient and high-fidelity base editing.
Although UGI was fused to nCas9 in BE3, indels were still observed in reported studies2,3. Such a phenomenon suggests that additional UGI activity may be required to further improve the efficiency and fidelity of BE3-mediated base editing. We therefore tested this hypothesis by co-expressing UGI in trans with BE3. After co-transfection of UGI in trans with sgRNA/BE3 in 293FT cells (Figure 1A and Supplementary information, Figure S3A and S3B), we applied deep-sequencing to determine the indel and base substitution frequencies at three sgRNA target sites. Compared to BE3 alone, co-expressing BE3 and UGI in trans evidently reduced the indel frequencies (Figure 1B and 1C, P < 10−6, Supplementary information, Table S1) and promoted C-to-T editing frequencies at target bases (Figure 1D and 1E, P < 10−5; Supplementary information, Table S2). Specifically, the expression level of UGI is positively correlated with the ratio of C-to-T editing to indels (Figure 1F). When a high level of free UGI is present, the ratio of desired base editing to unwanted indels increased by ∼6-fold (Figure 1G, P < 10−4). At the same time, the unwanted C-to-A/C-to-G substitutions were also suppressed in most tested cases by free UGI expression (Supplementary information, Table S2), resulting in a significant increase of C-to-T over C-to-A/C-to-G substitutions (Figure 1H and 1I, P < 10−6). We noticed that the variations among biological replicates were not trivial (Figure 1B, 1D and 1F, standard deviation represented by error bar), which could be explained by the different transfection efficiencies among replicates. To exclude the influence of transfection efficiency among different biological replicates, we normalized the indel frequencies, C-to-T editing frequencies and the ratios of editing to indels induced in BE3/UGI co-expression by those induced in paired BE3 tests. As illustrated in Supplementary information, Figure S3C-S3E, consistently better base editing effects were observed in BE3/UGI co-expression than in BE3. Moreover, the statistical analysis indicates that those improving effects conferred by high level of free UGI were highly significant (Figure 1C, 1E and 1G, P values were all within the range of 10−6 to 10−4). These results indicated that additional free UGI could reduce AP site formation on single-stranded NTS, thereby suppressing the generation of unwanted indels and C-to-A/C-to-G substitutions and simultaneously increasing the desired C-to-T editing.
We next sought to set up the enhanced BE (eBE) more conveniently by using a single vector to co-express BE3 with either one (eBE-S1) or three (eBE-S3) copies of 2A-UGI sequence (Figure 1J). After being transfected into 293FT cells together with five sgRNAs targeting different genomic loci, both eBEs showed lower indel frequencies and higher C-to-T editing frequencies than the original BE3 (Figure 1K and 1M; Supplementary information, Tables S1 and S2); eBE-S3, with three copies of 2A-UGI and the highest level of UGI expression (Supplementary information, Figure S4A), displayed the most robust and highly significant effect (Figure 1K-1N, P < 10−8-10−4; Supplementary information, Figure S4B and S4C, Tables S1 and S2). Consistently, the ratios of C-to-T editing to indels were elevated when either eBE was used (Figure 1O and 1P, P < 10−4 for eBE-S3; Supplementary information, Figure S4D). Moreover, the C-to-A/C-to-G substitutions were also suppressed by eBEs (Supplementary information, Table S2) and eBE-S3 induced a highly significant increase of C-to-T fractions over C-to-A/C-to-G (Figure 1Q and 1R, P < 10−9). It is worth noting that the nCas9-fused UGI domain is still important for achieving high fidelity of base editing, even when high levels of free UGI is present (data not shown). Such facts corroborate the importance of preventing U from transforming into AP site and are consistent with our hypothesis presented above (Supplementary information, Figure S1C).
Next, we tested the effects of co-expressing BE3 and free UGI in another cell line, HeLa (Supplementary information, Figure S5). Compared to BE3, co-expressing free UGI from a separate or the same vector both induced significantly lower indel frequencies (Supplementary information, Figure S5B-S5D), higher C-to-T editing frequencies (Supplementary information, Figure S5E-S5G), higher ratios of C-to-T editing to indels (Supplementary information, Figure S5H-S5J) and higher C-to-T fractions over C-to-A/C-to-G (Supplementary information, Figure S5K and S5L). Taken together, these results indicated that our enhanced base editing system can improve the efficiency and outcome fidelity of base editing, leading to more accurate gene editing at the single-base level.
In conclusion, we have developed an enhanced base editing system by co-expressing BE3 together with free UGI. This enhanced base editing system not only suppressed the formation of unwanted indels and substitutions but also increased the frequency of C-to-T editing, thereby improving both the fidelity and efficiency of base editing. In conditions such as therapy-related applications of BEs, the 'cleanness' of editing is pursued. Our finding thus provides a method to further improve BEs for cleaner editing outcomes. Since new BEs utilizing nCas9s with altered PAMs have recently been developed4, our enhanced base editing strategy reported here could also be used to improve the fidelity and efficiency of these newly emerged BEs.
We thank Drs Xingxu Huang (ShanghaiTech University) and Haopeng Wang (ShanghaiTech University) for discussion of this work, and Ms Xiao Wang for participation in experiments. This work is supported by the National Natural Science Foundation of China (31600619, 31600654, 31471241 and 91540115), the Ministry of Science and Technology of China (2014CB910600), and Shanghai Municipal Science and Technology Commission (16PJ1407000 and 16PJ1407500).
Calculation of indels.
Calculation of base substitutions.
Oligos used for plasmid construction.
sgRNA target sequences and PCR primers for amplifying sgRNA-targeted genomic regions.
(Supplementary information is linked to the online version of the paper on the Cell Research website.)
This work is licensed under a Creative Commons Attribution 4.0 Unported License. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/