Abstract
We describe base editors that combine both cytosine and adenine base-editing functions. A codon-optimized fusion of the cytosine deaminase PmCDA1, the adenosine deaminase TadA and a Cas9 nickase (Target-ACEmax) showed a high median simultaneous C-to-T and A-to-G editing activity at 47 genomic targets. On-target as well as DNA and RNA off-target activities of Target-ACEmax were similar to those of existing single-function base editors.
Data availability
The high-throughput sequencing data of this study are available at the Sequence Read Archive (PRJNA596330) of the NCBI. The original fluorescent microscopy image data are available at https://doi.org/10.6084/m9.figshare.12016785.v1.
Code availability
The source codes for the base-editing prediction model are available at https://github.com/yachielab/base-editing-prediction. The other codes used in this study are available upon request.
Acknowledgements
We thank members of the Yachie lab for useful discussions and critical assessment of this work, especially A. Adel for reviewing the manuscript. We also thank K. Shiina, Y. Takai and N. Ishii for technical supports of high-throughput sequencing. This study was mainly funded by the Uehara Memorial Foundation (to N.Y.), the NOVARTIS Foundation (Japan) for the Promotion of Science (to N.Y.), and the Japan Agency for Medical Research and Development (AMED) Platform Project for Supporting Drug Discovery and Life Science Research (to N.Y., H.N. and O.N.), and partly supported by the New Energy and Industrial Technology Development Organization (NEDO), AMED PRIME program (17gm6110007), the Japan Science and Technology Agency (JST) PRESTO program (10814), the Naito Foundation, the SECOM Science and Technology Foundation (all to N.Y.), the Japan Society for the Promotion of Science (JSPS) Grant-in-Aid for Scientific Research (16J06287) (to S.I.) and research funds from the Yamagata Prefectural Government and Tsuruoka City, Japan (to K.A. and M. Tomita). S.I. was supported by a JSPS DC1 Fellowship; S.I., H.M. and N.M. were supported by TTCK Fellowships; H.M. and N.M. were supported by the Mori Memorial Foundation; and N.M. was supported by the Yamagishi Student Project Support Program of Keio University.
Ethics declarations
Competing interests
K.N. and A.K. are shareholders and board members of BioPalette Co., Ltd.
Additional information
Integrated supplementary information
Extended Data Fig. 1 Single- and dual-function base editors used in this study.
Developmental lineages of single- and dual-function base editors used in this study are represented by arrows. Base editor mix controls for dual-function base editors are indicated by dashed lines.
Extended Data Fig. 2 Base-editing activity in base-editing reporter cells.
a, Schematic representation of the C→T base-editing reporter. C→T base editing of the antisense strand followed by DNA replication restores the translation of EGFP by converting a mutated start codon GTG (valine) to ATG (methionine). b, Schematic representation of the A→G base-editing reporter. A→G base editing of the antisense strand followed by DNA replication converts the stop codon, TAA, to CAA (glutamine) releases the translation of its downstream EGFP. c, Microscopy images of the positive control cells for C→T and A→G base-editing reporters transiently transfected with different base editor reagents and non-targeting (NT) gRNAs. Scale bar, 40 µm. d, Frequency of start codon restoration in C→T editing reporter cells. Each bar shows the mean of three independent transfection experiments represented by dots. e, Frequency of stop codon destruction in A→G editing reporter cells. f, Frequency of amplicon sequencing reads showing C→T editing at any position of the gRNA target site of C→T editing reporter cells (from –30 to +10 bp relative to the PAM). g, Frequency of amplicon sequencing reads showing A→G editing at any position of the gRNA target site of A→G editing reporter cells (from –30 to +10 bp relative to the PAM).
Extended Data Fig. 3 DNA off-target editing activity.
Editing frequencies of EMX1 site 1 and FANCF site 1 and site 2 and their corresponding off-target sites. Amplicon sequencing experiments were performed in triplicate.
Extended Data Fig. 4 Prediction of base-editing outcome frequencies.
a, Schematic diagram of the model to predict the frequencies of each base-editing outcome. In brief, to train a given base editor model using a training amplicon sequencing dataset for different target sites, probabilities of single base transition events and their conditional probabilities given each of the other single events are thoroughly calculated for different positions relative to the PAM. The frequency of a given editing outcome in a new test target site is then predicted as a geometric mean of probabilities of base transitions at all edited positions, each given by the other independent base transition patterns. b, Correlation of measured and predicted relative editing outcome frequencies in the 5-fold cross-validation experiment.
Extended Data Fig. 5 Heterologous trinucleotide co-editing frequencies predicted by the computational model.
To predict the multidimensional co-editing spectra of the different base-editing methods using the base-editing prediction model, 100 synthetic target sequences consisting of only cytosine and/or adenine bases in the region from −20 to −1 bp relative to the PAM were generated in silico. For each target sequence, all possible outcomes with C→T and/or A→G edits (220 outcomes in total) were predicted using the base-editing prediction model trained from all 47 amplicon sequencing data. The average homologous trinucleotide-editing spectra shown by the bubble charts were then calculated using all predicted frequencies.
Extended Data Fig. 6 Codon convertibility matrices (CCMs) of single-function base editors without allowing bystander mutations to occur.
For each codon in the human genome (hg38), possible gRNA target sites were first screened in the area of ±25 bp. For all gRNAs, base-editing outcome probabilities of all possible C→T and/or A→G editing patterns in the ±15 bp region of the target codon were predicted using the base-editing prediction model trained by the amplicon sequencing data for all 47 genomic sites. The conversion potential of the target source codon to each destination codon without allowing bystander mutations to occur was then defined as the maximum probability of generating the target outcome among those induced by all possible gRNAs. After calculating conversion potentials to different destination codons for all genomic codons, a CCM was generated to show the genome-wide frequency of each source-destination codon conversion type with a conversion potential threshold of 5%.
Extended Data Fig. 7
Codon convertibility matrices (CCMs) of base editor mixes and dual-function base editors without allowing bystander mutations to occur.
Extended Data Fig. 8
Codon conversion matrices (CCMs) of single-function base editors with allowing bystander mutations to occur.
Extended Data Fig. 9
Codon conversion matrices (CCMs) of base editor mixes and dual-function base editors with allowing bystander mutations to occur.
