Main

Plastid genomes, which encode key genes for photosynthetic processes, including both light reactions and carbon assimilation, are potential targets of plant breeding. Plastid genetic transformation can now be used for only a limited number of species2 and is difficult even in the model plant Arabidopsis3,4. In addition, it requires insertion of a marker gene into the plastid genome5, so the created plants are regarded as genetically modified organisms (GMOs). Recently, cytidine deaminase (CD), which converts C to U to change G/C pairs to A/T pairs in double-stranded DNA, was successfully used for in vitro targeted base editing of mitochondrial DNAs in mammalian cultured cells1. Here, we applied this technology to edit targeted bases in three genes in the plastid genome in Arabidopsis plantlets, without leaving any foreign genes in either the plastid or nuclear genomes. Targeted single nucleotide substitutions are expected to be the best way to make desired single nucleotide polymorphisms (SNPs) without disturbing any other genes or regulatory regions in the plastid genomes of common crops or elite lines. For the targets, we selected three genes whose modifications would be expected to lead to observable effects; 16SrRNA, whose modification was expected to confer resistance to an antibiotic and two genes whose modifications would lead to poor growth; rpoC1, which encodes a part of the DNA-directed RNA polymerase subunit beta′; and psbA, which encodes photosystem II (PSII) protein D1.

As in the previous study1, the CD domain (163 amino acids (aa)) of Burkholderia cenocepacia DddA toxin (1,427 aa) was split at the 1,333th or 1,397th amino acid. Each of the amino terminal or carboxy terminal halves of the CD was linked to the C terminus of the DNA binding domain of the platinum TALEN6 (pTALECD; Fig. 1a). The N terminus of the pTALECD was linked to a plastid-targeting signal peptide (PTP) of Arabidopsis thaliana RecA1 protein (51 aa)7,8 (Fig. 1b), while the C terminus was linked to an uracil glycosylase inhibitor (UGI)1,9 to inhibit hydrolysis of the generated uracil (Fig. 1b). The nucleotide sequences of CD and UGI were optimized for A. thaliana codon usage. A pair of PTP-pTALECD-UGIs (ptpTALECDs) were expressed in a single plant transformation vector under control of efficient RPS5A promoters10 (Fig. 1b). We established a system to smoothly assemble the complicated tandem expression vectors of ptpTALECD for each target sequence on the Ti plasmid (Extended Data Fig. 1) by replacing the FokI in the vectors used in a previous study11 with CD-UGI (Extended Data Fig. 2). We introduced the vectors into the nucleus of A. thaliana by floral dipping12 and attempted to substitute C/G to T/A in 16SrRNA (Fig. 1c), rpoC1 (Fig. 1d) and psbA (Fig. 1e). Substitution of G5 and/or G8 in 16SrRNA (highlighted in red in Fig. 1a) to A would confer spectinomycin (Spm) resistance (below)13,14 while substitutions of C6 in rpoC1 and C10 in psbA to Ts would lead to changes in initiation codons from ATG (methionine) to ATA (isoleucine). As a result, accumulation of their coding proteins would decrease and mutants would grow poorly15,16 and/or be unable to grow photoautotrophically17,18. Other neutral mutations in some C/G pairs in the target windows, which are the regions between the sequences that the left and right transcription activator-like effector (TALE) domains recognize, would also be expected.

Fig. 1: ptpTALECD for three plastid genes.
figure 1

a, A pair of pTALECD proteins and its targeting window (shown in a red rectangle) in 16SrRNA gene and CD half combination. Substitution of the fifth and/or eighth C/G pairs (shown in red) with T/A pairs was predicted to confer Spm resistance. b, T-DNA region of the tandem expression vectors for ptpTALECD. ce, Base-edited plant numbers, editing efficiencies shown in colour (bottom) and predicted amino acid substitutions in the three target windows of 23 DAS T1 plants (c, 16S rRNA; d, rpoC1; e, psbA). fh, Representative data from Sanger sequencing in the ptpTALECD targeting windows of 23 DAS T1 plants (f, 16SrRNA; g, rpoC1; h, psbA). i, Transition of substitution frequency states at the targeted bases between 11 DAS and 23 DAS T1 plants. Abbreviations: h/c, heteroplasmically or chimaerically substituted; homo, homoplasmically substituted; Cp, preferential target cytosines; and C*, special target cytosines predicted to cause biological effects.

Twelve ptpTALECD expression vectors were constructed (four pairs of CD halves for each of three targets). Each vector was introduced into A. thaliana and, 23 days after stratification (DAS), the targeted regions of the T1 plants were sequenced by Sanger sequencing. Only constructs from which T1 plants were obtained are shown in Fig. 1c–e and Supplementary Table 1a–c. In all three target windows, C/G pairs were replaced with T/A pairs in multiple T1 lines (Fig. 1c–h and Supplementary Table 1a–c). Surprisingly, in many lines, the targeted base(s) seemed to be homoplasmically substituted (homo), while in other lines, they seemed to be heteroplasmically or chimaerically substituted (h/c; Fig. 1c–h and Supplementary Table 1a–c). Such homoplasmic mutations might have occurred through stochastic sorting processes, such as selection of mutations in a small number of copies of plastid genomes (that is, plastid sorting)19 or gene conversion20. Nevertheless, it is also conceivable that ptpTALECD mutated the C/G pairs in the target windows of all plastid genomes at the early stage of embryogenesis because the RPS5A promoter used for ptpTALECD expression was reported to highly drive gene expression in egg cells and early embryos21,22. Not all C/G pairs in the target windows were substituted and the positions of the substituted C/G pairs were biased for all the three target windows (Fig. 1c–e). Three homoplasmically substituted bases were C of (5′)TC(3′) (Cp in Fig. 1c–e), which was the preferential target1 but a C of (5′)AC(3′) in 16SrRNA gene was also substituted (Fig. 1c).

To investigate the stability of mutation rates during plant development, total DNAs extracted from an emerging leaf of T1 plants at 11 and 23 DAS (or from a cotyledon of slowly growing plants at 11 DAS; Supplementary Table 1a–c) were sequenced. Among the plants with base change(s) in the target window on either day, some had bases heteroplasmically or chimaerically (h/c) substituted on both days with their mutation rate increased or decreased (27.5%, 14/51; Fig. 1i) and others had bases at which the mutation rate differed between the two time points (for example, homoplasmy (homo) to h/c, 5.9% (3/51); h/c to null, 15.7% (8/51); h/c to homo, 11.8% (6/51); or null to h/c, 2.0% (1/51); Fig. 1i). Many of the remaining plants had bases that were homoplasmically substituted on both days (37.3%, 19/51; Fig. 1i). Interestingly, one leaf of one T1 plant (16SrRNA 1397NC 3) showed an example of sector formation19; that is, it had differently coloured sectors (wild-type-like green and pale) and the mutation rate at the Cp* in 16SrRNA differed between the sectors (Extended Data Fig. 3a,b). Remarkably, most of the bases homoplasmically substituted at 11 DAS were also homoplasmically substituted at 23 DAS (86.4%, 19/22; Fig. 1i), suggesting that the targeted bases of T1 plants transformed by the ptpTALECD expression vector were homoplasmically substituted at a high frequency and that the homoplasmic mutations were stably fixed through development.

Next, we investigated the off-target effects of ptpTALECD on the plastid and mitochondrial genomes, because both organelle genomes are maternally inherited, so off-target mutations in these two organelle genomes cannot be segregated from the desired mutation by usual cross-breeding. The total genomes of 17 T1 plants were sequenced (Novaseq, illumina) and Fig. 2a and Supplementary Table 2 show the results. As Fig. 2a and Supplementary Table 2 show, the targeted C appeared to be homoplasmically substituted to T in 16 of them (16SrRNA 1397C-1397N (1397CN) 1, 2, 7, 8, 12, 16; 1397N-1397C (1397NC) 1~3; psbA 1397CN 6; 1397NC 1, 5; and rpoC1 1397CN 8, 9, 13, 16), while it was either heteroplasmically or chimaerically substituted in the remaining one (rpoC1 1397CN 3). Each redundant mutation in the inverted repeats of the plastid genome was counted as one mutation. The targeted bases in these 16 lines were confirmed to be homoplasmically or dominantly substituted and the base in the remaining one line was confirmed to be heteroplasmically or chimaerically substituted (Fig. 2a and Supplementary Table 2). Dominant off-target point mutations (with substitution frequencies >50%) were detected at six places in the line 16SrRNA 1397CN 1, while no dominant off-target point mutations were detected in the other lines (Fig. 2a and Supplementary Table 2). The 16SrRNA 1397CN 1 did not have any true leaves (Extended Data Fig. 4) and died before 23 DAS. A total of 116 off-target mutations (allele frequencies ≥1%) were detected in the plastid genomes (Fig. 2b–f and Supplementary Table 2). Most (69.0%) were located within 2,000 base pairs (bp) of the target windows while only a few (11.2%) were located within 20 bp of sequences similar to those recognized by TALEs. The rest were found in other regions. No dominant off-target mutations were detected in the mitochondrial genomes of these 17 lines, including 16SrRNA 1397CN 1 (Supplementary Table 2). These results indicate that ptpTALECD only infrequently introduced off-target point mutations into organelle genomes and can specifically and homoplasmically (or dominantly) substitute C/G to T/A in the target windows.

Fig. 2: Investigation of on- and off-target mutations and determinants of off-target mutations.
figure 2

a, Mutated read frequencies in the plastid genomes of nine T1 lines (11 DAS) and one T2 line (49 DAS) targeted for 16SrRNA. SNPs where ≥10% of the reads were different from the reference genome (AP000423.1) in at least one plant are listed. Supplementary Table 2 shows all the on- and off-target mutations in both organelle genomes, where different reads frequencies were ≥1%. be, Positions and frequencies of on- and off-target mutations (shown as green dots) in T1 plants targeting 16SrRNA (b), rpoC1 (d) and psbA (e) and in null-segregant T2 plants mutated in 16SrRNA (c). Magenta lines represent the target windows. The right panels are magnified views of the regions surrounded by the dotted rectangles in left panels within 1,000 bp from the target nucleotides in each gene (G5 in 16SrRNA, G3 in rpoC1 and C10 in psbA). f, List of the off-target mutations categorized by types. Off-target mutations within 2 kb of the target windows and/or within 20 bp of sequences similar to those recognized by TALEs and in other regions are shown. In bf, off-target mutations were defined as SNPs in which C/G-to-T/A substitutions (allele frequencies ≥0.01) were detected only in the T1 or T2 plants but not in three wild-type plants used as controls.

All but one of the T1 plants that were transformed by the 16SrRNA-targeting ptpTALECD vector and whose first Cp* (G5) and/or C10 was homoplasmically substituted were fertile (Supplementary Table 1a). The exception was 16SrRNA 1397CN 1. To investigate whether the mutations were stably inherited by the offspring, the T2 progenies of three of these lines (16SrRNA 1397CN 2, 8 and 1397NC 3) were genotyped (Fig. 3a and Extended Data Fig. 5a). Transgenic T2 plants were identified by having seed-specific green fluorescent protein (GFP) fluorescence from Ole1 pro::Ole1-GFP (ref. 23) on the transfer DNA (T-DNA; Fig. 1b) and/or a positive polymerase chain reaction (PCR) result showing the presence of the ptpTALECD reading frame (Fig. 1b and 3a). Both progeny stably inherited the homoplasmic mutations (Fig. 3a and Extended Data Fig. 5a). Interestingly, some T2 plants had white, red or variegated cotyledons (Fig. 3b and Extended Data Fig. 5b), which were different from the phenotypes of their parents (Extended Data Fig. 4). All of these plants were GFP positive (Fig. 3a and Extended Data Fig. 5a) and many of them (8/9) had additional mutation(s) in or near the target window of 16SrRNA (Extended Data Fig. 5a). Because, as mentioned above, the RPS5A promoter used for ptpTALECD expression was reported to highly drive gene expression in egg cells and early embryos21,22, de novo mutagenesis may have occurred during the early developmental stage in these transgenic T2 plants with abnormal cotyledons. In contrast, all T2 plants of the T-DNA-free null segregants examined had the targeted mutations without any of the additional altered phenotypes described above. No major off-target mutations were detected in the three null-segregant T2 plants (16SrRNA 1397CN 8 lines 1, 2 and 8), whose genomes were sequenced by next generation sequencing (Fig. 2a and Supplementary Table 2). Homoplasmic mutations in rpoC1 (G3) and psbA (C10) in other lines were also inherited by their T2 progeny (Extended Data Figs. 6a and 7a). These data indicate that plastid genomes with artificially introduced point mutations were stably inherited, independently of nuclear T-DNA inheritance and also suggest that transgene-free plants with targeted point mutations in the plastid genomes were successfully established.

Fig. 3: T2 generation analysis.
figure 3

a,b, Genotypes and phenotypes of T2 progenies of 16SrRNA 1397CN 2. a, Gel images of bands of PCR products of 16SrRNA and ptpTALECD, presence of seed GFP fluorescence, genotypes of G5 SNP and phenotypes of T2 progenies of 16SrRNA 1397CN 2 are shown. Abbreviations: WT, wild type (Col-0); NTC, non-template control. b, Representative phenotype images of five WT-like plants (lines 1, 3, 4, 6, 8) and a plant with red cotyledons (line 7). Scale bar, 1 mm. c,d, Phenotypes of T2 progenies of 16SrRNA 1397CN 2 and 15 in the presence of Spm. c, Images of seeds (0 DAS) and seedlings (8 DAS) of the T2 progenies of the two lines and WT (Col-0) on 1/2 MS medium containing 50 mg l–1 of Spm. Scale bar, 1 cm. d, A table that displays the relationship between the presence of seed GFP fluorescence and 8 DAS plants colour. Abbreviations: W/G, white or red cotyledons and green leaves; n.g., not germinated.

Source data

The antibiotic Spm binds to a specific location in Escherichia coli 16SrRNA and inhibits translation24. Substitution of a specific G near this region to A confers Spm resistance (Spmr)13. The targeted G5 in the Arabidopsis plastid 16SrRNA gene is homologous to this G. Several mutations are known to confer Spmr to flowering plants25,26 but none of them occur at the position of the targeted G5. T2 seeds obtained from a T1 plant in which G5 was homoplasmically substituted to A (16SrRNA 1397CN 2; Supplementary Table 1a) were sown on plates containing Spm. Many of the seedlings that germinated from these seeds showed Spmr, regardless of the presence of seed GFP fluorescence (Fig. 3c and Extended Data Fig. 8b–d). However, some T2 progenies from 16SrRNA 1397CN 2 showed a Spm-sensitive (Spms)-like phenotype (white plantlet with purple cotyledon; Fig. 3c and Extended Data Fig. 8a–d). All the Spms-like plantlets germinated from GFP-positive seeds (Fig. 3c and Extended Data Fig. 8a–d) and many of them (5/5; Extended Data Fig. 9) had multiple de novo mutations (in addition to G5) in 16SrRNA. This suggests that the de novo mutations caused dysfunction of 16SrRNA, resulting in the Spms-like phenotype. The existence of Spms-like T2 plants of 16SrRNA 1397CN 2 on Spm-free medium (Extended Data Fig. 5a,b) with the de novo mutation(s) support this suggestion. Surprisingly, some of the progeny of a T1 plant (16SrRNA 1397CN 15) that had the G5 mutation at very low frequency at 11 DAS and no mutation at 23 DAS (Supplementary Table 1a) also showed Spmr. These progeny germinated from GFP-positive seeds (Fig. 3c). In five of them, the G5 was homoplasmically substituted to A and in 13 others it was dominantly substituted to A (Extended Data Fig. 9). This suggests that the inherited nuclear T-DNA caused a major de novo mutation on the G5. These results suggest that homoplasmic substitution of G5 to A confers Spmr to A. thaliana. Furthermore, the result that the GFP-negative T2 progeny showed Spmr or Spms phenotype that was predictable from SNPs at G5 in the T1 plants showed that the null-segregant T2 plants were likely to inherit mutation(s) that their parent had and not likely to have additional mutations.

Previous studies showed that accumulation of D1 protein (encoded by psbA) and/or the maximum quantum yield of PSII (Fv/Fm) drastically decreased in mutants deficient in psbA expression17,18. Furthermore, these mutants looked pale17 and could not grow photoautotrophically17,18. Surprisingly, psbA 1397NC 1, which had the homoplasmic mutation at the psbA initiation codon (C10) at both 11 and 23 DAS (Supplementary Table 1c), could grow photoautotrophically and set viable seeds. Thus, to investigate the effects of the homoplasmic mutation at the psbA initiation codon (C10) on its expression, we measured Fv/Fm and the accumulation of D1 protein in T-DNA-free null-segregant T2 progeny of psbA 1397NC 1, which were confirmed to inherit the homoplasmic mutation. Unexpectedly, their growth (Extended Data Fig. 6b,c) and accumulation of D1 protein (Extended Data Fig. 6d–f) were comparable to those in wild-type plants, while Fv/Fm was only slightly decreased compared with wild-type plants (Extended Data Fig. 6g). One possibility is that another codon served as the initiation codon. It could be another AUG, or possibly a GUG or UUG, which can also serve as start codons in the chloroplast27. Upstream of the altered AUA, no such sites occur after the nearest stop codon. Downstream, the next potential start codons would shorten the protein by at least 10% but they can be excluded because the recombinant protein was the same size as the wild-type protein (Extended Data Fig. 6d). These results suggest that the AUA codon does not greatly affect the initiation of translation of psbA or the D1 level but that the AUG codon is necessary for the full activity of PSII. Thus, a better way to knock out a plastid gene might be to create a premature stop codon in its reading frame rather than to change the initiation codon to AUA. In rpoC1, none of the homoplasmic mutations that were obtained were at the initiation codon as expected. Instead, they were at the second codon where they caused a synonymous mutation (Ile to Ile; Fig. 1d and Supplementary Table 1b). Null-segregant T2 progeny of rpoC1 1397CN 8, which had the synonymous homoplasmic mutation at both 11 and 23 DAS, inherited the homoplasmic mutation and appeared to grow as well as wild-type plants (Extended Data Fig. 7b,c).

These experiments showed that ptpTALECD could specifically introduce homoplasmic C-to-T mutations in target windows in the A. thaliana plastid genome and that the mutations were stably (and probably maternally) inherited by the progeny seeds. Previous attempts to introduce homoplasmic mutations in mammalian mitochondrial genomes were unsuccessful1,28. The method was also successful in a region of inverted repeats, where mutations are thought to occur at a lower rate due to their greater potential for copy correction29; 16SrRNA occurs in inverted repeats and targeted point mutations were successfully introduced in both copies. Compared to traditional methods for plastid transformation, such as biolistic methods, ptpTALECD technology has three advantages. First, it allows plastid-genome editing of A. thaliana without using specific mutants3,4 or a specific ecotype30 and without tissue culture, which is a major obstacle to plastid transformation. Second, it could probably be used to edit plastid genomes of other plant species that are recalcitrant to plastid transformation but amenable to nuclear transformation. And third, it could be used to create plastid-genome-edited plants without leaving any foreign gene in their genomes. Such plants are not regarded as GMOs in several countries. On the other hand, the ptpTALECD method has some problems with respect to accuracy. For example, unwanted substitutions in the target windows occurred (C10 in 16SrRNA and G3 in rpoC1; Fig. 1c,d), while homoplasmic mutations at some special target C/G pairs in the target windows were not introduced (G8 in 16SrRNA and C6 in rpoC1; Fig. 1c,d). These problems might be avoided by sliding the TALE recognition targets a few base pairs upstream or downstream or by using different sizes of target windows or by optimizing the sequences linking the TALE and CD31. In any case, only a few mutations in this study were off target. We also obtained null-segregant T2 plants that had the targeted homoplasmic mutation but had no off-target mutations (Fig. 2a and Supplementary Table 2).

This technology may also be useful for strengthening agronomic traits. For example, amino acid polymorphisms in the plastid-encoded ribulose 1,5-bisphosphate carboxylase/oxygenase (Rubisco) large subunit are expected to affect the carbon assimilation (and oxidation) rate32,33 and some polymorphisms in psbA (not involving the C10 in this study) enhance herbicide resistance34. In addition, null-segregant plants are not regarded as GMOs in some countries and the introduced mutations would not leak out of the pollen2,5. Therefore, plants with their plastid genomes precisely edited by ptpTALECD might be more acceptable to the public. Also, this technology could be used for creating premature stop codons, substituting amino acids and modifying RNA editing sites. Thus, ptpTALECD technology has the potential to accelerate both plant breeding and basic research on plastid-encoded genes.

Methods

Plant material and growth conditions

A. thaliana Columbia-0 (Col-0) and transgenic plants were grown at 22°C and under long-day conditions (16 h light, 8 h dark). Col-0 seeds were sown on 1/2 Murashige and Skoog (MS) medium (pH 5.7) containing 2.3 g l−1 of MS Plant Salt Mixture (Wako), 500 mg l−1 of MES, 10 g l−1 of sucrose, 1 ml l−1 of Plant Preservative Mixture (Plant Cell Technology), 1 ml l−1 of Gamborg’s Vitamin Solution (Sigma–Aldrich) and 8 g l−1 of agar. Seedlings at 2–3-weeks-old were transferred to Jiffy-7 (Jiffy Products International) and thereafter subjected to Agrobacterium transfection. Most T1 plants were transplanted to Jiffy-7 but several growth-retarded plants were transplanted to plant boxes containing the 1/2 MS medium at 23 DAS.

Designing the TALE binding sequence

The TALE targeting sequence was designed to be on both sides of the CD targeting window, with Old TALEN Targeter (https://tale-nt.cac.cornell.edu/node/add/talen-old). The first recognized base was required to be adjacent to the 3′ side of ‘T’ as far as possible. The minimum length of TALE targeting sequence was 15 bp so that TALE would specifically bind the sequence. All the sequences that TALE binds and the target windows between the TALE binding sequences are shown in Supplementary Table 3 and Fig. 1c–e.

Vector constructions

A pair of left and right ptpTALECDs in Ti-plasmids (Extended Data Fig. 1b) for each target was constructed by using Platinum Gate assembling kit and multisite Gateway (Thermo Fisher) as described in our previous study of mitochondria-targeted TALEN11.

The DNA binding domains of ptpTALECD were assembled with the Platinum Gate TALEN system6 on the basis of the same previous study11 (Extended Data Fig. 1a). Each FokI coding sequence in the previous vectors of mitoTALENs used for assembly-step2 was replaced in advance by the CD half and UGI coding sequence with In-Fusion HD Cloning Kit (TaKaRa; Extended Data Fig. 2). The CD half and UGI coding sequences were designed to encode the same amino acids as those of Mok’s experiment1 and artificially synthesized by Eurofins Genomics with the codon usage optimized for A. thaliana (https://www.eurofinsgenomics.jp/jp/orderpages/gsy/gene-synthesis-multiple/; Supplementary Table 4). The reading frames in the assembled first and third entry vectors and the second entry vector (below) were transferred into the Ti plasmid10 by a multi-LR reaction with LR Clonase II Plus enzyme (Thermo Fisher Scientific; Extended Data Fig. 1b). The second entry vector had an Arabidopsis heat-shock protein terminator35, an Arabidopsis RPS5A promoter and the N terminal (51 aa) PTP of Arabidopsis RECA1 (refs. 7,8; Extended Data Fig. 10a). This Ti plasmid was made from a Gateway destination Ti plasmid pK7WG2 (ref. 36) by replacing the CaMV 35S promoter with the Arabidopsis RPS5A promoter and inserting the PTP coding sequence and Ole1 pro::Ole1-GFP derived from pFAST02 (ref. 23; Extended Data Fig. 10b; http://www.inplanta.jp/pfast.html). All primers used for vector construction are listed in Supplementary Table 5. All plasmids are deposited in Addgene and their sequences are also available in Addgene (ID 171723–171736).

Plant transformation and screening transformants

Col-0 plants were transformed by floral dipping12 with Agrobacterium tumefaciens strain C58C1 that harboured one of the transformation vectors described above. Transgenic T1 seeds were selected at first by observing seed GFP fluorescence. GFP-positive seeds were sown on the 1/2 MS medium (section Plant material and growth conditions) further containing 125 mg l−1 of claforan. In addition, GFP-negative seeds were sown on the 1/2 MS medium containing 50 mg l−1 of kanamycin and 125 mg l−1 of claforan.

Sanger sequencing and next generation sequencing and their analyses

Total DNAs were extracted from an emerging true leaf or a cotyledon of the selected seedlings with Maxwell RSC Plant DNA Kit (Promega). To genotype transgenic lines, plastid DNA sequences adjacent to the CD targeting windows were amplified with primer sets (Supplementary Table 6). Purified PCR products were subjected to Sanger sequencing (Eurofins Genomics) to detect substitution of the targeted bases. The data were analysed with Geneious prime (v.2020.2.2).

We called SNPs in the plastid and mitochondrial genomes using total DNA sequenced data. First, we ordered Macrogen Japan to prepare paired-end libraries using a Nextera XT DNA library Prep Kit (Illumina) and sequenced using Illumina NovaSeq 6000 platform. As preprocess for analysis, low-quality and adaptor sequences in the reads were trimmed using Platanus_trim v.1.0.7 (http://platanus.bio.titech.ac.jp/pltanus_trim). Pair-end reads of each strain were mapped to reference sequences (AP000423.1 and BK010421.1) using BWA (v.0.7.12)37 in single-ended mode. We filtered out inadequate mapped reads with mapping identities ≤97% or alignment cover rates ≤80%. SNPs were then called using samtools mpileup command (-uf -d 30000 -L 2000) and bcftools call command (-m -A -P 0.1)38. We finally listed positions in which variants with allele frequencies (AFs) ≥0.1 were detected in at least one strain including the WT (Fig. 2a). SNP calls with AFs ≥0.01 were also performed for positions with read depths ≥500 (Supplementary Table 2).

To evaluate whether closeness to target sites or similarity to TALE sequences influenced the locations of off-target mutations, we tallied off-target mutations that were either within 2,000 bp of the target site or within 20 bp of sequences ≥70% similar to those recognized by one of the TALEs.

Genotyping T2 plants

T2 seeds gained from several T1 lines were sown on the 1/2 MS medium (section Plant material and growth conditions). The genotypes of the target windows of a cotyledon of the 7 DAS (for Fig. 3a and Extended Data Figs. 5a, 6a and 7a) or 13 DAS (for Extended Data Fig. 9) seedlings were determined in the same way as determining those of T1 plants (above). The ptpTALECD PCRs were performed with primers described in Supplementary Table 6.

Screening Spm-resistant plants

T2 seeds obtained from a T1 line of which G5 in 16SrRNA was homoplasmically substituted at 11 and 23 DAS and control seeds were sown on the 1/2 MS medium (section Plant material and growth conditions) containing 0, 10, 50 or 100 mg l−1 of Spm (without Plant Preservative Mixture for Extended Data Fig. 8a–d). Phenotypes of germinated seedlings were observed on 8 DAS.

Measurement of chlorophyll fluorescence

Chlorophyll fluorescence was measured using a MINI-pulse-amplitude modulation portable chlorophyll fluorometer (MINI-PAM; Walz). Minimal fluorescence at open PSII centres in the dark-adapted state (Fo) was excited by a weak measuring light (650 nm) at a photon flux density of 0.05 to 0.1 μmol of photons m−2 s−1. A saturating pulse of white light (800 ms, 8,000 μmol of photons m−2 s−1) was applied to determine the maximal fluorescence at closed PSII centres in the dark-adapted state (Fm). Maximum quantum yield of PSII was calculated as Fv/Fm. These procedures were done independently three times (experimental replicates = 3). In each replicate, four plants of each genotype (Col-0 and psbA 1397NC 1 T2) were analysed, average values and standard errors were calculated and Fv/Fm values of the two groups were tested by two-tailed Welch’s test.

SDS–polyacrylamide gel electrophoresis and immunoblot analyses

Leaf extract was prepared by grinding the rosette leaves using mortar and pestle in an ice-cold buffer (20 mM Tricine (pH 8.4) containing 330 mM sorbitol, 10 mM NaHCO3, 5 mM EGTA and 5 mM EDTA). After filtration with two layers of Miracloth, intact chloroplasts were collected by centrifugation for 5 min at 4,800g. The purified chloroplasts were ruptured in a buffer (20 mM HEPES-KOH (pH 7.6), 5 mM MgCl2, 2.5 mM EDTA and complete ULTRA protease-inhibitor cocktail (Roche)). The insoluble fraction containing thylakoids and envelopes was separated from the soluble fraction by centrifugation for 2 min at 15,000g and resuspended in the above buffer. The concentration of chlorophyll was determined as described previously39. Chloroplast thylakoid and membrane proteins were solubilized in SDS–PAGE sample buffer. Proteins solubilized from the thylakoid membrane corresponding to 1–2 μg of chlorophyll were separated by 12.5% (w/v) SDS–PAGE and electrotransferred onto polyvinylidene fluoride membranes. The antibodies were added and the protein–antibody complexes were labelled using the ECL Prime western-blotting detection system (GE Healthcare). The chemiluminescence was detected with a lumino-image analyser (LAS4000, GE Healthcare). Anti-PsbA and anti-AtpB were purchased from Agrisera. Anti-PetA and anti-PsbO were kindly provided by A. Makino (Tohoku University, Japan) and T. Endo (Kyoto University, Japan), respectively.

Image processing

Plant images were taken by iPhone Xs (Apple) and LEICA MC 170 HD (Leica). Gel images were taken by ChemiDoc MP Imaging System (BIORAD). These images were processed with Adobe Photoshop 2021 (Adobe). Figures and tables were made with Adobe Photoshop 2021 and Adobe Illustrator 2021 (Adobe).

Reporting Summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.