Introduction

Two types of designer nucleases have been widely used, by fusing a sequence-specific DNA binding polypeptide with the catalytic domain of the FokI endonuclease. Zinc finger nucleases (ZFNs) are based on a DNA-binding domain from zinc finger proteins: each unit recognizes nucleotide triplets with high specificity1,2. A pair of polypeptides, each consisting of three or more tandem zinc finger units fused to the FokI nuclease domain that dimerize with each other, will form an active endonuclease and make a double strand break (DSB) in the spacer region (5–7 bases) between the two target DNA sequences of 9-bases or longer1,2. A major limitation of ZFNs is that they prefer GNN triplets and have a lower affinity for AT-rich target sequences1,2. The second type of designer nucleases emerged in the past several years, based on transcription activator-like effectors (TALEs) that are DNA binding proteins of plant pathogenic bacteria3. A TALE contains a tandem array of ~34 amino acid units (monomer): each monomer preferentially recognizes a single nucleotide via two adjacent amino acids termed repeat-variable di-residues (RVD). To obtain an efficient DNA-binding the minimum number of TALE monomers should be 113. Pairs of dimeric TALE-based nucleases (TALENs) that recognize two adjacent DNA sequences (11–24 bases, separated by 10–30 base spacer) have been shown to aid gene targeting via homologous recombination (HR) in many cell types, including human induced pluripotent stem (iPS) cells3,4,5. The technique of designing high efficiency TALENs is ever progressing; however, there are still no monomers recognizing Guanine (G) with both high affinity and specificity3,4,5. In addition, off-target DNA recognitions of TALENs exist, but they have not been examined as extensively as those with ZFNs6.

In our previous study7, a pair of ZFNs was used to target the HBB locus near the point mutation causing sickle cell disease (SCD) in human iPS cells. However the efficiency was relatively lower than other ZFNs used previously to target the PIG-A or AAVS1 loci8,9, which is likely due to the fact that the HBB gene is not actively transcribed in human iPS cells. In order to develop a more efficient designer nuclease to target the HBB locus near the SCD mutation, we made a pair of TALENs that is more efficient than our previous HBB-ZFNs in stimulating homologous recombination (HR)-mediated gene targeting in a reporter system containing the HBB target sequence. We also discovered that one component of ZFNs and one component of TALENs can form a pair of hybrid nuclease with expanded specificity at the HBB locus and stimulate gene targeting in multiple cell types including human iPS cells with improved efficiency. Finally, using AAVS1-TALEN and AAVS1-ZFN components, we show a similar behavior of hybrid nucleases at the AAVS1 locus in human 293T and iPS cells. These observations expand the approaches available for manipulating genomes with improved efficiency in multiple cell types including human iPS cells.

Results

We designed several TALEN pairs based on a published protocol4 and tested all seven pairs of obligate TALEN heterodimers using an EGFP reporter assay, shown in Figure 1A. Human cell lines harboring a chromosomally integrated, inactive EGFP reporter gene that is interrupted by a stop codon and the HBB target DNA sequence were transfected by a pair of expression vectors expressing either ZFNs or TALENs, together with a non-expressing donor DNA, tGFP. Removal of the insertion by HR using the tGFP as a template will restore EGFP sequence and expression in the reporter. One of these TALEN pairs (TALEN1s) (data not shown), designed to recognize nearby 19-base and 13-base DNA segments (Figure 1B), showed the highest efficiency (Figure 1C). The HR frequency, measured as frequency of acquired GFP-positive cells, increased from <10−6 to 0.35% in the presence of TALEN1s, 1.7-fold higher than the HBB-ZFNs previously used (Figure 1C and 1D). We also tested the efficiency of the ZFN pair A10, hereinafter named as ZFA, which was recently reported to efficiently target the HBB sequence present in our EGFP reporter construct (Figure 1B). As for the TALEN1 pair, DNA binding domains of the ZFA pair were cloned into the same pair of vectors expressing two heterodimeric FokI nuclease domains7. In the same assay, the ZFA pair (recognizing 9-base and 9-base DNA sequences respectively) showed 4.5-fold higher stimulatory activity compared with HBB-ZFNs (Figure 1D). We confirmed that the differences observed were not due to the level of ZFN or TALEN protein expression in transfected cells (Figure 2). The two subunits of ZFA ZFN proteins expressed at similar level as the pair of HBB-ZFNs. The expression levels of the two TALEN1 subunits are lower than ZFNs, especially TALEN1-L, likely due to the presence of many repeats of 34 amino acid monomers (Figure 2).

Figure 1
figure 1

Evaluation of novel TALEN and hybrid designer nucleases targeting the human HBB sequence in human cell lines.

(a). A scheme to evaluate the stimulatory activity of new designer endonucleases targeting the HBB locus and enhancing homologous recombination (HR) using an EGFP reporter assay. (b). The target sequences recognized by HBB-ZFNs (underlined by black lines), ZFAs (boxed) and a novel pair of TALEN1s (underlined by blue lines) are marked in the 96-base HBB sequence that is inserted in the middle of EGFP sequence following a stop codon, taa (in red). In addition to HBB96, we also made EGFP reporter constructs and 293T reporter cell lines containing corresponding sequences from the related HBD, HBE and HBG genes; mismatches are marked in pink. HBB96-SCD contains the sickle cell disease (SCD or βS) mutation. (c). Flow cytometric (FACS) assay to detect events of HR in the 293T-HBB96 reporter cell line with the donor DNA in the absence or presence of a pair of designer nucleases targeting HBB. Four days after transfection, cells were analyzed by FACS to detect GFP-positive cells resulting from an HR event. Dot plots show EGFP signal (in FL1 channel) and background fluorescence using FL2 channel. (d). Efficiency of gene targeting stimulated by various pair of ZFNs and TALENs in 293T cells, as measured by numbers of GFP-positive cells per million transfected cells (mean +/− SEM, n = 3). Note that a hybrid of ZFA-L + TALEN1-R showed the greatest stimulatory efficiency. (e). FACS analysis of HR stimulation in human iPS cell line TNC1 transfected by an EGIP*-HBB96 reporter. Stably transfected iPS cells were selected and used for HR-mediated gene targeting assay as in (c). The reporter iPS cells were transfected by electroporation with the tGFP donor plasmid and a pair of designer nuclease pair and then plated on feeder cells. Four days after, numbers of GFP-positive iPS cells (TRA-1-60-positive) were analyzed by FACS. Contour plots of representative samples are shown. (f). Numbers of GFP-positive iPS cells (TRA-1-60-positive) per million transfected iPS cells were plotted (mean +/− SEM, n = 3), in the absence (donor only) or presence of a designer nuclease pair.

Figure 2
figure 2

Expression levels of each designer nuclease detected by Western blot.

(a). Schematic of the designer endonucleases used in this study. (b). Gel image of individual designer nuclease detected by Western blot. Full-length blots are presented in Supplementary Figure 1. Arrow bars indicated the bands of TALENs. (c). Relative expression level of each designer nuclease versus the endogenous β-tubulin protein.

To test the specificity of these ZFNs and TALENs, we constructed similar EGFP reporters in which the HBB target sequence was replaced by homologous regions from highly related HBD, HBE and HBG loci, as well as the sickle mutation (βs) (Figure 1B). Levels of stimulatory activity of each pair of HBB designer nucleases on these related but mismatched targets provided a glimpse into their specificity (levels of mismatches are shown in Table 1). The stimulatory activity in each EGFP reporter cell line was normalized by the efficiency of the EGFP-ZFNs recognizing the common EGFP sequence in each reporter (Table 2). We detected a low level (0.07%) of off-target effect of HBB-ZFNs on HBE and HBG sequences, but not on the HBD target. However, we did not detect off-target activity by the ZFA pair, which recognizes only 9-base left and 9-base right DNA sequences respectively. The TALEN1s for HBB showed off-target activity (2.4%) on the HBD target (2-base mismatches), but not on HBE and HBG. The TALEN1s that were designed to recognize wild-type HBB remained high with the HBB-SCD target, indicating low specificity at least by TALEN1-L recognizing 19-base DNA sequence including the SCD (βs) site. Our data illustrated that a greater length of recognition sequences by TALENs or ZFNs does not necessarily translate to a higher efficiency or specificity. This is relevant because the only published TALEN targeting the wild-type HBB gene was designed to recognize 23-base and 20-base DNA sequence respectively, with untested specificity11.

Table 1 Length and space of recognition sites of various designer nuclease pairs targeting the HBB and mismatches at its paralogue in reporters
Table 2 Efficiencies of various designer nucleases at HBB and related beta-globin loci in reporters

We next tested if one component of ZFNs will dimerize with one component of TALENs that recognizes a nearby DNA sequence, to form a functional endonuclease via the FokI nuclease domain. The two possible combinations of hybrid nucleases both resulted in higher efficiencies: 28,319 and 12,159 events per million cells for ZFA-L + TALEN1-R and TALEN1-L + ZFA-R combinations, respectively (Figure 1D). The best combination, ZFA-L + TALEN1-R, showed a stimulatory efficiency of 8-fold and 3-fold higher than TALEN1 and ZFA pairs, respectively. This pair of a hybrid nuclease did not show any off-target effect on the related HBD, HBE and HBG sequences (Table 2).

We further investigated if the hybrid designer nucleases also show higher efficiency in human iPS cells harboring the same HBB-SCD EGFP reporter (Figure 1E). Although the overall gene targeting efficiency to generate GFP-positive cells is about 10-fold lower in human iPS cells than in 293T cells as previously reported7, the same stimulation pattern by designer nucleases was observed (Figure 1F). The two pairs of hybrid nucleases showed higher efficiencies than ZFN and TALEN1 pairs: the ZFA-L + TALEN1-R pair showed highest efficiency in human iPS cells similarly as in 293T cells.

To examine if the concept of using ZFN and TALEN hybrid designer nucleases is applicable to other genomic sequences, we tested the gene-targeting efficiency of the AAVS1 (precisely PPP1R12C) locus mediated by HR with existing ZFNs, TALENs and their hybrids. The DNA binding domains of the AAVS1-ZFNs5,9 or TALENs4 were cloned into the same pair of vectors expressing heterodimeric FokI nuclease domains7. Each component of AAVS1-ZFNs or AAVS1-TALENs recognizes 12 bases or 17 bases nucleotide respectively; the gene targeting of the endogenous AAVS1 locus in human 293T cells was examined (Figure 3A–3D). The two hybrid combinations (with spacer of 10 bases and 12 bases) showed stimulatory efficiencies similar to or higher than AAVS1-ZFNs or AAVS1-TALENs. Then the best hybrid combination, ZFN-R and TALEN-L, was used to target the AAVS1 locus in human iPS cells (Figure 3E, 3F). The efficiency of GFP-integration mediated by designer nucleases were determined by counting the total numbers of GFP-positive and puromycin-resistant cells (Figure 3E). Targeted integration was confirmed by HR-specific PCR detection (Figure 3F). The pair of AAVS1 hybrid nucleases yielded 10–15 fold more GFP-positive and puromycin-resistant cells than either TALEN or ZFN pair alone.

Figure 3
figure 3

Targeting the endogenous AAVS1 locus using hybrid designer nucleases.

(a). Schematic of the AAVS1 locus and the AAV-CAGGS-GFP donor plasmid with homology arms Left (L) and Right (R) to promote homologous recombination. GFP is driven by the CAG promoter that expresses regardless of whether integration has occurred. The purple arrows represent PCR primers that amplify only if targeted integration has occurred at the AAVS1 locus. The recognition sites of the ZFNs and TALENs lie within the first intron of the PPP1R12C. (b). Recognition sequences of the ZFNs9 and TALENs4 at the AAVS1 Locus. (c). Gel image of PCR to detect AAVS1 targeted integration in 293T cells comparing various ZFN and TALEN combinations (top) and a control amplifying the HBB locus (bottom) from the same genomic DNA sample. Full-length gels are presented in Supplementary Figure 2. The samples derive from the same experiment and that gels were processed in parallel. (d). The AAVS1 gene-targeting efficiency in 293T cells was measured by the band intensity of the AAVS1 targeted integration PCR. Band intensity was normalized by the intensity of the genomic control and standardized to the positive control (AAVS-ZFNs) to allow multiple repeats to be compared (Mean +/− SEM, n = 5). (e) Total number of puromycin-resistant (PuroR) human iPS cells derived from 6 × 106 starting cells under each condition. In the absence of ZFNs or TALENs (donor only), few cells survived after puromycin selection. In the presence of the ZFNs, TALENs or a pair of hybrid nuclease, nearly all the cells were GFP-positive after selection. The total cells were counted and plotted (mean +/− SEM, n = 2). (f). Confirmation of the targeted integration (TI) mediated by HR in the puromycin-resistant human iPS cells. The parental iPS cells were used as a negative control for PCR to detect the AAVS1 targeted integration mediated by HR (top). As a control, the same genomic DNA was amplified by a primer set for the HBB locus (bottom). Full-length gels are presented in Supplementary Figure 3. The samples derive from the same experiment and that gels were processed in parallel.

Discussion

We uncovered that by simply increasing the length of target DNA sequences by either TALENs or ZFNs did not necessarily increase the gene-targeting efficiency or specificity. We report here that one component of ZFNs and one component of TALENs can form a pair of hybrid designer nucleases with expanded DNA recognition and even improved efficiency to achieve gene targeting in multiple cell types. The concept of using a pair of hybrid designer nucleases for genome editing is not restricted to a specific ZFN and TALEN pair or DNA locus, because we observed similar phenomena by using various ZFN or TALEN pairs targeting two diverse loci. Using different pairs of existing ZFNs or TALENs that might have one suboptimal subunit may accelerate our search for a satisfactory pair of designer nucleases. Further optimization such as adequate lengths of DNA target sequences and a spacer in between will likely further improve the efficiency and specificity of ZFN and TALEN hybrid designer nucleases. It is of importance to compare the efficiency and specificity of ZFNs, TALENs or their optimized hybrids with newly reported CRISPR-Cas9 endonucleases12,13 to further expand our toolboxes for genome editing.

Methods

Cell culture

Human iPS cells were maintained on primary mouse embryonic fibroblasts (PMEFs) as feeder cells using the standard human ES cell media and transferred into feeder-free culture condition on Matrigel (BD Biosciences) and E8 medium (Invitrogen). The BC1 iPS cell line was established by a single episomal vector and fully sequenced for the whole genome14. The TNC1 iPS cell line was also derived by using an episomal vector method and peripheral blood mononuclear cells from an adult SCD patient14. To adapt to single-cell passaging, human iPS cells were first passaged from PMEFs feeders to feeder-free culture condition using Matrigel and E8 medium at 1:3 ratio using 0.5 mM EDTA digestion and then they were maintained by this EDTA-passaging and the feeder-free culture condition. Conducting laboratory research using anonymous human cells, including iPS cell lines, was approved the Johns Hopkins University internal review board. Human 293T cells that were used to validate ZFNs and TALENs were grown in DMEM high glucose supplemented with 10% FBS.

Expression vectors encoding TALENs and ZFNs for targeting the endogenous HBB or AAVS1 locus

The pair of HBB-ZFNs was previously described7. They were expressed from a pair of plasmid vectors (CompoZr, Sigma) containing two different versions of FokI nuclease domains that form a heterodimer. The DNA-binding motifs of the AAVS1-ZFNs and ZFAs targeting HBB locus, were synthesized following the published DNA sequences9,10 by Blue Heron Bio (Rockville, MD). The DNA-binding motif sequences were cloned into the same pair of heterodimeric CompoZr vectors at KpnI to BamHI sites, to replace the original HBB–binding motifs, respectively. To design TALENs targeting the human HBB gene adjacent to the SCD point mutation, we followed a previously published protocol15 and obtained the TALE backbone and monomers were obtained from Addgene: pLenti-EF1a-Backbone (NG) (Addgene Plasmid 27963), monomer-NN (Addgene Plasmid 27965), monomer-NI (Addgene Plasmid 27966), monomer-NG (Addgene Plasmid 27967) and monomer-HD (Addgene Plasmid 27968). The TALE backbone (N and C terminal portions flanking the DNA-binding motif) was amplified by the primer set TN-FP: acgtacggtacccatgtcgcggacccg and TC-RP: ttgcgcggatcctgccactcgatgtgatgtc, cleaved by KpnI and BamHI and ligated into the CompoZr vectors to form a pair of basal expression vectors for various TALENs. The TALEN1-L coding sequence DNA was synthesized from GenScript. TALEN1-R was PCR and ligated following the published protocol15. A pair of AAVS1-TALEN constructs4 were obtained from Addgene: hAAVS1-L TALEN (Addgene plasmid 35431) and hAAVS1-R TALEN (Addgene plasmid 35432). The FokI domains of AAVS1-TALENs were substituted by heterodimeric FokI domains as in the CompoZr vectors7, cleaved by SalI and BglII or XbaI and ligated into our TALEN backbone vector. Therefore, all the TALENs and ZFNs used in this study are expressed by the same pair of expression vectors with heterodimeric FokI nuclease domains. All the described vectors will be available via Addgene (http://www.addgene.org/Linzhao_Cheng/) or upon request.

Gene targeting assays using a GFP reporter system

To examine efficiency and specificity of various TALENs and ZFNs targeting the HBB locus, we conducted an improved assay using an enhanced green fluorescent protein (EGFP)-based HR reporter, similar to previous studies7. We cloned a 96-bp DNA sequence of the HBB gene (from codon 1, nt 58 to nt 141 from transcription starting site) containing the putative ZFNs and TALENs recognition sites and insert it into the middle of the EGIP* vector7. The insertion, together with a stop codon, taa, upstream and a HindIII site downstream aagctt, disrupts the EGFP reading frame. A full-length GFP activity could be restored, when the mutated GFP, GFP*, is corrected by HR (see Figure 1A). A plasmid bearing a non-expressing and truncated EGFP (tGFP, missing the first 36 nucleotides) that provides a repair template, together with a pair of plasmids expressing a pair of ZFNs or TALENs, were transfected into the stably transfected 293T cells expressing the EGIP*-HBB gene. The successful gene targeting efficiency was examined by the ratios of GFP-positive cells 4 days after transient transfection by FACS. To test the specificity of various HBB ZFNs and TALENs, the homologous region from other HBB-locus genes such as HBD, HBE and HBG also was inserted into the EGIP vector as tested in parallel.

Western blot for assessing the level of designer nuclease expression

Human 293T cells were transfected by individual subunits of HBB-ZFNs, ZFA and TALEN1. A common FLAG tag (3x) is present in the coding sequence in each expression vector. Equal amount of protein extracts were made 3 days after transfection. Ready Gels, 4–15% TRIS-HCl gels, were purchased from BioRad. MagicMark™ XP Western Protein Standard was purchased from Invitrogen. Antibodies used in this study were anti-FLAG, M2 (Sigma) and anti-β-tubulin (Sigma). And the relative expression level of individual nuclease versus endogenous β-tubulin was analyzed by ImageQuant software.

HBB targeting in human iPS cells

For the EGFP-based reporter assay in human iPS cells, we stably transfected the EGIP*-HBB96-SCD into the TNC1 iPS cells14. The 2–3 million TNC1-EGIP* cells grown on Matrigel were harvested at 80% confluence with accutase digestion and pipetting. Then, the cells were resuspended in 100 μl Amaxa mouse ES nucleofection buffer with 5 μg tGFP donor and 2.5 μg ZFNs and electroporated using the Nucleofector II and preset programs such as A-23 (Amaxa). Cells were replated 1:6 onto feeder cells afterward and GFP-positive cells began to emerge four days later. The analysis of GFP gene targeting efficiency was performed by flow cytometry after collecting 0.5–1 million events.

HR at the endogenous AAVS1 locus in human 293T cells

0.25 million 293T cells were plated into a 12 well plate 24 hours before lipofection. One μg AAV-CAGGS-GFP Donor9 and 0.5 μg each component of ZFNs or TALENs were transfected under the standard Invitrogen Lipofectamine 2000 protocol. The cells were harvested 96 hours after lipofection by 0.05% trypsin and pelleted for DNA extraction by Qiagen DNeasy Blood & Tissue kit. Targeted integration at the AAVS1 locus was confirmed by the PCR primer set used in previous study9 using Phusion high-fidelity DNA polymerase (New England Biolabs). A genomic control was amplified (HBB locus) to ensure consistent levels of starting genomic DNA. The AAVS1 targeted integration PCR was used to measure the relative targeting efficiency of the various designer nucleases. Gel image density was calculated using ImageJ software to quantitate the relative amount of targeted integration. Each experiment was normalized to the positive control (AAVS1-ZFN) to compare results across multiple experiments (n = 5). The values were also normalized to the genomic control PCR to account for differences in the amount DNA in the reaction.

Targeted integration at the AAVS1 locus in human iPS cells

Approximately 2 million BC1 iPS cells (at ≤80% confluence) were harvested by accutase and resuspended in 100 μl Amaxa mouse ES nucleofection buffer with 5 μg AAV-CAGGS-GFP and 2.5 μg each component of ZFNs or TALENs. The mixture was electroporated using the Nucleofector II program A-23 (Amaxa/Lonza), or by Nucleofector 4D and P3 solution (Lonza). Nucleofected cells were immediately diluted in pre-warmed E8 medium and plated onto a Matrigel-coated six-well plate. Two to four days post nucleofection, puromycin (0.25–0.5 μg/mL) was added to the media to select for targeted and stable integration. At day 10, puromycin-resistant cells were counted and pelleted for genomic DNA isolation. PCR confirmation of targeted integration was confirmed by the primer set reported previously, AAVS1U-F2: CTGCCGTCTCTCTCCTGAGT and PuroU-R: GTGGGCTTGTACTCGGTCAT9. As a control, the same genomic DNA was amplified by the primer set HBB-NHE-FJ: CCCTAGGGTTGGCCAATCTACTCC and HBB-NHE-RJ: CAGCCTAAGGGTGGGAAAATAGACC for detecting HBB locus targeted integration was used.