Genome targeting by hybrid Flp-TAL recombinases

Genome engineering is a rapidly evolving field that benefits from the availability of different tools that can be used to perform genome manipulation tasks. We describe here the development of the Flp-TAL recombinases that can target genomic FRT-like sequences in their native chromosomal locations. Flp-TAL recombinases are hybrid enzymes that are composed of two functional modules: a variant of site-specific tyrosine recombinase Flp, which can have either narrow or broad target specificity, and the DNA-binding domain of the transcription activator-like effector, TAL. In Flp-TAL, the TAL module is responsible for delivering and stabilizing the Flp module onto the desired genomic FRT-like sequence where the Flp module mediates recombination. We demonstrate the functionality of the Flp-TAL recombinases by performing integration and deletion experiments in human HEK-293 cells. In the integration experiments we targeted a vector to three genomic FRT-like sequences located in the β-globin locus. In the deletion experiments we excised ~ 15 kilobases of DNA that contained a fragment of the integrated vector sequence and the neighboring genome sequence. On average, the efficiency of the integration and deletion reactions was about 0.1% and 20%, respectively.


Supplementary Note 1. Target specificity of Flp-TAL recombinases, the Flp module of which can recombine multiple FRT-like sequences.
Target specificity of the hybrid Flp-TAL recombinases is the product of target specificity of the TAL and the Flp modules. For the sake of this analysis, we consider target specificity of the TAL modules a fixed value and estimate the overall specificity of Flp-TAL that contain the Flp module with broad target specificity by assessing the probability to find an FRT-like sequence between two binding sequences for the TAL modules (see Figs. 1 and 2).
Target specificity of the Flp module that is capable of recombining several FRT-like sequences can be estimated based on the sequence characteristics of the FRT-like sequences that differ them from random DNA sequences. In mammalian genomes, these sequence characteristics translate into one FRT-like sequence per about 5,000 base pairs (Shultz et al., 2011). As such, the probability to find an FRT-like sequence between the two sequences for the TAL modules is ~1/5,000. Consequently, the target specificity of Flp-TAL should be about three orders of magnitude higher than that of the TAL modules.
Additionally, we have to consider the important property of the Flp/FRT system that depends on the spacer sequence of the recombination target to function: only the targets with the same spacers will efficiently recombine with each other while the targets with different spacers will not. Since FRT has an 8-bp spacer, the probability to find a spacer sequence of this length is 1/4 8 (or 1/65,536). For the high-scoring FRT-like sequences this probability is higher since the first and the last base pairs of the spacer in these sequences are invariant: T/A and A/T, respectively, and the G/C content of the spacer is set to be equal or lower than 50% (Shultz et al., 2011).
Collectively, these spacer features increase the probability to find two FRT-like sequences with the same spacer to 1/4 6 /2 (or ~1/2,000).
Taken together, the theoretical probability to find an FRT-like sequence with a unique spacer between two TAL binding sequences is ~1/10 7 (~1/(5x10 3 ) x ~1/(2x10 3 )). Such low probability should ensure that the TAL-guided Flp variant with broad target specificity will recombine just the FRT-like sequence of interest. This, of course, can only be realized if the Flp module is not sufficiently active to recombine the FRT-like sequences on its own, without being stabilized on the target by the TAL module.    Table 4. FRT-like sequences in the human genome that were tested for potential off-target integration events. The FRT-like sequences shown have the highest level of homology to FRT (Shultz et al., 2011) and their spacer sequences, which are highlighted in bold, are identical to that of FL-61, see Fig. 2b. The relative orientation of these FRT-like sequences (which is specified by the direction of their spacer sequences) is different in the human chromosomes but these sequences are arranged in such a way that the direction of their spacers matches that of FL-61 (Fig. 2b).   CTTCGAGGACGGCGGCGTGGTGACCGTGACCCAGGACTCCTCCCTcCAGGACGGCTCCTTCATCTACAAG  GTGAAGTTCATCGGCGTGAACTTCCCCTCCGACGGCCCCGTAATGCAGAAGAAGACTATGGGCTGGGAGG  CCTCCACCGAGCGCCTGTACCCCCGCGACGGCGTGCTGAAGGGCGAGATCCACAAGGCCCTGAAGCTGAA  GGACGGCGGCCACTACCTGGTGGAGTTCAAGTCTATCTATATGGCCAAGAAGCCCGTGCAGCTGCCCGGC  TACTACTACGTGGACTCCAAGCTGGACATCACCTCCCACAACGAGGACTACACCATCGTGGAGCAGTACG  AGCGCGCCGAGGGCCGCCACCACCTGTTCCTGTAAGGATCCGCGGGACTCTGGGGTTCGAAATGACCGAC  CAAGCGACGCCCAACCTGCCATCACGAGATTTCGATTCCACCGCCGCCTTCTATGAAAGGTTGGGCTTCG  GAATCGTTTTCCGGGACGCCGGCTGGATGATCCTCCAGCGCGGGGATCTCATGCTGGAGTTCTTCGCCCA  CCCCAACTTGTTTATTGCAGCTTATAATGGTTACAAATAAAGCAATAGCATCACAAATTTCACAAATAAA  GCATTTTTTTCACTGCATTCTAGTTGTGGTTTGTCCAAACTCATCAATGTATCTTATCATGTCTGTATAC  CGTCGACCTCTAGCTAGAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCT  CACAATTCCACACAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAA  CTCACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAAT  GAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTC  GCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACA  GAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGG  CCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCA  GAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCT  CCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTC  AATGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACC  CCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGAC  TTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGT  TCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGGACAGTATTTGGTATCTGCGCTCTGCTGAAGCC  AGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTT  TTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGG  GGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTT  CACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCT  GACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTG  CCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGAT  ACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGGCCGAGCGC  AGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGTA  GTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTT  TGGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAA  AAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGG  TTATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTA  CTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAATACGGGAT  AATACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGGCGAAAACTCT  CAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGATCTTCAGCATC  TTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAAGGGAATAAGG