Efficient targeted integration into the bovine Rosa26 locus using TALENs

The genetic modification of cattle has many agricultural and biomedical applications. However, random integration often results in the unstable expression of transgenes and unpredictable phenotypes. Targeting genes to the “safe locus” and stably expressing foreign genes at a high level are desirable methods for overcoming these hurdles. The Rosa26 locus has been widely used to produce genetically modified animals in some species expressing transgenes at high and consistent levels. For the first time, we identified a bovine orthologue of the mouse Rosa26 locus through a genomic sequence homology analysis. According to 5′ rapid-amplification of cDNA ends (5′RACE), 3′ rapid-amplification of cDNA ends (3′RACE), reverse transcription PCR (RT-PCR) and quantitative PCR (Q-PCR) experiments, this locus encodes a long noncoding RNA (lncRNA) comprising two exons that is expressed ubiquitously and stably in different tissues. The bovine Rosa26 (bRosa26) locus appears to be highly amenable to transcription activator-like effector nucleases (TALENs)-mediated knock-in, and ubiquitous expression of enhanced green fluorescent protein (EGFP) inserted in the bRosa26 locus was observed in various stages, including cells, embryos, fetus and cattle. Finally, we created a valuable master bRosa26-EGFP fetal fibroblast cell line in which any gene of interest can be efficiently introduced and stably expressed using recombinase-mediated cassette exchange (RMCE). The new tools described here will be useful for a variety of studies using cattle.

the Rosa26 locus exists in the bovine genome or whether this locus possesses similar characteristics as observed in other species.
In this study, we identified and characterized the bovine Rosa26 locus. We further conducted in vivo experiments to produce cattle with a knock-in at this locus to functionally validate its applicability. We also established a master bRosa26-EGFP cell line, which were used to perform RMCE mediated by TAT-Cre without a selective marker gene [21][22][23] . Our results add bRosa26 to the tool box of transgenic research in cattle. The availability of a safe-harbor locus in bovine is expected to improve the applications of genetically modified cattle.

Results
Identification of the bovine Rosa26 locus. The 1 kb mouse Rosa26 promoter and exon 1 sequences (NC_000072.6) were used as a template to search the Ensembl bovine database (UMD3.1.1). A highly conserved sequence (sequence similarity of 83%) on bovine chromosome 22 was identified as the putative bovine Rosa26 locus, based on the results from DNAMAN software. This region contained both the bovine Rosa26 locus and the neighboring genes that have also been identified in mice, rats, humans and pigs ( Supplementary Fig. S1A). Alignments of the mouse, rat, pig, human and bovine Rosa26 promoter and exon 1 sequences showed high sequence conservation (>75%) among these species (Supplementary Fig. S1B). Bovine Rosa26 exon 1 was predicted by aligning the sequence to the mouse, rat, human and pig Rosa26 exon 1 sequences, and exhibited the greatest sequence conservation (>98%). These findings provided convincing evidence that the region on bovine chromosome 22 identified here represents the equivalent of the mouse Rosa26 locus and should thus be referred to as bovine Rosa26 (bRosa26). A screen of the Ensembl gene expression database for the bRosa26 locus failed to identify any expressed sequence tags (ESTs). Therefore, we designed primers aligned within predicted exon 1 to perform 5′ and 3′ rapid-amplification of cDNA ends (RACE) analyses and identified a noncoding RNA product of 377 bp transcribed from the bRosa26 locus that comprised two exons ( Fig. 1A and Supplementary Fig. S1C). Reverse transcription (RT-PCR) and Quantitative PCR assays (Q-PCR) revealed this noncoding RNA expressing in a wide variety of adult tissues (Fig. 1B,C). Assessment of TALEN activity toward bRosa26 in bovine fetal fibroblasts. TALENs efficiently mediate the knock-in of a reporter vector into the pRosa26 locus. Therefore, we selected the intron of the bRosa26 locus for targeting with three pairs of TALENs ( Fig. 2A). First, the activities of TALENs in human 293 T cells were screened with a luciferase single-strand annealing (SSA) assay 24 . All three pairs of TALENs displayed cleavage activity (Fig. 2B). Subsequently, a T7 endonuclease 1 (T7EI) assay was used to assess the frequencies of TALEN-induced indels in bovine fetal fibroblasts (BFFs). TALEN pairs were transfected into the BFFs using nucleofection. After 72 h, the genome of the BFFs was extracted and analyzed. TALEN pair 1 cleaved the target site with the greatest efficiency, as evidenced by the increased incidence of allelic mutations (non-homologous end joining (NHEJ) frequency) (Fig. 2C). TA-cloning and a DNA sequencing analysis of the PCR amplicons The expression of the bRosa26 lncRNA in various tissues relative to GAPDH was determined using Q-PCR. Error bars represent the mean ± SD. (C) The expression of the bRosa26 lncRNA in various tissues relative to GAPDH was analyzed using RT-PCR. For RT-PCR, the designed primers annealed to the bRosa26 sequence and amplified a correctly spliced product of 155 bp. GAPDH served as a control (488 bp).
Efficient gene knock-in at the bovine Rosa26 locus. A reporter donor vector was constructed, which consisted of a splice acceptor (SA), a promoterless neomycin (neo) resistance gene, and an inverted EGFP (iEGFP) flanked by homologous arms, to determine whether the bRosa26 region enabled the ubiquitous expression of an inserted gene, similar to the Rosa26 locus in other species. The loxP and mutant loxP2272 sites, which flanked the neo and iEGFP genes, respectively, were used for Cre-mediated removal of the neo gene and inversion of the iEGFP gene. As a result, EGFP will be activated to express under the control of the bRosa26 promoter (Fig. 3A). Moreover, these sites allow virtually any gene of interest to be inserted into the bRosa26 locus using RMCE. For gene targeting, BFFs were electroporated with the linearized targeting vector accompanied by TALEN pairs. After selection with G418 (1 mg/ml from day 7 to day 10), 37 cell clones were screened and expanded (Table 1). Twenty-eight of the 37 clones (bRosa26-iEGFP) were correctly targeted, based on PCR analyses of the 5′-and 3′-arms. The PCR results of several positive clones are shown in Fig. 3B. The positive bRosa26-iEGFP cell clones were used in subsequent experiments.
Transgenes at the bRosa26 locus are expressed ubiquitously. We generated bRosa26-EGFP cells by adding the TAT-Cre protein to bRosa26-iEGFP cell line #7 to determine whether the bRosa26 promoter ubiquitously drives gene expression similar to the Rosa26 promoter of other species (Fig. 4A). TAT-Cre proteins were purified from a bacterial source using a two-step method based on our previously study 21,22 (Supplementary Fig. S2A and B) and were then used to analyze the activity in an in vitro recombination reaction. We constructed a pDFR plasmid, which was used as a substrate ( Supplementary Fig. S2C). Incubation of linearized pDFR (8.3 kb) with Cre resulted in a linearized pL (5.7 kb) and a recircularized pC (2.0 kb). Based on the results of the in vitro assay, the TAT-Cre protein functioned to recombine the substrate (linearized pDFR, 8.3 kb in size) and produced two bands (2.0 kb and 5.7 kb in size) ( Supplementary Fig. S2D). Subsequently, we used the TAT-Cre protein to treat bRosa26-iEGFP cell line #7 and obtain bRosa26-EGFP cells ( Supplementary Fig. S2E and F), which were used as donors to perform nuclear transfer (NT). As shown in Supplementary Fig. S3A, EGFP was ubiquitously  Table 2, we obtained four bRosa26-EGFP cloned cattle. However, four of them died soon after birth due to the commonly observed effect of somatic cell cloning or because these four cattle were twins. PCR and Southern blot analyses showed that all four cattle were derived from the donor cells obtained after somatic cell nuclear transfer (SCNT) (Fig. 4B,C). In the positive bRosa26-EGFP cloned cattle, EGFP was ubiquitously expressed in all organs examined in this study ( Supplementary Fig. S4A). In addition, Q-PCR, RT-PCR and Western blotting showed the ubiquitous expression of EGFP in all organs examined in this study ( Fig. 4D-F and Supplementary Fig. S4B,C). Thus, the bRosa26 locus is an excellent site for inducing the ubiquitous expression of exogenous genes. RMCE at the bRosa26 locus mediated by TAT-Cre. RMCE at a "safe locus" is important for creating farm animal transgenics because it avoids random integration and does not require drug selection. The insertion of heterotypic loxP sites along with EGFP into bRosa26 introduced a donor site, which allowed us to replace the EGFP cassette in the bRosa26 locus with any gene of interest by RMCE. We engineered an exchange vector (Fig. 5A) containing a promoterless tandem Dimer of DsRed (tdimer2(12)) 25 flanked by heterotypic loxP sites to test the feasibility of this approach. This exchange vector named bRosa26-tdDIMER was electroporated into bRosa26-EGFP fetal fibroblasts (Fig. 5A). The recombinant TAT-Cre protein was used to avoid integration of the Cre plasmid. After screening single-cell cultures, we isolated 88 colonies. The PCR analysis confirmed the successful replacement of EGFP with tdimer2(12) in 43 colonies (Table 3). Partial PCR results are shown in Fig. 5B.  The positive bRosa26-tdDIMER cell clone was also identified by fluorescence microscopy (Fig. 5C). As expected, tdimer2 (12) was ubiquitously expressed in the cell clones. Based on these results, the TAT-Cre-mediated RMCE technology can be used in cattle along with our bRosa26-EGFP reporter cell line.

Discussion
Precision genetic engineering based on the stable chromosomal insertion of exogenous DNA at a "safe locus" of the bovine genome may become important for the development of improved biomedical models, pharmaceutical research and accelerated breeding programs. Although this approach has been widely established in mice 9,26 , rats 16 , human cells 10,15 and pigs 17 , this technology has been poorly investigated in bovine. The bRosa26 locus characterized here is the first transgene safe harbor identified in bovine. Similar to mice 12,27 , rats 16 , humans 15 , pigs [17][18][19] , and rabbits 20 , this locus consists of highly conserved promoter and exon 1 sequences, as well as the location of   a flanking gene. Meanwhile, the bRosa26 locus comprises two exons and encodes lncRNAs, which were also expressed in all tissues examined in the present study. Efficient gene targeting is an important factor if bRosa26 is to be used as a general purpose permissive locus for transgene placement, similar to mouse Rosa26. The bRosa26 locus was highly amenable for TALEN-mediated knock-in in the present study, with greater than 70% efficiency for the bovine fetal fibroblasts. Hence, the application of the transgene safe-harbor locus identified in this study in combination with emerging gene-editing nuclease tools, such as CRISPR/Cas9 28,29 , may enable the efficient and simple targeted integration of a single copy (or any desired number) of a transgene with predictable transgene expression in the future. We confirmed that a foreign reporter gene inserted into the bRosa26 locus was expressed ubiquitously in various stages, including cells, embryos, fetus and cattle. Strategies targeting the bRosa26 locus may overcome the problem of gene silencing and the concerns regarding gene disruptions or alterations induced by insertional mutagenesis.
RMCE is a simple and effective technique for replacing and expressing any gene at a safe locus 14 . Recently, RMCE has also been used in transgenic pigs 17 , but it has not been reported in bovine. Therefore, we created a valuable master bRosa26-EGFP fetal fibroblast cell line in which any genetic material of interest can be efficiently introduced and stably overexpressed using RMCE without the need for a drug-resistance gene. We used the TAT-Cre protein 30 in bovine fetal fibroblasts for the first time to perform RMCE and avoid integration of the Cre plasmid. These features define bRosa26 as a safe-harbor locus that is attractive for versatile transgenic applications, including gain-of function, loss-of-function and lineage-tracing studies in bovine.
Many normal genetically modified animals produced using the Rosa26 safe locus live. However, the four cloned cattle all died soon after birth in our study, which may have been due to several factors. First, SCNT is  associated with increased rates of abortion and health problems, such as death soon after birth, due to incomplete epigenetic reprogramming of the somatic donor nuclei 31,32 . Therefore, many reconstructed embryos must be transplanted into a large number of recipient animals. Only eight recipient animals were used in our study, which was fewer than the number employed in our previous study 5,33 . Second, the cattle is a singleton animal, but we transplanted two embryos, resulting in twins and a higher chance of death. The zygote injection pathway has been used to efficiently generate random insertions, deletions (indels) or knock-in sequences in cattle. These studies provide strong technological support for the use of bRosa26 locus in the future. In summary, the bRosa26 locus characterized here is the first transgene safe-harbor locus identified in the bovine genome. We confirmed that foreign genes inserted into the bRosa26 locus were expressed ubiquitously, and we created a valuable master bRosa26-EGFP fetal fibroblast cell line in which any genetic material of interest can be efficiently introduced and stably overexpressed using RMCE. Hence, the results of this study will serve as an excellent platform for generating genetically modified cattle for biomedical and agricultural applications, particularly the application of the transgenic cattle mammary bioreactor.

Materials and Methods
Identification of the bRosa26 locus. The mouse Rosa26 promoter and exon 1 sequences were used as a template to search the Ensembl bovine database. A highly conserved sequence (sequence similarity of 83%) on bovine chromosome 22 was identified as the putative bRosa26 locus. The sequence of bRosa26 exon 1 was predicted using the alignment of mouse, rat, human, and pig Rosa26 exon 1 sequences.
Luciferase single-strand annealing (SSA) assay. The luciferase reporter plasmids were constructed using the SSA Kit (Viewsolid Biotech, Beijing, China) based on described previously 24 . The luciferase gene contained two repeats that was disrupted by a "stop" signal. Effective TALEN pairs would generate a double-strand break (DSB) in the "stop" sequence, which allow the functional luciferase gene to be restored. The TALEN pairs plasmids (500 ng each) and luciferase reporter plasmid (100 ng) were co-transfected into human 293 T cells in 48-well plate using Lipofectamine 3000 (Invitrogen, USA). After 48 h, the cells were lysed and luciferase activity was determined using luciferase assay reagent (Promega, USA). T7 endonuclease 1 (T7EI) assay. TALEN pairs were transfected into the BFFs using nucleofection. After 72 h, the genome of the BFFs was extracted and the editing activity of each TALEN was assayed using T7 endonuclease I (T7E1) (New England Biolabs, USA), as described previously 34 . Briefly, genomic DNA was extracted from TALEN-treated cells using a DNeasy Blood and Tissue kit (Qiagen, Hilden, Germany). PCR amplicons including nuclease target sites were generated using the primers: Rosa26-F, 5′-GCCGCAATACCTTTATGGGAG-3′ and Rosa26-R, 5′-ATTGGTGGTGAAACCTGTCTG-3′. The 700 bp PCR amplicons were denatured by heating and annealed to form heteroduplex DNA using a thermocycler and then digested with T7E1 for 30 min at 37 °C and then analyzed using agarose gel electrophoresis. Mutation frequencies (indels, %) were calculated by quantifying the relative using ImageJ software.
Cell culture and transfection. Primary BFFs were isolated from a Holstein cattle fetus by disaggregating the entire body, with the exception of the head and viscera, and cultured in Dulbecco's Modified Eagle's Medium (DMEM; Gibco, Grand Island, New York, USA) supplemented with 10% fetal bovine serum (FBS) (Gibco, Grand Island, New York, USA) at 37.5 °C in an atmosphere of 5% CO 2 and humidified air. Next, 4 µg of TALEN pair 1 and 4 µg of linearized pbRosa26-iEGFP donor were nucleofected into 1 × 10 6 BFFs using Amaxa Nucleofector reagent (Lonza Group AG Basel, Switzerland) according to the manufacturer's guidelines and the program T-016. G418 (1 mg/ml) selection was used in cell colonies that formed within 48 h after transfection, and the cell density was approximately 1 × 10 5 cells/dish (10 cm 2 ). Individual cell clones were isolated 7-10 days after G418 selection, expanded, cultured, sequenced and cryopreserved after a total of 12-14 days in culture.
The TAT-Cre protein was produced in vitro by transfecting the BL21 (DE3) E. coli strain with the pTAT-Cre vector according to our previously published method 21,22 . Protein expression was induced with 0.5 mM IPTG for 16 h at 16 °C. TAT-Cre was purified using Ni-NTA resin (Qiagen, Hilden, Germany) and cation-exchange chromatography (HiTrap SP HP) (GE Healthcare, Uppsala, Sweden), according to the manufacturer's manual. The protein purity was examined by SDS-PAGE. The protein concentration was determined with a Bradford assay and adjusted to approximately 1 µg/µl. Proteins were stored at −80 °C.
Generation of bRosa26-EGFP master cell lines. Approximately 1.0 × 10 5 bRosa26-iEGFP cells were plated on 48-well cell culture plates and cultivated. After 24 h, cells were treated with 2 µM recombinant TAT-Cre proteins in serum-free DMEM (Gibco, Grand Island, New York, USA) for 2 h and then washed and cultivated for an additional 48 h with DMEM supplemented with 10% FBS (Gibco, Grand Island, New York, USA). Cells expressing EGFP were used to perform rejuvenation based on a previous method. Briefly, E46 embryos of bRo-sa26-EGFP fetuses from which the head, limbs, and internal organs had been removed were minced and digested with 0.25% Trypsin-EDTA in DMEM supplemented with 10% FBS (Gibco, Grand Island, New York, USA) and 1% penicillin-streptomycin (Gibco, Grand Island, New York, USA) for 10 min at 37 °C. Dissociated cells were centrifuged at 1000 g for 5 min and cultivated in DMEM containing 10% FBS on 10 cm 2 culture dishes. Cells were frozen in a CellBanker 2 (ZENOAQ, Tokyo, Japan) for future use.
Production of cloned embryos and cattle. The somatic cell nuclear transfer (SCNT) procedure was performed as previously described 35 . Briefly, the nuclei of transgenic cells were transferred into enucleated oocytes to produce reconstructed embryos in vitro using the ECM ® 2001 Electro Cell Manipulation System (BTX, San Diego, CA, USA). Day 7 blastocysts were collected for future transplantation. Sixteen blastocysts were transferred into 8 recipient cattles. One to two transgenic cloned blastocysts were transferred into each recipient. Pregnancy was detected by ultrasonography at 60 days and 180 days post-transfer. All experiments were performed in accordance with the relevant guidelines and regulations, and the Institutional Animal Care and Use Committee of China Agricultural University approved this research. Southern blot. Genomic DNA was extracted from animal ear tissue using phenol/chloroform. At least 10 μg of genomic DNA from transgenic and wild-type cattle were digested with the ScaI or DraIII restriction enzyme (New England Biolabs, USA) overnight. A 347 bp EGFP probe was amplified with the primer pairs 5′-ATGGTGAGCAAGGGCGAGGAG-3′ and 5′-TTACTTGTACAGCTCGTCCATGC-3′ and was labeled using a PCR DIG Probe Synthesis Kit (Roche, Mannheim, Germany). After electrophoresis on a 0.8% agarose gel for 6 h, the DNA was transferred to a nitrocellulose membrane (Roche, Mannheim, Germany). Pre-hybridization and hybridization were performed at 45 °C, and washing steps were performed at 68 °C. The positive bands were expected to be 5.8 kb and 9.2 kb, respectively. Western blotting. Samples were isolated from different tissues from the transgenic and wild-type (WT) cattles and homogenized in cell lysis buffer for Western and IP analyses (Beyotime, Shanghai, China). After centrifugation at 10000 g for 10 min at 4 °C, the total protein supernatants were collected, and protein concentrations were measured using a BCA Protein Assay kit (Beyotime, Shanghai, China). Approximately 20 μg of protein was separated on 10% SDS-PAGE gels and transferred to Immobilon-P membranes (MilliporeSigma, Burlington, MA, USA). After blocking in 3% BSA in TBST for 1 h, membranes were incubated with a GFP antibody (dilution, 1:10000; Abcam, Cambridge, MA, USA) or bovine GAPDH antibody (dilution, 1:10000; Abcam, Cambridge, MA, USA) overnight at 4 °C. After washes with TBST, membranes were incubated with a goat anti-rabbit antibody conjugated with horseradish peroxidase (dilution, 1:20000; Sino-American Co, Beijing, China) for 1 h followed by three washes with TBST. Protein signals were detected using an ECL Chemiluminescence kit (Thermo Fisher Scientific, Waltham, MA, USA).