Identification of telomere-associated molecules by engineered DNA-binding molecule-mediated chromatin immunoprecipitation (enChIP)

Biochemical analysis of molecular interactions in specific genomic regions requires their isolation while retaining molecular interactions in vivo. Here, we report isolation of telomeres by engineered DNA-binding molecule-mediated chromatin immunoprecipitation (enChIP) using a transcription activator-like (TAL) protein recognizing telomere repeats. Telomeres recognized by the tagged TAL protein were immunoprecipitated with an antibody against the tag and subjected to identification of telomere-binding molecules. enChIP-mass spectrometry (enChIP-MS) targeting telomeres identified known and novel telomere-binding proteins. The data have been deposited to the ProteomeXchange with identifier PXD000461. In addition, we showed that RNA associated with telomeres could be isolated by enChIP. Identified telomere-binding molecules may play important roles in telomere biology. enChIP using TAL proteins would be a useful tool for biochemical analysis of specific genomic regions of interest.

M olecular complexes in the context of chromatin mediate functions of eukaryotic genome 1 . Identification of chromatin components is essential for elucidation of molecular mechanisms of genome functions. We recently developed the insertional chromatin immunoprecipitation (iChIP) technology to isolate specific genomic regions retaining molecular interactions in vivo 2,3 . In iChIP, recognition sequences of an exogenous DNA-binding molecule such as LexA are inserted into a target genomic region. Subsequently, the tagged genomic region is affinity-purified and subjected to downstream analyses such as mass spectrometry (MS) (iChIP-MS) to identify proteins and RT-PCR (iChIP-RT-PCR) to identify RNA 3,4 . Although iChIP enables us to isolate specific genomic regions of interest and dissect chromatin components, insertion of recognition sequences of LexA or other DNA-binding molecules is a time-consuming step.
To eliminate the step of insertion of recognition sequences of an exogenous DNA-binding molecule from iChIP and make the procedure more straightforward, we recently developed a novel method, engineered DNAbinding molecule-mediated chromatin immunoprecipitation (enChIP), for purification of specific genomic regions 5 . In enChIP, a tagged engineered DNA-binding molecule recognizing an endogenous target DNA sequence is expressed into the cell to be analyzed. Subsequently, the target genomic region is subjected to affinity-purification such as immunoprecipitation with an antibody (Ab) against the tag.
In the present study, we isolated telomeres by enChIP using a transcription activator-like (TAL) protein recognizing telomere repeats. enChIP-mass spectrometry (enChIP-MS) targeting telomeres identified known and novel telomere-binding proteins. In addition, we showed that RNA associated with telomeres could be detected by enChIP combined with RT-PCR (enChIP-RT-PCR). Identified telomere-binding molecules may play important roles in telomere biology. Thus, enChIP using TAL proteins would be a useful tool for biochemical analysis of specific genomic regions of interest.

Results
Scheme of enChIP. enChIP consists of the following steps 5 (Fig. 1): A DNA-binding molecule/complex (DB) recognizing a target DNA sequence in a genomic region of interest is engineered (Fig. 1a, b). Zinc-finger proteins 6 , TAL proteins 7 , and a catalytically inactive Cas9 (dCas9) plus small guide RNA (gRNA) 8 can be used as DB. A tag(s) and the nuclear localization signal (NLS) are fused with the engineered DB (Fig. 1b), and expressed into the cell to be analyzed.
If necessary, the resultant cell is stimulated and crosslinked with formaldehyde or other crosslinkers. (iii) The cell is lysed, and chromatin is fragmented by sonication or digested with nucleases. (iv) The chromatin complexes containing the engineered DB are affinity-purified by immunoprecipitaion or other methods.
After reverse crosslinking, DNA, RNA, proteins, or other molecules are purified and subjected to identification by various methods including next generation sequencing and mass spectrometry (Fig. 1c).
Isolation of telomeres by enChIP using a TAL protein.
To isolate telomeres, a TAL protein, Telomere-TAL (Tel-TAL), recognizing a 19-bp sequence containing an array of TTAGGG (telomere repeats) ( Fig. 2a) was fused with 3xFLAG tag and NLS (3xFN-Tel-TAL) (Fig. 1b, Supplemental Fig. 1). First, we examined binding of 3xFN-Tel-TAL to telomere repeats or irrelevant interferon regulatory factor-1 (IRF-1) promoter sequence in vitro using DNA-affinity precipitation assay (DNAP) 9 . As shown in Fig. 2b (the full-length blot with size markers is shown in Supplemental Fig. 2), binding of 3xFN-Tel-TAL to telomere repeats was clearly detected, whereas its binding to irrelevant IRF-1 promoter sequence was marginal, showing that binding of 3xFN-Tel-TAL is specific to telomere repeats in vitro. Next, 3xFN-Tel-TAL was expressed in a mouse hematopoietic cell line, Ba/F3. Ba/F3 expresses functional telomerase 10 . Expression of 3xFN-Tel-TAL or a negative control protein consisting of 3xFLAG-tag, NLS, and LexA protein (3xFNLDD) 11 was confirmed by flowcytometry with anti-FLAG M2 Ab (Fig. 2c). The cells were crosslinked with formaldehyde, and crosslinked chromatin was fragmented by sonication. Subsequently, chromatin complexes containing 3xFN-Tel-TAL or 3xFNLDD were immunoprepicitated with anti-FLAG M2 Ab. Southern blot analysis using a telomere probe showed that telomere DNA was specifically detected in the immunoprecipitants prepared from Ba/F3 expressing 3xFN-Tel-TAL. 1.2% of input genomic DNA was immunoprecipitated with 3xFN-Tel-TAL (Fig. 2d, the full-length blot with size markers is shown in Supplemental Fig. 3). In contrast, irrelevant c-satellite repeats were not specifically enriched in the immunoprecipitants prepared from Ba/F3 expressing 3xFN-Tel-TAL (Supplemental Fig. 4). These results showed that enChIP with 3xFN-Tel-TAL can isolate telomeres specifically.
Identification of telomere-binding proteins by enChIP-MS. Next, enChIP-MS was used to perform non-biased search for proteins associated with telomeres. 4 3 10 7 cells were subjected to enChIP followed by elution with 3xFLAG peptide. The eluate was reversecrosslinked and subjected to SDS-PAGE. After staining with Coomassie Brilliant Blue, proteins were excised for MS analysis (Supplemental Fig. 5). We detected many telomere-related proteins (Table 1, Supplemental Table 1). They included known telomere-binding proteins, proteins interacting with telomerebinding proteins, and proteins whose mutations affect telomere function. These data clearly showed that it is feasible to identify proteins interacting with endogenous genomic regions by enChIP-MS.
Co-localization of novel telomere-binding proteins with telomeres. The above-mentioned list contained many proteins whose localization at mammalian telomeres has not been reported. We examined localization of several of them at telomeres in U2OS cells by imaging analysis. Immunofluorescence microscopy with Abs against endogenous proteins showed that selected candidate proteins (KDM5C, POLA1, CTBP1, DDX54, GNL3L) co-localized with TRF1, a marker protein of telomeres ( Fig. 3a-e). Consistently, Halo-tagged KDM5C and CTBP1 also showed co-localization with TRF2, another maker protein of telomeres (Supplemental Fig. 6). In addition, Halo-tagged BEND3 protein co-localized with TRF2 ( Fig. 3f). Detected telomere-binding proteins showed a variety of  localization patterns with telomeres (Fig. 3). KDM5 is a lysinespecific histone demethylase. It has been shown that its yeast homologue LSD1 localizes in heterochromatin including subtelomeric regions and their mutations cause spreading of telomeric heterochromatin 12 . POLA1 is DNA polymerase a catalytic subunit. It has been shown that its yeast homologue binds to telomere protein Cdc13p 13 . A mouse cell line expressing a temperature sensitive-mutant of POLA1 has been shown to have telomere defects 14 . It has been shown that CTBP (C-terminal binding protein) binds to FoxP2 protein, which is associated with POT1 telomere-binding protein 15 . DDX54 is a DEAD-box RNA helicase, and it has been reported that human DDX54 interacts with estrogen receptors (ERs) and represses transcription of target genes of ERs 16 . Involvement of DDX54 in telomere functions has not been reported. GNL3L (guanine nucleotide-binding protein-like 3like protein) is also known as nucleostemin and a paralog of the stem cell-enriched GTP binding protein GNL3 17 . It has been shown that GNL3L binds to the telomerase complex and TRF1 18 . BEND3 (BEN domain-containing protein 3) has been shown to localize to heterochromatin associated with HP1 19 .
Detection of telomerase RNA by enChIP-RT-PCR. Next, we examined whether RNA species interacting with telomeres can also be isolated by enChIP. After isolation of telomeres by enChIP from Ba/F3 cells, RNA purified from chromatin was treated with DNase and subjected to RT-PCR analysis to detect the RNA component of telomerase 20,21 . As shown in Fig. 4 (the full-length gel image with size markers is shown in Supplemental Fig. 7), telomerase RNA was clearly detected in chromatin isolated with 3xFN-Tel-TAL but not in the negative control sample. This result clearly showed that enChIP can be used for identification of RNA associated with target genomic regions.

Discussion
In this study, we developed enChIP using a TAL protein for purification of specific genomic regions retaining molecular interactions in vivo for non-biased identification of binding molecules (Fig. 1). We showed that enChIP using 3xFN-Tel-TAL is able to isolate telomeres (Fig. 2). Using enChIP-MS, we detected known and novel proteins interacting with telomeres (Table 1, Supplemental Table 1 and Fig. 3). Detected telomere-binding proteins showed a variety of localization patterns with telomeres (Fig. 3). For example, DDX54 showed high frequency of co-localization with telomeres, suggesting that it functions mainly at telomeres. In contrast, relatively small fractions of KDM5C and POLA1 co-localized with telomeres, suggesting that they function at telomeres as well as other sites in the nucleus. In this regard, U2OS cells used in the imaging analysis lack telomerase activity and maintain telomere DNA via homologous recombination (alternative lengthening of telomeres (ALT)) 22 . Therefore, it would be interesting to examine localization of detected proteins in telomerase-positive cells. Nevertheless, enChIP-MS could detect a wide variety of telomere-interacting proteins, which is beneficial for comprehensive understanding of telomere biology. It would also be an interesting future issue to investigate the functions of newly identified telomere-binding proteins in mammalian telomeres in telomerase-positive and negative cells.
Furthermore, telomerase RNA associated with telomeres was detected by enChIP-RT-PCR (Fig. 4). Combination of enChIP with microarray analysis (enChIP-chip) or RNA-Seq analysis (enChIP-RNA-Seq) would enable us to perform non-biased search for RNA species associated with a given genomic region. In fact, we have identified a list of telomere-binding RNA species by enChIP-RNA-Seq analysis (T.F. and H.F., unpublished data). enChIP can also be combined with next generation sequencing (enChIP-Seq) to detect interactions between genomic regions.
In addition to TAL proteins, other engineered DNA-binding molecules such as zinc-finger proteins 6 and the CRISPR system consisting of dCas9 and gRNA 8 can be used for enChIP. In fact, in the concurrent work we could successfully isolate a single-copy locus by enChIP using dCas9 plus gRNA and identify associated proteins by MS 5 .
enChIP is a technology related to iChIP we developed recently 2,4 . In contrast to iChIP, enChIP does not require insertion of recognition sequences of exogenous DNA-binding proteins such as LexA. Therefore, the isolation procedure of enChIP is much more straightforward than that of iChIP. On the other hand, enChIP cannot distinguish two alleles if the target genomic region is in autosomes, whereas iChIP can differentially isolate genomic regions in a specific allele. Consequently, if the genome function is regulated in an allelespecific manner, eg. in genomic imprinting, iChIP would be the method of choice. Thus, enChIP and iChIP are complimentary approaches.
Recently, Kingston's group developed proteomics of isolated chromatin (PICh) as a novel technique to isolate specific genomic regions retaining molecular interaction 23 . In PICh specific biotinylated nucleic acid probes hybridizing target genomic regions and streptavidin beads are used to isolate the regions. Telomere-associated proteins were identified by PICh. However, it has not been shown whether PICh can be used for identification of RNA species associated with specific genomic regions. Since nucleic acid probes used in PICh can hybridize with not only genomic DNA but also RNA, careful analysis would be required to confirm if the detected RNA is associated with the target genomic regions or the RNA directly hybridizes with the probe. In this regard, RNA detected by enChIP is basically interacting with the target genomic regions. Therefore, detection of RNA associated with specific genomic regions can be more easily done by enChIP.
In summary, we isolated telomeres by enChIP using a TAL protein. enChIP-MS and enChIP-RT-PCR could successfully identified telomere-associated proteins and RNA, respectively. Thus, enChIP using TAL proteins would be a useful tool for biochemical analysis of specific genomic regions of interest.  Plasmids. The plasmid encoding the NLS-fused Telomere-TAL protein (Tel-TAL) recognizing a 19-mer telomere repeat (TAGGGTTAGGGTTAGGGTT) (the Tel-TAL cloning plasmid) was generated by Life Technologies. To construct 3xFN-Tel-TAL/pCMV-7.1, the Tel-TAL cloning plasmid was cleaved with Not I, blunted, and subsequently cleaved with Pme I to obtain the coding sequence of Tel-TAL. The coding sequence of Tel-TAL was inserted into the p3XFLAG-CMV-7.1 vector (Sigma-Aldrich). To construct 3xFN-Tel-TAL/pMXs-puro and 3xFN-Tel-TAL/ pMXs-neo, 3xFN-Tel-TAL/pCMV-7.1 was cleaved with Sac I, blunted and subsequently cleaved with Sma I to obtain the coding sequence of 3xFN-Tel-TAL. The coding sequence of 3xFN-Tel-TAL was inserted into the pMXs-puro or pMXsneo retroviral vector, which was constructed by replacing the GFP reporter cassette of pMXs-IG 24 with puromycin or neomycin resistance gene of pMX-puro or pMXneo 25 , respectively. Expression vectors of Halo-tagged proteins were purchased from Promega.
enChIP-Southern blot analysis. Cells (2 3 10 7 ) were fixed with 1% formaldehyde at 37uC for 5 min. The chromatin fraction was extracted and fragmented by sonication (the average length of fragments was about 2 kb) as described previously 3   enChIP-RT-PCR. For the enChIP-RT-PCR analysis, the enChIP procedure was performed as described for enChIP-Southern blot analysis with 5 U/ml of rRNasin Plus (Promega) in all the buffer solution except for Sonication Buffer in which 40 U/ ml of rRNasin Plus was added. The Dynabeads were suspended in 285 ml of TE and 12 ml of 5 M NaCl and incubated at 65uC overnight for reverse crosslink. After Proteinase K treatment at 45uC for 2 h, RNA was purified with Isogen II (Nippon Gene). After treatment with RQ1 RNase-free DNase (Promega), samples were subjected to RT-PCR analysis using TITANIUM One-Step RT-PCR Kit (Clontech). PCR cycles were as follows: heating at 50uC for 1 h followed by 94uC for 5 min; 43 cycles of 94uC for 30 sec, 60uC for 30 sec and 68uC for 1 min; an additional incubation at 68uC for 2 min. The primers used in this experiment were mTR 3-RT (#27234) (59-CCGGCGCCCCGCGGCTGACAGAG-39) (0.4 mM), mTR 5-b (#27235) (59-GCTGTGGGTTCTGGTCTTTTGTTC-39) (0.9 mM) and mTR 3 (#27236) (59-GCGGCAGCGGAGTCCTAAG-39) (0.9 mM) described previously 26 .