Allele-specific locus binding and genome editing by CRISPR at the p16INK4a locus

The clustered regularly interspaced short palindromic repeats (CRISPR) system has been adopted for a wide range of biological applications including genome editing. In some cases, dissection of genome functions requires allele-specific genome editing, but the use of CRISPR for this purpose has not been studied in detail. In this study, using the p16INK4a gene in HCT116 as a model locus, we investigated whether chromatin states, such as CpG methylation, or a single-nucleotide gap form in a target site can be exploited for allele-specific locus binding and genome editing by CRISPR in vivo. First, we showed that allele-specific locus binding and genome editing could be achieved by targeting allele-specific CpG-methylated regions, which was successful for one, but not all guide RNAs. In this regard, molecular basis underlying the success remains elusive at this stage. Next, we demonstrated that an allele-specific single-nucleotide gap form could be employed for allele-specific locus binding and genome editing by CRISPR, although it was important to avoid CRISPR tolerance of a single nucleotide mismatch brought about by mismatched base skipping. Our results provide information that might be useful for applications of CRISPR in studies of allele-specific functions in the genomes.

Scientific RepoRts | 6:30485 | DOI: 10.1038/srep30485 In this study, using the p16INK4a gene in HCT116 as a model locus, we investigated whether different chromatin states or a single-nucleotide gap form at target sites can be exploited for allele-specific CRISPR applications in vivo. We showed that allele-specific targeting of CpG-methylated regions could be achieved with one of six guide RNAs (gRNAs) tested. The allelic specificity was not determined by CpG methylation. In addition, we showed that a single-nucleotide gap form in one allele could be exploited for allele-specific locus binding and genome editing by CRISPR. Our results might facilitate applications of CRISPR to studies of allele-specific genome functions.

Results
Allele-specific genome editing using the CRISPR complex at CpG-methylated target regions in vivo.
In the human colorectal carcinoma cell line HCT116, one allele of the p16INK4a gene is not transcribed due to heavy methylation of the associated CpG island, which extends from the promoter to the first intron ( Fig. 1) [20][21][22][23] , and contains H3K9m2, a heterochromatin mark 24 . The other allele, which is not CpG-methylated, is transcribed; however, it bears a single-guanine insertion in the first exon, resulting in a frameshift mutation that prevents production of a functional protein (Fig. 1a) [20][21][22][23] . These properties of the p16INK4a locus in HCT116 make this cell line ideal for investigating whether chromatin states and a single-nucleotide gap form can be utilized for allele-specific CRISPR-mediated locus binding and genome editing in vivo.
First, to examine feasibility of allele-specific genome editing, we designed three chimeric single gRNAs (sgRNAs) that included different numbers of CpGs, namely sgRNA_lef5, sgRNA_mid2, and sgRNA_rig3, which targeted genomic sequences containing five, two, and three CpGs, respectively, in the CpG island of p16INK4a (Figs 1b-d  and 2a). In addition, they contained four, two, and two CpGs, respectively, in the seed sequence and/or protospacer adjacent motif (PAM) (Fig. 2a), positions that determine recognition of target sequences by CRISPR 25 .
Using these sgRNAs, we investigated whether allele-specific genome editing can be achieved by targeting allele-specifically transcribed locus in vivo. We transfected wild-type Cas9 and sgRNA expression plasmids, along with donor single-strand DNA (ssDNA), into wild-type HCT116 or HCT116-derived HCT116/del#3 cells to induce knock-in by homologous recombination (Supplementary Figs S1 and S2). We analyzed the outcome by genotyping PCR followed by DNA cloning and sequencing ( Supplementary Fig. S1); the allele-specific single-guanine insertion (wild-type HCT116) or 31 nt deletion (HCT116/del#3) in the first exon of the p16INK4a gene can be used to distinguish the corresponding alleles by DNA sequencing. As shown in Fig. 2b, when sgRNA_lef5 or sgRNA_ mid2 was used, the efficiencies of genome editing were comparable for both alleles. By contrast, when sgRNA_rig3 was used, the intended mutation was introduced preferentially in the non-CpG-methylated Gx5 allele (Fig. 2b). These results suggest that CRISPR-mediated genome editing is not necessarily affected by CpG methylation in vivo. However, the results obtained with sgRNA_rig3 show a possibility that allele-specific genome editing can be achieved by targeting an imprinted locus. Table 1 summarizes information on CpG positions at target sites. Our results suggest that the allelic preference of sgRNA_rig3 was not related to the level of CpG methylation per se, because sgRNA_lef5 (which targets more CpGs) had no allelic preference. Moreover, sgRNA_lef5 and sgRNA_mid2 contain four and two CpGs, respectively, in their seed sequences and PAMs, but did not exhibit allelic preferences, whereas sgRNA_rig3, which has two CpGs in these sequences, did exhibit a preference. Therefore, the allelic preference of a given sgRNA might not be related to the number of CpGs in the seed sequence and/or PAM, but might instead simply be determined by the local accessibility of target sites.

Allele-specific locus binding of the CRISPR complex in CpG-methylated target regions in vivo.
To explore this idea, we examined the locus accessibility of CpG-methylated target sites using a CRISPR complex consisting of dCas9 and sgRNA in vivo. We developed engineered DNA-binding molecule-mediated chromatin immunoprecipitation (enChIP) technology using dCas9 for identification of molecules that interact with genomic regions of interest in vivo [26][27][28][29][30][31] (see review 10 ). In enChIP, sgRNA and dCas9 (fused to an epitope tag, if necessary) expressed in cells bind to a locus specified by the sequence of the sgRNA. The targeted locus can be isolated by affinity purification using an antibody (Ab) against the epitope tag or dCas9 itself ( Supplementary Fig. S3a). In this study, we quantitatively evaluated allele-specific binding of the CRISPR complex to the target sites by enChIP followed by bisulfite treatment and quantitative methylation-specific PCR (MSP) (Supplementary Fig. S3a). The primer set designed for MSP could clearly distinguish CpG-methylated and non-CpG-methylated alleles ( Supplementary Fig. S3b,c). As shown in Fig. 2c, enChIP with sgRNA_lef5 or sgRNA_mid2 resulted in a comparable percentage of input (DNA yield) between the CpG-methylated Gx4 and non-CpG-methylated Gx5 alleles, suggesting that CpG methylation had no effect on binding of the CRISPR complex to these loci in vivo. By contrast, enChIP with sgRNA_rig3 resulted in significantly higher DNA yields for the non-CpG-methylated Gx5 allele. These findings are consistent with the results of genome editing ( Fig. 2b and Table 1), suggesting that the allelic preference of genome editing reflects the accessibility of a locus to the CRISPR complex.
CpG methylation does not directly suppress binding of the CRISPR complex to purified DNAs in vitro. Next, we investigated whether the allelic locus-binding preference of sgRNA_rig3 was not directly affected by CpG methylation. To this end, we employed in vitro enChIP technology using recombinant CRISPR ribonucleoproteins (RNPs) 32 . In in vitro enChIP, target genomic regions can be isolated without loss of the molecular interactions in cells that do not express the CRISPR complex 32 . This technology can be applied to sequence-specific isolation of target DNA from purified genomic DNA ( Supplementary Fig. S4) 32 . In this study, we performed in vitro enChIP with CRISPR RNA (crRNA) : trans-activating crRNA (tracrRNA) duplex instead of sgRNA. As shown in Fig. 3, when in vitro enChIP was performed with gRNA_lef5 (crRNA_lef5 : tracrRNA) or gRNA_mid2 (crRNA_mid2 : tracrRNA) using purified genomic DNA (which retains its characteristic in vivo methylation patterns) from HCT116 cells, CpG methylation did not suppress binding of CRISPR to the target site (Fig. 3). This result was consistent with the observed binding of CRISPR to the target site in vivo (Fig. 2c). Moreover, in vitro enChIP with gRNA_rig3 (crRNA_rig3 : tracrRNA) resulted in comparable DNA yields of the target site between the CpG-methylated Gx4 and non-CpG-methylated Gx5 alleles, suggesting that CpG methylation had no suppressive effect on binding of CRISPR to purified genomic DNA in vitro. This result is in sharp contrast to the allelic locus-binding preference of the CRISPR complex containing sgRNA_rig3 in vivo (Fig. 2c). Thus, the allelic preference of locus binding and genome editing by each sgRNA reflects the locus accessibility of the target sites in vivo, which may be influenced by nucleosome positioning, chromatin structure, occupancy by DNA-binding molecules, and other factors.

Analysis of allele-specific locus binding of the CRISPR complex in another locus in vivo.
It is interesting to know how easily we can find an sgRNA suitable for allele-specific locus binding of CRISPR in an imprinted locus. To this end, we also examined allele-specific locus binding of CRISPR to the p14ARF gene, which is also allele-specifically CpG-methylated and silenced at the level of transcription 33 (Fig. 4a-d). We designed three sgRNAs targeting the CpG island in the gene; two sgRNAs target DNA sequences containing three CpG sites similarly to sgRNA_rig3 (Fig. 4b-d). Although two sgRNAs (p14ARF_LEF3 and p14ARF_RIG3, Fig. 4b) showed statistically significant difference in allelic preference ( Fig. 4e and Supplementary Fig. S5), the difference (~1.4-fold higher DNA yields for CpG-methylated allele, Fig. 4e) was much smaller than that with sgRNA_rig3 for the p16INK4a locus (~5-fold difference, Fig. 2c). Thus, although allele-specific locus binding of CRISPR is possible with a suitable sgRNA such as sgRNA_rig3 for p16INK4a, it might not be easy to find sgR-NAs with large allelic preference. More effort and trial-and-error approaches would be required to find a logical designing strategy for allele-specific applications of CRISPR using different chromatin states.

Allele-specific genome editing by the CRISPR complex using a single-nucleotide gap form.
Allele-specific genome editing by CRISPR can take advantage of SNPs between alleles 14,15 . Therefore, using the single-guanine insertion in the p16INK4a gene, we next investigated whether a single-nucleotide gap form could also be utilized for allele-specific genome editing by CRISPR in vivo. Recent work showed that CRISPR can tolerate sequence mismatches through single-nucleotide skipping in the sgRNA and cleave the mismatched sites 34,35 .
We confirmed that cleavage of target sites caused by single-nucleotide skipping is observed if the gap occurs at the first or second base 5′ of the PAM (Supplementary Fig. S6a-c). In addition, the CRISPR complex tolerated sequence mismatches at the locus-binding step ( Supplementary Fig. S6d). In light of the results of the aforementioned reports 34,35 , skipping between the third and seventh bases 5′ of the PAM is unlikely to occur or to be less effective in tolerating sequence mismatches even if it occurs. Therefore, we designed two sgRNAs specifically targeting the Gx4 (sgRNA_Gx4#2) or Gx5 (sgRNA_Gx5#2) position to avoid undesirable single-nucleotide skipping at the first or second base 5′ of the PAM and allow it between the third and seventh bases ( Fig. 5a and Supplementary Fig. S7). As shown in Fig. 5b and Supplementary Fig. S8, Gx4 and Gx5 allele-specific genome editing occurred successfully using sgRNA_Gx4#2 and sgRNA_Gx5#2, respectively. These results clearly showed that an allele-specific single-nucleotide gap form can be used for allele-specific genome editing by CRISPR if the gRNA includes the gap form between the third and seventh bases 5′ of the PAM. We also investigated whether a single-nucleotide gap form could also be utilized for allele-specific locus binding by CRISPR in vivo. As shown in Fig. 5c, dCas9 in complex with sgRNA_Gx5#2 bound to the target site in a Gx5 allele-specific manner, consistent with the results of genome editing (Fig. 5b). This result showed that an allele-specific single-nucleotide gap form can also be used for allele-specific locus binding by CRISPR. By contrast,    sgRNA_Gx4#2 induced recruitment of dCas9 to both alleles equally (Fig. 5c), suggesting that allele-specific genome editing occurs through different molecular mechanisms for sgRNA_Gx4#2 and sgRNA_Gx5#2: sgRNA_Gx4#2 distinguishes the single-guanine insertion after the locus-binding step, whereas sgRNA_Gx5#2 distinguishes it at the binding step (Fig. 6). Therefore, the reported single-nucleotide skipping rule 34,35 might apply to allele-specific genome editing, but not necessarily to allele-specific locus binding.

Discussion
In this study, focusing on allele-specific applications of CRISPR, we first examined the effects of chromatin states at target sites on CRISPR-mediated locus binding as well as on genome editing in vivo (Figs 2 and 3). Although the allele-specific locus binding and genome editing were achieved only by sgRNA_rig3, our results revealed the following issues concerning the targeting of CpG-methylated sites for these applications: (1) CpG methylation does not necessarily affect locus binding or genome editing by CRISPR in vivo, although we are currently unable to eliminate the possibility that the position and number (> 5) of methylated cytosines (mCs) in a target site might change this viewpoint.
(2) Allelic preference may simply be determined by the accessibility of target sites in each allele, rather than by direct effects of CpG methylation. (3) Conventional in vivo enChIP technology is useful for the evaluation of the accessibility of target sites in vivo. It has been reported that CpG methylation (~4 mCs in a target site) does not affect genome editing by CRISPR using a CpG-methylated pUC19 plasmid in vitro and a CpG-methylated endogenous locus in vivo 19 . Because our conclusions are consistent with this observation, the lack of a direct effect of CpG methylation on CRISPR-mediated genome editing may be considered a general phenomenon. In addition, we also showed that CpG methylation (~5 mCs in a target site) does not directly suppress locus binding by CRISPR (Fig. 3). To our knowledge, this study is the first report to investigate the effects of CpG methylation on locus binding of CRISPR in vivo and in vitro while focusing on the allele-specific applications of CRISPR.
CpG methylation of target sequences interferes with DNA recognition by TAL proteins 36,37 . However, because the TAL DNA-binding module that recognizes thymine also recognizes mC 36,37 , this module could be applied to the targeting of CpG-methylated sequences 38 . In this study, in addition to the in vivo observations, we showed that CpG methylation of purified genomic DNA did not directly affect target binding by CRISPR (Fig. 3). Therefore, it is not necessary to consider the direct effects of CpG methylation of target sites when CRISPR is used for applications such as genome editing and locus binding. However, in the context of allele-specific applications, the aforementioned properties of TAL proteins (i.e., the ability to distinguish between methylated or un-methylated cytosines) might provide an advantage over the CRISPR system.
We showed a possibility that allele-specific locus binding and genome editing can be achieved by targeting an allele-specifically transcribed locus in vivo. However, only sgRNA_rig3 for the p16INK4a locus was successful (Figs 1 and 2), and detailed molecular basis for the allele-specific locus binding have not been yet elucidated at this stage. In addition, it remains unclear how we can find sgRNAs suitable for allele-specific targeting using different chromatin states. In this regard, it would be an interesting future study to identify the factors that contribute to the allelic preference of sgRNA_rig3 (see below).
In the Gx4 allele of p16INK4a, the exon 1 region contains H3K9m2, a heterochromatin mark 24 . Genomic regions with heterochromatin structures are thought to be closed and inaccessible 12,[16][17][18] . Although the p16INK4a exon 1 region is heavily methylated, some of the sgRNAs we tested could effectively recruit dCas9 to target sites in Figure 6. Summary of genome editing and locus binding using sgRNA_Gx4#2 or sgRNA_Gx5#2. Singlenucleotide skipping ["sgRNA jump (DNA bulge)" for sgRNA_Gx4#2 and "sgRNA bulge" for sgRNA_Gx5#2] can occur between the third and seventh positions 5′ of the PAM for each sgRNA. As a representative, singlenucleotide skipping at the seventh nucleotide 5′ of the PAM is shown for each sgRNA. PAMs are shown in green. The single-guanine insertion in the Gx5 allele is shown in red. the region in vivo (Fig. 2c), suggesting that these target sites are locally open even in heterochromatin structures in vivo, allowing CRISPR to access these sites. By contrast, sgRNA_rig3 could not recruit dCas9 to the target site in the Gx4 allele (Fig. 2c). Because CpG methylation does not directly affect binding of dCas9, allelic preference might be determined by the local accessibility of the target site in vivo, in which nucleosome positioning, chromatin structure, occupancy by DNA-binding molecules, and other factors might play important roles. In this regard, a nucleosome boundary is likely to be present around the target site for sgRNA_mid2 in CpG-methylated p16INK4a in the human gastric adenocarcinoma cell line AGS 39 . The CpG-methylated Gx4 allele in HCT116 might be organized in the same way. Thus, it might be possible that nucleosome positioning and/or chromatin structures are involved in the allelic preference of sgRNA_rig3. Alternatively, unidentified DNA-binding molecules might simply occupy the target site of sgRNA_rig3, which would inhibit binding of dCas9. On the other hand, our results suggest that accessibility of target sites in vivo in heterochromatin regions can be evaluated by conventional in vivo enChIP technology. If target sites prove to be accessible even in heterochromatin regions, they can be targeted for CRISPR-mediated genome editing or other applications, such as locus-specific epigenetic modifications to introduce euchromatinization 40,41 .
We succeeded in performing allele-specific CRISPR-mediated genome editing using an allele-specific single-guanine insertion in the p16INK4a gene. Thus, in addition to SNPs 14,15 , a single-nucleotide gap form between two alleles can be exploited for allele-specific applications of CRISPR in vivo (Fig. 5). In contrast to the situation using sgRNA_rig3, we could systematically design sgRNAs suitable for this purpose on the basis of the previous reports 34,35 . To avoid potential nucleotide skipping, it was necessary to design the sgRNA to include the gap form between the 3rd and 7th bases 5′ from the PAM. Indeed, CRISPR with sgRNA_Gx4 tolerated the single-nucleotide guanine insertion by single-nucleotide skipping, preventing successful allele-specific locus binding and genome editing (Supplementary Fig. S6). Previous reports showed that nucleotide mismatches can be tolerated by CRISPR through single-nucleotide skipping 34,35 . However, no previous report has investigated whether single-nucleotide skipping can also occur during locus binding by CRISPR. In this study, we found that the reported rule for single-nucleotide skipping in genome editing 34,35 can be applied to locus binding in some cases (i.e. sgRNA_Gx5#2) (Fig. 5c). We also observed that although single-nucleotide skipping does not occur for nucleotide mismatches between the 3rd and 7th bases 5′ from the PAM during genome editing, skipping occurred even for these mismatches when sgRNA_Gx4#2 but not sgRNA_Gx5#2 was used to specify the locus binding of dCas9 (Fig. 5c). The mode of single-nucleotide skipping obviously differed between these sgRNAs (Fig. 6): specifically, sgRNA_Gx4#2 and sgRNA_Gx5#2 are predicted to engage in "sgRNA jumping (DNA bulge)" and "sgRNA bulge" types of skipping, respectively. Therefore, single-nucleotide skipping by the DNA bulge type might simply tolerate more single-nucleotide mismatches at the locus binding step than at the genome editing step. Alternatively, an alternative PAM "cgg" for sgRNA_Gx4#2 might result in increased mismatch tolerance in locus binding by CRISPR (Supplementary Fig. S9). Indeed, Supplementary Fig. S9 shows that 2-bp mismatches adjacent to PAM could be tolerated, although such off-target sequences would not be effectively tolerated [42][43][44][45] . An additional study will be required to identify the detailed mechanisms.
The CRISPR system can also be used for a wide range of biological applications including artificial transcriptional regulation 2,5,6,9 , epigenetic modification 9 , locus imaging 5,6,9 , and isolation of specific genomic regions in a locus-specific manner 5,9,10 . Therefore, defining the rule for single-nucleotide skipping in locus binding would be beneficial for the aforementioned allele-specific applications of CRISPR using a single-nucleotide gap form. Our results might provide information that might be useful for implementing CRISPR in studies of allele-specific genome functions in phenomena such as X-chromosome inactivation, genomic imprinting, and cancer.

Methods
Cell lines, plasmids, primers, and donor DNAs. The human colorectal carcinoma cell line HCT116 was obtained from the American Type Culture Collection (ATCC). HCT116 or HCT116-derived cells were maintained in McCoy's 5A Medium (Thermo Fisher Scientific) with 10% (v/v) fetal bovine serum at 37 °C.
The Cas9 expression plasmid (Addgene #41815) was provided by Dr. George Church via Addgene. The dCas9 expression plasmid 3xFLAG-dCas9/pCMV7.1 (Addgene #47948) was described previously 26 . For construction of the sgRNA_rig3 expression plasmid, a plasmid containing the U6 promoter, target sequence, and gRNA scaffold was synthesized by GeneArt Gene Synthesis (Thermo Fisher Scientific). For construction of other sgRNA expression plasmids, the plasmid gRNA Cloning Vector (BbsI), which contains the U6 promoter, BbsI site, and gRNA scaffold, was synthesized by GeneArt Gene Synthesis, and gRNA sequences were cloned into the BbsI site. Primers and donor ssDNA used in this study are shown in Supplementary Table S1.
Bisulfite treatment and sequencing. Genomic DNA (400 ng) was subjected to bisulfite treatment with the EZ DNA Methylation-Lightning Kit (Zymo Research). Bisulfite-treated DNA (40 ng) was subjected to PCR with TaKaRa EpiTaq HS (for bisulfite-treated DNA) (Takara Bio). PCR cycles were as follows: 40 cycles of 98 °C for 10 sec, 55 °C for 30 sec, and 72 °C for 1 min. PCR products were cloned into T-vector pMD20 (Takara Bio) and subjected to DNA sequencing analysis with the M13 reverse primer. Methylation status was analyzed by QUMA, a methylation analysis tool (http://quma.cdb.riken.jp/index_j.html).
Genome editing by CRISPR. HCT116 or HCT116-derived cells (4 × 10 5 cells) were transfected with 2 μ g each of Cas9 expression plasmid, sgRNA expression plasmid, and donor ssDNA using the Lipofectamine 3000 transfection reagent (Thermo Fisher Scientific). To confirm gene targeting, genomic DNA was extracted 2 days after transfection using the Quick-DNA Universal Kit (Zymo Research), and then subjected to genotyping PCR with KOD FX (Toyobo). PCR products were cloned into pCR4-TOPO (Thermo Fisher Scientific) and analyzed by DNA sequencing.
Statistical analysis. P-values were calculated using the Excel software (Microsoft) using Student's t-test.