Internal guide RNA interactions interfere with Cas9-mediated cleavage

The CRISPR/Cas system uses guide RNAs (gRNAs) to direct sequence-specific DNA cleavage. Not every gRNA elicits cleavage and the mechanisms that govern gRNA activity have not been resolved. Low activity could result from either failure to form a functional Cas9–gRNA complex or inability to recognize targets in vivo. Here we show that both phenomena influence Cas9 activity by comparing mutagenesis rates in zebrafish embryos with in vitro cleavage assays. In vivo, our results suggest that genomic factors such as CTCF inhibit mutagenesis. Comparing near-identical gRNA sequences with different in vitro activities reveals that internal gRNA interactions reduce cleavage. Even though gRNAs containing these structures do not yield cleavage-competent complexes, they can compete with active gRNAs for binding to Cas9. These results reveal that both genomic context and internal gRNA interactions can interfere with Cas9-mediated cleavage and illuminate previously uncharacterized features of Cas9–gRNA complex formation.

(b-e) Plots of published chromatin marks, made using the ZENCODE genome browser, near in vivo inactive gRNAs, with the gRNA region marked in pink, in the area of (b) in vivo inactive gRNA 4, (c) in vivo inactive gRNA 1, (d) in vivo inactive gRNA 2, and (e) in vivo inactive gRNA 5. In vivo inactive gRNA 3 is not shown but is similar to gRNAs 1 and 2, lacking notable peaks for these histone marks near the cleavage site.  Table 3) with 13 M of Cas9 enzyme and injected 1 nL (300 pg gRNA, 6.5 M Cas9) for the condition where gRNA/Cas9 is equal. We found little effect of competition at these ratios, even when the inactive gRNA was pre-incubated at room temperature for five minutes with the Cas9 protein. This indicates that there is could some exchange between active and inactive gRNAs over the time they are in the embryo because the Cas9 protein should be limiting under these conditions. However, when the Cas9 protein is reduced by four-fold and the gRNA amount is kept constant, there is a substantial effect of competition with the inactive gRNA compared to the control injections. Pre-incubation and mixing of the gRNAs at the same time do not result in substantially different outcomes for any condition, supporting the idea that there could be some gRNA exchange occurring in the embryo. First, an active gRNA is found that is similar to the inactive gRNA being studied, by comparing the sequence to a large set of active gRNAs tested in our lab. Then, chimeric gRNAs are tested against cognate chimeric target sites to identify which regions of the inactive gRNA are responsible for low activity. In this case, inactive gRNA 5, a putative hairpin was identified mainly in the gene-specific portion of the gRNA, but also with the first two bases of the gRNA backbone. Single nucleotide substitutions (purple) within the putative hairpin (underlined) showed varying levels of increased activity over the original gRNA, likely depending on their importance to the formation of the hairpin. Mutations on both sides of the hairpin had the same effect, indicating that it is forming between the predicted regions. Mfold free energies were collected for this set of gRNAs, as well as for all subsequent calculations except in those in Figure S13, for the entire gene-specific portion and the first "GU" of the gRNA backbone because we found for this gRNA that it can interact with the gene-specific portion. The predictions are not done with the entire backbone because the prediction programs are not aware of the Cas9 enzyme and that it is energetically favorable to form the correct backbone configuration to bind to the protein. We found that many unlikely structures were predicted when the entire backbone included in the calculation. Step 1: Identify an active gRNA that closely matches inactive gRNA.

ggUAGAGGCGGCCCUGCAGG -Does not cut ggUAGAACAGAGACUGCAGG -Cuts
ggUAGAGGCGGCCCUGCAGG ggUAGAACAGGCCCUGCAGG ggUAGAGGCGAGACUGCAGG Step 2: Identify region of inactive gRNA responsible for low cleavage by testing chimeric gRNAs.
Step 4: Test single nucleotide changes to determine whether hypothesis in step 3 is likely.
ggUAGAGGCGGCCCUGCAGGgu ggUAGAGGCGACCCUGCAGGgu ggUAGAGGCGGGCCUGCAGGgu ggUAGAGGCGGCACUGCAGGgu ggUAGAGGCGGCCCUGCAGCgu Step 3: Identify regions of gRNA that are potentially misfolded and account for results in step 2.

ggGAGUUCCAGGAGUCCCAG -Does not cut ggGAGGUGCUGAAGCCGCUG -Cuts
Step 1: ggGAGUUCCAGGAGUCCCAG ggGAGUUCCAGGAGCCGCUG ggGAGGUGCUGAAGUCCCAG Step 2: Step 4: Step 3 Fig. 3. The wild-type sequence is displayed on top, boxed with teal (multiple boxes if multiple possible hairpins). Putative hairpins were predicted (underlined and bold) and substitutions (purple) were made. For all cases, the Mfold free energy was predicted, and the energies are shown in red when the predicted lowest free energy does not follow the expected trend. These discrepancies could be due to altered energetics of folding within the Cas9 active sites, as the lowest free energy predicted hairpin may not be favorable in this context, and the structures described in each case are examined in detail in Supplementary Fig. 15eh (corresponding to a, d, c, and e in this figure). a) A completely different hairpin structure was predicted for the last, best-cleaving, gRNA (pink) with a lower free energy, and it is unlikely that this hairpin is plausible. b) Mfold gives G:U pairings better free energies than they appear to have experimentally (-6.2 vs. -2.9, both cleave the same), but otherwise the prediction is reasonable. c) Two hairpins are possible, although the substitution that breaks the one with worse free energy (-2.3) shows the best experimental cleavage and Mfold identifies a completely different hairpin with a comparably low free energy for this best-cleaving gRNA (likely not forming). d) The finding that the gRNA boxed in pink is high-cleaving conflicts with the lowest free energy predicted hairpin (-9.5). There are two plausible explanations: 1) the hairpin with free energy -9.5 is not forming and instead the less stable hairpin with free energy -3.9 is forming. This -3.9 hairpin is compatible with all data, if a new and different hairpin forms for the gRNA boxed in orange (new hairpin in darker grey, free energy -5.1).
2) The C to A substitution in the gRNA boxed in pink is not compatible with forming the -9.5 hairpin for some unknown reason, such as space within the Cas9 binding site. e) A repetitive stretch of GCs (underlined and bold) could be forming internal interactions, as activity increased when the stretch was broken. However, a second gRNA with a similar stretch of GCs was also low activity although the region was potentially too short to form internal gRNA interactions. It is possible that the repetitive nature of the sequence in this case was responsible for low activity, rather than hairpins.  Fig. 3. gRNAs were tested with either one or two nucleotide substitutions that were predicted to increase the strength of a hairpin. Three of the gRNAs did not have strongly reduced activity from the added interactions (f, h, i), however two of these gRNAs already had relatively poor cleavage activity (f, h) without the substitutions and were likely already forming detrimental internal gRNA interactions.  Fig. 3. These gRNAs were predicted to contain strong hairpins with Mfold, but they showed relatively high in vitro cleavage activity. Changing the G:U pairing to a G:C resulted in reduced cleavage activity.

Supplementary Figure 12. Comparison of Mfold free energies and intra-gRNA complementarity.
The biggest distinction between prediction with Mfold and a simple calculation of complementarity, defined as four or more bases that can interact within the gene-specific region, is the low free energy Mfold prediction for G:U basepairing. Breaking down the gRNAs that do not contain obvious complementarity, but are predicted to have low free energies with Mfold (examined all with free energies less than -2.5), we find that those with the lowest free energies all contain predicted G:U pairings (unpaired t test p = 0.0114, comparing the two subsets with free energies <-2.5).
As we show experimentally (Fig. 3c, Supplementary Fig. 11), the substitution of a G:U pairing for a G:C pairing can significantly strengthen a hairpin and reduce activity. The remaining gRNAs with low Mfold-predicted free energies and without G:U pairings often contained a hairpin with just three base-pairs or a hairpin with a bulge in the sequence that may not be compatible with folding in the Cas9 binding site.  Supplementary Fig. 14. This gRNA showed high cleavage activity in the context of the original backbone, but not the extended one, indicating that interactions were forming with part of the extension. Making single nucleotide substitutions in the gene-specific region, in the context of the previously inactive extended backbone, corrected the low activity. The Mfold free energy predictions for these substitutions do trend correctly, however they are not valid because the prediction does not identify this interaction between the gene-specific region and backbone, while the experimental evidence clearly supports it. b and c) Making single nucleotide substitutions in the gene-specific region for both of these gRNAs, both with interactions to the same predicted region of the backbone, resulted in increased activity comparable to substitutions in the backbone itself (Fig. 3f). Mfold free energy predictions with the whole backbone yielded structures and free energies that matched the expected interactions, for both b and c. The structure of the Cas9 complex with the gene-specific region of the gRNA shown white, the DNA in brown, and hairpins that occur within the gRNA backbone and stabilize the Cas9-gRNA complex in four different colors. The hairpin that is most susceptible to misfolding caused by interactions with the gene-specific region of the gRNA is hairpin 2. This hairpin has fewer base-pairs (4 compared to 6 and 7 for hairpins 3 and 4, respectively). The first region (hairpin 1) is the longest by total numbers of interactions, but it contains two region of strong interactions separated by a less stable region. This hairpin was also susceptible to misfolding when it was extended by ten base-pairs ( Supplementary Fig. 13). This extension was predicted to stabilize the hairpin, but it is unclear whether it had the intended effect without further structural studies. All assays conducted in this work used the extended hairpin unless otherwise noted, as in Supplementary Fig. 13. The gRNA mutagenesis rate for these gRNAs was collected from cleavage in zebrafish embryos using pools of gRNAs. For this set we compared gRNAs with a mutagenesis rate of less than 0.05 to all the rest of the gRNAs. We used a different criteria here compared to our own dataset because the tests of pooled gRNAs likely underestimate the mutagenesis induced by each gRNA due to limiting gRNA concentrations and potential competition effects. Cas9 mRNA inactive gRNA 2 Cas9 protein inactive gRNA 2 Cas9 mRNA inactive gRNA 9 Cas9 protein inactive gRNA 9 Cas9 protein inactive gRNA 1 Cas9 mRNA inactive gRNA 1 Fold increase with refolding Supplementary Figure 18. Comparison of Mfold predicted free energies and experimental data. The experimentally-derived G of Cas9 binding was estimated using the EC 1/2max as a proxy for K D . This assumption is valid if all Cas9-gRNA complexes have similar turnover rates and if their cleavage of the DNA is controlled by enzyme concentration. a) The correlation between the experimentally-derived free energy and Mfold free energy of gRNA folding for the sets of gRNAs shown in Supplementary Fig. 6 (S6) and 7 (S7). For both sets the predictions are accurate, R 2 = 0.85 for S6 set and 0.95 for S7, although the relative trends for each set differ from each other. b) Energies for breaking interactions between the gene-specific region and gRNA backbone shown in Supplementary  Fig. 13. The S13a set displays a linear trend but, as discussed in the S13 legend, the values are not based on a correct structure. The S13b and S13c datasets match expectations reasonably well, especially when considering that several of the points (open triangles) are based on experimental EC 1/2max values that are underestimated because the cleavage is so poor. c) Free energies for gRNA pairs where hairpins were increase and G:U pairings were substituted with G:Cs (Supplementary Fig. 10 and Supplementary Fig. 11). d) Free energies for sets of broken hairpins that are not implausible from Supplementary Fig. 8 (nothing in red text is shown here). e-h) Structures of Mfold-predicted interactions that correspond with predictions in Supplementary Fig. 8. e) Structure 1 is predicted for the wild-type sequence and gRNAs with low cleavage and high free energy break this sequence (2, 3), except for the high-cleaving gRNA with a low free energy prediction (4), that is likely not forming the predicted structure. f) As in e, structure 4 is predicted, but not likely forming for S8c. g) Examples of the putative hairpins for the wild-type sequence in S8d (1,2) and two low-cleaving mutant sequences (3,4) with potentially different detrimental hairpins. h) Putative interactions for the wild-type sequence in S8e (1,2), compared to alternative low-energy structures that are found for the highcleaving gRNAs (3,4), but that are likely not forming. Structure 2 is compatible with the experimental data.