To the Editor,

The generation of knockout mouse models is essential for understanding complex biological pathways within the whole organism. Elucidating cell death mechanisms in vivo or identifying cancer vulnerabilities in a model system has only been possible through the generation of knockout mice lacking the genes thought to be involved in these processes.1, 2, 3 However, the constitutive deletion of certain gene products, such as MCL-1 for example, often results in early embryonic lethality.4 Therefore, conditional deletion methodologies have been invaluable for analyzing the phenotype of genes critical for embryonic development in the adult mouse. However, generation of mice with conditional alleles or other complex genetic alterations has to date been both very difficult and labor intense.

The use of the CRISPR/Cas9 system in the one-cell stage embryo has revolutionized the speed, simplicity and efficiency in generating genetically modified mice.5, 6 Although it is well accepted that simple modifications, such as gene deletions and small insertions are highly reproducible, the use of this methodology to develop mouse models with more complex genetic modifications, such as floxed or large knock-in alleles, has been associated with much lower success rates. Indeed, there is considerable ongoing debate among mouse geneticists regarding the feasibility of completely replacing ES cell-based techniques of generating genetically engineered mouse models with CRISPR/Cas9 technology.7 Importantly, we note that the key reason for the low efficiency of CRISPR/Cas9-mediated generation of floxed alleles is due to the methodology outlined by the initial landmark studies describing the use of CRISPR/Cas9 in mouse zygotes.6 Specifically, these studies described the use of two sgRNAs and two loxP-encoding single-stranded oligonucleotide donors (ssODN) to generate floxed alleles. This approach entails the following limitations: (i) high Cas9 cleavage activity combined with the use of two sgRNAs often results in the deletion of the entire genomic region flanked by the sgRNAs (Figure 1a (i)), (ii) homology-directed repair (HDR)-mediated incorporation of loxP-encoding ssODNs can occur on different alleles, leading to unusable loxP integration sites and potential false positives (Figure 1a (ii)) and (iii) the palindromic nature of loxP sequences often contributes to poor ssODN synthesis and frequent errors, especially when long gene-targeting oligonucleotides are synthesized (Figure 1a (iii), Supplementary Figure S1 and Supplementary Table S1).

Figure 1
figure 1

Schematic of loxP insertions using the ssODN or targeting vector method. (a) The ssODN loxP insertion method utilizes Cas9-mediated double-stranded breaks at two sgRNAs positions flanking the targeted exon to stimulate the insertion of two loxP-encoding ssODNs. This method is prone to (i) the formation of a deleted allele, (ii) single loxP insertions and (iii) error-prone loxP insertions. (b) The targeting vector loxP insertion method utilizes Cas9 double-stranded breaks at the two sgRNA positions to excise intervening DNA sequence, resulting in the formation of a deleted allele intermediate. Before the deleted allele intermediate is repaired by NHEJ, the targeting vector with appropriate homology arms can introduce a floxed allele

We report that these limitations can be overcome by the use of large targeting vectors to introduce loxP sequences. We found that this approach works more consistently and efficiently compared to the use of ssODN-mediated incorporation of loxP sequences. Specifically, we have so far utilized large targeting vectors featuring 2 kb or more of homology arms and two sgRNA recognition sites (Figure 1b) to generate >10 unique floxed alleles across multiple genetic loci. With this approach, we take advantage of the high Cas9 cleavage activity to delete genomic sequences in between the two sgRNA recognition sites. This facilitates the replacement of this deleted sequence with a loxP-flanked allele via HDR-mediated incorporation of large targeting vectors. The use of large targeting vectors allows for comprehensive and reliable sequence validation prior to gene targeting and also minimizes the likelihood of single loxP insertions. In addition, we use targeting vectors in circular form without the need for selection markers and have observed close to no off-target integration of our vectors. Analogous to the generation of floxed alleles, we have also successfully knocked in >10 large DNA sequences, such as sequences encoding Cre recombinase or reverse tet transactivator (either using one or two sgRNAs).

Interestingly, even with similar experimental parameters, we observed variable targeting efficiencies across different gene loci when using large targeting vectors in the generation of genetically modified mice (Supplementary Table S2). Since it was shown that high targeting efficiencies were obtained by knocking-in large sequences into gene loci supposedly active during embryonic development, such as Nanog, Oct4 and Sox2,6 we hypothesize that chromatin accessibility and/or gene expression levels at specific targeted locations during the one- to eight-cell stage of early embryonic development may significantly influence HDR-mediated introduction of DNA sequences. To test this hypothesis, we utilized RNA-seq and ATAC-seq data obtained during early embryonic development,8 and assessed the correlation between gene-targeting efficiency and chromatin accessibility and/or gene expression levels. Interestingly, this analysis revealed that high gene-targeting efficiency showed a significant positive correlation with chromatin accessibility analyzed at the four-cell stage (likelihood ratio test; P=0.00017) and a significant negative correlation with RNA expression levels measured at the eight-cell stage (likelihood ratio test; P=0.00024). Predictions using chromatin accessibility and RNA expression data together explained more than half (55%) of the total variation in efficiency (logistic regression, P=7.7e−7). Our observations suggest that open chromatin structures are permissive for Cas9 activity and thereby facilitate HDR-mediated incorporation of large targeting vectors. Conversely, the excessively high transcriptional activity at certain genomic loci associated with high mRNA density, transcriptional regulators and RNA polymerases may hinder Cas9 and HDR activity.

Importantly, we have incorporated our targeting data into an updatable predictive online tool (https://bioapps.wehi.edu.au/CrisperPredictor/Prediction/) (Supplementary Methods). The purpose of this tool is to provide researchers with a resource for identifying optimal loci within genes for targeting and to provide an indication of the success rate of future gene-targeting projects. The accuracy of this tool can be improved further with the incorporation of data collected from future targeting projects.

While ES cell-based gene-targeting remains viable, generating mice via CRISPR/Cas9-mediated manipulation of the one-cell stage embryo has many advantages: (i) germ line transmission has always been observed in our models, (ii) gene targeting in one-cell stage embryos is relatively cheap and non-laborious and (iii) the manipulation of previously genetically engineered mouse models of various genetic backgrounds can be performed. In conclusion, using the CRISPR/Cas9 technology in the one-cell stage embryos has proven to be extremely versatile and efficient in generating a variety of genetically modified mice.