To the Editor:

Recently, Schaefer et al.1 reported the presence of more than a thousand genomic differences between mice that had been edited with Streptococcus pyogenes Cas9 at the zygote stage and a control mouse of the same strain, which the authors attributed to a Cas9-dependent activity. We feel that this conclusion was inappropriate, since there was no consideration that the variation could reflect the normal Mendelian segregation of pre-existing variants in these animals. Furthermore, the inferred behavior of Cas9 lies outside of its understood mechanism of action.

The authors propose, given the pattern of homozygosity and heterozygosity they describe, that the variations occurred within the first few divisions of the zygote as a result of the Cas9-mediated repair of the rd1 allele of Pde6b2, an ophthalmic disease target. The alternative explanation—namely, that genetic variation was present in the FVB/NJ mouse lineage before the treatment of the animals—was not considered under the assumption that the FVB/NJ line is universally homozygous at all sites. To examine whether this is true, we reprocessed the Schaefer et al.1 data (Fcon, F03 and F05) as well as the FVB/NJ reference genome data (Fref)3 using an adaptation of the GATK best practices workflow, and we compared these to the reference mm10 sequence (Supplementary Methods). All four FVB/NJ genomes exhibit extensive heterozygosity, despite these animals being inbred (Supplementary Tables 1 and 2; Supplementary Fig. 1). The FVB/NJ line is therefore far from universally homozygous. In the initial description of Fref, heterozygous SNVs were filtered out of the analysis3. However, the attempted validation by Wong et al.3 of 231 nominally homozygous sites from the >5 million surviving SNVs (with respect to mm10) showed 4 to be heterozygous, from which rate we might expect more than 50,000 heterozygous sites, consistent with our reanalysis (Supplementary Table 1).

The pattern of heterozygosity reveals that the three mice in Schaefer et al.1 are clearly more related to each other than any is to Fref. For example, of 96,243 Fcon sites with homozygous differences from Fref, 91,975 and 91,539 are homozygously present in F03 and F05, respectively (Supplementary Fig. 2). While the exact relationship between Fcon and F03 and F05 was not reported, the genetic divergence of these three animals does not appear to support the idea of a burst of induced variation, especially given the amount of variation already present in the FVB/NJ line. While there may be a small excess of variations induced by Cas9 on account of well-understood off-target mechanisms4, the significant excess of background variation masks detection at the whole-genome sequencing (WGS) level. What is required, and lacking, is the sequence of the parents of F03 and F05 as well as data processing that does not assume universal homozygosity.

In an unrelated study in a different strain of mouse, WGS of unedited inbred (except for a small region around agouti) C57BL/6J littermates showed 985 sites of variation (SNVs and indels) between individuals5. While the number of segregating variants in C57BL/6 and FVB/N has not been independently compared, the 985 sites in C57BL/6J littermates is similar to that found in the authors' FVB/NJ data set (696 SNVs and indels). A second unrelated study directly examined the effects of Cas9 editing (with intact Cas9 cleavase or Cas9D10A nickase) using WGS6. These authors saw significant genetic variation between individuals; but after comparison with a colony control and sib genomes, they concluded the opposite of Schaefer et al.1, namely, that Cas9 induced no unexpected mutations.

Consistent with the evidence that the variation is simply segregating background variation, none of the illustrated edited sites in Schaeffer et al.1 has sufficient target homology to the guide to support Watson–Crick gRNA–DNA base pairing1, which is a requirement for Cas9 activity4. Finally, of the variation present, there is an excess of transitions over transversions (Supplementary Table 3), which is a hallmark of naturally occurring variation7. Therefore, we feel that the main conclusions in Schaeffer et al.1 are likely incorrect.

Data availability

The code used to analyze the data, along with the parameters used for variant calling, are available at