In 2017, Nature Methods published a peer-reviewed paper (Schaefer et al.) in which the authors reported that CRISPR–Cas9 caused unexpected off-target changes in mice. Since its publication, we have been contacted by many scientists challenging this work. Five of these critiques are now published, as peer-reviewed Correspondences, in this issue.

As the authors of these critiques point out, and as confirmed by four independent referees who evaluated both the critiques and the original authors' response to them, the study by Schaefer et al. lacked key controls so that it is not possible to ascribe the observed genomic variants, with reasonable confidence, to CRISPR. The paper has now been retracted to maintain the accuracy of the published record.

The authors made their observation as part of a study in which they successfully used CRISPR to correct a genetic mutation in inbred FVB/NJ mice. As a follow-up, the authors examined off-target changes by sequencing the genomes of two CRISPR-edited animals. In contrast to much other work assessing CRISPR off-targets in vivo, they did not examine only predicted sites, but looked at the entire genome. In comparison to a colony control animal of the same inbred strain, the authors observed numerous unexpected genomic differences in the CRISPR-treated animals, and they attributed these changes to CRISPR.

The parents of the CRISPR-treated animals, and the colony control animal, were directly purchased by the authors from the JAX genetic quality control program in an attempt to control for background variation. But the exact genetic relationship between the colony control and the CRISPR-treated animals is unknown; they were not siblings. In addition, the parents of the CRISPR-treated animals were not sequenced, and the two CRISPR-treated animals were not independently derived. The level of background genetic variation, in the FBV/NJ strain in general and in the mice in the authors' laboratories in particular, is not known.

Without a more direct assessment of the background variation in the animals used for the experiment, it is not possible to determine whether the variants reported by Schaefer et al. represent off-target effects of CRISPR or simply reflect variation already present in the background of these mice. The central claim of the paper is therefore not sufficiently supported by the data. This is the reason for retraction of the paper.

The original paper was peer reviewed, but we should have sought at least one additional referee with expertise in the genetics of inbred mouse strains. We regret this omission. While ensuring appropriate referee expertise is a task we have always taken seriously, and is a central part of the editorial process, we have now put in place further processes to reduce the likelihood that such an error will happen again.

The five critiques of Schaefer et al., most of which included re-analysis of the data, as well as the response to these critiques from the original authors, went through multiple rounds of additional peer review during the months following publication. While we aim to prioritize our handling of challenges to previous work, we also need to allow enough time for the issues to be examined thoroughly in such situations, with further data or analysis if needed.

During this process, we rapidly posted an editorial note to inform readers that questions had been raised about the work, and then published an editorial expression of concern once it became clear that there were genuine reasons to doubt the central message of the work. We finally concluded, on the basis of comments from four independent referees and our further discussions with referees and authors, that there was insufficient data to support the main claim of the paper. We decided to publish the critiques but not the authors' response, because the latter did not resolve the central criticism: that the study lacked controls needed to ascribe a causal role to CRISPR.

There has also been criticism levied against this work in regard to the small number of animals used, the single gRNA examined, and the method of Cas9 and donor-construct delivery. We recognize that these are weaknesses, and we knew about them when we accepted the paper for publication. We note that these facts were reported in Schaefer et al. and that the authors did not claim that their observations were general. The paper was published as a Correspondence rather than as a full research paper. The intention was to report a single observation, along the lines of a medical case study, of sufficient potential importance that the community should be alerted to it.

Indeed, although controls were lacking to support a causal role for CRISPR even for this single observation, the work of Schaefer et al. highlights limitations in the current literature that should be considered.

There is relatively little published data on genome-wide effects of in vivo CRISPR treatment. Most studies of off-target changes in CRISPR-treated organisms are not agnostic; they examine genomic sites that are algorithmically predicted to harbor off-target sequences. While this is in keeping with the known mechanism of Cas9, the enzyme could, at least in principle, have unpredicted effects on the in vivo genome. In previous work that has examined the whole genome, including a paper published in our pages that came to the opposite conclusion from Schaefer et al., the number of animals sequenced was equivalently small. Although two recent studies used experimental whole-genome methods to examine CRISPR-edited human embryos, and, encouragingly, report no significant off-targets, a definitive understanding of the in vivo genomic effects of CRISPR will need more data.

On the question of whether CRISPR can be safely used in vivo, the stakes are high for many. But for none are they higher than for the people in whom this technology may be used in the future. They are owed a careful and rigorous answer.