Predicting the mutations generated by repair of Cas9-induced double-strand breaks


The DNA mutation produced by cellular repair of a CRISPR–Cas9-generated double-strand break determines its phenotypic effect. It is known that the mutational outcomes are not random, but depend on DNA sequence at the targeted location. Here we systematically study the influence of flanking DNA sequence on repair outcome by measuring the edits generated by >40,000 guide RNAs (gRNAs) in synthetic constructs. We performed the experiments in a range of genetic backgrounds and using alternative CRISPR–Cas9 reagents. In total, we gathered data for >109 mutational outcomes. The majority of reproducible mutations are insertions of a single base, short deletions or longer microhomology-mediated deletions. Each gRNA has an individual cell-line-dependent bias toward particular outcomes. We uncover sequence determinants of the mutations produced and use these to derive a predictor of Cas9 editing outcomes. Improved understanding of sequence repair will allow better design of gene editing experiments.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: Mutational profiles generated by CRISPR–Cas9 and a method for their high-throughput measurement.
Figure 2: Synthetic mutational profiles are reproducible, specific to individual gRNAs and closely resemble endogenously measured profiles in human K562 cells.
Figure 3: Mutational profiles are diverse and biased in K562 cells, as measured using 6,568 gRNAs with a median 991 sequenced reads with mutations per target.
Figure 4: Local sequence context strongly influences editing outcomes in the explorative set of gRNA–target pairs.
Figure 5: Differences between editing outcomes in K562-Cas9 and other cell lines and effector proteins.
Figure 6: Accurate prediction of repair profiles.

Accession codes

Primary accessions

European Nucleotide Archive


  1. 1

    Doudna, J.A. & Charpentier, E. The new frontier of genome engineering with CRISPR-Cas9. Science 346, 1258096 (2014).

    PubMed  Google Scholar 

  2. 2

    Chiruvella, K.K., Liang, Z. & Wilson, T.E. Repair of double-strand breaks by end joining. Cold Spring Harb. Perspect. Biol. 5, a012757 (2013).

    PubMed  PubMed Central  Google Scholar 

  3. 3

    Her, J. & Bunting, S.F. How cells ensure correct repair of DNA double-strand breaks. J. Biol. Chem. 293, 10502–10511 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  4. 4

    Truong, L.N. et al. Microhomology-mediated end joining and homologous recombination share the initial end resection step to repair DNA double-strand breaks in mammalian cells. Proc. Natl. Acad. Sci. USA 110, 7720–7725 (2013).

    CAS  PubMed  Google Scholar 

  5. 5

    Shibata, A. Regulation of repair pathway choice at two-ended DNA double-strand breaks. Mutat. Res. 803-805, 51–55 (2017).

    CAS  PubMed  Google Scholar 

  6. 6

    Bae, S., Kweon, J., Kim, H.S. & Kim, J.-S. Microhomology-based choice of Cas9 nuclease target sites. Nat. Methods 11, 705–706 (2014).

    CAS  PubMed  Google Scholar 

  7. 7

    van Overbeek, M. et al. DNA repair profiling reveals nonrandom outcomes at Cas9-mediated breaks. Mol. Cell 63, 633–646 (2016).

    CAS  PubMed  Google Scholar 

  8. 8

    Koike-Yusa, H., Li, Y., Tan, E.-P., del Castillo Velasco-Herrera, M. & Yusa, K. Genome-wide recessive genetic screening in mammalian cells with a lentiviral CRISPR-guide RNA library. Nat. Biotechnol. 32, 267–273 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  9. 9

    Lemos, B.R. et al. CRISPR/Cas9 cleavages in budding yeast reveal templated insertions and strand-specific insertion/deletion profiles. Proc. Natl. Acad. Sci. USA 115, E2040–E2047 (2018).

    CAS  PubMed  Google Scholar 

  10. 10

    Shou, J., Li, J., Liu, Y. & Wu, Q. Precise and predictable CRISPR chromosomal rearrangements reveal principles of Cas9-mediated nucleotide insertion. Mol. Cell 71, 498–509.e4 (2018).

    CAS  PubMed  Google Scholar 

  11. 11

    Taheri-Ghahfarokhi, A. et al. Decoding non-random mutational signatures at Cas9 targeted sites. Nucleic Acids Res. 46, 8417–8434 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  12. 12

    Chakrabarti, A.M. et al. Target-specific precision of CRISPR-mediated genome editing. Preprint at. bioRxiv (2018).

  13. 13

    Chari, R., Mali, P., Moosburner, M. & Church, G.M. Unraveling CRISPR-Cas9 genome engineering parameters via a library-on-library approach. Nat. Methods 12, 823–826 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  14. 14

    Kim, H.K. et al. In vivo high-throughput profiling of CRISPR-Cpf1 activity. Nat. Methods 14, 153–159 (2017).

    CAS  PubMed  Google Scholar 

  15. 15

    Tycko, J. et al. Pairwise library screen systematically interrogates Staphylococcus aureus Cas9 specificity in human cells. Nat. Commun. 9, 2962 (2018).

    PubMed  PubMed Central  Google Scholar 

  16. 16

    Mali, P. et al. RNA-guided human genome engineering via Cas9. Science 339, 823–826 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  17. 17

    Chen, B. et al. Dynamic imaging of genomic loci in living human cells by an optimized CRISPR/Cas system. Cell 155, 1479–1491 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  18. 18

    Kosicki, M., Tomberg, K. & Bradley, A. Repair of double-strand breaks induced by CRISPR-Cas9 leads to large deletions and complex rearrangements. Nat. Biotechnol. 36, 765–771 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  19. 19

    Cho, S.W. et al. Analysis of off-target effects of CRISPR/Cas-derived RNA-guided endonucleases and nickases. Genome Res. 24, 132–141 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  20. 20

    Gallagher, D.N. & Haber, J.E. Repair of a site-specific DNA cleavage: old-school lessons for Cas9-mediated gene editing. ACS Chem. Biol. 13, 397–405 (2018).

    CAS  PubMed  Google Scholar 

  21. 21

    Slaymaker, I.M. et al. Rationally engineered Cas9 nucleases with improved specificity. Science 351, 84–88 (2016).

    CAS  PubMed  Google Scholar 

  22. 22

    Bothmer, A. et al. Characterization of the interplay between DNA repair and CRISPR/Cas9-induced DNA lesions at an endogenous locus. Nat. Commun. 8, 13905 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. 23

    Mazur, D.J. & Perrino, F.W. Excision of 3′ termini by the Trex1 and TREX2 3′5′ exonucleases. Characterization of the recombinant proteins. J. Biol. Chem. 276, 17022–17029 (2001).

    CAS  PubMed  Google Scholar 

  24. 24

    Bhargava, R., Carson, C.R., Lee, G. & Stark, J.M. Contribution of canonical nonhomologous end joining to chromosomal rearrangements is enhanced by ATM kinase deficiency. Proc. Natl. Acad. Sci. USA 114, 728–733 (2017).

    CAS  PubMed  Google Scholar 

  25. 25

    Certo, M.T. et al. Coupling endonucleases with DNA end-processing enzymes to drive gene disruption. Nat. Methods 9, 973–975 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. 26

    Shi, J. et al. Discovery of cancer drug targets by CRISPR-Cas9 screening of protein domains. Nat. Biotechnol. 33, 661–667 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  27. 27

    Zuo, Z. & Liu, J. Cas9-catalyzed DNA cleavage generates staggered ends: evidence from molecular dynamics simulations. Sci. Rep. 5, 37584 (2016).

    PubMed  PubMed Central  Google Scholar 

  28. 28

    Richardson, C.D., Ray, G.J., DeWitt, M.A., Curie, G.L. & Corn, J.E. Enhancing homology-directed genome editing by catalytically active and inactive CRISPR-Cas9 using asymmetric donor DNA. Nat. Biotechnol. 34, 339–344 (2016).

    CAS  PubMed  Google Scholar 

  29. 29

    Sutherland, G.R. & Richards, R.I. Simple tandem DNA repeats and human genetic disease. Proc. Natl. Acad. Sci. USA 92, 3636–3641 (1995).

    CAS  PubMed  Google Scholar 

  30. 30

    Gu, Y., Shen, Y., Gibbs, R.A. & Nelson, D.L. Identification of FMR2, a novel gene associated with the FRAXE CCG repeat and CpG island. Nat. Genet. 13, 109–113 (1996).

    CAS  PubMed  Google Scholar 

  31. 31

    Cinesi, C., Aeschbach, L., Yang, B. & Dion, V. Contracting CAG/CTG repeats using the CRISPR-Cas9 nickase. Nat. Commun. 7, 13272 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  32. 32

    Mahadevan, M.S. et al. Reversible model of RNA toxicity and cardiac conduction defects in myotonic dystrophy. Nat. Genet. 38, 1066–1070 (2006).

    CAS  PubMed  PubMed Central  Google Scholar 

  33. 33

    Park, C.-Y. et al. Reversion of FMR1 methylation and silencing by editing the triplet repeats in fragile X iPSC-derived neurons. Cell Rep. 13, 234–241 (2015).

    CAS  PubMed  Google Scholar 

  34. 34

    Tzelepis, K. et al. A CRISPR dropout screen identifies genetic vulnerabilities and therapeutic targets in acute myeloid leukemia. Cell Rep. 17, 1193–1205 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  35. 35

    Wang, T., Wei, J.J., Sabatini, D.M. & Lander, E.S. Genetic screens in human cells using the CRISPR-Cas9 system. Science 343, 80–84 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  36. 36

    Gibson, D.G. Enzymatic assembly of overlapping DNA fragments. Methods Enzymol. 498, 349–361 (2011).

    CAS  Google Scholar 

  37. 37

    Cong, L. et al. Multiplex genome engineering using CRISPR/Cas systems. Science 339, 819–823 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  38. 38

    Zhang, J., Kobert, K., Flouri, T. & Stamatakis, A. PEAR: a fast and accurate Illumina Paired-End reAd mergeR. Bioinformatics 30, 614–620 (2014).

    CAS  PubMed  Google Scholar 

  39. 39

    Jones, E., Oliphant, T. & Peterson, P. SciPy: open source scientific tools for Python. SciPy (2001, accessed 10 January 2018).

  40. 40

    Hart, T. et al. Evaluation and design of genome-wide CRISPR/SpCas9 knockout screens. G3 (Bethesda) 7, 2719–2727 (2017).

    CAS  Google Scholar 

  41. 41

    Zhu, C., Byrd, R.H., Lu, P. & Nocedal, J. Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization. ACM Trans. Math. Softw. 23, 550–560 (1997).

    Google Scholar 

  42. 42

    Ernst, J. & Kellis, M. ChromHMM: automating chromatin-state discovery and characterization. Nat. Methods 9, 215–216 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  43. 43

    ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

  44. 44

    Quinlan, A.R. BEDTools: The Swiss-Army tool for genome feature analysis. Curr. Protoc. Bioinformatics 47, 11.12.1–11.12.34 (2014).

    Google Scholar 

  45. 45

    Meyers, R.M. et al. Computational correction of copy number effect improves specificity of CRISPR-Cas9 essentiality screens in cancer cells. Nat. Genet. 49, 1779–1784 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  46. 46

    Aguirre, A.J. et al. Genomic copy number dictates a gene-independent cell response to CRISPR/Cas9 targeting. Cancer Discov. 6, 914–929 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  47. 47

    Allen, F. et al. JACKS: joint analysis of CRISPR/Cas9 knock-out screens. Preprint at bioRxiv (2018).

  48. 48

    Zerbino, D.R. et al. Ensembl 2018. Nucleic Acids Res. 46, D754–D761 (2018).

    CAS  PubMed  Google Scholar 

  49. 49

    Allen, F. et al. Predicting the mutations generated by repair of Cas9-induced double-strand breaks. Code Ocean (2018).

Download references


We thank J. Eliasova for help with Figure 1, E. de Braekeleer from Wellcome Sanger Institute for providing the K562-Cas9 line, and A. Lawson for comments on the text. F.A. was supported by a Royal Commission for the Exhibition of 1851 Research Fellowship. L.P. was supported by Wellcome (206194) and the Estonian Research Council (IUT 34-4). H.P.H. was supported by a Wellcome Trust grant (200848/Z/16/Z) and a Wellcome Trust Strategic Award to the Cambridge Institute for Medical Research (100140). Y.G. is funded by Cancer Research UK C6/A18796 and Wellcome Trust Investigator Award 206388/Z/17/Z in the Jackson laboratory. F.M.M. was funded by a Marie Curie Intra-European Fellowship, project number 626375, DDR SYNVIA, and by Wellcome Trust Investigator Award 206388/Z/17/Z and an AstraZeneca Collaborative Award in the Jackson laboratory.

Author information




F.A.: designed experiments, analyzed data, wrote paper. L.C.: designed experiments, performed experiments, wrote paper. C.A.: performed experiments in human iPSCs. A.J.S., E.M.: performed experiments in mouse ESCs. V. Kleshchevnikov: analyzed data, wrote paper. A.K., V. Kiselev: created web server. P.D.A., P.P.: performed experiments. M.K., A.R.B.: generated TREX2 constructs. H.H.: generated CHO-Cas9 line. Y.G., F.M.-M., S.P.J.: generated RPE-1-Cas9 and HAP1-Cas9 lines. L.P.: designed experiments, contributed to data analysis, wrote paper. All authors contributed to drafting the manuscript.

Corresponding authors

Correspondence to Felicity Allen or Leopold Parts.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–27 and Supplementary Tables 1–6 (PDF 4737 kb)

Life Sciences Reporting Summary (PDF 397 kb)

Supplementary Data 1

Supplementary Data 1 (TXT 7714 kb)

Supplementary Data 2

Supplementary Data 2 (ZIP 33 kb)

Supplementary Data 3

Supplementary Data 3 (TXT 20 kb)

Supplementary Software

Supplementary Software (ZIP 156 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Allen, F., Crepaldi, L., Alsinet, C. et al. Predicting the mutations generated by repair of Cas9-induced double-strand breaks. Nat Biotechnol 37, 64–72 (2019).

Download citation

Further reading


Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing