Combinatorial mutagenesis en masse optimizes the genome editing activities of SpCas9

A Publisher Correction to this article was published on 23 July 2019

This article has been updated


The combined effect of multiple mutations on protein function is hard to predict; thus, the ability to functionally assess a vast number of protein sequence variants would be practically useful for protein engineering. Here we present a high-throughput platform that enables scalable assembly and parallel characterization of barcoded protein variants with combinatorial modifications. We demonstrate this platform, which we name CombiSEAL, by systematically characterizing a library of 948 combination mutants of the widely used Streptococcus pyogenes Cas9 (SpCas9) nuclease to optimize its genome-editing activity in human cells. The ease with which the editing activities of the pool of SpCas9 variants can be assessed at multiple on- and off-target sites accelerates the identification of optimized variants and facilitates the study of mutational epistasis. We successfully identify Opti-SpCas9, which possesses enhanced editing specificity without sacrificing potency and broad targeting range. This platform is broadly applicable for engineering proteins through combinatorial modifications en masse.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Generation of a high-coverage library of combination mutants of SpCas9 and efficient delivery of the library to human cells.
Fig. 2: Strategy for the profiling of on- and off-target activities of SpCas9 variants in human cells.
Fig. 3: High-throughput profiling reveals the broad-spectrum specificity and efficiency of SpCas9 combination mutants.
Fig. 4: Heat maps depicting editing efficiency and epistasis for on- and off-target sites.
Fig. 5: Opti-SpCas9 exhibits robust on-target and reduced off-target activities.

Data availability

Source data for the count matrices determined for SpCas9 variants on the basis of pooled characterization that are shown in Fig. 3 are provided with the online version of this paper. GUIDE-seq data are available from the European Nucleotide Archive under accession PRJEB32521.

Code availability

The custom scripts for data analysis are available at

Change history

  • 23 July 2019

    An amendment to this paper has been published and can be accessed via a link at the top of the paper.


  1. 1.

    Bornscheuer, U. T. et al. Engineering the third wave of biocatalysis. Nature 485, 185–194 (2012).

    CAS  Article  Google Scholar 

  2. 2.

    Weinreich, D. M., Delaney, N. F., Depristo, M. A. & Hartl, D. L. Darwinian evolution can follow only very few mutational paths to fitter proteins. Science 312, 111–114 (2006).

    CAS  Article  Google Scholar 

  3. 3.

    Slaymaker, I. M. et al. Rationally engineered Cas9 nucleases with improved specificity. Science 351, 84–88 (2016).

    CAS  Article  Google Scholar 

  4. 4.

    Kleinstiver, B. P. et al. High-fidelity CRISPR–Cas9 nucleases with no detectable genome-wide off-target effects. Nature 529, 490–495 (2016).

    CAS  Article  Google Scholar 

  5. 5.

    Chen, J. S. et al. Enhanced proofreading governs CRISPR–Cas9 targeting accuracy. Nature 550, 407–410 (2017).

    CAS  Article  Google Scholar 

  6. 6.

    Casini, A. et al. A highly specific SpCas9 variant is identified by in vivo screening in yeast. Nat. Biotechnol. 36, 265–271 (2018).

    CAS  Article  Google Scholar 

  7. 7.

    Hu, J. H. et al. Evolved Cas9 variants with broad PAM compatibility and high DNA specificity. Nature 556, 57–63 (2018).

    CAS  Article  Google Scholar 

  8. 8.

    Packer, M. S. & Liu, D. R. Methods for the directed evolution of proteins. Nat. Rev. Genet. 16, 379–394 (2015).

    CAS  Article  Google Scholar 

  9. 9.

    Romero, P. A. & Arnold, F. H. Exploring protein fitness landscapes by directed evolution. Nat. Rev. Mol. Cell Biol. 10, 866–876 (2009).

    CAS  Article  Google Scholar 

  10. 10.

    Gasperini, M., Starita, L. & Shendure, J. The power of multiplexed functional analysis of genetic variants. Nat. Protoc. 11, 1782–1787 (2016).

    CAS  Article  Google Scholar 

  11. 11.

    Fowler, D. M. & Fields, S. Deep mutational scanning: a new style of protein science. Nat. Methods 11, 801–807 (2014).

    CAS  Article  Google Scholar 

  12. 12.

    Ma, S., Saaem, I. & Tian, J. Error correction in gene synthesis technology. Trends Biotechnol. 30, 147–154 (2012).

    CAS  Article  Google Scholar 

  13. 13.

    Kosuri, S. & Church, G. M. Large-scale de novo DNA synthesis: technologies and applications. Nat. Methods 11, 499–507 (2014).

    CAS  Article  Google Scholar 

  14. 14.

    Engler, C., Kandzia, R. & Marillonnet, S. A one pot, one step, precision cloning method with high throughput capability. PLoS ONE 3, e3647 (2008).

    Article  Google Scholar 

  15. 15.

    Gibson, D. G. et al. Enzymatic assembly of DNA molecules up to several hundred kilobases. Nat. Methods 6, 343–345 (2009).

    CAS  Article  Google Scholar 

  16. 16.

    Trudeau, D. L., Smith, M. A. & Arnold, F. H. Innovation by homologous recombination. Curr. Opin. Chem. Biol. 17, 902–909 (2013).

    CAS  Article  Google Scholar 

  17. 17.

    Wong, A. S., Choi, G. C., Cheng, A. A., Purcell, O. & Lu, T. K. Massively parallel high-order combinatorial genetics in human cells. Nat. Biotechnol. 33, 952–961 (2015).

    CAS  Article  Google Scholar 

  18. 18.

    Wong, A. S. et al. Multiplexed barcoded CRISPR–Cas9 screening enabled by CombiGEM. Proc. Natl Acad. Sci. USA 113, 2544–2549 (2016).

    CAS  Article  Google Scholar 

  19. 19.

    Cheng, A. A., Ding, H. & Lu, T. K. Enhanced killing of antibiotic-resistant bacteria enabled by massively parallel combinatorial genetics. Proc. Natl Acad. Sci. USA 111, 12462–12467 (2014).

    CAS  Article  Google Scholar 

  20. 20.

    Doudna, J. A. & Charpentier, E. The new frontier of genome engineering with CRISPR–Cas9. Science 346, 1258096 (2014).

    Article  Google Scholar 

  21. 21.

    Hsu, P. D., Lander, E. S. & Zhang, F. Development and applications of CRISPR–Cas9 for genome engineering. Cell 157, 1262–1278 (2014).

    CAS  Article  Google Scholar 

  22. 22.

    Mali, P., Esvelt, K. M. & Church, G. M. Cas9 as a versatile tool for engineering biology. Nat. Methods 10, 957–963 (2013).

    CAS  Article  Google Scholar 

  23. 23.

    Barrangou, R. & Horvath, P. A decade of discovery: CRISPR functions and applications. Nat. Microbiol. 2, 17092 (2017).

    CAS  Article  Google Scholar 

  24. 24.

    Kim, S., Bae, T., Hwang, J. & Kim, J. S. Rescue of high-specificity Cas9 variants using sgRNAs with matched 5′ nucleotides. Genome Biol. 18, 218 (2017).

    Article  Google Scholar 

  25. 25.

    Kulcsar, P. I. et al. Crossing enhanced and high fidelity SpCas9 nucleases to optimize specificity and cleavage. Genome Biol. 18, 190 (2017).

    Article  Google Scholar 

  26. 26.

    Zhang, D. et al. Perfectly matched 20-nucleotide guide RNA sequences enable robust genome editing using high-fidelity SpCas9 nucleases. Genome Biol. 18, 191 (2017).

    Article  Google Scholar 

  27. 27.

    Kato-Inui, T., Takahashi, G., Hsu, S. & Miyaoka, Y. Clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated protein 9 with improved proof-reading enhances homology-directed repair. Nucleic Acids Res. 46, 4677–4688 (2018).

    CAS  Article  Google Scholar 

  28. 28.

    Sternberg, S. H., LaFrance, B., Kaplan, M. & Doudna, J. A. Conformational control of DNA target cleavage by CRISPR–Cas9. Nature 527, 110–113 (2015).

    CAS  Article  Google Scholar 

  29. 29.

    Singh, D. et al. Mechanisms of improved specificity of engineered Cas9s revealed by single-molecule FRET analysis. Nat. Struct. Mol. Biol. 25, 347–354 (2018).

    CAS  Article  Google Scholar 

  30. 30.

    Fu, Y. et al. High-frequency off-target mutagenesis induced by CRISPR–Cas nucleases in human cells. Nat. Biotechnol. 31, 822–826 (2013).

    CAS  Article  Google Scholar 

  31. 31.

    Lee, J. K. et al. Directed evolution of CRISPR–Cas9 to increase its specificity. Nat. Commun. 9, 3048 (2018).

    Article  Google Scholar 

  32. 32.

    Haeussler, M. et al. Evaluation of off-target and on-target scoring algorithms and integration into the guide RNA selection tool CRISPOR. Genome Biol. 17, 148 (2016).

    Article  Google Scholar 

  33. 33.

    Fu, Y., Sander, J. D., Reyon, D., Cascio, V. M. & Joung, J. K. Improving CRISPR–Cas nuclease specificity using truncated guide RNAs. Nat. Biotechnol. 32, 279–284 (2014).

    CAS  Article  Google Scholar 

  34. 34.

    Vakulskas, C. A. et al. A high-fidelity Cas9 mutant delivered as a ribonucleoprotein complex enables efficient gene editing in human hematopoietic stem and progenitor cells. Nat. Med. 24, 1216–1224 (2018).

    CAS  Article  Google Scholar 

  35. 35.

    Ran, F. A. et al. In vivo genome editing using Staphylococcus aureus Cas9. Nature 520, 186–191 (2015).

    CAS  Article  Google Scholar 

  36. 36.

    Zetsche, B. et al. Cpf1 is a single RNA-guided endonuclease of a class 2 CRISPR–Cas system. Cell 163, 759–771 (2015).

    CAS  Article  Google Scholar 

  37. 37.

    Komor, A. C., Kim, Y. B., Packer, M. S., Zuris, J. A. & Liu, D. R. Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature 533, 420–424 (2016).

    CAS  Article  Google Scholar 

  38. 38.

    Nishida, K. et al. Targeted nucleotide editing using hybrid prokaryotic and vertebrate adaptive immune systems. Science 353, aaf8729 (2016).

    Article  Google Scholar 

  39. 39.

    Gaudelli, N. M. et al. Programmable base editing of A*T to G*C in genomic DNA without DNA cleavage. Nature 551, 464–471 (2017).

    CAS  Article  Google Scholar 

  40. 40.

    Li, X. et al. Base editing with a Cpf1–cytidine deaminase fusion. Nat. Biotechnol. 36, 324–327 (2018).

    CAS  Article  Google Scholar 

  41. 41.

    Honma, K. et al. RPN2 gene confers docetaxel resistance in breast cancer. Nat. Med. 14, 939–948 (2008).

    CAS  Article  Google Scholar 

  42. 42.

    Kampmann, M., Bassik, M. C. & Weissman, J. S. Functional genomics platform for pooled screening and generation of mammalian genetic interaction maps. Nat. Protoc. 9, 1825–1847 (2014).

    CAS  Article  Google Scholar 

  43. 43.

    Olson, C. A., Wu, N. C. & Sun, R. A comprehensive biophysical description of pairwise epistasis throughout an entire protein domain. Curr. Biol. 24, 2643–2651 (2014).

    CAS  Article  Google Scholar 

  44. 44.

    Aakre, C. D. et al. Evolving new protein–protein interaction specificity through promiscuous intermediates. Cell 163, 594–606 (2015).

    CAS  Article  Google Scholar 

  45. 45.

    Guschin, D. Y. et al. A rapid and general assay for monitoring endogenous gene modification. Methods Mol. Biol. 649, 247–256 (2010).

    CAS  Article  Google Scholar 

  46. 46.

    Tsai, S. Q. et al. GUIDE-seq enables genome-wide profiling of off-target cleavage by CRISPR–Cas nucleases. Nat. Biotechnol. 33, 187–197 (2015).

    CAS  Article  Google Scholar 

  47. 47.

    Tsai, S. Q., Topkar, V. V., Joung, J. K. & Aryee, M. J. Open-source guideseq software for analysis of GUIDE-seq data. Nat. Biotechnol. 34, 483 (2016).

    Article  Google Scholar 

Download references


We thank members of the Wong lab for helpful discussions, and Z. Dong, L. Qin, N. Shirgaonkar and L. Pardeshi from the Genomics, Bioinformatics and Single Cell Analysis Core of the Faculty of Health Sciences at the University of Macau for their technical support. We thank J. Chan for support at the High Performance Computing Cluster (HPCC) of ICTO of the University of Macau. We thank T. Ochiya for OVCAR8-ADR cells. We thank the Faculty Core Facility at the LKS Faculty of Medicine of The University of Hong Kong for providing and maintaining the equipment needed for flow cytometry analysis and cell sorting. This work was supported by The University of Hong Kong start-up and internal funds, the Croucher Foundation Start-up Allowance and the Hong Kong Research Grants Council (ECS-27105716, GRF-17104619 and TRS-T12-710/16-R) (to A.S.L.W.); the Swedish Research Council (2016-02830) and the National Natural Science Foundation of China (81672098) (to Z.Z.); and the Science and Technology Development Fund of Macau S.A.R. (FDCT 085/2014/A2), the Research Services and Knowledge Transfer Office of the University of Macau (MYRG2016-00211-FHS and MYRG2018-00017-FHS), and the Start-up fund from the Faculty of Health Sciences, University of Macau (to K.H.W. and K.T.).

Author information




G.C.G.C. and A.S.L.W. conceived the work. G.C.G.C., P.Z., C.T.L.Y., B.K.C.C., F.X., D.T. and A.S.L.W. designed and performed the experiments and interpreted and analyzed the data. G.C.G.C., C.T.L.Y., K.T., K.H.W. and A.S.L.W. performed computational analyses on next-generation sequencing data for CombiSEAL experiments. G.C.G.C., P.Z., A.S.L.W., S.B., H.Y.C. and Z.Z. performed GUIDE-seq experiments and analyzed the data. G.C.G.C. and A.S.L.W. wrote the paper.

Corresponding author

Correspondence to Alan S. L. Wong.

Ethics declarations

Competing interests

A.S.L.W. and G.C.G.C. have filed a patent application that is based on this work.

Additional information

Peer review information: Lei Tang and Nicole Rusk were the primary editors on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Integrated supplementary information

Supplementary Figure 1

Examples of strategies for characterizing combinatorial mutations on a protein sequence.

Supplementary Figure 2 Strategy for seamless assembly of the barcoded combination mutant library pool.

a, To create barcoded DNA parts in storage vectors, genetic inserts were generated by PCR or synthesis, and cloned in the storage vectors harboring a random barcode (pAWp61 and pAWp62; digested with EcoRI and BamHI) with Gibson assembly reactions. BsaI digestion was performed to generate the barcoded DNA parts (that is, P1, P2,…, P(n)). BbsI sites and a primer-binding site for barcode sequencing were introduced in between the insert and the barcode for pAWp61 and pAWp62, respectively. b, To create the barcoded combination mutant library, the pooled DNA parts and destination assembly vectors were digested with BsaI and BbsI, respectively. A one-pot ligation created a pooled vector library, which was further iteratively digested and ligated with the subsequent pool of DNA parts to generate higher-order combination mutants. The barcoded inserts were linked with compatible overhangs that are originated from the protein-coding sequence after digestion with type IIS restriction enzymes (that is, BsaI and BbsI), thereby no fusion scar is formed in the ligation reactions. All barcodes were localized into a contiguous stretch of DNA. The final combination mutant library was encoded in lentiviruses and delivered into targeted human cells. The integrated barcodes representing each combination were amplified from the genomic DNA within the pooled cell populations in an unbiased fashion and quantified using high-throughput sequencing to identify shifts in representation under different experimental conditions. c, High correlations between barcode representations (normalized barcode counts obtained from a single set of experiment) within the plasmid and infected OVCAR8-ADR cell pools indicate efficient lentiviral delivery of the library into human cells. High reproducibility for barcode representations between two biological replicates in OVCAR8-ADR cells infected with the library. R is the Pearson’s r.

Supplementary Figure 3 Fluorescence-activated cell sorting of SpCas9 library-infected human cells harboring on- and off- target reporters.

OVCAR8-ADR reporter cell lines that express RFP and GFP genes driven by UBC and CMV promoters, respectively, and a tandem U6 promoter-driven expression cassette of gRNA targeting the RFP site (RFPsg5 or RFPsg8) were either uninfected or infected with the SpCas9 library. RFPsg5-ON and RFPsg8-ON lines harbor sites that match completely with the gRNA sequence, while RFPsg5-OFF5-2 and RFPsg8-OFF5 lines contain synonymous mutations on the RFP and are mismatched to the gRNA. Cells were sorted under flow cytometry into bins each encompassing ~5% of the population with low RFP fluorescence. These experiments were repeated independently twice with similar results.

Supplementary Figure 4 Positive correlation between enrichment score determined from the pooled screen and individual validation data.

The normalized log2(E) for each SpCas9 combination mutant is the mean score determined from the pooled screens in two biological replicates, and the normalized RFP disruption value is the mean cell percentage with depleted RFP level when compared to WT determined from three biological replicates. R is the Pearson’s r.

Supplementary Figure 5 Heatmaps depicting editing efficiency for the on- and off- target sites.

Editing efficiency was measured by the log-transformed enrichment ratio (log2(E)) determined for each SpCas9 combination mutant. Enriched and depleted mutants have >0 and <0, respectively. To aid visualization, amino acid residues that are predicted to make contacts with the target DNA strand or located at the linker region connecting SpCas9’s HNH and RuvC domains are grouped on the y-axis, while those predicted to make contacts with the non-target DNA strand are presented on the x-axis. The combinations for those with no enrichment are indicated in gray.

Supplementary Figure 6 Frequency of N20-NGG and G-N19-NGG sites in the reference human genome.

A custom Python code was used to find the occurrence of N20-NGG and G-N19-NGG sites in both strands of the reference human genome hg19, as an estimate of the targeting ranges of Opti-SpCas9 and other engineered SpCas9 variants including eSpCas9(1.1), SpCas9-HF1, HypaCas9, and evoCas9, respectively. N20-NGG sites are about 4.3 times more frequent than G-N19-NGG sites in the human genome.

Supplementary Figure 7 Summary of T7 endonuclease I (T7E1) assay results for DNA mismatch cleavage in OVCAR8-ADR cells.

Cells were infected with an SpCas9 variant and the indicated gRNA, and genomic DNA were collected for T7E1 assay after 11 to 16 days post-infection. Indel quantification for the infected samples is displayed as a bar graph.

Supplementary Figure 8 Expression of SpCas9 variants in OVCAR8-ADR cells.

Cells were infected with lentiviruses encoding WT SpCas9, Opti-SpCas9, eSpCas9(1.1), HypaCas9, SpCas9-HF1, Sniper-Cas9, evoCas9, xCas9, or OptiHF-SpCas9. Protein lysates were extracted for Western blot analysis, and immunoblotted with anti-SpCas9 antibodies. Beta-actin was used as loading control. We failed to detect the expression of SpCas9-HF1 and xCas9 in OVCAR8-ADR cells, which could be due to their non-optimized sequence for expression in mammalian cells (ref. 24 and Nat. Biotechnol. 36, 888–893, 2018) and thus SpCas9-HF1 and xCas9 were not included in other activity assays. These experiments were repeated independently for three times with similar results.

Supplementary Figure 9 Evaluation of the editing efficiency of SpCas9 variants with gRNAs bearing or lacking an additional mismatched 5’ guanine (5’G) using GFP disruption assay.

OVCAR8-ADR cells expressing WT SpCas9, Opti-SpCas9, eSpCas9(1.1), or HypaCas9 were infected with lentiviruses encoding gRNAs carrying or lacking an additional mismatched 5’G. Editing efficiency was measured by cell percentage with depleted GFP level using flow cytometry. Values and error bars reflect the mean and s.d. of four independent biological replicates.

Supplementary Figure 10 Opti-SpCas9 exhibits reduced off-target activity when compared to wild-type SpCas9.

Assessment of SpCas9 variants for off-target editing brought by VEGFA site 3 or DNMT1 site 4 gRNA at eight endogenous loci. Percentage of indel was measured using T7E1 assay, averaged from three independent experiments. Dash indicates none detected. Specificity of WT SpCas9 and its variants with VEGFA site 3 gRNA at OFF1 loci is plotted as the ratio of on-target to off-target activity (on-target activity data was obtained from Supplementary Fig. 7).

Supplementary Figure 11 Characterization of SpCas9 variants for editing target sites harboring sequences that are perfectly matched with the gRNA’s spacer or contain mismatch(es) using GFP disruption assay.

OVCAR8-ADR cells expressing WT SpCas9, Opti-SpCas9, eSpCas9(1.1), or HypaCas9 were infected with lentiviruses encoding gRNAs carrying no or one- to four-base mismatch(es) against the target. Editing efficiency was measured by cell percentage with depleted GFP level using flow cytometry. Values and error bars reflect the mean and s.d. of three independent biological replicates.

Supplementary Figure 12 On-target editing activity of SpCas9 variants using truncated gRNAs.

a,b, OVCAR8-ADR cells expressing WT SpCas9, Opti-SpCas9, eSpCas9(1.1), or HypaCas9 were infected with lentiviruses encoding gRNAs of varied length (17 to 19 nucleotides) targeting the GFP sequence (a) and endogenous loci (b). Editing efficiency was measured by cell percentage with depleted GFP level using flow cytometry (a) and T7E1 assay (b). The list of gRNA sequences used is presented in Supplementary Table 5. For (a), values and error bars reflect the mean and s.d. of four independent biological replicates.

Supplementary information

Supplementary Information

Supplementary Figs. 1–12

Reporting Summary

Supplementary Table 1

List of SpCas9 combination mutants that were generated and tested.

Supplementary Table 2

Enrichment scores determined for SpCas9 variants on the basis of pooled characterization.

Supplementary Table 3

A table comparing the on- and off-target activities of SpCas9 variants.

Supplementary Table 4

List of constructs used in this work.

Supplementary Table 5

List of gRNA protospacer sequences used in this study.

Supplementary Table 6

List of reporter cell lines used in this work.

Supplementary Table 7

List of primers and PCR conditions used for the T7E1 assay.

Supplementary Table 8

Adaptor and primer sequences for GUIDE-seq.

Source data

Source Data

Source Data for Fig. 3a.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Choi, G.C.G., Zhou, P., Yuen, C.T.L. et al. Combinatorial mutagenesis en masse optimizes the genome editing activities of SpCas9. Nat Methods 16, 722–730 (2019).

Download citation

Further reading


Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing