GuideScan software for improved single and paired CRISPR guide RNA design

Journal name:
Nature Biotechnology
Volume:
35,
Pages:
347–349
Year published:
DOI:
doi:10.1038/nbt.3804
Received
Accepted
Published online

Abstract

We present GuideScan software for the design of CRISPR guide RNA libraries that can be used to edit coding and noncoding genomic regions. GuideScan produces high-density sets of guide RNAs (gRNAs) for single- and paired-gRNA genome-wide screens. We also show that the trie data structure of GuideScan enables the design of gRNAs that are more specific than those designed by existing tools.

At a glance

Figures

  1. The GuideScan gRNA design tool.
    Figure 1: The GuideScan gRNA design tool.

    (a) Overview of GuideScan. Left, GuideScan takes as input a FASTA file containing any genome of choice. Middle, targetable sequences are defined by choosing the PAM sequence(s) (Cas9's canonical PAM, red; non-canonical PAM, blue), its position relative to the gRNA, and the length of the gRNA (gray box). Right, targetable sequences are indexed in a retrieval tree (trie), and associated information is stored at leaf nodes. R, trie root node. (b) Distributions of combined distance of flanking gRNA-pairs to the boundaries of selected noncoding genomic features using GuideScan (blue) or mit.edu genome-wide tracks (red). (c,d) Example deletions of genomic regions containing RNA (c) and DNA (d) noncoding elements using pairs of gRNAs designed by GuideScan. gRNA sequences, blue and red; PAM sequences, bold underlined. The predicted sequence after deletion, the sequences of three edited alleles, and a representative chromatogram are shown for each targeted locus.

  2. GuideScan correctly enumerates off-target sequences and filters out promiscuous gRNAs.
    Figure 2: GuideScan correctly enumerates off-target sequences and filters out promiscuous gRNAs.

    (a) Number of murine gRNAs (20 mers) designed by each tool for a random sample of protein-coding genes, noncoding elements, and repetitive regions. Number of gRNAs with off-target sites within at least two mismatches from gRNA (black), within a single mismatch (white), and with perfect off-target sites (red). OT, off-target. (b) Number of perfect off-target sites for the gRNAs designed by each tool. Each dot represents a gRNA (mean, red line). (c) Cumulative distribution of specificity scores for the gRNAs designed by each tool. (d) T7 cleavage assay for gRNAs having a single (black, on-target site) or multiple (red, on-target site; blue, perfect off-target site) perfect matches in the genome. Position of the cleavage substrates, filled triangles; position of cleavage products, open triangles. Estimated total editing (TE) at each site is shown below the corresponding lane. (e) Left, schematic representation of the chromosomal locations of three perfect target sites of a gRNA labeled highly specific by competitor tools (mit.edu score = 78). Right, PCR-based identification of chromosomal translocations between all three targets. +, gRNA; – empty plasmid. (f) Left, schematic representation of the position of three perfect target sites—all within chromosome 2—of a gRNA labeled highly specificly by competitor tools (mit.edu score = 89). Genomic sequence: target sites, red; PAM sequence, bold. Right, PCR-based identification of chromosomal deletions between target sites. Position of the wild-type amplicon, filled triangle; position of deletion amplicons, open triangle. +, gRNA; –, empty plasmid. Gels in d and e were cropped from full-length versions shown in Supplementary Figure 2.

  3. Output of GuideScan and competitor tools.
    Supplementary Fig. 1: Output of GuideScan and competitor tools.

    (a) Genome-wide density of guides in GuideScan’s murine Cas9 database (blue), compared to the UCSC genome track from mit.edu (red). (b) Dot plot showing specificity scores and number of perfect off-target sites for promiscuous guides designed by the mit.edu gRNA web design tool. Data points are color-coded based on the specificity scores of the corresponding gRNAs following the guidelines of mit.edu web portal (red = low specificity, score 1-19; yellow = medium specificity, score 20-49; green = high specificity, score 50-100).

  4. Uncropped gel images.
    Supplementary Fig. 2: Uncropped gel images.

    (a) Uncropped gel of T7 cleavage assay shown in Figure 2d. First lane shows the separations of the 1kb+ (Invitrogen) molecular size ladder. The size of selected bands of the ladder is shown on the left. Numbers above gel refer to gel lanes on which T7 assays were run. Red and Blue bars below gel highlight assays unrelated to this work (red; lanes 1—4), and those shown in Fig. 2 (blue; lanes 5—10); (b) Uncropped gel of PCR shown on Figure 2f; (c) Uncropped gel of PCRs shown in Fig. 2e. Bars below gels in panels b—c highlight the chromosomal region amplified.

References

  1. Doudna, J.A. & Charpentier, E. Science 346, 1258096 (2014).
  2. Hsu, P.D., Lander, E.S. & Zhang, F. Cell 157, 12621278 (2014).
  3. Shalem, O. et al. Science 343, 8487 (2014).
  4. Koike-Yusa, H., Li, Y., Tan, E.P., Velasco-Herrera, Mdel C. & Yusa, K. Nat. Biotechnol. 32, 267273 (2014).
  5. Wang, T., Wei, J.J., Sabatini, D.M. & Lander, E.S. Science 343, 8084 (2014).
  6. Korkmaz, G. et al. Nat. Biotechnol. 34, 192198 (2016).
  7. Rajagopal, N. et al. Nat. Biotechnol. 34, 167174 (2016).
  8. Zhu, S. et al. Nat. Biotechnol. 34, 12791286 (2016).
  9. Canver, M.C. et al. Nature 527, 192197 (2015).
  10. Moreno-Mateos, M.A. et al. Nat. Methods 12, 982988 (2015).
  11. Vidigal, J.A. & Ventura, A. Nat. Commun. 6, 8083 (2015).
  12. Fu, Y., Sander, J.D., Reyon, D., Cascio, V.M. & Joung, J.K. Nat. Biotechnol. 32, 279284 (2014).
  13. Doench, J.G. et al. Nat. Biotechnol. 34, 184191 (2016).
  14. Hsu, P.D. et al. Nat. Biotechnol. 31, 827832 (2013).
  15. Tsai, S.Q. et al. Nat. Biotechnol. 33, 187197 (2015).
  16. Heigwer, F., Kerr, G. & Boutros, M. Nat. Methods 11, 122123 (2014).
  17. Aguirre, A.J. et al. Cancer Discov. 6, 914929 (2016).
  18. Maddalo, D. et al. Nature 516, 423427 (2014).
  19. Pliatsika, V. & Rigoutsos, I. Biol. Direct 10, 4 (2015).
  20. Li, H. et al. Bioinformatics 25, 20782079 (2009).
  21. ENCODE Project Consortium. Nature 489, 5774 (2012).
  22. Whyte, W.A. et al. Cell 153, 307319 (2013).
  23. Kozomara, A. & Griffiths-Jones, S. Nucleic Acids Res. 39, D152D157 (2011).
  24. Harrow, J. et al. Genome Res. 22, 17601774 (2012).
  25. Lin, S., Staahl, B.T., Alla, R.K. & Doudna, J.A. eLife 3, e04766 (2014).

Download references

Author information

  1. These authors contributed equally to this work.

    • Alexendar R Perez,
    • Yuri Pritykin &
    • Joana A Vidigal

Affiliations

  1. Computational Biology Program, Memorial Sloan Kettering Cancer Center, New York, New York, USA.

    • Alexendar R Perez,
    • Yuri Pritykin,
    • Sagar Chhangawala,
    • Lee Zamparo &
    • Christina S Leslie
  2. Cancer Biology and Genetics Program, Memorial Sloan Kettering Cancer Center, New York, New York, USA.

    • Alexendar R Perez,
    • Joana A Vidigal &
    • Andrea Ventura
  3. Weill Cornell Graduate School of Medical Sciences of Cornell University, New York, New York, USA.

    • Alexendar R Perez &
    • Sagar Chhangawala

Contributions

J.A.V., C.S.L., and A.V. conceived and supervised the project. Y.P. and A.R.P. developed the GuideScan algorithm with input from C.S.L.; A.R.P. and Y.P. implemented the GuideScan software package; A.R.P. performed the computational experiments; J.A.V. performed the wet-lab experiments; A.R.P. and S.C. implemented the web-server; L.Z. provided expertise in software development and helped improve the website user experience; J.A.V. drafted the manuscript with contributions from all authors.

Competing financial interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to:

Author details

Supplementary information

Supplementary Figures

  1. Supplementary Figure 1: Output of GuideScan and competitor tools. (602 KB)

    (a) Genome-wide density of guides in GuideScan’s murine Cas9 database (blue), compared to the UCSC genome track from mit.edu (red). (b) Dot plot showing specificity scores and number of perfect off-target sites for promiscuous guides designed by the mit.edu gRNA web design tool. Data points are color-coded based on the specificity scores of the corresponding gRNAs following the guidelines of mit.edu web portal (red = low specificity, score 1-19; yellow = medium specificity, score 20-49; green = high specificity, score 50-100).

  2. Supplementary Figure 2: Uncropped gel images. (232 KB)

    (a) Uncropped gel of T7 cleavage assay shown in Figure 2d. First lane shows the separations of the 1kb+ (Invitrogen) molecular size ladder. The size of selected bands of the ladder is shown on the left. Numbers above gel refer to gel lanes on which T7 assays were run. Red and Blue bars below gel highlight assays unrelated to this work (red; lanes 1—4), and those shown in Fig. 2 (blue; lanes 5—10); (b) Uncropped gel of PCR shown on Figure 2f; (c) Uncropped gel of PCRs shown in Fig. 2e. Bars below gels in panels b—c highlight the chromosomal region amplified.

PDF files

  1. Supplementary Text and Figures (689 KB)

    Supplementary Figures 1,2 and Supplementary Note 1

Excel files

  1. Supplementary Table 1 (67 KB)

    Off-target reporting by competitor tools for a subset of promiscuous gRNAs.

  2. Supplementary Table 2 (80 KB)

    Number of gRNAs with off-targets with 0 or 1 mismatches.

  3. Supplementary Table 3 (51 KB)

    Genomic Coordinates used for tool comparison experiment.

  4. Supplementary Table 4 (45 KB)

    sequence of gRNAs and primers used in Fig. 2

Zip files

  1. Supplementary Code (27,970 KB)

    Supplementary Code

Additional data