Abstract
Genome-wide CRISPR-based knockout (CRISPR-KO) screening is an emerging technique which enables systematic genetic analysis of a cellular or molecular phenotype in question. Continuous improvements, such as modifications to the guide RNA (gRNA) scaffold and the development of gRNA on-target prediction algorithms, have since been made to increase their screening performance. We compared the performance of three available second-generation human genome-wide CRISPR-KO libraries that included at least one of the improvements, and examined the effect of gRNA scaffold, number of gRNAs per gene and number of replicates on screen performance. We identified duplicated screens using a library with 6 gRNAs per gene as providing the best trade-off. Despite the improvements, we found that each improved library still has library-specific false negatives and, for the first time, estimated the false negative rates of CRISPR-KO screens, which are between 10% and 20%. Our newly-defined optimal screening parameters would be helpful in designing screens and constructing bespoke gRNA libraries.
Similar content being viewed by others
Introduction
The discovery of the bacterial adaptive immune system, namely clustered regularly interspaced short palindromic repeats (CRISPR)-Cas9 system1,2,3,4,5, and its subsequent development6,7,8,9 into a genome editing tool have made possible the precise targeting and modification of DNA, including in mammalian cells. In addition to genome editing, the high versatility of the CRISPR-Cas system has led to the development of various derivative tools for epigenome10,11,12 and base editing13, 14, as well as transcriptional activation/silencing15,16,17,18.
We and others have exploited the highly consistent editing efficiency of the CRISPR-Cas system and the scalable nature of gRNA vector production, and developed genome-wide CRISPR-knockout (KO) screening19,20,21. In CRISPR-KO screening, a pool of synthetic single guide RNAs (sgRNAs, or simply gRNAs) that targets every gene in the genome under interrogation is transduced into the cells under study, usually at the stoichiometry of a single gRNA per cell. The final cell population that exhibits the phenotype of interest are then recovered and the integrated gRNAs, which reveal the identity of the knocked-out genes, are sequenced in parallel. Analysis of the differences in the abundance of gRNAs between the controls and the treated cells would unveil the relationship between genes and the phenotype of interest22.
Several factors need to be considered when constructing CRISPR gRNA libraries for effective genome-wide CRISPR-KO screening. First and foremost is the selection of gRNAs to be included in a library, as pooled screening requires the gRNAs to be effective in knocking out only the target genes without affecting other loci which may share high sequence similarity. Various algorithms for selecting suitable gRNAs with minimal off-target activity have been reported23,24,25,26,27. In addition to off-target effects, it is now evident that different gRNAs exhibit varying KO efficiency due to different on-target cutting efficiency20, 28,29,30,31 and/or DNA repair outcome32. Lastly, genetic variations present in individuals or cell lines may also affect on-target activity. It is therefore important to choose gRNAs that target common sequences within the human population and to ensure that every gene has a fair chance of being knocked out in various human cell lines.
In this study, we seek to examine the vital design parameters that affect screening efficiency. We first compared the performance of three second-generation human genome-wide CRISPR libraries that employed different gRNA selection strategies, and used the results to arrive at an estimate of the false negative rate of CRISPR-KO screens. We then proceeded to determine the optimum number of gRNA per gene and the number of replicates that provide the best trade-off for genome-wide CRISPR-KO screens.
Results
Screening performance of second-generation CRISPR libraries
To date, 7 genome-wide CRISPR gRNA libraries targeting human genes have been described27, 31, 33,34,35,36,37. Of these, 4 libraries were designed and constructed with features that improve gRNA efficacy, thereby increasing screen performance27, 31, 36, 37. In this study, we refer to these 4 libraries as second-generation CRISPR libraries (Table 1 and Fig. 1). Apart from differences in the constituent gRNAs and a small variation in the human genes targeted (Fig. 1), two other major differences among these libraries are (1) the use of on-target prediction for gRNA selection20, 27 and (2) the gRNA scaffold38 used. The number of gRNAs targeting each gene also differs among the 4 libraries (Table 1). We directly compared the performance of 3 second-generation libraries in an identical condition to examine the effect of these major differences on screen outcomes.
To eliminate any bias that might be caused by differences in the lentiviral vector backbone used to generate each library, we re-constructed the complete Brunello27 and half of the Whitehead36 libraries on our lentiviral backbone (see Methods and Fig. 2a). Doing so also ensures that the three libraries had a similar level of complexity (77 k–92 k gRNAs) which allowed us to perform screens in an identical experimental setting. Screens were performed in the colon cancer cell line HT-29 Cas9#3 clone37 in technical triplicate and the gRNA abundance at 16 days post transduction was analysed (Fig. 2a). For meaningful comparisons we subsampled the read counts for 8,948 genes common to the three reconstructed libraries (Fig. 2a; read counts in Supplementary Table S1) in all our subsequent analyses. We obtained the false discovery rate (FDR) for these genes using the read counts from all three replicates using MAGeCK39, and separately calculated their respective gene-level fold change (FC).
The receiver operating characteristic (ROC) curves of all three libraries showed nearly identical performance in discriminating the essential from non-essential genes (Fig. 2b). We however found that the FCs in the Human v1 library exhibited a more significant reduction compared to in the other two libraries (Fig. 2c; Kolmogorov-Smirnov test, p = 2.8 × 10−7 and 5.3 × 10−9, against the Brunello and Whitehead libraries, respectively, and Fig. 2d). When the gene-level statistical results returned by MAGeCK were compared, the Whitehead library detected the largest number of significantly depleted genes (Fig. 2e). This is likely due to the fact that the Whitehead library comprises 10 gRNAs per gene, much higher than the Brunello (4 gRNAs per gene) and the Human v1 (5 gRNAs per gene) libraries.
We then integrated the FDR and FC values and counted the number of genes depleted at different FDR and FC cut-offs. The differences between the libraries immediately became more apparent (Fig. 2f). Nearly all genes with FC ≤ 0.5 in the Whitehead library were below 1% FDR, but the fraction of such genes are lower at 62% and 45% in the Human v1 and Brunello libraries, respectively. Although the Whitehead library detected the largest number of genes, approximately 45% of the genes detected below FDR 20% showed a smaller depletion (FC > 0.5). The fraction of such genes was smaller in the Human v1 (23.5%) and Brunello (22.6%) libraries. When the Whitehead library read count data were separated into two subsets (sets A and B, as described36), each of the 5 gRNAs-per-gene subsets detected comparable numbers of depleted genes and showed similar trends as the Human v1 and Brunello libraries (Fig. 2f). These results indicate that these second-generation CRISPR libraries show comparable screening performance and identify similar numbers of depleted genes when the number of gRNA per gene is similar. It is however worth noting that the fraction of genes with FDR ≤ 1% is biggest in the Human v1 library.
Concordance of depleted genes detected and false negative rate estimation
We next analysed the overlaps of genes that satisfied both cut-off criteria (FDR ≤ 20% and FC ≤ 0.5) in the Human v1, Brunello and the two Whitehead subset libraries (Fig. 3a). A total of 955 non-redundant genes were detected by these 4 libraries. Of these, 718 genes (75.1%) were detected by 2 or more libraries, which we dubbed as “high-confidence genes”. Within this confident set, 317 genes (44.2%) were detected by all 4 libraries. The concordance between any two given libraries was similar, at between 55% and 60%, but the Human v1 library consistently detected more genes than the other libraries (Fig. 3b). As demonstrated using the genes detected by the Human v1 library, the FCs and FDRs tend to be greater if the genes were detected by more libraries (Fig. 3c). In each library, library-specific hits accounted for approximately 10% of the total hits and they have a lower FC. These trends were consistent across all gRNA libraries tested (Supplementary Fig. S1a).
One important question that remains unanswered in CRISPR-KO screening is the false negative (FN) rate. As we do not know the phenotype of every gene KO in a given cell type, theoretically it is not possible to measure the true FN rate of CRISPR-KO screens. We sought to address this issue by utilising the high-confidence gene set we defined above to arrive at an estimate. Among the 718 high-confidence genes, the number of genes missed by each library was 102, 163, 158 and 143 for the Human v1, Brunello, Whitehead_A and Whitehead_B libraries, respectively (Fig. 3a), resulting in observed FN rates of between 14% and 23%. Among the FN genes of each library, the GO terms related to cell survival and proliferation were highly enriched (Supplementary Table S2), indicating that similar to commonly detected genes, these FN genes have fundamental cellular functions. As exemplified by the Human v1 library in Fig. 3d, FN genes from this library showed substantially lower FC and weaker significance but were detected by the other 3 libraries, albeit at a slightly weaker FC and FDR when compared to the 317 commonly-detected genes. A similar trend was observed in all other libraries for library-specific FN genes (Supplementary Fig. S1b). We then analysed the genes missed by 2 libraries and found a similar trend, although the FC of genes in this category was considerably lower than that of the commonly detected genes (Supplementary Fig. S1c). The gRNAs targeting the FN genes of a given library exhibited marked variation in their FC, many of which showed little to no effect on proliferation, whereas the gRNAs from the other libraries indeed produced phonotypic effects (Supplementary Fig. S2). We also calculated the FN rates by comparing the Human v1, Brunello and the Whitehead full-set and found FN rates of 9.9%, 18.5% and 9.3%, respectively (Supplementary Fig. S3). These results suggest that each library has a varying level of inherent false negatives and that false negatives are primarily caused by the selection and inclusion of less or non-effective gRNAs in the respective libraries.
Optimal number of gRNAs per gene and number of replicates on screening performance
As reported previously39, the number of gRNAs per gene has a substantial impact on statistical results. We sought to revisit this question as the greater efficiency accorded by the second-generation libraries may affect the optimal parameters for CRISPR-KO screening. We cloned the 10 gRNAs-per-gene Whitehead sub-library used above into our lentiviral vector carrying the optimised scaffold37, 38 and performed the same screen (read counts in Supplementary Table S1). This library showed greater FC than the library using the conventional scaffold, indicating that the optimised scaffold can further improve gRNA efficacy even if the gRNAs were already selected based on on-target prediction scores (Supplementary Fig. S4).
In order to assess the effect of the number of gRNAs per gene, we down-sampled this dataset and generate a series of read count subsets, ranging between 3 and 9 gRNAs per gene (5 subsets for each per-gene gRNA number) and performed MAGeCK and ROC curve analyses individually. As expected, the library with 10 gRNAs per gene exhibited the greatest area under the curve (AUC); however, even with fewer numbers of gRNAs, ROC curve analysis detected only marginal performance difference across the subsets (Fig. 4a). Consistent with our previous observation (Fig. 2f), greater difference became evident when we compared the numbers of depleted genes with FDR and FC cut-offs (Fig. 4b). The subsampled libraries with 3 gRNAs per gene detected few hits below 10% FDR, whereas the performance improved markedly in libraries with 4 gRNAs per gene. The total numbers of depleted genes detected below 10% FDR increased further as the number of gRNAs per gene increases up to 10 gRNAs per gene. However, when FC was also used as an additional criterion, the number of depleted genes with FDR ≤ 10% and FC ≤ 0.5 plateaued at around 6 gRNAs per gene, whereas the number of genes with FDR ≤ 10% and FC > 0.5 kept increasing. These results indicated that increasing the number of gRNAs per gene is beneficial to a certain point, but beyond this point the increase in significance only accounts for genes with a minor effect (as indicated by their low FC) and thus are not crucial to the phenotype in question.
We next analysed the effect of the number of replicates on the screening outcome. For each down-sampled dataset, we performed statistical analysis with reducing number of replicates from 3 to 2 to 1 and counted the genes with FDR ≤ 0.1 and FC ≤ 0.5 (Fig. 4c). This analysis revealed that conducting a screen in more replicates improved sensitivity and detected a greater numbers of genes, particularly when using libraries with 4 or 5 gRNAs per gene. However, the difference between 2 and 3 replicates became negligible in libraries that contain 6 or more gRNAs per gene. Moreover, when using libraries with 9 gRNAs per gene, screens with a single replicate were as robust as those with 2 or more replicates.
Taken together, we concluded that screens using a second-generation library with 6 gRNAs per gene in duplicate would provide the most suitable and practical trade-off. To evaluate the sensitivity of libraries with 6 gRNAs per gene, we re-designed our library and generated new Human v3 library, yielding a total of 114,749 gRNAs targeting 18,740 human genes (Supplementary Table S3). To test this new v3 library and to evaluate the screening parameters we had determined, we randomly selected 5,000 genes as a test set and cloned an oligo pool into a modified version of our lentiviral vector (Supplementary Fig. S5) and similarly performed a screen in HT-29 cells (read counts in Supplementary Table S4). The new lentiviral gRNA vector incorporated a library-specific barcode, which can be used for a reverse primer-annealing site, thus preventing gRNA amplification from existing lentiviral gRNA vectors due to cross-contamination. The vast majority of depleted genes with the new v3 library showed improved significance (Fig. 5a). Further corroboration using RNA-seq data37 also showed the v3 library as producing higher significance in the expressed genes but not in the non-expressed genes (Fig. 5b,c). As the principle of gRNA selection and the gRNA scaffold used remained unchanged, there was no significant difference in the FC between the v1 and v3 libraries (Fig. 5d; Kolmogorov-Smirnov test, p = 0.44). We next conducted statistical analysis with various numbers of replicates and compared the number of depleted genes (Fig. 5e). Consistent with the observation in Fig. 4c, there was almost no difference in the number of depleted genes between the 3- and 2-replicate screens with the v3 library. Even with a single replicate, the reduction in detecting depleted genes was marginal and the new library could thus detect more genes than the three-replicate screen with the v1 library.
Discussion
Prior to the advent of CRISPR-Cas9-based genome editing technology, genome-wide recessive screening had been typically performed using the RNA interference technology40. However, the intrinsic nature of RNAi, namely incomplete silencing and frequent off-target effects, resulted in low detection sensitivity and inconsistent results across laboratories. In contrast, the CRISPR-Cas9 system directly modifies genomic DNA and generates inactivating mutations, thereby exhibiting greater gene perturbation efficiency and stronger phenotypic outcomes. This was well exemplified by a recent study on CRISPR-KO screens for host factors required for HIV-1 infection41. Nonetheless, the CRISPR-Cas9 system has several pitfalls. As a result of nucleic acid-based recognition of targets, the CRISPR-Cas9 system could exhibit off-target effects through imperfect hybridisation between a gRNA and target DNA23, 42. A few computational prediction methods have been developed23, 27, but predicting which potential off-target sites are actually cleaved is still difficult. Selecting gRNAs with the fewest predicted off-target sites would still be the most practical approach.
Another major pitfall of CRISPR-KO screening, especially screens in cancer cell lines, is the false positives caused by copy number aberration43, 44. As CRISPR-KO screening relies on double-stranded DNA break (DSB) generated by wildtype Cas9 protein to inactivate gene function, the higher number of DSBs in copy-number-amplified regions generate higher genotoxicity than those in normal copy number regions, causing cell proliferation delay or deletion of cancer driver genes in such regions. As a result, genes without any proliferative impact may be falsely identified as a screen hit. Gene copy number data should ideally be incorporated when interpreting screening results from cancer cell lines to minimise false-positive calls.
CRISPR-KO screening using the first-generation libraries did not robustly detect genes in negative selection. In order to improve screening performance, two different approaches have been taken. One approach is to optimise the gRNA scaffold sequences. Several studies have been conducted thus far and resulted in a similar structure, which showed more robust phenotypic outcomes38, 45, 46. Another approach is to computationally predict on-target efficiency using nucleotide biases on gRNA KO efficiency. These improvements have been adopted when generating the second-generation CRISPR-KO libraries.
The degree of differences between the four second-generation CRISPR libraries may not be immediately apparent with a quick glance at Table 1. However, the overlapping patterns of the human genes targeted and the constituent gRNAs in Fig. 1 revealed that only a measly 0.3% of the 350,860 unique gRNAs are shared by all four libraries, and only about 16% are present in more than one library, despite having 17,070 genes (87.1% of the total) targeted by all four libraries. Even the three libraries that were designed using on-target prediction algorithms (Table 1) shared only 4,002 gRNAs, a mere 1.42% of the three-library 281,418 gRNA pool. This great diversity of gRNAs is a testament to the huge effect that the gRNA design principles have on the composition of CRISPR libraries.
Despite the huge differences in the composition of the gRNAs and other parameters (Table 1), the estimated false negative rates of the four libraries (Human v1, Brunello and the two Whitehead subsets) turned out to be remarkably similar at between 14% and 23% (at FDR ≤ 20% and FC ≤ 0.5). This indicates that each of the libraries has their fair share of gRNAs which are not active in producing gene knockouts. This also suggests that false negatives are difficult to avoid simply via currently-available computational methods of gRNA selection, as the elaborate on-target prediction algorithms did not show significant advantage over the gRNA design principles we used for the Human v1 library. Further work is required to understand factors that influence gRNA efficacy. It would also be worth considering the construction of new CRISPR libraries via selecting and incorporating validated active gRNAs when more CRISPR-KO datasets using different libraries become available.
Our in silico subsampling analyses from the 10-gRNA-per-gene Whitehead half library (Fig. 4) identified a substantial gRNA number-dependent effect on statistical results returned by MAGeCK. It is evident from our results that a higher number of gRNAs per gene allows for a more robust statistical analysis but can become over-sensitive with the tendency of calling genes with a relatively minor phenotypic effect (i.e. lower fold change values) as significant hits. From the perspective of practicality, a higher number of gRNAs per gene would cause the following two disadvantages. Firstly, increasing the numbers of gRNAs per gene invariably results in a larger library size. As a consequence, ensuring that the library complexity is maintained throughout the screening process would become increasingly challenging, and the loss of library complexity could lead to poor screening outcome. The second shortcoming of using a higher number of gRNA per gene means over-detecting hits, resulting in unnecessary downstream validation work. Therefore, it is important to identify a practical trade off that reports biologically meaningful outcome with libraries small enough to perform a proper screen. For second-generation CRISPR-KO libraries, we have showed that the use of a library containing 6 gRNAs per gene in duplicated screens is likely to be the optimal design parameter for genome-wide CRISPR-KO screens. In the validation screen of our new Human v3 library, we have detected with just a single replicate significantly more depleted genes than duplicated screens using our v1 library (Fig. 5e). This translates to a reduction of 60% in required reagents and laboratory work such as tissue culture and sequencing. For data analysis, several statistical packages have been developed and more are likely to be developed47,48,49,50. To narrow down a large number of primary hits to high confidence hits, additional statistical analyses using these newly developed tools would be of great utility.
CRISPR-KO screens in cancer cell lines have been successfully conducted and have identified novel therapeutic targets34, 36, 37, 51, 52. There will be, however, more demands in screening cell types less amenable to scaling up, such as organoids and primary cells53. Our newly defined optimal screening parameters would be helpful to other researchers in the field who are considering conducting such screens with genome-wide libraries or constructing bespoke gRNA libraries.
Methods
Plasmid construction
pKLV3-U6gRNA5(BbsI)-PGKpuroBFP-W-L1 was constructed by cloning a gBlock fragment containing the gRNA expression cassette and a bar code into the MluI-BamHI site of pKLV2-U6gRNA5(BbsI)-PGKpuroBFP-W37 using a Gibson assembly kit (NEB).
CRISPR libraries
The Brunello library was purchased from Addgene (catalog #73178). The gRNA expression cassette of the Brunello library was PCR-amplified with primers (5′-TAGTACCGGGCCCTACGCGTGAGGGCCTATTTCCCATG-3′ and 5′-CTACCCGGTAGAATTGGATCCAAAAAAAGCACCGACTCG-3′). The resulting PCR product was purified with a gel purification kit (Qiagen) and cloned into the MluI-BamHI site of pKLV2-U6gRNA5(BbsI)-PGKpuroBFP-W using a Gibson assembly kit (NEB). Five assembly reactions were performed with 100 ng of the vector and 9.1 ng of the insert fragment per reaction at 50 °C for 30 min. The reactions were then combined, purified with a MinElute PCR purification kit (Qiagen) and eluted into 10 µl water. Assembled DNA was electroporated into DH10B electro-competent cells (NEB) with 1 µl per reaction. All 10 reactions were combined, incubated at 37 °C for 1 hr and directly inoculated into 500 ml 2xTY medium containing 50 µg ml−1 ampicillin. Plasmid DNA was purified from overnight culture using a Plasmid Maxi kit (Qiagen).
For the Whitehead library (Addgene catalog #1000000067), gRNA sequences for randomly selected 9,190 genes were obtained from the published gRNA list. The sequences were appended as follows: 5′-GCAGATGGCTCTTTGTCCTAGACATCGAAGACAACACCGN20GTTTTAGTCTTCTCGTCGC-3′. Pooled oligos was purchased from CustomArray and cloned into pKLV2-U6gRNA5(BbsI)-PGKpuroBFP-W as described previously21.
The v3 library was designed essentially as described previously37 with a minor modification. We added another gRNA selection filter of GC% being 40–80% and selected 6 gRNAs per consensus coding sequence (CCDS). An oligo pool containing 30,000 gRNAs targeting 5,000 randomly selected genes was purchased from Twist Bioscience and cloned into pKLV3-U6gRNA5(BbsI)-PGKpuroBFP-W-L1 as described previously21.
Virus production, transduction, screening, gRNA amplification and sequencing
These were performed as described previously21, 37. To amplify gRNA fragments from the v3 library, 5′-ACACTCTTTCCCTACACGACGCTCTTCCGATCTCTTGTGGAAAGGACGAAACA-3′ and 5′-TCGGCATTCCTGCTGAACCGCTCTTCCGATCTTACCCAGACTGCTCATCGTC-3′ were used.
Processing of gRNA count tables
All FDR values were calculated using MAGeCK version 0.5.339.
Prior to the calculation of fold change (FC) values, all gRNA count values were first normalised against the total of the baseline plasmid counts to account for the sequencing yield variability between replicates, then incremented by 1 to avoid division-by-zero errors. The incremented gRNA counts for the replicates were then totalled and averaged, before division by the gRNA baseline plasmid read count to arrive at the gRNA-level fold change. Gene-level FC values were calculated as the average of the fold change values of all constituent gRNAs for each gene.
ROC analysis
ROC analysis were done in R, using predefined lists31 of non-essential genes (n = 927) and essential genes (n = 684).
Data Availability
All data generated or analysed during this study are included in this published article and its Supplementary Information files.
Change history
12 April 2018
A correction to this article has been published and is linked from the HTML and PDF versions of this paper. The error has not been fixed in the paper.
References
Barrangou, R. et al. CRISPR provides acquired resistance against viruses in prokaryotes. Science 315, 1709–1712, doi:10.1126/science.1138140 (2007).
Deltcheva, E. et al. CRISPR RNA maturation by trans-encoded small RNA and host factor RNase III. Nature 471, 602, doi:10.1038/nature09886 (2011).
Jinek, M. et al. A Programmable Dual-RNA-Guided DNA Endonuclease in Adaptive Bacterial Immunity. Science 337, 816–821, doi:10.1126/science.1225829 (2012).
Gasiunas, G., Barrangou, R., Horvath, P. & Siksnys, V. Cas9-crRNA ribonucleoprotein complex mediates specific DNA cleavage for adaptive immunity in bacteria. Proc Natl Acad Sci U S A 109, E2579–2586, doi:10.1073/pnas.1208507109 (2012).
Doudna, J. A. & Charpentier, E. Genome editing. The new frontier of genome engineering with CRISPR-Cas9. Science 346, 1258096, doi:10.1126/science.1258096 (2014).
Cong, L. et al. Multiplex genome engineering using CRISPR/Cas systems. Science 339, 819–823, doi:10.1126/science.1231143 (2013).
Mali, P. et al. RNA-guided human genome engineering via Cas9. Science 339, 823–826, doi:10.1126/science.1232033 (2013).
Hwang, W. Y. et al. Efficient genome editing in zebrafish using a CRISPR-Cas system. Nat Biotechnol 31, 227–229, doi:10.1038/nbt.2501 (2013).
Cho, S. W., Kim, S., Kim, J. M. & Kim, J. S. Targeted genome engineering in human cells with the Cas9 RNA-guided endonuclease. Nat Biotechnol 31, 230–232, doi:10.1038/nbt.2507 (2013).
Hilton, I. B. et al. Epigenome editing by a CRISPR-Cas9-based acetyltransferase activates genes from promoters and enhancers. Nat Biotechnol 33, 510–517, doi:10.1038/nbt.3199 (2015).
Liu, X. S. et al. Editing DNA Methylation in the Mammalian Genome. Cell 167(233–247), e217, doi:10.1016/j.cell.2016.08.056 (2016).
Morita, S. et al. Targeted DNA demethylation in vivo using dCas9-peptide repeat and scFv-TET1 catalytic domain fusions. Nat Biotechnol 34, 1060–1065, doi:10.1038/nbt.3658 (2016).
Komor, A. C., Kim, Y. B., Packer, M. S., Zuris, J. A. & Liu, D. R. Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature 533, 420–424, doi:10.1038/nature17946 (2016).
Nishida, K. et al. Targeted nucleotide editing using hybrid prokaryotic and vertebrate adaptive immune systems. Science 353, doi:10.1126/science.aaf8729 (2016).
Gilbert, L. A. et al. CRISPR-mediated modular RNA-guided regulation of transcription in eukaryotes. Cell 154, 442–451, doi:10.1016/j.cell.2013.06.044 (2013).
Gilbert, L. A. et al. Genome-Scale CRISPR-Mediated Control of Gene Repression and Activation. Cell 159, 647–661, doi:10.1016/j.cell.2014.09.029 (2014).
Konermann, S. et al. Genome-scale transcriptional activation by an engineered CRISPR-Cas9 complex. Nature 517, 583–588, doi:10.1038/nature14136 (2015).
Chavez, A. et al. Highly efficient Cas9-mediated transcriptional programming. Nat Methods 12, 326–328, doi:10.1038/nmeth.3312 (2015).
Shalem, O. et al. Genome-Scale CRISPR-Cas9 Knockout Screening in Human Cells. Science 343, 84–87, doi:10.1126/science.1247005 (2014).
Wang, T., Wei, J. J., Sabatini, D. M. & Lander, E. S. Genetic Screens in Human Cells Using the CRISPR-Cas9 System. Science 343, 80–84, doi:10.1126/science.1246981 (2014).
Koike-Yusa, H., Li, Y., Tan, E. P., Velasco-Herrera Mdel, C. & Yusa, K. Genome-wide recessive genetic screening in mammalian cells with a lentiviral CRISPR-guide RNA library. Nat Biotechnol 32, 267–273, doi:10.1038/nbt.2800 (2014).
Shalem, O., Sanjana, N. E. & Zhang, F. High-throughput functional genomics using CRISPR-Cas9. Nat Rev Genet 16, 299–311, doi:10.1038/nrg3899 (2015).
Hsu, P. D. et al. DNA targeting specificity of RNA-guided Cas9 nucleases. Nat Biotechnol 31, 827–832, doi:10.1038/nbt.2647 (2013).
Bae, S., Park, J. & Kim, J. S. Cas-OFFinder: a fast and versatile algorithm that searches for potential off-target sites of Cas9 RNA-guided endonucleases. Bioinformatics 30, 1473–1475, doi:10.1093/bioinformatics/btu048 (2014).
Heigwer, F., Kerr, G. & Boutros, M. E-CRISP: fast CRISPR target site identification. Nat Methods 11, 122–123, doi:10.1038/nmeth.2812 (2014).
Hodgkins, A. et al. WGE: a CRISPR database for genome engineering. Bioinformatics 31, 3078–3080, doi:10.1093/bioinformatics/btv308 (2015).
Doench, J. G. et al. Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9. Nat Biotechnol 34, 184–191, doi:10.1038/nbt.3437 (2016).
Doench, J. G. et al. Rational design of highly active sgRNAs for CRISPR-Cas9-mediated gene inactivation. Nat Biotechnol 32, 1262–U1130, doi:10.1038/nbt.3026 (2014).
Chari, R., Mali, P., Moosburner, M. & Church, G. M. Unraveling CRISPR-Cas9 genome engineering parameters via a library-on-library approach. Nat Methods 12, 823–826, doi:10.1038/nmeth.3473 (2015).
Xu, H. et al. Sequence determinants of improved CRISPR sgRNA design. Genome Res 25, 1147–1157, doi:10.1101/gr.191452.115 (2015).
Hart, T. et al. Evaluation and Design of Genome-wide CRISPR/Cas9 Knockout Screens. G3: Genes, Genomes, Genetics, doi:10.1534/g3.117.041277 (2017).
Bae, S., Kweon, J., Kim, H. S. & Kim, J. S. Microhomology-based choice of Cas9 nuclease target sites. Nat Methods 11, 705–706, doi:10.1038/nmeth.3015 (2014).
Sanjana, N. E., Shalem, O. & Zhang, F. Improved vectors and genome-wide libraries for CRISPR screening. Nat Methods 11, 783–784, doi:10.1038/nmeth.3047 (2014).
Hart, T. et al. High-Resolution CRISPR Screens Reveal Fitness Genes and Genotype-Specific Cancer Liabilities. Cell 163, 1515–1526, doi:10.1016/j.cell.2015.11.015 (2015).
Ma, H. et al. A CRISPR-Based Screen Identifies Genes Essential for West-Nile-Virus-Induced Cell Death. Cell Rep 12, 673–683, doi:10.1016/j.celrep.2015.06.049 (2015).
Wang, T. et al. Identification and characterization of essential genes in the human genome. Science 350, 1096–1101, doi:10.1126/science.aac7041 (2015).
Tzelepis, K. et al. A CRISPR Dropout Screen Identifies Genetic Vulnerabilities and Therapeutic Targets in Acute Myeloid Leukemia. Cell Rep 17, 1193–1205, doi:10.1016/j.celrep.2016.09.079 (2016).
Chen, B. et al. Dynamic imaging of genomic loci in living human cells by an optimized CRISPR/Cas system. Cell 155, 1479–1491, doi:10.1016/j.cell.2013.12.001 (2013).
Li, W. et al. MAGeCK enables robust identification of essential genes from genome-scale CRISPR/Cas9 knockout screens. Genome Biol 15, 554, doi:10.1186/s13059-014-0554-4 (2014).
Mohr, S. E., Smith, J. A., Shamu, C. E., Neumuller, R. A. & Perrimon, N. RNAi screening comes of age: improved techniques and complementary approaches. Nat Rev Mol Cell Biol 15, 591–600, doi:10.1038/nrm3860 (2014).
Park, R. J. et al. A genome-wide CRISPR screen identifies a restricted set of HIV host dependency factors. Nat Genet 49, 193–203, doi:10.1038/ng.3741 (2017).
Fu, Y. et al. High-frequency off-target mutagenesis induced by CRISPR-Cas nucleases in human cells. Nat Biotechnol 31, 822–826, doi:10.1038/nbt.2623 (2013).
Munoz, D. M. et al. CRISPR Screens Provide a Comprehensive Assessment of Cancer Vulnerabilities but Generate False-Positive Hits for Highly Amplified Genomic Regions. Cancer Discov 6, 900–913, doi:10.1158/2159-8290.CD-16-0178 (2016).
Aguirre, A. J. et al. Genomic Copy Number Dictates a Gene-Independent Cell Response to CRISPR/Cas9 Targeting. Cancer Discov 6, 914–929, doi:10.1158/2159-8290.CD-16-0154 (2016).
Cross, B. C. et al. Increasing the performance of pooled CRISPR-Cas9 drop-out screening. Sci Rep 6, 31782, doi:10.1038/srep31782 (2016).
Dang, Y. et al. Optimizing sgRNA structure to improve CRISPR-Cas9 knockout efficiency. Genome Biol 16, 280, doi:10.1186/s13059-015-0846-3 (2015).
Hart, T., Brown, K. R., Sircoulomb, F., Rottapel, R. & Moffat, J. Measuring error rates in genomic perturbation screens: gold standards for human functional genomics. Mol Syst Biol 10, 733, doi:10.15252/msb.20145216 (2014).
Diaz, A. A., Qin, H., Ramalho-Santos, M. & Song, J. S. HiTSelect: a comprehensive tool for high-complexity-pooled screen analysis. Nucleic Acids Res 43, e16, doi:10.1093/nar/gku1197 (2015).
Yu, J., Silva, J. & Califano, A. ScreenBEAM: a novel meta-analysis algorithm for functional genomics screens via Bayesian hierarchical modeling. Bioinformatics 32, 260–267, doi:10.1093/bioinformatics/btv556 (2016).
Winter, J. et al. caRpools: an R package for exploratory data analysis and documentation of pooled CRISPR/Cas9 screens. Bioinformatics 32, 632–634, doi:10.1093/bioinformatics/btv617 (2016).
Steinhart, Z. et al. Genome-wide CRISPR screens reveal a Wnt-FZD5 signaling circuit as a druggable vulnerability of RNF43-mutant pancreatic tumors. Nat Med 23, 60–68, doi:10.1038/nm.4219 (2017).
Wang, T. et al. Gene Essentiality Profiling Reveals Gene Networks and Synthetic Lethal Interactions with Oncogenic Ras. Cell 168(890–903), e815, doi:10.1016/j.cell.2017.01.013 (2017).
Parnas, O. et al. A Genome-wide CRISPR Screen in Primary Immune Cells to Dissect Regulatory Networks. Cell 162, 675–686, doi:10.1016/j.cell.2015.06.059 (2015).
Acknowledgements
This work was funded by the Wellcome Trust (WT077187).
Author information
Authors and Affiliations
Contributions
K.Y. conceived the study and designed the experiments. H.Y. and K.Y. performed the screens. Y.L. and K.Y. designed the gRNA libraries. S.H.O. analysed the data. S.H.O. and K.Y. wrote the paper with input from all authors.
Corresponding author
Ethics declarations
Competing Interests
The authors declare that they have no competing interests.
Additional information
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Ong, S.H., Li, Y., Koike-Yusa, H. et al. Optimised metrics for CRISPR-KO screens with second-generation gRNA libraries. Sci Rep 7, 7384 (2017). https://doi.org/10.1038/s41598-017-07827-z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-017-07827-z
This article is cited by
-
Efficient prioritization of CRISPR screen hits by accounting for targeting efficiency of guide RNA
BMC Biology (2023)
-
A genome-wide in vivo CRISPR screen identifies essential regulators of T cell migration to the CNS in a multiple sclerosis model
Nature Neuroscience (2023)
-
Surfaceome CRISPR screen identifies OLFML3 as a rhinovirus-inducible IFN antagonist
Genome Biology (2021)
-
Genome-wide CRISPR/Cas9 deletion screen defines mitochondrial gene essentiality and identifies routes for tumour cell viability in hypoxia
Communications Biology (2021)
-
DGK and DZHK position paper on genome editing: basic science applications and future perspective
Basic Research in Cardiology (2021)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.