We compared the ability of short hairpin RNA (shRNA) and CRISPR/Cas9 screens to identify essential genes in the human chronic myelogenous leukemia cell line K562. We found that the precision of the two libraries in detecting essential genes was similar and that combining data from both screens improved performance. Notably, results from the two screens showed little correlation, which can be partially explained by the identification of distinct essential biological processes with each technology.
This is a preview of subscription content, access via your institution
Open Access articles citing this article.
The N6-methyladenosine methyltransferase METTL16 enables erythropoiesis through safeguarding genome integrity
Nature Communications Open Access 28 October 2022
Functional characterization of the PI3K/AKT/MTOR signaling pathway for targeted therapy in B-precursor acute lymphoblastic leukemia
Cancer Gene Therapy Open Access 06 July 2022
ACE: a probabilistic model for characterizing gene-level essentiality in CRISPR screens
Genome Biology Open Access 23 September 2021
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Rent or buy this article
Get just this article for as long as you need it
Prices may be subject to local taxes which are calculated during checkout
Sequence Read Archive
Bassik, M.C. et al. Cell 152, 909–922 (2013).
Silva, J.M. et al. Science 319, 617–620 (2008).
Barbie, D.A. et al. Nature 462, 108–112 (2009).
Shalem, O., Sanjana, N.E. & Zhang, F. Nat. Rev. Genet. 16, 299–311 (2015).
Wang, T., Wei, J.J., Sabatini, D.M. & Lander, E.S. Science 343, 80–84 (2014).
Shalem, O. et al. Science 343, 84–87 (2014).
Koike-Yusa, H., Li, Y., Tan, E.-P., Velasco-Herrera, M.D.C. & Yusa, K. Nat. Biotechnol. 32, 267–273 (2014).
Zhou, Y. et al. Nature 509, 487–491 (2014).
Kaelin, W.G. Science 337, 421–422 (2012).
Barrangou, R. et al. Nucleic Acids Res. 43, 3407–3419 (2015).
Jackson, A.L. & Linsley, P.S. Nat. Rev. Drug Discov. 9, 57–67 (2010).
Grimm, D. et al. Nature 441, 537–541 (2006).
Kampmann, M. et al. Proc. Natl. Acad. Sci. USA 112, E3384–E3391 (2015).
Deans, R.M. et al. Nat. Chem. Biol. http://dx.doi.org/10.1038/nchembio.2050 (2016).
Hart, T., Brown, K.R., Sircoulomb, F., Rottapel, R. & Moffat, J. Mol. Syst. Biol. 10, 733 (2014).
Parnas, O. et al. Cell 162, 675–686 (2015).
Gilbert, L.A. et al. Cell 159, 647–661 (2014).
Li, W. et al. Genome Biol. 15, 554 (2014).
König, R. et al. Nat. Methods 4, 847–849 (2007).
Luo, B. et al. Proc. Natl. Acad. Sci. USA 105, 20380–20385 (2008).
Diaz, A.A., Qin, H., Ramalho-Santos, M. & Song, J.S. Nucleic Acids Res. 43, e16 (2015).
Hart, T. et al. Cell 163, 1515–1526 (2015).
Wang, T. et al. Science 350, 1096–1101 (2015).
Tsvetkov, P. et al. Elife 4, e08467 (2015).
Frock, R.L. et al. Nat. Biotechnol. 33, 179–186 (2015).
Tsai, S.Q. et al. Nat. Biotechnol. 33, 187–197 (2015).
Pruett-Miller, S.M., Reading, D.W., Porter, S.N. & Porteus, M.H. PLoS Genet. 5, e1000376 (2009).
Shi, J. et al. Nat. Biotechnol. 33, 661–667 (2015).
Efron, B. Stat. Sci. 29, 285–301 (2014).
Kass, R.E. & Steffey, D. J. Am. Stat. Assoc. 84, 717–726 (1989).
Birmingham, A. et al. Nat. Methods 3, 199–204 (2006).
Jackson, A.L. et al. Nat. Biotechnol. 21, 635–637 (2003).
Eden, E., Navon, R., Steinfeld, I., Lipson, D. & Yakhini, Z. BMC Bioinformatics 10, 48 (2009).
ENCODE Project Consortium. Nature 489, 57–74 (2012).
Cherry, J.M. et al. Nucleic Acids Res. 40, D700–D705 (2012).
We thank K. Han, G. Hess, M. Dubreuil, M. Haney, K. Tsui, and the Bassik lab for technical expertise and helpful discussions. We thank G. Sherlock, N. Boley, A. Kundaje, R. Tibshirani, and A. Fire for their critical reading of the manuscript and suggestions. Many thanks to A. Regev, N. Hacohen, O. Parnas, and R.H. Herbst for providing data from their published screen. We acknowledge the ENCODE Consortium and the Gingeras lab for the generation and release of the K562 gene expression data used here. This work was funded by the NIH Director's New Innovator Award Program (project no. 1DP2HD084069-01) NIH/NHGRI (training grant T32 HG000044 to D.W.M.), and a seed grant from Stanford ChEM-H. This material is based on work supported by the National Science Foundation Graduate Research Fellowship (grant DGE-114747 to D.W.M. and R.M.D.). R.M.D. was additionally supported by a Burt and DeeDee McMurtry Stanford Graduate Fellowship.
The authors declare no competing financial interests.
Integrated supplementary information
Supplementary Figure 1 Distribution of targeting and control elements.
(a) Distribution of negative controls for a single replicate of Cas9 and shRNA screens. Enrichments are calculated as a median-normalized log ratio of counts. (b,c) Distribution of targeting elements is shown in meta-gene plots for the top 50 (b) enriched and (c) disenriched genes found in a single replicate of the Cas9 and shRNA screens as identified by casTLE. To normalize, the enrichment of each individual element was divided by the effect size estimate for the gene generated by casTLE. The dotted line is placed at the estimated effect size and normalized to one.
Supplementary Figure 2 Distribution of targeting sgRNAs for top disenriched genes.
(a-d) Enrichment of targeting elements and estimated effect size is shown for the top four disenriched genes from Cas9 data from a single replicate. Enrichments are calculated as a median-normalized log ratio of counts. Gray lines represent the smoothed distribution of non-targeting controls. Red vertical lines represent enrichment of individual targeting guides towards indicated genes. Vertical dotted line represents effect size estimate from casTLE. Red distribution is a smoothed distribution of guides targeting the genes indicated.
Supplementary Figure 3 Distribution of targeting sgRNAs for top disenriched genes.
(a-d) Enrichment of targeting elements and estimated effect size is shown for the top four disenriched genes from shRNA data from a single replicate. Enrichments are calculated as a median-normalized log ratio of counts. Gray lines represent the smoothed distribution of non-targeting controls. Blue vertical lines represent enrichment of individual targeting hairpins towards indicated genes. Vertical dotted line represents effect size estimate from casTLE. Blue distribution is a smoothed distribution of hairpins targeting the genes indicated.
Supplementary Figure 4 casTLE provides a statistical framework to account for high-throughput screens.
The unknown relationship between gene dosage and measured phenotype as well as the unknown distribution of shRNA and Cas9 efficacies restricts the predicted effect size of reagents to a bounded region, marked as the blue shaded region, between 0 and the maximum effect I, marked by the dotted line. Some fraction (1-θ) of the reagents have no on-target effect at all. The phenotype observed is thus the true effect obscured by noise, which is estimated using the distribution of non-targeting controls. The likelihood of models for different values of I and θ are calculated and by marginalizing θ the most likely effect size is selected. A likelihood ratio is then calculated by comparing to a null model where I is zero.
Supplementary Figure 5 Reanalysis of previous screens.
(a) Results are shown for a previously published shRNA screen for ricin sensitivity reanalyzed with casTLE and compared to published results based on a MW test1. (b) Previous CRISPR/Cas9 deletion screen for LPS-induced TNF expression in primary mouse bone-marrow derived dendritic cells, analyzed with casTLE and the published DESeq results16. (c) Previous CRISPRi screen for sensitivity to the fusion toxin CTx-DTA, analyzed with casTLE versus the average of the top three sgRNA effects17. (d) Previous CRISPRa screen for sensitivity to the fusion toxin CTx-DTA, analyzed with casTLE versus the average of the top three sgRNA effects17.
Supplementary Figure 6 Comparison of casTLE to other methods.
(a-d) ROC curves indicate screen performance in identifying essential genes from changing composition between the plasmid library and two weeks growth. True positive rates and false positive rates are calculated using a previously established gold standard set of essential and nonessential genes15. Genes are ranked by likelihood to be essential using the indicated methods, including casTLE. Highest effect heuristic was calculated by ranking the genes according to their most disenriched element. Data is shown from single replicates of the (a,c) Cas9 and (b,d) shRNA screens for (a,b) replicate 1 and (c,d) replicate 2.
Supplementary Figure 7 Performance of combination of shRNA and Cas9 data.
(a) ROC curves from combination of different replicates of Cas9 and shRNA using casTLE. ROC curves indicate screen performance in identifying essential genes from changing composition between the plasmid library and two weeks growth. True positive rates and false positive rates are calculated using a previously established gold standard set of essential and nonessential genes15. (b) Combination score has high reproducibility. A large positive casTLE score indicates a high confidence increase in growth rate, while a highly negative casTLE indicates a high confidence decrease in growth rate, i.e. gene essentiality. The graphs compare replicate measurements of likelihood ratio between plasmid and T14 of the combination score based on replicates 1 for Cas9 and shRNA and replicates 2 for Cas9 and shRNA. Density is in log scale.
Supplementary Figure 8 Comparison of casTLE combination to casTLE analysis of single screens.
(a) ROC curves indicate screen performance in identifying essential genes by comparing the library composition between the plasmid library and cells after two weeks growth. ROC curves for Cas9 (red) and shRNA (blue) screens based on duplicate data combined using casTLE. Alternatively, data from single replicates of both Cas9 and shRNA screens were combined using casTLE (purple). (b) The number of essential genes at 10% false positive rate and their overlap based on the duplicate data from Cas9 and shRNA screens, as well as combination of a single replicate from both screens. False positive rate was estimated using gold standard nonessential genes. (c) Precision recall curve for Cas9, shRNA, and combination data using casTLE.
Supplementary Figure 9 Comparison to an in silico four-shRNA-per-gene library.
Results from the 25 shRNA library were downsampled by only including four hairpins per gene, selected by previous computational ranking. (a) ROC curves indicate screen performance in identifying essential genes by comparing the library composition between the plasmid library and cells after two weeks growth. (b) The number of essential genes at 10% false positive rate and their overlap based on the duplicate data from Cas9 and shRNA screens, as well as combination of a single replicate from both screens. (c) Comparison of casTLE scores derived from casTLE between single replicates of Cas9 and shRNA data. (d) Adjusted p-values for select GO terms for shRNA and Cas9 screens as well as for data from both screens combined with casTLE.
Supplementary Figure 10 Screen reproducibility and time-dependence of phenotypes.
(a,b) shRNA and Cas9 screens have high reproducibility. A large positive casTLE score indicates a high confidence increase in growth rate, while a highly negative casTLE score indicates a high confidence decrease in growth rate, i.e. gene essentiality. The graphs compare replicate measurements of casTLE scores between plasmid and T14 for (a) Cas9 and (b) shRNA screens. Density is in log scale. (c,d) Time dependence of phenotypes. casTLE scores in different time-frames for (c) Cas9 and (d) shRNA screens.
Supplementary Figure 11 Analysis of gene expression and yeast essential homologs.
Genesets are defined for Cas9, shRNA, and Combination by a 10% FPR cutoff. Genesets are defined for Cas9-combo and shRNA-combo by the genes present in Cas9 or shRNA set and not in the Combination set. Overlap set is defined as genes present in both the Cas9 and shRNA set (See Supplementary Fig. 8b). (a,b) ~7,000 genes with detectable expression in K562 were binned by expression. The fraction of genes identified as essential in each bin is reported versus the average expression level of the bin. (c,d) Fraction of genes that are homologs of essential yeast genes versus genes that are homologs of nonessential yeast genes. P-values calculated using Fisher’s exact test.
Supplementary Text and Figures
Supplementary Figures 1–11, Supplementary Discussion, Supplementary Methods (PDF 2398 kb)
Supplementary Data 1
Gold standard of essential and nonessential genes (XLSX 27 kb)
Supplementary Data 2
Raw counts for sequencing (XLSX 28624 kb)
Supplementary Data 3
Full results from Cas9 screens (XLSX 3505 kb)
Supplementary Data 4
Full results from shRNA screens (XLSX 4388 kb)
Supplementary Data 5
Previous results of shRNA screen for modulators of ricin toxicity (XLSX 1455 kb)
Supplementary Data 6
Previous results for CRISPR/Cas9 deletion screen (XLSX 1740 kb)
Supplementary Data 7
Previous results from CRISPRi and CRISPRa screens (XLSX 1986 kb)
Supplementary Data 8
Full results from screen combinations (XLSX 3585 kb)
Supplementary Data 9
GO term information (XLSX 867 kb)
Supplementary Data 10
Results of gene expression analysis (XLSX 161 kb)
Supplementary Data 11
Results from yeast essential comparison (XLSX 627 kb)
Rights and permissions
About this article
Cite this article
Morgens, D., Deans, R., Li, A. et al. Systematic comparison of CRISPR/Cas9 and RNAi screens for essential genes. Nat Biotechnol 34, 634–636 (2016). https://doi.org/10.1038/nbt.3567
This article is cited by
Functional single-cell genomics of human cytomegalovirus infection
Nature Biotechnology (2022)
Ribosome stalling during selenoprotein translation exposes a ferroptosis vulnerability
Nature Chemical Biology (2022)
Identifying novel therapeutic targets in gastric cancer using genome-wide CRISPR-Cas9 screening
CRISPR in cancer biology and therapy
Nature Reviews Cancer (2022)
The N6-methyladenosine methyltransferase METTL16 enables erythropoiesis through safeguarding genome integrity
Nature Communications (2022)