Pooled CRISPR screens are a powerful tool for assessments of gene function. However, conventional analysis is based exclusively on the relative abundance of integrated single guide RNAs (sgRNAs) between populations, which does not discern distinct phenotypes and editing outcomes generated by identical sgRNAs. Here we present CRISPR-UMI, a single-cell lineage-tracing methodology for pooled screening to account for cell heterogeneity. We generated complex sgRNA libraries with unique molecular identifiers (UMIs) that allowed for screening of clonally expanded, individually tagged cells. A proof-of-principle CRISPR-UMI negative-selection screen provided increased sensitivity and robustness compared with conventional analysis by accounting for underlying cellular and editing-outcome heterogeneity and detection of outlier clones. Furthermore, a CRISPR-UMI positive-selection screen uncovered new roadblocks in reprogramming mouse embryonic fibroblasts as pluripotent stem cells, distinguishing reprogramming frequency and speed (i.e., effect size and probability). CRISPR-UMI boosts the predictive power, sensitivity, and information content of pooled CRISPR screens.
Subscribe to Journal
Get full journal access for 1 year
only $20.17 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Koike-Yusa, H., Li, Y., Tan, E.-P., Del Castillo Velasco-Herrera, M. & Yusa, K. Genome-wide recessive genetic screening in mammalian cells with a lentiviral CRISPR-guide RNA library. Nat. Biotechnol. 32, 267–273 (2014).
Shalem, O. et al. Genome-scale CRISPR-Cas9 knockout screening in human cells. Science 343, 84–87 (2014).
Wang, T., Wei, J.J., Sabatini, D.M. & Lander, E.S. Genetic screens in human cells using the CRISPR-Cas9 system. Science 343, 80–84 (2014).
Hart, T. et al. High-resolution CRISPR screens reveal fitness genes and genotype-specific cancer liabilities. Cell 163, 1515–1526 (2015).
Chen, S. et al. Genome-wide CRISPR screen in a mouse model of tumor growth and metastasis. Cell 160, 1246–1260 (2015).
Graham, D.B. & Root, D.E. Resources for the design of CRISPR gene editing experiments. Genome Biol. 16, 260 (2015).
Doench, J.G. et al. Rational design of highly active sgRNAs for CRISPR-Cas9-mediated gene inactivation. Nat. Biotechnol. 32, 1262–1267 (2014).
Shalem, O., Sanjana, N.E. & Zhang, F. High-throughput functional genomics using CRISPR-Cas9. Nat. Rev. Genet. 16, 299–311 (2015).
Miles, L.A., Garippa, R.J. & Poirier, J.T. Design, execution, and analysis of pooled in vitro CRISPR/Cas9 screens. FEBS J. 283, 3170–3180 (2016).
Burden, D.A. et al. Topoisomerase II etoposide interactions direct the formation of drug-induced enzyme-DNA cleavage complexes. J. Biol. Chem. 271, 29238–29244 (1996).
Jackson, S.P. & Bartek, J. The DNA-damage response in human biology and disease. Nature 461, 1071–1078 (2009).
Li, W. et al. MAGeCK enables robust identification of essential genes from genome-scale CRISPR/Cas9 knockout screens. Genome Biol. 15, 554 (2014).
Black, S.J., Kashkina, E., Kent, T. & Pomerantz, R.T. DNA polymerase θ: a unique multifunctional end-joining machine. Genes (Basel) 7, 67 (2016).
Takata, K., Reh, S., Tomida, J., Person, M.D. & Wood, R.D. Human DNA helicase HELQ participates in DNA interstrand crosslink tolerance with ATR and RAD51 paralogs. Nat. Commun. 4, 2338 (2013).
Gilmore-Hebert, M., Ramabhadran, R. & Stern, D.F. Interactions of ErbB4 and Kap1 connect the growth factor and DNA damage response pathways. Mol. Cancer Res. 8, 1388–1398 (2010).
Icli, B., Bharti, A., Pentassuglia, L., Peng, X. & Sawyer, D.B. ErbB4 localization to cardiac myocyte nuclei, and its role in myocyte DNA damage response. Biochem. Biophys. Res. Commun. 418, 116–121 (2012).
Mukherjee, B., Choy, H., Nirodi, C. & Burma, S. Targeting nonhomologous end-joining through epidermal growth factor receptor inhibition: rationale and strategies for radiosensitization. Semin. Radiat. Oncol. 20, 250–257 (2010).
Greer Card, D.A., Sierant, M.L. & Davey, S. Rad9A is required for G2 decatenation checkpoint and to prevent endoreduplication in response to topoisomerase II inhibition. J. Biol. Chem. 285, 15653–15661 (2010).
He, W. et al. A role for the arginine methylation of Rad9 in checkpoint control and cellular sensitivity to DNA damage. Nucleic Acids Res. 39, 4719–4727 (2011).
Smilenov, L.B. et al. Combined haploinsufficiency for ATM and RAD9 as a factor in cell transformation, apoptosis, and DNA lesion repair dynamics. Cancer Res. 65, 933–938 (2005).
Stadtfeld, M., Maherali, N., Borkent, M. & Hochedlinger, K. A reprogrammable mouse strain from gene-targeted embryonic stem cells. Nat. Methods 7, 53–55 (2010).
Marión, R.M. et al. A p53-mediated DNA damage response limits reprogramming to ensure iPS cell genomic integrity. Nature 460, 1149–1153 (2009).
Liao, J. et al. Inhibition of PTEN tumor suppressor promotes the generation of induced pluripotent stem cells. Mol. Ther. 21, 1242–1250 (2013).
Onder, T.T. et al. Chromatin-modifying enzymes as modulators of reprogramming. Nature 483, 598–602 (2012).
Buckley, S.M. et al. Regulation of pluripotency and cellular reprogramming by the ubiquitin-proteasome system. Cell Stem Cell 11, 783–798 (2012).
Cheloufi, S. et al. The histone chaperone CAF-1 safeguards somatic cell identity. Nature 528, 218–224 (2015).
dos Santos, R.L. et al. MBD3/NuRD facilitates induction of pluripotency in a context-dependent manner. Cell Stem Cell 15, 102–110 (2014).
Rais, Y. et al. Deterministic direct reprogramming of somatic cells to pluripotency. Nature 502, 65–70 (2013).
Wang, T. et al. Identification and characterization of essential genes in the human genome. Science 350, 1096–1101 (2015).
Rauscher, B., Heigwer, F., Breinig, M., Winter, J. & Boutros, M. GenomeCRISPR—a database for high-throughput CRISPR/Cas9 screens. Nucleic Acids Res. 45, D679–D686 (2017).
Wang, T. et al. Gene essentiality profiling reveals gene networks and synthetic lethal interactions with oncogenic Ras. Cell 168, 890–903 (2017).
Datlinger, P. et al. Pooled CRISPR screening with single-cell transcriptome readout. Nat. Methods 14, 297–301 (2017).
Dixit, A. et al. Perturb-Seq: dissecting molecular circuits with scalable single-cell RNA profiling of pooled genetic screens. Cell 167, 1853–1866 (2016).
Adamson, B. et al. A multiplexed single-cell CRISPR screening platform enables systematic dissection of the unfolded protein response. Cell 167, 1867–1882 (2016).
Jaitin, D.A. et al. Dissecting immune circuits by linking CRISPR-pooled screens with single-cell RNA-seq. Cell 167, 1883–1896 (2016).
Koidl, S. et al. The SUMO2/3 specific E3 ligase ZNF451-1 regulates PML stability. Int. J. Biochem. Cell Biol. 79, 478–487 (2016).
Guzzo, C.M. et al. RNF4-dependent hybrid SUMO-ubiquitin chains are signals for RAP80 and thereby mediate the recruitment of BRCA1 to sites of DNA damage. Sci. Signal. 5, ra88 (2012).
Cappadocia, L., Pichler, A. & Lima, C.D. Structural basis for catalytic activation by the human ZNF451 SUMO E3 ligase. Nat. Struct. Mol. Biol. 22, 968–975 (2015).
Lu, J. et al. Alpha cell-specific Men1 ablation triggers the transdifferentiation of glucagon-expressing cells and insulinoma development. Gastroenterology 138, 1954–1965 (2010).
Liu, L. et al. Targeting Mll1 H3K4 methyltransferase activity to guide cardiac lineage specific reprogramming of fibroblasts. Cell Discov. 2, 16036 (2016).
Boyle, K. et al. Deletion of the SOCS box of suppressor of cytokine signaling 3 (SOCS3) in embryonic stem cells reveals SOCS box-dependent regulation of JAK but not STAT phosphorylation. Cell. Signal. 21, 394–404 (2009).
Takahashi, Y. et al. SOCS3: an essential regulator of LIF receptor signaling in trophoblast giant cell differentiation. EMBO J. 22, 372–384 (2003).
Heigwer, F. et al. CRISPR Library Designer (CLD): software for multispecies design of single guide RNA libraries. Genome Biol. 17, 55 (2016).
Doench, J.G. et al. Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9. Nat. Biotechnol. 34, 184–191 (2016).
Richter, F. et al. Engineering of temperature- and light-switchable Cas9 variants. Nucleic Acids Res. 44, 10003–10014 (2016).
Kleinstiver, B.P. et al. Engineered CRISPR-Cas9 nucleases with altered PAM specificities. Nature 523, 481–485 (2015).
Konermann, S. et al. Genome-scale transcriptional activation by an engineered CRISPR-Cas9 complex. Nature 517, 583–588 (2015).
Larson, M.H. et al. CRISPR interference (CRISPRi) for sequence-specific control of gene expression. Nat. Protoc. 8, 2180–2196 (2013).
Hess, G.T. et al. Directed evolution using dCas9-targeted somatic hypermutation in mammalian cells. Nat. Methods 13, 1036–1042 (2016).
Fu, Y., Sander, J.D., Reyon, D., Cascio, V.M. & Joung, J.K. Improving CRISPR-Cas nuclease specificity using truncated guide RNAs. Nat. Biotechnol. 32, 279–284 (2014).
Michlits, G., Burkard, T.R., Novatchkova, M. & Elling, U. CRISPR-UMI step-by-step: a protocol for robust CRISPR-screening. https://doi.org/10.1038/protex.2017.111 (2017).
Rosenbloom, K.R. et al. The UCSC Genome Browser database: 2015 update. Nucleic Acids Res. 43, D670–D681 (2015).
Finn, R.D. et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 44, D279–D285 (2016).
Kuscu, C., Arslan, S., Singh, R., Thorpe, J. & Adli, M. Genome-wide analysis reveals characteristics of off-target sites bound by the Cas9 endonuclease. Nat. Biotechnol. 32, 677–683 (2014).
Tsai, S.Q. et al. GUIDE-seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases. Nat. Biotechnol. 33, 187–197 (2015).
Smith, T., Heger, A. & Sudbery, I. UMI-tools: modeling sequencing errors in Unique Molecular Identifiers to improve quantification accuracy. Genome Res. 27, 491–499 (2017).
We acknowledge everybody involved in the generation of data and of the manuscript. We thank J. Zuber (Research Institute of Molecular Pathology (IMP), Vienna Biocenter (VBC), Vienna, Austria) for the retroviral backbone used to generate the guide library, and K. Hochedlinger (Department of Molecular Biology, Cancer Center and Center for Regenerative Medicine, Massachusetts General Hospital, Boston, Massachusetts, USA; Department of Stem Cell and Regenerative Biology and Harvard Stem Cell Institute, Cambridge, Massachusetts, USA; Howard Hughes Medical Institute, Chevy Chase, Maryland, USA) for provision of the Dox-inducible reprogramming system. We thank A. Stark for important suggestions and critical reading of the manuscript, J. Jude for critical discussion and sharing of protocols, and P. Svoboda for discussion and advice. We are grateful to all IMBA/IMP services, in particular bioinformatics, biooptics, molecular biology, media kitchen, and graphics for technical support, as well as to J. Brennecke and A. Andersen (Life Science Editors) for critical reading and editing of the manuscript. We thank the VBCF NGS facility. This work was supported by IMBA, the Austrian Academy of Sciences (OEAW), Novartis Institute of Biomedical Research, and AstraZeneca.
The authors declare no competing financial interests.
Integrated supplementary information
(a) Data analysis in CRISPR screens is conventionally based on several sgRNAs targeting the same gene. Introduction of random barcodes at complexities well above total analyzed cell number will tag each individual cell with a unique molecular identifier (UMI). This generates a third layer of information at single cell level, generating biological replicas. (b) Schematic illustration of the generation of single cell derived clones using a limiting dilution followed by clonal expansion.
Doench-scores as measure of predicted sgRNA activity were calculated for all exonic sgRNAs compatible with our cloning strategy. Doench scores were penalized based on a ruleset for biological effects. Those rules combine evaluation of exon length, prediction of protein domains, alternative splicing and ATG start codons, Pol-III terminator sequences, position of the sgRNA within the CDS. The penalties also spread selected sgRNAs over different exons and include off-target prediction penalties.
The CRISPR-UMI library is generated by 2 subsequent complex cloning steps. Initially, a random barcode consisting of 10 nucleotides is integrated into the vector backbone. Subsequently, the sgRNA pool of 26,514 sgRNAs is ligated to the barcode library with over 1000 ligation events per sgRNA. Thereby, each of the over 1000 ligation events per sgRNA combines the sgRNA with another random barcode. The combination of sgRNA and random barcodes generates a complexity of >1000 times the number of sgRNAs. We refer to this highly complex combination of sgRNA and barcode as UMI (unique molecular identifier). Our library reached a complexity of 83 million.
(a) Vector design for library generation. Upon pooled parallel cloning of a barcode of 10 random nucleotides into retroviral backbones at complexities of 106, chip-synthesized sgRNA pools (at a complexity of 26514) were cloned into UMI containing backbone at a coverage >1000 clones/guide. Cassette-flanking PacI sites allowed for liberation of small sgRNA containing fragments from mammalian genomic DNA; (b) Library subpools and cloning complexity resulting in overall complexity of 83 million. (c) Ethidium bromide stained agarose gel, 200ng DNA/lane; Digest of genomic DNA after screens and plasmid DNA as control with the octamer recognition site enzyme PacI results in mostly large genomic fragments, while sgRNA fragments are 589 bp long (arrowhead). Long and short fragments can be fractionated using magnetic beads. (d) q-PCR on genomic DNA (gDNA), PacI digest gDNA and size separated fractions of digested gDNA. Error bars are s.d., shown are two biologically independent experiments in technical triplicate, equivalent to Figure 2b, all data points are shown Source data
Supplementary Figure 5 A pilot screen to identify optimal conditions for UMI-based CRISPR screen analysis.
Setup of screen; Upon editing, various clonal outgrowth regiments, followed by clonal expansion and dropout screening, were run in parallel. Cas9 expression was induced by Dox, selection for cells harboring guide RNAs was performed by neomycin (G418) selection. Limiting dilution and expansion is variable in the experiment. Cells are treated with or without 3.3nM etoposide a LD30 for 8 days. (b) Scheme illustrating variation in clone number and size (c) Average clone numbers and size determined from NGS data (d) Distribution of single cell derived clones in each regimen illustrated with guide_1 against Nhej1. P-value for each clone correlates with read depth but results in less data points. (e) Plot illustrating median dropout for each condition as well as p-value determined by combining multiple clones using MAGeCK. Signal to noise ratios (SNR) are highest in 148 clones of 35 reads, and the percentage of guides expected to have less than 5 clones due to variability in representation is with 0.06% lower than for 52 or 21 clone datasets. Source data
Graphical illustration for large scale screen setup used to identify sensitizing mutations for etoposide. Cas9 expression was induced by doxcycline, selection for cells harboring guide RNAs was performed by neomycin (G418) selection. After washout of doxycycline and neomycin single cell derived clones are generated by a limiting dilution and clonal expansion. Cells are treated with 3.3nM Etoposide or mock treatment for 8days.
(a) No strong outlier clones are detected in hits identified by conventional analysis as well as CRISPR-UMI (b) Strong outlier clones with very high read counts as well as depletion are seen in putative false positive hits called by conventional analysis. (c) Genes identified only in CRISPR-UMI show modest but reproducible depletion in multiple independent clones but are often dominated by clones with high read counts that to not deplete. Source data
Supplementary Figure 8 A pooled dropout screen without clonal outgrowth showing outliers resulting in false positive calls.
(a) Comparison of CRISPR-UMI with conventional screen analysis on guide level in absence of clonal dilution and outgrowth shows highly correlative results (yellow) as well as discrepancy between both regimen (mixed colors, Pearson correlation: 0.729) (b) While correlating sgRNAs do not contain strong outlier clones based on total read count, guides only called in conventional analysis show outlier clones responsible for overall dropout. (c) Ranking of sgRNAs improves upon removal of outlier clones (top 3 clones by read count) from the dataset illustrating their confounding effects. Source data
Supplementary Figure 9 A comparison of positive control guide detection in conventional analysis versus CRISPR-UMI.
Venn Diagrams illustrating the number of sgRNAs targeting the positive controls of the NHEJ complex (Lig4, Xrcc4-6, Nhej1) called within the top 50/100 hits. Source data
Supplementary Figure 10 Validation of the reprogramming efficiency and predicted size distribution of colonies by UMI analysis from NGS data.
(a) Median size distribution of read counts per UMI for each sgRNA. Reads for each UMI were filtered for sequencing errors and median colony size is plotted relative to median size illustrating a marked size increase per iPS colony in many but not all identified roadblocks of reprogramming. Center line, median; hinges, 25th and 75th quartiles; whiskers, median ± 1.58Å~ the interquartile range (IQR)/2. Individual data points represent outliers. (b) Alkaline phosphatase staining in 6 well dishes 10 days after Dox induction in the transgenic system illustrating enhanced iPS colony formation for guides targeting Men1 or Pias1. (c) Representative colonies for comparison with Figure 5e stained with alkaline phosphatase in validation experiment on day 10 after Dox. Source data
Supplementary Figures 1–10 (PDF 5814 kb)
sgRNA library content. (XLSX 2028 kb)
Individual oligonucleotides used for this study. (XLSX 47 kb)
Experimental indices used in NGS experiments. (XLSX 44 kb)
About this article
Cite this article
Michlits, G., Hubmann, M., Wu, S. et al. CRISPR-UMI: single-cell lineage tracing of pooled CRISPR–Cas9 screens. Nat Methods 14, 1191–1197 (2017). https://doi.org/10.1038/nmeth.4466
Molecular Cell (2020)
Nature Methods (2020)
Computational approaches in cancer multidrug resistance research: Identification of potential biomarkers, drug targets and drug-target interactions
Drug Resistance Updates (2020)
The CRISPR Journal (2020)
APL Bioengineering (2020)