CRISPR-UMI: single-cell lineage tracing of pooled CRISPR–Cas9 screens


Pooled CRISPR screens are a powerful tool for assessments of gene function. However, conventional analysis is based exclusively on the relative abundance of integrated single guide RNAs (sgRNAs) between populations, which does not discern distinct phenotypes and editing outcomes generated by identical sgRNAs. Here we present CRISPR-UMI, a single-cell lineage-tracing methodology for pooled screening to account for cell heterogeneity. We generated complex sgRNA libraries with unique molecular identifiers (UMIs) that allowed for screening of clonally expanded, individually tagged cells. A proof-of-principle CRISPR-UMI negative-selection screen provided increased sensitivity and robustness compared with conventional analysis by accounting for underlying cellular and editing-outcome heterogeneity and detection of outlier clones. Furthermore, a CRISPR-UMI positive-selection screen uncovered new roadblocks in reprogramming mouse embryonic fibroblasts as pluripotent stem cells, distinguishing reprogramming frequency and speed (i.e., effect size and probability). CRISPR-UMI boosts the predictive power, sensitivity, and information content of pooled CRISPR screens.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: The conceptual framework of CRISPR screen analysis by single-cell tracing.
Figure 2: CRISPR-UMI library representation and post-screen enrichment.
Figure 3: Conventional and clonal analysis of negative-selection screens.
Figure 4: The performance of conventional analysis compared with that of CRISPR-UMI.
Figure 5: CRISPR-UMI analysis of a positive-selection screen to identify roadblocks to reprogramming.

Accession codes

Primary accessions



  1. 1

    Koike-Yusa, H., Li, Y., Tan, E.-P., Del Castillo Velasco-Herrera, M. & Yusa, K. Genome-wide recessive genetic screening in mammalian cells with a lentiviral CRISPR-guide RNA library. Nat. Biotechnol. 32, 267–273 (2014).

    Article  CAS  Google Scholar 

  2. 2

    Shalem, O. et al. Genome-scale CRISPR-Cas9 knockout screening in human cells. Science 343, 84–87 (2014).

    Article  CAS  PubMed  Google Scholar 

  3. 3

    Wang, T., Wei, J.J., Sabatini, D.M. & Lander, E.S. Genetic screens in human cells using the CRISPR-Cas9 system. Science 343, 80–84 (2014).

    Article  CAS  PubMed  Google Scholar 

  4. 4

    Hart, T. et al. High-resolution CRISPR screens reveal fitness genes and genotype-specific cancer liabilities. Cell 163, 1515–1526 (2015).

    Article  CAS  Google Scholar 

  5. 5

    Chen, S. et al. Genome-wide CRISPR screen in a mouse model of tumor growth and metastasis. Cell 160, 1246–1260 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. 6

    Graham, D.B. & Root, D.E. Resources for the design of CRISPR gene editing experiments. Genome Biol. 16, 260 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. 7

    Doench, J.G. et al. Rational design of highly active sgRNAs for CRISPR-Cas9-mediated gene inactivation. Nat. Biotechnol. 32, 1262–1267 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. 8

    Shalem, O., Sanjana, N.E. & Zhang, F. High-throughput functional genomics using CRISPR-Cas9. Nat. Rev. Genet. 16, 299–311 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. 9

    Miles, L.A., Garippa, R.J. & Poirier, J.T. Design, execution, and analysis of pooled in vitro CRISPR/Cas9 screens. FEBS J. 283, 3170–3180 (2016).

    Article  CAS  PubMed  Google Scholar 

  10. 10

    Burden, D.A. et al. Topoisomerase II etoposide interactions direct the formation of drug-induced enzyme-DNA cleavage complexes. J. Biol. Chem. 271, 29238–29244 (1996).

    Article  CAS  PubMed  Google Scholar 

  11. 11

    Jackson, S.P. & Bartek, J. The DNA-damage response in human biology and disease. Nature 461, 1071–1078 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. 12

    Li, W. et al. MAGeCK enables robust identification of essential genes from genome-scale CRISPR/Cas9 knockout screens. Genome Biol. 15, 554 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. 13

    Black, S.J., Kashkina, E., Kent, T. & Pomerantz, R.T. DNA polymerase θ: a unique multifunctional end-joining machine. Genes (Basel) 7, 67 (2016).

    Article  CAS  Google Scholar 

  14. 14

    Takata, K., Reh, S., Tomida, J., Person, M.D. & Wood, R.D. Human DNA helicase HELQ participates in DNA interstrand crosslink tolerance with ATR and RAD51 paralogs. Nat. Commun. 4, 2338 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  15. 15

    Gilmore-Hebert, M., Ramabhadran, R. & Stern, D.F. Interactions of ErbB4 and Kap1 connect the growth factor and DNA damage response pathways. Mol. Cancer Res. 8, 1388–1398 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. 16

    Icli, B., Bharti, A., Pentassuglia, L., Peng, X. & Sawyer, D.B. ErbB4 localization to cardiac myocyte nuclei, and its role in myocyte DNA damage response. Biochem. Biophys. Res. Commun. 418, 116–121 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. 17

    Mukherjee, B., Choy, H., Nirodi, C. & Burma, S. Targeting nonhomologous end-joining through epidermal growth factor receptor inhibition: rationale and strategies for radiosensitization. Semin. Radiat. Oncol. 20, 250–257 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  18. 18

    Greer Card, D.A., Sierant, M.L. & Davey, S. Rad9A is required for G2 decatenation checkpoint and to prevent endoreduplication in response to topoisomerase II inhibition. J. Biol. Chem. 285, 15653–15661 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. 19

    He, W. et al. A role for the arginine methylation of Rad9 in checkpoint control and cellular sensitivity to DNA damage. Nucleic Acids Res. 39, 4719–4727 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. 20

    Smilenov, L.B. et al. Combined haploinsufficiency for ATM and RAD9 as a factor in cell transformation, apoptosis, and DNA lesion repair dynamics. Cancer Res. 65, 933–938 (2005).

    CAS  PubMed  Google Scholar 

  21. 21

    Stadtfeld, M., Maherali, N., Borkent, M. & Hochedlinger, K. A reprogrammable mouse strain from gene-targeted embryonic stem cells. Nat. Methods 7, 53–55 (2010).

    Article  CAS  PubMed  Google Scholar 

  22. 22

    Marión, R.M. et al. A p53-mediated DNA damage response limits reprogramming to ensure iPS cell genomic integrity. Nature 460, 1149–1153 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. 23

    Liao, J. et al. Inhibition of PTEN tumor suppressor promotes the generation of induced pluripotent stem cells. Mol. Ther. 21, 1242–1250 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. 24

    Onder, T.T. et al. Chromatin-modifying enzymes as modulators of reprogramming. Nature 483, 598–602 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. 25

    Buckley, S.M. et al. Regulation of pluripotency and cellular reprogramming by the ubiquitin-proteasome system. Cell Stem Cell 11, 783–798 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. 26

    Cheloufi, S. et al. The histone chaperone CAF-1 safeguards somatic cell identity. Nature 528, 218–224 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. 27

    dos Santos, R.L. et al. MBD3/NuRD facilitates induction of pluripotency in a context-dependent manner. Cell Stem Cell 15, 102–110 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. 28

    Rais, Y. et al. Deterministic direct reprogramming of somatic cells to pluripotency. Nature 502, 65–70 (2013).

    Article  CAS  PubMed  Google Scholar 

  29. 29

    Wang, T. et al. Identification and characterization of essential genes in the human genome. Science 350, 1096–1101 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. 30

    Rauscher, B., Heigwer, F., Breinig, M., Winter, J. & Boutros, M. GenomeCRISPR—a database for high-throughput CRISPR/Cas9 screens. Nucleic Acids Res. 45, D679–D686 (2017).

    Article  CAS  PubMed  Google Scholar 

  31. 31

    Wang, T. et al. Gene essentiality profiling reveals gene networks and synthetic lethal interactions with oncogenic Ras. Cell 168, 890–903 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. 32

    Datlinger, P. et al. Pooled CRISPR screening with single-cell transcriptome readout. Nat. Methods 14, 297–301 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. 33

    Dixit, A. et al. Perturb-Seq: dissecting molecular circuits with scalable single-cell RNA profiling of pooled genetic screens. Cell 167, 1853–1866 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. 34

    Adamson, B. et al. A multiplexed single-cell CRISPR screening platform enables systematic dissection of the unfolded protein response. Cell 167, 1867–1882 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. 35

    Jaitin, D.A. et al. Dissecting immune circuits by linking CRISPR-pooled screens with single-cell RNA-seq. Cell 167, 1883–1896 (2016).

    Article  CAS  PubMed  Google Scholar 

  36. 36

    Koidl, S. et al. The SUMO2/3 specific E3 ligase ZNF451-1 regulates PML stability. Int. J. Biochem. Cell Biol. 79, 478–487 (2016).

    Article  CAS  PubMed  Google Scholar 

  37. 37

    Guzzo, C.M. et al. RNF4-dependent hybrid SUMO-ubiquitin chains are signals for RAP80 and thereby mediate the recruitment of BRCA1 to sites of DNA damage. Sci. Signal. 5, ra88 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. 38

    Cappadocia, L., Pichler, A. & Lima, C.D. Structural basis for catalytic activation by the human ZNF451 SUMO E3 ligase. Nat. Struct. Mol. Biol. 22, 968–975 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. 39

    Lu, J. et al. Alpha cell-specific Men1 ablation triggers the transdifferentiation of glucagon-expressing cells and insulinoma development. Gastroenterology 138, 1954–1965 (2010).

    Article  CAS  PubMed  Google Scholar 

  40. 40

    Liu, L. et al. Targeting Mll1 H3K4 methyltransferase activity to guide cardiac lineage specific reprogramming of fibroblasts. Cell Discov. 2, 16036 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. 41

    Boyle, K. et al. Deletion of the SOCS box of suppressor of cytokine signaling 3 (SOCS3) in embryonic stem cells reveals SOCS box-dependent regulation of JAK but not STAT phosphorylation. Cell. Signal. 21, 394–404 (2009).

    Article  CAS  PubMed  Google Scholar 

  42. 42

    Takahashi, Y. et al. SOCS3: an essential regulator of LIF receptor signaling in trophoblast giant cell differentiation. EMBO J. 22, 372–384 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. 43

    Heigwer, F. et al. CRISPR Library Designer (CLD): software for multispecies design of single guide RNA libraries. Genome Biol. 17, 55 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. 44

    Doench, J.G. et al. Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9. Nat. Biotechnol. 34, 184–191 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. 45

    Richter, F. et al. Engineering of temperature- and light-switchable Cas9 variants. Nucleic Acids Res. 44, 10003–10014 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  46. 46

    Kleinstiver, B.P. et al. Engineered CRISPR-Cas9 nucleases with altered PAM specificities. Nature 523, 481–485 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. 47

    Konermann, S. et al. Genome-scale transcriptional activation by an engineered CRISPR-Cas9 complex. Nature 517, 583–588 (2015).

    Article  CAS  PubMed  Google Scholar 

  48. 48

    Larson, M.H. et al. CRISPR interference (CRISPRi) for sequence-specific control of gene expression. Nat. Protoc. 8, 2180–2196 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. 49

    Hess, G.T. et al. Directed evolution using dCas9-targeted somatic hypermutation in mammalian cells. Nat. Methods 13, 1036–1042 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. 50

    Fu, Y., Sander, J.D., Reyon, D., Cascio, V.M. & Joung, J.K. Improving CRISPR-Cas nuclease specificity using truncated guide RNAs. Nat. Biotechnol. 32, 279–284 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. 51

    Michlits, G., Burkard, T.R., Novatchkova, M. & Elling, U. CRISPR-UMI step-by-step: a protocol for robust CRISPR-screening. (2017).

  52. 52

    Rosenbloom, K.R. et al. The UCSC Genome Browser database: 2015 update. Nucleic Acids Res. 43, D670–D681 (2015).

    Article  CAS  PubMed  Google Scholar 

  53. 53

    Finn, R.D. et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 44, D279–D285 (2016).

    Article  CAS  PubMed  Google Scholar 

  54. 54

    Kuscu, C., Arslan, S., Singh, R., Thorpe, J. & Adli, M. Genome-wide analysis reveals characteristics of off-target sites bound by the Cas9 endonuclease. Nat. Biotechnol. 32, 677–683 (2014).

    Article  CAS  PubMed  Google Scholar 

  55. 55

    Tsai, S.Q. et al. GUIDE-seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases. Nat. Biotechnol. 33, 187–197 (2015).

    Article  CAS  PubMed  Google Scholar 

  56. 56

    Smith, T., Heger, A. & Sudbery, I. UMI-tools: modeling sequencing errors in Unique Molecular Identifiers to improve quantification accuracy. Genome Res. 27, 491–499 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references


We acknowledge everybody involved in the generation of data and of the manuscript. We thank J. Zuber (Research Institute of Molecular Pathology (IMP), Vienna Biocenter (VBC), Vienna, Austria) for the retroviral backbone used to generate the guide library, and K. Hochedlinger (Department of Molecular Biology, Cancer Center and Center for Regenerative Medicine, Massachusetts General Hospital, Boston, Massachusetts, USA; Department of Stem Cell and Regenerative Biology and Harvard Stem Cell Institute, Cambridge, Massachusetts, USA; Howard Hughes Medical Institute, Chevy Chase, Maryland, USA) for provision of the Dox-inducible reprogramming system. We thank A. Stark for important suggestions and critical reading of the manuscript, J. Jude for critical discussion and sharing of protocols, and P. Svoboda for discussion and advice. We are grateful to all IMBA/IMP services, in particular bioinformatics, biooptics, molecular biology, media kitchen, and graphics for technical support, as well as to J. Brennecke and A. Andersen (Life Science Editors) for critical reading and editing of the manuscript. We thank the VBCF NGS facility. This work was supported by IMBA, the Austrian Academy of Sciences (OEAW), Novartis Institute of Biomedical Research, and AstraZeneca.

Author information




G.M. and U.E. conceived the study. G.M. cloned the library and performed the etoposide screens and bioinformatic studies. M.H. generated the Cas9 inducible ESC line and supported follow-up experiments. E.B. and U.E. performed the iPSC screen, and S.-H.W., G.V., and U.E. validated it. T.R.B. and M.N. performed bioinformatic analyses. S.Z., Y.L., and D.S. supported experiments. M.A. generated the vector backbone for the sgRNA library. J.R.-H., R.N., and D.H. supported the design and generation of sgRNA libraries. U.E. wrote the manuscript with support from G.M., D.S., D.H., and all other co-authors.

Corresponding author

Correspondence to Ulrich Elling.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 A schematic illustration of UMI use in CRISPR–Cas9 screens.

(a) Data analysis in CRISPR screens is conventionally based on several sgRNAs targeting the same gene. Introduction of random barcodes at complexities well above total analyzed cell number will tag each individual cell with a unique molecular identifier (UMI). This generates a third layer of information at single cell level, generating biological replicas. (b) Schematic illustration of the generation of single cell derived clones using a limiting dilution followed by clonal expansion.

Supplementary Figure 2 The bioinformatic pipeline of sgRNA prediction.

Doench-scores as measure of predicted sgRNA activity were calculated for all exonic sgRNAs compatible with our cloning strategy. Doench scores were penalized based on a ruleset for biological effects. Those rules combine evaluation of exon length, prediction of protein domains, alternative splicing and ATG start codons, Pol-III terminator sequences, position of the sgRNA within the CDS. The penalties also spread selected sgRNAs over different exons and include off-target prediction penalties.

Supplementary Figure 3 A scheme illustrating the generation of CRISPR-UMI library complexity.

The CRISPR-UMI library is generated by 2 subsequent complex cloning steps. Initially, a random barcode consisting of 10 nucleotides is integrated into the vector backbone. Subsequently, the sgRNA pool of 26,514 sgRNAs is ligated to the barcode library with over 1000 ligation events per sgRNA. Thereby, each of the over 1000 ligation events per sgRNA combines the sgRNA with another random barcode. The combination of sgRNA and random barcodes generates a complexity of >1000 times the number of sgRNAs. We refer to this highly complex combination of sgRNA and barcode as UMI (unique molecular identifier). Our library reached a complexity of 83 million.

Supplementary Figure 4 CRISPR-UMI sgRNA library and cassette amplification.

(a) Vector design for library generation. Upon pooled parallel cloning of a barcode of 10 random nucleotides into retroviral backbones at complexities of 106, chip-synthesized sgRNA pools (at a complexity of 26514) were cloned into UMI containing backbone at a coverage >1000 clones/guide. Cassette-flanking PacI sites allowed for liberation of small sgRNA containing fragments from mammalian genomic DNA; (b) Library subpools and cloning complexity resulting in overall complexity of 83 million. (c) Ethidium bromide stained agarose gel, 200ng DNA/lane; Digest of genomic DNA after screens and plasmid DNA as control with the octamer recognition site enzyme PacI results in mostly large genomic fragments, while sgRNA fragments are 589 bp long (arrowhead). Long and short fragments can be fractionated using magnetic beads. (d) q-PCR on genomic DNA (gDNA), PacI digest gDNA and size separated fractions of digested gDNA. Error bars are s.d., shown are two biologically independent experiments in technical triplicate, equivalent to Figure 2b, all data points are shown Source data

Supplementary Figure 5 A pilot screen to identify optimal conditions for UMI-based CRISPR screen analysis.

Setup of screen; Upon editing, various clonal outgrowth regiments, followed by clonal expansion and dropout screening, were run in parallel. Cas9 expression was induced by Dox, selection for cells harboring guide RNAs was performed by neomycin (G418) selection. Limiting dilution and expansion is variable in the experiment. Cells are treated with or without 3.3nM etoposide a LD30 for 8 days. (b) Scheme illustrating variation in clone number and size (c) Average clone numbers and size determined from NGS data (d) Distribution of single cell derived clones in each regimen illustrated with guide_1 against Nhej1. P-value for each clone correlates with read depth but results in less data points. (e) Plot illustrating median dropout for each condition as well as p-value determined by combining multiple clones using MAGeCK. Signal to noise ratios (SNR) are highest in 148 clones of 35 reads, and the percentage of guides expected to have less than 5 clones due to variability in representation is with 0.06% lower than for 52 or 21 clone datasets. Source data

Supplementary Figure 6 The screen layout for a sensitizer screen against etoposide.

Graphical illustration for large scale screen setup used to identify sensitizing mutations for etoposide. Cas9 expression was induced by doxcycline, selection for cells harboring guide RNAs was performed by neomycin (G418) selection. After washout of doxycycline and neomycin single cell derived clones are generated by a limiting dilution and clonal expansion. Cells are treated with 3.3nM Etoposide or mock treatment for 8days.

Supplementary Figure 7 Read distributions of single-cell-derived clones.

(a) No strong outlier clones are detected in hits identified by conventional analysis as well as CRISPR-UMI (b) Strong outlier clones with very high read counts as well as depletion are seen in putative false positive hits called by conventional analysis. (c) Genes identified only in CRISPR-UMI show modest but reproducible depletion in multiple independent clones but are often dominated by clones with high read counts that to not deplete. Source data

Supplementary Figure 8 A pooled dropout screen without clonal outgrowth showing outliers resulting in false positive calls.

(a) Comparison of CRISPR-UMI with conventional screen analysis on guide level in absence of clonal dilution and outgrowth shows highly correlative results (yellow) as well as discrepancy between both regimen (mixed colors, Pearson correlation: 0.729) (b) While correlating sgRNAs do not contain strong outlier clones based on total read count, guides only called in conventional analysis show outlier clones responsible for overall dropout. (c) Ranking of sgRNAs improves upon removal of outlier clones (top 3 clones by read count) from the dataset illustrating their confounding effects. Source data

Supplementary Figure 9 A comparison of positive control guide detection in conventional analysis versus CRISPR-UMI.

Venn Diagrams illustrating the number of sgRNAs targeting the positive controls of the NHEJ complex (Lig4, Xrcc4-6, Nhej1) called within the top 50/100 hits. Source data

Supplementary Figure 10 Validation of the reprogramming efficiency and predicted size distribution of colonies by UMI analysis from NGS data.

(a) Median size distribution of read counts per UMI for each sgRNA. Reads for each UMI were filtered for sequencing errors and median colony size is plotted relative to median size illustrating a marked size increase per iPS colony in many but not all identified roadblocks of reprogramming. Center line, median; hinges, 25th and 75th quartiles; whiskers, median ± 1.58Å~ the interquartile range (IQR)/2. Individual data points represent outliers. (b) Alkaline phosphatase staining in 6 well dishes 10 days after Dox induction in the transgenic system illustrating enhanced iPS colony formation for guides targeting Men1 or Pias1. (c) Representative colonies for comparison with Figure 5e stained with alkaline phosphatase in validation experiment on day 10 after Dox. Source data

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–10 (PDF 5814 kb)

Life Sciences Reporting Summary (PDF 160 kb)

Supplementary Table 1

sgRNA library content. (XLSX 2028 kb)

Supplementary Table 2

Individual oligonucleotides used for this study. (XLSX 47 kb)

Supplementary Table 3

Experimental indices used in NGS experiments. (XLSX 44 kb)

Source data

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Michlits, G., Hubmann, M., Wu, S. et al. CRISPR-UMI: single-cell lineage tracing of pooled CRISPR–Cas9 screens. Nat Methods 14, 1191–1197 (2017).

Download citation

Further reading