CRISPR-based genetic screens are accelerating biological discovery, but current methods have inherent limitations. Widely used pooled screens are restricted to simple readouts including cell proliferation and sortable marker proteins. Arrayed screens allow for comprehensive molecular readouts such as transcriptome profiling, but at much lower throughput. Here we combine pooled CRISPR screening with single-cell RNA sequencing into a broadly applicable workflow, directly linking guide RNA expression to transcriptome responses in thousands of individual cells. Our method for CRISPR droplet sequencing (CROP-seq) enables pooled CRISPR screens with single-cell transcriptome resolution, which will facilitate high-throughput functional dissection of complex regulatory mechanisms and heterogeneous cell populations.
Subscribe to Journal
Get full journal access for 1 year
only $20.17 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Gene Expression Omnibus
Blomen, V.A. et al. Gene essentiality and synthetic lethality in haploid human cells. Science 350, 1092–1096 (2015).
Wang, T. et al. Identification and characterization of essential genes in the human genome. Science 350, 1096–1101 (2015).
Shalem, O. et al. Genome-scale CRISPR-Cas9 knockout screening in human cells. Science 343, 84–87 (2014).
Marceau, C.D. et al. Genetic dissection of Flaviviridae host factors through genome-scale CRISPR screens. Nature 535, 159–163 (2016).
Lamb, J. The Connectivity Map: a new tool for biomedical research. Nat. Rev. Cancer 7, 54–60 (2007).
Gapp, B.V. et al. Parallel reverse genetic screening in mutant human cells using transcriptomics. Mol. Syst. Biol. 12, 879 (2016).
Macosko, E.Z. et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161, 1202–1214 (2015).
Sanjana, N.E., Shalem, O. & Zhang, F. Improved vectors and genome-wide libraries for CRISPR screening. Nat. Methods 11, 783–784 (2014).
Tiscornia, G., Singer, O. & Verma, I.M. Design and cloning of an shRNA into a lentiviral silencing vector: version A. CSH Protoc. 1, pdb.prot5009 (2008).
de Kok, S. et al. Rapid and reliable DNA assembly via ligase cycling reaction. ACS Synth. Biol. 3, 97–106 (2014).
Guschin, D.Y. et al. A rapid and general assay for monitoring endogenous gene modification. Methods Mol. Biol. 649, 247–256 (2010).
Brownlie, R.J. & Zamoyska, R. T cell receptor signalling networks: branched, diversified and bounded. Nat. Rev. Immunol. 13, 257–269 (2013).
Datlinger, P. et al. Pooled CRISPR screening with single-cell transcriptome read-out. Preprint at http://biorxiv.org/content/early/2016/10/27/083774 (2016).
Adamson, B. et al. A multiplexed single-cell CRISPR screening platform enables systematic dissection of the unfolded protein response. Cell 167, 1867–1882.e21 (2016).
Dixit, A. et al. Perturb-Seq: dissecting molecular circuits with scalable single-cell RNA profiling of pooled genetic screens. Cell 167, 1853–1866.e17 (2016).
Jaitin, D.A. et al. Dissecting immune circuits by linking CRISPR-pooled screens with single-cell RNA-seq. Cell 167, 1883–1896.e15 (2016).
Zheng, G.X.Y. et al. Massively parallel digital transcriptional profiling of single cells. Preprint at http://biorxiv.org/content/early/2016/07/26/065912 (2016).
Bock, C., Farlik, M. & Sheffield, N.C. Multi-omics of single cells: strategies and applications. Trends Biotechnol. 34, 605–608 (2016).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Bolger, A.M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
Langmead, B., Trapnell, C., Pop, M. & Salzberg, S.L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).
Glaus, P., Honkela, A. & Rattray, M. Identifying differentially expressed transcripts from RNA-seq data with biological variation. Bioinformatics 28, 1721–1728 (2012).
Li, J. et al. Single-cell transcriptomes reveal characteristic features of human pancreatic islet cell types. EMBO Rep. 17, 178–187 (2016).
Treutlein, B. et al. Dissecting direct reprogramming from fibroblast to neuron using single-cell RNA-seq. Nature 534, 391–395 (2016).
Kuleshov, M.V. et al. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 44, W90–W97 (2016).
We would like to thank J. Bigenzahn, A. Fauster, and M. Owusu (CeMM) for providing Cas9-expressing cell lines; M. Farlik for contributing to the Drop-seq setup; F. Müller and J. Menche for bioinformatic discussions; N. Winhofer for feedback on the illustrations; the Biomedical Sequencing Facility at CeMM for assistance with next-generation sequencing; and all members of the Bock lab for their help and advice. C.S. is supported by a Feodor Lynen Fellowship of the Alexander von Humboldt Foundation. C.B. is supported by a New Frontiers Group award of the Austrian Academy of Sciences and by an ERC Starting Grant (European Union's Horizon 2020 research and innovation programme, grant agreement no. 679146).
The authors declare no competing financial interests.
Integrated supplementary information
a) As starting point for preparing the CROPseq-Guide-Puro plasmid, we amplified four PCR products (A, B, C, and D) from the LentiGuide-Puro plasmid with the indicated primer pairs. b) CROPseq-Guide-Puro was constructed from these four amplicons using the ligase cycling reaction (LCR). The assembly was directed by four overlapping bridge oligonucleotides to flip the order of parts C and D. This rearrangement places the hU6-gRNA cassette into the 3′ LTR, downstream of the EF-1a puromycin marker. c) To validate the duplication of the hU6-gRNA cassette during lentiviral integration, we performed PCRs with primers that bind to the hU6 promoter but face in opposite directions. Productive amplification can occur only when amplifying from a circular plasmid or following duplication of the cassette during viral integration. As templates, we used gDNA from LentiGuide-Puro transduced cells (lane 1, resulting in no amplification), a plasmid preparation of CROPseq-Guide-Puro (lane 2), or gDNA from CROPseq-Guide-Puro transduced cells (lane 3).
Supplementary Figure 2 Genome editing efficiencies of LentiGuide-Puro and CROPseq-Guide-Puro based on the T7 endonuclease assay
a) Clonally expanded HEK293T cell lines as well as a HEK293T bulk population were transduced with LentiGuide-Puro (LentiGuide) or CROPseq-Guide-Puro (CROP-seq) vectors containing a gRNA targeting the MBD1 locus (+) or targeting a different locus (−). Ge-nome editing efficiencies for MBD1 were measured using the T7 endonuclease assay, which indicated highly similar performance between the two vectors. HEK293T clone 5 did not show any genome editing and was not used for further experiments. b) Table summa-rizing genome editing efficiencies for four cell lines (HEK293T, K562, Jurkat, KBM7) and two gRNAs (MBD1, DNMT3B).
Supplementary Figure 3 Configuration and validation of the droplet-based assay for single-cell transcriptome profiling
a) Setup of the Drop-seq workflow used as part of CROP-seq. b) Bioanalyzer trace of a typical cDNA library prepared with CROP-seq. c) Electropherogram of a sequencing-ready CROP-seq library after tagmentation. d) Doublet estimates based on a HEK293T (human) / 3T3 (mouse) mixing experiment across all detected cells and transcripts (without filtering). e) Percent of detected genes aligning to the human and mouse transcriptomes (filtered for cells with more than 500 detected genes). f) PCR duplication rates based on unique molecular identifiers (UMIs) in the HEK293T (human) / 3T3 (mouse) mixing experiment. g) Distribution of the distance of read mapping positions to the 3′end of gene models (blue line) and their cumulative sum (red line). h) Detailed performance statistics for twelve CROP-seq experiments. Green and orange labels indicate different batches of Drop-seq beads, where batch 1 suffered from production problems affecting the cell barcodes, which have been bioinformatically corrected to improve the data quality of the affected samples.
a) gRNA representation in the T cell receptor (TCR) gRNA library, assessed by amplicon sequencing of the plasmid pool (top) and the gDNA of Jurkat cells at day 10 post transduction with CROPseq-Guide-Puro (bottom), both displayed as cumulative distribution plots. The fold change between the 10th and 90th percentile is highlighted as a measure of library imbalance, which expectedly increases upon transduction. b) Abundance of each gRNA shown as a heatmap. c) Scatterplots of gRNA abundance from amplicon libraries at day 10 versus the original plasmid library. Frequencies of detected gRNAs have been normalized to the evaluated reads in each experiment.
Supplementary Figure 5 Similarities and differences in the transcriptome response at the gRNA and gene level
a) Mean and standard deviation of pairwise distances (L2-norm) between CROP-seq transcriptomes for pairs of gRNAs that target the same gene (orange) or different genes (blue). Statistical significance was assessed with the Mann-Whitney U test. b) Pairwise distances as in panel a, shown separately for gRNAs targeting specific genes and for naïve as well as anti-CD3/CD28 stimulated cells. The dotted line indicates the 99th percentile of the distribution of distances between non-targeting gRNAs. c) Statistical significance of the transcriptome-wide effect induced by gRNAs targeting specific genes relative to cells with non-targeting gRNAs (based on the Mann-Whitney U test). The dotted line indicates a p-value of 0.01. d) As in panel c, but aggregated at the gene level by combining pvalues using Fisher's method. Insets show scatterplots for the expression of two example genes with low (left) and high (right) systematic effects on the transcriptome compared to cells expressing non-targeting control gRNAs in the same stimulation condition (y-axis)
Supplementary Figure 6 Unsupervised analysis of the transcriptome response to T cell receptor stimulation
a) Principal component analysis of CROP-seq transcriptomes for cells with an assigned gRNA. b) Principal component analysis of median gene expression aggregated across cells expressing the same gRNA. c) Principal component analysis of median gene expression aggregated across cells expressing gRNAs targeting the same gene. The first principal component provided best separation between naïve and anti-CD3/CD28 stimulated cells, and the genes on the 99th percentile of loading contributions were selected as the CROP-seq derived TCR activation signature (n = 165 genes). d) Genes of the TCR activation signature and their absolute loading values for the first principal component in panel c. e) Principal component analysis as in panel a, with cells colored by the expression of three marker genes selected as part of the TCR activation signature. f) Enriched pathways and biological processes of genes in panel d, as identified by Enrichr. The combined enrichment score (p-value * z-score) is displayed for the top 8 terms of each gene set library.
Supplementary Figure 7 Positioning cells and target genes on a spectrum defined by naive and stimulated cell states
a) Hierarchical clustering of single-cell transcriptomes with unambiguously assigned gRNA (n = 5,798) based on all genes included in the TCR activation signature. Clustering for cells and genes used the Pearson correlation, and the z-score of expression is displayed along with the stimulation state for each cell (left column). b) Hierarchical clustering of median gene expression values aggregated across cells expressing gRNAs for the same target gene (n = 107). c) Analytical procedure for assigning each cell to a specific position on a spectrum defined by the CROP-seq transcriptomes of non-targeted cells in the naive and the anti-CD3/CD28 stimulated cell state. In a first step (left), a matrix of synthetic transcriptome signatures (Z) is built by linear combination of the transcriptomes in the two defining cell states (μA, μB). In a second step (right), the position of the Z matrix that shows the maximum Pearson correlation with the transcriptome of the cells (E matrix) is taken as the cell's position along the spectrum of cell states. d) Correlation (row-wise z-score) of single-cell transcriptomes with a matrix comprising synthetic mixtures of transcriptome profiles between the median of non-targeted cells in both conditions (values close to one reflect similarity with anti-CD3/CD28 stimulated cells). Additional data on the transcrip-tome quality for each cell (unique reads per cell) and the overall correlation performance are shown as columns and reveal no relation-ship with the inferred position of the corresponding cells. All cells were ordered by the inferred signature value (position with maxi-mum correlation), rather than being clustered as in panel a. e) Same as in panel c, but for cells grouped by gRNA target genes.
a) Hierarchical clustering of the median expression of TCR activation signature genes, based on bulk RNA-seq data aggregated across gRNAs for the same target gene. Clustering of rows and columns used the Pearson correlation, and z-scores of expression are shown. b) Correlation (row-wise z-score) of bulk RNA-seq transcriptomes with a matrix comprising synthetic mixtures of transcriptome profiles between the median of non-targeted cells in both conditions (values close to one reflect similarity with anti-CD3/CD28 stimulated cells). Additional metrics are shown as columns (center) and reveal no relationship with the inferred position of the corresponding samples. The effect that perturbing each target gene had on the TCR activation signature based on the bulk RNA-seq data was assessed in comparison to the control group (barplot on the right). c) Correlation (top) of median expression levels based on CROP-seq aggregating across target genes (left) or gRNAs (right) compared to the respective bulk RNA-seq libraries across all TCR activation signature genes. The corresponding number of cells in each group is shown at the bottom. Empty values reflect lack of matching bulk RNAseq libraries due to failed samples in the bulk RNA-seq. d) Comparison of the signatures inferred from CROP-seq (x-axis) with those derived from bulk RNA-seq data (y-axis) across all shared gRNAs. e) Comparison of the relative impact of each gRNA (left) or target gene (right) on the TCR activation signature relative to the corresponding control for CROP-seq (x-axis) and bulk RNA-seq (y-axis).
a) Experimental strategy of the arrayed validation screen. For this screen, 48 CROPseq-Guide-Puro constructs were individually cloned, targeting 20 genes with two gRNAs each and including eight non-targeting controls. Lentivirus production and transductions were performed in 96-well plates, and cells were expanded for 10 days under puromycin and blasticidin selection. Cells were then split into two parts, serum starved for three hours, and subjected to either anti-CD3/CD28-stimulation or continuous starvation for another four hours. The resulting cell populations were validated by Sanger sequencing of the corresponding gRNAs. For validation of the CROPseq signature, bulk RNA-seq was performed using a 3′ enrichment protocol, yielding data similar to Drop-seq (n = 87 RNA-seq libraries). As a complementary single-cell and protein-based read-out, flow cytometry (n = 96 samples) was performed for surface markers enriched in the TCR induction signature derived from CROP-seq (CD69, CD82) or previously reported as markers of T cell activation (CD25, CD38, CD154, PD-1). b) Examples of marker expression changes for TCR pathway activators identified by CROP-seq (ZAP70, LCK, LAT). c) Scatterplots comparing protein levels for TCR induction markers to RNA expression values obtained by CROP-seq.
Supplementary Figures 1–9 and Supplementary Protocol (PDF 9631 kb)
Oligonucleotide sequences used for developing and validating CROP-seq (XLSX 12 kb)
gRNA library used for the CROP-seq T cell receptor screen (XLSX 21 kb)
Arrayed validation screen for the CROP-seq T cell receptor screen (XLSX 28 kb)
CROPseq-Guide-Puro plasmid sequence (TXT 16 kb)
Source code for CROP-seq computational analyses (ZIP 69 kb)
About this article
Cite this article
Datlinger, P., Rendeiro, A., Schmidl, C. et al. Pooled CRISPR screening with single-cell transcriptome readout. Nat Methods 14, 297–301 (2017). https://doi.org/10.1038/nmeth.4177
Data generation and network reconstruction strategies for single cell transcriptomic profiles of CRISPR-mediated gene perturbations
Biochimica et Biophysica Acta (BBA) - Gene Regulatory Mechanisms (2020)
From GWAS to Function: Using Functional Genomics to Identify the Mechanisms Underlying Complex Diseases
Frontiers in Genetics (2020)
Computational approaches in cancer multidrug resistance research: Identification of potential biomarkers, drug targets and drug-target interactions
Drug Resistance Updates (2020)
Annual Review of Cancer Biology (2020)