Single-cell CRISPR screens enable the exploration of mammalian gene function and genetic regulatory networks. However, use of this technology has been limited by reliance on indirect indexing of single-guide RNAs (sgRNAs). Here we present direct-capture Perturb-seq, a versatile screening approach in which expressed sgRNAs are sequenced alongside single-cell transcriptomes. Direct-capture Perturb-seq enables detection of multiple distinct sgRNA sequences from individual cells and thus allows pooled single-cell CRISPR screens to be easily paired with combinatorial perturbation libraries that contain dual-guide expression vectors. We demonstrate the utility of this approach for high-throughput investigations of genetic interactions and, leveraging this ability, dissect epistatic interactions between cholesterol biogenesis and DNA repair. Using direct capture Perturb-seq, we also show that targeting individual genes with multiple sgRNAs per cell improves efficacy of CRISPR interference and activation, facilitating the use of compact, highly active CRISPR libraries for single-cell screens. Last, we show that hybridization-based target enrichment permits sensitive, specific sequencing of informative transcripts from single-cell RNA-seq experiments.
Subscribe to Journal
Get full journal access for 1 year
only $20.83 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Raw and processed sequencing data are available at Gene Expression Omnibus under accession code GSE146194.
Cell Ranger 3.0 is available from 10x Genomics (https://support.10xgenomics.com/single-cell-gene-expression/software/downloads/latest). Our previously published analytic framework for Perturb-seq analysis17 is available at https://github.com/thomasmaxwellnorman/Perturbseq_GI. Python scripts and Jupyter notebooks for direct capture guide identity assignment are available at https://github.com/josephreplogle/guide_calling. Python Jupyter notebooks for the design of hybridization capture probes are available at https://github.com/josephreplogle/target_enrichment.
Packer, J. & Trapnell, C. Single-cell multi-omics: an engine for new quantitative models of gene regulation. Trends Genet. 34, 653–665 (2018).
Feldman, D. et al. Optical pooled screens in human cells. Cell. 179, 787–799.e17 (2019).
Rubin, A. J. et al. Coupled single-cell CRISPR screening and epigenomic profiling reveals causal gene regulatory networks. Cell. 176, 361–376.e17 (2018).
Adamson, B. et al. A multiplexed single-cell CRISPR screening platform enables systematic dissection of the unfolded protein response. Cell. 167, 1867–1882.e21 (2016).
Dixit, A. et al. Perturb-Seq: dissecting molecular circuits with scalable single-cell RNA profiling of pooled genetic screens. Cell. 167, 1853–1866.e17 (2016).
Jaitin, D. et al. Dissecting immune circuits by linking CRISPR-pooled screens with single-cell RNA-seq. Cell. 167, 1883–1896.e15 (2016).
Xie, S., Duan, J., Li, B., Zhou, P. & Hon, G. C. Multiplexed engineering and analysis of combinatorial enhancer activity in single cells. Mol. Cell. 66, 285–299.e5 (2017).
Datlinger, P. et al. Pooled CRISPR screening with single-cell transcriptome readout. Nat. Methods 14, 297–301 (2017).
Hill, A. J. et al. On the design of CRISPR-based single-cell molecular screens. Nat. Methods 15, 271 (2018).
Adamson, B., Norman, T. M., Jost, M. & Weissman, J. S. Approaches to maximize sgRNA-barcode coupling in Perturb-seq screens. Preprint at bioRxiv https://doi.org/10.1101/298349 (2018).
Xie, S., Cooley, A., Armendariz, D., Zhou, P. & Hon, G. C. Frequent sgRNA-barcode recombination in single-cell perturbation assays. PLoS ONE 13, e0198635 (2018).
Feldman, D., Singh, A., Garrity, A. J. & Blainey, P. C. Lentiviral co-packaging mitigates the effects of intermolecular recombination and multiple integrations in pooled genetic screens. Preprint at bioRxiv https://doi.org/10.1101/262121 (2018).
Gasperini, M. et al. A genome-wide framework for mapping gene regulation via cellular genetic screens. Cell. 176, 377–390.e19 (2019).
Smits, A. H. et al. Biological plasticity rescues target activity in CRISPR knock outs. Nat. Methods 16, 1087–1093 (2019).
Anzalone, A. V. et al. Search-and-replace genome editing without double-strand breaks or donor DNA. Nature 576, 149–157 (2019).
Ran, A. F. et al. Double nicking by RNA-guided CRISPR Cas9 for enhanced genome editing specificity. Cell 154, 1380–1389 (2013).
Norman, T. M. et al. Exploring genetic interaction manifolds constructed from rich single-cell phenotypes. Science 365, 786–793 (2019).
Macosko, E. Z. et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161, 1202–1214 (2015).
Klein, A. M. et al. Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell 161, 1187–1201 (2015).
Zheng, G. X. et al. Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 8, 14049 (2017).
Mimitou, E. P. et al. Multiplexed detection of proteins, transcriptomes, clonotypes and CRISPR perturbations in single cells. Nat. Methods 16, 409–412 (2019).
Stoeckius, M. et al. Simultaneous epitope and transcriptome measurement in single cells. Nat. Methods 14, 865–868 (2017).
Peterson, V. M. et al. Multiplexed quantification of proteins and transcripts in single cells. Nat. Biotechnol. 35, 936–939 (2017).
Zhang, S.-Q. et al. High-throughput determination of the antigen specificities of T cell receptors in single cells. Nat. Biotechnol. 36, 1156–1159 (2018).
Gilbert, L. A. et al. Genome-scale CRISPR-mediated control of gene repression and activation. Cell 159, 647–661 (2014).
Gilbert, L. A. et al. CRISPR-mediated modular RNA-guided regulation of transcription in eukaryotes. Cell 154, 442–451 (2013).
Mandegar, M. A. et al. CRISPR interference efficiently induces specific and reversible gene silencing in human iPSCs. Cell Stem Cell 18, 541–553 (2016).
Horlbeck, M. A. et al. Mapping the genetic landscape of human cells. Cell 174, 953–967.e22 (2018).
Horlbeck, M. A. et al. Compact and highly active next-generation libraries for CRISPR-mediated gene repression and activation. eLife 5, e19760 (2016).
Moreno, A. M. et al. In situ gene therapy via AAV-CRISPR-Cas9-mediated targeted gene regulation. Mol. Ther. 26, 1818–1827 (2018).
Savell, K. E. et al. A neuron-optimized CRISPR/dCas9 activation system for robust and specific gene regulation. eNeuro 6, https://doi.org/10.1523/ENEURO.0495-18.2019 (2019).
Cleary, B., Cong, L., Cheung, A., Lander, E. S. & Regev, A. Efficient generation of transcriptomic profiles by random composite measurements. Cell 171, 1424–1436.e18 (2017).
Subramanian, A. et al. A next generation connectivity map: L1000 platform and the first 1,000,000 profiles. Cell. 171, 1437–1452.e17 (2017).
Salomon, R. et al. Droplet-based single cell RNAseq tools: a practical guide. Lab. Chip 19, 1706–1727 (2019).
Saikia, M. et al. Simultaneous multiplexed amplicon sequencing and transcriptome profiling in single cells. Nat. Methods 16, 59–62 (2019).
Vallejo, A. F. et al. Resolving cellular systems by ultra-sensitive and economical single-cell transcriptome filtering. Preprint at bioRxiv https://doi.org/10.1101/800631 (2019).
Chan, M. M. et al. Molecular recording of mammalian embryogenesis. Nature 570, 77–82 (2019).
We thank S.E. Vazquez, A. Guna, M. Jost, D. Yang, R. Saunders, X. Qiu, E. Chow, R. Sit and all members of the Weissman and Adamson laboratories and 10x Genomics for helpful discussions. This work was funded by National Institutes of Health grant nos. P50 GM102706, U01 CA168370, R01 DA036858 and RM1HG009490 (all to J.S.W.), the Defense Advanced Research Projects Agency (DARPA) (grant no. HR0011-19-2-0007), the Chan Zuckerberg Initiative and Princeton University. J.S.W. is a Howard Hughes Medical Institute Investigator. J.M.R. is an NIH/NINDS Ruth L. Kirschstein National Research Service Award fellow (no. F31 NS115380). T.M.N. is a fellow of the Damon Runyon Cancer Research Foundation (no. DRG-(2211-15)). J.A.H. is the Rebecca Ridley Kry Fellow of the Damon Runyon Cancer Research Foundation (no. DRG-2262-16). J.C. is funded by the Jane Coffin Childs Memorial Fund for Medical Research and the NIH K99/R00 Pathway to Independence Award (no. GM134154).
10x Genomics was involved in producing this work. I.T.F., J.G.A., L.J.A., K.A.P., E.J.M., J.M.T., D.P.R., N.S. and T.S.M. are employees of 10x Genomics. The Regents of the University of California with T.M.N., J.S.W. and B.A. as inventors have filed patent applications related to CRISPRi/a screening, Perturb-seq and GI mapping. J.S.W. consults for and holds equity in KSQ Therapeutics, Maze Therapeutics and Tenaya Therapeutics; is a venture partner at 5AM Ventures and is a member of the Amgen Scientific Advisory Board. J.M.R. and T.M.N. consult for Maze Therapeutics. B.A. is a member of a ThinkLab Advisory Board for and holds equity in Celsius Therapeutics.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Integrated supplementary information
Supplementary Figure 1 Optimization of modified of guide constant regions to enable 3’ direct capture Perturb-seq.
a) Schematic of 3’ single-cell RNA-sequencing (3’ scRNA-seq). Polyadenylated mRNAs from individual cells (top, light blue) anneal to barcoded oligo-dT primers in emulsion droplets (delivered to droplets on gel beads) and are reverse transcribed into indexed cDNA (bottom). TSO, template switch oligo. UMI, unique molecular identifier. CBC, cell barcode. b) Schematic of 5’ single-cell RNA-sequencing (5’ scRNA-seq). Polyadenylated mRNAs from individual cells (top, light blue) anneal to unbarcoded oligo-dT primers in emulsion droplets (delivered to droplets as free oligos) and are reverse transcribed. Indexing of cDNA (bottom) occurs when template switching allows for extension of barcoded TSOs (delivered to droplets on gel beads). c) Schematic of constant region 1 (CR1) guide RNAs. Arrows indicate the positions of capture sequence insertions. d) CRISPRi activity of guides carrying the indicated capture sequences (all programmed with an identical GFP targeting sequence) in GFP+ K562 dCas9-KRAB cells 10 days post-transduction. Data from guides selected for direct capture experiments (sgRNA-CR1cs1 and sgRNA-CR1cs2) are indicated. For comparison, data from standard guides targeting GFP (programmed with the same targeting region but without capture sequences and expressed from 3 other vectors) were also included. One of these, indicated as “CROP-seq”, has a different, previously published (Datlinger et al., Nature Methods, 14-3, 2017)8 constant region and is expressed from a different promoter. Data represents the average of independently infected triplicates normalized to controls ± standard deviation. The data was collected in two separate batches (independently controlled). Representative flow cytometry gating for one sample is also shown. e) Gaussian kernel density estimates of normalized flow-cytometry measurements representing GFP expression demonstrate CRISPRi activity of the indicated sgRNAs (programmed with an identical GFP targeting sequence). Data was collected in three independent biological replicates and a representative replicate is shown. AU, arbitrary units. f) Gaussian kernel density estimates of normalized flow-cytometry measurements representing GFP expression demonstrate CRISPRi activity of the indicated sgRNAs (programmed with an identical GFP targeting region). Data was collected in three independent biological replicates and a representative replicate is shown. AU, arbitrary units. g) Schematic of experimental workflow for direct capture Perturb-seq (3’ or 5’) based on protocols from 10x Genomics. Red indicates generation of sequencing libraries. Box details construction of index sequencing library for GBC Perturb-seq, which is based on a previously published protocol (Adamson et al., Cell, 167-7, 2016)4.
Supplementary Figure 2 Cell indexing by direct guide capture is robust and comparable to indexing by GBC capture.
a) Box plot of total index (GBC or guide) UMI counts per cell for all cells (prior to guide identity mapping). Box plots denote quartile ranges (box), median (center mark), and 1.5 × interquartile range (whiskers). Data represent n=10036 cells for 3’ GBC, n=8267 cells for 3’ sgRNA-CR1cs2, n=8727 cells for 3’ sgRNA-CR1cs1, n=6789 cells for 5’ sgRNA-CR1, and n=7043 cells for 5’ sgRNA-CR1cs1. Several direct capture methods gave higher index capture than the GBC-based method (Mann-Whitney two-sided U test: U=65, p=2e-9 for 3’ sgRNA-CR1cs1 capture; U=64, p=2e-9 for 5’ sgRNA-CR1cs1 capture; U=32, p=1e-10 for 5’ sgRNA-CR1 capture), while 3’ capture of sgRNA-CR1cs2 had modestly lower capture (Mann-Whitney two-sided U test: U=788, p=0.0002) b) Representative guide identity mapping. Data correspond to NegCtrl3 in 3’ sgRNA-CR1cs1 Perturb-seq (n=8727 cells). Guide identity mapping relies on fitting a 2-component Poisson and Gaussian mixture model (black line), where cells with a posterior probability >0.5 (dotted line) of belonging to the upper mode component are assigned to NegCtrl3. c) Median index UMI counts per cell (capture rate) for cells assigned to each guide identity in Perturb-seq experiments (n=32 guides per experiment). Across platforms, NegCtrl2 has the worst capture rate, which may be explained by the fact that this negative control guide has a targeting sequence containing an extended run of guanine nucleotides (5’-GCGATGGGGGGGTGGGTAGC-3’). Data plotted here are also plotted in Figure 1c. d) For each pairwise comparison of Perturb-seq experiments, we calculated a Pearson correlation of guide capture rates (n=32 guides). Across experiments performed with direct capture Perturb-seq, guide capture rates are correlated (r>0.6), suggesting that targeting sequence-dependent features influence guide capture. e) Box plots of median guide UMIs per cell stratified by the 5’ terminal nucleotides of the targeting region. Box plots denote quartile ranges (box), median (center mark), and 1.5 × interquartile range (whiskers). The displayed data is from Perturb-seq by 5’ sgRNA-CR1 capture. There is a significant relationship between capture rate and the 5’ terminal nucleotides (Kruskal-Wallis H-test: n=32 guides, H=10.2, p=0.017 for 5’ sgRNA-CR1) f) Box plots of median guide UMIs per cell stratified by the 5’ terminal nucleotides of the targeting region. Box plots denote quartile ranges (box), median (center mark), and 1.5 × interquartile range (whiskers). The displayed data is from Perturb-seq by 3’ sgRNA-CR1cs1 capture. There is a significant relationship between capture rate and the 5’ terminal nucleotides (Kruskal-Wallis H-test: n=32 guides, H=10.9, p=0.012 for 3’ sgRNA-CR1cs1) g) Scatterplot of the median number of UMIs per cell and targeting region GC content for each of n=32 guides. The displayed data is from Perturb-seq by 5’ sgRNA-CR1 capture. For all platforms, we observed no significant Pearson correlation between capture rate and targeting region GC content (p>0.4 for all platforms). h) Identity assignment rates per guide for Perturb-seq experiments. Balanced representation among cells assigned to each of n=32 guides (with intentionally 4-fold overrepresented negative controls) was achieved by titering lentiviruses prior to pooling.
Supplementary Figure 3 Direct capture Perturb-seq performs comparably to GBC Perturb-seq for phenotypic analysis.
a) Mean target knockdown (fraction mRNA remaining) for each targeting guide (n=30) in the indicated experiments. For each guide, the data point represents the mean normalized expression level of the target gene across cells bearing the corresponding guide divided by the mean normalized expression level of the target gene in control cells (NegCtrl3). b) The Pearson correlation of pseudo-bulk expression profiles from direct capture Perturb-seq and GBC Perturb-seq experiments for each perturbation (n=30 targeting guides). Profiles were generated from the top 100 most differentially expressed genes in GBC Perturb-seq. Grey lines indicate medians. c) Scatterplot indicates the relationship between the number of differentially expressed genes for each guide (determined by a two-sided, two-sample Kolmogorov-Smirnov test using GBC Perturb-seq data) and the Pearson correlation of pseudo-bulk expression profiles between GBC Perturb-seq and direct capture Perturb-seq on the indicated platform (n=30 targeting guides per platform). d) Scatterplots of the balanced accuracy of random forest classifiers trained to distinguish perturbed and unperturbed (NegCtrl3) cells for each of n=30 targeting guides on the indicated platforms. Direct capture Perturb-seq accuracies were highly correlated with GBC Perturb-seq (Pearson correlation: r=0.91 for 3’ sgRNA-CR1cs1 capture; r=0.90 for 5’ sgRNA-CR1 capture). We failed to detect significant differences in performance between direct capture Perturb-seq and GBC Perturb-seq (Wilcoxon signed-rank two-sided test: p=0.2 for 3’ sgRNA-CR1cs1; p=0.6 for 5’ sgRNA-CR1 capture).
Supplementary Figure 4 Direct capture Perturb-seq allows for robust guide assignment and phenotypic analysis in iPSCs with CRISPR cutting (CRISPRn).
a) Box plots displaying the index (guide) UMIs per cell, mRNA UMIs per cell, and transcripts per cell in iPSCs expressing Cas9 (n=5300 cells). Box plots denote quartile ranges (box), median (center mark), and 1.5 × interquartile range (whiskers). b) Median index (guide) UMI counts per cell (capture rate) for cells assigned to each guide identity. c) Heatmap represents gene expression of most differentially expressed genes across all cells with the indicated genetic perturbation, as determined by a random forest classifier. Expression values are the z-scored expression relative to unperturbed cells. Profiles for each gene are calculated by averaging the pseudo-bulk expression profiles of the two independent sgRNAs targeting the gene (n=19 genes targeted by two sgRNAs each). d) Box plot of Pearson correlations of pseudo-bulk expression profiles caused by sgRNAs targeting the same gene (n=19) versus sgRNAs targeting different genes (n=684 pairs). Box plots denote quartile ranges (box), median (center mark), and 1.5 × interquartile range (whiskers). sgRNAs targeting the same gene cause significantly more similar profiles than sgRNAs targeting different genes (Mann-Whitney two-sided U test U=1636.0, p=0.0002). Differences in expression profiles caused by sgRNAs targeting the same genes are likely due to variation of CRISPR cutting efficacy, indel profiles, and/or genetic compensation.
Supplementary Figure 5 Optimization of additional guide constant regions to enable dual-guide 3’ direct capture Perturb-seq.
a) Gaussian kernel density estimates of normalized flow-cytometry measurements representing GFP expression demonstrate CRISPRi activity of sgRNAs with indicated constant regions (programmed with an identical GFP targeting region). Data was collected in three independent biological replicates and a representative replicate is shown. In this experiment, sgRNAs were expressed from a single-guide vector. As all CR2 and CR3 sgRNA variants were highly active, we used CR3 with cs1 in the stem loop (CR3cs1) and CR2 with cs2 at the 3’ end (CR2cs2) for downstream experiments. AU, arbitrary units. b) Median index (guide) UMIs per cell for each of n=45 sgRNAs where grey lines indicate medians. Data are from our dual-guide genetic interaction 3’ direct capture Perturb-seq experiments. c) Median index (guide) UMIs per cell for each of n=45 sgRNAs where grey lines indicate medians. Data are from our dual-guide genetic interaction 3’ direct capture Perturb-seq experiment with an mU6-CR3cs1-hU6-CR1cs1 design. d) Fraction of cells in each cell cycle phase across cells with the indicated perturbations.
a) Scatterplot depicts the relative target expression per gene to compare the CRISPRa activity of a single sgRNA (expressed from a dual-guide vector paired with a non-targeting control) with CRISPRa activity from multiplexed sgRNAs. Multiplexing sgRNAs significantly improves activation (sgRNAs 1+control, median fold-activation=2.9; sgRNAs 1+2, median fold-activation=4.7; Wilcoxon signed-rank two-sided test n=49 genes, W=162, p=7e-6). b) Box plots depict the relative target expression per gene for genes stratified by essentiality in K562 cells (Horlbeck et al., Elife, 2016)29. “min(1,2)” indicates the minimum remaining target expression between sgRNA 1 (paired with negative control) and sgRNA 2 (paired with negative control), ie. the predicted multiplexed sgRNA knockdown based on a dominant model. For essential genes, sgRNA multiplexing improves median knockdown from 80% to 87%. For nonessential genes, sgRNA multiplexing improves median knockdown from 76% to 90%. Box plots denote quartile ranges (box), median (center mark), and 1.5 × interquartile range (whiskers). Data from multiplexed CRISPRi experiment. c) Scatterplot depicts relative target expression per gene to compare observed CRISPRi-based gene knockdown with predicted knockdown using multiplexed sgRNAs (assuming a dominant model). Multiplexing sgRNAs performs better than predicted based on the dominant model (Wilcoxon signed-rank two-sided test n=87 genes, W=698, p=3e-7). d) Scatterplot depicts relative target expression per gene to compare predicted CRISPRa activity from multiplexed sgRNAs (assuming a dominant model) to observed activity. Multiplexing sgRNAs performs better than predicted based on the dominant model (Wilcoxon signed-rank two-sided test n=49 genes, W=233, p=0.0002). e) Scatterplot depicts relative target expression per gene to compare CRISPRi-based gene knockdown using sgRNAs 1+2 (<80bp apart) with the activity of sgRNAs 1+3 (>80bp apart). We failed to detect a significant increase in CRISPRi activity with increased distance between sgRNAs (Wilcoxon signed-rank two-sided test n=55 genes, W=643, p=0.3). f) Scatterplot depicts relative target expression per gene to compare the CRISPRa activity of sgRNAs 1+2 (<80bp apart) with sgRNAs 1+3 (>80bp apart). We failed to detect a significant increase in CRISPRa activity with increased distance between sgRNAs (Wilcoxon signed-rank two-sided test n=30 genes, W=190, p=0.4).
Supplementary Figure 7 Target enriched gene expression libraries are well correlated with deeply sequenced, unenriched libraries.
a) Cumulative density function of gene expression from K562 cells. In K562 cells, 2% of expressed genes consume >50% of sequencing reads. b) Heatmap compares gene ontology (GO) terms by their expression level in K562 cells. While some GO terms are enriched for highly expressed genes (for example ribosome, protein targeting, mRNA binding, protein folding) others are enriched for lowly expressed genes (e.g. DNA-binding transcription factor activity, phosphatase activity). c) Histogram depicts ratio of total UMIs in the target enriched library compared to total UMIs in the unenriched library for each of n=978 targeted genes. d) Histogram depicts Pearson correlations of the L1000 gene expression profiles per cell, before and after target enrichment (n=6349 cells with identified guide identities). e) Histogram depicts Pearson correlations of gene expression across cells per each gene, before and after target enrichment (n=978 genes). f) Scatterplot shows gene expression level (mean UMIs per cell) compared to the per gene Pearson correlation before and after target enrichment (n=978 genes). g) Histogram depicts Pearson correlations between differential expression profiles for the same guide pairs in enriched and unenriched libraries. Perturbations leading to >10 differentially expressed genes by two-sided, two-sample Kolmogorov-Smirnov test were included (n=137). Genes expressed at >1 UMI/cell were considered (n=263 genes), and gene expression profiles were z-scored with respect to the population of cells expressing non-targeting control guides in order to determine differential gene expression profiles. The median Pearson correlation of r=0.71 shows that genetic perturbation-dependent differential expression patterns are overall conserved before and after target enrichment. h) Scatterplot of the guide-guide Spearman’s rank correlations, calculated before and after target enrichment (n=10440 pairwise comparisons, cophenetic correlation r=0.95).
About this article
Cite this article
Replogle, J.M., Norman, T.M., Xu, A. et al. Combinatorial single-cell CRISPR screens by direct guide RNA capture and targeted sequencing. Nat Biotechnol (2020). https://doi.org/10.1038/s41587-020-0470-y
Genome Biology (2020)
Molecular Cell (2020)
Personalized Medicine (2020)