The scale and capabilities of single-cell RNA-sequencing methods have expanded rapidly in recent years, enabling major discoveries and large-scale cell mapping efforts. However, these methods have not been systematically and comprehensively benchmarked. Here, we directly compare seven methods for single-cell and/or single-nucleus profiling—selecting representative methods based on their usage and our expertise and resources to prepare libraries—including two low-throughput and five high-throughput methods. We tested the methods on three types of samples: cell lines, peripheral blood mononuclear cells and brain tissue, generating 36 libraries in six separate experiments in a single center. To directly compare the methods and avoid processing differences introduced by the existing pipelines, we developed scumi, a flexible computational pipeline that can be used with any single-cell RNA-sequencing method. We evaluated the methods for both basic performance, such as the structure and alignment of reads, sensitivity and extent of multiplets, and for their ability to recover known biological information in the samples.
Subscribe to Journal
Get full journal access for 1 year
only $21.58 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
The scumi Python package is available freely from bitbucket repository at https://bitbucket.org/jerry00/scumi-dev/src/master/ and as Supplementary code. The R scripts (used to assign cell types to clusters based on a set of marker genes, for parameter selecting for clustering analysis and for filtering low-quality cells) are available from bitbucket repository at https://bitbucket.org/jerry00/scumi-dev/src/master/.
Haque, A., Engel, J., Teichmann, S. A. & Lonnberg, T. A practical guide to single-cell RNA-sequencing for biomedical research and clinical applications. Genome Med. 9, 75 (2017).
Tanay, A. & Regev, A. Scaling single-cell genomics from phenomenology to mechanism. Nature 541, 331–338 (2017).
Wu, A. R., Wang, J., Streets, A. M. & Huang, Y. Single-cell transcriptional analysis. Annu. Rev. Anal. Chem. (Palo Alto Calif.) 10, 439–462 (2017).
Regev, A. et al. The Human Cell Atlas. eLife 6, e27041 (2017).
Picelli, S. et al. Smart-seq2 for sensitive full-length transcriptome profiling in single cells. Nat. Methods 10, 1096–1098 (2013).
Hashimshony, T. et al. CEL-Seq2: sensitive highly-multiplexed single-cell RNA-Seq. Genome Biol. 17, 77 (2016).
Klein, A. M. et al. Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell 161, 1187–1201 (2015).
Macosko, E. Z. et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161, 1202–1214 (2015).
Zheng, G. X. et al. Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 8, 14049 (2017).
Gierahn, T. M. et al. Seq-Well: portable, low-cost RNA sequencing of single cells at high throughput. Nat. Methods 14, 395–398 (2017).
Han, X. et al. Mapping the mouse cell atlas by Microwell-seq. Cell 172, 1091–1107 (2018).
Cao, J. et al. Comprehensive single-cell transcriptional profiling of a multicellular organism. Science 357, 661–667 (2017).
Cao, J. et al. The single-cell transcriptional landscape of mammalian organogenesis. Nature 566, 496–502 (2019).
Rosenberg, A. B. et al. Single-cell profiling of the developing mouse brain and spinal cord with split-pool barcoding. Science 360, 176–182 (2018).
Svensson, V., Vento-Tormo, R. & Teichmann, S. A. Exponential scaling of single-cell RNA-seq in the past decade. Nat. Protoc. 13, 599–604 (2018).
Ziegenhain, C. et al. Comparative analysis of single-cell RNA sequencing methods. Mol. Cell 65, 631–643 (2017).
Dueck, H. R. et al. Assessing characteristics of RNA amplification methods for single cell RNA sequencing. BMC Genomics 17, 966 (2016).
Svensson, V. et al. Power analysis of single-cell RNA-sequencing experiments. Nat. Methods 14, 381–387 (2017).
Bhargava, V., Head, S. R., Ordoukhanian, P., Mercola, M. & Subramaniam, S. Technical variations in low-input RNA-seq methodologies. Sci. Rep. 4, 3678 (2014).
Zhang, X. et al. Comparative analysis of droplet-based ultra-high-throughput single-cell RNA-seq systems. Mol. Cell 73, 130–142 (2019).
Wang, Y. J. et al. Comparative analysis of commercially available single-cell RNA sequencing platforms for their performance in complex human tissues. Preprint at bioRxiv 541433 (2019).
Habib, N. et al. Div-Seq: single-nucleus RNA-seq reveals dynamics of rare adult newborn neurons. Science 353, 925–928 (2016).
Habib, N. et al. Massively parallel single-nucleus RNA-seq with DroNc-seq. Nat. Methods 14, 955–958 (2017).
Bakken, T. E. et al. Single-nucleus and single-cell transcriptomes compared in matched cortical cell types. PLoS ONE 13, e0209648 (2018).
Lake, B. B. et al. Neuronal subtypes and diversity revealed by single-nucleus RNA sequencing of the human brain. Science 352, 1586–1590 (2016).
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
Wallrapp, A. et al. The neuropeptide NMU amplifies ILC2-driven allergic lung inflammation. Nature 549, 351–356 (2017).
Tilgner, H. et al. Deep sequencing of subcellular RNA fractions shows splicing to be predominantly co-transcriptional in the human genome but inefficient for lncRNAs. Genome Res. 22, 1616–1625 (2012).
Grun, D., Kester, L. & van Oudenaarden, A. Validation of noise models for single-cell transcriptomics. Nat. Methods 11, 637–640 (2014).
Wagner, F., Yan, Y. & Yanai, I. K-nearest neighbor smoothing for high-throughput single-cell RNA-seq data. Preprint at bioRxiv 217737 (2018).
Zeisel, A. et al. Brain structure. Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq. Science 347, 1138–1142 (2015).
Korsunsky, I. et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods 16, 1289–1296 (2019).
Saelens, W., Cannoodt, R., Todorov, H. & Saeys, Y. A comparison of single-cell trajectory inference methods. Nat. Biotechnol. 37, 547–554 (2019).
Arazi, A. et al. The immune cell landscape in kidneys of patients with lupus nephritis. Nat. Immunol. 20, 902–914 (2019).
Lacar, B. et al. Nuclear RNA-seq of single neurons reveals molecular signatures of activation. Nat. Commun. 7, 11022 (2016).
van den Brink, S. C. et al. Single-cell sequencing reveals dissociation-induced gene expression in tissue subpopulations. Nat. Methods 14, 935–936 (2017).
Lake, B. B. et al. A comparative strategy for single-nucleus and single-cell transcriptomes confirms accuracy in predicted cell-type expression from nuclear RNA. Sci. Rep. 7, 6031 (2017).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Kim, D., Langmead, B. & Salzberg, S. L. HISAT: a fast spliced aligner with low memory requirements. Nat. Methods 12, 357–360 (2015).
Liao, Y., Smyth, G. K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).
Parekh, S., Ziegenhain, C., Vieth, B., Enard, W. & Hellmann, I. zUMIs—a fast and flexible pipeline to process RNA sequencing data with UMIs.Gigascience 7, giy059 (2018).
Li, B. & Dewey, C. N. RSEM: accurate transcript quantification from RNA-seq data with or without a reference genome. BMC Bioinformatics 12, 323 (2011).
Ding, J. et al. Systematic analysis of somatic mutations impacting gene expression in 12 tumour types. Nat. Commun. 6, 8554 (2015).
Levine, J. H. et al. Data-driven phenotypic dissection of AML reveals progenitor-like cells that correlate with prognosis. Cell 162, 184–197 (2015).
Blondel, V. D., Guillaume, J.-L., Lambiotte, R. & Lefebvre, E. Fast unfolding of communities in large networks. J. Stat. Mech. Theory E. 2008, P10008 (2008).
Butler, A., Hoffman, P., Smibert, P., Papalexi, E. & Satija, R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol. 36, 411–420 (2018).
Tasic, B. et al. Adult mouse cortical cell taxonomy revealed by single cell transcriptomics. Nat. Neurosci. 19, 335–346 (2016).
Palmer, C., Diehn, M., Alizadeh, A. A. & Brown, P. O. Cell-type specific gene expression profiles of leukocytes in human peripheral blood. BMC Genomics 7, 115 (2006).
We especially thank M. Chatterjee, A. Ratner and S. Boswell of the Single Cell Core at Harvard Medical School for performing the inDrops experiments. We are grateful to A. Neumann, J. Lee, D. Dionne and N. Sharif for assistance with project coordination; A. Klein for helpful discussions and suggestions; R. Kirchner for advice on inDrops data analysis; D. Leib for advice on CEL-Seq2 data analysis; B. Li for advice on PBMC data analysis; K. Shekhar for precision analysis in cell line mixture data; M. Cuoco for sample transportation; Broad Flow Cytometry Facility for cell sorting; Broad Genomics Platform for sequencing; and L. Gaffney for assistance with figures. Work was supported by the Klarman Cell Observatory, the Manton Foundation and the BRAIN Initiative (grant no. 1U19 MH114821, A.R.). A.R. is an Investigator of the Howard Hughes Medical Institute. This publication is part of the Human Cell Atlas at www.humancellatlas.org/publications.
A.R. is a founder and equity holder in Celsius Therapeutics; an equity holder in Immunitas; and an SAB member of Syros Pharmaceuticals, Neogene Therapeutics and Thermo Fisher Scientific. A.K.S. is a founder of, and consultant for, Honeycomb Biotechnologies, Inc., which manufactures Seq-Well peripherals. A.K.S. and A.R. are also named inventors on patents filed by the Broad Institute related to either Drop-seq (A.R. and A.K.S.), DroNc-seq (A.R.) or Seq-Well (A.K.S). The interests of A.K.S. and A.R. were reviewed and are subject to a management plan overseen by their institutions in accordance with their conflict of interest policies. The other authors declare no competing interests.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
a, scumi workflow, b, removing low quality cell barcodes, c, profiling samples, d, bulk data workflow.
a, Mixture, b, PBMCs, c, Cortex. For each pair of bar graphs, experiment 1 is on the left and experiment 2 is on the right. For Smart-seq2, there were no poly(T) reads due to the full transcriptome coverage and the library construction using transposase-based Nextera reagents to attach adapters to both ends of cDNA fragments. Reads were assigned in the following order: no poly(T), unmapped or multi-mapped, ambiguous (mapping to a single location that overlaps 2 or more genes), and then one of the remaining categories. Reads were assigned as antisense only for the cortex datasets (c). % of reads may not sum to 100 due to rounding and numbers not shown for fraction of reads in categories with <2%.
a, human cells and mouse cells from Mixture experiments. Multiplet cells are not shown in this plot. b, PBMC. c, cortex. The number of cells (x-axis) with a given mean number of genes detected (y-axis), when cells are ordered from highest (left) to lowest (right) total number of genes. The right most point at the end of each curve shows the average number of detected genes for the final selected number of cells in this study.
Extended Data Fig. 5 Impact of sequencing depth on gene and UMI detection per cell in the PBMC datasets.
a–d, The median number of genes (a, b, y-axis) and UMIs (c, d, y-axis) detected per cell at different sequencing depths (x-axis) for low-throughput (a, c) and high-throughput (b, d) methods from PBMC1 (left) and PBMC2 (right). Far right point of each curve: median number of detected genes per cell at full sequencing depth. e, Relation between median number of genes and UMIs per cell in PBMC1 (left) and PBMC2 (right). (n = 1 biologically independent sample for each curve in each plot).
Fraction of UMIs (or reads for Smart-seq2) aligned to either mouse (y-axis) or human (x-axis) in each cell from the Mixture1 a, and Mixture2 b, experiments (n = 1 biologically independent sample per panel). Each dot represents a cell. Dashed line and number: robust linear regression fitted line and its slope. Number of genes detected from the “wrong” species is higher in cells with more reads.
Distributions of the extra Poisson coefficients of variation (“Extra Poisson CV”, y-axis) from each method (x-axis). a, b, Human cells, c, d, mouse cells – from Mixture1 (left) and Mixture2 (right) (n = 1 biologically independent sample per panel). Violin and box plot elements are defined as in Figure 2.
t-SNEs of single cell profiles (dots) from each method colored by cell type assignment from PBMC1 a, and PBMC2 b,. (n = 1 biologically independent sample per panel).
t-SNEs of single nucleus profiles (dots) from each method colored by cell type assignment from Cortex2 (n = 1 biologically independent sample).
a, t-SNE plot generated with Harmony clustering all PBMC cells in this study (n = 2 biologically independent samples). b, All libraries contain cells of every cell type, according to this joint annotation. This differs from the individual level clustering results, in which many libraries are missing particular cell types (n = 2 biologically independent samples). c, PBMC1 and d, PBMC2. For each annotated cell type and method in the jointly clustered dataset (y-axis), we calculated the percentage of cells from that cell type that come from each cell type in the individual level clustering results (x-axis). This is denoted by the color of the corresponding boxes (n = 1 biologically independent sample for (c) and (d)).
Supplementary Notes 1–8, Methods, Figs. 1–13 and table legends.
Supplementary Tables 1–12 and 15–19. Titles are in the Supplementary Information file.
Supplementary Table 13. Title is in the Supplementary Information file.
Supplementary Table 14. Title is in the Supplementary Information file.
Code for scumi computational pipeline.
About this article
Cite this article
Ding, J., Adiconis, X., Simmons, S.K. et al. Systematic comparison of single-cell and single-nucleus RNA-sequencing methods. Nat Biotechnol 38, 737–746 (2020). https://doi.org/10.1038/s41587-020-0465-8
Integrating Deep Supervised, Self-Supervised and Unsupervised Learning for Single-Cell RNA-seq Clustering and Annotation
Trends in Immunology (2020)
scClassify: sample size estimation and multiscale classification of cells using single and multiple reference
Molecular Systems Biology (2020)
Nature Reviews Gastroenterology & Hepatology (2020)