Systematic comparison of single-cell and single-nucleus RNA-sequencing methods

Abstract

The scale and capabilities of single-cell RNA-sequencing methods have expanded rapidly in recent years, enabling major discoveries and large-scale cell mapping efforts. However, these methods have not been systematically and comprehensively benchmarked. Here, we directly compare seven methods for single-cell and/or single-nucleus profiling—selecting representative methods based on their usage and our expertise and resources to prepare libraries—including two low-throughput and five high-throughput methods. We tested the methods on three types of samples: cell lines, peripheral blood mononuclear cells and brain tissue, generating 36 libraries in six separate experiments in a single center. To directly compare the methods and avoid processing differences introduced by the existing pipelines, we developed scumi, a flexible computational pipeline that can be used with any single-cell RNA-sequencing method. We evaluated the methods for both basic performance, such as the structure and alignment of reads, sensitivity and extent of multiplets, and for their ability to recover known biological information in the samples.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: Study overview.
Fig. 2: Performance metrics for mixture experiments.
Fig. 3: PBMCs sensitivity.
Fig. 4: Cortex nuclei sensitivity.
Fig. 5: Cell type identification and assignment in PBMCs.
Fig. 6: Cell type identification and assignment in cortex nuclei.

Data availability

RNA-seq data generated in this project are available from the Gene Expression Omnibus with accession number GSE132044 and the Single Cell Portal with accession numbers SCP424, SCP425 and SCP426. Source data for Figs. 2–6 are presented with the paper.

Code availability

The scumi Python package is available freely from bitbucket repository at https://bitbucket.org/jerry00/scumi-dev/src/master/ and as Supplementary code. The R scripts (used to assign cell types to clusters based on a set of marker genes, for parameter selecting for clustering analysis and for filtering low-quality cells) are available from bitbucket repository at https://bitbucket.org/jerry00/scumi-dev/src/master/.

Change history

  • 27 April 2020

    An amendment to this paper has been published and can be accessed via a link at the top of the paper.

References

  1. 1.

    Haque, A., Engel, J., Teichmann, S. A. & Lonnberg, T. A practical guide to single-cell RNA-sequencing for biomedical research and clinical applications. Genome Med. 9, 75 (2017).

    PubMed  PubMed Central  Google Scholar 

  2. 2.

    Tanay, A. & Regev, A. Scaling single-cell genomics from phenomenology to mechanism. Nature 541, 331–338 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  3. 3.

    Wu, A. R., Wang, J., Streets, A. M. & Huang, Y. Single-cell transcriptional analysis. Annu. Rev. Anal. Chem. (Palo Alto Calif.) 10, 439–462 (2017).

    CAS  Google Scholar 

  4. 4.

    Regev, A. et al. The Human Cell Atlas. eLife 6, e27041 (2017).

    PubMed  PubMed Central  Google Scholar 

  5. 5.

    Picelli, S. et al. Smart-seq2 for sensitive full-length transcriptome profiling in single cells. Nat. Methods 10, 1096–1098 (2013).

    CAS  PubMed  Google Scholar 

  6. 6.

    Hashimshony, T. et al. CEL-Seq2: sensitive highly-multiplexed single-cell RNA-Seq. Genome Biol. 17, 77 (2016).

    PubMed  PubMed Central  Google Scholar 

  7. 7.

    Klein, A. M. et al. Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell 161, 1187–1201 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  8. 8.

    Macosko, E. Z. et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161, 1202–1214 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  9. 9.

    Zheng, G. X. et al. Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 8, 14049 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  10. 10.

    Gierahn, T. M. et al. Seq-Well: portable, low-cost RNA sequencing of single cells at high throughput. Nat. Methods 14, 395–398 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. 11.

    Han, X. et al. Mapping the mouse cell atlas by Microwell-seq. Cell 172, 1091–1107 (2018).

    CAS  PubMed  Google Scholar 

  12. 12.

    Cao, J. et al. Comprehensive single-cell transcriptional profiling of a multicellular organism. Science 357, 661–667 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  13. 13.

    Cao, J. et al. The single-cell transcriptional landscape of mammalian organogenesis. Nature 566, 496–502 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  14. 14.

    Rosenberg, A. B. et al. Single-cell profiling of the developing mouse brain and spinal cord with split-pool barcoding. Science 360, 176–182 (2018).

    CAS  PubMed  Google Scholar 

  15. 15.

    Svensson, V., Vento-Tormo, R. & Teichmann, S. A. Exponential scaling of single-cell RNA-seq in the past decade. Nat. Protoc. 13, 599–604 (2018).

    CAS  PubMed  Google Scholar 

  16. 16.

    Ziegenhain, C. et al. Comparative analysis of single-cell RNA sequencing methods. Mol. Cell 65, 631–643 (2017).

    CAS  PubMed  Google Scholar 

  17. 17.

    Dueck, H. R. et al. Assessing characteristics of RNA amplification methods for single cell RNA sequencing. BMC Genomics 17, 966 (2016).

    PubMed  PubMed Central  Google Scholar 

  18. 18.

    Svensson, V. et al. Power analysis of single-cell RNA-sequencing experiments. Nat. Methods 14, 381–387 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  19. 19.

    Bhargava, V., Head, S. R., Ordoukhanian, P., Mercola, M. & Subramaniam, S. Technical variations in low-input RNA-seq methodologies. Sci. Rep. 4, 3678 (2014).

    PubMed  PubMed Central  Google Scholar 

  20. 20.

    Zhang, X. et al. Comparative analysis of droplet-based ultra-high-throughput single-cell RNA-seq systems. Mol. Cell 73, 130–142 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  21. 21.

    Wang, Y. J. et al. Comparative analysis of commercially available single-cell RNA sequencing platforms for their performance in complex human tissues. Preprint at bioRxiv 541433 (2019).

  22. 22.

    Habib, N. et al. Div-Seq: single-nucleus RNA-seq reveals dynamics of rare adult newborn neurons. Science 353, 925–928 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. 23.

    Habib, N. et al. Massively parallel single-nucleus RNA-seq with DroNc-seq. Nat. Methods 14, 955–958 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  24. 24.

    Bakken, T. E. et al. Single-nucleus and single-cell transcriptomes compared in matched cortical cell types. PLoS ONE 13, e0209648 (2018).

    PubMed  PubMed Central  Google Scholar 

  25. 25.

    Lake, B. B. et al. Neuronal subtypes and diversity revealed by single-nucleus RNA sequencing of the human brain. Science 352, 1586–1590 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. 26.

    Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  27. 27.

    Wallrapp, A. et al. The neuropeptide NMU amplifies ILC2-driven allergic lung inflammation. Nature 549, 351–356 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  28. 28.

    Tilgner, H. et al. Deep sequencing of subcellular RNA fractions shows splicing to be predominantly co-transcriptional in the human genome but inefficient for lncRNAs. Genome Res. 22, 1616–1625 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  29. 29.

    Grun, D., Kester, L. & van Oudenaarden, A. Validation of noise models for single-cell transcriptomics. Nat. Methods 11, 637–640 (2014).

    PubMed  Google Scholar 

  30. 30.

    Wagner, F., Yan, Y. & Yanai, I. K-nearest neighbor smoothing for high-throughput single-cell RNA-seq data. Preprint at bioRxiv 217737 (2018).

  31. 31.

    Zeisel, A. et al. Brain structure. Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq. Science 347, 1138–1142 (2015).

    CAS  PubMed  Google Scholar 

  32. 32.

    Korsunsky, I. et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods 16, 1289–1296 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  33. 33.

    Saelens, W., Cannoodt, R., Todorov, H. & Saeys, Y. A comparison of single-cell trajectory inference methods. Nat. Biotechnol. 37, 547–554 (2019).

    CAS  PubMed  Google Scholar 

  34. 34.

    Arazi, A. et al. The immune cell landscape in kidneys of patients with lupus nephritis. Nat. Immunol. 20, 902–914 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  35. 35.

    Lacar, B. et al. Nuclear RNA-seq of single neurons reveals molecular signatures of activation. Nat. Commun. 7, 11022 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  36. 36.

    van den Brink, S. C. et al. Single-cell sequencing reveals dissociation-induced gene expression in tissue subpopulations. Nat. Methods 14, 935–936 (2017).

    PubMed  Google Scholar 

  37. 37.

    Lake, B. B. et al. A comparative strategy for single-nucleus and single-cell transcriptomes confirms accuracy in predicted cell-type expression from nuclear RNA. Sci. Rep. 7, 6031 (2017).

    PubMed  PubMed Central  Google Scholar 

  38. 38.

    Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  39. 39.

    Kim, D., Langmead, B. & Salzberg, S. L. HISAT: a fast spliced aligner with low memory requirements. Nat. Methods 12, 357–360 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  40. 40.

    Liao, Y., Smyth, G. K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  41. 41.

    Parekh, S., Ziegenhain, C., Vieth, B., Enard, W. & Hellmann, I. zUMIs—a fast and flexible pipeline to process RNA sequencing data with UMIs.Gigascience 7, giy059 (2018).

    PubMed Central  Google Scholar 

  42. 42.

    Li, B. & Dewey, C. N. RSEM: accurate transcript quantification from RNA-seq data with or without a reference genome. BMC Bioinformatics 12, 323 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  43. 43.

    Ding, J. et al. Systematic analysis of somatic mutations impacting gene expression in 12 tumour types. Nat. Commun. 6, 8554 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  44. 44.

    Levine, J. H. et al. Data-driven phenotypic dissection of AML reveals progenitor-like cells that correlate with prognosis. Cell 162, 184–197 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  45. 45.

    Blondel, V. D., Guillaume, J.-L., Lambiotte, R. & Lefebvre, E. Fast unfolding of communities in large networks. J. Stat. Mech. Theory E. 2008, P10008 (2008).

    Google Scholar 

  46. 46.

    Butler, A., Hoffman, P., Smibert, P., Papalexi, E. & Satija, R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol. 36, 411–420 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  47. 47.

    Tasic, B. et al. Adult mouse cortical cell taxonomy revealed by single cell transcriptomics. Nat. Neurosci. 19, 335–346 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  48. 48.

    Palmer, C., Diehn, M., Alizadeh, A. A. & Brown, P. O. Cell-type specific gene expression profiles of leukocytes in human peripheral blood. BMC Genomics 7, 115 (2006).

    PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We especially thank M. Chatterjee, A. Ratner and S. Boswell of the Single Cell Core at Harvard Medical School for performing the inDrops experiments. We are grateful to A. Neumann, J. Lee, D. Dionne and N. Sharif for assistance with project coordination; A. Klein for helpful discussions and suggestions; R. Kirchner for advice on inDrops data analysis; D. Leib for advice on CEL-Seq2 data analysis; B. Li for advice on PBMC data analysis; K. Shekhar for precision analysis in cell line mixture data; M. Cuoco for sample transportation; Broad Flow Cytometry Facility for cell sorting; Broad Genomics Platform for sequencing; and L. Gaffney for assistance with figures. Work was supported by the Klarman Cell Observatory, the Manton Foundation and the BRAIN Initiative (grant no. 1U19 MH114821, A.R.). A.R. is an Investigator of the Howard Hughes Medical Institute. This publication is part of the Human Cell Atlas at www.humancellatlas.org/publications.

Author information

Affiliations

Authors

Contributions

J.Z.L., A.K.S., O.R.-R. and A.R. conceived the research. X.A., C.C.H., N.D.M., T.K.H., M.H.W., T.B., L.T.N., J.Y.H.K., S.C. and S.L. performed the scRNA-seq experiments. X.A. and C.C.H. organized the sequencing. X.A. prepared the bulk RNA-seq libraries. J.D. created the scumi pipeline. J.D., S.K.S., A.-C.V., A.J.K. and J.Z.L. analyzed the data. M.S.K. contributed an optimized Smart-seq2 protocol. J.Y.H.K. prepared the cell lines. A.-C.V. and W.G. prepared the PBMCs. B.B. prepared the mouse cortex. J.Z.L., N.H., O.R.-R., A.K.S., A.-C.V. and A.R. provided supervisory guidance. J.D., X.A., S.K.S., C.C.H., T.K.H., M.H.W., T.B., J.Y.H.K., A.-C.V., A.R. and J.Z.L. wrote the paper. All authors assisted in editing the paper.

Corresponding author

Correspondence to Joshua Z. Levin.

Ethics declarations

Competing interests

A.R. is a founder and equity holder in Celsius Therapeutics; an equity holder in Immunitas; and an SAB member of Syros Pharmaceuticals, Neogene Therapeutics and Thermo Fisher Scientific. A.K.S. is a founder of, and consultant for, Honeycomb Biotechnologies, Inc., which manufactures Seq-Well peripherals. A.K.S. and A.R. are also named inventors on patents filed by the Broad Institute related to either Drop-seq (A.R. and A.K.S.), DroNc-seq (A.R.) or Seq-Well (A.K.S). The interests of A.K.S. and A.R. were reviewed and are subject to a management plan overseen by their institutions in accordance with their conflict of interest policies. The other authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Description of scRNA-seq methods evaluated.

Salient details for seven protocols tested in this paper. Adapted from Figure 2 of ref. 16.

Extended Data Fig. 2 Flowchart detailing computational analysis.

a, scumi workflow, b, removing low quality cell barcodes, c, profiling samples, d, bulk data workflow.

Extended Data Fig. 3 Characterization of genome alignments for sequence reads.

a, Mixture, b, PBMCs, c, Cortex. For each pair of bar graphs, experiment 1 is on the left and experiment 2 is on the right. For Smart-seq2, there were no poly(T) reads due to the full transcriptome coverage and the library construction using transposase-based Nextera reagents to attach adapters to both ends of cDNA fragments. Reads were assigned in the following order: no poly(T), unmapped or multi-mapped, ambiguous (mapping to a single location that overlaps 2 or more genes), and then one of the remaining categories. Reads were assigned as antisense only for the cortex datasets (c). % of reads may not sum to 100 due to rounding and numbers not shown for fraction of reads in categories with <2%.

Extended Data Fig. 4 Impact of number of cells on sensitivity.

a, human cells and mouse cells from Mixture experiments. Multiplet cells are not shown in this plot. b, PBMC. c, cortex. The number of cells (x-axis) with a given mean number of genes detected (y-axis), when cells are ordered from highest (left) to lowest (right) total number of genes. The right most point at the end of each curve shows the average number of detected genes for the final selected number of cells in this study.

Extended Data Fig. 5 Impact of sequencing depth on gene and UMI detection per cell in the PBMC datasets.

ad, The median number of genes (a, b, y-axis) and UMIs (c, d, y-axis) detected per cell at different sequencing depths (x-axis) for low-throughput (a, c) and high-throughput (b, d) methods from PBMC1 (left) and PBMC2 (right). Far right point of each curve: median number of detected genes per cell at full sequencing depth. e, Relation between median number of genes and UMIs per cell in PBMC1 (left) and PBMC2 (right). (n = 1 biologically independent sample for each curve in each plot).

Extended Data Fig. 6 Fraction of reads from each species in Mixture experiments.

Fraction of UMIs (or reads for Smart-seq2) aligned to either mouse (y-axis) or human (x-axis) in each cell from the Mixture1 a, and Mixture2 b, experiments (n = 1 biologically independent sample per panel). Each dot represents a cell. Dashed line and number: robust linear regression fitted line and its slope. Number of genes detected from the “wrong” species is higher in cells with more reads.

Extended Data Fig. 7 Technical precision plots for mixture experiments.

Distributions of the extra Poisson coefficients of variation (“Extra Poisson CV”, y-axis) from each method (x-axis). a, b, Human cells, c, d, mouse cells – from Mixture1 (left) and Mixture2 (right) (n = 1 biologically independent sample per panel). Violin and box plot elements are defined as in Figure 2.

Extended Data Fig. 8 Cell type analysis for each PBMC dataset.

t-SNEs of single cell profiles (dots) from each method colored by cell type assignment from PBMC1 a, and PBMC2 b,. (n = 1 biologically independent sample per panel).

Extended Data Fig. 9 Cell type analysis for Cortex2.

t-SNEs of single nucleus profiles (dots) from each method colored by cell type assignment from Cortex2 (n = 1 biologically independent sample).

Extended Data Fig. 10 Cell type analysis of the combined PBMC datasets.

a, t-SNE plot generated with Harmony clustering all PBMC cells in this study (n = 2 biologically independent samples). b, All libraries contain cells of every cell type, according to this joint annotation. This differs from the individual level clustering results, in which many libraries are missing particular cell types (n = 2 biologically independent samples). c, PBMC1 and d, PBMC2. For each annotated cell type and method in the jointly clustered dataset (y-axis), we calculated the percentage of cells from that cell type that come from each cell type in the individual level clustering results (x-axis). This is denoted by the color of the corresponding boxes (n = 1 biologically independent sample for (c) and (d)).

Supplementary information

Supplementary Information

Supplementary Notes 1–8, Methods, Figs. 1–13 and table legends.

Reporting Summary

Supplementary Tables 1–12 and 15–19

Supplementary Tables 1–12 and 15–19. Titles are in the Supplementary Information file.

Supplementary Table 13

Supplementary Table 13. Title is in the Supplementary Information file.

Supplementary Table 14

Supplementary Table 14. Title is in the Supplementary Information file.

Supplementary Software

Code for scumi computational pipeline.

Source data

Source Data Fig. 2

Statistical source data.

Source Data Fig. 3

Statistical source data.

Source Data Fig. 4

Statistical source data.

Source Data Fig. 5

Statistical source data.

Source Data Fig. 6

Statistical source data.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ding, J., Adiconis, X., Simmons, S.K. et al. Systematic comparison of single-cell and single-nucleus RNA-sequencing methods. Nat Biotechnol 38, 737–746 (2020). https://doi.org/10.1038/s41587-020-0465-8

Download citation

Further reading

Search

Sign up for the Nature Briefing newsletter for a daily update on COVID-19 science.
Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing