Single-nucleus RNA sequencing (sNuc-seq) profiles RNA from tissues that are preserved or cannot be dissociated, but it does not provide high throughput. Here, we develop DroNc-seq: massively parallel sNuc-seq with droplet technology. We profile 39,111 nuclei from mouse and human archived brain samples to demonstrate sensitive, efficient, and unbiased classification of cell types, paving the way for systematic charting of cell atlases.
This is a preview of subscription content, access via your institution
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Rent or buy this article
Prices vary by article type
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Gene Expression Omnibus
Wagner, A., Regev, A. & Yosef, N. Nat. Biotechnol. 34, 1145–1160 (2016).
Tanay, A. & Regev, A. Nature 541, 331–338 (2017).
Habib, N. et al. Science 353, 925–928 (2016).
Lake, B.B. et al. Science 352, 1586–1590 (2016).
Lacar, B. et al. Nat. Commun. 7, 11022 (2016).
Grindberg, R.V. et al. Proc. Natl. Acad. Sci. USA 110, 19802–19807 (2013).
Macosko, E.Z. et al. Cell 161, 1202–1214 (2015).
Dixit, A. et al. Cell 167, 1853–1866 (2016).
Adamson, B. et al. Cell 167, 1867–1882 (2016).
Klein, A.M. et al. Cell 161, 1187–1201 (2015).
Shekhar, K. et al. Cell 166, 1308–1323 (2016).
Ziegenhain, C. et al. Cell 65, 631–643 (2017).
Rabani, M. et al. Nat. Biotechnol. 29, 436–442 (2011).
Rabani, M. et al. Cell 159, 1698–1710 (2014).
Schwanhäusser, B. et al. Nature 473, 337–342 (2011).
Cheadle, C. et al. BMC Genomics 6, 75 (2005).
Tasic, B. et al. Nat. Neurosci. 19, 335–346 (2016).
GTEx Consortium. Science 348, 648–660 (2015).
Zeisel, A. et al. Science 347, 1138–1142 (2015).
Tirosh, I. et al. Science 352, 189–196 (2016).
Anindita, B. et al. Protocol Exchange https://doi.org/10.1038/protex.2017.094 (2017).
McDonald, J.C. et al. Electrophoresis 21, 27–40 (2000).
Carithers, L.J. et al. Biopreserv. Biobank. 13, 311–319 (2015).
Dobin, A. et al. Bioinformatics 29, 15–21 (2013).
Brennecke, P. et al. Nat. Methods 10, 1093–1095 (2013).
Rosvall, M. & Bergstrom, C.T. Proc. Natl. Acad. Sci. USA 105, 1118–1123 (2008).
van der Maaten, L. & Hinton, G. J. Mach. Learn. Res. 9, 2579–2605 (2008).
McDavid, A. et al. Bioinformatics 29, 461–467 (2013).
Subramanian, A. et al. Proc. Natl. Acad. Sci. USA 102, 15545–15550 (2005).
Breiman, L. Mach. Learn. 45, 5–32 (2001).
Lein, E.S. et al. Nature 445, 168–176 (2007).
We thank R. Macare, A. Rotem, C. Muus, and E. Drokhlyansky for helpful discussions, T. Habib for babysitting, T. Tickle and A. Bankapur for technical support, and L. Gaffney and A. Hupalowska for help with graphics. Work was supported by the Klarman Cell Observatory, National Institute of Mental Health (NIMH) grant U01MH105960, National Cancer Institute (NCI) grant 1R33CA202820-1 and NIAID grant U24AI118672-01 (to A.R.), and Koch Institute Support (core) grant P30-CA14051 from the NCI. Microfluidic devices were fabricated at the Center for Nanoscale Systems, Harvard University, supported by National Science Foundation award no. 1541959. N.H. is supported by HHMI through the HHWF, A.R. is supported by HHMI, and F.Z. is supported by the New York Stem Cell Foundation. F.Z. is supported by NIMH (5DP1-MH100706 and 1R01-MH110049), NSF, HHMI, and the New York Stem Cell, Simons, Paul G. Allen Family, and Vallee Foundations, and by J. and P. Poitras, R. Metcalfe, and D. Cheng. D.A.W. thanks NSF DMR-1420570, NSF DMR-1310266, and NIH P01HL120839 grants for their support. GTEx is supported by the NIH Common Fund (Contract HHSN268201000029C to K.A.).
N.H., A.B., I.A.D., D.A.W., F.Z., and A.R. are co-inventors on international patent application PCT/US16/59239 of Broad Institute, Harvard and MIT, relating to inventions of methods of this manuscript. A.R. is a member of Scientific Advisory Boards for Thermo Fisher Scientific, Syros Pharmaceuticals, and Driver Genomics.
Integrated supplementary information
(a) Overview of DroNc-seq. (b) CAD schema of DroNc-seq microfluidic device. (c) Comparison of library complexity from experiments with different microfluidic device parameters. Distribution of number of transcripts (y-axis) detected across nuclei, ranked in decreasing order (x-axis) in each of two representatives of three independent replicates (left and right) performed with nuclei from the mouse cortex and mid-brain regions, each using DroNc-seq 75 μm microfluidics device (dark blue, Methods) and the Drop-seq7 125 μm device (light blue). (d,e) Comparison between nuclei isolation protocols. (d) Distribution of number of transcripts detected per nucleus (x axis), for samples from mouse Lateral Cortex (LC) or Prefrontal Cortex (PFC) processed using either sucrose gradient centrifugation method as in sNuc-Seq (sucrose gradient) or the EZ-PREP based isolation method used in DroNc-seq (EZ) (Methods). (e) Bar plot showing the number of nuclei per library (y axis) with at least 20,000 reads and 200 genes detected per nucleus (Methods) on the same samples as in (d), using either sNuc-Seq3 (Gradient) or DroNc-seq (EZ) nuclei isolation protocols. (f) Comparison of library complexity between frozen and RNAlater-fixed brain samples. Distribution of number of transcripts detected per nucleus (x-axis, nuclei with at least 20,000 reads and 200 genes detected), for nuclei from mouse cortex and mid-brain tissues that were either fresh-frozen or lightly fixed with RNAlater (Methods); two independent experiments for each isolation method are shown. (g) Single nucleus specificity in DroNc-seq. Scatter plot showing the number of transcripts associated with human (HEK293 cells, x-axis) or mouse (3T3 cells, y-axis) transcripts for each nucleus barcode (dot). 2.5% (27/1,064) of nuclei are human-mouse doublets, reflected by barcodes associated with a high number of both human and mouse transcripts (purple). We estimate a 5% expected doublet rate at our current loading and flow parameters. Four replicate species mixing experiments are in Supplementary Table 1.
(a) Bioanalyzer traces of DroNc-seq libraries showing the length of the cDNA library fragments of (from left): 3T3 cells, 3T3 nuclei, human brain nuclei, and mouse brain nuclei. (b,c) Distribution of number of transcripts (b) and genes (c) detected in DroNc-seq from nuclei isolated from: Left- 3T3 cells, mouse frozen brain tissue, and human frozen archived brain tissue; Right- 3T3 cells by Drop-seq or nuclei by DroNc-seq. (d) Scatter plot comparing average gene expression levels detected in single 3T3 nuclei (DroNc-seq, y-axis) and cells (Drop-seq, x-axis). Red dots mark outlier genes highly expressed in one but not the other experiment. (e) Percent reads mapped to the mouse genome (out of the total reads, left, y-axis), and percent mapped to exons, introns, intergenic regions, and rRNA loci (out of the genomic-mapped reads, right, y-axis), for 3T3 cells (dark bars) and nuclei (light bars). (f-g) Saturation analysis. (f) Saturation curves of the number of transcripts (y-axis) at different number of sequenced reads (x-axis), estimated by subsampling of reads in each nucleus (Methods). Circles are the observed subsampling values averaged across 10 replicates of 3T3 cells (top) or nuclei (bottom). The blue line indicates the nonlinear fit of a saturation function to all observations. The grey line is the extrapolated trend given the estimated fit parameters. (g) Shown for each cell or nucleus (grey dot) in each sample (as in f) is the estimated number of reads (y-axis) needed to achieve 80% saturation of transcripts (left) followed by the total number of usable reads per cell or nucleus (right). Red lines indicate the population mean, pink regions indicate 95% C.I. of the mean.
(a) Cell type classification. tSNE plot of DroNc-seq libraries from adult frozen mouse hippocampus (Hip) and prefrontal cortex (PFC) brain regions, as in Fig. 1a, but with clusters colour coded by cell type annotations and anatomical distinctions (exPFC=pyramidal neurons from the PFC, GABA=GABAergic interneurons, exCA=pyramidal neurons from the Hip CA region,, exDG=granule neurons from the Hip dentate gyrus region, ASC=astrocytes, ODC=oligodendrocytes, MG=microglia, OPC=oligodendrocyte precursor cells, SMC=smooth muscle cells, END=endothelial cells). (b) Cell filtering. tSNE plot as in (a), but marking clusters of nuclei (black) that were identified as either doublets or unclassified cells that are likely from adjacent brain regions (ChP = Choroid plexus, Methods) and thus removed from subsequent analyses. (c) Cell quality. tSNE plot showing a 2D embedding of DroNc-seq nuclei as in (a) with additional lower complexity nuclei, marking (turquoise) nuclei with either less than 400 (left) or 300 (right) genes. (d) Distribution of number of transcripts detected in each cluster. Violin plots show the distribution of the number of transcripts in each cluster (x-axis; as in Fig. 1a). (e) Fraction of nuclei from each brain region associated with each cell type. Cell types are defined as in (a) and sorted from left by types enriched in Hip vs. PFC.
Supplementary Figure 4 Sub-clusters of mouse pyramidal neurons from the CA region in the hippocampus
a) Marker gene expression. tSNE embedding of DroNc-seq nuclei from mouse hippocampus and PFC (as in Fig. 1a), colour coded by expression levels (scaled log(transcripts)) of selected genes with distinct expression across the anatomical sub-regions of the hippocampus: Neurod6 in CA regions, Golm1 in CA3 and dentate gyrus regions, Pex5l in CA1 and subiculum regions, and Prox1 in dentate gyrus region. (b) RNA in situ hybridization (ISH) images from the Allen Brain Atlas31 showing expression patterns of marker genes (from a) in the mouse hippocampus (sagittal sections, scale = 839 μm).
(a) Number of nuclei from each sample (colour code: PFC=blue; Hip=yellow) associated with each cluster (as defined in Fig. 1a). In the legend, numbers denote different samples, and letters denote technical replicates from the same sample. (b) Number of high quality nuclei (with at least 10,000 reads and 200 genes detected, Methods) in each mouse brain sample (as in a). (c,d) Complexity. (c) Distribution of number of genes (left) and transcripts (right) in the nuclei in each subset (as in Supplementary Fig. 3a). exPFC=glutamatergic neurons from the PFC region, GABA=GABAergic interneurons, exCA=pyramidal neurons from the Hip CA region, exDG=granule neurons from the Hip dentate gyrus region, ASC=astrocytes, NSC=neuronal stem cells, ODC=oligodendrocytes, OPC=oligodendrocyte precursor cells, SMC=smooth muscle cells, END=endothelial cells, MG=microglia. (d) Distribution of number of genes in the cells in each subset from the scRNA-seq mouse brain study of Tasic et al.17 (Glut=glutamatergic neurons).
(a) Reproducibility. tSNE embedding (as in Fig. 1d) of 816 DroNc-seq nuclei profiles from mouse GABAergic neuron clusters (clusters 10-11 in Fig. 1a), colour coded by sample of origin. Technical replicates are marked as separate samples. (b) Complexity. Violin plots show the distribution of the number of transcripts in the nuclei in each mouse GABAergic sub-cluster. (c,d) Each cluster is characterized by a unique combination of expressed marker genes. (c) tSNE embedding as in (a), colour coded by the expression level of marker genes (marked on top). (d) Violin plots of the distribution of expression levels (scaled log(transcripts)) of markers genes (marked on top) in the nuclei (dots) in each of the mouse GABAergic sub-clusters (defined as in Fig. 1d).
(a) tSNE plot of DroNc-seq profiles from adult human hippocampus (Hip) and prefrontal cortex (PFC), as in Fig. 2a, but with clusters grouped by cell type annotations and anatomical distinctions. exPFC=pyramidal neurons from the PFC, exCA1/exCA3=pyramidal neurons from the Hip CA regions, GABA=GABAergic interneurons, exDG=granule neurons from the Hip dentate gyrus region, ASC=astrocytes, ODC=oligodendrocytes, OPC=oligodendrocyte precursor cells, MG=microglia, NSC=neuronal stem cells, END=endothelial cells. (b) Complexity. Violin plots show the distribution of the number of transcripts detected in the nuclei of each cluster (clusters as in Fig. 2a). (c) Quality. Number of nuclei passing quality filter (Methods; at least 10,000 reads and 200 genes detected per nucleus) in each sample. Numbers denote different samples, and letters denote technical replicates from the same sample. (d) Each cluster is supported by multiple samples. Number of nuclei from each human sample (colour code; blue=PFC; yellow=Hip; samples and technical replicates marked as in c) that are associated with each cluster (x-axis, defined as in Fig. 2a). (e) Interferon signalling and MHC I genes in single endothelial cells. Shown is the expression of a sub-set of genes (rows) up-regulated (Methods) in the endothelial nuclei Cluster 15 (in Fig. 2a) across the nuclei (columns) in the cluster. Right: membership of the genes in two enriched pathways (hypergeometric p-value FDR < 0.01, Methods). (f) Violin plots of the distribution of expression levels (scaled log(transcripts)) of neuronal stem cell marker genes (marked on top) in nuclei (dots) in each of the human clusters (defined as in Fig. 2a).
a) Sub-clusters. tSNE embedding of DroNc-seq nuclei profiles from 1,116 nuclei of human pyramidal neurons in the CA region of the hippocampus (CA3, CA1 and subiculum; clusters 3-4 in Fig. 2a), colour coded by cluster membership (legend). (b) Each sub-cluster is supported by multiple samples. tSNE embedding as in a, colour coded by the sample of origin. Technical replicates are marked as separate samples. (c) Complexity. Violin plots of the distribution of number of transcripts in nuclei (dots) from each exCA sub-cluster. (d-e) Marker gene expression. (d) tSNE embedding as in a, but with nuclei coloured by the expression levels of genes up-regulated in specific clusters (SPOCK1 – cluster 5, RASL10A – cluster 4, PEX5L – cluster 1-4, FCHO2 – cluster 3, CCSER1 – cluster 2, FN1 – cluster 1). (e) ISH images from the mouse Allen Brain Atlas31 of three markers genes from (d) with distinct anatomical expression patterns in the hippocampus (Spock1 in CA3, Fn1 in subiculum, and Pex5l in CA1 and subiculum).
(a) Sub-clusters. tSNE embedding of 2,501 DroNc-seq nuclei profiles of human glutamatergic neurons in the prefrontal cortex (PFC) (clusters 1 and 2 in Fig. 2a), colour coded by cluster membership. (b) Each sub-cluster is supported by multiple samples. tSNE embedding as in a, colour coded by the sample of origin. Technical replicates are marked as separate samples. (c) Complexity. Violin plots show the distribution of number of transcripts in the nuclei in each exPFC sub-cluster. (d-f) Each sub-cluster is characterized by a unique combination of expressed marker genes. (d) Violin plots show the distribution of expression levels (scaled log(transcripts)) of known cortical layer marker genes or genes differentially expressed between exPFC sub-clusters, in nuclei (dots) from each of the sub-clusters. (e) tSNE embedding as in (a), but colour coded by expression level of genes, showing unique combinatorial expression patterns across sub-clusters. (f) ISH images from the mouse Allen Brain Atlas31 of markers genes (from e) that have a unique expression pattern in specific cortical layers: Rorb (cortical layer 4-5), Pcp4 (cortical layer 5), Tle4 (cortical layer 6), and Cux2 (cortical layer 2-4). Scale: 1678 μm.
(a,b) Each cluster is supported by multiple samples and most by multiple brain regions. tSNE embedding (as in Fig. 2f) of 1,500 DroNc-seq nuclei profiles of human GABAergic neurons (clusters 5 and 6 in Fig. 2a), colour coded by the sample of origin (a) or by brain region (b, PFC=blue, Hip=yellow). Each cluster has nuclei from both brain regions, except clusters 4 and 7, which are hippocampus-specific. (c) Fraction of nuclei from each brain region associated with each GABAergic sub-cluster defined in (Fig. 2f). (d,e) Each cluster is characterized by a unique combination of expressed marker genes. (d) tSNE embedding as in Fig. 2f with each nucleus coloured by the expression level (scaled log(transcripts)) of canonical GABAergic marker genes. (e) Violin plots of the distribution of expression levels (scaled log(transcripts)) of each GABAergic marker gene in the nuclei (dots) in each human GABAergic sub-cluster (as in Fig. 2f).
Supplementary Figure 11 Comparison of DroNc-seq human GABAergic sub-clusters to previously defined sub-clusters of human cells.
Mapping of GABAergic nuclei sub-clusters defined in Fig. 2f to subsets defined from nuclei profiles in the human cortex in Lake et. al. Dot plot shows the proportion of cells in each cluster defined by DroNc-seq (rows) that were classified to each of the Lake et. al. clusters (columns) using a decision tree classifier defined in the previous study4 (Methods).
Supplementary Figures 1–11.
Step-by-step protocol for DroNc-seq.
Cells Nuclei 3T3.
Data Info Mouse.
Mouse Cluster Markers.
Mouse GABAergic Cluster Markers.
Tissue Samples from GTEX.
Data Info Human.
Human Cluster Markers.
Human GABAergic Clusters Markers.
CAD scheme of DroNc-seq microfluidic device.
About this article
Cite this article
Habib, N., Avraham-Davidi, I., Basu, A. et al. Massively parallel single-nucleus RNA-seq with DroNc-seq. Nat Methods 14, 955–958 (2017). https://doi.org/10.1038/nmeth.4407
This article is cited by
Data-driven identification of total RNA expression genes for estimation of RNA abundance in heterogeneous cell types highlighted in brain tissue
Genome Biology (2023)
Ambient RNAs removal of cortex-specific snRNA-seq reveals Apoe+ microglia/macrophage after deeper cerebral hypoperfusion in mice
Journal of Neuroinflammation (2023)
BMC Bioinformatics (2023)
Dissociation protocols used for sarcoma tissues bias the transcriptome observed in single-cell and single-nucleus RNA sequencing
BMC Cancer (2023)
Single-nucleus transcriptomes reveal spatiotemporal symbiotic perception and early response in Medicago
Nature Plants (2023)