Abstract
High-throughput single-cell RNA sequencing has transformed our understanding of complex cell populations, but it does not provide phenotypic information such as cell-surface protein levels. Here, we describe cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq), a method in which oligonucleotide-labeled antibodies are used to integrate cellular protein and transcriptome measurements into an efficient, single-cell readout. CITE-seq is compatible with existing single-cell sequencing approaches and scales readily with throughput increases.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Accession codes
References
Macosko, E.Z. et al. Cell 161, 1202–1214 (2015).
Klein, A.M. et al. Cell 161, 1187–1201 (2015).
Zheng, G.X.Y. et al. Nat. Commun. 8, 14049 (2017).
Pontén, F. et al. Mol. Syst. Biol. 5, 337 (2009).
Paul, F. et al. Cell 163, 1663–1677 (2015).
Wilson, N.K. et al. Cell Stem Cell 16, 712–724 (2015).
Yuan, J. & Sims, P.A. Sci. Rep. 6, 33883 (2016).
Gierahn, T.M. et al. Nat. Methods 14, 395–398 (2017).
Cao, J. et al. Preprint at http://www.biorxiv.org/content/early/2017/02/02/104844 (2017).
Rosenberg, A.B. et al. Preprint at http://www.biorxiv.org/content/early/2017/02/02/105163 (2017).
Ståhlberg, A., Thomsen, C., Ruff, D. & Åman, P. Clin. Chem. 58, 1682–1691 (2012).
Genshaft, A.S. et al. Genome Biol. 17, 188 (2016).
Albayrak, C. et al. Mol. Cell 61, 914–924 (2016).
Darmanis, S. et al. Cell Rep. 14, 380–389 (2016).
Frei, A.P. et al. Nat. Methods 13, 269–275 (2016).
Murphy, K., Travers, P. & Walport, M. Janeway's Immunobiology 7th edn. (Garland Publishing, 2008).
Robinson, J.P. & Roederer, M. Science 350, 739–740 (2015).
Fan, H.C., Fu, G.K. & Fodor, S.P.A. Science 347, 1258367 (2015).
Poli, A. et al. Immunology 126, 458–465 (2009).
Ferlazzo, G. & Münz, C. J. Immunol. 172, 1333–1339 (2004).
Wendt, K. et al. J. Leukoc. Biol. 80, 1529–1541 (2006).
Shahi, P., Kim, S.C., Haliburton, J.R., Gartner, Z.J. & Abate, A.R. Sci. Rep. 7, 44447 (2017).
Stoeckius, M. & Smibert, P. Protocol Exchange http://dx.doi.org/10.1038/protex.2017.068 (2017).
Adler, M., Wacker, R. & Niemeyer, C.M. Analyst 133, 702–718 (2008).
Baranauskas, A. et al. Protein Eng. Des. Sel. 25, 657–668 (2012).
Breton, G., Lee, J., Liu, K. & Nussenzweig, M.C. Nat. Protoc. 10, 1407–1422 (2015).
Blondel, V.D., Guillaume, J.-L., Lambiotte, R. & Lefebvre, E. J. Stat. Mech. 2008, P10008 (2008).
van der Maaten, L. & Hinton, G. J. Mach. Learn. Res. 9, 2579–2605 (2008).
van der Maaten, L. J. Mach. Learn. Res. 15, 1–21 (2014).
Aitchison, J . J. Math. Geol. 21, 787–790 (1989).
Acknowledgements
We acknowledge S. Jaini and K. Pandit of the NYGC Technology Innovation Lab for critical discussions and support; E. Papalexi for help with CBMCs isolation; M. Coppo, S. Fennessey, B. Baysa and S. Pescatore at NYGC for sequencing support; and C. Kocks for manuscript discussions. Multiparameter flow cytometer instrument time and reagents were generously provided by the Vaccine Research Center of the National Institutes of Health. C.H. was supported by a Deutsche Forschungsgemeinschaft research fellowship. R.S. was supported by an NIH New Innovator Award (DP2-HG-009623). Research reported here was partially supported by the National Human Genome Research Institute under Award Number UM1HG008901.
Author information
Authors and Affiliations
Contributions
M.S. conceived and designed the study with input from B.H.-L., R.S., H.S. and P.S. M.S. performed all experiments. C.H. and R.S. designed and contributed the computational analyses. W.S. assisted with Drop-seq experiments. P.K.C provided conceptual input on how to benchmark CITE-seq to flow cytometry and performed multiparameter flow cytometry analysis. M.S., C.H., R.S. and P.S. interpreted the data. M.S. and P.S. wrote the manuscript with input from all authors.
Corresponding author
Ethics declarations
Competing interests
M.S., B.H.-L. and P.S. have filed a patent application based on this work (US provisional patent application 62/515-180).
Integrated supplementary information
Supplementary Figure 1 CITE-seq library preparation.
(a) Illustration of the DNA-barcoded antibodies used in CITE-seq. (b) Antibody-oligonucleotide complexes appear as a high-molecular-weight smear when run on an agarose gel (1). Cleavage of the oligo from the antibody by reduction of the disulfide bond collapses the smear to oligo length (2). (c) Drop-seq beads are microparticles with conjugated oligonucleotides comprising a common PCR handle, a cell barcode, followed by a unique molecular identifier (UMI) and a polyT tail. (d) Schematic illustration of CITE-seq library prep in Drop-seq (downstream of Fig. 1b). Reverse transcription and template switch is performed in bulk after emulsion breakage. After amplification, full length cDNA and antibody-oligo products can be separated by size and amplified independently (also shown in d) (e) Reverse transcription and amplification produces two product populations with distinct sizes (left panel). These can be size separated and amplified independently to obtain full length cDNAs (top panel, capillary electrophoresis trace) and ADTs (bottom panel, capillary electrophoresis trace).
Supplementary Figure 2 Analysis of mixtures of mouse and human cells that were incubated with oligo-tagged-antibodies specific for either human or mouse cell-surface markers.
(a) Fractions of human RNA molecules compared to human ADTs for detected cell barcodes in the Drop-seq species mixing experiment. Each point represents one cell barcode (i.e., droplet containing one or more cells). (b-e) CITE-seq of human and mouse mixing experiment repeated on a commercially available system from 10x Genomics. (b) Quantification of the number of human and mouse transcripts associating to each cell barcode. Each green point indicates a cell barcode (i.e. droplet containing one or more cells) from which we measured >90% human transcripts; each red point indicates a cell barcode with >90% mouse transcripts. Blue points indicate cell barcodes (i.e. droplets) from which we observed a mixture of human and mouse transcripts. (c) Quantification of antibody tags (ADTs) associated with each cell barcode. Points are colored based on species classifications using transcripts in panel b. (d) Quantification of human, mouse or mixed-cell barcodes based on RNA transcripts, or ADTs, detected in 10x Genomics workflows. (e) Fractions of human RNA molecules compared to human ADTs for detected cell barcodes. Each point represents one cell barcode (i.e., droplet containing one or more cells).
Supplementary Figure 3 Comparing qualitative and quantitative readout in CITE-seq and flow cytometry.
(a-b) Periperal blood mononuclear cells were processed by flow cytometry and CITE-seq to compare qualitative readout between both technologies. Relevant immune populations were labelled (a) and their abundances relative to the entire population compared (b, see also Fig. 2a,b). (b) Relative abundances of relevant immune cell subsets as determined by flow cytometry and CITE-seq (see Fig. 2a,b; and panel a) (c-e) Relative quantitative comparison of flow cytometry and CITE-seq. (c) Cells were first gated based on forward and side scatter and separated from dead cells. Profile of CD4 and CD8a fluorescence in CBMCs. Colored boxes are gates set to sort different levels of CD8. (d) Re-analysis of cells sorted into CD8a very-high (+++), high (++), intermediate (+) and low (+/-) by flow cytometry. Histograms of CD8a levels (fluorescence intensities) in the four different pools of cells. 2,000 cells were measured for each run. (e) CD8a levels obtained by CITE-seq of the different pools of cells sorted in panel a. Histograms of four CITE-seq runs of the separate pools. 288-522 cells were measured for each run.
Supplementary Figure 4 Gene expression in CBMC clusters.
(a) Gene expression heatmap of most differentially expressed marker genes defining clusters. Dimensionality reduction followed by modularity optimization was used to cluster 8,005 CBMCs (methods). Cluster color assignments are identical to Figure 3a. The mouse control cell population was excluded from the clustering. (b) Expression of individual marker genes in the context of tSNE representation of cell relationships based on single-cell gene-expression profiles. Levels of transcripts corresponding to specific marker genes are indicated by blue shading.
Supplementary Figure 5 Multimodal bi-axial plots.
(a) Histograms of CLR-transformed ADT counts in CBMCs (red) and mouse control cells spiked at very low frequency (blue). Solid line shows the determined cutoff for significant ADT signal (mean of mouse values + standard deviation of mouse values). (b) Histograms of CLR-transformed ADT counts for the three antibodies-oligo conjugates in our 13 antibody pool that did not pass the mouse-derived threshold. (c) Multimodal bi-axial plots. Pairwise comparison of different CLR-transformed ADT levels in CBMCs. Upper right: CBMCs plotted with colors based on RNA clusters shown in Figure 3a. Lower left: mouse control cells (black) that were spiked at very low frequency are overlaid on the CBMCs (light grey).
Supplementary Figure 6 Joint analysis of protein and RNA to finely resolve NK cell (CD56bright vs CD56dim) populations that have only subtle transcriptomic differences.
(a) mRNA (blue) and corresponding ADT (green) signal for CD56 and CD16 projected onto the NK cell cluster tSNE plot. Darker shading corresponds to higher levels measured. NK cell cluster was split into CD56bright and CD56dim groups based on CD56 ADT levels. (b) Histogram of CD8a levels in the CD56bright and CD56dim cells. Two-sample Kolmogorov–Smirnov test p-value < 0.001 indicates two different CD8a distributions. Histograms for CD4 and CD45RA, ADTs that show low/absent or high expression in these cells, but no difference between CD56bright and CD56dim populations are shown as controls.
Supplementary Figure 7 Correlations between mRNA and protein marker levels in CITE-seq.
(a) Correlation between normalized mRNA expression and CLR-transformed ADT counts at a single cell level. For each gene, cells with no detected mRNA molecules were excluded. Pearson’s correlation coefficient is shown in the boxed labels. (b) Correlation between normalized mRNA expression and CLR-transformed ADT counts at the cluster level. No cells were excluded. mRNA and ADT signals were averaged per cluster before calculating Pearson’s correlation coefficient, shown in boxed labels.
Supplementary Figure 8 Clustering of CBMCs based on ADTs.
(a) CITE-seq single-cell ADT profiles of ~8,005 CBMCs (and ~600 mouse control cells) were clustered using modularity optimization resulting in 17 cell populations (including the mouse control cell population) with distinct antibody compositions. (b) ADT levels for 10 markers in clusters defined by ADT levels. Levels of ADT are indicated by blue to dark blue shading. (c) Single-cell ADT level heatmap in the ADT-derived clusters. Colors of clusters in top panel represent colors in panel a. (d) Single-cell gene-expression heatmap of top marker genes of ADT-derived clusters. Colors of clusters in top panel represent colors in (a).
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–8 and Supplementary Tables 1 and 2 (PDF 3080 kb)
Supplementary Protocol
CITE-seq protocol (PDF 971 kb)
Rights and permissions
About this article
Cite this article
Stoeckius, M., Hafemeister, C., Stephenson, W. et al. Simultaneous epitope and transcriptome measurement in single cells. Nat Methods 14, 865–868 (2017). https://doi.org/10.1038/nmeth.4380
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nmeth.4380
This article is cited by
-
hadge: a comprehensive pipeline for donor deconvolution in single-cell studies
Genome Biology (2024)
-
CITEViz: interactively classify cell populations in CITE-Seq via a flow cytometry-like gating workflow using R-Shiny
BMC Bioinformatics (2024)
-
scGIST: gene panel design for spatial transcriptomics with prioritized gene sets
Genome Biology (2024)
-
SuperCellCyto: enabling efficient analysis of large scale cytometry datasets
Genome Biology (2024)
-
Mapping cancer biology in space: applications and perspectives on spatial omics for oncology
Molecular Cancer (2024)