We present a tool to measure gene and protein expression levels in single cells with DNA-labeled antibodies and droplet microfluidics. Using the RNA expression and protein sequencing assay (REAP-seq), we quantified proteins with 82 barcoded antibodies and >20,000 genes in a single workflow. We used REAP-seq to assess the costimulatory effects of a CD27 agonist on human CD8+ lymphocytes and to identify and characterize an unknown cell type.
Gene Expression Omnibus
Macosko, E.Z. et al. Cell 161, 1202–1214 (2015).
Zheng, G.X. et al. Nat. Commun. 8, 14049 (2017).
Villani, A.-C. et al. Science 356, eaah4573 (2017).
Rizvi, A.H. et al. Nat. Biotechnol. 35, 551–560 (2017).
Tirosh, I. et al. Science 352, 189–196 (2016).
Liu, Y., Beyer, A. & Aebersold, R. Cell 165, 535–550 (2016).
Darmanis, S. et al. Cell Reports 14, 380–389 (2016).
Battle, A. et al. Science 347, 664–667 (2015).
Schwanhäusser, B. et al. Nature 473, 337–342 (2011).
Perfetto, S.P., Chattopadhyay, P.K. & Roederer, M. Nat. Rev. Immunol. 4, 648–655 (2004).
Bendall, S.C. et al. Science 332, 687–696 (2011).
Genshaft, A.S. et al. Genome Biol. 17, 188 (2016).
van der Maaten, L. & Hinton, G. J. Mach. Learn. Res. 9, 2579–2605 (2008).
French, R.R. et al. Blood 109, 4810–4815 (2007).
Roberts, D.J. et al. J. Immunother. 33, 769–779 (2010).
Ramakrishna, V. et al. J. Immunother. Cancer 3, 37 (2015).
Manz, M.G., Miyamoto, T., Akashi, K. & Weissman, I.L. Proc. Natl. Acad. Sci. USA 99, 11872–11877 (2002).
Stoeckius, M. et al. Nat. Methods http://dx.doi.org/10.1038/nmeth.4380 (2017).
Gierahn, T.M. et al. Nat. Methods 14, 395–398 (2017).
Alles, J. et al. BMC Biol. 15, 44 (2017).
Kotecha, N., Krutzik, P.O. & Irish, J.M. Curr. Protoc. Cytom. 53, 10.17 (2010).
Wognum, A.W., Eaves, A.C. & Thomas, T.E. Arch. Med. Res. 34, 461–475 (2003).
Thomas, T.E., Miller, C.L. & Eaves, C.J. Methods 17, 202–218 (1999).
Dobin, A. et al. Bioinformatics 29, 15–21 (2013).
Satija, R., Farrell, J.A., Gennert, D., Schier, A.F. & Regev, A. Nat. Biotechnol. 33, 495–502 (2015).
Waltman, L. & Jan van Eck, N. Eur. Phys. J. B 86, 471 (2013).
Li, H. et al. Bioinformatics 25, 2078–2079 (2009).
Ashburner, M. et al.; The Gene Ontology Consortium. Nat. Genet. 25, 25–29 (2000).
Gene Ontology Consortium. Nucleic Acids Res. 43, D1049–D1056 (2015).
Subramanian, A. et al. Proc. Natl. Acad. Sci. USA 102, 15545–15550 (2005).
Mootha, V.K. et al. Nat. Genet. 34, 267–273 (2003).
Hu, J., Ge, H., Newman, M. & Liu, K. Bioinformatics 28, 1933–1934 (2012).
Li, B. & Dewey, C.N. BMC Bioinformatics 12, 323 (2011).
We acknowledge R. Riener for help with staining and profiling Rhesus blood blocked with aCD27 drug.
The authors declare no competing financial interests.
Integrated supplementary information
(a) Cells are labeled with Ab-Barcodes (AbBs, Supplementary Fig. 2) before compartmentalization into discrete droplets containing a bead with cell-barcode primers. Upon droplet formation the cell is lysed and the polyadenylated mRNA and AbBs hybridize to the poly(dT) cell barcoded primer. REAP-seq leverages the DNA polymerase activity of reverse transcriptase to simultaneously extend the hybridized AbB and synthesize complementary DNA from mRNA in the same reaction. The droplet emulsion is then broken and the cell barcoded AbB sequences (∼155 bp) are size fractionated from the cell barcoded cDNA derived from mRNA (>∼500 bp). mRNA and protein libraries are prepared and sequenced (see Methods). (b) Paired end sequencing reads for mRNA and protein libraries are generated on a high-throughput sequencer. The mRNA workflow is similar to previously published methods1,2. Protein sequencing reads are first aligned using an antibody-barcode dictionary that associates each antibody with a unique 8 bp sequence (Supplementary Table 1). Next, reads are grouped by their cell barcodes, and sequences with unique UMIs are counted for each protein and gene in each cell. The result is a digital protein and gene expression matrix where each column corresponds to a cell, and each row corresponds to a different protein or gene. Each entry in this matrix is the integer number of detected genes or proteins per cell.
Antibodies were conjugated to oligonucleotides (65-66 bp) that consisted of 3 parts: 1) 33 bp Nextera Read 1 sequence that was used as a primer for amplification as well as sequencing, 2) a unique 8 bp antibody barcode (Ab BC) and 3) 24-25 bp poly(dA) sequence. Due to complexity of manufacturing oligonucleotides with consecutive insertions of the same base, if the Ab BC ended in an A, then poly(dA) was 24 bp to keep consecutive insertions of the same base <26 bp. All oligonucleotides were purchased from Integrated DNA Technologies with a 5′ amine modification (/5AmMC6/).
REAP-seq leverages the DNA polymerase activity of reverse transcriptase to simultaneously synthesize complementary DNA from mRNA and extend the hybridized AbB resulting in dsDNA containing the Ab barcode, cell barcode, UMI, and two PCR primer sequences. The droplets are then broken and a silane bead cleanup step is used to purify the DNA from the oil emulsion. Then shorter dsDNA molecule (∼155 bp) is separated from the longer cDNA molecules (typically >500 bp) using 0.6x SPRI bead enrichment step. All downstream scRNA-seq steps remain the same (10x Genomics v1 standard protocol) and the supernatant that is typically discarded from the SPRI cleanup step is used for the protein part of the assay. Exonuclease I is used to degrade any excess unbound single-stranded oligonucleotides from the protein dsDNA (∼155 bp) product to prevent crosstalk between AbBs and cell barcodes from different cells. The dsDNA product is then amplified using Illumina adapter sequence primers (P7 and P5) and a primer containing a sample index (P5-Sample_Index-Part_Rd1, Supplementary Table 6) to create a final library product that can be sequenced in a multiplexed fashion with other samples (Supplementary Table 7).
Supplementary Figure 4 Validation of REAP-seq protein assay using Anti-mouse IgG beads labeled with AbBs.
(a) The protein part of the REAP-seq assay was conducted on a mixture of beads labeled with either AbB 1 (CD70) or AbB 2(CD13). The scatter plot shows the number of AbB counts associated with each bead. Blue dots indicate beads designated as AbB 1 specific (>70% AbB 1 counts); red dots indicate beads that are AbB 2 specific (>70% AbB 2 counts). Of the 574 beads identified (>100 counts), 4 (0.7%) had a mixed phenotype, suggesting that the Exonuclease 1 step was successful at degrading excess unbound single-stranded barcodes and preventing crosstalk between AbBs and cell barcodes from different beads. (b) t-SNE visualization of clusters identified among 1,082 beads that were labeled with one of the 10 different AbBs (CD127, TIGIT, CD27, CD8a,CD73, CD28, CD9, CD40, OX40, and Mouse IgG1 isotype control) and processed through the REAP-seq protein pipeline. Beads were assigned a color based on the AbB with the maximum number of counts. As expected, 10 clusters were identified for each of the 10 AbBs.
(a) PBMCs were blocked with either DNA salmon sperm (1 mg/ml), dextran sulfate (0.2 mg/ml), or polyanionic Inhibitor (1 μM). Bulk PBMCs were labeled with an AbB mix (n=28) and protein libraries were prepared for bulk cells rather than single cells (less expensive for initial optimization experiments). The table shows normalized counts (AbB counts/total AbB counts x 1×104) of DNA barcodes from isotype control antibodies; Mouse IgG1, Mouse IgG2b, and Rat IgG1. Dextran sulfate showed the best reduction in non-specific binding of the AbB isotype controls. (b) PBMCs were either blocked with dextran sulfate (0.2 mg/ml) or not blocked and then labeled with an AbB mix (n=45). UMI count graphs showing the % total cells (# cells that had a specific number of UMI counts/ total # cells × 100) that are expressing a specific # of UMI counts. Single cells (without DS, n=3,158, with DS, n=4,330) were processed with REAP-seq and protein measurements show that dextran sulfate blocking helped reduce non-specific binding of the isotype controls and increased the % cells with 0 UMI counts. All three isotype controls blocked with dextran sulfate had a background noise of <= 2 UMI counts in >96% of the cells.
Supplementary Figure 6 Evaluation of specificity and non-specific binding in the REAP-seq protein assay.
(a) High correlation (R2=0.95) between two different monoclonal antibodies against CD8 (Clone RPA-TA and SK1) was observed in PBMCs indicating high specificity and reproducibility of single cell protein measurements. (b) Evaluation of non-specific binding using isotype controls in aCD27 treated and untreated cells from Donor 1 (n=5,196 cells). UMI count graphs showing the % total cells (# cells that had a specific number of UMI counts/ total # cells × 100) that are expressing a specific # of UMI counts. Isotype controls Mouse IgG2a, Mouse IgG2b, Rat IgG1, and Rat IgG2a have >90% cells with 0 UMI counts and >98% of cells have <2 UMI counts.
(a) CD3+ T cells (n=3,797), CD19+ B cells (n=1,533), and CD11b+ myeloid cells (n=2,883) were magnetically enriched from PBMCs and processed with REAP-seq. Gene expression matrices from the 3 magnetically enriched cell populations (CD3+,CD19+,CD11b+) were merged into one matrix and the nonlinear dimensionality reduction method, t-Distributed Stochastic Neighbor Embedding (t-SNE), was used to visualize the PCA-reduced dataset in two dimensional space. t-SNE visualization of six clusters were identified using the top 9 significant principal components across 1,789 variable genes. (b) Cells are colored by the magnetic beads used for isolation: CD3+ (pink), CD11b+ (green), CD19+ (blue) and projected on the tSNE plot from (a). There are three easily discernible purified populations of cells which can be used as a positive control to assess the sensitivity and specificity of REAP-seq mRNA and protein measurements for canonical markers of these cell types. (c) mRNA and protein signal for canonical markers expressed in myeloid cells (CD11b, CD33, CD14, CD155), B cells (CD19, CD20) and T cells (CD3, CD4, CD8) projected on the t-SNE plot from (a). For each marker, the Pearson correlation coefficient (R) between mRNA and protein expression across 8,213 single cells is displayed. Purple indicates high expression and grey indicates low expression. (d) mRNA signal for markers expressed in FCGR3A+ Monocytes (FCGR3A) and mature B cells (TNFRSF17) were projected on the tSNE plot from (a). Purple indicates high expression and grey indicates low expression.
(a) Violin plots showing the distribution of gene expression in PBMCs that were processed with REAP-seq (n=4,113) versus standard scRNA-seq (n=3,158) using the 10× Genomics platform. Genes on the left were those that had less abundant cell expression and genes on the right were those that were expressed in a larger number of cells. Distribution of gene expression between both platforms is comparable suggesting that the protein assay does not affect the scRNA-seq standard assay.
(a) The REAP-seq transcriptomic assay was run on PBMCs labeled with or without AbBs (scRNA-seq). Sequencing reads were processed with our bulk RNA-seq pipeline (Omicsoft, see Methods) and the 5,000 highest expressed genes were compared between the two conditions. High correlation (R2=0.97) suggests that labeling cells with AbBs does not effect the transcriptomic signature. (b) PBMCs were either blocked with dextran sulfate (0.2 mg/ml) or not blocked before labeling cells with AbBs and processed through the REAP-seq pipeline. Comparison between the top 5,000 expressed genes show good correlation (R2=0.96) suggesting that blocking cells with dextran sulfate does not effect the transcriptomic signature. Gene counts were normalized by total counts and scaled by a factor of 1×106.
Supplementary Figure 10 REAP-seq characterization of ex vivo activation of naïve CD8+ T cells with aCD27
(a) Schematic of aCD27 ex-vivo assay where naïve CD8+ enriched T cells were either treated with aCD3 and aCD28 (top, aCD27 untreated) or aCD3, aCD28, and aCD27 (bottom, aCD27 treated). (b) t-SNE visualization plots based on either gene or protein expression for each of the 3 donors where cells were treated with aCD27 (Donor 1, 4,246; Donor 2, 4,044; Donor 3, 3,550 cells) or not treated with aCD27 (Donor 1, 950; Donor2, 622; Donor 3, 406 cells). For t-SNE visualization of mRNA data, 6-7 clusters were identified using the top 10, 15, and 10 significant principal components across 1,452, 1,159, and 1,706 genes for Donor 1, 2, and 3 respectively. For t-SNE visualization of protein data, 6-7 clusters were identified using the top 10 significant principal components across all Natu anBboecesofog ye aoh1(0otfl38/Cbtl39a3e colored by cluster.
Supplementary Figure 11 Unsupervised clustering of aCD27 treated and untreated cells based on gene expression
(a) t-SNE plots based on gene expression for each of the 3 donors. Blue dots indicate cells treated with aCD27 (Donor 1, 4,246; Donor 2, 4,044; Donor 3, 3,550 cells) and magenta indicates cells not treated with aCD27 (Donor 1, 950; Donor2, 622; Donor 3, 406 cells). (b) t-SNE plots based on a reduced set of genes (n=69) consisting of markers that were also measured for protein expression (excluding isotype controls and post translationally modified proteins such as CD45RO and CD45RA). For comparison, clustering based on protein expression is shown in Fig. 2a.
Supplementary Figure 12 Differentially expressed proteins and transcripts in aCD27 treated naïve CD8+ T cells
Volcano plots showing differential gene and protein expression between aCD27 treated and untreated cells in three different donors. Cyan, genes and proteins with adjusted p-values < 0.01 (corrected for multiple testing using the Bonferroni correction) and fold changes greater than 1.3 (threshold used for differential expression). Magenta, remaining genes and proteins. Selected genes and proteins labeled were differentially expressed in all three donors.
Supplementary Figure 13 Heatmap showing normalized expression of genes (n=61) that were either upregulated or downregulated in the aCD27 treated and untreated cells for all three donors.
Red indicates high expression and blue indicates low expression. Expression levels are log normalized, first scaling each cell to a total of 1×104 molecules.
(a) In addition to differential expression, single cell measurements can help distinguish if changes are due to gene or protein regulation versus those that arise due to compositional changes of different cellular states15. (b) Histograms showing protein expression for markers that increased upon stimulation with aCD27. (blue, aCD27; orange, no aCD27). Protein markers that had statistically significant differential expression (**) had an adjusted p-value < 0.01 (corrected for multiple testing using the Bonferroni correction) and a fold change > 1.3. For each marker, the fold change in the percentage of positive cells (right of red line) between the aCD27 treated and untreated samples is shown in the table below the histogram.
(a) Histograms showing REAP-seq protein expression for markers that decreased upon stimulation with aCD27. (blue, aCD27; orange, no aCD27). Protein markers that had statistically significant differential expression (**) had an adjusted p-value < 0.01 (corrected for multiple testing using the Bonferroni correction) and a fold change > 1.3. For each marker the fold change in the percentage of positive cells (right of red line) between the aCD27 untreated and treated samples is shown in the table below the histogram. (b) Flow cytometry histograms showed an increase in CD69 expression upon aCD27 treatment of the CD8+ naïve T cells in all three donors. (c) Schematic of CD69 Ab characterization using the Biacore assay. First, rabbit anti-mouse FC polyclonal antibodies are immobilized to the flow cell. Then antibodies (anti-CD69, anti-CD69+DNA Barcode, and Mouse IgG1 isotype control) are captured to the flow cell. Binding stability to either recombinant CD69 (dimer) or recombinant CD47 (negative control) is measured. (d) Anti-CD69, Anti-CD69+DNA barcode, and Mouse IgG1 Isotype control antibodies were captured to 130, 160, and 150 RU, respectively, on the flowcell. (e) Anti-CD69 and anti-CD69+DNA barcode both demonstrated binding to recombinant human CD69 protein and did not bind to recombinant human CD47 (negative control). The Mouse IgG1 isotype control did not exhibit any binding to either human CD69 or CD47 recombinant proteins.
Supplementary Figure 16 Scatter plots looking at the relationship between the change in protein expression and the change in mRNA expression between aCD27 treated and untreated CD8+ naïve T cells for each of the three donors.
Supplementary Figure 17 Comparative analysis of differentially expressed proteins and genes using Gene Set Enrichment Analysis (GSEA, Broad).
(a,b) Differential gene expression between aCD27 treated and untreated cells was used to generate a rank order list where red indicates genes with increased expression in aCD27 untreated cells (control) and blue indicates genes with increased expression in aCD27 treated cells. GSEA was then used to compare this rank order gene list to a rank order protein list (abs(log fold change) >0.2, adjusted p-value <0.05). Enrichment plots are shown for (a) proteins that had higher expression in the control aCD27 untreated cells and for (b) proteins that had higher expression in the aCD27 treated cells. The top 5 correlated proteins are shown for each donor. The horizontal scale bar from red (left) to blue (right) represents the DE genes ranked from highest expression in the aCD27 untreated cells (left) to the high expression in the aCD27 treated cells (right). The vertical black lines represent the projection of DE protein markers onto the ranked DE gene list. The curve in green corresponds to the calculation of the enrichment score (ES) where the green ES curve shifted to the upper left of the graph (a) indicates an enrichment in genes that show increased expression in aCD27 untreated cells. Conversely, the green ES curve shifted to the lower right (b) indicates an enrichment of genes with increased expression in aCD27 treated cells. Normalized enrichment scores (NES) and false discovery rate (FDR) q-values are indicated on the upper right corner of each plot.
Supplementary Figure 18 REAP-seq comparison of mRNA versus protein expression upon aCD27 stimulation in Donor 2
(a) t-SNE visualization plot (on left) based on protein expression for Donor 2 where blue dots indicate cells treated with aCD27 (n=4,044) and magenta indicates cells not treated with aCD27 (n=622) mRNA and protein expression of IL7R, NT5E, CD70, PD1, and CD4 projected on the t-SNE visualization plot (on right). Genes and proteins in the aCD27 treated cells that had adjusted p-values <0.01 and fold changes greater than 1.3 were considered significant (** with red circles). Purple indicates high expression and grey indicates low expression. (b) REAP-seq protein measurements showing increased CD4 expression in Donor 2 aCD27 treated naïve CD8+ T cells was validated with flow cytometry.
Supplementary Figure 19 Flow cytometry experiment showing aCD27 drug partially blocks anti-CD27 (clone M-T271) from binding CD27
(a) A representative T cell gating strategy for flow cytometry where Rhesus blood was stained with a phenotypic panel of Abs (Supplementary Table 5). (b) Flow cytometry data showing that the T cells from Rhesus blood that are treated with aCD27 drug (10 ug/ml and 0.25 ug/ml) partially block the CD27-APC monoclonal antibody clone (M-T271, BioLegend) used in the AbB panel compared to no aCD27 drug (0 ug/ml). (c) T cell staining of a Mouse IgG1 Isotype control conjugated to APC showed minimal background fluorescence signal.
(a) Flow cytometry histograms showed an increase in CD25 and CD4 expression upon aCD27 treatment of the CD8+ naïve T cells in all three donors which is consistent with REAP-seq findings (Supplementary Fig. 14b). (b) Flow cytometry bivariate scatter plots show a 14, 22, and 5 fold increase in the percentage of double positive cells (CD8+/CD4+) in aCD27 treated vs untreated naïve CD8+ T cells in Donor 1, 2, and 3, respectively.
Supplementary Figure 21 mRNA and protein correlations in aCD27 treated and untreated naïve CD8+ T cells
(a) Statistics of the percentage of cells with zero UMI counts and >0 UMI counts for mRNA and protein expression in naïve CD8+ T cells in each donor. (b) Scatter plots looking at the relationship between protein and mRNA expression in aCD27 treated (turquoise) and untreated (red) CD8+ naïve T cells for each donor. Pearson R correlation scores including (black) or not including (red) cells with zero UMI protein or gene counts. Scatter plots shown include cells with zero protein or gene UMI counts.
Supplementary Figure 22 Differential expression at both the protein and transcript level in the outlier cluster.
Violin plots showing the expression distribution of the three markers (HLA-DRA, CD27, CD2) that had both differential gene and protein expression in the outlier cluster (purple) compared to the rest of the cells (grey) in all three donors. Expression levels are log transformed, first scaling each cell to a total of 1×104 molecules.
Supplementary Figure 23 MetaCore pathway analysis of the outlier cluster in CD8+ naïve T cells (Fig. 2a).
The development transcriptional regulation of megakaryopoiesis pathway was the most significant enriched pathway (FDR 1.2e-5, Clarivate Analytics) for genes and proteins upregulated in the outlier cluster compared to the rest of the cells (fold change >1.5, adjusted p value <0.01). Red circles indicate genes that were upregulated in the outlier cluster in at least one of the donors and the red square indicates the two proteins (CD34 and CD38) that were upregulated in the outlier cluster in all three donors.
(a) PBMCs were initially gated on live cells. (b) aCD27 untreated CD8+ naïve T cells were initially gated on live cells. Representative live cell gating strategy for aCD27 untreated and treated CD8+ naïve T cells for all three donors.
About this article
Cite this article
Peterson, V., Zhang, K., Kumar, N. et al. Multiplexed quantification of proteins and transcripts in single cells. Nat Biotechnol 35, 936–939 (2017). https://doi.org/10.1038/nbt.3973
Nature Protocols (2021)
Droplet-based mRNA sequencing of fixed and permeabilized cells by CLInt-seq allows for antigen-specific TCR cloning
Proceedings of the National Academy of Sciences (2021)
Approaches for the integration of big data in translational medicine: single‐cell and computational methods
Annals of the New York Academy of Sciences (2021)
Trends in Cancer (2021)
Single-cell RNA sequencing reveals heterogeneous tumor and immune cell populations in early-stage lung adenocarcinomas harboring EGFR mutations