MULTI-seq: sample multiplexing for single-cell RNA sequencing using lipid-tagged indices

Abstract

Sample multiplexing facilitates scRNA-seq by reducing costs and identifying artifacts such as cell doublets. However, universal and scalable sample barcoding strategies have not been described. We therefore developed MULTI-seq: multiplexing using lipid-tagged indices for single-cell and single-nucleus RNA sequencing. MULTI-seq reagents can barcode any cell type or nucleus from any species with an accessible plasma membrane. The method involves minimal sample processing, thereby preserving cell viability and endogenous gene expression patterns. When cells are classified into sample groups using MULTI-seq barcode abundances, data quality is improved through doublet identification and recovery of cells with low RNA content that would otherwise be discarded by standard quality-control workflows. We use MULTI-seq to track the dynamics of T-cell activation, perform a 96-plex perturbation experiment with primary human mammary epithelial cells and multiplex cryopreserved tumors and metastatic sites isolated from a patient-derived xenograft mouse model of triple-negative breast cancer.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: MULTI-seq demultiplexes cell types, culture conditions and time-points for single-cell and single-nucleus RNA sequencing.
Fig. 2: MULTI-seq barcoding of multiplexed HMEC culture conditions.
Fig. 3: PDX sample multiplexing demonstrates low-RNA cell detection, reveals immune cell proportional shifts and classical monocyte heterogeneity in the progressively metastatic lung.

Data availability

Raw gene expression and barcode count matrices were uploaded to the Gene Expression Omnibus (GSE129578) along with pertinent metadata.

Code availability

R implementations of the MULTI-seq sample classification and barcode pre-processing pipelines are available in the ‘deMULTIplex’ R package, and can be downloaded at https://github.com/chris-mcginnis-ucsf/MULTI-seq.

References

  1. 1.

    Ramsköld, D. et al. Full-length mRNA-Seq from single cell levels of RNA and individual circulating tumor cells. Nat. Biotechnol. 30, 777–782 (2012).

  2. 2.

    Hashimshony, T., Wagner, F., Sher, N. & Yanai, I. CEL-Seq: single-cell RNA-Seq by multiplexed linear amplification. Cell Rep. 2, 666–673 (2012).

  3. 3.

    Gierahn, T. M. et al. Seq-Well: portable, low-cost RNA sequencing of single cells at high throughput. Nat. Methods 14, 395–398 (2017).

  4. 4.

    Cao, J. et al. Comprehensive single-cell transcriptional profiling of a multicellular organism. Science 357, 661–667 (2017).

  5. 5.

    Rosenberg, A. B. et al. Single-cell profiling of the developing mouse brain and spinal cord with split-pool barcoding. Science 360, 176–182 (2018).

  6. 6.

    Macosko, E. Z. et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161, 1202–1214 (2015).

  7. 7.

    Klein, A. M. et al. Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell 161, 1187–1201 (2015).

  8. 8.

    Zheng, G. X. et al. Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 8, 14049 (2017).

  9. 9.

    Habib, N. et al. Massively parallel single-nucleus RNA-seq with DroNc-seq. Nat. Methods 14, 955–958 (2017).

  10. 10.

    Tabula Muris Consortium. Single-cell transcriptomic characterization of 20 organs and tissues from individual mice creates a Tabula Muris. Nature 562, 367–372 (2018).

  11. 11.

    Regev, A. et al. The Human Cell Atlas. eLife 6, e27041 (2017).

  12. 12.

    Wagner, D. E. et al. Single-cell mapping of gene expression landscapes and lineage in the zebrafish embryo. Science 360, 981–987 (2018).

  13. 13.

    Ordovas-Montanes, J. et al. Allergic inflammatory memory in human respiratory epithelial progenitor cells. Nature 560, 649–654 (2018).

  14. 14.

    Kang, H. M. et al. Multiplexed droplet single-cell RNA-sequencing using natural genetic variation. Nat. Biotechnol. 36, 89–94 (2018).

  15. 15.

    Dixit, A. et al. Perturb-Seq: dissecting molecular circuits with scalable single-cell RNA profiling of pooled genetic screens. Cell 167, 1853–1866.e17 (2016).

  16. 16.

    Adamson, B. et al. A multiplexed single-cell CRISPR screening platform enables systematic dissection of the unfolded protein response. Cell 167, 1867–82.e21 (2016).

  17. 17.

    Jaitin, D. A. et al. Dissecting immune circuits by linking CRISPR-pooled screens with single-cell RNA-seq. Cell 167, 1883–1896.e15 (2016).

  18. 18.

    Aarts, M. et al. Coupling shRNA screens with single-cell RNA-seq identifies a dual role for mTOR in reprogramming-induced senescence. Genes Dev. 31, 2085–2098 (2017).

  19. 19.

    Shin, D., Lee, W., Lee, J. H. & Bang, D. Multiplexed single-cell RNA-seq via transient barcoding for simultaneous expression profiling of various drug screening. Sci. Adv. 5, eaav2249 (2019).

  20. 20.

    Guo, C., Biddy, B. A., Kamimoto, K., Kong, W. & Morris, S. A. CellTag indexing: genetic barcode-based sample multiplexing for single-cell technologies. Genome Biol. 20, 90 (2019).

  21. 21.

    Stoeckius, M. et al. Cell Hashing with barcoded antibodies enables multiplexing and doublet detection for single cell genomics. Genome Biol. 19, 224 (2018).

  22. 22.

    Gehring, J., Park, J. H., Chen, S., Thomson, M. & Pachter, L. Highly multiplexed single-cell RNA-seq for defining cell population and transcriptional spaces. Preprint at https://www.biorxiv.org/content/10.1101/315333v1 (2018).

  23. 23.

    Gaublomme, J. T. et al. Nuclei multiplexing with barcoded antibodies for single-nucleus genomics. Preprint at https://www.biorxiv.org/content/10.1101/476036v1 (2018).

  24. 24.

    Weber, R. J., Liang, S. I., Selden, N. S., Desai, T. A. & Gartner, Z. J. Efficient targeting of fatty-acid modified oligonucleotides to live cell membranes through stepwise assembly. Biomacromolecules 15, 4621–4626 (2014).

  25. 25.

    Wu, H., Kirita, Y., Donnelly, E. L. & Humphreys, B. D. Advantages of single-nucleus over single-cell RNA sequencing of adult kidney: rare cell types and novel cell states revealed in fibrosis. J. Am. Soc. Nephrol. 30, 23–32 (2019).

  26. 26.

    Coutelier, J. P. et al. Binding and functional effects of thyroid stimulating hormone on human immune cells. J. Clin. Immunol. 10, 204–210 (1990).

  27. 27.

    Jeffrey, K. L. et al. Positive regulation of immune cell function and inflammatory responses by phosphatase PAC-1. Nat. Immunol. 7, 274–283 (2006).

  28. 28.

    Ziegler, S. F., Ramsdell, F. & Alderson, M. R. The activation antigen CD69. Stem Cells 12, 456–465 (1994).

  29. 29.

    Lieberman, J. & Fan, Z. Nuclear war: the granzyme A-bomb. Curr. Opin. Immunol. 15, 553–559 (2003).

  30. 30.

    Garbe, J. C. et al. Molecular distinctions between stasis and telomere attrition senescence barriers shown by long-term culture of normal human mammary epithelial cells. Cancer Res. 69, 7557–7568 (2009).

  31. 31.

    Brisken, C. Progesterone signalling in breast cancer: a neglected hormone coming into the limelight. Nat. Rev. Cancer 13, 385–396 (2013).

  32. 32.

    McGinnis, C. S., Murrow, L. M. & Gartner, Z. J. DoubletFinder: doublet detection in single-cell rna sequencing data using artificial nearest neighbors. Cell Syst. 8, 329–337.e4 (2019).

  33. 33.

    Wolock, S. L., Lopez, R. & Klein, A. M. Scrublet: computational identification of cell doublets in single-cell transcriptomic data. Cell Syst. 8, 281–291.e9 (2019).

  34. 34.

    Chitale, D. et al. An integrated genomic analysis of lung cancer reveals loss of DUSP4 in EGFR-mutant tumors. Oncogene 28, 2773–2783 (2009).

  35. 35.

    Fearon, A. E. et al. PHLDA1 mediates drug resistance in receptor tyrosine kinase-driven cancer. Cell Rep. 22, 2469–2481 (2018).

  36. 36.

    Savage, P. et al. A targetable EGFR-dependent tumor-initiating program in breast cancer. Cell Rep. 21, 1140–1149 (2017).

  37. 37.

    DeRose, Y. S. et al. Tumor grafts derived from women with breast cancer authentically reflect tumor pathology, growth, metastasis and disease outcomes. Nat. Med. 17, 1514–1520 (2011).

  38. 38.

    Jiang, K., Sun, X., Chen, Y., Shen, Y. & Jarvis, J. N. RNA sequencing from human neutrophils reveals distinct transcriptional differences associated with chronic inflammatory states. BMC Med. Genom. 8, 55 (2015).

  39. 39.

    Lun, A. T. L. et al. EmptyDrops: distinguishing cells from empty droplets in droplet-based single-cell RNA sequencing data. Genome Biol. 20, 63 (2019).

  40. 40.

    Reyfman, P. A. et al. Single-cell transcriptomic analysis of human lung provides insights into the pathobiology of pulmonary fibrosis. Am. J. Respir. Crit. Care Med. https://doi.org/10.1164/rccm.201712-2410OC (2018).

  41. 41.

    Jablonska, J., Lang, S., Sionov, R. V. & Granot, Z. The regulation of pre-metastatic niche formation by neutrophils. Oncotarget 8, 112132–112144 (2017).

  42. 42.

    Sharma, S. K. et al. Pulmonary alveolar macrophages contribute to the premetastatic niche by suppressing antitumor T cell responses in the lungs. J. Immunol. 194, 5529–5538 (2015).

  43. 43.

    Condamine, T., Ramachandran, I., Youn, J. & Gabrilovich, D. I. Regulation of tumor metastasis by myeloid-derived suppressor cells. Annu Rev. Med. 66, 97–110 (2015).

  44. 44.

    Kitamura, T. et al. Monocytes differentiate to immune suppressive precursors of metastasis-associated macrophages in mouse models of metastatic breast cancer. Front. Immunol. 8, 2004 (2018).

  45. 45.

    Catena, R. et al. Bone marrow-derived Gr1+ cells can generate a metastasis-resistant microenvironment via induced secretion of thrombospondin-1. Cancer Discov. 3, 578–589 (2013).

  46. 46.

    Ouzounova, M. et al. Monocytic and granulocytic myeloid derived suppressor cells differentially regulate spatiotemporal tumour plasticity during metastatic cascade. Nat. Commun. 8, 14979 (2017).

  47. 47.

    Nabavi, S., SChmolze, D., Maitituoheti, M., Malladi, S. & Beck, A. H. EMDomics: a robust and powerful method for the identification of genes differentially expressed between heterogeneous classes. Bioinformatics 32, 533–541 (2016).

  48. 48.

    Subramanian, A. et al. A next generation connectivity map: L1000 Platform and the first 1,000,000 profiles. Cell 171, 1437–1452.e17 (2017).

  49. 49.

    Ye, C. et al. DRUG-seq for miniaturized high-throughput transcriptome profiling in drug discovery. Nat. Commun. 9, 4307 (2018).

  50. 50.

    Romero, J. M. et al. Coordinated downregulation of the antigen presentation machinery and HLA class I/beta2-microglobulin complex is responsible for HLA-ABC loss in bladder cancer. Int. J. Cancer 113, 605–610 (2005).

  51. 51.

    Lim, E. et al. Aberrant luminal progenitors as the candidate target population for basal tumor development in BRCA1 mutation carriers. Nat. Med. 15, 907–913 (2009).

  52. 52.

    Stoeckius, M. et al. Simultaneous epitope and transcriptome measurement in single cells. Nat. Methods 14, 865–868 (2017).

  53. 53.

    Lawson, D. A. et al. Single-cell analysis reveals a stem-cell program in human metastatic breast cancer cells. Nature 526, 131–135 (2015).

  54. 54.

    Satija, R., Ferrel, J. A., Gennert, D., Schier, A. F. & Regev, A. Spatial reconstruction of single-cell gene expression data. Nat. Biotechnol. 33, 495–502 (2015).

  55. 55.

    Butler, A., Hoffman, P., Smibert, P., Papalexi, E. & Satija, R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol. 36, 411–420 (2018).

  56. 56.

    van der Maaten, L. J. P. Accelerating t-SNE using tree-based algorithms. J. Mach. Learn. Res. 15, 3221–3245 (2014).

  57. 57.

    McDavid, A. et al. Data exploration, quality control and testing in single-cell qPCR-based gene expression experiments. Bioinformatics 29, 461–467 (2013).

  58. 58.

    Morgan, M. et al. ShortRead: a Bioconductor package for input, quality assessment and exploration of high-throughput sequence data. Bioinformatics 25, 2607–2608 (2009).

  59. 59.

    van der Loo, M. The stringdist package for approximate string matching. R J. 6, 111–122 (2014).

  60. 60.

    Wand, M. P. & Jones, M. C. Kernel Smoothing (Chapman & Hall, 1995).

Download references

Acknowledgements

This research was supported in part by grants from the Department of Defense Breast Cancer Research Program (nos. W81XWH-10-1-1023 and W81XWH-13-1-0221), NIH (nos. U01CA199315 and DP2 HD080351-01), the NSF (no. MCB-1330864) and the UCSF Center for Cellular Construction (no. DBI-1548297), the 2019 Mary Anne Koda-Kimble Seed Award for Innovation, and the NSF Science and Technology Center. Z.J.G. is a Chan Zuckerberg BioHub Investigator. D.M.P. is supported by the NIGMS of the National Institutes of Health (grant no. F32GM128366). L.M.M. is a Damon Runyon Fellow supported by the Damon Runyon Cancer Research Foundation (grant no. DRG-2239-15). J.W. and M.Y.H. are supported by EMBO long-term post-doctoral fellowships (grant nos. ALTF-159-2017 and ALTF-1193-2015, respectively). J.L.H. is supported by an NSF GRFP award. We thank M. Thomson for insightful discussions. We thank the UCSF Flow Core (grant no. NIHS10 1S10OD021822-01) and M. Owyong, S. Liu and C. Diadhiou for technical support.

Author information

E.D.C. and Z.J.G. conceptualized the method. C.S.M. and D.M.P. designed experiments, synthesized LMOs and optimized the method. C.S.M., D.M.P. and D.N.C. performed analytical flow cytometry experiments. C.S.M. and D.M.P. performed proof-of-concept scRNA-seq experiments. D.M.P. and D.N.C. performed proof-of-concept snRNA-seq experiments. C.S.M. and J.W. performed PDX scRNA-seq experiments. C.S.M., D.M.P., J.L.H. and V.S. performed HMEC scRNA-seq experiments. Z.W. and J.S.W. provided tissue and computational resources, respectively. C.S.M., D.M.P. and L.M.M. performed bioinformatics analysis. C.S.M., M.Y.H., J.W. and J.L.H. implemented the sample classification pipeline. C.S.M. implemented the barcode pre-processing pipeline. C.S.M., D.M.P., Z.J.G. and E.D.C. wrote the manuscript.

Correspondence to Eric D. Chow or Zev J. Gartner.

Ethics declarations

Competing interests

Z.J.G., E.D.C., D.M.P. and C.S.M. have filed patent applications related to the MULTI-seq barcoding method. The contents of this manuscript are solely the responsibility of the authors and do not necessarily represent the official views of the National Institutes of Health.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Integrated supplementary information

Supplementary Figure 1 Flow cytometry using fluorophore-conjugated MULTI-seq barcode oligonucleotide probes demonstrates robust LMO and CMO labeling efficiency on living cells and nuclei, as well as label stability over time and LMO quenching with BSA, related to Figure 1.

(a) Live-cell LMO (gold) and CMO (purple) labeling efficiency varies predictably across a titration curve of anchor and co-anchor LMO/CMO concentrations. Qualitative trends shown with histograms (top) are supported by regression analyses (bottom) demonstrating technical reproducibility and linear relationship between LMO/CMO concentration and fluorescence abundance. n=10,000 events/sample. Data represented as mean ± SEM over 3 experimental replicates. (b) Time-course analysis of LMO and CMO scaffold loss and exchange on ice following mixing of live cell populations labeled with either AF647- or FAM-conjugated barcode probes. Qualitative trends (contour plots, left) document how LMO or CMO labeled cells maintain fluorescence signal over unlabeled control cells (grey) over time. Quantitative analysis (right) illustrates how LMO scaffolds more stably embed in the plasma membrane relative to CMO scaffolds, although sample-to-sample cross-talk is minimal. n=10,000 events/time-point. Experiment was repeated 3 times with similar results. (c) Same experiment as described in Supplementary Fig. 1a, except with nuclei. n=10,000 events/sample. Data represented as mean ± SEM over 3 experimental replicates. (d) Same experiment as described in Supplementary Fig. 1b, except with nuclei. Difference between LMO and CMO membrane residency kinetics does not occur during nuclear membrane labeling. n=10,000 events/time-point. (e) Same experiment as described in Supplementary Fig. 1c, except at room temperature. The LMO advantage in label stability shown at 4 °C is lost at RT as both CMO (purple) and LMO (gold) labels decrease at similar rates. n=10,000 events/time-point. (f) Live-cells were labeled with LMO or CMO at 200 nM and diluted with either PBS (black) or 1% BSA in PBS (red). The cells were pelleted and the supernatant was transferred to unlabeled cells to determine the labeling efficiency of remaining LMO or CMO label. Dilution with BSA leads to a decrease in supernatant labeling relative to dilution with PBS alone. n=10,000 events/sample. Data represented as mean ± SEM over 3 experimental replicates. Source data

Supplementary Figure 2 MULTI-seq with LMOs preserves endogenous gene expression during single-cell RNA sequencing, while both LMOs and CMOs are non-perturbative during single-nucleus RNA sequencing, related to Figure 1.

(a) Cell state annotations for aggregated LMO, CMO, and unlabeled control scRNA-seq data. Violin plots illustrate marker genes used to define HEKs (pink), MEPs (cyan), and LEPs (dark teal). n = 15,482 cells. (b) MULTI-seq classifications for LMO- (left) and CMO-labeled cells (right) mapped onto aggregated gene expression space. As with LMO labeling, sample classifications for CMO-labeled samples match their expected cell type annotations. n = 15,482 cells. (c) Violin plots (left) describing the number of detected UMIs, number of detected genes, and percentage of mitochondrial gene expression for LMO-labeled (gold), CMO-labeled (purple), and unlabeled control HEKs (black). Distributions are largely identical, suggesting that LMO- and/or CMO-labeling does not negatively influence endogenous mRNA capture. This point is further supported by the observation that barcode UMIs and RNA UMIs are slightly positively correlated for both LMO- and CMO-labeled cells (right), suggesting that sample barcodes do not outcompete endogenous transcripts during mRNA capture. n = 7,888 HEK cells. r = Pearson’s correlation. (d) HEK gene expression space colored by whether cells were LMO-labeled (gold), CMO-labeled (purple), or unlabeled (black) reveals sub-structure specific to CMO-labeling. Quantifying the proportion of LMO-, CMO-, and unlabeled cells amongst each cell’s 100 nearest-neighbors in gene expression space emphasizes CMO-specific sub-structure. Unlabeled and LMO-labeled cells have similar neighborhoods while CMO-labeled cells preferentially co-localize in gene expression space. n = 7,888 HEK cells. (e) Schematic overview of a proof-of-concept snRNA-seq experiment using MULTI-seq. Nuclei were isolated from 10 distinct cell samples (HEKs (dark red), MEFs (blue), and Jurkats (green) stimulated with ionomycin and PMA for 8 distinct time points) prior to LMO barcoding and sequencing. CMO-labeled and unlabeled HEK and MEF nuclei were sequenced in parallel. (f) MULTI-seq classifications exhibit species-specificity between Jurkat cells (green), HEKs (dark red) and MEFs (blue). n = 4,848 (human) and 1,046 (mouse) MULTI-seq labeled nuclei. (g) Cell state annotations for aggregated LMO, CMO, and unlabeled control snRNA-seq data. Violin plots illustrate marker genes used to define HEKs (pink) and Jurkats (dark teal). n = 8,468 human nuclei. (h) Same analysis as described in Fig. S2C on snRNA-seq data. n = 2,113 MEF nuclei. r = Pearson’s correlation. (i) Same analysis as described in Fig. S2D on snRNA-seq data. Unlike in live cells, gene expression neighborhoods for LMO- and CMO-labeled cells both mirror unlabeled cells, suggesting that both LMOs and CMOs enable non-perturbative single-nucleus RNA sequencing sample multiplexing. n = 2,113 MEF nuclei. Source data

Supplementary Figure 3 HMEC MULTI-seq sample classification results and technical replicate, related to Figure 2.

(a) Schematic overview of 96-plex HMEC scRNA-seq analysis. 96 distinct HMEC cultures consisting of LEPs alone (blue), MEPs alone (green), or both cell types together (dark red) were grown in media supplemented with 15 distinct signaling molecules or signaling molecule combinations and one control. (b) 96-well plate schematic overlaid with a heat map showing the number of cells assigned to each sample barcode group. Twenty samples—predominantly those arising from column 2—were not represented in the original large-scale HMEC experiment due to technical error during sample preparation. (c) Normalized barcode UMI heat map demonstrating that sample groups are predominantly associated with single MULTI-seq barcodes. (d) Violin plots describing the barcode UMI SNR for negative cells, doublets, and singlets. n=40,009 cells. (e) Same analysis as described in Supplementary Fig. 3b, except with the 96-plex HMEC technical replicate experiment. All samples were classified in the technical replicate. (f) Same analysis as described in Supplementary Fig. 3c, except with the 96-plex HMEC technical replicate experiment. (g) Same analysis as described in Supplementary Fig. 3d, except with the 96-plex HMEC technical replicate experiment. n=48,091 cells. Source data

Supplementary Figure 4 MULTI-seq barcode pre-processing and sample classification workflows, related to Figure 2.

Results from the 96-plex HMEC experiment are used as representative examples for the barcode classification workflow. Results from the 96-plex technical replicate HMEC experiment as used as representative examples for the semi-supervised negative cell reclassification workflow. PDF=probability density function.

Supplementary Figure 5 Transcriptional signatures of co-culturing and signaling molecule stimulation in MEPs, related to Figure 2.

(a) Distributions of marker gene expression used to identify MEPs (KRT14) and LEPs (KRT19) in gene expression space. n=25,166 cells. (b) Distributions of MKI67 expression in gene expression space for LEPs (left) and MEPs (right). MKI67 enrichment was used as a proxy for distinguishing proliferative and resting LEPs and MEPs. n=6,159 (LEP) and 14,428 (MEP) cells. (c) MEPs co-cultured with LEPs are not induced to proliferate relative to MEPs grown in monoculture. Clusters corresponding to resting (black) and proliferative (blue) LEPs are identifiable in gene expression space (Supplementary Fig. 5b). Projecting sample classification densities onto gene expression space for co-cultured MEPs (red, left) and MEPs cultured alone (green, middle) illustrates that both culture compositions are equally proliferative (table, right). n= 14,428 cells. (d) MEPs co-cultured with LEPs exhibit enriched TGF-β signaling (as measured by TGFBI expression) relative to MEPs grown in monoculture. Each point represents an average of MEPs grouped according signaling molecule treatment. *** = Wilcoxon rank sum test (two-sided), p = 1.5x10−6. n = 32 signaling molecule condition groups. Data are represented as mean ± SEM. (e) Hierarchical clustering and heat map analysis of MEPs grouped by signaling molecule treatment highlights an EGFR signaling transcriptional response specific to EGF and AREG treatment. Dendrogram labels: E = EGF, W = WNT4, A = AREG, I = IGF-1, R = RANKL, C = Control. Source data

Supplementary Figure 6 MULTI-seq application to primary PDX samples, related to Figure 3.

(a) Representative histology of lung tissue illustrates metastatic progression in early, mid, and late-stage PDX mice. Individual metastases denoted with black arrows. H&E staining was performed 3 times (early), 4 times (mid), and 10 times (late), yielding the same result. (b) Negative cell reclassification improves sample classification results. Singlets (black) localize into clusters in barcode space whereas doublets (red) localize between singlet clusters. Negative cells either co-localize with singlet or doublet clusters (blue outline, bottom) or cluster separately (red outline, bottom). Negative cell reclassification is insensitive to the true-negatives that cluster separately, while rescuing a subset of false-negatives that cluster amongst singlets and doublets. n = 12,086 cells. (c) Barcode SNR comparisons between samples ordered according to the viability (top) or total cell number (nCell, bottom) of the MULTI-seq barcoding conditions. See table S3 for details. Data are represented as mean ± SEM. n=10,427 cells. (d) Mouse immune cells in gene expression space colored according to tissue of origin. Lung immune cells (brown) cluster separately from primary tumor immune cells (teal). n=8,420 cells. (e) Bar plots describing the proportion of mouse (pink) and human (blue) cells detected during FACS enrichment and detected in the final 10X dataset. Classification of human and mouse cells from the L-A lung demonstrates sample classification accuracy and species-independence. (f) Marker gene heat map describing markers utilized for defining cell type annotations. RNA UMI abundances are scaled from 0-1 for each gene. Values correspond to the average expression within each annotation group. Displayed genes represent the top 3 most statistically-significant genes for each cell type (Likelihood-ratio test for single cell gene expression57 with Bonferroni multiple comparisons adjustment). (g) Unsupervised clustering of classical monocytes recapitulates intercellular heterogeneity due to metastatic progression (displayed in Supplementary Fig. 3d). See Supplementary Table 6 for differential gene expression analysis results. n=2,496 cells. Source data

Supplementary Figure 7 FACS purification of LEP and MEP cells from bulk HMECs, related to Experimental Methods.

Bulk HMECs were labeled with FITC anti-EpCAM and APC-Cy7 anti-CD49f to identify and isolate LEPs and MEPs. LEPs are identified as EpCAM high and CD49f low, while MEPs are CD49f high and EpCAM low. Gating strategy causes minor cell type impurities in final sorted population.

Supplementary Figure 8 FACS gating strategy for PDX lung and primary tumor samples, related to Experimental Methods.

(a) Dissociated human metastases and mouse immune cells were separated from dissociate PDX mouse lungs using hCD298 and mCD45 following gating for live singlets. Mouse 847 (Sample L-A) is presented here as a representative example. (b) Dissociated human primary tumor cells and mouse tumor-associated immune cells were separated using hCD298 and mCD45 following gating for live, singlets. Sample A is presented here as a representative example for all other primary tumor samples.

Supplementary Figure 9 Bioanalyzer traces of representative MULTI-seq barcode library, related to Experimental Methods.

(a) Bioanalyzer traces following cDNA amplification and MULTI-seq barcode enrichment using 3.2X SPRI with 1.8X 100% isopropanol exhibits two distinct peaks. Bioanalyzer traces are representative of all datasets presented in this study (n=4). The first peak (p1) is an average of 65–70bp in length and likely corresponds to barcodes amplified via the MULTI-seq additive primer. The second peak (p2) is an average of 100bp in length and likely corresponds to barcodes that successfully underwent MMLV-RTase template switching and were subsequently amplified by the standard 10X Genomics Single Cell V2 primer. (b) Bioanalyzer analysis following library preparation PCR exhibits one distinct peak (p3) with an average length of 173bp, matching expectations. Bioanalyzer traces are representative of all datasets presented in this study (n=4). (c) Schematic illustrating the two species of reverse-transcribed MULTI-seq barcodes with and without template switching. Processive reverse-transcription without template switching (p1) is more likely than reverse-transcription with template switching (p2), resulting in relative enrichment of the 65–70bp product following cDNA amplification.

Supplementary information

Supplementary Information

Supplementary Figs. 1–9, Supplementary Tables 1–8, Supplementary Note, Supplementary Protocol

Reporting Summary

Source data

Source Data, Fig. 1

Source Data, Fig. 2

Source Data, Fig. 3

Source Data, Supplementary Fig. 1

Source Data, Supplementary Fig. 2

Source Data, Supplementary Fig. 3

Source Data, Supplementary Fig. 5

Source Data, Supplementary Fig. 6

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark