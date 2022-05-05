Abstract
Deciphering the functional interactions of cells in tissues remains a major challenge. Here we describe DIALOGUE, a method to systematically uncover multicellular programs (MCPs)—combinations of coordinated cellular programs in different cell types that form higher-order functional units at the tissue level—from either spatial data or single-cell data obtained without spatial information. Tested on spatial datasets from the mouse hypothalamus, cerebellum, visual cortex and neocortex, DIALOGUE identified MCPs associated with animal behavior and recovered spatial properties when tested on unseen data while outperforming other methods and metrics. In spatial data from human lung cancer, DIALOGUE identified MCPs marking immune activation and tissue remodeling. Applied to single-cell RNA sequencing data across individuals or regions, DIALOGUE uncovered MCPs marking Alzheimer’s disease, ulcerative colitis and resistance to cancer immunotherapy. These programs were predictive of disease outcome and predisposition in independent cohorts and included risk genes from genome-wide association studies. DIALOGUE enables the analysis of multicellular regulation in health and disease.
Data availability
The datasets analyzed in this study include seq-FISH data37 and HMRF37 annotations obtained from https://bitbucket.org/qzhu/smfish-hmrf/src/master/hmrf-usage/; scRNA-seq from the mouse neocortex obtained via the Gene Expression Omnibus (GEO), accession number GSE115746 (ref. 38); Slide-seq data11 obtained via the Single Cell Portal: https://singlecell.broadinstitute.org/single_cell/study/SCP354/slide-seq-study#study-summary; MERFISH data10 obtained from the DRYAD repository: https://datadryad.org/stash/dataset/doi:10.5061/dryad.8t8s248; and SMI data from human NSCLC samples39, including cell type annotations, from https://www.nanostring.com/products/cosmx-spatial-molecular-imager/ffpe-dataset/. scRNA-seq data of colon biopsies16 were obtained from the Single Cell Portal: https://singlecell.broadinstitute.org/single_cell/study/SCP259/intra-and-inter-cellular-rewiring-of-the-human-colon-during-ulcerative-colitis#study-download; snRNA-seq data from human prefrontal cortex17 were obtained via the AMP-AD Knowledge Portal: https://adknowledgeportal.synapse.org/ (Synapse IDs: syn18686381, syn18686382, syn18686372 and syn3505720; available through controlled access and subject to the use conditions set by human privacy regulations); and melanoma and brain organoid single-cell data were obtained via the GEO under accession numbers GSE120575 (ref. 55), GSE115978 (ref. 18) and GSE86153 (ref. 43).
Code availability
DIALOGUE is implemented as an R package and can be installed using the devtools::install(‘DIALOGUE’) command. Further documentation and tutorials are provided in the package help pages (for example, ?DIALOGUE). We also provide DIALOGUE via GitHub (https://github.com/livnatje/DIALOGUE) and the Klarman Cell Observatory repository10, along with additional guidelines and specifications.
Acknowledgements
We thank L. Gaffney and A. Hupalowska for help with figure preparation. L.J.A. is a Chan Zuckerberg Biohub Investigator and holds a Career Award at the Scientific Interface from the Burroughs Wellcome Fund. L.J.A. was a Cancer Research Institute (CRI) Irvington Fellow supported by the CRI and a fellow of the Eric and Wendy Schmidt Postdoctoral Program. A.R. was a Howard Hughes Medical Institute (HHMI) Investigator. Work was supported by the Klarman Cell Observatory, National Institute of Diabetes and Digestive and Kidney Diseases RC2 DK114784, the Food Allergy Science Initiative, the Manton Foundation and the HHMI. The AD dataset was provided by the Rush Alzheimer’s Disease Center, Rush University Medical Center. Data collection was supported through funding by National Instutute on Aging grants P30AG10161, R01AG15819, R01AG17917, R01AG30146, R01AG36836, U01AG32984, U01AG46152 and U01AG61356, the Illinois Department of Public Health and the Translational Genomics Research Institute.
Ethics declarations
Competing interests
A.R. is a co-founder and equity holder of Celsius Therapeutics, an equity holder in Immunitas Therapeutics and, until 31 July 2020, was a scientific advisory board member of Thermo Fisher Scientific, Syros Pharmaceuticals, Asimov and Neogene Therapeutics. From 1 August 2020, A.R. is an employee of Genentech, a member of the Roche group, and has equity in Roche. The remaining authors declare no competing interests.
Extended data
Extended Data Fig. 1 DIALOGUE identified MCPs in the mouse hypothalamus that are not recovered with other dimensionality reduction and clustering approaches.
(a)* Pearson correlation coefficient between genes, PCs, NMF, and DIALOGUE MCPs from either the training or the test set (x axis) across different pairs of cell types (panels) in spatial niches in the mouse hypothalamus. (b) Pearson correlation coefficient (red/blue, color bar) between the Overall Expression of the relevant MCP component when considering only defined subsets of the pertaining cell types (rows, columns), as previously identified by clustering10. White: missing values (cell subtypes that cannot be compared). (c) MCPs are not merely driven by cell subtype composition in a niche. Fraction of cells from different clusters (as previously defined10, y axis) among cells of a given type (label on top) that over- or under-express the relevant component of each pair-wise MCP1 (top or bottom 25%, respectively, x axis) involving that cell type. (d)* Similarity (y axis, Spearman’s r) between the gene loadings of MCPs identified in the microenvironment setting (x axis) and the gene loadings of matching MCPs identified in the macro-environment setting, when computed for different pairs of cell types using MERFISH data. *In both (a) and (d) middle line: median; box edges: 25th and 75th percentiles, whiskers: most extreme points that do not exceed ±IQR*1.5; further outliers are marked individually.
Extended Data Fig. 2 DIALOGUE captures spatial patterns.
(a) Average Overall Expression in a niche (dot, 15 cells on average) of the first MCP (MCP1) in the first (x axis) and second (y axis) cell type in that MCP. In red is the locally weighted polynomial (LOWESS) regression line. (b) As in (a), but depicting the Overall Expression residuals after regressing out impact of cell clusters, as previously defined10. (a-b) Spearman correlation coefficient (R) and significance (P, one-sided). (c) Performance (AUROC, y axis) when predicting the expression of the corresponding DIALOGUE component in the neighboring cells located in the same macro-environment (dark blue, ~500 cells) or micro-environment (purple, and light blue, ~15 cells), when testing on unseen test set; the training data includes either spatial coordinates and single-cell profiles (light blue, ‘spatial data’) or only single cell profiles from ~500 cell aggregates, without spatial information (‘dissociated’, Methods).
Extended Data Fig. 3 DIALOGUE vs. HMFR.
(a) Overall Expression of HMRF37 domain-specific programs in neighboring pairs of glutamatergic (y axis) and GABAergic (x axis) neurons from different regions (colors). (b) Overall Expression of the relevant components of MCPs 1-5 in glutamatergic (y axis) and their adjacent GABAergic (x axis) neurons from different regions (colors). (a-b) Spearman correlation coefficient (R) and significance (P, one-sided).
Extended Data Fig. 4 MCPs mark spatial patterns and phenotypes.
(a) Spatial distribution of MCPs and HMFR programs. Overall Expression of MCPs identified by DIALOGUE and the HMRF37 domain programs in glutamatergic (circles) and GABAergic (dots) neurons in the mouse visual cortex. As shown, while many of the patterns follow either a more layered or salt and pepper pattern, MCP2 distinguished a more discrete region. While such boundaries sometimes reflect measurement artifacts, we did not find an association with number of genes/reads (typical quality measures) nor with simple alignment with Fields of View (FOV). (b,c) Shared and cell type specific components in DIALOGUE MCP1s in the mouse hypothalamus. (b) Fraction of genes (y axis) that are shared (yellow) or specific to one (A, dark blue) or another (B, light blue) of the cell types in each of the hypothalamus MCPs (x axis). (c) The two cell programs in each of the MCPs in (b) and their specific and shared (intersection) genes. P-values denote association with naïve animal behavior (mixed-effects models, two-sided test).
Extended Data Fig. 5 DIALOGUE identifies mis-localized cells and disease MCPs in single-cell data.
(a) ROC curves showing the true positive (y axis) and false positive (x axis) rate when predicting mis-localized cells of each major subset (panels) with different types of ‘contamination’ with cells that are either from the same layer (LP/EPI) within control (black, from replicate biopsy) or UC (blue; from adjacent biopsy with a different clinical status: inflamed or non-inflamed); or from a different layer but same clinical status, when considering either all samples (green) or only samples from control (yellow) or UC patients (red). (b) UC multicellular program genes. Average expression (Z score residuals after regressing out the associations with the LP/EPI location, red/blue color bar) of top genes (columns) from the UC multicellular program, sorted by their pertaining cell type (top color bar), across samples (rows), sorted by Overall Expression (right, Methods), and labeled by clinical status, location and patient ID (left color bars). (c) Melanoma MCP1. Average expression (Z score, red/blue color bar) of top genes (columns) from MCP1 identified in four different cell types (top color bar), across melanoma tumor samples (rows), sorted by Overall Expression of MCP1 (right, Methods), and labeled by treatment status and ICB response (left color bar).
Supplementary information
Supplementary Table
Supplementary Tables 1–4.
