Cell segmentation in imaging-based spatial transcriptomics

Petukhov, Viktor; Xu, Rosalind J.; Soldatov, Ruslan A.; Cadinu, Paolo; Khodosevich, Konstantin; Moffitt, Jeffrey R.; Kharchenko, Peter V.

doi:10.1038/s41587-021-01044-w

Article
Published: 14 October 2021

Cell segmentation in imaging-based spatial transcriptomics

Nature Biotechnology volume 40, pages 345–354 (2022)Cite this article

31k Accesses
72 Citations
94 Altmetric
Metrics details

Subjects

Abstract

Single-molecule spatial transcriptomics protocols based on in situ sequencing or multiplexed RNA fluorescent hybridization can reveal detailed tissue organization. However, distinguishing the boundaries of individual cells in such data is challenging and can hamper downstream analysis. Current methods generally approximate cells positions using nuclei stains. We describe a segmentation method, Baysor, that optimizes two-dimensional (2D) or three-dimensional (3D) cell boundaries considering joint likelihood of transcriptional composition and cell morphology. While Baysor can take into account segmentation based on co-stains, it can also perform segmentation based on the detected transcripts alone. To evaluate performance, we extend multiplexed error-robust fluorescence in situ hybridization (MERFISH) to incorporate immunostaining of cell boundaries. Using this and other benchmarks, we show that Baysor segmentation can, in some cases, nearly double the number of cells compared to existing tools while reducing segmentation artifacts. We demonstrate that Baysor performs well on data acquired using five different protocols, making it a useful general tool for analysis of imaging-based spatial transcriptomics.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Segmentation-free analysis of spatial data using NCVs.**

**Fig. 2: Application of an MRF framework for segmentation-free cell-type inference and background filtration.**

**Fig. 3: Examples of Baysor cell segmentation over the published protocols.**

**Fig. 4: Comparison of Baysor segmentation with other methods and published results.**

**Fig. 5: Examples of the segmentation differences on the osmFISH data.**

**Fig. 6: Comparing Baysor segmentation to MERFISH measurements with stained cell boundaries in the mouse ileum.**

Bridging structural and cell biology with cryo-electron microscopy

Article 03 April 2024

Eva Nogales & Julia Mahamid

Three million images and morphological profiles of cells treated with matched chemical and genetic perturbations

Article Open access 09 April 2024

Srinivas Niranj Chandrasekaran, Beth A. Cimini, … Anne E. Carpenter

Inferring gene regulatory networks from single-cell multiome data using atlas-scale external data

Article Open access 12 April 2024

Qiuyue Yuan & Zhana Duren

Data availability

The following datasets were used in evaluating the developed methods:

1. osmFISH mouse somatosensory cortex⁸, 35 genes: http://linnarssonlab.org/osmFISH/availability/.

2. MERFISH mouse preoptic hypothalamus¹⁹, 140 genes: https://doi.org/10.5061/dryad.8t8s248.

3. ISS mouse CA1 region¹⁶, 95 genes: https://doi.org/10.6084/m9.figshare.7150760.v1.

4. STARmap mouse VISp¹⁸, 1,020 genes: https://www.starmapresources.com/data/ (visual_1020, 20180505_BY3_1kgenes).

5. STARmap mouse VISp¹⁸, 160 genes: https://www.starmapresources.com/data/ (visual_160, 20171120_BF4_light).

6. seqFISH⁺ NIH/3T3 cells⁷, 10,000 genes: https://doi.org/10.5281/zenodo.2669683.

7. seqFISH mouse embryo⁴⁵, 387 genes: https://marionilab.cruk.cam.ac.uk/SpatialMouseAtlas/.

8. Allen smFISH mouse VISp, 22 genes: https://github.com/spacetx-spacejam/data.

9. MERFISH mouse ileum, 241 genes: https://doi.org/10.5061/dryad.jm63xsjb2.

Code availability

The Baysor package is available at https://github.com/kharchenkolab/Baysor. Baysor parameters for different datasets are reported in Supplementary Table 3. The code to reproduce the results is available at https://github.com/kharchenkolab/BaysorAnalysis/. This repository also contains the links to interactive visualization of the processed datasets using the Vitessce tool (http://vitessce.io/). MERFISH probe design and analysis software is available at https://github.com/ZhuangLab/MERFISH_analysis.

References

Mereu, E. et al. Benchmarking single-cell RNA-sequencing protocols for cell atlas projects. Nat. Biotechnol. 38, 747–755 (2020).
Article CAS Google Scholar
Regev, A. et al. The human cell atlas. eLife 6, e27041. (2017).
Article Google Scholar
HuBMAP Consortium. The human body at cellular resolution: the NIH Human Biomolecular Atlas Program. Nature 574, 187–192 (2019).
Aldridge, S. & Teichmann, S. A. Single cell transcriptomics comes of age. Nat. Commun. 11, 4307. (2020).
Article Google Scholar
Lee, J. H. et al. Highly multiplexed subcellular RNA sequencing in situ. Science 343, 1360–1363 (2014).
Article CAS Google Scholar
Ke, R. et al. In situ sequencing for RNA analysis in preserved tissue and cells. Nat. Methods 10, 857–860 (2013).
Article CAS Google Scholar
Eng, C.-H. L. et al. Transcriptome-scale super-resolved imaging in tissues by RNA seqFISH. Nature 568, 235–239 (2019).
Article CAS Google Scholar
Codeluppi, S. et al. Spatial organization of the somatosensory cortex revealed by osmFISH. Nat. Methods 15, 932–935 (2018).
Article CAS Google Scholar
Xia, C., Fan, J., Emanuel, G., Hao, J. & Zhuang, X. Spatial transcriptome profiling by MERFISH reveals subcellular RNA compartmentalization and cell cycle-dependent gene expression. Proc. Natl Acad. Sci. USA 116, 19490–19499 (2019).
Article CAS Google Scholar
Rodriques, S. G. et al. Slide-seq: a scalable technology for measuring genome-wide expression at high spatial resolution. Science 363, 1463–1467 (2019).
Article CAS Google Scholar
Vickovic, S. et al. High-definition spatial transcriptomics for in situ tissue profiling. Nat. Methods 16, 987–990 (2019).
Article CAS Google Scholar
Lein, E., Borm, L. E. & Linnarsson, S. The promise of spatial transcriptomics for neuroscience in the era of molecular cell typing. Science 358, 64–69 (2017).
Article CAS Google Scholar
Bingham, G. C., Lee, F., Naba, A. & Barker, T. H. Spatial-omics: novel approaches to probe cell heterogeneity and extracellular matrix biology. Matrix Biol. 91-92, 152–166 (2020).
Article CAS Google Scholar
Soldatov, R. et al. Spatiotemporal structure of cell fate decisions in murine neural crest. Science 364, eaas9536 (2019).
Article CAS Google Scholar
Chen, W.-T. et al. Spatial transcriptomics and in situ sequencing to study Alzheimer’s disease. Cell 182, 976–991 (2020).
Article CAS Google Scholar
Qian, X. et al. Probabilistic cell typing enables fine mapping of closely related cell types in situ. Nat. Methods 17, 101–106 (2020).
Article CAS Google Scholar
Chen, K. H., Boettiger, A. N., Moffitt, J. R., Wang, S. & Zhuang, X. RNA imaging. Spatially resolved, highly multiplexed RNA profiling in single cells. Science 348, aaa6090 (2015).
Article Google Scholar
Wang, X. et al. Three-dimensional intact-tissue sequencing of single-cell transcriptional states. Science 361, eaat5691 (2018).
Article Google Scholar
Moffitt, J. R. et al. Molecular, spatial, and functional single-cell profiling of the hypothalamic preoptic region. Science 362, eaau5324 (2018).
Article Google Scholar
Wang, Z. Cell segmentation for image cytometry: advances, insufficiencies, and challenges. Cytometry A 95, 708–711 (2019).
Article Google Scholar
Park, J. et al. Cell segmentation-free inference of cell types from in situ transcriptomics data. Nat. Commun. 12, 3545 (2021).
Dirmeier, S. & Beerenwinkel, N. Structured hierarchical models for probabilistic inference from perturbation screening data. Preprint at bioRxiv https://doi.org/10.1101/848234 (2019).
Zhu, Q., Shah, S., Dries, R., Cai, L. & Yuan, G.-C. Identification of spatially associated subpopulations by combining scRNAseq and sequential fluorescence in situ hybridization data. Nat. Biotechnol. 36, 1183–1190 (2018).
Article CAS Google Scholar
Rueden, C. T. et al. ImageJ2: ImageJ for the next generation of scientific image data. BMC Bioinformatics 18, 529 (2017).
Article Google Scholar
Wang, G., Moffitt, J. R. & Zhuang, X. Multiplexed imaging of high-density libraries of RNAs with MERFISH and expansion microscopy. Sci. Rep. 8, 4847 (2018).
Article Google Scholar
Moffitt, J. R. et al. High-performance multiplexed fluorescence in situ hybridization in culture and tissue with matrix imprinting and clearing. Proc. Natl Acad. Sci. USA 113, 14456–14461 (2016).
Article CAS Google Scholar
Stringer, C., Wang, T., Michaelos, M. & Pachitariu, M. Cellpose: a generalist algorithm for cellular segmentation. Nat. Methods 18, 100–106 (2021).
Article CAS Google Scholar
Yangel, B. & Vetrov, D. in Energy Minimization Methods in Computer Vision and Pattern Recognition (eds Heyden, A., Kahl, F., Olsson, C., Oskarsson, M., & Tai, X.-C.) p 137–150 (Springer, 2013).
McInnes, L., Healy, J. & Melville, J. UMAP: uniform manifold approximation and projection for dimension reduction. Preprint at arXiv https://arxiv.org/abs/1802.03426v3 (2018).
Kanemura, A., Maeda, S. & Ishii, S. Superresolution with compound markov random fields via the variational em algorithm. Neural Netw. 22, 1025–1034 (2009).
Article Google Scholar
Blei, D. M., Kucukelbir, A. & McAuliffe, J. D. Variational inference: a review for statisticians. J. Am. Stat. Assoc. 112, 859–877 (2017).
Article CAS Google Scholar
Dempster, A. P., Laird, N. M. & Rubin, D. B. Maximum likelihood from incomplete data via the em algorithm. J. R. Stat. Soc. Series B Stat. Methodol. 39, 1–38 (1977).
Google Scholar
Nielsen, S. F. The stochastic EM algorithm: estimation and asymptotic results. Bernoulli 6, 457–489 (2000).
Article Google Scholar
Kimura, T. et al. Expectation–maximization algorithms for inference in Dirichlet processes mixture. Pattern Anal. Appl. 16, 55–67 (2013).
Article Google Scholar
Wolf, F. A., Angerer, P. & Theis, F. J. Scanpy: large-scale single-cell gene expression data analysis. Genome Biol. 19, 15 (2018).
Article Google Scholar
Zeisel, A. et al. Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq. Science 347, 1138–1142 (2015).
Article CAS Google Scholar
Harris, K. D. et al. Classes and continua of hippocampal CA1 inhibitory neurons revealed by single-cell transcriptomics. PLoS Biol. 16, e2006387 (2018).
Article Google Scholar
Hodge, R. D. et al. Conserved cell types with divergent features in human versus mouse cortex. Nature 573, 61–68 (2019).
Article CAS Google Scholar
Haber, A. L. et al. A single-cell survey of the small intestinal epithelium. Nature 551, 333–339 (2017).
Article CAS Google Scholar
Gehart, H. et al. Identification of enteroendocrine regulators by real-time single-cell differentiation mapping. Cell 176, 1158–1173 (2019).
Article CAS Google Scholar
Tsoucas, D. et al. Accurate estimation of cell-type composition from gene expression data. Nat. Commun. 10, 2975 (2019).
Article Google Scholar
Moffitt, J. R. et al. High-throughput single-cell gene-expression profiling with multiplexed error-robust fluorescence in situ hybridization. Proc. Natl Acad. Sci. USA 113, 11046–11051 (2016).
Article CAS Google Scholar
Stringer, C., Wang, T., Michaelos, M. & Pachitariu, M. Cellpose: a generalist algorithm for cellular segmentation. Nat. Methods 18, 100–106 (2021).
Article CAS Google Scholar
Traag, V. A., Waltman, L. & Van Eck, N. J. From Louvain to Leiden: guaranteeing well-connected communities. Sci. Rep. 9, 5233 (2019).
Article CAS Google Scholar
Lohoff, T. et al. Highly multiplexed spatially resolved gene expression profiling of mouse organogenesis. Preprint at bioRxiv https://doi.org/10.1101/2020.11.20.391896 (2020).

Download references

Acknowledgements

We thank B. Tasic and B. Long for sharing the non-published Allen smFISH data and aiding in its interpretation and the SpaceTx consortium for facilitating the collaborations. We also thank Y. Boykov (University of Waterloo) for the initial discussions and advice on an alternative segmentation approach based on graph cuts. We are also grateful to a number of colleagues who advised us on the published protocols, including N. Pierson and L. Cai (seqFISH⁺), S. Codeluppi, L. Borm and S. Linnarsson (osmFISH) and X. Qian, M. Hilscher and M. Nilsson (ISS). Additionally, we thank J. Miller for his input on segmentation benchmarks and B. Lelieveldt for his advising on NCV visualization. We express our gratitude to D. Molchanov and D. Vetrov (HSE, Moscow) for their input on the algorithm. J.R.M. acknowledges pilot funding from the Harvard Digestive Disease Center (P30 DK034854). V.P., P.V.K. and J.R.M. were supported by the Seed Network grant 2019-202743 from the Chan Zuckerberg Initiative. V.P. is funded through a cooperative agreement between University of Copenhagen and Harvard Medical School.

Author information

Authors and Affiliations

Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Viktor Petukhov, Ruslan A. Soldatov & Peter V. Kharchenko
Biotech Research and Innovation Centre, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
Viktor Petukhov & Konstantin Khodosevich
Program in Cellular and Molecular Medicine, Boston Children’s Hospital, Boston, MA, USA
Rosalind J. Xu, Paolo Cadinu & Jeffrey R. Moffitt
Department of Microbiology, Harvard Medical School, Boston, MA, USA
Rosalind J. Xu, Paolo Cadinu & Jeffrey R. Moffitt
Department of Chemistry, Harvard University, Boston, MA, USA
Rosalind J. Xu
Harvard Stem Cell Institute, Cambridge, MA, USA
Peter V. Kharchenko

Authors

Viktor Petukhov
View author publications
You can also search for this author in PubMed Google Scholar
Rosalind J. Xu
View author publications
You can also search for this author in PubMed Google Scholar
Ruslan A. Soldatov
View author publications
You can also search for this author in PubMed Google Scholar
Paolo Cadinu
View author publications
You can also search for this author in PubMed Google Scholar
Konstantin Khodosevich
View author publications
You can also search for this author in PubMed Google Scholar
Jeffrey R. Moffitt
View author publications
You can also search for this author in PubMed Google Scholar
Peter V. Kharchenko
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

P.V.K. and V.P. formulated the study and the overall approach. V.P. developed the detailed algorithms with advice from R.A.S. and K.K. V.P. implemented the Baysor package. J.R.M., R.J.X. and P.C. developed boundary immunostaining and performed MERFISH measurements. V.P. and P.V.K. drafted the manuscript, with contributions by J.R.M, R.A.S., and R.J.X. All authors provided suggestions and corrections on the manuscript text.

Corresponding author

Correspondence to Peter V. Kharchenko.

Ethics declarations

Competing interests

P.V.K. serves on the Scientific Advisory Board to Celsius Therapeutics, Inc., and Biomage, Inc. J.R.M. is a cofounder and Scientific Advisory Board member of Vizgen, Inc. J.R.M. is an inventor on patents associated with MERFISH applied for on his behalf by Harvard University and Boston Children’s Hospital. The other authors declare no conflict of interest.

Additional information

Peer review information Nature Biotechnology thanks Kenneth Harris and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Graphical models of the segmentation process.

Graphical representations of the Bayesian models used for the general Markov-Random Field (MRF) segmentation process (a) and the extended model for cell segmentation (b) are shown. Blue squares represent input parameters and data for the algorithm. The yellow circles represent the hidden parameters, fitted by the algorithm. Optional input and parameters are shown with dashed border lines. Round-corner boxes represent plate notation for a mixture of distributions with the size of the mixture shown on the bottom right corner. N^mols denotes the number of molecules in the dataset, and N^comps is the specified number of the mixture components. Arrow labels show the distributions used to model dependencies between the corresponding variables. Matrix variables are shown with the capital letters and vector variables are designated with the overline. a, The general MRF model, where the MRF prior with weights W is used to account for the spatial dependency of the inferred labels \(\overrightarrow{z}\in 1:{N}^{comps}\). Examples of the variables and distributions for different labelling problems are noted below the boxes. b, The detailed model for the Cell Segmentation problem. Here, Bayesian Mixture Models with Dirichlet prior were used, so the possible number of components of the mixture is infinite, which allows the algorithm to estimate the number of components automatically. To ensure that the components correspond to the actual cells, the Global Scale parameter s was introduced, which specifies the expected cell radius.

Extended Data Fig. 2 Comparison of Baysor, pciSeq, and DAPI Watershed segmentations based on poly(A) signal.

a, Number of cells in different segmentations. b, Distribution of poly(A) brightness (x-axis) across background molecules (that is, molecules outside of the predicted segmentations) for different segmentations (color). Baysor shows the lowest number of transcripts in bright poly(A) regions, while Watershed has the heaviest tail. c, Number of cells overlapping with the segmented poly(A) regions is shown as a distribution for different segmentation methods. Baysor shows highest frequency of one-to-one mapping with poly(A) segmentations. d, Mutual information (y-axis) of molecule assignment with poly(A) segmentation is shown for different segmentation methods. To account for local variation we split the data over 7x7 grid and showed mean and 95% CI, as well as individual values for n = 49 sub-regions. e, Size of the overlap for a best-matching cell in the poly(A) segmentation, normalized to the total size (in molecules) of the best-matching poly(A) cell is shown as a distribution for different segmentation methods. Peak near 1.0 indicates that many poly-A cells are fully covered by the best-matching target cells reported by a segmentation method. f, Similar to (e), the size of the overlap with best-matching poly(A) cells is shown as a fraction of the target cell size for different target segmentation methods. Peak near 1.0 indicates that many reported target cells are fully covered by the best-matching poly(A) cell. g-k, Examples borders of Baysor (purple) and Watershed (blue) segmentations are shown. The left plots show poly(A) signal, while the right plots show molecules colored based on local expression patterns (NCVs). While in most cases there is a good correspondence between the two modalities (g-i), in some cases molecular composition clearly indicates presence of distinct cells which are not easily separated from the poly(A) signal intensity (j,k).

Extended Data Fig. 3 Impact of the ‘prior segmentation confidence’ parameter on the difference between the prior and the posterior segmentations on the example of ISS CA1 region.

a, The Mutual Information between the Baysor and the Paper (published) segmentations (y-axis) is shown as a function of the prior segmentation confidence (x-axis). Mutual Information does not reach the value of 1.0, as even for prior confidence set to 1.0, Baysor is still allowed to re-assign molecules, recognised as background in the Paper segmentation. b, For each cell of the source segmentation (shown with colour), a cell with the largest overlap was picked from the target segmentation. The overlap fraction is shown on the y-axis for the different values of prior segmentation confidence. The boxes represent distribution quartiles with the maximal length of whiskers equal to 1.5 of the inter-quartile range. It can be seen that for high values of the prior confidence, for each Paper cell there is a Baysor cell that covers it completely (confidence ≥0.9, Source=Paper). The opposite is not true, as Baysor is allowed to re-assign the background molecules from the Paper segmentation.

Extended Data Fig. 4 Cell statistics for different segmentation methods.

The boxplots show distributions of the number of molecules per cell (a, log-scale y-axis) and the squared root of the cell area, which is an approximation for cell radii (b) for different protocols (x-axis) and segmentation methods (fill colours). The boxes represent quartiles with the maximal length of whiskers equal to 1.5 of the inter-quartile range. For all datasets, Baysor has approximately the same values as the published segmentations, which suggests that it is not biased towards over- or under-segmentation. The Watershed and pciSeq methods stably shows lower values, consistent with registering mostly nuclei molecules.

Extended Data Fig. 5 Comparison of the Baysor and the published segmentation on the MERFISH Hypothalamus dataset.

The figure shows the comparison in the same format as Fig. 5. a, A joint UMAP embedding of the cells from both Baysor and the paper segmentations. The colors correspond to the annotated cell types. b, The same embedding, colored by the segmentation that produced a specific cell. c, A heatmap showing expression patterns of marker genes (columns) for the different cell types (rows). The colors show expression levels, normalised for each gene. d, The frequency of different cell types is shown for the Baysor (brown bars) and the paper (blue bars) segmentations. The numbers on the top of the bars show excess percentage for Baysor. The largest difference is observed for Endothelial cells, where the Paper segmentation has 42% fewer cells compared to Baysor. e-f. Examples of Astrocytes (e) and Endothelial (f), which were not segmented by the Paper annotation, but were distinguished by Baysor. The dots correspond to the measured molecules, colored by gene (only three the most abundant genes are shown). The grayscale background shows the DAPI signal, and the black contours show the determined cell boundary. g. Example of a region with Ependymal cells, showing that for such regions molecules have homogeneous expression patterns. This results in Baysor slightly under-segmenting such cells, which causes the difference in the number of detected cells.

Extended Data Fig. 6 Comparison of cell clusters in the MERFISH mouse ileum dataset recovered from different segmentation methods.

a,b,c, Leiden clusters, cell type spatial distributions, and marker gene expressions in the Na⁺K⁺-ATPase immunofluorescence (IF) MERFISH mouse ileum dataset, where cells are segmented by (a) Baysor with RNA information only, (b) Baysor with priors provided by Cellpose-derived IF boundaries, (c) Cellpose-derived IF boundaries. Left: UMAP of all identified cells colored based on Leiden clustering. Middle: Spatial distributions of all identified cell clusters colored as in the UMAP. Right: Expressions of marker genes in each of the identified cell clusters. The size of the dots represents the fraction of cells with at least one count of the indicated gene. The color of the dots represents the average expression of each gene across all cell types, log-transformed, and normalized to the cell type with the largest expression. DC: dendritic cells; ICC: interstitial cells of Cajal; TA: transit amplifying cells. d, The numbers of each cell type identified by each of the segmentation methods in a-c.

Extended Data Fig. 7 Spatial distributions of all cell types identified by Baysor (with RNA information only).

Gray dots represent the location of all cells, and colored dots represent the location of the indicated cell type. DC: dendritic cells; ICC: interstitial cells of Cajal; TA: transit amplifying cells.

Extended Data Fig. 8 Additional benchmarks against MERFISH membrane staining data.

a, Similar to Fig. 6i of the main manuscript, the distribution shows overlap of different segmentations with membrane IF segmentation. Baysor+DAPI and Baysor+IF correspond to Baysor ran with DAPI and IF segmentations as priors, respectively. b, Size of the overlap of different target segmentations with IF segments is shown relative to the size of the predicted cell in the target segmentation. c, Distribution of the number of target cells matching to cells of the membrane IF segmentation is shown for different segmentation results. d, Number of cells recovered by different segmentation methods e-f number of molecules (e) and area (f) per cell, reported by different segmentation methods. The boxes represent distribution quartiles with the maximal length of whiskers equal to 1.5 of the inter-quartile range. g, Agreement between different segmentations and membrane IF segmentation is assessed using mutual information across molecules for n = 5 central z-planes. The average and 95% confidence intervals across z-planes, as well as dots for individual values are shown. Only molecules assigned to some cell in any of the methods are used.

Extended Data Fig. 9 Outstanding challenges: intracellular compartmentalization and homotypic cells.

a, An example of intracellular compartmentalization, illustrated by polarized expression pattern of enterocytes in the mouse ileum, as captured by MERFISH. RNAs are colored by NCV. b, Example of a homotypic cell cluster from the mouse ileum. Three panels show the same region with membrane IF signal. The left panel shows NCV molecule coloring, whereas center and right panels color molecules assigned to each cell differently. Red arrows point at homotypic cells that Baysor was only able to segment with the help of IF prior.

Extended Data Fig. 10 Outstanding segmentation challenges.

a, Seq-FISH+ Fibroblast⁷ data colored by NCVs with black contours showing the published segmentation borders. b, The same data, segmented by Baysor with colors showing cell assignment. c, Example of cells which are separable only in 3D in the Allen smFISH data. The two plots show 2D projections on the physical x-y and x-z axes correspondingly. Each point represents a molecule, coloured by its gene of origin. Gad2 and Pvalb are markers of inhibitory neurons, while Sv2c with Satb2 are markers of excitatory neurons. These markers are mutually exclusive, and there should be no cell that expresses all four of these markers. d-e, Seq-FISH mouse embryo⁴⁵ data colored by cell type published cell assignment (d) and the Baysor cell segmentation (e) with black contours showing the published segmentation borders. It can be seen that the dataset captures cytoplasm-specific genes, lacking nuclei expression, which leads to the holes in the middle of cells. f, Example of a cell from the STARmap VISp 160 dataset¹⁸. The black lines show the published cell boundaries. The plot shows colouring by gene for the 15 most expressed genes.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Petukhov, V., Xu, R.J., Soldatov, R.A. et al. Cell segmentation in imaging-based spatial transcriptomics. Nat Biotechnol 40, 345–354 (2022). https://doi.org/10.1038/s41587-021-01044-w

Download citation

Received: 05 October 2020
Accepted: 04 August 2021
Published: 14 October 2021
Issue Date: March 2022
DOI: https://doi.org/10.1038/s41587-021-01044-w

This article is cited by

Spatial multi-omics: novel tools to study the complexity of cardiovascular diseases
- Paul Kiessling
- Christoph Kuppe
Genome Medicine (2024)
Unravelling cell type-specific responses to Parkinson’s Disease at single cell resolution
- Araks Martirosyan
- Rizwan Ansari
- Matthew G. Holt
Molecular Neurodegeneration (2024)
Bento: a toolkit for subcellular analysis of spatial transcriptomics data
- Clarence K. Mah
- Noorsher Ahmed
- Gene W. Yeo
Genome Biology (2024)
Drug targeting in psychiatric disorders — how to overcome the loss in translation?
- Konstantin Khodosevich
- Katarina Dragicevic
- Oliver Howes
Nature Reviews Drug Discovery (2024)
BIDCell: Biologically-informed self-supervised learning for segmentation of subcellular spatial transcriptomics data
- Xiaohang Fu
- Yingxin Lin
- Jean Y. H. Yang
Nature Communications (2024)