Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Sprod for de-noising spatially resolved transcriptomics data based on position and image information

Abstract

Spatially resolved transcriptomics (SRT) provide gene expression close to, or even superior to, single-cell resolution while retaining the physical locations of sequencing and often also providing matched pathology images. However, SRT expression data suffer from high noise levels, due to the shallow coverage in each sequencing unit and the extra experimental steps required to preserve the locations of sequencing. Fortunately, such noise can be removed by leveraging information from the physical locations of sequencing, and the tissue organization reflected in corresponding pathology images. In this work, we developed Sprod, based on latent graph learning of matched location and imaging data, to impute accurate SRT gene expression. We validated Sprod comprehensively and demonstrated its advantages over previous methods for removing drop-outs in single-cell RNA-sequencing data. We showed that, after imputation by Sprod, differential expression analyses, pathway enrichment and cell-to-cell interaction inferences are more accurate. Overall, we envision de-noising by Sprod to become a key first step towards empowering SRT technologies for biomedical discoveries.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Extensive noise in spatially resolved transcriptomics data.
Fig. 2: Sprod for de-noising of spatially resolved transcriptomics data.
Fig. 3: Validation of Sprod on real Visium and Slide-Seq datasets.
Fig. 4: Detection of spatially differentially expressed genes is more accurate after de-noising.
Fig. 5: Inference of cell-to-cell communication is more accurate with Sprod-corrected expression data.

Similar content being viewed by others

Data availability

The Visium datasets are obtained from the public 10X resources/datasets website: https://www.10xgenomics.com/resources/datasets. The IDs of the datasets are: human-lymph-node-1-standard-1-1-0, Human-ovarian-cancer-whole-transcriptome-analysis-stains-dapi-anti-pan-ck-anti-cd-45-1-standard-1-2-0, human-ovarian-cancer-targeted-pan-cancer-panel-stains-dapi-anti-pan-ck-anti-cd-45-1-standard-1-2-0 and human-breast-cancer-block-a-section-1-1-standard-1-1-0. The ID of the standard 10X scRNA-seq dataset used in Fig. 1c is 10-k-peripheral-blood-mononuclear-cells-pbm-cs-from-a-healthy-donor-single-indexed-4.0.0. The Slide-Seq data are available from the publicly archived data by Stickels et al.1. Specifically, we used the Puck_200115_08 data from https://singlecell.broadinstitute.org/single_cell/study/SCP815/highly-sensitive-spatial-transcriptomics-at-near-cellular-resolution-with-slide-seqv2.

Code availability

The Sprod software is available at: https://github.com/yunguan-wang/SPROD. The doi is https://doi.org/10.5281/zenodo.604775229.

References

  1. Stickels, R. R. et al. Highly sensitive spatial transcriptomics at near-cellular resolution with Slide-seqV2. Nat. Biotechnol. 39, 313–319 (2021).

    Article  CAS  Google Scholar 

  2. Rodriques, S. G. et al. Slide-seq: a scalable technology for measuring genome-wide expression at high spatial resolution. Science 363, 1463–1467 (2019).

    Article  CAS  Google Scholar 

  3. Vickovic, S. et al. High-definition spatial transcriptomics for in situ tissue profiling. Nat. Methods 16, 987–990 (2019).

    Article  CAS  Google Scholar 

  4. Cho, C.-S. et al. Microscopic examination of spatial transcriptome using Seq-Scope. Cell 184, 3559–3572.e22 (2021).

    Article  CAS  Google Scholar 

  5. Lee, Y. et al. XYZeq: spatially resolved single-cell RNA sequencing reveals expression heterogeneity in the tumor microenvironment. Sci. Adv. 7, eabg4755 (2021).

    Article  CAS  Google Scholar 

  6. Li, W. V. & Li, J. J. An accurate and robust imputation method scImpute for single-cell RNA-seq data. Nat. Commun. 9, 997 (2018).

    Article  Google Scholar 

  7. Huang, M. et al. SAVER: gene expression recovery for single-cell RNA sequencing. Nat. Methods 15, 539–542 (2018).

    Article  CAS  Google Scholar 

  8. Lu, T. et al. Overcoming expressional drop-outs in lineage reconstruction from single-cell RNA-sequencing data. Cell Rep. 34, 108589 (2021).

    Article  CAS  Google Scholar 

  9. Nakagawa, T., Yamada, M. & Suzuki, Y. 18F-FDG uptake in reactive neck lymph nodes of oral cancer: relationship to lymphoid follicles. J. Nucl. Med. 49, 1053–1059 (2008).

    Article  Google Scholar 

  10. Weller, S. et al. Human blood IgM ‘memory’ B cells are circulating splenic marginal zone B cells harboring a prediversified immunoglobulin repertoire. Blood 104, 3647–3654 (2004).

    Article  CAS  Google Scholar 

  11. Agbay, R. L. M. C. et al. Characteristics and clinical implications of reactive germinal centers in the bone marrow. Hum. Pathol. 68, 7–21 (2017).

    Article  CAS  Google Scholar 

  12. Mayford, M., Baranes, D., Podsypanina, K. & Kandel, E. R. The 3′-untranslated region of CaMKII alpha is a cis-acting signal for the localization and translation of mRNA in dendrites. Proc. Natl Acad. Sci. USA 93, 13250–13255 (1996).

    Article  CAS  Google Scholar 

  13. Svensson, V., Teichmann, S. A. & Stegle, O. SpatialDE: identification of spatially variable genes. Nat. Methods 15, 343–346 (2018).

    Article  CAS  Google Scholar 

  14. Tushev, G. et al. Alternative 3′ UTRs modify the localization, regulatory potential, stability, and plasticity of mRNAs in neuronal compartments. Neuron 98, 495–511.e6 (2018).

    Article  CAS  Google Scholar 

  15. Ainsley, J. A., Drane, L., Jacobs, J., Kittelberger, K. A. & Reijmers, L. G. Functionally diverse dendritic mRNAs rapidly associate with ribosomes following a novel experience. Nat. Commun. 5, 4510 (2014).

    Article  CAS  Google Scholar 

  16. Wang, H., Wu, X. & Chen, Y. Stromal-immune score-based gene signature: a prognosis stratification tool in gastric cancer. Front. Oncol. 9, 1212 (2019).

    Article  Google Scholar 

  17. Wang, T. et al. An empirical approach leveraging tumorgrafts to dissect the tumor microenvironment in renal cell carcinoma identifies missing link to prognostic inflammatory factors. Cancer Discov. 8, 1142–1155 (2018).

    Article  CAS  Google Scholar 

  18. Jin, S. et al. Inference and analysis of cell-cell communication using CellChat. Nat. Commun. 12, 1088 (2021).

    Article  CAS  Google Scholar 

  19. Planes-Laine, G. et al. PD-1/PD-L1 targeting in breast cancer: the first clinical evidences are emerging. a literature review. Cancers (Basel) 11, 1033 (2019).

    Article  CAS  Google Scholar 

  20. Yuan, C. et al. Expression of PD-1/PD-L1 in primary breast tumours and metastatic axillary lymph nodes and its correlation with clinicopathological parameters. Sci. Rep. 9, 14356 (2019).

    Article  Google Scholar 

  21. Li, C.-J., Lin, L.-T., Hou, M.-F. & Chu, P.-Y. PD‑L1/PD‑1 blockade in breast cancer: the immunotherapy era (Review). Oncol. Rep. 45, 5–12 (2021).

    Article  Google Scholar 

  22. Wang, X. et al. Three-dimensional intact-tissue sequencing of single-cell transcriptional states. Science 361, eaat5691 (2018).

    Article  Google Scholar 

  23. Wang, L. & Li, R.-C. Learning low-dimensional latent graph structures: a density estimation approach. IEEE Trans. Neural Netw. Learn. Syst. 31, 1098–1112 (2020).

    Article  Google Scholar 

  24. Zhang, R., Atwal, G. S. & Lim, W. K. Noise regularization removes correlation artifacts in single-cell RNA-seq data preprocessing. Patterns (N. Y.) 2, 100211 (2021).

    Article  Google Scholar 

  25. Andrews, T. S. & Hemberg, M. False signals induced by single-cell imputation. [version 2; peer review: 4 approved]. F1000Res. 7, 1740 (2018).

    Article  Google Scholar 

  26. Haralick, R. M., Shanmugam, K. & Dinstein, I. Textural features for image classification. IEEE Trans. Syst. Man Cybern. 3, 610–621 (1973).

    Article  Google Scholar 

  27. Zhang, Z., Xiong, D., Wang, X., Liu, H. & Wang, T. Mapping the functional landscape of T cell receptor repertoires by single-T cell transcriptomics. Nat. Methods 18, 92–99 (2021).

    Article  CAS  Google Scholar 

  28. Adossa, N. A., Rytkönen, K. T. & Elo, L. L. Dirichlet process mixture models for single-cell RNA-seq clustering. Biol. Open 11, bio059001 (2022).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We acknowledge the ENCODE Consortium and the ENCODE production laboratories that generated the eCLIP datasets used in our study. We acknowledge J. Johnson for providing input on the interpretation of the mouse Slide-Seq data. This study was supported by the National Institutes of Health (NIH) (5P30CA142543 to T.W., G.X. and Y.X., 1R01CA258584 to T.W., U01AI156189 to T.W. and Y.X., R01DE030656 to G.X., R01GM141519 to G.X., R01GM140012 to G.X., U01CA249245 to G.X., R35GM136375 to Y.X., 2P50CA070907 to T.W., Y.X. and G.X., R01AG075582 to L.W., 3U01AI156189-01S1 to T.W.), National Science Foundation (NSF DMS-2009689 to L.W.), and Cancer Prevention Research Institute of Texas (CPRIT RP190208 to T.W. and RP190107 to G.X.).

Author information

Authors and Affiliations

Authors

Contributions

Y.W. and B.S. implemented the software and contributed bioinformatics analyses. L.W. and T.W. designed the model. M.C. provided input on the interpretation of the pathology analyses. Y.X., S.W. and G.X. provided input on the analyses and the writing. T.W. supervised the whole study.

Corresponding authors

Correspondence to Li Wang or Tao Wang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Methods thanks Nikos Karaiskos and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary handling editor: Lin Tang, in collaboration with the Nature Methods team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Correlation between PTPRC RNA expression (targeted) and CD45 protein expression (IF).

Red arrows mean that the results are limited to spots of higher quality. Yellow arrows mean that the single gene of PTPRC is replaced by a signature of PTPRC by including correlated genes.

Extended Data Fig. 2 Overlaying the un-corrected IgD gene expression on the H&E stained image in the 10X Visium human lymph node dataset.

The whole slide is shown, with the three examples of Fig. 1d picked from this slide. The yellow circle marks the mantle zone to be highlighted in Fig. 3d.

Extended Data Fig. 3 Gene expression clustering of the beads in the mouse brain Slide-Seq dataset.

Gene expression clustering of the beads in the mouse brain Slide-Seq dataset reflects the multi-cellular structures of mouse brain hippocampus. a, Slide-seq dataset puck 200306_03 and b puck 200115_08 by Stickels et al.1.

Extended Data Fig. 4 Deviances between CD45 IF intensities and the expression levels of PTPRC (left: original, right: denoised).

CD45 IF intensities and PTPRC expression values were normalized and distributionally warped to the same scale so they can be directly compared. The differences between CD45 IF and PTPRC on each spot are denoted by color. Red refers to small differences and green refers to larger differences.

Extended Data Fig. 5 Spatial IgD expression of the raw Visium data (left) and the Sprod-adjusted data (right).

The red circles mark the mantle zone to be highlighted in Fig. 3d.

Extended Data Fig. 6 Spearman correlations between IgD and CD3/CD20/CD1c for the human lymph node Visium dataset.

Results are shown for the original expression data, SAVER/scImpute-corrected data, the Sprod-corrected data, and the Sprod-corrected data with image/location information scrambled.

Extended Data Fig. 7 Extraction of four tumor regions.

The four tumor regions (blue, green, orange and red) that were extracted, according to expressional clustering and concordance with the H&E stained slide.

Supplementary information

Supplementary Information

Supplementary Notes 1 and 2 and Table 1.

Reporting Summary

Supplementary Table

Supplementary Table 2.

Supplementary Software

Sprod v.1.0 was provided with this publication for documentation purpose.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Y., Song, B., Wang, S. et al. Sprod for de-noising spatially resolved transcriptomics data based on position and image information. Nat Methods 19, 950–958 (2022). https://doi.org/10.1038/s41592-022-01560-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41592-022-01560-w

This article is cited by

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics