Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Integrating spatial gene expression and breast tumour morphology via deep learning

Abstract

Spatial transcriptomics allows for the measurement of RNA abundance at a high spatial resolution, making it possible to systematically link the morphology of cellular neighbourhoods and spatially localized gene expression. Here, we report the development of a deep learning algorithm for the prediction of local gene expression from haematoxylin-and-eosin-stained histopathology images using a new dataset of 30,612 spatially resolved gene expression data matched to histopathology images from 23 patients with breast cancer. We identified over 100 genes, including known breast cancer biomarkers of intratumoral heterogeneity and the co-localization of tumour growth and immune activation, the expression of which can be predicted from the histopathology images at a resolution of 100 µm. We also show that the algorithm generalizes well to The Cancer Genome Atlas and to other breast cancer gene expression datasets without the need for re-training. Predicting the spatially resolved transcriptome of a tissue directly from tissue images may enable image-based screening for molecular biomarkers with spatial variation.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: ST-Net pipeline and data.
Fig. 2: Results from the ST-Net predictions.
Fig. 3: Three example patches and regions of interest identified by interpreting ST-Net.
Fig. 4: UMAP visualization of the ST-Net latent space.

Similar content being viewed by others

Data availability

The main data supporting the results in this study are available within the paper and its Supplementary Information. Raw files for the breast cancer samples are available through a Materials transfer agreement with Å.B. (ake.borg@med.lu.se). All images and processed data are available at http://www.spatialtranscriptomicsresearch.org. The 10x Spatial Genomics data can be downloaded from https://wp.10xgenomics.com/spatial-transcriptomics. All data from TGCA are publicly available from the Genomic Data Commons Data Portal (https://portal.gdc.cancer.gov).

Code availability

The code for ST-Net is available at https://github.com/bryanhe/ST-Net.

References

  1. Gerlinger, M. et al. Intratumor heterogeneity and branched evolution revealed by multiregion sequencing. N. Engl. J. Med. 366, 883–892 (2012).

    Article  CAS  Google Scholar 

  2. Ståhl, P. L. et al. Visualization and analysis of gene expression in tissue sections by spatial transcriptomics. Science 353, 78–82 (2016).

    Article  Google Scholar 

  3. Chen, K. H., Boettiger, A. N., Moffitt, J. R., Wang, S. & Zhuang, X. Spatially resolved, highly multiplexed RNA profiling in single cells. Science 348, aaa6090 (2015).

    Article  Google Scholar 

  4. Eng, C. H. L. et al. Transcriptome-scale super-resolved imaging in tissues by RNA seqFISH+. Nature 568, 235–239 (2019).

    Article  CAS  Google Scholar 

  5. Liu, R. et al. Modeling spatial correlation of transcripts with application to developing pancreas. Sci. Rep. 9, 5592 (2019).

    Article  Google Scholar 

  6. Lee, J. H. et al. Highly multiplexed subcellular RNA sequencing in situ. Science 343, 1360–1363 (2014).

    Article  CAS  Google Scholar 

  7. Carpenter, A. E. et al. CellProfiler: image analysis software for identifying and quantifying cell phenotypes. Genome Biol. 7, R100 (2006).

    Article  Google Scholar 

  8. Kamentsky, L. et al. Improved structure, function and compatibility for CellProfiler: modular high-throughput image analysis software. Bioinformatics 27, 1179–1180 (2011).

    Article  CAS  Google Scholar 

  9. Yu, K. H. et al. Predicting non-small cell lung cancer prognosis by fully automated microscopic pathology image features. Nat. Commun. 7, 12474 (2016).

    Article  CAS  Google Scholar 

  10. He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition 770–778 (2016).

  11. Huang, G., Liu, Z., Van Der Maaten, L. & Weinberger, K. Q. Densely connected convolutional networks. In Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition 4700–4708 (2017).

  12. Simonyan, K. & Zisserman, A. Very deep convolutional networks for large-scale image recognition. In Proc. Int. Conf. on Learning Representations (2015).

  13. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J. & Wojna, Z. Rethinking the inception architecture for computer vision. In Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition (2016).

  14. Litjens, G. et al. 1399 H&E-stained sentinel lymph node sections of breast cancer patients: the CAMELYON dataset. GigaScience 7, giy065 (2018).

    Article  Google Scholar 

  15. Liu, Y. et al. Detecting cancer metastases on gigapixel pathology images. Preprint at https://arXiv.org/abs/1703.02442 (2017).

  16. Wang, D., Khosla, A., Gargeya, R., Irshad, H. & Beck, A. H. Deep learning for identifying metastatic breast cancer. Preprint at https://arXiv.org/abs/1606.05718 (2016).

  17. Coudray, N. et al. Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning. Nat. Med. 24, 1559–1567 (2018).

    Article  CAS  Google Scholar 

  18. Khosravi, P., Kazemi, E., Imielinski, M., Elemento, O. & Hajirasouliha, I. Deep convolutional neural networks enable discrimination of heterogeneous digital pathology images. EBioMedicine 27, 317–328 (2018).

    Article  Google Scholar 

  19. Yu, K. H. et al. Classifying non-small cell lung cancer histopathology types and transcriptomic subtypes using convolutional neural networks. J. Am. Assoc. Med. Inform. Assoc. 27, 757–769 (2019).

    Article  Google Scholar 

  20. Huang, D. W., Sherman, B. T. & Lempicki, R. A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 4, 44–57 (2008).

    Article  Google Scholar 

  21. The Cancer Genome Atlas Network. Comprehensive molecular portraits of human breast tumours. Nature 490, 61–70 (2012).

  22. Kurozumi, S. et al. Prognostic significance of tumour-infiltrating lymphocytes for oestrogen receptor-negative breast cancer without lymph node metastasis. Oncol. Lett. 17, 2647–2656 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. Ladha, J. et al. Identification of genomic targets of transcription factor AEBP1 and its role in survival of glioma cells. Mol. Cancer Res. 10, 1039–1051 (2012).

    Article  CAS  Google Scholar 

  24. Sangaletti, S. et al. Macrophage-derived SPARC bridges tumor cell–extracellular matrix interactions toward metastasis. Cancer Res. 68, 9050–9059 (2008).

    Article  CAS  Google Scholar 

  25. Yamamoto, K. et al. Biglycan is a specific marker and an autocrine angiogenic factor of tumour endothelial cells. Br. J. Cancer 106, 1214–1223 (2012).

    Article  CAS  Google Scholar 

  26. Cheng, J. et al. Integrative analysis of histopathological images and genomic data predicts clear cell renal cell carcinoma prognosis. Cancer Res. 77, e91–e100 (2017).

    Article  CAS  Google Scholar 

  27. Ge, R. & Zou, J. Intersecting faces: non-negative matrix factorization with new guarantees. In Proc. of the 32nd Int. Conf. on Machine Learning (2015).

  28. Rahmani, E. et al. Sparse PCA corrects for cell type heterogeneity in epigenome-wide association studies. Nat. Methods 13, 443–445 (2016).

    Article  CAS  Google Scholar 

  29. Sundararajan, M., Taly, A. & Yan, Q. Axiomatic attribution for deep networks. In Proc. 34th Int. Conf. on Machine Learning (2017).

  30. Hunt, D. A. et al. mRNA stability and overexpression of fatty acid synthase in human breast cancer cell lines. Anticancer Res. 27, 27–34 (2007).

    CAS  PubMed  Google Scholar 

  31. McInnes, L., Healy, J. & Melville, J. UMAP: uniform manifold approximation and projection. J. Open Source Software 3, 891 (2018).

    Article  Google Scholar 

  32. Salmén, F. et al. Barcoded solid-phase RNA capture for spatial transcriptomics profiling in mammalian tissue sections. Nat. Protoc. 13, 2501–2534 (2018).

    Article  Google Scholar 

  33. Deng, J. et al. Imagenet: a largescale hierarchical image database. In IEEE Conf. on Computer Vision and Pattern Recognition 248–255 (2009).

  34. Russakovsky, O. et al. Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 211–252 (2015).

    Article  Google Scholar 

  35. Esteva, A. et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature 542, 115–118 (2017).

    Article  CAS  Google Scholar 

  36. De Fauw, J. et al. Clinically applicable deep learning for diagnosis and referral in retinal disease. Nat. Med. 24, 1342–1350 (2018).

    Article  Google Scholar 

  37. Paszke, A. et al. PyTorch: an imperative style, high-performance deep learning library. In Adv. Neural Inf. Process. Syst. (2017).

  38. Holm, S. A simple sequentially rejective multiple test procedure. Scand. J. Stat. 6, 65–70 (1979).

  39. Seabold, S. & Perktold, J. Statsmodels: econometric and statistical modeling with Python. In Proc. 9th Python in Science Conf. (2010).

Download references

Acknowledgements

J.Z. is supported by the NSF (grant no. CCF 1763191), NIH (grant nos. R21 MD012867-01, P30AG059307 and U01MH098953) and grants from the Silicon Valley Foundation and Chan–Zuckerberg Initiative. J.L. is supported by the Swedish Foundation for Strategic Research, Swedish Cancer Society and Swedish Research Council.

Author information

Authors and Affiliations

Authors

Contributions

All of the authors contributed to the project planning and writing of the manuscript. B.H. and L.B. performed analysis. L.S., Å.B. and J.L. generated data. J.M., J.L. and J.Z. supervised the project.

Corresponding authors

Correspondence to Jonas Maaskola, Joakim Lundeberg or James Zou.

Ethics declarations

Competing interests

J.L. is an author on patent nos. PCT/EP2012/056823 (WO2012/140224), PCT/EP2013/071645 (WO2014/060483) and PCT/EP2016/057355 applied for by Spatial Transcriptomics AB/10x Genomics Inc. covering the described technology.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary figures and tables.

Reporting Summary

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

He, B., Bergenstråhle, L., Stenbeck, L. et al. Integrating spatial gene expression and breast tumour morphology via deep learning. Nat Biomed Eng 4, 827–834 (2020). https://doi.org/10.1038/s41551-020-0578-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41551-020-0578-x

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research