Integrating spatial gene expression and breast tumour morphology via deep learning

Abstract

Spatial transcriptomics allows for the measurement of RNA abundance at a high spatial resolution, making it possible to systematically link the morphology of cellular neighbourhoods and spatially localized gene expression. Here, we report the development of a deep learning algorithm for the prediction of local gene expression from haematoxylin-and-eosin-stained histopathology images using a new dataset of 30,612 spatially resolved gene expression data matched to histopathology images from 23 patients with breast cancer. We identified over 100 genes, including known breast cancer biomarkers of intratumoral heterogeneity and the co-localization of tumour growth and immune activation, the expression of which can be predicted from the histopathology images at a resolution of 100 µm. We also show that the algorithm generalizes well to The Cancer Genome Atlas and to other breast cancer gene expression datasets without the need for re-training. Predicting the spatially resolved transcriptome of a tissue directly from tissue images may enable image-based screening for molecular biomarkers with spatial variation.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: ST-Net pipeline and data.
Fig. 2: Results from the ST-Net predictions.
Fig. 3: Three example patches and regions of interest identified by interpreting ST-Net.
Fig. 4: UMAP visualization of the ST-Net latent space.

Data availability

The main data supporting the results in this study are available within the paper and its Supplementary Information. Raw files for the breast cancer samples are available through a Materials transfer agreement with Å.B. (ake.borg@med.lu.se). All images and processed data are available at http://www.spatialtranscriptomicsresearch.org. The 10x Spatial Genomics data can be downloaded from https://wp.10xgenomics.com/spatial-transcriptomics. All data from TGCA are publicly available from the Genomic Data Commons Data Portal (https://portal.gdc.cancer.gov).

Code availability

The code for ST-Net is available at https://github.com/bryanhe/ST-Net.

References

  1. 1.

    Gerlinger, M. et al. Intratumor heterogeneity and branched evolution revealed by multiregion sequencing. N. Engl. J. Med. 366, 883–892 (2012).

    CAS  Article  Google Scholar 

  2. 2.

    Ståhl, P. L. et al. Visualization and analysis of gene expression in tissue sections by spatial transcriptomics. Science 353, 78–82 (2016).

    Article  Google Scholar 

  3. 3.

    Chen, K. H., Boettiger, A. N., Moffitt, J. R., Wang, S. & Zhuang, X. Spatially resolved, highly multiplexed RNA profiling in single cells. Science 348, aaa6090 (2015).

    Article  Google Scholar 

  4. 4.

    Eng, C. H. L. et al. Transcriptome-scale super-resolved imaging in tissues by RNA seqFISH+. Nature 568, 235–239 (2019).

    CAS  Article  Google Scholar 

  5. 5.

    Liu, R. et al. Modeling spatial correlation of transcripts with application to developing pancreas. Sci. Rep. 9, 5592 (2019).

    Article  Google Scholar 

  6. 6.

    Lee, J. H. et al. Highly multiplexed subcellular RNA sequencing in situ. Science 343, 1360–1363 (2014).

    CAS  Article  Google Scholar 

  7. 7.

    Carpenter, A. E. et al. CellProfiler: image analysis software for identifying and quantifying cell phenotypes. Genome Biol. 7, R100 (2006).

    Article  Google Scholar 

  8. 8.

    Kamentsky, L. et al. Improved structure, function and compatibility for CellProfiler: modular high-throughput image analysis software. Bioinformatics 27, 1179–1180 (2011).

    CAS  Article  Google Scholar 

  9. 9.

    Yu, K. H. et al. Predicting non-small cell lung cancer prognosis by fully automated microscopic pathology image features. Nat. Commun. 7, 12474 (2016).

    CAS  Article  Google Scholar 

  10. 10.

    He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition 770–778 (2016).

  11. 11.

    Huang, G., Liu, Z., Van Der Maaten, L. & Weinberger, K. Q. Densely connected convolutional networks. In Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition 4700–4708 (2017).

  12. 12.

    Simonyan, K. & Zisserman, A. Very deep convolutional networks for large-scale image recognition. In Proc. Int. Conf. on Learning Representations (2015).

  13. 13.

    Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J. & Wojna, Z. Rethinking the inception architecture for computer vision. In Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition (2016).

  14. 14.

    Litjens, G. et al. 1399 H&E-stained sentinel lymph node sections of breast cancer patients: the CAMELYON dataset. GigaScience 7, giy065 (2018).

    Article  Google Scholar 

  15. 15.

    Liu, Y. et al. Detecting cancer metastases on gigapixel pathology images. Preprint at https://arXiv.org/abs/1703.02442 (2017).

  16. 16.

    Wang, D., Khosla, A., Gargeya, R., Irshad, H. & Beck, A. H. Deep learning for identifying metastatic breast cancer. Preprint at https://arXiv.org/abs/1606.05718 (2016).

  17. 17.

    Coudray, N. et al. Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning. Nat. Med. 24, 1559–1567 (2018).

    CAS  Article  Google Scholar 

  18. 18.

    Khosravi, P., Kazemi, E., Imielinski, M., Elemento, O. & Hajirasouliha, I. Deep convolutional neural networks enable discrimination of heterogeneous digital pathology images. EBioMedicine 27, 317–328 (2018).

    Article  Google Scholar 

  19. 19.

    Yu, K. H. et al. Classifying non-small cell lung cancer histopathology types and transcriptomic subtypes using convolutional neural networks. J. Am. Assoc. Med. Inform. Assoc. 27, 757–769 (2019).

    Article  Google Scholar 

  20. 20.

    Huang, D. W., Sherman, B. T. & Lempicki, R. A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 4, 44–57 (2008).

    Article  Google Scholar 

  21. 21.

    The Cancer Genome Atlas Network. Comprehensive molecular portraits of human breast tumours. Nature 490, 61–70 (2012).

  22. 22.

    Kurozumi, S. et al. Prognostic significance of tumour-infiltrating lymphocytes for oestrogen receptor-negative breast cancer without lymph node metastasis. Oncol. Lett. 17, 2647–2656 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. 23.

    Ladha, J. et al. Identification of genomic targets of transcription factor AEBP1 and its role in survival of glioma cells. Mol. Cancer Res. 10, 1039–1051 (2012).

    CAS  Article  Google Scholar 

  24. 24.

    Sangaletti, S. et al. Macrophage-derived SPARC bridges tumor cell–extracellular matrix interactions toward metastasis. Cancer Res. 68, 9050–9059 (2008).

    CAS  Article  Google Scholar 

  25. 25.

    Yamamoto, K. et al. Biglycan is a specific marker and an autocrine angiogenic factor of tumour endothelial cells. Br. J. Cancer 106, 1214–1223 (2012).

    CAS  Article  Google Scholar 

  26. 26.

    Cheng, J. et al. Integrative analysis of histopathological images and genomic data predicts clear cell renal cell carcinoma prognosis. Cancer Res. 77, e91–e100 (2017).

    CAS  Article  Google Scholar 

  27. 27.

    Ge, R. & Zou, J. Intersecting faces: non-negative matrix factorization with new guarantees. In Proc. of the 32nd Int. Conf. on Machine Learning (2015).

  28. 28.

    Rahmani, E. et al. Sparse PCA corrects for cell type heterogeneity in epigenome-wide association studies. Nat. Methods 13, 443–445 (2016).

    CAS  Article  Google Scholar 

  29. 29.

    Sundararajan, M., Taly, A. & Yan, Q. Axiomatic attribution for deep networks. In Proc. 34th Int. Conf. on Machine Learning (2017).

  30. 30.

    Hunt, D. A. et al. mRNA stability and overexpression of fatty acid synthase in human breast cancer cell lines. Anticancer Res. 27, 27–34 (2007).

    CAS  PubMed  Google Scholar 

  31. 31.

    McInnes, L., Healy, J. & Melville, J. UMAP: uniform manifold approximation and projection. J. Open Source Software 3, 891 (2018).

    Article  Google Scholar 

  32. 32.

    Salmén, F. et al. Barcoded solid-phase RNA capture for spatial transcriptomics profiling in mammalian tissue sections. Nat. Protoc. 13, 2501–2534 (2018).

    Article  Google Scholar 

  33. 33.

    Deng, J. et al. Imagenet: a largescale hierarchical image database. In IEEE Conf. on Computer Vision and Pattern Recognition 248–255 (2009).

  34. 34.

    Russakovsky, O. et al. Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 211–252 (2015).

    Article  Google Scholar 

  35. 35.

    Esteva, A. et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature 542, 115–118 (2017).

    CAS  Article  Google Scholar 

  36. 36.

    De Fauw, J. et al. Clinically applicable deep learning for diagnosis and referral in retinal disease. Nat. Med. 24, 1342–1350 (2018).

    Article  Google Scholar 

  37. 37.

    Paszke, A. et al. PyTorch: an imperative style, high-performance deep learning library. In Adv. Neural Inf. Process. Syst. (2017).

  38. 38.

    Holm, S. A simple sequentially rejective multiple test procedure. Scand. J. Stat. 6, 65–70 (1979).

  39. 39.

    Seabold, S. & Perktold, J. Statsmodels: econometric and statistical modeling with Python. In Proc. 9th Python in Science Conf. (2010).

Download references

Acknowledgements

J.Z. is supported by the NSF (grant no. CCF 1763191), NIH (grant nos. R21 MD012867-01, P30AG059307 and U01MH098953) and grants from the Silicon Valley Foundation and Chan–Zuckerberg Initiative. J.L. is supported by the Swedish Foundation for Strategic Research, Swedish Cancer Society and Swedish Research Council.

Author information

Affiliations

Authors

Contributions

All of the authors contributed to the project planning and writing of the manuscript. B.H. and L.B. performed analysis. L.S., Å.B. and J.L. generated data. J.M., J.L. and J.Z. supervised the project.

Corresponding authors

Correspondence to Jonas Maaskola or Joakim Lundeberg or James Zou.

Ethics declarations

Competing interests

J.L. is an author on patent nos. PCT/EP2012/056823 (WO2012/140224), PCT/EP2013/071645 (WO2014/060483) and PCT/EP2016/057355 applied for by Spatial Transcriptomics AB/10x Genomics Inc. covering the described technology.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary figures and tables.

Reporting Summary

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

He, B., Bergenstråhle, L., Stenbeck, L. et al. Integrating spatial gene expression and breast tumour morphology via deep learning. Nat Biomed Eng 4, 827–834 (2020). https://doi.org/10.1038/s41551-020-0578-x

Download citation

Further reading

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing