The microscopic assessment of tissue samples is instrumental for the diagnosis and staging of cancer, and thus guides therapy. However, these assessments demonstrate considerable variability and many regions of the world lack access to trained pathologists. Though artificial intelligence (AI) promises to improve the access and quality of healthcare, the costs of image digitization in pathology and difficulties in deploying AI solutions remain as barriers to real-world use. Here we propose a cost-effective solution: the augmented reality microscope (ARM). The ARM overlays AI-based information onto the current view of the sample in real time, enabling seamless integration of AI into routine workflows. We demonstrate the utility of ARM in the detection of metastatic breast cancer and the identification of prostate cancer, with latency compatible with real-time use. We anticipate that the ARM will remove barriers towards the use of AI designed to improve the accuracy and efficiency of cancer diagnosis.
Access optionsAccess options
Subscribe to Journal
Get full journal access for 1 year
only $18.75 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
The Camelyon16 dataset utilized to develop the deep learning algorithms used in this study is available from the Camelyon challenge8 (https://camelyon16.grand-challenge.org/). The prostate dataset from TCGA that was used to develop the deep learning algorithms used in this study is available from the Genomic Data Commons portal (https://portal.gdc.cancer.gov/), which is based on data generated by the TCGA Research Network (http://cancergenome.nih.gov/). The remainder of the prostate and the lymph node data sets is not publicly available due to restrictions in data-sharing agreements with the data sources. The use of de-identified tissue for this study was approved by the Institutional Review Board.
The deep learning architecture will be made available at https://github.com/google-research/google-research/tree/master/nopad_inception_v3_fcn. The deep learning framework used here (TensorFlow) is available at https://www.tensorflow.org/. The camera grabber drivers (BitFlow) are available at http://www.bitflow.com/. The software used for basic image processing (OpenCV) is available at https://opencv.org/. The Python library used for computation and plotting of the performance metrics (SciPy, NumPy and MatPlotLib) is available at https://www.scipy.org/, http://www.numpy.org/ and https://matplotlib.org/, respectively.
Elmore, J. G. et al. Diagnostic concordance among pathologists interpreting breast biopsy specimens. JAMA 313, 1122–1132 (2015).
Brimo, F., Schultz, L. & Epstein, J. I. The value of mandatory second opinion pathology review of prostate needle biopsy interpretation before radical prostatectomy. J. Urol. 184, 126–130 (2010).
Amin, M. B. et al. The eighth edition AJCC cancer staging manual: continuing to build a bridge from a population-based to a more ‘personalized’ approach to cancer staging. CA Cancer J. Clin. 67, 93–99 (2017).
Salgado, R. et al. The evaluation of tumor-infiltrating lymphocytes (TILs) in breast cancer: recommendations by an International TILs Working Group 2014. Ann. Oncol. 26, 259–271 (2015).
Wilson, M. L. et al. Access to pathology and laboratory medicine services: a crucial gap. Lancet 391, P1927–P1938 (2018).
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
Gulshan, V. et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA 316, 2402–2410 (2016).
Ehteshami Bejnordi, B. et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA 318, 2199–2210 (2017).
Esteva, A. et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature 542, 115–118 (2017).
Liu, Y. et al. Detecting cancer metastases on gigapixel pathology images. arX iv, https://arxiv.org/abs/1703.02442 (2017).
Saltz, J. et al. Spatial organization and molecular correlation of tumor-infiltrating lymphocytes using deep learning on pathology images. Cell Rep. 23, 181–193.e7 (2018).
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 770–778 (IEEE, 2016).
Shelhamer, E., Long, J. & Darrell, T. Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 640–651 (2017).
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J. & Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2818–2826 (IEEE, 2016).
Liu, Y. et al. Artificial intelligence-based breast cancer nodal metastasis detection. Arch. Pathol. Lab. Med. 143, 859–868 (2018).
Steiner, D. F. et al. Impact of deep learning assistance on the histopathologic review of lymph nodes for metastatic breast cancer. Am. J. Surg. Pathol. 42, 1636–1646 (2018).
Nagpal, K. et al. Development and validation of a deep learning algorithm for improving Gleason scoring of prostate cancer. NPJ Digit. Med. 2, 48 (2019).
Vandenberghe, M. E. et al. Relevance of deep learning to facilitate the diagnosis of HER2 status in breast cancer. Sci. Rep. 7, 45938 (2017).
Russ, J. C. Computer-Assisted Microscopy: The Measurement and Analysis of Images (Springer Science & Business Media, 2012).
Pirnstill, C. W. & Coté, G. L. Malaria diagnosis using a mobile phone polarized microscope. Sci. Rep. 5, 13368 (2015).
Quinn, J. A. et al. Deep convolutional neural networks for microscopy-based point of care diagnostics. In Proceedings of the International Conference on Machine Learning for HealthCare 56, 271–281 (2016).
Xie, W., Noble, J. A. & Zisserman, A. Microscopy cell counting and detection with fully convolutional regression networks. Comput. Methods Biomech. Biomed. Eng. Imaging Vis. 6, 283–292 (2018).
Hegde, N. et al. Similar image search for histopathology: SMILY. NPJ Digit. Med. 2, 56 (2019).
Coudray, N. et al. Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning. Nat. Med. 24, 1559–1567 (2018).
Kather, J. N. et al. Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer. Nat. Med. 25, 1054–1056 (2019).
Krupinski, E. A. et al. Eye-movement study and human performance using telepathology virtual slides: implications for medical education and differences with experience. Hum. Pathol. 37, 1543–1556 (2006).
Gleason, D. F. & Mellinger, G. T. Prediction of prognosis for prostatic adenocarcinoma by combined histological grading and clinical staging. J. Urol. 111, 58–64 (1974).
Epstein, J. I., Allsbrook, W. C. Jr, Amin, M. B. & Egevad, L. L., ISUP Grading Committee. The 2005 International Society of Urological Pathology (ISUP) Consensus Conference on Gleason grading of prostatic carcinoma. Am. J. Surg. Pathol. 29, 1228–1242 (2005).
Abadi, M. et al. TensorFlow: a system for large-scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (2016).
Kohlberger, T. et al. Whole-slide image focus quality: automatic assessment and impact on AI cancerdetection. arXiv, https://arxiv.org/abs/1901.04619 (2019).
Gutman, D. A. et al. Cancer digital slide archive: an informatics resource to support integrated in silico analysis of TCGA pathology data. J. Am. Med. Inform. Assoc. 20, 1091–1098 (2013).
van Der Laak, J. A., Pahlplatz, M. M., Hanselaar, A. G. & de Wilde, P. C. Hue-saturation-density (HSD) model for stain recognition in digital images from transmitted light microscopy. Cytometry 39, 275–284 (2000).
We would like to thank the following pathologists who provided initial user feedback: M. Amin, S. Binder, T. Brown, M. Emmert-Buck, I. Flament, N. Olson, A. Sangoi and J. Smith, as well as colleagues who provided assistance with engineering components and paper writing: T. Boyd, A. Chai, L. Dong, W. Ito, J. Kumler, T.-Y. Lin, M. Moran, R. Nagle, D. Stephenson, S. Sudhir, D. Sykora and M. Weakly.
P.-H.C.C., R.M., K.G., Y.L., S.K., K.N., T.K., J.D., G.S.C., and C.H.M. are employees of Google and own Alphabet stock. J.D.H. is an employee of AstraZeneca. M.C.S. is an employee of Tempus Labs.
Peer review information: Javier Carmona was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
a, Hardware components of the integrated ARM system on the Olympus BX43. b, Photograph of the ARM system implementation labeled with the corresponding modules. c, Sample view of lymph node metastasis detection model through the lens.
In the development phase, we first sample patches of size 911 × 911 pixels from digitized whole-slide imagery. The patches are then preprocessed to match the data distribution of microscope images. In the application phase, an image of size 2,560 × 2,560 pixels is provided to the network. The output of the network is a heatmap depicting the likelihood of cancer at each pixel location.
Modified InceptionV3 network that avoids introduction of artifacts when run in ‘fully convolutional mode’ at inference. ‘Crop’ layers with parameter k crop a border of width k from its input. The principles we followed in the modifications were: (1) use of ‘valid’ rather than ‘same’ padding for all convolutions, to avoid introduction of artificial zeroes when the input size is increased at inference time; (2) differential cropping of the output of the branches in each Inception block as appropriate, to maintain the same spatial size (height and width of each feature block) for the channel-wise concatenation operation; and (3) increasing the receptive field to increase tissue context available for interpretation of the neural network.
a, Schematic of the optic pathway. The standard upright microscope illuminates the specimen (S) from behind and captures the image rays with a conventional objective. These rays propagate upward, in a collimated state, towards the oculars. A teaching module (Nikon Y-IDP) with a beam splitter (BS1) was inserted into the optical pathway in the collimated light space. This module was modified to accept a microscope camera (C), so that the specimen image relayed from BS1 was in focus at the camera sensor when the specimen was also in focus for the microscope user. A second customized teaching module (Nikon T-THM) was inserted between the oculars and the first teaching module. The beam splitter in this module (BS2) was rotated 90° to combine light from the specimen image (SI) with that from the projected image (PI) from the microdisplay (P). The augmented reality display includes a microdisplay and collimating optics, which were chosen to match the display size with the ocular size (22 mm). In this prototype, we tested two microdisplays—one that supports arbitrary colors (RGB) and another, brighter, display that supports only the green channel. The position of the collimator was adjusted to position the microdisplay in the virtual focal plane of the specimen. This collocation of SI and PI in the same plane minimizes relative motion when the observer moves, a phenomenon known as parallax. Note that BS1 needs to precede BS2 in the optical pathway from objective to ocular, so that camera C sees a view of the specimen without the projection PI. The observer looking through the eyepiece (EP) sees PI superimposed onto SI. b, Photograph of the actual implementation labeled with the corresponding modules.
The images show actual views through the lens of the ARM, with green outlines highlighting the predicted tumor region. a, Left to right: lymph node metastasis detection at ×4, ×10, ×20 and ×40. b, Left to right: prostate cancer detection at ×4, ×10 and ×20.
Extended Data Fig. 6 Lymph node cancer detection at ×10 compared to the corresponding immunohistochemistry as the reference standard.
Left to right: FOV as seen from the ARM, predicted heatmap, corresponding FOV from a digital scanner and corresponding FOV of an immunohistochemistry (pancytokeratin AE1/AE3 antibody)-stained slide from a digital scanner. This immunohistochemistry stain highlights the tumor cells in brown.
Extended Data Fig. 7 Visualization of color distribution of slides used in the training and test sets.
In the polar scatter plots, the angle represents the hue (color) while the distance from origin represents the saturation. Each point represents the average hue and saturation of an image after mapping RGB values to optical densities followed by a hue–saturation–density (HSD) color transform. The HSD transform is similar to hue–saturation–value, but corrects for the logarithmic relationship between light intensity and stain amount and has been shown to better represent stained slides10,32. The training set from a digitized scanner is shown in blue, and test sets from microscopy are shown in red and green.
Extended Data Fig. 8 Visualization of optical density distribution of slides used in the training and test sets.
In the histogram plots, the x axis represents the luma (brightness) of the image. Each point represents the average optical density of an image after mapping RGB values to optical densities followed by HSD color transform32. The training set from a digitized scanner is shown in blue, and test sets from microscopy are shown in red and green.
Extended Data Fig. 9 Training and inference comparison across three design choices for the deep learning algorithm.
The standard patch-based approach crops the input image into smaller patches for training, and applies the algorithm in a sliding window across these patches for inference. This results in poor computational efficiency at inference from repeated computations across adjacent patches. Naive FCN eliminates the adjacent computations but causes the grid-like artifacts due to the mismatched context between training and inference. Modified FCN removes paddings in the network, ensuring consistent context between training and inference. This proves artifact-free inference results with no repeated computations between adjacent patches.
Extended Data Fig. 10 Qualitative and quantitative comparison of patch-level AUC between convolutional neural network, naive FCN and proposed FCN.
Confidence intervals (CIs) were calculated with 5,000 bootstrap replications.
About this article
Nature Medicine (2019)