Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Global voxel transformer networks for augmented microscopy

A preprint version of the article is available at arXiv.


Advances in deep learning have led to remarkable success in augmented microscopy, enabling us to obtain high-quality microscope images without using expensive microscopy hardware and sample preparation techniques. Current deep learning models for augmented microscopy are mostly U-Net-based neural networks, thus sharing certain drawbacks that limit the performance. In particular, U-Nets are composed of local operators only and lack dynamic non-local information aggregation. In this work, we introduce global voxel transformer networks (GVTNets), a deep learning tool for augmented microscopy that overcomes intrinsic limitations of the current U-Net-based models and achieves improved performance. GVTNets are built on global voxel transformer operators, which are able to aggregate global information, as opposed to local operators like convolutions. We apply the proposed methods on existing datasets for three different augmented microscopy tasks under various settings.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: GVTNets architecture, training and inference.
Fig. 2: GVTNets on label-free prediction of 3D fluorescence images from transmitted-light microscopy.
Fig. 3: GVTNets on content-aware 3D image denoising.
Fig. 4: GVTNets on content-aware 3D-to-2D image projection.
Fig. 5: Generalization ability of GVTNets.

Data availability

Datasets for label-free prediction of 3D fluorescence images from transmitted-light microscopy25 can be downloaded from Datasets for context-aware 3D image denoising and 3D-to-2D image projection27 can be downloaded from

Code availability

The code for GVTNets training, prediction and evaluation (in Python/TensorFlow) is publicly available at and ref. 60.


  1. 1.

    Gustafsson, M. G. Surpassing the lateral resolution limit by a factor of two using structured illumination microscopy. J. Microsc. 198, 82–87 (2000).

    Article  Google Scholar 

  2. 2.

    Huisken, J., Swoger, J., Del Bene, F., Wittbrodt, J. & Stelzer, E. H. Optical sectioning deep inside live embryos by selective plane illumination microscopy. Science 305, 1007–1009 (2004).

    Article  Google Scholar 

  3. 3.

    Betzig, E. et al. Imaging intracellular fluorescent proteins at nanometer resolution. Science 313, 1642–1645 (2006).

    Article  Google Scholar 

  4. 4.

    Rust, M. J., Bates, M. & Zhuang, X. Sub-diffraction-limit imaging by stochastic optical reconstruction microscopy (storm). Nat. Meth. 3, 793–796 (2006).

    Article  Google Scholar 

  5. 5.

    Heintzmann, R. & Gustafsson, M. G. Subdiffraction resolution in continuous samples. Nat. Photon. 3, 362–364 (2009).

    Article  Google Scholar 

  6. 6.

    Tomer, R., Khairy, K., Amat, F. & Keller, P. J. Quantitative high-speed imaging of entire developing embryos with simultaneous multiview light-sheet microscopy. Nat. Meth. 9, 755–763 (2012).

    Article  Google Scholar 

  7. 7.

    Chen, B.-C. et al. Lattice light-sheet microscopy: imaging molecules to embryos at high spatiotemporal resolution. Science 346, 1257998 (2014).

    Article  Google Scholar 

  8. 8.

    Belthangady, C. & Royer, L. A. Applications, promises, and pitfalls of deep learning for fluorescence image reconstruction. Nat. Meth. 16, 1215–1225 (2019).

  9. 9.

    Laissue, P. P., Alghamdi, R. A., Tomancak, P., Reynaud, E. G. & Shroff, H. Assessing phototoxicity in live fluorescence imaging. Nat. Meth. 14, 657–661 (2017).

    Article  Google Scholar 

  10. 10.

    Icha, J., Weber, M., Waters, J. C. & Norden, C. Phototoxicity in live fluorescence microscopy, and how to avoid it. Bioessays 39, 1700003 (2017).

    Article  Google Scholar 

  11. 11.

    Selinummi, J. et al. Bright field microscopy as an alternative to whole cell fluorescence in automated analysis of macrophage images. PLoS ONE 4, e7497 (2009).

    Article  Google Scholar 

  12. 12.

    Pawley, J. B. in Handbook of Biological Confocal Microscopy (ed. Pawley, J. B.) 20-42 (Springer, 2006).

  13. 13.

    Scherf, N. & Huisken, J. The smart and gentle microscope. Nat. Biotechnol. 33, 815–818 (2015).

    Article  Google Scholar 

  14. 14.

    Skylaki, S., Hilsenbeck, O. & Schroeder, T. Challenges in long-term imaging and quantification of single-cell dynamics. Nat. Biotechnol. 34, 1137–1144 (2016).

    Article  Google Scholar 

  15. 15.

    LeCun, Y. et al. Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998).

    Article  Google Scholar 

  16. 16.

    Sullivan, D. P. & Lundberg, E. Seeing more: a future of augmented microscopy. Cell 173, 546–548 (2018).

    Article  Google Scholar 

  17. 17.

    Chen, P. et al. An augmented reality microscope with real-time artificial intelligence integration for cancer diagnosis. Nat. Med. 25, 1453–1457 (2019).

    Article  Google Scholar 

  18. 18.

    Moen, E. et al. Deep learning for cellular image analysis. Nat. Meth. 16, 1233–1246 (2019).

  19. 19.

    Johnson, G. R., Donovan-Maiye, R. M. & Maleckar, M. M. Building a 3D integrated cell. Preprint at (2017).

  20. 20.

    Ounkomol, C. et al. Three dimensional cross-modal image inference: label-free methods for subcellular structure prediction. Preprint at (2017).

  21. 21.

    Osokin, A., Chessel, A., Carazo Salas, R. E. & Vaggi, F. GANs for biological image synthesis. In Proc. IEEE International Conference on Computer Vision 2233-2242 (2017).

  22. 22.

    Yuan, H. et al. Computational modeling of cellular structures using conditional deep generative networks. Bioinformatics 35, 2141–2149 (2019).

    Article  Google Scholar 

  23. 23.

    Johnson, G., Donovan-Maiye, R., Ounkomol, C. & Maleckar, M. M. Studying stem cell organization using ‘label-free’ methods and a novel generative adversarial model. Biophys. J. 114, 43A (2018).

    Article  Google Scholar 

  24. 24.

    Christiansen, E. M. et al. In silico labeling: predicting fluorescent labels in unlabeled images. Cell 173, 792–803 (2018).

    Article  Google Scholar 

  25. 25.

    Ounkomol, C., Seshamani, S., Maleckar, M. M., Collman, F. & Johnson, G. R. Label-free prediction of three-dimensional fluorescence images from transmitted-light microscopy. Nat. Meth. 15, 917–920 (2018).

    Article  Google Scholar 

  26. 26.

    Wu, Y. et al. Three-dimensional virtual refocusing of fluorescence microscopy images using deep learning. Nat. Meth. 16, 1323–1331 (2019).

  27. 27.

    Weigert, M. et al. Content-aware image restoration: pushing the limits of fluorescence microscopy. Nat. Meth. 15, 1090–1097 (2018).

    Article  Google Scholar 

  28. 28.

    Wang, H. et al. Deep learning enables cross-modality super-resolution in fluorescence microscopy. Nat. Meth. 16, 103–110 (2019).

    Article  Google Scholar 

  29. 29.

    Rivenson, Y. et al. Deep learning microscopy. Optica 4, 1437–1443 (2017).

    Article  Google Scholar 

  30. 30.

    Ronneberger, O., Fischer, P. & Brox, T. U-Net: convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention 234–241 (Springer, 2015).

  31. 31.

    Falk, T. et al. U-Net: deep learning for cell counting, detection, and morphometry. Nat. Meth. 16, 67–70 (2019).

    Article  Google Scholar 

  32. 32.

    He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 770–778 (2016).

  33. 33.

    He, K., Zhang, X., Ren, S. & Sun, J. Identity mappings in deep residual networks. In European Conference on Computer Vision 630–645 (Springer, 2016).

  34. 34.

    Fakhry, A., Zeng, T. & Ji, S. Residual deconvolutional networks for brain electron microscopy image segmentation. IEEE Trans. Med. Imaging 36, 447–456 (2017).

    Article  Google Scholar 

  35. 35.

    Lee, K., Zung, J., Li, P., Jain, V. & Seung, H. S. Superhuman accuracy on the SNEMI3D connectomics challenge. Preprint at (2017).

  36. 36.

    Çiçek, Ö., Abdulkadir, A., Lienkamp, S. S., Brox, T. & Ronneberger, O. 3D U-Net: learning dense volumetric segmentation from sparse annotation. In International Conference on Medical Image Computing and Computer-Assisted Intervention 424–432 (Springer, 2016).

  37. 37.

    Simonyan, K. & Zisserman, A. Very deep convolutional networks for large-scale image recognition. Preprint at (2014).

  38. 38.

    Vaswani, A. et al. Attention is all you need. In Advances in Neural Information Processing Systems 5998–6008 (2017).

  39. 39.

    Wang, X., Girshick, R., Gupta, A. & He, K. Non-local neural networks. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 7794–7803 (2018).

  40. 40.

    Wilson, D. R. & Martinez, T. R. The general inefficiency of batch training for gradient descent learning. Neural Networks 16, 1429–1451 (2003).

    Article  Google Scholar 

  41. 41.

    Wang, Z. et al. Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13, 600–612 (2004).

    Google Scholar 

  42. 42.

    Aigouy, B. et al. Cell flow reorients the axis of planar polarity in the wing epithelium of drosophila. Cell 142, 773–786 (2010).

    Article  Google Scholar 

  43. 43.

    Etournay, R. et al. Interplay of cell dynamics and epithelial tension during morphogenesis of the Drosophila pupal wing. eLife 4, e07090 (2015).

    Article  Google Scholar 

  44. 44.

    Pan, S. J. & Yang, Q. A survey on transfer learning. IEEE Transactions Knowl. Data Eng. 22, 1345–1359 (2009).

    Article  Google Scholar 

  45. 45.

    Blasse, C. et al. PreMosa: extracting 2D surfaces from 3D microscopy mosaics. Bioinformatics 33, 2563–2569 (2017).

    Article  Google Scholar 

  46. 46.

    Cai, L., Wang, Z., Gao, H., Shen, D. & Ji, S. Deep adversarial learning for multi-modality missing data completion. In Proc. 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 1158–1166 (Association for Computing Machinery, 2018).

  47. 47.

    Zhang, Q., Cui, Z., Niu, X., Geng, S. & Qiao, Y. Image segmentation with pyramid dilated convolution based on ResNet and U-Net. In International Conference on Neural Information Processing 364–372 (Springer, 2017).

  48. 48.

    Huang, J. et al. Range scaling global U-Net for perceptual image enhancement on mobile devices. In Proc. European Conference on Computer Vision (ECCV) (Springer, 2018).

  49. 49.

    Oktay, O. et al. Attention U-Net: learning where to look for the pancreas. Preprint at (2018).

  50. 50.

    Zhou, Z., Siddiquee, M. M. R., Tajbakhsh, N. & Liang, J. UNet++: a nested U-Net architecture for medical image segmentation. In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support 3–11 (Springer, 2018).

  51. 51.

    Gu, Z. et al. CE-Net: context encoder network for 2D medical image segmentation. IEEE Trans. Med. Imaging 38, 2281–2292 (2019).

    Article  Google Scholar 

  52. 52.

    Goodfellow, I. et al. Generative adversarial nets. In Advances in Neural Information Processing Systems 2672–2680 (MIT Press, 2014).

  53. 53.

    Rivenson, Y. et al. Virtual histological staining of unlabelled tissue-autofluorescence images via deep learning. Nat. Biomed. Eng. 3, 466–477 (2019).

    Article  Google Scholar 

  54. 54.

    Finn, C., Abbeel, P. & Levine, S. Model-agnostic meta-learning for fast adaptation of deep networks. In Proc. 34th International Conference on Machine Learning 70, 1126–1135 (JMLR, 2017).

  55. 55.

    Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 1097–1105 (2012).

  56. 56.

    Kolda, T. G. & Bader, B. W. Tensor decompositions and applications. SIAM Rev. 51, 455–500 (2009).

    MathSciNet  Article  Google Scholar 

  57. 57.

    Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. In Proc. 3rd International Conference on Learning Representations (2015).

  58. 58.

    Ioffe, S. & Szegedy, C. Batch normalization: accelerating deep network training by reducing internal covariate shift. In International Conference on Machine Learning 448–456 (2015).

  59. 59.

    Kendall, A. & Gal, Y. What uncertainties do we need in bayesian deep learning for computer vision? In Advances in Neural Information Processing Systems 5574–5584 (2017).

  60. 60.

    Wang, Z., Xie, Y. & Ji, S. zhengyang-wang/GVTNets: Code for “Global voxel transformer networks for augmented microscopy” (version v1.0.0). Zenodo (2020).

Download references


We thank the teams at CARE and the Allen Institute for Cell Science for making their data and tools publicly available. This work was supported in part by National Science Foundation grants DBI-1922969, IIS-1908166 and IIS-1908220, National Institutes of Health grant 1R21NS102828 and Defense Advanced Research Projects Agency grant N66001-17-2-4031.

Author information




S.J. conceived and initiated the research. Z.W. and S.J. designed the methods. Z.W. and Y.X. implemented the training and validation methods. Z.W. and Y.X. designed and developed the software package. S.J. supervised the project. Z.W., Y.X. and S.J. wrote the manuscript.

Corresponding author

Correspondence to Shuiwang Ji.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Machine Intelligence thanks Ruogu Fang and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Figs. 1–13, Tables 1–6 and Notes 1,2.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wang, Z., Xie, Y. & Ji, S. Global voxel transformer networks for augmented microscopy. Nat Mach Intell 3, 161–171 (2021).

Download citation


Quick links