The promise and peril of deep learning in microscopy

Ever since van Leeuwenhoek peered into his homemade microscope and revealed a world inhabited by “small animals,” scientists have been pushing the limits of microscopy to see ever finer details in living cells and organisms. Though current methods would be science fiction to van Leeuwenhoek, we’ve entered an era of diminishing returns: camera sensors are 95% efficient, modern lasers can evaporate samples, fluorescent molecules reliably emit thousands of photons, and objectives lenses hit their fundamental physical performance limit over a century ago. Nevertheless, a “resolution revolution” seeded by Lukosz in the 1960s and heralded by Hell, Gustafsson and Betzig in the 2000s introduced “super-resolution” (SR) to the scientific vernacular. In this issue of Nature Methods, Qiao et al.1 continue the revolution by riding the tidal wave of deep learning (DL) — a framework rooted in the 1940s2 that has only recently enjoyed the computational muscle required by any but the simplest tasks3 — and present alchemic results: transforming low-resolution, low-contrast, noisy images into super-resolved, high contrast, clean micrographs.

Qiao et al.’s achievement is threefold. First, they collect an exceptional training dataset1, an invaluable public resource for new method development, consisting of matched noisy, low-resolution images and high-quality, super-resolved, structured illumination microscopy (SIM, a variant of SR microscopy) reconstructions. Second, they introduce two DL architectures, termed deep Fourier channel attention networks (DFCAN) and deep Fourier generative adversarial networks (DFGAN), which, as their names imply, learn feature representations in the Fourier domain; and finally, they apply the networks to both the perennial problem of SIM reconstruction from nine low-quality images and the more fantastical concept of single-image SR (SISR)4, in which an SR image is inferred entirely from a single diffraction-limited, or lower resolution, image.

How does SISR work? DL essentially memorizes patterns: the network uses pairs of low- and high-resolution images to map low-frequency inputs to high-frequency components. Currently, biological microscopists are bound by an uncertainty principle in which they can either image with high spatiotemporal fidelity while worrying that the intense illumination and protein overexpression required is disturbing the dynamics under study, or capture only blurry images of healthy, normally behaving cells and organisms. The methods explored by Qiao et al. promise to circumvent this dilemma.

While Qiao et al. have pushed DL in microscopy forward, they are certainly not the only ones tilling this fertile field. DL is infiltrating every aspect of biological imaging, with applications ranging from denoising to SISR to segmentation and even to ‘in silico labeling’ (a process in which fluorescence images are inferred from non-fluorescence ones) demonstrating that DL can be a powerful addition to the lab bench. But statistical models, of which DL networks are a subset, have been aptly compared to the Golem of Prague: “Animated by truth, but lacking free will, a golem always does exactly what it is told. This is lucky, because the golem is incredibly powerful, able to withstand and accomplish more than its creators could. However, its obedience also brings danger, as careless instructions or unexpected events can turn a golem against its makers. Its abundance of power is matched by its lack of wisdom.”5

With this in mind, it’s worth pausing to consider what, specifically, we are asking golems like that of Qiao et al. to do: conjure more from less. As the authors note: “it is theoretically impossible for network inference to obtain [ground truth] images in every detail.” We certainly can’t trust every detail, but can we in fact trust any detail? Generative networks like those in this paper are specifically trained to lie plausibly. Occasionally, this is the goal; consider that the most famous generative network, the so-called Deepfakes model, was originally designed to switch faces in video footage6. But for scientific imaging this trait can be problematic: input a low-resolution image of a mitochondrion to a model trained on mitochondria, and it will fill in the details with whatever features it has memorized from other mitochondria (Fig. 1a); feed it a low-resolution image of a cat, and it will still happily try to fill in the details with mitochondrial features (Fig. 1b).

Fig. 1: Do androids dream of electric cells?

a, Under the right conditions a properly trained deep neural network can predict a super-resolved micrograph from a blurry diffraction-limited (or lower resolution) image. b, However, if the neural network encounters unknown specimens, or known specimens imaged with unknown microscopes, it can produce nonsensical results.

Perhaps we can think of DL in microscopy as replacing one dilemma with another: the more we rely on DL the less confident we can be because ‘all models are wrong’, and the infinitude of incorrect model outputs ranges from the obviously wrong to the perniciously plausible. If we want to characterize novel biological phenotypes, we should require, at the very least, an estimate of how confident we are that the observed phenotype is real and not a hopeful guess. Moreover, the immense variability inherent in biological systems — for example, between organisms, between tissues, between cell types and even between cell states (not to mention the myriad microscopy modalities used to image them) — and the scarcity of resources to generate manually annotated, procedurally generated, or experimentally captured ground truth compounds these issues.

The good news for microscopists is threefold: first, research into Bayesian DL promises to build trust by explicitly quantifying uncertainty7, a property that may prove useful for discovery research by flagging interesting features and phenotypes not observed in the large volumes of model training data. Second, models such as that of Qiao et al. could be used in combination with a ‘smart’ microscope8 that images biological processes with gentle epifluorescence until model confidence drops, causing a switch to real SR imaging to capture the never before seen dynamics with high spatiotemporal resolution. Third, microscopy is not alone in this learning curve: even machine learning’s most ardent adepts have compared the state of DL to alchemy9, and all fields are struggling to understand how to adopt DL responsibly10.

In the end, science is fundamentally curiosity driven, and we encourage continued interest in DL, with the caveat that scientists should maintain an awareness of its limits as the field slowly matures, graduating from black-box empirical performance to sound theoretical underpinnings and from chatbots to medical diagnoses. Biology is by nature chaotic, and relevance comes from replicates; thus, while unexpected or theoretically surprising results obtained by DL, or any other novel method, are powerful hypothesis generators, they ultimately should be taken as suggestive until substantiated and validated by well understood methods and repeated measurements. In short, we ought to maintain an open mind and a healthy skepticism while remembering the dictum popularized by Robert A. Heinlein: “There ain’t no such thing as a free lunch.”


  1. 1.

    Qiao, C. Nat. Methods https://doi.org/10.1038/s41592-020-01048-5 (2021).

  2. 2.

    McCulloch, W. S. & Pitts, W. Bull. Math. Biophys. 5, 115–133 (1943).

    Article  Google Scholar 

  3. 3.

    Silver, D. et al. Nature 550, 354–359 (2017).

    CAS  Article  Google Scholar 

  4. 4.

    Dong, C., Loy, C. C., He, K. & Tang, X. IEEE Trans. Pattern Anal. Mach. Intell. 38, 295–307 (2016).

    Article  Google Scholar 

  5. 5.

    McElreath, R. Statistical Rethinking: A Bayesian Course with Examples in R and Stan (Taylor and Francis, 2020).

  6. 6.

    Citron, D. Deep fakes: a looming challenge for privacy, democracy, and national security. Center for Internet and Society https://cyberlaw.stanford.edu/publications/deep-fakes-looming-challenge-privacy-democracy-and-national-security (2018).

  7. 7.

    Wang, H. & Yeung, D.-Y. A survey on Bayesian deep learning. ACM Comput. Surv. 53, 1–37 (2020).

    Google Scholar 

  8. 8.

    Eisenstein, M. Nat. Methods 17, 1075–1079 (2020).

    CAS  Article  Google Scholar 

  9. 9.

    Hutson, M. Science https://doi.org/10.1126/science.aau0577 (2018).

  10. 10.

    D’Amour, A. et al. Underspecification presents challenges for credibility in modern machine learning. Preprint at arXiv https://arxiv.org/abs/2011.03395 (2020).

Download references

Author information



Corresponding author

Correspondence to David P. Hoffman.

Ethics declarations

Competing interests

The authors declare no competing interests.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Hoffman, D.P., Slavitt, I. & Fitzpatrick, C.A. The promise and peril of deep learning in microscopy. Nat Methods 18, 131–132 (2021). https://doi.org/10.1038/s41592-020-01035-w

Download citation


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing