Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Reinforcing neuron extraction and spike inference in calcium imaging using deep self-supervised denoising


Calcium imaging has transformed neuroscience research by providing a methodology for monitoring the activity of neural circuits with single-cell resolution. However, calcium imaging is inherently susceptible to detection noise, especially when imaging with high frame rate or under low excitation dosage. Here we developed DeepCAD, a self-supervised deep-learning method for spatiotemporal enhancement of calcium imaging data that does not require any high signal-to-noise ratio (SNR) observations. DeepCAD suppresses detection noise and improves the SNR more than tenfold, which reinforces the accuracy of neuron extraction and spike inference and facilitates the functional analysis of neural circuits.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: General principle and validation of DeepCAD.
Fig. 2: Spatiotemporal enhancement with DeepCAD.
Fig. 3: DeepCAD denoises calcium imaging data of large neuronal populations.

Data availability

The dataset of synchronized low-SNR and high-SNR two-photon calcium imaging (covering various imaging depths, excitation power, and cell structures) has been made publicly available at The dataset of simultaneous two-photon imaging and electrophysiological recording can be downloaded from the Collaborative Research in Computational Neuroscience (CRCNS) platform at Source data are provided with this paper.

Code availability

Our PyTorch implementation of DeepCAD is publicly available at The Fiji plugin and the pretrained model for denoising of large neuronal populations are readily accessible at Because the plugin is only compatible with TensorFlow, a companion TensorFlow implementation of DeepCAD is also made publicly available at the same GitHub repository.


  1. Grienberger, C. & Konnerth, A. Imaging calcium in neurons. Neuron 73, 862–885 (2012).

    CAS  Article  Google Scholar 

  2. Lu, R. et al. Video-rate volumetric functional imaging of the brain at synaptic resolution. Nat. Neurosci. 20, 620–628 (2017).

    CAS  Article  Google Scholar 

  3. Weisenburger, S. et al. Volumetric Ca2+ imaging in the mouse brain using hybrid multiplexed sculpted light microscopy. Cell 177, 1050–1066 e1014 (2019).

    CAS  Article  Google Scholar 

  4. Chow, D. M. et al. Deep three-photon imaging of the brain in intact adult zebrafish. Nat. Methods 17, 605–608 (2020).

    CAS  Article  Google Scholar 

  5. Calarco, J. A. & Samuel, A. D. Imaging whole nervous systems: insights into behavior from worms to fish. Nat. Methods 16, 14–15 (2019).

    CAS  Article  Google Scholar 

  6. Sabatini, B. L., Oertner, T. G. & Svoboda, K. The life cycle of Ca2+ ions in dendritic spines. Neuron 33, 439–452 (2002).

    CAS  Article  Google Scholar 

  7. Chen, T. W. et al. Ultrasensitive fluorescent proteins for imaging neuronal activity. Nature 499, 295–300 (2013).

    CAS  Article  Google Scholar 

  8. Ji, N., Freeman, J. & Smith, S. L. Technologies for imaging neural activity in large volumes. Nat. Neurosci. 19, 1154–1164 (2016).

    Article  Google Scholar 

  9. Svoboda, K. & Yasuda, R. Principles of two-photon excitation microscopy and its applications to neuroscience. Neuron 50, 823–839 (2006).

    CAS  Article  Google Scholar 

  10. Skylaki, S., Hilsenbeck, O. & Schroeder, T. Challenges in long-term imaging and quantification of single-cell dynamics. Nat. Biotechnol. 34, 1137–1144 (2016).

    CAS  Article  Google Scholar 

  11. Podgorski, K. & Ranganathan, G. Brain heating induced by near-infrared lasers during multiphoton microscopy. J. Neurophysiol. 116, 1012–1023 (2016).

    CAS  Article  Google Scholar 

  12. Wang, T. et al. Quantitative analysis of 1300-nm three-photon calcium imaging in the mouse brain. eLife 9, e53205 (2020).

    CAS  Article  Google Scholar 

  13. Dana, H. et al. High-performance calcium sensors for imaging activity in neuronal populations and microcompartments. Nat. Methods 16, 649–657 (2019).

    CAS  Article  Google Scholar 

  14. Samantaray, N., Ruo-Berchera, I., Meda, A. & Genovese, M. Realization of the first sub-shot-noise wide field microscope. Light Sci. Appl. 6, e17005 (2017).

    CAS  Article  Google Scholar 

  15. Weigert, M. et al. Content-aware image restoration: pushing the limits of fluorescence microscopy. Nat. Methods 15, 1090–1097 (2018).

    CAS  Article  Google Scholar 

  16. Belthangady, C. & Royer, L. A. Applications, promises, and pitfalls of deep learning for fluorescence image reconstruction. Nat. Methods 16, 1215–1225 (2019).

    CAS  Article  Google Scholar 

  17. Wang, H. et al. Deep learning enables cross-modality super-resolution in fluorescence microscopy. Nat. Methods 16, 103–110 (2019).

    CAS  Article  Google Scholar 

  18. Ouyang, W. et al. Deep learning massively accelerates super-resolution localization microscopy. Nat. Biotechnol. 36, 460–468 (2018).

    CAS  Article  Google Scholar 

  19. Lehtinen, J. et al. Noise2Noise: learning image restoration without clean data. in Proc. 35th International Conference on Machine Learning (eds Dy, J. & Krause, A.) 2965–2974 (PMLR, 2018).

  20. Çiçek, Ö. et al. 3D U-Net: learning dense volumetric segmentation from sparse annotation. in Medical Image Computing and Computer-Assisted Intervention 424–432 (2016).

  21. Maggioni, M., Katkovnik, V., Egiazarian, K. & Foi, A. Nonlocal transform-domain filter for volumetric data denoising and reconstruction. IEEE Trans. Image Process. 22, 119–133 (2013).

    Article  Google Scholar 

  22. Batson, J. & Royer, L. Noise2Self: blind denoising by self-supervision. in Proc. 36th International Conference on Machine Learning (eds Chaudhuri, K. & Salakhutdinov, R.) 524–533 (PMLR, 2019).

  23. Krull, A., Buchholz, T.-O. & Jug, F. Noise2Void—learning denoising from single noisy images. in Proc. IEEE Conference on Computer Vision and Pattern Recognition (eds Davis, L., Torr, P. & Zhu, S. C.) 2129–2137 (2019).

  24. Pnevmatikakis, E. A. et al. Simultaneous denoising, deconvolution, and demixing of calcium imaging data. Neuron 89, 285–299 (2016).

    CAS  Article  Google Scholar 

  25. Wu, Y. & He, K. Group normalization. in European Conference on Computer Vision (ECCV) 3–19 (2018).

  26. Kingma, D. P. & Ba, J. L. Adam: a method for stochastic optimization. in International Conference on Learning Representations 1–15 (2015).

  27. Deneux, T. et al. Accurate spike estimation from noisy calcium signals for ultrafast three-dimensional imaging of large neuronal populations in vivo. Nat. Commun. 7, 12190 (2016).

    CAS  Article  Google Scholar 

  28. GENIE project. Simultaneous imaging and loose-seal cell-attached electrical recordings from neurons expressing a variety of genetically encoded calcium indicators. (2015);

  29. Berens, P. et al. Community-based benchmarking improves spike rate inference from two-photon calcium imaging data. PLoS Comput. Biol. 14, e1006157 (2018).

    Article  Google Scholar 

  30. Pnevmatikakis, E. A. & Giovannucci, A. NoRMCorre: an online algorithm for piecewise rigid motion correction of calcium imaging data. J. Neurosci. Methods 291, 83–94 (2017).

    CAS  Article  Google Scholar 

  31. Caicedo, J. C. et al. Nucleus segmentation across imaging experiments: the 2018 Data Science Bowl. Nat. Methods 16, 1247–1253 (2019).

    CAS  Article  Google Scholar 

  32. Giovannucci, A. et al. CaImAn an open source tool for scalable calcium imaging data analysis. eLife 8, e38173 (2019).

    Article  Google Scholar 

  33. Zeiler, M. D. & Fergus, R. Visualizing and understanding convolutional networks. in European Conference on Computer Vision (ECCV) 818–833 (2014).

Download references


We acknowledge Y. Tang and Y. Yang (School of Medicine of Tsinghua University) for providing transgenic mice for imaging and the mesoscope imaging data for cross-system validation. We thank the Svoboda lab (Janelia Research Campus) for releasing their data of simultaneous electrophysiology and two-photon imaging. This work was supported by the National Natural Science Foundation of China (62088102, 61831014, 61531014, 62071272, 61927802 and 6181001011), the Beijing Municipal Science & Technology Commission (Z181100003118014), the National Key Research and Development Program of China (2020AAA0130000), and the Shenzhen Science and Technology Project under Grant (ZDYBH201900000002 and JCYJ20180508152042002). J.W. and H.Q. were also funded by the National Postdoctoral Program for Innovative Talent and Shuimu Tsinghua Scholar Program.

Author information

Authors and Affiliations



Q. D., H. W., L. F., and X. Li conceived this project. Q. D., H. W., and L. F. supervised this research. X. Li and G. Z. designed detailed implementations and processed the data. X. Li designed and set up the imaging system. X. Li and G. Z. conducted the experiments. G. Z. developed the python code and the Fiji plugin. J. W., Y. Z., and X. Lin directed the experiments and data analysis. L. F., Y. Z., Z. Z, H. Q., and H. X. provided critical support on system setup and imaging procedure. J. W., L. F., Y. Z., X. Lin, H. Q., H. X., H. W., and Q. D. participated in critical discussions about the results. All authors participated in the writing of the paper.

Corresponding authors

Correspondence to Haoqian Wang, Lu Fang or Qionghai Dai.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Additional information

Peer review information Nature Methods thanks Jaakko Lehtinen, Adam Packer and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Nina Vogt was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Network architecture.

Our model adopted 3D U-net20, which is composed of a 3D encoder module, a 3D decoder module, and three skip connections from the encoder module to the decoder module. In the encoder module, there are three encoder blocks. Each block consists of two 3 × 3 × 3 convolutional layers followed by a leaky rectified linear unit (LeakyReLU), a group normalization layer, a 2 × 2 × 2 max pooling with strides of 2 in three dimensions. In the decoder module, there are three decoder blocks, each of which contains two 3 × 3 × 3 convolutional layers followed by a LeakyReLU, a group normalization layer, and a 3D nearest interpolation. The skip connections can pass feature maps from the encoder module to the decoder module to integrate low-level features and high-level features. Feature maps of the encoder module and the decoder module are represented in different colors. All operations are in 3D and feature maps are all 4D tensors. 3D (c, t, x) feature maps were used here to simplify representation.

Extended Data Fig. 2 Data processing pipeline.

a, The training process. Raw data captured by the imaging system are organized in 3D (x, y, t) and saved as a temporal stack. The original noisy stack is partitioned into thousands of 3D sub-stacks (64×64×600 pixels) with about 25% overlap in each dimension. For temporal stacks with a small lateral size or short recording period, sub-stacks can be randomly cropped from the original stack to augment the training set. Then, interlaced frames of each sub-stack are extracted to form two 3D tiles (64 × 64 × 300 pixels). One of them serves as the input and the other serves as the target for network training. b, Deployment of the pre-trained model. New recordings obtained with the imaging system are partitioned into 3D sub-stacks (64 × 64 × 300 pixels) with 25% overlap in each dimension. Then, pre-trained models are loaded into memory and the sub-stacks are directly fed into the model. Enhanced sub-stacks are sequentially output from the network and overlapping regions (both the lateral and temporal overlaps) are subtracted from the output sub-stacks. The final enhanced stack can be obtained by stitching all sub-stacks.

Extended Data Fig. 3 Interpretability of DeepCAD model.

To demonstrate the interpretability and reliability of our pre-trained DeepCAD model, a small 3D patch (64 × 64 × 300 pixels) was fed into the model and feature maps of the convolutional layers were visualized33. Scale bar, 20 μm. Example feature maps of three intermediate convolutional layers in the decoder module (Layer 10, Layer 12, and Layer 14) are shown here, displayed as the average intensity projection (AVG) of the original 3D feature maps. The feature representations learned by DeepCAD have substantial semantic meaning, such as soma-like structures, cytoplasm-like structures, and vessel-like structures (or shadows). These interpretable semantic representations would contribute to locating neurons, restoring cytoplasmic fluorescence, and avoiding unwanted intensity fluctuations in vascular regions.

Extended Data Fig. 4 SNR improvement of calcium traces after denoising.

a, Trace SNR before and after denoising. Calcium traces (N = 107) were divided into three groups according to input SNR (36 low-SNR traces, 37 medium-SNR traces, 34 high-SNR traces). Quantitatively, low-SNR traces are those with SNR < −8.10 dB, medium-SNR traces are those with −8.10 dB≤SNR < −4.71 dB, high-SNR traces are those with SNR ≥ −4.71 dB. b, The distribution of trace SNR before and after denoising (N = 36 for low-SNR, N = 37 for medium-SNR, N = 34 for high-SNR). c, SNR improvements at different input SNR levels (N = 36 for low-SNR, N = 37 for medium-SNR, N = 34 for high-SNR). The trace SNR was calculated by 10 log(||x | |/||y-x | |2), where x is the normalized calcium trace and y is corresponding normalized noise-free trace estimated by MLspike27. Boxplots were plotted in standard Tukey box-and-whisker plot format with outliers indicated with small black dots.

Source data

Extended Data Fig. 5 DeepCAD reduces the error rate of spike inference at different input SNRs.

a, The error rate (ER) of raw data and DeepCAD enhanced data at different input SNRs. b, The decrements of ER at different input SNRs. Sample size: N = 36 for low-SNR, N = 37 for medium-SNR, N = 34 for high-SNR. Boxplots were plotted in standard Tukey box-and-whisker plot format with outliers indicated with small black dots.

Source data

Extended Data Fig. 6 Timing jitters of inferred spikes relative to real spikes before and after denoising.

a, Boxplots showing the distribution of timing jitters relative to real spikes (electrophysiology) of all inferred spike pairs before (N = 2031) and after (N = 2574) denoising. b, Histograms showing the probability distributions of timing jitters before and after denoising. The two probability distributions were verified to be equivalent by Kolmogorov–Smirnov test (one-side, P ≤ 0.01, N = 2031 for raw data, N = 2574 for DeepCAD enhanced). c, Distributions of timing jitters at different input noise levels (Raw data, N = 326 for low-SNR, N = 689 for medium-SNR, N = 1016 for high-SNR; DeepCAD enhanced, N = 545 for low-SNR, N = 880 for medium-SNR, N = 1149 for high-SNR). d, Distributions of timing jitters at different baseline spike rates (Raw data, N = 663 for low spike rate, N = 766 for medium spike rate, N = 602 for high spike rate; DeepCAD enhanced, N = 1095 for low spike rate, N = 837 for medium spike rate, N = 642 for high spike rate). Baseline spike rates were calculated with 2 s binning time. All timing jitters were divided into three groups, that is low spike rate (baseline spike rate≤2.0 spk/s), medium spike rate (2.0 spk/s <baseline spike rate≤3.5 spk/s), and high baseline spike rate (baseline spike rateå 3.5 spk/s). These timing jitters were caused by the spike inference algorithm. Boxplots were plotted in standard Tukey box-and-whisker plot format with outliers indicated with small black dots.

Source data

Extended Data Fig. 7 Simultaneous low-SNR and high-SNR two-photon imaging system.

Microscope set-up for simultaneous acquisition of high-SNR and low-SNR calcium imaging data. Ti:sapp: titanium-sapphire laser with tunable wavelength; HWP: half-wave plate; EOM: Electro-Optic Modulator; M1: mirror; L1, L2, L3, L4, L5, L6, L7, L8, L9: lens; Scanner: galvo-resonant scanners; DM: long-pass dichroic mirror to separate fluorescence signals (green path) from excitation light (red path); BS: 1:9 (reflectance: transmission) non-polarizing plate beam splitter; PMT1, PMT2: photomultiplier tubes. Fluorescence signals were split into a low-SNR (~10%) component and a high-SNR (~90%) component and were synchronously detected by two PMTs.

Extended Data Fig. 8 System calibration.

a, Example frames captured by the low-SNR detection path (left) and the high-SNR detection path (right). There were 15 isolated fluorescent beads (1 μm diameter) in the field of view (FOV). b, Average projection of 500 continuously acquired frames. Scale bar, 50 μm. c, Intensity profiles (normalized to the maximum of high-SNR recording) along the red dashed lines in b. d, The intensity ratios (high-SNR relative to low-SNR) of all 15 fluorescent beads. Each point represents one bead. The average intensity ratio is 10.4 (blue dashed line).

Source data

Extended Data Fig. 9 Human inspection of segmentation results.

a, Left: manually annotated neuron borders. The standard deviation projection served as the background image. Right: manually annotated segmentation masks. Scale bar, 100 μm. b, Left: segmentation masks of the Low-SNR recording. Right: segmentation masks of the DeepCAD enhanced recording. The constrained nonnegative matrix factorization (CNMF) algorithm24,32 was used as the segmentation method. c, Magnified view of the blue boxed region showing the segmentation of three neurons. d, Magnified view of the red boxed region showing the segmentation results of five neurons.

Extended Data Fig. 10 Cross-system validation.

Denoising performance of DeepCAD on three two-photon laser-scanning microscopes (2PLSMs) with different system setups. Our system was equipped with alkali PMTs (PMT1001, Thorlabs) and a 25×/1.05 NA commercial objective (XLPLN25XWMP2, Olympus). The standard 2PLSM was equipped with a GaAsP PMT (H10770PA-40, Hamamatsu) and a 25×/1.05 NA commercial objective (XLPLN25XWMP2, Olympus). The two-photon mesoscope was equipped with a GaAsP PMT (H11706-40, Hamamatsu) and a 2.3×/0.6 NA custom objective. The same pre-trained model was used for processing these data. All scale bars represent 100 μm.

Supplementary information

Supplementary Information

Supplementary Figures 1–19, Table 1, and Notes 1–3.

Reporting Summary

Supplementary Video 1

Denoising performance of DeepCAD on single-neuron recordings. The top panel shows simultaneous electrophysiological recording of the neuron, which reflects the ground-truth neural activity. Detected spikes are marked with black dots. The original noisy data and DeepCAD enhanced data are shown in the middle panel and the bottom panel, respectively.

Supplementary Video 2

DeepCAD enhanced the neurite activity data from cortical layer 1 of a mouse expressing GCaMP6f. The low-SNR recording, DeepCAD enhanced recording, and the high-SNR recording are played synchronously. Below are magnified views of the boxed regions. The bottom panel shows fluorescence traces extracted from 40 dendritic pixels.

Supplementary Video 3

From left to right are the low-SNR recording of spontaneous calcium transients of a large neuronal population (layer 2/3, GCaMP6f), the DeepCAD enhanced counterpart, and corresponding high-SNR recording, respectively. The bottom panel shows magnified views of boxed regions.

Supplementary Video 4

Denoising performance of DeepCAD on three two-photon laser-scanning microscopes (2PLSMs) with different system setups. Our system was equipped with alkali PMTs (PMT1001, Thorlabs) and a ×25/1.05 NA commercial objective (XLPLN25XWMP2, Olympus). The standard 2PLSM was equipped with a GaAsP PMT (H10770PA-40, Hamamatsu) and a ×25/1.05 NA commercial objective (XLPLN25XWMP2, Olympus). The two-photon mesoscope was equipped with a GaAsP PMT (H11706-40, Hamamatsu) and a ×2.3/0.6 NA custom objective. The same pre-trained model was used for denoising these data.

Source data

Source Data Fig. 1

Statistical source data.

Source Data Fig. 2

Statistical source data.

Source Data Fig. 3

Statistical source data.

Source Data Extended Data Fig. 4

Statistical source data.

Source Data Extended Data Fig. 5

Statistical source data.

Source Data Extended Data Fig. 6

Statistical source data.

Source Data Extended Data Fig. 8

Statistical source data.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Li, X., Zhang, G., Wu, J. et al. Reinforcing neuron extraction and spike inference in calcium imaging using deep self-supervised denoising. Nat Methods 18, 1395–1400 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing