Spatial redundancy transformer for self-supervised fluorescence image denoising

Fluorescence imaging with high signal-to-noise ratios has become the foundation of accurate visualization and analysis of biological phenomena. However, the inevitable noise poses a formidable challenge to imaging sensitivity. Here we provide the spatial redundancy denoising transformer (SRDTrans) to remove noise from fluorescence images in a self-supervised manner. First, a sampling strategy based on spatial redundancy is proposed to extract adjacent orthogonal training pairs, which eliminates the dependence on high imaging speed. Second, we designed a lightweight spatiotemporal transformer architecture to capture long-range dependencies and high-resolution features at low computational cost. SRDTrans can restore high-frequency information without producing oversmoothed structures and distorted fluorescence traces. Finally, we demonstrate the state-of-the-art denoising performance of SRDTrans on single-molecule localization microscopy and two-photon volumetric calcium imaging. SRDTrans does not contain any assumptions about the imaging process and the sample, thus can be easily extended to various imaging modalities and biological applications.

The referees' reports seem to be quite clear.Naturally, we will need you to address all of the points raised.
While we ask you to address all of the points raised, the following points need to be substantially worked on: -Please provide clear comparisons between DeepCAD and SRDTrans.
-As indicated by Reviewer #1, SRDTrans is primarily designed to address challenges in high-speed imaging.Please discuss if by foregoing temporal averaging, is there a possibility that its SNR might be inferior to DeepCAD? -Please show the impact on network performance if two or four pixels are randomly selected to form sub-stacks for training instead.
-Please discuss how inference performed in the actual experiment?-Please ensure and verify the generalization ability of the network.
-As indicated by Reviewer #2, please provide further examples of the application of SRDTrans alternative modalities.
-As requested by Reviewer #3, please discuss whether orthogonal selection in the spatial domain is necessary.
-Please provide the description of the structure of the spatiotemporal transformer block.
Please use the following link to submit your revised manuscript and a point-by-point response to the referees' comments (which should be in a separate document to any cover letter): [REDACTED] ** This url links to your confidential homepage and associated information about manuscripts you may have submitted or be reviewing for us.If you wish to forward this e-mail to co-authors, please delete this link to your homepage first.** To aid in the review process, we would appreciate it if you could also provide a copy of your manuscript files that indicates your revisions by making use of Track Changes or similar mark-up tools.Please also ensure that all correspondence is marked with your Nature Computational Science reference number in the subject line.
In addition, please make sure to upload a Word Document or LaTeX version of your text, to assist us in the editorial stage.
To improve transparency in authorship, we request that all authors identified as 'corresponding author' on published papers create and link their Open Researcher and Contributor Identifier (ORCID) with their account on the Manuscript Tracking System (MTS), prior to acceptance.ORCID helps the scientific community achieve unambiguous attribution of all scholarly contributions.You can create and link your ORCID from the home page of the MTS by clicking on 'Modify my Springer Nature account'.For more information please visit please visit <a href="http://www.springernature.com/orcid">www.springernature.com/orcid</a>.We hope to receive your revised paper within three weeks.If you cannot send it within this time, please let us know.
We look forward to hearing from you soon.
Best regards, Ananya Rastogi, PhD Senior Editor Nature Computational Science Reviewers comments: Reviewer #1 (Remarks to the Author): Fluorescence imaging is constrained by its limited photon budget, a critical factor that hampers its potential.Employing deep learning techniques for denoising has emerged as one of the promising solutions to this challenge.Several algorithms have been proposed in the past, including the author's own DeepCAD and DeepCAD-RT.These algorithms, as presented, rely heavily on the continuity of pixels between frames.Consequently, if the information within the image changes rapidly between frames, these algorithms tend to err.My tests have confirmed that they tend to force pixel values to approximate between frames.Beyond the author's contributions, several other algorithms, especially those based on CNNs, have been proposed in the field.Each of these, to varying degrees, has its limitations, such as the intricate parameter tuning required for non-self-supervised algorithms.
The primary motivation behind this paper appears to address the challenges faced by DeepCAD, especially in high-speed imaging scenarios.When there's significant variation between frames, an over-reliance on temporal pixel continuity becomes problematic.However, the spatial continuity of pixels remains a consistent feature.This paper capitalizes on this aspect, integrating a spatial redundancy sampling strategy with a lightweight spatio-temporal transformer architecture.SRDTrans holds significant promise for fluorescence microscopic image denoising, particularly in applications like single-molecule localization microscopy and two-photon volumetric calcium imaging.
However, the authors should provide clear delineation regarding scenarios where DeepCAD might be more appropriate versus those where SRDTrans shines.A direct comparison, backed by metrics like SNR, would be beneficial.It's evident that SRDTrans is primarily designed to address challenges in high-speed imaging, but by foregoing temporal averaging, is there a possibility that its SNR might be inferior to DeepCAD?Further, if both DeepCAD and SRDTrans have their respective strengths, is there potential for an integrated algorithm?For biologists, choosing between algorithms can be daunting.For instance, in calcium imaging, where calcium signal response time hovers around 1 second, it would be valuable to know which algorithm to opt for based on the sampling rate relative to the response time.
Reviewer #1 (Remarks on code availability): 1.In the spatial redundancy sampling scheme, the authors randomly select three adjacent pixels from 2x2 small patches to form three sub-stacks.In these small patches, any two pixels can be considered as adjacent.What would be the impact on network performance if two or four pixels are randomly selected to form sub-stacks for training instead? 2. Why are there two MSA (Multi-Head Self-Attention) layers in the STB (Spatial Temporal Block)?Could their effects overlap, potentially rendering the second MSA layer less effective?This question is especially pertinent considering there are eight heads in each MSA layer.
3. How is inference performed in the actual experiment?What type of data is used to train the network?Is it necessary to retrain the network when the experimental conditions (such as imaging targets or speeds) change?4. Given the lightweight design of the network structure and the absence of a pooling layer, is there a risk of overfitting during network training?How can the generalization ability of the network be ensured and verified? 5.The L2 loss function tends to produce smoother network fitting results.Would employing the L1 loss function instead have a significant effect on the network's ability to fit more high-frequency information?
Reviewer #2 (Remarks to the Author): This article demonstrates "SRDTrans", a self-supervised deep learning method of denoising time-lapse microscopy data.The novelty of creating a network that utilizes a spatiotemporal transformer architecture allows significant improvement over U-Net based architectures which by their nature tend to over-smooth high-frequency information.A related benefit is the light-weight nature of the transformer allows high-resolution information to be retained at a relatively low computational cost.Similarly, by employing a transformer model the network does not rely on similarities between temporally adjacent frames, and hence can be utilized on relatively fastmoving imaging modalities such as calcium-flux imaging or single molecule localization.
The authors show impressive denoising results on a variety of simulated and "real" (for lack of a better word) microscopy data.Not only demonstrating qualitative and quantitative improvements in the image, but also noting the improvement in the reliability down-stream analysis of the denoised data through SMLM point detection accuracy and calcium flux traces.
The article also directly compares SRDTrans to a range of other self-supervised denoising methods, mostly in the supplementary information.I'd be interested to see probabilistic noise2void (https://www.frontiersin.org/articles/10.3389/fcomp.2020.00005/full)tested alongside these other methods.While it does use a CNN and I suspect it will have the resolution degradation associated with its CNN architecture, it does take some temporal information into account.I grant that it does also require a separate training dataset and so may not technically fall within a self-supervised category, the collection of such a dataset is usually trivial to do at the time of data acquisition.
While I find the application of SRDTrans to both SMLM and multiphoton calcium imaging compelling, the authors note the applicability of the network to other microscopy techniques as there is no underlying assumptions about sample dynamics or imaging speed.Without wanting to place undue burden on the authors, I'd welcome any further examples of the application of SRDTrans alternative modalities - Reviewer #2 (Remarks on code availability): The code is well documented on the github repo.I was able to test both training and inference on my own experimental data with minimal alteration.I would suggest altering some of the filepath definitions in test.py and train.py to use os.path.joininstead of string concatenation as I ran into some issues potentially due to OS differences.I also had to manually define the model pth_name variable in test.py as for a reason I couldn't determine the code was including "49" in the model name (I suspect due to the default checkpoint index but I didn't have time to debug more fully) 2.As one of the main innovation points, the structure of spatiotemporal transformer block needs detailed description.I had a glance on the provided codes, in SpatioTemporalTrans, it excute timeTrans first and then spatialTrans.It is better to describe this two step.In spatialTrans, it contains a SwinTransformerBlock, it may need a citation.In line 108-110 of the manuscript, what makes the proposed transformer lightweight?Less layers or less convolution kernels?For position embedding layer, does it require the input size been fixed, or can it deal with arbitrarily size?3.In Supplementary Figure 5, large input temporal scale can provide better results in 30Hz data.This result is impressive.How about in 1Hz condition (I see in line 364, you have 1Hz synthesized data)?Will large temporal scale be harmful?It can help users to determine how many timepoints feed into the network is proper.8.The background of SRDTrans in Fig. 2a looks very clean, but when I applied both the self-trained model and the provided pretrained model (Pretrained_Model_for_noise_200Hz_2400frames_pxlsize30nm_-0.05dBSNR_24000x328x328.pth) on cropped noisy data (noise_200Hz_2400frames_pxlsize30nm_-0.05dBSNR_24001x328x328.tif), the results are not good.I think it is necessary to make a clarification about how to preprocess or post-process these data.Some minor comments: 1.In line 118, it is better to point out which "typical deep layers", after the "STB" or the last layer.2.In line 136-137, from Supplementary Video 1, DeepCAD provide better visual quality than SRDCNN, it is difficult to convince that "spatial redundancy sampling is more reasonable".I think Supplementary Table 3 can  Thank you for submitting your revised manuscript "Spatial redundancy transformer for self-supervised fluorescence image denoising" (NATCOMPUTSCI-23-0725A).It has now been seen by the original referees and their comments are below.The reviewers find that the paper has improved in revision, and therefore we'll be happy in principle to publish it in Nature Computational Science, pending minor revisions to satisfy the referees' final requests and to comply with our editorial and formatting guidelines.
We are now performing detailed checks on your paper and will send you a checklist detailing our editorial and formatting requirements in about a week.Please do not upload the final materials and make any revisions until you receive this additional information from us.
TRANSPARENT PEER REVIEW Nature Computational Science offers a transparent peer review option for original research manuscripts.We encourage increased transparency in peer review by publishing the reviewer comments, author rebuttal letters and editorial decision letters if the authors agree.Such peer review material is made available as a supplementary peer review file.Please remember to choose, using the manuscript system, whether or not you want to participate in transparent peer review.Please note: we allow redactions to authors' rebuttal and reviewer comments in the interest of confidentiality.If you are concerned about the release of confidential data, please let us know specifically what information you would like to have removed.Please note that we cannot incorporate redactions for any other reasons.Reviewer names will be published in the peer review files if the reviewer signed the comments to authors, or if reviewers explicitly agree to release their name.For more information, please refer to our <a href="https://www.nature.com/documents/nrtransparent-peer-review.pdf"target="new">FAQ page</a>.

Messag e:
Dear Professor Dai, We are pleased to inform you that your Article "Spatial redundancy transformer for selfsupervised fluorescence image denoising" has now been accepted for publication in Nature Computational Science.
Once your manuscript is typeset, you will receive an email with a link to choose the appropriate publishing options for your paper and our Author Services team will be in touch regarding any additional information that may be required.
Please note that <i>Nature Computational Science</i> is a Transformative Journal (TJ).
Authors may publish their research with us through the traditional subscription access route or make their paper immediately open access through payment of an article-processing charge (APC).Authors will not be required to make a final decision about access to their article until it has been accepted.<a href="https://www.springernature.If you have any questions about our publishing options, costs, Open Access requirements, or our legal forms, please contact ASJournals@springernature.com Acceptance of your manuscript is conditional on all authors' agreement with our publication policies (see https://www.nature.com/natcomputsci/for-authors).In particular your manuscript must not be published elsewhere and there must be no announcement of the work to any media outlet until the publication date (the day on which it is uploaded onto our web site).
Before your manuscript is typeset, we will edit the text to ensure it is intelligible to our wide readership and conforms to house style.We look particularly carefully at the titles of all papers to ensure that they are relatively brief and understandable.
Once your manuscript is typeset, you will receive a link to your electronic proof via email with a request to make any corrections within 48 hours.If, when you receive your proof, you cannot meet this deadline, please inform us at rjsproduction@springernature.com immediately.
If you have queries at any point during the production process then please contact the production team at rjsproduction@springernature.com.Once your paper has been scheduled for online publication, the Nature press office will be in touch to confirm the details.
Content is published online weekly on Mondays and Thursdays, and the embargo is set at 16:00 London time (GMT)/11:00 am US Eastern time (EST) on the day of publication.If you need to know the exact publication date or when the news embargo will be lifted, please contact our press office after you have submitted your proof corrections.Now is the time to inform your Public Relations or Press Office about your paper, as they might be interested in promoting its publication.This will allow them time to prepare an accurate and satisfactory press release.Include your manuscript tracking number NATCOMPUTSCI-23-0725B and the name of the journal, which they will need when they contact our office.
About one week before your paper is published online, we shall be distributing a press release to news organizations worldwide, which may include details of your work.We are happy for your institution or funding agency to prepare its own press release, but it must mention the embargo date and Nature Computational Science.Our Press Office will contact you closer to the time of publication, but if you or your Press Office have any inquiries in the meantime, please contact press@nature.com.
An online order form for reprints of your paper is available at <a href="https://www.nature.com/reprints/authorreprints.html">https://www.nature.com/reprints/author-reprints.html</a>.All co-authors, authors' institutions and authors' funding agencies can order reprints using the form appropriate to their geographical region.
We welcome the submission of potential cover material (including a short caption of around 40 words) related to your manuscript; suggestions should be sent to Nature Computational Science as electronic files (the image should be 300 dpi at 210 x 297 mm in either TIFF or JPEG format).We also welcome suggestions for the Hero Image, which appears at the top of our <a href="http://www.nature.com/natcomputsci">homepage</a>; these should be 72 dpi at 1400 x 400 pixels in JPEG format.Please note that such pictures should be selected more for their aesthetic appeal than for their scientific content, and that colour images work better than black and white or grayscale images.Please do not try to design a cover with the Nature Computational Science logo etc., and please do not submit composites of images related to your work.I am sure you will understand that we cannot make any promise as to whether any of your suggestions might be selected for the cover of the journal.
You can now use a single sign-on for all your accounts, view the status of all your manuscript submissions and reviews, access usage statistics for your published articles and download a record of your refereeing activity for the Nature journals.
To assist our authors in disseminating their research to the broader community, our SharedIt initiative provides you with a unique shareable link that will allow anyone (with or without a subscription) to read the published article.Recipients of the link with a subscription will also be able to download and print the PDF.
As soon as your article is published, you will receive an automated email with your shareable link.
We look forward to publishing your paper.

I
would think light sheet and/or SIM might be of particular interest to the field.That said, I do not believe the work suffers from the absence of further examples as those presented are sufficiently impressive.I wish to commend the authors for their commitment to open science with the open-sourcing of their code which is well documented and easy to follow, as well as the availability of their training datasets and pre-trained models.I was able to test both training and inference on SMLM datasets with minimal alteration and look forward to further testing on my own datasets in the near future.
Reviewer #3 (Remarks to the Author):This manuscript by Xinyang and colleagues presents a self-supervised-learning-based framework SRDtrans that removes noise from fluorescence time-lapse images.They provide a novel sampling strategy based on spatial redundancy to generate the training datasets to avoid the high dependency on quick imaging.A lightweight deep learning architecture is proposed to restore high-frequency information without producing over-smoothed structures.SRDtrans enables low-speed imaging of fast biological activities over a wide range of imaging SNR.Image denoising is an important problem in image processing and computer vision.The spectral bias problem of CNN is a main limitation in self-supervised denoising tasks.This manuscript utilizes transformer to capture global spatiotemporal information can effectively solve this problem.Both synthetic and experimental results provided is impressive to convey it is a meaningful work.Although the manuscript provides many illustrations, some statements and network characteristics remain unclear.And the statistics analyse should be further improved both in the figures (error bar should show in main SNR figure) , figure legends and tables .Below I list my major comments: 1.In Fig 1a, spatial redundancy sampling strategy utilize orthogonal masks in spatial domain to generate the training datasets, and each timepoint (time domain) utilize different masks.I wonder whether the orthogonal selection in spatial domain is necessary.In each 2x2 cell of the raw images, can two neighboring pixels be randomly chosen and categorized into two sub-images (just like Neighbor2Neighbor manner)?Or use the extra pixel in 2x2 cell and generate four sub-sampled images, one is selected as the training input, and other sub-stacks are designated as the corresponding training targets.
4.In line 279, it is said SRDTrans does not rely on any assumptions noise model.In the experiments, the authors added Mixed Gaussian-Poisson noise into the data, but I still wonder which noise (Gaussian or Poisson) have more severe influence on the SRDTrans results.5.In line 247-248, the authors applied SRDTrans to volumetric recording, it is better to provide detailed process.Do they use spatial redundancy sampling to xyz-t data and change the network structure?6.If possible, the experimentally obtained data needs a high-SNR reference to make the SRDTrans results convincing (e.g. in Fig 3b). 7. It is necessary for the authors to comment any failure case of SRDTrans or artifacts after recovery.Or to show the stability and generalization ability of SRDTrans.

a> Authors may need to take specific actions to achieve <a href="https://www.springernature.com/gp/open-research/funding/policy- compliance-faqs"> compliance</a> with funder and institutional open access mandates.
If your research is supported by a funder that requires immediate open access (e.g. according to <a href="https://www.springernature.com/gp/open-research/plan-scompliance">PlanS principles</a>) then you should select the gold OA route, and we will direct you to the compliant route where possible.For authors selecting the subscription publication route, the journal's standard licensing terms will need to be accepted, including <a href="https://www.springernature.com/gp/open-research/policies/journalpolicies">self-archivingpolicies</a>.Those licensing terms will supersede any other terms that the author or any third party may assert apply to any version of the manuscript.