Hidden Markov modeling for maximum probability neuron reconstruction

Recent advances in brain clearing and imaging have made it possible to image entire mammalian brains at sub-micron resolution. These images offer the potential to assemble brain-wide atlases of neuron morphology, but manual neuron reconstruction remains a bottleneck. Several automatic reconstruction algorithms exist, but most focus on single neuron images. In this paper, we present a probabilistic reconstruction method, ViterBrain, which combines a hidden Markov state process that encodes neuron geometry with a random field appearance model of neuron fluorescence. ViterBrain utilizes dynamic programming to compute the global maximizer of what we call the most probable neuron path. We applied our algorithm to imperfect image segmentations, and showed that it can follow axons in the presence of noise or nearby neurons. We also provide an interactive framework where users can trace neurons by fixing start and endpoints. ViterBrain is available in our open-source Python package brainlit.

* The validation of the proposed approach is less convincing for two major reasons. First, the authors only select two baseline algorithms (APP2 and snake) for comparison out of twenty-six. Besides, the selected two baseline methods are not able to provide sufficient number of metrics for a fair comparison in Fig. 8. Second, it is not clear whether the results of all methods (on either spatial distance metrics or Frechet distance metrics) are statistically significantly different. A thorough statistical analysis would be necessary. Automatic neuron reconstruction is a challenging task in brain clearing and imaging. Although many automatic reconstruction algorithms have been proposed over the past decade, most focus on single neuron images. This manuscript presents a probabilistic reconstruction method for multiple neuron images, which combines a hidden Markov process encoding neuron geometry with a random field appearance model of neuron fluorescence. Moreover, the proposed method, ViterBrain, performed better than two selected baselines in imperfect image segmentation. It is also noteworthy that the authors of this manuscript shared their implementation code on the Internet. However, a few major or minor issues are listed as follows. 1. What is the difference between this manuscript and an online reprint (arXiv:2106.02701) authored by the same authors? Perhaps, the authors should mention the reprint in this manuscript. 2. The organization of Section 2 and Section 4 can be improved. Specifically, Subsection 2.1 could be moved to the beginning of Section 4 because it only presents the overview of ViterBrain and does not provide any actual result. Besides, since Figure 2 is relatively simple, the authors should describe how the ViterBrain components interact with other modules to perform a given task. 3. What is the correlation of ViterBrain with image intensity modeling? For the reviewer's part, Subsection 2.2 appears to be less important than the following three ones in Section 2. The authors should make the motivation of this result more precise. 4. There are 26 reconstruction algorithms available in Vaa3D (version 3.2), and the authors selected two baselines to compare with the proposed method. However, the two baselines were proposed many years ago. Could this manuscript introduce more recently published methods, primarily based on deep learning? 5. The comparison among three reconstruction algorithms was conducted on a dataset of 10 partial axon reconstructions. For the reviewer's part, the experimental result is less convincing. Therefore, the authors should compare them on a larger-size dataset to demonstrate the effectiveness of the proposed method. 6. Two distance metrics are employed to measure the discrepancies between reconstruction algorithms and manual traces. Obviously, they are used at the level of individual examples. Could the authors provide additional metrics or statistical analysis methods for the whole dataset? 7. Hidden Markov modeling is a relatively old technique in computer science. Due to some inherent disadvantages, this technique has gradually been replaced by the recurrent neural network (RNN) architecture (more specifically, long short-term memory networks and gate recurrent units) in natural language processing and other sequential tasks. Therefore, the novelty of ViterBrain remains unknown unless the authors can demonstrate that the proposed method outperforms other RNN-based baselines. 8. Some recent papers could be further analyzed and discussed in the revised version of this manuscript. [ We truly appreciate the time and effort you invested in reading our submission and giving constructive feedback. We tried to address all the suggestions. Notably, we expanded our comparison to state of the art by using more algorithms, a larger dataset, and a more thorough statistical analysis of the results. We also overhauled several sections in the Results and Methods in order to make the writing clearer and more concise. Below is a point-by-point reproduction of your review comments, each followed by our corresponding responses.
Thank you for considering our updated manuscript. This paper presents a probabilistic model, named as ViterBrain, for neuron reconstruction. The authors integrate a newly designed hidden Markov model that directly encodes the neuron geometry with appearance models of neuron fluorescence images. The proposed idea sounds interesting and the research topic of better reconstructing the neuron paths is of high interest to the neuroscience community. While the backbone of the developed methodology seems convincing, the current manuscript has a lot of room for improvement. Below are my major comments and suggestions: • While the structure of this paper is well organized, many technical details are sloppily defined or written; hence difficult for readers to follow. Some of the math notations (especially in section 4.2) are either missing, or defined after being used in equations (e.g., the notations of $\alpha_0, \alpha_1, alpha_k$). o We overhauled our description of the methods. In particular we: § Shortened our discussion and notation of the Poisson process underlying the imaging. § Moved all proofs to the supplement, but retained descriptions of the ideas behind the proofs. § Introduced the important variables (alpha_0, alpha_1, alpha_k, alpha_d) earlier, in the Results section.
• • Not sure whether it is appropriate to call the definition of the 'most probable solution' as a 'Theorem'. I would at least expect a rigorous mathematical proof of how/why the proposed formulation can achieve a 'most probable solution'. o We combined it with the "Proposition" and renamed it "Statement" (Statement 1 in Section 4.3). The rigorous proof is in Supplement section S3 and is based on a recursive factoring.
• For the estimation of foreground-background intensity distributions, I am wondering whether the authors have thoroughly validated the reliability/consistency of fitting Gaussian kernels to subsampled datasets. I assume different estimates of these intensity distributions will lead to different generations of fragments. o Though our validation was mostly qualitative, we use a method that has been heavily studied in the applied mathematics literature. In particular, the method has been proven to be consistent in estimating arbitrary distributions (under some regularity conditions). We added some details of the theoretical backing of our approach in section 4. • The validation of the proposed approach is less convincing for two major reasons. First, the authors only select two baseline algorithms (APP2 and snake) for comparison out of twenty-six. Besides, the selected two baseline methods are not able to provide sufficient number of metrics for a fair comparison in Fig. 8. Second, it is not clear whether the results of all methods (on either spatial distance metrics or Frechet distance metrics) are statistically significantly different. A thorough statistical analysis would be necessary. o We have significantly expanded our reconstruction experiment. Now, we have chosen four other baseline algorithms, APP2, GTree, Advantra, and Snake. There are indeed more algorithms in Vaa3d, but they are typically not thoroughly described (i.e. there is no accompanying publication that the reader can explore for troubleshooting purposes etc.), nor are they used or cited as frequently in field as the ones we chose, so we decided to limit our collection of algorithms to 5. o We more than triple the testing dataset -from 10 to 35 images. We also plot box and whisker plots of accuracy metrics of the reconstructions that are successful. We exclude the metrics of failed reconstruction because they would obscure the informative metrics. § See Figure 7 o We also perform 2 proportion z-tests to statistically compare the success rates of the different algorithms. § From Section 2.5: According to two proportion z-tests the success rate of ViterBrain (11/35) was higher than all other methods at $\alpha=0.05$. Also, APP2 had a higher success rate (4/35) than Advantra at $\alpha=0.05$.
• What are the variations of different metrics in Fig. 8? o The new box and whisker plots of the different metrics should adequately convey average and variation of these different values Minor comments: • The segmentation of the neuron node in Fig. 1 is surprisingly off. I would expect a much better quality since the boundary of the node is quite clear. As the proposed model heavily depends on the quality of binary segmentation masks, I'd suggest the authors use better examples in Fig. 1. o We modified Figure 1 to show a better segmentation. There are indeed still some visually obvious false negatives, which could be addressed with a less conservative binarization threshold. However, we maintain the conservative threshold for two reasons. First, a less conservative threshold would lead to more false positives in the background, and it would fuse together different neuronal processes (there is of course always a tradeoff). Second, we think it is more illustrative of our approach, which strings together distinct fragments. Our primary goal in Figure 1 is to describe a problem, and our approach.
• It would be helpful to show the original images of a) and b) in Fig. 7 without being overlaid by any manual labels or estimated reconstructions. o We added the original image to figure 7b, and now it is in the supplement as figure s2. We removed figure 7a after some rearrangement in order to satisfy the figure limit.
• It would be good to add a brief description of the tested dataset (e.g., the number and the resolution of images, etc.) in the result section. Automatic neuron reconstruction is a challenging task in brain clearing and imaging. Although many automatic reconstruction algorithms have been proposed over the past decade, most focus on single neuron images. This manuscript presents a probabilistic reconstruction method for multiple neuron images, which combines a hidden Markov process encoding neuron geometry with a random field appearance model of neuron fluorescence. Moreover, the proposed method, ViterBrain, performed better than two selected baselines in imperfect image segmentation. It is also noteworthy that the authors of this manuscript shared their implementation code on the Internet. However, a few major or minor issues are listed as follows.
• What is the difference between this manuscript and an online reprint (arXiv:2106.02701) authored by the same authors? Perhaps, the authors should mention the reprint in this manuscript. o Yes, that arxiv post is a preprint version of this same article. Earlier, the arxiv version was not completely up to date, which may have caused some confusion. We have since updated that post to match this submitted version. • The organization of Section 2 and Section 4 can be improved. Specifically, Subsection 2.1 could be moved to the beginning of Section 4 because it only presents the overview of ViterBrain and does not provide any actual result. Besides, since Figure 2 is relatively simple, the authors should describe how the ViterBrain components interact with other modules to perform a given task. o Regarding section 2.1, we are inclined to keep it there since it describes the primary result of our work (the ViterBrain algorithm) and we also think it is important as background for the following sections. However, in order to make it more appropriate for the Results section, we tweaked its contents (e.g. added link to our Python package). This structure of starting the results with an overview is in part inspired by other papers in the same field, such as the G-Cut paper (Li et. Al. 2019 Nature Communications). o We added some of the important equations and notation to Figure 2 in order to more closely connect the figure with the notation in the Results/Methods sections.
• What is the correlation of ViterBrain with image intensity modeling? For the reviewer's part, Subsection 2.2 appears to be less important than the following three ones in Section 2. The authors should make the motivation of this result more precise. o We worked on the clarity and precision of the writing in this section. The first important takeaway from this section is the evidence it gives for our conditional independence assumption of the observed image. Our model assumes that, for example, two different voxels that are already known to be background have independent image intensities. We added this motivation to section 2.2. § From Section 2.2: Figure 3a shows the correlations of image intensities between voxels at varying distances of separation. However we reference these methods in the Discussion as possible methods to supplement or improve our algorithm. • The comparison among three reconstruction algorithms was conducted on a dataset of 10 partial axon reconstructions. For the reviewer's part, the experimental result is less convincing. Therefore, the authors should compare them on a larger-size dataset to demonstrate the effectiveness of the proposed method. o We expanded the dataset to 35 partial axons.
• Two distance metrics are employed to measure the discrepancies between reconstruction algorithms and manual traces. Obviously, they are used at the level of individual examples. Could the authors provide additional metrics or statistical analysis methods for the whole dataset? o We performed two-proportion z-tests to compare success rates between algorithms. Additionally, we produced box and whisker plots of the two distance metrics for a more descriptive analysis of the algorithms' performances. § See Figure 7 • Hidden Markov modeling is a relatively old technique in computer science. Due to some inherent disadvantages, this technique has gradually been replaced by the recurrent neural network (RNN) architecture (more specifically, long short-term memory networks and gate recurrent units) in natural language processing and other sequential tasks. Therefore, the novelty of ViterBrain remains unknown unless the authors can demonstrate that the proposed method outperforms other RNN-based baselines. o It is true that RNN approaches have made tremendous strides in many sequential decision processes, which had been classically solved by HMMs. However, we could not find any notable publications about applying RNNs approaches to neuron reconstruction. We added a sentence to the discussion saying that this would be a ripe avenue for future work. § From Section 3: Future benchmark comparisons could include reinforcement learning, or recurrent neural network approaches, which have become prevalent in sequential decision processes. However, there is not much scientific literature on these approaches to neuron reconstruction with accompanying functional code. o We do, however, reference a deep reinforcement learning method (Dai et. al. 2019).
Unfortunately, the software accompanying this publication has issues (when we were following their readme, there is an error while running the training script), so we did not add it to our algorithm comparison.
• Some recent papers could be further analyzed and discussed in the revised version of this manuscript.