Precise segmentation of densely interweaving neuron clusters using G-Cut

Characterizing the precise three-dimensional morphology and anatomical context of neurons is crucial for neuronal cell type classification and circuitry mapping. Recent advances in tissue clearing techniques and microscopy make it possible to obtain image stacks of intact, interweaving neuron clusters in brain tissues. As most current 3D neuronal morphology reconstruction methods are only applicable to single neurons, it remains challenging to reconstruct these clusters digitally. To advance the state of the art beyond these challenges, we propose a fast and robust method named G-Cut that is able to automatically segment individual neurons from an interweaving neuron cluster. Across various densely interconnected neuron clusters, G-Cut achieves significantly higher accuracies than other state-of-the-art algorithms. G-Cut is intended as a robust component in a high throughput informatics pipeline for large-scale brain mapping projects.

b MES of neurons segmented by G-Cut, NeuroGPS-tree and TREES toolbox is represented by square, circle, and asterisk, respectively. The standard deviation is shown as error bar. Source data are provided as a Source Data file.

Supplementary Note 1. Simulation of neuron clusters
Due to the lack of publicly available reconstructed intact neuron clusters, we simulated neuron clusters by selecting and joining random subsets of 2693 well reconstructed neurons (including 435 interneurons and 2258 principal neurons) hosted on neuromorpho.org. In our synthetic data, the information of a neuron is represented by two parts: one part is a set of node information (including the node type, x, y, z location, and radius); and another part is an adjacency matrix representing connecting edges between nodes. We generated two datasets to evaluate the effect of cluster scale and degree of entanglement respectively. The procedures are listed as below: 1. Starting neuron population: Denote the predetermined number of neurons in a cluster as n, and the corresponding cluster scale as C n . From the public dataset of well reconstructed neurons, we randomly chose m pyramidal neurons and n -m interneurons as starting neuron population, where P(m  k1 k n)  (n  1) 1 . One of the n neurons is randomly selected as base neuron. Other neurons subsequently become connecting neurons. Each connecting neuron is joined with the base neuron as described in the following step 2 -4. The process is iterated until all neurons are joined into a single cluster.
2. Spurious link construction: During the neuron tracing process, if the distance between branches of two neurons is very small, an automatic tracing method will erroneously connect the gap between the two branches into a spurious link. To realistically mimic real world applications of neuron cluster tracing, we placed cell bodies of connecting neurons at random locations in the same bounded volume space as the base neuron. If a pair of branches from different neurons have a distance to each other less than the sum of their radius, we considered the event an occurrence of spurious link and construct a connection between these two branches (as shown in Supplementary Figure 2).
3. Synthetic dataset with varying cluster scales: In order to understand how the cluster scale affects segmentation, it is necessary to bound the number of spurious links to a reasonable range. We first empirically derived a distribution for the number of spurious links between a random neuron pair, using criteria described in 2. The neuron pair was randomly drawn from the set of well reconstructed neurons, and positioned together randomly 1000 time. We repeated the drawing and positioning 50 times, resulting in a total spurious link counts for 50,000 clusters. The result shows a majority of spurious links number is less than 10. The number of spurious link is extremely low when it is 1 and does not make sense in real image stacks. Thus, we bound the spurious number between a neuron pair to be between 2 and 10.
We then iteratively join the n  1 connecting neurons to the base neuron. For each connecting neuron, a random cell body position is generated and spurious link numbers are counted. If the number falls within 2 and 10, we accept the cell body position and construct links between the connecting and base neuron. Otherwise, a new position will be generated.
The cluster formed from the joining operation will be considered as the new base neuron.
The final cluster C n then contains spurious links ranging between 2 * (n  1) and 10 * (n  1), to be assigned between n cell bodies. We generated 100 clusters for each cluster scale C n .
For clusters with the same scale, the difficulty of the segmentation problem will only differ up to a bounded constant factor, and no unusually dense entanglement can occur. This allows us to analyze how cluster scale affects segmentation accuracy.
4. Synthetic dataset with varying degrees of entanglement: In order to understand how cluster degree of entanglement affects segmentation accuracy, we used a fixed cluster scale, n  6, for the entire dataset. Spurious link constructions were performed without an upper bound.
We generated 10,000 clusters and stratified the cluster population based on probability distribution of spurious link number (Fig. 5c). From clusters with spurious link number in each of the intervals [10,20), [20, 30) … [120, ∞), we randomly drew 100 samples for analysis.
One example of the reconstructed neuron cluster is shown in Fig. 4.

Supplementary Note 2. The redundant branches pruning method.
We compute GOF of all branches in a neuron, and the total GOF of a branch i calculated by the equation: ∑ where j represents all child branches of a branch i and length i represents the length of a branch i.
After we calculate the total GOF of all branches of a neuron, we use a threshold value to prune the branches. The threshold can be a constant value or variable according to each neuron. If the total GOF of a branch is larger than the threshold value, all of its child branches are discarded from the neuron.
Example: As shown in Supplementary Figure 8, branch 7 and branch 8 are the child branches of branch 6 in the right figure. If the total GOF of branch 6 is larger than a threshold value, branch 7 and branch 8 are discarded from the neuron.