There is great interest in tailoring diagnostic and therapeutic tools to individual patients. This concept reflects the growing recognition that there is significant variability between individuals. As therapies focus on molecular targets, diagnostic medical imaging tools must reveal focal pathologies and the effects of therapy in each patient. High-resolution object detection and image segmentation are thus critical to obtaining meaningful data in a heterogeneous image.

In image analysis, detection is used to identify objects such as organs and tumors, and segmentation is used to isolate the objects from an image. While large objects can often be automatically or semi-automatically isolated, small objects (blobs) are difficult to detect and segment. Blobs can range in size and location in images. Examples of blobs include cells or cell nuclei in images from optical microscopy1, exudative lesions in images of the retina2, breast lesions in ultrasound images3, and glomeruli in magnetic resonance (MR) images of the kidney4,5,6. Major challenges to detecting these blobs include low image resolution and high image noise. The small blobs are often numerous and can overlap each other. Many approaches have been proposed for blob detection7,8,9 of which intensity thresholding is among the most common10. Intensity thresholding assumes that the blobs have consistently different intensities from the background. Global differences can be addressed with a fixed threshold and local differences can be addressed with an adaptive threshold11,12. However, the assumptions required for consistent thresholding are often violated, and thresholding alone can lead to erroneous detection or segmentation. To address this, researchers have proposed multi-step pipelines13,14 in which thresholding is only the first step. Intensity-based features are then derived using filters for improved detection. One popular class of filters is based on mathematical morphology15,16. Operators such as erosion, dilation, opening and closing allow geometrical and topological properties of objects. This approach often begins with selected seed points in the image and iteratively adds connected points to form labeled regions. Mathematical morphology is preferred when the blobs are relatively large in size and small in number. Weaknesses of this approach include the tendency to under-segment and diminished performance in the presence of noise. Under-segmentation occurs when multiple blobs within close proximity are detected as one, resulting in an erroneously low detected number. Another type of filter is based on space transformation. For example, Radial-Symmetry17, a point detector for small blobs, uses radially symmetric space as a transformation space to detect radially symmetric blobs. SIFT18, SURF19 and BRISK20 are region detectors. Each of the region detectors extracts scale invariant features to detect small objects but may suffer from poor performance in optical imaging21. Recently, the Laplacian of Gaussian (LoG) detector22,23, from scale space theory, has attracted attention in blob detection8,24. Similar to the radially symmetric detector, the LoG detector is unreliable in detecting rotationally asymmetric blobs. To solve this, LoG extensions have been proposed, including the Difference of Gaussian (DoG)18,25,26,27 and the Generalized Laplacian of Gaussian (gLoG)21. While each approach detects small blobs to some extent, non-blob objects are detected as false blob candidates resulting in over-detection. A post-pruning procedure can remove false blob candidates, but results have been inconsistent28.

Here we focus on detecting individual glomeruli in MR images of the kidney as a specific blob detection problem. To date, most biomarkers of kidney pathology have come from histology using destructive techniques that estimate glomerular number29,30,31. A non-destructive imaging approach to measuring nephron endowment provides a new marker for renal health and susceptibility to kidney disease. Cationic ferritin enhanced MRI (CFE-MRI) enables the detection of glomeruli in animals32,33 and in human kidneys5,32,33,34. Because each glomerulus is associated with a nephron, CFE-MRI may provide an important imaging marker to detect changes in the number of nephrons and susceptibility to renal and cardiovascular disease5. Glomerulus detection by CFE-MRI presents difficulties because glomeruli are small and have a spatial frequency similar to image noise. Zhang et al. developed the Hessian-based Laplacian of Gaussian (HLoG) detector1 and the Hessian-based Difference of Gaussian (HDoG) detector28 to automatically detect glomeruli in CFE-MR images. They employed the LoG or DoG to smooth the images, followed by Hessian analysis of each voxel for pre-segmentation. Since LoG and DoG suffer from over-detection, a Variational Bayesian Gaussian Mixture Model (VBGMM) was implemented as a final step. LoG and DoG were the first two detectors applied to MR images of the kidney to identify glomeruli. However, deriving Hessian-based features from each blob candidate is computationally expensive, limiting high-throughput studies. In addition, unsupervised learning using the VBGMM in the post-pruning procedure requires a number of carefully tuned parameters for optimal clustering. Here we propose a new approach, termed UH-DoG, which applies joint constraints from spatial probability maps derived from U-Net, a deep learning model, and Hessian convexity maps derived from Hessian analysis on the DoG detector. The theoretical foundation of Hessian analysis guarantees that pre-segmentation will recognize all true convex blobs and some non-blob convex objects, resulting in a blob superset. Joining probability maps allows us to distinguish true blobs from the superset. The joint-constraint extension of the detector requires no post-pruning and thus is robust, generalizable and computationally efficient.

Within the field of deep learning, the Convolutional Neural Network (CNN) has been successfully implemented in medical imaging applications ranging from object detection and segmentation to classification35,36,37. The first generation of CNN models was used to classify images through fully connected layers. Shelhamer et al.38 first proposed a Fully Convolutional Network (FCN) that transfers the fully connected layers to deconvolutional layers and provides a dense class map with arbitrarily-sized input image. The FCN changes “image-label” mapping to “pixel/voxel-label” mapping for object detection and image segmentation. One limitation of the FCN for medical imaging is the need for large training datasets. A lightly weighted FCN model, the U-Net39, employs a modified FCN architecture to require fewer training images but yield precise, fast segmentation. U-Net has been implemented in various medical segmentation tasks such as nucleus, cell, and breast lesion segmentation40,41,42, all drawn from limited datasets. The U-Net yields a probability map where each pixel or voxel indicates the likelihood of being within the imaging object. However, based on our previous study43, U-Net does not reliably separate glomeruli within close proximity. Therefore, we choose to adopt the probability map as part of UH-DoG in conjunction with Hessian analysis for glomerulus detection from CFE-MR images.

There are three main advantages of the UH-DoG method. First, a global blob likelihood constraint from the U-Net probability map reduces over-detection by DoG. Second, a local convex constraint from the Hessian convexity map reduces under-segmentation. Third, integrating the probability map constraint with the Hessian convexity map eliminates the need for post-pruning. To validate the performance of UH-DoG, four methods were chosen from the literature: HLoG1, gLoG21, LoG22, and Radial-Symmetry17. We tested these on dataset of 2D fluorescent images (n = 200) where the locations of blobs were known. UH-DoG outperformed the other four methods in F-score and performed comparably to the other four methods in recall. Next, we compared blob detection of these methods on a 3D kidney MR dataset against the HDoG method. The differences between UH-DoG and HDoG were negligible but the average computation time of UH-DoG was 35% shorter than that of HDoG.


We propose UH-DoG, a joint constraint-based detector for glomeruli detection. UH-DoG consists of three steps (Fig. 1). Step 1 is to use the Difference of Gaussian (DoG) to smooth the images, followed by Hessian analysis to identify possible blob candidates based on local convexity. Step 2 is to use a trained U-Net to generate a probability map, which captures the most likely blob locations. Step 3 is to combine the probability map from Step 2 with blob candidates from Step 1 as joint constraints to identify true blobs. Each step is discussed in detail in the following sections.

Figure 1
figure 1

Proposed UH-DoG for glomerulus identification.

Hessian analysis and hessian convexity map

Before implementing Hessian analysis, DoG is used to smooth the images. By employing a convolution operator, DoG can filter image noise and enhance objects at the selected scale24. DoG is a fast approximation of the LoG filter to highlight blob structure4 and is thus computationally efficient18.

Let a 3D image be \(f:{R}^{3}\to R.\,\)The scale-space representation \(L(x,y,z;\sigma )\) at point (x, y, z), with scale parameter σ, is the convolution of image f(x, y, z) with the Gaussian kernel \(G(x,y,z;\sigma )\):

$$L(x,y,z;\sigma )=\,G(x,y,z;\sigma )\ast f(x,y,z),$$

where * is the convolution operator and the Gaussian kernel \(G(x,y,z;\sigma )=\,\frac{1}{{(2\pi {\sigma }^{2})}^{\frac{3}{2}}}{e}^{-\frac{({x}^{2}+{y}^{2}+{z}^{2})}{2{\sigma }^{2}}}\). The Laplacian of \(L(x,y,z;\sigma )\) is:

$${\nabla }^{2}L(x,y,z;\sigma )=\frac{{\partial }^{2}L(x,y,z;\sigma )}{\partial {x}^{2}}+\frac{{\partial }^{2}L(x,y,z;\sigma )}{\partial {y}^{2}}+\frac{{\partial }^{2}L(x,y,z;\sigma )}{\partial {z}^{2}}.$$

According to18, \({\rm{\sigma }}{\nabla }^{2}L(x,y,z;\sigma )={L}_{\sigma }(x,y,z;\sigma ).\,\)We approximate the partial derivative \({L}_{\sigma }(x,y,z;\sigma )\) by a one-sided difference quotient, the DoG approximation of LoG is:

$$\begin{array}{rcl}{\nabla }^{2}L(x,y,z;\sigma ) & = & \frac{{L}_{\sigma }(x,y,z;\sigma )\,}{{\rm{\sigma }}}\approx \frac{L(x,y,z;\sigma +\Delta \sigma )-L(x,y,z;\sigma )}{{\rm{\sigma }}\Delta \sigma }\\ & = & f(x,y,z)\ast \frac{(G(x,y,z;\sigma +\Delta \sigma )-G(x,y,z;\sigma ))}{{\rm{\sigma }}\Delta \sigma }\end{array}$$

To locate an optimum scale for the blobs, similar to1, we add γ-normalization to form the normalized DoG detector \({\sigma }^{\gamma }{\nabla }^{2}L(x,y,z;\sigma )\), which is:

$$Do{G}_{nor}(x,y,z;\sigma )={\sigma }^{\gamma -1}f(x,y,z)\ast \frac{(G(x,y,z;\sigma +\Delta \sigma )-G(x,y,z;\sigma ))}{\Delta \sigma },$$

where γis introduced to automatically determine the optimum scale for the blobs. We set γ to 2 here. For details on tuning γ, refer to1. The normalized DoG transformation underlies Hessian-based convexity analysis to detect blobs.

After the image is smoothed by the normalized DoG, for a voxel (x, y, z) in the normalized DoG image \(Do{G}_{nor}(x,y,z;\sigma )\) at scale σ, the Hessian matrix for this voxel is:

$$H(Do{G}_{nor}(x,y,z;\sigma ))=[\begin{array}{ccc}\frac{{\partial }^{2}Do{G}_{nor}(x,y,z;\sigma )}{\partial {x}^{2}} & \frac{{\partial }^{2}Do{G}_{nor}(x,y,z;\sigma )}{\partial x\partial y} & \frac{{\partial }^{2}Do{G}_{nor}(x,y,z;\sigma )}{\partial x\partial z}\\ \frac{{\partial }^{2}Do{G}_{nor}(x,y,z;\sigma )}{\partial x\partial y} & \frac{{\partial }^{2}Do{G}_{nor}(x,y,z;\sigma )}{\partial {y}^{2}} & \frac{{\partial }^{2}Do{G}_{nor}(x,y,z;\sigma )}{\partial y\partial z}\\ \frac{{\partial }^{2}Do{G}_{nor}(x,y,z;\sigma )}{\partial x\partial z} & \frac{{\partial }^{2}Do{G}_{nor}(x,y,z;\sigma )}{\partial y\partial z} & \frac{{\partial }^{2}Do{G}_{nor}(x,y,z;\sigma )}{\partial {z}^{2}}\end{array}].$$

In a normalized DoG-transformed 3D image, each voxel of a transformed bright blob has a negative definite Hessian matrix28. We define a binary indicator matrix, \(HI(x,y,z;\sigma )\), termed the Hessian convexity map. \(HI(x,y,z;\sigma )=1\) when \(H(Do{G}_{nor}(x,y,z;\sigma ))\) is negative definite; otherwise, \(HI(x,y,z;\sigma )=0\).

To determine a single optimum scale σ*, the maximum value of the normalized DoG is used here28. Let the average DoG value per blob candidate voxel measure BGoG be:

$${B}_{DoG}(\sigma )=\frac{{\sum }_{(x,y,z)}DoG(x,y,z)HI(x,y,z;\sigma )}{{\sum }_{(x,y,z)}HI(x,y,z;\sigma )}.$$

We have \({\sigma }^{\ast }=argmax\,{B}_{DoG}(\sigma )\). σ* is used to generate the optimum Hessian convexity map \(HI(x,y,z;{\sigma }^{\ast })\). This map is the local convexity constraint for detecting the convex blob regions. Result is a set of convex objects including all true blobs and some non-blob convex objects.

U-Net and Probability Map

A classical CNN usually consists of multiple convolutional layers followed by pooling layers, activation layers, and fully connected layers. Convolutional layers learn hierarchical and high-level feature representation. Pooling layers can reduce feature dimensions and capture spatial feature invariance. The final fully-connected layers categorize the images into different groups. Compared to classical CNNs, FCNs replace all fully-connected layers with a fully convolutional layer. There are several advantages to this approach. First, the input image size can be arbitrary because all models consist of convolution layers, so output size only depends on input size. Second, the FCN can be trained from whole images without patch sampling, thus the effects of patch-wise training need not be considered. However, FCNs require a large dataset for training. To address this issue, a modified FCN model, the U-Net, was proposed. Interested readers are referred to U-Net39 for details. The output of U-Net is a probability map in [0, 1].

In U-Net, let the input images be \(X\in {R}^{{I}_{1}\times {I}_{2}\times {I}_{3}}\), and the output map be \(Y\in {R}^{{I}_{1}\times {I}_{2}\times {I}_{3}}\), where \({y}^{{i}_{1},{i}_{2},{i}_{3}}\in [0,1]\). A binary cross entropy loss function is used in the training process to obtain the output map:

$$L=-\,\frac{1}{{I}_{1}{I}_{2}{I}_{3}}\mathop{\sum }\limits_{k=1}^{{I}_{1}\times {I}_{2}\times {I}_{3}}{Y}_{k}\Delta \,\log ({\hat{Y}}_{k})+(1-{Y}_{k})\Delta \,\log (1-{\hat{Y}}_{k}),$$

where Yk is the true label and \({\hat{Y}}_{k}\,\)is the predicted probability for voxel k.

In our U-Net output, we obtain a probability map \(U(x,y,z),\,\)where x and y are dimensions of each 2D image slice and z is the slice number. The probability map is the global blob likelihood constraint in our joint constraints operations to detect the most likely blob regions. By setting a probability threshold, most noise is removed. However, some touching blobs could have a higher probability than the threshold in the boundary and might not be split up, resulting in reduced detection known as under-segmentation. Joining the Hessian convexity map with U-Net probability map will address the challenges of small blob detection.

Joint constraint operation for true blob identification

Given a 3D image \(f:{R}^{3}\to R,\) Hessian analysis is applied to render a convexity map \(\,HI(x,y,z;{\sigma }^{\ast })\), U-Net is applied to render a probability map \(U(x,y,z)\). We introduce a joint operator

$$UH(x,y,z)=HI(x,y,z;{\sigma }^{\ast })\circ I(x,y,z),$$

where \(I(x,y,z)\) is a binary indicator matrix. Given a probability threshold \({\delta }_{b},\,I(x,y,z)\) = 1 when \(U(x,y,z) > {\delta }_{b}\); otherwise, \(I(x,y,z)\) = 0. We define the true blob candidate as a 27-connected voxel44, and the blob set is represented as:

$${S}_{blob}=\{(x,y,z)|(x,y,z)\in Do{G}_{nor}(x,y,z;\sigma ),UH(x,y,z)=1\,\}.$$

To illustrate, Fig. 2 shows images of blobs detected during the joint constraint operation of the U-Net probability map and the Hessian convexity map. The blue circle in Fig. 2a shows only one blob. The same blue circle on Fig. 2b, after application of the Hessian convexity map, shows there is one “bigger” blob in the middle and a number of smaller blobs around the boundary of the blue circle. Figure 2c shows the correct outcome, only one blob in the middle. This clearly illustrates the sensitivity of the Hessian matrix to noise. Even though the Hessian analysis guarantees the detection of the convex object, some non-blob convex objects (noise) will also be detected, resulting in over-detection. This noise can be readily filtered by the U-Net probability map (Fig. 2c). We conclude that U-Net may be useful for denoising, which alleviates the over-detection of Hessian analysis.

Figure 2
figure 2

(a) A 2D gray scale image preprocessed from experiment 1 fluorescent image (b) Binary Hessian convexity map of (a), the convex pixels are marked as the white color. (c) U-Net probability map of (a), pixel is illustrated with a color indicating a probability of the pixel belonging to a blob. (d) Blob identification map joined from Hessian convexity map and U-Net probability map with 0.5 threshold.

The red circle in Fig. 2a shows overlapped blobs. They are still overlapping in the U-Net probability map from Fig. 2c. But they are split up in the Hessian convexity map from Fig. 2b. By joining the Hessian convexity map and U-Net probability map with a single global threshold, the overlapped blobs in the red circle are visualized as distinct entities, as shown in Fig. 2d. We conclude Hessian analysis could alleviate the under-segmentation issue from U-Net.

Our proposed UH-DoG integrates the probability map from U-Net and convexity map from Hessian analysis to guarantee robustness to noise and effective blob detection. The detailed steps of UH-DoG are shown in Table 1.

Table 1 Detail Steps of proposed UH-DoG.


Two experiments were conducted to validate of the performance of our proposed UH-DoG detector. The first experiment validated the UH-DoG on 200 fluorescence, 2D light microscopy images for cell detection45. The 2D cell images were of interest because (1) to the best of our knowledge, there are no 3D small blob datasets available for comparison; (2) the blobs from these images are small and each image could be used to test the performance of the algorithm in the presence of background noise; (3) this dataset has the ground truth of the locations of each blob. The detection accuracy measured by recall, precision, and F-score can be used to compare this approach with methods from the literature. The second experiment validated the performance of UH-DoG on CFE-MR images of mouse kidneys where each glomerulus was detected. All experiments were approved by the University of Virginia Institutional Care and Use Committee, in accordance with the NIH Guide for the Care and Use of Laboratory Animals.


Training dataset and data augmentation

We used a public dataset46 to train our deep learning model, based on optical images of cell nuclei. This dataset has 141 optical microscopy pathology images (2,000 × 2,000 pixels), as shown in Fig. 3a. The 12,000 ground truth annotations are typically done by an expert, which involves delineating object boundaries over 40 hours46. Due to the large amount of time and effort required, the annotated nuclei in this dataset only represents a small fraction of the total number of nuclei present in all images. Since we aim to facilitate U-Net to denoise our blobs images based on the ground truth labeled images, as shown in Fig. 3b, we generated Gaussian distributed noise with \({\mu }_{noise}=0\) and \({\sigma }_{noise}^{2}=0.01\) and we added it to the ground truth labeled images, resulting in 141 simulated training images, as shown in Fig. 3c. Data were augmented to increase the in variance and robustness properties of U-Net39. We generated the augmented data by a combination of rotation shift, width shift, height shift, shear, zoom, and horizontal flip.

Figure 3
figure 3

Training images. (a) Original image. (b) Ground truth labeled image. (c) Simulated training image.

Experiment I: Validation experiments using 2D fluorescent images

Figure 4 illustrates an example fluorescent image (256 × 256 pixels). Since this was a 2D image, our proposed UH-DoG must incorporate a modified 2D DoG because comparison algorithms were from the 2D LoG and its extensions.

Figure 4
figure 4

(a) Sample 2D fluorescent image. (b) Ground truth dots of (a).

To revise the DoG to a 2D version, for 2D images \(f(x,y)\) with the Gaussian kernel \(G(x,y;\sigma )\), we modified the normalized 3D DoG detector from Eq. 4 in a 2D format:

$$Do{G}_{nor}(x,y;\sigma )={\sigma }^{\gamma -1}f(x,y)\ast \frac{(G(x,y;\sigma +\Delta \sigma )-G(x,y;\sigma ))}{\sigma \Delta \sigma }.$$

Then the corresponding Hessian matrix were modified from Eq. 5 as follows:

$$H(Do{G}_{nor}(x,y;\sigma ))=[\begin{array}{cc}\frac{{\partial }^{2}Do{G}_{nor}(x,y;\sigma )}{\partial {x}^{2}} & \frac{{\partial }^{2}Do{G}_{nor}(x,y;\sigma )}{\partial x\partial y}\\ \frac{{\partial }^{2}Do{G}_{nor}(x,y;\sigma )}{\partial x\partial y} & \frac{{\partial }^{2}Do{G}_{nor}(x,y;\sigma )}{\partial {y}^{2}}\end{array}].$$

The parameter settings for Hessian analysis and DoG were as suggested in28. γ is set to 2. σ varies from 0.5 to 3 with step-size 0.5. Δσ is set to 0.001.

We used precision, recall and F-score to evaluate the performance of our proposed algorithm. Precision measures the fraction of retrieved candidates confirmed by the ground-truth. Recall measures the fraction of ground-truth data retrieved. F-score measures overall performance. Since ground truth data were provided in the form of dots (the coordinates of the blob centers), as in the literature1,28, a candidate was considered a true positive if its intensity centroid was within a threshold d of the corresponding ground truth dot. Specifically, if the Euclidian distance Dij between dot i and blob candidate j was less than or equal to d, the blob was considered a true positive. To avoid duplicate counting, the number (#) of true positives TP was calculated by Eq. 11. Precision, recall, and F-score are calculated by Eqs. 12, 13, 14 respectively:

$$TP=\,\min \{\#\{j:{\min }_{i=1}^{m}{D}_{ij}\le d\},\,\#\{i:{\min }_{j=1}^{n}{D}_{ij}\le d\}\},$$
$$F-score=2\times \frac{precision\times recall}{(precision+recall)},$$

where m is the number of ground-truth and n is the number of blob candidates; d is a thresholding parameter set to a positive value \(\,(0,+\infty )\). If d is small, fewer blob candidates are counted since the distance between the blob candidate centroid and ground-truth should be small. If d is too large, more blob candidates are counted. Here, since local intensity extremes could be anywhere within a small blob with an irregular shape, we set d to the average diameter of the blobs: \(d=2\times \sqrt{\frac{{\sum }_{(x,y)}I(x,y;\sigma )}{\pi }}\).

Since the results of detection by the complete versions of HLoG, gLoG, Radial-Symmetry and LoG on 200 pathological images are available online1,17,21, the results were directly used from these papers for comparison.

Figure 5 shows a comparison of UH-DoG to the HLoG, gLoG, LoG and Radial-Symmetry algorithms. While UH-DoG is comparable to HLoG, gLoG and Radial Symmetry algorithms in recall, it significantly outperforms the four algorithms in both precision and F-score (Table 2). The standard deviation of F-score in UH-DoG was 0.025, compared to 0.0377 with the HLoG method, compared to 0.1436 with the gLoG method, 0.0795 with the Radial-Symmetry method, and 0.0385 with the LoG method. We conclude that UH-DoG provides more accurate and robust detection of blobs in this dataset. In addition, statistical analysis was performed with the results summarized in Table 2. While comparable to the four algorithms on recall, our approach statistically outperformed the others on precision and F-score.

Figure 5
figure 5

Comparison of full versions of UH-DoG, HLoG, gLoG, Radial-Symmetry and LoG on 200 fluorescence images. The error bar indicates the standard deviation of the corresponding measure across 200 images. For precision and F-score, UH-DoG has significant different (see Table 2) with others. For recall, UH-DoG has significant difference with gLoG and LoG.

Table 2 ANOVA using Tukey’s HSD pairwise test on 200 Fluorescent Images (*significance p < 0.05).

Experiment II: Validation experiments using 3D Kidney MRI

In this section, we conducted experiments on CF-labeled glomeruli from a dataset of 3D magnetic resonance images (256 × 256 × 256 voxels) to measure number (Nglom) and apparent size (aVglom) of glomeruli in diseased kidneys and healthy control kidneys. Acute kidney injury was induced in adult male C57Bl/6 mice using an intraperitoneal injection of folic acid (125 mg). A subset of the group receiving folic acid, the AKI group (n = 4) was euthanized 4 days after the folic acid was administered and the remainder of those that received folate were euthanized 4 weeks later and termed the chronic kidney disease (CKD) group, n = 3. The control groups for AKI (n = 5) and CKD (n = 6) were age-matched adult male C57Bl/6 mice that received intraperitoneal sodium bicarbonate.

For improved detection, we adopted a preprocessing step to segment the medulla from the image because no glomeruli are located there. Based on the segmented kidney image, shown in Fig. 6a, we converted it to a binary mask (Fig. 6b). Then we generated a distance mask, seen in Fig. 6c. With the map showing the distance between each kidney’s voxel and the kidney boundary, we set up a distance threshold to remove regions farther from the boundary than this threshold. Figure 6d shows the 2D image slice after removing the medulla.

Figure 6
figure 6

(a) One slice of healthy mouse kidney (ID: 477) image. (b) Binary image of (a). (c) Distance mask of (b). (d) Remove medulla from (a).

Then we performed the proposed UH-DoG method to segment the kidney glomeruli in Fig. 6d. The parameter settings are as follows: γ is set to 2. σ varies from 0.5 to 1.8 with step-size 0.1. Δσ is set to 0.001. Example segmentation results are shown in Figs. 7 and 8. The number of glomeruli (Nglom), mean apparent glomerular volume (aVglom) and median aVglom are reported in Table 3, where the UH-DoG method is compared to the HDoG method. We used the method of calculating apparent glomerular volume from the paper34. Similarly, Table 4 summarizes the results from the AKI and control groups.

Figure 7
figure 7

(a) Glomerular segmentation results from 3D MR images of mouse kidneys (selected slices presented). (ae) One slice for the CKD group. (fj) Identified glomeruli are marked in red. (k) is the zoom-in region of (d) while (l) is the segmentation result of (k). (b) Glomerular segmentation results from 3D MR images of mouse kidneys (selected slices presented). (mp) One slice for the control group. (qt) Identified glomeruli are marked in red. (u) is the zoom-in region of (o) while (v) is the segmentation results of (u).

Figure 8
figure 8

(a) Glomerular segmentation results from 3D MR images of mouse kidneys (selected slices presented). (ae) One slice for the AKI group. (fj) Identified glomeruli are marked in red. (k) is the zoom-in region of (d) while (l) is the segmentation result of (k). (b) Glomerular segmentation results from 3D MR images of mouse kidneys (selected slices presented). (m–p) One slice for the control group. (q–t) Identified glomeruli are marked in red. (u) is the zoom-in region of (o) while (v) is the segmentation results of (u).

Table 3 Glomerular number (Nglom) and volume (aVglom) for the CKD and control mice kidneys using the proposed UH-DoG method comparing with HDoG method (*aVglom unit mm3 × 10−4).
Table 4 Glomerular number (Nglom) and volume (aVglom) for the AKI and control mice kidneys using the proposed UH-DoG method comparing with HDoG method (*aVglom unit mm3 × 10−4).

We performed quality control by visually checking the identified glomeruli in kidney images. Figure 7 shows glomerular identification for CKD and control kidneys. Figure 8 shows glomerular identification for kidneys in the AKI and control groups.

Discussion: computation cost

UH-DoG significantly decreases computation time compared to the HDoG algorithm28, as shown in Tables 5 and 6. The training time of U-Net is not included in the estimates of computation time as it is trained beforehand and can be used to test on all images.

Table 5 Computation time for CKD and Control kidneys using HDoG and the proposed method with scale = 1 (Intel Xeon 3.6 GHz CPU and 16 GB of memory, NVIDIA TITAN XP and 12 GB of memory).
Table 6 Computation time for AKI and Control kidneys using HDoG and the proposed method with scale = 1 (Intel Xeon 3.6 GHz CPU and 16 GB of memory, NVIDIA TITAN XP and 12 GB of memory).

Discussion: Clinical translation

The use of imaging biomarkers in humans has increased both for disease early detection and disease severity assessment. Additionally, imaging biomarkers can serve as surrogate endpoints in clinical trials, reducing cost and burden associated with these studies. For example, total kidney volume has recently been accepted as a surrogate marker for disease progression in autosomal dominant polycystic kidney disease trials47. Although the importance of glomerular number has been universally accepted, the detection of glomerular number and size has been limited because the only methodology to obtain these metrics were destructive stereological approaches that could only be performed post mortem. With the advent of CFE-MRI, the need for image analysis tools is paramount. However, it is critical to the success of any imaging biomarker that the marker be accurate and rapidly obtained. This study demonstrates some of the challenges in detecting small objects, such as glomeruli, particularly in the settings of low image resolution, image noise and overlap of objects. It also shows the promise of rapid acquisition where data can be used in a timeframe to influence patient care. Further work is necessary to validate the accuracy of the detection of diseased glomeruli to apply this algorithm to a wider range of renal disease models.


Discovering imaging biomarkers is important to inform disease diagnosis, prognosis, therapy development and treatment assessment. Of particular interest in this research is to identify quantitative glomeruli biomarkers from CFE-MR image. This is a challenging problem because the number of glomeruli is large, the size is small. In addition, the limitation from imaging acquisition such as hardware and variable acquisition parameters often renders the images with less desirable resolution resulting the overlapping glomeruli. In this paper, we demonstrated a new small blob detector by joining the Hessian convexity map and probability map from U-Net. This joint constraint-based approach overcomes under-segmentation by U-Net and over-detection by Hessian analysis. While it was successfully implemented in segmenting the kidney glomeruli, there are still some limitations. First, the assumption that the blobs are convex and similar in size may not be robust for non-convex objects with difference sizes. A future possible improvement is to enhance ability of U-Net to detect both convex and non-convex small objects. Second, the probability map is sensitive to the threshold. We plan to explore the use of thresholding to improve UH-DoG.