Introduction

Gliomas are the most common primary brain tumors, arising from the glial cells that support the cerebral nerve cells. The World Health Organization (WHO) grading system for gliomas defines grades I–IV, where grade I tumors are the least aggressive and IV are the most aggressive1. Of these, 70% are considered malignant gliomas (anaplastic astrocytoma WHO grade III and glioblastoma multiforme WHO grade IV). The glioblastoma multiforme, named for its histopathological appearance, is the most frequent malignant primary brain tumor and is one of the most highly malignant human neoplasms. The approach to the treatment of glioblastomas typically includes maximum safe resection, percutaneous radiation and chemotherapy. Despite new radiation strategies and the development of oral alcylating substances (e.g. Temozolomide), the life expectancy for GBM patients is still only about fifteen months2. Although in previous years the role of surgery was controversial, recent literature favors a maximum safe surgical resection as a positive predictor for extended patient survival3. Microsurgical resection can now be optimized with the technical development of neuronavigation based on data from diffusion tensor imaging (DTI), functional magnetic resonance imaging (fMRI), magnetoencephalography (MEG), magnetic resonance spectroscopy (MRS), or positron-emission-computed-tomography (PET). An early postoperative magnetic resonance imaging (MRI) with a contrast agent can be used to determine how much of the tumor mass has been removed and frequent MRI scans can help to monitor any new tumor growth.

For automatic glioma segmentation in general (World Health Organization grade I–IV), several algorithms have already been proposed that rely on magnetic resonance imaging. Szwarc et al.4 have presented a segmentation approach that uses fuzzy clustering techniques. In their evaluation, the authors used six magnetic resonance (MR) studies of three subjects and the reported Dice Similarity Coefficient (DSC)5,6 ranged from 67.21% to 75.63%. Angelini et al.7 have presented an extensive overview of some deterministic and statistical approaches. Gibbs et al.8 have introduced a combination of region growing and morphological edge detection for segmenting enhancing tumors in T1-weighted MRI data. The authors evaluated their method with one phantom data set and ten clinical data sets. An interactive method for segmentation of full-enhancing, ring-enhancing and non-enhancing tumors has been proposed by Letteboer et al.9 and was evaluated on twenty clinical cases. Depending on intensity-based pixel probabilities for tumor tissue, Droske et al.10 have presented a deformable model method, using a level set formulation, to divide the MRI data into regions of similar image properties for tumor segmentation. Clark et al.11 have introduced a knowledge-based automated segmentation on multispectral data in order to partition glioblastomas. Direct comparison with a hand labeled segmentation 89 of 120 slices had a percent matching rate of 90% or higher. Segmentation based on outlier detection in T2-weighted MR data has been proposed by Prastawa et al.12. For each case, the time required for the automatic segmentation method was about ninety minutes. Sieg et al.13 have introduced an approach to segment contrast-enhanced, intracranial tumors and anatomical structures of registered, multispectral MR data. The approach has been tested on twenty-two data sets, but no computation times were provided. Egger et al.14,15 present a graph-based approach. After the graph has been constructed, the minimal cost closed set on the graph is computed via a polynomial time s-t cut16. The presented method has been evaluated with fifty glioblastoma multiforme yielding an average Dice Similarity Coefficient of 80.37 ± 8.93%.

Since fully automated segmentation often fails to match human judgments of tumor boundaries, a number interactive segmentation algorithms have been proposed. Vezhnevets and Konouchine17 give an overview of methods for generic image editing and methods for editing medical images. An interactive segmentation technique called Magic Wand17 is a common selection tool in image editing software applications. The tool gathers color statistics from the user specified image point (or region) and segments (connected) image regions with pixels whose color properties fall within some given tolerance of the gathered statistics. Reese18 has presented a region-based interactive segmentation technique called Intelligent Paint, based on hierarchical image segmentation by tobogganing, with a connect-and-collect strategy to define an object's region. Mortensen and Barrett19 have introduced a boundary-based method to compute a minimum-cost path between user-specified boundary points. The intelligent scissors method20 treats each pixel as a graph node and uses shortest-path graph algorithms for boundary calculation and a faster variant of region-based intelligent scissors uses tobogganing for image over-segmentation and then treats homogenous regions as graph nodes. GraphCut is a combinatorial optimization technique applied to the task of image segmentation by Boykov and Jolly21. An extension of the GraphCut named GrabCut developed by Rother et al.22, is an iterative segmentation scheme that uses a graph-cut for intermediate steps. A marker-based watershed transformation algorithm for medical image segmentation, developed by Moga and Gabbouj23, uses user-specified markers for segmenting gray level images. The Random Walker algorithm of Grady and Funka-Lea24 is a probabilistic approach using a small number of user-labeled pixels. Heimann et al.25 have presented an interactive region growing method that is a descendant of one of the classic image segmentation techniques. A manual refinement system for graph-based approaches has recently been presented by Egger et al.26,27. The approach takes advantage of the basic design of graph-based image segmentation algorithms and restricts a graph-cut by using additional user-defined seed points to set up fixed nodes in the graph. Another resent publication by Zukić et al.28 presents semi-automatic GBM segmentation with a balloon inflation approach29. The balloon inflation method has been evaluated with twenty-seven magnetic resonance imaging data sets with a reported average DSC of 80.46%. The GrowCut method, developed by Vezhnevets and Konouchine17, is a cellular automaton-based algorithm for interactive multilabel segmentation of N-dimensional images. The GrowCut algorithm is freely available as a module30 for the medical image computing platform 3D Slicer31 and has been used in a recent study to segment Pituitary Adenomas32.

In this paper, we present a detailed study of the volumetric analysis of glioblastoma multiforme using the GrowCut tool 3D Slicer. Our objective is to evaluate the utility of 3D Slicer in simplifying the time-consuming manual slice-by-slice segmentation while achieving a comparable accuracy. Thus, 4 physicians segmented GBMs in 10 patients, once using the competitive region-growing based GrowCut segmentation module of 3D Slicer and once by drawing boundaries manually on a slice-by-slice basis. The time required for GrowCut vs. manual segmentation were recorded. A comparison was performed of 3D Slicer based segmentation with manual slice-by-slice segmentation using the Dice Similarity Coefficient (DSC) and the Hausdorff Distance (HD)33,34,35.

Methods that use all slices to calculate the tumor boundaries have more information available to make accurate predictions of tumor volume. Simpler methods such as geometric models provide only a rough estimate of the tumor volume and may not be indicated for accurate determination of tumor burden. Geometric approximations use one or several user-defined diameters to estimate the tumor volume36,37,38. The Macdonald criteria39 for measuring brain tumors adopts uniform, rigorous response criteria similar to those in general oncology where response is defined as a ≥50% reduction in tumor size and the usual measure of "size" is the largest cross-sectional area (the largest cross-sectional diameter multiplied by the largest diameter perpendicular to it). Accurate and repeatable methods to calculate tumor volume are therefore an important aspect of clinical care.

The rest of this article is organized as follows: Section 2 presents the results of our experiments. Section 3 discusses the performance of the proposed approach, concludes the contribution and outlines areas for future work. Finally, Section 4 presents the material and the methods.

Results

The goal of this study was to evaluate the utility of 3D Slicer for segmentation of GBMs compared to manual slice-by-slice segmentation. We used two metrics for this evaluation: a) the time it took for physicians to segment GBMs manually vs. using 3D Slicer, b) the agreement between the two segmentations. In using these metrics to evaluate our results, our assumption is that if 3D Slicer can be used to produce GBM segmentations that are statistically equivalent to what the physicians achieve manually and in substantially less time, then the tool is useful for volumetric follow-ups of GBM patients. Overall, four physicians participated in our study: three physicians provided the manual slice-by-slice segmentations and one physician has been trained in a Slicer-based segmentation as described in the methods section. The results of our study are detailed in Table 1, the primary conclusion of which is that 3D Slicer based GBM segmentation can be performed in about 60% of the time and with acceptable agreement (DSC: 88.43 ± 5.23%, HD: 2.32 ± 5.23 mm) to manual segmentation by a qualified physician. In Table 1, The MT column shows the time (in minutes) it took a physician to segment each of ten GBMs on slice-by-slice basis. The SlicerT column shows the time (in minutes) it took a physician to segment it using 3D Slicer. The Slices column shows the number of slices that the tumor spans in each case, as a rough approximation of the complexity of the segmentation task. Note that 9 out of 10 cases, Slicer < MT and on an average, the time it took to segment with 3D Slicer was 61% of the time it took to segment manually on a slice-by-slice basis. The columns DSC and HD show the agreement between the two segmentations using a Dice Similarity Coefficient and Hausdorff Distance, respectively.

Table 1 This table presents a comparison of a) the time it took for physicians to segment GBMs manually vs. using 3D Slicer, b) the agreement between the two segmentations. The MT column shows the time (in minutes) it took a physician to segment each of ten GBMs on slice-by-slice basis. The SlicerT column shows the time (in minutes) it took a physician to segment it using 3D Slicer. The Slices column shows the number of slices that the tumor spans in each case, as a rough approximation of the complexity of the segmentation task. Note that 9 out of 10 cases, Slicer < MT and on an average, the time it took to segment with 3D Slicer was 61% of the time it took to segment manually on a slice-by-slice basis. The columns DSC and HD show the agreement between the two segmentations using a Dice Similarity Coefficient and Hausdorff Distance, respectively

To provide readers with a point of comparison on how DSC and HD computations vary between expert raters, we include in Table 2 some statistics that we published in another article where we analyzed the results of 12 manual slice-by-slice GBM segmentations by 3 neurosurgeons40,41.

Table 2 Manual intra- and inter-physician segmentation results (min, max, mean μ and standard deviation σ) for three neurosurgeons – X, Y and Z – for twelve glioblastoma multiforme (GBM) data sets. The first column represents the intra-physician segmentation result: within a time distance of two weeks Physician X segmented the twelve GBMs slice-by-slice twice. The second and third columns present the inter-physician segmentation results, whereby the manual slice-by-slice segmentations form Physician Y and Physician Z have been compared with the first manual segmentation of Physician X

In addition to the quantitative results, we present sample GBM segmentation results in Figures 1 and 2 for visual inspection. Figure 1 shows the results of the 3D Slicer GrowCut function (for the tumor and background initialization shown in Figure 3). The rendered 3D tumor segmentation is superimposed (green) on three orthogonal cross-sections of the data. Figure 2 presents the direct comparison of 3D Slicer vs. manual segmentation on an axial slice: the semi-automatic segmentation under 3D Slicer (green) is shown on the left side of the figure and the pure manual segmentation (blue) is shown in the middle of the figure. A fused visualization of the 3D masks of the manual and the Slicer segmentations are displayed on the right side of the figure.

Figure 1
figure 1

This image presents the segmentation results of GrowCut (green) for the tumor and background initialization of Figure 3.

After the initialization of the GrowCut algorithm under Slicer it took about ten seconds to get the segmentation result on an Intel Core i7-990 CPU, 12 × 3.47 GHz, 12 GB RAM, Windows 7 Home Premium x64 Version, Service Pack 1.

Figure 2
figure 2

Comparison of glioblastoma multiforme (GBM) segmentation results on an axial slice: semi-automatic segmentation under Slicer (green, left image) and pure manual segmentation (blue, middle image).

Moreover, a fused visualization of the 3D masks of the manual and the Slicer segmentation is presented (rightmost image).

Figure 3
figure 3

These images present a typical user initialization for glioblastoma multiforme (GBM) segmentation under Slicer with GrowCut: axial (left image) sagittal (second image from the left) and coronal (third image from the left).

Besides, a 3D visualization of all three slices is presented (rightmost image). Note: the tumor has been initialized in green and the background has been initialized in yellow.

Discussion

We observed that the automatic segmentation results produced by 3D Slicer (GrowCut) typically required some additional editing on some slices to achieve the desired boundary and the time required for this manual correction is included in our measurements. Manual segmentation by neurosurgeons took three to nineteen minutes (mean: ten minutes), in contrast to the semi-automatic segmentation with the GrowCut implementation under 3D Slicer that took about 60% of that time (mean: five minutes) including the time needed for editing the GrowCut results.

To quantify the quality of the GrowCut algorithm, we performed intra- and inter-physician segmentations40,41. The results also provided an upper segmentation threshold and therefore a quality measure for our algorithm. For the intra-physician segmentation, a neurosurgeon segmented twelve glioblastoma multiforme. After two weeks, the same neurosurgeon segmented these twelve cases again. The detailed results are presented in Table 2 and provide a mean value μ and a standard deviation σ of 90.29 ± 4.48% with a minimal Dice Similarity Coefficient of 84.01% and a maximal Dice Similarity Coefficient of 96.30% (see the first column). Finally, Table 2 also shows inter-physician segmentation results for the twelve glioblastoma multiforme (see the second and third columns). Therefore, the segmentation of the neurosurgeons Y and Z have been compared with the segmentations of neurosurgeon X. It is evident that there is an upper threshold with a Dice Similarity Coefficient of around ninety percent for the manual intra- and inter-physician segmentations (average DSC when compared with an automatic segmentation: 79.96 ± 8.06% (neurosurgeon X), 77.79 ± 8.49% (neurosurgeon Y) and 76.83 ± 13.67% (neurosurgeon Z)). The DSC of 90% can be thought of as a metric for estimating how well an automatic segmentation result is performing relative to the range of performance of experts and perhaps also can serve as an indicator for how much manual post-editing will be required after the automatic segmentation is performed.

In this paper, the evaluation of glioblastoma multiforme segmentation with the free and open source medical image analysis software 3D Slicer has been presented. Slicer provides a semi-automatic, 3D segmentation algorithm, GrowCut, that is a viable alternative to the time-consuming process of volume determination during monitoring of a patient, for which slice-by-slice contouring has been the best demonstrated practice. Editing tools available in 3D Slicer are used for manual editing of the results upon completion of the automatic GrowCut segmentation. The volume of the 3D tumor is then computed and stored as an aide for the surgeon in decision making for comparison with follow-up scans. This segmentation has been evaluated on 10 GBM data sets against manual expert segmentations using the Dice Similarity Coefficient (DSC) and the Hausdorff Distance (HD). Additionally, intra-physician segmentations have been performed to provide a quality measure of the presented evaluation. In summary, the achieved research highlights of the presented work are:

  • Manual slice-by-slice segmentations of glioblastoma multiforme (GBM) have been performed by clinical experts to obtain ground truth of tumor boundaries and estimates of rater variability.

  • Physicians have been trained in segmenting glioblastoma multiforme with GrowCut and the Editor module of 3D Slicer.

  • Trained physicians used Slicer to segment a glioblastoma multiforme evaluation set.

  • Semi-automatic segmentation times have been measured for GrowCut based segmentation in 3D Slicer.

  • Dice Similarity Coefficient (DSC) and Hausdorff Distance (HD) have been calculated to evaluate the quality of the segmentations.

There are several areas for future work. In particular, some steps of the segmentation workflow under Slicer can be automated. Instead of initializing the foreground on three single 2D slices, a single 3D initialization could be used by means of generating a sphere around the position of the user-defined seed point. Additionally, the algorithm can be enhanced with statistical information about the shape42 and the texture of the desired object43 to improve the automatic segmentation. Furthermore, we plan to evaluate the method on magnetic resonance imaging (MRI) data sets with World Health Organization grade I, II and III gliomas. As compared to high-grade gliomas, low-grade tumor MR images lack gadolinium enhancement. Thus, for these tumors, outlines cannot be expressed by contrast-enhancing T1-weighted images, but by surrounding edema in T2-weighted images. In addition, we want to study how Slicer can be used to enhance the segmentation process of vertebral bodies. Besides, we want to apply the scheme to segment other organs and pathologies. Moreover, we are considering improving the algorithm by performing the whole segmentation iteratively; that is, after the segmentation has been performed, the result of the segmentation can be used as a new initialization for a new segmentation run with the process repeated under user control. We anticipate that the iterative approach will result in more robustness with respect to initialization.

Methods

Data

Ten diagnostic T1-weighted MRI scans with gadolinium enhancement were used for segmentation. These were acquired on a 1.5 Tesla MRI scanner (Siemens MAGNETOM Sonata, Siemens Medical Solutions, Erlangen, Germany) using a standard head coil. Scan parameters were: TR/TE 2020/4.38 msec, isotropic matrix, 1 mm; FOV, 250 × 250 mm; 160 sections.

Software

For the semi-automatic segmentation work in this study we used 3D Slicer 4.0, which is freely downloadable from the website http://www.slicer.org.

Manual segmentation of each data set was performed on a slice-by-slice basis by neurosurgeons at the University Hospital of Marburg in Germany (Chairman: Prof. Dr. Ch. Nimsky) with several years of experience in the resection of gliomas (note: if the tumor border was very similar between consecutive slices, the software allowed the user to skip manual segmentation in each slice and instead interpolated the boundaries in these areas). The software used for this manual contouring provided simple contouring capabilities and was created by us using the medical prototyping platform MeVisLab (see http://www.mevislab.de/). The hardware platform used was an Intel Core i5-750 CPU, 4 × 2.66 GHz, 8 GB RAM, Windows XP Professional ×64 Version, Version 2003, Service Pack 2.

GrowCut segmentation in 3D Slicer

The GrowCut is an interactive segmentation algorithm based on the idea of cellular automaton. The algorithm achieves reliable and reasonably fast segmentation of moderately difficult objects in 2D and 3D using an iterative labeling procedure resembling competitive region growing. A user's interactions results in a set of seed pixels which in turn try to assign their labels to their pixel neighborhood. A pixel is assigned the label of its neighbor when the similarity measure of the two pixels weighted by the neighboring pixel's weight or “strength” exceeds its current weight. Label assignment also results in an update of the pixel's weight. The labeling procedure continues iteratively until a stable configuration is reached when modification of the pixel labels is no longer possible. The algorithm is simple to use requiring no additional inputs from the user besides the painted strokes on the apparent foreground and background. Furthermore, the user can modify the segmentation by adding additional labels in the image, thereby influencing the segmentation result.

Our implementation of the algorithm in 3D Slicer consists of a GUI front-end to enable interactions of the user with the image and an algorithm back-end where the segmentation is computed. We employ a minimal interface, where the user interacts by painting on the image. The algorithm requires labeling with at least two different colors (for a foreground and a background label class). The naïve implementation of the algorithm would require every pixel to be visited in each iteration. Furthermore, a pixel will need to visit every one of its neighbors to update the pixel strengths and labels. Such an implementation would be computationally expensive especially for large 3D images. We implemented the following techniques for speeding up the segmentation. First, as the user may be interested only in segmenting out a small area in the image, the algorithm computes the segmentation only within a small region of interest (ROI). The ROI is computed as a convex hull of all user labeled pixels with an additional margin of approximately 5% for our study. Second, the iterations involving the image are executed in multiple threads, such that several small regions of the image are updated simultaneously (note: the implementation is multithreaded and automatically makes use of all the cores of the computer). Finally, the similarity distance between the pixels are pre-computed once and reused. Also the algorithm keeps track of saturated pixels (those whose weights and therefore labels can no longer be updated) and avoids the expensive neighborhood computation on those pixels. Keeping track of such pixels also helps to determine when to terminate the algorithm.

GBM segmentation using 3D Slicer

After trials of the various segmentation facilities available in Slicer, we determined that the use of GrowCut followed by morphological operations such as erosion, dilation and island removal provides the most efficient segmentation method for GBMs from gadolinium enhanced T1 images. As shown in Figure 4, we used the following workflow to perform GBM segmentation: 1) load the data set into Slicer 2) initialization of an area inside the tumor and a stroke drawn outside the tumor with a brush size of about 1 cm 3) automatic competing region-growing using GrowCut and 4) usage of Editing tools like dilation, erosion and island removal or pure manual refinement after visual inspection of results (note: the users are responsible for qualitatively deciding how much dilation, erosion and island removal are required for the segmentation). Figure 5 shows the Slicer Editor module user interface on the left side and a loaded GBM data set on the right side. Figure 3 presents a typical user initialization for GrowCut on the axial, sagittal and coronal cross-sections. Figure 6 shows the results of subsequent erosion followed by a dilation and Figure 1 shows the results of the GrowCut method.

Figure 4
figure 4

Detailed workflow of the segmentation process that is used in the training and the evaluation phase (left).

The segmentation process starts with the initialization of the GrowCut algorithm by the user on an axial, sagittal and coronal slice. Then, the automatic segmentation is started and afterwards reviewed by the user. This results into the refinement phase where the Editor tools under Slicer are used to correct the automatic segmentation result – mostly by navigating along the axial slices. During the evaluation phase the time for the initialization and the refinement has been measured. The overall workflow of the proposed study is presented on the right side; it starts with the image data and ends with the training or the evaluation process. Therefore, the data is divided into two pools of data sets: the training data set and the evaluation data set. The segmentation process is for both stages the same. However, for the evaluation phase further image processing (voxelization and volume calculation) is required to calculate the Dice Similarity Coefficient (DSC) and the Hausdorff Distance (HD) for a quantitative evaluation.

Figure 5
figure 5

Slicer interface with the Editor on the left side and a loaded glioblastoma multiforme (GBM) data set on the right side: axial slice (upper left window), sagittal slice (lower left window), coronal slice (lower right window) and the three slices shown in a 3D visualization (upper right window).

Figure 6
figure 6

In these images the usage for the Dilate and Erode options under Slicer are presented.

The background shows an axial slice with a glioblastoma multiforme (white rectangle). The left white rectangle presents the zoomed segmentation result of GrowCut (green). As shown, the segmentation result is not very smooth at the tumor border. To get a smoother result the Dilate and Erode options under Slicer can be used. For this example Dilate, Erode and an additional Erode have been performed. The result of this operations is shown in the right white rectangle (green).

The hardware platform used was an Apple MacBook Pro (4 Intel Core i7, 2.3 GHz, 8 GB RAM, AMD Radeon HD 6750 M, Mac OS × 10.6 Snow Leopard).

Measurement of segmentation time

We measured the time taken by the same physician to segment manually vs. the 3D Slicer method. Within the 3D Slicer segmentation, we separately measured the time taken by each of the three steps (initialization, GrowCut, refinement using morphological operations) of the 3D Slicer method (see left chart of Figure 4).

Metrics for comparison between 3D Slicer and manual segmentation

The resulting segmentations from both methods were saved as binary volumes and the agreement between the two was compared using the Dice Similarity Coefficient and the Hausdorff Distance.

The Dice Similarity Coefficient (DSC) of agreement between two binary volumes is calculated as follows:

The DSC measures the relative volume overlap between A and R, where A and R are the binary masks from the automatic (A) and the reference (R) segmentation. V(·) is the volume (in mm3) of voxels inside the binary mask, by means of counting the number of voxels, then multiplying with the voxel size.

The Hausdorff Distance (HD) between two binary volumes is defined in terms of the Euclidean distance between the boundary voxels of the masks. Given the sets A (of the automatic segmentation) and R (of the reference segmentation) that consist of the points that correspond to the centers of segmentation mask boundary voxels in the two images, the directed HD h(A,R) is defined as the minimum Euclidean distance from any of the points in the first set to the second set and the HD between the two sets H(A,R) is the maximum of these distances: