Quantifying the unknown impact of segmentation uncertainty on image-based simulations

Image-based simulation, the use of 3D images to calculate physical quantities, relies on image segmentation for geometry creation. However, this process introduces image segmentation uncertainty because different segmentation tools (both manual and machine-learning-based) will each produce a unique and valid segmentation. First, we demonstrate that these variations propagate into the physics simulations, compromising the resulting physics quantities. Second, we propose a general framework for rapidly quantifying segmentation uncertainty. Through the creation and sampling of segmentation uncertainty probability maps, we systematically and objectively create uncertainty distributions of the physics quantities. We show that physics quantity uncertainty distributions can follow a Normal distribution, but, in more complicated physics simulations, the resulting uncertainty distribution can be surprisingly nontrivial. We establish that bounding segmentation uncertainty can fail in these nontrivial situations. While our work does not eliminate segmentation uncertainty, it improves simulation credibility by making visible the previously unrecognized segmentation uncertainty plaguing image-based simulation.


Multi-class EQUIPS Workflow
In the main manuscript, we primarily presented results for binarized images, where each voxel is classified into one of two phases. However, the EQUIPS workflow is equally applicable to images where there are multiple classes to segment, as we present in this supplementary information. We demonstrate this approach on a manufactured five-class image with a nominal segmentation shown in Supplementary Figure 1.
Consider a 3D image that is composed of a set of n c segmented classes, C = {1, 2, . . . , n c }. The probability map for a multi-class image segmentation is generated identically to the binary case (n c = 2). Namely, where v,i is the probability of voxel v being in class i, N is the number of image segmentation samples, and p k v,i is the binarized value of voxel v being in class i in sample k. Here, p ∈ {0, 1} is the binarized value inside voxel v, where p = 1 means that the voxel is in class i while p = 0 means that the voxel is not in class i.
The key to exploring segmentation uncertainty for multi-class problems, which is not obvious in the binary case, is that a separate probability map exists for each class i. As a result, the class probabilities in each voxel v must sum to unity: In the binary case (n c = 2), adding Eq. (S1) together for each class C = {1, 2} results in the simple identity v,1 = 1− v,2 , as it must. For this reason, only one probability map is necessary to describe segmentation uncertainty in the binary case, as we have used throughout the main paper. Let A represent a segmented multi-class image. Each voxel v in the segmented image has a unique class label such that A v ∈ C, where A v specifies a single voxel v in this image segmentation. In a two-class image, A is a binarized image segmentation. The nominal image segmentation of a multi-class image (N , Supplementary Figure 1) is made by assigning each voxel to the class for which the probability is highest, where N v is simply the nominal segmentation for voxel v. The arg max operator returns the value of c when class c has the highest probability of all classes in C. In the binary image case, this is equivalent to defining N v by v,1 > 0.5 as was done throughout the main manuscript. It is also useful to specify the nominal segmentation omitting class c ∈ C from the arg max operator: We probe the uncertainty of this multi-class segmentation on a single class c ∈ C at a time following Supplementary Algorithm 1. To generate a percentile segmentation where class c is above the percentile threshold α, first set all voxels where v,c > α to c. The remaining unassigned voxels are assigned to the most probable class other than c, N c v using Eq. (S4). The resulting image is the multi-class equivalent to the percentile segmentations of binary images in the main manuscript.
We demonstrate Supplementary Algorithm 1 by probing the probability maps of our manufactured multi-class image Supplementary Figure 1 Figure 2(f-j)). As in the main manuscript, we want to probe uncertainties for each class at multiple probability values, in this case 10%, 50%, and 90%, the results of which are shown in the remaining three rows (Supplementary Figure 2(k-y)).
As expected, for higher percentile values, the area of the probed class is smaller, indicating that fewer voxels have a high probability of being in that class. Whereas for lower percentile values, the area of that class is larger. For the high percentile values, the voxels no longer in class c are assigned to the most probable other class, which is most frequently (but not necessarily) the nearest neighboring class. The boundary between any two classes that are not c remains unchanged.
In the binary case, the nominal segmentation corresponds to the 50% probability values because of Eq. (S2). In contrast, selecting a percentile value of 50% from a multi-class probability map does

SupplementaryAlgorithm 1 Probe class c segmentation uncertainty
Require: Image I segmented into C = {1, 2, . . . , n c } classes and represented by A, and percentile threshold value α. 1: Generate probability maps v,i for voxel v and i, c ∈ C. 2: for all v ∈ I do 3: if v,c > α then

4:
A v = c. not return the nominal segmentation. The explanation for this observation is simple. When n c > 2, the maximal probability value for any given voxel near the boundary between classes is most likely less than 0.5. Therefore, the 50-percentile segmentation sets a higher probability threshold than the nominal segmentation, resulting in a smaller area for class c than in the nominal segmentation. The EQUIPS workflow allows for probing the segmentation uncertainty of one single class from a multi-class image at a time. However, one can imagine probing the segmentation uncertainty of multiple classes from a multi-class image simultaneously. An approach similar to the one proposed in this section can be envisioned. Doing this, however, would necessarily require a constraint on how the percentile values of the selected classes are chosen to satisfy Eq. (S2). Because there is some ambiguity to the authors on how to properly constrain percentile value selections, we omit further generalization of this approach.