Selected annotated instance segmentation sub-volumes from a large scale CT data-set of a historic aircraft

The Me 163 was a Second World War fighter airplane and is currently displayed in the Deutsches Museum in Munich, Germany. A complete computed tomography (CT) scan was obtained using a large scale industrial CT scanner to gain insights into its history, design, and state of preservation. The CT data enables visual examination of the airplane’s structural details across multiple scales, from the entire fuselage to individual sprockets and rivets. However, further processing requires instance segmentation of the CT data-set. Currently, there are no adequate computer-assisted tools for automated or semi-automated segmentation of such large scale CT airplane data. As a first step, an interactive data annotation process has been established. So far, seven 512 × 512 × 512 voxel sub-volumes of the Me 163 airplane have been annotated, which can potentially be used for various applications in digital heritage, non-destructive testing, or machine learning. This work describes the data acquisition process, outlines the interactive segmentation and post-processing, and discusses the challenges associated with interpreting and handling the annotated data.


Introduction
The Messerschmitt Me 163, see Figure 1, was a German fighter airplane with a rocket engine during the Second World War, and was part of the secret developments of the German air force [19].With its unique rocket engine, it was the first piloted aircraft to reach a maximum speed of about 1000 km/h.Of the 350 Me 163s built between 1941 and 1945, only ten examples survive in museums, one of which is displayed in the in the historic aircraft exhibition of the 'Deutsches Museum' in Munich, Germany.
To gain new insights into the history, design and state of preservation of this unique and historical airplane, a complete CT scan was obtained using the XXL-computer tomography scanner of the Fraunhofer IIS' development center for X-ray technologies EZRT in Fürth, Germany [21], see Figure 2.
Besides viewing and examining the XXL-CT data in detail using adequate interactive 3D-volume reader and viewer software [13] also, the airplane's many individual components are of interest, such as the screws, wheels, sprockets, rivets, and much more.
To obtain more information about these parts, their distribution within the airplane and their spatial and functional relationships to each other ideally an automated, semi-automatic, or purely manual instance segmentation or partitioning of all components and objects of interest into disjoint parts from the CT data is necessary.
To this end, different automated CT volume-segmentation methods of different complexity could be applied to obtain a set of segmented airplane parts.Nevertheless, all automatic as well as semiautomatic 3D image segmentation methods usually depend strongly on the availability of sufficient and adequate labelled reference data needed for the parameter optimisation or parameter training, as well as the sufficient evaluation of the developed delineation approaches.As the Me 163 aeroplane is an example of a unique object with partially very exceptional and matchless components, of which only one CT-scan exits (also known as lot-one-problem), adequate automatic or semi-automatic segmentation methods are currently not available, enabling and supporting the delineation of the airplane's different components.
However, if -as a first step -some adequate labelled reference data from such an XXL-CT airplane scan would be available, the development of new segmentation methods, either based on traditional image processing methods or alternatively using novel deep-learning approaches (e.g.employing deep convolutional neural networks (DCNNs) [11,7,14], could be developed and evaluated more efficiently.Especially as the performance of such DCNNs on vision tasks tends to increase logarithmically based on the volume of training data [22].Thus, in order to optimize an automation segmentation scheme, a large set of well-curated ground truth data sets is of most importance [5,17].
Hence, within this contribution, we will provide historical background to the Me 163 and its current stay in the 'Deutsches Museum' (Section 2), describe the data acquisition process using an XXL CT scanner (Section 3), outline the interactive labelling and annotation process of some distinct sub-volumes of the airplane (Section 4), and discuss various challenges with respect to interpreting and handling the annotated and labelled data (Section 5).Furthermore, we introduce a matrix-based metric to compare two (manually or automatic) labelled segmentations (Section 6) which can handle erroneous split or merged segments as well as voxel overlap of the segments.
Seven of the sub-volumes together with their manually obtained annotations shall be made public available [12] to be used in the future by researchers in the fields of digital heritage, non-destructive testing, machine vision and/or artificial intelligence to visualize and interact with as well as to develop, train, optimise and evaluate novel 3D-instance segmentation approaches.

The Me 163
The Messerschmitt Me 163 [1,2,19] in the historic aircraft collection and exhibition of the 'Deutsches Museum' (see Figure 1) is still a mysterious plane.The British' Royal Air Force (RAF) gifted it to the museum in 1964, but since the ID plate in the nose is empty, not much is known about its operational history in the Second World War or its second life in Great Britain.After it was captured in 1945, the plane was modified for flight-testing by the RAF.When an accident with another Me 163 nearly killed a test pilot, the Me 163s were kept only as technological curiosities.Some were scrapped after, some found their way into museums around the globe.
The British had realized that this alleged Nazi 'wonder weapon' was more of a danger for its pilots rather than allied planes.Developed from innovative tailless gliders by Alexander Lippisch and fitted with a Walter HWK 109-509 rocket engine with 14.7 kN of thrust in 1941, the Me 163 reached exceptional speeds and climb rates.The small and light airframe with its thick, swept wings reached Mach 0.84 and could climb up to 81 m/s.These achievements, however, came at a very high price: With no space for a retractable landing gear, the wheels were jettisoned after takeoff, often bouncing back and damaging the plane.The rocket fuel was depleted in just seven minutes, leaving very little time to reach the enemy.The armament was weak and unsuited for the purpose of the Me 163, that is intercepting heavy allied bombers.When gliding back to base, pilots could evade attacking fighters thanks to the good maneuverability of the Me 163, only to be sitting ducks after they came to a halt on the landing skid.What really made the plane an unacceptable hazard for pilots and ground crew was its highly flammable rocket fuel: 'C-Stoff' and 'T-Stoff' (the latter 80 % hydrogen peroxide) exploded on contact and fumes could dissolve any organic matter.Fatal accidents at take-off or landing were common.
The Me 163, as well as the so-called weapons 'V1' and 'V2', embodies a widespread belief in innovative technology as a miraculous savior from vastly superior allied air power.Forced laborers, willingly exploited by the German industry by the hundreds of thousands, had to build many parts of the plane in murderous conditions.In the end, the approx.350 Me 163s produced in total, shot down only nine heavy allied bombers between 1943 and 1945.In telling us about the hubris of its engineers as well as cultural aspects of technology, the Me 163 is a highly sought-after study object.

Data acquisition
The XXL-CT data-set of the historic Me 163 airplane was acquired at the XXL-CT facility of the Fraunhofer IIS' development center for X-ray technologies EZRT in Fürth, Germany [21].To cover the complete airplane, four subsequent CT-scans were performed, two for the fuselage (see Figure 2a) and two for the disassembled wings (see Figure 2b).Afterwards, the two sub-data-sets for the fuselage and the two sub-data-sets for the wings were each manually merged into one data-set (see Figure 3a left).
In total the four CT scans of the airplane parts needed approximately 17 days to complete.
To provide enough performance to permeate the airplane with X-rays, a linear accelerator X-ray source with 9 MeV was used.The distance between the X-ray source and the detector was set to d S-D = 12 m and the source-to-object distance to about d S-O = 10 m.
The use of a line detector with a width of w = 4 m and a pixel spacing of 400 µm results in a horizontal  As expected, and can be seen in Figure 3b as well as the last column of Table 1, most of the airplane's reconstructed interior consists of empty space or air.Apart from that, the CT volumes depict mainly a plethora of thin metal sheets, which have poorly or barely visible edge transitions to the adjacent metal sheets.
For the many cases where two metal sheets butt together, semantic information must be used to decide on the correct object boundaries between the entities.In addition, many regions in the XXL-CT volume are severely affected by artefacts from the data acquisi-tion and reconstruction such as beam hardening or scattered radiation, especially in the vicinity of solid thick walled metal structures.

Data Annotation
Even though manual data labelling is currently referred to as the 'gold standard' for unique complex image data [25], the required resources are quite high with respect to experienced staff and delineation time, even if specialized annotation pipelines (e.g.[16] and [9] ) allowing image processing guided annotation, proofreading of inference results and model refinements, are applied for this task.
Hence, to reduce the costs of experts needed for manual or interactive image labelling tasks, so-known 'crowd-sourcing' approaches have been proposed and partially established [6,20,3].
Nevertheless, to be effective, crowd-sourcing also profits strongly from specialized data management, annotation tool and soft skills of the annotators [24].However, besides the huge amount of organizational, legal and logistic overhead, one drawback of crowdsourcing is the limited understanding of the annotators about the annotation problem at hand and the complexity of the complex 3D data depicting the various objects.
To somehow make a compromise between experts and crowd-sourcing, each individual sub-volume was initially annotated and labelled by an first annotator, and the thus acquired annotation was subsequently proofread and corrected by a second experienced annotator.
The complete annotation of the first two 512³ subvolumes each needed about 350 working hours (or approximately two months with 40 hour per weeks), as the first annotator was trained on these sub-volumes and they contained many segments compared to later more empty sub-volumes.The manual annotation of each of these subsequent sub-volumes took about 10 % to 50 % of that time, mostly depending on the sub-volume complexity.The subsequent correction by different but trained annotators took about the same amount of time, or 4 to 120 hours per subvolume.3a) and detail (Figure 3b) located at approximately the midpoint of the fuselage between the nose and the tail of the airplane.
The following Section 4.1 will give a brief overview of the annotated XXL-data, while the used annotation pipeline will be introduced in Section 4.2.Table 1 gives a brief overview of the depicted objects in the annotated sub-volumes.Even though the regarded sub-volumes are located in the centre of the airplane (see Figure 3a), the 0-coordinates in the second column indicate that they are placed exactly at the border between the two sub-scans.It can be seen, that the largest object in the Table 1, being a complex metal sheet, consisting of 1,768,078 voxels (with an equivalent of approximately 116 cm 3 ), while the smallest object being a rivet contains only of 158 voxels (with an equivalent of 0.1 cm 3 ).Both of these objects are bounded by their respective side surfaces of their surrounding sub-volumes and actu-ally extend beyond them into adjacent sub-volumes.

Description of data
Overall annotated sub-volumes, approximately 93 % of all voxels refer to background data, namely air, while only 7 % (or 62.6 million voxels) relate to data of the depicted objects.These comprise a sum of 344 segments.

Annotation pipeline
The annotation process of the XXL-CT data is on one side related to the used annotation software   On the other side, it is highly dependent on the annotation rules and guidelines provided to the annotators as well as institutional knowledge which gets developed over time (see Section 4.2.3).Furthermore, some postprocessing possibilities such as filtering, morphological operators, or data fusion must be considered (see Section 4.2.4).

Annotation Software
We used the application 3D Slicer [18,8] for most of the annotation.This software provides the annotator with different types of interactive annotation tools such as 'paint strokes', 'boolean operations' or gray value aware 'fill methods' to select individual voxels and voxel groups.Furthermore, it can easily be extended with new segmentation functions [15] and includes a powerful scripting interface.

Annotation Hardware
We used graphic tablets with digital styluses as input devices for the slice-by-slice manual annotation and labelling of the sub-volumes.As they allowed easy and intuitive drawing.In contrast to the use of a mouse this approach is more precise, intuitive and more importantly more gentle on the wrist of the annotators [4].In Figure 6 a typical manual segmentation and labelling task of a sub-volume from XXL data can be seen using a graphics tablet.

Annotation guidlines
In our annotation guidelines, provided to all annotators, we stipulated that the 'human interpreted reality' of the data (based on the a-priori knowledge about the depicted objects) and not the 'perceived visual representation' should be segmented.For example, if scattered radiation artefacts were encountered, represented through bright or dark streaks through the volume or cupping artefacts from beam hardening, it was suggested to annotate the guessed real specimen and not the distorted image.This should increase the uniformity of the annotations since otherwise it is difficult to find the same thresholds over different volume regions and artifacts.The ultimate goal of the work is to develop methods to separate all components from each other in a meaningful way.This may not be achievable in some cases, e.g. if there is not enough data available.However, this can only be known when everything has been tried, for which a meaningful annotation of the desired ideal result has to be available.

Annotation Post-Processing
After the individual segments have been partially annotated automatically by hand, they usually do not yet have the quality expected from a ground truth.Due to the presence of noise on the segment surfaces and voxels that were annotated as belonging to more than one segment, post-treatment is necessary.
Morphological Closing: The use of the previously mentioned bandpass filter to visually smooth the grey values sometimes yields grainy textures inside the segments (see example in Section 5.1).Due to the presence of this coarse-grained noise, we decided to postprocess the results obtained by manual annotation to close gaps between the quality of the manual annotation and the desired quality of the segmentation.Overall, it was aimed to achieve semantical reasonable and simultaneously visually pleasing segmentation results.For this purpose, the manual annotation of each segment was first postpro-cessed using a morphological closing filter [10] with a 3 × 3 × 3 structure element.Figure 7 depicts two orthogonal slices from the manually segmented subvolume V 4 (3072,6144,0) prior and post morphological processing.While most of the changes introduced by the postprocessing consist mostly out of simple surface voxel alterations (see Figure 7c), they may also include changes to the surfaces of noisy metal sheets (see Figure 7f) which are prone to the more pronounced changes due to their noisy nature.
Overlapping Entities: We annotated each entity in the sub-volume individually slice by slice.In some rare cases, this yielded results, where we annotated voxels as belonging to multiple segments.For example, if the spatial resolution of the reconstructed volume data (with approximately 0.07 mm 3 per voxel) was not sufficient enough to represent the exact border between two adjacent thin sheets of metal.It was not always possible to represent this cases in an annotation data-set with only voxel resolution.In such cases, the corresponding voxels were annotated as belonging to several segments.
After finishing the annotation and labelling process of all depicted entities in a sub-volume V i , all these segmented entities were combined on the voxel level into one single volume.Nevertheless, within this step we allowed the possibility to overwrite already existing voxels of previously included segments.The overwriting of labelled voxels primarily occurs at the edges between two adjacent segments.This means that the order in which the segments are processed and fused has partially influenced the result of the final segmentation results.Hence the order of the fusion sequence was assigned pseudo-randomly.
Connected Component Analysis: Finally, we performed a successive connected component analysis with a chessboard metric (aka Chebyshev distance or L ∞ norm) [10] to find the separated chunks.This also allows for a simple fix of the challenges described in Section 5.3.Furthermore, we discarded small segments with less than 100 voxels and deleted them, to avoid over-segmentation.The threshold of θ = 100 voxels was determined empirically.

Challenges
In the following section, we discuss, some characteristics of the above-introduced data-set and challenges regarding its annotation and labelling.Both, the XXL imaging as well as the labelling steps provide ambiguities with respect to the data.To this end an example from sub-volume V 6 (3072,7168,0) will be taken, see Figure 8, and used as representative for the corresponding categories of challenges, namely noise (see Section 5.1), low contrast segments (see Section 5.2), segments leaving and re-entering the sub-volume (see Section 5.3) as well as annotator noise 5.4.However, these categories are only exemplary and not to be understood as fully comprehensive.

Noise
Figure 8 shows at location (a) a region in which three parallel thin metal sheets are visible.In Figure 9a an enlarged version is depicted where it can be observed that the three metal plates are interspersed Figure 9: Three parallel metal plates with high noise in the reconstruction.Figure 9a: enlarged section from sub-volume V 6 (3072,7168,0) (see Figure 8 (a)).
The grainy texture is due to the low data quality and should therefore not be included in the annotation.Figure 9b: result of naive segmentation; Figure 9c: desired segmentation after morphological closing.
with coarse-grained noise.Figure 9b provides the naive annotation strictly based on the visible grey values, leading to a result permeated by granular noise.However, using a-priori knowledge that the displayed metal components do not consist of sponge-like porous material, but the coarse-grained texture is due to measurement or reconstruction artefacts, the annotation is modified using the morphological closing (see above) as postprocessing step, yielding the desired result shown in Figure 9c.

Low to no contrast between segments
Figure 8 at location (b) as well as the zoomed-in area in Figure 10a shows a region in which specifically the bright object components to be annotated have no appreciable grey value or texture contrast to each other.Figure 10b shows a possible annotation in which the presumed bolt or screw (depicted in orange), runs through the nut (in light green).Figure 10c provides the grey value plot on along of the green dashed line in Figure 10a.The coloured backgrounds refer to the annotation, see Figure 10b.This annotation cannot be justified by the existing grey values and textures alone but must be made by examining the neighbouring similar structures and knowledge or assumptions about the production process.Another example of such low-contrast segment boundaries between adjacent metal sheets is shown in the field of view in Figure 8 (c) or in Figure 11. Figure 11b depicts a possible manual segmentation of the two metal sheets and the rivets in the regions.Figure 11c shows a grey value profile of the green dashed line shown in Figure 11b, together with its possible segmentation as a colored background.Similar to before, the course of the segment boundaries can only be argued using a-priori knowledge from the surrounding segments and layers.
Finally, Figure 12 shows a similar case from subvolume V 3 .Here, a rivet penetrates three adjacent metal sheets.Due to similar material densities and the large and evenly shaped contact surface, the transition between the rivet and sheet metal cannot be discerned clearly.

Re-entering segments
Some components in the volumetric data leave the visible area of the current sub-volume and reappear as disconnected segments at a different location of Example of low to no contrast entities.Figure 11a shows a slice from sub-volume V 6 (2072,7168,0) (see Figure 8 (c)) containing two metal sheets riveted together.No appreciable grey value and texture differences between the two components can be determined.Figure 11b: possible manual annotation of the left (light green ( )) and right (orange ( )) metal sheets.Example of an entity with low to no contrast.Figure 12a: slice from sub-volume V 3 (3072,5632,0) presumably depicting three metal sheets riveted together.No appreciable visual grey values nor texture differences between the components can be determined.the same sub-volume (see Figure 13).Here the component of interest -a helical wire support structure probably for a suction hose -is located in the upper left corner of a sub-volume see Figure 13a for overview and Figure 13b for an enlarged view.Without any further semantic information, the individual coils appear to be thirteen separate segments.Figure 13c depicts the result of a human segmentation of these entities.Figure 13d provides the final annotation result after applying a connected component analysis, where no correspondences and connections among the thirteen entities have been found.
However, without additional semantic information about the course of the entities outside the subvolume, it must be assumed that these segments are most likely separated from each other.For this reason, we performed a connected component analysis on the hand-annotated data-set and separated these segments as they leave and re-enter the subvolume.

Annotator noise
Limited knowledge of the true real ground truth often leads to severe annotator noise [23,26], which can often be observed within vast and difficult-to-label data sets.Different annotators will have inconsistent knowledge of the problem domain, are possibly fatigued, subconsciously introduce their own bias into the annotation output, or will label multiple parts differently.Thus, the obtained annotation from a specific annotator or a fusion of several annotations should be only understood as one possible annotation.
Figure 14 shows a small region (Figure 14a (d)) of sub-volume V 6 (3072,7168,0) which has been annotated by two different annotators (see Figures 14b and  14c).Additionally, the difference volume between the two annotations is depicted in Figure 14.As can be seen, most of the metal sheets only diverge in some surface voxels, whereas the rivet was annotated quite differently by each annotator.

Comparison Metric
The example of a segment correlation matrix depicted in Figure 16 shows how well the results of two different segmentations provided by two different an-notators may match.In this case the set of reference segments S R was initially generated by one annotator and is depicted on the vertical axis of the matrix in Figure 16.The set of detected segments S D was created by a second annotator, refining the first segmentation with our current understanding of the data-set.This set is depicted on the horizontal axis of the matrix in Figure 16.
Each row is assigned to one reference segment S R (i) and each column is assigned to a detected segment S D (j).The value or colour of each cell corresponds to the Intersection over Union (IoU) score (also known as Jaccard-Index) of two segments S R (i) and S D (j): If these two segments yield a complete overlap (meaning that their segmentations match completely) the value IoU is equal to 1.0.If two compared segments do not share at least one common voxel, the value IoU will be 0.0.All other overlap scenarios are mapped to a value range of IoU ∈ [0, 1].
The rows in the matrix are sorted in descending order by the count of voxels of their corresponding reference segments.Consequently, the top rows correspond to the largest segments and the bottom rows to the smallest segments.The columns have been sorted by searching for the detected segment with the best match, or highest IoU to the reference segment of the current row.Each detected segment can only be assigned to a single reference segment.Detected segments unmatched to a reference segment are sorted by their voxel count.We excluded segments with a voxel count of fewer than 100 voxels to reduce the size of the matrix.
Hence, a perfect segmentation S with respect to a reference segmentation R should be reflected by a quadratic correlation matrix containing the same count of rows and columns, and thus the same amount of reference segments and detected segments.Additionally, all correlation values outside the main diagonal should contain IoU values of IoU = 0.0, while all values on the main diagonal should have values of IoU = 1.0.However, in realistic application examples, the row and column count will differ.Usually, an oversegmentation will result in more columns than rows.Boundary errors will result in suboptimal correlation values.Rows with multiple horizontal values either denote an over-segmentation of the respective detected segment, or a reference segment that was accidentally been split into multiple segments.In contrast, vertical lines indicate segments spanning multiple reference segments.They merge multiple reference segments.Breaks in the diagonal line indicate reference segments without a good match in the detected segments.Figure 15 shows an example result of the manual annotation of a sub-volume compared to the postprocessed version of the same sub-volume.The desired bright diagonal line from the top left to the bottom right is pronounced, indicating that most of the reference segments (prior postprocessing) could be assigned to the detected segments (after postprocessing).The scattered purple cells, mostly located in the top third of the matrix, signal that some voxels of the manually augmented segments overlap multiple postprocessed segments and are assigned to them.This often happens if the surface of the manual segmentation which was created using the bandpass selection gets smoothed by the postprocessing (see Section 4.2.4).
Figure 16 shows the correlation matrix between the two manually annotated version of sub-volume V 4 (3072,6144,0) which have been annotated by two different annotators.It can be seen that the annotated segments of the two annotators, especially the smaller segments, mostly match.As the two more or less pronounced vertical lines in towards the left matrix size indicate, most segments annotated by the first annotator lose voxels, most likely surface voxels, to the bigger segments segmented by the second annotator.The gap in the diagonal line almost at the center of the matrix corresponds to a rivet which was annotated much sturdier in the first annotation than by the second annotator.

Discussion and Conclusions
In this work, we presented a data collection of seven manually-annotated sub-volumes obtained from an XXL-CT Dataset from a historical airplane.These sub-volumes can potentially serve as a novel benchmark date-collection for instance segmentation in the field of non-destructive testing using XXL-CT subvolumes.To our knowledge, at this point of time similar public data sets from XXL-CT are not available.
For the complete XXL-CT volume data we described the acquisition and measurement procedures, as well as its further processing.We described how and according to which criteria the seven sub-volumes were annotated and labelled manually by various annotators, including the description and discussion of challenges regarding possible ambiguities contained in the data-set.
We would like to note that although we have taken great care to annotate the sub-volumes to the best of our knowledge and belief, we may still have made mistakes.Some regions of the data-set simply cannot be clearly annotated due to the quality of the data and the recording modality.
All reconstruction and labelled sub-volumes are available under [12] We hope that the provided data sets are useful for further research.built all necessary interfaces and integrated it in the segmentation workflow.He drafted and wrote the manuscript including the graphics.
NR -together with MB, TF and MS -organized and performed the XXL scan of the aero plane and provided the background information in the paper about the scanning process and setups as well as the scanning parameter.
AH -together with NS -organized the museums logistics of the airplane scanning and provided the historical background and contextual setting in the paper.
SG -was involved in the idea and conception of the XXL-data preparation and annotation, as well as in the planning, conception and proofreading of the paper.
MB -together with NR, TF and MS -organized and performed the XXL scan of the aero plane.
TF -together with NR, MB and MS -organized and performed the XXL scan of the aero plane.
MS -together with NR, MB and TF organized, performed, and supervised the XXL scan of the aero plane TW -together with RG and SG -provided the idea, planned, and concepted the paper, wrote the introduction and the setting, and did the final proofreading and editing.

Figure 1 :
Figure 1: Image of the Messerschmitt Me 163 in the historic aircraft exhibition of the 'Deutsches Museum'

Figure 2 :
Figure 2: Fuselage (Figure 2a) and wings (Figure 2b) of the Me 163 airplane inside the mounting brackets for the CT scan.

Figure 3 :
Figure 3: Rendering of the reconstructed fuselage of the scanned airplane (Figure3a) and detail (Figure3b) located at approximately the midpoint of the fuselage between the nose and the tail of the airplane.

Figure 4
Figure4provides examples of 3D-renderings of annotated and labelled sub-volumes.Each sub-volume contains between 5 and 172 individual object entities of various sizes, materials, and types.Table1gives a brief overview of the depicted objects in the annotated sub-volumes.Even though the regarded sub-volumes are located in the centre of the airplane (see Figure3a), the 0-coordinates in the second column indicate that they are placed exactly at the border between the two sub-scans.It can be seen, that the largest object in the Table1, being a complex metal sheet, consisting of 1,768,078 voxels (with an equivalent of approximately 116 cm 3 ), while the smallest object being a rivet contains only of 158 voxels (with an equivalent of 0.1 cm 3 ).Both of these objects are bounded by their respective side surfaces of their surrounding sub-volumes and actu- depicts the unannotated volume, Figure 5b shows all labelled segments separated by colour.To increase clarity, only the segments of a specific category are shown in the following: Figure 5c provides all metal sheets; Figure 5d gives the presumably pressure-carrying pipes, pressure tanks and lines; Figure 5e contains all rivets and screw connections; Finally, Figure 5f shows all brackets, clamp connectors and other miscellaneous transition elements that could not otherwise be assigned a category.

Figure 4 :
Figure 4: Examples of 3D-renderings of manually annotated and labelled sub-volumes from the XXL-Scan of the Me 163, depicting various semantic objects of different types, shapes, and materials.

Figure 5 :
Figure 5: Example renderings of sub-volume V 6 (3072,7168,0).While Figure 5a shows the unannotated volume, Figure 5b depicts all labelled segments separated by colour.To increase clarity, only the segments of a specific category are shown in the following sub-figures: Figure 5c provides all metal sheets; Figure 5d gives the presumably pressure-carrying pipes, pressure tanks and lines; Figure 5e contains all rivets and screw connections; Figure 5f finally shows all brackets, clamp connectors and other miscellaneous transition elements that could not otherwise be assigned a category.

Figure 6 :
Figure 6: Manual labelling of large-scale industrial CT data of an aeroplane part using a high-resolution graphics tablet and a digital pen (Composed image to illustrate the process).

Figure 7 :
Figure 7: Slices from sub-volume V 4 (3072,6144,0) depicting typical changes introduced by morphological postprocessing.Figures 7a and 7d(1st column) show manually annotated input volumes.Figures 7b and 7e (2nd column) depict the morphologically postprocessed output.Finally, Figures 7c and 7f (3rd column) show the difference between the input and output volumes.The upper row shows an example where the changes introduced by the postprocessing consist mainly of small voxel alterations of the surface of a thin metal sheet.The bottom row depicts the changes close to the surface of the orange metal sheet located at the bottom of the upper row of images.This metal sheet appears to be quite noisy and therefore prone to the more pronounced changes visible in the residual Figure 7f.

Figure 10 :
Figure 10: Example of low to no contrast entities.Figure10a: Slice from sub-volume V 6 (3072,7168,0) (see Figure8 (b)) presumably showing a screw and its corresponding nut.No appreciable grey value and texture differences between the two components can be determined.Figure10b: possible semantic annotation with an orange screw ( ) and a light green nut ( ) inside a blue structure ( ); Figure10c: grey value profile plot along the green dashed section marked in the left subfigure, where the background colours indicate the possible annotation into the semantic segments.
Figure 11:Example of low to no contrast entities.Figure11ashows a slice from sub-volume V 6 (2072,7168,0) (see Figure8 (c)) containing two metal sheets riveted together.No appreciable grey value and texture differences between the two components can be determined.Figure11b: possible manual annotation of the left (light green ( )) and right (orange ( )) metal sheets.Figure11c: grey value plot along the green dashed section marked in Figure11a.The background colors indicate the possible annotation into semantic segments.
Figure 11:Example of low to no contrast entities.Figure11ashows a slice from sub-volume V 6 (2072,7168,0) (see Figure8 (c)) containing two metal sheets riveted together.No appreciable grey value and texture differences between the two components can be determined.Figure11b: possible manual annotation of the left (light green ( )) and right (orange ( )) metal sheets.Figure11c: grey value plot along the green dashed section marked in Figure11a.The background colors indicate the possible annotation into semantic segments.

Figure 12 :
Figure 12:Example of an entity with low to no contrast.Figure12a: slice from sub-volume V 3 (3072,5632,0) presumably depicting three metal sheets riveted together.No appreciable visual grey values nor texture differences between the components can be determined.Figure12b: possible (assumed) annotation of the regions taking the surrounding topology into account.
Figure 12:Example of an entity with low to no contrast.Figure12a: slice from sub-volume V 3 (3072,5632,0) presumably depicting three metal sheets riveted together.No appreciable visual grey values nor texture differences between the components can be determined.Figure12b: possible (assumed) annotation of the regions taking the surrounding topology into account.

Figure 13 :
Figure 13: Example of a slice (from sub-volume V 6 (3072,7168,0) ) depicting a component which is not fully contained in the current sub-volume.Figure 13a and 13b: helical wire support structure.Without further information, the individual coils appear to be thirteen separate segments; Figure 13c: the result of human segmentation; Figure 13d: annotation result after connected component analysis, where no correspondences among the entities have been found.

Figure 14 :
Figure 14: Example of a small region (from Figure 14a (d)) depicting multiple metal sheets riveted together annotated by two different annotators (Figures 14b and 14c).The differences between both annotators are shown in Figure 14d.

Figure 15 :
Figure15: Correlation matrix of the segmentation of sub-volume V 4 (3072,6144,0) before and after the postprocessing.Rows correspond to reference segments, here the manual annotation (see Figure9b), which are sorted top to bottom by decreasing voxel count of the segments.The columns correspond to the detected segments, here the postprocessed segments (see Figure9c), which are sorted by the maximum IoU to a reference segment.

Figure 16 :
Figure 16: Correlation matrix of the segmentation of sub-volume V 4 (3072,6144,0) of the segmentation results for the same data-set annotated by two different annotators.The rows correspond to reference segments, here the first initial annotation.The columns correspond to detected segments, here the second refined annotation, which are sorted by the maximum IoU to a reference segment.

Table 1 :
Key metrics of the annotated sub-volumes depicted in Figure4.