A reusable neural network pipeline for unidirectional fiber segmentation

Fiber-reinforced ceramic-matrix composites are advanced, temperature resistant materials with applications in aerospace engineering. Their analysis involves the detection and separation of fibers, embedded in a fiber bed, from an imaged sample. Currently, this is mostly done using semi-supervised techniques. Here, we present an open, automated computational pipeline to detect fibers from a tomographically reconstructed X-ray volume. We apply our pipeline to a non-trivial dataset by Larson et al. To separate the fibers in these samples, we tested four different architectures of convolutional neural networks. When comparing our neural network approach to a semi-supervised one, we obtained Dice and Matthews coefficients reaching up to 98%, showing that these automated approaches can match human-supervised methods, in some cases separating fibers that human-curated algorithms could not find. The software written for this project is open source, released under a permissive license, and can be freely adapted and re-used in other domains.


Introduction
Fiber-reinforced ceramic-matrix composites are advanced materials used in aerospace gas-turbine engines 1,2 and nuclear fusion 3 , due to their resistance to temperatures 100-200 °C higher than alloys used in the same applications.
Larson et al. investigated new manufacturing processes for curing preceramic polymer into unidirectional fiber beds, studying the microstructure evolution during matrix impregnation with the aim of reinforcing ceramic-matrix composites 4,5 . They used X-ray computed tomography (CT) to characterize the three-dimensional microstructure of their composites non-destructively, studying their evolution in-situ while processing the materials at high temperatures 4 and describing overall fiber bed properties and microstructures of unidirectional composites 5 . The X-ray CT images acquired from these fiber beds are available at Materials Data Facility 6 .
Larson et al. 's fiber beds have widths of approximately 1.5 mm, containing 5000-6200 fibers per stack. Each fiber has an average radius of 6.4 ± 0.9 μm, with diameters ranging from 13 to 20 pixels in the micrographs 5 . They present semi-supervised techniques to separate the fibers within the fiber beds; their segmentation is available for five samples 7 . We were curious to see whether their results could be improved using different techniques.
In this study we separate fibers in ex-situ X-ray CT fiber beds of nine samples from Larson et al. Our paper makes the following contributions: • It annotates, explains, and expands Larson et al.'s dataset 7 to facilitate reproducible research and benchmarking.
• It provides open source tools to analyze such datasets, so that researchers may compare their results with ours and one another's. • It shows that automated analysis can perform similarly to or better than human steered fiber segmentations.
The samples we used in this study correspond to two general states: wet -obtained after pressure removaland cured. These samples were acquired using microtomographic instruments from the Advanced Light Source at Lawrence Berkeley National Laboratory operated in a low-flux, two-bunch mode 5 . We used their reconstructions obtained without phase retrieval; Larson et al. provide segmentations for five of these samples 7 , which we compare to our results.
To separate the fibers in these samples, we tested four different fully convolutional neural networks (CNN), algorithms from computer vision and deep learning. When comparing our neural network approach to Larson et al. 's results, we obtained Dice 8 and Matthews 9 coefficients greater than 92.28 ± 9.65%, reaching up to 98.42 ± 0.03%, showing that the network results are close to the human-supervised ones in these fiber beds, in some cases separating fibers that the algorithms created by Larson et al. 5 could not find. All software and data generated in this study are available for download, along with instructions for their use. The code is open source, released under a permissive software license, and can be adapted easily for other domains.

Results
Larson et al. provide segmentations for their fibers ( Fig. 1) in five of the wet and cured samples, obtained using the following pipeline 5 : 1. Fiber detection using the circular Hough transform 10,11 ; 2. Correction of improperly identified pixels using filters based on connected region size and pixel value, and by comparisons using ten slices above and below the slice of interest; 3. Separation of fibers using the watershed algorithm 12 .
Their paper gives a high-level overview of these steps, but provides no details on parameters used, nor the source code for computing their segmentation. We tried different approaches to reproduce their results, focusing on separating the fibers in the fiber bed samples. Our first approach was to create a classic, unsupervised image processing pipeline. We used histogram equalization 13 , Chambolle's total variation denoising 14,15 , multi-Otsu threshold 16,17 , and the WUSEM algorithm 18 to separate each single fiber. The result is a labeled image containing the separated fibers (Fig. 2). The pipeline had limitations when processing fibers on the edges of fiber beds, where its labels differed from those produced by Larson et al. Restricting the segmentation region to the center of beds gives satisfactory results (Fig. 2(e)), but reduces the total number of detected fibers.
To obtain more robust results, we evaluated four fully convolutional neural network architectures: Tiramisu 19 and U-Net 20 , as well as their three-dimensional counterparts, 3D Tiramisu and 3D U-Net 21 . We also investigated whether three-dimensional networks generate better segmentation results, leveraging the structure of the material.
Fully convolutional neural networks (Cnn) for fiber detection. We implemented four architectures of fully convolutional neural networks (CNNs) -Tiramisu, U-Net, 3D Tiramisu, and 3D U-Net -to reproduce the results provided by Larson et al. Labeled data, in our case, consists of fibers within fiber beds. To train the neural networks to recognize these fibers, we used slices from two different samples: "232p3 wet" and "232p3 cured", registered according to the wet sample. Larson et al. provided the fiber segmentation for these samples 7 , which we used as labels in the training. The training and validation datasets contained 250 and 50 images from www.nature.com/scientificdata www.nature.com/scientificdata/ www.nature.com/scientificdata www.nature.com/scientificdata/ each sample, respectively, in a total of 600 images. Each image from the original samples have width and height size of 2560 × 2560 pixels.
For all networks, we used a learning rate of 1 −4 , and binary cross entropy 22 as the loss function. During training, the networks reached accuracy higher than 0.9 and loss lower than 0.1 on the first epoch. Two-dimensional U-Net is the exception, presenting loss of 0.23 at the end of the first epoch. Despite that, 2D U-Net reaches the lowest loss between the four architectures at the end of its training. 2D U-Net is also the fastest network to finish its training (7 h, 43 min), followed by Tiramisu (13 h, 10 min), 3D U-Net (24 h, 16 min) and 3D Tiramisu (95 h, 49 min, Fig. 3). www.nature.com/scientificdata www.nature.com/scientificdata/ Examining convergence behavior on the first epoch, the 2D U-Net does not progress as smoothly as the other networks (Fig. 4). However, this does not impair U-Net's accuracy (0.977 after one epoch). Accuracy and loss for the validation dataset also improve significantly: Tiramisu had validation loss vs. validation accuracy ratio of 0.034 while U-Net had 0.048, and both 3D architectures had ratios of 0.043. The large size of the training set and the similarities between slices in the input data are responsible for these high accuracies and low losses.
We used the trained networks to predict fiber labelings for twelve different datasets in total. These datasets were made available by Larson et al. 7 , and we keep the same file identifiers for fast cross-reference: • "232p1": wet • "232p3": wet, cured, cured registered • "235p1": wet • "235p4": wet, cured, cured registered • "244p1": wet, cured, cured registered • "245p1": wet Here, the first three numeric characters correspond to a sample, and the last character correspond to different extrinsic factors, e.g. deformation. Despite being samples from similar materials, the reconstructed files presented several differences, for example regarding amount of ringing artifacts, intensity variation, noise, therefore they are considered as different samples in this paper.
We calculated the average prediction time for each sample (Fig. 5). As with the training time results, 2D U-Net and 2D Tiramisu are the fastest architectures to process a sample, while 3D Tiramisu is the slowest.  Mean and standard deviation of prediction times for each sample. As with processing, during training 2D U-Net and 2D Tiramisu were the fastest architectures to process a sample in one hour, on average. 3D Tiramisu, being the slowest, takes on average more than a day to process one sample.

Evaluation of our results and comparison with Larson et al. (2019).
After processing all samples, we compared our predictions with the results that Larson et al. made available on their dataset 7 . They provided segmentations for five datasets from the twelve we processed: "232p1 wet", "232p3 cured", "232p3 wet", "244p1 cured", "244p1 wet".
First, we compared our predictions to their results using receiver operating characteristic (ROC) curves and the area under curve (AUC, Fig. 6). AUC is larger than 98% for all comparisons; therefore, our predictions www.nature.com/scientificdata www.nature.com/scientificdata/ are accurate when compared with the semi-supervised method suggested by Larson et al. 5 . The 2D versions of U-Net and Tiramisu have similar results, performing better than 3D U-Net and 3D Tiramisu.
We also examined the binary versions of our predictions and compared them with Larson et al. 's results. For each slice or cube from the dataset, we used a hard threshold of 0.5; values above that are considered as fibers, while values below that are treated as background. We used Dice 8 and Matthews 9 correlation coefficients for our comparison ( Table 1). The comparison using U-Net yields the highest Dice and Matthews coefficients for three of five datasets. Tiramisu had the highest Dice/Matthews coefficients for the "244p1 cured" dataset, and both networks have similar results for "232p1 wet". 3D Tiramisu had the lowest Dice and Matthews coefficients in our comparison.

Discussion
The analysis of ceramic matrix composites (CMC) depends on the detection of its fibers. Semi-supervised algorithms, such as the one presented by Larson et al. 5 , can perform that task satisfactorily. The description of that specific algorithm, however, lacks information on parameters necessary for replication. It also includes steps that involve manual curation. As such, it was not possible for us to reimplement it fully.
Convolutional neural networks are being used successfully in the segmentation of different two-and three-dimensional scientific data 23-28 , including microtomographies. For example, fully convolutional neural networks were used to generate 3D tau inclusion density maps 29 , to segment the tidemark on osteochondral samples 30 , and 3D models of structures of temporal-bone anatomy 31 .
Researchers have been studying fiber-analysis detection for a while, using a variety of tools. Approaches include tracking, statistical methods, and classical image processing [32][33][34][35][36][37][38][39] . To the best of our knowledge, there are two different deep learning approaches applied to this problem: • Yu et al. 40 use an unsupervised learning approach based on Faster R-CNN 41 and a Kalman filter based tracking. They compare their results with Zhou et al. 36 , reaching a Dice coefficient of up to 99%. • Miramontes et al. 42 reach an average accuracy of 93.75% using a 2D LeNet-5 CNN 43 to detect fibers in a specific sample.
Our study builds upon previous work by using similar material samples, but it expands tests to many more samples and it includes the implementation and training of four architectures: 2D U-Net, 2D Tiramisu, 3D U-Net, and 3D Tiramisu, used to process twelve large datasets (≈140 GB total), and comparing our results with the gold standard labeling provided by Larson et al. 7 for five of them. We used ROC curves and their area under curve (AUC) to ensure the quality of our predictions, obtaining AUC larger than 98% (Fig. 6). Also, Dice and Matthews coefficients were used to compare our results with Larson et al.'s solutions (Table 1), reaching coefficients of up to 98.42 ± 0.03%.
When processing a defective slice (a slice with severe artifacts), the 3D architectures perform better than the 2D ones since they are able to leverage information about the structure of the material (Fig. 7).
Based on the research presented, we recommend using the 2D U-Net to process microtomographies of CMC fibers. Both 2D networks lead to similar accuracy and loss values in our comparisons (Table 1); however, U-Nets converge more rapidly and are therefore computationally cheaper to train than Tiramisu. The 3D architectures, while performing better on defective samples (Fig. 7), do not generally achieve better results than the 2D architectures. In fact, the 3D architectures require more training to achieve comparable accuracy (Fig. 3) and are slower to predict (Fig. 5), therefore requiring considerable additional computation for marginal gains.
Our CNN architectures perform at the level of human-curated accuracy -i.e., Larson et al. 's semi-supervised approach -, sometimes even surpassing it. For instance, the 2D U-Net identified fibers that the Larson et al. algorithm did not find (Fig. 8).
Using labels predicted by the U-Net architecture, we render a three-dimensional visualization of the fibers (Fig. 9). Despite the absence of tracking, the U-Net segmentation clearly outlines fibers across the stack.
In this paper, we presented neural networks for analyzing microtomographies of CMC fibers in fiber beds. The data used is publicly available 7 and was acquired in a real materials design experiment. Results are comparable to human-curated segmentations; yet, the networks can predict fiber locations in large stacks of microtomographies without any human intervention. Despite the encouraging results achieved in this study, there is  www.nature.com/scientificdata www.nature.com/scientificdata/ www.nature.com/scientificdata www.nature.com/scientificdata/ room for improvement. For example, the training time of especially the 3D networks turned out to be prohibitive in performing a full hyperparameter sweep. A search for optimal parameters of all networks used could be implemented in a future study. We also aim to investigate whether an ensemble of networks will perform better. www.nature.com/scientificdata www.nature.com/scientificdata/ We would also like to explore how to best adjust thresholds at the last layer of the network. Here, we maintained a hard threshold of 0.5 that suited the sigmoid on the last layer of the implemented CNNs, but one could, e.g., use conditional random field networks instead.

Methods
Fully convolutional neural networks. We implemented four architectures -two dimensional U-Net 20 and Tiramisu 19 , and their three-dimensional versions -to attempt improving on the results provided by Larson et al. We used supervised algorithms: they rely on labeled data to learn what are the regions of interest -in our case, fibers within microtomographies of fiber beds.
All CNN algorithms were implemented using TensorFlow 44 and Keras 45 on a computer with two Intel Xeon Gold processors 6134 and two Nvidia GeForce RTX 2080 graphical processing units. Each GPU has 10 GB of RAM.
To train the neural networks in recognizing fibers, we used slices from two different samples: "232p3 wet" and "232p3 cured", registered according to the wet sample. Larson et al. provided the fiber segmentation for these samples, which we used as labels in the training. The training and validation procedures processed 350 and 149 images from each sample, respectively; a total of 998 images. Each image from the original samples have width and height size of 2560 × 2560 pixels.
To feed the two-dimensional networks, we padded the images with 16 pixels, of value zero, in each dimension. Then, each image was cut into tiles of size 288 × 288, each 256 pixels, creating an overlap of 32 pixels. These overlapping regions, which are removed after processing, avoid artifacts on the borders of processed tiles. Therefore, each input slice generated 100 images with 288 × 288 pixels, in a total of 50,000 images for the training set, and 10,000 for the validation set.
We needed to pre-process the training images differently to train the three-dimensional networks. We loaded the entire samples, each with size 2160 × 2560 × 2560, and padded their dimensions with 16 pixels of value zero. Then, we cut slices of size 64 × 64 × 64 voxels, each 32 pixels. Hence, the training and validation sets for the three-dimensional networks have 96,000 and 19,200 cubes, respectively.
We implemented data augmentation, aiming for a network capable of processing samples with varying characteristics. We augmented the images on the training sets using rotations, horizontal and vertical flips, width and height shifts, zoom, and shear transforms. For that, we used Keras embedded tools within the ImageDataGenerator module to augment images for the two-dimensional networks. Since Keras's ImageDataGenerator is not able to process three-dimensional input so far, we adapted the ImageDataGenerator module. The adapted version we used in this study is named ChunkDataGenerator, and is provided at the repository presented in the section Code Availability, along with the software produced in this study.
To reduce the possibility of overfitting, we implemented dropout regularization 46 . We followed the suggestions in the original papers for U-Net architectures: 2D U-Net received a dropout rate of 50% in the last analysis layer and in the bottleneck, while 3D U-Net 21 did not receive any dropout. The Tiramisu structures received a dropout rate of 20%, as suggested by Jégou et al. 19 .
Hyperparameters. To better compare the networks, we maintain the same training hyperparameters when possible. Ideally, we would conduct a hyperparameter sweep -a search for the optimal hyperparameters for each network -, but training time turned out to be prohibitive, especially for the three-dimensional networks. Due to the large amount of training data and the similarities between training samples (2D tiles or 3D cubes), we decided to train all architectures for five epochs. The 2D architectures were trained with batches of four images, Fig. 9 Fibers on the sample "232p3 wet" processed using the U-Net architecture. As seen in the longitudinal cut, this pipeline identifies fibers across the sample height despite the absence of tracking.
www.nature.com/scientificdata www.nature.com/scientificdata/ while the batches for 3D architectures had two cubes each. The learning rate used was 1 −4 , and the loss function used was binary cross entropy 22 . We followed advice from the original papers with regards to optimization algorithms: we used the Adam optimizer 47 Dice and Matthews coefficients receive true positive (TP), false positive (FP), true negative (TN), and false negative (FN) pixels, which are determined as: • TP: pixels correctly labeled as being part of a fiber.
• FP: pixels incorrectly labeled as being part of a fiber.
• TN: pixels correctly labeled as background.
• FN: pixels incorrectly labeled as background.
TP, FP, TN, and FN are obtained when the prediction data is compared with the gold standard.
Dataset. The dataset accompanying Larson et al. 5 includes raw images, segmentation results, and a brief description of segmentation tools -the Hough transform, mathematical morphology, and statistical filters. To reproduce their work fully would have required further information, including metadata, parameters used and, ideally, code for analysis. To aid reproducing segmentation results, we contribute a set of twelve processed fiber beds, based on the Larson et al. data. We also include the weights for each neural network architecture we implemented and trained. These weights can be used to process fibers of similar structure in other datasets.
Visualization. Imaging CMC specimens at high-resolution, such as the Larson et al. samples 7 , leads to large datasets -for example, each stack we used in this paper occupies around 14 GB after reconstruction, with the following exceptions: the registered versions of cured samples 232p3, 235p4 and 244p1, at 11 GB each, and the sample 232p3 wet at around 6 GB.
Often, specialists need software to visualize results during data collection. Yet, it can be challenging to produce meaningful figures without advanced image analysis and/or computational platforms with generous amounts of memory. We wanted to show that interactive exploration of large datasets is viable on a modest laptop computer. We therefore used matplotlib 50 and ITK 51 (Fig. 9) to generate all figures in this paper, using a standard laptop with 16 GB of RAM. This means that a scientist could use, e.g., Jupyter Notebooks 52 to do quick, interactive probing of specimens during beamtime.

Data availability
This study uses neural networks to process fibers in fiber beds, using Larson et al. datasets 7 . To be able to reproduce our study, it is necessary to download that data.
The data generated in this study is available in Dryad 53 , under a CC0 license. CC0 dedicates the work to the public domain, to the extent allowed by law.

Code availability
The software produced throughout this study is available at https://github.com/alexdesiqueira/fcn_microct/ under the BSD-3 license.