Segmentation of pancreatic ductal adenocarcinoma (PDAC) and surrounding vessels in CT images using deep convolutional neural networks and texture descriptors

Fully automated and volumetric segmentation of critical tumors may play a crucial role in diagnosis and surgical planning. One of the most challenging tumor segmentation tasks is localization of pancreatic ductal adenocarcinoma (PDAC). Exclusive application of conventional methods does not appear promising. Deep learning approaches has achieved great success in the computer aided diagnosis, especially in biomedical image segmentation. This paper introduces a framework based on convolutional neural network (CNN) for segmentation of PDAC mass and surrounding vessels in CT images by incorporating powerful classic features, as well. First, a 3D-CNN architecture is used to localize the pancreas region from the whole CT volume using 3D Local Binary Pattern (LBP) map of the original image. Segmentation of PDAC mass is subsequently performed using 2D attention U-Net and Texture Attention U-Net (TAU-Net). TAU-Net is introduced by fusion of dense Scale-Invariant Feature Transform (SIFT) and LBP descriptors into the attention U-Net. An ensemble model is then used to cumulate the advantages of both networks using a 3D-CNN. In addition, to reduce the effects of imbalanced data, a multi-objective loss function is proposed as a weighted combination of three classic losses including Generalized Dice Loss (GDL), Weighted Pixel-Wise Cross Entropy loss (WPCE) and boundary loss. Due to insufficient sample size for vessel segmentation, we used the above-mentioned pre-trained networks and fine-tuned them. Experimental results show that the proposed method improves the Dice score for PDAC mass segmentation in portal-venous phase by 7.52% compared to state-of-the-art methods in term of DSC. Besides, three dimensional visualization of the tumor and surrounding vessels can facilitate the evaluation of PDAC treatment response.

integrated the information from different phases. They reported dice scores of 63.94 ± 22.74 and 53.08 ± 27.06 using multi and venous phase respectively. The results of these studies indicate the necessity of achieving better performance in PDAC segmentation.
Furthermore, several studies have investigated abdominal blood vessels segmentation in CT images. Farag et al. 25 proposed a method for abdominal vessel segmentation in contrast enhanced CT images. They obtained automated seed points for vessel tracking and generation of statistical models of the vessels using outputs from the multi-organ multi-atlas label fusion and the Frangi vesselness filter. In another work, Oda et al. 26 presented a fully convolutional network (FCN) for abdominal artery segmentation in CT volume. They used a patch-based method for segmentation of the small arteries with an average F1-measure, precision and recall of 87.1%, 85.8%, and 88.4%, respectively.
Texture descriptors such as Scale Invariant Feature Transform (SIFT), Local Binary Pattern (LBP), GLCM and wavelets have shown to be promising in pancreas segmentation and PDAC detection 11,27,28 . Several other studies have fused texture features and deep features to achieve robust segmentation and prognosis 11,29 . Dense SIFT proposed in 30 and Local Binary Pattern (LBP) 31 have been proven to be powerful descriptors providing informative information.
In this study, using a combination of these texture descriptors and convolutional neural networks, a fully automatic system is proposed for segmentation of PDAC mass and surrounding vessels based on improved attention U-Net. The contributions of this work are introduced in a staged framework as follows: • A localization stage is designed to find the smallest bounding box covering the entire pancreas from the whole CT scan. For this purpose, 3D CNN and 3D LBP are used to extract corresponding sub-volumes from 3D CT scans. • A finer stage is proposed for automated segmentation of PDAC mass and normal pancreas from cropped sub-volumes in previous stage. For this purpose, Attention U-Net and Texture Attention U-Net (TAU-Net) are integrated with dense SIFT and 3D LBP which provide rich information in spatial domain. • A hybrid model is developed as an ensemble model to fuse the Attention U-Net and TAU-Net from the fine stage and to achieve better performance in segmentation.

• A new loss function is proposed as a weighted combination of Generalized Dice Loss (GDL), Weighted
Pixel-Wise Cross Entropy Loss (WPCEL) and boundary Loss. • An approach is designed for segmentation of SMA and SMV using fine-tuning of the above-mentioned models.
The reminder of the paper is structured as follows: section "Materials and methods" outlines details of the proposed methodology. Experimental results are presented in section "Results". section "Conclusion and discussion" are devoted to discussion and conclusion.

Materials and methods
In this study, pancreatic ductal adenocarcinoma CT scans from two datasets are used. First dataset is the publicly available and titled Medical Segmentation Decathlon dataset 32 . It includes three types of pancreatic tumor (intraductal mucinous neoplasms, pancreatic neuroendocrine tumors, and pancreatic ductal adenocarcinoma). We selected 138 pancreatic ductal adenocarcinoma cases by radiologists in our team. The second dataset was collected at our university hospital. It contains 19 retrospectively pathologically proven PDAC cases. This study was approved by the Tehran University of Medical Sciences Institutional Review Board (IRB) (IR.TUMS.MEDI-CINE.REC.1397.119) and followed the tenets of the Declaration of Helsinki. Written informed consents were obtained from all participants. All patient identifiers have been removed from the data. In the second dataset, pancreas, tumor and vessels such as SMA and SMV are labeled in each manually selected slice by two experienced radiologists using ITK-SNAP v.3.6.0 (http:// www. itksn ap. org/) 33 . The resolution of each CT scan in both dataset is 512 × 512 × L , where L is the number of sampling slices along the long axis of body. For data pre-processing, all images are clipped to [0.5, 99.5] percentiles of the intensity value, followed by a min-max normalization on each volume. Related works. Attention U-Net. U-Net architecture has been revealed to achieve accurate and robust performance in medical image segmentation 19 . Attention Gates (AG) were introduced that can be utilized in CNN frameworks for dense label prediction to focus on target structure without additional supervision 17 . Attention U-Net, an extension to standard U-Net, applies self-soft attention technique in a feed-forward CNN model for medical image segmentation 18 . The sensitivity of the model to target pixels is improved without utilizing complicated heuristics. Using gating module, salient features are highlighted while noisy as well as unrelated responses disambiguate. These operations are performed just before each skip connection to fuse the relevant features. High-level and low-level feature maps are fed into AG module according to Eq. (1).
where ϕ AG is called the attention coefficient, W T denote a linear transformation, x l and x h represent the lowlevel and the high-level feature map respectively, Sigmoid(x) = 1/(1 + e −x ) , and ⊕ denotes the matrix addition. Attention features are obtained from the element-wise product of the low-level feature map and attention coefficients [Eq. (2)].
(1) www.nature.com/scientificreports/ where x l is the attention feature, ⊙ represents element-wise product and ϕ AG is the attention coefficient of the higher level features. These attention features are subsequently concatenated with high level features extracted from the network in decoder path.
Dense SIFT. U-Net and Attention U-Net as classical semantic segmentation networks, are not significantly effective for the challenging pancreatic tumor segmentation task 8 . Due to the difficulty of labelling process which leads to a somewhat small sample size, these networks may not benefit from a large number of layers and channels. To solve this issue, we proposed an approach which combines selected traditional salient features with high level features in the decoder path.
One of the most popular feature space transforms proposed by Lowe 34 is SIFT descriptor which includes two stages: key point detector and descriptor. The SIFT algorithm is invariant to rotations, translations and scaling transformations. This property is obtained by characterizing local gradient information around a corresponding detected point of interest.
The original SIFT is a sparse feature representation method for an image while dense feature representation is preferred in pixel classification. Liu et al. 30 proposed the dense SIFT descriptor for object recognition and registration which eliminates the feature detection stage, while preserving the pixel-wise feature descriptor.
2D and 3D local binary pattern (LBP). We can extend the previous idea of combining classical features with CNN extracted features. Two dimensional LBP, presented by Ojala et al. 35 , is an impressive texture descriptor to characterize the local texture patterns through encoding pixel values. It leads to strong capability of rotation and gray level invariance.
Banerjee et al. 31 introduced a rotationally invariant three dimensional LBP algorithm where the invariants are constructed implicitly using spherical harmonics for increasing spatial support. A Non-Gaussian statistic measure (kurtosis), was used due to loss of phase information. The number of obtained variables was equivalent to the number of spherical harmonics plus the kurtosis term. This method can locally estimate the invariant features, useful in describing the small patterns.
Segmentation of PDAC mass. The proposed method consists a three-step segmentation process. Due to small occupied portion of CT volumes with desired pancreas region, in the first step, the approximated smallest bounding box for pancreas is localized using proposed 3D LBP sub-volumes and 3D CNNs. Subsequently, the cropped ROI are fed into segmentation network with basis of Attention U-Net to segment pancreas and PDAC mass. We also introduce a modified Attention U-Net, named Texture Attention U-Net (TAU-Net) using information from Dense-SIFT and 3D-LBP. Finally, ensemble approach is applied using a light-weight CNN to take advantage of the mentioned networks. The overall diagram of the proposed method is depicted in Fig. 2. www.nature.com/scientificreports/ Localization of pancreas region. The goal of this stage is to obtain an approximate estimation of where the pancreas volume is located inside the whole 3D CT so as to exclude irrelevant organs and tissues. Although 2D CNN frameworks have achieved considerable progress, contextual information cannot be preserved along z-axis. On the other hand, training 3D networks on a large number of sub-volumes needs considerable computational power and memory. Inadequate number of samples, on the contrary, leads to inappropriate results. As a solution for this problem, we propose a new approach based on LBP texture descriptor. Instead of training the network using original sub-volumes, informative sub-volumes are extracted utilizing 3D-LBP to be fed to the network. As pancreas region occupy a small portion of the CT volume, sub-volumes with size of 64 × 64 × 64 pixels are randomly sampled in the training phase. The extracted sub-volumes either cover a fraction of the pancreas/tumor voxels (foreground) or contain background regions. Although the background volume is much bigger than the foreground in the original image, we selected balanced sets of sub-volumes for the training phase. In the testing phase, a sliding window was applied to the entire CT volume with spatial stride of 50 along both X and Y axes, and the stride of 20 along the Z axis. Subsequently, 3D-LBP are also calculated for each volume, with the same size as the sub-volumes with parallel counterpart pixels. Such 3D-LBP volumes are then fed into a 3D-CNN, composed of five convolution layers with max-pooling, batch normalization (BN) 36 and leaky rectified linear unit (ReLU) as nonlinear activation function. In each convolution block, two 3 × 3 × 3 convolution kernels with stride 1 and no padding are used to reduce the network parameters. The pooling layers are applied on a 2 × 2 × 2 sliding window with stride 2, padded with 2 pixels. The network is trained by Adam optimizer with 5 mini-batchs and a base learning rate of 0.01 benefitting from a polynomial decay (gamma = 0.1) in a total of 30 iterations. Training process is carried out using cross entropy loss. Figure 3 summarizes the proposed localization process of the pancreas within the whole CT volume.
In the testing phase, to find the location of pancreas, the center of each predicted pancreas sub-volume is obtained and the gravity center of these points are subsequently calculated as the center of detected pancreas volume. A bounding box is finally placed around the center to cover the whole pancreas.
Fine stage (segmentation of PDAC mass). Using previous stage, we obtained smallest sub-volume from the whole CT scan which can be had the benefit of the dimensionality reduction to facilitate the fine segmentation. It's worth mentioning that the fine segmentation results are mapped back to their places in the entire CT scan using the locations of the rectangular cube that determines where to cut bounding box from the whole CT scan.
Texture attention U-Net (TAU-Net). In the fine stage, extracted 3D sub-volume for each case is cut into a series of 2D slices to be fed into the next new network. The proposed network in this stage is attention U-Net integrated with texture descriptors. To detail, dense SIFT and 3D LBP are used to enrich the deep features extracted in convolutional layers and design the TAU-Net as shown in Fig. 4a. TAU-Net has an architecture similar to attention U-Net. The input images are progressively down sampled with five convolution steps which extracts higher level image representations x h . Each step consists a convolution block (two 3 × 3 convolution layers followed by batch normalization and ReLU activation). A max-pooling operation subsequently is used with a kernel size 2 × 2 and stride of 2 to reduce the size of the transitional feature maps. At the decoder path we used deconvolution layers, AGs and Texture Attention Gates (TAGs) along with skip connections. AGs and TAGs are applied for skip connection to aggregate information from multiple scales. In addition to the generated deep features from the previous network layers, low-level features, such as dense SIFT and 3D-LBP, are used to provide a more comprehensive representation of the pathological characteristics of the tissues. Texture Attention Block (TA Block) is responsible for integration of these efficient features with features from the decoding path.
Texture attention block (TA Block). In the TA Block (Fig. 4b), high-level feature maps ( x h ) are first concatenated with texture features (TXFea) to result in informative features. The output of this stage ( x h_new ) is one of the inputs to the gate module. Newly produced high-level feature maps ( x h_new ) and also low-level feature maps ( x l ) from the coding stage are combined to compose the inputs of the attention gate [Eq. (3)].
(3) www.nature.com/scientificreports/ where, x l represents attention features, x h_new is high-level features concatenated with texture features, x h denotes the input to the next stage in the decoder path and U denotes up-sampling. The Best configuration is finally achieved empirically based on the network performance. As can be seen from Fig. 4, SIFT features are integrated into the network at the second up-sampling layer. Since the size of the feature map in this layer is a quarter of the input image size, SIFT features are calculated with a sampling size of four. Besides, LBP was integrated into network at the last layer which has the same size of the input image. The network is trained using the Adam optimizer in 10 mini-batches and a base learning rate of 5 × 10e−5 with polynomial www.nature.com/scientificreports/ decay (gamma = 0.1). We have also used random rotation, flip and shift to augment the data. A random 70-30 split was used for training-testing phases.
Proposed hybrid model. In this study, Attention U-Net and TAU-Net are used for segmentation of pancreas and the tumor using CT images. To benefit the advantages of both networks, a novel but simple approach based on hybrid models 37,38 is proposed using a 3D-CNN to aggregate the outputs of these networks. First, three predicted masks, namely pancreas, tumor and background are generated using each of the two segmentation networks leading to six predicted masks as the inputs to the 3D-CNN aggregator network. This network consists of one convolution layer and a softmax function as shown in Fig. 5. The 3D convolution layer predicts label of each pixel using its neighbors along three axes.
Loss function. Standard losses such as cross entropy loss and Dice loss cannot handle the imbalanced tasks appropriately 39 . Hence, to deal with small foreground as well as large intra class variation of region sizes, we use three different loss functions, namely GDL, boundary loss and WPCE.
Weighted pixel-wise cross entropy (WPCE). WPCE loss is defined as follow: where U represents ground truth and V is the predicted binary mask. In this study, the sample weights for each class was selected to be the ratio of total number of background pixels n B and total number of each class (pancreas or tumor) pixels n c , ( W c = n B n c ).
Generalized dice loss (GDL). The Generalized Dice Score (GDS) has been used as a metric for assessing multiple class segmentation with a single score. Generalize Dice Loss (GDL) was proposed in 40 as a loss function for training deep neural networks with highly unbalanced data: is a weight used for balancing the effect of unbalanced region sizes in the loss, c represents the target class and ε is employed to avoid division by zero.
Boundary loss. Because of low contrast around the boundary of pancreas and tumor, exact determination of tumor and healthy tissue boundary can be challenging. To diminish this issue, we also use the differentiable version of the boundary loss proposed in 41 using boundary metric BF 1 . The boundaries of the ground truth and predicted segments are obtained using max-pooling operation as follow: where a pixel-wise max-pooling operation with a kernel size θ 0 is applied to add an outside border to the segment complement. Now subtracting the initial complement from the altered one generates the inner boundary www.nature.com/scientificreports/ of the original segment. Then an extended boundary is constructed by applying max-pooling to the obtained boundaries: The minimal distance between neighboring segments of the binary ground truth map may be used to calculate the value of hyper parameter θ. Subsequently, P (precision) and R (recall) is calculated using the above constructed maps as follow: where operations • and sum() present pixel-wise multiplication of two binary maps and pixel-wise summation of a binary map respectively. Totally, BF 1 and corresponding loss function can be calculated using P and R as follow: Segmentation of vessels. PDAC staging is mainly based on the degree of involvement between the tumor and surrounding vessels. So segmentation of surrounding vessels of the pancreas can be a crucial task in pancreas surgery planning. As mentioned at the beginning of section two, 19 self-collected 3D CT scans have vessel labels. This sample size is insufficient to train the vessel segmentation networks from scratch. In order to overcome this limitation, we utilized the models trained for PDAC segmentation, namely Attention U-Net, TAU-Net and hybrid models, as pre-trained models and fine-tuned them for vessel segmentation. The data is augmented by translation, flipping and rotation of each image to make the dataset larger 12 times. The initial learning rate of the Adam optimizer is 1e−5. We train the networks for 30 epochs.
Evaluation metrics. The following metrics were used to evaluate the pancreas and PDAC mass segmentation: dice similarity coefficient (DSC), recall, precision 42 .
where U and V denote ground-truth and predicted binary masks respectively.
All algorithms are implemented using the publicly available PyTorch package 43 . Training of all models was carried out using a machine with a single NVIDIA GFORCE GTX1080Ti with 11 GB memory.

Results
Pancreas localization. The 3D CNN is trained to localize the pancreas in the whole CT scan. Table 1 shows the results of training this network with original images and 3D LBP sub-volumes using different number of 3 × 3 filters.
PDAC mass segmentation. The Attention U-Net and TAU-Net were trained using images obtained from the previous stages. As shown in Fig. 4a, the best performance of TAU-Net was obtained when SIFT images were added into the second layer and the LBP images into the last layer of the decoder path. Various loss functions such as GDL, WPCE and boundary loss were examined to train the network and the results of these loss functions for Attention U-Net were shown in Table 2. The results of U-Net, Attention U-Net, TAU-Net and hybrid networks with manual and automated localization of pancreas region are reported in Table 3.
As it can be seen from Table 3, the hybrid model with "ensemble" approach using a 3D CNN shows the best performance in PDAC mass segmentation. Furthermore, the lowest hausdorff distance value belong to PDAC segmentation. To show the performance of the localization stage, in addition to the results of fully automated pancreas and PDAC mass segmentation, we have also reported the results of feeding manual localization of the www.nature.com/scientificreports/ pancreas region to the attention U-Net, TAU-Net and hybrid network in Table 3. Furthermore, Fig. 6 shows the qualitative segmentation results for different CT slices where rows 1 and 2 depict two samples from MSD dataset while row 3 and 4 represent results for two samples from self-collected dataset. 3D reconstruction of the results using ITK-SNAP (Fig. 6d) shows an anatomical and realistic view with very small irregularities. As the distribution of data was normal, a paired sample T-Test was carried out which indicates that our proposed method (Hybrid model) performs significantly better than Attention U-Net and U-Net (p = 0.007 and p = 0.017 respectively). www.nature.com/scientificreports/  www.nature.com/scientificreports/ In Table 4 successful pancreatic tumor segmentation methods in portal venous phase CT images are compared with our suggested method in terms of "Dice", "recall", "precision", "housdorff distance" and "run time".
Vessel segmentation. We have compared the performance of three networks, namely the pre-trained Attention U-Net, TAU-Net and hybrid model for segmentation of vessels. Segmentation results for two samples are reported in Fig. 7. For further validation purposes, DSC, recall and precision for all networks are reported in Table 5. Furthermore, Fig. 7d and i shows the segmented PDAC mass and vessels. Three visualization of PDAC mass and surrounding vessels is presented in Fig. 7e and j. As it can be seen from Table 5, most criteria are higher for hybrid model and on the whole, the hybrid model is the best model. Furthermore, the lowest hausdorff distance value belong to SMA segmentation.

Conclusion and discussion
In this paper, we introduced a fully automatic convolutional neural network for segmentation of PDAC mass and abdominal vessels such as SMA and SMV in CT images. We also introduced TAU-Net as a modified Attention U-Net, using information from dense SIFT and 3D-LBP. As PDAC staging is mainly based on the degree of involvement between the tumor and surrounding vessels, segmentation of PDAC mass as well as surrounding vessels can be an important task, which is already addressed in this study.
To localize the pancreas slices in the whole CT image, a single network could possibly be used in both cascaded stages (coarse and fine steps). Nevertheless, a desired outcome can only be expected from extremely complicated networks. To overcome this challenge, the proposed method used a simple 3D CNN with inputs from the 3D LBP maps extracted from sub-volumes of the original images. With this approach, instead of applying a large number of original images, a small sample size of informative 3D LBP maps resulted in an accuracy of 95%  www.nature.com/scientificreports/ in classification between pancreas and background sub-volumes. Furthermore, as it can be found according to Table 3, by feeding manual and automated localization of the pancreas region to the segmentation stage networks, automated and manual localizations of pancreas region show comparable performances.
Although the performance of the segmentation stage might improve with the integration of contextual information through 3D architecture networks 7,44 , due to the limited size of the dataset and the involved high computational cost, 2D architectures are selected in this work to analyse each CT slice individually. The disadvantage of the 2D architecture is that each slice does not include all of the contextual and geometric information due to the absence of the disregarded z-axis information.
To overcome this weakness partially, the hybrid model using a simple 3D CNN was introduced to incorporate information from the adjacent slices to the segmentation of the intended slice. One may assume that the superior performance of the hybrid network compared to each network alone, is due to using data from adjacent slices as well as utilizing a combination of texture features and deep features.
To incorporate texture information, we examined fusion of SIFT, 2D-LBP and slices of 3D-LBP into the selected layers of the network. Experimental results showed that 3D-LBP performs better than the 2D-LBP as expected. Figure 8 compares 2D-LBP and 3D-LBP of a single slice. The 2D slice of the 3D-LBP is much more informative than both the original image and the 2D LBP.
The best result for fusion of the 3D LBP into the network obtained when it was added to the last layer in the decoder path. SIFT maps were also fused into the network. The SIFT maps are calculated using different step sizes and each obtained map is concatenated into its matched layer in the decoding path. This leads to mutual advantages of high and low level SIFT features during the decoding. The mean dice value for PDAC mass segmentation using (1) Attention U-Net + LBP, (2) Attention U-Net + SIFT_1 + LBP, (3) Attention U-Net + SIFT_2 + LBP, (4) Attention U-Net + SIFT_4 + LBP and (5) Attention U-Net + SIFT_8 + LBP were 0.54, 0.55, 0.53, 0.58 and 0.55 respectively. SIFT_n, means that the SIFT map with step size of n is added to the corresponding layer in the decoder path. The best performance among all architectures belong to Attention U-Net with SIFT_4 + LBP. It seems that in detection of the targeted texture, neither very low level nor extreme high level SIFT features provide enough information.
To reduce the effects of unbalanced data (pancreas vs background) we examined a tailored set of loss functions consisting of weighted combinations of three classic losses namely GDL, WPCE and boundary loss. WPCE shows the best performance compared to the other two losses. But the best final results have been achieved using the balanced combination of all mentioned losses. It seems that the best results are obtained with this combination due to usage of high and low resolution losses simultaneously.
Three dimensional segmentation of tumor can be useful in tumor size evaluation. But in PDAC, involvement of tumor and surrounding vessels is an important factor in treatment response assessment. So 3D visualization of tumor and surrounding vessels can facilitate the evaluation of pre and post-treatment conditions. As it can be seen from Fig. 7, three-dimensional examination of the tumor and vessels can give the better view for the physician than 2-D scans. An intuitive evaluation of Fig. 7j and e show that the amount of involvement of the tumor and vessels is decreased after treatment.
Previous pancreatic tumor segmentation approaches have led to Dice scores between 0.52 to 0.71, depending on tumor types (PDAC and others), utilizing single or multi CT phase (non-contrast, arterial, venous), dataset size and image quality.
Zhu et al. 7 suggested a multi scale coarse to fine segmentation for screening PDAC in CT images on a private dataset with tumor sizes varying between one thousand to a million voxels. Classification of PDAC was carried out with acceptable results, but the dice score for PDAC mass segmentation is 57.3%. Other evaluation metrics such as sensitivity and specificity for tumor segmentation is not reported. Besides, they have reported 11 min of test time, while we experienced 9 min for each sample (of which the most part used in changing the data format for better representation). Tureckova et al. 8 proposed a CNN using deep supervision and attention gates for segmentation of lesions such as liver and pancreatic tumors. They used MSD dataset for pancreatic tumor segmentation. This data set includes three types of pancreatic tumors such as adenocarcinoma, neuroendocrine tumors and intra-ductal mucinous neoplasms. Contrary to the adenocarcinoma, the two other types of pancreatic tumors show better contrast compared to the PDAC, which leads to an improvement in segmentation results. www.nature.com/scientificreports/ They have reported 54.66 ± 4.54, 62.18 ± 3.35 and 58.12 ± 6.12 for DSC, precision and recall respectively. They have not reported the run time for their test samples. In another work, Zhang et al. 9 used multi-phase CT images for PDAC segmentation using a large dataset and nnUNet network. The advantage of this work is using a large dataset (n ~ 1000) with multi-phase CT-images. Although they achieved a dice score of 0.709 ± 0.159 for multiphase data, but for common and routine phase CT-images used in the clinic, namely venous phase, a dice score of 0.522 ± 0.250 was reported. The average testing run time, precision, recall and hausdorff distance were not reported. Zhou et al. 10 utilized multi-phase CT-images and hyper-pairing networks. While achieving a good relative improvement in the segmentation of PDAC with multi-phase data (Dice score = 0.639 ± 0.227), the reported DSC for single venous phase was 0.531 ± 0.271. Compared to the studies that use portal venous phase data, which is the current imaging protocol and routine, the proposed method benefits from an improvement of up to 7.52%. A beneficial outcome of our study is segmentation and 3D visualization of PDAC mass and surrounding vessels which can facilitate the assessment of treatment response in pancreas surgery planning.
In vessel segmentation we had a small sample size (19 self-collected case). Therefore, pre-trained networks, obtained from PDAC segmentation, are fine-tuned for this task and the results are reported in Table 5. As it can be seen, TAU-Net shows the better performance than the Attention U-Net on the segmentation of vessels. Furthermore, most criteria are higher for hybrid model and totally, the hybrid model is the best model.
It can be concluded that this network can be promising in segmentation of medical images with small sample size.
As the future step, we propose fusion of 3D texture descriptors into the 3D networks. In addition, we suggest using GLCM and multi-resolution features such as dual tree complex wavelet transform. Furthermore, in the hybrid model, fine tuning of the final combined network may be a good solution.
By conducting this study, we tried to enhance the diagnosis and visualization results by using current imaging protocols and routines. Another successive phase may contain manipulating the protocols and even the media for achieving better results. Though this work mainly focuses on segmentation of PDAC mass, we can apply the proposed idea to the segmentation of another pancreas tumor such as intraductal mucinous neoplasms and pancreatic neuroendocrine tumors as well as other lesions like liver, brain and lung. Three dimensional visualization of PDAC mass and surrounding vessels can facilitate the assessment of treatment response in pancreas surgery planning. However, in order to be integrated into clinical applications, 3D visualization software is a must in further developments.