Tensor Decomposition for Colour Image Segmentation of Burn Wounds

Research in burns has been a continuing demand over the past few decades, and important advancements are still needed to facilitate more effective patient stabilization and reduce mortality rate. Burn wound assessment, which is an important task for surgical management, largely depends on the accuracy of burn area and burn depth estimates. Automated quantification of these burn parameters plays an essential role for reducing these estimate errors conventionally carried out by clinicians. The task for automated burn area calculation is known as image segmentation. In this paper, a new segmentation method for burn wound images is proposed. The proposed methods utilizes a method of tensor decomposition of colour images, based on which effective texture features can be extracted for classification. Experimental results showed that the proposed method outperforms other methods not only in terms of segmentation accuracy but also computational speed.

Colour model. The green and blue components are represented by a* and b* CIELab negative values, whereas the skin and the burn wound are represented by positive values. The purpose of a colour model is to facilitate the specification of colours in some standard, generally accepted way. In essence, a colour model is a specification of a coordinate system and a subspace within that system, where each colour is represented by a single point 24 . There exist several colour models for different functions: (i) RGB model, (ii) CMY and CMYK models, (iii) HSI model that decouples the intensity component from the colour-carrying information (hue and saturation) 24 , (iv) YCbCr, CIELab, CIELuv and CIELch, where their components represent the image luminance and chromatic scales separately.
The CIELab colour model is the most complete one specified by the International Commission on illumination. The CIELab extracts the luminance and the chromatic information of an image utilizing three coordinates: the L* coordinate (L* = 0 encloses black and L* = 100 encloses white) that describes the luminance, the a* and b* coordinates that represent the pure colours from green to red (a* = −127 encloses green, a* = +128 yields red) and from blue to yellow (b* = −127 encloses blue, b* = +128 encloses yellow), respectively. Another important characteristic of this model is its uniformity, where the distance between two different colours corresponds to the Euclidean one and it coincides to the perceptual difference detected by the human visual system [25][26][27] . For all these reasons, the CIELab colour model was therefore chosen in this study.
www.nature.com/scientificreports www.nature.com/scientificreports/ Furthermore, the L*, a* and b* images were filtered with Gaussian filters in the frequency domain using the Gaussian low-pass filter function 24 . The best filters' cut-off frequency, σ 0 , which keeps the 99% of the power spectrum of the zero-padded discrete Fourier transforms of the CIELab coordinates was estimated with the bisection method 28 . tensor decomposition. A tensor is a multidimensional array. More formally, an N-way or Nth-order tensor is an element of the tensor product of N vector spaces, each of which has its own coordinate system [29][30][31] . There are two main techniques for tensor decomposition: the CANDECOMP/PARAFAC, and the Tucker tensor decomposition. The CANDECOMP/PARAFAC, or shortly CP, decomposition factorizes a tensor into the sum of rank-one components. For example, a third-order tensor  ∈ × × X I J K is decomposed as:  where E and e i j k ( , , ) represent the noise or the error, represents the number of rank-one components which decompose the tensor X, respectively, and the symbol "" indicates the vector outer product.
It is often useful to assume that the factor matrices columns are normalized to length one with the weight absorbed into a vector  λ ∈ R : where λ Λ = diag( ), the symbol "× n " indicates the n-mode product between the core tensor and the factor matrices.
On the other hand Tucker decomposition decomposes a tensor X into a core tensor G multiplied (or transformed) by a matrix along each mode. For example, the third-order tensor  ∈ × × X I J K is decomposed as: are the factor matrices considered as the principal components in each mode. Equation (4) can be element-wise written as: ≈ x i j k g p q r u i p u j q u k r ( , , ) ( , , ) ( , ) ( , ) ( , ) Both decompositions are a form of higher-order principal component analysis 30,31 . Equation (3) represents the decomposition as a multi-linear product of a core tensor and factor matrices and it is often used in signal processing. Equation (4) builds the core tensor with a different dimension for each mode and it is often used for data compression 31 . For the purpose of data compression in this study, the Tucker decomposition, known as the Tucker3 model 30,31 , is preferred to the CP. An RGB image  ∈ × × I M N 3 can be converted into the CIELab space, and can be expressed as tensor  ∈ × × X I J K , where I, J and K are the number of grey-levels in the L* a* and b* image, respectively. Figure 1 graphically shows the Tucker decomposition of the CIELab tensor.
1 1 2 2 3 3 Equations (5) and (6) can be element-wise written as: ≈ ⋅ ⋅ ⋅ g p q r u i p u j q u k r ( , , ) ( , ) (, ) ( , ) www.nature.com/scientificreports www.nature.com/scientificreports/ where X is the CIELab colour tensor, Y is the tensor to estimate, E is the tensor error, U 1 , U 2 and U 3 are the factor matrices calculated from the a*, b* and L* mode of the X tensor respectively, whereas i, j and k are the X coordinates and p, q and r are the core tensor G ones.
Setting ρ i j k ( , , ) as the CIELab vector module with coordinate i j k ( , , ), two tensors, X d and X s , are built in order to mix the image luminance and colour as follows: where ∈ ⁎ i a , ∈ ⁎ j b , and ∈ ⁎ k L . The tensors X d and X s in Equations (9) and (10), contain the normalized values of the addition and difference of the colour sets in proportion to the luminance, respectively. The values set to 0 indicate the background: a* and b* negative values corresponds to the green and blue components respectively; whereas δ ≤ ⁎ L defines the dark regions where δ is a parameter arbitrarily chosen, in this study δ = 10. Finally, the estimated Y d and Y s are re-transformed into images Y d and Y s , the chromaticity sources of the I image, where the texture features are extracted. tensor rank. Let X be an Nth-order tensor of size × × × . Then the n-rank of the tensor X, rank n X ( ), is the dimension of the vector space spanned by the mode-n fibres. Bro et al. 29,32 developed a technique, called core consistency diagnostics (CORCONDIA), for estimating an optimal number R of rank-one tensor, which produces the factor matrices for the CP decomposition. Unfortunately, there is not such a single algorithm for the Tucker decomposition, so the core tensor dimensions have to be decided with a reasonable choice.
In this study, the three-mode tensor  ∈ × × X I J K is the CIELab colour space. So, its the upper limit rank is rank = X ( ) [256, 256, 101], when all the luminance and chroma values define the image; whereas its lower limit is rank = X ( ) [1, 1, 1], when just one luminance and chroma grey-level defines the image. An intuitive choice for the rank of the CIELab tensor is the amount of grey levels of each CIELab component that defines the image I: As being expressed in Equations (9) and (10), the background values are set to 0, there is a further reduction which does not involve the background grey-levels of the a* and b* colour sets and the very dark regions by the L* set.
Feature extraction. In this study, the grey-level co-occurrence matrix (GLCM) 9 were applied on two sources of the decomposed luminance and colour components to extract the contrast, homogeneity, correlation and energy. They were calculated with a mask 5 × 5 with offsets: 0, 45, 90 and 135. The means of luminance-colour www.nature.com/scientificreports www.nature.com/scientificreports/ images were also included. The total of 10 values of the 5 features were extracted, 5 for each luminance-colour source. These feature vectors were used for cluster analysis to identify burn regions.
Cluster analysis. The cluster analysis was carried out using the FCM algorithm. Since the number clusters of colours and their shades with the different luminance levels are unknown, a high value of clusters = 20 was selected for the FCM analysis, and then a cluster merging process was performed to distinguish burn from the healthy skin background regions. Figure 2 illustrates the steps of the proposed segmentation method that works as follows. Given an RGB image of burn as the input, it is converted into CIELab space, filtered with a Gaussian filter, and two image components are computed by the Tucker3 decomposition for colour texture feature extraction. These features are the inputs of the fuzzy c-mean algorithm that divides the image into 20 clusters. The clusters are then merged in order to obtain three main regions of interest: burn wound, skin, and background. Once these 3 regions are obtained, the closing morphological operation is applied to obtain the burn-wound contour. However, the user has the possibility of choosing the hole filling after the closing operation. Figure 3 shows a burn image of × × 1330 1925 3 pixels of a paediatric patient with a burn wound located on the right hand assessed 96 hours after the burn injury. The acquired RGB image was converted into the CIELab colour space with standard D65 illuminant and its components were filtered in the frequency domain with Gaussian filters, keeping the the 99% of the power spectrum of the zero-padded discrete Fourier transforms of them (see Fig. 4). Figures 3 and 4 show the effect of Gaussian filtering on the reduction of the reflection and producing a homogeneous background. This study does not consider the effect of illumination, which will be an issue for future investigation. Moreover, Fig. 5 shows the CIELab colour space and the CIELab tensor X for the image in Fig. 4(a).

Results and Discussion
Tensors X d and X s were constructed as explained in Equations (9) and (10). The tensor estimations of Y d and Y s were obtained by the Tucker3 tensor decomposition technique. The tensor rank is the amount of a*, b* and L* grey-levels: [66,43,76]. Since there is a background (the green blanket) and some dark areas (left side) in the image, the core tensors' rank is reduced by using Equations (9) and (10) to rank = X ( ) [51, 38,68]. It should be pointed out that such decomposition was carried out on the number of unique combination of luminance and chromatic values instead on the number of pixels in the original image. In this case, the decomposition was performed on × × = 51 38 68 131, 784 instead of × × = 1925 1330 3 7, 680, 750 pixels, resulting in a data reduction about 98.3% without losing the image information. Finally, the Y d and Y s values were re-transformed into images Y d and Y s with the same size of the original (see Fig. 6). Figure 6 illustrates that the tensor decompositions can enhance the contrast of tensors X d and X s with the estimated Y d and Y s after the error eliminations E d and E s , respectively. In the end, 4 and 6 show how Gaussian filtering and tensor decomposition can remove errors and/or artefacts from the image. Based on these estimated tensors, one statistical (mean) and four GLCM-based texture (contrast, homogeneity, correlation, and energy)  These features were used in the FCM analysis, which initially grouped the data in 20 different clusters and successively manually merged into 3 cluster: burn wound, healthy skin and background (see Fig. 7(a)). On the other hand, Fig. 7(b) shows the final image segmentation result with the burn contour superimposed over the original image. 7(a) and 7(a) are the segmentation result after a closing morphological operation with structure element disk with radius 2. Moreover it is user choice to fill or not eventual holes with another morphological operation.
In order to compare the proposed method with others, image segmentation results were obtained using four other techniques: Gaussian pre-filtering, PCA, ICA 20 and the JSEG 11 . Figure 8 shows six segmentation results in six rows obtained from the proposed and other four methods, which are discussed as follows. It is obvious in all cases that the JSEG can only distinguish the human body from the background but not the burn wound from the healthy skin; and therefore not further included in the following comparisons.
For the results shown in the first row, the original image is the one discussed previously with size × × 1925 1330 3. The CIELab and PCA methods present under-segmented areas along the burn wound on the left side, and they took 375 and 1527 seconds for the segmentation, respectively. The ICA result is comparative with the proposed method but it required 2286 seconds for the segmentation. The proposed method successfully detects the burn wound contour after 297 seconds and using as Tucker tensor core rank:   www.nature.com/scientificreports www.nature.com/scientificreports/  www.nature.com/scientificreports www.nature.com/scientificreports/ along the right side, and required 1952 and 2348 seconds for the task, respectively. The proposed method detects the burn wound contour well with a minor under-segmentation on the upper-right side of the injury in 214 seconds with Tucker tensor core rank:   Table 1 illustrates the computational times for the image segmentation obtained from different methods illustrated in Fig. 8, except for the JSEG method which produces unsatisfactory results for every images. The experimental results suggest that the proposed method can provide the best results not only in terms of segmentation accuracy but also the computational speed is approximately 10 times faster than the ICA, 5 times faster than the PCA, and 1.5 times faster than the CIELab. Table 2 shows the quantitative measurements that consist of positive predicted value (PPV) and sensitivity (SEN) for a segmented image. The PPV and SEN are defined as 21 The number of pixels correctly segmented by the algorithm The total number of pixels segmented by the algorithm = SEN (13) The number of pixels correctly segmented by the algorithm The total number of pixels in the segmented region according to the expert For a perfect segmentation, = PPV 1 and = S 1. In case of under-segmentation, = PPV 1 and < SEN 1, whereas in case of over-segmentation, < PPV 1 and = SEN 1. Based on the results shown in Table 2, cases of under-segmentation are ICA with image R5 and Tucker with R5; and over-segmentation are CIELab with R6, PCA with R2, R3, R4, R5 and R6, ICA with R1, R2, R3, R4 and R6, and Tucker with R4.
Both results shown in Fig. 8 and Table 2 suggest that the proposed method provides better segmentation results for the images in the 1 st , 2 nd , 3 rd and 6 th row of Fig. 8. The average PPV and SEN values of the segmentations obtained from the proposed method are better than the other three methods in terms of the balance between over-segmentation and under-segmentation.
Furthermore, the proposed method is also compared with the simple linear iterative clustering (SLIC) superpixel 33 , the efficient graph-based image segmentation 34 , and the SegNet 35 methods that are discussed as follows.
The SLIC superpixel method performs on the local clustering of CIELab values and pixel coordinates. It is fast and requires the specified number of superpixels as the input. Figure 9 shows the segmentation results of the original burn image as shown in Fig. 3(a) obtained from the superpixel method using 5, 20, 100, 500 and 1000 as  PCA  1527  1178  1952  800  1430  958  1307.5   ICA  2286  1705  2348  3275  3653  2824  2073   Proposed method  297  358  214  118  115  185  214.5   Table 1. Computational times (seconds) for image segmentation results obtained from different methods as shown in Fig. 8, where R1, …, R6 stand for images shown in rows 1, …, row 6 of Fig. 8 www.nature.com/scientificreports www.nature.com/scientificreports/ the numbers of desired superpixels. It is quite obvious that the bigger the number of superpixels are, the better the segmentation result is obtained, but it is very difficult assign to which class a superpixel belongs to. The algorithm distinguishes quite well the skin from the background but then encounters a problem in classifying a superpixel as skin or burn wound, resulting in either under-or over-segmentation. Being similar to the proposed method, the SLIC superpixel technique requires a manual merging process in order to distinguish the three classes of interest. However, an advantage of the proposed method over the superpixel method is that the merging process can be carried out faster since the number of clusters specified for the proposed method is much smaller than that for the SLIC superpixel technique to achieve a good final segmentation result as shown in Fig. 7(a).
The efficient graph-based image segmentation method defines a predicate to highlight the boundary between 2 or more regions using a graph-based representation of the image of interest. Figure 10 shows the segmentation results of the original burn image as shown in Fig. 3(a) obtained from the graph-based image segmentation method, where its input parameters σ = .
. 0 5, 0 8, and = k 100, 300, 500, 800, 1000, where σ is the standard  www.nature.com/scientificreports www.nature.com/scientificreports/ deviation of the Gaussian filter in the pre-processing and k is a scaling parameter. It is easy to observe that all the results are not satisfactory as they were largely over-segmented.
The SegNet consists of an encoder network and a corresponding decoder network followed by a pixel-wise classification layer. It is composed of an encoder sub-network and a corresponding decoder sub-network. The depth of such network is specified by a scalar D that determines the number of times an input image is downsampled or upsampled as it is processed. The encoder network downsamples the input image by a factor of 2 D . The decoder network performs the opposite operation and upsamples the encoder network output by a factor of 2 D . Two types of the SegNet were designed: i) encoder and decoder with D = 4, and ii) the network is initialized using the VGG-16 weights and D = 5. Using these two networks with 11 images as training and 2 as testing, the accuracies achieved after 10 epochs and with learning rate = 10 −3 are 26% and 25%, respectively. A reason for the poor results produced by SegNet can be that SegNet was primarily designed for scene understanding applications (road and indoor scenes), while the data domain in this study is medical imaging of burn wounds. Another possible reason is the very small training sample size (11 images) used for training the SegNet in this study, which was not sufficient for the deep network to capture the feature map information for correct learning, particularly the vague boundary information between the skin and burn areas.
It should be pointed out that good outcome of automated merging of the fuzzy clusters obtained from the FCM depends on sufficient training data for various types of burn, as shown in Fig. 11, where the assignment of new fuzzy clusters to the trained ones is based on the minimum distance criterion. A dataset with about 220 clusters centres (8 belong to the background, 135 to healthy skin, and 77 to burn wound) was developed and as reference to assign each new cluster centre extracted from an image under the current analysis to the class of the reference centre that has the minimum distance. As the number of the reference cluster centres are limited, the automated merging fails sometimes. Therefore, the user has the opportunity to do this process manually and then adds the new labelled cluster into the reference set.
Regarding the sufficient data and training process, at present it is difficult to optimally determine how much more data would be needed for the good training of the proposed algorithm. In fact, sample size planning for classifiers is an area of research in its own right. Most methods of sample size planning for developing classifiers require non-singularity of the sample covariance matrix of the covariates 36 . In biomedicine, methods for sample size planning for classification models were developed on the basis of learning curves that show the classification performance as the function of the training sample size to appropriately select the sample size needed to train a classifier. However, these methods require extensive previous data that attempted to differentiate the same classes 37 , or suggest sizes between 75-100 samples to achieve only reasonable accuracy in the validation 38 . This issue will certainly be investigated in our future research when more clinical data become available.

Conclusion
The proposed method has been shown to be able to extract burn wounds from the complex background with relatively fast computational time. The tensor decomposition is independent from the camera resolution, because it works on the CIELab tensor model instead on the number of pixels of the image. The proposed method results in a big data reduction without any information lost for the image source estimation, and therefore applicable for real-time processing. The CIELab, PCA and ICA do not consistently provide good segmentation results over various burn images, showing over/under-segmentation errors. Moreover, these techniques require longer computational times than the proposed method.
Besides, the fuzzy burn wound centres extracted by the FCM during the cluster analysis, in this paper used to distinguish partial-thickness burns from normal skin and 1 st degree burns, but they could also be used to identify the depth of the burn and classify it into: superficial partial-thickness burn, deep partial-thickness, and full-thickness burns. The 1 st degree burns are not included in the total area of burn estimation and should therefore not be included in this estimation. www.nature.com/scientificreports www.nature.com/scientificreports/ The strategy for colour image segmentation with fuzzy integral and mountain clustering 39 which does not require an initial estimate of the number of fuzzy clusters, is worth investigating for improving the proposed approach in terms of limited training data for cluster merging. It would be desirable to utilize the proposed method for segmenting images captured with a polarized camera that can eliminate the light reflection problem, and try to extract features on the CIELab tensor instead of the re-constructed images. As another issue for future research, it is worth investigating the segmentation of burn areas on 3D images to include curves and depth to further improve the segmentation accuracy. Furthermore, there exist many image segmentation techniques such as semantic segmentation 35,40 , superpixels segmentation 33,41 , spectral clustering 42 , fully connected conditional random fields 43 , and mask R-CNN 44 . However, all these techniques require a huge amount of training data to achieve a high degree of accuracy. To utilize such techniques, developing a method for simulating and generating a large quantity of burn images would be an important area of research to pursue in our future work.