Morphological diagnosis of hematologic malignancy using feature fusion-based deep convolutional neural network

Leukemia is a cancer of white blood cells characterized by immature lymphocytes. Due to blood cancer, many people die every year. Hence, the early detection of these blast cells is necessary for avoiding blood cancer. A novel deep convolutional neural network (CNN) 3SNet that has depth-wise convolution blocks to reduce the computation costs has been developed to aid the diagnosis of leukemia cells. The proposed method includes three inputs to the deep CNN model. These inputs are grayscale and their corresponding histogram of gradient (HOG) and local binary pattern (LBP) images. The HOG image finds the local shape, and the LBP image describes the leukaemia cell's texture pattern. The suggested model was trained and tested with images from the AML-Cytomorphology_LMU dataset. The mean average precision (MAP) for the cell with less than 100 images in the dataset was 84%, whereas for cells with more than 100 images in the dataset was 93.83%. In addition, the ROC curve area for these cells is more than 98%. This confirmed proposed model could be an adjunct tool to provide a second opinion to a doctor.

The rest of the paper is organized as follows.
In "Proposed method" section, the proposed method algorithm and model architecture have been elaborated.The result of the 3SNet is discussed in "Results" section, whereas in "Discussion" section, a comparison of the results with the state-of-the-art method has been discussed.Finally, in "Conclusion" section, we have concluded the proposed method.

Proposed method
In this study, we developed a deep convolutional neural network model called 3SNet, which incorporates a multilayer feature fusion approach.The architecture of 3SNet is depicted in Fig. 1.The feature fusion model employed in this study is designed to extract features from the grey image as well as the corresponding histogram of oriented gradients (HOG) and local binary patterns (LBP) images.Subsequently, the aforementioned features are integrated in order enhance their effectiveness, after that the classification module is added to performs classification leaukemia cells.
The convolution blocks are designed using depth-wise convolution techniques to reduce the computation costs.Several methods in the past have done a significant job of improving leukaemia cell classification.However, several limitations of these methods motivated us to design a robust and efficient model.A detailed summary of the models is described in Table 2.  www.nature.com/scientificreports/

Local binary pattern (LBP)
The texture of leukemia cells is heterogeneous, which can be explored to categorize them.Hence, in the proposed work, we have used a powerful feature descriptor developed by Ojala et al. 52 .This descriptor associates the analysis of occurrences and local structure analysis by assigning binary patterns to each pixel p c .After that, the difference between pixel p c grey level value and its circular region is evaluated with the radius R centred at p c .The LBP of the central pixel p c is calculated as follows.
If the value of q c − p c >0, then 1 is assigned in the Eq. (1); otherwise, 0. Finally, the LBP picture is created by combining the texture descriptor and the LBP distribution pattern, as illustrated in Fig. 2. The histogram vector H of the LBP for image representation is given as follows.
The LBP image and their feature descriptor calculation are shown in Figs. 3 and 4, respectively.

Histogram of oriented gradient (HOG)
Dalal and Triggs first used the HOG descriptor for object detection 53 .It focuses on the local shape and structure of an object.For the region of the image, the histogram is generated by calculating the magnitude and direction of the gradient.In the proposed work, images are resized to 256 × 256.After that, a sliding window of size 3 × 3 is used to calculate the gradient Grad x in the Y-direction and Grad y in the X-direction as follows.
(1)  where r and c refer to the row and column of the image.Finally, magnitude and direction are calculated using the following formulae.

Novel 3-scale deep CNN model (3SNet)
We have designed a novel The ReLU activation adds non-linearity to the model by applying a threshold to the pixels obtained from BN layers.This model has 9 × 10 9 trainable parameters, and it can avoid degradation problems, saturation of the model and gradient descent problems 54 .The ReLU activation is defined as.
where x = input to the layer.After each convolution layer, a max-pooling layer of size 3 × 3 and stride of 2 × 2 is incorporated.Finally, a global average pooling at each layer is applied that generates channel descriptors and combines them to develop feature fusion.The output from the fused feature acts as an input to a Fully Connected layer with 1024 filters followed by BN and ReLU activation.In the end, a dense layer of 15 neurons was added for AML-Cytomorphology_LMU respectively.The classification of multiclass classification is performed using the Softmax optimization function, which converts logits into probability.The input weight and bias calculate the probability value.Finally, the probability value is converted to a particular class of leukaemia cells.The value of the Softmax optimizer can be calculated using Eqs.( 7) and (8).
where N = 15, w 0 y 0 = bias of kth class, =input vector, and the value of k = 0-14 for multiclass (15 class of leukemia cell).

Feature fusion
Feature fusion improves the performance of the deep CNN.We have used three deep CNN models for feature extraction in the proposed method.The feature extracted from the HOG, Leukemia Cell and LBP image is fused as follows. (4) The neighbours and observed pixel  where F con is final feature vector with a bag of 1536 features.The original image and their LBP and HOG image is shown in Fig. 5. .

Algorithm1
(1) Find the HOG and LBP image using the equation described in sections II (A) and II (B) (2) Apply a 5-fold cross-validation to the leaukemia dataset and set an initial learning rate of 0.0001

Consent to participate
The authors declare their consent to participate in this article.

Dataset
The images used in this research have been taken from the available Munich AML Morphology Dataset, containing 18,365 expert-labelled single-cell images 55 .These single-cell images were produced using the M8 digital (13) Here, (a-c) original image, (d-i) represents their corresponding LBP and HOG image respectively.microscope/scanner from peripheral blood smears of 100 people from each group, with the first group comprised of patients diagnosed with Acute Myeloid Leukemia at Munich University Hospital between 2014 and 2017and the second group having patients without signs of hematological malignancy.

Training and validation
The training and validation of the proposed method are performed in Python 3.6, Tensorflow 2.0, Windows 10, Nvidia GeForce GTX TITAN X GPU with 128 GB RAM.The leukemia cells like lymphocyte and Promyelocyte have very similar morphological characteristics.Also in the dataset few classes like Lymphocyte, Basophil, Promyelocyte, Promyelocyte (bilobed), Myelocyte, Metamyelocyte, Monoblast, Erythroblast, and Smudge cells have less than 100 images.Due to this high classification, accuracy is difficult to achieve.Considering these challenges, a multimodal features fusion-based model has been proposed to discriminate 15 classes of leukemia cells.The 3SNet model is trained with an image size of 256 × 256 pixels and batch size 32 for 50 epochs.The initial learning rate was set to 0.0001.Since the dataset is imbalanced, we have applied fivefold cross-validation to avoid the biased performance of the model.In a fivefold cross-validation for each fold, one set is used for validation and four sets are used for training.Hence, in each fold, 20% images are used for validation and 80%, of images are used for training.In Fig. 6, we have depicted the confusion matrix of each fold.From the confusion matrix average performance measures like precision, recall, F1-score and accuracy are calculated.
The loss function categorical_crossentropy is used to calculated the training and validation loss of the proposed method and shown in the Fig. 7.We can see in Fig. 7a that initially, validation accuracy fluctuates, but after 40 epochs, changes are negligible.Similarly, in Fig. 7b, training loss reaches close to zero.In addition, initially, validation loss fluctuates and becomes less vibrant after 40 epochs.This shows that the 3SNet model can differentiate leukemia cells with high accuracy and less training and validation loss.
The performance measures of the model are calculated for each fold, as shown in Table 3. Table 3, shows precision, recall F1-score, and accuracy values for each fold.It can be observed, in fold-1, that model performance is less than 50%.After that, it gradually increases in substituent folds.Finally, we can see proposed model achieved an average of 87.93% precision, 88.65% recall, 88.11% F1-score, and 98.16% accuracy.

Discussion
Microscopic image analysis for blood smear provide essential data for diagnosing and predicting diseases in hematological assessment.Blood comprises three major components red blood cells (RBCs), white blood cells (WBCs) and platelets.Out of these, white blood cells (WBCs) are a part of the immune system and play an important role in the body's immune system.Leukemia, a blood malignancy that affects the bone marrow and lymphatic system, is generally caused by abnormalities in these WBCs.The morphological differences in the lymphocytes in blood and bone marrow from patients with chronic lymphocytic leukemia and healthy ones have been noticed in various studies.These morphological differences can potentially diagnose the malignancy at various stages, from the primary to the acute stage.Nevertheless, the manual detection of these morphological differences needs expertise, effort and time.Due to this, it is very difficult to identify these cells, and it is necessary to automate this diagnosis with the help of CNN.In this study, we have used a dataset of 18,365 leukemia cells divided into 15 classes.The expert annotates the dataset, which is unbalanced due to the unequal distribution of data.In addition, out of 15 classes, nine classes contain less than 100 images.In Table 4 we have presented a summary several methods using different CNN models on different datasets.
In the past, several research on leukemia cells classification has been reported, shown in Table3.In this regard, Thahn et al. 35 developed a CNN model for normal and abnormal cell classification.They applied the data augmentation technique to increase the dataset's size, and the model's classification accuracy is 96.6%.In a similar type of research, Shafique et al. 56 classify blood smears and their three subtypes using AlexNet.The overfitting of the model is avoided using the data augmentation technique and achieves 96.06% classification accuracy.Pansombut et al. 57 utilized machine and deep learning to classify leukemia cells.First, the feature is extracted using ConvNet; after that, the feature is optimized using a genetic algorithm and finally, a classification accuracy of 81.74% is obtained using a support vector machine (SVM).Ahmed et al. 59 reported the comparative study of several machine-learning algorithms and the effect of data augmentation on training.They also proposed a deep CNN model for the classification of leukemia cells.Their model classifies leukemia cells with an accuracy of 88% and its subtype with an accuracy of 81%.
Prellber and Kramer et al. 60 classify leukemia cells using ResNeXt50 with a Squeeze-and-Excitation block.They train their model with original and augmented images and archive a weighted F1-score of 89.91%.Many pieces of research on leukemia cell classification also applied a transfer learning-based approach.Loey et al. 61 compare the performance of AlexNet before and after fine-tuning.They claim that fine-tuning AlexNet performed better and achieved an accuracy of 100%.In similar research, Vogado et al. 62 applied three deep learning models AlexNet, Coffenet, and Vgg-f to extract features from the leukemia cells.In addition, two classifiers, SVM and KNN were applied for classification.They reported an SVM classifier to outperform and archived an accuracy of 99.76%.Ruberto et al. 63 also extract features from pre-trained AlexNet.Nevertheless, before extracting features from leukemia cells, they applied preprocessing, detecting blob, and segmentation to extract objects of interest.Their method achieves 94.1% classification accuracy.
Rehman et al. 64 extract features using the deep CNN model.Comparative analysis of three classifiers, Naive Base, KNN, and SVM, are performed using the deep features.Out of these three classifiers, Naïve Base achieved 78.34%, KNN 80.42%, SVM 90.91%, and proposed deep classifier 97.78%.Huang et al. 65 also applied a transferlearning approach to extract features from Leukemia cells.The Inception-V3, ResNet50, and DenseNet121 classify with a notable accuracy of 74.8%, 84.9% and 95.3% respectively.
Vol:.( 1234567890 www.nature.com/scientificreports/ In short, all these methods have a high potential for the classification of blood smears.However, many researchers experiment on small datasets, as data augmentation techniques have been used to increase the dataset size.Due to image augmentation, overfitting of the model can be avoided, but several images of the same type lead to the biased performance of the model.In addition, blood smears having a smaller number of images in the dataset need to be explored for their better classification.Therefore, we have not applied the data augmentation technique in the proposed method and focused on the blood smears having fewer images in the dataset.Features extracted from the HOG, Leukemia, and LBP images and aggregated together to form a feature fusion vector that improves the classification performance of the leukemia cell.The 3SNet is the three-scale sequential model used for feature extraction and classification.Each model is trained with the input of 256 × 256 pixels images with a batch size 32 for 50 epochs.Further, a fivefold cross-validation scheme is applied to the model to evaluate bias-free performance.The multi-scale fusion-based CNN model outperforms most blood smears, and outstanding performance is obtained for the cells with less than 100 images in the dataset.The average sensitivity and precision obtained from fivefold cross-validation for the cells with more than 1000 images in the dataset are more than 95%, while cells with less than 100 images in the dataset are 70%.The class-wise performance of each class cell has been compared with the method proposed by Matek et al. 34 . Table 5 shows that the Neutrophil (segmented) cells have 8484 images, which is the highest number in the dataset.For the Neutrophil cell, the precision of the model is close to 99%, and the sensitivity is 99.4% better than the 96% of Matek et al. 34 .For other leukemia cells having more than 1000 images in the dataset, the fusionbased outperforms compared to the available method.Furthermore, the 3SNet is highly sensitive toward the cells having less than 100 images in the dataset.For such cells, except for the myelocyte cells, which had 76.2% precision, achieved more than 80% precision and 80% sensitivity.This notable precision and sensitivity confirm that the proposed 3SNet model can be used for real-time diagnosis.Further, an receiver operating characteristic (ROC) curve is plotted for performance visualization, taking the true positive rate on the Y-axis and the false positive rate on the X-axis 67,68 , shown in Fig. 8.We can see in Fig. 8 that most of the leukemia cell ROC curve area is 1, while EBO shows 98% and MON 99%.This confirms that our model is highly sensitive towards leukemia identification.The class-wise performance can also be observed using the bar chart shown in Fig. 9.We can see in Fig. 9 that the proposed 3SNet model sensitivity and specificity are better than the state-of-the-art method.

Ablation study of the proposed model
We conducted two experiments on similar settings, as discussed in "Training and validation" section.However, we changed the setting of the proposed model as follows: In the first experiment, we removed the HOG feature and trained the model for 50 epochs in a batch size of 32.After training of the model, performance measures precision, recall, F1-score and accuracy of the model are calculated as shown in Table 6.Table 6 shows that the 3SNet achieved average precision and F1-score of 86.60% and 85.10%, respectively.
In the second experiment, we removed the LBP feature, and the model was trained using gray and HOG features for 50 epochs in a batch size of 32.The average performance measures are shown in Table 7.In Table 7, we can observe that the model achieved an accuracy of 96.13% and a recall value of 84.61%.
The dataset used in the study is divided into training and validation.The proposed method applied a similar training and validation set as utilized by Matek et al. 34 .However, we conducted an ablation study and divided the dataset into 80%, 10%, and 10% for training, validation and testing, respectively.The class-wise sensitivity and precision of each cell on the test dataset are shown in Table 8.In Table 8, we notice that the sensitivity and precision of the cells with large numbers of images is more than 90%.Furthermore, the cells having fewer images also achieved notable performance measure values.

Conclusion
This research proposes a novel 3SNet, a deep CNN model for leukemia cell classification.Leukemia cells are a major cause of blood cancer.These blood smears' morphological characteristics are very similar in several classes.Due to this, classification tasks are difficult.To tackle this problem, our method implicitly extracts features from leukemia and their corresponding HOG and LBP images using 3SNet.The HOG feature locates the local shape, and the LBP feature describes the texture pattern of leukemia cells, which helps to discriminate the morphological characteristics of blood smears.The features extracted from three scales are fused and refined to enhance the feature pool.After that, the feature vector is passed to the classification module.The classification performance depicted in Table 5, confirms that the proposed method not only classifies cells having a large number in the dataset with high accuracy but also cells having a smaller number of images in the dataset.Further, depth-wise separable convolution block reduces the computation cost and resources.Hence, this method can be used to design computer-aided diagnostic (CAD) tools that can provide a second opinion to a doctor.The limitation of the model is to feed the images at three scales for training.In addition, the computation costs of the algorithm can be further reduced.In future work, we will add other texture features and a grayscale image to the deep CNN model for further performance improvement.In addition, feature optimization techniques can be applied to the feature pool to enhance the fused features.Further, other lightweight deep CNN models with attention mechanisms can be explored to improve the classification performance.The 2D convolutional layers of the proposed model can be replaced with 3D convolution layers to perform analysis of the 3D images.This will improve the model's capability to diagnose disease more accurately.

Figure 2 .
Figure 2. Here, (a-c) are sample images used in the experiment.

( 3 )
for i=1 to 50 do (a) Train and validate the model with a batch size of 32 (b) Calculate the training loss and validation loss end (4) Generate the confusion matrix for each fold of the validation dataset.(5) Plot the training as well as validation loss graph of the each epoch (6) Draw the ROC curve for each class

Figure 7 .
Figure 7.The training and Validation accuracy and loss of 3SNet is shown in (a) and (b) Respectively.

Figure 8 .
Figure 8.The ROC Plot for the proposed method.
SensiƟvity Matek et al.Precision Proposed Model Precision Matek et al.

Figure 9 .
Figure 9.The bar plot for the comparison of precision and sensitivity with the method 34 .

Table 1 .
Summary of the recent work using machine learning and deep learning.

Table 2 .
The detailed summary of the previous models used for leukemia classification.
6This model is slow in training and computationally expensive due to many trainable parameters AlexNet 24 × 106 Due to the large number of trainable neurons, AlexNet is also costly.Moreover, the model is unable to detect all high-dimensional spatial features ResNetXt 23 × 10 6 ResNeXt is a fifty-layer deep CNN model that can extract high-dimensional features that require a large training dataset.In addition, it cannot be used for real-time applications due to the significant number of trainable parameters DenseNet-121 7.2 × 10 6 DenseNet-121 has significantly less trainable parameters.However, this model's performance is less compared to other state-of-the-art models Vol:.(1234567890)Scientific Reports | (2023) 13:16988 | https://doi.org/10.1038/s41598-023-44210-7

Table 3 .
The performance measures of the 3SNet model.

Table 4 .
Comparison of 3SNet with the recent deep learning methods.

Table 6 .
The performance measures of the 3SNet mode using Grey and LBP features.

Table 7 .
The performance measures of the 3SNet mode using Grey and HOG features.

Table 8 .
Class-wise performance of the proposed 3SNet on the test dataset.