New segmentation and feature extraction algorithm for classification of white blood cells in peripheral smear images

This article addresses a new method for the classification of white blood cells (WBCs) using image processing techniques and machine learning methods. The proposed method consists of three steps: detecting the nucleus and cytoplasm, extracting features, and classification. At first, a new algorithm is designed to segment the nucleus. For the cytoplasm to be detected, only a part of it located inside the convex hull of the nucleus is involved in the process. This attitude helps us overcome the difficulties of segmenting the cytoplasm. In the second phase, three shapes and four novel color features are devised and extracted. Finally, by using an SVM model, the WBCs are classified. The segmentation algorithm can detect the nucleus with a dice similarity coefficient of 0.9675. The proposed method can categorize WBCs in Raabin-WBC, LISC, and BCCD datasets with accuracies of 94.65%, 92.21%, and 94.20%, respectively. Besides, we show that the proposed method possesses more generalization power than pre-trained CNN models. It is worth mentioning that the hyperparameters of the classifier are fixed only with the Raabin-WBC dataset, and these parameters are not readjusted for LISC and BCCD datasets.

sification of WBCs by tuning pre-trained AlexNet and LeNet-5 networks as well as training a new CNN from scratch. They declared that the novel network they have proposed performed better than the fine-tuned networks mentioned previously. Jung et al. 22 designed a new CNN architecture called W-Net to classify WBCs in the LISC dataset. Baydilli and Atila 23 adopted capsule networks to classify the WBCs existing in the LISC dataset. Banik et al. 24 devised a fused CNN model in the task of differential WBC count and evaluated their model with the BCCD dataset. Liang et al. 25 combined the output feature vector of the flatten layer in a fine-tuned CNN and a long short term memory network to classify WBCs in BCCD dataset. A new complicated fused CNN introduced in 26 was trained from scratch on 10,253 augmented WBCs images from the BCCD dataset. Despite the complexity of the proposed CNN in 26 , the number of its parameters stands at 133,000.
For the classification of WBCs based on traditional frameworks, segmenting the nucleus and the cytoplasm of WBCs is a vital but tough task. In this study, a novel accurate method to segment the nucleus is put forward. In order to segment the nucleus, some researchers used the thresholding algorithms after applying various pre-processing techniques on the image (e.g. Otsu's thresholding algorithm, Zack algorithm, and etc.) [27][28][29][30] . A combination of machine learning and image processing techniques is also commonly employed to segment the nucleus of the WBC 31,32 . Moreover, during the last decade, CNNs have gained more popularity and are used to segment the nucleus of the WBC and cytoplasm 33 . Segmenting the cytoplasm is more complicated and less accurate than segmenting the nucleus. Therefore, in this paper, a part of the cytoplasm rather than the whole cytoplasm is detected as a representative of the cytoplasm (ROC) to be segmented. This approach, as a result, does not have the difficulties of segmenting the cytoplasm. We will talk more about this method in materials and methods section.
In order to classify WBCs after segmenting the nucleus and the cytoplasm, discriminative features need to be extracted. Shape characteristics such as circularity, convexity, solidity are meaningful features for the nucleus. This is due to the fact that lymphocytes and monocytes are mononuclear, and the shape of their nucleus is circular and ellipsoidal, respectively 11 . On the other hand, the nucleus of neutrophil and eosinophil is multi-lobed 11 and non-solid. Characteristics such as color and texture, e.g., local binary pattern (LBP) or gray level co-occurrence matrix (GLCM), are also interpretable features for the cytoplasm 11 . In addition to the mentioned features, SIFT (scale-invariant features transform) or dense SIFT algorithm can be employed for feature extraction. In the next paragraph, we review some related works that use traditional frameworks for classifying WBCs.
Rezatofighi and Soltani-zadeh 7 proposed a new system for the classification of five types of WBCs. In this system, nucleus and cytoplasm were extracted using the Gram-Schmidt method and Snake algorithm, respectively. Then, LBP and GLCM were used for feature extraction, and WBCs were categorized using a hybrid classifier including a neural network and an SVM model. Hiremath et al. 28 segmented the nucleus utilizing a global thresholding algorithm and classified WBCs using geometric features of the nucleus and cytoplasm. In 29 , Otsu's thresholding algorithm was used to detect the nucleus, and shape features such as area, perimeter, eccentricity, and circularity were extracted to identify five types of WBCs. Diagnosing ALL using images of WBCs was investigated in 30 . The authors of this paper applied the Zack algorithm to estimate the threshold value to segment the cells. Then, shape, texture, and color features were extracted, and the best features were selected by the means of the social spider optimization algorithm. Finally, they classified WBCs into two types of healthy and non-healthy, using a KNN classifier. Ghane et al. 31 designed a new method to segment the nucleus of the WBCs through a novel combination of Otsu's thresholding algorithm, k-means clustering, and modified watershed algorithm, and succeeded in segmenting nuclei with a precision of 96.07%. Laosai and Chamnongthai 32 examined the task of diagnosing ALL and acute myelogenous leukemia using the images of the WBCs. They detected the nuclei by employing the k-means clustering algorithm, extracted shape and texture features, and finally categorized WBCs utilizing an SVM classifier.
In this section, we briefly introduced the WBCs, its clinical importance and available datasets together with methods used to classify and count WBCs in other studies. In the materials and methods section, we present our proposed method for classifying WBCs. Afterwards, we will present and compare the obtained results with those of the other studies.

Materials and methods
Overview of the proposed method. This research has aimed to suggest a new method for classifying white blood cells in peripheral smear images that is light, fast, and more robust compared to CNN-based methods. Since the proposed method is light and fast, it has no heavy processing cost; therefore, the algorithm can be easily executed on minicomputers and mobiles, and there is no need for TPU or GPU. In this study, the method we put forward is based on classical machine learning ways. In other words, we extract features manually and do not use CNNs to extract features automatically. As said before, the method that we introduce can be divided into three main steps: detecting the nucleus and cytoplasm, extracting shape and color features, and classifying WBCs through an SVM model. Figure 1 shows the block diagram of our method. In the detecting nucleus phase, a novel method is designed and compared with the other introduced methods. Also, for the feature extraction phase, four new color features are designed, and it will be shown that these new features enhance the accuracy of classification. This is worth noting that these color features designed in this research are not general and can only be used for WBC classification problem. At the final phase, the proposed method is evaluated with three different datasets that these datasets will be investigated in detail in the next section. Also, two of these datasets are considered to assay the robustness and resiliency of our method against altering imaging instruments and staining techniques that can be treated as generalization power. In the real world, the generalizability of the intelligence systems is a very important ability, and it needs to pay attention to this side of the proposed method. For this purpose, this research has investigated the generalization of the suggested method and compared it with well-known CNN models. In the results section, it can be seen that the method we propose possesses more generalization power in comparison with the famous CNN models.
Datasets. Three different datasets used in this study are Raabin-WBC 15 , LISC 7 , and BCCD 16 . These datasets are discussed in the next three subsections, and are compared in Table 1. Also, Fig. 2 shows some sample images of these three datasets.
Raabin-WBC dataset. Raabin-WBC 15 is a large free-access dataset recently published in 2021. Raabin-WBC dataset possesses three sets of WBC cropped images for classification: Train, Test-A, and Test-B. All WBCs in Train and Test-A sets have been separately labeled by two experts. Yet, images of Test-B have not yet been labeled thoroughly. Therefore, in this study we only used Train and Test-A sets. These two sets have been collected from 56 normal peripheral blood smears (for lymphocyte, monocyte, neutrophil, and eosinophil) and one chronic myeloid leukemia (CML) case (for basophil) and contain 14,514 WBC images. All these films were stained through Giemsa technique. The normal peripheral blood smears have been taken using the camera phone of  www.nature.com/scientificreports/ Samsung Galaxy S5 and the microscope of Olympus CX18. Also, the CML slide has been imaged utilizing an LG G3 camera phone along with a microscope of Zeiss brand. It is worth noting that the images have all been taken with a magnification of 100.
LISC dataset. LISC dataset 7 contains 257 WBCs from peripheral blood, which have been labeled by only one expert. The LISC dataset has been acquired from peripheral blood smear and stained through Gismo-right technique. These images have been taken at a magnification of 100 using a light microscope (Microscope-Axioskope 40) and a digital camera (Sony Model No. SSCDC50AP). We cropped all WBCs in this dataset as shown in Fig. 2.
BCCD dataset. BCCD dataset 16 has been taken from the peripheral blood and includes 349 WBCs labeled by one expert. The Gismo-right technique has been employed for staining the blood smears. This dataset, also, has been imaged at a magnification of 100 using a regular light microscope together with a CCD color camera 34 . In addition, based on diagnosis made by two of our experts, we found that one of the images of the BCCD dataset had been incorrectly labeled, and thus, we corrected this label.
Training, augmented training, and test sets. For the Raabin-WBC dataset, we have employed already split sets of the original data namely Train and Test-A sets for training and test. In this dataset, different blood smears have been considered for the training and testing sets. Test-A and Train sets comprise almost 30 percent and 70 percent of the whole data, respectively. For the LISC dataset, we randomly selected 70 percent of the data for training, and 30 percent for testing. BCCD dataset has two splits in the original data, 80% of which serve as training and 20% as testing. Since this dataset had only three basophils, we ignored the basophils in BCCD and only considered the remaining four types.
To train an appropriate classifier, it is necessary to balance the training data adopting various augmentation methods. For this reason, some augmentation methods such as horizontal flip, vertical flip, random rotation (between − 90 and + 90 degree), random scale augmentation (rescaling between 0.8 and 1.2), and a combination of them were utilized to augment the training sets of Raabin-WBC and LISC datasets. In addition, the training data of the BCCD dataset had already been augmented. In Table 1, all information about the amount of data in each set is presented.
Nucleus segmentation. Three following steps for nucleus segmentation are considered: Firstly, a color balancing algorithm 1 is applied to the RGB input image, then the CMYK and HLS color spaces are computed and combined and a soft map is computed. Finally, the nucleus is segmented by applying Otsu's thresholding In this research, the color balancing algorithm of 1 is utilized to reduce color variations. To create a colorbalanced representation of the image, it is necessary to compute the mean of R, G, and B channels as well as the grayscale representation of the RGB image. Then, by using Eq. (1), the new balanced R, G, B components are obtained.
It is worth mentioning that the proposed segmentation algorithm was obtained with lots of trial and error. It was found that the algorithm can detect the nuclei very well. Still, for evaluating the performance of the way, 250 new images from Raabin-WBC dataset were utilized, that the evaluation details are described in the results section.
Cytoplasm detection. To extract proper features from the cytoplasm, it is first necessary to segment it.
However, segmenting the cytoplasm is more difficult and less accurate than segmenting the nucleus. Hence, we designed a new method to solve this problem. In this method, the convex hull of the nucleus is obtained first, and a part of the cytoplasm that has been located inside the convex hull is considered as the representative of the cytoplasm (ROC). The more convex nucleus is, the smaller ROC is. Thus, lymphocytes, which usually have a circular nucleus, have lower ROC than neutrophils. Figure 5 illustrates this point.
Feature extraction. In this study, two groups of features are taken into account. The first group includes shape features of the nucleus (convexity, circularity, and solidity). The equations associated with the shape features are as follows 1 :  www.nature.com/scientificreports/ The second group of features is color characteristics. According to the experience of hematologists, in addition to the shape features of the nucleus, the color features of the nucleus and the cytoplasm can also provide us with useful information about the type of WBC 11 . In this research, four novel color features by means of nucleus region, convex hull region, and ROC region are designed as follows: (2) Solidity = Area of Nucleus Area of Convex hull

3.
Mean of ROC Mean of convex hull

Standard deviation of ROC Standard deviation of convex hull
These color features were extracted from the components of RGB, HSV, LAB, and YCrCb color spaces. Therefore, 48 color features and 3 shape features were extracted, which comes up to a total of 51 features. By looking at the classifier's performance in the results section, it is evident that the introduced color features significantly improve the classification accuracy.
Classification. After features are extracted from augmented data, they are normalized using the max-min method and are fed into an SVM classifier. We also tested other classifiers such as KNN and deep neural networks. However, we observed that the SVM provides us with the best results. With much trial and error, we found that if the weight of the neutrophils in the training is set to be more than one, and the rest of the classes are one, the best overall accuracy is observed. Three commonly used kernels which are linear, polynomial, and radial basis functions are tested in this regard. Besides, the regularization parameter known as C is an important parameter to train an SVM model. Thus, three important hyperparameters (class-weight, kernel, and C) are tuned to properly train the SVM model. To find the optimal hyperparameters, we applied fivefold cross-validation on the Train set of the Raabin-WBC employing three different kernels (linear, polynomial with degree three, and radial basis function), neutrophil-weight = 1, 2, 5, 10, 15, 20, and C = 1, 2, 4, 6, 8, 10. Hence, 108 states were assumed. We examined each combination of the hyperparameters with fivefold cross-validation on the Train set of the Raabin-WBC. Table 2 shows the result of examining different combinations of the hyperparameters. From Table 2, it can be seen that the best accuracy is obtained by polynomial kernel, neutrophil-weight of 10, and this is when the C parameter is equal to 6. We fixed these hyperparameters obtained over the Raabin-WBC dataset meaning that we did not readjust these hyperparameters for the LISC and BCCD datasets. Table 2. The accuracy for fivefold cross validation on the Raabin-WBC in order to find the optimal hyperparameters; RBF (radial basis function), Poly (polynomial with degree 3), C (regularization parameter), Neut-W (neutrophil-weight). The results shows that the SVM model with polynomial kernel, C = 6, and neutrophil-weight = 10 provides the best accuracy. Bold values Illustrate the best-obtained value.

Results
The result of nucleus segmentation. The performance of the proposed nucleus segmentation algorithm is evaluated using three different metrics namely dice similarity coefficient (DSC), sensitivity, and precision. These metrics are computed using true positive (TP), false positive (FP), true negative (TN) and false negative (FP) of the resulting segmentation (as shown in Fig. 6) and are provided by the following equations.
In order to extract the ground truth, 250 images including 50 images from each type of WBCs were randomly selected from Raabin-WBC dataset. Then, the ground truths for these images were identified by an expert with the help of Easy-GT software 35 . Also, since very dark purple granules cover the basophil's surface, it is almost impossible to distinguish the nucleus 11 . Therefore, the whole basophil cell was considered as the ground truth. The results of the proposed segmentation algorithm have been presented in Table 3. The proposed segmentation method can detect the nucleus with precision, sensitivity, and dice similarity coefficient of 0.9972, 0.9526, and 0.9675, respectively.
The performance of the proposed segmentation algorithm is compared with that of U-Net + + 33 , Attention U-Net 36 , mask R-CNN 37 (with ResNet50 38 as backbone), and Mousavi et al.'s method 35 . U-Net + + , Attention U-Net, and mask R-CNN are three well-known deep CNNs developed for image segmentation. To train these models, 989 images from Raabin-WBC dataset were randomly chosen, and their ground truths were extracted by an expert utilizing Easy-GT software 35 . The training set includes 199 lymphocytes, 199 monocytes, 199 neutrophils, 195 eosinophils, and 197 basophils. Three aforesaid deep CNN models were trained for 40 epochs, then evaluated with 250 ground truths mentioned in the previous paragraph. Table 3 presents the results of different segmentation algorithms. It can be seen that the proposed segmentation method has very low standard deviation for DSC and precision which indicates that the proposed method works consistently well for different cells in the data. In addition, U-Net + + , attention U-Net, and mask R-CNN are deep CNNs, and their training process  www.nature.com/scientificreports/ is supervised. Hence, they need way more data to be trained. This is while our proposed method does not need to be learned. Also, these two models have lots of parameters and need more time to segment an image, but the proposed segmentation algorithm is simpler and faster. The suggested method can detect the nucleus of a WBC in a 575 by 575 image size in 45 ms. This is while U-Net + + , attention U-Net, and mask R-CNN need 1612, 628, and 1740 ms to segment the nucleus. The proposed method, U-Net + + , attention U-Net, mask R-CNN, and Mousavi et al. 's method 35 were implemented in Google Colab, CPU mode and were compared their execution time.

Result of classification.
In order to evaluate the classification accuracy, four metrics are used: Precision, Sensitivity, F1-score (F1), and Accuracy (Acc). If we face a two-class classification problem such the first class is called Positive and the second class is called Negative, the confusion matrix can be assumed as Table 4, and the mentioned criteria are obtained through relations (8), (9), (10), and (11).
In order to evaluate the effectiveness of color features, Raabin-WBC, LISC, and BCCD datasets are classified in two modes: classification using the shape features, and classification using the shape features together with the color ones. The comparison of the classification accuracy of these two modes is provided in Table 5. It can be seen in Table 5 that adding proposed color features significantly changes the classification results. Addition of color features leads to a remarkable increase in precision, sensitivity, and F1-score for all five types of WBCs. The proposed method classifies WBCs in Raabin-WBC, LISC, and BCCD datasets with accuracies of 94.65%, 92.21%, and 94.20%, respectively. The resulting confusion matrices of our proposed method for the three datasets are shown in Fig. 7.
Comparison with the state-of-the-art methods. Since the LISC and BCCD datasets have been publicly available for several years, the performance of the proposed method on these two datasets is compared to that of the state-of-the-art works in terms of precision, sensitivity, and F1-score. Also, because the categorization      www.nature.com/scientificreports/ of WBCs in peripheral blood is an imbalanced classification problem 15 , the comparison has been made based on each class. Table 6 shows the detailed comparisons. By taking a meticulous look at criterion F1-score, which actually covers both criteria precision and sensitivity, it can be said that our proposed method has achieved the best performance in most classes. In the LISC dataset, the proposed method has classified neutrophils, eosinophils, and basophils with F1-scores of 97.14%, 100%, and 100%, respectively. Also, in the BCCD dataset, our method was able to classify lymphocytes, monocytes, and neutrophils with F1-scores of 100%, 100%, and 96%, respectively. In reference to traditional approaches, the method employed in this article is simple and creative and can be easily implemented. In this method, suitable shape and color features are extracted by means of the nucleus and the cytoplasm, yet there is no need for the cytoplasm to be segmented. The methods used in 19,[22][23][24][25] , and 26 are based on deep learning approaches. Therefore, their models are more complex and have more trainable parameters versus our classifier model which is SVM. For example, the models utilized in 22,23 , and 25 have 16.5, 23.5, and 59.5 million parameters, successively. Besides, it should be noted that the hyperparameters of our SVM model were set only using the Raabin-WBC dataset and were not readjusted again on the LISC and BCCD datasets. This is while the other methods have fixed the hyperparameters of their classifiers on each dataset, separately.
Generalizability. In this section, we aim to compare our method with five well-known pre-trained CNN models in terms of generalization power. These pre-trained models are namely ResNet50 38 , ResNext50 39 , MobileNet-V2 40 , MnasNet1 41 , and ShuffleNet-V2 42 . These models and the proposed method are trained with the augmented training set of Raabin-WBC, then are evaluated with the test set of the Raabin-WBC and all cropped images of the LISC dataset. The Raabin-WBC test and train sets have been acquired employing the same imaging and staining process, so it is expected that the performance of models does not decrease. But, the LISC dataset has been collected with different imaging devices and staining techniques, and the performance of the models probably drops significantly. Dropping the accuracy of the models is natural, but it is important how much decreasing? At first glance, it seems that the aforementioned pre-trained CNN models must be resistant against altering datasets because these models have been trained on the ImageNet dataset, which contains more than one million images from 1000 categories 43 . Therefore, these models should extract robust features and possess high generalization ability while the results illustrate something else. According to Table 7, the accuracy of the pre-trained CNN models drops from above 98% to 30% and below 30%, while the proposed method is more robust, and its accuracy drops from 94.65 to 50.97%. This is probably because extracted features by pre-trained  www.nature.com/scientificreports/ models are too many, and most of them are zeros or redundant 20 . Extracting a large number of features by pretrained CNNs before fully connected layers causes increasing the number of trainable parameters. In addition, this approach (extracting a large number of features) poses a huge processing cost for training them that makes inevitable the utilization of GPU or TPU, which are so expensive while, our method is very light and simple and can be easily executed on CPU or affordable and tiny processor like Raspberry Pi. From the results presented in Table 7, it can be seen that our proposed method outperforms pre-trained CNNs in terms of generalization power and execution time. Although the proposed method is more robust than pre-trained CNNs, we do not claim that the suggested way has high generalization power and needs improvement to reach high generalization ability. At the end of this section, we should mention that the hyperparameters of the aforesaid pre-trained CNNs were selected through the reference 15 , and the last fully connected layer of each model was modified for a five-class problem, and then all the layers were retrained/fine-tuned. All the pre-trained CNN models mentioned in Table 7 were implemented in python 3.6.9 and Pytorch library 1.5.1 and then trained on a single NVIDIA GeForce RTX 2080 Ti graphic card. But for comparing inference time, all the models reported in bellow Table  were executed on CPU configuration CPU configuration (windows 10 64 bit, Intel Core-i7 HQ 7700, 12 GB RAM, and 256 GB SSD hard disk).

Discussion
As mentioned before, the proposed method contains three phases. Segmenting the nucleus and detecting a part of the cytoplasm located in the nucleus's convex hull are performed at first phase. After extracting shape and color features, WBCs are finally categorized employing extracted features. Our proposed nucleus segmentation algorithm consists of several steps depicted in Fig. 3. These steps have been designed to remove the red blood cells and the cytoplasm. From Table 3, it is clear that the segmentation algorithm can detect the nucleus with a very high precision of 0.9972 and DSC of 0.9675. The proposed segmentation algorithm is very fast in comparison with U-Net + + , Attention U-Net, and mask R-CNN models ( Table 2). In the cytoplasm detection phase, in contrast to the common practice of segmenting the whole cytoplasm, only parts of the cytoplasm that are inside the convex hull of the nucleus was selected as a representative of cytoplasm (ROC). This way has not the difficulties of segmenting cytoplasm, but the classification accuracy is boosted with the help of features extracted by means of ROC.
In the Feature extracting phase, we used three common shape features namely solidity, convexity, and circularity. Besides, we designed four novel color features and extracted them from channels of RGB, HSV, LAB, and YCrCb color spaces. According to Table 5, it is obvious that the designed color features have remarkably increased the classification accuracy.
In the final phase, the classification is done with an SVM model. To choose the best hyperparameters for the SVM model, 5-fold cross validation was applied only on the Raabin-WBC dataset. The SVM model was separately trained for a different combination of hyperparameters to obtain the best one ( Table 2). The method we put forward is automatic, simple, and fast that does not need to resize the images and segment the cytoplasm. According to Table 6, in LISC dataset, the proposed method came first in distinguishing neutrophils, eosinophils, and basophils. In addition, in the BCCD data set, our method was ranked first in detecting lymphocytes, monocytes, and neutrophils. Besides, according to Table 7, our proposed method has more generalization power rather than pre-trained networks. The features designed and extracted in this research are from the shape of the nucleus and color of the nucleus and cytoplasm, which are important characteristics that all hematologist experts pay attention to them to detect the WBC type. Therefore, these features are meaningful while the features extracted by CNNs are not interpretable and many of them are zeros or redundant 20 , and cause the network to overfit the dataset and affect generalization power. Even though the proposed method outperforms pre-trained CNN in terms of generalization power, it's still insufficient and needs improvement to possess very high generalizability, and we aim to design new features in future works to carry out this matter. In addition to generalization ability, our method outperforms the CNNs in terms of inferencing time. The CNN models have lots of trainable parameters, which increase the inferencing time that makes inevitable the use of powerful hardware like GPU or TPU (Table 7) while our method is faster and can quickly run on CPU or affordable processor like Raspberry Pi. Table 7. The comparison of the pre-trained models and our proposed method in terms of generalization ability, trainable parameters, and inferencing time. Acc, and ms are the abbreviations of accuracy and millisecond, respectively.

Conclusion
This research designed a novel nucleus segmentation algorithm and four new color features to classify WBCs. This paper has two contributions. The first contribution is devising a new algorithm for segmenting the nucleus that is fast and accurate and does not need to be trained like CNN-based methods. The second contribution is designing and extracting four new color features from the nucleus and cytoplasm. To extract color features from the cytoplasm, we used the convex hull of the nucleus that eliminates the need for segmenting the cytoplasm that is a challenging task. We showed that these features help the SVM model in more accurately classifying WBCs. The proposed method successfully managed to classify three data sets differing in terms of the microscope, camera, staining technique, variation, and lighting conditions, and ensured the following accuracy of 94.65% (Raabin-WBC), 92.21% (LISC), and 94.20% (BCCD). In addition, the results presented in Table 7 indicate that the proposed method is faster and has more generalization ability than the CNN-based method. Therefore, we can conclude that not only is the suggested way robust and reliable, but also it can be utilized for laboratory applications and purposes.