COVID-19 infection segmentation using hybrid deep learning and image processing techniques

The coronavirus disease 2019 (COVID-19) epidemic has become a worldwide problem that continues to affect people’s lives daily, and the early diagnosis of COVID-19 has a critical importance on the treatment of infected patients for medical and healthcare organizations. To detect COVID-19 infections, medical imaging techniques, including computed tomography (CT) scan images and X-ray images, are considered some of the helpful medical tests that healthcare providers carry out. However, in addition to the difficulty of segmenting contaminated areas from CT scan images, these approaches also offer limited accuracy for identifying the virus. Accordingly, this paper addresses the effectiveness of using deep learning (DL) and image processing techniques, which serve to expand the dataset without the need for any augmentation strategies, and it also presents a novel approach for detecting COVID-19 virus infections in lung images, particularly the infection prediction issue. In our proposed method, to reveal the infection, the input images are first preprocessed using a threshold then resized to 128 × 128. After that, a density heat map tool is used for coloring the resized lung images. The three channels (red, green, and blue) are then separated from the colored image and are further preprocessed through image inverse and histogram equalization, and are subsequently fed, in independent directions, into three separate U-Nets with the same architecture for segmentation. Finally, the segmentation results are combined and run through a convolution layer one by one to get the detection. Several evaluation metrics using the CT scan dataset were used to measure the performance of the proposed approach in comparison with other state-of-the-art techniques in terms of accuracy, sensitivity, precision, and the dice coefficient. The experimental results of the proposed approach reached 99.71%, 0.83, 0.87, and 0.85, respectively. These results show that coloring the CT scan images dataset and then dividing each image into its RGB image channels can enhance the COVID-19 detection, and it also increases the U-Net power in the segmentation when merging the channel segmentation results. In comparison to other existing segmentation techniques employing bigger 512 × 512 images, this study is one of the few that can rapidly and correctly detect the COVID-19 virus with high accuracy on smaller 128 × 128 images using the metrics of accuracy, sensitivity, precision, and dice coefficient.


Literature review
Despite some claims to the contrary, COVID-19 is resurfacing, so early COVID-19 illness identification is necessary for controlling this outbreak.For this reason, medical imaging processing techniques have recently become widely used for the automated identification of COVID-19, as CT scan and X-ray images are analyzed using artificial intelligence technologies 33 .In this regard, several methods have been proposed recently for the detection and segmentation of COVID-19 infection in the lungs using chest X-rays and CT scans.In order to identify COVID-19 using these methods, feature extraction is a crucial step either using manual techniques or deep learning techniques [12][13][14][15][16][17]23,34,35 . For instnce, the authors of 14 employed manual feature extraction methods to calculate frequency and spatial features from X-ray images to build a feature vector of 256 elements.After that, they used Principal Component Analysis (PCA) to select the most significant features, which were then used to train and test a multilayer perceptron (MLP) network to classify healthy, pneumonia, and COVID-19 cases.Thus, scholars have been motivated to carry out more research in deep learning due to the absence of manual feature extraction and the existence of an end-to-end architecture. Acordingly, hybrid approaches have been used to detect COVID-19.For instance, the scholar's in 36 discuss various deep learning techniques used for COVID-19 detection and diagnosis, such as convolutional neural networks (CNNs), recurrent neural networks (RNNs), and hybrid models.These approaches use the pre-trained convolution neural network (CNN) models as feature extractors and the classical machine learning algorithms in the classification process as in 11,[37][38][39][40] .For example, the COVID-19 CT-scan binary classification work was handled by the authors of 40 using four potent pre-trained CNN models: VGG16, DenseNet121, ResNet50, and ResNet152.This approach uses a FastAI ResNet framework to automatically choose the optimum architecture from CT scans.To get around the lengthy training period, transfer learning strategies were also employed, since transfer learning performs best when there are few available training data sets.Another method proposed in 11 comparing the effectiveness of three pre-trained CNN models, AlexNet, ResNet50, and SqueezeNet, with that of three machine learning classifiers, Nave Bayes, Bagging, and Reptree, to categorize the chest CT images into two image classes, namely COVID and non-COVID.The three CNN models were trained using the transfer learning strategy as well.According to their research, the Naive Bayes classifier had the best accuracy (97%), while ResNet50, one of the three CNN models, had the highest accuracy (99.1%).As a result, they reported that while classifying COVID-19 chest CT scan images, deep learning networks outperformed machine learning approaches.On the other hand, because of their high sensitivity, specificity and correct prediction rate, molecular methods like reverse transcription-polymerase chain reaction (RT-PCR) are frequently recognized as the gold standard for identifying COVID-19 5 .The speed and effectiveness of screening suspected cases, however, are hindered by a lack of resources for performing RT-PCR assays.Furthermore, several studies have shown that RT-PCR tests have high false positive and false negative rates 6,7 .The fast mutations and genetic variety of COVID-19 are thought to be the main cause of this.Several studies [41][42][43] have identified COVID-19 by identifying features in its genomic sequence in addition to using molecular methods and medical imaging modalities.These methods use a variety of genomic signal mapping algorithms to convert genome sequences into genomic signals.To create useful systems that can detect COVID-19, these signals were analyzed using digital signal processing technologies.In another study, COVID-19 was identified using genomic image-processing (GIP) methods 44 .GIP is an area of bioinformatics that connects bioinformatics with image processing techniques.

Contribution
Since COVID-19 is a novel pandemic, only a limited number of datasets are publicly available.The size of the data is a key factor in the performance of any DL model.Data augmentation is a useful approach to overcoming the dataset's size limitation and preventing over-fitting or under-fitting issues.The majority of the methods employed traditional augmentation techniques, such as rotation, shearing, scaling, and horizontal and vertical translations, among others, or a generative adversarial network (GAN) 45 to overcome the issue of reliance on original data, which may occurs using traditional augmentation techniques.Although the majority of most methods have sought to employ different data augmentation techniques to make up for the lack of a dataset, the efficacy of data augmentation in real-life and live images for the detection of COVID-19 remains unclear.Furthermore, general data expansion methods provide visuals that strongly resemble the original images.In our method, we didn't use any augmentation methods and instead employed various pre-processing image processing techniques to split the image into red, green, and blue (RGB) channels, which are fed into three separate U-Net for each RGB image channel.Consequently, the original dataset-3138 CT images-is immediately enlarged into 3138 × 3 without the need for any augmentation methods.Furthermore, to provide a difference in the information passed to the U-Net architecture, which gives more information about the infection, this division contributes to the infection detection by using a large amount of information instead of the grey image, which contains only one channel.The proposed system merges the prediction results of the three RGB image channels to present its final result.The basic idea of the proposed algorithm is to map scalar values of the given CT images to colors (e.g., RGB image channels in our case) and use the heat-map as a data visualization tool.These colors can be used to highlight features in the given images, such as bone density, tissue density, or blood flow given in 27 .In the case of COVID-19, infected individuals often exhibit a fever, which can be detected using thermal imaging and represented as a heat-map [46][47][48] .Heat-maps are graphical representations used to identify the region of interest in data by observing how colors changed.In the current study, heat-map is applied on the lungs images to visualize the lung part in RGB color space to give more details that facilitate the detection of infection inside the lung.Every channel is passed into U-Net architecture then before the classification the segmented result is merged and passed into 1 × 1 convolution layer for the classification.The three RGB channels were employed to expand the feature information, which enhanced infection prediction.The proposed technique can detect COVID-19 with a 99.71% accuracy rate and improved sensitivity and specificity.The proposed approach possesses the following merits: 1.It avoids the use of data augmentation to overcome the limited size of the dataset, which is commonly applied by most of the previous methods.2. It uses a small image size to train the model to increase the speed of the training and testing instead of using large images, as commonly used.3. It uses RGB image channels to predict COVID-19 infection.4. It merges the results of the COVID-19 prediction of three U-Nets based on the image channel, which increases the prediction accuracy.5.The different number of filters in the convolution block in the proposed approach reduces the total number of parameters, produces accurate results, and increases the running speed of the model.

Basic notions Convolution neural networks (CNNs)
One of the most important networks in the field of DL is the convolutional neural networks (CNNs), which are used for a variety of classification problems 49,50 .CNN's success is largely due to its in-built capacity to automatically extract features from input data without operator intervention 51 .One of the major advantages of using CNNs over other neural networks is that they can deal with 2D image data, so we do not need to flatten the input images to 1D, which helps in retaining the "spatial" features of the images.Hierarchical feature representation can be learned automatically from data, which is a multilevel representation from pixels to highlevel semantic features learned using a hierarchical multi-stage structure.CNN may be thought of as a series of convolution layers interspersed with nonlinearity and pooling layers that translate activations received at one end into activations received at the other end using a loss or error function.The loss function is used to calculate the error between a single prediction and its associated actual value, so we become able to evaluate the effectiveness of a specific classification function in classifying certain data points in our dataset as "good" or "poor".A general representation of CNNs architecture for image classification is generally composed of four layers: input, convolution, pooling (sub-sampling) and fully connected as demonstrated in Fig. 1.The input layer stores the pixel values of the input images.The image is divided into receptive fields that are fed into a convolutional layer.Receptive fields are areas of the visual field where a single neuron is activated in response to a stimulus.The features of the input image are extracted using a convolution layer.The convolution layers are based on the term "convolution, " which is a mathematical operation performed on two variables (f * g) to produce a third variable.These layers are responsible for identifying low-and high-level complex features in each input.A group of parameters, called hyperparameters, is associated with a convolution layer: filter size, stride, and zero-padding 51 .The hyperparameters are constants whose values must be known before the models can be built.For instance, stride is the number of units in which the filter slides over an input image.A convolutional layer is applied by taking an image as an input matrix of pixels and then applying learnable filters (or kernels) of a fixed size, that is, to each block of the input matrix (see Fig. 2 as an example).A kernel convolves images using a specific set of weights by multiplying its elements with the corresponding elements of the receptive field.Receptive fields are the area of the visual field where a single neuron is activated in response to a stimulus.These multiplications are summed, and the process is repeated for every location in the input volume.Figure 3 depicts what occurs when a filter is applied to an input image with a stride.Convolution produces "feature maps, " which are collections of several features.Consequently, a pooling layer (e.g., max pooling) is used to reduce the dimensionality of each feature map, but it holds the most critical data.The system linearly performs computations until the convolutional layer is reached.The selection of an appropriate activation function, such as Sigmoid, the Rectified Linear Unit (ReLU), and variants of ReLU, is used to introduce nonlinearity in the system.For instance, the purpose of ReLU is to replace negative activation with a 0. The architecture of each CNN had a fully connected layer at the end.Inside this layer, each neuron is connected to all neurons of the previous layer, which is the so-called FC approach.It was used as the CNN classifier.• U-Nets U-Net is a convolutional neural network architecture developed for biomedical image segmentation tasks 52 .It was originally proposed by Ronneberger et al. 53 for biomedical image processing.The name "U-Net" comes from the shape of the network, which resembles the letter "U.".The U-Net architecture is based on the CNN encoder and decoder approaches.The encoder is responsible for the encoding context by using CNN's typical architecture of alternating convolution and pooling operations.It is composed of five blocks, each of which is composed of two convolutional layers, and uses a ReLU activation function to provide network nonlinearity.These blocks produce feature maps through a convolution process.One max-pooling layer then reduces the size of these feature maps while simultaneously increasing the number of layers per block, allowing the architecture to effectively learn complex structures.By employing transposed convolution (deconvolution) procedures to create the segmentation mask of the picture, the decoder component is responsible for decoding the information, enabling exact localization.It also consists of five blocks, each of which is made up of two convolutional layers, which also use the ReLU as the activation function; one up-sampling layer, which is in charge of reverting the max-pooling operation to restore the feature maps to their original size in the network; and a skip connection, which combines the up-sampled features with high-resolution encoded features from the encoder part.According to Zhou et al. 54 , skip connections can aid gradient propagation in deep networks by reducing the likelihood of gradient dispersion, which enhances segmentation performance.Figure 4 illustrates the overall architecture of the U-Net model.

Prior research
In recent years, several illnesses have been monitored using medical image-processing techniques 55 .The development of DL and artificial intelligence technologies, which have become popular methods for the identification and segmentation of various medical issues, has accelerated this field's advancement 23,33,52 .In the last few months, several approaches have been proposed for the detection and segmentation of the lungs' COVID-19 infection using chest X-rays and CT scans.The proposed approaches can be divided into three groups: (1) techniques for classification, (2) techniques for segmenting diseased regions, and (3) diagnostic systems that can handle both tasks.The second group comprised the main topic of this study.For instance, under classification techniques, the authors of 56   CT scan images.Wang et al. 57 developed COVID-Net, a densely connected deep convolutional neural network architecture that scored 93.3% in a test of accuracy for the purpose of detecting COVID-19 instances from chest X-ray images.Besides, Ahuja et al. 58 , presented a three-phase detection model using deep transfer learning to increase detection accuracy.On the other hand, a hybrid model using transfer learning has also been discussed in 39 using CT scans to detect COVID-19, and other researchers used infection segmentation techniques [59][60][61][62][63][64][65] , the author of 59 , for instance, developed a novel deep network named "Inf-Net" to automatically identify sick areas from chest CT slices.That approach is built around a parallel partial decoder that combines high-level characteristics to produce a world map.They employed a modest dataset for their investigation that had 100 CT-labeled images and a die score of 0.682.Another study was conducted by 60 who suggested a unique method for detecting COVID-19 characteristics in chest X-ray images by combining CNN and VGG19.They employed a dataset of eighty-seven chest X-ray images linked to twenty-five cases in their investigation and found that it had 96.93% accuracy, 57.14% sensitivity, and 99.2% specificity.Using CT imaging, the authors of 61 created a diagnostic system based on deep learning methods to identify and quantify COVID-19 infection and pneumonia screening.In addition, the authors of 62

Datasets and methodology
This section presents in-depth information on our proposed methodology and the datasets that are used to train and evaluate our proposed approach.

Datasets
The size of the data has a significant impact on how well any deep learning model performs.

Methodology
Our proposed approach is presented in detail in this section.The methodology of our proposed approach includes the following stages: The first stage is to acquire patient data in CTS imaging format; the second is data pre-processing; and the third is training and classification using the pre-processed data.

Data pre-processing
Pre-processing is the initial procedure that takes place before the dataset is fed into the deep learning model.The CT scan image must be resized since the lungs are so big.To speed up training and testing, the data has been scaled to 128 × 128 instead of using large images as usual.When handling medical images, extreme caution must be taken because medical images are noisy and must be cleaned up before feeding them to the model; otherwise, the model would pick up on the noise 66,67 .An effective pre-processing phase is needed to improve the model's performance by removing noise and artifacts that might hinder the model's ability to learn and generalize.Medical images frequently suffer from two contrast issues, such as noise and intensity inhomogeneity; so, we used the Contrast Limited Adaptive Histogram Equalization (CLAHE) method, which was proposed in 68 , to improve the contrast of the obtained images.Figure 6 displays a sample of the lung CT scan image's CLAHE result for each RGB image channel before and after applying the CLAHE.
To enlarge the dataset for deep learning processes, a data augmentation strategy must be employed to acquire more images, given that Kaggle's CT scan images collection contains fewer than necessary images.As previously mentioned, the majority of approaches have attempted to address the data shortage using a variety of data augmentation techniques; however, there is no conclusive proof of the efficacy of data augmentation in real-life and live images for the detection of COVID-19.In contrast, we enlarged the dataset into 3138 × 3 without the need for any augmentation methods by employing various pre-processing image processing techniques to split the image into three RGB image channels, which are fed into three separate U-Net for each channel.The use of three RGB image channels provides further information to increase infection detection.The convolution blocks of the U-Net contain two convolution layers, and each layer has a different number of filters, which enhances U-Net segmentation, and reduces the number of parameters.The various image processing techniques used include thresholding, resizing, inversing image colors, using histogram equalization, and heat map as a data visualization tool.A density heat map was applied to the lung images to visualize the lung part in RGB color form.This provides more details that facilitate the detection of infection in the lungs.Our idea was motivated by the fact that mapping scalar values of medical images to colors is a helpful process that may be used to highlight features like bone density, tissue density, or blood flow, as discussed in 27,69 .The research created on COVID-19 always achieves high performance in lung segmentation and detection; however, COVID-19 infection segmentation and detection need more enhancement as they achieve low accuracy, as given in [48][49][50][51][52] .In the case of COVID-19, infected individuals often exhibit fever, which can be detected using thermal imaging and represented as a heat map 9,48 .Heat map is a graphical representation used to identify the regions of interest in the data by observing how colors change, and it is considered to be the best data visualization tool.It works with all image types, not just thermal images, such as those in 70,71 .To detect COVID-19 infection in the lungs, some researchers have used lung instruction.Others used grey lung images and depended on the power of the segmentation method, as in 35 .The authors of 36 highlighted the challenges faced in developing deep learning models for COVID-19 detection and diagnosis, such as the lack of large-scale datasets and the need for interpretability of the models.
The following stages can serve as an overview of our proposed method's pre-processing phase: 1. Eliminating artifacts and noise.
2. Remove all the images that don't depict lungs from the datasets (refers to Fig. 7).
3. Resize the dataset's images to match the dimensions of the input layer of the network (128 × 128) 4. The images are given a threshold.This threshold is calculated by a simple equation given as follows: if pixel intensity is less or equal 50, make it zero; otherwise, don't change.This process simplifies the lung pictures by removing extraneous details. Figure 8a,b shows samples of images before and after thresholding.5.After thresholding, create a density heat map 46 by applying an RGB colour scheme to the area of interest.
(i.e., lung).The original image is then overlaid with the heat map. Figure 8c shows the effect of applying the heat map on lung images.This step was quite helpful in getting rid of the infection in the lung area.
There is free software that may be used to construct heat maps, including the languages R for computing statistics and graphics, OpenLayers, Gnuplot, and Python, among others.In this study, we implemented the heat map using Python.Implementation and setup are discussed later in the "Result analysis and discussion" section.The thresholded images and masks of the lungs from the datasets, as shown in Fig. 8d, were used to create the heat map image (colored image), which is presented in Fig. 8c.The processes are given as follows: 1.The Gaussianblur, which uses the equation shown below, was initially applied to the mask image, where x is the input image and σ is the standard deviation of the Gaussian distribution of the image pixels.Using the OpenCV function, the implementation is as follows to create the blurred image: where blur is the output image from Gaussianblur, and lung_mask is the mask of lung from the dataset.2. We used the following OpenCV function to determine the color space that is applicable to the heat map: where heatmap_imag is the output of choosing the color space.3. The following OpenCV function may be used to create the heat map's colored image:

Proposed algorithm
After the pre-processing phase, including the heat-map, the images are divided into three channels, namely, R-channel, G-channel, and B-channel.Then, a U-Net architecture based on CNN encoder and decoder approaches is applied to each of the RGB channels of the colored image for quick and accurate image segmentation to obtain the lung segmentation model.After visualizing the channel, we discovered that the red channel included the majority of the crucial data on infections and lung disorders.In the images, this information is rounded off with white areas.The white portion of the lung can have an impact on picture segmentation, while the green and blue channels can include information about infections.In order to perfectly clarify the in-image information, the image channels are processed using histogram equalization and the image inverse.Samples of the employed RGB channels are shown in Fig. 9.The examples of the images given in Fig. 10 were created using the image inverse and histogram equalization.Each channel passes into the U-Net architecture, before classification, the segmented output is merged and fed into 1 × 1 convolution layer.

Network architecture
The proposed approach contains three U-Nets.Each U-Net contains nine convolutional neural network blocks.Four blocks in the expanding path and five blocks in the contracting path.Two convolution layers with a filter size of 3 × 3 make up each convolution block.The number of filters in the first convolution layer in the convolution block was 50 and the number of filters in the second convolution layer was 25.To avoid overfitting in neural networks, we used dropout regularization after each convolution block in the U-net.During training, a dropout regularization strategy avoids overfitting by preventing any units from being codependent on one another.www.nature.com/scientificreports/Dropout is often used to improve the performance (accuracy) of deep learning tasks on unknown datasets to avoid overfitting 72 .Our approach aims to avoid overfitting by minimizing both the batch size and feature count.This is accomplished by applying a max-pooling layer with a size of 2 × 2 and a dropout algorithm from the Python Keras package with a value of 0.01 to each convolution block in the U-net.An activation function called ReLU is then used.The dropout Python function that is used modifies the feature value that is larger than 0 by a factor of (1/1-0.01),while leaving the 0 feature value unchanged.The batch size was given a value of 32.In the classification layer, the sigmoid activation function is also applied, as shown below: where y is the activation function's output, and stride in the used U-Net was equal to 2. Figure 11 shows the network architecture's complete structure.

Implementation and setups
Our system is implemented using the Python programming language with libraries such as TensorFlow, Keras, and open-CV.We ran the network using Keras and TensorFlow on a Dell laptop with an Intel(R) Core(TM) i5-1035G1CPU; generation 10, 8 GB of RAM, Windows 10, and a 64-bit operating system with an × 64-based processor.Experimental results were obtained for 20 CT scan datasets.The total number of images and masks used was 7040.After excluding closed lungs, 5,946 images were obtained.A total of 2973 were imaged with the lungs and 2973 with masks.There were 1783 images in the training set and 1190 images in the test set.Additionally, the masks had the same image numbers.

Training and evaluation metrics
Image classification is comprised of two stages: training and testing.The dataset was divided into two groups: 70%of the images were used for training and 30% for the testing process.This division was randomly.The research community uses a variety of performance metrics to evaluate the efficiency of classification and segmentation algorithms.The F1-score, also known as the Dice coefficient, recall, accuracy, specificity, and precision, are some of these metrics.Four measurements are required to calculate these metrics and validate the proposed model: true positive (TP), true negative (TN), false positive (FP), and false negative (FN).
• True positive (TP): represents the number of pixels being correctly identified in the segmentation tasks and the number of correctly predicted infected CTs in the classification task.Simply, it represents the correct classified images.• True negative (TN): denotes the number of non-lung/infection pixels being correctly identified as non-lung infection in the segmentation tasks and the number of correctly predicted healthy CTs in the classification task.Simply, it represents the images with a wrong class label and is classified to this wrong class.• False positive (FP): represents the number of non-lung/infection pixels being wrongly classified as lung/ infection pixels in the segmentation tasks and the number of mistakenly predicted infected CTs in the classification task.Simply, it represents the wrong classified images.• False negative (FN): denotes the lung/infection pixels being wrongly classified as non-lung/infection pixels in the segmentation tasks and the number of mistakenly predicted healthy CTs in the classification task.Simply, it represents the images with a wrong class label and is not classified to this wrong class.
Based on the above four measurements, the performance metrics, which include accuracy, precision, sensitivity, specificity, and F1-score, were calculated as follows: • Accuracy: The accuracy statistic counts the number of times a model predicts accurately over the whole dataset.It measures the ratio of correctly identified predictions divided by the entire prediction, and it is defined in the equation below: • Precision: The ratio of correctly predicted positive observations to all positively expected observations is known as precision.Given in the equation below: • Recall (also known as sensitivity): It is the proportion of true positives correctly predicted by the model.It is the percentage of accurately predicted positive observations among all observations in the current class.Given in the equation below: • Specificity: it is the proportion of true negatives correctly predicted by the model: • F1-score (Dice coefficient): It is the proportion of the predictions to the actual data that overlaps.Its value ranges from 0 to 1 and the higher the value, the more accurate the segmentation.It is provided in the equation below: The evaluation metrics findings for our proposed approach are shown in Table 1, and they demonstrate good performance.According to the learning curve illustrated in Fig. 12 which is based on the training and testing losses across 50 epochs, such good performance is confirmed.From this curve, we found that the accuracy Figure 13 shows the evaluation metrics results of the proposed approach in terms of accuracy, precision, sensitivity and dice coefficient.Experimental results in the segmentation and detection process are demonstrated in Fig. 14.

Ablation study and comparative analysis
In deep learning, we frequently use models that are composed of a variety of components, each of which has a significant impact on the overall performance, so, it is crucial to provide a method for evaluating the contribution of these components to the overall model.This is where the idea of an ablation study comes in, when specific network components are eliminated in order to better understand the behavior of the network.In other words, an ablation study, which is used to quantify causality, is a simple method to look into the causes of those components.
We conducted an ablation study to demonstrate the model's robustness by eliminating a part of the proposed approach's components, referred to as "mode", while leaving the others unchanged.The significance of each component (Red channel, Green Channel, Blue channel, histogram equalization, among others), used in the proposed approach is presented in Tables 2 and 3, with the help of the evaluation metrics (e.g., sensitivity, precision, dice coefficient, and accuracy) to highlight their impact on the segmentation performance for COVID-19 detection.
Table 4 shows a comparative study of our proposed framework with other existing ones that use the same dataset and are based on training time, error rates and dice coefficient metrics.We found that our method, which operates on 128 × 128 images, outperforms existing methods that use larger 512 × 512 image sizes.Although we use the same U-Net and dataset as other frameworks, our proposed approach has a distinct structure and better accuracy.For instance, the authors of 63 have achieved a dice coefficient rate of 76% and used data augmentation by increasing the data by 15% through random intensity to enhance the results of the 3D U-Net.On the contrary, our proposed approach achieved a dice coefficient rate of 0.85 without the need for any augmentation strategies.Scholars of 65 used two cascaded residual attention inception U-Net (RAIU-Net) models and achieved a dice coefficient rate of 0.81, while study 64 used edge-enhancing diffusion filtering (EED) to improve the contrast and intensity homogeneity of the infection areas with a dice coefficient rate of 0.78.Finally, the authors of 62 suggested a U-Net for COVID-19 detection and trained it using 80% of the data, with an attainable dice coefficient rate of 0.67.It is important to point out that the research studies [63][64][65] separated their data into 70% for training and 30% for testing, much like we did in our proposed methodology.

Limitations
There are some potential limitations that need to be highlighted, one of which is the limited availability of datasets for training models due to the novelty of COVID-19 disease.The proposed technique was implemented using  23 .Also, we would need further training with larger and more adaptable datasets to compare the outcomes in order to be more effective in clinical practice.On a larger dataset, the algorithm's accuracy could differ.Another limitation in our research is that we did not focus on the different levels of infection severity in the lung CT scan images of COVID-19.Moreover, the accuracy of COVID-19 identification may also be impacted by a lack of diverse data, false positive and negative measurements, applying the method to different hyper-parameter configurations (e.g., epoch size, batch size, activation function, optimizer techniques, etc.), among others.All of these aspects could be useful for achieving a more reliable model for practical applications.Table 2. Effects of the proposed components on the model's performance (without heat map).

Sensitivity Precision Dice coefficient Accuracy (%)
Removal of the red channel and its U-net from the system 0.71 0.72 0.72 99.56 Removal of the green channel and its U-net from the system 0.68 0.70 0.69 99.52 Removal of the blue channel and its U-net from the system 0.71 0.74 0.72 99.57 Our proposed approach without removing 0.66 0.76 0.71 99.57 Table 3. Effects of the proposed components on the model's performance (with heat map).

Sensitivity Precision Dice coefficient Accuracy (%)
Removal of the red channel and its U-net from the system 0.7 0.8 0.74 99.52 Removal of the green channel and its U-net from the system 0.63 0.64 0.63 99.27 Removal of the blue channel and its U-net from the system 0.72 0.76 0.74 99.49 Our proposed approach without removing 0.83 0.87 0.85 99.71

Conclusion
The COVID-19 virus, a threat that exists on a worldwide scale, has an influence on millions of people's lives.The fight against this pandemic depends on the earliest detection of COVID-19 symptoms.To aid in early illness detection and disease prevention, deep learning algorithms have been trained to identify and classify lung images.In this paper, we propose a new approach that combines image processing, data visualization, and deep learning (DL) techniques, particularly U-Net architecture, to accurately detect COVID-19 infections with the existing COVID-19 CT scans publicly available at Kaggle dataset repository.The majority of earlier research that used the same dataset for training and testing has attempted to address the data shortages using various data augmentation techniques, since Kaggle's collection of CT scan images has fewer images than necessary.However, data augmentation in real-life and live images hasn't been demonstrated to be particularly helpful in detecting COVID-19.On the other hand, our approach uses a variety of pre-processing image-processing techniques to divide the image into red, green, and blue (RGB) channels, which are then fed into three separate U-Nets for each RGB image channel.This method directly expands the original dataset without the need for any augmentation strategies.The three RGB channels were used to increase the feature information, consequently improving infection detection by employing a lot of data instead of the grey image, which only comprises one channel; this division helps with infection detection.The various image processing techniques used include thresholding, resizing, inversing image colors, using histogram equalization, and heat map.A heat map is an effective method for data visualization that can be used to envision the lung image in RGB color space and identify the region of interest in data by noticing how colors change.We evaluated the performance of our approach using accuracy, precision, sensitivity, specificity, and dice coefficient metrics, and we found that our method showed good performance on 128 × 128 images.However, other algorithms were able to improve their segmentation by using larger 512 × 512 images.

Figure 3 .
Figure 3.An example of a filter applied to a two-dimensional input image to produce a feature map.

Figure 5 .
Figure 5. Four sample images from the used dataset.First row (a): original CT lung images.Second row (b): sample of lung masks.Third row (c): sample of infection and lung mask together.Fourth row (d): infection masks.

Figure 6 .
Figure 6.An example of the RGB image channel before and after applying the CLAHE.

2σ 2 blurFigure 7 .
Figure 7. (a) Samples of images that were excluded; (b) samples of images that were included.

Figure 8 .
Figure 8.(a) A sample of CT scan images before thresholding, (b) a sample of those same images after thresholding, (c) a sample of images following the application of a heat-map, and (d) an illustration of heat map image (colored image) generation.

Figure 10 .
Figure 10.(a) Original image, (b) red-channel after histogram equalization and image inverse, (c) blue-channel after histogram equalization and image inverse, and (d) green-channel after histogram equalization and image inverse.

Figure 11 .
Figure 11.The detailed design of the network architecture.
F1 − score = TP TP + 1 2 (FP + FN) improved during the 50th epoch and that the loss from epoch 9 did not significantly change.
developed three standards for lung and infection segmentation based on 70 annotated COVID-19 cases.Although numerous authors have split lung CT images, ground-class opacity in COVID-19-infected regions caused by inflammation has not been segmented effectively in the current literature.A small number of studies have focused on the segmentation and classification of the COVID-19 area using image processing and DL techniques, whereas the majority of recent research has only focused on the detection of COVID-19 using DL approaches.The current study tackles these problems by pre-processing the input images with thresholds to exclude infections and coloring the images with a heat-map tool, also it uses U-Net to identify and quantify COVID-19 infections using clinical CT scan images.

Table 1 .
Evaluation criteria for our proposed algorithm.

Table 4 .
A comparative study of our proposed framework and other existing ones.