An augmentation aided concise CNN based architecture for COVID-19 diagnosis in real time

Kaur, Balraj Preet; Singh, Harpreet; Hans, Rahul; Sharma, Sanjeev Kumar; Kaushal, Chetna; Hassan, Md. Mehedi; Shah, Mohd Asif

doi:10.1038/s41598-024-51317-y

Download PDF

Article
Open access
Published: 11 January 2024

An augmentation aided concise CNN based architecture for COVID-19 diagnosis in real time

Balraj Preet Kaur¹,
Harpreet Singh²,
Rahul Hans¹,
Sanjeev Kumar Sharma³,
Chetna Kaushal⁴,
Md. Mehedi Hassan⁵ &
…
Mohd Asif Shah^6,7,8

Scientific Reports volume 14, Article number: 1136 (2024) Cite this article

757 Accesses
1 Altmetric
Metrics details

Subjects

Abstract

Over 6.5 million people around the world have lost their lives due to the highly contagious COVID 19 virus. The virus increases the danger of fatal health effects by damaging the lungs severely. The only method to reduce mortality and contain the spread of this disease is by promptly detecting it. Recently, deep learning has become one of the most prominent approaches to CAD, helping surgeons make more informed decisions. But deep learning models are computation hungry and devices with TPUs and GPUs are needed to run these models. The current focus of machine learning research is on developing models that can be deployed on mobile and edge devices. To this end, this research aims to develop a concise convolutional neural network-based computer-aided diagnostic system for detecting the COVID 19 virus in X-ray images, which may be deployed on devices with limited processing resources, such as mobile phones and tablets. The proposed architecture aspires to use the image enhancement in first phase and data augmentation in the second phase for image pre-processing, additionally hyperparameters are also optimized to obtain the optimal parameter settings in the third phase that provide the best results. The experimental analysis has provided empirical evidence of the impact of image enhancement, data augmentation, and hyperparameter tuning on the proposed convolutional neural network model, which increased accuracy from 94 to 98%. Results from the evaluation show that the suggested method gives an accuracy of 98%, which is better than popular transfer learning models like Xception, Resnet50, and Inception.

COVID-Net: a tailored deep convolutional neural network design for detection of COVID-19 cases from chest X-ray images

Article Open access 11 November 2020

Generative adversarial network based data augmentation for CNN based detection of Covid-19

Article Open access 10 November 2022

Detection and analysis of COVID-19 in medical images using deep learning techniques

Article Open access 04 October 2021

Introduction

Coronavirus¹ was identified in Wuhan, China, in 2019, and it has affected more than 760 million people around the globe². The virus causes respiratory diseases such as Middle East Respiratory Syndrome, Severe Acute Respiratory Syndrome (SARS)³, and other deadly complications. The most common symptoms are cough, sore throat, headache, fever and fatigue (https://covid19.who.int/). The virus is passed from person to person by droplets of breath. During past COVID 19 waves, the sudden surge in cases made it difficult for the laboratories to confirm positive or negative cases using RT-PCR as it is a time-consuming method and has high false-negative rates⁴, and is costly also. Therefore, development of real time diagnostic tools, which can be executed in mobile and edge devices is the need of the hour⁵. Since most diagnostic centers already have X-ray machines, and because acquiring an X-ray takes less time than getting the RT-PCR done, using chest X-rays of patients⁶ satisfies the urgent need for a speedy diagnostic approach.

Deep learning^7,8 is one of the most promising techniques that provides efficient results in the accurate diagnosis of the diseases from images and is widely used in the medical field to diagnose severe diseases at early stages⁹. It is made up of input layer, activation functions, hidden layers and also output layer. The mathematical equation in each step with feed forward and backward functions can help in finding better results¹⁰. An activation function is used to activate and deactivate the neurons and basically defines the output of a node. Convolutional neural networks (CNN)^11,12 are deep learning neural network made up of neurons which are experienced, self-optimized and are used primarily by researchers working in the field of disease diagnosis from images. CNN’s key popularity is attributed to its ability to automatically learn functions from domain-specific images¹³. Furthermore, transfer learning models¹⁴ saves knowledge from one problem and can apply that knowledge on another problem. But the conventional CNN models such as Resnet50, AlexNet, Inception and Xception, etc., cannot be run on low computing power devices such as tablets, embedded chips, mobile phones and hence cannot be given to real time applications^15,16. These conventional models are also complex, need a lot of training time. To overcome these shortcomings, the lightweight¹⁷ and concise models of CNNs are being developed which are having lesser number of parameters than the conventional CNNs so that they can be executed on devices with low computing power and smaller memory requirements. Figure 1 shows the features of concise CNNs.

In the case of CNNs, there is also a need to preprocess the image to get a better classification and for that purpose, image enhancement techniques¹⁸ are used in which mask and filters upgrade the quality of image. In addition to that data augmentation techniques¹⁹ increase the training data to upgrade the successful rate of model. The goal of this study is to present a new, simple CNN-based model for diagnosing COVID 19 from X-ray pictures, and the proposed model has been compared to existing transfer learning methods using a number of different criteria. The following are the primary contributions of this research work: (1) a novel set of layers, as well as image enhancement and hyperparameter tuning of parameters, have been suggested for the classification of COVID-19, normal, and pneumonia cases. (2) In order to prevent the models from overfitting, data augmentation has also been carried out. (3) The proposed framework may be used as one of the effective methods for classifying data in the medical industry. Furthermore, it helps radiologists diagnose and treat ailments earlier.

The paper is divided into the sections listed below. The second section discusses the literature review and the third section explains our suggested model. Materials and methods are discussed in fourth section. Fifth section includes a description of the experimental outcomes. Conclusions and future work are discussed in sixth section.

Literature survey

Recently, most frequently research has been going on in the domain of diagnosis disease using CNN from images. This section summarizes some of the existing works for disease diagnosis. The comparison of models performances are shown in Table 1.

Table 1 State of the art techniques.

Full size table

Litjens et al.²⁹ proposed application aspects in deep learning. The different deep learning techniques extracted the spatial features from sophisticated image data i.e. CT, X-ray images, color Fundus images, ultrasound image and implemented models, which can be helpful in hospitals to detect severe diseases such as diabetic retinopathy, skin lesion, bone fracture and breast cancer at their early stages. Kermany et al.³⁰ used optical coherence tomography images dataset to detect viral pneumonia and macular degeneration and diabetic retinopathy. Cao et al.³ introduced deep learning, and the image analysis is done by deep learning architecture such as RNN, CNN and Stacked machine auto encoder. With these models, the detection of pediatric pneumonia with chest X-ray images can be done. The authors also presented the challenges in handling unlabeled data, privacy issues in the medical field, and many more. Jaiswal et al.³¹ in their work used the region of interest, align convolution layer and pixel-wise segmentation of disease.

Toğaçar et al.¹⁴ proposed a minimum redundancy maximum relevance (mRMR) model for the diagnosis of pneumonia. The three knowledge transfer models, namely, VGG-16, Alex Net, and VGG-19,are used in the proposed architecture. Moreover, decision tree, linear Discriminant analysis, k-nearest neighbor, and support vector machines are used for grouping using features generated by transfer learning model. Singh et al.³² proposed multi-objective differential evolution model for the classification of the COVID 19 disease. An exponential crossover algorithm is used. The proposed model gives high accuracy as compared to adaptive neuron fuzzy inference system, artificial neural network, and CNN types.

Das et al.³³ designed an Xception model to diagnosed COVID infection using chest X-ray dataset containing three classes pneumonia and COVID 19 negative, COVID 19 positive, and other infections except for COVID. The features are extracted by using different masks applied to the convolution layer. As a loss function, the cross-entropy is utilized. Brunese et al.³⁴ built two models: the first model assesses whether the picture belongs to a healthy patient or a patient suffering from general pulmonary illness. If the patient has a general pulmonary condition, the X-ray picture is sent to the second model, which checks whether it is a COVID 19 patient or pulmonary disease only.

Liu et al.⁷ suggested a model for dental disease diagnosis utilizing a mask region-based convolution neural network with classification of seven different dental diseases. The model uses an IoT platform for patients to upload their dental images. A broad-level prototype is also given in the paper for dental image acquisition. Jain et al.³⁵ presented four phases of model where ResNet50 network is used to differentiate between bacterial pneumonia and pneumonia. Varela et al.³⁶ suggested approach uses feature extraction to minimize the number of pixels, grey level co-occurrence matrix features that focus on the spatial relationship between pixels, and the local binary patterns method to encode the pixel values. Marques et al.³⁷ has been suggested efficientnetb4 model is a convolution neural network. Ezzat et al.³⁸ suggested a technique to identify the optimal settings for hyperparameters, the gravitational search method is utilized as an optimization tool. The new method is contrasted with Social Ski Driver-Dennsenet121. Data preparation, hyperparameter selection, and the learning step for COVID 19 diagnosis are all part of the technique. Hassantabar et al.³⁹ has been proposed technique for detecting COVID 19 patients. Two approaches are utilized for diagnosis. The first is a deep neural network, while the second is an image segmentation approach for detecting diseased areas. Table 1 summarizes studies relating to COVID 19 and CNN architecture on chest x ray images (Type1) and PIMA and UCI (Type2), as well as information on additional approaches utilized in the papers.

Various studies have explored the application of deep learning techniques across diverse imaging modalities, including CT scans, X-rays, color fundus images, ultrasound, and optical coherence tomography^40,37,42. The investigated diseases range from diabetic retinopathy and skin lesions to bone fractures, breast cancer, viral pneumonia, and COVID-19^13,43. These studies employ a variety of deep learning architectures such as RNNs, CNNs, and stacked machine auto encoders. Notably, researchers have addressed challenges like handling unlabeled data and privacy issues in the medical field⁴⁴. Key findings include the efficacy of models like Xception for diagnosing COVID-19 from chest X-ray images, the use of multi-objective models for COVID-19 classification, and innovative approaches like dental disease diagnosis using a mask region-based CNN⁴⁵. The comparison in Table 1 underscores the performance of different models in COVID-19 diagnosis and CNN architecture across chest X-ray images and datasets like PIMA and UCI. Overall, these studies demonstrate the versatility and potential impact of deep learning in advancing early disease detection and diagnostic accuracy in medical imaging.

Proposed concise CNN based architecture

The framework of a convolutional neural network depends on the number of layers, activation function, optimizer, number of filters and batch size^46,43,48. Figure 2 represents the proposed architecture of the COVID 19 diagnosis structure. The proposed model has been derived from the baseline Efficient Net model⁴⁶. In contrast to the more complex architectures, the goal is to develop a concise CNN model that can identify picture modification⁴⁰. The layers of the efficient Net model have are modified by replacing the MBconv layer with a Conv2D layer and also by updating the values of filter in layers additionally,the dropout layer is added with a 0.4 value to reduce overfitting of the model and add regularization. The proposed architecture is a sequential model. Additionally, the layers are added in the sequence order to build the CNN architecture. The proposed CNN model contains Conv2D, Maxpooling2D, Dropout, the Relu activation function, dense/fully connected layer. The suggested model has nine total layers: three convolutional, three maxpooling, three Relu, two dense, one dropout, and a fully connected layer.

a.
The size of image as input is 224 × 224 × 3, i.e., 224 is height and width of image and 3 is image channel value as RGB. The first convolution layer (L1) represent the first layer of model takes an input of size 224 × 224 × 3 and has kernel size 3 × 3 which produces 32 features maps as result.
b.
The second convolution layer (L2) has 32 filters and has kernel size 3 × 3 which produces 32 features maps as result.
c.
The third convolution layer (L3) contain 64 filters with kernel size 3 × 3 which produces tensor of 64 features maps as result.
d.
To tackle the overfitting problem, the above layers are followed by dropout layer with 64 filters having kernel size of 3 × 3. Dropout layer is followed by flattening layer. The flatten layer converts the data into 1-D form.
e.
In last the dense/fully connected layer is added with 128 filters and the efficiency of the model is improved by Relu as activation function which produces 258 features. This layer produces the output.

Convolutional layers are used in conjunction with the most common Rectified linear unit (ReLU) activation function to increase the performance and generalization by introducing non-linearities to the network. The vanishing gradient issue that may be seen in other forms of activation functions is eliminated by ReLU by correcting the values of inputs less than zero. ReLU's key benefit is quicker execution, which shortens computation time. The maxpool2D is used with each convolution layer to extract the best features. The description of each layer is represented in Table 2 in which Conv2D as T1 layer, Max_pooling2D as T2 layer, Dropout as T3 layer, Flatten as T4 layer and Fully connected as T5 layer.

Table 2 Layer architecture of proposed model.

Full size table

The filter applied on the image is represented in Eq. (1). The h is a kernel and input image is represented as f. The resulting matrix of indexes of rows and columns is marked as q, r. ∑ a sign is used to add all values with limits j and k.

$$R[q,r] =[f*h][q,r]= {\sum }_{j}\sum_{k}h\left[j,k\right] f[q-j,r-k]$$

(1)

After this process, the filter is placed over the image and the value is multiplied by the value from the image. Then all values sum up and the feature map is generated. The padding is added to the image to fix the size in proper form. Equation (2) is used for padding.

$$p=\frac{ f-1}{2}$$

(2)

The preprocessing part with CONV2D now moves to the pooling process. The formula for the pooling function is defined in Eq. (3). In the pooling process, we find the maximum and average according to the pool type. It is a technique to get sample feature maps from all features. It extracts features that contain high value during the sliding window extraction process. Here, s is the stride, n_H is a size of height, nc is number of channels, n_w is size of width.

$$\mathrm{Pooling }= (\left[ \frac{{n}_{H}+2p-f}{s}+1\right],\left[ \frac{{n}_{w}+2p-f}{s}+1\right],nc, ):s>0,$$

(3)

For improved results, Relu is employed as an activation function in each CONV2D layer and maxpooling layer. The function work as Eq. (4), where x is the input value

$$Relu(x) = max(0,x)$$

(4)

The convolution layer with maxpooling value is then direct to the feed forward function to calculate the value for the next step, Eq. (5) represents the functioning of this process

$$Z[I] = W[I] \cdot AF[I-1] + b[I]$$

(5)

$$AF[I] = g[I] (Z[I])$$

(6)

Here, g is the activation function in Eq. (6), firstly the value of Z is calculated from the previous layer with W tensor and then bias b is added to it.

After calculating it, we move to the calculation of derivatives which will be used to update the value of parameter also known as gradient descent. The formula of a partial derivative as

$$dAF\left[I\right]= \frac{\partial L}{\partial A\left[I\right]} dZ\left[I\right]= \frac{\partial L}{\partial Z\left[I\right]} dW\left[I\right]= \frac{\partial L}{\partial W\left[I\right]} db\left[I\right]= \frac{\partial L}{\partial b\left[I\right]}$$

(7)

dW and db are parameters that work on the present layer. According to the chain rule, the result is Eq. (8)

$$dZ[I] = dAF[I] * g{\prime}(Z[I] )$$

(8)

After the backpropagation process, hyperparameters tuning is the next step which includes checking parameter values with different patterns based on the performance of the model. The parameter used for tuning are: loss function, learning rate, optimizer and number of neurons.

(a)
Loss function: It is used to compute the model error. The gradients may be calculated from the loss function and used to update the weights. To generate output, the suggested model uses a sparse categorical cross entropy loss function and the mathematical operation of which is shown in Eq. (9)
$$L= \sum_{j=1}^{M}yi log(\widehat{y}i)$$
(9)
where y hat represent the outcome produced by the model and y represents the expected outcome.

(b)
Optimizer: The goal of an optimizer is to minimize losses by adjusting relevant model parameters like the learning rate and the weights. In the proposed approach, RMSprop is used as optimizer. The RMSprop takes the cumulative sum of the squared gradient represented in Eqs. (10) and (11)
$$wt+1 = wt- \frac{\alpha t }{(vt + e)1/2}* \left[\frac{\delta l}{\delta wt}\right]$$
(10)
$$vt=\beta vt-1 + (1-\beta )*\left[\frac{\delta l}{\delta wt}\right]^{2}.$$
(11)

Here, $\alpha t$ learning rate at time t, $\delta wt$ derivative of weight at time t and $\delta l$ derivative of the loss function, v_t sum of the square of past gradient, ${\text{e}}$ small positive constant (10^–8) and β is moving average parameter (constant, 0.9). Dense Layer receives input from all neurons of the previous layer along with the Relu activation function. The dense layer return output is represented in Eq. (12)

$$o = g(dot(I, K)+b)$$

(12)

In the above equation, o is output g is the activation function, dot represents numpy function for calculation, I is input. K represents the weight data, while b is the bias used to optimize the model. Figure 3 depicts the study's step-by-step process. The classification process of CNNs is to process input images through convolutional, activation, pooling, and fully connected layers. Training involves optimizing weights via backpropagation to minimize a loss function. The trained model predicts image classes by analyzing learned features.

Materials and methods

The description of the dataset⁴⁹ that was utilized in the experiment is configured into two types. The firstly used dataset in the experiment is dataset with data augmentation⁴⁴. The second dataset used in the experiment is with image enhancement using hyper parameter tuning, data augmentation and Gaussian Blur.

Dataset description

The suggested model for detecting COVID 19 illness was evaluated using a dataset of publically accessible conventional chest X ray images⁵⁰. The collection includes 3616 COVID 19 positive cases, 10,192 Normal pictures, 6012 Lung Opacity, and 1345 viral Pneumonia images. Only two of the four types of images presented were taken into account in our experiments, i.e., COVID 19 positive and viral Pneumonia. Every image is a grayscale image consisting of $299\times 299$ pixels. Figure 4 shows the sample image from each class of the test dataset. Total of 1000 images are taken from a dataset and then divided into different three samples.

Image enhancement

Gaussian blur feature is derived by blurring an image using Gaussian function. This technique upgrade the quality of an image and is helpful in finding inadequate information for image interpretation⁵¹. The spatial filtering, slicing, stretch, edge sharpening and other methods are used in this technique. The method reduces noise and increases smoothening of image. The process is achieved by convolving on image with Gaussian kernel. The formula used for the process is shown in Eq. (13)¹⁸:

$${G}_{2D}(x,y,{\sigma }^{2}) = \frac{1}{2\pi {\sigma }^{2}}{e}^{-}\frac{-{x}^{2}+{y}^{2}}{2{\sigma }^{2}}$$

(13)

Here the distribution by standard deviation is denoted by σ and x, y are location indices. The Gaussian distribution mean value depends on the value of σ which influence the extent of blurring affect around pixel. The COVID and viral pneumonia images after and before Gaussian blur is shown in Figs. 5 and 6.

The opencv2 is used to implement Gaussian blur. The three functions are used as argument in the process i.e. img used to modified the image, sigma used in the x and y direction and truncate used to determining the limits of the approx. The Gaussian filter takes the x, y pixel and returns a single number by calculating the weighted average based on the normal distribution.

Figure 5 shows the images of chest X ray of covid class with and without image enhancement using Gaussian Blur¹⁸ technique. Figure 6 shows the images of chest X-ray of Viral-pneumonia classes with and without image enhancement using Gaussian Blur technique. The paper results contain experimentation on two type of dataset i.e. with Gaussian Blur images dataset and without Gaussian Blur images dataset.

Methodology of Gaussian Blur

The CLAHE⁵² and histogram equalization⁵³ techniques are experimented before the selection of Gaussian Blur. The CLAHE define as contrast limiting adaptive histogram equalization which refine the images with high intensity. To improve contrast of image, histogram equalization is used. The other two techniques give less accuracy as compared to the blur technique⁵¹.

Data augmentation

In order to make training data more generalizable and applicable, data augmentation involves transforming images in various ways, such as rotating, flipping, and resizing⁵⁴. Figure 7 shows the images without data augmentation and with data augmentation¹⁹. It increases the size of samples used as training set by applying different techniques written in the Table 2 which help the model to extract features and understand the image. The technique provide good results for enhancing the performance and is used to reduce over fitting⁴⁵. The data augmentation methods is represent in Table 3 with different parameters.

Table 3 Data augmentation methods.

Full size table

The augmentation algorithms include kernel filters, geometric transformations, random erasing, mixing images, color space augmentations, etc. The above results show that preprocessing of image with data augmentation can increase the precision of classification and reduces the overfitting problem.

Evaluation parameters

Based on the confusion matrix, we will determine the class-wise performance of our model based on the following performance metrics⁵⁵.

1.
True positive (TP): These are instances in which both the predictive and actual class are true(P).
2.
True negative (TN): True negatives arise when both the expected and actual classes are false(N).
3.
False negative (FN): These are instances in which the data item's real class is true (P), but the classification model wrongly labels it as false (N).
4.
False positive (FP): These are instances when the data item's real class is false (N), but the classification model mistakenly labels it as true (P).
5.
Accuracy: Accuracy is a percentage of correct predictions to total predictions. Equation (14) defines the accuracy formula:
$$Accuracy=\frac{TP+TN}{TP+TN+FP+FN}$$
(14)
6.
Loss: The difference between the predicted and actual value is the loss. It is a way for calculating loss. Equation (15) depicts the loss formula, where y represents the predicted outcome and y hat represents the model's output
$$L= -(yi log(\widehat{y}i) +(1- yi) log(1-(\widehat{y}i))$$
(15)
7.
Execution time: The time the model takes from start to finish of execution.
8.
Recall: The percentage of total relevant results properly categorized by the model is referred to as recall.
$$Recall = \frac{TP}{{TP + FN}}$$
(16)
9.
Precision: It is the percentage of relevant results in your results. The formula is as follows:
$$Precision = \frac{TP}{{TP + FP}}$$
(17)
10.
F-Measure: The F-measure represents the harmonic mean of accuracy and recall. It is determined as follows:
$$F{ - }Measure = \frac{2*Recall*Precision}{{Recall + Precision}}$$
(18)

Experimental results and discussion

This part represents and analyze the results obtained after performing experiments in three different scenarios. All the proposed approaches have been executed with python using Tensor flow and Keras libraries³⁸. For the analysis of results, dataset is categorized into different ratios as represented in Table 4.

Table 4 Dataset distribution.

Full size table

Experimental results without image enhancement

Scenario 1

The following results are based on scenario 1, i.e., dataset ratio of 70:30 in which 70% of images belong to the training set and 30% belong to the testing dataset. After hyperparameter tuning step, the suggested approach is compared to other models. Figure 8 represents the confusion matrix.

Figure 8 represents the confusion-matrix of bifurcation of the dataset. Figure 9 represents the percentage of success for classifying COVID disease in 500 epochs. The proposed model gives 96% accuracy which is better than other models. The bar graph representation in Fig. 10 sum up the time of execution. The proposed approach takes less time as contrast to other models because it is lightweight and used fewer parameter which makes it faster than other models.

Moving ahead, Table 5 shows the performance metrics obtained in scenario 1, containing the value of F1-score, recall and precision. In this table, the proposed model shows the highest value of F1-score, recall and precision for class COVID and viral-pneumonia. Figure 10 represents the testing loss of each model with 500 epochs. The loss value shows how much error rate is there in the model performance, the resulting graph shows that Resnet50 has high value of loss rate as compared to other models. Figure 11 illustrates the testing loss with 500 epochs for all models.

Table 5 Model results with scenario 1.

Full size table

Scenario 2

The following results are based on scenario 2 with a dataset ratio of 60:40 in which 60% of images belong to the training dataset and 40% belong to testing dataset. Figure 12 shows a confusion matrix that illustrates that the proposed model and Inception model have high true positive value i.e. 141 and 142 as compared to other models. In the confusion matrix true positive value of proposed model is 141 which means the COVID images are correctly classified as COVID and 136 as true negative value which shows the viral pneumonia images correctly classified.

Table 6 displays the f1-score, recall and precision value of models. The model outperforms the Inception model in precision and surpasses the Xception model in recall. The projected model gives the maximum value of precision and recall which makes it better than other models.

Table 6 Model results on scenario 2.

Full size table

Figure 13 represents the result of truly and correctly classified images of viral-pneumonia and COVID. The testing accuracy with Resnet50, Inception and XCeption is less as compared to the proposed model. Figure 14 displays the bar graph of the time executed by each model during the execution of the 500 epochs.

The proposed model was executed in 7460 s in total which is less than other transfer learning CNN models. The proposed model was executed in less time because the model architecture have lesser parameters as compared to other models. Figure 15 shows the result of validation loss rate with Resnet50, Inception and Xception. Loss defines how many the wrong predictions were made by the model. The proposed model gives less value of loss rate as compared to other models.

Scenario 3

The following results are based on scenario 3 on the dataset ratio 80:20 in which 80% of images belong to the training dataset and 20% belong to testing dataset. The confusion matrix which shows the result of model performance in predicting true images of diseases has been illustrated in Fig. 16

Figure 16 depicts the findings of the analysis confusion matrix, as well as the values related to performance metrics of the transfer learning model as well as suggested model. Table 7 displays the significance of the findings in terms of accuracy, recall, and f1-score of Inceptionv3, Resnet50, Xception, and the proposed model while using Scenario 3. Figure 17 represent the increasing success rate of classified data of proposed model as compared to three models with every epoch. The testing accuracy of Resnet50 is 0.89, Inception is 0.93, Xception is 0.74 and the proposed model is 0.96.

Table 7 Model results on scenario 3.

Full size table

Figure 18 shows the bar graph of the time which define the completion of task by each model in 500 epochs. This bar graph shows the results of each model with 80:20 ratio dataset. The proposed model executed in 7114 s in total. Figure 19 shows the result of wrongly classified images rate by each model and the graph represent the different peak of loss rate with each epoch. The Resnet50 have highest value of loss as compared to other models.

Experimental results with image enhancement

The following results are taken with Gaussian Blur images for $500$ epochs, and maximum accuracy of $98\%$ was observed. With every epoch, the accuracy rate of our proposed model with image enhancement gets improved. The functioning of CNN model with image enhancement images got improved in every epoch, the model uses smoothed and less noise image which increase its accuracy from 96 to 98%.