Non-iterative learning machine for identifying CoViD19 using chest X-ray images

CoViD19 is a novel disease which has created panic worldwide by infecting millions of people around the world. The last significant variant of this virus, called as omicron, contributed to majority of cases in the third wave across globe. Though lesser in severity as compared to its predecessor, the delta variant, this mutation has shown higher communicable rate. This novel virus with symptoms of pneumonia is dangerous as it is communicable and hence, has engulfed entire world in a very short span of time. With the help of machine learning techniques, entire process of detection can be automated so that direct contacts can be avoided. Therefore, in this paper, experimentation is performed on CoViD19 chest X-ray images using higher order statistics with iterative and non-iterative models. Higher order statistics provide a way of analyzing the disturbances in the chest X-ray images. The results obtained are quite good with 96.64% accuracy using a non-iterative model. For fast testing of the patients, non-iterative model is preferred because it has advantage over iterative model in terms of speed. Comparison with some of the available state-of-the-art methods and some iterative methods proves efficacy of the work.

their attention towards it [11][12][13] . Everyone understands that the need of the hour is to come up with solutions to contain and control this globe-spread virus.
The remaining paper is organized in the following manner: Section II gives a brief idea about the literature survey of this pandemic as well as research carried out to tackle the problem. Section III gives the details about the role of machine learning in classification of Chest X-ray images and briefly describes different models. Section IV details the proposed technique explaining feature extraction method with various classifiers and the database utilized to prove the efficacy of machine learning in CoViD19 with various tested methods. Section V discusses the experimental results followed by comparison of proposed method with other existing state-of-theart methods in Section VI. At last, Section VII concludes the paper and discusses briefly about the future work that can be possible in machine learning to cater situations like this pandemic.

Related work
In the past two years, a lot of research has been conducted on this novel virus. In some studies, it has been stated that CoViD19 is associated with pneumonia and can be cured using a drug related to malaria. This drug, called as chloroquine or hydroxychloroquine has been recommended by National Health Commission of the People's Republic of China. But, it is able to cure patients who are at low risk 14 . As a side-effect of CoViD19, a fungal infection called as mucormycosis or black fungus has also been observed in a few patients in India post CoViD19 recovery 15,16 . Since the virus spreads through contact, an obvious solution to this problem has seemed to be the development of learned machines that can minimize human contact and ease out diagnostic procedures for doctors. The novel virus impacts lungs and causes heaviness in breathing. So, CT scans are being used by medical community to monitor the condition of patients from time to time. Automatic detection and classification of CT scans 17 can lift off the burden from the diagnosticians and speed up the process of identifying potential patients. Apostolopoulous and Mpesiana 18 have evaluated the performance of CNN architectures, which are a popular choice for medical image classification, pertaining to automatic detection of Coronavirus disease through X-rays 19 . The dataset for their research included 1427 X-ray images including normal conditions, confirmed Corona cases as well as common bacterial pneumonia cases. They have applied Transfer Learning strategy and CNN for classification. Based on the results, it has been suggested that X-ray imaging with deep-learning can be used for diagnosis of Coronavirus. Singh and Bansal 20 have proposed the use of a deep learning model namely truncated Visual Geometry Group from Oxford (truncated VVG16) for screening of CT scans. After extraction of features, they have implemented principal component analysis (PCA) for feature selection. Once the useful features are available, they have used four learning models namely extreme learning machine (ELM), online sequential ELM, deep CNN and bagging ensemble along with SVM as classification models. On a dataset of 208 images, the last classifier has achieved an accuracy of 95.7% and performs better than other classifiers. Use of multi-objective differential evolution (MODE) and CNN has been advocated by Singh et al. in 21 . Instead of using random valued parameters for CNN, the researchers have proposed fine tuning of the initial parameters with the help of MODE. They have compared the results with CNN, adaptive neuro-fuzzy inference systems, and artificial neural networks and shown that their technique performs with a good accuracy rate. Sethy and Behera 22 have also suggested deep learning based methodology for classification of X-ray images as Corona positive or negative. They have obtained validated data images from Kaggle, Github and Open-i. The Resnet 50 plus SVM model proposed by them achieves accuracy of 95.38%. A deep CNN model DeTraC (decompose, transfer and compose) has been validated and adapted by Abbas et al. 23 for classification of chest X-rays for identifying COVID19 positive cases. They have also discussed the importance of transfer learning in situations where availability of annotated medical images in limited. Their deep CNN model achieves accuracy, sensitivity and specificity of 95.12%, 97.91% and 91.87% respectively. Detection of this infected virus based on the comprehensive dataset of CT scan and X-ray images collected from different sources that needs deep learning and transfer learning algorithms 24 . Wu et al. has developed novel idea to diagnose CoViD19 with the help of joint classification and segmentation (JCS) system in which they were able to detect CoViD19 chest CT scan. To develop the JCS system, they have built a high scale CoViD19 classification and segmentation dataset, with 350 uninfected cases and 144,167 chest CT images of 400 CoViD19 patients. Fine-grained pixel-level marks of opacifications, which are enhanced attenuation of the lung parenchyma, are annotated on 3,855 chest CT images of 200 patients 25 . Ahuja et al. detected CoViD19 using transfer learning from CT scan images decomposed to three levels using stationary wavelet in the proposed study. To improve detection accuracy, a three-phase detection model is proposed, with the following procedures: Phase 1: data augmentation using stationary wavelets, Phase 2: COVID-19 detection using a pre-trained CNN model, and Phase 3: abnormality localization in CT scan images. For the experimental evaluation, this work used well-known pre-trained architectures such as ResNet18, ResNet50, ResNet101, and SqueezeNet 26 29 . Some more deep learning models are also introduced by the researchers which also includes pre-trained networks of deep learning 28,30-32 . Motivation. In the past year, various transfer learning models have been implemented for CoViD19 prediction 33 . These include ResNet 32 based model 34  www.nature.com/scientificreports/ tional neural network models for classification of CoViD19 chest X-ray images. All these transfer learning models achieved very good results in terms of accuracy. But, while taking the computational time into consideration, these models take a large amount of time.
Contribution. At the moment, even after so many efforts, the virus is not completely contained. So, there is a need to detect CoViD19 virus in patients with great accuracy and in lesser time. Deep learning techniques are efficient but take more time in execution because of their iterative nature. Currently, with new cases being reported in some countries, leading to lockdown in some cities, there must be a faster solution for the prediction of CoViD19+ve patients. To deal with the situation at hand, some methods need to be developed which can help in detecting this virus without coming in contact with the patient with higher accuracy and lesser time. This is possible by using non-iterative approach in detecting the CoViD19.

Machine learning in CoViD19
Machine learning helps in making a system, which is trained with the help of some data, detect and identify the patterns and make decisions with minimum level of human intervention 41,42 . As an effective vaccine has not yet been invented for the virus, it is a must to detect this virus in patients at an early stage. Since the virus is transmitted by touching and breathing close to another individual, it is imperative that a person should not come in contact with the patients. Under given circumstances, machine learning is one of the best solutions for the detection of CoViD19. Preliminaries used during this experimentation have been briefly discussed below.

Support vector machine.
Any non-linear problem can be converted into linearly separable problem in a higher dimensional space. This is the key point on which Support vector machines work. It maps the given input training data to a higher dimension feature space and then finds a hyperplane which maximizes the margin between two classes as shown in Fig. 1. As a result, the decision boundary in the input space is of non-linear nature. By using kernels, the separating hyperplane can be computed without explicitly transforming the input space to a higher dimensional feature space 43 . What makes SVM a favourable choice is that it works well even when the training data available is small in size. Further, they are tolerant to imbalance in the number of training samples of two classes, which usually is the case 44 .
Convolutional neural network. Problem with simple feedforward neural network is the use of huge number of neurons for their operation which results in large amount of execution time requirement. This problem is resolved with the help of convolutional neural network (CNN). It is also a type of deep neural network but what it performs is that it extracts the image features by reducing the dimensions of the image without even losing their characteristics 45 . A single layer CNN consists of three main sub-layers i.e. convolutional layer, ReLU layer and max pooling layer. Architecture for Nth-layer CNN is shown in Fig. 2.
Convolutional layer. In this layer of CNN, features are extracted from the image by convolving a filter with the part of image (having size same as that of filter). This process is repeated over whole image by sliding the filter using a Stride and features are obtained.
ReLU layer. ReLU is rectifying linear unit. It is an activation function that makes all negative values of the resultant obtained from convolutional layer to zero. Its function is shown below: Max pooling layer. Max pooling is the most common non-linear down sampling process of CNN. It is a necessary step before applying next convolutional layer in CNN as it reduces the dimensions of the image and hence, computational cost can be reduced. In this layer also, a filter 'f ' is chosen which is slide over the image with Stride (1) h y = max(0, y) www.nature.com/scientificreports/ 's' . Filter size image field is selected and maximum value of that portion is obtained by non-overlapping sliding of the filter over the image. It is shown in an example in Fig. 3. Therefore, CNN is also helpful in dimension reduction as well, keeping the important information within the resultant image. It is utilized here for binary classification of chest X-ray images.
Kernel extreme learning machine. Extreme learning machines have gained attention widely because of their simplicity. They have a single hidden layer, which does not need to be tuned. Since their proposal in 2006 46 , they have been extended to adopt a more generalized form, where the neurons in a single layer feed-forward network (SLFFN) need not be alike [47][48][49] .
For a SLFFN, let there be a dataset of H samples {s i , o i } i = 1,2,…,M, where s i is the ith training input vector of dimension D and o i is the corresponding class label belonging to either one of the classes {1, 2, …, L}. Hence the dimension of input space for SLFFN is D and the dimension of output layer is L. The input space is first mapped to a higher dimensional space that has the dimension equal to number of neurons in the hidden layer (here, say Ḧ ). The important features are preserved during this mapping. Next phase is the projection of high-dimension feature space to a low-dimension feature space so as to map the output from H neurons to L output classes 50 . Thus, the system can be modelled using Eq. (2) as given below: where A(s) is the activation function of the hidden layer, ̟ i = [̟ i1 , ̟ i2 , . . . , ̟ iD ] T is the weight vector connecting the inputs to the ith hidden neuron, T is the weights connecting hidden layer neurons to output layer nodes, and bi is the bias. The output layer is assumed to have a linear activation function. According to Huang et al. 46 , a SLFFN with Ḧ nodes in the hidden layer and an activation function A(s) can achieve zero error for H sample by approximation such that,    The minimum norm least square solution of SLFFN can be given by: where M′ is the generalized inverse (Moore-Penrose) of M. Unlike other machine learning algorithms, what makes extreme learning machines stand out is their noniterative behaviour. They are highly scalable and have comparatively less computational complexity 51 .
Kernel extreme learning machines are an extension of ELM. Here, the hidden layer outputs are calculated once and stored permanently in the kernel matrix. It is not calculated on the output layer of dimensionality L, but rather on the input data dimension and samples 52,53 . It is shown is Fig. 4.
KELM has the capability of approximation as well as classification. Single model can be thus used for variety of applications. In the propose scheme, KELM is used as a binary classifier as only CoViD19+ve and CoViD19−ve are the two classes which are classified using this classifier.

Proposed technique
This section begins with a detailed discussion on the database used for checking the efficacy of proposed technique. Next, pre-processing as well as feature extraction is elaborated as it is an important step before formation of dataset for KELM. Also, the steps of implementing the proposed technique are also discussed in this section.
Database used. As CoViD19 is nothing but associated pneumonia, therefore, chest X-ray of an individual can be helpful in detecting whether the person is CoViD19 positive or not. The database consists of 5,856 images of chest X-ray having 4,273 CoViD19 positive images and 1,583 normal images. The X-ray images are gray scale in nature and have different dimensions. Hence, all the images have been resized to dimension 50 × 50. In terms of pixel values, images are normalized between − 1 and + 1. Sample images are shown in Fig. 5 in which the chest X-ray of the normal (CoViD19−ve) and CoViD19 associated pneumonia (CoViD19+ve) are shown. It is a publicly available database 54 . Normal chest X-ray is a clear image of lungs without any areas of opacification. CoViD19 associated pneumonia is detected by interstitial pattern as depicted with red arrows. But, this difference may seem insignificant to an untrained person. A scatter plot of all the images is also shown in Fig. 6 showing the distribution of the database. Red dots are representing the normal chest X-ray images and blue dots are denoting the CoViD19 associated pneumonia images of the patients. The x-axis and y-axis in Fig. 6 are particular pixel values of all the samples of the database. As all images (samples) have been resized to 50 × 50, hence, there are 2500 pixel values for each image in the database which are converted into a vector from its matrix representation. Thus, from all the images, a dataset is formed consisting of 5856 rows and 2500 columns. Figure 6 only represents correlation among column 1 (x-axis) and column 2 (y-axis) of all the samples. Some X-ray images have been added in Fig. 6 along with red and blue dots to showcase that the images are indistinguishable for untrained people and may seem similar to them.
Similarly, some more scatter plots have been shown in Fig. 7. These scatter plots have been plotted by taking various random combinations of pixel values of all the images. The column numbers are written on x-axis and  Higher order statistics can be defined in terms of moments and cumulants. Cumulants are non-linear combinations of various moments. The motivation behind using higher order statistics is to analyze the disturbances in chest X-ray images of the patients due to the attack of CoViD19. Bispectrum is the Fourier transform of third order cumulant 55,56 .
The third-order cumulant K 3 (ǫ 1 , ǫ 2 ) is represented as In this, n 3 (ǫ 1 , ǫ 2 ) depicts the third-order moment. K 3 (ǫ 1 , ǫ 2 ) is the third-order cumulant that explains the skewness of the data and is equal to n 3 (ǫ 1 , ǫ 2 ) for zero-mean. Skewness is a measure of asymmetry of any distribution about its mean. Positive value of skew indicates that the tail on the left side of the data, is shorter and thicker than the right side. In cases where one tail is short but the other tail is thick, skewness does not obey a simple rule. A zero value shows that the tails on both sides of the mean is balanced, which is the case for a symmetric distribution. It is also true for an asymmetric distribution where the asymmetries, such as one tail being short but thin, and the other being long but thick.
If these cumulants are considered in frequency domain, then it can be obtained by taking the Fourier transform of these cumulants. Fourier transform of third-order cumulant is given as: www.nature.com/scientificreports/ where S(ϕ 1 , ϕ 2 ) is the Bispectrum of z(m), K 3 (ǫ 1 , ǫ 2 ) is the third-order cumulant and Z(ϕ) is the Fourier transform of x(n). We demonstrate that bispectral analysis has great potential in this new application where bicoherence magnitude responses of the image are used to identify chest X-ray images. Bispectrum is often used for detecting the existence of quadratic correlation within a signal, as being applied in oceanography, EEG signal analysis, manufacturing, non-destructive structural fatigue detection and plasma physics applications 57 . We compute the bispectrum magnitude response for the chest X-ray images for different class of images and observe that the distributions for the CoViD19+ve images are greater than that of CoViD19−ve images as shown in Fig. 8.

Steps of implementation.
A basic block diagram of workflow for non-iterative approach using KELM is shown in Fig. 9. The steps for implementing proposed technique are as follows: 1. Select CoViD19 chest X-ray images and resize them to size 50 × 50. The images in dataset are of variable size, and hence need to be made coherent to a uniform size and dimension. 2. Enhance the images using adaptive histogram equalization 58 . Pre-processing techniques lead to data enhancement and refinement by identifying the affected part in the chest X-ray images 59 . 3. Perform feature extraction using third-order cumulants. 4. Apply fast Fourier transform on third-order cumulants to obtain bispectrum of the images. It helps in taking out features that includes color, texture information in the images 60 . 5. Normalize data between values − 1.0 and + 1.0. 6. Divide these normalized features into train and test data. Train data helps in training of the classifier and thereafter, a trained model is obtained. There are multiple available classifiers in machine learning like kNN 61

Experimental results and discussion
For all the experiments, the images of the Chest X-ray database have been resized to 50 × 50 and converted to gray scale. All the images gone through the image enhancement process and bispectrum features are obtained from all the enhanced images. The feature vector set size remains same as the size of the resized image (50 × 50 matrix -1 × 2500 vector). The features computed from the enhanced chest X-ray image database is experimented over three classifiers viz. SVM, Convolutional Neural Network and kernel extreme learning machine (KELM), which is a non-iterative learning machine model. SVM and Convolutional Neural Network are the iterative learning approaches which are utilized here to compare the execution time of the iterative learning approaches with non-iterative learning approach. The performance of these classifiers is measured mainly on three parameters based on confusion matrix. They are briefly explained here.
Confusion matrix is defined as the matrix using which correctness of a classifier can be measured. It is represented below and its parameters are defined as: Results computed on the fore-mentioned classifiers based on these performance metrics are here as follows.

Results with SVM classifier
SVM is utilized with train-test ratio of 10-90, 30-70, 50-50 and 70-30 as shown in Table 1. Quite good percentage accuracy of 95.85% is achieved on 70-30 train-test ratio. SVM gives such good results with quadratic kernel, penalty parameter taken as 1, 0.001 of tolerance, and value of degree as 3 during experimentation. SVM performs very well but have very high execution time because of iterative learning approach. It can be seen in Table 1. Table 2 shows the performance of 2-Layer CNN on the bispectrum features of CoViD19 chest X-ray images. In both the layers of CNN, 3 × 3 size of filter is utilized in convolutional layer and 2 × 2 size of mask in max pooling layer. The results achieved are excellent in terms of accuracy as 96.81% accuracy is achieved with 70-30 train-test ratio. The execution time for this set is 215 s which is better than the previous classifiers used in the experimentation. Table 3 is representing the results for 3-Layer CNN. Results achieved with this classifier are almost same as that achieved using 2-Layer CNN. A small disadvantage of using this classifier is increase in execution time which is due to the increase in one layer of CNN. Specifications of filter and stride are same as that used in 2-Layer CNN. Maximum percentage accuracy achieved with 3-layer CNN is 95.90% with 363 s of execution time.

Results with CNN classifier
All these classifiers utilized are iterative and hence, execution time is quite large. In CNN, only two and three layers are used and execution time is comparatively large. If transfer learning approaches are utilized, in which there are so many layers, results in very large execution time.

Results with KELM classifier
KELM is a faster classifier that is non-iterative in nature. There are various kernel functions which can be used in the kernel-based ELM. They are linear, polynomial kernel, sigmoid kernel, wavelet kernel, and RBF kernel 65 . Any of these kernel functions can be utilized with KELM depending upon the requirement and hence, the kernelbased ELM model is defined as the kernel extreme learning machine. Table 4 represents the variation in accuracy observed with various kernel functions. For this, regularization coefficient and kernel parameters are computed as 1 and 345 respectively. These parameters' values were obtained experimentally by optimizing the accuracy using optimization algorithm. The table is shown below: It can be seen from the table that best accuracy is achieved with RBF kernel function and hence, RBF kernel function is utilized here for the results and comparing them with other state-of-the-art methods. Table 5 shows the classification results for KELM classifier with RBF kernel function on CoViD19 database. For KELM also, train-test ratio is selected as 10-90, 30-70, 50-50 and 70-30 and an accuracy of 90.86%, 95.32%, 96.52% and 96.64% has been obtained respectively. It can be seen through the execution time ( Table 5) that non-iterative nature of KELM gives an advantage of much faster speed.

Comparison and discussion
All the classifiers with bispectral magnitude analysis of the chest X-ray images of CoViD19 patients utilized during experimentation provide very good results and it has already been seen in the previous sections. Now, in this section, results obtained from these classifiers are compared with each other and with the existing state-of-art methods on the chest X-ray database of CoViD19 (Table 6). Comparison of these classifiers is shown in Figs. 10, 11, 12 and 13 on the basis of their ROC curves.
Tables 1, 2, 3 and 5 also show the values of performance measures, sensitivity and specificity. Sensitivity tells about the images of chest X-ray which are not affected with corona virus signifying CoViD19−ve images while specificity represents the chest X-ray images having the presence of corona virus. It can be seen from the tables that KELM performs best among all these classifiers in identifying CoViD19+ve and CoViD19−ve cases based on chest X-ray images. The same is also clearly visible in Figs. 10, 11, 12 and 13 which are plotted using the values of sensitivity and specificity of these classifiers. These figures are plotted for 10-90, 30-70, 50-50 and 70-30 train-test ratio respectively. Figure 14 represents the plots of classifiers used during experimentation based on their accuracies in classifying CoViD19+ve and CoViD19−ve images. It can be clearly seen that KELM performs best and Decision tree performs worst among these classifiers. 2-Layer CNN also performs equally as KELM but only when the train data is large.
As CNN is an iterative algorithm, therefore, increasing the training data results in the increase in their execution time as well. Similar case can be observed with transfer learning approaches used in the existing approaches where there are so many layers compared to 2-Layer CNN and 3-Layer CNN. KELM is non-iterative, hence, performs faster. It can be observed from Fig. 15. KELM is fastest among all the classifiers utilized during the experimentation followed by 2-Layer CNN and kNN is slowest. The results of these classifiers are compared with other state-of-art approaches and are shown in Table 6. In 21 and 34 , classification was performed on Corona virus database based on differential evolution based CNN and deep transfer learning respectively. Both these papers have utilized comparatively lesser number of images in their experimentation. Narin et al. 66 have achieved 98% accuracy in detecting the patients correctly but they have trained the model with 95% of the total data which might result in overtraining of the model leading to good results. One advantage that can be observed in our   Table 7. KELM performs classification with high accuracy even if less amount of data is used to train the classifier while CNNs require large amount of training data for good classification results as seen from Tables 2, 3 and 5. Also, all these CNN based approaches are iterative in nature and take large amount of time compared to KELM. Hence, KELM being a non-iterative algorithm comes out as very helpful, fast and efficient algorithm in this pandemic (CoViD19).

Conclusion and future scope
The results obtained in this experiment depicts that machine learning can be very helpful in dealing with this worldwide pandemic. Higher order statistics has provided a clear view in differentiating the CoViD+ve patients from the normal ones. As various classifiers are applied, they give very good results on the chest X-ray images of CoViD19 database. 2-Layer CNN performs the best with 96.81% of accuracy in classifying the CoViD19−ve chest X-ray images from CoViD19+ve images. But the execution time for 2-Layer CNN for this accuracy is 215 s. And when speed is the concern, as it is the peer requirement in current situation, KELM comes out to be best.  23,34 . Thus, the researchers and scientists working in the labs can focus on machine learning techniques for CoViD19 detection so that proper treatment can be given to the patients at an early-stage only with faster speed. And in the situation of third wave approaching, this non-iterative machine learning methods can replace the kits utilized these days for the testing. Also, as various countries are under lockdown during this pandemic, daily life activities are at halt as everything is closed included shopping malls, cinema halls etc. This is resulting in an increase in anxiety level among   www.nature.com/scientificreports/ people, mainly young ones, which further can lead to anger, isolation, mood swings, panic attacks, fear, depression etc. This behavior change is evident on social networking sites like twitter 71 where the usage of words like bored, frustrated, want to get out etc. has increased after lockdown 72 . Such words depict the mental status of an individual and reports state that such mental status leads to suicidal tendencies. Hence, emotion detection can also be possible with the help of machine learning. By detecting such words of emotions on the social networking sites, it can be made possible to give any individual appropriate counseling at the right time in this serious pandemic.
Ethics approval. All methods were carried out in accordance with relevant guidelines and regulations.