Combining a convolutional neural network with autoencoders to predict the survival chance of COVID-19 patients

COVID-19 has caused many deaths worldwide. The automation of the diagnosis of this virus is highly desired. Convolutional neural networks (CNNs) have shown outstanding classification performance on image datasets. To date, it appears that COVID computer-aided diagnosis systems based on CNNs and clinical information have not yet been analysed or explored. We propose a novel method, named the CNN-AE, to predict the survival chance of COVID-19 patients using a CNN trained with clinical information. Notably, the required resources to prepare CT images are expensive and limited compared to those required to collect clinical data, such as blood pressure, liver disease, etc. We evaluated our method using a publicly available clinical dataset that we collected. The dataset properties were carefully analysed to extract important features and compute the correlations of features. A data augmentation procedure based on autoencoders (AEs) was proposed to balance the dataset. The experimental results revealed that the average accuracy of the CNN-AE (96.05%) was higher than that of the CNN (92.49%). To demonstrate the generality of our augmentation method, we trained some existing mortality risk prediction methods on our dataset (with and without data augmentation) and compared their performances. We also evaluated our method using another dataset for further generality verification. To show that clinical data can be used for COVID-19 survival chance prediction, the CNN-AE was compared with multiple pre-trained deep models that were tuned based on CT images.

The remaining sections of the paper are organised as follows: "Literature review" reviews the related literature; "Background" briefly sets out the required background; "Description of our clinical dataset" describes our dataset; "Proposed methodology" explains the proposed methodology; "Experiments" presents our experimental results; and "Discussion" and "Conclusions and future works" present our discussion, conclusion and future works.

Literature review
This study sought to predict the survival chance of COVID-19 patients using clinical features. We began by reviewing the COVID-19 detection methods that rely on clinical features and image data. We also reviewed methods on mortality estimations of infected patients.
To contain the COVID-19 threat as soon as possible, researchers approached this virus from multiple directions. Some focused on the fast and accurate detection of infected patients. For example, Wu et al. 17 extracted 11 vital blood indices using the random forest (RF) method to design an assistant discrimination tool. Their method had an accuracy of 96.97% and 97.95% for the test set and cross-validation set, respectively. The assistant tool was well equipped to perform a preliminary investigation of suspected patients and suggest quarantine and timely treatment. In another study, Rahman et al. 18 reviewed various studies on treatment, complications, seasonality, symptoms, clinical features and the epidemiology of COVID-19 infection to assist medical practitioners by providing necessary guidance for the pandemic. Using a CNN, they tried to detect infected patients to isolate them from healthy patients.
Various hybrid approaches have been adopted to improve COVID-19 diagnosis accuracy. Islam et al. 19 employed a CNN for feature extraction and long short-term memory for the classification of patients based on X-ray images. EMCNet 20 is another hybrid diagnosis approach that uses a CNN for feature extraction and carries out binary classification using a number of learning techniques, including RF and support vector machine (SVM), on X-ray images. Islam et al. 21 also used a CNN for feature extraction but relied on a recurrent neural network (RNN) for classification based on the extracted features. Multiple experiments have been conducted using a combination of architectures, such as VGG19 and DenseNet121, with an RNN. VGG19 + RNN was reported to have the best performance.
In addition to distinguishing between infected and non-infected patients, it is also important to determine whether infected patients have severe conditions. Muhammad et al. 22 relied on data mining to predict the recovery condition of infected patients. Their method was able to determine the age group of high-risk patients who are less likely to recover and those who are likely to recover quickly. Their method was able to provide the minimum and the maximum number of days required for a patient's recovery. Chen et al. 23 studied 148 severe and 214 non-severe COVID-19 patients from Wuhan, China using their laboratory test results and symptoms as features to design a RF. The task of the RF was to classify COVID-19 patients into severe and non-severe types using the features. Using the laboratory results and symptom as input, the accuracy of their model was over 90%. Some of the key features they identified were lactate dehydrogenase (LDG), interleukin-6, absolute neutrophil count, D-Dimer, diabetes, gender, cardiovascular disease, hypertension and age.
Other researchers have focused on the mortality risk prediction of the patients. Gao et al. 24 proposed a mortality risk prediction model for COVID-19 (MRPMC) that applied clinical data to stratify patients by mortality risk and predicted mortality 20 days in advance. Their ensemble framework was based on four machine-learning techniques; that is, a neural network (NN), a gradient-boosted decision tree 25 , a SVM and logistic regression. Their model was able to accurately and expeditiously stratify the mortality risk of COVID-19 patients.
Zhu et al. 26 presented a risk stratification score system as a multilayer perceptron (MLP) with six dense layers to predict mortality. 78 clinical variables were identified and prediction performance was compared with the pneumonia severity index, the confusion, uraemia, respiratory rate, BP, age ≥ 65 years score and the COVID-19 severity score. They derived the top five predictors of mortality; that is, LDH, C-reactive protein, the neutrophil www.nature.com/scientificreports/ to lymphocyte ratio, the Oxygenation Index and D-dimer. Their model was proved to be effective in resourceconstrained and time-sensitive environments.
The power of the XGBoost algorithm has also been leveraged for mortality risk prediction. For example, Yan et al. 27 collected blood samples of 485 infected patients from China to detect key predictive biomarkers of COVID-19 mortality. They employed a XGBoost classifier that was able to predict the mortality of patients with 90% accuracy more than 10 days in advance. In another study, Bertsimas et al. 28 developed a data-driven mortality risk calculator for in-hospital patients. Laboratory, clinical and demographic variables were accumulated at the time of hospital admission. Again, they applied XGBoost to predict the mortality of patients. Adopting a different approach, Abdulaal et al. 29 devised a point-of-admission mortality risk scoring system using a MLP for COVID-19 patients. The network exploited patient specific features, including present symptoms, smoking history, comorbidities and demographics, and predicted the mortality risk based on these features. The mortality prediction model demonstrated a specificity of 85.94%, a sensitivity of 87.50% and an accuracy of 86.25%.
As the symptoms of different viruses may be similar to some extent, there has been an attempt to distinguish different viruses from one another 30 . To this end, multiple classical machine-learning algorithms were trained to classify textual clinical reports into the four classes of Severe acute respiratory syndrome (SARS), acute respiratory distress syndrome, COVID-19 and both SARS and COVID-19. Feature engineering has also been carried out using report length, bag of words and etc. Multinomial Naïve Bayes and logistic regression outperformed other classifiers with a testing accuracy of 96.2%. A summary of the reviewed works are presented in Table 1.
Most existing studies on COVID-19 have relied on computed tomography (CT) and X-ray images to achieve their research objectives. Al-Waisy et al. 31 proposed COVID-DeepNet, a hybrid multimodal deep-learning system for diagnosing COVID-19 using chest X-ray images. After the pre-processing phase, the predictions from two models (a deep-belief network and a convolutional deep-belief network) were fused to improve diagnosis accuracy. Another fusion of two models (ResNet34 and a high-resolution network model) was proposed in 32 to form the COVID-CheXNet method for COVID-19 diagnosis. Mohammed et al. collected a dataset of X-ray images and made it publicly available. The dataset has been used to benchmark various machine-learning methods for COVID-19 diagnosis 33 . They reported that the ResNet50 model achieved the best performance. In another benchmarking study 34 , 12 COVID-19 diagnostic methods were examined based on 10 evaluation criteria. To this end, multicriteria decision making (MCDM) and the technique order of preference by similarity to ideal solution were employed. The 10 criteria were weighted based on entropy. The SVM classifier was reported to have the best performance among the benchmarked methods.
Slowing down the spread of COVID-19 and supporting infected patients are as important as COVID-19 detection. Several works have investigated the possibility of using existing technologies to benefit infected patients. Rahman et al. 35 proposed a deep-learning architecture to determine whether people are wearing a facial mask. The monitoring was realised via closed-circuit television cameras in public places. Islam et al. 36 reviewed existing technologies that can facilitate the breathing of infected patients. Wearable technologies and how they can be used to provide initial treatment to people have also been investigated 37 . Ullah et al. 38 reviewed telehealth services and the possible ways in which they can be used to provide patients with necessary treatments while keeping the social distance between patients and doctors.
Some works have adopted a broader approach and reviewed various recently developed deep-learning methods with application to COVID-19 diagnosis. For example, Islam et al. 39 reviewed these methods based on X-ray and CT images while the overall application of deep learning for diagnosis purposes to control the pandemic threat has been discussed in 40 .
Based on the review presented above, it is apparent that existing works based on clinical data are rather scarce. Thus, we sought to conduct another study using clinical data for mortality risk assessment. The difference between our method and existing research on mortality risk assessment is twofold. First, we developed a new approach for carrying out the assessment. Second, some of the clinical features that we considered had never been used previously, which is why we have released our dataset publicly. As will be discussed further below, clinical data are more cost effective than CT images, and classifiers trained on clinical data achieve a level of performance that is almost equal to that achieved by classifiers trained on CT images. To justify this claim, we compared the performance of our method trained on clinical data to a standard CNN trained on CT images.

Background
Our proposed method comprises two modules: the classifier and data augmenter. The classification is carried out using a CNN. The data augmentation is realised using 10 AEs. In this section, we briefly review the main concepts of CNNs and AEs.
CNNs. CNNs are massively used in image-based learning applications. Due to the automatic feature extraction mechanism of CNNs, they can discover valuable information from training samples. CNNs are usually designed with several convolutional, pooling and fully connected layers 41 . As Fig. 1 shows, feature extraction is done by convolving the input with convolutional kernels. The pooling layer reduces the computational volume of the network without making a noticeable change in the resolution of the feature map. In CNNs, the size of the pooling layers usually decreases as the number of layers increases. Two of the most popular types of pooling layers are max pooling and average pooling 42 . AEs. AEs belong to the realm of unsupervised learning, as they do not need labelled data for their training.
In brief, an AE compresses input data to a lower dimensional latent space and then reconstructs the data by decompressing the latent space representation. Similar to principle component analysis (PCA), AEs perform dimensionality reduction in the compression phase. However, unlike PCA, which relies on linear transformation, AEs carry out nonlinear transformation using deep neural networks 43 . Figure 2 shows the architecture of a typical AE.
Information gain. In this section, we review information gain (IG), as it is used to determine the degree to which each feature of our dataset contributes to the patients' deaths (see "Description of our clinical dataset"). IG calculates the entropy reduction that results from splitting a dataset, D , based on a given value, a , of a random variable, A , such that: where H(D) and H(D|A = a) are entropy on dataset D and conditional entropy on dataset D , respectively, given that A = a. Conditional entropy is computed as:

Description of our clinical dataset
The dataset we collected in this paper comprised 320 patients (300 cases of recovered patients and 20 cases of deceased patients). The percentage of female cases was 55%. The mean age of patients in the dataset was 49.5 years old, and the standard deviation was 18.5. The patients referred to Tehran Omid hospital in Iran from 3 March 2020 to 21 April 2020. Ethical approval for the use of these data was obtained from the Tehran Omid hospital. In gathering the data, patients' history (as collected by doctors), questionnaires (as completed by patients), laboratory tests, and vital sign measurements were used. Descriptions of the dataset features are presented in Table 2. Our dataset is publicly available in 44 . Institutional approval was granted for the use of the patient datasets in research studies for diagnostic and therapeutic purposes. Approval was granted on the grounds of existing datasets. Informed consent was obtained from all of the patients in this study. All methods were carried out in accordance with relevant guidelines and regulations.
(1) www.nature.com/scientificreports/ As our dataset had not been released previously, it was vital to assess the degree to which each dataset feature contributed to patients' deaths. Such an analysis provides researchers with valuable insights into the characteristics of the collected data. Various feature selection methods are available to determine the weight of each feature in the classification of dataset samples. We chose IG 45 , which is one of the most widely used feature selection methods 46 . In Fig. 3, the importance of each feature (i.e., the IG) is shown as a bar. Age had a much larger IG (0.149) than other features. Thus, age was not included in Fig. 3 to make it easier to compare the importance of the other features. According to the bar chart, (after age) cancer, heart and kidney diseases were the second, third and fourth most important features related to patients' deaths, respectively. Thus, it was clear that patients with poor health conditions were more vulnerable to COVID-19. It should be noted that Fig. 3 does not include the features with zero IG.
We also inspected the interplay between the dataset features to determine the potential correlation between them. To this end, the grid in Fig. 4 is presented. Figure 4 can be thought as a heat map that shows the positive/ negative correlation between features. Each cell c(i, j) in the grid of Fig. 4 represents the correlation of features in the i-th row and j-th column. As the cell colour approaches red, the positive correlation between the feature pairs is higher. For example, anosmia (the loss of the ability to smell) and ageusia (the loss of the ability to taste with the tongue) had a high positive correlation, which means they were usually observed simultaneously.

Proposed methodology
This study investigated the survival chance prediction of COVID-19 patients who referred to the Omid hospital in Tehran. The classification was based on features obtained from patients' information. In the dataset collected, the number of recovered patients was 300 and the number of deceased patients was 20. The number of recovered patients was clearly much higher than that of the deceased patients. To ensure accurate classification, it was necessary to balance the recovered to the deceased ratio of the dataset samples. To do this, the number of instances of the lower class was increased, such that the number of data in both classes was approximately equal. To increase the number of data of deceased patients, an AE model was used. To carry out the data augmentation, the 20 samples of the deceased class were fed to the AE to undergo the compression and decompression routines. The output of this process comprised 20 reconstructed samples that were similar (but not identical) to the original ones. Thus, we augmented the original 20 samples with 20 reconstructed samples. Training the AE 10 times using different training and validation sets yielded 10 AEs with a similar architecture but different parameters. Each of the 10 AEs generated 20 reconstructed deceased samples, yielding reconstructed samples of 200 overall, which were added to the original deceased samples. To provide an insight into the function of the AEs, sample vectors before and after reconstruction are presented in Table 3. For the majority of '1' elements of input vector c , the AE    Table 3. An example of reconstruction performed by an AE: vector c is the original sample and vector c is its reconstructed counterpart.    To implement the proposed method, we used Python language and the Keras library, which has a TensorFlow backend. In this study, the dataset contained 320 samples of infected cases. Of these 320 cases, the number of recovered cases was 300, and the number of deceased cases was 20. Additionally, we also generated 200 reconstructed deceased cases to balance the recovered to the deceased ratio of our dataset. After the reconstruction phase, our dataset contained 520 cases. We used a tenfold cross-validation. Additionally, 80% of 9 of the folds were used for training, and the remaining 20% was used for validation. The implementation details of CNN and AE are illustrated in Figs. 6 and 7, respectively.

Experiments
In this section, the experimental results are presented. The implementation details of CNN and AEs are explained in "Experimental details". We report on the performance of the proposed method (CNN-AE) and compare it to a CNN in "Experimental results".    Table 4. To ensure a fair comparison, we used the same CNN architecture in our method. The implementation details of the AEs used in the CNN-AE are presented in Table 5.
In the second phase of our experiments, we compared the CNN-AE trained on clinical data to a standard CNN trained on image data. The CNN architecture is presented in Fig. 8. After multiple trials, we obtained the best set of the CNN hyperparameters (see Table 6).  Table 5. AE implementation details. www.nature.com/scientificreports/ Experimental results. We sought to answer two important questions about the proposed method. First, we compared our method performance with a standard CNN trained on clinical data. This experiment examined the effects of the proposed data augmentation technique using multiple AEs. We also trained a standard CNN for the same purpose (to predict patients' survival chance) but used CT images. This experiment sought to determine how well CT images can represent patients' survival chance using a CNN as the predictor.

Hyper-parameters Values
Examining the data augmentation approach. As mentioned in "Implementation details of CNN-AE", we used 10 AEs to augment the available dataset. Data augmentation is critical to successful training when the number of samples from different classes is unbalanced. Data imbalance can defeat any powerful classifier even a state-ofthe-art CNN, which is why we employed the data augmentation technique.
To investigate the effectiveness of our data augmentation procedure, we trained a CNN on the original dataset and our CNN-AE on an augmented dataset. The original dataset comprised only 20 samples with the deceased label, but had 300 samples with the recovered label. Comparing the 300 to 20 reveals severe data imbalance from which the CNN suffered during training (see Table 7). However, using an augmented dataset with 300 recovered samples and 220 deceased samples facilitated the CNN training and improved accuracy (see Table 7). Additionally, the area under the curve (AUC) measure of the CNN-AE was almost twice that of the CNN. The specificity measure of CNN was almost zero, which was due to the fact that the CNN was unable to distinguish between deceased and recovered samples due to the insufficient number of deceased samples in the original dataset. As Table 7 shows, the CNN-AE training took more time; however, this was due to the time it took to train the 10 AEs required for data augmentation.
In Table 7, the CNN-AE method had an average accuracy of 96.05% and thus outperformed the CNN method, which had an average accuracy of 92.49%. Additionally, due to the augmented data, our method was able to reduce the training/validation loss faster than CNN (as is evident in Fig. 9). Similarly, the CNN-AE reached higher accuracy faster than the CNN (see the plots in Fig. 10). During training, our method exhibited great variation in the validation plots compared to those of the CNN. This is because the CNN quickly overfitted to the small number of deceased samples but the CNN-AE had to deal with more versatile augmented samples. Thus, the training of the CNN-AE was more difficult, but it achieved better overall performance.
Comparisons with existing deep models trained on image data. In this section, we evaluated the performance of various existing deep models that were trained on a dataset of CT images. The CT images were taken from the same patients for whom the clinical dataset was collected. Thus, the results of this section reveal how well deep models trained on CT images perform compared to a CNN trained on clinical data. It should be noted that most of the experiments in the COVID-19 literature revolve around classifying infected and non-infected people using CT images. This section sheds some light on how well deep models can predict the survival chance of already infected patients based on CT images.
The dataset comprised 2822 CT images of recovered patients and 2269 CT images of deceased patients. The CT image dataset size was much greater than the clinical dataset size, as the CT dataset contained multiple images for each patient. As the number of samples of the two classes in the dataset was almost balanced, we did not apply our data augmentation technique to the CT dataset. Additionally, having multiple images for each patient Table 6. Implementation details of the CNN trained on image data.

Hyper-parameters Values
Number of convolutional kernels of first layer 64 www.nature.com/scientificreports/ served as a form of data augmentation. This was not the case for the clinical dataset for which each patient had only one value per feature. In Table 8, the performance metrics for the evaluated deep models are presented as 95% confidence intervals (CIs) that have been computed over a tenfold cross-validation. The results in Table 8 show that UNet had the best performance among the evaluated methods, followed by Inception Net V3 and DenseNet121, respectively. Overall, Table 8 suggests that some of the famous deep models with pre-trained parameters can be tuned via training to predict the survival chance of COVID-19 patients based on CT images. A performance comparison of the deep models (see Table 8) and the CNN-AE (see Table 7) revealed that a CNN trained on clinical data performed on par with various pre-trained deep models which have been tuned via training on CT data. As Table 7. Comparison of the CNN and the CNN-AE using different evaluation metrics based on a tenfold cross-validation.  www.nature.com/scientificreports/ stated above, the CT image dataset size was almost 10 times that of the clinical dataset size. However, the CNN trained on clinical data performed almost as well as the deep models trained on CT data. Thus, clinical data could be a good replacement for CT training data if the preparation of the CT images would be difficult or expensive.
Comparison with other methods trained on clinical data. In this section, we compare the performance of our CNN-AE with some of the existing works on mortality prediction 23,26,27 . To this end, we implemented the methods of Chen et al. 23 , Zhu et al. 26 and Yan et al. 27 . As mentioned above in the literature review, Chen et al. relied on the RF to assess the severity of COVID-19 patients. For mortality risk prediction, Zhu et al. 26 and Yan et al. 27 used MLP and XGBoost, respectively. These methods were specifically designed to achieve COVID-19-related objectives. For a broader perspective, we also experimented with Naïve Bayes, which is a generic method that can be used regardless of the classification objective. The conducted experiments revealed that our data augmentation approach was generic and beneficial to any classification method.
Methods' performance. In this section, we present the experimental results for the classification methods mentioned above. We also investigate the effects of using the proposed data augmentation technique during training.
The performance statistics are presented as 95% CIs in Table 9. The CIs are computed based on tenfold crossvalidation. First, each method was trained on the original dataset (without augmentation). The training was repeated using the augmented dataset. The proposed data augmentation using AEs was used for this purpose. All of the rows in Table 9 that are related to training on the augmented dataset are marked with ' + AE' postfix in the 'Methods' column. The last row of Table 9 is identical to the last row of Table 7, which has been reproduced here for ease of reference. An inspection of the results in Table 9 reveals that the proposed CNN-AE method outperformed the other methods in terms of accuracy, recall and AUC. Yan et al. 27 58 , Genetic Algorithm (GA) 59 and Particle Swarm Optimisation (PSO) 60 . Details of the implementation of these methods are available in MEALPY 61 , which is a Python module consisting of metaheuristic algorithms. In all of the experiments detailed in this section, the meta-heuristic methods were run for 500 epochs with a population size of 100.
The results of running each of the meta-heuristic methods listed above was a set of selected features (see Table 10) that specified a subset of the clinical dataset. The dataset extracted subset was used to train a CNN for survival chance prediction. The training was performed with and without data augmentation. The results of the training are presented in Table 11. In each row of the table, the meta-heuristic method used for feature selection and the classifier is specified. Usage of data augmentation is denoted by '-AE' .
As Table 11 shows, regardless of the feature selection method, the CNN-AE trained on the selected features did not outperform the CNN-AE trained on the full dataset (see the last row of Table 7). This is because the CNN already included an automatic feature selection mechanism and could rule out unnecessary features during learning. Discarding some of the features via feature selection only deprived the CNN of the opportunity to choose the features that best fit its objective.
Among the evaluated feature selection methods in Table 11, BOA showed the best performance, followed by the ACO and ABC, respectively. In relation to Table 11, it should be noted that data augmentation after the application of all of the feature selection methods yielded better results. Thus, the proposed data augmentation approach is generic.

Discussion
This paper focused on survival chance prediction for COVID-19 patients. We performed experiments using both a clinical dataset and a CT image dataset. The size of the CT image dataset was almost 10 times that of the clinical dataset. However, the CNN trained on clinical data performed almost as well as the CNN trained on CT data, which supports the use of clinical data as an alternative for CT images.
Another aspect that might encourage the use of clinical training samples relates to data collection costs. Preparing CT data may require high-end facilities; however, such facilities may increase data collection costs. Additionally, the facilities required to prepare CT data may not be available in deprived areas. Conversely, the tools required to measure clinical data, such as blood pressure, fever and C-reactive protein, are generally accessible.
The proposed method can detect the severity of patients' conditions based on clinical data and enable preventive actions to be taken to minimise the mortality rate. As discussed in "Literature review", very few methods have studied mortality rate prediction using clinical data. Additionally, existing methods have used features that differ from the ones we used in our experiments. Thus, the proposed method sheds some light on unexplored aspects of the COVID-19 virus. To implement the proposed system in practice, it must be evaluated by medical experts from medical centres in different regions. After being verified, the system could be used to help experts analyse the severity condition of COVID-19 patients. Thus, patients with critical conditions could be given higher treatment priority than non-critical patients. Prioritising the patients' treatment is of the utmost importance when the medical resources available are limited.
In addition to the proposed method, our dataset can be considered the second contribution of this paper, as it is a good resource for further medical research. The analysis of the importance of the dataset features and their correlations are shown in Figs. 3 and 4. Using our dataset, experts can study the relationships between patients' medical conditions (e.g., blood pressure and diabetes) and the likelihood of dying from COVID-19. This will enable medical experts to exercise more caution during the treatment of patients who are more likely to die due to their medical conditions. As the IG values in Fig. 3 suggest, there is a strong relationship between the mortality rate of COVID-19 patients and the presence of other critical diseases, such as cancer, kidney and heart diseases. Conversely, mild symptoms and/or diseases, such as dyspnoea, conjunctivitis and asthma, are less likely to contribute to the mortality rate. Table 9. Performance metrics for various classification algorithms with and without AE-based data augmentation.

Methods
Rank Accuracy (%) PPV (%) Recall (%) Specificity (%) F1-score (%) AUC (%) Chen et al. 23 Table 11. CNN and CNN-AE performance trained on features selected by meta-heuristic methods. www.nature.com/scientificreports/ Like any other classification approach, the proposed method has some limitations. Due to the use of multiple AEs in the data augmentation phase, the training time of our method was longer than that of a standard CNN. Further, standard CNNs receive a single image sample as input and perform feature extraction automatically. Conversely, we manually collected multiple clinical features for each patient, and such a process is more difficult to manage. Some of the features in our dataset were gathered directly by asking patients; thus, it is possible that patients provided incorrect information.

Conclusions and future works
In this paper, we investigated the possibility of training a CNN on clinical data to predict the survival chance of COVID-19 patients. To this end, a new dataset consisting of clinical features, such as gender, age, blood pressure and the presence of various diseases, was gathered. The first contribution of this paper relates to our decision to release the collected dataset for public use. We also analysed the dataset features using IG and correlation. Our analysis could aid potential researchers and practitioners with their work on the COVID-19 virus.
To reduce the data imbalance of our dataset, we proposed a novel data augmentation method based on AEs. Our data augmentation approach is generic and applicable to other datasets. Based on the proposed data augmentation approach, a novel survival chance prediction method named CNN-AE was presented, which represents the second contribution of this paper. Using augmented data for training, the 95% CI for the accuracy, recall and specificity of the CNN-AE were 96.05 ± 1.48%, 98.00 ± 1.33% and 93.13 ± 2.52%, respectively. However, a CNN trained on a dataset without augmentation yielded an accuracy of 92.49 ± 2.75%, a recall of 95.4 ± 0.88% and a specificity of 96.9 ± 3.73%. Thus, it is clear that the CNN-AE benefitted the data augmentation and outperformed the CNN.
We repeated the CNN training on CT images obtained from the same patients for whom the clinical data had been collected. Comparisons of the performances of the methods trained on clinical data and the methods trained on CT data revealed that clinical data can be used as an alternative to CT images.
In the future, more data needs to be collected to further assess our proposed approach. The use of other data augmentation methods also needs to be investigated and the results compared with our data augmentation method.