Introduction

Coronavirus disease-19 (COVID-19) has been initially defined as an atypical pneumonia caused by a zoonotic viral agent, then identified as Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2)1.

In the first wave of the outbreak, most people with COVID-19 developed mild disease (40%), without evidence of viral pneumonia or hypoxia, or moderate disease (40%), with clinical signs of pneumonia (fever, cough, dyspnea, fast breathing) but no signs of reduced oxygen saturation (SpO2 ≥ 90% on room air). 15–20% of infected individuals developed a severe or critical disease with complications such as respiratory failure, acute respiratory distress syndrome, sepsis and septic shock, thromboembolism, and/or multiorgan failure, including acute kidney injury and cardiac injury. Exitus was eventually reported in two to eight weeks from symptom appearance2.

Many countries have faced a second wave of COVID-19 pandemics. Compared to the first wave, a lower proportion of patients requiring invasive mechanical ventilation and a lower rate of thrombotic events have been observed3,4. Hospitalized patients in the second wave are younger, require fewer days of hospitalization, have longer survival5,6.

Although COVID-19 is mostly defined by pneumonia, it has been documented that extrapulmonary systemic hyperinflammation plays a crucial role in clinical manifestations7, also contributing to COVID-19 associated coagulopathy8. Peculiar COVID-19 immunophenotypes have been also described7,9. At peripheral blood level, a decreased number of basophils and plasmacytoid dendritic cell depletion correlates with disease severity9. Aberrant pathogenic T cells and inflammatory monocytes are rapidly activated and produce a large number of cytokines, thus inducing a so called “cytokine storm”. Many studies on first wave of COVID-19 outbreak have indicated an increase of both pro-inflammatory and anti-inflammatory cytokines, whose levels appear to correlate with severity of disease, both in adults and in children9,10,11,12. Hence, targeted approaches have been envisioned to dampen COVID-related cytokine storm, particularly IL-6, IL-8, and TNFα13,14,15. However, to date, no available cytokine-based drug or therapy have demonstrated 100% efficacy for patients with COVID-19.

Since March 2020, a large number of studies on cytokine storm in COVID-19 patients has been published. Main findings often display a high degree of variability and refer to the first wave of the outbreak11. In this study, we have provided a cytokine profile of patients with mild and severe symptoms of COVID-19 during two peaks of epidemic curves in Campania region (Italy). Moreover, by using machine learning methods, we have analyzed whether a specific cytokine profile could guide disease diagnosis and prognosis.

Methods

Study design and ethics statement

Between March 2020 and May 2020, 65 consecutive patients with a positive SARS-CoV-2 PCR swab test, admitted at Federico II University Hospital and “Azienda Ospedaliera dei Colli” Hospital of Naples, Italy, were recruited for the study. 49 healthy adult volunteers were also enrolled as control cohort.

Similarly, during the second wave of pandemics, from September to October 2020, 36 patients with confirmed SARS-CoV-2 infection and 15 negative controls were included in the study.

The study was approved by the ethical committee of the University of Naples Federico II (prot. n. 140/20/ESCOVID19). All the methods involving patients and volunteers have been performed in accordance with the Declaration of Helsinki. Also, an informed consent has been obtained from all participants.

Sample processing and cytokine assay

Blood samples in serum separator tubes were centrifuged and stored at − 80 °C. Serum samples were then screened for the concentration of Interleukin (IL)-1β, IL-1ra, IL-2, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-10, IL-12(p70), IL-13, IL-15, IL-17, basic Fibroblast Growth Factor (FGF-b), Eotaxin, Granulocyte-Colony Stimulating Factor (G-CSF), Granulocyte–Macrophage Colony Stimulating Factor (GM-CSF), Interferon (IFN)-γ, Interferon gamma-Induced Protein (IP)-10, Monocyte Chemoattractant Protein (MCP)-1, Macrophage Inflammatory Protein 1-alpha/beta (MIP-1α, MIP-1β), Platelet-Derived Growth Factor (PDGF), Regulated on Activation Normal T-cell Expressed and Secreted RANTES/CCL5, Tumor Necrosis Factor (TNF)-α, and Vascular Endothelial Growth Factor (VEGF), using the Bio-Plex multiplex Human Cytokine and Growth factor kits (Bio-Rad) according to the manufacturer's protocol and as previously described16.

Statistical analysis

A Shapiro–Wilk test was used to evaluate whether the continuous data were normal distributed, and according to the results, values were expressed as median and interquartile range and compared using the Kruskall-Wallis non-parametric test followed by Mann Whitney U test for pairwise comparisons. The non-parametric Jonckheere–Terpstra test was used to analyse trend between an ordinal independent variable. Categorical values were described by number of occurrences and percentages and compared by chi-square test.

Three machine learning methods have been used for prediction of COVID-19: linear discriminant analysis (LDA), classification and regression tree (CART) and neural network (NNET).Performance of algorithms in terms of sensitivity, specificity and overall accuracy were computed.

The predictive accuracy of the single factors and of the machine learning methods was measured by the area under the receiver operating characteristic (ROC) curve (AUC)17.

Algorithms have been first designed (trained) and then evaluated (test) on proper sets of data. To avoid overfitting and to robustly evaluate classification performance, a cross-validation approach was used. In detail, one of the subjects was excluded from the training set and used as test: the procedure was iterated over all the subjects and average performance were thus computed. Thisleave-one-out approach better suites for small data-sets18,19:

Data from the first wave of outbreak have been used to produce cross-validated classifiers (LDA, CART): those classifiers have been then applied on data from the second wave. Performance have been evaluated using confusion matrix indices (sensitivity, specificity, overall accuracy).

Processing and statistical analysis have been conducted using R software. Differences were considered statistically significant for p value less than 0.05.

Results

Wave 1 cytokine signature

Between March 2020 and May 2020, 65 patients with a positive SARS-CoV-2 PCR test were enrolled. In agreement with World Health Organization (WHO) eight-point scale for COVID-19 trial endpoints20, patients were classified in “mild” (WHO scores 3–4; N = 46) and “severe” (WHO scores 5–8; N = 19). A cohort of 49 healthy blood donors was enrolled as control. No differences for gender were observed in the three groups (Table 1). Severe COVID-19 patients were significantly older compared with both mild COVID-19 patients and controls. At variance, no differences in age were detected between mild COVID-19 patients and controls (Table 1).

Table 1 Serum concentration (pg/ml) of cytokines, chemokines and growth factors (Wave 1). Results are expressed as median and range [25% percentile; 75% percentile] or number of cases (%). Jonckheere–Terpstra test was used to assess the trend between groups. The non parametric Kruskall Wallis test was applied to assess the difference among three groups followed by Mann Whitney U test for pairwise comparisons.

A significant increasing trend of IL-1β, IL-1ra, IL-2, IL-4, IL-5, IL-6, IL-7, IL-8, IL-10, IL-12(p70), IL-15, IL-17, FGF-b, G-CSF, GM-CSF, IFN-γ, IP-10, MCP-1, MIP-1α, PDGF, TNF-α, and VEGF was observed in the three groups (Controls ≤ Mild COVID-19 ≤ Severe COVID-19) (Table 1, Fig. 1). Moreover, significantly higher concentration of all these factors was detected in serum of mild and severe COVID-19 patients compared to controls (Table 1). Only FGF-b did not change between severe COVID-19 and controls (Table 1). Finally, IL-1ra and IL-6 levels were significantly higher in severe versus mild COVID-19 patients, while PDGF decreased (Table 1).

Figure 1
figure 1

COVID-19 patients display increased trend in circulating cytokines. Box plots denote median and 25th to 75th percentiles (boxes) and minimum to maximum (whiskers) and Jonckheere–Terpstra trend test was performed to analyse data. Figure reports only factors with statistically significant different trends. p values and the number of patients for each group are reported in Table 1.

Thus, patients with COVID-19 displayed higher levels of cytokines and chemokines, as also shown by the starplot in Fig. 2.

Figure 2
figure 2

Cytokine-based pattern of COVID-19 patients. Star plot obtained by multivariate data analysis of whole cytokinome of every subject consists of a sequence of equi-angular spokes (radii), with each spoke representing one cytokine as indicated in figure legend on the right. Data length of a spoke is proportional to the magnitude of the variable for the data point relative to the maximum magnitude of the variable across all data points. A line is drawn connecting the data values for each spoke.

Cytokine-based prediction models

Next, we attempted to define a cytokine-based COVID-19 prediction model. As shown in Fig. 3, LDA algorithm allowed to classify subjects in the three groups (control, mild COVID-19, severe COVID-19) with accuracy 0.96, 95% CI: (0.91, 0.99) (Fig. 3, Supplementary Tables 1–3). ROC analyses revealed that almost all cytokines could achieve high diagnostic discriminative power (Fig. 4). However, IL-6, IL-8, IL-10 and IP-10 showed either diagnostic and prognostic classification performance, with an AUC > 0.95 in at least 2 out of the 3 groups (Control vs mild + severe COVID-19; mild vs control + severe COVID-19; severe vs control + mild COVID-19) (Supplementary Table 4). Thus, machine learning algorithms (LDA, NNET and CART) were set up using only the concentrations of the four selected cytokines. ROC analyses revealed a high performance of the three classifier algorithms, with an AUC of 0.97 for LDA, 0.81 for NNET and 0.94 for CART (Fig. 5A). Indeed, CART algorithm clearly indicated that IL-6 discriminated controls and COVID-19 patients. Moreover, combination of IL-6 and IL-8 well defined disease severity. Test overall accuracy was 0.85; 95% CI: (0.77, 0.91) (Fig. 5B,C, Supplementary Tables 5, 6).

Figure 3
figure 3

27 cytokine-based algorithm allows to predict disease state and severity. 2D scatterplot of each subject’s cytokines. LDA projection is based on two-component LD1 and LD2 whose coefficients are reported in Supplementary Table 3.

Figure 4
figure 4

Diagnostic relevance of COVID-19 related cytokines. The diagnostic performance of cytokines, chemokines and growth factors was estimated using ROC curve analysis and compared with the AUC in Controls versus Mild + Severe COVID-19 patients.

Figure 5
figure 5

IL-6 and IL-8 performance in discriminating COVID-19 disease and severity. Comparison of classification accuracy of LDA, NNET and CART algorithms. AUC of ROC analysis indicates performance of the three classifier algorithms (A). Scatterplot from CART analysis identifies the groups labelled by their terminal nodes (B). The decision tree shows the rules and split points to estimate COVID-19 disease and severity. In each box, the first number estimates controls, the second number estimates mild COVID-19 patients, the third number severe COVID-19 patients. Decision binary tree reveals an optimal cut-off of IL-6 > 6.8 pg/ml for predicting COVID-19 disease and of IL-8 > 117 pg/ml for severity (C).

Wave 2 cytokine signature

During September and October 2020 (wave 2), other 36 patients with confirmed SARS-CoV-2 infection and 15 negative controls were enrolled. 26 patients were classified as mild ad 10 patients as severe. As for wave 1, significant differences and increasing trends of IL-1β, IL-1ra, IL-2, IL-6, IL-8, IL-10, GM-CSF, IFN-γ, IP-10, were observed in the three groups (Controls ≤ Mild COVID-19 ≤ Severe COVID-19) (Table 2). At variance, compared to wave 1, neither significant trend, neither significant difference was detected for IL-4, IL-5 IL-7, IL-17, FGF-b, G-CSF, PDGF, TNF-α, and VEGF among the three groups (Table 2). Interestingly, in wave 2, only IL-1ra, IL-2, IL-6 IL-8 and IFN-γ concentrations were significantly higher in serum of mild and severe COVID-19 patients compared to controls (Table 2). Higher concentrations of IL-1β and IL-12(p70) were detected only in serum from mild COVID-19 patients, while IP-10 only in serum of severe COVID-19 patients, compared to controls (Table 2). Reduced levels of eotaxin in mild COVID-19 patients were also observed (Table 2). Notably, compared to wave 1, only IP-10 concentrations were significantly higher in severe versus mild COVID-19 patients, with no change of IL-1ra and IL-6 levels (Table 2).

Table 2 Serum concentration (pg/ml) of cytokines, chemokines and growth factors (Wave 2). Results are expressed as median and range [25% percentile; 75% percentile] or number of cases (%). Jonckheere–Terpstra test was used to assess the trend between groups. The non parametric Kruskall Wallis test was applied to assess the difference among three groups followed by Mann Whitney U test for pairwise comparisons.

Challenge of cytokine-based prediction model

To validate cytokine-based COVID-19 prediction algorithms defined with data of the first wave (Fig. 5), CART analysis was carried out using data of the second wave as test sample. Test accuracy was 0.68; 95% CI: (0.54, 0.80), with a low sensitivity for the discrimination of severe COVID-19 and low specificity for mild COVID-19 patients (Supplementary Tables 7, 8).

Thus, a prediction model was set up with cytokine data derived only from control and mild COVID-19 cohorts of wave 1. ROC analyses revealed that IL-5, IL-6, IL-7, IL-8 and IL-10 showed the best discriminative power (Supplementary Table 9). Based on these selected cytokines, CART algorithm indicated that IL-6 was able to discriminate control and mild COVID-19 patients with an overall test accuracy of 0.92; 95% CI: (0.85, 0.97) (Supplementary Tables 10, 11). Interestingly, challenge of this prediction model with data from the second wave achieved an accuracy of 0.83; 95% CI: (0.68, 0.93), sensitivity of 0.88 and specificity of 0.73 (Supplementary Tables 12, 13), indicating IL-6 as the best predictor of COVID-19.

Discussion

COVID-19, caused by the SARS-CoV-2, leads to fast activation of innate immune cells, with a profound cytokine response, especially in patients developing severe disease, resembling a hyper- inflammatory state21. The identification of specific cytokines as indicators of disease severity might improve clinical management of COVID-19 patients having a great impact on the diagnostic and therapeutic decision making. However, discrepancies exist on factors involved in cytokine storm and the majority of studies refers only to the first wave of outbreak.

Here, we have shown that: (1) second wave of COVID-19 pandemics is characterized by a less impressive cytokine storm compared to wave 1; (2) 27 cytokine-based algorithm allows to predict disease state and severity with an accuracy of about 96%; (3) IL-6 was significantly associated with COVID-19 diagnosis regardless of peak epidemic curve.

Accumulating evidence has clearly indicated that cytokine storm occurs in patients with COVID-19; however, the different cytokine profiles analyzed revealed variable results. Consistent with previous studies12,22, results obtained in our population during the wave 1 reveal an activation of type 1, type 2 and type 3 immunity. In detail, we found increased levels of many pro-inflammatory and suppressive cytokines, as well as chemokines and growth factors, including IL-1β, IL-1ra, IL-2, IL-4, IL-5, IL-6, IL-7, IL-8, IL-10, IL-12(p70), IL-15, IL-17, FGF-b, G-CSF, GM-CSF, IFN-γ, IP-10, MCP-1, MIP-1α, PDGF, TNF-α, and VEGF. IL-6 and IL-1ra levels further increased in patients who were critically ill. It is widely recognized that IL-6, an important biomarker of inflammation for multiple conditions, has a crucial role in COVID-19 cytokine storm23,24. Its levels correlate with serum viral load detected by RT-PCR in critically ill COVID-19 patients and with disease outcome23,24,25. IL-1ra has inhibitory roles against pro-inflammatory cytokine activation and T lymphocyte responses26,27. It regulates IL-1, TNF-α and IFN production27, arguing a potential role in constraining a further increase of these cytokines in severe patients. The simultaneous increase of IL-6 and IL-1ra in critically ill patients suggests an overactive immune response, which may participate to the inflammation-induced tissue damage.

Cytokines display a large interindividual variability, and their functions and release depend on multiple signals, different cell targets, physiological and lifestyle factors. Thus, it is particularly challenging to evaluate cytokines’ diagnostic ability due to the difficulty of setting up cytokine cut-off levels28,29. Notably, here we have shown that a 27-cytokine profiling could be used to stratify patients with COVID-19. However, currently, a diagnostic tool based on the measurement of the whole cytokinome may raise problems for high costs to the National Health Systems.

Thus, we selected IL-6, IL-8, IL-10 and IP-10 as the cytokines with the highest performance in the discrimination of mild COVID-19, severe COVID-19 patients and healthy volunteers. Our results are in agreement with the work by Laing and colleagues who identified IL-6, IL-10 and IP-10 as a “severity-related triad”9. IL-10 is a cytokine with anti-inflammatory functions. It suppresses macrophage and dendritic cell activation and limits Th1 and Th2 effector responses28. In COVID-19, IL-10 could be involved in counteracting the hyperactive immune response, thereby limiting injury but also boosting infection persistence. IP-10 has versatile biological functions on different cell types, which include chemoattraction of inflammatory cells, but also migration and proliferation of endothelial cells30. IP-10 is commonly secreted in response to IFNγ. However, it could be directly induced also by virus-related mechanisms9. IL-8 is a potent pro-inflammatory cytokine. It is involved in the recruitment and activation of neutrophils31,32. Thus, its increase may be related to the neutrophilia often detected in patients with COVID-19.

Interestingly, these four cytokines were fed into three machine learning methods: CART, NNET, and LDA. We found that all these methods were able to predict COVID-19 occurrence and severity with a comparable high performance. Although LDA and NNET provided superior or comparable accuracy, CART is considered the best performer regardless of sample size, group size ratio, effect size, and type of model and virtually always provides more accurate predictions. Moreover, CART, compared to the other prediction methods, provides the clinician with useful information regarding the relative importance of predictors in group separation with the advantage of producing human-readable rules33. Thus, we moved on to CART algorithm and found that IL-6 is the best predictor for COVID-19 disease. The addition of IL-8 well defined disease severity.

However, to obtain a best validation of the algorithm, we challenged the method with the determination of serum cytokines of patients enrolled in a different epidemic peak.

Characteristics of patients with COVID-19 have largely changed over time3,4,5. In Italy, patients who died in the second phase of the epidemic were older, more likely to be women, and had higher probability of superinfections, larger comorbidity burden, and longer survival from symptom onset compared to people who died in the first phase (March–May 2020)5. Here, we found that in wave 2 the cytokine storm profile developed at lower levels, compared to wave 1. Many cytokines that during wave 1 were increased in serum of COVID-19 patients were undistinguishable in patients compared to controls, during wave 2. For example, IL-4 and IL-5 levels were not increased in wave 2 COVID-19 patients, suggesting lack of type 2 immunity activation. Moreover, in comparison to wave 1, only IP-10 levels were significantly higher in severe versus mild COVID-19 patients, with no change of IL-1ra and IL-6 levels. IP-10 increase was not paralleled by IFN-γ increase, suggesting a direct relationship with viral pathogenic mechanisms.

The treatment approach has changed over the two periods, as critically ill patients in the second phase were less likely to receive antivirals and/or IL-6R inhibitors and more likely to be treated with steroids and FANS. Thus, the reduction of cytokine storm extent observed in wave 2 may reflect the different therapeutic strategies adopted in the two epidemic moments. For instance, in wave 2, the loss of IL-6 augmentation in severe COVID-19 may be explained with the treatment approach, definitely not based on tocilizumab administration.

The discrepancies we found among cytokine profiles in the two COVID-19 outbreaks has led to a modification of discriminative power of the previously identified algorithm. In particular, the challenge of the method with the results obtained during the second wave has revealed a different pattern of cytokines with best predictive performance and a reduction in the classification between mild and severe COVID-19. However, CART analysis was still able to define controls and mild COVID-19 patients, with high accuracy by an algorithm based on IL-6 concentration. Thereby, we have confirmed that IL-6 remains an excellent predictor and found that it represents a COVID-19 biomarker regardless the epidemic peak curves.

In conclusion, it is conceivable that a detailed knowledge of the role of single cytokines in SARS-CoV-2 infection and a prediction model built on cytokine levels might strongly help to foster novel diagnostic tools and to inform innovative therapeutic interventions.