Upregulated type I interferon responses in asymptomatic COVID-19 infection are associated with improved clinical outcome

Understanding key host protective mechanisms against SARS-CoV-2 infection can help improve treatment modalities for COVID-19. We used a blood transcriptome approach to study biomarkers associated with differing severity of COVID-19, comparing severe and mild Symptomatic disease with Asymptomatic COVID-19 and uninfected Controls. There was suppression of antigen presentation but upregulation of inflammatory and viral mRNA translation associated pathways in Symptomatic as compared with Asymptomatic cases. In severe COVID-19, CD177 a neutrophil marker, was upregulated while interferon stimulated genes (ISGs) were downregulated. Asymptomatic COVID-19 cases displayed upregulation of ISGs and humoral response genes with downregulation of ICAM3 and TLR8. Compared across the COVID-19 disease spectrum, we found type I interferon (IFN) responses to be significantly upregulated (IFNAR2, IRF2BP1, IRF4, MAVS, SAMHD1, TRIM1), or downregulated (SOCS3, IRF2BP2, IRF2BPL) in Asymptomatic as compared with mild and severe COVID-19, with the dysregulation of an increasing number of ISGs associated with progressive disease. These data suggest that initial early responses against SARS-CoV-2 may be effectively controlled by ISGs. Therefore, we hypothesize that treatment with type I interferons in the early stage of COVID-19 may limit disease progression by limiting SARS-CoV-2 in the host.


Results
Demographic description of study subjects. All COVID-19 cases had a respiratory sample which was positive for SARS-CoV-2 at the time of recruitment. Blood samples were taken within 24-48 h of confirmed COVID-19 diagnosis. COVID-19 patients were classified according to the WHO ordinal score 22 . There were eighteen Asymptomatic COVID-19 cases and eleven Symptomatic COVID-19 cases. Symptomatic cases were further categorized into three with mild and eight with severe disease (Supplementary Table 1). Controls were eighteen uninfected healthy individuals. Controls and Asymptomatic cases was younger than Symptomatic COVID-19 (Table 1). However, within the Symptomatic group, ages of cases with mild disease was comparable with Controls and Asymptomatic cases.
Four of eighteen Asymptomatic COVID-19 cases had a positive serum IgG antibody response to Spike protein. As IgG antibodies are shown to develop 5-7 days after SARS-CoV-2 infection 23 , this further confirms that the blood samples from these individuals were taken early in the infection. All sera from Control cases were negative for IgG antibodies to Spike protein. Asymptomatic and uninfected controls were comparable in age (Table 1), but were younger than Symptomatic COVID-19 cases (p = 0.0001). There was no gender-based difference between the Asymptomatic, Symptomatic or uninfected Controls. Table 1. Description of study population. a p value as compared across age groups between Controls, Asymptomatic and Symptomatic COVID-19 cases. '#' , all COVID-19 cases had a respiratory sample which was positive for SARS-CoV-2 by PCR within a 48 h prior to collection of blood samples; '*' , significantly different; NS, not statistically significant; the Kruskal-wallis test was conducted to determine non-parametric statistical comparison between groups. www.nature.com/scientificreports/ Differential regulation of genes between COVID-19 cases and uninfected controls. We compared blood transcriptome profiles between Asymptomatic and Symptomatic COVID-19 cases and Controls to identify transcriptional differences between groups. Principal component analysis (PCA) demonstrated clustering of datasets, with Asymptomatic and Controls cases segregated together, plotted away from Symptomatic COVID-19 cases (Fig. 1a). Notably, the data points for both mild and severe Symptomatic disease clustered together, although there was an age difference between these COVID-19 cases. A total of 4338 differentially regulated genes (DEGs) were identified between the three data sets based on a > 2-logFC difference. The greatest number of DEGs were between Symptomatics and uninfected Controls (n = 3609), followed by DEGs between Asymptomatics and Symptomatics (n = 3404), with the least between Asymptomatic and Controls (n = 424) (Fig. 1b).
Coronavirus disease and inflammatory pathway genes are downregulated in asymptomatic COVID 19 cases. We investigated the DEGs between Symptomatic and Asymptomatic COVID-19 cases, with 1939 Up-and 1465 Down-regulated genes. A volcano plot displays these using a > 2-logFC a two-way ANOVA paired analysis, with a false discovery rate (FDR) < 0.05 (Fig. 2a). This data was further run through a KEGG enrichment analysis to rank genes according the significant changes that occurred and grouped them according to biological pathways (Fig. 2b, Supplementary Tables 2 and 3).
KEGG enrichment analysis of biological components revealed that genes involved in pathways for antigen processing and presentation were suppressed in Symptomatic as compared with Asymptomatic cases. Further, the most significantly activated genes were those classified as Coronavius disease comprising; CXCL8, highly chemotactic for neutrophils; NF-kB pathway genes (FOS/JUN/NFKBIA) and a number of ribosomal proteins representing small and large sub-unit proteins and accessory proteins found to be affected in during viral infection of cells 24 (Supplementary Table 2). Further, IL-17-and inflammatory pathways such as, those in non-alcoholic fatty liver disease (NAFLD), AGE-RAGE signaling and NFκβ pathways were upregulated (Fig. 2b). Additionally, genes involved in responses to bacterial, parasitic and viral diseases were also activated, indicating activation of innate and adaptive immune response genes.

Differential gene expression between symptomatic cases with severe and mild COVID-19.
To further understand regulatory changes in those with Symptomatic COVID-19 we compared transcriptome data of those with severe and mild disease. We found 241 genes to be differentially regulated between the groups, with 53 upregulated and 183 downregulated genes. As depicted by the volcano plot (Fig. 3), CD177, a member of the Ly-6 gene superfamily involved with neutrophil proliferation was the most upregulated in severe COVID-19 cases. Additional markers upregulated in severe disease were MAPKAPK2 (MAP kinase-activated protein kinase 2, which regulates inflammatory cytokines), IRF2BP2 (interferon regulatory factor-2 binding protein-2, a transcriptional corepressor for interferon) and CXCL16 (a chemoattractant for activated CD8 T cells, NKT cells and Th1-polarized T cells that express CXCR6). Downregulated genes included HIST1H2BO (a replicationdependent histone gene cluster), and the ISGs, IFIT, IFIT3, OAS1, OAS3, LY6E and MX1.

Activation of inflammatory responses and suppression of T cell immunity in severe COVID-19 cases.
We validated our data analysis using an RNAseq data set from Germany by Aschenbrenner et al. who compared transcriptomes of COVID-19 cases with healthy controls 25 , focusing on severe COVID-19 and Controls only. These were used to run a gene set enrichment analysis (gseGO-BP/MF/CC, enrichKEGG) (GSEA) between the two groups. GSEA of the data from the report by Aschenbrenner et al. 25 , showed that in COVID-19 cases with severe disease, there was activation pathways belonging to the humoral immune response, complement activation, host innate immune responses such as, phagocytosis, leucocyte and neutrophil activation (Supplementary Fig. 1). However, there was suppression of ribosomal biogenesis and translation pathways together with downregulation of cytotoxic cell functions associated with both Natural Killer (NK) and T cells. Both T cell activation and differentiation pathways were seen to be suppressed.

Upregulation of immune regulatory genes and interferon pathway genes in asymptomatic COVID-19 cases. We subsequently interrogated transcriptional profiles of Asymptomatic COVID-19 and
Controls in our Pakistani dataset to identify gene signatures associated with effective viral restriction of SARS-CoV-2. A Gene Set Enrichment (GSE) analysis for Biological Processes identified DEGs in the most affected pathways. This revealed activation of innate, defense response to virus, effector immune responses and cytokine pathways, which included type I interferon response and type I interferon pathway genes in Asymptomatics (Fig. 4a). Asymptomatic cases displayed the activation of IFN I, II, III and alpha/beta, JAK/STAT pathway, IL-1 mediated Myd88 signaling, IL 2, RIG-1 like receptor pathway and MAPK/ERK signaling pathways. Further, MHC class I along with perforins genes were also upregulated as compared with Controls.
To identify specific genes which could be driving the host-protective responses in Asymptomatic cases, we first examined the DEGs between the groups that included 144 Up and 277 Down genes (Fig. 4b). Twenty-four of the most differentially regulated genes were identified using a fold change of > 7 and a FDR − log10 cut off of 2.5. These comprised eighteen upregulated and six downregulated genes as visualized through hierarchical clustering demonstrated by a heat map (Fig. 4c) www.nature.com/scientificreports/ Further, upregulated genes were B cell related IGL5, JCHAIN, LY6E (Thymic Shared Antigen-1, proton ATPase) and CALM1 (calmodulin 1) associated with blood and vasculature. Downregulated genes were ALPL (non-tissue specific alkaline phosphatase), HEMGN (erythroid associated hemapoietic gene) and PI3 (peptidase inhibitor 3).

Discussion
We investigated host transcriptional responses in SARS-CoV-2 positive Asymptomatic and Symptomatic COVID-19 cases comparing them with uninfected Controls to identify biomarkers of host protective responses in individuals who show improved control of viral infection. Our study highlights the upregulation of type I Interferon-driven response genes in Asymptomatic COVID-19 cases.
Our Asymptomatic COVID-19 cases and those with mild Symptomatic COVID-19 were similar in age to the uninfected Control group, but younger than severe Symptomatic COVID-19 cases. The age range of our groups fits the demographics of COVID-19 patients seen through the first two waves of the pandemic in Pakistan during 2020. Over the same period studied, the Sindh Health Department COVID-19 report indicated that 76% of confirmed cases were below 50 years of age 26 with a similar trend was observed in national data. Nasir et al. showed that there were no specific factors associated with hospital related morbidity and mortality of COVID-19 in Pakistan, except for an older age group 27 . These study samples were from between March and October 2020. In March, S, L and G clade strains were present with an increasing predominance of the GH clade with the D614G mutation found by October 2020 28 . Importantly, the samples in this study were collected before the emergence of SARS-CoV-2 variants of concern (VoC) known to have a greater transmissibility and have been found to be more virulent 29 .
In the context Pakistan, it is difficult to ascertain if the relatively lower morbidity and mortality from COVID-19 observed may be due to factors such as, other infections which have led to cross-protective immunity against SARS-CoV-2 11,12,30 . The younger age group may be a factor however, albeit our sample size for the mild Symptomatic group was small, we found both mild and severe Symptomatic cases to have transcriptional data segregated away from that of Asymptomatics and Controls.
The overall comparison of Symptomatic and Asymptomatic revealed upregulation of TNFα, NFkβ, IL1, HIF1A, ICAM, SOCS3 transcripts, and of type 2 cytokines (IL4, IL6, IL10), but downregulation of antigen processing and presentation pathways in the former groups. Raised inflammatory responses in COVID-19 individuals with advanced disease reflects the induction of a cytokine storm-up-regulation of IL-6, G-CSF, IL-1RA, and MCP1 has been shown to be associated with severe outcomes leading to mortality in patients with COVID-19 31 . Persistent expression of these inflammatory cytokines may be detrimental as it leads to increased influx of neutrophils, reported to be the source of tissue damage 32 . Heightened Th1, Th2 and Th17 responses along with increased chemokines may result in "cytokine storm" 33 , which can have devastating consequences on host  www.nature.com/scientificreports/ inflammation 34 . Upregulation of DEGs related to protein synthesis genes in Symptomatic cases fits with increased viral mRNA translation in infected individuals, associated with viral replication and spread 35 . Reduced antigen processing responses in Symptomatic cases supports the observation of downregulated adaptive immunity in COVID-19 such as, T cells effector responses which are found to post treatment 36 . Overall, the transcriptional changes we observed in Symptomatic COVID-19 cases are in concordance with previous reports 37 .
We only had a few mild Symptomatic cases and these were younger in age than the severe Symptomatic group. However, PCA analysis revealed that they clustered together within the Symptomatic cases and away www.nature.com/scientificreports/ from the Asymptomatics and Controls. Comparison of severe and mild Symptomatic COVID-19 cases revealed an upregulation of CD177, the neutrophil marker, CXCL16, MAPKMAPK2 and IRF2BP2 with downregulation of ISGs; IFIT, IFIT3, OAS1, OAS2, LY6E and MX1 in severe cases. OAS is a family of molecules upregulated in response to ISGs is known to play a role in early viral clearance by degrading viral RNA in combination with RNase L 38 . LY6E, a GPI-anchored protein, has an impact on cellular receptors for viruses or viral glycoproteins in terms of their expression, kinetics, or biophysical properties, thus affecting their binding, trafficking, and membrane fusion 39 . We compared our data set with the transcriptome data set from Germany published by Ashenbrenner et al. 25 , which showed the upregulation of neutrophil/granulocytes related processes and interferon responses in both severe and mild COVID-19 cases as compared with Controls 25 . Our analysis of the same data set comparing severe COVID-19 cases as with Controls also revealed activation of leucocyte and neutrophil activation (Supplementary Fig. 1). Further, we found downregulation of T cell proliferation and activation markers, as shown previously in COVID-19 disease 36 . Of note, Aschenbrenner et al. did not find any differential activation of ISGs between mild and severe COVID-19 cases 25 . However, this may be because their classification for 'mild' cases was WHO ordinal score '1-4' , whilst we compared cases further stratified into Asymptomatic 'score 1-2' and mild Symptomatic cases with a score of 4 only. Hence, our study was able to clearly focus on transcriptome responses of a Asymptomatic COVID-19 cohort with minimal or no symptoms, representing effective early host immune responses.
In this study we used an RNA microarray chip with > 21,000 transcripts including, immune, metabolic and regulatory pathway genes. Unlike in RNA sequencing, only known targets are identified and will not include novel RNAs. Therefore, there may be differences in the precise genes identified between our data and others using microarray or RNAseq based analysis.
Focus on transcriptional responses of Asymptomatics revealed upregulation of ISGs (such as, USP18, IFI27, LY6E, MX1, OAS1, IFI44L, RSAD2, CMPK2), in addition to histone genes and CALM1. CMPK2 has a well described antiviral role in HIV patients 40 . USP18 regulate type 1 IFNs, protecting against severe immune inflammation with calcification and polymicrogyria following viral infection 41 . RSAD2, an anti-viral factor, has been shown to under-expressed in SARS-CoV2 42 . Upregulation of IF44 and IF44L genes has been shown in early protection against respiratory syncytial virus 43 . Upregulated CALM1 gene in Asymptomatics adds another layer of protection by interacting with ACE-2 and inhibits shedding of its ectodomain 44 . www.nature.com/scientificreports/ Raised plasma cytokines reflected increased inflammation in Asymptomatic individuals. IL-6, TNFα, IL1β are all proinflammatory responses whilst IL-10 downregulate the response and the balance of these is essential for maintaining immune regulation against pathogens 45 . SARS-CoV-2 has been shown to induce reduced IFN type I responses, increased pro-inflammatory cytokines and chemokines profile with down-regulated IL-10 responses 46 . The balance in which IFN responses are required is controlled by immune modulatory cytokines such as IL10 and TGFβ to prevent the tissue damage 46 .
Investigated the differential activation of Interferon and other cytokine genes across the COVID-19 disease severity spectrum identified an activation signature of type I interferon pathway genes and immune responses genes including cytokine signaling pathways in asymptomatic as compared with symptomatic COVID-19 cases.
Further, IFNs activate downstream JAK/STAT pathways to regulate ISGs that induce an anti-viral state in the host 52 . Deficiency in STAT2 has been linked with compromised IFNα responses.
When cytokine and chemokine expression was compared between Asymptomatic, mild and severe COVID19, it was apparent that there was upregulation of pro-inflammatory cytokine/chemokine signaling pathways with downregulated AREG, IL18, IL18R1, IL18RAP, IL1R1, IL1R2, IL1RAP, IL6, PPBP and VEGFA gene expression. The association of IL18 with acute respiratory distress syndrome has been described in infections with avian influenza virus (H5N1 and H7N9) 53 . Patients with increased IL-6 responses have shown to be at increased risk for the requirement of mechanical ventilation and therefore, IL-6 is now recommended to be included in the diagnostic workup of severe COVID-19 cases 54 . Up-regulated VEGFA in severe COVID-19 cases is in line with the findings of the study where it has been indicated as a markers of endothelial dysfunction with significant correlation with disease severity 55 . AREG is an epidermal growth factor ligand which plays an important role in tissue repair and pulmonary fibrosis and its increased expression in patients with severe COVID-19 disease has been negatively correlated with genes associated with cytotoxic NK cell functions 56 . Hence, the upregulated inflammatory cytokines and downregulation of IL18, IL6, VEGF and AREG in Asymptomatics are all suggestive of improved control of SARS-CoV-2 infection.
Overall, discrepancies observed between studies investigating the role of type I and III interferons in protecting against SARS-CoV-2 may possibly be due to the different stages of COVID-19 disease in cohorts investigated. Our study highlights upregulated type I interferon responses in Asymptomatic COVID-19 cases. Our findings are supported by studies that show increased production of type I IFNs in patients with mild disease and improved clinical outcomes 57 . Other studies have reported the insufficient activation of IFN I responses in moderate to severe cases of SARS-CoV2 infection, which may explain the failure to restrict viral replication in a timely fashion 20 . Longitudinal studies investigating alterations in immune responses confirmed the late induction of Type I IFNs in peripheral blood of SARS-CoV2 patients 58 .
IFN based therapies are used to treat viral infections such as hepatitis C and HIV 59,60 . A recent clinical trial has shown improvement in recover time when patients with severe COVID-19 were treated with IFN Iβ 61 . In summary, our study provides insights into the role of immune responses which may be associated with early clearance of virus. Therefore, we hypothesize that treatment modalities such as early administration of type I interferon treatment may facilitate clearance of SARS-CoV-2.

Methods
Study subjects. This study received approval from the Ethics Review Committee of the Aga Khan University (AKU). All methods were performed in accordance with the relevant guidelines and regulations. All study subjects were aged over 18 years. Informed consent was taken from all study subjects or their adult next of kin. COVID-19 patients included in the study were recruited in the period April-October 2020.    Of the eighteen Asymptomatic COVID-19 cases, two had no symptoms and sixteen had one or two mild symptoms such as a short (1-2 days) history of fever, sore throat or myalgia. These were scored (1-2) as per the WHO ordinal scale. They were identified during screening due to contact with a COVID-19 positive case, or as part of regular screening conducted due to travel from outside of the city into Karachi. Blood samples from Asymptomatic cases were taken within 2-3 days of their positive SARS-CoV-2 PCR result. None from this group required any medical treatment.

SARS-CoV
Symptomatic cases were admitted to AKUH at the time of recruitment. Their clinical details are provided in Supplementary Table 1. Blood samples from Symptomatic cases were drawn within 2 days of their positive SARS-CoV-2 PCR result, prior to the administration of any an anti-viral treatment, or IL-6 antagonist. Upon admission, cases had WHO ordinal scores between 4 and 7. Symptomatic cases can be further sub-divided into those with 'Mild' (score of 4, n = 3) or 'Severe' (score of 5 -7, n = 8) COVID-19.
Uninfected healthy controls did not have any current or prior history of SARS-CoV-2 infection. These included cases 6 cases recruited in March 2020 and 12 cases from 2018.
IgG to spike protein ELISA. IgG antibodies to Spike protein were measured using an in-house ELISA method 62 . Recombinant Spike protein for ELISA was kindly provided by Dr. Paula Alves, IBET, NOVA University, Portugal. Sera from Asymptomatic COVID-19 cases and uninfected Controls were available for serology testing. Serum for Symptomatic cases was not available.
RNA microarray data. RNA was extracted from whole blood collected in plasma/EDTA tube using the Qiagen RNA Blood Mini Kit (Qiagen, GmbH, Germany). One hundred nanogram of RNA was used for the preparation of cDNA for use in the Clariom™ S Assay, human (Affymetrix, USA; 902927). The Clariom S Array has hybridization probes for 21,488 genes The arrays were scanned using an Affymetrix autoloader system. Array data was generated and used for processing. For accession number generation, array output raw files (CEL files) and processed files (CHP) were submitted to Gene Expression Omnibus (GEO) NCBI and available as GSE177477. The accession numbers generated are in Supplementary Table 3. Statistical analysis. Differences between age groups and lab parameters were compared using Kruskal-Wallis test. CEL files were analysed using the TAC Transcriptome Analysis Software Suite (TACS version 2) using the Summarization Method: Gene Level-SST-RMA Pos vs Neg AUC Threshold: 0.7 against Genome Ver-

Functional enrichment analysis.
To obtain further insights into the function of the differentially expressed genes (DEGs), we performed Gene Ontology (GO) analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis, using clusterProfiler 63 (Fig. 2). The R package clusterProfiler, www.nature.com/scientificreports/ perform statistical methods to analyze and visualize functional profiles (GO and KEGG) of gene and gene clusters 64 . We use two types of functions from clusterProfiler i.e., enricher function (enrichGO, enrichKEGG) for hypergeometric test and GSEA (gseGO, gseKEGG) function for gene set enrichment analysis on user defined data. GO enrichment analysis is carried out employing enrichGO function which requires a gene list as input vector. The results are annotated along three ontologies: Molecular Functions, Biological Processes and Cellular Components with the following parameters: pvalueCutoff = 0.05, pAdjustMethod = "BH" (Benjamani and Hochberg) and qvalueCutoff = 0.05. While the enrichKEGG is simpler, it requires a gene-list as input, parameter of pvalueCutoff = 0.05 and organism of interest (homo sapiens "hsa"). Gene set enrichment analysis is performed on GO terms using gseGO which requires gene-list in the form of input vector, organism of interest (database: org.Hs.eg.db), pvalueCutoff = 0.05, minGSSize (minimal size of genes annotated by Ontology term for testing) = 10 and maxGSSize (maximum number of genes annotated for testing) = 800. gseKEGG function is similar with respect to input parameters (genelist, organism = hsa, minGSSize, maxGSSize and pvalueCutoff), applied on KEGG database. For visualization of results related R packages such as GOplot, enrichplot, DOSE and pathview were used to generate pathway maps, dotplots, heatmaps and barplot.
Comparison with a published transcriptional data set. Open data published by Ashenbrennen et al.
of German Covid-19 omics initiative was extracted online resources for analysis 25 . Data of 20 COVID-19 patients with severe disease symptoms) and 10 healthy control donors was downloaded. Functional enrichment analysis (gseGO-BP/MF/CC, enrichKEGG) was applied on the sample data and same thresholds (as in the reference study) were used to calculate the GSEA for KEGG molecular pathways and GO ontology.