Looking for pathways related to COVID-19: confirmation of pathogenic mechanisms by SARS-CoV-2–host interactome

In the last months, many studies have clearly described several mechanisms of SARS-CoV-2 infection at cell and tissue level, but the mechanisms of interaction between host and SARS-CoV-2, determining the grade of COVID-19 severity, are still unknown. We provide a network analysis on protein–protein interactions (PPI) between viral and host proteins to better identify host biological responses, induced by both whole proteome of SARS-CoV-2 and specific viral proteins. A host-virus interactome was inferred, applying an explorative algorithm (Random Walk with Restart, RWR) triggered by 28 proteins of SARS-CoV-2. The analysis of PPI allowed to estimate the distribution of SARS-CoV-2 proteins in the host cell. Interactome built around one single viral protein allowed to define a different response, underlining as ORF8 and ORF3a modulated cardiovascular diseases and pro-inflammatory pathways, respectively. Finally, the network-based approach highlighted a possible direct action of ORF3a and NS7b to enhancing Bradykinin Storm. This network-based representation of SARS-CoV-2 infection could be a framework for pathogenic evaluation of specific clinical outcomes. We identified possible host responses induced by specific proteins of SARS-CoV-2, underlining the important role of specific viral accessory proteins in pathogenic phenotypes of severe COVID-19 patients.


Introduction
Whilst COVID-19 predominantly affects the respiratory system, it is a multisystem disease, with a wide spectrum of clinical presentations from asymptomatic, mild and moderate, to severe, fulminant disease 1 . Host conditions and comorbidities (age, obesity, diabetes, hypertension, organ damages, inflammation and coagulation dysfunctionality), represented risk factors for severe and fatal disease courses 2 , but the mechanisms of host-SARS-CoV-2 interaction, activating pathological pathways and influencing severity, are still unknown.
Recently, several studies described many mechanisms of SARS-CoV-2 infection at cell and tissue level. It was observed that the replication of SARS-CoV-2, as well as all +RNA viruses, occurs in the cytoplasm of the host cell, inducing a membrane rearrangement of rough endoplasmic reticulum (ER) membranes into doublemembrane vesicles 3,4 . NSP8, NSP7 and NSP12, yield the RNA polymerase activity of NSP8, and are assembled into the replicase-transcriptase complex, generating antisense (−) RNAs, templates for positive-sense genome (+) and mRNA transcripts 5,6 . During virus entry, a wellknown process is the cleavage of S protein by FURIN on the cell membrane, which lead to the split S protein into two subunits, S1 and S2, which the last can interact with ACE2 (refs. 7,8 ). However, the SARS-CoV-2-host interaction is not restricted to local infection, but it triggers a systemic reaction, including the activation of the Bradykinin Storm, as described in many severe COVID-19 patients 9 . Indeed, SARS-CoV-2 infection causes from one side a decrease of ACE level in the lung cells and, on the other side, an increase of ACE2 level, leading to increase Bradykinin (BK) level 9 . BK is produced from an inactive pre-protein Kininogen-1 (KNG1) through the activation by the serine protease kallikrein 10 . Furthermore, the excess of BK can lead to vasodilatation, hypotension and hypokalaemia 9,11 , which is associated with arrhythmia 12,13 . All these clinical conditions have been widely reported in COVID-19 patients [14][15][16] . Furthermore, microvascular injuries, due to systemic inflammatory response and endothelial dysfunction, were frequently found in severe COVID-19 patients 17 . Moreover, myocardial injuries were found to be linked to the risk of fatal outcome in COVID-19 patients 15,18,19 . 43% of the patients with severe COVID-19 (21% of all COVID-19 patients) presented myocardial injury, showing an increased risk of myocardial injury of 4.74-fold in severe compared with non-severe patients 20 . Although microvascular injuries and microthrombi formation frequently occurred in severe COVID-19 patients, the role of SARS-CoV-2 in this phenotype is not completely explained yet.
Although many aspects of COVID-19 pathogenesis and mechanisms of SARS-CoV-2 have been investigated, only few papers describe the interactions among SARS-CoV-2 proteins by wet experiments. The influence of SARS-CoV-2 on transcriptome, proteome, ubiquitinome and phosphoproteome of a lung-derived human cell line, was described through a multi-omics approach.
Virus-host interactome by the computational approach has been applied to COVID-19 for drug repurposing 21 , allowing the identification of new drug targets 22 and contributing to explain clinical manifestations 23,24 . The structural information on SARS-CoV-2 proteins and their interactions with human proteins and other viral proteins, allowed to better understand the mechanisms of SARS-CoV-2 infection, also comparing it with SARS-CoV 25 .
One of the most interesting models of SARS-CoV-2-host interactions was carried out on Spike-receptor interactions in other Human Coronaviruses (H-CoV). This network-based analysis of H-CoV-host interaction has provided a theoretic model for H-CoV infections applicable to SARS-CoV-2 pathogenesis, revealing biologically and clinically relevant molecular targets of three H-CoV infections 26 .
The interactome based on PPI and gene expression data, have been applied also to uncover the molecular origins of phenotypes of other complex diseases 27,28 , in order to better define the mechanisms of COVID-19 pathogenesis.
The understanding of all mechanisms of SARS-CoV-2 infection also passes by overall visualization of biological reactions and pathways involved in COVID-19 and H-CoV infections [29][30][31] .
In this context, defining the host response induced by specific viral proteins would be of great importance and can guide the identification of functional viral targets, helping to better define the pathologic phenotypes of the infection.
Here, we carry out a network analysis on PPI to better identify host biological response induced by SARS-CoV-2. Furthermore, the interactome analysis was applied to design network of SARS-CoV-2-host proteins that could lead to a Bradykinin Storm. We used three different applications to identify interacting proteins: 200 proteins that interact closely with 28 SARS-CoV-2 proteins, 50 protein interactomes around each SARS-CoV-2 protein and 200 protein associated with KNG1, the pre-cursor of BK. The proteins found in all three approaches were analysed by gene enrichment and pathways associated with proteins were identified analysis ( Fig. 1).

Interactions with the whole proteome of SARS-CoV-2
To define the massive effect of SARS-CoV-2 infection on the host cell, an interactome between the entire set of SARS-CoV-2 proteins and host was carried out. Observing the colours' distribution, it is possible to distinguish three different areas, corresponding to as many subcellular districts, where SARS-CoV-2 proteins seem to be distributed (Fig. 2). In fact, N and some nonstructural proteins (nsp1, nsp4 and nsp15) are posed around nuclear proteins, while S, along with nsp5, nsp10, nsp12, nsp13, nsp14, nsp16, ORF1a and ORF9b, were among cytosolic and membrane proteins, at the bounder of the interactome. Protein M and many accessory and not-structural proteins (NS7b, nsp6, nsp7, ORF3a, ORF7a, ORF8, ORF10, and ORF14) are around mitochondrial and endoplasmic proteins. At first, Gene Enrichment Analysis on Gene Ontology databases highlighted the processes DNA replication, synthesis of RNA primer (GO:0006269) and peptide antigen transport (GO:0046968) were the most enriched (66.33 fold enrichment and False Discovery Rate, FDR < 5%). To further dissect the interactions with the entire proteome of SARS-CoV-2 and to better understand which pathways could be involved, enrichment analysis was carried out on proteins reported in the interactome, using WikiPathways and Kyoto Encyclopaedia of Genes and Genomes (KEGG) databases. WikiPathways gene enrichment analysis revealed biological pathways of DNA replication, ubiquitination, and proteasome, with high significance (FDR < 0.01%). KEGG pathway enrichment analysis revealed DNA and RNA replication pathways as the most significant pathways (FDR < 0.01%), as well as signalling pathways and viral infection pathways ( Supplementary Fig. 1). We described the high values of betweenness centrality and degree for nine host proteins: two proteins multi-organelle amyloid-beta precursor protein (APP) and dual-specificity protein phosphatase (PTEN); the membrane protein Sodium/ potassium-transporting ATPase subunit alpha-1 (ATP1A1); two cytoplasmic proteins, 14-3-3 protein theta (YWHAQ) and Ubiquitin (UBC); two proteins of ER and Golgi apparatus, Sarcoplasmic/ER calcium ATPase 2 (ATP2A2) and Unconventional myosin-VI

Host response to SARS-CoV-2 infection Pathways modulated by specific SARS-CoV-2 protein
Gene Enrichment analysis  Table 1).

Host response induced by the specific protein of SARS-CoV-2
To define the effect of specific viral proteins on host response, interactomes were built, choosing one specific viral protein as seed, imposing to find the 50 closest proteins. These analyses allowed to produce 28 restricted interactomes, which defined the strictest biological interactions associated to seed proteins both among human proteins and other viral proteins. In Supplementary Fig. 3 the interactomes for structural proteins were reported, while in Supplementary Figs. 4 and 5 the interactomes for accessory and non-structural proteins were plotted.
Most of the interactomes have few viral proteins into the reconstruct PPI network (i.e., M, nsp1, and ORF3a), while into other interactomes the viral seed protein shared own human target proteins with other viral proteins or interacts directly with other viral proteins, such as nsp3, nsp10, ORF6 and ORF10.
The lists of proteins for each reduced interactome were submitted to gene enrichment analysis on WikiPathways and KEGG pathway databases (Supplementary Tables 2  and 3, respectively). For each single viral protein interactome, the top 5% with the smallest p values of gene enrichment analysis was selected, reporting for KEGG and WikiPathways in Fig. 3A and B, respectively.
For KEGG database the gene enrichment analysis on interactomes of NS7b, ORF1a, ORF3a, and ORF8 showed pathway clusters highly significant and consistent with possible pathogenic mechanisms, such as the activation of the complement and of the coagulative cascade 32 , and the TGF-β-dominated immune response 33 . In particular, the NS7b interactome revealed SNARE interactions in the vesicular transport pathway, describing the mechanisms of intracellular vesicle trafficking and secretion, as the most significant pathway among all interactome (FDR < 0.00001%). The ORF1a interactome revealed a cluster of pathways, composed by Adherent junctions, Bacteria host infection mechanisms, Regulation of actin cytoskeleton and phagocytosis, all showing the direct involvement of these accessory proteins in a viral entry in host cells (FDR < 0.001%). The ORF8 interactome showed a pathway cluster involving complement and coagulation cascades and cardiovascular pathology (FDR < 0.3%). The ORF3a interactome revealed the TGF-β and Cytokine signalling, endocytosis, and Vasopressin-regulation pathways (FDR < 0.01%). From pathway point of view, Proteasome pathway was involved in the interactome of M protein as well as for ORF7a, ORF8 (FDR < 0.05%) (Fig. 3A). For WikiPathways database the gene enrichment analysis showed results consistent with KEGG gene enrichment for ORF1a, ORF3a and ORF8, and added new pathway cluster for nsp1 (FDR < 0.00001%) with DNA replication and modification pathways (Fig. 3B).

Computational investigation on Bradykinin Storm in COVID-19
As the virus-host interactome has been efficient to describe the response to specific viral proteins, this method was applied to reconstruct the possible involvement of SARS-CoV-2 proteins in triggering the Bradykinin Storm during the infection. In this case, KNG1, precursor of bradykinin made by proteolysis cleavage, was considered as seed for RWR, imposing 200 closest proteins as limit to stop the algorithm. The resulting interactome showed NS7b, ORF3a, ORF8, and S as proximal to the seed protein, suggesting their role in the modulation of biological processes around KNG1. Indeed, it would provide a direct influence of NS7b and ORF3a proteins to activation of BK. In fact, both NS7b and ORF3a interact with the same target, endothelin converter enzyme (ECE1), and cell surface endopeptidase that converts big endothelin-1 to pressure peptide endothelin-1 (ET-1) and inactivates BK (Fig. 4A). The enrichment analysis on WikiPathways and KEGG databases revealed biological pathways of Complement and coagulation cascades (FDR < 0.000001%). Finally, to quantify the direct effect of S, ORF8, ORF3a, NS7b on Complement and coagulation cascades pathways, all proteins that belonged to the most significantly enriched pathways, were highlighted on the interactome: Human Complement System WP2806 for WikiPathways (22 of 97 genes; adj P value 2.9E−21) and Complement and coagulation cascades for KEGG (24 of 79 genes; adj P value 5.96E−27). ORF8 showed the direct interactions with Fibrinogen alpha chain (FGA), described in both pathways, and urokinase-type plasminogen activator (PLAU), tissue-type plasminogen activator (PLAT) and Plasminogen activator inhibitor 1 (SERPINE1).

Discussion
In this study, we built a functional interactome between SARS-CoV-2 and human proteome through RWR algorithm, to identify biological mechanisms and cell responses during SARS-CoV-2 infection and to propose a model of infection contributing to a better understanding of COVID-19 pathogenesis.
SARS-CoV-2-host interactome allowed to mirror likely mechanisms of infections in different subcellular districts.
The high likelihood of our model to in vivo real mechanisms strengthens the representative power of the interactome and the explorative algorithm to simulate biological processes.
Moreover, the proximity among specific viral proteins into the network, such as nsp7, nsp8, nsp9, nsp10, nsp12, nsp13, nsp14 involved in the viral RNA replication, can be a good way to investigate their possible function in viral infection. The closeness among S, nsp5, nsp16, ORF9b in the network is difficult to interpret, but S, nsp5 and ORF9b, along with N, had significant positive responses of IgG antibody in sera of COVID-19 patients 34 . In SARS-CoV infected cell culture, the location of nsp2 in the cytoplasm and to some extent in the nucleus, as well as ORF3a, ORF7b, ORF6 and M in ER, seems to be consistent with their location among nuclear and cytoplasmatic, and ER proteins respectively 35 .
The involvement of proteasome and ubiquitination pathways, along with RNA replication, represent the principal pathways activated for assembly and replication of SARS-CoV-2 (ref. 36 ). In fact, strong increases in RNAmodifying proteins were revealed in cell culture after infection with SARS-CoV-2 (ref. 37 ). The ubiquitinproteasome system deletes viral proteins to control the infection, but the virus can use them for its propagation 38 .
The interactomes, built around a single protein of SARS-CoV-2, allowed to draw effects on cell, involving specific pathways. Vesicular transport mechanism by SNARE interactions, identified for NS7b, is used by pathogens to penetrate host cells through their membranes and in particular in SARS-CoV 39,40 .
The pathways showed in ORF3a and ORF8 interactomes suggested a possible modulation of the coagulation cascade and cardiovascular pathology in COVID-19 and the involvement in Cytokine storm and water homoeostasis in endothelial tissue, respectively.
Although the biological function of the ORF8 protein of SARS-CoV-2 remains unclear, the role of ORF8 in severe COVID-19 outcome might be supported by SARS-CoV-2 variant with a 382-nucleotide deletion (Δ382) found in Singapore in January-February 2020 linked to mild forms of COVID-19. This deletion truncates ORF 7b and locks ORF8 transcription and would be associated to less cytokine releasing during the acute phase of infection 41,42 .
The systemic inflammatory response and the relied endothelial dysfunction might be responsible for the microvascular injuries resulting in the formation of microthrombi, as well as a featuring microvascular thrombosis and haemorrhage pathology, viewed in the lung of COVID-19 patients 43 . The main pathways described in ORF3a and ORF8 interactomes, are consistent with the microvascular injuries and the development of cardiovascular complications, i.e., heart failure, myocarditis, pericarditis, vasculitis and cardiac arrhythmias observed in COVID-19 patients 44 . Such featuring symptoms, suggest specific mechanisms of host-virus interactions, and reinforce the role of ORF3a and ORF8 in the COVID-19 pathogenesis.
To study the pathogenesis of COVID-19 in a systemic context, Bradykinin Storm was investigated by the interactome approach. We showed NS7b and ORF3a interacting with ECE1, which can inactivate BK 45 . Such observation suggests specific mechanisms of host-virus interactions, occurring in severe COVID-19 patients and reinforce the role of ORF3a to enhance the bradykinin dysregulation.
These host-virus interactions could enhance a better viral fitness during the infection, as suggested by the interaction already observed with SARS-CoV between NS7b in ECE1 (ref. 36 ).
This interaction system might suggest a hypothesis that can explain the possible enhancing effect on Bradykinin Storm in COVID-19: the locking of ECE1 activity due to NS7b and especially ORF3a interaction, could reduce the activation of the vasoconstrictor endotelin-1, and amplify the vasodilatation effect of BK.  The ECE1 gene is much more expressed in all the tissues than ACE2 and ACE, especially in the lungs, where ECE1 is 128 fold and 3.8 fold expressed compared to ACE2 and ACE, respectively 46 . Significant downexpression of ECE1 protein was reported in endothelial cells infected by SARS-CoV-2 in vitro, resulting clearly consistent with effect of SARS-CoV-2 infection on BK modulation 47 .
BK activation is associated with coagulation disorders: F12 activates plasma kallikrein (PK), which releases bradykinin by cleavage 48 .
The results of our network-based model for the pathogenesis of SARS-CoV-2 mirror likely clinical manifestations in severe COVID-19 patients, noting that analysis of published data about COVID-19 allows to synthesise and describe complex aspect of pathogenesis, as reported for inflammatory cytokines storm in COVID-19-induced respiratory failure 49 .
This theoretical approach, combined with proteomic analysis in COVID-19 patients, might suggests new targets for specific treatments, as described for interaction between nsp10 and NKRF to mediate IL-8 expression, proposing this molecular mechanism as possible therapeutic target for anti-IL-8 and JUK inhibitor monoclonal antibodies 50 .
In conclusion, we developed a network-based model for SARS-CoV-2 infection, which could be a framework for pathogenic evaluation of specific clinical outcomes.
Here, the PPI interactomes were used to identify mechanistic processes of the viral molecular machinery during SARS-CoV-2 infection, for viral survival and replication within the host. With this knowledge, proteins interactions crucial for pathogenesis could be discerned. We identified different host response induced by specific proteins of SARS-CoV-2, underlining the important role of ORF3a and ORF8 in phenotypes of severe COVID-19 patients. The interactome approach applied to the identification of biological reactions around KNG1, allowed to view NS7b and ORF3a interactions with ECE1, which might have a role to enhance the Bradykinin storm. This network-based model of SARS-CoV-2-host interaction could guide to develop novel treatments against specific viral proteins.

Materials and methods
The virus-host interactome was made by merging SARS-CoV-2-host interaction data (PPI) from Intact 51 , with data from human PPI databases, such as BioGrid, InnateDB-All, IMEx, IntAct, MatrixDB, MBInfo, MINT, Reactome, Reactome-FIs, UniProt, VirHostNet, BioData, obtained by R packages PSICQUIC and biomaRt 52,53 . For SARS-CoV-2-host interaction experimentally obtained, 1407 protein interactors and 2305 interactions were downloads from IntAct at 12 September 2020. 28 -2 proteins are used for all the analyses: E, M,  N, S, nsp1, nsp2, nsp3, nsp4, nsp5, nsp6, nsp7, nsp8, nsp9,  nsp10, nsp12, nsp13, nsp14, nsp15, nsp16, ORF1a, ORF3a,  ORF6, ORF7a, NS7b, ORF8, ORF9b, ORF10, ORF14. To obtain functional information about PPI in vitro, to simulate biological reactions in infected cells and to explore cell response against viral infection, we employed a RWR algorithm, a state-of-the-art guilt-by-association approach 54 . This algorithm allows to establish a proximity network from a given protein (seed), to study its functions, based on the premise that nodes related to similar functions tend to lie close to each other in the network. For this study, two types of interactome were performed: the whole SARS-CoV-2-host interactome and an interactome per single SARS-CoV-2 protein.

SARS-CoV
In the first imputation, every protein of SARS-CoV-2 proteome was used as seeds, with the limit of 200 closest host's proteins to every SARS-CoV-2 protein, every protein of SARS-CoV-2 proteome was chosen as seeds for RWR algorithm, imposing the limit of 200 closest host proteins to every SARS-CoV-2 protein. This analysis allowed to design SARS-CoV-2-host interactome, where the different colours correspond to the localization of the proteins within the cell.
For the second kind of interactome, one SARS-CoV-2 per time was used as seed, lowering to the closest 50 proteins to define the induced biological response as better as possible.
For each node, a score was computed as a measure of proximity to the seed protein 26 . In total, a large PPI interaction database was assembled, including 13334 nodes and 73584 interactions. Graphical representations of networks were performed by GEPHI 0.9.2 (ref. 55 ). To identify hub protein in the SARS-CoV-2-host interactome, the values of betweenness centrality and degree were plotted. Betweenness centrality score measure how a specific node is in-between other nodes and then can be considered a hub, while the degree of node corresponds to number of connections.
Pathways of proteins involved in host response were tested by gene enrichment analysis on KEGG human pathways and WikiPathways databases 56 , the biological processes were identified by PANTHER GO 57 . To allow gathering of results for every running, the R package enrichR was used, an R interface to web-based tool 'Enrichr' for analysing gene sets 58 . The Enrichr analysis was performed using these statistical parameters: p-value (Fisher exact test), q-value (adjusted p-value for False Discovery Rate, FDR). Results for KEGG and WikiPathways were considered significant with a revised p-value < 0.05. To infer pathways involved in a single viral interactome, gene enrichment analyses for each viral interactome were collected along with p-values, as reported in enrichR package output. P-values of every single enrichment analyses were transformed by the function x = −log10 (p-value) and the 5% of x values were plotted on heatmap by R package pheatmap 59 .

Limitations
There are many experimental platforms for deriving such physical interactions, such as affinity purification mass-spectrometry (AP-MS) and yeast-two-hybrid (Y2H), which enable the accurate identification of interactions with a relatively long time.
The scenario reported in this study refers to few experimental data available on public databases and could be different respect to real phenotypes of COVID-19 patients.
The pathways' analysis did not consider tissue and celltype diversity. Finally, the low threshold established for the number of nodes found by RWR (200) limited the reconstruction of the entire pathways. However, this was a software-imposed threshold. Although such a networkbased approach showed great potential in identifying mechanisms not yet observed, experimental tests will be necessary to confirm what we have described. This study does not report clinical data or the outcomes of patients, but our results resulted highly consistent with clinical cases reported in literature 43,44,49 and in vitro studies 21,36,37,47 . Moreover, this approach should be used to design adaptive clinical or experimental trials.