A structural homology approach to identify potential cross-reactive antibody responses following SARS-CoV-2 infection

The emergence of the novel SARS-CoV-2 virus is the most important public-health issue of our time. Understanding the diverse clinical presentations of the ensuing disease, COVID-19, remains a critical unmet need. Here we present a comprehensive listing of the diverse clinical indications associated with COVID-19. We explore the theory that anti-SARS-CoV-2 antibodies could cross-react with endogenous human proteins driving some of the pathologies associated with COVID-19. We describe a novel computational approach to estimate structural homology between SARS-CoV-2 proteins and human proteins. Antibodies are more likely to interrogate 3D-structural epitopes than continuous linear epitopes. This computational workflow identified 346 human proteins containing a domain with high structural homology to a SARS-CoV-2 Wuhan strain protein. Of these, 102 proteins exhibit functions that could contribute to COVID-19 clinical pathologies. We present a testable hypothesis to delineate unexplained clinical observations vis-à-vis COVID-19 and a tool to evaluate the safety-risk profile of potential COVID-19 therapies.

Aligning proteins. The ProBiS algorithm searches for structurally similar sites on a local scale by finding all matches to each query protein chain within the proteome list supplied. The alignment of two protein chains, one from the query protein and one from the human proteome, is based on finding similar regions between the two chains.
The benefit of the ProBiS algorithm to this exercise is that it is focused on local alignments. The algorithm will begin with vertices relating to three amino acid residue surface regions and expand outwards along the backbone.
Drawing protein structures. Graphical protein structure were created using PyMol v 2.3.5 24 .
Quantification and statistical analysis. Each match between a query chain and a proteome chain is scored in four ways: • Surface vectors angle A vector orthogonal to the geometric mean of the surface of each protein at the aligned area is calculated. The angle between these two vectors is calculated. If the angle is less than 90° (1.571 rad), the alignment will be retained in the results. • Surface patch root mean squared distance (RMSD) The distance between each set of vertices in the match is calculated and the RMSD is calculated from this from the following formula.
where: Each vertex exists in three-dimensional space and has x, y, and z coordinates. There are n sets of vertices. • Surface patch size The algorithm requires that alignments must contain at least 10 vertices.
• E-values E-values for a particular alignment are calculate using the Karlin-Altschul Equation 25 in a similar fashion to evaluating the quality of matches in a sequence alignment according to the Karlin-Altschul Equation: where: E = The Expect-value of the alignment (E-value); k and λ are constants (k = 0.134 and λ = 0.3176, these are the often-used values for an ungapped alignment in a structural homology search); m is the number of vertices in the query (SARS-CoV-2) alignment fragment; n is the number of vertices in the library (human protein); S is the substitution score calculated using the sum of scored for each substitution using a BLO-SUM62 matrix.
Go Term and KEGG pathway analysis. Both analyses were performed in R 26 and calculated using the topGO 27 package. The universe of genes for both analyses were the list of proteins for which a PDB file was available rather than the entire human proteome. All p-values were adjusted using the Benjamini-Hochberg method 28 as implemented in R. A threshold of 0.05 was used as a cut off for significance using adjusted p-values.
Identification of human proteins with structural homology to SARS-CoV-2 proteins. A detailed description of the novel computational workflow we used to identify endogenous human proteins with structural homology to SARS-CoV-2 proteins has been provided in the "Methods" section. The human and SARS-CoV-2 proteins used in this analysis are listed in the Supplementary Tables 1 and 2 respectively. Identifying structural homologies relies on parsing databases of known protein structural files down to non-redundant lists of host organism proteins, splitting those proteins into distinct macro-molecular chains, generating surface structures for each chain, and matching endogenous human chains to SARS-CoV-2 protein chains. For this study, the criterion for homology between a SARS-CoV-2 and an endogenous human protein was based on four criteria. Proteins were considered structurally homologous based on: (i) the angles of the orthogonal vectors of the surface patches, (ii) the root mean squared distances between each set of vertices, (iii) the size of the surface patches, and (iv) the e-value of the match. The rationale for these criteria was built into the software implementation of the ProBiS algorithm 22,23 . Using these criteria, we identified 346 human proteins with structural homology to SARS-CoV-2 proteins. These proteins are listed in Supplementary Table 4. A full record of the matches including alignment lengths, chains and scores is provided in Supplemental Table 5. An illustrative example is provided in Fig. 2, showing the structural alignment between SARS-CoV-2 spike protein chain A (Protein Data Bank (PDB) ID: 6LXT) and the human protein complement factor B (PDB ID: 1RTK).

Clinical indications associated with SARS-CoV-2 infections. A list of COVID-19 -related clinical
pathologies was compiled following a review of the medical literature. COVID-19 related clinical pathologies were categorized as general (involving multiple organ systems) or the immune system (immunopathology), or specific to an individual organ system including lungs (pneumopathy), heart (cardiopathy), blood and clotting (hemopathy and coagulopathy), liver (hepatopathy), kidney (nephropathy), gastrointestinal, or brain (neuropathy). In Table 1 we provide a comprehensive list of the clinical indications and symptoms associated with SARS-CoV-2 infections and provide a numerical code for each indication/symptom. The numerical codes were used for easy cross-referencing to human proteins that could potentially be (i) associated with COVID-19 related clinical pathologies based on their known function and (ii) targeted by anti-SARS-CoV-2 antibodies (see below).

Human structural homologues of SARS-CoV-2 proteins and clinical pathologies associated with COVID-19.
We also assessed which of the 346 human proteins found to have structural homology to SARS-CoV-2 could be associated with reported COVID-19 pathologies. General literature search strategies included the gene name in addition to key terms such as coronavirus, molecular mimicry, autoimmunity, pathology, knock-out/down in some combination within the PubMed search engine. In some instances, no hits were available to provide evidence of connection to COVID-19 pathologies. However, theoretical associations were hypothesized based on protein localization and function. Protein candidates were tiered. Proteins expressed on the plasma membrane surface or secreted into the extracellular space were considered to have the highest probability of being bound and inhibited by a cross-reactive anti-SARS-CoV-2 antibody. Based on this strategy, of the 346 human proteins that are structural homologs of SARS-CoV-2 proteins, 102 were identified as having biological functions that could be associated with COVID-19 pathologies or symptoms upon inhibition by cross-reactive antibody responses. We have tabulated and encoded clinical indications associated with COVID-19. The COVID-19-related clinical indications potentially associated with each of these 102 human proteins are shown in Table 2.
The biology of human structural homologues of SARS-CoV-2. We    Binds water, Ca(2 +), Na( +), K( +), fatty acids, hormones, bilirubin and drugs (Probable). Its main function is the regulation of the colloidal osmotic pressure of blood (Probable). Major zinc transporter in plasma, typically binds about 80% of all plasma zinc. Major calcium and magnesium transporter in plasma, binds approximately 45% of circulating calcium and magnesium in plasma (By similarity). Potentially has more than two calcium-binding sites and might additionally bind calcium in a non-specific manner (By similarity). The shared binding site between zinc and calcium at residue Asp-273 suggests a crosstalk between zinc and calcium transport in the blood (By similarity). The rank order of affinity is zinc > calcium > magnesium (By similarity). Binds to the bacterial siderophore enterobactin and inhibits enterobactin-mediated iron uptake of E.coli from ferric transferrin and may thereby limit the utilization of iron and growth of enteric bacteria such as E.coli. Does not prevent iron uptake by the bacterial siderophore aerobactin Chemoattractant active on T-lymphocytes and monocytes but not neutrophils. Activates the C-X-C chemokine receptor CXCR4 to induce a rapid and transient rise in the level of intracellular calcium ions and chemotaxis. SDF-1-beta(3-72) and SDF-1-alpha  show a reduced chemotactic activity. Binding to cell surface proteoglycans seems to inhibit formation of SDF-1-alpha  and thus to preserve activity on local sites. Also binds to atypical chemokine receptor ACKR3, which activates the beta-arrestin pathway and acts as a scavenger receptor for SDF-1. Binds to the allosteric site (site 2) of integrins and activates integrins ITGAV:ITGB3, ITGA4:ITGB1 and ITGA5:ITGB1 in a CXCR4-independent manner. Acts as a positive regulator of monocyte migration and a negative regulator of monocyte adhesion via the LYN kinase. Stimulates migration of monocytes and T-lymphocytes through its receptors, CXCR4 and ACKR3, and decreases monocyte adherence to surfaces coated with ICAM-1, a ligand for beta-2 integrins. SDF1A/CXCR4 signaling axis inhibits beta-2 integrin LFA-1 mediated adhesion of monocytes to ICAM-1 through LYN kinase. Inhibits  Receptor tyrosine kinase which binds promiscuously transmembrane ephrin-B family ligands residing on adjacent cells, leading to contact-dependent bidirectional signaling into neighboring cells. The signaling pathway downstream of the receptor is referred to as forward signaling while the signaling pathway downstream of the ephrin ligand is referred to as reverse signaling. Generally has an overlapping and redundant function with EPHB2. Like EPHB2, functions in axon guidance during development regulating for instance the neurons forming the corpus callosum and the anterior commissure, 2 major interhemispheric connections between the temporal lobes of the cerebral cortex.
In addition to its role in axon guidance plays also an important redundant role with other ephrin-B receptors in development and maturation of dendritic spines and the formation of excitatory synapses. Controls other aspects of development through regulation of cell migration and positioning. This includes angiogenesis, palate development and thymic epithelium development for instance. Forward and reverse signaling through the EFNB2/EPHB3 complex also regulate migration and adhesion of cells that tubularize the urethra and septate the cloaca. Finally, plays an important role in intestinal epithelium differentiation segregating progenitor from differentiated cells in the crypt Tyrosine-protein kinase that acts as cell-surface receptor for fibroblast growth factors and plays an essential role in the regulation of cell proliferation, differentiation and apoptosis. Plays an essential role in the regulation of chondrocyte differentiation, proliferation and apoptosis, and is required for normal skeleton development. Regulates both osteogenesis and postnatal bone mineralization by osteoblasts. Promotes apoptosis in chondrocytes, but can also promote cancer cell proliferation. Required  Key player in the regulation of energy balance and body weight control. Once released into the circulation, has central and peripheral effects by binding LEPR, found in many tissues, which results in the activation of several major signaling pathways. In the hypothalamus, acts as an appetite-regulating factor that induces a decrease in food intake and an increase in energy consumption by inducing anorexinogenic factors and suppressing orexigenic neuropeptides, also regulates bone mass and secretion of hypothalamo-pituitaryadrenal hormones. In the periphery, increases basal metabolism, influences reproductive function, regulates pancreatic beta-cell function and insulin secretion, is pro-angiogenic for endothelial cell and affects innate and adaptive immunity. In the arcuate nucleus of the hypothalamus, activates by depolarization POMC neurons inducing FOS and SOCS3 expression to release anorexigenic peptides and inhibits by hyperpolarization NPY neurons inducing SOCS3 with a consequent reduction on release of orexigenic peptides. In addition to its known satiety inducing effect, has a modulatory role in nutrient absorption. In the intestine, reduces glucose absorption by enterocytes by activating PKC and leading to a sequential activation of p38, PI3K and ERK signaling pathways which exerts an inhibitory effect on glucose absorption. Acts as a growth factor on certain tissues, through the activation of different signaling pathways increases expression of genes involved in cell cycle regulation such as CCND1, via JAK2-STAT3 pathway, or VEGFA, via MAPK1/3 and PI3K-AKT1 pathways. May also play an apoptotic role via JAK2-STAT3 pathway and up-regulation of BIRC5 expression. Pro-angiogenic, has mitogenic activity on vascular endothelial cells and plays a role in matrix remodeling by regulating the expression of matrix metalloproteinases (MMPs) and tissue inhibitors of metalloproteinases (TIMPs). In innate immunity, modulates the activity and function of neutrophils by increasing chemotaxis and the secretion of oxygen radicals. Increases phagocytosis by macrophages and enhances secretion of pro-inflammatory mediators. Increases cytotoxic ability of NK cells. Plays a pro-inflammatory role, in synergy with IL1B, by inducing NOS2 wich promotes the production of IL6, IL8 and Prostaglandin E2, through a signaling pathway that involves JAK2, PI3K, MAP2K1/MEK1 and MAPK14/p38. In adaptive immunity, promotes the switch of memory T-cells towards T helper-1 cell immune responses. Increases CD4( +)CD25(-) T-cell proliferation and reduces autophagy during TCR (T-cell  The JNK-interacting protein (JIP) group of scaffold proteins selectively mediates JNK signaling by aggregating specific components of the MAPK cascade to form a functional JNK signaling module. May function as a regulator of vesicle transport, through interactions with the JNK-signaling components and motor proteins (By similarity). Promotes neuronal axon elongation in a kinesin-and JNK-dependent manner. Activates cofilin at axon tips via local activation of JNK, thereby regulating filopodial dynamics and enhancing axon elongation. Its binding to kinesin heavy chains (KHC), promotes kinesin-1 motility along microtubules and is essential for axon elongation and regeneration. Regulates cortical neuronal migration by mediating NTRK2/TRKB anterograde axonal transport during brain development (By similarity). Acts as an adapter that bridges the interaction between NTRK2/ TRKB and KLC1 and drives NTRK2/TRKB axonal but not dendritic anterograde transport, which is essential for subsequent BDNF-triggered signaling and filopodia formation Ezrin-radixin-moesin (ERM) family protein that connects the actin cytoskeleton to the plasma membrane and thereby regulates the structure and function of specific domains of the cell cortex. Tethers actin filaments by oscillating between a resting and an activated state providing transient interactions between moesin and the actin cytoskeleton. Once phosphorylated on its C-terminal threonine, moesin is activated leading to interaction with F-actin and cytoskeletal rearrangement. These rearrangements regulate many cellular processes, including cell shape determination, membrane transport, and signal transduction. The role of moesin is particularly important in immunity acting on both T and B-cells homeostasis and self-tolerance, regulating lymphocyte egress from lymphoid organs. Modulates phagolysosomal biogenesis in macrophages (By similarity). Participates also in immunologic synapse formation.", NA)   Phosphoinositide-3-kinase (PI3K) that phosphorylates PtdIns(4,5)P2 (Phosphatidylinositol 4,5-bisphosphate) to generate phosphatidylinositol 3,4,5-trisphosphate (PIP3). PIP3 plays a key role by recruiting PH domaincontaining proteins to the membrane, including AKT1 and PDPK1, activating signaling cascades involved in cell growth, survival, proliferation, motility and morphology. Links G-protein coupled receptor activation to PIP3 production. Involved in immune, inflammatory and allergic responses. Modulates leukocyte chemotaxis to inflammatory sites and in response to chemoattractant agents. May control leukocyte polarization and migration by regulating the spatial accumulation of PIP3 and by regulating the organization of F-actin formation and integrin-based adhesion at the leading edge. Controls motility of dendritic cells. Together with PIK3CD is involved in natural killer (NK) cell development and migration towards the sites of inflammation. Participates in T-lymphocyte migration.
Regulates T-lymphocyte proliferation and cytokine production.
Together with PIK3CD participates in T-lymphocyte development. Required for B-lymphocyte development and signaling. Together with PIK3CD participates in neutrophil respiratory burst. Together with PIK3CD is involved in neutrophil chemotaxis and extravasation. Together with PIK3CB promotes platelet aggregation and thrombosis. Regulates alpha-IIb/beta-3 integrins (ITGA2B/ ITGB3) adhesive function in platelets downstream of P2Y12 through a lipid kinase activity-independent mechanism. May have also a lipid kinase activity-dependent function in platelet aggregation. Involved in endothelial progenitor cell migration. Negative regulator of cardiac contractility. Modulates cardiac contractility by anchoring protein kinase A (PKA) and PDE3B activation, reducing cAMP levels. Regulates cardiac contractility also by promoting beta-adrenergic receptor internalization by binding to GRK2 and by non-muscle tropomyosin phosphorylation. Also has serine/threonine protein kinase activity: both lipid and protein kinase activities are required for beta-adrenergic receptor endocytosis. May also have a scaffolding role in modulating cardiac contractility. Contributes to cardiac hypertrophy under pathological stress. Through simultaneous binding of PDE3B to RAPGEF3 and PIK3R6 is assembled in a signaling complex in which the PI3K gamma complex is activated by RAPGEF3 and which is involved in  Serine/threonine protein kinase that acts as key mediator of the nitric oxide (NO)/cGMP signaling pathway. GMP binding activates PRKG1, which phosphorylates serines and threonines on many cellular proteins. Numerous protein targets for PRKG1 phosphorylation are implicated in modulating cellular calcium, but the contribution of each of these targets may vary substantially among cell types. Proteins that are phosphorylated by PRKG1 regulate platelet activation and adhesion, smooth muscle contraction, cardiac function, gene expression, feedback of the NO-signaling pathway, and other processes involved in several aspects of the CNS like axon guidance, hippocampal and cerebellar learning, circadian rhythm and nociception. Smooth muscle relaxation is mediated through lowering of intracellular free calcium, by desensitization of contractile proteins to calcium, and by decrease in the contractile state of smooth muscle or in platelet activation. Regulates intracellular calcium levels via several pathways: phosphorylates IRAG1 and inhibits IP3-induced Ca(2 +) release from intracellular stores, phosphorylation of KCNMA1 (BKCa) channels decreases intracellular Ca(2 +) levels, which leads to increased opening of this channel. www.nature.com/scientificreports/ Protein kinase which is a key regulator of actin cytoskeleton and cell polarity. Involved in regulation of smooth muscle contraction, actin cytoskeleton organization, stress fiber and focal adhesion formation, neurite retraction, cell adhesion and motility via phosphorylation of DAPK3, GFAP, LIMK1, LIMK2, MYL9/ MLC2, TPPP, PFN1 and PPP1R12A. Phosphorylates FHOD1 and acts synergistically with it to promote SRC-dependent non-apoptotic plasma membrane blebbing. Phosphorylates JIP3 and regulates the recruitment of JNK to JIP3 upon UVB-induced stress. Acts as a suppressor of inflammatory cell migration by regulating PTEN phosphorylation and stability (By similarity). Acts as a negative regulator of VEGF-induced angiogenic endothelial cell activation. Required for centrosome positioning and centrosome-dependent exit from mitosis (By similarity). Plays a role in terminal erythroid differentiation. May regulate closure of the eyelids and ventral body wall by inducing the assembly of actomyosin bundles (By similarity). Promotes keratinocyte terminal differentiation. Involved in osteoblast compaction through the fibronectin fibrillogenesis cell-mediated matrix assembly process, essential for osteoblast mineralization (By similarity) Catalyzes the phosphorylation of sphingosine to form sphingosine 1-phosphate (SPP), a lipid mediator with both intra-and extracellular functions. Also acts on D-erythro-sphingosine and to a lesser extent sphinganine, but not other lipids, such as D,L-threodihydrosphingosine, N,N-dimethylsphingosine, diacylglycerol, ceramide, or phosphatidylinositol. In contrast to proapoptotic SPHK2, has a negative effect on intracellular ceramide levels, enhances cell growth and inhibits apoptosis. Involved in the regulation of inflammatory response and neuroinflammation. Via the product sphingosine 1-phosphate, stimulates TRAF2 E3 ubiquitin ligase activity, and promotes activation of NF-kappa-B in response to TNF signaling leading to IL17 secretion. In response to TNF and in parallel to NF-kappa-B activation, negatively regulates RANTES induction through p38 MAPK signaling pathway. Involved in endocytic membrane trafficking induced by sphingosine, recruited to dilate endosomes, also plays a role on later stages of endosomal maturation and membrane fusion independently of its kinase activity. In Purkinje cells, seems to be also involved in the regulation of autophagosome-lysosome fusion upon VEGFA  8.4, 9.0, 9.4, 9.13, 9.10, 9.17    Since the first COVID-19 cases were identified, a diverse array of clinical indications have been reported to be associated with SARS-CoV-2 infections (Table 1). Although SARS-CoV-2 is primarily a respiratory pathogen, there can be multiple organ systems involvement. Thus, hematological, cardiovascular, neurological and gastrointestinal complications have been reported in COVID-19 patients [37][38][39][40][41][42][43][44] . Many of these pathologies are difficult to explain on the basis of the route of entry and/or sites of SARS-CoV-2 infection. In this theoretical study we have explored the hypothesis that cross-reactivity of antibodies that target the SARS-CoV-2 proteins to endogenous human proteins may play a role in at least some of the dizzying array of clinical presentations of COVID-19 patients. Clinical data suggest that this is a plausible hypothesis to investigate. A prospective study involving 22 German patients suggests that SARS-CoV-2 infection could elicit organ specific autoimmunity in susceptible patients and lead to respiratory failure 45 . Similarly, a retrospective study of 21 patients with critical SARS-CoV-2 pneumonia, detected autoantibodies related to autoimmune disease 17 . In one study, a high-throughput autoantibody detection technique was applied on 194 SARS-CoV-2-positive subjects 20 . These subjects showed higher levels of autoantibodies to diverse human antigens compared to controls who were SARS-CoV-2 negative.
Several in silico approaches have been used to identify and study potential cross-reactivity between pathogenderived and human proteins 12,13 . These studies primarily rely on identifying amino acid sequence homologies between proteins from the pathogen and endogenous human proteins. In subsequent analysis some of the studies have endeavored to then determine which (if any) of these homologous regions are potential T cell epitopes based on their affinities for HLA class I and class II alleles 30,31 . We on the other hand, rationalized that anti-SARS-CoV-2 antibodies that cross-react with endogenous human proteins and elicit auto-immune pathologies are more likely to interact with conformational domains. However, using computational tools to compare conformational homologies between host and pathogen proteins is far more challenging than carrying out a sequence homology. We have limited our search to proteins which have a known structure and PDB file and used the ProBiS algorithm 22,23 to match surface patches on chains of human proteins to surface patches on SARS-CoV-2 proteins [See "Materials and methods"]. Using this method, we have identified 346 human proteins that show structural homology to SARS-CoV-2 viral antigens (Supplemental Table 4).
We concomitantly identified proteins that are linked to clinical conditions or symptoms reported for COVID-19. We carried out an in-depth literature survey to record and classify many clinical manifestations reported in patients diagnosed with COVID-19 (Table 1). We also used a numeric notation for each clinical condition to allow cross-referencing. Our analysis shows that of the 346 human proteins that showed structural homology to SARS-CoV-2, 102 proteins have biological functions which, if disrupted, could result in pathologies associated with COVID-19 (the pathologies and proteins are depicted in Tables 1 and 2). We again emphasize that these 102 human genes have not been experimentally verified but provide a data set which could provide a useful  www.nature.com/scientificreports/ resource. This list could be a starting point for carrying out in vitro studies to elucidate the mechanistic basis of clinical observations. We have identified human proteins with structural homology to SARS CoV-2 proteins that, if functionally inhibited (e.g., by cross-reactive anti-SARS CoV-2 antibodies), may be mechanistically implicated in the development of severe COVID-19 clinical manifestations. An exhaustive discussion of each candidate human protein and its possible implication to COVID-19 clinical presentation is not possible. However, we discuss several examples of identified candidate proteins and their possible connection to severe COVID-19 illness to illustrate the potential, practical utility of our theoretical study and resultant data sets. Some genes that may be related to severe COVID-19 pathophysiology and are good candidates for experimental investigation include PRKG1, ACE, CFB, CRP, CTNNB1, EGFR, and VEGFA.
The renin-angiotensin system (RAS) is known for its effects on the cardiovascular system and fluid hemostasis 50 . Increased activity of the vasoconstrictive and proliferative axis such as angiotensin II/ Angiotensinconverting enzyme (ACE)/ AT1 has been reported to be associated with a higher risk of acute thrombosis through the destabilizing of atherosclerotic plaque and enhancing the platelet activity and coagulation 51 . ACE2 shares 40% identity and 61% similarity with ACE 52 . SARS-Cov-2 infection mediated by ACE2 and TMPSRSS2 proteins is well established 53 . ACE2 is expressed in cells from multiple tissues, including airways, cornea, esophagus, ileum, colon, liver, gallbladder, heart, kidney and testis 54 . Similarly, to SARS-CoV 55 infection with SARS-CoV-2 may downregulate cell surface expression of ACE2 and may result in reduced activity of ACE2 in infected organs. Moreover, binding of ACE2 to SARS-CoV, and most likely with SARS-CoV-2, increases the activity of disintegrin and metalloproteinase domain-containing protein (ADAMTS17) 57 which can induce the shedding of ectodomain form of ACE2 and detectable the soluble ACE2 58 . The shedding of myocardial ACE2 into the circulation and its association with heart disease in preclinical models suggest that the loss of tissue ACE2 plays a pathogenic role in heart disease 59,60 . Varying ACE2 expression might affect disease susceptibility and progression. Generally, ACE2 expression is highest in children, young people, and women, decreases with age and is lowest in people with underlying conditions such as diabetes and hypertension. Therefore, lower levels of expression of the viral receptor ACE2 are found in those at the highest risk for progression of COVID-19 to a severe disease phenotype 61,61 .
The liver is the major site of complement synthesis. Complement factor B is a protein encoded by the CFB gene. Complement factor B generally called as Factor B, plays a role in the alternative pathway like the role of C2 in the classical pathway. Factor B binds to C3b and is activated to form proteolytic enzyme that cleaves C3. Recently, it has been reviewed systematically in the literature about the COVID-19 associated thrombosis and over activation of complement cascade 56 . A preprint by Gao et al. 62 reported that the SARS-CoV, MERS-CoV and SARS-CoV-2 nucleocapsid (N) proteins were found to bind to MBL-associated serine protease-2 via lectin pathway of complement activation, resulting in aberrant complement activation and aggravated inflammatory lung injury 63 .
C-reactive protein (CRP) is a normal plasma protein and elevates during cytokine-mediated response to most forms of tissue injury, infection and inflammation and serum CRP values are widely measured in clinical practice as an objective index of disease activity 64 . The upregulation of C reactive protein (CRP) that has been reported in COVID-19 patients might be an indication of excessive inflammatory stress and contribute to severe illness or even death [65][66][67] . Moreover, it has been shown that elevated CRP levels in COVID-19 patients is strongly associated with Venous thromboembolism, acute kidney injury, critical illness, and mortality 68 .
Type 1 interferon production is impaired in severe COVID-19 patients and leads to Acute Respiratory Distress Syndrome (ARDS) and coagulopathy. Matsuyama et al. reviewed COVID-19 pathophysiology with respective to NSP1 and ORF6 proteins via induction of signal transducer activator of transcription 1 (STAT1) dysfunction and compensatory hyper activation of STAT3 69 . IFN signaling was inhibited by upregulated EGFR and activated STAT3 70 . This review also emphasized the "STAT3 and Coagulopathy" with the production Tissue Factor induced by CRP which may have activated by STAT3 and prime the initial phase of coagulation.
Catenin beta-1 is also known as β-catenin. Activation of β-catenin, the primary mediator of the ubiquitous Wnt signaling pathway, alters the immune system in lasting and harmful ways 71 . It has been demonstrated that the activation of Wnt/ β-catenin signaling enhances influenza virus replication 72 . Wnt signaling is a complex mechanism of signal transduction pathways mediated by multiple signaling molecules. These molecules are involved in many disease conditions 73 . Specifically, Wnt family genes FZD4, FZD5, CTNNB1 and downstream targets CCDN1, VEGFA, axin2 were upregulated in end-stage of Pulmonary Arterial Hypertension condition, which is a life-threatening disease associated with increase pulmonary pressures, subsequently followed by development of right-sided heart failure 73 .
This study has limitations. Most importantly we have used a computational method to compare the conformations of human proteins with SARS-CoV-2 proteins. The underlying postulate is that shared structural homology would result in cross reactivity. We do not however have a direct computational measure of cross reactivity. Additionally, we rely on conformational similarities and do not weigh our scores for protein conformers that may be inaccessible to antibodies. Another limitation is that the available human protein structures represent approximately 35% of the human proteome. Similarly, variants of SARS-CoV-2 are concern are important, but we have kept this study focused on the wild type (Wuhan strain) SARS-CoV-2 for 2 reasons: (1) Essentially there have been no major changes in COVID-19 associated disorders/pathologies with the emergence of the new variants. (2) In the absence of reliable literature on pathologies associated with individual variants the data will be almost impossible to interpret. The final limitation of this study is that although we list 102 human genes www.nature.com/scientificreports/ with high structural homology to SARS-CoV-2 proteins these have not been experimentally validated. Hence, we do not claim that these are linked to human disease. Overall, the datasets generated here are "hypothesis generating" and provide a useful resource. Evidence has emerged that SARS-CoV-2 infections are associated with auto antibodies and that these have the potential to elicit autoimmune pathologies. We have developed a novel computational approach to identify human proteins that have conformational features similar to SARS-CoV-2 proteins. Thus, there is a likelihood that these human proteins could be targeted by anti-SARS-CoV-2 antibodies. This method and list of human proteins is a resource that can be utilized to study the phenomenon of autoimmune pathologies associated with COVID-19.

Conclusions
In this theoretical study we have identified multiple human proteins with strong structural homology to SARS-CoV-2 proteins. Of these, we posit 102 human proteins could potentially be both (i) associated with COVID-19 related clinical pathologies based on their known function and (ii) targeted by anti-SARS-CoV-2 antibodies. The data sets we have generated using novel computational methods present testable hypotheses to elucidate molecular mechanisms that could explain the complex multi-system disorders associated with COVID-19.

Data availability
The datasets generated as a result of this experiment can be obtained from the corresponding author upon reasonable request.