An extremely high contagiousness of SARS CoV-2 indicates that the virus developed the ability to deceive the innate immune system. The virus could have included in its outer protein domains some motifs that are structurally similar to those that the potential victim's immune system has learned to ignore. The similarity of the primary structures of the viral and human proteins can provoke an autoimmune process. Using an open-access protein database Uniprot, we have compared the SARS CoV-2 proteome with those of other organisms. In the SARS CoV-2 spike (S) protein molecule, we have localized more than two dozen hepta- and octamers homologous to human proteins. They are scattered along the entire length of the S protein molecule, while some of them fuse into sequences of considerable length. Except for one, all these n-mers project from the virus particle and therefore can be involved in providing mimicry and misleading the immune system. All hepta- and octamers of the envelope (E) protein, homologous to human proteins, are located in the viral transmembrane domain and form a 28-mer protein E14-41 VNSVLLFLAFVVFLLVTLAILTALRLCA. The involvement of the protein E in provoking an autoimmune response (after the destruction of the virus particle) seems to be highly likely. Some SARS CoV-2 nonstructural proteins may also be involved in this process, namely ORF3a, ORF7a, ORF7b, ORF8, and ORF9b. It is possible that ORF7b is involved in the dysfunction of olfactory receptors, and the S protein in the dysfunction of taste perception.
The interaction of SARS CoV-2 with the host immune system is largely determined by the structural similarities between viral and host proteins. The studies of SARS CoV-2 are still focused on the S protein1.
An extremely high contagiousness of the coronavirus SARS CoV-2 indicates that during its evolution the virus developed the ability to deceive the innate immune system. The simplest way to achieve this ability would be to incorporate into its membrane the proteins that share structural similarity with those which the immune system of the potential victim has learnt to ignore. Probably, the virus borrowed some n-mers from bats or other mammals. Any motif of any mammalian protein was suitable for borrowing, if only the immune system considered it to be of its own.
The knowledge of the homology between the SARS CoV-2 and human proteins would help understand the mechanisms of mimicry at the moment of infection. The SARS CoV-2 proteins may simulate human proteins, mislead the immune system, and slow down its response.
However, mimicry is not the only process that is determined by the protein homology between the virus and host organism. After the inevitable destruction of the virus particle, the proteins or their domains, which were inside the virus until then, come into contact with the immune system. With some structural similarity, a part of the immune response will be directed against the proteins of the host organism, i.e., an autoimmune response will arise.
This study aimed to identify the human proteins which share a significant structural homology with the SARS CoV-2 proteins. We hope this information will be useful to the developers of vaccines against coronavirus.
Joshua Lederberg2 believed that "microbes and their human hosts constitute a superorganism." According to this, we considered the concept of "human proteins" as a combination of human own proteome and the proteomes of gut microbiota. We have paid particular attention to the proteins that are involved in the three functions that are almost necessarily affected in this disease, namely digestion, olfaction and taste.
Using an open-access protein database Uniprot and our original computer program Ouroboros3, we compared the SARS CoV-2 proteome4 with those of other organisms. We also searched for a separate database of 75,777 human proteins5. The algorithm we used compares primary sequences of SARS CoV-2 and human proteins, presented in the form of a one-letter code. We performed a comparison of proteins by a consecutive search for regions of one protein in the others, which is essentially a standard task of finding a substring in a string. This algorithm is implemented in standard methods of many programming languages, including Python, in which the main program was coded. The URL to the source code is provided above3.
When assessing the homology between the viral and human proteins, we took into account the presence of the common 7-/8-mers and especially their fusion into longer sequences. For example, 7-dimensional viruses, one of which is homologous to the human protein A, and the other to the protein B, can "overlap" at the ends, forming regions of 8 to 14 amino acid residues in length.
Results and discussion
S protein, 1273 aa
Hereinafter, regions homologous to human proteins are highlighted in red. Transmembrane tail TM1214-1237 is underlined.
In the S protein molecule, we localized more than two dozen of 7-/8-mers homologous to human proteins (Table 1).
Fragments homologous to human proteins are scattered along the entire length of the S protein molecule, and some of them fuse in sequences of considerable length, namely 10-mers SPRRARSVAS680-689, 11-mers GLTVLPPLLTD857-867 and two closely spaced 7-mers NASVVNI1173-1179 and EIDRLNE1182-1188. Octamer RRARSVAS682-689 is located at the junction of the S1 and S2 subunits. All these n-mers stand out from the virus particles and may be involved in the effect of mimicry.
SARS CoV-2 can cause smell and taste dysfunction, as well as muscle injury6.
The 8-mer DEDDSEPV1257-1264, located in the cytoplasmic tail, can be released during the destruction of the virus particle and get involved in orchestrating the immune system’s response, directing a part of it to the homologous 8-mer in human unconventional myosin-XVI1404-1421. The role of this mechanism in muscle dysfunction in coronavirus infection deserves a special investigation.
The 8-mer RRARSVAS682-689 is homologous to the amiloride-sensitive sodium channel subunit alpha201-208, which is involved in salt taste perception7.
With a high degree of probability, it can be argued that the S protein is involved in the process of mimicry. It may also take some part in provoking an autoimmune response.
We have checked the S protein homology across10 species, specifically primates, bats and some other mammals. The results are presented in Table entitled Similarity of SARS CoV-2 spike glycoprotein structure with some mammalian proteins in the electronic attachement. Probably, attention should be paid to the homologous regions common to SARS CoV-2, humans, and bats. The data presented so far do not allow us to derive a more general rule.
Envelope small membrane protein
E protein, 75 aa (transmembrane domain8-38 is underlined)
In the E protein molecule, we localized seven 7-mers and one 8-mer homologous to human proteins (Table 2).
A fragment of the E8-38 protein transmembrane domain can be represented as follows:
The size of the letters (point size) corresponds to the frequency of the viral 7-/8-mers in the human proteome.
The protein E transmembrane domain contains 7-/8-mers, homologous to the proteins of some gut bacteria and even cereals, for example, corn, sorghum, wheat, and barley (Table 3).
The simulation targets may have been the proteins synthesized by a macroorganism itself or by its normal gut microbiota.
All protein E 7-/8-mers, homologous to proteins of humans, gut bacteria and cereals, are located in the transmembrane domain of the virus and form the 28-mer protein E14-41. A random selection of 28 amino acid residues in a row would require an astronomical number of iterations: 2028 = 2.7 ∙ 1036.
The involvement of the E protein in mimicry is hardly possible, but its implication in provoking an autoimmune response (after the destruction of the virus particle) seems very likely.
As a major target, the viral E protein has usually been used for the development of vaccines, specifically against HIV-19, Dengue virus10, hepatitis B virus11, SARS CoV-212 and many other viruses. A deletion of the SARS-CoV E protein reduces pathogenicity and mortality in laboratory animals13. In the transmembrane domain of the SARS-CoV E protein, specific critical virulence-determining features have been identified14.
Membrane protein, 222 aa
In the M protein molecule, we localized six 7-mers homologous to human proteins (Table 4).
A N-terminus fragment1-19 of the M protein can be represented as follows:
In the protein M, four 7-dimensional homologues of human proteins are fused into 10-mer VEELKKLLEQ10-19, the hydrophilic composition of which indicates a possible contact with the external environment, i.e., with the host's immune system, and the involvement in mimicry.
Outside of the 10-mer, we found only two homologous 7-mers. It is unlikely that the M protein is involved in provoking an autoimmune response (after the destruction of the virus particle).
Nucleoprotein, 419 aa
In the N protein molecule, we localized eleven 7-mers homologous to human proteins (Table 5).
The N protein is located completely inside the virus particle and cannot be involved in mimicry. All heptamers homologous to human proteins form several rather long fragments, including the 13-mer SKQLQQSMSSADS404-416 and 10-mer AEGSRGGSQA173-182, which increases the likelihood of the protein involvement in provoking an autoimmune response.
All non-structural proteins of SARS CoV-2 are located completely inside the virus particle and, by definition, cannot be involved in the process of mimicry. It remains to consider the possibility of their implication in provoking an autoimmune process.
ORF3a protein, 275 aa
In the ORF3a protein molecule, we localized five 7-mers homologous to human proteins (Table 6).
The 7-mers scattered along the entire length of its molecule do not form long n-mers anywhere else. ORF3a does not appear to be involved in provoking an autoimmune response.
ORF7a 121 aa
In the ORF7a protein molecule, we found two 7-mers homologous to human proteins and located in close proximity to each other (Table 7).
It is possible that ORF7a is involved in provoking an autoimmune response.
ORF7b protein, 43 aa
In this polypeptide, we found only one 7-mer homologous to the human protein (Table 8).
ORF7b may be involved in provoking an autoimmune response, contributing to olfactory dysfunction.
ORF8 protein, 121 aa
Due to the fusion of two 7-mers into 10-mer LVFLGIITTV4-13, the ORF8 protein can be involved in provoking an autoimmune response.
ORF9b protein, 97 aa
In the ORF9b protein molecule, we localized six 7-/8-mers, homologous to human proteins (Table 10).
Some of these 7-/8-mers merge into larger n-mers TEELPDEFVV84-93 and LGSPLSLN48-55.
Octamer ELPDEFVV86-93 is homologous to the Maestro heat-like repeat-containing protein family member 2B (Fig. 1), which may play a role in the sperm capacitation16. Male reproductive dysfunction was proposed as a likely consequence of COVID-1917.
After the destruction of the virus particle, ORF9b can take part in provoking an autoimmune response.
Replicase polyprotein RPP 1a
Replicase polyprotein RPP 1a, 4405 aa
The longest n-mers are underlined.
In the RPP 1a molecule, we localized eleven 8-mers (Table 11) and more than a hundred 7-mers homologous to human proteins.
Some of the 8-mers are found in more than one human protein, some fold into long n-mers, for example EDIQLLKSAYENFNQH1126-1141, EVEKGVLPQLEQPY55-68 and SVEEVLSEARQHL34-46.
In the RPP 1a molecule, 7-mers SCGNFKV505-511 and AIFYLIT2785-2791 are homologous to human olfactory receptor proteins 52N2190-196 and 2W132-38, respectively. A heptamer LKTLLSL1556-1562 is homologous to the human bitter taste receptor T2R55181-187 (Fig. 2).
Replicase polyprotein RPP 1ab
This huge (7096 aa; the primary structure see in18) molecule contains 210 hepta- and octamers homologous to human proteins. Some of them fold into long (more than 15 aa) n-mers.
The possibility of the involvement of replicases in provoking an autoimmune response is debatable. Enzymes in general, and cell cycle enzymes in particular, are evolutionarily highly conserved. Fragments homologous to human proteins must be thrown in huge quantities into the gut lumen during the decay of any microorganism that dies there. It is possible that the interaction of replicases with the host's immune system obeys the laws other than for shorter proteins.
ORF6, ORF10, and ORF14
In these polypeptides (61, 38, and 73 aa, respectively), we did not find 7-/8-mers homologous to human proteins. When assessing the role of SARS CoV-2 proteins in mimicry and provoking an autoimmune response in humans, we considered the following parameters: (i) the number of homologous n-mers; (ii) the compactness of their arrangement in the SARS CoV-2 protein molecules; (iii) intradomain localization (external, transmembrane, internal) of the SARS CoV-2 proteins, and (iv) physiological functions that involve the homologous human proteins (Table 12).
Analysis of homology between the SARS CoV-2 and human proteins led us to the following conclusions. Some of the SARS CoV-2 proteins can be implicated in mimicry that can delay the response of innate immunity to the invasion of virus particles into a macroorganism, and in provoking an autoimmune process that directs a part of the immune response to the proteins of a macroorganism (after the destruction of virus particles). Mimicry is probably more characteristic of the spike (S) protein, and the provocation of an autoimmune response seems to be a distinctive feature of the envelope (E) protein. The ORF7b protein may be involved in the impairment of olfactory receptors, and the S protein may be involved in taste perception dysfunction.
Drugs aimed at destructing or blocking these and alike regions in proteins of SARS CoV-2 and other viruses can enable the human immune system not to succumb to viral deception and destroy the invader shortly after its penetration into a macroorganism. It should also be borne in mind that drugs affecting such imitation regions can damage native proteins present of the human body. Destroying or blocking such regions can weaken the autoimmune response.
Source code of Ouroboros (v. 0.5) is fully available at github. URL: https://github.com/liquidbrainisstrain/ouroboros. Artwork: We used GIMP (Version 2.10.22) to create our artwork. The figures are completely original and have not been published anywhere.
Sanami, S. et al. Design of a multi-epitope vaccine against SARS-CoV-2 using immunoinformatics approach. Int. J. Biol. Macromol. 164, 871–883. https://doi.org/10.1016/j.ijbiomac.2020.07.117 (2020).
Lederberg, J. Infectious history. Science 288(5464), 287–293 (2000).
Terekhov, A. Ouroboros (Version 0.5) [Source code]. https://github.com/liquidbrainisstrain/ouroboros.
Proteomes: Severe acute respiratory syndrome coronavirus 2 (2019-nCoV) (SARS-CoV-2). https://www.uniprot.org/proteomes/UP000464024 SARS-COV-2, accessed 20 Aug 2020.
Proteomes: Homo sapiens (Human). https://www.uniprot.org/proteomes/UP000005640 Homo sapiens, accessed 03 Sept 2020.
Koralnik, I. J. & Tyler, K. L. COVID-19: A global threat to the nervous system. Ann. Neurol. 88(1), 1–11. https://doi.org/10.1002/ana.25807 (2020).
Huang, T. & Stähler, F. Effects of dietary Na+ deprivation on epithelial Na+ channel (ENaC), BDNF, and TrkB mRNA expression in the rat tongue. BMC Neurosci. 10, 19. https://doi.org/10.1186/1471-2202-10-19 (2009).
Mandala, V. S. et al. Structure and drug binding of the SARS-CoV-2 envelope protein transmembrane domain in lipid bilayers. Nat. Struct. Mol. Biol. 27(12), 1202–1208. https://doi.org/10.1038/s41594-020-00536-8 (2020).
Li, S. W. et al. Gene editing in CHO cells to prevent proteolysis and enhance glycosylation: Production of HIV envelope proteins as vaccine immunogens. PLoS ONE 15, e0233866. https://doi.org/10.1371/journal.pone.0233866 (2020).
Rathore, A. S., Sarker, A. & Gupta, R. D. Production and immunogenicity of Fubc subunit protein redesigned from DENV envelope protein. Appl. Microbiol. Biotechnol. 104, 4333. https://doi.org/10.1007/s00253-020-10541-y (2020).
Ho, J. K., Jeevan-Raj, B. & Netter, H. J. Hepatitis B Virus (HBV) subviral particles as protective vaccines and vaccine platforms. Viruses 12, 126. https://doi.org/10.3390/v12020126 (2020).
Abdelmageed, M. I. et al. Design of a multiepitope-based peptide vaccine against the E protein of human COVID-19: An immunoinformatics approach. Biomed. Res. Int. 2020, 2653286. https://doi.org/10.1155/2020/2683286 (2020).
DeDiego, M. L. et al. Inhibition of NF-κB-mediated inflammation in severe acute respiratory syndrome coronavirus-infected mice increases survival. J. Virol. 88, 913–924 (2014).
Regla-Nava, J. A. et al. Severe acute respiratory syndrome coronaviruses with mutations in the E protein are attenuated and promising vaccine candidates. J. Virol. 89, 3870–3887 (2015).
Hassan, S. S. et al. A unique view of SARS-CoV-2 through the lens of ORF8 protein. Comput. Biol. Med. 133, 104380. https://doi.org/10.1016/j.compbiomed.2021.104380 (2021).
MROH2B: Function. https://www.nextprot.org/entry/NX_Q7Z745.
Sansone, A. et al. Addressing male sexual and reproductive health in the wake of COVID-19 outbreak. J. Endocrinol. Invest. 44(2), 223–231. https://doi.org/10.1007/s40618-020-01350-1 (2021).
Replicase polyprotein 1ab [Severe acute respiratory syndrome coronavirus 2]. https://www.ncbi.nlm.nih.gov/protein/P0DTD1.1?report=fasta.
This research is an authors’ initiative project funded exclusively from their personal sources.
The authors declare no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Khavinson, V., Terekhov, A., Kormilets, D. et al. Homology between SARS CoV-2 and human proteins. Sci Rep 11, 17199 (2021). https://doi.org/10.1038/s41598-021-96233-7