Abstract
An extremely high contagiousness of SARS CoV-2 indicates that the virus developed the ability to deceive the innate immune system. The virus could have included in its outer protein domains some motifs that are structurally similar to those that the potential victim's immune system has learned to ignore. The similarity of the primary structures of the viral and human proteins can provoke an autoimmune process. Using an open-access protein database Uniprot, we have compared the SARS CoV-2 proteome with those of other organisms. In the SARS CoV-2 spike (S) protein molecule, we have localized more than two dozen hepta- and octamers homologous to human proteins. They are scattered along the entire length of the S protein molecule, while some of them fuse into sequences of considerable length. Except for one, all these n-mers project from the virus particle and therefore can be involved in providing mimicry and misleading the immune system. All hepta- and octamers of the envelope (E) protein, homologous to human proteins, are located in the viral transmembrane domain and form a 28-mer protein E14-41 VNSVLLFLAFVVFLLVTLAILTALRLCA. The involvement of the protein E in provoking an autoimmune response (after the destruction of the virus particle) seems to be highly likely. Some SARS CoV-2 nonstructural proteins may also be involved in this process, namely ORF3a, ORF7a, ORF7b, ORF8, and ORF9b. It is possible that ORF7b is involved in the dysfunction of olfactory receptors, and the S protein in the dysfunction of taste perception.
Introduction
The interaction of SARS CoV-2 with the host immune system is largely determined by the structural similarities between viral and host proteins. The studies of SARS CoV-2 are still focused on the S protein1.
An extremely high contagiousness of the coronavirus SARS CoV-2 indicates that during its evolution the virus developed the ability to deceive the innate immune system. The simplest way to achieve this ability would be to incorporate into its membrane the proteins that share structural similarity with those which the immune system of the potential victim has learnt to ignore. Probably, the virus borrowed some n-mers from bats or other mammals. Any motif of any mammalian protein was suitable for borrowing, if only the immune system considered it to be of its own.
The knowledge of the homology between the SARS CoV-2 and human proteins would help understand the mechanisms of mimicry at the moment of infection. The SARS CoV-2 proteins may simulate human proteins, mislead the immune system, and slow down its response.
However, mimicry is not the only process that is determined by the protein homology between the virus and host organism. After the inevitable destruction of the virus particle, the proteins or their domains, which were inside the virus until then, come into contact with the immune system. With some structural similarity, a part of the immune response will be directed against the proteins of the host organism, i.e., an autoimmune response will arise.
This study aimed to identify the human proteins which share a significant structural homology with the SARS CoV-2 proteins. We hope this information will be useful to the developers of vaccines against coronavirus.
Joshua Lederberg2 believed that "microbes and their human hosts constitute a superorganism." According to this, we considered the concept of "human proteins" as a combination of human own proteome and the proteomes of gut microbiota. We have paid particular attention to the proteins that are involved in the three functions that are almost necessarily affected in this disease, namely digestion, olfaction and taste.
Methods
Using an open-access protein database Uniprot and our original computer program Ouroboros3, we compared the SARS CoV-2 proteome4 with those of other organisms. We also searched for a separate database of 75,777 human proteins5. The algorithm we used compares primary sequences of SARS CoV-2 and human proteins, presented in the form of a one-letter code. We performed a comparison of proteins by a consecutive search for regions of one protein in the others, which is essentially a standard task of finding a substring in a string. This algorithm is implemented in standard methods of many programming languages, including Python, in which the main program was coded. The URL to the source code is provided above3.
When assessing the homology between the viral and human proteins, we took into account the presence of the common 7-/8-mers and especially their fusion into longer sequences. For example, 7-dimensional viruses, one of which is homologous to the human protein A, and the other to the protein B, can "overlap" at the ends, forming regions of 8 to 14 amino acid residues in length.
Results and discussion
Structural proteins
Spike glycoprotein
S protein, 1273 aa

Hereinafter, regions homologous to human proteins are highlighted in red. Transmembrane tail TM1214-1237 is underlined.
In the S protein molecule, we localized more than two dozen of 7-/8-mers homologous to human proteins (Table 1).
Fragments homologous to human proteins are scattered along the entire length of the S protein molecule, and some of them fuse in sequences of considerable length, namely 10-mers SPRRARSVAS680-689, 11-mers GLTVLPPLLTD857-867 and two closely spaced 7-mers NASVVNI1173-1179 and EIDRLNE1182-1188. Octamer RRARSVAS682-689 is located at the junction of the S1 and S2 subunits. All these n-mers stand out from the virus particles and may be involved in the effect of mimicry.
SARS CoV-2 can cause smell and taste dysfunction, as well as muscle injury6.
The 8-mer DEDDSEPV1257-1264, located in the cytoplasmic tail, can be released during the destruction of the virus particle and get involved in orchestrating the immune system’s response, directing a part of it to the homologous 8-mer in human unconventional myosin-XVI1404-1421. The role of this mechanism in muscle dysfunction in coronavirus infection deserves a special investigation.
The 8-mer RRARSVAS682-689 is homologous to the amiloride-sensitive sodium channel subunit alpha201-208, which is involved in salt taste perception7.
With a high degree of probability, it can be argued that the S protein is involved in the process of mimicry. It may also take some part in provoking an autoimmune response.
We have checked the S protein homology across10 species, specifically primates, bats and some other mammals. The results are presented in Table entitled Similarity of SARS CoV-2 spike glycoprotein structure with some mammalian proteins in the electronic attachement. Probably, attention should be paid to the homologous regions common to SARS CoV-2, humans, and bats. The data presented so far do not allow us to derive a more general rule.
Envelope small membrane protein
E protein, 75 aa (transmembrane domain8-38 is underlined)

In the E protein molecule, we localized seven 7-mers and one 8-mer homologous to human proteins (Table 2).
A fragment of the E8-38 protein transmembrane domain can be represented as follows:

The size of the letters (point size) corresponds to the frequency of the viral 7-/8-mers in the human proteome.
The protein E transmembrane domain contains 7-/8-mers, homologous to the proteins of some gut bacteria and even cereals, for example, corn, sorghum, wheat, and barley (Table 3).
The simulation targets may have been the proteins synthesized by a macroorganism itself or by its normal gut microbiota.
All protein E 7-/8-mers, homologous to proteins of humans, gut bacteria and cereals, are located in the transmembrane domain of the virus and form the 28-mer protein E14-41. A random selection of 28 amino acid residues in a row would require an astronomical number of iterations: 2028 = 2.7 ∙ 1036.
The involvement of the E protein in mimicry is hardly possible, but its implication in provoking an autoimmune response (after the destruction of the virus particle) seems very likely.
As a major target, the viral E protein has usually been used for the development of vaccines, specifically against HIV-19, Dengue virus10, hepatitis B virus11, SARS CoV-212 and many other viruses. A deletion of the SARS-CoV E protein reduces pathogenicity and mortality in laboratory animals13. In the transmembrane domain of the SARS-CoV E protein, specific critical virulence-determining features have been identified14.
Membrane protein
Membrane protein, 222 aa

In the M protein molecule, we localized six 7-mers homologous to human proteins (Table 4).
A N-terminus fragment1-19 of the M protein can be represented as follows:

In the protein M, four 7-dimensional homologues of human proteins are fused into 10-mer VEELKKLLEQ10-19, the hydrophilic composition of which indicates a possible contact with the external environment, i.e., with the host's immune system, and the involvement in mimicry.
Outside of the 10-mer, we found only two homologous 7-mers. It is unlikely that the M protein is involved in provoking an autoimmune response (after the destruction of the virus particle).
Nucleoprotein
Nucleoprotein, 419 aa

In the N protein molecule, we localized eleven 7-mers homologous to human proteins (Table 5).
The N protein is located completely inside the virus particle and cannot be involved in mimicry. All heptamers homologous to human proteins form several rather long fragments, including the 13-mer SKQLQQSMSSADS404-416 and 10-mer AEGSRGGSQA173-182, which increases the likelihood of the protein involvement in provoking an autoimmune response.
Nonstructural proteins
All non-structural proteins of SARS CoV-2 are located completely inside the virus particle and, by definition, cannot be involved in the process of mimicry. It remains to consider the possibility of their implication in provoking an autoimmune process.
ORF3a protein
ORF3a protein, 275 aa

In the ORF3a protein molecule, we localized five 7-mers homologous to human proteins (Table 6).
The 7-mers scattered along the entire length of its molecule do not form long n-mers anywhere else. ORF3a does not appear to be involved in provoking an autoimmune response.
ORF7a protein
ORF7a 121 aa

In the ORF7a protein molecule, we found two 7-mers homologous to human proteins and located in close proximity to each other (Table 7).
It is possible that ORF7a is involved in provoking an autoimmune response.
ORF7b protein
ORF7b protein, 43 aa

In this polypeptide, we found only one 7-mer homologous to the human protein (Table 8).
ORF7b may be involved in provoking an autoimmune response, contributing to olfactory dysfunction.
ORF8 protein
ORF8 protein, 121 aa

The primary structure of SARS-CoV-2 ORF8 is close to that of bat RaTG13-CoV15. In this polypeptide, there are three 7-mers homologous to human proteins (Table 9).
Due to the fusion of two 7-mers into 10-mer LVFLGIITTV4-13, the ORF8 protein can be involved in provoking an autoimmune response.
ORF9b protein
ORF9b protein, 97 aa

In the ORF9b protein molecule, we localized six 7-/8-mers, homologous to human proteins (Table 10).
Some of these 7-/8-mers merge into larger n-mers TEELPDEFVV84-93 and LGSPLSLN48-55.
Octamer ELPDEFVV86-93 is homologous to the Maestro heat-like repeat-containing protein family member 2B (Fig. 1), which may play a role in the sperm capacitation16. Male reproductive dysfunction was proposed as a likely consequence of COVID-1917.
After the destruction of the virus particle, ORF9b can take part in provoking an autoimmune response.
Replicase polyprotein RPP 1a
Replicase polyprotein RPP 1a, 4405 aa

The longest n-mers are underlined.
In the RPP 1a molecule, we localized eleven 8-mers (Table 11) and more than a hundred 7-mers homologous to human proteins.
Some of the 8-mers are found in more than one human protein, some fold into long n-mers, for example EDIQLLKSAYENFNQH1126-1141, EVEKGVLPQLEQPY55-68 and SVEEVLSEARQHL34-46.
In the RPP 1a molecule, 7-mers SCGNFKV505-511 and AIFYLIT2785-2791 are homologous to human olfactory receptor proteins 52N2190-196 and 2W132-38, respectively. A heptamer LKTLLSL1556-1562 is homologous to the human bitter taste receptor T2R55181-187 (Fig. 2).
Replicase polyprotein RPP 1ab
This huge (7096 aa; the primary structure see in18) molecule contains 210 hepta- and octamers homologous to human proteins. Some of them fold into long (more than 15 aa) n-mers.
The possibility of the involvement of replicases in provoking an autoimmune response is debatable. Enzymes in general, and cell cycle enzymes in particular, are evolutionarily highly conserved. Fragments homologous to human proteins must be thrown in huge quantities into the gut lumen during the decay of any microorganism that dies there. It is possible that the interaction of replicases with the host's immune system obeys the laws other than for shorter proteins.
ORF6, ORF10, and ORF14
In these polypeptides (61, 38, and 73 aa, respectively), we did not find 7-/8-mers homologous to human proteins. When assessing the role of SARS CoV-2 proteins in mimicry and provoking an autoimmune response in humans, we considered the following parameters: (i) the number of homologous n-mers; (ii) the compactness of their arrangement in the SARS CoV-2 protein molecules; (iii) intradomain localization (external, transmembrane, internal) of the SARS CoV-2 proteins, and (iv) physiological functions that involve the homologous human proteins (Table 12).
Conclusions
Analysis of homology between the SARS CoV-2 and human proteins led us to the following conclusions. Some of the SARS CoV-2 proteins can be implicated in mimicry that can delay the response of innate immunity to the invasion of virus particles into a macroorganism, and in provoking an autoimmune process that directs a part of the immune response to the proteins of a macroorganism (after the destruction of virus particles). Mimicry is probably more characteristic of the spike (S) protein, and the provocation of an autoimmune response seems to be a distinctive feature of the envelope (E) protein. The ORF7b protein may be involved in the impairment of olfactory receptors, and the S protein may be involved in taste perception dysfunction.
Drugs aimed at destructing or blocking these and alike regions in proteins of SARS CoV-2 and other viruses can enable the human immune system not to succumb to viral deception and destroy the invader shortly after its penetration into a macroorganism. It should also be borne in mind that drugs affecting such imitation regions can damage native proteins present of the human body. Destroying or blocking such regions can weaken the autoimmune response.
Data availability
The highest.
Code availability
Source code of Ouroboros (v. 0.5) is fully available at github. URL: https://github.com/liquidbrainisstrain/ouroboros. Artwork: We used GIMP (Version 2.10.22) to create our artwork. The figures are completely original and have not been published anywhere.
References
Sanami, S. et al. Design of a multi-epitope vaccine against SARS-CoV-2 using immunoinformatics approach. Int. J. Biol. Macromol. 164, 871–883. https://doi.org/10.1016/j.ijbiomac.2020.07.117 (2020).
Lederberg, J. Infectious history. Science 288(5464), 287–293 (2000).
Terekhov, A. Ouroboros (Version 0.5) [Source code]. https://github.com/liquidbrainisstrain/ouroboros.
Proteomes: Severe acute respiratory syndrome coronavirus 2 (2019-nCoV) (SARS-CoV-2). https://www.uniprot.org/proteomes/UP000464024 SARS-COV-2, accessed 20 Aug 2020.
Proteomes: Homo sapiens (Human). https://www.uniprot.org/proteomes/UP000005640 Homo sapiens, accessed 03 Sept 2020.
Koralnik, I. J. & Tyler, K. L. COVID-19: A global threat to the nervous system. Ann. Neurol. 88(1), 1–11. https://doi.org/10.1002/ana.25807 (2020).
Huang, T. & Stähler, F. Effects of dietary Na+ deprivation on epithelial Na+ channel (ENaC), BDNF, and TrkB mRNA expression in the rat tongue. BMC Neurosci. 10, 19. https://doi.org/10.1186/1471-2202-10-19 (2009).
Mandala, V. S. et al. Structure and drug binding of the SARS-CoV-2 envelope protein transmembrane domain in lipid bilayers. Nat. Struct. Mol. Biol. 27(12), 1202–1208. https://doi.org/10.1038/s41594-020-00536-8 (2020).
Li, S. W. et al. Gene editing in CHO cells to prevent proteolysis and enhance glycosylation: Production of HIV envelope proteins as vaccine immunogens. PLoS ONE 15, e0233866. https://doi.org/10.1371/journal.pone.0233866 (2020).
Rathore, A. S., Sarker, A. & Gupta, R. D. Production and immunogenicity of Fubc subunit protein redesigned from DENV envelope protein. Appl. Microbiol. Biotechnol. 104, 4333. https://doi.org/10.1007/s00253-020-10541-y (2020).
Ho, J. K., Jeevan-Raj, B. & Netter, H. J. Hepatitis B Virus (HBV) subviral particles as protective vaccines and vaccine platforms. Viruses 12, 126. https://doi.org/10.3390/v12020126 (2020).
Abdelmageed, M. I. et al. Design of a multiepitope-based peptide vaccine against the E protein of human COVID-19: An immunoinformatics approach. Biomed. Res. Int. 2020, 2653286. https://doi.org/10.1155/2020/2683286 (2020).
DeDiego, M. L. et al. Inhibition of NF-κB-mediated inflammation in severe acute respiratory syndrome coronavirus-infected mice increases survival. J. Virol. 88, 913–924 (2014).
Regla-Nava, J. A. et al. Severe acute respiratory syndrome coronaviruses with mutations in the E protein are attenuated and promising vaccine candidates. J. Virol. 89, 3870–3887 (2015).
Hassan, S. S. et al. A unique view of SARS-CoV-2 through the lens of ORF8 protein. Comput. Biol. Med. 133, 104380. https://doi.org/10.1016/j.compbiomed.2021.104380 (2021).
MROH2B: Function. https://www.nextprot.org/entry/NX_Q7Z745.
Sansone, A. et al. Addressing male sexual and reproductive health in the wake of COVID-19 outbreak. J. Endocrinol. Invest. 44(2), 223–231. https://doi.org/10.1007/s40618-020-01350-1 (2021).
Replicase polyprotein 1ab [Severe acute respiratory syndrome coronavirus 2]. https://www.ncbi.nlm.nih.gov/protein/P0DTD1.1?report=fasta.
Funding
This research is an authors’ initiative project funded exclusively from their personal sources.
Author information
Authors and Affiliations
Contributions
A.M. and V.K. wrote the main manuscript text. A.T. and D.K. prepared data analysis. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Khavinson, V., Terekhov, A., Kormilets, D. et al. Homology between SARS CoV-2 and human proteins. Sci Rep 11, 17199 (2021). https://doi.org/10.1038/s41598-021-96233-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-021-96233-7
This article is cited by
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.