Introduction

Multiple sclerosis (MS) is a chronic autoimmune inflammatory disease which affects the central nervous system and is characterized by multifocal demyelinating lesions, axonal loss, and atrophy1. MS is the most common neurological disorder among young adults and its global prevalence is increasing for unclear reasons2,3. The etiology of MS is uncertain although viruses have long been purported to contribute to the disease, particularly in genetically vulnerable individuals. For example, human herpes viruses (HHV) including Epstein-Barr virus (EBV/HHV4), roseolavirus (HHV6), and varicella zoster virus (VZV/HHV3) as well as human endogenous retroviruses (HERVs) have been commonly implicated in MS4,5,6,7. In addition, human polyoma JC virus (JCV), a human polyomavirus, is associated with MS and particularly in complications stemming from immunosuppressive treatment for MS8,9,10. The primary genetic influence on MS is attributed to human leukocyte antigen (HLA) genes which are centrally involved in the human immune response to viruses and other foreign antigens and have been implicated in both MS risk and protection11,12,13,14,15,16. In a recent immunogenetic epidemiological study, we evaluated the association between the population frequencies of 127 HLA alleles and the population prevalence of MS across 14 European countries and found a preponderance of negative (i.e., protective) associations between HLA allele frequencies and MS prevalence, particularly for Class I HLA alleles16. Given the role of HLA in elimination/suppression of viruses and other foreign antigens, we hypothesized that negative (i.e., protective) associations between Class I HLA and MS are likely attributable to superior pathogen elimination afforded by those alleles, and that, conversely, positive (i.e., susceptibility) HLA-MS associations may be attributable to insufficient immunogenetic protection against certain pathogens, thereby hindering their suppression and possibly contributing to downstream effects associated with MS. Here, in an effort to test this hypothesis and bridge separate lines of research implicating exposure to pathogens and HLA in MS, we evaluated the virus-HLA (V-HLA) immunogenicity of viruses implicated in MS with respect to HLA alleles that are positively associated with MS prevalence.

Results

MS-HLA susceptibility scores

The MS-HLA Susceptibility scores are epidemiological measures of association between MS prevalence and HLA allele frequency. Of the 69 HLA Class I alleles investigated, 24 were positive, indicating a positive association between MS prevalence and allele frequency (Fig. 1). It can be seen that the scores were practically the same for the last 7 alleles and, hence, the scores of the top 17 alleles (Table 1) were used for further analyses.

Figure 1
figure 1

MS-HLA susceptibility scores are plotted against their rank. Red, scores of alleles used I further analyses; gray, scores at the tail of the distribution, not used hereafter. The red line demarcates these two groups. See text for details.

Table 1 MS-HLA PScov scores for the 17 susceptibility Class I alleles investigated.

Immunogenicity of viral proteins for HLA Class I alleles

In silico virus-HLA immunogenicity scores (V-HLA scores) are estimates of T-cell epitope prediction, indicating the likelihood that the complex between a given epitope and a specific HLA Class I allele will engage T-cell receptor and, hence, activate CD8 + cytotoxic lymphocytes to kill the infected cell. V-HLA immunogenicity varied appreciably among the 12 viruses studied (Table 2, Fig. 2), being highest for HHV4 (V-HLA = 13.639) and lowest for HHV6A (V-HLA = 2.563), a 5.32 × differential. V-HLA was highest for allele C*03:03 (V-HLA = 12.686) and lowest for A*03:01 (V-HLA = 3.486) (Table 3, Fig. 3).

Table 2 Descriptive statistics of V-HLA immunogenicities across the 17 HLA Class I alleles in Table 1 (N = 17).
Figure 2
figure 2

Mean (± SEM) of V-HLA immunogenicity scores for each one of the 12 viruses investigated (N = 17 alleles in Table 1).

Table 3 Descriptive statistics of V-HLA immunogenicities across the 12 viruses investigated for the 17 Class I alleles in Table 1..
Figure 3
figure 3

Mean (± SEM) of MS-HLA susceptibility scores for each one of the 17 alleles in Table 1. (N = 12 viruses in Fig. 2).

Association between MS-HLA susceptibility and V-HLA immunogenicity

Overall, MS-HLA susceptibility scores and V-HLA immunogenicity scores were negatively associated, such that MS-HLA susceptibility scores decreased as the V-HLA immunogenicity increased (Fig. 4; r = − 0.512, P = 0.035, N = 17), indicating a protective effect of viral immunogenicity. In order to evaluate the association of MS-HLA scores with V-HLA immunogenicity of individual viruses in a robust, uniform and nonparametric way, correlations were computed between data converted to normal scores using Blom’s formula17. The results are shown in Figs. 5, 6, 7 and 8 as scatterplots of the normalized MS-HLA susceptibility scores vs. normalized V-HLA immunogenicity scores for each of the 12 viruses investigated. It can be seen that all associations were negative, such that MS-HLA susceptibility decreased as V-HLA immunogenicity increased, indicating a protective effect of the latter. The strength of this association differed across viruses (Fig. 9), as reflected in the order of the figures, with Fig. 5 illustrating the case with the strongest association (HHV3), Fig. 8 the case with the weakest association (HPV), and the rest (Figs. 6, 7) in between. Detailed association statistics are given in Table 4, where the strength of MS-HLA susceptibility vs. V-HLA immunogenicity is formalized as the percent of variance in MS-HLA susceptibility scores explained by the corresponding (to each allele) V-HLA immunogenicity. It can be seen (Table 4, Fig. 9) that HHV3 had the highest PVE (43.56%) and HPV the lowest (5.11%), a 8.52 × differential.

Figure 4
figure 4

The MS-HLA susceptibility scores of the 17 alleles (Table 1) are plotted against the mean of the corresponding (per allele) V-HLA immunogenicity scores (N = 12 viruses). See text for details.

Figure 5
figure 5

Negative association of MS-HLA susceptibility scores of the 17 alleles (Table 1) vs. corresponding V-HLA immunogenicity scores for the viruses indicated (HHV3, JCV, HHV1). See Table 4 for detailed statistics.

Figure 6
figure 6

Negative association of MS-HLA susceptibility scores of the 17 alleles (Table 1) vs. corresponding V-HLA immunogenicity scores for the viruses indicated (HHV4, HHV7, HHV5). See Table 4 for detailed statistics.

Figure 7
figure 7

Negative association of MS-HLA susceptibility scores of the 17 alleles (Table 1) vs. corresponding V-HLA immunogenicity scores for the viruses indicated (HHV8, HHV6A, HHV6B). See Table 4 for detailed statistics.

Figure 8
figure 8

Negative association of MS-HLA susceptibility scores of the 17 alleles (Table 1) vs. corresponding V-HLA immunogenicity scores for the viruses indicated (HERVW, HHV2, HPV). See Table 4 for detailed statistics.

Figure 9
figure 9

Percent of MS-HLA susceptibility variance explained by V-HLA immunogenicity of the 12 viruses investigated.

Table 4 Association statistics between MS-HLA susceptibility scores and immunogenicities of the 12 viruses investigated (N = 17 HLA Class I alleles, Table 1).

Discussion

It is largely accepted that MS is a result of complex genetic and environmental interactions. Here we focused on the role of viruses and HLA in MS. Specifically, we evaluated the association between immunogenicity of 12 viruses with respect to 17 HLA Class I alleles that we found to be associated with susceptibility to MS by analyzing population-level epidemiological data. Our findings documented a negative association between the viral V-HLA immunogenicity of all 12 viruses and MS-HLA susceptibility across the 17 MS-HLA Class I susceptibility alleles above. Although the strength of this association varied across viruses, the systematic negative association between viral V-HLA immunogenicity and MS-HLA susceptibility highlight a key role of HLA-mediated virus elimination and/or suppression in influencing MS risk, both at the initial infection and at later relapses caused by reactivation of a latent virus.

MS is presumed to result from exposure to ubiquitous infectious agents in the context of permissive genetic traits18. In addition to Class II HLA alleles that have long been implicated in MS11, the present findings suggest that the interaction between several common viruses including human herpes viruses and JCV with Class I HLA influences MS prevalence. In light of the role of HLA in antigen elimination and virus suppression, the effect of exposure to certain viruses on MS appears to be moderated by a given HLA allele’s ability to bind and eliminate viral antigens that may otherwise contribute to MS or other conditions. Indeed, HHVs have been implicated in a number of human diseases including MS4,5,6,7. Following initial infection, typically in childhood, HHVs establish latency and may be periodically reactivated by various triggers and/or waning immunity. Notably, patterns of reactivation have been shown to correspond to MS relapse19,20. Similarly, JCV persists in a latent state in the brain, is detectable in human brain tissue, and has also been linked to MS7,21,22.The mechanisms underlying the influence of HLA on virus-MS associations are unclear, although several mechanisms including molecular mimicry, persistent viral antigens, bystander activation, superantigen activation, adjuvant effects, epitope spreading, and viral support of autoreactive cell survival have been proposed to explain how viruses might induce autoimmunity in MS17,23,24,25,26. We have suggested that exposure to pathogens in the absence of HLA that can bind and eliminate those antigens results in antigen persistence and deleterious long-term effects including low-grade chronic inflammation and downstream autoimmunity, apoptosis, and atrophy, thereby setting the groundwork for various conditions including MS16,27.

With regard to specific viruses, the strongest effects observed here were for HHV3/VZV, JCV, HHV1/HSV1, HHV4/EBV, HHV7, and HHV5/CMV. Each of these viruses have been previously linked with MS although the findings have been somewhat inconsistent, even for EBV which is considered the leading viral candidate for MS7,18,21,25,28,29,30,31,32,33,34,35,36,37,38. For instance, recent evidence demonstrated that although EBV antibodies were higher in MS patients than in controls, neither EBV antibodies nor salivary EBV DNA load were associated with radiological or clinical disease activity in patients with MS39. Like many HHVs, EBV is also commonly detected in the healthy adult population40 suggesting infection with EBV or other HHVs is insufficient to cause MS in the absence of other factors, including HLA41,42. Furthermore, even among HLA alleles that were positively associated with MS risk in the present study, there was considerable variability in HLA-virus immunogencities, MS-HLA susceptibility scores, and their associations.

Additional contributions

In addition to the contributions of the Class I HLA-virus immunogenicities on MS susceptibility documented here, there are likely other contributing factors. Class II HLA has been strongly linked to MS risk10; thus, it is likely that HLA Class II alleles, which are involved in formation of antibodies and immunological memory and often form haplotypes with other HLA alleles including those of Class I, contribute to MS and particularly to autoimmunity associated with MS26. Beyond viruses, several other environmental and lifestyle factors also appear to play a role in MS susceptibility including geography, smoking, sun exposure/vitamin D, and adolescent obesity43,44,45. Notably, some of these factors have been shown to interact with HLA to influence MS risk43. For example, smoking has been shown to increase the odds of MS in individuals lacking the protective HLA-A*02:01 allele or in carriers of the high-risk Class II HLA-DRB1*15:01 allele46. Similar interactions have been documented for obesity47. Thus, other HLA x environmental/lifestyle factor interactions not evaluated here may account for some of the unexplained variance in the HLA-MS profile.

Limitations

Our findings provide novel insights highlighting the interaction of viral exposure and host immunogenetics on MS; however, there are several study limitations that must be considered. First, the analyses here are based on MS diagnosis without regard to subtype; as such, it is unclear to what extent the present findings apply to different forms of the disease. Second, the data utilized here was derived from populations of Continental Western European countries and may not extend to other geographic locations given the global variation in HLA48,49, MS prevalence2, and virus-MS associations50,51. Third, it would be informative to evaluate immunogenicity of these viruses with regard to Class II HLA, particularly in light of the extensive literature documenting the relevance of Class II HLA in MS; however, we are not aware of any in silico application that allows for examination of both binding affinity and immunogenicity for Class II alleles akin to the approach we used here for Class I. Finally, we exclusively focused on the role of viruses in MS and on specific viral proteins, from several possible. The interplay between various environmental factors that have been linked to MS43,44,45 and the HLA-related MS-viral associations remains to be investigated.

Materials and methods

Prevalence of MS

The population prevalence of MS was computed for each of 14 countries in Continental Western Europe (Table 5). For each country, we identified the total number of people with each condition in 2019 from the Global Health Data Exchange52, a publicly available catalog of data from the Global Burden of Disease study, divided those values by the total population of each country in 201952, and expressed the prevalence as percentage.

Table 5 Prevalence of multiple sclerosis in 14 CWE countries in 2019.

HLA alleles

We obtained the population frequency in 2019 of 69 common HLA Class I alleles from 14 Continental Western European Countries (Austria, Belgium, Denmark, Finland, France, Germany, Greece, Italy, Netherlands, Portugal, Norway, Spain, Sweden, and Switzerland)53. The alleles and their mean frequencies (across countries) are given in Table 6.

Table 6 The 69 HLA Class I alleles used and their mean frequencies.

MS-HLA susceptibility scores

We computed the covariance between the prevalence of MS and the population frequency of the 69 HLA Class I alleles of Table 6:

$$\mathrm{MS-HLA \, susceptibility \, score }= \frac{1}{N-1}\sum_{i}^{i=1,N}({f}_{i}-\overline{f })({p}_{i}-\overline{p })$$
(1)

where \({f}_{i}, {p}_{i}\) denote allele frequency and MS prevalence for the ith country, respectively, and \(\overline{f },\overline{p }\) are their means. A positive covariance indicates a positive association between MS prevalence and allele frequency, indicating MS susceptibility.

Viral antigens

For a given allele, we estimated the immunogenicity of typical proteins of 12 viruses that have been implicated in MS to varying degrees, namely 9 human herpes virus species (HHV1-HHV8), human polyoma JC virus (JCV), human endogenous retrovirus (HERV-W), and human papilloma virus (HPV), the latter of which has not been implicated in MS, to our knowledge, and serves as a negative control, Details of the proteins analyzed are given in Table 7 and their amino acid (AA) sequences are given in the Appendix, together with a short description of their function.

Table 7 Viral proteins used.

Determination of immunogenicity of HLA Class I alleles

The INeo-Epp method54 was used for T-cell receptor (TCR) epitope prediction using the INeo-Epp web tool via the INeo-Epp web form interface55. For that purpose, we split a given viral antigen (Table 6) to all possible 9-mer (nonamer) AA residue epitopes using a sliding window approach56,57,58 (Fig. 10) and submitted each epitope to the web-application together with a specific HLA allele. More specifically, we paired all epitopes with all alleles and obtained for each pair its percentile rank, a measure of binding affinity of the epitope-HLA allele complex; smaller percentile ranks indicate higher binding affinity. The web-application gave as an outcome a TCR predictive score for pairs with high binding affinities (percentile rank < 2); scores > 0.4 indicated positive immunogenicity and were analyzed further. We computed the following as a comprehensive measure of immunogenicity for quantitative analyses. Let K be the number of nonamers that showed positive immunogenicity (score > 0.4); then, K weighted by their average score \(\overline{w }\), would serve as a good estimate of the overall effectiveness of a given allele, I, to induce immunogenicity for a given protein:

Figure 10
figure 10

The sliding nonamer window approach used to determine exhaustively in silico the immunogenicity of all possible consecutive nonamers in a protein, illustrated here for HHV3.

$$\mathrm{V-HLA \, immunogenicity \, score}=\overline{w }K$$
(2)

Association of V-HLA immunogenicities with MS-HLA susceptibility scores

We evaluated the association between MS-HLA susceptibility scores Eq. (1) and V-HLA immunogenicity scores Eq. (2) by computing the Pearson correlation between them for each HLA allele. The correlation coefficient obtained for each virus was squared and multiplied × 100 to provide the percent of MS-HLA susceptibility explained (PVE) by the viral protein immunogenicity:

$$PVE=100{r}^{2}$$
(3)

Implementation of analysis procedures

The IBM-SPSS statistical package (version 27) was used for implementing standard statistical analyses, including descriptive statistics and measures of associations. Since we were testing explicitly only a negative association between virus immunogenicity and MS-HLA covariance, one-sided P-values were used. We did not correct for multiple comparisons because these were planned comparisons.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.