High cell surface expression and peptide binding affinity of HLA-DQA1*05:03, a susceptible allele of neuromyelitis optica spectrum disorders (NMOSD)

Neuromyelitis optica spectrum disorder (NMOSD) is a relapsing autoimmune disease characterized by the presence of pathogenic autoantibodies, anti-aquaporin 4 (AQP4) antibodies. Recently, HLA-DQA1*05:03 was shown to be significantly associated with NMOSD in a Japanese patient cohort. However, the specific mechanism by which HLA-DQA1*05:03 is associated with the development of NMOSD has yet to be elucidated. In the current study, we revealed that HLA-DQA1*05:03 exhibited significantly higher cell surface expression levels compared to other various DQA1 alleles, and that its expression strongly depended on the amino acid sequence of the α1 domain, with a preference for leucine at position 75. Moreover, in silico analysis indicated that the HLA-DQ encoded by HLA-DQA1*05:03 preferentially presents immunodominant AQP4 peptides, and that the peptide major histocompatibility complexes (pMHCs) are more energetically stable in the presence of HLA-DQA1*05:03 than other HLA-DQA1 alleles. In silico 3D structural models were also applied to investigate the validity of the energetic stability of pMHCs. Taken together, our findings indicate that HLA-DQA1*05:03 possesses a distinct property to play a pathogenic role in the development of NMOSD.

www.nature.com/scientificreports/ Despite these important breakthroughs, little has been known about the genetic factors of NMOSD. In 2018, a whole-genome sequence study conducted on 215 NMOSD cases and 1244 controls of European ancestry identified two independent signals associated with NMO-IgG positivity in the major histocompatibility complex (MHC) regions, one of which is located in the HLA-DQA1 gene 3 . Subsequently, in 2019, Ogawa et al. performed HLA genotyping using next-generation sequencing on 31 Japanese NMOSD patients and 429 healthy controls, and found that HLA-DQA1*05:03 had a significant association with NMOSD 4 .
The human leukocyte antigen (HLA) that are encoded by the MHC gene complex in humans can be classified into class I and class II proteins. Class II molecules, to which HLA-DQ belongs, is a transmembrane protein composed of an α chain consisting of α1 and α2 domains, and a β chain consisting of β1 and β2 domains, and is generally expressed on antigen-presenting cells (APCs). MHC class II binds to a peptide taken up by endocytosis via the peptide binding groove and translocates to the cell surface, thereby transmitting signals to naive CD4+ T cells to promote their differentiation into effector helper T cells.
Although the main pathology of NMOSD is astrocytopathy caused by anti-AQP4 antibodies, it has also been reported that T cells play a pivotal role in NMOSD pathogenesis. Varrin-Doyer et al. reported that NMOSD patients had a higher frequency of AQP4-reactive T cells than healthy controls, and that these cells exhibited Th17 polarization 5 . In addition, it has been reported that anti-AQP4 antibodies belong to the IgG1 subtype, a T cell-dependent Ig subclass 6,7 , and that their production requires AQP4 reactive CD4+ helper T cells.
From these observations, it is reasonable to theorize that the presentation of AQP4 peptide to CD4+ T cells by MHC class II promotes their differentiation from naive T cells to effector T cells, thus subsequently promoting anti-AQP4 antibody production. It can further be hypothesized that NMOSD patients with HLA-DQA1*05:03 will undergo this process more efficiently.
In this current study, we evaluated the cell surface expression of HLA-DQA1*05:03, as well as the binding of HLA-DQA1 alleles to AQP4 peptides and the energetic stability of the resulting complexes, in order to test our above-mentioned hypotheses. Our findings indicate the possibility that HLA-DQA1*05:03 contributes to the pathogenesis of NMOSD through preferentially presenting AQP4 peptides by several distinct mechanisms.

Results
HLA-DQA1*05:03 is highly expressed on the cell surface. Since the relationship between HLA cell surface expression level and disease susceptibility has been clarified in several recent reports, we started by revealing the characteristics of cell surface expression levels of the HLA-DQ molecule encoded by HLA-DQA1*05:03. To investigate whether the cell surface expression levels of HLA-DQ molecules depend on haplotypes, we transiently transfected various HLA-DQ alleles into HEK293 cells, which lack endogenous HLA class II expression, and measured the cell surface expression level of the HLA-DQ molecule by flow cytometry. As for the HLA-DQ expression vectors, we selected nine HLA-DQA1 alleles (DQA1*01:01, DQA1*01:02, DQA1*01:03, DQA1*01:04, DQA1*03:01, DQA1*03:03, DQA1*04:01, DQA1*05:03 and DQA1*05:05), which encompass most HLA-DQA1 alleles in the Japanese population, together with HLA-DQB1*03:01. HEK293 cells were transiently co-transfected with one of the HLA-DQA1 expression vectors and the HLA-DQB1*03:01 expression vector, which is known to be able to pair with most types of DQA1 alleles in the Japanese population. It was observed that HLA-DQ molecules encoded by HLA-DQA1*05, especially HLA-DQA1*05:03, had the highest cell surface expression level (Fig. 1A, B).
To further confirm that HLA-DQ molecules encoded by HLA-DQA1*05:03 show high cell surface expression, we used a pcDNA3.1(+) IRES GFP plasmid vector in order to express GFP as an internal control (Fig. 1C). Co-expression vectors of GFP and HLA-DQA1 were constructed for HLA-DQA1*01:03, DQA1*03:03, and DQA1*05:03, and the HLA-DQB1*03:01 expression vector was transiently transfected in combination with these vectors. The cell surface expression level of HLA-DQ and the expression of cytosolic GFP were determined by flow cytometry. In the GFP-positive cells, the highest expression of HLA-DQ was observed in the HLA-DQA1*05:03 group among all the HLA-DQA1 alleles examined, consistent with the previous results (Fig. 1D, E). For each HLA-DQ allelic pair, we also calculated the mean fluorescence intensity (MFI) of HLA-DQ relative to the MFI of GFP, which indicates the amount of cell-surface HLA normalized to GFP, and observed that HLA-DQ encoded by HLA-DQA1*05:03 had the highest value (Fig. 1F).
Since it was unclear to what extent the peptides bound to these HLA-DQ molecules affected their cell surface expression levels, we evaluated whether the cell surface expression level of HLA-DQ could be altered by binding to AQP4 peptides. We selected five immunodominant AQP4 peptides (p61-80, p91-110, p131-150, p201-220 and p211-230) that reportedly promote the activation and proliferation of T cells with APCs 5,8-11 (Table 1). A previous study demonstrated that ectopically expressed HLA in HEK293 cells could load exogenously added epitope peptides 12 . Therefore, the transfected HEK293 cells were incubated with each of the five AQP4 peptides for 8 h 12,13 . The hierarchy of cell surface expression levels among the HLA-DQA alleles observed in Fig. 1 was maintained even when co-cultured with four out of the five immunodominant AQP4 peptides (Supplemental Fig. S1). These findings demonstrated that the HLA-DQ molecule encoded by the HLA-DQA1*05:03 allele had the highest cell surface expression level irrespective of the presence of immunodominant AQP4 peptides.
Amino acid sequence of the α1 domain determines the expression level of HLA-DQA1*05:03 on the cell surface. Furthermore, to investigate the possibility that the hierarchy is caused by a variation in the amino acid sequence that constitutes the HLA-DQ molecule, we generated combinatorically modified HLA-DQA1 expression vectors in which the first half of the sequences encoding α1 domains and the latter half encoding α2 domains, transmembrane regions (TM) and cytoplasmic regions (CY), were individually exchanged among HLA-DQA1*01:03, DQA1*03:03, and DQA1*05:03 alleles ( Fig. 2A). www.nature.com/scientificreports/ www.nature.com/scientificreports/ There was no significant difference in cell surface expression of HLA-DQ molecules among the modified expression vectors with the same sequence in the first half ( Fig. 2B, C). On the other hand, when comparing vectors with the same sequence in the latter half, a large difference in cell surface expression was observed; when α1 domain was comprised by HLA-DQA1*05:03, the cell surface expression was the highest (Fig. 2D, E). Taken together, these data demonstrated that the amino acid sequence of α1 domain controls the difference in the cell surface expression levels of HLA-DQ molecules.
Furthermore, to determine which position in the amino acid sequence of the α1 domain was particularly important for the cell surface expression of the HLA-DQ molecule, we generated four types of modified vectors, each with one amino acid substituted in the α1 domain from those originally found in HLA-DQA1*05:03 to that in HLA-DQA1*01:03; specifically, V50E, Q53K, F54_R55insG and L75M (Fig. 2F). These four positions, all of which are located in the peptide binding cleft of HLA-DQ, were selected because various polymorphisms were observed in these positions among HLA-DQA1 alleles. Alteration of V50E, Q53K and F54_R55insG did not result in any significant changes in the cell surface expression levels of HLA-DQ, but the mutation L75M showed a significant decrease in the cell surface expression compared to wild-type HLA-DQA1*05:03 (Fig. 2G, H).
Immunodominant AQP4 peptides have high binding affinity with HLA-DQ molecules encoded by the HLA-DQA1*05 allele. To clarify whether HLA-DQA1*05:03, the highest risk allele of NMOSD, efficiently presents immunodominant AQP4 peptides, we compared the binding affinity of various HLA-DQA1 alleles with both immunodominant and non-immunodominant AQP4 peptides using NetMHCIIpan 3.2 14 , an in silico HLA-peptide binding affinity calculation tool.
Since DQA1*05:03 is reportedly a disease-sensitive allele in Japanese patients, the DQA1 and DQB1 alleles analyzed were limited to those known to be present in the Japanese population based on the HLA Laboratory Table 1. AQP4 peptide list. Peptides with a length of 20 amino acids constituting AQP4 molecule (323 amino acids in length) was listed as overlapping by 10 amino acids. AQP4 amino acid sequence of the molecules refers to the UniProt database (AQP4_HUMAN P55087-1). The immunodominant AQP4 peptides are shown in bold type. www.nature.com/scientificreports/ website data (https:// hla. or. jp/ med/ frequ ency_ search/ en/ haplo/). In addition, it is generally reported that DQB1*05 and DQB1*06 alleles form pairs only with the DQA1*01 alleles, and vice versa, so they were excluded from the analysis ( Table 2). It should be noted that the output of this tool is not affected by the 2nd field (formerly 4-digit) of the DQA1 allele, so we excluded its description. The immunodominant AQP4 peptides (7 peptides) and the non-immunodominant AQP4 peptides (24 peptides) are shown in Table 1. First, we calculated binding affinities for all combinations of the seven immunodominant AQP4 peptides (Table 1), the HLA-DQA1 alleles and HLA-DQB1 alleles (Table 2), and subsequently compared them among the HLA-DQA1 alleles (Fig. 3A). In comparison with the other HLA-DQA1 alleles, HLA-DQA1*05 showed no statistically significant difference. Next, focusing on HLA-DQA1*05, we calculated binding affinities for all the combinations of AQP4 peptide (Table 1) and HLA-DQB1 alleles (Table 2), and then compared the calculated results between two groups of immunodominant AQP4 peptides and non-immunodominant AQP4 peptides (Fig. 3B). We found that HLA-DQA1*05 binds more potently to immunodominant AQP4 peptides than to nonimmunodominant AQP4 peptides.
The pMHC complex of DQA1*05:03 with immunodominant AQP4 peptides is energetically stable. Next, using ImmuneScape, a tool that is capable of modeling MHC, peptide, and T cell receptor (TCR) complexes 15 , we evaluated the energetic stability of the pMHC complex. All complexes were modeled using 17 distinct TCRs from the Protein Data Bank (PDB) (see the description in the Methods section). For all combinations of these HLA-DQA1 alleles, AQP4 peptides, and TCRs, the energetic stability of each pMHC complex was evaluated by the EMPIRE (EMpirical Protein-InteRaction Energy) score function 16 and compared among the DQA1 alleles (Fig. 4A). Here, lower EMPIRE score represents higher energetic stabilization of pMHC.
In 4 of the 7 immunodominant AQP4 peptides (p11-30, p91-110, p131-150 and p261-280), the energetic stabilization of pMHC was significantly higher in the order of HLA-DQA1*05: 03, DQA1*03:03 and DQA1*01:03 (Fig. 4B). In addition, when all seven AQP4 peptides were analyzed together, a significant difference in the degree of energetic stabilization was observed in the same order (Fig. 4C). Moreover, 3D structure modeling was performed using ImmuneScape on both p131-150 and p261-280 (Fig. 5A, B). In both models, the volume of the peptide cavity was smaller in HLA-DQA1*01:03 and DQA1*03:03 than in HLA-DQA1*05:03. It is wellestablished that conformational entropy contributes significantly to peptide-protein binding affinity 17,18 . Thus, we speculated that the larger binding cavity provided greater conformational entropy to the bound peptides, contributing to the lower EMPIRE energies. In the case of p131-150, the core peptide differed for each HLA-DQA1 allele. Both HLA-DQA1*03:03 and DQA1*05:03 could bury a hydrophobic side-chain (methionine and valine, respectively) on the C-terminus, but only HLA-DQA1*05:03 could bury a large aromatic side-chain (tyrosine). This probably explains the better binding energy. In the case of p261-280, all three alleles were predicted to bind the same core peptide, and this peptide was predicted to bind with a similar conformation. However, the cavity of HLA-DQA1*05:03 was larger, thus allowing the anchor side-chains to assume different orientations for the N-and C-terminal residues.
In summary, our findings revealed that HLA-DQA1*05:03 is energetically more stable than other HLA-DQA1 alleles when bound to the immunodominant AQP4 peptide, and according to the 3D structure models, it was further rationalized by the large volume of the peptide cavity of DQA1*05:03, possibly contributing favorably to the binding affinity of the pMHC complex.

Discussion
Although several reports have been made on the disease-susceptible HLA alleles of NMOSD in Japanese, Southern Han Chinese, Asian, Latin American and European populations [19][20][21][22][23][24][25] , there was no common diseasesusceptibility allele regardless of race. Meanwhile, a large-scale genome-wide association study was carried out on NMOSD patients of European ancestry in 2018, and the findings revealed that one of the single nucleotide polymorphisms that has the strongest association with NMOSD susceptibility was located in the HLA-DQA1 gene, with DQA1*05:01 reported as one of the highest risk alleles 3 . Moreover, in 2019, Ogawa et al. performed next-generation sequencing on Japanese NMOSD patients and identified HLA-DQA1*05:03 as the most susceptible allele 4 . In Japan, unlike in Europe, the frequency of HLA-DQA1*05:03 in the population is much higher than that of HLA-DQA1*05:01, so the subtle differences in the findings between these two major genetic surveys may possibly have been due to genetic differences. In fact, the amino acid sequences of HLA-DQA1*05:03 and DQA1*05:01 differ by only one amino acid in the α2 domain (i.e., the amino acid sequences of the peptide binding clefts are identical), so it was considered that both of these two alleles might affect the pathophysiology of NMOSD via a similar mechanism.
It should be noted that there are many diseases, including autoimmune diseases, for which specific HLA alleles and haplotypes are reportedly strongly associated with the disease susceptibility or protection. However, few reports have focused on clarifying the specific pathogenic mechanism by which the disease-sensitive allele (or protective allele) promotes the onset of the disease. Moreover, only a few papers have focused on the effect of HLA cell surface expression. Pisapia et al. reported that the DQα1*05 chain and DQ1β*02 chain, which are known as risk alleles for celiac disease and type 1 diabetes, have significantly higher cell surface expression levels in B lymphoblastoid cell lines than non-risk alleles DQα chain and DQβ chain 26 .
It has been reported that the cell surface expression level of HLA molecules changes dramatically depending on the haplotype 27,28 . Miyadera et al. comprehensively measured the cell surface expression level of HLA-DQ molecules, and reported that it differs greatly depending on the combination of α and β chain that make up the HLA-DQ molecule 27 . Although HLA-DQA1*05:03 was not addressed in that study, the hierarchy of cell surface expression observed with other HLA-DQA1 alleles was consistent with our findings. In addition, by identifying the responsible residues that regulate the cell surface expression through mutagenesis studies and association www.nature.com/scientificreports/ analysis, they demonstrated that the hierarchies are mainly formed by polymorphic sites that directly affect the interactions between the subunits or domains constituting the HLA molecules, or between HLA molecules and binding peptides. Furthermore, in a study by Kaur et al., the authors concluded that the high thermal stability of pMHC encoded by HLA-C*05 compared to HLA-C*07 resulting from variations in amino acid residues constituting the peptide binding groove of HLA-C molecules can explain the high cell surface expression of pMHC encoded by HLA-C*05 29 . What these two studies have in common is that the difference in cell surface expression of HLA molecules between alleles is independent of the mRNA level and intracellular total protein level of HLA molecules in HLA-expressing cells. This is consistent with the results of the transfection experiment we obtained in this study using the pcDNA3.1 (+) IRES GFP plasmid vector. Our findings revealed that the cell surface expression of the HLA-DQ molecule encoded by DQA1*05:03 was remarkably high, and that its molecular mechanism was due to the amino acid sequence of the HLA-DQ molecule itself. After quantifying the cell surface expression level of HLA-DQ molecules in vitro, we examined the binding affinity of HLA-DQ and AQP4 peptides and the energetic stability of the complex in silico. Our findings indicated that the binding affinity between the HLA-DQ molecule encoded by DQA1*05:03 and the immunodominant AQP4 peptide was relatively high, and that the energetic stability of the complex was also higher than that of other HLA-DQA1 alleles. In fact, the HLA-DR molecule encoded by the systemic sclerosis with anti-Topoisomerase 1 antibody (ATASSc)-associated alleles (HLA-DRB1*08:02, HLA-DRB1*11:01 and HLA-DRB1*11:04) or suspected allele (HLA-DRB5*01:02) is reported to have a high binding affinity for the immunodominant peptide of topoisomerase 1 self-protein (residues 349-368) 30 . Furthermore, it has been shown that high stability of the peptide-MHC class II complex broadens the clonotypic repertoire of antigen-specific CD4+ T cells, while low stability skews it into a restricted repertoire with high affinity TCRs for pMHC 31 . Based on these facts, the high binding affinity between the HLA-DQ molecule encoded by DQA1*05:03 and immunodominant AQP4 peptides and the stability of the complex may characterize the clonotypic repertoire of CD4+ T cells and induce the production of anti-AQP4 antibodies, thus resulting in the development of NMOSD.
There are certain limitations that should be taken into consideration in future studies. First, the cell surface expression level of HLA molecules was measured by a transient transfection assay on HEK293 cells, which may not reflect the expression profile of HLA-DQ molecules in APCs of actual NMOSD patients. Moreover, only HLA-DQB1*03:01 was paired with DQA1 alleles, so there is no certainty that the results of this experiment can be reproduced with other DQB1 alleles. In addition, we did not evaluate the effects of other alleles that are in a linkage disequilibrium with HLA-DQA1*05:03. Finally, the sensitivity and specificity of the structural models obtained in silico needs to be verified by further in vitro studies.
In conclusion, the findings in this study indicated the pathogenic roles of disease-susceptible HLA alleles in the development of NMOSD, an autoimmune disease characterized by the production of anti-AQP4 autoantibodies. Thus, we propose that the cell surface expression level of the HLA molecule, the binding affinity with the immunodominant AQP4 peptide, and the stability of the complex may play important roles in the pathophysiology of NMOSD.

Methods
Plasmids. All HLA expression vectors were generated at the Research Institute for Microbial Diseases, Osaka University, Suita, Japan. Complementary DNAs (cDNAs) prepared from pooled human peripheral blood mononuclear cells (3H Biomedical AB, Uppsala, Sweden) were cloned into the pME18S vectors. cDNA sequences for HLA class II were based on information contained in the Immunogenetics/HLA Database (http:// www. ebi. ac. uk/ imgt/ hla/ index. html).  32 . Briefly, the fragment encoding the HLA-DQ α chain of each HLA-DQA1 expression vector was amplified by polymerase chain reaction (PCR) using KOD FX (Toyobo Co., Ltd., Osaka, Japan). Nhe1 cleavage sequence was added to the 5'side of the forward primer and the Not1 cleavage sequence to the 5′ side      www.nature.com/scientificreports/ Stimulating AQP4 peptides. We selected the following five immuodominant AQP4 peptides: AQP4 p61-80, p91-110, 131-150, 201-220 and 211-230. These peptides were commercially synthesized by GenScript ® Japan, Inc. (Tokyo, Japan) with purity > 95%. Sequences of these peptides were based on UniProt database (AQP4_HUMAN P55087-1). Each of these synthesized peptides was added to the culture medium of HLA expressing HEK293 cells at a concentration of 10 μg ml −1 , and the cell surface expression level of HLA was evaluated by flow cytometry after culturing for 8 h.

Construction of modified HLA-DQA1 expression vectors with swapped sequences. Each
HLA-DQA1 expression vector (DQA1*01:03, DQA1*03:03 and DQA1*05:03) was cut into two fragments, one of which encoded the α1 domain and the other the α2, TM and CY domain. They were electrophoresed on an agarose gel and their DNA fragments were extracted with the illustra™ GFX™ PCR DNA and Gel Band Purification Kit. In accordance with the manufacturer's instructions, the fragment encoding the α1 domain and the fragment encoding the α2, TM and CY domains were exchanged and ligated respectively using the DNA Ligation Kit Version 2.1.  www.nature.com/scientificreports/ Measurement of binding affinity of AQP4 peptides and HLA-DQ molecules using NetMHCIIpan 3.2. Measurement of binding affinity of AQP4 peptides and HLA-DQ molecules was performed with the artificial neural network-based alignment method NetMHCIIpan 3.2 (http:// www. cbs. dtu. dk/ servi ces/ NetMH CIIpan-3.2/), as described previously. This system was generated by using an extended data set of quantitative MHC-peptide binding affinity data obtained from the Immune Epitope Database 33 covering HLA-DR, HLA-DQ, HLA-DP and H-2 mouse molecules. Trained with these data, this system can predict binding affinity of any peptide for any MHC class II haplotype in silico. The output formats are IC50 and %Rank (percentile rank), and we adopted the latter in this study so that the values could be compared among different HLA haplotypes, as IC50 is defined as 50% binding inhibition concentration for standard peptide and the standard peptide varies depending on the HLA haplotype.

Construction of mutant
Energetic stability analysis and 3D models of pMHC complex. The pMHC complexes were modeled for each combination of 7 immunodominant AQP4 peptides (p11-30, p61-80, p91-110, p131-150, p201-220, p211-230 and p261-280), 3 HLA-DQA1 alleles (DQA1*01:03, DQA1*03:03 and DQA1*05:03), and 1 DQB allele (DQB1*03:01) using a customized version of the ImmuneScape 15 workflow. ImmuneScape requires not only MHC and epitope, but also TCR sequences; the program then returns a 3D model of the TCR-peptide-MHC complex. In order to ensure that the results did not depend on the chosen TCR, all complexes were modeled with 17 different TCRs. These TCRs were based on ImmuneScape's structure template library, which is a subset of experimentally verified structures from the PDB. The TCR sequence pairs (combined alpha and beta chains) were clustered using cd-hit with a similarity cutoff of 0.8 to remove redundancy. The resulting 17 cluster representatives were used for structural modeling, with the obtained EMPIRE energy results averaged across all 17 TCRs. ImmuneScape extracts 9-residue epitope cores from the full-length input peptides using NetMHCIIpan 14 predictions. For each TCR-epitope-MHC model, the binding energies (MHC alpha and beta vs epitope) were evaluated using EMPIRE 16 , which computes a statistics-based protein-protein binding energy.
Statistical analysis. All statistical analyses were performed by the Kruskal-Wallis test followed by Steel-Dwass multiple comparisons using EZR statistical software (Saitama Medical Center, Jichi Medical University, Saitama, Japan), which is a graphical user interface for R commander (The R Foundation for Statistical Computing, Vienna, Austria) designed to add statistical functions frequently used in biostatistics 34 . Data given in a box plot show the median (horizontal line within the box), the 25th and 75th percentile (boundary of the box), and the 10th and 90th percentile (whiskers above and below the box). Data points not included in this range are indicated by dots as outliers.