Polymorphism at rs9264942 is associated with HLA-C expression and inflammatory bowel disease in the Japanese

An expression quantitative trait locus (eQTL) single-nucleotide polymorphism (SNP) at rs9264942 was earlier associated with human leukocyte antigen (HLA)-C expression in Europeans. HLA-C has also been related to inflammatory bowel disease (IBD) risk in the Japanese. This study examined whether an eQTL SNP at rs9264942 could regulate HLA-C expression and whether four SNP haplotypes, including the eQTL SNP at rs9264942 and three SNPs at rs2270191, rs3132550, and rs6915986 of IBD risk carried in the HLA-C*12:02~B*52:01~DRB1*15:02 allele, were associated with IBD in the Japanese. HLA-C expression on CD3e+CD8a+ lymphocytes was significantly higher for the CC or CT genotype than for the TT genotype of rs9264942. The TACC haplotype of the four SNPs was associated with a strong susceptibility to ulcerative colitis (UC) but protection against Crohn’s disease (CD) as well as with disease clinical outcome. While UC protectivity was significant but CD susceptibility was not for the CGTT haplotype, the significance of UC protectivity disappeared but CD susceptibility reached significance for the CGCT haplotype. In conclusion, our findings support that the eQTL SNP at rs9264942 regulates HLA-C expression in the Japanese and suggest that the four SNPs, which are in strong linkage disequilibrium, may be surrogate marker candidates of a particular HLA haplotype, HLA-C*12:02~B*52:01~DRB1*15:02, related to IBD susceptibility and disease outcome.


Results
Comparisons of HLA-C expression on peripheral blood mononuclear cells by the eQTL rs9264942 SNP genotype in healthy Japanese subjects. A total of 32 healthy control subjects were included for the analysis of HLA-C expression on peripheral blood mononuclear cells (PBMC) ( Table 1). The gating strategy for PBMC is shown in Fig. 1a. Although the cell surface expression of HLA-C on CD3e + CD8a + T lymphocytes (Fig. 1b) as detected by flow cytometry in healthy subjects was comparable between males and females ( Fig. 1e), it was significantly higher for the CC or CT genotype than for the TT genotype at rs9264942. HLA-C expression on CD3e + CD8a + T lymphocytes (Fig. 1f), macrophages (Fig. 1c,g), and neutrophils (Fig. 1d,h) were significantly higher for the CC or CT genotype than for the TT genotype. Since another SNP at rs2395471 was also reported to be an eQTL SNP of HLA-C 23 , we compared HLA-C expression for the AA or AG genotype and the GG genotype at rs2395471 on PBMC. HLA-C expression on CD3e + CD8a + T lymphocytes (Fig. 1i) was significantly higher for the AA or AG genotype than for the GG genotype at rs2395471, while HLA-C expression on macrophages (Fig. 1j) and neutrophils (Fig. 1k) were comparable between the groups.
Association of the eQTL rs9264942 SNP of HLA-C expression with IBD susceptibility. A total of 160 patients with UC and 275 patients with CD along with 325 healthy subjects were enrolled for this association study ( Table 2). The C allele frequency of the eQTL SNP at rs9264942, which has been associated with HLA-C expression, was significantly higher in UC patients than in the healthy control group (UC vs. controls: 49.7% vs. 35.5%, odds ratio [OR] 1.79; pc = 9.52 × 10 -4 ) ( Table 3). In terms of clinical findings, there were no significant differences for UC between the CC or CT genotype and the TT genotype (Supplementary Table 1). There was also no remarkable difference for the C allele frequency of the rs9264942 SNP between the CD and healthy control groups (OR 1.00; pc = 1.000) ( Table 3).
Association of three SNPs at rs2270191, rs3132550, and rs6915986 with IBD susceptibility. The allele frequencies of three SNPs at rs2270191, rs3132550, and rs6915986 in patients with UC and CD and in healthy and primary biliary cholangitis (PBC) disease controls are shown in Table 3. The frequency of the T allele at rs2270191 in strong LD with the HLA-C*12:02 allele (r 2 = 1), the A allele at rs3132550 in strong LD with the HLA-B*52:01 allele (r 2 = 0.94), and the C allele at rs6915986 in strong LD with the HLA-DRB1*15:02 allele (r 2 = 0.89) were significantly higher in UC patients than in the healthy group but significantly lower in CD patients than in healthy controls (Table 3). These allele frequencies were comparable between healthy and PBC disease controls (Table 3). were calculated using an expectation-maximization algorithm for the three SNPs (rs2270191, rs3132550, and rs6915986) and compared among UC, CD, PBC, and healthy control groups. The three-SNP haplotype of TAC, which was predicted as associated with the particular HLA haplotype of HLA-C*12:02~B*52:01~DRB1*15:02, showed a strong susceptibility correlation between UC and controls (OR 2.53; pc = 3.92 × 10 -7 ) but a protective association between CD and controls (OR 0.50; pc = 0.002) ( Table 4). In contrast, the three-SNP haplotype of CGT showed a strong protectivity correlation between UC and controls (OR 0.39; pc = 1.58 × 10 -8 ) but a significant susceptibility association between CD and controls (OR 1.90; pc = 1.19 × 10 -3 ) ( Table 4). The three-SNP haplotypes of CGT and TAC accounted for the vast majority of the cohort, which indicated that the SNPs were in strong LD with each other, as shown in Fig. 2. Moreover, these three-SNP haplotype frequencies were comparable between healthy and PBC disease controls (Table 4).
Logistic regression analysis of SNPs associated with IBD susceptibility. The strong LD of the three SNPs at rs2270191, rs3132550, and rs6915986 indicated that simultaneous logistic regression analysis to determine which SNP was independently associated with IBD susceptibility was inappropriate due to multi-collinearity. However, since rs9264942 SNP distribution was different from that of the other three SNPs regardless of being within a haplotype block ( Fig. 2 and Supplementary Fig. 1  Representative HLA-C expression on CD3e + CD8a + lymphocytes (b), macrophages (c), and neutrophils (d) (red) in relation to the isotype control (gray). Comparisons of geometric mean fluorescence intensity quantification of the cell surface expression of HLA-C on CD3e + CD8a + lymphocytes between males and females (e) and between the SNP of the CC or CT genotype and of the TT genotype at rs9264942 on CD3e + CD8a + lymphocytes (f), macrophages (g), and neutrophils (h) as well as between the SNP of the AA or AG genotype and the GG genotype at rs2395471 on CD3e + CD8a + T lymphocytes (i), macrophages (j), and neutrophils (k). Haplotype analysis of the SNPs at rs2270191, rs3132550, rs9264942, and rs6915986. Haplotype frequencies were calculated using an expectation-maximization algorithm for the four SNPs (rs2270191, rs3132550, rs9264942, and rs6915986) and compared among UC, CD, PBC, and healthy control groups. The four-SNP haplotype of TACC, which was predicted as associated with the particular HLA haplotype of HLA-C*12:02~B*52:01~DRB1*15:02, showed a strong susceptibility correlation between UC and controls (OR 2.53; pc = 3.96 × 10 -7 ), but a protective association between CD and controls (OR 0.50; pc = 0.002) ( Table 4). The four-SNP haplotype of TATC could not be calculated, which indicated that all subjects carrying the particular HLA haplotype of HLA-C*12:02~B*52:01~DRB1*15:02 had the TACC haplotype, i.e., the SNP at rs9264942 of subjects carrying the HLA haplotype of HLA-C*12:02~B*52:01~DRB1*15:02 was the C allele (CC or CT genotype).    (Table 5).

Discussion
This study clearly demonstrated a significant association of an eQTL SNP of HLA-C with the regulation of HLA-C expression on PBMC in Japanese subjects. Moreover, four SNP haplotypes consisting of the eQTL SNP of HLA-C and three SNPs of HLA-C*12:02~B*52:01~DRB1*15:02 were significantly related to susceptibility to UC but protection against CD. The cell surface expression of HLA-C is regulated through multiple mechanisms 30 . Located 35 kb upstream of the HLA-C gene ( Supplementary Fig. 1), the SNP variant at rs9264942 has been associated with HLA-C surface expression in subjects of European-descent 24,31,32 , but not in an African American population 21 , suggesting that the function of the eQTL SNP at rs9264942 in HLA-C expression varies among population groups. Therefore, we examined for and identified a significant association of the eQTL SNP at rs9264942 with HLA-C expression on PBMC, specifically on CD3e + CD8a + lymphocytes, macrophages, and neutrophils, in the Japanese. HLA-C cell surface expression differed among cell populations, which supported previous findings of relatively low HLA-C expression on lymphocytes [33][34][35] and high expression on antigen-presenting cells, such as macrophages 36 . The increased expression of HLA-C molecules promotes antigen presentation and recognition by CD8a + cytotoxic T cells and natural killer cells for host protection, but in some instances leads to multiple diseases. HLA-C surface expression is also regulated by microRNA miR-148a binding on the 3′ untranslated region (3′UTR) of the HLA-C gene 37,38 . Therefore, interactions between the eQTL SNP of HLA-C at rs9264942 and miR-148a binding on the 3′UTR region of the HLA-C gene merit future consideration to uncover the precise genetic and molecular mechanisms of HLA-C expression. We also compared the HLA-C expression on PBMC between the AA or AG genotype and the GG genotype at rs2395471, which was reported as another eQTL SNP of HLA-C 23 . HLA-C expression on CD3e + CD8a + T lymphocytes was significantly higher for the AA or AG genotype than for the GG genotype at rs2395471, but no differences were observed on macrophages and neutrophils. Additional molecular mechanisms of HLA-C expression regulation on PBMC subsets by eQTL SNP differences require assessment in future studies.
Okada et al. reported that the HLA-C*12:02~B*52:01~DRB1*15:02 haplotype was associated with susceptibility to UC as well as protection against CD in the Japanese 25 . If patients had the particular three-SNP haplotype of TAC at rs2270191, rs3132550, and rs6915986, this haplotype showed very similar sensitivity to UC as did the HLA-C*12:02~B*52:01~DRB1*15:02 haplotype 25 . Moreover, the TAC haplotype frequency was low in CD patients, which supported a similar findings for the HLA-C*12:02~B*52:01~DRB1*15:02 haplotype frequency Table 5. Relationship between haplotypes from four SNPs (rs2270171, rs3132550, rs9264942, and rs6915986) and clinical findings in CD. Data are expressed as the number (%) except for age, which is expressed as the median (first-third quartile). SNP single-nucleotide polymorphism, CD Crohn's disease, OR odds ratio, CI confidence interval, IBD inflammatory bowel disease, 5-ASA 5-aminosalicylic acid, PSL prednisolone, CAP cytapheresis, AZA azathioprine, 6-MP 6-mercaptopurine, Tac tacrolimus, CyA cyclosporin. www.nature.com/scientificreports/ in a previous report 25 . Our result indicated that three-SNP haplotype assignments could serve as surrogate markers for disease susceptibility or protectivity to IBD instead of HLA ones. Although the HLA-C eQTL SNP at rs9264942 was located on this haplotype, the r 2 values for each SNP were relatively high apart from those between the SNP at rs9264942 and the others of less than 30. This indicated that LD was broken only by rs9264942 and therefore the possibility that the eQTL SNP at rs9264942 had an independent function in terms of IBD susceptibility. Thus, we analyzed the relationship of four-SNP haplotypes with IBD risk. The SNP at rs9264942 was related to CD susceptibility in European patients by the regulation of HLA-C expression 21,24 . We detected an association of the TACC haplotype at rs9264942, but not the TATC haplotype, which suggested that IBD susceptibility might not be related to the eQTL SNP at rs9264942 if patients carried the three SNPs of the TAC haplotype, thereby depending on a particular HLA-C*12:02~B*52:01~DRB1*15:02 haplotype 25 . Reciprocally, the CGTT haplotype in patients with the T allele at rs9264942 (low HLA-C expression group) was associated with UC protection, whereas the CGCT haplotype in patients with the C allele at rs9264942 (high HLA-C expression group) exhibited no remarkable association (Table 4). On the other hand, the CGTT haplotype was not associated with CD, although the CGCT haplotype had a significant association with CD susceptibility. This supported a previous paper describing that high HLA-C expression was related to CD susceptibility 21 . Attempts to confirm the independent involvement of the SNP at rs9264942 in IBD susceptibility or protection by stratified analysis using the Mantel-Haenszel test were inconclusive (data not shown), which might have been due to a relatively small number of subjects or a stronger association of a particular haplotype. Larger studies are needed to validate our findings. Regarding disease phenotype, it was interesting that CD patients without the TACC haplotype had significantly more intestinal complications (intestinal stenosis, perforation, and fistula) and history of intestinal surgery, suggesting that the HLA-C*12:02~B*52:01~DRB1*15:02 haplotype might impart a stronger effect on clinical results. Based on the above findings, these four SNPs may be simple but useful surrogate markers for predicting HLA-C*12:02~B*52:01~DRB1*15:02 haplotypes as well as disease outcome. Other HLA class II alleles have been associated with IBD onset among populations, such as HLA-DRB1*01:03 with disease predisposition to CD and UC in Europeans, but not in East-Asian Backgrounds 39,40 . Further studies are required to identify the direct mechanisms of involvement of these genetic variants on disease susceptibility and outcome.
The present study had several limitations. First, it evaluated HLA-C expression on the PBMC of healthy controls. As the number of normal controls was too small for a definitive conclusion, a larger validation analysis with more subjects is needed to confirm our results. However, earlier studies have used as few as 30 healthy controls 31 . HLA-C expression in tissue samples should also be analyzed to confirm the molecular mechanisms of HLA-C involvement in IBD. Second, this investigation was preliminary in nature because the numbers of cases and controls were limited. However, we adopted a PBC disease control cohort for the association analysis, whereby the IBD cohort was significantly different from healthy controls as well as from PBC patient controls regarding genetic frequency. Additional investigation is required to validate these newly discovered associations in a larger number of individuals; however, power calculations based on the study subjects of 160 UC patients and 275 CD patients and an effect size of 0.28 at rs9264942 showed sufficient detection power (0.99) at the 0.05 level of significance 41 . Moreover, the calculated type I error and type II were 0.001 and 0.004, respectively, which suggested that statistical error could be avoided. Third, the HLA-C*12:02~B*52:01~DRB1*15:02 haplotype should be examined in more detail to strengthen the results of this study. Fourth, this investigation was retrospective in nature and did not include a segregation analysis; prospective longitudinal studies are needed to clarify the associations of genetic polymorphisms with IBD outcomes.
In conclusion, our findings supported that the eQTL SNP at rs9264942 regulated HLA-C expression in the Japanese. Our association study also implicated four SNPs in strong LD of a particular HLA haplotype, HLA-C*12:02~B*52:01~DRB1*15:02, with IBD susceptibility and outcome, which might be potential surrogate markers for the disease.

Study subjects.
A total of 32 healthy control subjects were included for the analysis of HLA-C expression on PBMC (Table 1). The controls were volunteers from hospital staff who had indicated the absence of any major illnesses and no direct familial relations in a standard questionnaire.
A total of 160 patients with UC and 275 patients with CD who were seen between April 2014 and May 2019 at Shinshu University School of Medicine, Matsumoto, Japan, Japanese Red Cross Society Suwa Red Cross Hospital, Suwa, Japan, or Tokyo Yamate Medical Center, Tokyo, Japan, along with 325 healthy subjects including the above-described 32 healthy control subjects were enrolled for this association study (Table 2). A total of 328 PBC patients who were diagnosed as having PBC at Shinshu University Hospital between 1986 and 2015 were also included as a disease control group in this analysis. No participants were direct relatives of each other, and all were of an East-Asian genetic background.
The diagnoses of UC and CD were based on established Japanese Society of Gastroenterology guidelines using endoscopic, histologic, radiographic, and serologic findings 42 . We also sub-classified the IBD patients into disease parameters based on the phenotype of disease location (colonic, ileal, ileocolonic, and not determined in CD and proctitis, left-sided colitis, pancolitis, and not determined in UC) ( www.nature.com/scientificreports/ prednisolone (PSL) and anti-TNF-α was recommended by the guidelines for moderate or severe UC or CD 42 . Oral or intravenous PSL was advised as a remission induction therapy for left-sided colitis or pancolitis in UC, while anti-TNF-α was advocated for induction or maintenance therapy. A PSL-positive status was designated for patients with a history of PSL administration. A PSL-and anti-TNF-α-positive status was designated for patients who had received both PSL and anti-TNF-α. The diagnosis of PBC was based on the criteria from the Japan Society of Hepatology 45 . This sub-cohort has been well characterized in prior genetic and clinical studies 46-54 . Flow cytometry analysis of HLA-C expression on CD3e + CD8a + lymphocytes. PBMC were obtained one day after the hemolysis of red blood cells in ACK lysing buffer (Thermo Fisher Scientific K.K., Tokyo, Japan) according to the manufacturer's instructions. PBMC were incubated with antibodies that included 7-AAD (BioLegend, San Diego, USA), APC-anti-CD3e (BioLegend), FITC-anti-CD8a (BioLegend), and PE-anti-HLA-C (BD Biosciences) or PE-HLA-C-isotype control by IgG1 k (BioLegend) for 30 min on ice, washed twice with staining buffer, and then analyzed using a BD FACSCanto™ II flow cytometer (BD Biosciences, New Jersey, USA). HLA-C expression was determined by median fluorescence intensity on CD3e + CD8a + lymphocytes, CD3e + CD8a − lymphocytes, neutrophils, and macrophages in relation to the isotype control. The gating strategy is shown in Fig. 1a. Obtained data were analyzed by FlowJo software (BD Biosciences). All samples were simultaneously stained and analyzed in a one-day experiment.
Snp determination. Genomic DNA from all participants was isolated from whole blood samples using Quick Gene-610L assays (Fujifilm, Tokyo, Japan). The DNA concentration was adjusted to 10-15 ng/μL for SNP genotyping experiments. Two eQTL SNPs at rs9264942 and rs2395471 were selected based on a previous study 22,23 . Three SNPs at rs2270191, rs3132550, and rs6915986, which were in strong LD with the HLA-C*12:02~B*52:01~DRB1*15:02 haplotype, were also adopted based on an earlier report 22 . The genotyping of five SNPs located at rs9264942, rs2395471, rs2270191, rs3132550, and rs6915986 was performed with a TaqMan 5 exonuclease assay using primers supplied by Applied Biosystems (Foster City, CA, USA). The probe's fluorescence signal was detected with a StepOne Plus Real-Time PCR System (Thermo Fisher Scientific) according to the manufacturer's instructions.
Snp haplotype analysis. Haploview version 4.2 55,56 was used to evaluate the haplotype structure of the four SNPs at rs2270191, rs3132550, rs9264942, and rs6915986 (Fig. 2). Pairwise LD patterns and haplotype frequency analysis for all SNPs in patients and controls were analyzed by the block definition established by Gabriel et al. 57 .
Statistical analysis. The significance of associations was evaluated using chi-squared analysis or Fisher's exact test. p values were subjected to Bonferroni's correction by multiplication by the number of different alleles observed in each locus (pc). The Mann-Whitney U-test was used to analyze continuous variables. Stepwise logistic regression analysis with a forward approach was performed to identify independent SNPs associated with IBD susceptibility. A two-sided p value of < 0.05 was considered statistically significant. Association strength was estimated by calculating the OR and 95% CI with StatFlex software (version 7.0.3, Artech, Osaka, Japan) as well as IBM SPSS Statistics version 23.0 (IBM, Chicago, IL, USA).