Principal contribution of HLA-DQ alleles, DQB1*06:04 and DQB1*03:01, to disease resistance against primary biliary cholangitis in a Japanese population

Identification of the primary allele(s) in HLA class II associated diseases remains challenging because of a tight linkage between alleles of HLA-DR and -DQ loci. In the present study, we determined the genotypes of seven HLA loci (HLA-A, -B, -DRB1, -DQA1, -DQB1, -DPA1 and -DPB1) for 1200 Japanese patients with primary biliary cholangitis and 1196 controls. Observation of recombination derivatives facilitated an evaluation of the effects of individual HLA alleles consisting of disease-prone/disease-resistant HLA haplotypes. Consequently, a primary contribution of DQB1*06:04 (odds ratio: 0.19, p = 1.91 × 10−22), DQB1*03:01 (odds ratio: 0.50, p = 6.76 × 10−10), DRB1*08:03 (odds ratio: 1.75, p = 1.01 × 10−7) and DQB1*04:01 (odds ratio: 1.50, p = 9.20 × 10−6) was suggested. Epistasis of the protective DQB1*06:04 to risk conferred by DRB1*08:03 was demonstrated by subpopulation analysis, implicating the presence of an active immunological mechanism that alleviates pathogenic autoimmune reactions. Further, the contribution of the aforementioned HLA alleles as well as an HLA-DP allele, DPB1*02:01 to the association signals of 304 loci among 4103 SNPs in the HLA region at the genome-wide level of significance (p values less than 5 × 10−8) was demonstrated by the stepwise exclusion of the individuals possessing these HLA alleles from the comparison.

As is the case in other autoimmune disorders, PBC has been associated with HLA polymorphisms [8][9][10][11][12][13] , and in most of these conditions, the impact of specific HLA alleles on the antigenic repertoire of effector cells was suggested as a mechanism underlying autoimmunity.
Genome-wide association studies (GWAS) of PBC in different populations have revealed the involvement of genetically determined alterations in certain immunological pathways, such as those related to IL12 signal transduction, TNF/TLR signal transduction, and B cell differentiation to plasma cells. However, most of these genes have not been universally identified in studies thus far, with the exception of polymorphic markers in the HLA region [14][15][16] . One of the outstanding features of HLA genes is that they exhibit the highest degree of polymorphism among human functional genes. Hundreds to thousands of alleles have been identified at the loci encoding HLA class I (HLA-A, -B, and -C) and class II (HLA-DR, -DQ, and -DP) molecules, some of which exist in particular preferential combinations known as "common HLA haplotypes" in a relatively ethnicity-specific manner. In the present study, we examined the effects of HLA polymorphisms on the development of PBC and demonstrated that multiple HLA alleles show highly significant genome wide-association signals for single-nucleotide polymorphisms (SNPs) in the HLA region.

Results
Clinical characteristics of the study population. This study enrolled 1200 Japanese patients with PBC (Table 1). A female predominance was observed, with a female to male ratio of 7.63. The majority of patients (71.6%) did not progress beyond clinical stage I by the time of their latest clinical evaluation. Patients in the clinical stage III group included 112 cases who had undergone liver transplantation (9.4%) and 8 cases who died of progression to hepatic failure (0.7%). Clinical and histological staging did not differ between genders.
Concomitant autoimmune disorders were generally more prevalent in female patients. Among them, Sjögren's syndrome, systemic sclerosis, and rheumatoid arthritis differed significantly in prevalence between female and male patients. The AMA-positive rate was higher in male than female patients (93.6% vs. 87.4%), but other autoantibodies that may be accompanied by autoimmune complications were more prevalent in female patients. Aside from the higher prevalence of PBC and autoimmune diseases in females, there were no gender-based HLA haplotype analysis. Because the linkage between certain HLA alleles is so tight, high level of linkage disequilibrium (LD) occurs, carrying a certain portion of the significant difference in allele or carrier frequencies at the loci of interest observed in patients-control comparisons to potentially be attributable to over-or under-representation of the alleles of other loci, which are more likely to be causative variants. We tried to identify such primary associations among the significant DRB1 and DQB1 alleles by haplotype analysis of the four most significant DRB1-DQB1 combinations (  Interaction between DR-DQ risk/protective factors. The effect of a given risk/protective factor may enhance or attenuate the action of a second factor beyond the extent anticipated by an independent additive effect model. The interactions between the four most significant primary risk/protective factors identified above were analyzed. For this purpose, augmentation or attenuation of the effect was evaluated by comparing the frequencies of carriers of the factor of interest in subpopulations stratified by the presence or absence of each of the other three factors (Table 4); for example, the effect of DRB1*08:03 was not influenced by the existence of DQB1*03:01 or DQB1*04:01, but was profoundly affected by the presence of DQB1*06:04 (Table 4). It is of noteworthy that the interaction between DRB1*08:03 and DQB1*06:04 was asymmetrical; the protective effect of DQB1*06:04 was not influenced by the disease-promoting effect of DRB1*08:03 (OR = 0.14, p = 0.00417), while the disease-promoting effect of DRB1*08:03 was almost completely negated by the presence of DQB1*06:04 (OR = 1.08, p = 0.92). A similar but inverse asymmetric interaction between DRB1*08:03 and DQB1*03:01 was also demonstrated by stratification analysis; the risk of DRB1*08:03 was evident in the presence of DQB1*03:01 (OR = 2.71, p = 0.000882) but the protective effect conferred by DQB1*03:01 was not observed in the presence of DRB1*08:03 (OR = 0.78, p = 0.41).

SNP association.
For all patients and controls, 4103 SNPs in the HLA region (bound by rs446198 at position chr 6:29507426 of GRCh37 assembly and rs367408 at position chr 6:33505746) were genotyped 16,18 and the data were further analyzed for the 1200 patients with PBC and 1196 controls whose HLA data were available. When the smaller of two p values which were obtained by applying the dominant effect model of either the predominant allele or the less frequent allele was taken as the effect of each SNP locus, 305 SNPs of them showed p values less than 5 × 10 −8 (Fig. 1). Among them, rs9268644 near the HLA-DRA locus gave the minimal p value (p = 5.64 × 10 −24 ) with an odds ratio of 0.39 (Table 5). In our first round of GWAS, rs9275175 in the HLA-DQB1 locus was identified as the most significant SNP 16 , but it was not as significant as rs9268644 in this setting (OR = 0.41, p = 6.06 × 10 −18 , Table 5). Three hundred and five nominally significant association signals were distributed from the HLA-A to HLA-DP loci in accordance with the results of the HLA association analysis (Fig. 1A).
To evaluate the contribution of HLA alleles on the association signal of these SNPs in the HLA region, individuals carrying the HLA of interest were excluded from the dominant effect models. We then calculated odds  ratio and p value as described above and compared the p values before and after the stepwise exclusion of HLA alleles of interest (Table 5 and Fig. 1B-H). Among the four most significantly associated HLA alleles, DQB1*06:04 shows strongest impact as revealed by the comparison between 1163 DQB1*06:04-negative patients and 1025 DQB1*06:04-negative controls in which only 62 SNPs remained with a p value less than 5 × 10 −8 (Table 5 and Fig. 1B), although the other three alleles did as well but to a reduced extent (Fig. 1C-E). Furthermore, when the two protective alleles DQB1*06:04 and DQB1*03:01 were combined, 302 signals became less significant than the threshold (Fig. 1F), while the combination of two disease-promoting alleles, DRB1*08:03 and DQB1*04:01 had a weaker impact (Fig. 1G) even though the number of patients and controls remained larger than the case that for the protective alleles. In the absence of the four HLA-DR/DQ alleles, no SNP reached a nominal genome-wide significance level, but a peak association signal was found in the HLA-DPB1 locus, rs9277509 with p = 2.30 × 10 −7 and an odds ratio of 0.49 in the comparison of the four HLA-DR/DQ-negative subpopulation of patients (N = 476) and controls (N = 440) ( Table 5 and Fig. 1H). The fact that the SNP association signal remained after excluding the carriers of these four HLA-DR/DQ alleles could be explained by DPB1*02:01 exhibiting a similar p value (3.08 × 10 −7 ) to the same subpopulation analysis (Fig. 1H).

Discussion
The association of certain SNP alleles in the HLA class II region with the development of PBC has been consistently reported as a major finding in several GWAS studies across different ethnic groups, including Japanese 14-16 .
These results suggest that one or more HLA class II-linked genetic factors influence susceptibility to PBC, a theory that has been strongly supported by a number of HLA association studies [8][9][10][11][12][13] . In the present study, we recruited a large number of patients and healthy control individuals and could therefore obtain confirmatory results at genome-wide significance levels (p < 5 × 10 −8 ) for the strongest disease-promoting effect of DRB1*08:03 and the strongest protective effect of DQB1*06:04 in Japanese individuals. The effects of both haplotypes were previously reported by us 10 and others 12 . Further, 22 out of the 87 HLA factors examined in the present study were significant after Bonferroni's correction for multiple testing (p < 5.75 × 10 −4 ), which is known to be very conservative. This was the case even when we applied the statistical test for the comparison of carrier frequencies, which is generally less sensitive than the test for allele frequencies but is more relevant for the dominant model of inheritance. Because of LD between the alleles carried by common HLA haplotypes in the ethnic population of interest, some of our findings result from secondary associations due to LD with the primary allele. Therefore, we performed haplotype association analysis to discern possible interdependencies among them. We identified four major risk/protective factors in the HLA-DR-DQ region for PBC in a Japanese population, DQB1*06:04, DRB1*08:03, DQB1*03:01, and DQB1*04:01. As discussed in previous reports 8,9,11 , some of these risk/protective HLA alleles share similarities with clinically important alleles in other populations, such as the risk-increasing DRB1*08:01 allele and the protective DRB1*11 and DRB1*13 alleles in individuals of European descent. As shown in previous studies, the molecular basis underlying the effects of alleles, whether they are risk-promoting or protective, may involve common amino acid residues exclusive to each group 8,12 . However, false association errors hindered the results of these studies, and therefore determining the principally associated allele is critical for this type of analysis. The identification of multiple independent risk/protective factors within the HLA region prompted us to evaluate the effects of an interaction between them, since this has not previously been performed except for a study examining the genotype effect of the HLA-DRB1 locus 9 , in which the risk-promoting effect of DRB1*08 and the protective effect of DRB1*11 were independent and competed with each other. Some risk/protective factors behaved differently, being either enhanced or attenuated, depending on the presence or absence of another factor. For example, the major disease-promoting effect associated with DRB1*08:03 disappeared almost completely in the presence of the protective effect of DQB1*06:04. A similar but opposite relationship between the disease-promoting HLA-DR and the protective HLA-DQ factors was observed between DRB1*08:03 and DQB1*03:01. Recently, DQB1*06:04 (and DRB1*13:02) was also reported to be a protective allele against autoimmune thyroid diseases 19 ; this allele exhibited dominant epistatic effects on HLA risk factors in a similar fashion to that observed in our study. Elucidating the immunological implications of the unrivaled protective effect conferred by DQB1*06:04 may lead to the identification of an active suppressive mechanism for the development of PBC, and could thus identify potential targets for disease prevention. Odds ratio (95% CI), p 0.15 (0.03-0.73), p = 0.0068 0.21 (0.14-0.30), p = 8.16 × 10 −19 Table 4. Effect of the risk/protective DRB1/DQB1 alleles on the presence/absence of another risk/protective DRB1/DQB1 allele. The effect of the most significant risk/protective DRB1 and DQB1 alleles was evaluated by comparing the frequencies between patients with PBC and control population in the presence/absence of another risk/protective allele. Significant differences (p < 0.05) in the comparisons are highlighted in bold. Epigenetic control of gene expression is another factor that could further elucidate the genetic contribution for PBC disease susceptibility which was not covered by studies of genetic polymorphisms such as our HLA analysis or GWAS. Indeed, alterations in DNA methylation patterns in immune cells were found in patients with PBC 20, 21 . Furthermore, somatic changes in the genetic material such as sex chromosome loss leading to monosomy X were also reported and may elucidate the underlying mechanism for the female predominance of PBC 22,23 .
In summary, this study analyzed a large population of patients with PBC and an equivalently sized control group to confirm the presence of multiple disease-promoting and -protective genetic factors in the HLA region. Interactions between these genetic factors will provide a better understanding of the complicated pathogenic mechanisms of PBC.

Methods
Study design. Case-control study: patient samples were collected at 60 medical institutions in Japan, 32 of which belong to the National Hospital Organization Study Group for Liver Disease in Japan (NHOSLJ). After obtaining written informed consent, we collected patient blood samples for serum and DNA analysis. All study protocols were approved by the institutional review boards of Nagasaki University, NHOSLJ and the other participating institutions according to the Declaration of Helsinki issued by the World Medical Association. Subject population. All patients met at least two of the following three criteria for the definitive diagnosis of PBC: (i) persistent elevation of serum alkaline phosphatase, an enzyme indicative of cholestasis; (ii) positive AMA test; and (iii) liver biopsy showing non-suppurative inflammation and destruction of the interlobular bile ducts (florid duct lesions), which are characteristics of PBC 3 . Patients with positive serological markers for persistent hepatitis B or C virus infection were excluded from this study. Liver biopsy data were available for 857 of the 1280 patients (67.0%). Histological diagnosis and staging was performed according to Scheuer's classification 3 . Patients were categorized into three different clinical stages based on liver biopsy results and clinical manifestations: clinical stage I, Scheuer's stage 1 or 2 in liver biopsy or unknown histological stage without signs of portal hypertension or liver cirrhosis; clinical stage II, Scheuer's stage 3 or 4 in liver biopsy or any histological stage with signs of portal hypertension or liver cirrhosis but without jaundice (total bilirubin less than 2 mg/dL); clinical stage III, any Scheuer's stage with persistent jaundice (total bilirubin 2 mg/dL or above). Data for clinical staging were provided by patients' primary caregivers via the collection of fixed case record form.  SNP genotyping. The study population in the present study includes 487 patients and 476 controls who were analyzed in our first round of GWAS 16 . Other patients and controls were collected for the second phase of GWAS 18 . After collection and cleaning of SNP genotyping data as previously described 16 , the genotype data for 4103 SNPs in the HLA region (bounded by rs446198 at position chr 6:29507426 of GRCh37 assembly and rs367408 at position chr 6:33505746) were evaluated. The subjects' first, second and third-degree relatives (parent-offspring, siblings, uncle/aunt-nephew/niece) were excluded from this study based on a test of identity-by-descent using SNP data collected in the GWAS.
Statistical analysis. The association of disease phenotype and HLA carrier status or SNP genotype was evaluated by the odds ratio as calculated by Woolf 's formula and examined by the chi-square test with 2 × 2 contingency tables, unless otherwise indicated. HLA-DRB1-DQB1 haplotypes were empirically determined for all genotyped individuals with reference to publically accessible HLA haplotype frequency data (HLA Laboratory, Kyoto Japan, http://hla.or.jp/haplo/haplodl.php?lang = en). Haplotype association analysis was applied to compare the relative effect size between HLA-DRB1 and -DQB1 alleles consisting of significant haplotypes. In order to examine the interaction between selected HLA alleles, the effect of HLA carrier status was evaluated in subpopulations stratified by another HLA carrier status of interest. Statistical tests, including those mentioned above, were performed using STATA Release 12 (StataCorp, College Station, TX, USA).