Original Article

Genes and Immunity (2012) 13, 461–468; doi:10.1038/gene.2012.17; published online 10 May 2012

Classical HLA-DRB1 and DPB1 alleles account for HLA associations with primary biliary cirrhosis

P Invernizzi1,2, M Ransom3, S Raychaudhuri4,5,6, R Kosoy3, A Lleo2, R Shigeta3, A Franke7, F Bossa8, C I Amos9, P K Gregersen10, K A Siminovitch11,12, D Cusi13,14, P I W de Bakker4,6,15, M Podda2, M E Gershwin1, M F Seldin1,3 and The Italian PBC Genetics Study Group

  1. 1Department of Medicine, Division of Rheumatology, Allergy and Clinical Immunology, University of California at Davis, Davis, CA, USA
  2. 2Department of Medicine, Center for Autoimmune Liver Diseases, IRCCS Istituto Clinico Humanitas, Milan, Italy
  3. 3Department of Biochemistry and Molecular Medicine, University of California at Davis, Davis, CA, USA
  4. 4Department of Medicine, Divisions of Genetics and Rheumatology, Brigham and Women’s Hospital, Boston, MA, USA
  5. 5Partners HealthCare Center for Personalized Genetic Medicine, Boston, MA, USA
  6. 6Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
  7. 7Institute of Clinical Molecular Biology, Christian-Albrechts-University, Kiel, Germany
  8. 8Division of Gastroenterology, IRCCS-CSS Hospital, San Giovanni Rotondo, Italy
  9. 9Department of Epidemiology, University of Texas MD Anderson Cancer Center, Houston, TX, USA
  10. 10The Robert S Boas Center for Genomics and Human Genetics, Feinstein Institute for Medical Research, North Shore LIJ Health System, Manhasset, NY, USA
  11. 11Mount Sinai Hospital, Samuel Lunenfeld Research Institute and Toronto General Research Institute, Toronto, Ontario, Canada
  12. 12Departments of Immunology and Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
  13. 13Department of Medicine, Surgery and Dentistry, Università degli Studi di Milano, Milan, Italy
  14. 14Genomics and Bioinformatics Unit, Fondazione Filarete, Milan, Italy
  15. 15Department of Medical Genetics, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The Netherlands

Correspondence: Dr MF Seldin, Department of Biochemistry and Molecular Medicine, University of California, Davis, Davis, CA 95616, USA. E-mail: mfseldin@ucdavis.edu.

16Italian PBC Genetics Study Group members are listed in the appendix.

Received 21 February 2012; Revised 27 March 2012; Accepted 4 April 2012
Advance online publication 10 May 2012



Susceptibility to primary biliary cirrhosis (PBC) is strongly associated with human leukocyte antigen (HLA)-region polymorphisms. To determine if associations can be explained by classical HLA determinants, we studied Italian, 676 cases and 1440 controls, genotyped with dense single-nucleotide polymorphisms (SNPs) for which classical HLA alleles and amino acids were imputed. Although previous genome-wide association studies and our results show stronger SNP associations near DQB1, we demonstrate that the HLA signals can be attributed to classical DRB1 and DPB1 genes. Strong support for the predominant role of DRB1 is provided by our conditional analyses. We also demonstrate an independent association of DPB1. Specific HLA-DRB1 genes (*08, *11 and *14) account for most of the DRB1 association signal. Consistent with previous studies, DRB1*08 (P=1.59 × 10−11) was the strongest predisposing allele, whereas DRB1*11 (P=1.42 × 10−10) was protective. Additionally, DRB1*14 and the DPB1 association (DPB1*03:01; P=9.18 × 10−7) were predisposing risk alleles. No signal was observed in the HLA class 1 or class 3 regions. These findings better define the association of PBC with HLA and specifically support the role of classical HLA-DRB1 and DPB1 genes and alleles in susceptibility to PBC.


genetic risk; risk allele; imputation; antigen-binding pocket; autoimmune disease



The human major histocompatibility complex, human leukocyte antigen (HLA), has been implicated in the etiopathogenesis of primary biliary cirrhosis (PBC), similar to many other autoimmune diseases. Genome-wide association studies (GWAS) of PBC, including our own, find the strongest association with single-nucleotide polymorphisms (SNPs) within the HLA region.1, 2, 3 In these studies, the peak association signal is between HLA-DQA1 and HLA-DQB1. Multiple studies of PBC also show association with particular classical HLA alleles in PBC (reviewed in Invernizzi4). These studies have variably implicated different DRB1 alleles in the European populations with most studies, including all larger cohorts showing association of DRB1*08.5, 6 Our previous studies in an Italian cohort with PBC showed the association of DRB1*08as predisposing, and DRB1*11 and DRB1*13as protective alleles.6 A study using a small cohort (32 German PBC cases and 47 controls) suggest that DPB1 associations may also be present in Europeans.7 However, a comprehensive study of the HLA region associations has not been performed, and like other autoimmune diseases, it is unclear which determinants are actually causally related to pathogenesis.

To further study the HLA associations in PBC, in the current study, we used the most recent advances in imputation algorithms and sequence information resources, including the 1000 genome database to accurately impute missing SNPs, and importantly, HLA classical alleles. Specifically, our investigation rests on recent development and resources for imputing HLA classical alleles, including a reference set of European subjects.8 For our study, we used an inference set of SNP genotypes from both GWAS and a designed chip array, the Immunochip,9 which contains a set of SNPs that have been used in multiple studies of HLA.10 We perform a series of conditioning analyses that clarify which HLA genes and alleles underlie the major component of the genetic associations of PBC.



Analyses show strong association of imputed SNPs and HLA determinants

To further define PBC–HLA region associations, we analyzed association using imputed genotypes with high probabilities and information scores (see Materials and methods). These studies utilized genotypes from both GWAS and Immunochip arrays that contained large numbers of SNPs in the major histocompatibility complex region (Table 1, Supplementary Table S1, and see Materials and methods). Strong association was observed with the peak association (P=9.83 × 10−17), with rs115721871 at position 32653792, distal to DQB1 (Figure 1a, Table 2 and Supplementary Table S2). Although the strongest associations were with non-coding SNPs, multiple classical genes in HLA show strong association with PBC (Table 2). For the classical HLA genes, the strongest association was with DRB*08 (P=1.59 × 10−11). The DQB1*04:02 and DQA1*04:01 in tight linkage disequilibrium (LD) with DRB*08 (r2=0.84 and 0.89, respectively) showed nearly equivalent signals (1.38 × 10−10, 1.90 × 10−10, respectively). Very strong association was also observed for DRB1*11 (P=1.42 × 10−10) with a weaker association with the DQB1 allele (DQB1*03:01, P=6.10 × 10−9) that is in LD (r2=0.75) with DRB1*11. Less strong associations were observed with DRB1*14, DQB1*05:03 (6.89 × 10−7 and 6.21 × 10−7, respectively) and DPB1*03:01 (P=9.18 × 10−7). DQB1*05:03 is in nearly complete LD with DRB1*14 (r2=0.97). DPB1*03:01 is not in LD with any of the DRB1, DQB1 or DQA1 classical alleles or amino acids (AAs; r2<0.01). There was no association (P>10−4) observed for classical alleles in HLA A, B, C or DPA1.

Figure 1.
Figure 1 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

Analysis of the HLA region association signals in PBC. In each panel, the symbols show the strength of the association signal (ordinate) for the corresponding position (Mb, HG19) on chromosome 6 (abscissa). For panel (a), the P-value before conditioning is shown. For panels (bf) the P-values are shown after conditioning on the HLA determinant(s) indicated in the panel. The blue color-coded symbols denote the strongest associated marker with P-value <10−6, and the other markers are color-coded to indicate marker LD with the strongest associated marker: Markers with strong LD (r2>0.8) (red); moderate LD (r2>0.5) (orange), weak LD (r2>0.2) (yellow) and little or no LD (open symbols) are shown. The SNPs with the strongest associations were rs115721871 at bp 32653792 (panels a and d), rs9277558 at bp 33056711 (panels b and c) and rs9268668 at bp 32413889 (panel f).

Full figure and legend (196K)

As expected from the analysis of classical HLA alleles, PBC also showed strong association with specific AAs in these genes. Most of the HLA AA showing association signals corresponded to the key residues that distinguish the specific classical alleles, which for DRB1, included lysine (L) at AA74 in DRB1 (DRB*08), glutamate (E) at AA58 (DRB*11), alanine (A) at AA57 (DRB1*14) and histidine (H) at AA60 (DRB1*14). Similar results were observed for specific DQB1 and DQA1 AAs that are in strong LD with specific DRB1 alleles and AAs (Supplementary Table S2).

Conditioning studies using classical HLA genes

To examine whether these associations could be explained for by known coding differences in genes, we next performed a series of conditional analyses. These studies were done by conditioning on a combination of the alleles from an HLA gene (for example, DRB1) to control for the association that might be attributable to each gene, albeit some of the effect may not be directly attributed to that gene because of extensive LD across this region. The residual signals after controlling for the effect of various combinations of classical alleles and AA residues in these HLA genes show that both DRB1 and DQB1 could account for most of the association signal (Figure 1b–e, Table 2 and Supplementary Table S3). In addition, the signal in the DPB1 region was only marginally decreased, conditioning on DRB1, DQB1 or DQA1. Conditioning on DPB1 eliminated the signal in the DPB1 region and showed a modest increase in the signal in the across DRB1, DQA1 and DQB1.

To further assess these conditional analyses, we also examined the relative difference of the conditioning by different HLA genes by examining beta estimates and their differences. The beta estimate is the measure of the increase in log-odds that can be attributed to each copy of a given minor allele. The largest effect is from the composite of DRB1 alleles as shown by the residual beta estimates (and odds ratios (ORs)) after conditioning and the mean change in the beta estimates (Table 3). This is most evident examining the SNPs with the strongest signals from association (original signal<5 × 10−8). For example, the DRB1 conditioning had a much larger mean change in beta estimate (−0.424) compared with DQB1 (−0.236) (P-value <10−10, paired t-test). Additional conditional analyses using combinations with DQA1 demonstrated that DQA1 could not substitute for DRB1 or DPB1 in any of the combinations tested (data not shown).

Conditioning on specific HLA alleles

We next examined the effect of conditioning on specific DRB1, DQA1, DQB1 and DPB1 classical genes and AAs. A clear pattern emerged showing that the association of groups of SNPs was specifically controlled by different alleles. These results are highlighted in Table 4, and in a more complete version Supplementary Table S3. Individually, the specific SNPs conditioned a part of the association signal largely corresponding to those SNPs in moderate or strong LD (r2>0.5) with the particular classical specificity or AA. Similar effects were observed for specific alleles in one gene in strong LD with another gene. This is particularly evident for the DRB1*14 and DQB1*05:03, in which the effects of controlling for either of these alleles was virtually indistinguishable (Supplementary Table S3). Other pairs (for example, DRB1*08 and DQB1*04:02) showed small but consistent differences in which the DRB1 allele diminished more signal than when the DQB1 allele was used in the conditioning analyses (Supplementary Table S3). Notably, the strongest AA association was at position 74 that is in the antigen-binding pocket of DRB1.11, 12

Similar to when we conditioned on genes (‘all’ alleles at a particular gene), the DPB1 region SNP signal was only substantially reduced when DPB1*03:01 (Table 4 and Supplementary Table S3) or when specific DPB1 AAs were used in conditioning. The strongest effects were observed for the lysine at AA position 11 and the methionine at AA position 76 that are both members of the 16 AA in the putative antigen-binding pocket of this gene.11

Thus, the vast majority of the HLA region association signal can be accounted individually by conditioning on one of four specific alleles, three in DRB1 (*08, *11, and *14) and one in DPB1 (*03:01). Combinations of these specific alleles accounted for most of the remaining signal (Supplementary Table S3 and Figure 1f) and are also reflected in the strong reduction in beta estimates (Table 3). However, there are signals from several SNPs in the DRB1–DQB1 region that are not accounted for by these conditioning studies. None of these SNPs with signals P<10−5 after conditioning on DRB*08, *11 and *14, and DPB1*03:01 were among the stronger associated SNPs before conditioning (all with original association P-values >10−6). In particular, the strongest associated SNP after conditioning (rs9268668, P=1.67 × 10−7) showed no signal before conditioning (P=0.40). Whether these residual or new signals are also because of other specific classical HLA genes is not clear; however, conditioning on ‘all’ DRB1 and DPB1 alleles ablated all signals with resulting P-values>10−5 (Figure 1f), suggesting that additional sequence differences (for example, putative regulatory SNPs) do not have to be postulated.

Most of the signal observed for specific AAs was also specifically eliminated when conditioning on the DRB1 or DPB1 classical alleles. However, there were several exceptions in which the association signal was not readily decreased by controlling for single classical HLA alleles. These AAs included DRB1-AA47F, DRB1 AA74A, DQB1-AA26G and DQB1-AA74S. For these AAs, the signal was ablated when conditioning on two DRB1 alleles (DRB1*08 and DRB1*11) (Supplementary Table S3). Conversely, conditioning on these AA could not account for the association of the most of the other SNPs that were not ablated by single classical HLA alleles (Supplementary Table S3). Therefore, it may be less likely that these particular AAs are critical to explaining the association patterns we observed. However, we cannot exclude a specific functional role for these AAs, and it is notable that DRB1-AA47 and DRB1-AA74 are both in the antigen-binding pocket of DRB111, 12 and that conditioning on DRB1-AA47F did ablate several of the association signals that were not controlled by individual DRB1 classical alleles.

Genotypic associations

We also examined genotypic associations, including combinations of susceptibility alleles, and combinations of risk susceptibility and protective alleles. Examining individuals with combinations of risk alleles, DRB1*08 combined with DPB1*03:01 or DRB1*14 combined with DPB1*03:01, we found higher ORs for disease association than when examining only individuals with single susceptibility alleles (Supplementary Table S4). There were insufficient numbers of DRB1*08/DRB1*14 (frequency <1%) to evaluate these heterozygote genotypes. The increased risk of the combining DPB1*03:01 with DRB1*08 or DRB1*14 was observed whether DPB1*0301 was on the same or different haplotype as that of the DRB1 risk allele (Supplementary Table S4). When risk alleles were combined with the DRB1*11 protective allele the ORs were near 1 and there was no significant association with disease.

Finally, we also examined the cumulative combination of risk predisposing and protective alleles (Table 5 and Supplementary Table S4). The count of predisposing alleles minus protective alleles showed a strong correspondence with the OR for PBC. Individuals with an excess of one or two, or more risk alleles showed an OR of 3.05 and 5.25 between cases and controls, and conversely, individuals with an excess of one or two, or more protective alleles had OR of 0.5 or 0.38, respectively (Table 5). All results were compatible with an additive model of action between each of the alleles similar to our previous observations.6



The current study of PBC association with HLA differs from previous investigations by providing the most comprehensive analysis of the entire HLA region while correcting for multiple confounding factors. Our results are consistent with a predominant role for class II genes and, we believe, exclude any substantial effect from either HLA class I or class III genes (there were no residual signals for these genes with P<0.0005 after accounting for class II genes). This contrasts other autoimmune diseases in which HLA class I or class III has a predominant role (for example, myasthenia gravis13) or strong class I gene effects are observed independent of class II associations for example, type 1 diabetes14 and multiple sclerosis15, 16).

Our study strongly suggests that the major gene in HLA that underlies susceptibility to PBC is DRB1. Overall, DRB1 alleles show the strongest associations and conditioning studies show that DRB1 could account for almost all (except the DPB1 region) of the association signal. HLA-DQB1 shows association that is only marginally less than that observed for DRB1. However, several points suggest that these associations are secondary to the strong LD between DRB1 and DQB1: (1) the overall strength of association of particular DRB1 alleles is stronger than the corresponding DQB1 allele; (2) conditioning on DRB1 could account for all DQB1 associations; and (3) residual beta-estimates after conditioning showed a substantially stronger DRB1 than DQB1 effect.

In addition, our study provides strong evidence for an independent effect of DPB1. Although previous studies have, as indicated in the introduction, suggested DPB1 associations in PBC, these were based on small subject sets and were difficult to evaluate. Our study demonstrates that the association of DPB1 cannot be accounted for by controlling for other HLA region genes. These results are also consistent with findings in some, but not other, autoimmune diseases in which an independent effect of DPB1 has been reported. These include juvenile idiopathic arthritis,17 type 1 diabetes,18 multiple sclerosis16, 19 and particular autoantibodies in systemic lupus erythematosus.20

We note that our study does not directly address whether DRB3, DRB4, DRB5 or structural variations might have additional independent associations. At present, such studies are challenging because of absence of reference sets for imputation and/or difficulty in assessing these polymorphisms, including whether missing genotypes (excluded SNPs with call rates <0. 95) may have excluded analysis or inclusion of SNPs within these genes in available arrays.

This study has also addressed the association of specific HLA-gene alleles. Most of the HLA association with PBC can be attributed to specific associations with DRB1*08, DRB1*11, DRB1*14 and DPB1*03:01. DRB1*08 has the strongest association, followed by DRB1*11, consistent with several previous studies.6, 7 PBC associations with DRB1*14 has not been previously demonstrated; however, this weaker effect is supported by our conditioning studies that show that this classical allele can control for a set of associated SNPs and AAs that are not strongly influenced by other classical alleles (Table 3 and Supplementary Table S3). The DPB1*0301 association is consistent with a previous study of a small German cohort.7

In a previous study, we observed that DRB1*13 was a protective allele.6 In the current study, the association of DRB1*13 was weak (P=4.9 × 10−3, OR=0.69, 95% confidence limits=0.53–0.89) compared with the previous study (P=3.6 × 10−6). This may be because of several factors: (1) the previous study did not explicitly control for population substructure; (2) the overlap of subjects with the previous study is<25% and the difference may reflect statistical noise; and (3) the previous study used DNA typing rather than the imputation used in the current study. It may be worth noting that DRB1*13 like DRB1*11 has an alanine at AA position 74 and thus contributes to the protective effect observed for this AA (P=1.33 × 10−11, Supplementary Table S3). Similarly, our previous study6 showed only a marginal association of DRB1*14 (uncorrected P-value=0.004) compared with a strong association (P=6.9 × 10−7) observed in our current study. Here, the conditioning study results, including the effect of controlling SNPs with very strong associations (see group 3, Table 4), provide additional support for the role of DRB1*14.

Finally, we have also considered specific HLA AAs. Most of the associated AAs are both nearly unique to the specific HLA classical alleles discussed above and also correspond to critical residues for the antigen-binding pocket. Thus, associated AAs in DRB1 at AA positions 37, 47, 57, 60, 67, 70 and 74; and DPB1 at AA positions 9, 11, 76, 84, 87 are antigen-pocket AAs.11, 12 Consistent with our results, strong associations have been recently observed with serine at position 57 and leucine at position 74 in a Japanese PBC cohort.21 We also note that many of the associated DQB1 AAs are also in critical residues for antigen binding (DQB1-AA13, 26, 70, 71, 74).11, 12 Of the DRB1 associated, only AA58 is not among the AAs in this functional class, whereas for DQB1, several are not in this functional class (DQB1-AA45, 56, 75, 167, 185).

In conclusion, the most parsimonious explanation consistent with the current study is that classical HLA genes and the coding variations within these genes are responsible for the HLA associations with PBC. Although we cannot exclude the possibility that other sequence variations affecting, for example, gene regulation could be important, our data indicates that a limited set of classical DRB1 and DPB1 alleles are sufficient to explain the HLA associations with this disease. We believe the current data provides cogent information for understanding HLA-associations in PBC. Studies in other ethnic groups both within Europe and in other continental groups will also be important in further definition of the role of particular HLA genes and alleles. Lastly, our results provide additional rational for functional studies examining specific HLA genes and their relative binding to the putative disease-associated epitopes of the PDC-E2, the immunodominant autoantigen epitopes of PBC.22, 23


Materials and methods

Study population and design

The Italian PBC cases were obtained through a multi-center study and met internationally accepted criteria for the diagnosis of PBC as detailed in a previous study.6 Each of the included cases also met ancestry criteria as defined below (see Ancestry). Controls were derived from several sources and this sample set information is detailed in Supplementary Table S1. After data filtering and ancestry, analyses contained 676 Italian PBC cases and 1440 Italian controls. All subjects enrolled in the study provided written informed consent and the study followed ethical guidelines of the most recent revision of the Declaration of Helsinki (Edinburgh, 2000).

All samples were genotyped with either Illumina (San Diego, CA, USA) genome-wide and/or Immunochip SNP platforms, and the participants included the data set from our previous GWAS as well as new samples (see Supplementary Table S1). With the exception of ancestry information and assessment of relatedness, the current study was restricted to genotypes in an approximately 4Mb segment of human chromosome 6 (bps 28911802–33813043, HG19 map). This data set comprised a minimum of 1548 and a maximum of 5489 genotyped SNPs in each individual (Supplementary Table S1) and was used for the SNP and HLA imputations (see Imputation).

Data filtering

We used stringent quality-control criteria to ensure that high-quality data were included in the analyses. We excluded individuals who had >5% missing data, and all individuals with cryptic relatedness and duplicate samples based on identity-by-descent status for genome-wide SNPs (PI^>0.15) using PLINK.24

We included only SNPs with <5% missing data, Hard–-Weinberg equilibrium P-values >10−4 in controls and >10−5 in combined cases and controls (to exclude most genotyping errors), applying these procedures in a stepwise approach separately for each data set. For each of the separately derived control genotyping sets (Supplementary Table S1), SNPs were excluded if they failed the above criteria within the individual control set or in combination with any of the other control groups, or in the complete data set. The Hard–Weinberg criteria were applied after exclusion of non-European individuals (see Ancestry). Finally, SNPs were excluded if allele frequency differed by >10% between different control subject groups.


European ancestry was determined using 883 genome-wide SNPs with minimal or no LD (r2<0.1). SNPs analyzed using the STRUCTURE v2.1 program25 and subjects of known European, Amerindian, East Asian and West African origin as previously described.26 We used STRUCTURE to exclude non-European and admixed study participants, as this method allows exclusion/inclusion criteria to be set using reference populations. Subjects with >15% non-European ancestry were excluded from further analysis.

Italian ancestry was defined using principal components (PCs) analyses. For subjects with GWAS data, we used the same methods and criteria applied in a previous study with largely the same data set. Briefly, PCs analyses was performed using the EIGENSOFT statistical package,27 utilizing 34000 SNPs distributed throughout the genome (r2<0.1) that we have previously used to define population genetic substructure.2 These analyses used an independent set of Italian subjects for establishing membership (±2 s.d.) in first four PCs). In the current study, a substantial portion of the samples did not have GWAS data (Supplementary Table S1). For these samples, we used a set of 12579 SNPs from the Immunochip for which our empiric analyses demonstrated the ability of this set to discern Italian ancestry and exclude both the other European ethnicities, including Sardinian Italians (Supplementary Figure S1). Using the GWAS-defined individuals, the Immunochip only genotyped samples were included using 2s.d. in the first four PCs. In addition to the subject selection, we used the eigenvalues from the first four PCs (only the first four PCs were significant based on Tracy–Widom statistics) as covariates in our association analyses.


We imputed SNPs, HLA classical alleles and HLA gene AAs using phased reference genotypes from both the 1000 genome-sequencing project (interim release June 2011) (http://www.1000genomes.org/) and an HLA-defined reference set.8 For the 1000 genome imputation, we used IMPUTE version 228 under default parameters. The reference haplotypes for this imputation were from 1094 subjects, including 381 European subjects and 98 Tuscan Italians. The number of genotyped (inference) SNPs that overlapped with the 1000 genome reference set ranged from 1435 SNPs (samples typed by GWAS), 4386 (samples typed by Immunochip) to 4981 SNPs (samples typed by GWAS plus Immunochip) (Supplementary Table S1). For subsequent data analyses, we utilized only imputed genotypes with maximum posterior probability scores of >0.90. Using this parameter, our empiric testing (leave one-out analyses) indicated that the maximum error rate for genotype assignment was <0.05, and the mean error rate was <0.01. To impute classical HLA alleles and corresponding AAs determinants, we utilized a reference separate data set of collected by the Type 1 Diabetes Genetics Consortium. This reference data contains genotype data for 2537 SNPs, selected to tag the entire major histocompatibility complex, and classical types for HLA-A, B, C, DRB1, DQA1, DQB1, DPA1 and DPB1 at 4-digit resolution in 2767 unrelated individuals of European descent.29 The Beagle software package30 was used for this imputation under default parameters. The number of inference SNPs that overlapped with this reference data set ranged from 648 (samples typed by GWAS), 1444 SNPs (samples typed by Immunochip) to 1610 SNPs (GWAS plus Immunochip) (Supplementary Table S1). Similar to the imputation using 1000 Genome data, only SNPs with posterior probabilities of >0.90 were included in our final analyses. For imputed SNPs that overlapped between the two imputation sets and algorithms used (Impute V2.0 and Beagle), there was a nearly complete concordance of the association-testing results, indicating similar performance of these algorithms for this data set. After imputation and selecting only those markers meeting posterior probability criterion, this region contained a total of 49885 markers including the genotyped SNPs that were included in association test analyses.

Association and conditional association tests of imputed SNPs and HLA determinants

SNPTEST V2.028 (web) was used for the primary association analyses for the imputed genotypes. This software uses the genotype probabilities for the imputed SNPs or determinants and accounts for genotype uncertainty. The first four PC eigenvalue scores were used as continuous variables in the association test together with the gender covariate. Analyses were performed using the SNPTEST v2 Score test algorithm that enabled both the inclusion of the covariates and conditioning tests, and all of our reported results used an additive model. To minimize potential spurious results, we limited our main and conditioning analyses to markers with information scores (Inf)>0.85. This parameter is a measure of the observed statistical information for the estimate of SNP allele frequency (for additional information see https://mathgen.stats.ox.ac.uk/genetics_software/

Conditioning on multiple markers either separately or together was performed using an additive model. For the HLA region, over 150 conditional analyses were performed using the SNPs and HLA determinants, including all HLA determinants with P-values <10−6.

Nominal P-values after correction for covariates and conditioning are provided throughout the manuscript. The P-values <10−6 would remain significant after conservative (Bonferroni) correction for the number of markers (<50000) tested after imputation.


Conflict of interest

The authors declare no conflict of interest.



  1. Hirschfield GM, Liu X, Xu C, Lu Y, Xie G, Lu Y et al. Primary biliary cirrhosis associated with HLA, IL12A, and IL12RB2 variants. N Engl J Med 2009; 360: 2544–2555. | Article | PubMed | ISI | CAS |
  2. Liu X, Invernizzi P, Lu Y, Kosoy R, Lu Y, Bianchi I et al. Genome-wide meta-analyses identify three loci associated with primary biliary cirrhosis. Nat Genet 2010; 42: 658–660. | Article | PubMed | ISI | CAS |
  3. Mells GF, Floyd JA, Morley KI, Cordell HJ, Franklin CS, Shin SY et al. Genome-wide association study identifies 12 new susceptibility loci for primary biliary cirrhosis. Nat Genet 2011; 43: 329–332. | Article | PubMed | CAS |
  4. Invernizzi P. Human leukocyte antigen in primary biliary cirrhosis: an old story now reviving. Hepatology 2011; 54: 714–723. | Article | PubMed |
  5. Donaldson PT, Baragiotta A, Heneghan MA, Floreani A, Venturi C, Underhill JA et al. HLA class II alleles, genotypes, haplotypes, and amino acids in primary biliary cirrhosis: a large-scale study. Hepatology 2006; 44: 667–674. | Article | PubMed | ISI | CAS |
  6. Invernizzi P, Selmi C, Poli F, Frison S, Floreani A, Alvaro D et al. Human leukocyte antigen polymorphisms in italian primary biliary cirrhosis: a multicenter study of 664 patients and 1992 healthy controls. Hepatology 2008; 48: 1906–1912. | Article | PubMed |
  7. Mella JG, Roschmann E, Maier KP, Volk BA. Association of primary biliary cirrhosis with the allele HLA-DPB1*0301 in a German population. Hepatology 1995; 21: 398–402. | Article | PubMed | ISI | CAS |
  8. Pereyra F, Jia X, McLaren PJ, Telenti A, de Bakker PI, Walker BD et al. The major genetic determinants of HIV-1 control affect HLA class I peptide presentation. Science 2010; 330: 1551–1557. | Article | PubMed | ISI | CAS |
  9. Cortes A, Brown MA. Promise and pitfalls of the Immunochip. Arthritis Res Ther 2011; 13: 101. | Article | PubMed |
  10. de Bakker PI, McVean G, Sabeti PC, Miretti MM, Green T, Marchini J et al. A high-resolution HLA and SNP haplotype map for disease association studies in the extended human MHC. Nat Genet 2006; 38: 1166–1172. | Article | PubMed | ISI | CAS |
  11. Salamon H, Klitz W, Easteal S, Gao X, Erlich HA, Fernandez-Vina M et al. Evolution of HLA class II molecules: Allelic and amino acid site variability across populations. Genetics 1999; 152: 393–400. | PubMed | ISI | CAS |
  12. Karp DR, Marthandan N, Marsh SG, Ahn C, Arnett FC, Deluca DS et al. Novel sequence feature variant type analysis of the HLA genetic association in systemic sclerosis. Hum Mol Genet 2010; 19: 707–719. | Article | PubMed |
  13. Vandiedonck C, Beaurain G, Giraud M, Hue-Beauvais C, Eymard B, Tranchant C et al. Pleiotropic effects of the 8.1 HLA haplotype in patients with autoimmune myasthenia gravis and thymus hyperplasia. Proc Natl Acad Sci USA 2004; 101: 15464–15469. | Article | PubMed |
  14. Noble JA, Valdes AM, Varney MD, Carlson JA, Moonsamy P, Fear AL et al. HLA class I and genetic susceptibility to type 1 diabetes: results from the Type 1 Diabetes Genetics Consortium. Diabetes 2010; 59: 2972–2979. | Article | PubMed |
  15. Wu XM, Wang C, Zhang KN, Lin AY, Kira J, Hu GZ et al. Association of susceptibility to multiple sclerosis in Southern Han Chinese with HLA-DRB1, -DPB1 alleles and DRB1-DPB1 haplotypes: distinct from other populations. Mult Scler 2009; 15: 1422–1430. | Article | PubMed |
  16. Field J, Browning SR, Johnson LJ, Danoy P, Varney MD, Tait BD et al. A polymorphism in the HLA-DPB1 gene is associated with susceptibility to multiple sclerosis. PLoS One 2010; 5: e13454. | Article | PubMed | CAS |
  17. Runstadler JA, Saila H, Savolainen A, Leirisalo-Repo M, Aho K, Tuomilehto-Wolf E et al. HLA-DRB1, TAP2/TAP1, and HLA-DPB1 haplotypes in Finnish juvenile idiopathic arthritis: more complexity within the MHC. Genes Immun 2004; 5: 562–571. | Article | PubMed | ISI | CAS |
  18. Varney MD, Valdes AM, Carlson JA, Noble JA, Tait BD, Bonella P et al. HLA DPA1, DPB1 alleles and haplotypes contribute to the risk associated with type 1 diabetes: analysis of the type 1 diabetes genetics consortium families. Diabetes 2010; 59: 2055–2062. | Article | PubMed |
  19. Bergamaschi L, Leone MA, Fasano ME, Guerini FR, Ferrante D, Bolognesi E et al. HLA-class I markers and multiple sclerosis susceptibility in the Italian population. Genes Immun 2010; 11: 173–180. | Article | PubMed | ISI |
  20. Sebastiani GD, Galeazzi M, Tincani A, Scorza R, Mathieu A, Passiu G et al. HLA-DPB1 alleles association of anticardiolipin and anti-beta2GPI antibodies in a large series of European patients with systemic lupus erythematosus. Lupus 2003; 12: 560–563. | Article | PubMed |
  21. Umemura T, Joshita S, Ichijo T, Yoshizawa K, Katsuyama Y, Tanaka E et al. Human leukocyte antigen class II molecules confer both susceptibility and progression in Japanese patients with primary biliary cirrhosis. Hepatology 2012; 55: 506–511. | Article | PubMed |
  22. Gershwin ME, Mackay IR, Sturgess A, Coppel RL. Identification and specificity of a cDNA encoding the 70 kd mitochondrial antigen recognized in primary biliary cirrhosis. J Immunol 1987; 138: 3525–3531. | PubMed | ISI | CAS |
  23. Lleo A, Bowlus CL, Yang GX, Invernizzi P, Podda M, Van de Water J et al. Biliary apotopes and anti-mitochondrial antibodies activate innate immune responses in primary biliary cirrhosis. Hepatology 2010; 52: 987–998. | Article | PubMed |
  24. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 2007; 81: 559–575. | Article | PubMed | ISI | CAS |
  25. Falush D, Stephens M, Pritchard JK. Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 2003; 164: 1567–1587. | PubMed | ISI | CAS |
  26. Kosoy R, Nassir R, Tian C, White PA, Butler LM, Silva G et al. Ancestry informative marker sets for determining continental origin and admixture proportions in common populations in America. Hum Mutat 2009; 30: 69–78. | Article | PubMed | ISI |
  27. Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 2006; 38: 904–909. | Article | PubMed | ISI | CAS |
  28. Marchini J, Howie B. Genotype imputation for genome-wide association studies. Nat Rev Genet 2010; 11: 499–511. | Article | PubMed | ISI | CAS |
  29. Brown WM, Pierce J, Hilner JE, Perdue LH, Lohman K, Li L et al. Overview of the MHC fine mapping data. Diabetes, obesity & metabolism 2009; 11: 2–7. | Article | PubMed |
  30. Browning BL, Browning SR. A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. Am J Hum Genet 2009; 84: 210–223. | Article | PubMed | ISI | CAS |



AppendixThe Italian PBC Genetic Study Group

Piero L. Almasio (Gastroenterology and Hepatology Unit, Di.Bi.M.I.S., University of Palermo, Palermo), Domenico Alvaro (Department of Medico-Surgical Sciences and Biotechnologies, Fondazione Eleonora Lorillard Spencer Cenci, University Sapienza of Rome, Rome), Pietro Andreone (Dipartimento di Medicina Clinica, Università di Bologna, Bologna), Angelo Andriulli (IRCCS Casa Sollievo della Sofferenza Hospital, San Giovanni Rotondo), Cristina Barlassina (Department of Medicine, Surgery and Dentistry, Università degli Studi di Milano, Milan), Antonio Benedetti (Università Politecnica delle Marche, Ancona), Francesca Bernuzzi (Center for Autoimmune Liver Diseases, IRCCS Istituto Clinico Humanitas, Rozzano), Ilaria Bianchi (Center for Autoimmune Liver Diseases, IRCCS Istituto Clinico Humanitas, Rozzano), MariaConsiglia Bragazzi (Department of Medico-Surgical Sciences and Biotechnologies, Fondazione Eleonora Lorillard Spencer Cenci, University Sapienza of Rome, Rome), Maurizia Brunetto (Azienda Ospedaliera Universitaria Pisana, Pisa), Savino Bruno (Department of Internal Medicine, Ospedale Fatebene Fratelli e Oftalmico, Milan), Lisa Caliari (Center for Autoimmune Liver Diseases, IRCCS Istituto Clinico Humanitas, Rozzano), Giovanni Casella (Medical Department, Desio Hospital, Desio), Fabiola Civardi (Center for Autoimmune Liver Diseases, IRCCS Istituto Clinico Humanitas, Rozzano IRCCS Istituto Clinico Humanitas, Rozzano), Barbara Coco (Azienda Ospedaliera Universitaria Pisana, Pisa), Agostino Colli (Department of Internal Medicine, AO Provincia di Lecco, Lecco), Massimo Colombo (Fondazione IRCCS Ca’ Granda, Ospedale Maggiore Policlinico, Milan), Silvia Colombo (Treviglio Hospital, Treviglio), Carmela Cursaro (Dipartimento di Medicina Clinica, Università di Bologna, Bologna), Lory Saveria Croce (University of Trieste and Fondazione Italiana Fegato (FIF), Trieste), Andrea Crosignani (San Paolo Hospital Medical School, Università di Milano, Milan), Francesca Donato (Fondazione IRCCS Ca’ Granda, Ospedale Maggiore Policlinico, Milan), Luca Fabris (University of Padova, Padova), Carlo Ferrari (Azienda Ospedaliero-Universitaria di Parma, Parma), Annarosa Floreani (Dept. of Surgical, Oncological and Gastroenterological Sciences, University of Padova, Padova), Andrea Galli (University of Florence, Florence), Ignazio Grattagliano (Italian College of General Practicioners, ASL Bari), Roberta Lazzari (Department of Surgical, Oncological and Gastroenterological SciencesUniversity of Padova, Padova), Fabio Macaluso (Gastroenterology & Hepatology Unit, Di.Bi.M.I.S., University of Palermo, Palermo), Fabio Marra (University of Florence, Florence), Marco Marzioni (Università Politecnica delle Marche, Ancona), Alberto Mattalia (Santa Croce Carle Hospital, Cuneo), Renzo Montanari (Ospedale di Negrar, Verona), Lorenzo Morini (Magenta Hospital, Magenta), Filomena Morisco (University of Naples, Federico II, Naples), Luca Moroni (Center for Autoimmune Liver Diseases, IRCCS Istituto Clinico Humanitas, Rozzano), Luigi Muratori (Department of Clinical Medicine, University of Bologna, Bologna), Paolo Muratori (Department of Clinical Medicine, University of Bologna, Bologna), Grazia Niro (IRCCS Casa Sollievo della Sofferenza Hospital, San Giovanni Rotondo), Antonio Picciotto (University of Genoa, Genoa), Piero Portincasa (Department of Interdisciplinary Medicine, University Medical School, Bari), Daniele Prati (Ospedale Alessandro Manzoni, Lecco, Fondazione IRCCS Ca’ Granda, Ospedale Maggiore Policlinico, Milan), Cleofe Prisco (Ospedale Niguarda, Milan), Floriano Rosina (Division of Gastroenterology & Hepatology, Center for Predictive Medicine, Gradenigo Hospital, Turin), Sonia Rossi (Department of Internal Medicine, Ospedale Fatebene Fratelli e Oftalmico, Milan), Carlo Selmi (IRCCS Istituto Clinico Humanitas, Rozzano), Giancarlo Spinzi (Azienda Ospedaliera Valduce, Como), Mario Strazzabosco (Yale University, New Haven, Connecticut 06511, USA and University of Milan-Bicocca, Monza), Sonia Tarallo (Division of Gastroenterology and Hepatology, Center for Predictive Medicine, Gradenigo Hospital, Turin), Claudio Tiribelli (University of Trieste and Fondazione Italiana Fegato (FIF), Trieste), Pierluigi Toniutto (University of Udine, Udine), Maria Vinci (Ospedale Niguarda, Milan), Massimo Zuin (San Paolo Hospital Medical School, Università di Milano, Milan).



This study was supported by NIH grants: R01 DK056839, R01DK091823, and K08AR055688 and HYPERGENES (European Network for Genetic-Epidemiological Studies HEALTH-F4-2007-201550).

Supplementary Information accompanies the paper on Genes and Immunity website