Contribution of an Asian-prevalent HLA haplotype to the risk of HBV-related hepatocellular carcinoma

Liver cancer, particularly hepatitis B virus (HBV)-related hepatocellular carcinoma (HCC), is more common in Asians than in Caucasians. This is due, at least in part, to regional differences in the prevalence of exogenous factors such as HBV; however, endogenous factors specific to Asia might also play a role. Such endogenous factors include HLA (human leukocyte antigen) genes, which are considered candidates due to their high racial diversity. Here, we performed a pancancer association analysis of 147 alleles of HLA-class I/II genes (HLA-A, B, and C/DRB1, DQA1, DQB1, DPA1, and DPB1) in 31,727 cases of 12 cancer types, including 1684 liver cancer cases and 107,103 controls. HLA alleles comprising a haplotype prevalent in Asia were significantly associated with pancancer risk (e.g., odds ratio [OR] for a DRB1*15:02 allele = 1.12, P = 2.7 × 10–15), and the associations were particularly strong in HBV-related HCC (OR 1.95, P = 2.8 × 10–5). In silico prediction suggested that the DRB1*15:02 molecule encoded by the haplotype does not bind efficiently to HBV-derived peptides. RNA sequencing indicated that HBV-related HCC in carriers of the haplotype shows low infiltration by NK cells. These results indicate that the Asian-prevalent HLA haplotype increases the risk of HBV-related liver cancer risk by attenuating immune activity against HBV infection, and by reducing NK cell infiltration into the tumor.

Subsequently, we investigated the association between the eight alleles comprising the Asian-prevalent haplotype and the risk for each of the 12 cancer types.After applying Bonferroni correction, we identified significant associations with four cancer types: liver, stomach, cervical, and lung (Table 2; e.g., the OR of a DRB1*15:02 allele for liver cancer risk = 1.30,P = 3.1 × 10 -7 ).Notably, three of these cancers (liver, stomach, and cervical) are more prevalent in Asian populations than in Caucasian populations, and are linked to viral or bacterial infections 1 .In light of the availability of information regarding cancer subtypes and viral infection, we focused on liver cancer in the subsequent part of this study.
Table 1.Association of HLA alleles with pancancer risk.a Adjusted for age, sex, and the top five major PCA components, which were obtained from the pancancer GWAS.b Alleles constituting an Asian-prevalent haplotype.HLA: human leukocyte antigen, OR: odds ratio, CI: confidence interval.HBV and HCV infections are the primary etiologies of liver cancer 2,3 .Hence, we investigated the differences in the association of HLA alleles in the context of viral infection.Our analysis encompassed cases of HBV-related liver cancer/HCC (n = 128/67), HCV-related liver cancer/HCC (n = 622/299), and virus-negative liver cancer/HCC (NBNC) (n = 277/130).Notably, we found that the ORs for the associations across seven HLA genes, except for HLA-A, were significantly higher for HBVrelated cancers than for HCV-related and NBNC cancers (Table 3; e.g., the OR of a DRB1*15:02 allele for HBVrelated liver cancer = 1.95,P = 2.8 × 10 -5 ).Furthermore, when we limited our analysis to cases with information related to the diagnosis of carcinoma (i.e., HCC), we detected an even more substantial increase in the ORs (Table 3; e.g., the OR of a DRB1*15:02 allele for HBV-related HCC = 2.43, P = 1.7 × 10 -5 ).Our case-case analysis comparing HBV-related cancer with NBNC cancers (i.e., NBNC cases were used as a reference) also demonstrated significant differences, suggesting that HLA alleles comprising the Asian-prevalent haplotype exhibit a stronger association with the risk for HBV-related liver cancer than for NBNC liver cancer (Supplementary Table 3).Next, ORs were calculated according to genotype.Homozygosity for the risk alleles exhibited significantly higher ORs for HBV-related liver cancer/HCC risk than heterozygosity, suggesting that risk-associated HLA alleles have synergistic effects (Table 4; the OR of homozygotes of the DRB1*15:02 allele for HBV-related HCC = 9.82, P = 1.2 × 10 -8 ).By contrast, the ORs for A*24:02 genotypes were not significant, which is consistent with a lack of association with the HLA-A*24:02 allele (Table 3).www.nature.com/scientificreports/ the risk-associated alleles bind efficiently to HBV-derived peptides since HLA-class II molecules play a pivotal role in immune responses to viral infection 19,20 .Using the antigen-prediction algorithm MARIA 21 , we estimated the fraction of peptides efficiently captured by polymorphic HLA-DRB1 proteins among 1551 HBV-derived peptides deposited in the IEDB.Interestingly, we found that the DRB1*15:02 molecule bound fewer neoantigens than other HLA-DRB1 molecules (Fig. 1).

Affinity of HLA-class II molecules for HBV-derived peptides.
Immune profile of HBV-related HCC in risk allele carriers.Often, HBV-associated HCCs exhibit reduced intratumoral infiltration by activated NK cells 22 , which represents an immune profile that could potentially promote tumor development and progression 23 .Therefore, to investigate whether the risk-associated Asian-prevalent HLA haplotype plays a role in this phenotype, we analyzed RNA sequencing and whole-exome sequencing data from 160 Japanese HCC samples 24 .HCC cases with the HLA-class I and -class II haplotypes were identified from noncancerous tissue whole-exome sequencing data, while intratumoral immune cell fractions were estimated from tumor tissue RNA sequencing data using the CIBERSORTx program1 25 .Consistent with a previous report 22 , we found that the fraction of activated NK cells in HBV-related HCCs was lower than that in NBNC-HCCs (Supplementary Fig. 1), although the difference was not statistically significant (P > 0.05; Mann-Whitney U test).HCCs with the Asian-prevalent HLA haplotype had a lower percentage of activated NK cells than those without (Fig. 2), suggesting that intratumor infiltration by activated NK cells is lower in carriers of the HCC risk allele.By contrast, there was no significant difference in intratumor infiltration by CD8+, CD4+, and dendritic cells (P > 0.05; Mann-Whitney U test; Supplementary Fig. 2).

Discussion
Here, we conducted a comprehensive association analysis of HLA alleles to determine their role in pancancer risk and in the risk of developing 12 specific cancer types.The results clearly indicate that HLA-class I and -class II alleles comprising an Asian-prevalent haplotype play a role in the risk of developing Asian-prevalent cancers such as liver, cervical, and stomach cancers.To the best of our knowledge, this is the first investigation that has provided evidence that HLA polymorphisms are an endogenous factor contributing to the risk of Asianprevalent cancers.Of the 12 cancer types, we focused on liver cancer because it is prevalent in Asia, and information regarding viral infections is readily available.Remarkably, HLA alleles comprising the Asian-prevalent haplotype showed a more pronounced association with the risk of HBV-related HCC than HCV-related and NBNC-HCCs, a finding in line with the high incidence rate of HBV-related HCC in Asian countries, including Japan 26,27 .These HLA alleles did not deviate significantly from Hardy-Weinberg equilibrium (Supplementary Table 4), and showed linkage disequilibrium (Supplementary Fig. 3) in our study population.Therefore, these alleles, and the haplotype, are a common genetic risk factor shared by the Japanese population.www.nature.com/scientificreports/Two particular findings provide insight into the molecular mechanisms underlying the way in which HLA polymorphisms contribute to the risk of HBV-related HCC.First, in silico deductions suggest that the HLA-DRB1*15:02 molecule encoded by the Asian haplotype does not bind HBV-derived peptides efficiently.In particular, inefficient binding of the HLA-DRB1*15:02 molecule to large envelope protein-derived peptides is plausible, as these proteins comprise the hepatitis B surface antigen (HBsAg).We then used whole-exome sequencing data from 160 Japanese HCC cases to investigate the ability of HLA-class I molecules, specifically those encoded by alleles associated with increased risk (i.e., B*52:01 and C*12:02), to bind efficiently to neoantigens arising from somatic mutations in tumor tissues.We deduced that the number of neoantigens that bound to these HLA-class I molecules was not lower than the number that bound to other HLA-class I molecules (Supplementary Fig. 4A).Notably, a previous study showed that HLA-class II alleles, but not class I alleles, comprising the Asian-prevalent haplotype are associated with an increased risk for chronic hepatitis B infection in Japanese individuals 20 .As such, it is possible that this haplotype contributes to the risk of liver cancer by increasing the likelihood of developing a chronic hepatitis B infection, rather than by promoting evasion of immune surveillance by established tumor cells.
The transcriptome data from the 160 HCCs enabled us to investigate the impact of the haplotype on the immunogenic properties of HCC.Chronic HBV infection, the primary risk factor for HCC, modulates expression of inhibitory and activating receptors on NK cells within tumor tissues, leading to a decrease in NK cell activation 23 .In fact, HBV-associated HCCs frequently exhibit reduced infiltration by activated NK cells 22 , which is an observation made in the present study.Moreover, HCCs carrying the risk haplotype had a lower fraction of activated NK cells than those without the haplotype.The mechanisms underlying these observations remain unclear.Previous studies suggest that HLA-class I molecules expressed on tumor cells mediate NK cell suppression 28 .Intriguingly, we observed infrequent LOH of the B*52:01 and C*12:02 alleles (which are associated with increased risk) compared with other class I alleles in HCCs of individuals heterozygous for these alleles (Supplementary Fig. 4B).This suggests that retention of these HLA-class I molecules may contribute to NK cell suppression.Nevertheless, further functional investigations are needed to establish a definitive conclusion.
In summary, our investigation highlights that the Asian-prevalent haplotype encompassing HLA-class I and -class II alleles increases susceptibility to Asian-prevalent malignancies, specifically HBV-related HCC, by suppressing antiviral and antitumor immune responses.Nonetheless, certain limitations must be acknowledged.First, although our study was a large-scale association analysis encompassing 12 major cancer types, the sample size for each cancer type, including HBV-related HCC, was small.Additionally, only Japanese patients were included in our analysis; thus, the associations should be confirmed in larger and more diverse cohorts of HCC patients.Second, the transcriptome analysis of HCC focusing on NK cells was performed using only publicly available data, thereby limiting our ability to assess detailed pathological characteristics such as immune cell distribution.Therefore, the relationship between HLA alleles and the intratumoral distribution of immune cells should be evaluated further using immunohistochemical methods coupled with HLA genotype data.

Materials and methods
Patients characteristics.Details of the BBJ subjects, including liver cancer cases, were described previously 29 .The BBJ project enrolled participants, including cases diagnosed as liver cancer, from healthcare facilities in Japan; history of HBV or HCV infection (yes, no, or unknown) was obtained from medical records, and from interviews using a standardized questionnaire at enrolment.The histological type of liver cancer was diagnosed on the basis of tissue or cytological samples obtained from biopsies.Liver cancer histology was not In silico prediction of the binding of polymorphic HLA-DRB1 molecules to HBV peptides.The amino acid sequences of 1551 peptides derived from HBV were obtained from the Immune Epitope Database (IEDB) 32 .These peptides comprised fragments of proteins encoded by the HBV genome, including the capsid protein (n = 334), external core antigen (n = 202), large envelope protein (n = 555), protein P (n = 317), and protein X (n = 143).The MARIA (https:// maria.stanf ord.edu/ index.php) program, which specifically predicts peptide binding to HLA-DRB1 molecules, was used to infer the potential binding ability of these peptides to polymorphic HLA-DRB1 molecules.Peptides with a predicted score > 0.95 were considered to have positive binding ability, consistent with previous studies 21 .

Computational identification of HLA-class I molecule-restricted neoantigens.
Whole-exome sequencing data (Fastq files) from HCC and nontumor tissue DNA samples obtained from 160 HCC patients of Japanese descent were procured from the National Bioscience Database Center (NBDC) Human Database (research ID: hum0187.v2) 24.Exome sequencing was conducted on the Illumina HiSeq2000 platform using 2 × 100 bp paired-end reads (resulting in an estimated 100-fold coverage) and the SureSelect Human All Exon Kit V4/V5 (Agilent Technologies, Santa Clara, United States) or a SeqCap EZ HGSC VCRome2.1 design1 kit (Roche, Basel, Switzerland).Basic alignment and sequence quality control were then undertaken in accordance with the GATK4 best practices pipeline 33 .The reads obtained were aligned to the UCSC human genome 38 (hg38) reference sequence.Somatic single nucleotide variants and insertion/deletion variants were detected by the Mutect2 program (BROAD Institute; http:// www.broad insti tute.org/ gatk/).Each nonsynonymous single nucleotide variant was translated into a 17-mer peptide sequence centered on the mutated amino acid.The 17-mers were then used to generate 9-mers through a sliding window approach, followed by prediction of HLAclass I binding to neopeptides by the HLAthena program 34 .Neoantigens were selected based on a prediction score of Msi > 0.9 for each patient-specific HLA-class I allele [35][36][37][38] .

Estimation of somatic HLA-class I allele loss in HCC.
The HLA-class I genotypes (comprising fourdigit alleles) of the HCC patients were determined from whole-exome sequencing data using the HLA-HD 39 and POLYSOLVER 40 programs.Somatic loss of HLA-class I alleles was estimated by the LOHHLA 41 program using default settings.In short, the allele-specific copy number of each HLA-class I locus was determined by realigning sequence reads to patient-specific HLA reference sequences.Somatic loss of heterozygosity (LOH) was considered positive when the difference in the log copy ratio between the two HLA alleles was less than the Pval_unique value of 0.01, as previously described 42 .
RNA sequencing and immune cell profiling.The RNA sequencing data (Fastq files) for tumor tissues from the same cohort of 160 HCC patients were acquired from the NBDC (research ID: hum0187.v2).The polyadenylated RNA libraries were synthesized using the TruSeq Stranded mRNA Library Prep kit (Illumina) and sequenced using the Illumina HiSeq2000 platform, generating 2 × 100 bp paired-end reads.Read alignment was performed using STAR version 2.7.3a 43 , with the human genome (GRCh38) and transcriptome data (GENCODE version 31 44 ) as reference datasets.Transcripts per million (TPM) values were calculated using the StringTie program (version 2.0.4) 45 .Levels of immune infiltration were calculated from TPM expression data using the LM22 gene signature and the CIBERSORTx algorithm 25,46 .Data were run with 1000 permutations under the LM22 signature.The fraction of CD4+ cells was calculated by summing the fractions of "T cells CD4

Figure 1 .
Figure 1.Potential inefficient binding of the DRB1*15:02 molecule.Peptides derived from HBV that were predicted to bind with high affinity (i.e., predicted score > 0.95) to HBV peptides cataloged in the IEDB were identified by the MARIA program.The results for all deposited peptides (left panel, n = 1551) and for peptides that elicited a reaction in a T cell assay (right panel, n = 465) are shown.Colors indicate the protein type from which the binder peptides are derived.

Table 2 . Association of HLA alleles comprising an Asian-prevalent haplotype with cancer risk. a Adjusted for age, sex, and the top five major PCA components, which were obtained from the pancancer GWAS. Results with statistically significant associations are shown in bold. OR: odds ratio; CI: confidence interval. b Includes 2143 multiple primary cancers. Type (N) Class I Class II A*24:02 B*52:01 C*12:02 DRB1*15:02 DQA1*01:03 DQB1*06:01 DPA1*02:01 DPB1*09:01 OR a (95% CI) P value OR a (95% CI) P value OR a (95% CI) P value OR a (95% CI) P value OR a (95% CI) P value OR a (95% CI) P value OR a (95% CI) P value OR a (95% CI) P value
The association between HLA-class II alleles and HBV-related liver cancer prompted us to investigate whether the HLA-class II molecules encoded by

Table 3 .
Association analysis of HLA alleles with liver cancer/HCC risk (according to viral infection).
a Adjusted for age, sex, and the top five major PCA components, which were obtained from the pancancer GWAS.Results with statistically significant associations are shown in bold.HCC: hepatocellular carcinoma; HBV: hepatitis B virus; HCV: hepatitis C virus; NBNC; non-B non-C; OR: odds ratio; CI: confidence interval.

Table 4 .
Association of the Asian-prevalent HLA haplotype with heterozygous or homozygous retention in HBV-related liver cancer.
a Adjusted for age, sex, and the top five major PCA components, which were obtained from the pancancer GWAS.Results with statistically significant associations are shown in bold.HCC: hepatocellular carcinoma; HBV: hepatitis B virus; HCV: hepatitis C virus; NBNC; non-B non-C; OR: odds ratio; CI: confidence interval.b Includes B*52:01, C*12: 31oportion of activated NK cells infiltrating HCC tissues.Fractions of activated NK cells within HCC tissues was determined using the CIBERSORTx algorithm using RNA sequencing data obtained from 160 Japanese patients with HCC.Fractions of activated NK cells was stratified according to the HLA-class I, -class II, and -class I/II genotypes.Statistical significance was determined by the Mann-Whitney U test, and P values < 0.05 were considered significant.HLA genotyping and association analysis.SNP2HLA30and a Japanese HLA reference panel8were used to impute the HLA alleles of the study subjects from genotype data covering the MHC region.This enabled identification of four-digit classical class I (HLA-A, B, and C) and class II (HLA-DRB1, DQA1, DQB1, DPA1, and DPB1) HLA alleles.Chi-squared tests and logistic association analyses of case/control data were performed for each HLA allele, and the 12 cancer types, using the PLINK program (version 1.07)31; age, sex, and the top five principal component scores were used as covariates.Bonferroni correction was applied to account for multiple testing.A total of 147 HLA-class I/II alleles were studied.P < 3.4 × 10 -4 was considered significant (i.e., P < 3.4 × 10 -4 = 0.05/147).Statistical analyses were performed using R statistical environment version 4.1.2(GraphPad Prism9, or PLINK1.07).