Identification of HLA-A*02:06:01 as the primary disease susceptibility HLA allele in cold medicine-related Stevens-Johnson syndrome with severe ocular complications by high-resolution NGS-based HLA typing

Stevens-Johnson syndrome (SJS) and toxic epidermal necrolysis (TEN) are life-threatening acute inflammatory vesiculobullous reactions of the skin and mucous membranes. These severe cutaneous drug reactions are known to be caused by inciting drugs and infectious agents. Previously, we have reported the association of HLA-A*02:06 and HLA-B*44:03 with cold medicine (CM)-related SJS/TEN with severe ocular complications (SOCs) in the Japanese population. However, the conventional HLA typing method (PCR-SSOP) sometimes has ambiguity in the final HLA allele determination. In this study, we performed HLA-disease association studies in CM-SJS/TEN with SOCs at 3- or 4-field level. 120 CM-SJS/TEN patients with SOCs and 817 Japanese healthy controls are HLA genotyped using the high-resolution next-generation sequencing (NGS)-based HLA typing of HLA class I genes, including HLA-A, HLA-B, and HLA-C. Among the alleles of HLA class I genes, HLA-A*02:06:01 was strongly associated with susceptibility to CM-SJS/TEN (p = 1.15 × 10−18, odds ratio = 5.46). Four other alleles (HLA-A*24:02:01, HLA-B*52:01:01, HLA-B*46:01:01, and HLA-C*12:02:02) also demonstrated significant associations. HLA haplotype analyses indicated that HLA-A*02:06:01 is primarily associated with susceptibility to CM-SJS/TEN with SOCs. Notably, there were no specific disease-causing rare variants among the high-risk HLA alleles. This study highlights the importance of higher resolution HLA typing in the study of disease susceptibility, which may help to elucidate the pathogenesis of CM-SJS/TEN with SOCs.

using an Ion PGM ™ Template IA 500 reagents kit (Thermo Fisher Scientific). Beads carrying the single-stranded DNA templates were enriched using a OneTouch ™ ES instrument (Thermo Fisher Scientific). Sequencing was performed using an Ion PGM ™ Sequencing Hi-Q ™ kit and 318 Chip kit v2 (Thermo Fisher Scientific). For the AllType kit, IA and enrichment steps were carried out using an Ion S5 ™ ExT Chef kit (Thermo Fisher Scientific) and Ion Chef ™ instrument (Thermo Fisher Scientific), and sequencing was performed using Ion S5 ™ sequencing reagents (Thermo Fisher Scientific) and an Ion 530 ™ chip (Thermo Fisher Scientific).
After the sequencing run, all raw data were automatically saved in the Ion Torrent Server and converted to sequence fastq files for each sample. These fastq files were then analyzed, and HLA allele calling was performed using TypeStream ™ Visual NGS Analysis software, v.1.1 Hot Fix 1, with the IMGT/HLA 3.29.0 databases.
HLA haplotype estimation. Each HLA class I gene haplotype was estimated using BIGDAWG (Bridging ImmunoGenomic Data-Analysis Workflow Gaps) software, version 2.1, implemented as the bigdawg R package 24 .
Statistical analysis. The carrier frequencies of individual HLA alleles in patients and controls were compared based on the dominant model using the χ 2 -test (R software; R Foundation for Statistical Computing). HLA alleles and haplotypes with frequencies less than 1% in cases and controls were excluded from the association analysis. Fisher's exact test (R software; R Foundation for Statistical Computing) was used when one or more observed counts was less than 5. Significance levels were corrected by Bonferroni correction for multiplicity of testing by the number of comparisons. A corrected P value of <0.05 was considered statistically significant.

NGS-based HLA allele typing.
A total of 95 HLA class I allele sequences (23 HLA-A, 47 HLA-B, 25 HLA-C) at the 3-field level were detected for all SJS cases and healthy controls using NGS-based HLA allele typing (Table 1, Supplementary Table 2). For cases in which 3-field level sequences were not registered in the IMGT/ HLA database v.3.29.0, 2-field (4-digit) allele assignments (such as C*14:03) were adopted. Newly discovered alleles were validated by Sanger sequencing, and all alleles were correctly sequenced by NGS ( Supplementary  Fig. 1a-d). The number of alleles subdivided into multiple 3-field alleles is summarized in Table 2.
For some samples (SJS, n = 105; controls, n = 752), 2-field Luminex HLA allele typing results were available from our previous study. The accuracy of NGS-based HLA allele typing was evaluated by assessing the concordance between Luminex-based HLA typing results and NGS-based HLA allele typing results at 2-field resolution. Seven discordant alleles were observed, as shown in Table 3. Additional Sanger sequencing was performed to evaluate the key nucleotide differences between the results of NGS-based HLA allele typing and Luminex-based HLA allele typing, which demonstrated that the NGS results were accurate for these 7 alleles (Supplementary Fig. 1e-k). For example, A*02:15N has been detected at an allele frequency of 0.003% in the Japanese population 25 , and the differences in the exon sequences between A*02:15 N and A*02:07 exist in exon 4 of the HLA-A gene, which is not covered by Luminex oligonucleotide probes. Therefore, these two alleles are listed among Luminex HLA ambiguities.

Significant associations between HLA alleles and susceptibility to CM-SJS/TEN with SOCs.
Carrier frequencies of HLA alleles were compared between 120 CM-SJS/TEN with SOCs patients and 817 healthy controls. We compared 42 alleles (10 for HLA-A, 18 for HLA-B, 14 for HLA-C) carried by more than 1% of individuals among both patients and controls. In total, 16 alleles (6 for HLA-A, 6 for HLA-B, 4 for HLA-C) demonstrated associations with P values < 0.05, for which the associations of the 5 HLA alleles remained significant after correction for multiple-testing (Table 4)  www.nature.com/scientificreports www.nature.com/scientificreports/ HLA haplotype association analyses. In order to identify the primary associations among HLA class I alleles, haplotype analyses were conducted using BIGDAWG software. We compared 84 haplotypes (32 for A-B, 25 for B-C, 27 for A-C haplotypes) carried by more than 1% of individuals among both patients and controls. Twenty haplotypes (5 for A-B, 7 for B-C, 8 for A-C haplotypes) demonstrated associations with P values < 0.05, and the associations of 7 haplotypes remained significant after corrected for multiple-testing ( Table 4)

Discussion
In this study, we investigated potential novel HLA alleles associated with the occurrence of CM-SJS/TEN with SOCs using NGS-based high-resolution HLA allele typing. To our knowledge, there have been no reports published to date regarding case-control association analyses of HLA alleles at the 3-field level using NGS-based HLA typing. Our approach enabled 3-field HLA allele assignment in which previous associations could be analyzed at higher resolution. In addition, this approach provided full resolution of ambiguous HLA alleles in comparison   Table 3. Discordance between NGS typing and Luminex typing results (SJS: n = 105, controls: n = 752). Abbreviations: PCR-SSOP, polymerase chain reaction sequence specific oligonucleotide probing; NGS, next generation sequencing. Discordant base position between NGS typing and Luminex typing is shown with the number of bases. "out of target" means the Luminex probe doesn't cover the targeted reigion (exon 2 and 3). "no probe" means Luminex probe doesn't cover the discordant base position. "✓(Typing error)" means the Luminex probe covers the discordant base position, but can't distinguish the difference because of some typing error. www.nature.com/scientificreports www.nature.com/scientificreports/ with traditional HLA allele typing methods such as PCR-SSOP or PCR-SBT. This could ultimately provide a more detailed picture of the relationship between HLA polymorphism profiles and disease susceptibility.
Our group previously reported that HLA-A*02:06 was strongly associated with susceptibility to CM-SJS/TEN with SOCs 13 . In this study, HLA-A*02:06:01 showed the strongest association at the 3-field level, and all previously typed HLA-A*02:06 alleles were classified as HLA-A*02:06:01 at the 3-field level. A previous study using in silico docking simulations reported that the protein encoded by HLA-A*02:06 is predicted to bind to various ingredients contained in cold medicines, such as acetaminophen, although the predicted binding was not verified experimentally 26 . These findings suggest that the association of HLA-A*02:06:01 is mainly attributed to the 2-field rather than 3-field level; that is, the differences in the amino acid sequences might be directly correlated with CM-SJS/TEN through peptide presentation. A well-known example of this phenomenon involves the specific binding of abacavir, an anti-HIV drug, with the protein encoded by HLA-B*57:01. Abacavir binds with exquisite specificity to the HLA-B*57:01-encoded protein, altering immunologic 'self ' with the selection of new endogenous peptides 27,28 . This molecular mechanism is one possibility, but whether such reactions occur between the ingredients of cold medicines and protein product of HLA-A*02:06:01 remains unclear. Considering that the onset of SJS/TEN with SOCs is associated not only with the administration of drugs but also with viral and microbial infections, these factors might interact with each other, leading to alterations of the immune system and subsequent destruction of target cells.
In contrast, B*44:03, which our group previously reported as an independent risk factor in the Japanese population 13 , can be divided into two alleles (HLA-B*44:03:01 and HLA-B*44:03:02) at the 3-field level. Although the case group had a higher frequency of B*44:03:01 than controls, the association of B*44:03:01 did not remain significant after correction for multiple-testing, probably because of the smaller sample size of that study. However, HLA-B*44:03 appears to be a universal marker of CM-SJS in many populations, including Indian 29 , Brazilian 30 , and Thai populations 31 . Reports from the USA 32 and France 33 indicated that levels of the HLA-B12 (HLA-Bw44) antigen, primarily encoded by HLA-B*44:02 or HLA-B*44:03, are significantly increased in Caucasian SJS patients who had taken NSAIDs as cold medicines. Considering the possibility that the HLA-B*44:03 association can be attributed to specific ingredients in cold medicines, stratification of the case group based on drug ingredients should be examined in future studies. In this study, however, the specific drugs were not known in all patients, so further analyses targeting specific drugs are necessary.
In this study, four HLA class I alleles other than HLA-A*02:06:01 that exhibited new associations with CM-SJS/TEN with SOCs were identified. However, haplotype analyses suggested that the associations of HLA-B and HLA-C were due primarily to strong linkage disequilibrium with significant HLA-A alleles.   www.nature.com/scientificreports www.nature.com/scientificreports/ result of linkage disequilibrium with HLA-A*24:02:01. Similar haplotype associations were also observed for HLA-B*46:01:01, which forms a common haplotype with HLA-C*01:02:01 and HLA-A*02:06:01, and the direction of the odds ratio is the same as that of HLA-A*02:06:01, suggesting that the HLA-B*46:01:01 association can be attributed to the effect of HLA-A*02:06:01. In addition, HLA-A*24:02:01, which exhibited a protective association, is the most frequent HLA-A allele in the Japanese population (the allele frequency of HLA-A*24:02 is 36.1%) 25 . Thus, it is possible that the increase in HLA-A*02:06:01 in the case group caused the decrease in HLA-A*24:02:01, leading to the apparent protective association of HLA-A*24:02:01. In order to evaluate the susceptible and protection effect of HLA-A*02:06:01 and HLA-A*24:02:01, we compared the frequencies of CM-SJS/ TEN with SOCs patients and healthy controls carrying both HLA-A*02:06:01 and HLA-A*24:02:01. The percentage of CM-SJS/TEN with SOCs patients and controls carrying both HLA alleles are 11.7% and 5.01% respectively, indicating that the susceptible allele effect of HLA-A*02:06:01 is stronger than the apparent protective effect of HLA-A*24:02:01.
Notably, NGS of HLA class I genes revealed no specific disease-causing rare variants among the high-risk HLA alleles; that is, HLA genes contribute to the susceptibility to CM-SJS/TEN with SOCs. In previous studies, several immune-related genes other than HLA have been linked to susceptibility to CM-SJS/TEN with SOCs, such as PTGER3 34 , IKZF1 35 , TSHZ2 35 , IL-4R 36 , FasL 37 , IL-13 38 , and TLR3 4,34 , suggesting that the combination of multiple gene polymorphisms and their interactions, including with HLA alleles, contributes strongly to the onset of CM-SJS/TEN with SOCs. This hypothesis is supported in part by the findings of this study indicating that HLA genes are associated with disease susceptibility rather causality.
Although almost all of 3-field alleles subdivided from the same 2-field alleles belonged to one of the alleles ( Table 2); some alleles were divided into 2 subgroups with substantial frequencies (e.g., 77 B*39:01 was divided into 31 B*39:01:01 and 46 B*39:01:03). Although peptide presentation possibly affects disease susceptibility, it has been reported that coding variants affecting regions outside of the peptide-binding groove are strongly associated with diseases such as type 1 diabetes 39 . These data support the hypothesis that higher-resolution NGS-based HLA allele typing is useful for disease association studies. In addition, one of the objectives of this study was to investigate the impact of non-coding variants on disease susceptibility. However, we were only able to achieve 3-field level HLA allele assignment and could not determine assignments at the 4-field level for some of the samples. This limitation occurred due to 2 major factors: (i) lack of amplification of the UTR by the NXType kit for several samples, and (ii) insufficient 4-field HLA allele sequences in the current IMGT/HLA database. Several studies have shown that HLA non-coding SNPs are associated with diseases, such as an association of an SNP in the 3′UTR of HLA-DPB1 with acute graft-versus-host disease 22 and of a specific SNP in intron 2 of HLA-DRB1 with rheumatoid arthritis 40 . Therefore, in order to consider the effect of non-coding variants on disease susceptibility, technical improvements in reagents and/or typing software are needed, in addition to an expansion of 4-field HLA reference sequences in public databases.
In conclusion, although some technical limitations have to be noted, we successfully identified HLA class I alleles at the 3-field level using high-resolution NGS-based HLA allele typing and demonstrated a strong association between CM-SJS/TEN with SOCs and HLA-A*02:06:01 in the Japanese population. These findings highlight the importance of higher-resolution HLA typing in the study of disease susceptibility, which may help to elucidate the pathogenesis of CM-SJS/TEN with SOCs.