The adverse events of haematopoietic stem cell transplantation are associated with gene polymorphism within human leukocyte antigen region

Adverse reactions may still occur in some patients after receiving haematopoietic stem cell transplantation (HSCT), even when choosing a human leukocyte antigen (HLA)-matched donor. The adverse reactions of transplantation include disease relapse, graft-versus-host disease (GVHD), mortality and CMV infection. However, only the relapse was discussed in our previous study. Therefore, in this study, we investigated the correlation between the gene polymorphisms within the HLA region and the adverse reactions of post-HSCT in patients with acute leukaemia (n = 176), where 72 patients were diagnosed with acute lymphocytic leukaemia (ALL) and 104 were acute myeloid leukaemia (AML). The candidate single nucleotide polymorphisms were divided into three models: donor, recipient, and donor-recipient pairs and the data of ALL and AML were analysed individually. Based on the results, we found 16 SNPs associated with the survival rates, the risk of CMV infection, or the grade of GVHD in either donor, recipient, or donor-recipient matching models. In the ALL group, the rs209132 of TRIM27 in the donor group was related to CMV infection (p = 0.021), the rs213210 of RING1 in the recipient group was associated with serious GVHD (p = 0.003), and the rs2227956 of HSPA1L in the recipient group correlated with CMV infection (p = 0.001). In the AML group, the rs3130048 of BAG6 in the donor-recipient pairs group was associated with serious GVHD (p = 0.048). Moreover, these SNPs were further associated with the duration time of survival after transplantation. These results could be applied to select the best donor in HSCT.


Scientific Reports
| (2021) 11:1475 | https://doi.org/10.1038/s41598-020-79369-w www.nature.com/scientificreports/ a generally better prognosis 7 , and the risk of graft-versus-host disease (GVHD) was lower than those with an HLA-mismatched donor 8 . Unfortunately, even selecting an HLA-matched donor, the disease relapse and other poor outcomes might still occur after receiving HSCT. In 2013, Petersdorf et al. demonstrated that several SNPs in non-classical HLA genes could potentially affect the effectiveness of HSCT 9 . We used these SNPs (rs2244546, rs394657, rs429916, rs915654, rs2075800, rs2242656, rs107822, rs209130 and rs2071479) as sourced SNPs to look for candidate SNPs, which were within the 500 bp flanking genomic regions of these sourced SNPs. In our previous studies, we investigated the correlation between these candidate SNPs and disease relapse of post-HSCT, and the results had been reported in peerj (2018) 10 and scientific reports (2019) 11 , respectively. These studies supported that the outcomes of allo-HSCT might be affected by genes within the HLA system besides HLA-A, -B, -C, -DR and -DQ.
However, HSCT's adverse reactions include mortality, GVHD and cytomegalovirus (CMV) infection, besides disease relapse. In this study, we expanded these two studies mentioned above by increasing the sample of patients with AML or ALL who receiving HSCT and investigating the relationship between these adverse reactions of HSCT mentioned above and gene polymorphisms within the HLA region.

Results
The clinical characteristics of the patients were summarised in Table 1. The 176 patients were enrolled in this study, including 104 patients with AML and 72 patients with ALL. Additionally, AML patients were older (30.64 ± 11.43 years old) than ALL patients (21.86 ± 14.12 years old), and the male to the female gender ratio of these patients just was 1:1. All patients' survival rate was 52%, the relapse rate was 74%, CMV infection rate was 56% and the rate of getting severe GVHD was 27%. Furthermore, in this study, the most common stem cell source was peripheral blood (70.5%), and the most common conditioning regimen was chemotherapy (59.1%).
In Table 2, regarding the correlation between the three adverse reactions of HSCT and the candidate SNPs in ALL group, the rs209132 of TRIM27 gene was associated with the risk of CMV infection (p = 0.021), and the rs213210 of the RING1 gene was related to the risk of GVHD grade 3-4 (p = 0.003) in donor group. In the recipient group, 3 SNPs correlated with CMV infection, including of the rs9282369 of the HLA-DOA gene (p = 0.014), the rs2227956 of HSPA1L gene (p = 0.001) and the rs3130048 of the BAG6 gene (p = 0.035). Moreover, 2 SNPs were associated with the grade 3-4 of GVHD, including the rs213210 of RING1 gene (p = 0.024) and rs139791445 of TRIM27 gene (p = 0.044). For the donor-recipient pairs group, only the rs209130 of TRIM27 gene was related to the risk of GVHD grade 3-4 (p = 0.036), in which the matched gene polymorphism of rs209130 in donorrecipient pairs had a lower probability for getting grade3-4 GVHD than those who were unmatched (OR = 0.333, 95% CI 0.117-0.946).  Table 3. Two SNPs of HLA-DOB had significantly correlated with adverse reactions in the donor group, where one of the SNPs was associated with CMV infection (rs11244, p = 0.019), and the other was related to survival rate (rs17220087, p = 0.047). Additionally, the rs209131 of TRIM27 was associated with CMV infection (p = 0.034), the rs2518028 of HCP5 was related to the survival rate (p = 0.026) and the rs1536215 of TRIM27 was related to GVHD grade 3-4 (p = 0.021). In the recipient group, the rs209132 and rs209131 of TRIM27 (p = 0.022 and 0.042, respectively) and the rs2070120 and the rs17213693 of HLA-DOB (p = 0.026 and 0.043, respectively) were associated with the risk of CMV infection. Moreover, the rs79327197 of HLA-DOA was related to survival rate, and it had higher mortality frequency in patients with minor allele, in which the allele with higher frequency in ALL or AML population was defined as a major allele, and the lower was minor allele (minor allele = G-allele; p = 0.008, OR = 0.217, 95% CI 0.065-0.720). In the donor-recipient pairs group, the rs209131 of TRIM27 (p = 0.020) and the rs17213693 of HLA-DOB (p = 0.044) gene polymorphism matched or unmatched were associated with CMV infection. And the patients had lower survival when the gene polymorphism of rs107822 in the RING1 promoter region was matched in donor-recipient pairs (p = 0.016, OR = 0.368, 95% CI 0.161-0.838). Furthermore, the patients had a lower risk of getting severe GVHD when the genotype of rs3130048 in the intron region of BAG6 was matched in donor-recipient pairs (p = 0.048, OR = 0.302, 95% CI 0.099-0.920). Table 2. The SNPs significant association with three adverse reactions in ALL group. D: dominant model (AA vs. Aa + aa); R: recessive model (AA + Aa vs. aa); A: additive model (AA vs. Aa vs. aa), in which ' A' was defined as a higher frequency allele and the lower was 'a' . Fisher's exact test was used when more than 20% of cells had an expected count of less than 5 for the Chi-square test. The unmatched was used as a standard for odds ratio here. In other words, the chance for getting CMV infection, severe GVHD and survival in the matched genotype in donor-recipient pairs compared to those unmatched.   Tables 2 and 3 to investigate whether CMV infection, severe GVHD, CMV-related SNPs and GVHD-related SNPs would influence overall survival in which the time was calculated from the patient receiving HSCT, and the censoring was indicated the time point of patient death or losing contact. According to the presence or absence of CMV infection, the survival rate was similar in the early stage in ALL and AML groups, and the death toll of CMV infection was gradually increasing after approximately 10 months after transplantation (Fig. 1). However, CMV infection's overall survival was only significantly correlated in the AML group, p = 0.049 (Fig. 1B). In terms of the survivorship curve of CMV-related SNPs in patients with ALL, the rs209132 of TRIM27 in the donor group had a significant difference, where patients with minor allele (A allele) had better overall survival, p = 0.040 ( Fig. 2A). Furthermore, the G-allele patients in rs2227956 of HSPA1L had a better prognosis, p = 0.009 (Fig. 2B).
In ALL group, the highest overall survival was shown in patients with chronic GVHD, and the highest mortality rate was shown in patients with grade 3-4 of GVHD (Fig. 3A). While in the AML group, patients with non-GVHD had the highest survival rate, and the next was chronic GVHD. The highest mortality rate was shown in patients with grade 3-4 of GVHD, and patients with chronic GVHD had continuously survived in the first four years, 48 months (Fig. 4A). Moreover, the grade of GVHD in AML patients was related to the overall survival, p = 0.011 (Fig. 4A). Comparing the results of Table 2 and Fig. 3B, the polymorphism of rs213210 in RING1 was related to grade of GVHD and the overall survival (p = 0.012). Moreover, patients had better overall survival when the genotype of rs3130048 was matched in donor-recipient pairs (Fig. 4B).
For the linkage disequilibrium plot of donor SNPs (Fig. S5), there were 13 SNPs of HCP5 and NOTCH4 gene included. One block contained the rs2523676 and rs4713466 with high linkage disequilibrium (LD) (D′ = 0.96) in the HCP5 gene region and one block contained rs2256594, rs394657 and rs429853 of NOTCH4 gene with   (Fig. S7), there were 17 SNPs and three blocks contained in this analysis. The rs209131 and rs209130 of the TRIM27 gene were contained in block 1. Moreover, two SNP pairs of significant SNPs in the TRIM27 gene had high LD, but they were not considered a block, where rs1536215 had high LD with rs139791445 and rs209130 (D′ = 1). The rs2844464 and the rs2242656 of BAG6 gene were contained in block 2, while the rs3130048, a statistically significant SNP, had high LD with rs2844464 and rs2242656 (D′ = 0.90), but they were not considered as a block with each SNP. The rs107822 had a high LD with rs213210 (D′ = 0.96) in the RING1 gene, which both were significant SNPs of SNP analysis, and they were contained together in block 3. Furthermore, the complete data for genetic analysis of the overall SNPs were shown in Tables S1 to S3, except for the outcome-related SNPs.

Discussion
Our data indicated that 16 candidate SNPs from the nine sourced genes had a significant statistical correlation with HSCT's adverse reactions. Among these 16 SNPs, the HCP5, BAG6 and HSPA1L gene each had one SNP, the RING1 and HLA-DOA gene each had two SNPs, the HLA-DOB gene had four SNPs and the TRIM27 gene had five SNPs. Additionally, the presence of CMV infection and severe GVHD would affect the overall survival in AML patients after receiving HSCT. Among the nine sourced genes, there were seven genes with HSCT outcomes-related SNPs in our study, except LTA and NOTCH4. The function of these genes is associated with immunity. HCP5 (HLA complex P5) is a long non-coding RNA (lncRNA) located in the HLA I region, which involved innate and adaptive immune response  www.nature.com/scientificreports/ and association with the occurrence of certain autoimmune diseases and cancer 12 . The rs2518028 was associated with survival rate in AML patients. Moreover, we discovered that the genotype of rs2518028 was related to the risk of relapse in the correlation study of unrelated CBT 10 . HLA-DO is a heterodimer with an alpha and beta chain, and it is a highly conserved molecule 13 . Compared with other HLA II molecules, HLA-DO rarely shows genetic sequence variation, especially at the protein level 14 . The expression of the HLA-DO molecule could ensure the normal development of CD4 memory T cells 15 . Additionally, more and more evidence showed that CD4 + T cells played an important role in regulating CMV infection, reactivation and vertical transmission 16 . The rs79327197 of HLA-DOA, the rs1721369 of HLA-DOB and the rs2070120 of HLA-DOB were found and they were related to relapse in our previous HSCT study 11 . Regarding HSPA1L, which encodes 70 kDa heat shock protein (HSP70) in the HLA class III region 17 . Hsp70 plays a role in cell apoptosis inhibition, and it participates in immune regulation by regulating protein degradation and antigen presentation 18 . Moreover, BAG6 regulates the activity of T cells 19 and NK cells 20 , and it decreases the expression of MHC class II on antigen-presenting cells 21 . RING1 is in the HLA class II region, which can play a transcriptional repressor role and influence the expression of gene 22 . Notably, the rs107822 and rs213210 in the RING1 gene region had high LD, and they were considered a haplotype block. It was meant that one of these SNPs with statistical significance might be due to LD with the other. That was necessary to verify further that which SNP was a risk SNP. And TRIM27 is a member of the TRIM protein family, which are widely involved in biological development, including cell proliferation, differentiation, development, morphological changes, apoptosis and TRIM expression proteins are regulated by interferons (IFNs) 23 . These sourced genes were related to cancer and adverse reactions after transplantation 10,11,[24][25][26][27] . The several SNPs were associated with relapse in previous studies 10,11 and related to the grade of GVHD, the rate of survival and the risk of CMV infection in this study. This suggested that these SNPs might play an important role in bone marrow transplantation. Depending on the location of the SNP, the effect on gene performance is different. The genetic variation in the promoter and 3′UTR region may influence the expression level of gene 28,29 , while genetic variation in the intron region may affect the alternative splicing of mRNA and production of proteins with impaired function 30,31 . Moreover, gene variation in the exon region may result in transcription code alternation and cause protein structure changes or termination codes. For example, the nucleotide changes from G to A at rs2227956 results in amino acid replacement from Met to Thr, and it induces the property of HSP70 changing from hydrophobicity to hydrophilicity and altering biological function 32 . Furthermore, this change could influence the substrate specificity and chaperone activity of HSP70 33 , and the SNP has a statistical relationship with the disease, which may not be pathogenic but may be due to the strong LD with the pathogenic gene. Therefore, the SNP's biological function should be explored by functional analysis in the future to confirm the mechanism of adverse reaction of HSCT caused by the SNP.
The presence or absence of CMV infection and GVHD grade were related to overall survival in AML patients. On the contrary, it did not correlate in ALL patients. Cytomegalovirus might mainly target a population of myeloid lineage progenitor cells derived from bone marrow 34 . Thus, the presence or absence of CMV infection was not significant in the ALL group, even though CMV infection was an important cause of non-relapse mortality after allogeneic HSCT 35 . We found that patients had the highest mortality rate in grades 3-4 regardless of both ALL and AML, but a significant difference was only found in the AML group. However, GVHD had been found to affect the overall survival of AML and ALL patients after allo-HSCT in studies 36,37 . That might be due to our limited sample size; hence, there was no statistical significance in ALL groups. Most of these SNPs were only associated with HSCT outcomes and not related to overall survival in the survival curve of CMV-related SNPs and GVHD-related SNPs. It meant that although these SNPs were associated with the risk of CMV infection or the grade of GVHD, these SNPs were not the major factors affecting the survival time of patients. Only four HSCT-outcome-related SNPs (rs209132, rs2227956, rs3130048 and rs213210) had correlation with overall survival. This suggested that these SNPs might play a vital role in GVHD or CMV infection, which affects the patient's survival.
Although CBT is one of HSCT, it has a less rigorous requirement for HLA compatibility than bone marrow and peripheral blood for HSCT due to its immunological immaturity; therefore, CBT was not included in this study. We would further explore the correlation between CBT adverse reactions and HLA-related gene SNPs. In conclusion, the issue regarding the correlation between the SNPs of non-HLA genes and the effectiveness of HSCT has continuously been revisited. In this study, we found that several SNPs were related to HSCT's adverse reactions, and several of them were associated with overall survival. However, these SNPs' biological function needed to be further investigated to confirm that they directly affect the subsequent gene expression or protein function and then leads to poor prognosis.

Materials and methods
Patients and laboratory tests. The Institutional Review Board of Chang Gung Memorial Hospital has reviewed and approved the study. The approval ID was 201304949B0; all study subjects signed informed consent and performed according to the ethical requirement and regulation. A total of 176 patients participated in this study, consisting of 104 patients with AML and 72 patients with ALL who were receiving HSCT in Chang Gung Memorial Hospital. The informed consent forms were written from all enrolled patients. Before transplantation, HLA-A, -B, -C -DRB1 and DQB1 alleles in donor and recipient must be typed by using LABType SSO Typing Test (Thermo Fisher, Waltham, MA) through sequence-specific oligonucleotide probes-based method. Then, the high-resolution HLA-typing was used via the SeCore kit (Thermo Fisher, Waltham, MA) to acquire more detailed allele information. To resolve allele ambiguity from the SeCore typing, the Micro SSP Allele Specific Typing Tray (Thermo Fisher, Waltham, MA) was used by the sequence-specific primers-based method. The characteristics of the patient were shown in Table 1 Selection of SNPs. These nine sourced SNPs originated from the literature written by Petersdorf 9 , which showed that these SNPs were associated with the risk of relapse, GVHD, CMV infection and survival rate. A total of 41 SNPs was selected to be candidates from 500 bp upstream and downstream of those nine sourced SNPs. These selected SNPs were also divided into three groups, according to Petersdorf 's study. In other words, we focused on the relationship between the adverse reactions and the SNP in donors if it was showed that the donors with the SNP were related to the effectiveness of allogeneic HSCT in Petersdorf 's study, and this SNP would be distinguished into 'donor group. ' The classification criteria of the other two groups were similar (Table 4).
PCR and sequencing. The genomic DNA was extracted via the QIAamp DNA Blood mini kit (Qiagen, Valencia, CA) from a peripheral blood sample. The purpose was to look for genetic variation within 500 base pairs downstream and upstream of these nine sourced genes. A 50 μl volume of PCR mixture contains 1× reaction buffer, 10 nmol of dNTP, 6 pmol of forward and reversed primers, 300 ng of genomic DNA and 1 μl of Pfu Turbo Hotstart DNA Polymerase (Agilent, Santa Clara, CA). PCR programme was 1 cycle of 94 °C for 4 min, 30 cycles of 94 °C for 30 s, 58 °C for 30 s and 72 °C for 45 s, and the last one cycle was 72 °C for 10 min. After PCR was finished, 5 μl of PCR products were pipetted onto a 2% agarose gel and visualised under UV illumination. The remaining PCR product was directly sequenced by the Big Dye Terminator Cycle Sequencing kit (Thermo Fisher, Waltham, MA) and an ABI PRISM Genetic Analyzer (Thermo Fisher, Waltham, MA) according to the instruction of the manufacturer. Table 4 were divided into the donor, recipient and donor-recipient pairs groups, and which were analysed with the survival rate, GVHD and CMV infection in both AML and ALL groups. The SPSS (SPSS Inc. Released 2008. SPSS Statistics for Windows, Version 17.0. Chicago, USA) was used to analyse genotype frequency and survival curve. Genotype frequency was analysed using Chi-square and Fisher's exact tests through a genetic model (dominant, recessive and additive model), in which the allele with higher frequency in ALL or AML population was defined as a major allele, and the lower was minor allele. The survival curve was analysed via Kaplan-Meier analysis and log-rank test. D′ is the normalised standard measurement of LD by contrasting the observed and expected frequencies of one haplotype involved by alleles at different loci. The block was defined as it scarcely had evidence for historical recombination in this region (Gabriel SB, Science, 2002). Haploview 4.2 (https ://www.broad insti tute.org/haplo view/haplo view) was used to analyse LD for the SNPs in donor, recipient and donor-recipient pairs groups, which was used to measure the non-random association of two or more loci in the general population. The blocks were produced by the default algorithm taken from Gabriel et al., Science, 2002.