Single nucleotide polymorphisms within HLA region are associated with the outcomes of unrelated cord blood transplantation

Cord blood transplantation (CBT) provides a treatment scheme for hematologic diseases and leukemia in both children and adults. However, adverse reactions and transplantation-related death may still occur in patients receiving CBT even when donor and recipient have fully matched HLA in high-resolution HLA typing analysis. Single nucleotide polymorphisms (SNPs) of HLA-related and unrelated genes are known to associate with disease status of patients with unrelated stem cell transplantation. In this study, the genomic regions ranging from 500 base pairs upstream to 500 base pairs downstream of the eight SNPs that were reported as transplantation determinants by Petersdorf et al. were analyzed to evaluate whether genetic variants were associated with the survival status of patients, and the risk for severe (grades 3–4) graft-versus-host disease (GVHD) or cytomegalovirus (CMV) infection/reactivation. The analyses were performed in the mode of recipient genotype, donor genotype, and recipient-donor mismatching, respectively. By analysis of sixty-five patients and their HLA-matched unrelated donors, we found that five SNPs were associated with patient survival which included the recipient genotype with SNPs of rs107822 in the RING1 gene, and rs2070120, rs17220087 and rs17213693 in the HLA-DOB gene; and the recipient-donor mismatching with SNPs of rs9282369 in HLA-DOA gene, and rs2070120, rs17220087 and rs17213693 in the HLA-DOB gene. Five SNPs were associated with the risk for severe GVHD which included the donor genotype with SNPs of rs213210 and rs2523675; the recipient genotype with SNPs of rs9281491 in the HCP5 gene; and the recipient-donor mismatching with SNPs of rs209130 in the TRIM27 gene, and rs986522 in the COL11A2 gene. Six SNPs were related to the risk for CMV infection/reactivation which included the donor genotype with SNPs of rs435766, rs380924, and rs2523957; and the recipient-donor mismatching with SNPs of rs2070120, rs17220087, and rs17213693 in the HLA-DOB gene; and rs435766 and rs380924 in the MICD gene. This study provides the basis for larger analyses and if the results are confirmed, a way of selecting better unrelated CBT candidate donors.


Results
Patient characteristics and study design. Patients (n = 65) with hematological disorders (mostly transfusion-dependent thalassemia), or other tumor diseases receiving unrelated CBT from HLA-matched donors were recruited to this study (Table 1). To analyze whether any SNPs within the HLA region are associated with the occurrence of adverse reactions and the survival of patients, the genomic regions 500 bp upstream and downstream of the 8 sourced SNPs (Table 2) for the donor and recipient were amplified by PCR using the forward and reversed primers (Table 3). PCR amplicons were sequenced and analyzed to investigate whether there were candidate SNPs related to the survival of patients and the occurrence of adverse reactions. Three different modes including donor genotype analysis, recipient genotype analysis, and mismatch between donor-recipient pair (defined by having a specific combination of SNP alleles between the donor and recipient) were performed to correlate specific SNPs with the clinical outcomes post-transplantation.

Association of SNPs with adverse reactions and survival post-CBT. Five SNPs were associated
with the occurrence of adverse reactions post-CBT in the donor genotype analysis (Table 4). Three SNPs located in the MICD gene were related to the risk for CMV infection/reactivation. When the donor had the AA genotype in the rs435766, the recipient had a higher risk for CMV infection/reactivation (p = 0.031, OR = 4.667, 95% CI 1.251-17.409). Moreover, the rs380924 and the rs2523957 with GG genotype were also associated with a higher risk for CMV infection/reactivation (p = 0.031, OR = 0.214, 95% CI = 0.057-0.799). On the other hand, two SNPs were related to the occurrence of severe GVHD (grades 3-4), including rs213210 located on the upstream of RING1 gene (p = 0.028) and rs2523675 located on 2.4 kb telomeric of HCP5 gene (p = 0.016). No SNP had statistical correlation with the survival of recipient in the donor genotype analysis.
In the recipient genotype analysis, six SNPs were related to the survival of recipients or the occurrence of severe GVHD (grades 3-4) ( Table 5). The (CC + CT) alleles in rs107822 located 2.0 kb upstream of RING1 gene was associated with a higher survival for recipients (p = 0.017, OR = 4.909, and 95% CI = 1.229-19.606). Three SNPs (rs2070120, rs17220087, and rs17213693) located in the intron or 3'-UTR of HLA-DOB were associated with the survival of recipients based on the dominant model of analysis (p = 0.027, OR = 0.178, and 95% CI = 0.040-0.783). On the other hand, two SNPs located on 2.2 kb or 2.3 kb telomeric of HCP5 gene were related to the occurrence of severe GVHD (grades [3][4]. Patients with the AA genotype in rs9281491 had higher risk for severe GVHD (p = 0.013, OR = 10.889, and 95% CI = 1.729-68.576), while the patients with T-allele in rs4713466 had lower risk for severe GVHD comparing to CC genotype (p = 0.013, OR = 0.092, 95% CI = 0.015-0.578). No SNP was statistically correlated with CMV infection/reactivation in the recipient genotype analysis.
For the mismatch between donor-recipient pair genotype analysis, nine SNPs were related to the survival of patients or the occurrence of adverse reactions (Table 6). Among the nine SNPs, four SNPs were related to the survival of patients, six SNPs were associated with CMV infection/reactivation, and two SNPs were correlated with the occurrence of severe GVHD. Three SNPs (rs2070120, rs17220087, and rs17213693) were related to more than one outcome post-CBT. When the genotype of rs9282369 located 2.3 kb upstream of HLA-DOA gene was matched between donor and recipient, the recipients had a less favorable survival (p = 0.005, OR = 0.181, 95% CI = 0.052-0.628). When the genotypes of rs2070120, rs17220087, and rs17213693 located in the intron or 3'-UTR of HLA-DOB gene were not matched between the donors and recipients, the recipients had a less favorable survival (p = 0.014, OR = 7.667, 95% CI = 1.569-37.458). The same three SNPs in the HLA-DOB gene (rs2070120, rs17220087, and rs17213693) were also associated with the risk for CMV infection/reactivation. When the genotype of these SNPs was matched between donor and recipient, the recipients had a lower risk for CMV infection/reactivation (p = 0.044, OR = 0.200, 95% CI = 0.042-0.946). However, when the genotypes were matched between donor and recipient for the three SNPs located in the MICD gene, the recipients had a higher risk for CMV infection/reactivation (rs435766, p = 0.009; rs380924, p = 0.014; and rs1264813, p = 0.042). When the genotype of rs209130 located on 3.0 kb downstream of TRIM27 gene was matched between donor and Linkage disequilibrium analysis. Pair-wise linkage disequilibrium (LD) analysis of the forty-one SNPs was performed to determine whether there was non-random association of alleles at two or more loci in a general population. D' was defined as the normalized standard measurement of LD by comparing the observed and expected frequencies of one haplotype comprised by alleles at different loci. Out data revealed that the seven SNPs in the MICD gene were all in high LD (pair-wise D' measures ranged from 0.88 to 1) and formed a haplotype block, implying that these SNPs formed a genetic linkage ( Figure S1). The other three haplotype blocks were each comprised of two SNPs located in the RING1, BAG6 and HCP5 genes, respectively.
Temporal effects of outcomes-related SNPs in CBT. In addition to endpoint study, the association of outcome-related SNPs with the occurrence of adverse reactions and the survival of recipients was analyzed and confirmed by counting on the time-effect. Event-free duration was defined as the time from patients receiving CBT to the occurrence of adverse reactions or death. By Kaplan-Meier analysis, the association of CMV-related SNPs with CMV infection/reactivation ( Fig. 1) was confirmed (in donor genotype analysis: rs435766, p = 0.004; rs380924, p = 0.004; and rs2523957, p = 0.004; in mismatch between donor-recipient pair genotype analysis: Table 3. The primer sequences for amplifying genomic region flanking the sourced SNPs. F forward primer, R reversed primer.  Only the rs1264813 of MICD gene failed to demonstrate its association with CMV reactivation (p > 0.05). All GVHD-related SNPs except rs4713466 (Fig. 2) were associated with the occurrence of severe GVHD when timedependent effect was included in the analysis (in donor genotype analysis: rs213210, p = 0.034; and rs2523675, p = 0.006; in recipient genotype analysis: rs9281491, p = 0.005; in mismatch between donor-recipient pair genotype analysis: rs209130, p = 0.030; and rs986522, p = 0.016). All survival-related SNPs (Fig. 3) were related to the overall survival of patients when counting on time-dependent effect (in recipient genotype analysis: rs107822, p = 0.013; rs2070120, p = 0.020; rs17220087, p = 0.020; and rs17213693, p = 0.020; in mismatched between donor-recipient pair genotype analysis: rs9282369, p = 0.004; rs2070120, p = 0.008; rs17220087, p = 0.008; and rs17213693, p = 0.08). These data confirm the association of specific SNP genotypes with the occurrence of adverse reactions and the survival of patients.

Discussion
CMV infection/reactivation, severe GVHD, and relapse are usually occurred after transplantation. In our previous study, we revealed the association of four HLA-related SNPs in the donor group (rs2523675, rs2518028, rs2071479, and rs2523958) and three HLA-related SNPs in the recipient group (rs9276982, rs435766, and rs380924) with disease relapse in CBT cases 9 . In present study, we expanded the analysis and found that five SNPs in the HLA regions were associated with the survival of patients (rs107822, rs2070120, rs17220087, rs17213693, and rs9282369), six SNPs were associated with CMV infection/ reactivation (rs435766, rs380924, rs2523957, rs2070120, rs17220087, and rs17213693), and five SNPs were associated with the development of severe GVHD (rs213210, rs2523675, rs9281491, rs209130, and rs986522). Moreover, the SNPs of rs2070120, rs17220087, and rs17213693 located in the HLA-DOB genes correlate with more than one outcome post-CBT. These outcomesrelated SNPs were confirmed by both end-point and Kaplan-Meier analyses. The outcome-related SNPs as revealed in this study represent novel molecular markers related to the occurrence of adverse reactions and survival of patients receiving unrelated CBT. The outcomes-related SNPs are mainly located in or adjacent to the MICD, RING1, HCP5, HLA-DOA, HLA-DOB, TRIM27, and COL11A2 genes, implying that these SNPs may be related to the function or expression of the abovementioned genes. MICD gene was considered as a pseudogene located within the MHC class I region. However, recent studies indicate that the start sequence of the HLA complex group 9 (HCG9) corresponds with part of the MICD sequence 10 . The effectiveness of transplantation and the mechanism of disease related to the SNPs in the MICD gene are not clear. Instead of being recognized as a pseudogene, MICD is likely to play a role in basic physiology and disease progression 11,12 . This is consistent with the notion that pseudogenes may encode protein under certain circumstances 13 . The possibility that the MICD has an unknown functional impact on the outcome of CBT cannot be ruled out. In this regard, the SNP of rs380924 adjacent to MICD gene that we showed to associate with CMV infection/reactivation has been found to relate to the risk for psoriasis 14 . The findings of this and previous studies implicate that the SNPs of rs435766, rs380924, and rs2523957 in the MICD gene are crucial to the susceptibility of various diseases including CMV infection/reactivation. www.nature.com/scientificreports/ Our data revealed that three SNPs (rs2070120, rs17220087, and rs17213693) located in the intron or 3'-UTR of HLA-DOB regions are associated with the survival of patients in the recipient and donor-recipient matching groups, as well as CMV infection/reactivation in the donor-recipient matching group. The rs9282369 in the upstream promoter region of HLA-DOA, a paralogue of HLA-DOB, is also associated with the survival of patients in donor-recipient matching group. These data imply that both HLA-DOA and -DOB are important in modulating the effectiveness of transplantation. HLA-DO belongs to the HLA class II homologous genes. Both HLA-DOA and -DOB are heterodimer composed of alpha chain and beta chain. Compared with the typical HLA class II molecules, HLA-DO has limited number of polymorphisms 15,16 . HLA-DO interferes MHC-bound epitopes presentation by mediating the function of HLA-DM 17 . The three SNPs in the region of HLA-DOB gene may cause excessive HLA-DOB expression leading to a less effective antigen presentation and lower immune response. On the other hand, DNA polymorphism in the promoter region is known to regulate gene expression 18 . It is worthy to investigate whether the rs9282369 located in the HLA-DOA promoter region regulates HLA-DOA expression leading to the change of patient survival post-CBT.
For the GVHD-related SNPs, the polymorphism of rs213210 was related to GVHD in the donor genotype analysis. The rs213210 is considered not only as the SNP in the promoter region of RING1 gene, but also as the SNP within the pre-miR-219 gene. Wu et al. indicated that this gene was related to gastric cancer and showed that the nucleotide changes from T to C in rs213210 increase the expression of miR-219-1 19 . Moreover, there were specific miRNAs that could be served as a biomarker for acute GVHD 20 . Based on our results, the relationship between miR-219 and acute GVHD can be further explored in the future. There are 2 SNPs (rs2523675 and  21 . IFN-γ is an important cytokine in proliferation and differentiation of T cells 22 . It is likely that patients with different SNP genotypes response differently to IFN-γ and elicit differential effects on preventing acute GVHD. The rs209130 located downstream of TRIM27 and the rs986522 in the intron of COLI11A2 gene were associated with GVHD in donor-recipient matching group. The association of rs209130 with severe GVHD may be caused by the abnormal regulation of CD4 + T cells 23 leading to the patient's own tissue cells attacked by over-reconstructed CD4 + T cells 24 . On the other hand, the strong linkage disequilibrium 25 of the SNPs in COL11A2 and HLA-DP and the risk of GVHD related to HLA-DPB1 mismatching in unrelated HSCT 26 may provide an explanation for the association of rs986522 with GVHD. We noted that the number of cases in this study is limited, although the specimens were collected over a period of more than ten years with an average of five to ten cases of CBT per year available in our hospital. This represents a limitation of this study. Increasing the enrolled number of donor-recipient pairs to this study or www.nature.com/scientificreports/ analysis of another large cohort may further validate and confirm the association of these SNPs with the outcomes of unrelated CBT. We also noted that 23% of patients had fully matched CB grafts, and the remaining had one or two mismatches. The cord blood unit up to two allele mismatches between the donor and recipient can be used without an increased risk of allograft rejection in most CBT cases 27 . With the limited number of donor-recipient pairs, it is not feasible to perform further analysis and remains unclear whether the number of HLA mismatch has any impact on SNP association with clinical outcomes. A cohort comprising large number of patients with fully-matched HLA, and one or two mismatched HLA to the donors is required to address this issue. Moreover, multiple factors likely contribute to the occurrence of adverse reactions and the survival of patients. Multivariate analyses including more variables are worthy to perform in the future study.
In conclusion, this study provides a foundation for creating a screening panel of SNPs for seeking a suitable donor before receiving CBT. The successful rate of CBT can be improved by selection of appropriate SNPs in donor genotype, recipient genotype, or mismatch between donor-recipient pair to avoid the occurrence of adverse reactions and improve the survival of patients. Because the genes which carry these SNPs are related to immunological functions or the susceptibility to the immunological disorders, clarifying the effects of these SNPs on the biological functions of these genes and the underlying mechanism should further provide explanations for the association of these SNPs with the outcomes of unrelated CBT.

Materials and methods
Patients and HLA typing. This study was approved by the Institutional Review Board of Chang Gung Memorial Hospital with the approval ID of 102-4949B. All methods were performed in accordance with the relevant guidelines and regulations. A total of sixty-five donor-recipient pairs undergoing unrelated CBTs were recruited at Chang Gung Memorial Hospital. The clinical characteristics and indicated diseases were shown in Table 1. All participants provided written informed consents to participate in this study.
HLA typing of HLA-A, -C, -B, -DRB1, -DQB1 alleles for donors and recipients were implemented prior to the transplantation. First, the method of LABType SSO Typing Test (Thermo Fisher, Waltham, MA) was employed along with sequence-specific oligonucleotide probes. Second, the SeCore kit (Thermo Fisher, Waltham, MA) was used for high-resolution HLA typing to obtain more detailed allele information. Finally, the MicroSSP Allele Specific Typing Tray (Thermo Fisher, Waltham, MA) was used to resolve ambiguous alleles of the SeCore typing with sequence-specific primers.
Definition. The diagnosis of CMV infection/reactivation was based on the detection of CMV antigen or DNA in the peripheral blood of patient after transplantation. CMV antigen in white blood cells was determined by CMV Antigenemia Assay (MONOFLUO™, Bio-Rad). This method can detect CMV viremia earlier and more sensitive than traditional virus culture or shell vial. The test was considered positive when more than 2 polymorphonuclear leukocytes (PMN) were positive for CMV antigen in a total of 50,000 PMN. CMV DNA Quantitative Amplification test is a real-time quantitative PCR assay (COBAS® AmpliPrep/COBAS® TaqMan® CMV Test, Roche). The nucleic acid test was considered positive when the Ct < 37. These two assays can assist clinicians in monitoring the status of CMV infection or reactivation before and after transplantation.
GVHD was defined in accord with the National Institutes of Health (NIH) Consensus. Acute GVHD (aGVHD) was defined as the syndrome occurred with 100 days after transplantation. Otherwise, it was classified as chronic GVHD. According to the clinical characteristics of organs, aGVHD was divided into 4 grades. grade 1: only mild symptoms on the skin; grade 2: slightly serious symptoms on the skin and the mild symptoms on the liver and gastrointestinal tract; grade 3: symptoms on more than half of the skin and severe symptoms on the liver and gastrointestinal tract; grade 4: organs are not able to function properly 28 . In our analysis, GVHD was classified as grades 3-4. vs. else, because the grades 3-4 of GVHD was considered as severe GVHD. Patients who were still alive at the end of this study were classified as survival. Death was defined when patients were died by any reason after receiving transplantation. The event-free duration was defined as the duration from transplantation to the event (CMV infection/reactivation, GVHD grades 3-4, or death) occurred.
Selection of SNPs. The 8 SNPs (rs2244546, rs986522, rs2244546, rs2523957, rs429916, rs2071479, rs107822, and rs209130) within the HLA region have been shown to associate with the risk of mortality, diseasefree survival, transplant-related mortality, relapse and GVHD in patients with HSCT 8 . These SNPs were selected as the sourced SNPs in this study.
The genomic regions ranging from 500 bp upstream to 500 bp downstream of these 8 sourced SNPs were amplified and sequenced to search for candidate SNPs that were associated with the adverse reactions and survival of patients in unrelated CBT. A total of 41 SNPs was present in these regions ( Table 2). The association between these SNPs and the risk for the occurrence of adverse reactions or the survival of patients were analyzed and conferred by donor SNP (mode of donor genotype analysis), recipient SNP (mode of recipient genotype analysis) or mismatched of donor-recipient pair SNP (mode of mismatch between donor-recipient pair, defined by having a specific combination of different SNP alleles between the donor and recipient) as described previously 14 .
PCR and sequencing. The genomic DNA from the peripheral blood of recipients was isolated by using the QIAamp DNA Blood mini Kit (Qiagen, Valencia, CA). The genomic DNA from the donors was extracted from a small segment of blood in the infusion tube connected to the blood bag by using the same DNA extraction method. A total of 8 different primer pairs (Table 3) were used to amplify the DNA fragments that covered from 500 bp upstream to 500 bp downstream of the 8 sourced SNPs by PCR as described previously 9 . Briefly, PCR was performed in a reaction volume of 50 μl containing 1X reaction buffer, 10 nmol of dNTPs, 6 pmol of forward and reverse primers, 300 ng of genomic DNA, and 1 μl of Pfu Turbo Hotstart DNA Polymerase (Agilent, Santa Statistical analysis. The Hardy-Weinberg equilibrium (HWE) test was performed in control group to examine the quality of experiments for the tested SNPs. SNPs that violated the HWE were eliminated from analysis. The frequencies of allele and genotype were computed and compared between each of the three dichotomized outcome groups: (1) survived vs. deceased; (2) GVHD grades 3-4 vs. else; and (3) having CMV infection/reaction vs. not. The Cochran-Armitage test for trend was performed to evaluate the additive effect of risk alleles that each SNP had on the outcome. The correlation of specific SNP genotypes with different outcomes was investigated by genotypic test. The association between each outcomes and mismatch status of SNP genotypes in donor-recipient matching group was examined using the chi-square and Fisher's exact tests under three types of models: additive (AA vs. Aa vs. aa), dominant (AA vs. Aa + aa), and recessive (AA + Aa vs. aa) 29 . Kaplan-Meier analysis and log-rank test were performed by using SPSS ver 17.0. The measurements of pair-wise linkage disequilibrium (LD) D' and r 2 for the SNPs in donor group, recipient group, and donor-recipient matching group which refers to the non-random association of alleles at two or more loci in a general population were determined by using HaploView 4.2 (https:// www. broad insti tute. org/ haplo view/ haplo view). 30 .