Significance of genetic modifiers of hemoglobinopathies leading towards precision medicine

Hemoglobinopathies though a monogenic disorder, show phenotypic variability. Hence, understanding the genetics underlying the heritable sub-phenotypes of hemoglobinopathies, specific to each population, would be prognostically useful and could inform personalized therapeutics. This study aimed to evaluate the role of genetic modifiers leading to higher HbF production with cumulative impact of the modifiers on disease severity. 200 patients (100 β-thalassemia homozygotes, 100 Sickle Cell Anemia), and 50 healthy controls were recruited. Primary screening followed with molecular analysis for confirming the β-hemoglobinopathy was performed. Co-existing α-thalassemia and the polymorphisms located in 3 genetic loci linked to HbF regulation were screened. The most remarkable result was the association of SNPs with clinically relevant phenotypic groups. The γ-globin gene promoter polymorphisms [− 158 C → T, + 25 G → A],BCL11A rs1427407 G → T, − 3 bp HBS1L-MYB rs66650371 and rs9399137 T → C polymorphisms were correlated with higher HbF, in group that has lower disease severity score (P < 0.00001), milder clinical presentation, and a significant delay in the age of the first transfusion. Our study emphasizes the complex genetic interactions underlying the disease phenotype that may be a prognostic marker for predicting the clinical severity and assist in disease management.

Primary screening and molecular analysis. Primary screening involved complete blood count analysis and the concentration of different hemoglobin fractions was quantified on BioRad Variant II high-performance liquid chromatography.
Molecular analysis of the β-globin gene was first carried out to confirm the hemoglobinopathy status in the patient samples by covalent reverse dot blot hybridization (CRDB), amplification refractory mutation system-polymerase chain reaction (AMRS PCR), or by direct DNA sequencing 6 . α-globin gene deletions were detected by multiplex PCR 7  The patients were clinically evaluated and the disease severity score was calculated based on the detailed clinical history of the patient 9,10 . Linkage disequilibrium analysis was performed by using Haploview software. (https:// www. broad insti tute. org/ haplo view/ haplo view).
Statistical analysis. Statistical analysis of the data was performed using GraphPad version 6.01 software (Graph Pad Prism Inc, California, U.S.A). The hematological indices among different patient groups and normal controls are represented as mean ± standard deviation (SD). Fischer extract test was used to compare the polymorphism distribution among the patients and the control groups. The comparison of the quantitative variables among the groups and between differing genotypes was carried out by unpaired non-parametric Mann-Whitney U test. The P-value ≤ 0.05 was considered to be statistically significant. Generalized Multifactor Dimensionality Reduction (GMDR) software version beta 0.9 was used to analyse the interaction among the SNPs in different patient groups. The Kaplan Meier survival curve analysis was performed to determine the age of presentation by considering the transfusion free survival among the patient groups.
Ethics approval. The study was approved by the National Institute of Immunohaematology-Institutional Ethics Committee. Consent to participate. Informed consent was obtained from all individual participants included in the study.

Results
On the basis of clinical history, the β-thalassemia patients were classified into a severe group (50 Thalassemia major: TM) and milder group (50 Thalassemia Intermedia: TI).
As co-inheritance of α-thalassemia is a well-known disease modifier of β-thalassemia and SCA, the presence of α-globin-gene deletions was screened in the patient groups. A much higher prevalence of single alpha globin gene deletions was observed in SCA patients (51.0%). Among the β-thalassemia homozygotes, the β-thalassemia intermedia showed a higher prevalence (26.0%) of α-globin gene deletions as compared to β-thalassemia major (20.0%). (P: 0.47) (Supplementary Table 2).
The second powerful modifier of disease severity in hemoglobinopathy patients is elevated HbF levels. Hence the polymorphisms located in the three loci linked to raised HbF levels: γ-globin promoter region, BCL11A and HBSL1-MYB intergenic region were analysed in this study.
In the A γ-globin promoter region, + 25 (G → A) (HBG1:c.-29 G > A) variation was detected and A allele was found to be the variant allele. The A allele was found to be significantly higher in the TI group as compared to www.nature.com/scientificreports/ TM (P: 0.005). Also, the A allele in β-thalassemia intermediates, was significantly associated with increased HbF levels (79.9% ± 28.6, P: 0.03). In SCA patients, 94% of the patients were homozygous for the A allele, an observation similar to the XmnI polymorphism (Fig. 1B). Among the 5 intronic polymorphisms in the BCL11A gene screened, the mutant T allele of rs1427407 (G → T) polymorphism, was significantly higher in the β-thalassemia intermedia group as compared to the thalassemia major group (P: 0.002, OR 5.6, 1.84-17.22). In SCA patients the T allele was found to be significantly associated with raised HbF levels (HBF > 17.4% (P: 0.003, OR 3.14, 1.46-6.75) as compared to the other group. The T allele was also found to be significantly associated with HbF levels in both the patient groups (P < 0.05) (Fig. 1C).
In the sickle cell anemia patients, the C allele of rs11886868 C → T polymorphism was found to be significantly associated with increased HbF levels (P: 0.02, HbF: 20.9% ± 8.8) (Fig. 1D). Among the HBS1L-MYB intergenic polymorphisms, the deletional allele of rs66650371 (Intact TAC → Deletion-'TAC') polymorphism and the C allele of rs9399137 (T → C) were found to be significantly present in the milder β-hemoglobinopathy patients. As reported in earlier studies these 2 polymorphisms were found to be in complete linkage disequilibrium. The minor alleles of these polymorphisms were found to be significantly associated with the HbF levels ( Fig. 1E,F). Table 3 gives a detailed analysis of the allelic frequency of these polymorphisms determined among the patient and the control group.
The linkage disequilibrium plot showed that the + 25 (G → A) polymorphism in A γ-globin gene and the XmnI polymorphism in the G γ-globin are highly linked (Linkage Disequilibrium coefficient D′: 93). Also, the SNPs rs11886868 (C → T) and rs7557939 (A → G) in the BCL11A intronic region, are strongly linked (D′: 88) with each other (Fig. 2).
Further, the best SNP models accompanied by the lowest prediction error (Testing balance accuracy), the highest CVC, and the P-value of significant level were calculated. The results revealed a cumulative effect of the mutant alleles of the 3 SNPs − 158(C → T), rs11886868 (C → T), and rs1427407 (G → T) significantly higher in β-thalassemia intermedia patients, and as the best SNP model with testing balance accuracy of 74.9% and crossvalidation consistency of 9/10. Further gene-gene interaction studies showed a synergistic effect may coexist among these SNPs in elevating the HbF level (Fig. 3A). Similarly among the SCA patients, gene-gene interaction between the mutant alleles of rs66650371 and rs1427407 were found to be significantly higher in the sickle cell anemia patients with HbF levels > 17.4% with a testing balance accuracy 66.0% and cross-validation consistency 10/10 (Fig. 3B). The generation of GMDR models for determining the most influential SNPs among the 9 SNPs studied in the patient groups is shown in Supplementary Table 3.
The presence of ameliorating alleles may significantly delay the age of presentation and transfusion requirement in β-hemoglobinopathy patients. Hence, for the analysis we included both primary modifiers (the type of β-globin gene mutation in β-thalassemia patients) and secondary modifiers: α-globin genotype and the HbF modulators [γ-globin promoter variations, BCL11A, MYB and KLF1 variations (the KLF1 data from our previous published paper) 11 . Among the SCA patients, a strong negative correlation was observed between the HbF levels and the disease severity score. (Pearson correlation coefficient r: − 0.7, P < 0.00001) The patients inheriting the higher numbers of modulating allele showed significantly elevated HbF levels (mean HbF: 21.9% ± 9.8)     Also showed a significant delay in the age of first transfusion as compared to the other group (Fig. 4A,B). The β-thalassemia intermedia patients inheriting more number of the disease ameliorating alleles showed elevated HbF levels (mean HbF: 75.1% ± 29.9), with reduced disease severity score (mean DSS: 5.6) as compared to patients with lower numbers of disease severity modulating alleles, who had lower HbF levels (mean HbF: 54.1% ± 36.9). Further, when compared to the age of first transfusion, however, no significant difference was observed among the 2 groups (Fig. 4C,D). The Supplementary Table 4 shows the median transfusion free survival and hazard ratio in both the patient groups. The number of modulating alleles, transfusion free survival ratio was inversely associated with the hazard ratio.

Discussion
Though β-thalassemia and sickle cell disease are single-gene disorders with prototypical Mendalian inheritance patterns, both the disorders display a wide spectrum of clinical phenotypes. Thus, the search for the genetic modifiers was triggered, as 5-10% of β-thalassemia homozygous patients with the same β-globin gene mutation and sickle cell anemia patients showed a variable pattern of clinical expression 12 .
In this study, we first classified the β -thalassemia patients according to the clinical severity and then studied the influence of the genetic modifiers. Modell and Berdukas 13 , reported that 60% of β-thalassemia homozygous patients presented in the first year of life, these patients were segregated as β-thalassemia major and 9% of the β-thalassemia homozygous patients who presented after 2 years of age, with intermediate clinical severity were classified as β-thalassemia intermedia 13,14 . A similar observation was made in our study, in which the β-thalassemia major patients presented early by 9.2 ± 2.7 months and the patients in the β-thalassemia intermedia group had a delayed age of presentation mean of 4.3 ± 3.3 years. The β-thalassemia intermedia patients also showed a significantly higher mean baseline hemoglobin of 7.8 ± 1.4 g/dL as compared to thalassemia major patients. Similarly, another study showed that in 63 β-thalassemia intermedia patients, the hemoglobin values ranged between 7 and 9 g/dL with occasional transfusion regimen and splenomegaly 15 . In our study as well, pronounced hepatosplenomegaly was observed in β-thalassemia intermedia patients as compared to β-thalassemia major. Mpalampa et al. 16 considering the mean HbF cut-off as 10%, in 216 sickle cell anemia patients observed a strong negative correlation of HbF levels with the total number of transfusions (r = − 0.181, P: 0.004), hospitalisations rate (r = − 0.173, P: 0.006), and significant positive correlation with the age at diagnosis (r = 0.151, P: 0.013) 16 . In the Indian context, Nayak et al. 17 studied 60 sickle cell anemia patients and observed fewer episodes of painful crises in children with high baseline HbF level as compared to children with low HbF level 17 . Correspondingly in our study as well, the mean age of diagnosis among 100 SCA patients was found to be 6.3 ± 5.2 years which is very much delayed as compared to the patient cohort studied by Mpalampa et al. 16 . This observation could be due to inherently elevated HbF levels in Indian patients mainly due to Arab-Indian haplotype which is a major determinant of HbF levels in Indian SCA patients 18 . Further, it was observed that the patients with higher HbF level had a delayed age of presentation (7.1 ± 5.5 years) with less transfusion requirement and sporadic painful crisis compared to patients with HbF level ≤ 17.4% (age of presentation: 5.2 ± 4.9 years).
As the β-thalassemia alleles inherited by the patient act as a primary modulator of the disease severity in β-thalassemia, Colah et al. 19 observed that the milder mutations are prevalent in β-thalassemia intermedia group as compared to severe β-thalassemia major patients 19 . Similarly in our study, the presence of milder β-thalassemia alleles were significantly higher in β-thalassemia intermedia as compared to β-thalassemia major   20 . A similar observation was seen in our patients, which suggested the presence of other genetic factors that may play a synergistic role in modifying the disease severity of β-thalassemia. In a previous study by Nadkarni et al. 21 , the associated α-thalassemia was found to be significantly higher in the thalassemia intermedia group (37%) as compared to β-thalassemia major group (5%) (P < 0.025) 21 . A study by Pandey et al. 22 revealed 32% sickle cell anemia patients with co-existing α-globin gene deletion, showed a relatively milder clinical course with improved hematological indices and reduced transfusion history 22 . Similarly, Rumaney et al. 23 observed that in Cameroon sickle cell disease patients, co-inheritance of α-thalassemia showed improved hematological indices with a better survival rate 23 . Similarly in this study, we observed that the coinheritance of α-thalassemia was higher in the milder β-thalassemia patient group as compared to the other group. 51% of SCA patients also showed presence of α-thalassemia. Alternatively, the excess alpha-globin chains play a significant role in the pathophysiology of homozygous beta-thalassaemia. The coexistence of a triplicated α-globin gene is found to be exacerbating the phenotypic severity of β-thalassemia by causing more globin chain imbalance, thus causing severe anemia 24 . The effect of the genetic modifiers of fetal hemoglobin was also analysed in this study. A study in the Egyptian β-thalassemia patients showed that 83.3% of β-thalassemia intermedia cases were heterozygous for XmnI www.nature.com/scientificreports/ polymorphism as compared to β-thalassemia major (57.6%) and that β-thalassemia intermedia with single T allele of XmnI showed delayed age of diagnosis, raised HbF levels and milder disease phenotype as compared patients negative for the XmnI polymorphism 25 . In another study, it was also determined that the patients with homozygosity for the mutant T allele of XmnI polymorphism significantly showed higher mean HbF levels (85.5 ± 6.8%) as compared to the thalassemia intermedia patients homozygous for XmnI CC genotype (19.5% ± 29.3) 26 . A similar result was observed in our patient group where in the β-thalassemia intermedia patients homozygous for variant allele T showed significantly higher HbF level. + 25 G → A polymorphism in A γ-globin promoter region was found to be significantly associated with elevated HbF levels in the β-thalassemia intermedia group. This polymorphism was first reported by Bianchi et al. 27 and a strong linkage of this polymorphism with the − 158 C → T (XmnI polymorphism) was observed in their study as well 27 . It has been reported that + 25 G → A polymorphism reduces the binding efficacy of LYAR transcription factor (repressor of γ-globin gene expression) and abolishes the binding of 2 negative epigenetic regulators [DNA methyltransferase 3 alpha (DNMT3A) and protein arginine methyltransferase 5 (PRMT5)] to this promoter region 27,28 . Thus, it could be speculated that there could be a cumulative effect of mutant alleles of both XmnI polymorphism (T allele) and + 25 G → A polymorphism (A allele) in synergistically elevating the HbF levels.
The association of BCL11A polymorphisms with elevated HbF levels and their effect on amelioration of the disease phenotype was studied by Uda et al. 29 in Sardinian β-thalassemia homozygous patients 29 . They showed that the mutant C allele of rs11886868 (C → T) formed the major allele in Sardinian population and was significantly associated with elevated HbF levels in β-thalassemia intermedia patient group. Similarly, in Indian patients Dadheech et al. 30 , determined that the C allele was significantly associated with the raised HbF levels and delayed the age of presentation in both thalassemia homozygous and SCA groups 30 . In our study, the mutant CC genotype was found to be significantly associated with HbF levels only in the sickle cell anemia patients.
Similarly, in Indonesian HbE-β-thalassemia patients inheriting variant alleles of rs11886868, rs766432 in the BCL11A gene, showed higher HbF levels and reduced disease severity as compared to patients with wild type alleles 31 . The second SNP that was found to be significantly associated with the HbF levels is rs1427407 (G → T) polymorphism in the BCL11A gene. Our results were found to be consistent with the earlier report by Bhanushali et al. 32 , who showed a similar distribution of allelic frequency of rs1427407 in Indian SCA patients 32 . Studies have demonstrated that the patients with the mutant T allele of rs1427407 (G → T) showed significantly higher HbF level, the results of which are concordant with our study, thus suggesting a crucial role of this SNP in modulating the HbF levels 32,33 .
Similarly, Chaouch et al. 34 observed that co inheritance of the mutant C allele of the rs11886868 and the mutant A allele of the rs46713939 ameliorated the clinical phenotype of SCA patients 34 . In our study, though the A allele of rs46713939 was found to be higher in the TI group, no significant difference in the allelic frequencies among the milder and severe groups could be observed. Studies have identified a restricted 14 kb region in BCL11A intron 2 to be associated with H3-acetylation, RNA pol II activity as well as a strong GATA-I, TAL-1 binding site, all of which indicated the presence of a regulatory sequence in this region 35,36 . Thus, suggesting that the presence of a polymorphism could potentially alter the recruitment of transcription factors to this region.
Similarly, in our population, a 100% linkage was observed between rs66650371 and rs9399137 polymorphism. A similar observation was seen in the Tanzanian SCD patients where, both these polymorphisms in HMIP 2A block were strongly associated with HbF levels and showed a strong linkage 37 . Similarly, Lai et al. 38 in β-thalassemia intermedia patients showed that the mutant alleles of rs9376090 (NC_000006.12:g.1350900 90T > C), rs7776054 (NC_000006.12:g.135097778A > G), rs9399137, rs9389268 (NC_000006.12:g.135098493 A > G), rs9402685 (NC_000006.12:g.135098550T > C) in the HbS1L-MYB intergenic region and rs189984760 in the BCL11A locus, showed significant association with high HbF level 38 . Bioinformatic characterization of the 3 bp deletion polymorphism showed that this region acts as a binding site for 4 transcription factors TAL1/ SCL, E47, GATA-2 and RUNX1/ AML1 all of which are important for erythroid differentiation, erythropoiesis and the presence of the mutant allele may disrupt the MYB gene expression, which is a negative regulator of the γ-globin gene expression 39,40 .
In our study, it was observed that in SCA patients, the prevalence of the T allele of rs1427407 (G → T) and the 3 bp deletional allele of rs66650371 both were significantly higher in the SCA patient group possessing higher HbF levels (HbF > 17.4%) as compared to the SCA patients, with lower HbF level (HbF ≤ 17.4%). A similar result was shown by Adeyemo et al. 41 , where in patients with the mutant alleles of these 2 polymorphisms had a milder form of the disease, with improved hemoglobin levels 41 .
Similarly, in β-thalassemia homozygous patients, the cumulative effect of 3 HbF associated mutant alleles of the SNPs − 158(C → T), rs11886868 (C → T) and rs1427407 (G → T), were observed significantly in the β-thalassemia intermedia group as compared to β-thalassemia major group. A comparable result was reported by Allawi et al. 42 , where they determined the main factors leading to milder phenotypes were the attenuated β-thalassemia alleles, the T allele of XmnI polymorphism and the minor allele of BCL11A rs10189857 42 . Cardoso et al. 43 , studied the influence of three known major loci on the HbF trait (HBG2, rs748214; BCL11A, rs4671393; and HBS1L-MYB, rs28384513, rs489544 and rs9399137) in north Brazilian SCA patients and they showed that the raised HbF trait was primarily influenced by mutant alleles of BCL11A 43 .
Further to predict the disease severity in presence of these genetic modifiers, Badens et al. 5 studied 5 genetic modifiers of β-thalassemia. By regression analysis, all 5 types of favorable allele were found to be significantly associated with thalassemia intermedia phenotype. The β-globin gene mutations and XmnI polymorphism were the most influential modifiers of the disease severity 5 . A similar observation was reported by Danjou et al. 44 , wherein they further evaluated the age of the first transfusion with respect to the inheritance of HbF boosting alleles and observed that the age of transfusion was found to be delayed in presence of more number of HbF inducing alleles 44 . Similarly, in our study, it was observed that, 8% of thalassemia intermedia patients (HbF: 82.7% ± 22.6) had inherited more than 10 disease ameliorating alleles as compared to none in thalassemia major These observations suggest that the presence of increased number of an ameliorating allele, may help in reducing the disease severity in hemoglobinopathy patients mainly by restoring the globin chain imbalance. The precise identification of the polymorphisms associated with elevated HbF levels may help in developing a molecular chip that may assist in predicting the disease severity. Validation of our results needs to be carried out in a bigger cohort.

Conclusions
The present study expands the knowledge of the frequency of the genetic modifiers (primary and secondary modifiers) and the independent effect of individual predictor genes on HbF levels in hemoglobinopathy patients. The analysis of the cumulative effect of the HbF modulators may help in identifying the strongest response gene to the HbF level in both β-thalassemia and sickle cell anemia patients in the population. The predictions based on genetic modifiers thus can foresee the severity of β-thalassemia and SCA. This study may assist the clinicians, to predict the clinical phenotype of hemoglobinopathy patients at an early stage and thus may help in the efficient management of the disease. This may contribute towards molecular mechanisms of HbF regulation and the development of therapeutic approaches for β-hemoglobinopathies.