Introduction

Hemoglobin E (HbE; β26Glu->Lys) is the most common variant of the β-globin gene (HBB; MIM 141900) in Southeast Asia. The HbE variant causes a structural defect in hemoglobin, while the HbE disease (homozygotes for HbE) shows milder clinical severity than hemoglobin disorders caused by HbS (HbS; β6Glu->Val) and HbC (HbC; β6Glu->Lys). In addition, the HbE trait (heterozygotes for HbE) is clinically benign with no anemia. The severity is considered to be proportional to the degree of the structural defect and the expression level of the variant chromosome. The βE chain is inefficiently synthesized compared to the normal βA chain so that heterozygotes for HbE express less variant β chain than the HbS and HbC heterozygotes (Traeger et al. 1980). This lower expression of the βE chain seems to be one reason why the HbE disease and trait are mostly benign.

The HbE variant has a point mutation at codon 26 that changes a glutamic acid in the normal β chain to a lysine. In addition to creating a different hemoglobin structure, this exon mutation also activates a cryptic 5′ splicing site at codon 25 of the HbE pre-mRNA. The mRNA produced by the aberrant splicing is nonfunctional, and the amount of correctly spliced HbE mRNA is decreased (Orkin et al. 1982), resulting in the low concentration of βE chain compared to βA chain in HbE heterozygotes.

The expression of HBB is also known to be regulated by many critical cis-acting sequences (Grosveld et al. 1993). There is a binding site for the repressor protein BP1 located upstream of HBB at position −530 (Berg et al. 1989; Chase et al. 2002). The BP1 binding site contains a tandem (AT)x(T)y repeat, alterations in which change the binding affinity for BP1, thereby influencing the transcription of HBB. To date, several alleles with different numbers of repeats have been found in various populations. The (AT)9(T)5 allele has been reported to be associated with silent β-thalassemia (Semenza et al. 1984; Murru et al. 1990). In addition, compared to other repeats on the HbS chromosome, the (AT)9(T)5 repeat binds BP1 tightly, and, in Indians with the HbS trait, the HbS is in linkage disequilibrium with (AT)9(T)5 (Elion et al. 1992). This may explain why the expression level of HbS is lower and the disease is milder in Indians than in Africans with the sickle cell trait (Elion et al. 1992). More recently, transient expression experiments have revealed that the (AT)9(T)5 repeat affects the promoter activity in an erythroid environment (Kalotychou et al. 2002). These results strongly suggest that the expression level of the β chain coded by the chromosome with the (AT)9(T)5 repeat is reduced compared to other chains coded by the chromosome without the (AT)9(T)5 repeat.

The HbE variant was reported to be on the chromosome with (AT)7(T)7 or (AT)9(T)5 in the Chinese population (Zhou et al. 1995). However, it is uncertain whether or not the HbE in the Thai population is also in linkage disequilibrium with these (AT)x(T)y alleles because HbE in Southeast Asia has been suggested to have multiple origins (Antonarakis et al. 1982). In Thailand, the overall frequency of HbE is estimated to be 13%, and in some areas, the frequency of HbE equals that of HbA (Wasi et al. 1980). The HbE variant is thought to have arisen recently by a single mutation and rapidly increased its population frequency in Thailand (Ohashi et al. 2004). Thus, it is interesting to examine which (AT)x(T)y allele is in linkage disequilibrium with such a major HbE variant. In this study, we investigated variations in the (AT)x(T)y repeat in three HbE homozygous, 22 HbE heterozygous, and 32 normal (HbA) homozygous individuals living in Thailand. For the genotyping of this complicated repeat, we developed an efficient polymerase chain reaction–single-strand conformation polymorphism (PCR-SSCP) method.

Materials and methods

Subjects and polymorphisms

For this study, we recruited Thai patients with Plasmodium falciparum malaria living in Suan Pung, Thailand, near the border with Myanmar. The diagnosis of mild malaria is described elsewhere (Ohashi et al. 2002). The HbE variant genotype of malaria was described in our previous study (Ohashi et al. 2004). In the present study, we analyzed five polymorphic sites, including the (AT)x(T)y repeat at the putative BP1 binding site, in 57 subjects. This study was approved by the institute review board of the Faculty of Tropical Medicine, Mahidol University, and informed consent was obtained from all participants.

PCR amplification and sequencing

PCR was performed for the putative BP1 binding site using the sense primer BP1-F (5′- gcatgcatgagcaaattaaga-3′) and the antisense primer BP1-R (5′-caggacagaatggatgaaaactc-3′). The positions of detected polymorphisms and primers on NG_000007.2 are shown in Fig. 1. PCR was carried out in a GeneAmp PCR system 9700 thermal cycler (Perkin-Elmer, Applied Biosystems) using an initial denaturation for 10 min at 96°C, followed by 40 cycles of denaturation for 30 s at 96°C, annealing for 30 s at 58°C, and extension for 30 s at 72°C. PCR products were sequenced for all samples on both strands using an ABI Prism 3100 automated sequencer (Perkin-Elmer, Applied Biosystems).

Fig. 1
figure 1

Five polymorphisms detected in this study. The position of each polymorphism was described based on the GenBank sequence entry NG_000007.2.

PCR-SSCP analysis

PCR-SSCP analysis for the amplified fragment including the (AT)x(T)y repeat was performed for all samples. The amplified fragment included five polymorphic sites (Table 1 and Fig. 1). One microliter of solution containing the PCR product was mixed with 7 μl of denaturing solution (95% formamide, 20 mM EDTA, 0.05% bromophenol blue, and 0.05% xylene cyanol FF). The mixture was denatured at 96°C for 5 min and immediately cooled on ice. One microliter of the mixture was applied to a 10% polyacrylamide gel (acrylamide:bisacrylamide = 49:1) containing 10% glycerol. Electrophoresis was carried out at 20 mA/gel for 70 min at 10°C in a minigel electrophoresis unit (ATTO, Tokyo, Japan) and in 0.5X TBE (45 mM Tris-borate, pH 8.0, 1 mM EDTA). The separated single-stranded DNA fragments in the gel were visualized by silver staining (Daiichi Pure Chemicals, Tokyo, Japan). PCR-SSCP of the fragments showed seven different banding patterns (Fig. 2). To obtain the reference samples for the electrophoresis, each silver-stained band showing a unique position was cut out and subjected to direct PCR-based sequencing. Accordingly, we observed seven haplotypes corresponding to the unique seven banding patterns (Table 1). The results from direct PCR-based sequencing and PCR-SSCP were consistent.

Table 1 Haplotypes including the BP1 binding site detected in this study
Fig. 2
figure 2

Polymerase chain reaction–single-strand conformation polymorphism (PCR-SSCP) analyzes for five polymorphic sites at the BP1 binding site. The sample genotype (pair of haplotypes) is shown at the top of the lane (the numbers from 1 to 7 represent the haplotypes A1 to A7).

Statistical analysis

Because our SSCP analysis could distinguish the seven haplotypes, we treated the SSCP haplotypes as alleles at a single polymorphic site. A phase or pair of haplotypes for 57 subjects (i.e., 28 HbE and 86 HbA chromosomes) was inferred using the Phase-Standard analysis version 2.0.2 program (Stephens et al. 2001; Stephens and Donnelly 2003) with the default settings. In this study, we reported the number of haplotypes based on the most likely haplotype pair for each subject.

Results

In this study, we investigated variations in the (AT)x(T)y repeat in three HbE homozygous, 22 HbE heterozygous, and 32 normal (HbA) homozygous individuals living in Thailand. Five polymorphisms, 69758T/C, (AC)n, (AT)x, (T)y, and 69840T/C were genotyped successfully using both direct PCR-based sequencing and PCR-SSCP, allowing determination of the phase or pair of haplotypes in each individual. In the studied population, we observed seven haplotypes defined by these five polymorphisms (Table 1 and Fig. 1). Four alleles were detected at (AT)x whereas only two alleles were observed at other sites. To our knowledge, this is the first study reporting an (AC)n polymorphism adjacent to the (AT)x(T)y repeat.

The frequencies of haplotypes consisting of both HbA/HbE polymorphism and SSCP haplotype were estimated based on the group of 57 subjects (Table 2). Haplotypes with low estimated frequencies may not exist in the studied subjects (e.g., HbE—A2) because the results are based on statistical inference from a small sample. SSCP haplotypes A1, A2, A4, and A6 were common, and the (AT)9(T)5 and (AT)7(T)7 alleles were predominant, in the Thai population. The HbE variant was in strong linkage disequilibrium with haplotype A1 (i.e., (AT)9(T)5 allele) whereas HbA was on various chromosomes with different SSCP haplotypes.

Table 2 Estimated number of haplotype frequencies among 57 subjects (114 chromosomes)

Discussion

We investigated the (AT)x(T)y repeat on HbE and HbA chromosomes in the Thai population. The (AT)9(T)5 allele is thought to associate with decreased β chain synthesis (Semenza et al. 1984; Murru et al. 1990; Elion et al. 1992) and affect the HBB expression in an erythroid environment (Kalotychou et al. 2002). Our results show a strong linkage disequilibrium of HbE with (AT)9(T)5, implying that this repeat may lead to the low concentration of βE chain in individuals with HbE. However, some results contradict this prospect. For example, the expression level of β chain in normal individuals is not significantly affected by a polymorphism in the (AT)x(T)y repeat (Galanello et al. 1993), and phenotypic diversity is not associated with (AT)9(T)5 (Wong et al. 1989). Because it is known that the low concentration of βE chain can be attained by the aberrant splicing of HbE mRNA (Orkin et al. 1982), further functional studies are needed to determine whether the (AT)9(T)5 repeat influences HbE expression.

Hemoglobin E was found to be in strong linkage disequilibrium with (AT)9(T)5 in this study. The same haplotype has been reported in Chinese (Zhou et al. 1995). Thus, HbE variants in the studied Thai population and Chinese may have the same origin (i.e., the HbE mutation may have occurred on a chromosome containing the (AT)9(T)5 repeat and rapidly spread into Southeast Asia) although HbE variants with different origins are suggested to exist in Thailand (Fucharoen et al. 2002). Assuming that the present relative allele frequencies of the (AT)x(T)yrepeats on HbA chromosomes correspond to the frequencies of these repeats at the time of HbE mutation, (AT)9(T)5 would have been one of the dominant alleles. Although the functional effect of this repeat on the HbE chromosome still remains unclear, HbE may not have been such a common variant in Southeast Asia if the HbE mutation occurred on a chromosome with another repeat not affecting β-chain production.

We found the (AC)n polymorphism adjacent to the (AT)x(T)y repeat in the Thai population. The frequency of (AC)3(AT)7(T)5 was low in Thai, while the frequency was 11.3% in Japanese (Our unpublished data). Thus, the (AC)3(AT)7(T)5 allele would have arisen in common ancestors of Asian populations. Because the (AC)n polymorphism seems to affect the binding affinity of BP1 (Elion et al. 1992), it may reveal the functional significance of the (AC)n(AT)x(T)y repeat in the regulation of HBB expression by the BP1 repressor. Because the allele frequency of (AC)3(AT)7(T)5 is approximately 11.3% in Japanese (Our unpublished data), the genotyping of not (AT)x(T)y but (AC)n(AT)x(T)y is required, at least in Asians, in the future. The present PCR-SSCP analysis allowed us to distinguish seven haplotypes including the (AC)n(AT)x(T)y repeat. Furthermore, some additional Japanese haplotypes different from these seven could be also distinguished (data not shown). Thus, although it is necessary to initially perform direct PCR-based sequencing for the reference bands cut out of the SSCP gels, the PCR-SSCP method is very useful for rapidly and simply determining the phase or pair of haplotypes in a subject.

The BP1 binding site has been predicted to be part of a putative recombination hotspot located upstream of the HBB coding sequence (Chakravarti et al. 1984; Smith et al. 1998; Schneider et al. 2002; Wall et al. 2003). To examine whether the BP1 site is included in the hotspot, a larger sample group or population must be studied. The PCR-SSCP method presented here should be helpful for such a study because it can be used to determine the phase, or the combination of haplotypes, of each sample (Wall et al. 2003).