Introduction

Polymorphisms of drug-metabolizing enzymes (DMEs) significantly affect the outcomes of treatment and are reportedly associated with risks of developing various diseases such as cancer and neurodegenerative disease (Bandman et al. 2000; Hein 2006; Lilla et al. 2006; Wikman et al. 2001). An early example of polymorphism in a phase II DME is the enzyme arylamine N-acetyltransferase 2, which metabolizes several drugs, such as isoniazid, sulfamethazine, procainamide and dapsone, in addition to chemicals and carcinogens (Butcher et al. 2002; Gross et al. 1999).

This enzyme is encoded by NAT2, which is located on the short arm of chromosome 8 (8p21.3–23.1) and contains an 870-base pair (bp) intronless protein-coding region (Blum et al. 1990; Straka et al. 2006). Genetic polymorphisms in NAT2 have been described, and 36 alleles have been registered at http://www.louisville.edu/medschool/pharmacology/NAT.html.

NAT2 variants are also associated with drug-induced hepatotoxicity in tuberculosis and rheumatoid arthritis patients (Hiratsuka et al. 2002; Huang et al. 2002; Soejima et al. 2007). Impaired drug metabolism can lead to serious adverse drug reactions, including peripheral neuritis, fever and hepatic toxicity (Huang et al. 2002; Hiratsuka et al. 2002), whereas wild-type or rapid metabolism may need a higher dose for disease management (Kubota et al. 2007). Therefore, determination of acetylator status may be essential to optimize individual drug dosing, predict toxicity and improve clinical outcomes of therapy (Weinshilboum 2003).

N-acetylation activity has been investigated in various populations and has been classified in bimodal (rapid or slow) and trimodal (rapid, intermediate or slow) distributions (Hein et al. 2000; Kilbane et al. 1990; Kinzig-Schippers et al. 2005; Parkin et al. 1997). Genotype frequencies of NAT2 vary considerably among different populations (Evans et al. 2001a). The most frequent NAT2 allele in Asians is NAT2*4, compared to NAT2*5 in Caucasian and African populations (Bell et al. 1993; Lin et al. 1993, 1994). Frequency of the slow acetylator phenotype is low in Asian populations (5–30%), but high in African and Caucasian populations (40–90%) (Butcher et al. 2002; Lee et al. 2002; Woolhouse et al. 1997).

The molecular mechanisms underlying the slow phenotype have been reviewed (Butcher et al. 2002). Studies in bacterial and yeast expression systems have shown that generally, nucleotide substitution in the coding region of NAT2 results in low activity, decreased expression and unstable proteins with the slow acetylator allele (Blum et al. 1991; Hein et al. 1994; Leff et al. 1999).

Genotype/phenotype relationships in NAT2 have been extensively studied. However, the relationship between NAT2 genotype and phenotype is not really understood. Even though most studies showed a high concordance between genotypes and phenotypes, none of them showed 100% concordance. NAT2*4, NAT2*12A and NAT2*13 are related to rapid acetylator phenotypes, whereas NAT2*5, NAT2*6 and NAT2*7 are related to slow acetylator phenotypes (Butcher et al. 2002; Cascorbi et al. 1995; Hickman and Sim 1991; Rihs et al. 2007; Woolhouse et al. 1997). The highest discordance (86%) between genotypes and phenotypes was reported in the Hmong population (Southeast Asian immigrants from Laos since 1975) in Minnesota (Straka et al. 2006). The effects of diet, physiological conditions, bacterial or viral infections affecting metabolic activity of NAT2 and polymorphisms in the non-coding regulatory regions of NAT2 were deduced as potential sources of this discordance (Straka et al. 2006).

Indonesia is an archipelago country located in Southeast Asia, with more than 230 million inhabitants according to 2006 estimates. The majority Indonesian ethnicities are Javanese and Sundanese, accounting for 59% of the population.

Most studies focusing on NAT2 have been based on the genotyping of nucleotide variants in the coding region and ignore the potential effects of the role of promoter polymorphisms in determining NAT2 phenotypes. In addition, functional elements of promoter region and promoter activity have not been defined. Moreover, there is no report that sufficiently examined NAT2 polymorphisms in the Indonesian population. The present study focused on the identification of NAT2 polymorphisms in both promoter and coding regions of NAT2 in the Indonesian population and analyzed the haplotypes in both regions and the relationships among them.

Materials and methods

Subjects

A group of 212 unrelated healthy Indonesian subjects (96 men, 116 women; mean age, 33.28 years; range, 18–68 years) from the general population was subjected to NAT2 genotyping. Subjects were interviewed for ethnic background for three generations back and only those of Javanese or Sundanese ethnicity were recruited for the study. Written informed consent in the Indonesian language was obtained from each subject prior to enrollment in this study. All study protocols were approved by the research ethics committees of Yarsi University, Jakarta-Indonesia and Faculty of Medicine, University of Tokyo, Japan.

DNA extraction

Peripheral blood was collected into ethylenediamine tetraacetic acid (EDTA) containing tubes, and DNA was extracted from whole blood samples using the QIAamp DNA blood mini kit (Qiagen, Germany).

NAT2 genotyping

PCR was applied to amplify the promoter and coding regions of NAT2. The coding region and 2 kb of the promoter region was determined based on the database searches (http://genome.ucsc.edu/) and the previous report (Husain et al. 2007).

Direct sequencing was used for variation screening using 48 samples to detect all NAT2 polymorphisms/variations and for genotyping of the rest of the samples. The forward primer X2F and reverse primer X2R were used to amplify the coding region, and forward primers P1F and P2F and reverse primers P1R and P2R were used to amplify promoter regions a and b, respectively (Table 1, Fig. 1). These specific primers were designed using Primer3 free software (http://frodo.wi.mit.edu/) and by considering homology of the promoter regions of NAT1, NATP and NAT2. For direct sequencing of the coding region, we used reverse primers X2aR, X2bR, X2cR and X2R for variation screening, and finally only X2bR and X2cR primers were used to genotype all remaining samples, whereas in the promoter region, P1aR, P1bR, P2aR and P2R primers were used (Table 1, Fig. 1).

Table 1 Primers used for amplification and direct sequencing of NAT2
Fig. 1
figure 1

Gene structure, polymorphic sites and positions of all primers and PCR products in promoter and exon 2 regions of NAT2. aOpen box indicates exon 1 (101 bp), shaded box indicates exon 2 (1,216 bp). bArrows a–p indicate primer positions. c Promoter region (2 kb) was determined from transcription start site according to Husain et al. 2007. dDash lines show PCR products. Pa and Pb represent PCR products in promoter regions a and b, respectively

To amplify the coding region, 25 μl of reaction mixture including 5 ng of genomic DNA, 5 pmol of primers, 2 mM of each dNTP, 5 μl of 10 × PCR buffer containing 2 mM MgCl2, and 1 u of FastStart Taq DNA Polymerase (Roche Applied Science, Mannheim, Germany) were used. Thermal cycling parameters were as follows: initial denaturation for 4 min at 96°C, 40 cycles of 30 s at 96°C, 30 s at 50°C and 2 min at 72°C, ending with final elongation for 5 min at 72°C. Afterwards, PCR product of NAT2 was verified using electrophoresis on a 2% agarose gel prepared in TAE buffer (40 mM Tris–acetate, 1 mM ethylenediamine tetraacetate; pH 8.0), stained with ethidium bromide and inspected under ultraviolet light. To amplify the promoter region, the same method as described for the coding region was applied with an annealing temperature of 55°C. Direct sequencing was performed using a commercial kit (BigDye Terminator ver 3.1, Applied Biosystems) with an automated sequencer (ABI 3100). Allele-specific PCR was applied using ASX2F1, ASX2F2 and ASX2R primers to confirm the inferred haplotype in the coding region using the same method as described for coding region amplification.

Statistical analysis

Data were statistically analyzed using the χ 2 to determine whether genotype frequencies are in Hardy–Weinberg equilibrium. Values of p < 0.05 were considered statistically significant. Linkage disequilibrium (LD) between alleles at two different loci and haplotype frequencies were inferred using the Haploview 3.32 free software (http://www.broad.mit.edu/mpg/haploview/index.php) and SNPAlyze 3.1 software (Dynacom, Tokyo, Japan) based on the expectation-maximization (EM) algorithm.

Results

In total, 23 polymorphisms/variations were found in the promoter and coding regions of NAT2 in the Indonesian population samples. In the promoter region, eight single nucleotide polymorphisms (SNPs) with minor allele frequencies of >10%, seven infrequent single nucleotide substitutions, and two insertion polymorphisms were identified, and these included six polymorphisms/variations and two insertions that were newly described. In contrast, the coding region displayed six polymorphic sites with minor allele frequencies >9%, all of which had already been registered in the database (Table 2; Fig. 1). Observed genotype frequencies of all polymorphisms/variations found in this study were consistent with Hardy–Weinberg equilibrium except for the SNP rs4646245, which was slightly deviated because only 1 minor homozygote and 1 heterozygote were found.

Table 2 Allele frequencies of observed polymorphisms/variations in promoter and exon 2 regions of NAT2

Pairwise |D′| and r 2 values and haplotype frequencies were estimated for polymorphisms with minor allele frequencies ≥5% (Table 3). Of the 23 polymorphisms found, 16 were selected for analysis. High |D′| values were observed throughout the entire region of the gene, and no clear boundary of LD structure was identified between promoter and exon 2 regions.

Table 3 Pairwise LD values between the 16 polymorphisms in the promoter and coding regions of NAT2. Upward triangle is pairwise |D’| and downward triangle is pairwise r2. The value of each pair was highlighted with the degree of shading corresponding to the strength of LD parameters. Numbers 4–23 represent IDs of polymorphisms used in the present study

Table 4 shows haplotypes in the promoter and coding regions, and 13 different combined haplotypes were identified in this study. In the promoter region, seven haplotypes composed of ten polymorphisms with minor allele frequency ≥10% were inferred and encoded by U1, U2, U3, U4, U5, U6 and U7. The most frequent haplotypes are U1, U2, U3 and U4 with frequency 38.9, 33.5, 16 and 9.7%, respectively, whereas in the coding region, six haplotypes composed of six polymorphisms with minor allele frequency ≥10% were inferred and encoded by NAT2*4, NAT2*6A, NAT2*7B, NAT2*5B, NAT2*12A and NAT2*13 based on conventional NAT2 nomenclature (http://louisville.edu/medschool/pharmacology/NAT.html). When both haplotype groups were analyzed together, 13 combined haplotypes were inferred and accounted for 99.8% of all haplotypes estimated in the studied population. NAT2*4.U2 and NAT2*6A.U1 were predominant with frequencies >30%, followed by NAT2*7B.U3 and NAT2*5B.U4 with a frequency >8%. These most frequent combined haplotypes together comprise 93.9% of all haplotypes.

Table 4 Frequencies of promoter haplotypes, coding region haplotypes and combined haplotypes of NAT2 in the Indonesian population

NAT2 genotypes were determined based on both NAT2 alleles referred to conventional NAT2 nomenclature. Frequencies of NAT2 genotypes and predicted phenotypes both based on bimodal and trimodal distributions are summarized in Table 5.

Table 5 Frequencies of NAT2 genotypes and predicted phenotypes in the Indonesian population

Discussion

This is the first complete report of NAT2 genotyping in the Indonesian population. In addition, to the best of our knowledge, this is also the first study to report combined haplotypes of the promoter and coding regions of NAT2.

A total of eight new polymorphisms/variations were found in the promoter region and registered in NCBI dbSNP. Using the EM algorithm, 13 different combined haplotypes comprising six different haplotypes (NAT2*4, *5B, *6A, *7B, *12A, *13) in the coding region and seven different haplotypes (U1, U2, U3, U4, U5, U6, U7) in the promoter region were inferred (Table 4). Obviously, a limited number of allele combinations (haplotypes) were possible in the promoter region, due to LD among promoter polymorphisms. In addition, most polymorphisms in the promoter and coding regions of NAT2 showed maximal LD with |D′| = 1. Those polymorphisms seem to have rarely been separated by recombination or recurrent mutation during the history of the Indonesian population. Several couples of polymorphisms in the promoter and coding regions also showed complete LD with an r 2 value of 1. These results provide useful information for further studies to simplify the genotyping process for NAT2 in the Indonesian population.

Haplotypes in the promoter region showed limited combinations with haplotypes in the coding region. According to this result, we propose a new nomenclature for NAT2 haplotypes, as described in Table 4. Interestingly, the same haplotype in the promoter region could be associated with different haplotypes in the coding region. On the other hand, different haplotypes in the promoter region could also be associated with the same haplotype in the coding region. The common haplotype NAT2*4 was associated with haplotypes U1, U2, U3, U5, U6 and U7 in the promoter region, NAT2*7B haplotype with haplotypes U2, U3, and NAT2*5B with haplotypes U1 and U4 (Table 4). In contrast, haplotype U1 was associated with NAT2*4, NAT2*6A, NAT2*5B and NAT2*13, haplotype U2 with NAT2*4 and NAT2*7B, haplotype U3 with NAT2*4 and NAT2*7B, and haplotype U4 with NAT2*5B and NAT2*12A. Each of four major haplotypes of the coding region NAT2*4, NAT2*6A, NAT2*7B and NAT2*5B was predominantly associated with one promoter region haplotype, U2, U1, U3 and U4, respectively, accounting for 93.9% in total, whereas 5.9% represent other combined haplotypes. Accordingly, one important question to be answered in the future is the possibility that different haplotypes in the promoter region might display different levels of promoter activity, leading to different phenotypes in the same haplotype of the coding region.

Frequencies of NAT2 alleles, genotypes and predicted phenotypes based on polymorphisms in the coding region are summarized in Table 5. Major alleles found in the Indonesian population were NAT2*4 (36.9%) as wild-type, NAT2*6A (36.8%), NAT2*7B (14.9%) and NAT2*5B (9.0%), whereas NAT2 alleles with frequencies <2% were NAT2*12A and NAT2*13. The last two alleles are also rarely found in other populations. The major NAT2 genotypes with frequencies >10% in order were NAT2*4/*6A (28.8%), NAT2*4/*4 (13.2%), NAT2*6A/*6A (12.7%), NAT2*4/*7B (12.3%) and NAT2*6A/*7B (10.8%). The remaining genotypes were in the range of <10%.

Predicted phenotypes deduced from genotypes were classified into bimodal or trimodal distributions based on polymorphisms in the coding region. According to trimodal distribution, phenotypes were predicted as slow, rapid or intermediate acetylator if a genotype comprised two slow alleles, two rapid alleles or heterozygous slow and rapid allele, respectively. When bimodal distribution was considered, rapid acetylator (RA) status was predicted if the genotype comprised at least a rapid allele, and slow acetylator (SA) status was predicted if the genotype comprised two slow alleles. In trimodal distribution, frequencies of predicted rapid, intermediate and slow acetylators were 13.6, 50.8 and 35.6%, respectively, while in bimodal distribution, frequencies of predicted rapid and slow acetylators were 64.4 and 35.6%, respectively (Table 5).

As reported by Hiratsuka et al. (2002) and Huang et al. (2002), slow acetylator individuals are prone to develop more severe hepatotoxicity than rapid acetylators. In contrast, Kubota et al. (2007) suggested that rapid acetylator individuals should get 1.5 times higher the isoniazid dose than currently recommended to get the proper effect. Therefore, since the prevalence of slow acetylator was significant (35.6%) among the Indonesian population, one should thus consider a probability of hepatic injury in tuberculosis patients on isoniazid treatment. On the other hand, among 13.6% rapid acetylator, one should also consider the probability of them getting a lower dose of drug treatment than what they should get.

The distribution of NAT2 alleles in various human populations is demonstrated in Table 6. The frequency of predicted slow acetylators in our studied population resembles that in other Southeast Asian populations. In contrast, frequencies of slow acetylators in Indonesian and other Southeast Asian populations are higher than those in Northeast Asian populations including Chinese, Japanese and Koreans (Lee et al. 2002; Chen et al. 2006; Shishikura et al. 2000), but lower than those in Caucasians and Africans (Butcher et al. 2002; Loktionov et al. 2002; Patin et al. 2006).

Table 6 Distribution of NAT2 alleles in various human populations

The investigation of NAT2 polymorphisms and deduced acetylator status in this population may provide useful information for tuberculosis treatment policy, to prevent and control the incidence of drug-induced hepatotoxicity and related adverse effects of chemotherapy, particularly among slow acetylators. Conversely, rapid acetylator status might also affect response to antituberculosis treatment. Further study to investigate the most appropriate dose of antituberculosis treatment in this population is thus warranted.

Conclusion

Our data are in line with published data from other Southeast Asian populations. However, we also examined polymorphisms in the promoter region of NAT2, which has not previously been reported. In addition, we suggest that NAT2 phenotype of each promoter haplotype we found should be examined to provide more accurate NAT2 phenotypes, instead of conventional prediction. The present results also will be helpful for future epidemiological or clinical studies and for understanding the genetic basis of acetylation polymorphisms in Indonesian. The new method to genotype NAT2 in a more cost-effective way should be facilitated to identify all mutations in the promoter and coding regions of NAT2.