Variations in the frequencies of polymorphisms in the CYP2C9 gene in six major ethnicities of Pakistan

Genetic variation in cytochrome P450 (CYP) 2C9 is known to cause significant inter-individual differences in drug response and adverse effects. The frequencies of CYP2C9*2 and CYP2C9*3, both of which are responsible for the low activity of the enzyme, are not known in the Pakistani population. Therefore, we screened various ethnic groups residing in Pakistan for these polymorphisms. A total of 467 healthy human volunteers were recruited from six major ethnicities of Pakistan after written informed consent. Our results indicate that about 20% of the Pakistani population has a genotype containing at least one low activity allele. Ethnic Punjabi and Pathan populations had the highest frequencies of wild type genotypes while Urdu, Seraiki, and Sindhi populations showed higher rates of both low activity genotypes. The Baloch population showed the highest rates of low activity genotypes with less than 50% of the samples showing wild type genotypes, suggesting that more than half of the Baloch population possesses low activity genotypes. The frequencies found in various ethnic groups in Pakistan were comparable with ethnicities in the South Asian region except for the Baloch population. These results suggest that pharmacogenetics screening for low activity genotypes may be a helpful tool for clinicians while prescribing medications metabolized by CYP2C9.


Scientific Reports
| (2020) 10:19370 | https://doi.org/10.1038/s41598-020-76366-x www.nature.com/scientificreports/ at position 430 (c.430C > T, p.Arg144Cys). CYP2C9*3 causes an amino acid substitution from Ilu to Leu as a result of a transversion, A > T in the CYP2C9 gene (c.1075A > C, p.Ile359Leu) 7 . The activity of CYP2C9 is significantly reduced as a result of these polymorphisms, and in the Caucasian population, they are responsible for a majority of decreased CYP2C9 activity phenotypes 7 . Both of these variations decrease the rate of phenytoin hydroxylation 8 . Hydroxylation of S-warfarin is impaired by CYP2C9*2 6 , while the metabolism of tolbutamide is decreased by CYP2C9*3 9 . Benzo[a]pyrene, an important lung carcinogen, is also metabolized by CYP2C9, and therefore, SNPs in this gene also carry the risk of lung cancer 10 . Therefore, these SNPs not only affect drug response and adverse effects, but they are also associated with certain disease phenotypes. Interpopulation differences in drug responses are well known, and in some cases, they correspond to differences in the frequency of associated genetic markers, especially CYP genes. That is why differences in CYP2C9 allele distribution have been described for various populations. Pakistan is a culturally diverse country, but little is known about the distribution of CYP2C9 genetic polymorphism in this country of over 200 million people. Therefore, we intended to determine the frequencies of these polymorphisms in the Pakistani population, with samples drawn from six of its most populous ethnic groups. We specifically investigated the samples of various ethnic populations from Pakistan to examine the frequencies of CYP2C9*1, CYP2C9*2, and CYP2C9*3 and then compared them with previous findings in other populations.
Comparison with worldwide populations. Comparison with the worldwide and regional populations revealed significant differences in the frequencies of CYP2C9*2. Colombian, Puerto Rican, Spanish, and Italian populations showed significantly higher frequencies while Han Chinese, Bengali, and Indian Telugu population displayed a significantly low frequency of CYP2C9*2. Frequencies of this allele in Mexican, Peruvian, Finnish, British, Gujrati Indian, and Sri Lankan Tamil populations were not statistically different from Pakistani frequencies (Table 3). Allelic frequencies of CYP2C9*3 in Peruvian and Chinese Dai populations were found lower than the Pakistani population, while Bengali and Gujrati Indian populations showed significantly higher frequencies (Table 4). CYP2C9*3 frequencies in Colombian, Mexican, Puerto Rican, Han & Southern Han Chinese, Japanese, Vietnamese, Finnish, British, Spanish, Italian, Sri Lankan Tamil, and Indian Telgu populations were not statistically different from Pakistani frequencies observed in our investigation ( Table 4). Frequencies of CYP2C9*2 and CYP2C9*3 frequencies in Pakistanis from Lahore were also in agreement with our study (

Discussion
Pakistan is one of the most populous countries in the world, with an estimated population of over 220 million people. Pakistan boasts a relatively young population that comes from diverse cultural and ethnic backgrounds.
Despite being home to one of the biggest populations in the world, studies investigating genetic variations responsible for drug response are scarce. There are several dozen ethnic groups in Pakistan. However, the six ethnicities we selected for our study represent more than 94% of the Pakistani population. The biggest ethnic group in Pakistan are Punjabis, followed by Pathan, Sindhi, Saraiki, Urdu, and Baloch ethnic groups. A geographical map indicating the regions where selected ethnicities primarily reside and the distribution CYP2C9 genetic frequencies in those ethnicities are shown in Fig. 1.
The allelic frequencies of CYP2C9*2 and CYP2C9*3 observed in the present study were found to agree with previously reported frequencies around the world (Table 3 & 4). The frequency of CYP2C9*1 in Pakistan was closest to the one found in America. However, the frequency of CYP2C9*2 was higher in the American population, and CYP2C9*3 was slightly higher in the Pakistani population 11 . In South Asia, Bangladesh was found to       (Table 3 & 4). Frequencies of CYP2C9*2 were relatively high in the Pakistani population compared to many Asian populations such as Japanese, Korean, Chinese Taiwanese, in which this allele was absent 7,13,14 . Many regional populations such as Indians and Sri Lankans did show significant differences in frequencies of CYP2C9*2 allele [15][16][17][18] . However, the Pakistani population displayed slightly lower frequencies of this allele compared to regional populations such as in Bengali and Gujrati Indians. Among the European populations, Swedish, Turkish 19-21 , Spanish and Italian populations had higher frequencies (Table 3) while Finnish and British populations, while displaying higher frequencies, were not statistically different from Pakistani population. The frequencies of the CYP2C9*3 allele in the Pakistani population were found to be similar to many European populations, including British, Finnish, Spanish, and Italian (Table 4). However, Peruvian and Chinese Dai populations showed statistically lower frequencies. Frequencies of CYP2C9*3 found in some regional population such as Bengalis and Gujrati Indians, were higher while in others, such as in Indian Telugu, and Sri Lankan Tamil were in agreement with our results (Tables 3, 4).
Among different ethnicities, Punjabi and Pathan populations had the highest frequencies of the CYP2C9*1 allele, while the CYP2C9*2 allele frequencies were also in a similar range. However, CYP2C9*3 frequencies were different between these two ethnicities, with the Pathan population showing much greater frequencies compared to the Punjabi population. Urdu and Seraiki populations had slightly lower frequencies of the wild type allele compared to Punjabi and Pathan populations. However, the allelic frequency of CYP2C9*2 was higher in the Urdu speaking population, while CYP2C9*3 was found higher in the Seraiki population. Baloch populations samples showed results very different from any other ethnic population. The baloch population had the lowest frequency of wild type allele, while the frequency of CYP2C9*2 was the highest among Pakistani populations. Similarly, the Baloch frequencies of CYP2C9*3 were also the highest among Pakistani ethnicities. The pattern in the Sindhi population was similar to Urdu and Seraiki populations.
While analyzing genotype frequencies, Punjabi and Pathan population samples showed similar frequencies of wild type genotype, CYP2C9*1*1. However, unlike Punjabi population samples, Pathan population samples lacked the CYP2C9*2*3 genotype (Table 2). Urdu and Seraiki population samples, although having similar frequencies of wild type allele, had different wild type genotypes. This implies that roughly 30% of the Urdu speaking population has a CYP2C9 genotype with at least one low activity allele. This was found to be true for the Sindhi population as well in which the frequencies of CYP2C9*1*1 were reported to be 70%. In the Baloch population, wild type CYP2C9 genotype was reported in only 46% samples, and therefore, indicates that more than half the population may possess at least one low activity allele (Table 2). This represents a significant fraction of the Baloch population with a potentially variable response and/or enhanced adverse effects when drugs metabolized by CYP2C9 are administered.
The Pakistani population is a heterogeneous mixture of Asian, Middle Eastern, and European populations partly because of the Arab invasion of the eighth century and British invasions of the eighteenth and nineteenth centuries, and partly owing to its high geographic and ethnic diversity 22 . The genetic structure of various Pakistani populations have been analyzed and several distinct variants identified among different ethnicities by global projects such as the 1000 Genome Project and Human Genome Diversity project 12,23 . Some studies indicate www.nature.com/scientificreports/ that the genetic structure of these ethnicities is closely related to both South Indian and European populations 24 while others suggest Pakistani ethnicities to be similar to European populations 25,26 . The extreme differences observed in the Balochi population may be due to their diverse ancestry belonging to Aryan, Arab, Persian, Turkish, Kurdish, Dravidian, Sewais, and black African lineages 27 . Genetic information about patients' CYP2C9 gene is likely to help physicians prescribe to patients the most suitable and safest drug based on their genetic make-up. With roughly 13% clinically available drugs metabolized by CYP2C9 enzyme 28 and over 2.6 billion unit doses of drugs dispensed in Pakistan annually, the number of unit doses metabolized by the CYP2C9 enzyme in Pakistan annually is over 332 million. Our study shows that about 20% of Pakistan's population has a CYP2C9 genotype that contains at least one low activity allele. These results indicate that over 66 million doses of drugs dispensed annually in Pakistan may not have desired effects as patients receiving these medications possess a low activity CYP2C9 allele. In patients receiving a drug that requires activation through CYP2C9, a lack of response could be expected. On the contrary, if a drug is inactivated by CYP2C9, then increased frequency and severity of adverse effects would be a more likely outcome. With CYP2C9 genotype information at hand, physicians will have a choice to change the drug or dose of the drug to provide maximum therapeutic benefit to the patient and/or prevent the undesired and excessive adverse effects.
To our knowledge, this is the first study to report frequencies of CYP2C9 gene polymorphisms in various ethnicities of the Pakistani population. Although there have been a few studies from Pakistan in which frequency of CYP2C9*2 and CYP2C9*3 were reported [29][30][31] , all of these studies involved patients with different diseases and, therefore, unable to capture the actual frequency of these polymorphisms in a general Pakistani population. The frequencies of CYP2C9*2 and CYP2C9*3 reported in these publications are 5.1% for CYP2C9*2, 15.4% for CYP2C9*3 in breast cancer patients 29 , 4.45% for CYP2C9*2, 22.8% for CYP2C9*3 in cardiovascular patients taking warfarin 31 and 12.1% for CYP2C9*2, 14.1% for CYP2C9*3 in heart valve replacement patients taking warfarin 30 . These frequencies vary significantly from one study to another and are also different from the ones we have reported for the healthy Pakistani population in this study. Frequencies of both low activity alleles were significantly higher in these studies than what we observed in our study. For example, frequencies of the CYP2C9*3 allele were four times higher in one of these studies and more than twice higher in the rest of the two studies. This may be because some polymorphisms are associated with certain diseases, and therefore, their frequencies in the patient groups would be different from a normal healthy population. Large differences in the sample size in these studies could also partly explain the variations observed in allelic frequencies. Another publication reporting the CYP2C9 gene polymorphisms in the Pakistani population also had participants who were heart valve replacement patients taking warfarin 32 . However, the allelic frequencies reported in that study were in agreement with ours, although the frequencies of CYP2C9*2 were slightly on the lower side. This study was carried out in patients with Punjabi ethnicity only. Furthermore, patient samples reported in these studies were obtained from a single geographical location and, therefore, may not represent entire Pakistan, which is a large country with a population of over 220 million people having varied ethnic backgrounds.
In conclusion, both the CYP2C9*2 and CYP2C9*3 allelic variants are found in the Pakistani population, and CYP2C9*3 was slightly more common than CYP2C9*2. One limitation of our study is we were unable to find the true CYP2C9*1 allele due to the genotyping method we employed in our research study. Individuals were genotyped CYP2C9*1 when neither CYP2C9*2 nor CYP2C9*3 was detected. Most of the polymorphisms demonstrated in our study were heterozygous. No CYP2C9*3*3 homozygosity was seen in our study, and only 3 (less than 1%) were homozygous for CYP2C9*2*2. This suggests that the homozygous polymorphism is rare in the Pakistani population. The frequency of these polymorphisms was found to be slightly different in different ethnic populations in Pakistan except for Baloch population samples, which showed an unusually high frequency of these polymorphisms. We recommend that genotyping of the CYP2C9 gene in patients on drugs such as warfarin, phenytoin, etc., may help to overcome the drug toxicity, chose the right alternative, and guide in therapeutic drug monitoring.

Methods
The study was approved by the Institutional Review Board and Ethics Committee of Shifa Tameer-e-Millat University, Islamabad, Pakistan, through approval number IRB#990-265-2018. Informed written consent was obtained from all participating individuals. All experiments were performed in accordance with relevant guidelines and regulations. A total of 467 unrelated individuals from a healthy population were recruited for the present study. The study cohort consists of six major ethnicities of Pakistan, including Punjabis, Pathan, Sindhi, Balochi, Seraiki, and Urdu speaking. Ethnicity was self-reported. Five milliliters of venous blood was drawn into a sterile tube containing EDTA as an anti-coagulant and were stored at 4 ο C. Genomic DNA was isolated using Gene Jet Genomic DNA extraction Kit (ThermoScientific) and was quantified using 1% agarose gel electrophoresis. Isolated genomic DNA was stored at -20 °C until further processing 33 . Genotyping. CYP2C9*2 and *3 were genotyped using ARMS-PCR (Allele Refractory Mutation System-Polymerase Chain Reaction) using a pair of outer primers and a pair of inner primers as described previously 34 . PCR for both the SNPs was performed in a single tube with a total reaction volume of 25 µl containing 12.5 µl of 2X Dream Taq Master mix (ThermoScientific), 0.5 pM of 2C9*2 wild type reverse primer, 1.5 pM of 2C9*2 mutant reverse primer, 3.0 pM of common forward primer, 1.0 pM of 2C9*3 wild type forward primer, 2.0 pM of 2C9*3 mutant forward primer, 3.0 pM of common reverse primer and 3 µl of template DNA (20-50 ng/μl). Thermal profile was as follows: initial denaturation at 95 ο C for 10 min followed by 37 cycles with denaturation at 95 ο C for 45 s, 45 s of primer annealing at 58 ο C, initial extension at 72 ο C for 45 s, and a final extension at 72 ο C for 7 min. For visualization, 12 µl of PCR product was directly loaded onto 4% agarose gel. The PCR products for 2C9*2 had 105 bp fragment for the wild type allele and 114 bp fragment for the mutant allele, whereas 2C9*3 had
Statistical analysis. Allelic Data were compiled according to the genotype and allele frequencies estimated from the observed numbers of each specific allele. The frequency of each allele and genotype in our samples is given together with the 95% confidence interval. The confidence interval for proportions was calculated using the formula (CI = p ± (1.96 × SE), SE = qrt [ p(1-p) / n ], p = proportion, n = sample size). Chi-squared test and p values were calculated using observed and expected frequencies as per the Hardy-Weinberg equation.