Introduction

The concept of choosing the right medicine for right person is not new. However, pharmacogenomic research has enabled us to predict an adverse outcome of administering a medication that would formerly have been judged to be generally safe and effective1. Due the initiative of the Clinical Pharmacogenetics Implementation Network and others2, many drugs in the USA are now dispensed with FDA advised pharmacogenetic warning labels. A detailed list of pharmacogenetic markers is available online at the FDA website (www.fda.gov/Drugs/ScienceResearch/ucm572698.htm). Drug regulatory agencies like the European EMA are following the lead. However, such data stems mainly from the West, which may not be applicable to other parts of the world.

Genetic variability of drug metabolizing enzymes and drug transporters has been associated with interindividual differences in pharmacokinetics and pharmacodynamics. Such differences may result in variation in drug efficacy, safety and treatment outcomes in a number of frequently prescribed drugs3. A notable example is that of pharmacogenetic peculiarities of Ashkenazi Jewish population who are reported to have important therapeutic implications, such as VKORC1 gene polymorphism necessitating warfarin dose adjustment4. Hence, interindividual genetic differences within but also between various ethnic groups are considered to be an important contributory factor to the variability of drug responses5. In this study, we characterized single nucleotide variants (SNVs) of select phase I enzymes (CYPs and ALDHs), phase II enzymes (GSTs, UGTs, TPMPTs and NATs) and transporters involved in drug metabolism in a population of 155 Karachiites in Pakistan, because no such studies are reported for this population. Further, we compared the variant allele frequency with allele frequencies reported for major ethnic groups in HapMap database and reported the differences between our population and each of those representative groups in HapMap.

It is estimated that 75ā€“80% of prescribed drugs are metabolized by oxidizing phase I cytochrome P450 enzymes such as CYP3A4 and 5, CYP2D6, CYP2C19 or CYP2C9, with CYP3A4/5 metabolizing more than half of currently prescribed drugs6. In addition, phase II enzymes catalyze the conjugation of xenobiotic metabolites with various hydrophilic molecules to render them less toxic and more polar, thus favouring their excretion from the body. Such reactions are catalyzed by different enzyme groups, such as GSTs (glutathione S-transferases), UGTs (UDP-glucuronosyl transferases), TPMTs (thiopurine S-methyl transferases), NATs (N-acetyl transferases)7. GSTs also support detoxification reactions8 and play an important role in preventing oxidative stress9.

Additionally, ALDHs (aldehyde dehydrogenases) are phase I xenobiotic metabolizing enzymes which have diverse functions, such as neutralization of toxic aldehydes during lipid peroxidation10, coenzyme Q synthesis11, prevention of tobacco smoke-induced respiratory epithelial cytotoxicity12, metabolism of cyclophosphamide8 and ethanol13 among other functions. Thus, altered ALDH function could predispose individuals to numerous medical conditions such as atherosclerosis, dementia, infertility, cancers. This becomes further complicated if the biological role shows a gene-dose effect, as we previously reported for ALDH3A18.

ATP binding cassette (ABC) transporters are a group of membrane transporters, which transport many xenobiotics, including drugs, in and out of various cells. In fact, some of them were called multi-drug resistance proteins because of this role. Of notable interest are ABCB1 and ABCC2 transporters. Their substrates include many drugs, including anticancer drugs, HIV protease inhibitors, antibiotics, beta blockers, statins, anticonvulsants, opiates14.

As outlined above, ethnic differences exist in the prevalence of genetic variants of the enzymes and transporters15. Hence, genetic characterization of patients may prove valuable in predicting therapeutic outcomes16,17,18. We reported previously19 that significant differences exist in the frequencies of polymorphic genes involved in metabolism and cellular transport of breast cancer chemotherapy among breast cancer patients from Karachi, as compared to some ethnic groups reported in HapMap database (https://www.ncbi.nlm.nih.gov/snp). However, that study was limited because of the small sample size used and the absence of data from a healthy population.

To the best of our knowledge, there is no comprehensive study on this topic involving South Asia and adjoining regions. Hence, we designed this study to explore the genotype profiles among healthy adults from different ethnicities living in Karachi, allowing us to compare them with those reported for other major ethnic groups in the HapMap database.

Results

TableĀ 1 shows the baseline characteristics of the study population. A total of 155 healthy Pakistani adults (98 females and 57 males) with a median age of 19 years (range: 18ā€“70 years) were included in this study. Participants were from all districts of Karachi and belonged to various major ethnic groups within Pakistan. Ethnicity was classified according to their mother tongue, including Balochi, Gujrati, Pashtun, Punjabi, Seraiki, Sindhi, and other minor groups. As expected, the local Urdu-speaking community with heterogeneous Indian ancestry, collectively described as Muhajir (Arabic/Urdu; immigrants) featured most in our population. Since consanguineous marriages are common in Pakistan20, we sought information regarding this fact. Most individuals declared that their parents were not related to each other. Some of the participants, labelled as ā€˜mixed lineageā€™ had grandparents from different ethnicities.

Table 1 Baseline characteristics of study participants.

TableĀ 2 shows the frequency distribution of SNVs and genotypes. Genotypes were in Hardy-Weinberg equilibrium. Some of the samples could not be genotyped completely, apparently due to low DNA quantity or quality. Haplotype and diplotype analyses were carried out where applicable. TableĀ 2 shows that in our population the percent frequency of wild type genotype was as follows:

  1. (a)

    Phase I enzymes: CYP1A1 42% (heterozygous 46%; homozygous variant 12%), CYP2B6 20% (heterozygous 54%; homozygous variant 26%), CYP2C9 71% (heterozygous 26%), CYP2C19 27% (heterozygous 48%; homozygous variant 25%), CYP2D6 extensive metabolizers 74% (25% intermediate metabolizers, 1% poor metabolizers), CYP3A4 98% (heterozygous 2%), CYP3A5 1% (heterozygous 38%, homozygous variant 61%), ALDH3A1 8% (heterozygous 50%, homozygous variant 42%).

  2. (b)

    Phase II enzymes: GSTA1 49% (heterozygous 41%, homozygous variant 10%), GSTM1 null 59%.

  3. (c)

    ABC Transporters: ABCB1 wildtype haplotype 10%, ABCC2 wildtype haplotype 50%

Table 2 Allele and diplotype frequencies of SNVs in drug metabolizing enzymes and ABC transporters (nā€‰=ā€‰155 healthy adults).

TableĀ 3 and Fig.Ā 1 show variant allele frequencies. This study compared Pakistani population with major global ethnic groups (African of Yoruba Nigerian ancestry, Caucasian of Northern and Western European ancestry, Chinese of Han ancestry) as well as a subset from the neighbouring area of India (Gujrati Indians in Houston, Texas), all taken from the HapMap database. Variant allele frequencies were compared using Chi-square or Fischer exact tests. The results show that as compared to ethnicities in the HapMap database there were significant differences in prevalence of variant alleles of (a) ALDH3A1, (b) CYP1A1*2A, CYP2B6*4, CYP2B6*6, CYP2C19*2, CYP3A5*3, (c) GSTA1, and (d) ABCB1 2677Gā€‰>ā€‰T/A and ABCC2 1249Gā€‰>ā€‰A in our population. GSTM1 null genotype was found higher than in other reported ethnicities. CYP2D6*3 was absent in our study sample. All other SNVs showed intermediate or similar prevalence of variant alleles as compared to other ethnicities.

Table 3 Comparison of variant allele frequency with other ethnic groups. The Chi square value was computed with dfā€‰=ā€‰1.
Figure 1
figure 1

Variant allele frequencies (percent) of drug metabolizing enzymes and ABC transporters in healthy Pakistanis as compared to the HapMap Database (http://www.ncbi.nlm.nih.gov/SNP/). KHI, Karachi Pakistan (current study); CHIN, Chinese of Han ancestry; CAUC, Caucasians of Northern and Western European ancestry; AFR, African of Yoruba Nigerian ancestry; GUJ, Gujrati Indian ancestry living in Houston, Texas, USA. Green highlighted row shows current study sample and yellow shaded areas show significant difference from KHI samples computed through chi-square or Fisher exact test. The missing values indicate absence of data in HapMap database for that particular SNV.

Discussion

This study is the first comprehensive pharmacogenetic report from Pakistan. Previously we had shown that SNV prevalence of a select group of Phase-I as well as Phase-II drug metabolizing enzymes and ABC transporters in a breast cancer population sample had significant differences as compared to various ethnicities in HapMap database19. In this study also we identified several important differences between allele and genotype frequencies compared to other populations. Interestingly, the differences were similar to those reported previously for breast cancer population19, suggesting that real differences might exist. It is important to understand the implications of such differences in this population as compared to others as a first step to precision medicine globally. For example, an altered gene function can lead to unfavourable therapeutic outcome(s) in acute care or chronic management of various disorders. A few recent examples where gene variant necessitate adjusted dosing of a drug include that of clopidogrel in case of CYP2C19 SNVs21, warfarin in case of CYP2C9 or VKORC1 SNVs22 and tamoxifen in case of CYP2D6 SNVs23.

This study has many advantages. Pakistan is a populous multi-ethnic country with more than 200 million inhabitants. We included people from various major ethnic groups including Urdu-speaking, Balochi, Gujrati, Pashtun, Punjabi, Seraiki and Sindhi. The presence of a substantial proportion of Urdu-speaking population enabled us to extend the relevance of our results to neighbouring India which has a population over 1300 million. Overall their relevance could be extended to approximately 20ā€“25% of the Worldā€™s population, which is historically underrepresented in pharmacogenomic studies. Hence, these results provide an important window to a largely unstudied population. Despite this, there are some limitations of our study. Approximately one third of our study population represented an inbred cohort due to consanguineous marriages, a widespread practice in this region. Further, our study cohort did not have substantial numbers of other Pakistani ethnic groups, like Baloch, Pashtun, Punjabi and Sindhi for sub-group analysis and robust conclusions regarding these ethnic groups. Hence, we recommend replicating the study to target these groups across the country and region.

The following section discusses in depth the implications of various significant observations in our study sample, comparing it with other ethnicities documented in the HapMap database, including African, Caucasian, Chinese and Gujrati. For further research and analysis, a web-based detailed account regarding substrates, inducers and inhibitors of various drug metabolizing enzymes, and updated clinical application of pharmacogenetics (CPIC guidelines) can be found at https://www.pharmgkb.org/, http://bioinformatics.charite.de/transformer/. and https://cpicpgx.org/guidelines/.

CYP1A1

Cytochrome P450 1A1 metabolizes xenobiotics such as polycyclic aromatic hydrocarbons (PAHs) found in tobacco smoke, atmospheric pollutants and industrial waste and generates carcinogens from several substrates24. Hence, CYP1A1 is considered a link between environment-gene interaction in the etiology of various cancers such as head and neck cancers among smokers25. Our sample shows a higher prevalence of variant alleles (58% of diplotypes carried at least one variant allele) as compared to Caucasians and could potentially confer an elevated disease risk although this would need further validation. The difference in prevalence of SNVs could also have an impact on therapeutic outcome of drugs that may be favourable such as in case of antineoplastic docetaxel26, or first-line antiepileptics27, but unfavourable for antiemetic granisetron28.

CYP2B6

Our results show a significantly higher prevalence of variant alleles CYP2B6*4 (48%) and CYP2B6*6 (36%) genotypes as compared to other ethnicities reported in HapMap database. Both alleles are associated with lower CYP2B6 activity leading to pharmacogenetic implications with many drugs including the antidepressant bupropion29, antiretroviral efavirenz30, anti-tuberculosis rifamycins and ethionamide31, among others. Pakistan has a high prevalence of tuberculosis, whereas HIV prevalence is on the rise especially in high-risk groups like sex workers and intravenous drug addicts32. Hence, compromised CYP2B6 function in the population could lead to elevated risk of side effects and drug-drug interactions. Further research is needed to evaluate the situation in this respect.

CYP2C9

This enzyme metabolizes many drugs, such as warfarin33, phenytoin34 and non-steroidal anti-inflammatory drugs diclofenac and ibuprofen35. Our results show an intermediate prevalence of CYP2C9*2 genotype in KHI (6.8%) as compared to Caucasian (10.4%) and African (0%) populations in HapMap database. However, the CYP2C9*3 genotype is more frequent (9.93%) than in those population groups. Haplotype analysis suggests that approximately 30% population has some degree of compromised function of CYP2C9. The potential effects of this observation should be explored, especially for warfarin and phenytoin due to their narrow therapeutic index.

CYP2C19

While CYP2C19*2 leads to complete loss-of-function, CYP2C19*17 is associated with gain-of-function. Several widely used drugs such as the antiplatelet clopidogrel36, antifungal voriconazole37, and the antidepressant citalopram38 are metabolized by CYP2C19. Because of problems in efficacy and pharmacokinetics, the USFDA and other such agencies include pharmacogenetic information in some drug labels to optimize the use of drugs, such as clopidogrel. Our data shows that only 27% of the Pakistani population had normal phenotype (CYP2C19*1/*1). Thus, further studies are required to elucidate the pharmacogenetics in this population, especially regarding drugs used in acute emergencies, such as clopidogrel in acute coronary syndrome.

CYP2D6

This highly polymorphic enzyme is involved in the metabolism of more than 20% of drugs. Notable examples include the antidepressant paroxetine38, SERM tamoxifen39, antipsychotic clozapine40, adrenoceptor antagonists metoprolol and carvedilol among others41. Our data shows that the minor allele frequency was approximately 30%, whereas, 26% population had genotypes associated with some degree of functional loss.

ALDH3A1

Aldehyde dehydrogenases are phase-1 metabolizing enzymes which exist as different isoenzymes. Our focus was ALDH3A1 which is involved in a broad spectrum of physiological activities, including the protection of oral and respiratory tract mucosa from damage caused by cigarette smoke12, food and air pollutants42, and ionizing radiation43. Additionally, it is involved in preventing ultraviolet light induced corneal damage44, detoxification of 4-HNE (4-hydroxynonenal; a by-product of lipid peroxidation)9, generation of NO from organic nitrates36, metabolism of oxazophorines like cyclophosphamide8, and synthesis of Coenzyme Q11. Thus, ALDH3A1 takes part in drug metabolism and reduction of oxidative stress. We had previously shown that the prevalence of ALDH3A1 (985Cā€‰>ā€‰G) variant allele shows significant differences among various ethnicities in HapMap database and was much more prevalent (62.5%) in Pakistani breast cancer patients with 40% homozygous for variant allele19. In this study, we have shown that it is similarly prevalent in the healthy population (67% variant allele; 42% homozygous variant genotype). So far however, there is lack of concrete evidence that non-functional ALDH3A1 is associated with increased disease risk.

GSTA1

Glutathione S-transferase A1 is the most abundant form of GSTs in human liver, kidney, adrenal gland and testis, where they appear to scavenge electrophiles and reduce oxidative stress45. It also appears to regulate other functions. For example, a recent in vitro study suggested that GSTA1 may facilitate nicotine-induced lung cancer metastasis46. Another study suggested its role in metabolism of anticancer drug busulfan47. We had previously reported that loss of GSTA1 is a major determinant of neutropenia among breast cancer patients receiving standard dose FAC (5-fluorouracil, doxorubicin, cyclophosphamide) chemotherapy8. Our current results also show that prevalence of variant allele is lower (30.5%) in the Pakistani population as compared to others in HapMap database (range: 58.4ā€“89.5%) though in absolute terms it is still high.

GSTM1

Glutathione S-transferase M1 is another GST believed to eliminate oxidative intermediates in the alimentary tract as posed by dietary toxins. The role of GSTM1 null genotype as a susceptibility factor for various carcinoma is conflicting, although a large meta-analysis comprising 198 studies revealed an association of lung cancer to GSTM1 null genotype48. Other studies have suggested that GSTM1 null genotype is associated with pathogenesis of chronic obstructive pulmonary disease49, or increased likelihood of toxicity of cyclophosphamide50 and oxaliplatin51. Our results show a high prevalence of putative ā€œat riskā€ null genotype (59%). A recent study from Pakistan observed elevated levels of carcinogenic 1-hydroxypyrene in GSTM1 null carriers52, making further molecular epidemiological studies necessary in the Pakistani population.

ABCB1

The ATP-binding cassette transporter B1, also called MDR1 (multi-drug resistance protein 1) or P-gp (permeability glycoprotein), is a membrane transporter located at many interfaces in the body53. It actively transports various xenobiotics and toxins across the cell membranes and has been implicated in antineoplastic drug resistance54. Certain drugs, such as amiodarone, clarithromycin, omeprazole, and calcium channel blockers, can inhibit this protein leading to drug-drug interactions55. Our results show a prevalence of 54ā€“65% variant alleles of ABCB1 (1236Cā€‰>ā€‰T, 2677Gā€‰>ā€‰T/A, 3435Cā€‰>ā€‰T; rs1128503, rs2032582, rs1045642 respectively). These frequencies are not significantly different from most populations except African.

ABCC2

ATP-binding cassette transporter C2, also called MRP2 (multidrug resistance-associated protein 2) or CMOAT (Canalicular Multispecific Organic Anion Transporter), is an active efflux transporter identified at apical or biliary canalicular surfaces of hepatocytes and in the kidney. There is mounting evidence that by promoting efflux in target cells this protein is involved in the resistance to several drugs, such as antiepileptics56, antiretroviral drugs57, antineoplastic drugs58, and statins among others59. Conversely, its decreased function may lead to increased drug toxicity. Our data (TablesĀ 2 and 3) suggests that a substantial proportion of the population has diplotypes with some degree of functional deficit where the prevalence of variant alleles ranges from 15ā€“37%. Thus, the effects of this finding should be explored in terms of drug efficacy and toxicity.

In conclusion, this study showed that in our sample compared with other ethnic populations, there was a generally higher prevalence (pā€‰<ā€‰0.05) of variant alleles of ALDH3A1, CYP1A1*2A, CYP2B6*4, CYP2B6*6, CYP2C19*2, CYP3A5*3, ABCB1 2677Gā€‰>ā€‰T/A and ABCC2 1249Gā€‰>ā€‰A. Further, GSTM1 null genotype also had higher frequency. There is a lower prevalence of variant alleles of GSTA1, and ABCC2 3972Cā€‰>ā€‰T as compared to other ethnicities. As mentioned above, these results are not significantly different from our previously reported Pakistani female breast cancer patients19, thus suggesting real differences between our sample and other ethnicities in HapMap database, namely African, Caucasian, and Chinese. Hence, further research in other population cohorts within the country and the region would be beneficial for a more complete understanding of the pharmacogenetic landscape in a region which is underrepresented in genetic studies. This is an important step forward in achieving widespread and cost-effective implementation of personalized medicine in the community60.

Methods

The study was conducted at Jinnah Medical and Dental College (JMDC), Karachi, Pakistan from July 2013 to December 2015 after approval by The Ethics Committee of Jinnah Medical & Dental College, in accordance with relevant guidelines. The study cohort included students and employees of JMDC who were invited to be volunteers and gave written informed consent.

Saliva was the source of genomic DNA. The saliva samples were collected and stored in OrageneĀ® DNA collection kits (DNA Genotek Inc. Canada) according to manufacturerā€™s recommendations. DNA was extracted through the proprietary extraction kit provided with collection kits. The extracted DNA was air-shipped to the Institute of Experimental and Clinical Pharmacology, Christian-Albrechts University, Kiel, Germany for genotyping.

For primers and experimental method details, see Supplementary TablesĀ 1 and 2. Briefly, genotyping was performed by restriction fragment length polymorphism (RFLP) for CYP1A1 SNVs rs1048943 (g.3798Cā€‰>ā€‰T, *2A and *2B), rs1799814 (g.2454Aā€‰>ā€‰G, *2B) and rs4646903 (g.2452Cā€‰>ā€‰A, *4); for CYP2B6 SNVs rs2279343 (c785Aā€‰>ā€‰G, *4), rs3211371 (c.1459Cā€‰>ā€‰T, *5) and rs3745274 (c.516Gā€‰>ā€‰T, *6); for CYP2C9 SNVs rs1799853 (430Cā€‰>ā€‰T, *2) and rs1057910 (1075Aā€‰>ā€‰C, *3); for CYP2C19*2 SNV rs4244285 (681Gā€‰>ā€‰A); for CYP2D6 SNVs rs5030655 (g.1707delT, *6), rs3892097 (g.1846C>ā€‰T, *4), rs35742686 (g.2549delA, *3), rs5030656 (g.2615-g.2617delAAG, *9 and rs1065852 (g.100Cā€‰>ā€‰T), *10); and for CYP3A5 SNV rs776746 (g.6986Gā€‰>ā€‰A, *3). The CYP2D6 deletion (*5) and gene duplication could not be determined in the DNA retrieved from saliva specimen. All homozygous variant genotypes detected through RFLP were repeated to ensure accuracy.

A PSQ HS 96 (Qiagen, Hilden, Germany) was used for pyrosequencing (PSQ) SNVs for GSTA1 āˆ’69Cā€‰>ā€‰T and āˆ’52Gā€‰>ā€‰A (rs3957357 and rs3957356 respectively; representing GSTA1*A and GSTA1*B haplotype); for ALDH3A1 (985Cā€‰>ā€‰G; rs2228100), and for CYP3A4 (15389Cā€‰>ā€‰T; rs35599367. Novel PSQ methods were established for CYP2C19*17, ABCB1 SNVs (rs1128503, g.1236Cā€‰>ā€‰T; rs2032582 g.2677Gā€‰>ā€‰T/A; rs1045642 g.3435Cā€‰>ā€‰T), and for ABCC2 (rs717620 āˆ’g.24Cā€‰>ā€‰T, rs2273697 g.1249Gā€‰>ā€‰A, rs3740066 g.3972Cā€‰>ā€‰T). For all the PCR reactions, a GeneAmp PCR 9700 Thermocycler (Applied Biosystems, Darmstadt, Germany) was used.

The data was analysed using SPSSĀ® version 19.0 software (IBM, Ehningen, Germany). The results were entered as frequencies, and percentage and 95% confidence interval (proportions) was calculated. All genotype frequencies were tested and found to be within Hardy-Weinberg equilibrium. Allele frequency data was compared through Ļ‡2 or Fisherā€™s Exact test where applicable. A p-valueā€‰<ā€‰0.05 was considered significant.