Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Whole genome sequencing of drug resistant Mycobacterium tuberculosis isolates from a high burden tuberculosis region of North West Pakistan


Tuberculosis (TB), caused by Mycobacterium tuberculosis bacteria, is a leading infectious cause of mortality worldwide, including in Pakistan. Drug resistant M. tuberculosis is an emerging threat for TB control, making it important to detect the underlying genetic mutations, and thereby inform treatment decision making and prevent transmission. Whole genome sequencing has emerged as the new diagnostic to reliably predict drug resistance within a clinically relevant time frame, and its deployment will have the greatest impact on TB control in highly endemic regions. To evaluate the mutations leading to drug resistance and to assess for evidence of the transmission of resistant strains, 81 M. tuberculosis samples from Khyber Pakhtunkhwa province (North West Pakistan) were subjected to whole genome sequencing and standard drug susceptibility testing for eleven anti-TB drugs. We found the majority of M. tuberculosis isolates were the CAS/Delhi strain-type (lineage 3; n = 57; 70.4%) and multi-drug resistant (MDR; n = 62; 76.5%). The most frequent resistance mutations were observed in the katG and rpoB genes, conferring resistance to isoniazid and rifampicin respectively. Mutations were also observed in genes conferring resistance to other first and second-line drugs, including in pncA (pyrazinamide), embB (ethambutol), gyrA (fluoroquinolones), rrs (aminoglycosides), rpsL, rrs and giB (streptomycin) loci. Whilst the majority of mutations have been reported in global datasets, we describe unreported putative resistance markers in katG, ethA (ethionamide), gyrA and gyrB (fluoroquinolones), and pncA. Analysis of the mutations revealed that acquisition of rifampicin resistance often preceded isoniazid in our isolates. We also observed a high proportion (17.6%) of pre-MDR isolates with fluoroquinolone resistance markers, potentially due to unregulated anti-TB drug use. Our isolates were compared to previously sequenced strains from Pakistan in a combined phylogenetic tree analysis. The presence of lineage 2 was only observed in our isolates. Using a cut-off of less than ten genome-wide mutation differences between isolates, a transmission analysis revealed 18 M. tuberculosis isolates clustering within eight networks, thereby providing evidence of drug-resistant TB transmission in the Khyber Pakhtunkhwa province. Overall, we have demonstrated that drug-resistant TB isolates are circulating and transmitted in North West Pakistan. Further, we have shown the usefulness of whole genome sequencing as a diagnostic tool for characterizing M. tuberculosis isolates, which will assist future epidemiological studies and disease control activities in Pakistan.


Tuberculosis (TB), caused by Mycobacterium tuberculosis bacteria, is a global public health problem responsible for 10 million new cases and 1.6 million deaths worldwide in 20171. M. tuberculosis drug resistance is making disease control more difficult, with 490,000 TB cases identified as resistant to both rifampicin (RIF) and isoniazid (INH) (multi-drug resistant, “MDR-TB”) in 2017. Five countries India, China, Indonesia, Philippines and Pakistan contribute more than half (56%) of the total TB global burden1. Pakistan has an estimated 518,000 TB cases each year, including ~15,000 MDR-TB patients. The estimated proportion of MDR-TB in Pakistan is ~4% in new cases and ~17% in patients who have previously been treated2. Khyber Pakhtunkhwa province (population size 35.5 million; 11.9% of the total population) is situated in North West Pakistan and shares a border with Afghanistan. The province contains the semi-autonomous federally administered tribal areas inhabited by the Pashtun people. The province has been affected by recent military action and accommodates the majority of the 1.4 million Afghan refugees currently in Pakistan. Khyber Pakhtunkhwa has an estimated 270 TB cases per 100,000 population2. Sputum smear microscopy is used as a primary screening test for the diagnosis of TB at local clinics, while GeneXpert MTB/RIF assays are employed for the rapid detection of rifampicin resistant TB at the district level3. Laboratory culture and drug susceptibility testing are available at the provincial level. Treatment of drug-susceptible TB is for six months, while for MDR-TB it is nearly two years4. M. tuberculosis may become extensively drug resistant (XDR-TB), which is MDR-TB with additional resistance to fluoroquinolones (e.g. ofloxacin) and at least one of the second line injectable aminoglycoside drugs (e.g. kanamycin, amikacin or capreomycin)5. In Pakistan, of all TB cases, ~5% are MDR-TB, and of these, ~5% are XDR-TB6,7.

The global emergence and rise in the prevalence of MDR-TB and XDR-TB cases in the past decade has made it imperative to detect drug resistance rapidly and accurately8. Drug resistance in M. tuberculosis is almost exclusively due to mutations (including single nucleotide polymorphisms (SNPs), insertions and deletions (indels)) in genes coding for drug-targets or drug-converting enzymes9,10. Putative compensatory mechanisms have been described to overcome fitness impairment that arise during the accumulation of resistance conferring mutations9,11. Efflux pumps are also thought to have a role in resistance9,12. M. tuberculosis culture and drug susceptibility testing is the gold standard technique, but this can take several weeks. The development of molecular tests, such as GeneXpert and line probe assays, can be used to detect M. tuberculosis directly from clinical samples and identify some mutations underlying MDR-TB. Whole genome sequencing (WGS) provides higher resolution9, and can be used to identify SNPs and indels in loci linked to drug resistance13. Known and putative markers of drug resistance have been identified using phylogenetic tree-based and genome-wide association study approaches9. Libraries of informative resistance mutations are leading to the development of informatic tools to rapidly profile samples for their drug susceptibility to aid clinical decision making10,14.

The M. tuberculosis complex has seven lineages that are endemic in different locations around the globe, leading to the hypothesis that the strain-types are specifically adapted to people of different genetic backgrounds15,16. The lineages vary in their geographic distribution and spread, with lineage 2 being particularly mobile with evidence of recent spread from Asia to Europe and Africa15. Lineage 4 is common in Europe and southern Africa, with regions of high TB incidence and high levels of HIV co-infection. Lineage 3, including Central Asian (CAS) strains, are common in South Asia. The lineages may vary in propensity to transmit and severity of disease16, but there is considerable inter-strain variation within lineages12. A set of SNPs has been identified that can be used to barcode sub-lineages15, leading to informatic tools that position sequenced samples within a global phylogeny17. Similarly, SNPs have been used to construct transmission networks, where samples from different individuals that have near identical genome sequences are most likely to be due to a transmission event. Analysis of genome-wide SNPs characterized in M. tuberculosis DNA sourced from a highly endemic TB region in Malawi has shown striking differences by lineage in the proportion of disease due to recent transmission and in transmissibility, highest in lineage2 (East-Asian), and lowest in lineage1 (Indo-Oceanic)18.

There have been few WGS TB studies in Pakistan, and none have focused on the tribal and migrant populations of the North West. One study characterized drug resistance mutations across 42 XDR-TB isolates from the Aga Khan University (Karachi) strain bank (years 2004–2009), which were sourced from 4 provinces (Sindh (21), Punjab (16), Khyber Pakhtunkhwa (4), Baluchistan (1))19. These isolates were predominantly CAS lineage 3 strains19, in keeping with previous genotyping-based studies20. The Karachi study found that most rifampicin resistance was attributable to SNPs in the rpoB hot-spot region, and isoniazid resistance was most commonly associated with the katG (codon 315) and inhA (S94A) mutations. Beyond MDR-TB, the study found that only 43% of pyrazinamide could be explained by pncA SNPs, fluoroquinolone resistance was mostly explained by gyrA (91–94 codon) mutations, and resistance to aminoglycoside injectables was associated with rrs mutations. The concordance between phenotypic and genotypic testing was highest for rifampicin and isoniazid (>90%), and lowest for pyrazimamide (<50%)19. Follow-up work with the XDR-TB isolates revealed SNPs in efflux pump genes, which may influence drug resistance12. In our study, we performed WGS on 81 drug resistant M. tuberculosis from the Khyber Pakhtunkhwa province, which is endemic for TB across its tribal and migrant populations, but where public health surveillance systems are not strong. We characterize the underlying M. tuberculosis resistance mutations, identifying novel drug-resistance conferring mutations and, in a combined Pakistan WGS data analysis, reveal potential MDR-TB transmission chains. Our methods and findings will assist future WGS and drug resistance mapping studies and inform disease control efforts in Pakistan and neighbouring Afghanistan.

Materials and Methods

The samples and whole genome sequencing

A total of 81 predominantly drug resistant isolates were randomly selected from 8,220 archived M. tuberculosis samples collected between June 2016 and June 2017 at the Provincial TB Reference Laboratory, Hayatabad Medical Complex Peshawar, Khyber Pakhtunkhwa province of Pakistan. Demographic data (e.g. age, sex) were collected from each TB patient that contributed sputum, alongside drug regimen and treatment outcome. The sputum samples were digested and decontaminated using the N-acetyl-L-cysteine sodium hydroxide (NALC-NaOH) method, and the M. tuberculosis cultured in modified 7H9 Middlebrook media in the Mycobacterium Growth Indicator Tube (MGIT) system. Positive cultures were confirmed using the BD MGIT TBc identification (TBc ID) or Capilia chromatographic tests21. All laboratory work involving the culture of live bacteria (from sputum) was performed under category level 3 bio-containment facilities and protocols. DNA samples were extracted using the CTAB method22. Before sequencing, the DNA was RNase-treated, quantified and quality assessed by NanoDrop One spectrophotometer and Qubit 2.0 fluorometer using the Qubit dsDNA BR Assay Kit (ThermoFisher Scientific). The samples were sequenced on the Illumina MiSeq and HiSeq2000 platforms using 200 bp paired end runs at the London School of Hygiene and Tropical Medicine and Genome Institute of Singapore genomic facilities. The confirmed M. tuberculosis MGIT cultured isolates underwent standard drug susceptibility testing against isoniazid (critical concentration 0.1 µl/ml), rifampicin (1.0 µl/ml), ethambutol (5.0 µl/ml), streptomycin (1.0 µl/ml), moxifloxacin (2.5 μg/ml), amikacin (1.0 μg/ml), kanamycin (2.5 μg/ml), capreomycin (2.5 μg/ml) and ofloxacin (2.0 μg/ml). Pyrazinamide susceptibility testing was performed using an established protocol23.

Bioinformatic analysis

Sequence reads were inspected using fastQC ( as a primary assessment of data quality. The reads were trimmed using trimmomatic software24 to remove low quality sequences, and then mapped against the H37Rv reference genome (AL123456) using the BWA-mem alignment package25. SNPs were called using the BCF/VCF software suite26, and those in non-unique regions of the genome (e.g. ppe genes) were excluded. SNPs were converted into a FASTA format alignment, which was used as input to RAxML (v8.0.0) software27 to reconstruct a phylogenetic tree. The tree was annotated and visualized using iTOL28. Drug resistance profiles and lineages were predicted in-silico using TB-Profiler (v2.4)10,14, using a library of established mutations ( SpolPred software29 was used to in-silico predict spoligotypes.

Ethical approval

This study was approved by the ethical committees of the Kohat University of Science and Technology, Kohat, and the Provincial TB Reference Lab, Hayatabad Medical Complex Peshawar, Pakistan. Informed consent was given by all patients who contributed sputum.


Clinical isolates and phylogeny

The 81 M. tuberculosis were predominantly sourced from TB hospitals and clinics in Peshawar, the largest city of Khyber Pakhtunkhwa province (Table 1; Supplementary Fig. 1). The patients contributing M. tuberculosis samples had a median age of 28, and there was gender parity (male: n = 41, 50.6%). In-silico predictions of resistance across the eleven anti-TB drugs revealed that the majority were MDR-TB (n = 62; 76.5%), with the others being pan-susceptible (n = 1; 1.2%), XDR-TB (n = 5; 6.2%) and non-MDR-/XDR-TB drug-resistant (n = 13; 16.0%). There were high levels of rifampicin (93.8%), isoniazid (84.0%), ethambutol (75.3%) and fluoroquinolone (63.0%) drug resistance and low levels of aminoglycoside resistance (3.7%). Mapping of the raw M. tuberculosis sequence data (Supplementary Table 1) led to high average genome-wide coverage across the clinical isolates (median: 227-fold; range: 74- to 288-fold). Across the isolates, 18,667 unique SNPs were identified; a high proportion (37%) of these were observed in single isolates. Isolates were classified predominantly into lineage 3 (CAS) strain types (70.4%) (Table 1), but lineages 1 (3.7%), 2 (11.1%; all Beijing strain-types) and 4 (14.8%) were also present. As expected, these lineages form clusters in a genome-wide SNP-based phylogenetic tree (Fig. 1; Supplementary Fig. 2).

Table 1 Mycobacterium tuberculosis samples (N = 81).
Figure 1
figure 1

Phylogenetic tree of M. tuberculosis strains (n = 81), with their lineages and drug resistance profiles. The phylogenetic tree was created using a maximum likelihood approach implemented in RAxML27. The tree was annotated using iTOL28. The first vertical band to the right of the tree denotes the lineage. The second vertical band denotes the drug resistance phenotype. The circles show the drug resistance profiles, with filled circles representing the presence of a resistance mutation. Some profiles are examples of isolates that are both pre-MDR and fluoroquinolone resistant. MDR = multi-drug resistant TB, XDR = extensively drug resistant TB; drug resistant = non-MDR/XDR resistant; based on drug susceptibility testing and TB-Profiler prediction14.

Mutations underlying drug-resistance

Resistance mutations14 were assessed and compared to the phenotypic drug susceptibility test results for 11 anti-TB drugs. There was perfect concordance between the phenotypic result and in-silico prediction for rifampicin resistance. This resistance was predominantly associated with known mutations at codon 450 in the rpoB gene (conferred by 3 mutations; in 59/76 resistant isolates; including S450L 56/76; Table 2; Supplementary Table 2), but also ten other putative mutations in the rifampicin-resistance-determining region (RRDR) and two mutations outside this region (L430R, L430P). A minority of rifampicin resistant isolates had putative compensatory rpoC mutations (I491T 7/76 (all Beijing), I885V 1/76 (CAS/Delhi)), which all had rpoB S450L background mutations. Mutations identified in the katG gene (S315T 61/68) and the Rv1482cfabG1 intergenic region (7/68) were most likely to be responsible for isoniazid resistance, but additional mutations in oxyR’-ahpC (1/68) and katG (5/68) were found in single isolates (Table 2; Supplementary Table 2). The phenotypic results and in-silico predictions were identical except for three isolates. These isolates had distinct previously uncharacterized frameshift mutations in the katG gene (Supplementary Table 2), most likely leading to resistance due to a loss of function of the isoniazid activating enzyme.

Table 2 High frequency drug resistance mutations. Mutations have been ordered by frequency.

Ethambutol resistance was conferred by mutations in the embCAB operon, including embB (M306 45/61; G497 9/61; G406 7/61). By assuming the laboratory phenotypic test result as the gold standard, the sensitivity of the in-silico prediction was high (97.2%), but the specificity was much lower (42.2%). This differential is most likely due to the large number of isolates predicted to be resistant based on the presence of mutations in codon 306 of the embB gene (n = 10). These mutations have been shown to confer resistance, albeit at a moderate level30. Pyrazinamide resistance is typically associated with the pncA gene, and we identified 31 non-synonymous mutations in that locus, with the most frequent being pncA-Rv2044c −11A > G (4/43), V180P (4/43), and H71A (3/43). The comparison of the pncA allele frequencies in our study to those in a global WGS dataset14 revealed four indels and three SNPs to be novel (Supplementary Fig. 3, Supplementary Table 2). Further, one isolate contained a 405 bp deletion that removed a large proportion of the pncA gene (Supplementary Fig. 3, Supplementary Table 2). Mutations in other genes associated with pyrazinamide resistance (rpsA and panD) were not detected. There was a moderate level of concordance between the phenotypic drug susceptibility test results and in-silico predictions for pyrazinamide (sensitivity 74.1%, specificity 70.4%). However, this discrepancy may be explained by the known difficulty in performing drug susceptibility testing for pyrazinamide resistance and its resulting high variability.

As expected for the samples selected for sequencing, drug resistance conferring mutations were also detected to second line anti TB drugs. Streptomycin resistance is typically related to mutations in gid, rpsL and rrs loci, which are related to low, low and intermediate, and high levels of resistance, respectively9. The mutations occurring in the gid gene were a frameshift (102del, 6/42) and A80P (2/42). The most common resistance conferring mutations were in the rrs (514a > c; 12/42) and rpsL (L43A 12/42) genes. One of the isolates had mutations in both gid (G352 → GC) and rpsL (K88R). Resistance to ethionamide is associated with the ethA locus, and nine mutations were identified in that gene, each present in single resistant isolates (Supplementary Table 2). Eight of these mutations were not present in a large global resistance database14, and may be novel resistance conferring mutations. The most frequent mutation was in the fabG1 promoter region (−15C > T; 7/15), which is also associated with isoniazid resistance9 (Table 2).

Resistance to second-line fluoroquinolones and aminoglycosides

Mutations in the gyrA and gyrB loci associated with fluoroquinolone resistance were observed. The majority of which were in gyrA (D94 40/51; S91P 5/51; A90V 4/51). Other mutations in gyrA (1/51) and gyrB (3/51) were present in single isolates, but absent in a global resistance database14, suggesting that these may be novel fluoroquinolone resistance-conferring mutations. Two isolates had multiple resistance mutations in the gyrA gene (A94G-S91P and A90V-DA94G). The consistency between the drug susceptibility phenotypic results and in-silico predictions was variable for the two fluoroquinolones tested (moxifloxacin and ofloxacin). Whilst the sensitivities for both fluoroquinolones were 100%, the specificities were dissimilar (ofloxacin 85.7%; moxifloxacin 40.8%), which may be explained by differences in the critical concentration used for the drug susceptibility testing (moxifloxacin 2.5 μg/ml, ofloxacin 2.0 μg/ml).

Resistance across the aminoglycoside injectable drugs was associated with the rrs A1401G, which was linked with amikacin (3/3), capreomycin (3/3) and kanamycin (3/5) resistance. Kanamycin resistance was also observed with a mutation in the eis locus (−14C > T, 2/5). This mutation has been found to be related to low levels of kanamycin resistance9 but not associated with resistance to other aminoglycosides. A number of discrepancies were observed between the phenotypic test results and in-silico predictions for resistance. Three isolates had a resistant drug susceptibility test result for amikacin but did not have any known resistance mutations. Of these, one isolate did not have any mutations in known resistance genes, but the other two isolates had a (878 g > a) mutation in the rrs gene. This mutation has previously been reported to confer resistance to capreomycin31, and therefore potentially amikacin resistance too. Two isolates had resistant drug susceptibility test results for kanamycin but did not have any mutations in known resistance genes.

The presence of fluoroquinolone resistance mutations (n = 51; 63.0%) was overwhelmingly more common than aminoglycoside (amikacin/capreomycin) resistance mutations (n = 3; 3.7%), which is in contrast to other settings where resistance to second-line injectables was more common32. Inspection of the in-silico predictions revealed a significant proportion of isolates with fluoroquinolone resistance mutations to be pre-MDR-TB (n = 9; 17.6%). In particular, five isolates were resistant to fluoroquinolones and rifampicin but not isoniazid, one was resistant to fluoroquinolones and isoniazid but not rifampicin, and three were resistant to fluoroquinolones but sensitive to both rifampicin and isoniazid.

Mutations in efflux pumps

Across the 81 isolates, we characterized fifty-five mutations in thirteen efflux pump genes (Rv0194, Rv1217, Rv1218, drrA, drrB, Rv1258, Rv1634, Rv2688, Rv1273, Rv1819, Rv1458, Rv1877 and Rv1250) (Supplementary Table 3). These mutations included twelve identified in previous work in Pakistan12. Mutation variants were observed in all thirteen efflux pump genes, with both SNPs and indels present (Supplementary Table 3). Mutations in efflux genes were present in both susceptible and drug-resistant isolates, demonstrating that their potential role in resistance may be complex, and mechanisms may involve transcriptional or epigenetic effects that we did not consider.

Evidence of transmission

We combined our study WGS data (n = 81) with those from a published Karachi study19 (n = 42). Potential transmission clusters were found by calculating the pairwise SNP distance between the 123 Pakistan isolates and using an established cut-off of <10 mutation differences18 (Supplementary Fig. 2). Eight clusters were found, with a maximum size of three isolates. Of these eight clusters, four contained only MDR-TB strains, three only XDR-TB strains, and one both MDR-TB and XDR-TB strains. Five clusters belonged to lineage 3, two to lineage 4, and one to lineage 1. Three of the clusters (the two lineage 4 clusters and a lineage 3 cluster) involved isolates from Khyber Pakhtunkhwa province. Overall, these data suggest ongoing transmission of MDR-TB and XDR-TB in Pakistan.


WGS is being used increasingly as a tool to assist epidemiological investigations and clinical and control program decision making in infectious diseases. However, most applications of WGS take place in developed countries, where the bacterial disease burden tends to be lower. Our study is the largest WGS analysis of drug resistant M. tuberculosis isolates from a high-burden TB region in Pakistan. In particular, the isolates were collected in Khyber Pakhtunkhwa province in North West Pakistan, a region that has been affected by recent armed conflict, social upheaval and refugee migration, and where operating an effective public health surveillance program has been difficult. CAS/Delhi was the most predominant strain-type identified in our analysis, and this is consistent with previous reports of lineage 3 strains dominating in South Asian populations12,15,33,34. The Beijing strain-type (lineage 2) was present in our study (11%), and although absent from previous WGS studies in Pakistan12, studies using spoligotyping have observed this strain-type in Pakistan before (e.g. 6% in33). It is unclear if the differences in the Beijing frequencies are due to the effects of increased prevalence, isolate selection, or geographical region. In general, Beijing strains are highly virulent and mobile, and their presence in Pakistan (and likely Afghanistan) is a concern for public health surveillance. Interestingly, in pre-MDR-TB isolates, the presence of rifampicin resistance mutations was more common than isoniazid resistance conferring mutations. This observation potentially indicates that resistance to rifampicin arises before isoniazid in this region, making this setting unique35. The high levels of fluoroquinolone (particularly ofloxacin) resistance in Pakistan have been observed by others2. This resistance, in some cases without evidence of rifampicin or isoniazid resistance, is worrisome and may be due to extensive and unregulated use of fluoroquinolones2.

The most frequent drug resistance conferring mutations identified in this study were already known, including in katG (e.g. S315T) and Rv1482cfabG1 intergenic region for isoniazid, rpoB (e.g. S450L and others in the RRDR) for rifampicin, embB (e.g. M306) for ethambutol, gyrA (e.g. D94) for fluoroquinolones, rrs (e.g. 1401a > g) for aminoglycosides, and rpsL, rrs and giB for streptomycin. The similarity of mutations observed and their frequency with previous Pakistan XDR-TB WGS data19 and global collections9,14 suggests that extensions of line probe assays and genotyping arrays to account for these mutations may be useful for disease control. However, we did identify novel potential resistance conferring mutations, including polymorphisms in katG (isoniazid), ethA (ethionamide), gyrA and gyrA (fluoroquinolones), and pncA (pyrazinamide). These mutations should be investigated and validated experimentally to determine their impact on drug minimum inhibitory concentrations and regimen efficacy. Our analysis has also revealed the potential transmission of drug resistant M. tuberculosis in Pakistan. However, a larger sample size and denser sampling frame will be required to fully characterize the degree of XDR-/MDR-TB transmission alongside genetic and non-genetic risk factors.

Overall, our work reveals the utility of WGS for the prediction of antimicrobial drug resistance, epidemiology and control activities in the Pakistan setting. The WGS data generated will serve as a baseline reference for future TB clinical, surveillance and control activities in Pakistan and the wider region.


The application of WGS for TB clinical management and disease control will have the greatest benefit in complex community outbreaks in endemic regions, where epidemiological data availability may be sparse. Our study in the Khyber Pakhtunkhwa province in Pakistan has provided a baseline characterization of circulating known and putative drug resistance mutations, and identified potential MDR-TB transmission chains. These insights will assist future proactive TB patient management, and the deployment of anti-TB drug regimens and surveillance activities.

Data availability

The accession codes for the raw sequence data are available in Supplementary Table 2.


  1. Global tuberculosis report 2018 (2018).

  2. Tahseen, S. et al. Use of Xpert® MTB/RIF assay in the first national anti-tuberculosis drug resistance survey in Pakistan. Int. J. Tuberc. Lung Dis. 20, 448–455 (2016).

    CAS  Article  Google Scholar 

  3. Onozaki, I. et al. National tuberculosis prevalence surveys in Asia, 1990–2012: an overview of results and lessons learned. Trop. Med. Int. Heal. 20, 1128–1145 (2015).

    Article  Google Scholar 

  4. Falzon, D. et al. World Health Organization treatment guidelines for drug-resistant tuberculosis, 2016 update. Eur. Respir. J. 49, 1602308 (2017).

    Article  Google Scholar 

  5. Mishra, G. Current updates in tuberculosis. npj Prim. Care Respir. Med. 27, 1 (2017).

    Article  Google Scholar 

  6. Ayaz, A. et al. Characterizing Mycobacterium tuberculosis isolates from Karachi, Pakistan: drug resistance and genotypes. Int. J. Infect. Dis. 16, (e303–e309 (2012).

    Google Scholar 

  7. Hasan, R. et al. Extensively Drug-Resistant Tuberculosis, Pakistan. Emerg. Infect. Dis. 16, 1473 (2010).

    Article  Google Scholar 

  8. Gandhi, N. R. et al. Multidrug-resistant and extensively drug-resistant tuberculosis: a threat to global control of tuberculosis. Lancet 375, 1830–1843 (2010).

    Article  Google Scholar 

  9. Coll, F. et al. Genome-wide analysis of multi- and extensively drug-resistant Mycobacterium tuberculosis. Nat. Genet. 50, 307–316 (2018).

    Article  Google Scholar 

  10. Coll, F. et al. Rapid determination of anti-tuberculosis drug resistance from whole-genome sequences. Genome Med. 7, 51 (2015).

    Article  Google Scholar 

  11. Perdigão, J. et al. Unraveling Mycobacterium tuberculosis genomic diversity and evolution in Lisbon, Portugal, a highly drug resistant setting. BMC Genomics 15, 991 (2014).

    Article  Google Scholar 

  12. Kanji, A. et al. Single nucleotide polymorphisms in efflux pumps genes in extensively drug resistant Mycobacterium tuberculosis isolates from Pakistan. Tuberculosis 107, 20–30 (2017).

    CAS  Article  Google Scholar 

  13. Witney, A. A. et al. Clinical application of whole-genome sequencing to inform treatment for multidrug-resistant tuberculosis cases. J. Clin. Microbiol. 53, 1473–83 (2015).

    Article  Google Scholar 

  14. Phelan, J. E. et al. Integrating informatics tools and portable sequencing technology for rapid detection of resistance to anti-tuberculous drugs. Genome Med. 11, 41 (2019).

    Article  Google Scholar 

  15. Coll, F. et al. A robust SNP barcode for typing Mycobacterium tuberculosis complex strains. Nat. Commun. 5, 4812 (2014).

    ADS  CAS  Article  Google Scholar 

  16. Gagneux, S. Host-pathogen coevolution in human tuberculosis. Philos. Trans. R. Soc. Lond. B. Biol. Sci. 367, 850–9 (2012).

    CAS  Article  Google Scholar 

  17. Benavente, E. D. et al. PhyTB: Phylogenetic tree visualisation and sample positioning for M. tuberculosis. BMC Bioinformatics 16, 155 (2015).

    Article  Google Scholar 

  18. Guerra-Assunção, J. et al. Large-scale whole genome sequencing of M. tuberculosis provides insights into transmission in a high prevalence area. Elife 4 (2015).

  19. Ali, A. et al. Whole Genome Sequencing Based Characterization of Extensively Drug-Resistant Mycobacterium tuberculosis Isolates from Pakistan. PLoS One 10, e0117771 (2015).

    Article  Google Scholar 

  20. Tanveer, M. et al. Genotyping and drug resistance patterns of M. tuberculosis strains in Pakistan. BMC Infect. Dis. 8, 171 (2008).

    Article  Google Scholar 

  21. Watanabe Pinhata, J. M. et al. Use of an immunochromatographic assay for rapid identification of Mycobacterium tuberculosis complex clinical isolates in routine diagnosis. J. Med. Microbiol. 67, 683–686 (2018).

    Article  Google Scholar 

  22. Belisle, J. T., Mahaffey, S. B. & Hill, P. J. Isolation of Mycobacterium Species Genomic DNA. In Methods in molecular biology (Clifton, N.J.) 465, 1–12 (2009).

    Article  Google Scholar 

  23. Woods, G. L. et al. Susceptibility testing of mycobacteria, nocardiae, and other aerobic actinomycetes. M24-A2 31 ( 5 ) (2011).

  24. Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).

    CAS  Article  Google Scholar 

  25. Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. (2013).

  26. Li, H. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics 27, 2987–2993 (2011).

    CAS  Article  Google Scholar 

  27. Stamatakis, A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313 (2014).

    CAS  Article  Google Scholar 

  28. Letunic, I. & Bork, P. Interactive Tree Of Life (iTOL) v4: recent updates and new developments. Nucleic Acids Res. 47, W256–W259 (2019).

    Article  Google Scholar 

  29. Coll, F. et al. SpolPred: rapid and accurate prediction of Mycobacterium tuberculosis spoligotypes from short genomic sequences. Bioinformatics 28, 2991–2993 (2012).

    CAS  Article  Google Scholar 

  30. Plinke, C., Walter, K., Aly, S., Ehlers, S. & Niemann, S. Mycobacterium tuberculosis embB codon 306 mutations confer moderately increased resistance to ethambutol in vitro and in vivo. Antimicrob. Agents Chemother. 55, 2891–6 (2011).

    CAS  Article  Google Scholar 

  31. Malinga, L., Brand, J., Olorunju, S., Stoltz, A. & van der Walt, M. Molecular analysis of genetic mutations among cross-resistant second-line injectable drugs reveals a new resistant mutation in Mycobacterium tuberculosis. Diagn. Microbiol. Infect. Dis. 85, 433–437 (2016).

    CAS  Article  Google Scholar 

  32. Falzon, D. et al. Resistance to fluoroquinolones and second-line injectable drugs: impact on multidrug-resistant TB outcomes. Eur. Respir. J. 42, 156–68 (2013).

    CAS  Article  Google Scholar 

  33. Hasan, Z. et al. Spoligotyping of Mycobacterium tuberculosis isolates from Pakistan reveals predominance of Central Asian Strain 1 and Beijing isolates. J. Clin. Microbiol. 44, 1763–8 (2006).

    CAS  Article  Google Scholar 

  34. Singh, U. B. et al. Predominant tuberculosis spoligotypes, Delhi, India. Emerg. Infect. Dis. 10, 1138–42 (2004).

    Article  Google Scholar 

  35. Manson, A. L. et al. Genomic analysis of globally diverse Mycobacterium tuberculosis strains provides insights into the emergence and spread of multidrug resistance. Nat. Genet. (2017).

    CAS  Article  Google Scholar 

Download references


A.J. was funded by an International Research Support Initiative Program award from the Higher Education Commission Pakistan (1-8/HEC/HRD/2017/8034). T.G.C. is funded by the Medical Research Council UK (Grant Nos MR/M01360X/1, MR/N010469/1, MR/R025576/1, and MR/R020973/1) and BBSRC (Grant Nos BB/R013063/1). S.C. is funded by Medical Research Council UK grants (MR/M01360X/1, MR/R025576/1, and MR/R020973/1) and BBSRC (Grant No. BB/R013063/1). S.J.W. is funded by the Wellcome Trust 204538/Z/16/Z. We thank the Scientific Computing Group for data management and computing infrastructure at the Genome Institute of Singapore. The MRC (HDR-UK) eMedLab computing resource was used for bioinformatics and statistical analysis.

Author information

Authors and Affiliations



A.J., S.J.W. and T.G.C. conceived and directed the project. A.J. and S.J.W. coordinated sample collection. A.J., S.A., H.R., D.M.C., L.M.W. and S.N.K. undertook sample collection, processing, DNA extraction, library building and sensitivity testing. A.J., P.F.D.S., S.C. and T.G.C. coordinated sequencing. A.J. and J.E.P. performed bioinformatic and statistical analyses under the supervision of S.J.W. and T.G.C. A.J., J.E.P., S.J.W. and T.G.C. interpreted the results. A.J. and T.A.K. wrote the first draft of the manuscript. All authors commented and edited on various versions of the draft manuscript and approved the final manuscript. A.J., J.E.P., S.J.W. and T.G.C. compiled the final manuscript.

Corresponding authors

Correspondence to Abdul Jabbar or Taane G. Clark.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Jabbar, A., Phelan, J.E., de Sessions, P.F. et al. Whole genome sequencing of drug resistant Mycobacterium tuberculosis isolates from a high burden tuberculosis region of North West Pakistan. Sci Rep 9, 14996 (2019).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:

Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing