Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

DNA replication fidelity in Mycobacterium tuberculosis is mediated by an ancestral prokaryotic proofreader


The DNA replication machinery is an important target for antibiotic development in increasingly drug-resistant bacteria, including Mycobacterium tuberculosis1. Although blocking DNA replication leads to cell death, disrupting the processes used to ensure replication fidelity can accelerate mutation and the evolution of drug resistance. In Escherichia coli, the proofreading subunit of the replisome, the ɛ exonuclease, is essential for high-fidelity DNA replication2; however, we find that the corresponding subunit is completely dispensable in M. tuberculosis. Rather, the mycobacterial replicative polymerase DnaE1 itself encodes an editing function that proofreads DNA replication, mediated by an intrinsic 3′–5′ exonuclease activity within its PHP domain. Inactivation of the DnaE1 PHP domain increases the mutation rate by more than 3,000-fold. Moreover, phylogenetic analysis of DNA replication proofreading in the bacterial kingdom suggests that E. coli is a phylogenetic outlier and that PHP domain–mediated proofreading is widely conserved and indeed may be the ancestral prokaryotic proofreader.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Rent or buy this article

Get just this article for as long as you need it


Prices may be subject to local taxes which are calculated during checkout

Figure 1: The M. tuberculosis DnaE1 polymerase encodes an intrinsic proofreading capability.
Figure 2: Inactivation of DnaE1 proofreading results in a mutator phenotype in vivo.
Figure 3: Conservation of PHP domain–mediated DNA replication proofreading.
Figure 4: Inactivation of the PHP domain renders mycobacteria sensitive to nucleoside analogs.


  1. Robinson, A., Causer, R.J. & Dixon, N.E. Architecture and conservation of the bacterial DNA replication machinery, an underexploited drug target. Curr. Drug Targets 13, 352–372 (2012).

    Article  CAS  Google Scholar 

  2. Kunkel, T.A. & Bebenek, K. DNA replication fidelity. Annu. Rev. Biochem. 69, 497–529 (2000).

    Article  CAS  Google Scholar 

  3. Mizrahi, V. & Andersen, S.J. DNA repair in Mycobacterium tuberculosis. What have we learnt from the genome sequence? Mol. Microbiol. 29, 1331–1339 (1998).

    Article  CAS  Google Scholar 

  4. Springer, B. et al. Lack of mismatch correction facilitates genome evolution in mycobacteria. Mol. Microbiol. 53, 1601–1609 (2004).

    Article  CAS  Google Scholar 

  5. Ford, C.B. et al. Use of whole genome sequencing to estimate the mutation rate of Mycobacterium tuberculosis during latent infection. Nat. Genet. 43, 482–486 (2011).

    Article  CAS  Google Scholar 

  6. Farhat, M.R. et al. Genomic analysis identifies targets of convergent positive selection in drug-resistant Mycobacterium tuberculosis. Nat. Genet. 45, 1183–1189 (2013).

    Article  CAS  Google Scholar 

  7. Cole, S.T. et al. Massive gene decay in the leprosy bacillus. Nature 409, 1007–1011 (2001).

    Article  CAS  Google Scholar 

  8. McHenry, C.S. DNA replicases from a bacterial perspective. Annu. Rev. Biochem. 80, 403–436 (2011).

    Article  CAS  Google Scholar 

  9. Stano, N.M., Chen, J. & McHenry, C.S. A coproofreading Zn2+-dependent exonuclease within a bacterial replicase. Nat. Struct. Mol. Biol. 13, 458–459 (2006).

    Article  CAS  Google Scholar 

  10. Wing, R.A., Bailey, S. & Steitz, T.A. Insights into the replisome from the structure of a ternary complex of the DNA polymerase III α-subunit. J. Mol. Biol. 382, 859–869 (2008).

    Article  CAS  Google Scholar 

  11. Barros, T. et al. A structural role for the PHP domain in E. coli DNA polymerase III. BMC Struct. Biol. 13, 8 (2013).

    Article  CAS  Google Scholar 

  12. Sassetti, C.M., Boyd, D.H. & Rubin, E.J. Comprehensive identification of conditionally essential genes in mycobacteria. Proc. Natl. Acad. Sci. USA 98, 12712–12717 (2001).

    Article  CAS  Google Scholar 

  13. Malshetty, V.S., Jain, R., Srinath, T., Kurthkoti, K. & Varshney, U. Synergistic effects of UdgB and Ung in mutation prevention and protection against commonly encountered DNA damaging agents in Mycobacterium smegmatis. Microbiology 156, 940–949 (2010).

    Article  CAS  Google Scholar 

  14. Lee, H., Popodi, E., Tang, H. & Foster, P.L. Rate and molecular spectrum of spontaneous mutations in the bacterium Escherichia coli as determined by whole-genome sequencing. Proc. Natl. Acad. Sci. USA 109, E2774–E2783 (2012).

    Article  CAS  Google Scholar 

  15. Fijalkowska, I.J. & Schaaper, R.M. Mutants in the Exo I motif of Escherichia coli dnaQ: defective proofreading and inviability due to error catastrophe. Proc. Natl. Acad. Sci. USA 93, 2856–2861 (1996).

    Article  CAS  Google Scholar 

  16. Denamur, E. & Matic, I. Evolution of mutation rates in bacteria. Mol. Microbiol. 60, 820–827 (2006).

    Article  CAS  Google Scholar 

  17. Dalrymple, B.P., Kongsuwan, K., Wijffels, G., Dixon, N.E. & Jennings, P.A. A universal protein-protein interaction motif in the eubacterial DNA replication and repair systems. Proc. Natl. Acad. Sci. USA 98, 11627–11632 (2001).

    Article  CAS  Google Scholar 

  18. Haft, D.H. et al. TIGRFAMs: a protein family resource for the functional identification of proteins. Nucleic Acids Res. 29, 41–43 (2001).

    Article  CAS  Google Scholar 

  19. Timinskas, K., Balvočiūtė, M., Timinskas, A. & Venclovas, č. Comprehensive analysis of DNA polymerase III α subunits and their homologs in bacterial genomes. Nucleic Acids Res. 42, 1393–1413 (2014).

    Article  CAS  Google Scholar 

  20. Jordheim, L.P., Durantel, D., Zoulim, F. & Dumontet, C. Advances in the development of nucleoside and nucleotide analogues for cancer and viral diseases. Nat. Rev. Drug Discov. 12, 447–464 (2013).

    Article  CAS  Google Scholar 

  21. Long, M.C. et al. Structure-activity relationship for adenosine kinase from Mycobacterium tuberculosis. Biochem. Pharmacol. 75, 1588–1600 (2008).

    Article  CAS  Google Scholar 

  22. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    Article  CAS  Google Scholar 

  23. McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).

    Article  CAS  Google Scholar 

  24. DePristo, M.A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).

    Article  CAS  Google Scholar 

  25. Ford, C.B. et al. Mycobacterium tuberculosis mutation rate estimates from different lineages predict substantial differences in the emergence of drug-resistant tuberculosis. Nat. Genet. 45, 784–790 (2013).

    Article  CAS  Google Scholar 

  26. Altschul, S.F., Gish, W., Miller, W., Myers, E.W. & Lipman, D.J. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990).

    Article  CAS  Google Scholar 

  27. Katoh, K., Misawa, K., Kuma, K.-I. & Miyata, T. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30, 3059–3066 (2002).

    Article  CAS  Google Scholar 

  28. Nawrocki, E.P. & Eddy, S.R. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics 29, 2933–2935 (2013).

    Article  CAS  Google Scholar 

  29. Price, M.N., Dehal, P.S. & Arkin, A.P. FastTree 2—approximately maximum-likelihood trees for large alignments. PLoS ONE 5, e9490 (2010).

    Article  Google Scholar 

  30. Stamatakis, A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313 (2014).

    Article  CAS  Google Scholar 

  31. Letunic, I. & Bork, P. Interactive Tree Of Life v2: online annotation and display of phylogenetic trees made easy. Nucleic Acids Res. 39, W475–W478 (2011).

    Article  CAS  Google Scholar 

  32. Casali, N. et al. Evolution and transmission of drug-resistant tuberculosis in a Russian population. Nat. Genet. 46, 279–286 (2014).

    Article  CAS  Google Scholar 

  33. Zhang, H. et al. Genome sequencing of 161 Mycobacterium tuberculosis isolates from China identifies genes and intergenic regions associated with drug resistance. Nat. Genet. 45, 1255–1260 (2013).

    Article  CAS  Google Scholar 

  34. Comas, I. et al. Out-of-Africa migration and Neolithic coexpansion of Mycobacterium tuberculosis with modern humans. Nat. Genet. 45, 1176–1182 (2013).

    Article  CAS  Google Scholar 

  35. Franzblau, S.G. et al. Rapid, low-technology MIC determination with clinical Mycobacterium tuberculosis isolates by using the microplate Alamar Blue assay. J. Clin. Microbiol. 36, 362–366 (1998).

    CAS  PubMed  PubMed Central  Google Scholar 

  36. Studier, F.W. Protein production by auto-induction in high-density shaking cultures. Protein Expr. Purif. 41, 207–234 (2005).

    Article  CAS  Google Scholar 

  37. Toste Rêgo, A., Holding, A.N., Kent, H. & Lamers, M.H. Architecture of the Pol III–clamp-exonuclease complex reveals key roles of the exonuclease subunit in processive DNA synthesis and repair. EMBO J. 32, 1334–1343 (2013).

    Article  Google Scholar 

  38. Eswar, N. et al. Comparative protein structure modeling using MODELLER. Curr. Protoc. Protein Sci. Chapter 2, Unit 2.9 (2007).

  39. Bailey, S., Wing, R.A. & Steitz, T.A. The structure of T. aquaticus DNA polymerase III is distinct from eukaryotic replicative DNA polymerases. Cell 126, 893–904 (2006).

    Article  CAS  Google Scholar 

  40. Larkin, M.A. et al. Clustal W and Clustal X version 2.0. Bioinformatics 23, 2947–2948 (2007).

    CAS  Google Scholar 

  41. Gouet, P. ESPript/ENDscript: extracting and rendering sequence and 3D information from atomic structures of. Nucleic Acids Res. 31, 3320–3323 (2003).

    Article  CAS  Google Scholar 

Download references


We thank E. Rubin, B. Bloom, D. Boyd, J. McKenzie, D. Warner and B. Javid for comments, B. Jacobs (Albert Einstein College of Medicine) and M. Wilmans (European Molecular Biology Laboratory) for bacterial strains, and T. Baker (University of Auckland) for plasmids. This work was supported by a Helen Hay Whitney fellowship to J.M.R., US National Institutes of Health Director's New Innovator Award 1DP20D001378, subcontracts from National Institute of Allergy and Infectious Diseases (NIAID) U19AI076217 and AI109755-01, the Doris Duke Charitable Foundation under grant 2010054 to S.M.F. and a UK Medical Research Council grant to M.H.L. (MC_U105197143).

Author information

Authors and Affiliations



J.M.R., U.F.L., M.H.L. and S.M.F. designed the project and wrote the manuscript. M.R.C. performed phylogenetic analyses. C.B.F. and E.R.G. made strains and measured mutation rates. R.G., M.C. and S.G. contributed sequencing data.

Corresponding authors

Correspondence to Sarah M Fortune or Meindert H Lamers.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 Mycobacterium tuberculosis (Mtb) and Mycobacterium smegmatis (Msmeg) contain two ɛ (dnaQ) exonuclease homologs.

(a) Sequence alignment of the ɛ-exonuclease homologs from four different species. Conserved catalytic residues of E. coli ɛ are indicated by blue triangles below the sequences. The clamp-binding motif of E. coli ɛ is boxed in green. (b) Active site of the E. coli ɛ exonuclease. (c) Computational model of the active site of Mtb Rv3711c. (d) Computational model of the active site of Mtb Rv2191.

Supplementary Figure 2 Mycobacterium tuberculosis Rv3711c (Rv3711cMTB) is a 3′–5′ DNA exonuclease but does not form a stable complex with DnaE1MTB.

(a) Coomassie-stained gel showing purified E. coli ɛ (ɛEC), DnaE1MTB and Rv3711cMTB. (b) Gel showing a 3–5 exonuclease activity assay with ɛEC, DnaE1MTB and Rv3711cMTB. (c) Analytical size exclusion chromatography shows that E. coli PolIIIα (PolIIIαEC) and ɛEC form a stable complex at concentrations as low as 1.5 μM. (d) In contrast, DnaE1MTB and Rv3711cMTB do not show any interaction, even at 10 μM protein concentration (all equimolar amounts).

Supplementary Figure 3 PHP active sites in bacterial replicative DNA polymerases.

(a) Alignment of the PHP domain sequences from replicative DNA polymerases. Conserved metal-binding residues of the PHP domain are indicated by blue triangles below the sequences. Cyan squares indicate residues in E. coli that deviate from the consensus metal-binding motif. (b) Computational model of the Mtb DnaE1 PHP domain based on the crystal structure of T. aquaticus PolIIIα (shown in c). Black circles indicate residues mutated for the experiments performed in this study. (c) The PHP domain active site of T. aquaticus PolIIIα. (d) The PHP domain active site of E. coli PolIIIα. Underlines indicate residues in E. coli that deviate from the consensus metal-binding motif.

Supplementary Figure 4 Mycobacterium tuberculosis DnaE1 wild-type (DnaE1MTB WT) and PHP mutants are properly folded.

(a) SDS-PAGE analysis of purified proteins. Each lane contains 3.5 pmol (~0.5 μg) protein. The gel was stained with Coomassie Brilliant Blue. (b) Purified proteins do not show any aggregation, as judged by size exclusion chromatography. The arrow indicates the void volume (at 0.8 ml). For clarity, graphs are shifted vertically by 150 mAU. (c) Circular dichroism spectra show that DnaE1MTB and PHP mutants are properly folded. (d) Thermal denaturation curves show that WT and mutant DnaE1 have similar melting temperatures of 45–50 °C. (e) Time course of exonuclease activity on single-stranded DNA. DnaE1MTB WT shows robust 3–5 exonuclease activity but not 5–3 exonuclease activity.

Supplementary Figure 5 Primer extension from mismatched substrates requires exonuclease activity.

(a) Primer extension from mismatched substrates by Mycobacterium tuberculosis DnaE1 wild-type (DnaE1MTB WT) and E. coli PolIIIa (PolIIIαEC) + ɛEC is blocked by a phosphorothioate linkage (denoted by -S-) that is resistant to exonuclease activity. In contrast, a matched primer with a terminal phosphorothioate linkage can be extended normally. (b) Addition of ɛ EC exonuclease in trans allows DnaE1MTB PHP mutants to extend from mismatched DNA substrates.

Supplementary Figure 6 The per–base pair mutation rate of Mycobacterium smegmatis estimated from fluctuation analysis.

(a) Fluctuation analysis was used to determine the rate at which wild-type M. smegmatis acquired resistance to rifampicin. Circles represent the mutant frequency (number of rifampicin-resistant mutants per cell plated in a single culture). The red bar represents the estimated mutation rate (mutations conferring rifampicin resistance per generation), with error bars representing the 95% confidence interval (CI). (b) The number of mutations in rpoB (Ms1367) that confer rifampicin resistance in our fluctuation analysis was determined by sequencing 150 independent rifampicin-resistant isolates. This analysis identified ten unique mutations. The per–base pair mutation rate, μin vitro, was determined by dividing μrifampicin by the target size.

Supplementary Figure 7 Loss-of-function mutations in the dnaE1 PHP domain are rarely found in clinical Mycobacterium tuberculosis isolates.

(a) dnaE1 (Rv1547) PHP domain SNPs observed in clinical Mtb isolates. SNP prevalence refers to the number of clinical strains containing the indicated SNP as compared to the total number of clinical strains analyzed. See Supplemental Table 1 for additional information. (b) Fluctuation analysis was used to determine the rates at which the indicated M. smegmatis strains acquired resistance to rifampicin. With the exception of wild-type M. smegmatis, these strains harbor a deletion of the endogenous dnaE1 (Ms3178) gene and have been complemented with the indicated M. tuberculosis dnaE1 (Rv1547) gene. Circles represent the mutant frequency (number of rifampicin-resistant mutants per cell plated in a single culture). The red bar represents the estimated mutation rate (mutations conferring rifampicin resistance per generation), with error bars representing the 95% confidence interval (CI). *P < 0.05 in comparison of mutant frequencies by Wilcoxon rank-sum test.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–7 and Supplementary Tables 5–7. (PDF 1068 kb)

Supplementary Table 1

dnaE1 (Rv1547 ) PHP domain SNPs in clinical M. tuberculosis isolates. (XLSX 13 kb)

Supplementary Table 2

Protein sequences used in phylogenetic analysis. (XLS 4806 kb)

Supplementary Table 3

HMMER comparison of ε hoologs to TIGR01406 (dnaQ_proteo). (XLSX 1091 kb)

Supplementary Table 4

Drug minimum inhibitory concentrations (μg/ml). (XLSX 8 kb)

Rights and permissions

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rock, J., Lang, U., Chase, M. et al. DNA replication fidelity in Mycobacterium tuberculosis is mediated by an ancestral prokaryotic proofreader. Nat Genet 47, 677–681 (2015).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing