Introduction

The drug resistance of Mycobacterium tuberculosis, the causative agent for tuberculosis (TB), is a growing problem. Diagnosis and treatment of TB are both challenging tasks, the latter lasting for 6 months or more1. Although the WHO declared TB a global public health emergency in 1993, difficult and often incomplete treatment led to the emergence of resistant strains2,3. Over the years, M. tuberculosis has developed mechanisms of resistance against the known first-line “multi-drug resistant TB” and second line “extensively drug-resistant TB” antitubercular agents4. This development ultimately led to totally drug resistant TB (TDR-TB), resistant to all known first- and second-line antitubercular agents, observed first in 2008 in Iran and in 2012 in India5,6. To treat patients infected with TDR-TB, the development of drugs with new modes of action is urgently needed.

The 2-C-methyl-d-erythritol 4-phosphate (MEP)-pathway offers seven new target enzymes for the development of anti-TB drugs, which should, with a new mode of action, break the resistance of TDR-TB7,8. For many bacteria, this pathway is the only source of the terpene building blocks dimethylallyl diphosphate (DMADP) and isopentenyl diphosphate (IDP), essential for the biosynthesis of secondary metabolites (Fig. 1). The IspC inhibitor fosmidomycin is already used in combination therapy for the treatment of malaria, validating this pathway for the development of new drugs9. Several projects to develop further antibiotic drugs are currently running, targeting different enzymes of the MEP pathway10,11,12,13,14,15. The homologues of M. tuberculosis are also addressed. A series of lipophilic phosphonates that target the IspC enzyme from E. coli and M. tuberculosis was designed and synthesized. The corresponding co-crystal structure confirms a binding mode similar to that of the natural inhibitor fosmidomycin16. Aryl bis-sulfonamides were investigated to inhibit the downstream enzymes of the pathway: the IspF enzyme of Plasmodium falciparum and, with slightly lower activity, also the homologue from M. tuberculosis17. However, to the best of our knowledge, no qualified lead compound targeting MEP-pathway enzymes from M. tuberculosis was selected for further development.

Figure 1
figure 1

An overview of the MEP pathway and the mechanism of the reaction catalyzed by DXS. Fosmidomycin and its derivative FR-900098 are inhibitors of the DXPS downstream enzyme IspC, validating the pathway as target for the development of new antibiotics.

In our drug design project, we chose to focus on 1-deoxy-d-xylulose 5-phosphate synthase (DXPS), the first and rate-limiting enzyme of the MEP-pathway18. DXPS catalyzes the ThDP-dependent decarboxylation of pyruvate and subsequent carboligation with d-glyceraldehyde-3-phosphate (D-GAP), the second substrate, following a preferred-order, random-sequential reaction mechanism (Fig. 1)19,20. Targeting this enzyme has the additional benefit of not only inhibiting the biosynthesis of the terpene precursors, but also the biosynthesis of the vitamins B1 and B6, which are synthesized from the product of DXPS, the branch-point metabolite 1-deoxy-d-xylulose 5-phosphate (DOXP)21. A previous ligand-based approach of our group to design ThDP-competitive inhibitors of DXPS, using the Deinococcus radiodurans DXPS (drDXPS) crystal structure and an M. tuberculosis DXPS (MtDXPS) homology model, resulted in compounds showing activity against D. radiodurans, but with significantly less activity against MtDXPS22.

The modern drug-design process benefits greatly from structural information of the enzymatic target. The knowledge of the molecular interactions between an inhibitor and its target protein sets the stage for structure-based optimization and therefore holds the potential to speed up the optimization of a hit or lead compound23,24. Most enzymes of the MEP pathway are structurally well-characterized, with structures of the enzymes IspC to IspH from multiple species available, both apo and in complex with inhibitors (Table S1), highlighting the interest in structure-based drug design studies for enzymes of this pathway. In contrast, the DXPS enzyme is less explored—with just five protein structures published to date, four from D.radiodurans and one from E. coli25,26,27. The low number of crystal structures may be a consequence of the enzyme´s susceptibility to proteolytic degradation25,27,28 or conformational changes during the catalytic cycle19,29—both properties that can make it difficult to obtain a homogeneous protein sample for crystallization. We recently published a modified drDXPS protein with improved crystallographic properties and speculated that our approach should be applicable to all DXPS homologues27.

In this report, we describe the application of our truncation strategy on the DXPS homologue of M. tuberculosis, as a representative pathogen in the focus of several drug-design projects30. Following the truncation approach, we were able to obtain protein crystals and report herein the first pathogenic DXPS structures: a holo structure with bound ThDP and a structure containing the enamine reaction intermediate (Fig. 1). We also docked MtDXPS inhibitors found in the literature and provide a structural rationale for their inhibitory activity. In addition to providing a molecular model of this important anti-infective target, this also shows that our truncation approach is transferable between organisms and has the potential to facilitate the determination of structures from other DXPS homologues in the future.

Results

Application of the truncation strategy

In our recent article, we showed that the truncation of a non-conserved loop not visible in the structures of the DXPS enzyme of D. radiodurans (drDXPS) improved the crystallization of this target27. We hypothesized that this result could also be transferable to other homologues. The DXPS enzyme from M. tuberculosis (MtDXPS) was chosen to test this hypothesis. This homologue is of particular interest, as it is from a pathogenic organism and inhibitors targeting the enzyme have been investigated, utilizing a computed homology model due to the lack of an MtDXPS structure22,40.

To apply the truncation strategy, we searched the previously published multiple sequence alignment (MSA) for the protein sequence of M. tuberculosis and compared it with the sequence of D. radiodurans27. The corresponding non-conserved loop from MtDXPS has a length of 45 amino acids, comprising amino acids 190–234 (Fig. S1). We replaced this loop with a linker of seven glycines that we visually estimated to be able to compensate for the removed amino acids, according to a homology model obtained for the MtDXPS homologue22. The resulting truncated sequence (SI: Sequences) of MtDXPS (ΔMtDXPS) was then obtained as a synthetic gene.

Characterization of the truncated enzyme

The ΔMtDXPS was recombinantly expressed as a soluble protein in good yields of > 50 mg/L, whereas we could express the native protein in a range of 0.5 mg/L in LB-medium. To analyze other effects of the truncation, we compared the wild-type MtDXPS and the truncated ΔMtDXPS, using several biophysical techniques.

ΔMtDXPS integrity was analyzed by LC–MS. The sample showed a single protein with a high purity and a mass of 65,901.8 Da (Fig. S2). This mass is 705 Da lighter than calculated for the full length ΔMtDXPS protein, and the full mass could not be observed. The weight of 705 Da corresponds to the molecular weight of the last six amino acids of the protein, suggesting either incomplete translation, a common issue in protein expression, or degradation of the sample during preparation. The terminal amino acids are often not resolved in protein crystal structures due to their flexibility, so the lack of the last six residues is unlikely to be a concern.

One of the truncation goals was to reduce the degradation of the wild-type MtDXPS enzyme. Over an incubation period of up to 5 days at RT, degradation of ΔMtDXPS could not be detected on SDS-PAGE gels. In contrast, for full length MtDXPS bands corresponding to calculated degradation products of 20 and 40 kDa started to appear after 4 days (Fig. S3). This observation becomes more apparent when the time period is extended to 7 days, where a substantial decrease in band intensity of full length MtDXPS protein can be seen, while ΔMtDXPS is still stable. Therefore, similarly to drDXPS, the loop truncation reduced the tendency for protein degradation, overall increasing the stability. As an additional method to assess protein stability, the melting point (Tm) was determined using a thermal shift assay (TSA). The Tm obtained for both wild-type and truncated MtDXPS was 45.4 ± 0.4 °C and 52.7 ± 0 °C, respectively. The large difference in melting temperature between the two proteins (∆Tm = 7.3 °C) strongly suggests that the truncation improved protein stability, which is often beneficial for protein crystallization31.

Similarly to drDXPS, the truncation had only small effects on the enzymatic activity of ΔMtDXPS, as shown in Table 1. The ΔMtDXPS enzyme showed a slightly reduced affinity for the substrates and a 2–3 times higher activity, which could originate from a more accessible active site with a smaller degree of conformational variability during the catalytic cycle. Nevertheless, the kinetic parameters are still within the same magnitude and the retained enzymatic activity seems to indicate that no catalytically important residues are impacted by the truncation.

Table 1 Comparison of the enzyme kinetics from MtDXPS and ΔMtDXPS.

MtDXPS crystal structure

We previously performed extensive crystallization screening of the wild-type MtDXPS enzyme, but were unable to obtain protein crystals. In contrast, the ΔMtDXPS protein crystallized within 2 days in several conditions. From the best condition, we were able to determine the crystal structure of the holo protein of ΔMtDXPS to a resolution of 1.85 Å (PDB ID: 7A9H) (Fig. 2a). The numbering of amino acids discussed in the following text is according to the MtDXPS sequence with the UniProt ID: P9WNS3 (SI: Sequences).

Figure 2
figure 2

Overview of the holo structure of ΔMtDXPS. (a) Overview of the ΔMtDXPS structure with highlighted domains. Domain I (res 1–312) in red, domain II (res 313–483) in green and domain III (484–638) in cyan. (b) Active site with important residues for catalysis shown in stick representation. (c) Structure colored according to Cα-RMSD to the drDXPS structure (PDB ID: 2O1X), insertion regions are colored in yellow. (d) Hydrogen-bonding network that stabilizes the different H426 conformation and key residues in the network. Water molecules are shown as red spheres and hydrogen bonds as black, dashed lines with distances indicated in Ångström.

The asymmetric unit of ΔMtDXPS contains a homodimer with each subunit consisting of three distinct domains and a ThDP molecule. Domain I (res 1–312) and Domain II (res 313–483) contribute to the active site, where ThDP is bound. Domain III (res 484–638) makes extensive contacts at the dimer interface (Fig. 2a). Overall, the structure and domain arrangement of MtDXPS show high similarity to the known structures of E. coli and D. radiodurans DXPS that were described in detail by S. Xiang et al. with a Cα-RMSD of 1.390 and 1.207 Å, respectively25. The main difference between the structures is the linker from amino acids 475–489, located at the surface which adopts a different conformation (Fig. 2c). From a single example of an MtDXPS structure we cannot conclude that this variation is not a result of the crystallization process.

Three inserts in the MtDXPS sequence were found according to a multiple sequence alignment (MSA) of other bacterial DXPS enzymes (Fig. S1). The first insert comprising amino acids 216–220, is in an evolutionary diverse and disordered loop, which is part of the truncated loop in ΔMtDXPS. This region can also not be observed in the wild-type protein crystal structure of other homologues25,26,27. Indeed, the high sequence variability and disorder in the known crystal structures were the reasons to truncate this region for a more stable and more easily crystallizable protein25,26,27.

The other two inserts concern amino acids 498–502 and 523–528 and are located at the protein’s solvent-accessible surface (Fig. 2c). Both insertions occur in all Mycobacterium strains, but are in a region of high variability with a pairwise sequence identity of 13.7% and 8.6%, compared to the MSA of 498 other bacterial DXPS, respectively25. The first is a loop extension of a ß-sheet turn, while the second is 30 amino acids downstream and forms an α-helix, both are located within domain III, responsible for the extensive contacts with the dimeric interface.

As anticipated, the DXPS active site is highly conserved, and most residues of MtDXPS share the same position across homologues, including key residues shown in mutational studies to be important in catalysis and substrate binding, such as Glu365, Tyr387, Arg415, Asp422, His40 and His71 (Fig. 2b)25,32,33. Three seemingly important differences, however, could be identified near the active site of MtDXPS: Lys473, His426 and Ser112. Lys473 plays a significant role in the catalysis and substrate binding, as mutational studies of the corresponding residue on ecDXPS have shown loss of enzymatic activity25. Most bacterial DXPS have an arginine in the corresponding position of Lys473. However, the residue is in a conformation similar to the corresponding Arg480/Arg478 of drDXS and ecDXPS structures and the observed Km for D-GAP is in the same order of magnitude as for these, indicating that the residues could be interchanged without causing major structural changes in the protein32,34,35.

His426 is conserved in all bacterial DXPS but adopts a different conformation in the MtDXPS structure. Unlike most bacteria, M. tuberculosis has a serine at position 112 instead of a glycine or alanine. The serine at this position creates a steric restriction that prevents His426 from adopting its previously reported conformation, while at the same time enabling the coordination of a water molecule that stabilizes His426 in its new position. The water is highly coordinated, making hydrogen bonds with His426, Ser112, Tyr98 and other water molecules. This network extends to the aminopyrimidine N4’ and the backbone and side chains of Ser425 (Fig. 2d). Despite the overall similarities with other homologues, the structural differences close to the active site provide new information that could be specific for M. tuberculosis and will be instrumental for future structure-based drug design endeavors.

MtDXPS with enamine intermediate

In order to further elucidate the structural conformations during catalysis, we also attempted to obtain crystal structures of reaction intermediates. It has previously been shown that the lactyl-conformation of ThDP (LThDP) is stable, while the last steps of the reaction, the decarboxylation of LThDP and the addition of D-GAP, are proceeding fast after binding of D-GAP20.

A structure of ΔMtDXPS simultaneously soaked with pyruvate and D-GAP was solved to a resolution of 1.9 Å (PDB ID: 7A9G) in space group P21 (Fig. 3). The asymmetric unit contains a homodimer, which adopts a highly similar fold to the ΔMtDXPS holo structure, with an overall RMSD of 0.2 Å. Residues 289–295 are visible in the structure whilst they could not be modeled in the holo structure. In the active site, density could be observed near the C2 of ThDP, consistent with an enamine or acetyl intermediate adduct (Fig. 3)26. This decarboxylated form of LThDP shows well-defined electron density with a B-factor of 25 Å2 (average in crystal is 22 ± 5 Å2). The B-factor is similar to that of the surrounding residues and we can assume that nearly all ThDP ligands in the crystal exist in the enamine or acetyl-form. Since we could not determine whether the enamine has converted to a less reactive acetylThDP state, we will refer to the modeled molecule as the enamine-intermediate. This is the first time the enamine-intermediate was observed in a DXPS structure without growing the crystals in an oxygen-free environment, as was necessary for the first structure of a pyruvate adduct26. This is likely the result of the very short time between soaking the crystals with pyruvate and D-GAP and the subsequent flash-cooling, capturing the reaction in an intermediate state.

Figure 3
figure 3

Active site of ∆MtDXPS with enamine-ThDP intermediate. Polder map density contoured at 3σ in gray wire mesh for the enamine-ThDP intermediate (PDB ID: 7A9G), calculated using Phenix. Key residues involved in the interaction with the enamine are shown and labeled. Distances are indicated in Ångström.

We found that the enamine–ThDP intermediate is engaged in several unique interactions with ΔMtDXPS not seen in other structures of DXPS reaction intermediates. Glu293 makes hydrogen bonds with the diphosphate moiety of ThDP and the C2α-hydroxyl forms hydrogen bonds with Ser112 and His426 through the highly coordinated water also found in the holo structure (Figs. 2d and 3). In line with other reaction intermediate structures from other DXPS homologues, the N4’ of the aminopyrimidine hydrogen bonds with the C2ɑ-hydroxyl.

During the catalytic cycle, ThDP reacts with pyruvate to form stabilized LThDP. Rapid decarboxylation can occur upon D-GAP binding, leading to the formation of the enamine-intermediate. In ∆MtDXPS, the enamine intermediate seems to be stabilized via a highly coordinated water molecule also found in the holo structure. Similarly, the water molecule is coordinated to the enamine intermediate, Ser112, Tyr98 and His426, through a network that further extends to Ser425 and the backbone of Ser112. This network of hydrogen bonds might help delocalize the negative charge of the carbanion during the reaction (Fig. 4a). When the enamine is not present (holo structure, PDB ID: 7A9H), the position of the enamine-hydroxyl group is occupied by a second water atom that maintains the hydrogen-bond network (Fig. 2d).

Figure 4
figure 4

Comparison of drDXPS and MtDXPS enamine intermediate active sites. (a) Ser-network/H-bonds stabilizing the enamine intermediate. Residues from MtDXPS are colored in cyan, while the corresponding residues from drDXPS (PDB ID: 6OUW) are colored in yellow. Hydrogen bonds are shown as dashed lines with distances in Ångström. (b) “Fork” motif. The “fork”-like motif found in the intermediate structure is highlighted in cyan, the similar motif in drDXS is colored in yellow. The key difference is the presence of Glu293 in the MtDXPS structure. (c) ESSH sequence motif. The ESSH sequence motif found in the ∆MtDXPS structures is shown in pink. The table shows the sequence alignment of bacterial DXPS, with several sequences bearing the ESSH sequence motif shown for representation, highlighted with a pink box.

In protein structures of drDXPS and ecDXPS, this hydrogen bond (H-bond) network cannot be observed and the enamine-intermediate stabilization is directly maintained through His434 (His426 in MtDXPS) (Fig. 4a). As previously stated, both enzymes have a glycine instead of a serine at position 112 (121 in E. coli, 123 in D. radiodurans), and consequently no H-bond network can be formed.

The Ser112 is unique for certain bacteria, a multiple sequence alignment with 498 bacterial DXPS sequences showed that only 33 have this amino acid (Table S3). Interestingly, most bacteria that have a serine at this position also have a conserved sequence pattern. The pattern starts with Glu110, followed by Ser111, Ser112 and His113, which we called ESSH sequence motif (Fig. 4c). His 113 is essential, with 100% conservation, but the rest of the sequence is in a variable region with only 43.8% similarity. All Mycobacteria, together with the closely related Corynebacteria and a few other species, bear this sequence motif, highlighting that this different mechanism could be specific for those bacteria (Table S3).

Residues near the active site set MtDXPS apart from other DXPS and could account for its lower activity

Two crystal structures from D. radiodurans were recently published by Drennan and coworkers, in which a “fork” and a “spoon” motif were introduced. These motifs adopt different conformations during the catalytic cycle26,29. The “fork” and “spoon” motifs are two loops adjacent to each other, in positions 292–306 and 307–319 (drDXPS). The corresponding amino acids in MtDXPS are 283–299 and 300–312, for the “fork” and “spoon”-motif, respectively. The “fork” motif in the MtDXPS structure seems to adopt a slightly different fold and does not include His304 (drDXS) (Fig. 4b). In both structures of MtDXPS, Histidine 296, which corresponds to His304 in drDXPS, could not be observed. This His304 is part of the active site and has previously been identified by mutational studies as important for the catalysis through a stabilization of both LThDP and the closed conformation of the enzyme19,26. Experiments have demonstrated a 90% reduction of DXPS activity when His304 is mutated to alanine32. In both structures of MtDXPS, no histidine making similar interactions was observed. In chain A of our intermediate structure (PDB ID: 7A9G, chain A), part of the residues that would correspond to the fork motif orient in a similar, but slightly different fold, resulting in the position of catalytically important His304 in drDXPS being occupied by Glu293, which forms hydrogen bonds with the diphosphate group of ThDP (Fig. 4b). These residues are disordered in chain B and the holo structure. Since His304 (drDXPS) stabilizes the LThDP by interacting with its carboxylate group, the presence of Glu293 in that location in MtDXPS and its interaction with the diphosphate may lead to a lower stabilization of the LThDP intermediate and the closed conformation of the protein, potentially contributing to the lower catalytic activity.

GAP binding site

Observation of the second substrate D-GAP was more difficult in the protein crystals soaked with both substrates. This was expected, as we did not trap the enzyme in a catalytic state for instance, by a dead-end substrate, but instead exposed the protein crystals to a solution allowing catalytic activity. This resulted in different catalytic states in the same protein crystal. However, it was possible to identify the electron density of a phosphate-like moiety and model it with an elevated B-factor of 49.3 Å2 (Fig. 5a). While the C3-body of D-GAP cannot be built with confidence, the crystallization conditions do not contain free phosphate, indicating that the observed phosphate density could be provided by a partially disordered D-GAP molecule. The electron density for this moiety can be observed clearly in chain B of the homodimer, while it is more diffuse in chain A. The phosphate is located in close proximity to the enamine intermediate and forms hydrogen bonds with Tyr387, Arg415, His426 and Lys473 with distances between 2.8 and 3.4 Å. The interacting residues correspond to the residues identified to make up the D-GAP binding site in previous studies, determined by molecular docking or kinetic studies and alanine-scanning20,35.

Figure 5
figure 5

MtDXPS D-GAP binding site with modeled phosphate and docked D-GAP. (a) ∆MtDXPS structure (PDB ID: 7A9G) with polder map density contoured at 3σ in gray wire mesh for the phosphate. Key residues involved in the interaction with the phosphate are shown in stick representation and labeled. Hydrogen bonds are shown as black, dashed lines. (b) D-GAP molecule docked into the D-GAP binding site of ∆MtDXPS after removal of the phosphate. The polder density map of the phosphate present in the crystal structure is shown to highlight the similar orientation and interactions of the 3-phosphate moiety of D-GAP to the modeled phosphate. The residues predicted to interact with the docked D-GAP are shown in stick representation and are labeled. Predicted hydrogen bonds are shown as black, dashed lines, while the distance between the D-GAP C3 and the enamine intermediate 2Cα are shown as blue, dashed lines.

Irrespective of the presence of a fully ordered D-GAP, the density of the putative phosphate group provides a reasonable starting point to model and dock the complete D-GAP substrate (Fig. 5b). The substrate molecule docks well into the binding site, with its 3-phosphate moiety superimposing with the observed density and making the same interactions with Tyr387, Arg415, His426 and Lys473 as the phosphate visible in the electron density. The carbonyl group interacts with Tyr98 and His40, which helps to orient the C3-body in an ideal position for the nucleophilic attack of the C2-carbanion of the enamine intermediate (Fig. 5b).

Docking of MtDXPS inhibitors from the literature

Hydroxybenzaldoximes

With the first protein crystal structure of a pathogenic DXPS homologue it is now possible to understand the binding of inhibitors of DXPS by in-silico docking. As an example, the class of trihydroxybenzaldoximes was chosen. This compound class is one of very few reported D-GAP-competitive inhibitors and does not compete with pyruvate36. With the first crystal structure that resembles the conformation of the active site during the D-GAP addition, and the additional hint for the D-GAP position, we were interested in the interactions of this compound class with the protein.

The class of trihydroxybenzaldoximes was published first as inhibitors of DXPS by Bartee et al.36. The inhibitors were developed based on the similarity with the substrates of DXPS. One side was kept as analogue of pyruvate, the first and specific substrate of DXPS, the other side of the chain was varied with aryl residues, based on previous findings that DXPS has a high substrate promiscuity and is able to accept several different acceptor substrates37,38. During inhibitor development, Bartee et al. discovered the high activity of compound 1, which had no pyruvate substitute, but was rather a symmetrical oxime (Fig. 6).

Figure 6
figure 6

Example structure of the literature known DXS inhibitor class of hydroxybenzaldoximes 136.

We docked the reported class of hydroxybenzaldoximes to our protein structure, using the docking software SeeSAR. The compound with the highest biological activity (Ki 1 ± 0.2 µM) against DXPS from D. radiodurans, compound 1, docks well into the active site (Fig. 7). The docking resulted in 20 poses ranked using affinity scores, as calculated by HYDE. The docking pose with the highest affinity score was chosen and shows interactions of the hydroxyl groups from one part of the molecule with the amino acids of the D-GAP binding site, while the linker between the two similar warheads is spanning over the active site. Besides hydrogen bonds with the protein, the second part can form chelating interactions with the bound Mg2+-ion. This metal interaction is remarkable, as it allows the compound to interact as the sixth coordination partner of the cation, while not competing with the diphosphate group of the ThDP cofactor, but instead additionally forming an H bond with the terminal β-phosphate. When docked to the protein structure with the bound intermediate, the inhibitor adopts a similar pose, shielding the enamine-ThDP from the solvent. This enclosure of the cofactor or intermediate is a possible explanation of the observed distinctive inhibition mode, competitive to D-GAP and noncompetitive to pyruvate36.

Figure 7
figure 7

Docking of the hydroxybenzaldoxime 1 to ∆MtDXPS. (a) Hydroxybenzaldoxime 1 docked in the active site of ∆MtDXPS with cofactor ThDP bound (PDB ID: 7A9H). The pose of the docked molecule with the highest affinity score, as calculated by HYDE, is shown in pink and the predicted interacting residues are shown in stick presentation and are labeled. Hydrogen bonds are shown as black, dotted lines with distances indicated in Ångström. (b) 2-Dimensional ligand interaction plot of Hydroxybenzaldoxime 1 with ∆MtDXPS.

Discussion

DXPS is one of the most important enzymes of the 2-C-methyl-d-erythritol 4-phosphate (MEP)-pathway and is a key target for the development of new anti-infective drugs to treat infections with multidrug-resistant bacteria. Here, we have applied our previously reported truncation approach and removed a disordered internal loop, to successfully crystallize the DXPS from an important anti-infective target, namely M. tuberculosis.

The ΔMtDXPS structures show a high degree of similarity to known structures. However, our work proposes a different stabilization mechanism of the enamine intermediate in MtDXPS when compared to the enamine-intermediate structure of drDXPS. We propose that the presence of the highly coordinated water, found in both the holo and enamine-intermediate structures, interacting with Ser112, His426 and Y98, and the different conformation of H426 might help delocalize the carbanion charge, yielding a more stable intermediate. Even though mutational studies with E. coli and D. radiodurans have shown that the corresponding residues of H426 (H434 in drDXPS and H431 in ecDXPS) are not essential for catalysis and may therefore mainly contribute to substrate binding affinity, the different conformation of this residue in the ΔMtDXPS structures suggests that it plays a different role for M. tuberculosis DXPS32.

Compared with other DXPS enzymes, which are reported to have kcat values in the range of 0.5–25 s–1, the MtDXPS possesses the slowest reaction kinetics for a DXPS homologue described to date, with a kcat value in the range of 0.005 s–1 8,39. The distinct enamine-intermediate stabilization mechanism found in ΔMtDXPS might explain its slower reaction rate. The Ser112 present in all Mycobacterium could be an important factor that explains this difference in the stabilization mechanism, however, there are no catalytic data from any other homologue containing a serine in this position reported to date. Obtaining kinetic information of such homologues could support these observations. In addition, to confirm whether the serine has any impact on the observed difference in kcat, the position should be mutated in the MtDXPS. The same holds true for the other amino acids involved in the hydrogen-bond network of the water molecule (Tyr 98, His426 and the ESSH sequence motif). Furthermore, these positions could be inserted into other, well characterized DXPS enzymes, such as drDXPS or ecDXPS, in order to obtain valuable information about their potential effect on the reaction rate.

Independent of the direct effect of the serine or the sequence motif on the catalytic activity, species with these features are more likely to have a slightly larger active site, due to the different rotameric state of His426 (Fig. 4a). By targeting this pocket or introducing a group that can interact with the H-bonding network in a similar manner to the intermediate, it might be possible to develop DXPS inhibitors selectively targeting M. tuberculosis over other species.

Distinct from previous studies performed on drDXPS, which found a fork-spoon motif that is positioned to open and close the active site based on substrate binding19,26, a similar but divergent structural feature was found for MtDXPS. This ‘fork-like’ motif utilizes a different residue for the interaction with the cofactor, potentially leading to a lesser degree of fixation of the active site conformation.

As of yet, we could not obtain a structure of MtDXPS with a fully ordered spoon-fork motif. This leaves the question open whether His296 interacts, in any conformation, with the active site of MtDXPS. If His296 is permanently located outside of the active site and the corresponding position is taken by Glu293, which makes different interactions with the diphosphate moiety of ThDP, this could also explain the lower enzymatic activity of MtDXPS. As mentioned, the mutation of His304 to alanine in drDXPS leads to a 90% reduction in activity32. Similar mutational studies in MtDXPS could be performed to evaluate their impact on the enzymatic activity, such as mutating Glu293 and His296 to alanine, as these might be crucial for the observed differences in the spoon-fork motif. Since drDXPS and ecDXPS do not have a corresponding amino acid in this position (Fig. S1), inserting it in their sequences could grant further insight into its role. A MtDXPS structure containing the LThDP mimic PLThDP could provide valuable insights into the conformation and interactions of the fork-like motif pre-decarboxylation and its potential effect on the catalytic activity of MtDXPS. Unfortunately, thus far, we were unable to obtain such a structure. However, the availability of high-resolution crystals of MtDXPS allows a more complete exploration of the molecular mechanisms behind the lower DXPS activity of MtDXPS, which will be the subject of future research.

Furthermore, this was the first time that phosphate-like electron density could be observed near the predicted D-GAP binding site in a DXPS structure, providing new evidence of the putative binding site and residues involved in substrate stabilization. While the observed density and docking of the D-GAP substrate do not provide definitive evidence of D-GAP binding, these results fit well with expectations of D-GAP substrate binding20,35. Chain A of the intermediate structure shows a partially folded, fork-like motif with Glu293 interacting with ThDP and only weak density in the D-GAP binding site. The fork-like motif in chain B is disordered but phosphate-like density can be observed in the D-GAP binding site. It could be postulated that the enzyme can adopt both the open and closed conformation post-decarboxylation depending on the presence of D-GAP/phosphate. Bound D-GAP/phosphate would induce an open conformation whilst the fork-like motif closes in the absence of D-GAP.

As mentioned, our group attempted a ligand-based approach for the discovery of inhibitors of DXPS using a drDXPS crystal structure and a homology model of MtDXPS derived from that structure. This yielded compounds that showed activity against drDXPS but less so against MtDXPS. The homology model active site differs from the mtDXPS crystal structure active site in the same manner as the drDXPS active site differs from the MtDXPS, which can be explained by the fact that it was modeled using the drDXPS crystal structure. The conformation of the spoon-fork motif, the orientation of His426, the absence of the coordinated water and the structural position of Glu293 of the homology model match the drDXPS crystal structure, whilst the MtDXPS crystal structure shows that these are distinct (data not shown). Since these amino acids were proposed to form several key interactions with the compounds designed in our previous paper22, it could explain why they did not have the expected activity against MtDXPS. This highlights the importance of obtaining crystal structures as computationally generated models do not always reflect the real conformations of proteins, leading to inhibitor design using incorrect data.

In summary, the successful use of the loop truncation for the crystallization of DXPS from a different species, provides a solid platform that can extend the structural study of DXPS homologues. All structural differences obtained, such as the ones found for MtDXPS, can then assist in the development of new antibiotics with a high specificity for species that are already drug-resistant, such as M. tuberculosis. Additionally, our results provide a new opportunity to investigate and improve previously identified inhibitors of MtDXPS by docking them into the experimentally determined structures, allowing for the elucidation of SARs and facilitating the design and hit-to-lead optimization of TB-specific inhibitors. The next step will be to obtain co-crystal structures with inhibitors bound to the enzyme.

Methods

Expression and purification of MtDXPS

The design of a pET22b plasmid hosting the native MtDXPS gene with an N-terminal His-Tag is reported elsewhere40. The plasmid was transformed into E. coli BL21 (DE3) cells, which were grown at 37 °C. At an OD600 of ~ 0.6, protein expression was induced by the addition of isopropyl β-d-thiogalactopyranoside (IPTG) to a final concentration of 0.1 mM, and the cells were incubated at 18 °C overnight. The bacteria were harvested by centrifugation and resuspended in lysis buffer [50 mM Tris–HCl pH 8.0, 300 mM NaCl, 10 mM imidazole, 5 mM β-mercaptoethanol (β-ME), 5 mM MgCl2, 2.5 U/mL benzonase, 1 tablet/200 mL Complete Protease inhibitor, EDTA free (Roche)]. The cells were lysed using a Microfluidizer M-110P, and the lysate clarified by centrifugation for 45 min at 18,000g. The lysate was filtered using a 0.45 µm syringe filter, loaded onto a metal-affinity HisTrap HP 5 mL column (GE Healthcare) and eluted using a linear gradient of imidazole from 10 to 500 mM. Protein-containing fractions were pooled and concentrated using VivaSpin ultrafiltration devices with a MWCO of 30 kDa. The concentrated sample was further purified by gel filtration on a HiLoad 16/600 Superdex 200 pg column equilibrated in 20 mM Bis–Tris-Propane, pH 7.5; 300 mM NaCl, 1 mM MgCl2, 0,5 mM TCEP-HCl.

Expression and purification of ∆MtDXPS

The truncated MtDXPS gene was obtained commercially as a synthetic gene and cloned into a pETM-11 plasmid using the NcoI and HindIII restriction sites. Expression, lysis and the initial IMAC purification were performed as above for full-length MtDXPS. The protein-containing fractions were subsequently combined and diluted with a low-salt buffer consisting of 50 mM HEPES pH 8.0, 5% glycerol, 5 mM dithiothreitol (DTT) and 100 µM MgCl2 to a conductivity of 8 mS/cm. The solution was then loaded on a Resource Q anion exchange column and eluted with a linear NaCl gradient from 0 to 1 M. The protein-containing fractions were pooled and purified by gel filtration on a HiLoad 16/600 Superdex 200 pg column equilibrated with 20 mM HEPES pH 8.0, 250 mM NaCl, 5% glycerol, 5 mM DTT. The protein-containing fractions were concentrated to 5.5 mg/mL using a Vivaspin centrifugal concentrator (MWCO 30 kDa, Sartorius), and the His-tag was cleaved by TEV-protease digestion at 10 °C overnight. Removal of the tag and protease was achieved by reversed IMAC chromatography, and the protein was purified again by gel filtration on a HiLoad 16/600 Superdex 200 pg column using 20 mM MOPS pH 7.50, 200 mM NaCl, 5% Glycerol, 2 mM DTT as buffer. The purified protein was concentrated to 10 mg/mL in a Vivaspin centrifugal concentrator (MWCO 10 kDa, Sartorius).

Protein crystallization and structure determination

A high-throughput crystallization robot (Mosquito, SPT Labtech) was used to perform sitting-drop vapor-diffusion screenings for crystallization conditions for ΔMtDXPS. Prior to screening, 1 mM of ThDP and 2 mM MgCl2 were added to the protein solution (10 mg/mL) and then incubated at 4 °C for 3 h. The mixture was then screened against commercially available sparse-matrix screening kits (PACT premier, JSCG plus and MORPHEUS; Molecular Dimensions) at 18 °C. 200 nL of protein solution and 200 nL of crystallization buffer were equilibrated against 50 μL of reservoir solution. Initial crystals were obtained after 2 days in MORPHEUS conditions B2 (0.1 M MES/imidazole pH 6.5, 10% w/v PEG 8000, 20% v/v ethylene glycol, 0.3 M sodium fluoride, 0.3 M sodium bromide and 0.3 M sodium iodide) and C2 (0.1 M MES/imidazole pH 6.5, 10% w/v PEG 8000, 20% v/v ethylene glycol, 0.3 M sodium nitrate, 0.3 M disodium hydrogen phosphate and 0.3 M ammonium sulfate) as well as PACT premier D2 (0.1 M MMT-buffer pH 5.0, 25% PEG1500). The structures described in this manuscript were obtained from crystals grown in condition PACT premier D2. The crystals were cryo-protected using reservoir solution supplemented with 30% (v/v) PEG 400 and flash-cooled in liquid nitrogen.

To obtain ΔMtDXPS with reaction intermediate, crystals grown in the same condition were harvested, soaked for one minute in the above-mentioned cryo-protecting solution supplemented with 1 mM of D-GAP and 1 mM of pyruvate and subsequently flash-cooled in liquid nitrogen. X-ray diffraction data for the ΔMtDXPS structures were collected on beamline P13 operated by EMBL Hamburg at the PETRA III storage ring (DESY, Hamburg, Germany) at 100 K. Data indexing, integration and scaling was performed using XDSAPP41 and AIMLESS42 from the CCP443 software package. The structure was solved using the MOLREP44 software in CCP4 using the previously solved drDXPS homologue (PDB ID: 2O1X) as the reference model. The resulting models were then subjected to iterative cycles of model building and refinement with COOT45 and REFMAC546.

The structures were deposited in the PDB with accession codes 7A9H and 7A9G, corresponding to the holo and reaction intermediate structures, respectively. Table S2 contains the data collection and refinement statistics.

Enzymatic assay

The DXPS activity was analyzed at RT as previously reported, with minor modifications40,47. A continuous kinetic photometric assay was used to measure DXPS activity. NADPH depletion by the downstream IspC enzyme was determined in a microplate reader (PHERAstar, BMG Labtech) by monitoring the decrease in absorbance at 340 nm. Total assay volume was 60 µL, containing 200 mM HEPES pH 8.0, 2 mM DTT, 1 mM MgCl2, 0.3 mM NADPH and 1.5 µM IspC (from E.coli, expressed and purified in-house according to a literature procedure)48. The amount of DXPS used in the assays was determined experimentally by a dilution series of the enzyme. These were 5 µM and 2 µM for ∆MtDXPS and MtDXPS, respectively, as they showed the highest linear reaction velocity without observable substrate depletion over a time range of 30 min. The reaction was monitored at RT for 30 min after addition of the substrate(s) and 1 min of centrifugation (2000 rpm). To determine the corresponding Km values, the compounds ThDP, pyruvate and D-GAP were used in varying concentrations. If a substrate or cofactor was kept constant, a concentration of 0.2 mM was used for ThDP, 0.5 mM for pyruvate and 2 mM for D-GAP.

Blank correction and linear fitting of the absorption data was performed using the program Origin 2019 (OriginLab). The initial velocities obtained were plotted against the substrate concentrations, and the Km values were determined by nonlinear curve fitting using the Michaelis–Menten model of the enzyme kinetics add-on of Origin2019.

Thermal shift assay (TSA)

Thermal shift analyses were performed using an ABI StepOneplus RT-PCR instrument. The samples were measured in white 96-well plates. Denaturation was achieved using a continuous heating rate of 1 °C/min from 20 to 95 °C. The total sample volume was 25 µL, consisting of 20 µL TSA buffer (20 mM Tris–HCl, pH 8.0; 100 mM NaCl, 5 mM MgCl2), 2.5 µL protein solution and 2.5 µL dye (Sypro Orange, Sigma-Aldrich). The optimal concentrations were experimentally determined. A final concentration in the plate of 1.5 µM protein and 5× SYPRO Orange yielded the best signal-to-noise ratio. All measurements were performed in duplicate.

LC–MS

All ESI–MS-measurements were performed on a Dionex Ultimate 3000 RSLC system using an Aeris Widepore XB-C8, 150 × 2.1 mm, 3.6 µm dp column (Phenomenex, USA). Separation of 1 µL sample was achieved by a linear gradient from (A) H2O + 0.1% formic acid (FA) to (B) ACN + 0.1% FA at a flow rate of 300 µL/min and 45 °C. The gradient was initiated by a 0.5 min isocratic step at 2% B, followed by an increase to 75% B in 10 min to end with a 3 min step at 75% B before re-equilibration with initial conditions. UV spectra were recorded by a DAD in the range from 200 to 600 nm. The LC flow was split to 75 µL/min before entering the maXis 4G hr-ToF mass spectrometer (Bruker Daltonics, Bremen, Germany), using the standard Bruker ESI source. In the source region, the temperature was set to 200 °C, the capillary voltage was 4000 V, the dry-gas flow was 5.0 L/min and the nebulizer was set to 1.0 bar. Mass spectra were acquired in positive ionization mode ranging from 600 to 1800 m/z at 2.5 Hz scan rate. Protein masses were deconvoluted by using the Maximum Entropy algorithm (Copyright 1991–2004 Spectrum Square Associates, Inc.).

Modeling and docking

The computer program SeeSAR, version 10.3.1 from BioSolveIT was used to generate the docking poses and calculate binding affinities. The software uses the FlexX docking algorithm for the placement of ligands49. The affinities are estimated using the HYDE scoring function, which calculates the binding affinities based on the hydration differences between the bound and unbound state of the molecule50,51. The binding site was chosen around the ThDP ligand, extending to the residues Tyr387, Arg415 and Lys473, which bind the D-GAP substrate. Sequence numbering is based on the Uniprot sequence file with the code P9WNS3. Analysis and visualization of the results were done using the program StarDrop, version 6.6.7.25378 from Optibrium.