Crystal structures of Mycobacterium tuberculosis HspAT and ArAT reveal structural basis of their distinct substrate specificities

Aminotransferases of subfamily Iβ, which include histidinol phosphate aminotransferases (HspATs) and aromatic amino acid aminotransferases (ArATs), are structurally similar but possess distinct substrate specificities. This study, encompassing structural and biochemical characterisation of HspAT and ArAT from Mycobacterium tuberculosis demonstrates that the residues lining the substrate binding pocket and N-terminal lid are the primary determinants of their substrate specificities. In mHspAT, hydrophilic residues in the substrate binding pocket and N-terminal lid allow the entry and binding of its preferential substrate, Hsp. On the other hand, the hydrophobic nature of both the substrate binding pocket and the N-terminal lid of mArAT is responsible for the discrimination of a polar substrate such as Hsp, while facilitating the binding of Phe and other aromatic residues such as Tyr and Trp. In addition, the present study delineates the ligand induced conformational rearrangements, providing insights into the plasticity of aminotransferases. Furthermore, the study also demonstrates that the adventitiously bound ligand 2-(N-morpholino)ethanesulfonic acid (MES) is indeed a specific inhibitor of HspAT. These results suggest that previously untapped morpholine-ring scaffold compounds could be explored for the design of new anti-TB agents.

mHspAT and mArAT exhibit topology similar to homologues belonging to subfamily Iβ aminotransferases. To decipher the structural basis of distinct substrate specificities of mHspAT and mArAT, we elucidated their three-dimensional (3D) structures. The apo and liganded forms of mHspAT crystallized in hexagonal and orthorhombic space groups, respectively, whereas, both mArAT-succinate and mArAT-Phe complexes crystallized in orthorhombic space groups. Both the class of enzymes display the canonical aminotransferase fold of the PLP-dependent Iβ subfamily. Their tertiary structure can be described as analogous to a curved left hand with clustering of three distinct structural motifs in the palm, thumb and fingers positions (Fig. 2a). The PLP-binding domain in the palm position (palm domain) encompasses a major portion of the polypeptide chain and consists of a seven-stranded β -sheet sandwiched between two bundles, each with three α -helices. The C-terminal domain resembling the fingers (fingers domain) forms a roof-shaped three-helix bundle resting over a small β -sheet. Helix 9 of the thumb domain connects the palm and fingers domains. An N-terminal 'lid' ,   consisting of an approximately 40-residue long loop, protrudes from the thumb domain and closes over the PLP-binding domain (Fig. 2a). Typically, the biologically active form of Iβ aminotransferases is a homodimer whose monomers are related by a pseudo 2-fold symmetry. Both mHspAT and mArAT adopt the similar functional unit whose protomers are aligned in an inverted manner with the axis of the molecular dyad passing through the interface (Fig. 2b). Residues that form the interface of the dimer protrude from the palm and thumb domains as well as the N-terminal lid of both protomers. Both the dimers of mHpsAT and mArAT contain two active sites in similar positions approximately 25 Å apart, with residues from both chains lining the cleft (Fig. 2b,c). The active site cavity is lined by residues mainly contributed by the palm domain. The fingers domain forms the roof of the cleft and the N-terminal lid shields the active side from the solvent in the ligand bound form. The crystallographic dimerization involves the burial of approximately a fifth of the surface area (17,000 Å 2 ) of an individual monomer (Supplementary Table S2). The two structures align to each other with a root mean square deviation (rmsd) of 2.25 Å over 543 Cα atom pairs. Cofactor binding triggers large conformational changes in mHspAT. Analyses (Fig. 3a). Also observed was the binding of the PLP as an internal aldimine via a Schiff 's base with Lys232 Nζ . The orientation of the morpholine-ring compound is similar to that of Hsp in substrate bound E. coli HspAT (eHspAT; 31% sequence and |F o |-|F c | (red) at 1σ and 2.5σ contour levels respectively) for the holo form shows MES bound near PLP. ED maps (2|F o |-|F c | (blue) and |F o |-|F c | (red) at 1σ and 2.5σ contour levels, respectively are shown for PLP-Phe (b) and PMP and succinate molecules (c) bound in the active site of mArAT complexes. The two distinct substrates are recognized by mArAT using the "arginine switch" mechanism, in which Arg322 (yellow) moves away from the active site to accommodate the neutral phenyl ring of Phe. identity; Fig. 4) 11,12 suggesting that MES binds in the active site of mHspAT. Superimposed structures of native and ligand-bound forms of mHspAT lucidly project various differences in the regions of the palm, thumb and fingers domains. It also reveals that the N-terminal lid is restructured upon binding of ligand ( Fig. 5a-e), leading to a 'closed' conformation of the enzyme necessary for the binding and probable catalysis of the substrate. The lid pivots about Arg35, with an rmsd between its 'open' and 'closed' forms of 4.25 Å (over 41 Cα atom pairs) (Fig. 5a). The closing of the lid upon ligand binding causes Tyr25 to sweep into the active site region and interact with the ligand. The backbone of the C-terminal domain undergoes a subtle lateral shift (rmsd of 0.28 Å over 78 Cα aligned atoms) with only three residues undergoing conformational changes in their side chains. This is unlike the previously reported homologous structures where the C-terminal domain residues undergo major conformational changes upon transition between 'open' and 'closed' conformations [13][14][15][16] . In mHspAT, Arg337, Arg346 and Val339 in the C-terminal domain undergo a change in the conformation of their side chains. The conformational changes of these two Arg residues are also implicated in ligand binding, as described later.
The 'open' and 'closed' conformations of mHspAT monomers also impact the packing of the homodimer. The apo dimeric form of the enzyme adopts a relatively compact organization of the monomers compared to the typical dimeric arrangement seen for liganded form of aminotransferases. This results in a shift of 6.5 Å between the centre of masses (CoMs) of the monomers of the apo and holo forms of mHspAT dimer ( Supplementary  Fig. S4). Furthermore, other regions in the vicinity of the active site that undergo significant conformational changes upon interaction with ligands include helix formation in two loop regions and displacement of two helices ( Fig. 5b-e). The two loop regions, Ser126-Thr138 and Phe236-Arg240, which become ordered to form helices result in the reorientation of Tyr127 and Arg240 which enables these residues to interact with the ligands (Fig. 5b,d). Concurrently, two helices (Fig. 5c,e) undergo concerted displacement, facilitating the interaction of the residues Asn103 and Arg337 with the ligands. Changes in the interactions of the residues at the interface are also observed, which include reshuffling of the four salt bridges between residues. The binding of the ligand thermodynamically stabilizes the protein as is exemplified by the increase in the number of H-bonds (from 20 to 31) and non-bonded interactions (from 306 to 390).
Active site of mArAT can accommodate a dicarboxylate and an aromatic amino acid. The binding of Phe and succinate (Fig. 3b,c) in the active site of mArAT as observed in the crystal structures of mArAT-PLP-Phe and mArAT-PMP-succinate complexes revealed the molecular basis of substrate selectivity of mArAT. The two liganded structures are virtually identical (an rmsd of 0.28 Å over 305 Cα atom pairs). In the substrate bound mArAT complex, Phe is present in the active site of both the monomers of a homodimer, though it exists as a PLP-Phe external aldimine intermediate complex in one of the monomers, and as free-Phe in the other monomer. The carboxylate group of free-form of Phe is held in place through interactions with Asn157 and the PLP bound-form makes an additional interaction with Arg330. Utilizing the structural information gained from the two structures, we traced the plausible conformational changes that mArAT undergoes -from accommodating a dicarboxylate to creating a suitable cavity for Phe entry and binding (Supplementary Movie S1). The N-terminal lid regulates the entry and binding of Phe by rearranging the side chains of Tyr15, along with Glu111, Leu112 and Arg322. A stretch of residues of the lid exhibit plasticity (Leu9 to Ala24, rmsd 2.9 Å over 16 aligned Cα pairs), by adopting different conformations in free-and Phe bound form. Tyr15 forms a H-bond with Arg322 when Phe is covalently linked to the cofactor in one monomer. In the other monomer, the carboxylate group of 'free' Phe form H-bond with Arg322. In the dicarboxylate bound form, succinate exists as a non-covalently linked moiety, forming a 'twisted' bidentate H-bond/ion pair with Arg322 and Arg330 of the C-terminal domain. Succinate essentially mimics L-glutamate, an amino donor ( Supplementary Fig. S3c,d).
Rearrangement of H-bond network in ArAT from Paracoccus denitrificans 17,18 and "arginine switch" in an engineered TyrAT from E. coli 19 , facilitate dual substrate recognition and binding. The present structures clearly demonstrate the role of the "arginine switch" mechanism in the binding of the carboxylate group of Phe in which Arg322 moves away from the active site to allow the access of the bulky head group of Phe (Fig. 3b,c). However, as our study suggests, the preferential binding of Phe and discrimination of Hsp in mArAT is also impacted by the presence of specific active site residues.

Mode of PLP binding is conserved in both mHspAT and mArAT. The unifying factor amongst
transaminases is the requirement of PLP as a cofactor. The cofactor PLP/PMP is lodged into the active sites of mHspAT and mArAT through a number of interactions which are conserved across the family of Iβ aminotransferases 2 . In mHspAT, these include as many as nine H-bonds and a salt bridge formed between the active site Arg240 and the phosphate moiety of PLP (Fig. 6a). The pyridine ring of PLP bound in the active site of mHspAT is further stabilized by π -π stacking with Tyr127 and covalent linkage of its C4A atom with the Nζ of the active site Lys232. Similarly, H-bonds, a salt bridge and π -π stacking with the active site Phe110 stabilize the PLP in mArAT (Fig. 6b). Since PLP-binding is largely conserved, the basis of substrate specificity of different classes of aminotransferases can only be attributed to the chemical environment of their substrate binding pocket.
Hydropathicity of active site pockets dictates substrate specificities of mHspAT and mArAT. A structural alignment of the MES bound form of mHspAT with eHspAT bound to PLP and Hsp shows that MES in mHspAT binds in a position similar to that of Hsp in eHspAT (Fig. 4). In order to chart out the interactions that Hsp utilizes to bind in the active site of mHspAT, Hsp was docked into the PLP-mHspAT complex. Analysis of the mHspAT -MES and mHspAT -Hsp structures suggests that the two ligands have similar interactions in the active site pocket (Fig. 6a,c). In particular, three active site residues, Tyr25, Asn103 and Tyr127, are involved in H-bonding with the imidazole ring and amino group of Hsp. Also, Met129 and Pro260 provide van der Waals interactions stabilizing the substrate in the binding pocket. The phosphate moiety of Hsp is held in place by Arg337, Arg346 and Asn176. To determine the molecular basis of substrate specificities of mHspAT and mArAT, we compared their active site residues. The chemical environment of their active site pockets, particularly the substrate binding regions, differ markedly (Fig. 7a). mHspAT possesses a bowl shaped hydrophilic pocket which is lined by Tyr25, Asn103, Tyr127, Met129, Tyr67* and Tyr261* (* indicates residue protruding from the adjacent protomer). Such a hydrophilic pocket provides a favourable niche for the binding of Hsp and other similar scaffold such as morpholine-ring of MES (Fig. 7b).
Contrastingly in mArAT, residues Val86, Phe110, Leu112 and Phe246* form a hydrophobic substrate binding pocket (Fig. 7c) which provides an energetically favourable environment for the specific binding of hydrophobic substrates such as Phe, Trp and the less polar Tyr. Notably, such a hydrophobic pocket would be unfavourable for the binding of polar substrate such as Hsp, corroborating with the biochemical result that mArAT shows no HspAT activity. These findings suggest that the nature of the residues lining the substrate binding pockets of mHspAT and mArAT is the primary determinant of their distinct substrate specificities.
Furthermore, analysis of amino acid residues composition of their N-terminal lids also shows distinct patterns. Like the substrate binding pocket, the N-terminal lid of mHspAT comprises of more hydrophilic amino acids, including four Arg, one Lys and three Val. Comparatively, the N-terminus of mArAT is hydrophobic having two Arg, two Lys, four Val and one Ile.
Structural and sequence similarity of mHspAT and mArAT highlights conservation of active site residues. To explore the conservation of active site residues of Iβ aminotransferases across species, we performed a structure and sequence based alignment of mArAT and mHspAT with homologs from a spectrum of organisms including archae, bacteria, fungi, plants and mammals 6,8,9,18,[20][21][22][23][24] . The residues that are conserved across homologues include those present at positions corresponding to Gly84, Asn157, Pro158, Asp184, Tyr187, Scientific RepoRts | 6:18880 | DOI: 10.1038/srep18880 Lys217 (forms internal aldimine with PLP), Arg225, Gly227 and Arg330 (involved in anchoring of the phosphate, sulphate and carboxylate moiety of Hsp, MES and Phe, respectively) in mArAT (Fig. 8a). Most of these residues line the active site pocket. The position corresponding to Val86 of mArAT was occupied by hydrophilic residues Ser, Thr, Asp and Asn in the other homologous sequences (Fig. 8b). This suggests that the substrate binding pocket of mArAT is relatively more hydrophobic compared to its counterparts in other organisms. However, the position corresponding to the active site Phe 110 of mArAT is mostly conserved and has an aromatic amino acids in other Iβ homologs (Fig. 8b). The nature of the aromatic amino acid in this position, however, plays a major role in deciding the ligand specificity of the aminotransferase, as is discussed further.
Inhibition of mHspAT and mArAT. Carboxylic acids are known inhibitors of aminotransferases 25 , as they mimic the amino donor and acceptor groups of aminotransferases, i.e., L-glutamate and α -ketoglutarte (α -KG). As seen in the one of the mArAT crystal structure, a dicarboxylic acid, succinate was bound in the active site pocket. Thus, the first choice of molecules for inhibition studies were the dicarboxylate molecules succinic acid and maleic acid, the latter being a well-studied inhibitor of aminotransferases. However, both the dicarboxylates failed to inhibit mHspAT and mArAT.
Morphiline-ring compounds such as MES and 4-morpholine propanesulfonic acid (MOPS) have been shown to inhibit enzymes such as metallo-β -lactamase from Bacteroides fragilis 26 . We therefore investigated whether the integrally bound MES in mHspAT has any inhibitory effect on its enzymatic activity and also, if MES is a specific inhibitor of mHspAT. We observed a drastic reduction in the aminotransferase activity of mHspAT for Hsp in presence of 50 mM MES (Supplementary Fig. S5a). Further exploration also revealed a concentration dependent inhibition of mHspAT-Hsp activity with MES (Fig. 9). However, appreciable inhibition by MES was not seen for the aminotransferase activity of mArAT for Phe ( Supplementary Fig. S5b), thereby suggesting a specific, albeit weak inhibitory property of MES. Previously suggested carboxylate derivatives as inhibitors against Staphylococcus aureus HspAT 27 targeted only the phosphate binding residues of PLP and Hsp. We report the first inhibitor which specifically interacts with the hydroxyl group of Tyr127, a residue which is involved in amino-group recognition of Hsp. Thus morphiline-ring based inhibitors may differentiate between enzymes having a Phe in the active site, thereby making this class of molecules a more specific and promising inhibitor of HspATs.

Mutating a hydrophilic residue to a hydrophobic one affects the recognition of Hsp by mHspAT.
Taking cue from the structural and inhibition studies, we probed the essentiality of a hydrophilic environment for the binding of Hsp in mHspAT. For this, we mutated Tyr127 into a hydrophobic residue, Phe, the equivalent residue in most ArATs. The mutant, Y127F, retained only 1/5th of the activity of the native enzyme for Hsp, irrespective of the concentration of substrate used. However, no significant loss of activity was observed for Phe, the second preferential substrate of mHspAT ( Supplementary Fig. S6), thus supporting the structural and biochemical results that the interaction of a hydrophilic residue with MES/Hsp is important for mHspAT activity.

Discussion
Worldwide efforts to develop new drugs to combat drug-resistant tuberculosis (TB) have been going on since long. In spite of these efforts, TB continues to infect the world population, causing between 2 to 3 million deaths every   year. With the availability of Mtb genome sequence in 1998, rational approach for designing anti-TB inhibitors by targeting proteins essential for Mtb growth and survival in the host macrophages is gaining momentum. Mounting evidences suggest that many enzymes of the amino acid biosynthesis pathways could be important drug targets for rational design of anti-TB agents 28 . Aminotransferases are one such class of enzymes which are involved in the biosynthesis of a number of metabolites in the cell. The importance of these enzymes is substantiated by the fact that many of them have been targeted for the development of drugs. Examples of human aminotransferases as targets include ornithine aminotransferase for the treatment of hyperammonemias 29 , γ -aminobutyric acid aminotransferase as an anti-epileptic drug 30 and kynurenine aminotransferase for the treatment of cognitive impairment associated with various psychiatric disorders 31,32 . Moreover, a recent study shows that the TyrAT of Leishmania infantum is a potential molecular target for the development of anti-leishmanial drug 33 . Thereby, the structural and functional characterization of aminotransferases of important infectious organisms opens new avenues for the development of species specific drugs. Our study on structural and biochemical aspects of two important mycobacterial enzymes mHspAT and mArAT is thus relevant for enzyme specific inhibitor design.
Our functional assays clearly showed the inability of mArAT to catalyze Hsp as substrate, but it exhibited broad specificity for aromatic amino acids. mHspAT showed high affinity for Hsp and a moderate affinity for the aromatic residues. Crystal structures of mHspAT and mArAT showed an overall structural similarity with each other and a structure based sequence alignment of mHspAT and mArAT with homologous members of subfamily Iβ also revealed a largely conserved backbone fold. A closer look at the active site architecture of mHspAT docked with Hsp unveiled the presence of a tetrad (Asn103, Tyr127, Met129 and Tyr261*) forming a hydrophilic cleft in which an imidazole ring and amino group of Hsp molecule could snugly fit in (Fig. 6c). The substrate binding site of mArAT has a stark difference in its residue composition. It consists of a hydrophobic pocket with an equivalent tetrad being formed by Val86, Phe110, Leu112 and Phe246*. Out of these four residues, Phe110 is mostly conserved in ArATs across species, whereas it is replaced by a Tyr in HspATs (Tyr127 in mHspAT). The role of the N-terminal lid has remained unexplored as far as the aminotransferases are concerned. On the basis of findings from the present study, we suggest that this lid plays a crucial role in administering the entry and exit of the substrate. Furthermore, our studies point out to existance of 'open' and 'closed' structure of mHspAT which is defined largely by the movement of the N-termini.
The serendipitous binding of MES to mHspAT prompted us to explore its inhibitory property, if any, against mHspAT as this enzyme has been proposed as a potential drug target 34 . In addition to being the first report of the inhibitory property of MES for an aminotransferase, the present study also suggests that the binding of the morpholine-ring is specific for mHspAT. Therefore, our data lays a foundation to explore MES-like molecules as specific inhibitors of HspATs. Given that amino acids are required in various stages of Mtb growth, survival, and defense [35][36][37] , that many enzymes of amino acid metabolic pathways are potential drug targets 28 and that humans do not synthesize His, mHspAT could be an important target for the design of anti-TB agents. We also compared the closely related human aminotransferases structures to explore their structural similarity to mHspAT. From the structures available in the PDB, except for human kynrenine aminotransferase II, all other aminotransferases showed a significant variation from mHspAT in terms of the active site composition. Even the closely related hTyrAT showed that its active site is not conducive for the binding of MES (Supplementary Fig. S7). We propose that an MES based scaffold can serve as a platform for developing more potent Mtb specific inhibitors, which do not target any of the human aminotransferases.
We also report the experimentally determined structure of mArAT-aromatic amino acid complex. The structural studies on the mArAT suggest that the residues lining the substrate binding pocket dictate preference for aromatic amino acids such as Phe, in addition to the "arginine switch" mechanism proposed in earlier studies 19 . The side chains of hydrophobic residues aid in the binding of Phe in the mArAT and repel polar substrate such as Hsp.
In a nutshell, the structural and functional characterization of the two Iβ aminotransferases from Mtb augment the current understanding of His and aromatic amino acid metabolism in Mtb and differences in their aminotransferases active sites.

Materials and Methods
Enzyme preparation, crystallization and data collection. The details of enzyme preparation, crystallization and preliminary X-ray characterization of both apo mHspAT and mArAT in complex with succinate have been reported previously 38,39 . Briefly, rv1600 and rv3772 were cloned in M. smegmatis/E. coli shuttle expression vector pYUB1062 and over-expressed in M. smegmatis strain mc 2 4517. The proteins were purified to homogeneity by Ni-NTA affinity and gel filtration chromatography. Apo form of recombinant mHspAT was crystallized in PEG MME 2,000, whereas its PLP-complex was prepared by co-purification with PLP (50 μ M) in the purification buffer and crystallized in a condition containing MES monohydrate (0.1 M) pH 6.5, ammonium sulphate (0.2 M) and PEG monomethyl ether (MME) 5,000 (30%). mArAT was crystallized in PEG MME 5,000 and the Phe-mArAT complex was obtained by soaking the crystals for 5 min in Phe (2 mM) solution prepared in the mother liquor. X-ray diffraction data from crystals of various forms of both the enzymes were collected using in-house facility as well as synchrotron beam line and were processed using HKL2000 40 . Structure solution and refinement. The structures of both mHspAT and mArAT were solved using the molecular replacement phasing method by the program PHASER 41 of the CCP4 42 . The structure of a homologous aminotransferase from C. glutamicum (PDB ID: 3CQ5), which shares 59% sequence identity with mHspAT, was used as the search model to solve the structure of mHspAT. The structure of mArAT was solved using the crystal structure of its Listeria innocua counterpart (PDB ID: 3FFH) with which it shares 29% sequence identity. Both the structures were refined in a similar manner using the program REFMAC5 of CCP4. To start with, the model was subjected to 50 cycles of rigid body refinement. Subsequently, 100 cycles of restrained coordinate refinement were carried out using a maximum likelihood target function. At this stage, the Mtb specific amino acids were Scientific RepoRts | 6:18880 | DOI: 10.1038/srep18880 incorporated/substituted into the electron density using the model-building program COOT 43 . After every round of model building, positional and isotropic B-factor refinements were carried out. Water molecules were incorporated in the model based on the peak heights (2|F o | -|F c | at 1σ and |F o | -|F c | at 3σ contour level) in the electron density maps. In the active site of mArAT, indigenously bound PMP and succinate were modelled based on the Fourier electron density maps. mHspAT-PLP-MES complex structure was determined using the same template used for apo mHspAT structure determination. The refined coordinates of mArAT-Suc complex was used as the template for determining the Phe bound mArAT structure. The ligand molecules were incorporated into their respective positions on the basis of difference electron density map (|F o | -|F c |). Subsequently, the complex structures were refined in a manner similar to that employed for the apo structures. The data collection, data processing, and refinement statistics are tabulated in Table 2. The stereochemical acceptability of the structures were validated using the program PROCHECK 44 .
Enzyme kinetics. The aminotransferase activity of mHspAT was determined using a two-step assay which involves glutamate dehydrogenase (GDH) 45 (Supplementary Fig. S1). The assay is based on the transamination of the substrate Hsp or other amino acids in the presence of α -KG resulting in the α -elimination of the amino group. The rate of formation of the 2-oxo acid was monitored spectrophotometrically. The final reaction mixture contained triethanolamine buffer (200 mM, pH 8.4), PLP (0.02 mM), α -KG (2mM), GDH (5 units), NAD (1 mM) and the substrate in varying concentrations. The enzymatic activity was measured at 37 °C by monitoring the reduction of NAD at 340 nm. All reactions were performed in triplicates. Control experiments lacking enzyme or substrate were taken as an estimate of basal level of detection. Glu was excluded from the analysis as   Table 2. Data collection and refinement statistics. a R sym (I) = ∑ hkl ∑ i |I i (h k l)-< I(h k l) > |/∑ hkl ∑ i I i (h k l) for n independent reflections and i observations of a given reflection. < I(h k l)> is the average intensity of the i observations. b CC* 52 was calculated using PHENIX 53 . c R work and R free = ∑ h ||F(h) o |-|F(h) c ||/∑ h |F(h) o | where F(h) o and F(h) c are the observed and calculated structure-factor amplitudes, respectively. R free was calculated using 5% of data. Values in the parentheses are for the highest resolution range.
it is a by-product of the reaction. Met showed a significant amount of absorbance even in control reaction without the enzyme and hence was excluded from the analysis. The final activities of the enzymes were calculated using a molar extinction coefficient of 6220 M −1 cm −1 for NADH at 340 nm. For inhibition studies mHspAT and mArAT activities were measured in the presence of MES (0-250 mM) with substrates Hsp (1 mM) and Phe (2 mM), respectively.