Structural analysis of mycobacterial homoserine transacetylases central to methionine biosynthesis reveals druggable active site

Mycobacterium tuberculosis is the cause of the world’s most deadly infectious disease. Efforts are underway to target the methionine biosynthesis pathway, as it is not part of the host metabolism. The homoserine transacetylase MetX converts l-homoserine to O-acetyl-l-homoserine at the committed step of this pathway. In order to facilitate structure-based drug design, we determined the high-resolution crystal structures of three MetX proteins, including M. tuberculosis (MtMetX), Mycolicibacterium abscessus (MaMetX), and Mycolicibacterium hassiacum (MhMetX). A comparison of homoserine transacetylases from other bacterial and fungal species reveals a high degree of structural conservation amongst the enzymes. Utilizing homologous structures with bound cofactors, we analyzed the potential ligandability of MetX. The deep active-site tunnel surrounding the catalytic serine yielded many consensus clusters during mapping, suggesting that MtMetX is highly druggable.

biosynthesis of methionine and threonine ( Fig. 1) 6 . It catalyzes the conversion of l-homoserine to O-acetyl-l-homoserine (OAHS) by transfer of an acetyl group from Acetyl-CoA to the γ-hydroxyl of homoserine 7 . OAHS is an essential precursor to methionine as well as other bacterial metabolites, such as S-Adenosyl-l-methionine (SAM). The modification of l-homoserine is a committed step in methionine and SAM synthesis.
Bacterial studies have shown that deletion of met2 gene, which encodes an analogous HTA, is lethal without methionine supplementation 8,9 . Further studies using immunocompetent and immunocompromised mice demonstrate that deletion of metX generates auxotrophic mutants unable to establish infection 10 . If starved of threonine and methionine in vitro, Mtb ΔmetX dies quickly. Furthermore, ΔmetX mutant strain is unable to proliferate inside of human macrophages. Also, it has recently been demonstrated that metX is required for maintaining bacterial survival during chronic Mtb infection 11 . Together, these data suggest that Mtb is unable to scavenge biosynthetic intermediates from the host for methionine synthesis, making for a uniquely exploitable vulnerability for the development of antibacterial agents 12 .
Efforts are already underway to inhibit HTA in Cryptococcus neoformans for use as an antifungal agent 8 . Targeting of a similar pathway for aspartate production has already yielded some promising selective inhibitors for Streptococcus pneumoniae and Vibrio cholerae 13 . A similar structurally guided approach to discovering specific inhibitors is an attractive alternative to traditional antibiotic killing of MDR-TB and XDR-TB.
Here we report the structures of three homologous MetX enzymes from Mycobacterium tuberculosis (MtMetX), Mycolicibacterium abscessus (MaMetX), and Mycolicibacterium hassiacum (MhMetX) and compare them to previously solved structures of HTAs in order to begin development of selective inhibitors of MtMetX via a structure-based approach. Using the MtMetX structure as a guide, we also elucidate the druggability of the enzyme and propose that it is an excellent candidate for small molecule drug discovery.

Results
overview of mycobacterial MetX structures. The three solved MetX structures include residues 15-70, 77-372 from MhMetX ( Fig. 2A), 10-379 from MaMetX (Fig. 2B), and 7-372 from MtMetX (Fig. 2C). Two copies of each monomer exist in the asymmetric unit of all three structures. MetX can be divided into two distinct structural domains, the catalytic domain, and the lid domain.
The organization of the catalytic domains' fold marks MetX as members of the α/β-hydrolase super-family. It is a highly diverse family that includes proteases, lipases, and esterases, among many others [14][15][16] . A canonical 8-stranded β-sheet fold with twisted, parallel topology forms the core of α/β-hydrolases 17 . Several α-helices flank  (Fig. 2F). The catalytic domain contains the active site tunnel with its a canonical catalytic triad.
Assembly occurs at an anti-parallel four-helix bundle motif (αL1 and αL3) in the lid domain. The total interface area of the dimer is ~1700 Å 2 18 . Two additional helices help to strengthen the interaction with hydrogen bonds and van der Waals contacts (αL4 and αL5). Like other previously studied HTAs, MetX forms solution dimers at this interface. These dimers have been shown to be physiologically relevant in other HTAs and are likely important for MetX as well 19 . The lid domain comprises residues 184-285 of both MhMetX and MtMetX, and residues 187-292 of MaMetX, between β8 and α5 (Fig. 3A).
The space between the catalytic and lid domains forms a deep active-site tunnel. At the end of this tunnel sits the nucleophilic serine residue. The tunnel is lined with polar residues highly conserved among other known HTA structures. Thr61/61/64, Arg227/227/230, Tyr234/234/237, and Asp351/351/358 for MhMetX, MtMetX, and MaMetX respectively all surround the active site to help facilitate the binding of acetyl-CoA and homoserine (Fig. 3B) 20 . catalytic mechanism. The catalytic triad of Nucleophile-His-acid is the α/β-hydrolase family's most conserved feature. Just as in other known HTA structures, MtHTA, MhHTA, and MaHTA contain a serine, aspartic acid, and histidine in the active site. HTAs have a serine between β7 and α3, an aspartic acid on the loop between β9 and α6, and histidine on α7 for these residues. For MtHTA and MhHTA, Ser157, Asp320, and His350 comprise the active site; MaHTA's triad is comprised of Ser160, Asp327, His357 (Fig. 4A). The catalytic serine sits at the end of a deep catalytic tunnel (Fig. 4B).
Studies in H. influenzae and Schizosaccharomyces pombe suggest that the mechanism of HTAs is based on a "ping-pong" reaction 21 . In the proposed mechanism, acetate is first transferred to serine from acetyl-CoA in the www.nature.com/scientificreports www.nature.com/scientificreports/ hydrogen-bonding distance for activation of the serine (Fig. 4). The active sites of LiHTA and SaHTA show the most deviation. In the structure, His344 of LiHTA exists in two different conformations with equivalent occupancy 22 . In the SaHTA structure, His296 is more disordered with a high B-factor (B = 72 Å 2 ) for its imidazole ring 23  Structural comparisons to other HtAs. The overall fold is similar to previously solved HTAs, specifically MsHTA (PDB entry 6IOG) 24 , HiHTA (PDB entry 2B61) 25 , SaHTA (PDB entry 4QLO) 23 , LiHTA (PDB entry 2PL5) 22 . The three solved MetXs are most similar in length to MsHTA (374 residues), with LiHTA (366 residues) and SaHTA (322 residues) being smaller proteins than MetX and HiHTA (377 residues) being slightly longer. The significant differences lie in the loop lengths and a few short secondary structural elements. The most notable secondary structure difference occurs in the length of β1 and β2, which are both extended in the MtMetX, MhMetX, MaMetX, and MsHTA structures when compared to the other related HTA structures. In LiHTA, HiHTA, and SaHTA, these sheets are both subdivided by loops, whereas they are continuous in MtMetX, MhMetX, MaMetX, and MsHTA. The extended loop between β3 and β4 show differences in secondary structure content, length, and orientation. Both MhMetX and HiHTA contain a short sequence with helical propensity adjacent to β4. MaHTA and LiHTA have no secondary structure as assigned by DSSP in this same region 26,27 , while SaHTA is unique in substituting β4 and β5 for a longer helix.
As the lid domains are the least conserved among the six HTA structures and within the α/β-hydrolase family as a whole, it is not surprising that this region has some of the most substantial structural deviations. The loop between αL4 and αL5 appears unique in each structure. MtMetX and MsHTA both contain a small helical region (αL1′); MaHTA features a unique pair of short anti-parallel sheets (β1′, β2′). HiHTA merely contains an unstructured loop, while SaHTA omits the majority of the residues entirely, with only a short linker between αL4 and αL5. LiHTA is perhaps the most variant, as its loop affects the length and orientation of αL5. Due to the high degree of variability, this loop is likely not critical for MetX's function or assembly.
The polar residues which line the active site tunnel show conservation between variants of HTA. Additionally, the motifs surrounding each are nearly identical. Notably, Thr61 (substituted for Ser in LiHTA) is in the middle of a HALTGD motif, and Asp351 sits next to the conserved region, adjacent to the catalytic His, in a GHD(G/A) FL motif. While the lid domain contains a much lower amount of structural conservation globally, the stretch of highest convergence appears in αL3, which contains both Arg227 and Tyr234 and directly forms the other side of the catalytic tunnel. αL1, the other partner in the four-helix bundle with αL3, also shows a fair degree of sequence www.nature.com/scientificreports www.nature.com/scientificreports/ conservation. Interestingly, SaHTA stands as an outlier when comparing tunnel dimensions, being much narrower and more restricted when compared to the other four HTA structures.
Alignment of MtMetX, MaMetX, and MhHTA structures using the FATCAT algorithm 3 demonstrate their high degree of structural similarity (Table 1). Across the examined monomers, all are significantly similar to one another (P < 0.05). Structural alignment RMSDs ranged from 0.52-3.04, and sequence similarity was in the range of 38.2-88.5%. MhMetX, MaMetX, and MtMetX ranked in the list of closest structural neighbors currently available on the PDB when applying FATCAT to all related structures. Interestingly, MsHTA is as good if not better an approximation to MtMetX as either MhMetX or MaMetX, suggesting that it may function as a good analog in assays where MtMetX cannot be directly utilized. However, when assaying MtMetX to determine its viability for crystallography, the T m was found to be between 40-41 °C by differential scanning fluorimetry (DSF) between pH 6.9-8.5 in Tris-HCl. Because of MtMetX's relative thermal stability under physiological conditions, we have chosen to focus on MtMetX when investigating future drug development. potential ligandability of MetX. Druggability of a protein is understood to be a measure of the relative ease of developing a small molecule, which will effectively modulate its activity in vivo 28,29 . Druggability depends on the pharmacodynamics and pharmacokinetics of the host and the pathogen making it difficult to predict computationally. Ligandability is a necessary condition for druggability, but is a more easily quantified metric for the development of inhibitors 30 . Understanding the ligandability is a critical first step before embarking on drug discovery. An estimated 60% of small-molecule searches fail due to the target site not being sufficiently druggable, which is positively correlated with not being sufficiently ligandable 31,32 . The availability of high-resolution structures for MetX opens up the possibility of performing direct therapeutic target discovery, provided it proves to be sufficiently ligandable.
The fundamental principle of drug discovery is that biologically active ligands are complimentary in molecular features and shape to the receptor. These can include physiochemical properties such as hydrophobicity, size, as well as the enclosure of the binding pocket and its promiscuity 33,34 . A strictly predominantly hydrophobic pocket might indicate a promiscuous binding site that may accommodate a wide-varying of ligands in different modes, some hydrophobic patches are still nevertheless ideal for ligand design. All three solved MetX have a similar, conserved hydrophobic patch ( Fig. 5A-C) running along the inside of the active site tunnel. Binding models for homoserine (Fig. 5D) and acetyl-CoA (Fig. 5E) were created by hybridizing the MtMetX structures with available MsHTA structures that have been co-crystallized with both substrates 24 in order to better understand their coordination within the binding pocket. While the hydrophobic patch does not appear critical to the recognition of either, its proximity to the catalytic site residues make it an ideal feature to leverage for designing hydrophobic ringed small molecules with flexible tail groups that could orient inside of the cleft similar to acetyl-CoA.
In order to further understand the active site's ligandability, FTMap was utilized [35][36][37] . The FTMap algorithm uses a set of small organic probes to sample a protein surface for binding hotspots computationally. Each organic probe is first rigidly docked before being energy minimized. Areas on the protein's surface where multiple probes bind are clusters. Binning these clusters based on their member's average free energy yields a consensus site (CS), a location on the protein's surface where small molecules are likely to bind. A CS strength (S) is defined as the number of probe clusters within the consensus cluster. A cluster of S > 16 represents a site targetable by a ligand 36,37 . A second CS should be located somewhere within 8 Å of the primary cluster. Of the eleven CS identified by FTMap, eight reside somewhere within the active site tunnel (Fig. 6). The closest CS near the catalytic Ser have S = 19, S = 16 and S = 13. One CS with S = 13 forms across from the active site on the lid domain, but no other high strength CS exists near it, making it an unlikely site for drug binding. MaMetX shows a similar pattern with the top clusters appearing nearly overtop those of MtMetX with S = 26, S = 13 and S = 10. These results, when overlaid with the hybrid substrate-binding models, suggest that MetX is ligandable.

Discussion
All three MetX structures show a high degree of overall similarity to previously studied HTAs from both bacteria and fungi. Differences in enzyme size are accounted for by the length of loop regions. Differences in secondary structural elements arise primarily from variations within these loops. All three solved structures crystallized as dimers at the expected interface and orientation that corresponds to predicted physiologically active assembly. www.nature.com/scientificreports www.nature.com/scientificreports/ In addition to sharing the overall domain and fold arrangement, the conserved location of the Ser-His-Asp catalytic triad's position within the active site tunnel may help protect the enzyme from covalent modification and deactivation. Engineered β-lactones with hydrophobic tails have already been shown to inhibit the activity  www.nature.com/scientificreports www.nature.com/scientificreports/ of HiHTA in vitro through the formation of adducts. There may be therapeutic value in modifying their structure to enhance specificity towards Mtb 38 . However, their lack of in vivo inhibition of HiHTA suggests that a more efficient method for disrupting methionine biosynthesis lies in small molecule inhibitors. These would be less prone to bacterial inactivation and less prone to the exclusion by Mtb's complex lipid cell wall. The high degree of structural similarity with previously solved homologs provides an excellent foundation for in silico compound screening and structure driven drug design methodologies.
The FTMap cluster data also provides aid towards a Fragment-Based Drug Discovery (FBDD). Previous research has shown that promising core fragments typically bind in the highest strength CS 39 . While the fragment molecules are too small on their own to have a useful affinity, neighboring CS probe structures can be then be linked to the core fragment to build up a high-affinity ligand. Furthermore, virtually-linked fragments could be screened in silico against existing chemical homologs using a tool such as ROCs 40 .
M. abscessus MetX was initially chosen for study due to its high similarity to the Mtb variant. However, the structure and CS data argue that it might make an excellent secondary target for drug development. Many compounds that inhibit MtMetX are also likely to affect MaMetX. M. abscessus is an emerging public-health threat, primarily implicated in pulmonary infections 41 . Cross-species gene transfer has helped to create multidrug-resistant strains, some even showing resistance to TB drugs such as rifampin 42,43 .
In summary, by reporting the first medically relevant MtMetX and MaMetX crystal structures, we hope that new avenues of structure-based drug design will be open for developing targeted and effective therapeutics.
pCDF-NT is a modified pCDF-Duet1 plasmid (Novagen) encoding His 6 tag followed by a tobacco etch virus (TEV) protease cleavage site. The pCDG-NT:His 6 -MetX plasmids were transformed into Escherichia coli Rosetta (DE3) competent cells (Novagen). Constructs were grown in LB to an OD 600 of 0.6 at 37 °C in the presence of streptomycin and chloramphenicol. Cultures were then cooled in an ice bath to 18 °C before the addition of 200 µM Isopropyl β-D-thiogalactopyranoside (IPTG) and 2% v/v ethanol. After inducing overnight for 16 hours, cultures were centrifuged at 5,000 rpm and lysed through two passes through a Microfluidizer (Microfluidics). Cell debris was removed via centrifugation at 18,000 rpm for 1 hour at 4 °C. Protein was purified by passage over a Ni-affinity column containing His-Trap chelating resin from GE Healthcare Life Sciences. The column was washed with a buffer containing 20 mM Tris pH 8.0, 300 mM NaCl, and 10 mM imidazole. The protein was then eluted using the same buffer with 250 mM imidazole. The elution fraction was dialyzed overnight at 4 °C into 20 mM Tris pH 8.0, 150 mM NaCl alongside TEV protease to release the His 6 tag. The passage of the protein back over the same His-Trap column removed the TEV protease, and tag before a polishing pass was performed over a Superdex 200 gel filtration column (GE Healthcare Life Sciences) in 20 mM Tris 7.5, 100 mM NaCl buffer. Data and structural determination. Crystals of each MetX construct were flash-frozen in liquid nitrogen after cryoprotection by transfer into the corresponding crystallization solutions supplemented with 20-25% glycerol. Data from MhMetX and MaMetX crystals were collected at the Stanford Synchrotron Radiation Lightsource beamline 9-2 using a Dectris Pilatus 6 M detector at 1.000 Å wavelength. Data from MtMetX crystal were collected at the SER-CAT beamline (22-ID) at the Advanced Photon Source using an Eiger 16 M detector at 1.00 0 Å wavelength. Data were indexed, integrated, and scaled using XDS and XSCALE 44 .

Crystallization of
The structure of MhMetX was determined by molecular replacement using Phaser 45  www.nature.com/scientificreports www.nature.com/scientificreports/ search model. Model building was performed using Coot 46 ; iterative refinement was done via phenix.refine 47,48 . Data and refinement statistics are summarized in Table 2.
Differential scanning fluorimetry (DSF). The assay was carried out using in a range of buffers using 50 mM Tris-HCl pH 6.9-8.5 in 100-300 mM NaCl with 0.2 mg/mL of MtMetX final concentration. 20 µL of protein was loaded with a 2x final concentration of SYPRO Orange dye (Thermo Fisher) and run on a CFX96 Touch qPCR system (Bio-Rad). A linear thermal ramp of 1 °C/min; 20 °C-90 °C run with an excitation wavelength of 512-535 nm and a detection wavelength of 560-580 nm. T m was calculated as the minimum of the first derivative plot of the unfolding transition in the CFX Maestro software.
Structural comparison and fragment-based hot spot detection. The Flexible structure alignment by chaining aligned fragment pairs allowing twists (FATCAT) server (http://fatcat.burnham.org) was used to compare the different available HTA structures. All pairwise alignments were done using the flexible alignment model with chain A. The database search for close homologs was performed with a P-value of 0.05. The FTMap server (http://ftmap.bu.edu) was used to map chain A of the MtMetX structure and assay ligandability. Cluster strength was determined by the number of probes in each consensus sites. The same run was also performed on the FTFlex server (https://ftflex.bu.edu) to assay whether or not sidechain flexibility would significantly alter the results, but observed trends in cluster locations were not significantly different between both servers.
The MtMetX substrate models were created by using a MsHTA structures, which have been solved in the presence of acetyl-CoA (PDB entries 6IOH and 6IOI) 24 . Monomers from each structure positioned using rigid-body alignment using Chimera's MatchMaker 49 . Energy minimization on the hybrid structure was then performed using the Molecular Modeling Toolkit and Dock Prep with the AMBER ff14SB forcefield 50 .  Table 2. Data collection and refinement statistics. a All data were collected from single crystals. b Values in parentheses are for the highest-resolution shell. c CC 1/2 correlation coefficient between intensities from two random half-data sets.