Analysis of mutations leading to para-aminosalicylic acid resistance in Mycobacterium tuberculosis

Thymidylate synthase A (ThyA) is the key enzyme involved in the folate pathway in Mycobacterium tuberculosis. Mutation of key residues of ThyA enzyme which are involved in interaction with substrate 2′-deoxyuridine-5′-monophosphate (dUMP), cofactor 5,10-methylenetetrahydrofolate (MTHF), and catalytic site have caused para-aminosalicylic acid (PAS) resistance in TB patients. Focusing on R127L, L143P, C146R, L172P, A182P, and V261G mutations, including wild-type, we performed long molecular dynamics (MD) simulations in explicit solvent to investigate the molecular principles underlying PAS resistance due to missense mutations. We found that these mutations lead to (i) extensive changes in the dUMP and MTHF binding sites, (ii) weak interaction of ThyA enzyme with dUMP and MTHF by inducing conformational changes in the structure, (iii) loss of the hydrogen bond and other atomic interactions and (iv) enhanced movement of protein atoms indicated by principal component analysis (PCA). In this study, MD simulations framework has provided considerable insight into mutation induced conformational changes in the ThyA enzyme of Mycobacterium.


5, 10 methylenetetrahydrofolate dUMP dTMP dihydrofolate
(1) − + = + ThyA also requires 5, 10-methylenetetrahydrofolate (MTHF) for dTMP synthesis both as a carbon donor and as a reductant in the methylation reaction 21 . Therefore, dTMP is important for DNA synthesis and loss of function mutations in the thyA gene causes PAS resistance in Mtb 13 .
Understanding of how point mutations induce structural changes in ThyA enzyme will provide opportunities for tackling PAS resistance. Therefore, we selected ThyA mutations (causing PAS resistance) correlated with minimum inhibitory concentration (MIC) values. Comparative analysis of the binding free energy, secondary structure elements, and free energy landscape of the wild-type and mutants was carried out using Molecular Dynamics (MD) simulations. The results of study showed that the wild-type ThyA enzyme formed strong interaction with dUMP (substrate) and MTHF (cofactor) mainly by hydrogen and hydrophobic interactions while, mutations apparently disturb the ThyA protein structure, making it dysfunctional.

Results
The emergence of mutation causing antibiotic resistance in the Mycobacterium necessitates inspecting the molecular mechanism underlying resistance. Extensive MD simulations were carried out to differentiate systematic changes in the wild-type and mutant ThyA enzyme structures and provide substantial insight into the mechanism underlying PAS resistance.
Selection of mutants. The mutants were selected based on MIC value ≥128 µg/ml 14 . Mutated residues were reported to be involved in interaction with substrate dUMP (R127L), catalytic site (C146R), MTHF cofactor (L143P, L172P and V261G), and completely buried in the hydrophobic core (A182P) of the wild-type ThyA enzyme 14 , as mentioned in Table 1. The selected mutations were also reported to destabilize the structure based on their positive folding free energy (∆∆G (kcal/mol) 14 (Table 1). Therefore, structural transitions due to point substitution in ThyA enzyme was discussed specifically using atomistic studies.
Molecular dynamics (MD) simulation analysis. All systems were simulated for 50 ns that helped in achieving much better conformational coverage for the wild-type and mutants. The homo-2-mer structure of the ThyA enzyme has 4 polypeptide chains, A and B chains with 260 amino acid residues and C and D with 261 residues. We selected dimer A-C for the present study, and chain A of the dimer showed interaction with the dUMP substrate and MTHF cofactor, respectively. MD simulation trajectories representing all-atomic coordinates as a function of time were analyzed for all seven systems: wild-type, R127L, L143P, C146R, L172P, A182P, and V261G. Furthermore, the average root-mean-square deviations (RMSDs) for wild-type, R127L, L143P, C146R, L172P,  Fig. 1(a,b)).
The RMSD values for wild-type and mutants were <3 Å, which is acceptable, implying that the systems were converged 22 . However, higher RMSD values were observed for R127L, L143P, and A182P complexes than the wild-type, which represented large conformational change during the simulation run. R127L and L143P were unstable at the beginning and attained stability between 35 and 45 ns. However, the A182P complex remained stable up to 25 ns, and then become highly unstable until the end of the simulations. The root mean square fluctuation (RMSF) plot explains flexibility of each amino acid residue providing insight into flexible and rigid regions of the protein. The RMSF plot of the wild-type and mutants for chain A and C are shown in Fig. 1(c-f). The average RMSF values for the wild-type for chain A and C were 0.88 Å and 1.04 Å respectively. Furthermore, the average RMSF for mutants ranged from 0.81-1.07 Å and 0.80-0.92 Å for chain A and C, respectively ( Table 2). The highest fluctuation peaks (>3 Å) for wild-type and R127L were located from 15-25 residue position in chain A. In chain C, higher RMSF values (>3 Å) were shown by wild-type, R127L, C146R, and V261G complexes, in residues ranging from 15-25 respectively. From the RMSF plot and values, we observed that   mutation positions in the respective mutants showed lower flexibility (relative to wild-type), which may cause improper binding of the substrate and cofactor, which is required for proper enzymatic activity (Table S1). The radius of gyration (Rg) was computed for all systems which reflected a change in global protein compactness due to point mutation 23 . The Rg was computed using following equation where r is the position of atom i, and r i − r cm is the distance between atom i and the center of mass of the molecule. The average Rg values for the wild-type, R127L, L143P, C146R, L172P, A182P, and V261G were 18.33, 18.34, 18.58, 18.25, 18.40, 18.52, and 18.38 Å, respectively. All mutants showed greater Rg values than the wild-type, with the exception of C146R, suggesting tight packing of the secondary structure (α, β, α + β, α/β) into the three-dimensional structure of wild-type ThyA enzyme (Table 2; Fig. 1(g,h)).
Determination of the residue-wise solvent-accessible surface area (SASA) reflected the change in the surface organization of the mutants compared with the wild-type. In R127L, leucine substituted natively placed arginine which may decrease protein stability due to leucine's aversion to water and a reduced SASA value from 372.07 (wild-type) to 348.26 (R127L) was observed. Substitution of the non-polar group (leucine, alanine and valine) by proline (hydrophobic side chain) and glycine (without side chain) led to slight decreases in the SASA values in L143P, L172P, A182P, and V261G mutants compared with the wild-type. Furthermore, a change from the polar uncharged group to the polar charged group increased the SASA value at the 146 position from 259.44 (wild-type) to 369.41 (C146R) (Fig. 2(a-d)).
To estimate the contribution of hydrogen bonds (H-bonds) to ThyA enzyme complex stability, the total number of H-bonds was computed during MD simulations. A total of five H-bonds were observed between dUMP and ThyA for wild-type, L143P, L172P, and V261G, respectively, as compared with four for the R127L, C146R, and A182P complexes. However, a single H-bond was observed between MTHF and ThyA in wild-type, L172P, and V261G, respectively. No H-bonds were detected in MTHF bound R127L and L143P complexes, while C146R-MTHF complex showed the maximum number of H-bonds during the 50 ns simulation time period (Table 2; Fig. 2(e-h)). www.nature.com/scientificreports www.nature.com/scientificreports/ Binding pocket analysis. Binding pocket analysis was carried out to interpret the variation in the volume and area of the protein with respect to the simulation time series. Initially, the structure conformation (at 0 ns) of the wild-type, R127L, L143P, C146R, L172P, A182P, and V261G mutants remained same. The volume of mutants R127L, L143P, C146R, L172P, and V261G (representative structures) was found to increase when compared with the wild-type, except the A182P mutant ( Table 3).
The drastic increase in the binding pockets could have led to an increase formation of H-bond between dUMP and mutants (L143P, L172P, and V261G) and a reduction for MTHF, which was in agreement with the H-bond analysis with respect to time. Interestingly, the representative structure of the A182P mutant showed an extreme decrease in its binding volume and area compared with its initial structure and wild-type representative structure and, accordingly, showed reduced H-bond occupancy for dUMP and an increase for the MTHF cofactor (Table 3). It was observed that even a single mutation could lead to too many differences in the binding affinity of enzyme substrate and cofactor.
Secondary structure analysis. The secondary structure analysis (helix, β-strand, and coil) provided information about the conformation of each amino acid in the representative structure of ThyA wild-type and mutants. The secondary structure elements (SSEs) are of two types: rigid (α-helices and β-sheet) and flexible (coil and turns). The wild-type was dominated by turns (28.85%) in chain A and α-helices (29.73%) in chain C ( Table 4). The C146R, A182P, and V261G mutants showed increases in coils in chain A and in A182P and V261G in chain C, respectively, as compared with the wild-type. Chain A showed increase in α-helices in R127L, L143P, C146R, L172P, A182P, and V261G mutants, whereas minimal deviation was observed in α-helix SSEs in chain C for all mutants with respect to the wild-type. In case of β-sheet, all mutants showed increases in β-sheet with the  Table 3. Analysis of enzyme binding cavity for ThyA wild-type and mutants.  www.nature.com/scientificreports www.nature.com/scientificreports/ exception of L172P in chain A, and R127L, C146R, and A182P showed increases in β-sheet in chain C compared with the wild-type. Both L143P and L172P mutants showed an increase in turn structure in chain A while all the mutants showed increases in turn in chain C relative to the wild-type. Furthermore, decreases in the percentages of 3 10 -helix SSEs were observed in chain A and C in all mutants compared with the wild-type (Table 4).

Coil (C)% α-Helix (H)% β-Sheet (E)% Turn (T)% 3 10 -Helix (G)%
In the wild-type, residues at positions 127, 143, 146, 172, 182, and 261 located in α-helix in both chain A and C. The R127L, L143P, C146R, and L172P mutants showed transformation into turn compared with the wild-type in both chain A and chain C. The A182P mutant retained its SSE, whereas V261G mutants showed transformation from turn to coil in chain C. It was observed from the results that mutated residue showed a change from a rigid to a flexible secondary structure (Table S2).

Residue network analysis.
A residue network graph of the ThyA enzyme bound with dUMP and MTHF was visualized for analyzing and depicting impact of mutation on interacting residues. Residue 127 formed van der Waals interactions in R127L, C146R, L172P, A182P, and V261G complexes, respectively, and single H-bonds in A182P and V261G complexes (Table 5). Three van der Waals interactions were formed by residue 143 in the wild-type, A182P, and V261G; four in C146R and L172P complexes; and two in R127L and L143P, respectively. Van der Waals Interaction 143-80 was found to be lost in L143P and 143-93 interaction in A182P and V261G complexes. In the wild-type and mutants of ThyA, a single H-bond was formed between 146 and 172 residues. Residue 182 formed a single H-bond in the wild-type and mutant forms of ThyA (except A182P), while two in C146R. In addition, all the complexes were extensively stabilized by the van der Waals forces formed by residue 182. Residue 261 stabilized only the L172P complex through the formation of a single van der Waals interaction (261-211) ( Table 5). In addition, complete list of ThyA residues interacting with the dUMP and MTHF in the interaction network of wild-type and mutants are listed in Table S3.
Electrostatic potential and binding affinity analysis. The estimation of electrostatic properties based on Generalized Born radii provided insight into the effect of point mutation on the ThyA enzyme structure and stability. Figure 3 shows the electrostatic potential surface for the wild-type and mutants. Furthermore, changes in the enzyme stability upon mutation were computed in terms of total energy, where a positive value signifies an unstable structure and a negative value signifies a stable structure. The wild-type exhibited total energy of −33486.94 kcal/mol, whereas the R127L, L143P, C146R, L172P, A182P, and V261G mutants showed total energy ranging from −33245.12 to −33452.05 kcal/mol, respectively, higher than that of the wild-type. The results were in accordance with the previous studies that have reported positive folding free energy (∆∆G (kcal/mol)) ranging from +0.18 to +3.94 for R127L, L143P, C146R, L172P, A182P, and V261G mutations ( Fig. 3; Table 1) 14 .
In addition, binding energies of the ThyA enzyme-substrate and cofactor complexes were calculated before and after MD simulations through the Molecular Mechanics/Generalized Born Surface Area (MM/GBSA) approach. In the pre-MD ThyA-dUMP complexes, the highest binding affinity was displayed in the wild-type complex (−15.37 kcal/mol), followed by the A182P mutant complex (−10.47 kcal/mol), while the lowest binding affinity was observed in the R127L and V261G mutant complexes (Table 6). However, in the post-MD ThyA-dUMP complexes, dUMP showed the strongest binding affinity for the wild-type (−15.11 kcal/mol), followed by the L143P (−8.06 kcal/mol) and C146R (−7.53 kcal/mol) complexes. Conversely, the L172P and V261G mutant complexes showed positive binding free energy, suggesting weak interaction between dUMP and the ThyA enzyme ( Fig. 4; Table 6).  Table 5. Interaction of key residues in ThyA wild-type and mutant complexes. *"VDW": van der Waals, "-": not present.
Consequently, at the point of mutation, the intensity of hydrophobic interactions remained unchanged while the H-bond number reduced drastically, indicating conformational changes in the mutant structures. www.nature.com/scientificreports www.nature.com/scientificreports/ Interaction analysis using MD trajectories for ThyA-MTHF interaction. In the wild-type, MTHF formed three H-bonds with Ile79, Lys49, and Leu172 with occupancies of 33%, 33%, and 31%, along with five hydrophobic interactions. R127L-MTHF formed two H-bonds with Lys49 (48%) and His51 (77%), and Val50, Phe176, and Ala256 residues formed hydrophobic interactions ( Table 8). The L143P complex was stabilized by H-bonds formed by Ile79 and Lys48 (60%) and large numbers of residues, including Lys48, Trp80, Trp83, Phe171, Leu172, Gly173, Phe176, Ala256, Ile257, Lys258, stabilizing the complex by hydrophobic interactions. The interaction between C146R and MTHF was stabilized by H-bonds formed by Lys48 (31%), Gly173 (62%), His51 (84%), Lys49 (56%), Gln165 (36%), Arg146 (54%), Glu58 (50%), Leu172 (43%), and Ile257 (32%), and hydrophobic interactions were contributed by Val50, Ile79, Trp80, Trp83, Phe171, Pro175, and Phe176 residues. Likewise, the L172P mutant complex was stabilized by four residues -Lys48 (51%), Lys49 (99%), Ile79 (40%), and Ile257 (46%) -and eight residues contributed hydrophobic interactions. Amino acid residues involved in the hydrogen and hydrophobic interactions with MTHF in A182P and V261G mutant complexes are listed in Table 8. The interaction analysis revealed that mutation at the 127 and 143 positions enforced large structural changes, leading to shifts in the MTHF away from its binding pocket. principle component and free energy analysis. The understanding of the correlation of atomic movement provides insight into enzyme and substrate interaction, particularly that arising from collective motion of the atoms, which are controlled by the secondary structure element of the proteins 24 . Covariance values can be positive or negative, which provides information about the cooperativity of motion. All diagonal elements of the symmetric 3N × 3N covariance matrix were summed and termed as trace value, which provides information about the measure of total variance (Fig. 5) Table 6. Binding energy analysis in the pre-and post-MD simulation complexes in the ThyA wild-type and mutants. www.nature.com/scientificreports www.nature.com/scientificreports/ modes of motion of a corresponding system 25 . PC analysis was applied to the backbone atoms of the wild-type and mutant ThyA enzymes bound with cofactor and substrate. PC1 must be denoted as the most important as it accounts for maximum variability in terms of internal motion of proteins in the enzyme conformation during binding of substrate and cofactor, while PC2 accounts for the remaining variability. As shown in Fig. 6, it was clearly observed that R127L and A182P mutants of ThyA occupy larger subspaces corresponding to their higher trace values. Percentages of variance against eigenvalues of the covariance matrix resulting from the simulations are shown in Fig. 7. Gibbs free energy landscape (FEL) plot was generated using PC1 and PC2 coordinates in which blue, green, and cyan represented metastable conformation with a low-energy state, while red signified high-energy protein conformation. The ∆G value wild-type, R127L, L143P, C146R, L172P, A182P, and V261G ranged from 11.5 to 15.4 kJ/mol (Fig. 8).   Table 8. Residues participating in the interaction in the ThyA enzyme-MTHF complex in wild-type and mutants. www.nature.com/scientificreports www.nature.com/scientificreports/

Discussion
The increased drug resistance in the Mycobacterium is considered to be a major problem faced in the treatment of TB 26 . The folate pathway is reported to be very important for the growth and survival of the bacteria. In M. tuberculosis, functional ThyA catalyzes the reductive methylation of 2′-deoxyuridine-5′-monophosphate (dUMP) to thymidine-5′-monophosphate (TMP) while utilizing N5, N10-methylene-5, 6, 7, 8-tetrahydrofolate (MTHF) in folate metabolism 16 . To understand the PAS resistance mechanism, five genes involved in folate pathways (thyA, dfrA, folC, folP1, and folP2) and an alternative thymine biosynthesis gene (thyX) were studied in PAS-resistant Mtb clinical isolates and spontaneous mutants, respectively 14 . Among several folate pathway genes in Mtb, thyA was the first gene reported to be associated with PAS resistance. Previous studies have identified mutation in thyA gene in one-third of the PAS resistance clinical isolates and laboratory mutants 14,15 . Moreover, clinical isolates from tuberculosis patients from northern China from 2006 to 2012 showed that, among 208 isolates, 54 (26.0%) showed 24 mutations in the thyA gene 15 . The identified mutations were correlated with antibiotic potency in terms of the MICs value, and a template-based (X-ray structure of the E. coli ThyA) three-dimensional structure of M. tuberculosis ThyA was constructed 14 . Further, mapping of the 24 identified point mutations in the thyA gene onto an M. tuberculosis ThyA homology-derived homodimer structure revealed that these mutations were in the dUMP substrate binding site, catalytic site, and folate cofactor binding site 14 . These point mutations also led to thermodynamic instability of the ThyA enzyme, confirmed with positive folding free energy value and also predicted to produce dysfunctional and unfolded polypeptide 14 .
Several studies have been conducted to investigate the mechanism of drug resistance in M. tuberculosis using structural dynamics and bioinformatics studies. Kumar and Sobhia (2015) demonstrated that inhA mutants reduced the interaction between NADH and binding site residues, causing isoniazid resistance in Mtb 27 . Similarly, various studies reported that mutation in rpoB gene leads to rifampicin resistance in Mtb [28][29][30] . Likewise, the mechanism of pyrazinamide resistance in the pncA gene mutants has been studied using a computational approach 31 .
Clear elucidation of the PAS resistance mechanism and ThyA conformational change is yet to be addressed specifically. Availability of the crystal structure of the Mtb ThyA enzyme provided us the opportunity to gain insight into the mechanism of emergence of drug resistance against PAS. Molecular dynamics simulations  were carried out for the selected mutations with MIC values >128 µg/ml for 50 ns time period. The binding free energy of the wild-type ThyA enzyme bound with substrate and cofactor was lower (more negative) than in the mutant complexes. Accordingly, the wild-type complex representative structure clearly indicates that the Asp167, Asp166, Tyr94, His207 and Asn177 (from chain A) and Arg126 and Arg127 (from chain C) residues from the ThyA enzyme were directly involved in dUMP binding and the Ile79 and Lys49 residues were directly involved in MTHF binding. Furthermore, substantial conformational changes were observed in the L143P and V261G mutants, which would affect substrate binding affinity, which is well supported by RMSD and RMSF analysis and also Fivian-Hughes et al. 16 reported that V261G mutation leads to non-functional ThyA enzyme in Mtb 16 . It is, however, worth mentioning that mutations significantly influenced the dynamic behavior of complexes, which was reflected by major disruption in the interaction network and altered motions in the mutants. Simulation results explained the reason for the strong link between PAS resistance and thyA gene mutations, such as wild-type, R127L, L143P, C146R, L172P, A182P, and V261G. Altogether, the results reported in the study will provide greater understanding of ThyA mutation-associated PAS resistance and will pave a path for facilitating therapeutic strategies against tuberculosis.

Materials and Methods
protein structure preparation and mutant generation. ThyA protein model complex with substrate and cofactor (ThyA-dUMP-MTHF) was generated by superposition of chains A and C from 3QJ7 on the crystal structure of Mtb ThyA (PDB ID: 4FOG) bound with 5-Fluoro-dUMP and 5-Methyl-5,6,7,8-Tetrahydrofolic acid (which corresponds exactly to 5,10-methylenetetrahydrofolate) 15 . In the optimized PDB file, point mutation was inserted at selected positions using Schrödinger's Protein Preparation Wizard 32 .
Atomistic simulation of wild-type and mutants. MD simulations of the wild-type and six mutants were carried out using Desmond software from Schrödinger 33 . The systems were first processed and optimized using Schrödinger's Protein Preparation Wizard 32 . The optimized systems were solvated in a cubic box with dimensions of 15Å × 15Å × 15Å using an SPC water model. Charges on the system were neutralized by adding corresponding counter ions such as Cl − /Na + for the accurate prediction of the physical properties of a realistic system. Furthermore, all systems were relaxed and energy minimized under an NPT ensemble at 300 degrees Kelvin and pressure of 1 bar using a default protocol. Finally, well equilibrated systems were subjected to a production run of 50 nanoseconds (ns) and internal energy was recorded for every 1.2 ns. The final MD simulation run was completed in eight stages. Energy minimization using the steepest descent method was carried out in two stages, while the equilibration phase occurred in four stages before the final (step 8) production time. Desmond trajectory analysis. The trajectory analysis was carried out using inbuilt python script in the Desmond software package. More intense analysis was carried out by converting Desmond trajectories into Gromacs trajectories using Visual Molecular Dynamics (VMD) software 34 . Furthermore, the representative structure for all the systems was also used to perform secondary structure analysis using the VMD package as described in our previous study 35 . Binding pocket and residue network analysis. A cavity with specific volume and area usually located on the outer surface or interior of the protein has suitable properties for ligand binding and is referred to as a binding pocket 36 . Study of the protein binding pocket dynamics reveals significant information about the interaction specificity. To measure the influence of point mutation on the protein binding pocket, comparative binding pocket analysis for all protein systems was carried out on the final representative structure using a CASTp server at intervals of 10ns 37 .
Furthermore, residue network analysis for all seven systems was carried out using a RING 2.0 server 38 . The RING server generated networks using complex empirical re-parameterization of distance thresholds performed on the PDB, in which nodes are residue and arcs are inter-and intra-chain interactions present in the protein structure. The output xml file of the RING server was visualized using the RINalyzer and structureViz packages in Cytoscape 39 . electrostatic properties calculations. The electrostatic potential differences in the wild-type and mutants were assessed through a Bluues server 40 using a solvent probe radius, salt radius, and minimum atomic radius of 1.4 Å, 2.0 Å, and 1.0 Å, respectively at 0.15 M ionic strength and a temperature of 298.15 K. Moreover, inner and outer dielectric constants were set at 4.0 and 78.5, respectively. MM/GBSA binding free energy. The difference in free energy of the binding of a ligand to an enzyme and a mutant enzyme was calculated using MM/GBSA 41 . In MM-GBSA, free energy is treated as the sum of conformational energy term (i.e., MM) and solvation free energy. The MM, which, as mentioned earlier, stands for (2019) 9:13617 | https://doi.org/10.1038/s41598-019-48940-5 www.nature.com/scientificreports www.nature.com/scientificreports/ molecular mechanics, refers to the type of energy function used to calculate the potential energy of a molecular structure. These functions, usually called force fields, are classical potentials including terms describing covalent bonding, van der Waals interactions, etc. The other part, the solvation term, can be further expressed as the sum of a polar component and a non-polar contribution 42 . The latter is usually assumed to depend linearly on the solvent-accessible surface area. The binding energies between ThyA and dUMP and ThyA and MTHF in the pre-and post-MD simulated complex were computed using the MM-GBSA approach via Schrödinger Prime software 43 . Analyzing atomic motion and fluctuation on MD trajectories. MD simulations generate large amount of data in terms of MD trajectories. PCA is a statistical approach for extracting and analyzing the functional motion of a protein 44 . Initially, a covariance matrix of atomic fluctuation was generated using gmx covar module of gromacs. Diagonalization of the matrix yielded eigenvector and corresponding eigenvalues that describe correlation properties of the collective motion in protein from atomic fluctuation. The eigenvector corresponding to the eigenvalues is called PCA and gives direction and relative amplitude for each atomic displacement in trajectories 45 . The trajectory of an individual eigenvector displays the projections as a function of time using the gmx anaeig command. Gromacs gmx sham package were used to identify significant conformational changes in the wild-type and mutant enzyme free energy landscape.