Structural and free energy landscape of novel mutations in ribosomal protein S1 (rpsA) associated with pyrazinamide resistance

Resistance to key first-line drugs is a major hurdle to achieve the global end tuberculosis (TB) targets. A prodrug, pyrazinamide (PZA) is the only drug, effective in latent TB, recommended in drug resistance and susceptible Mycobacterium tuberculosis (MTB) isolates. The prodrug conversion into active form, pyrazinoic acid (POA), required the activity of pncA gene encoded pyrazinamidase (PZase). Although pncA mutations have been commonly associated with PZA resistance but a small number of resistance cases have been associated with mutationss in RpsA protein. Here in this study a total of 69 PZA resistance isolates have been sequenced for pncA mutations. However, samples that were found PZA resistant but pncA wild type (pncAWT), have been sequenced for rpsA and panD genes mutation. We repeated a drug susceptibility testing according to the WHO guidelines on 18 pncAWT MTB isolates. The rpsA and panD genes were sequenced. Out of total 69 PZA resistant isolates, 51 harbored 36 mutations in pncA gene (GeneBank Accession No. MH46111) while, fifteen different mutations including seven novel, were detected in the fourth S1 domain of RpsA known as C-terminal (MtRpsACTD) end. We did not detect any mutations in panD gene. Among the rpsA mutations, we investigated the molecular mechanism of resistance behind mutations, D342N, D343N, A344P, and I351F, present in the MtRpsACTD through molecular dynamic simulations (MD). WT showed a good drug binding affinity as compared to mutants (MTs), D342N, D343N, A344P, and I351F. Binding pocket volume, stability, and fluctuations have been altered whereas the total energy, protein folding, and geometric shape analysis further explored a significant variation between WT and MTs. In conclusion, mutations in MtRpsACTD might be involved to alter the RpsA activity, resulting in drug resistance. Such molecular mechanism behind resistance may provide a better insight into the resistance mechanism to achieve the global TB control targets.

conserved among MTB strains and fully capable with POA binding. Amino acids at positions, F307, F310, H322, D352, and Arg357 are present in RNA binding sites of fourth S1 domain 11 . PZA resistance emerges due to mutations at MtRpsA CTD of mycobacterial species, causing conformational changes in the POA binding site 3 . Residues, Lys303, Phe307, Phe310, and Arg357, also known as tmRNA binding site, have been identified, involving in the interactions with two POA molecules 3,12 . In Mycobacterium smegmatis, Alanine deletion at the C-terminal (RpsA ΔA438 ) results in the lack of binding with RpsA. MtRpsA CTD is the POA binding site where tmRNA bind each other to form a complex for initiation of translation 12 . Protein structure may have drastic effects due to amino acid substitution, altering the protein structure and function, especially in the active site or binding pockets [13][14][15] . Mutation may also produces effects at a long-range position 16 . Exploring the mechanism of such changes behind a mutation, has been investigated for better understanding of a particular phenomenon. However, these are time-consuming and very expensive investigations when addressed by experimental procedure alone.
Molecular dynamic (MD) simulations is a method of choice that have been applied widely in exploring the mechanisms of conformational changes in protein, especially in drug resistance mechanisms caused by mutations. MD simulation studies of ligand-protein interactions are widely applied approach to explain the mechanisms of drug resistance due to mutations especially in target protein, which is one of the major causes behind resistance. In vivo experimental research, the crystal structure is analyzed for drug resistance while in comparison with experimental approach, MD simulation has a particular advantage to explain the mechanisms of drug resistance at molecular level 14,17 . Further, the structural dynamics of protein complexes and other residues level information can be accessed through MD which have been considered difficult by experimental procedures [18][19][20][21] .
In our recent studies, we identified different mutations in pncA gene 22 (GeneBank Accession No. MH46111) and RpsA 23 , whose molecular mechanism of resistance have been investigated through MD simulation [24][25][26] . Here, we analyzed the effect of our novel mutations, D342N, D343N, A344P, and I351F, on RpsA activity which have been detected in the conserved region in our previous study 23 among PZA-resistance isolates. We have investigated the possible changes in the RpsA dynamics, that results due to mutations in MtRpsA CTD which may provide useful information behind the drug resistance.

Results
The repeating DST results demonstrated that the isolates are resistant to PZA. Further, the resistant samples were also analyzed manually where growth was occurred against the critical concentration of PZA. In the previous study, out of total 69 PZA resistant isolates, 51 harbored 36 mutations in pncA gene (GeneBank Accession No. MH46111). The remaining 18 isolates were pncA WT PZA resistance isolates. Out of 18 PZA resistants but pncA WT isolates, 11 (61%) samples have fifteen non-synonymous mutations, while seven isolates were RpsA WT (Table 1). We did not detect any mutation in panD genes. Mutations, S324F, E325K, G341R, D342N, D343N, A344P, and I351F, were present in the conserved region (292-363) of the rpsA gene ( Table 1). Mechanism of PZA resistance behind mutations S324F, E325K, and G341R in rpsA have been already investigated in our previous study. Here, we analyzed the mechanism of resistance behind mutation, Asp342Asn, Asp343Asn, Ala344Pro, and Ile351Phe, which have been involved in the conformational changes that might be associated with RpsA activity.
Binding pocket and shape complementarity. Any diviation from pocket volume may effect the interaction with drug. Compared with WT (4638.493 Å), MTs Asp342Asn (4622.668 Å), Asp343Asn (4623.892 Å), Ala344Pro (4618.064 Å), and Ile351Phe (4623.253 Å), have been altered the binding pocket volume. Shape complementarity score of WT was also found higher (3862) than the MT (Table 2). Further, the RMSD of superimposed MT and WT has a significance difference showing the effect of mutation effect on protein structure (Fig. 1).  Free energy landscape and Gibbs free energy. Gibbs free energy (GFE) is a measure of work of a closed system when exchanging heat with the surroundings. The differences in GFE values may have importance in the stability calculation. A protein with native structure has the minimum GFE. WT exhibited a significant difference in GFE than MT, depicted in (Fig. 7). The color (red) in both states of the MT is more unstable compared to the MT.
The ΔG bind of WT (−6.88 kcal/mol) is very low in comparison with ΔG bind of D342N (−3.408 kcal/mol), D343N (−3.605 kcal/mol), A344P (−4.620 kcal/mol), and I351F (−3.937 kcal/mol) ( Table 2), showing a very low binding affinity and stability. The results of Gibbs free energy and ΔG bind is showing that these mutations might be involved in no or low binding affinity with POA, causing resistance.
Distance matrix. The average distance of WT RpsA and drug is almost constant during the simulation period as compared to MTs, where a high degree of fluctuation ( Fig. 8) representing the variation in distance between drug and protein during the whole simulation period. The MT, D343N and I351F, exhibited a high average distance between target and drug, signifiying the effect of mutations on proteins binding affinity. The total energy of WT RpsA (−76000 kJ/mol) was significantly lower than MTs (74500kj/mol, 73000kj/mol, 71500kj/mol, and 70500kj/mol) throughout simulation as shown in Fig. 9.  www.nature.com/scientificreports www.nature.com/scientificreports/

Discussion
Drug resistance is the main obstacle towards the WHO end TB 2030. Investigating the prevalence of PZA resistance in TB high burden countries 23,27 and insight into drug resistance mechanisms is vital for better supervision of global TB control. PZA is the only drug that kills persister MTB under latent state. According to the previous investigations, the emergence of PZA resistance is associated with mutations in pncA, but not in all cases, where resistance may be emerged due to mutations in the target RpsA 23,28-30 . In our previous study 23 , we reported novel mutations in conserved region called MtRpsA CTD (Table 1). However, the molecular mechanism behind resistance was still unknown. Here, we explored the inside mechanism of resistance due to RpsA mutations which have  www.nature.com/scientificreports www.nature.com/scientificreports/ ribosomal protein S1 (RpsA) WT (black) retained a constant RMSD from 25 ns to 100 ns. RMSD of D342N seems to be inconsistent as it rises at 100 ns. In apo state, D343N exhibited a continuous rise in RMSD from 0 ns to 78 ns. However, a little fall have been shown at 95 ns-100 ns. While in complex state, the RMSD is little stable but seems, still rising at 83 ns-100 ns. RMSD of MT A344P is rising at 80 ns-100 ns in apo state, however, it seems stable in complex state. MT I351F attained a little higher RMSD in both, apo and complex state. However, it seems highly unstable in apo state, rising continuously 20 ns-100 ns. www.nature.com/scientificreports www.nature.com/scientificreports/ been associated in-vivo PZA resistance. MD simulations provide the interaction mechanisms at the molecular level 17,31,32 along with structural and dynamical information, which is difficult to be determined by experimental procedures. Protein's structure and function are maintained by conserved region residues where mutations have  www.nature.com/scientificreports www.nature.com/scientificreports/ been reported, leading to conformational changes or a loss of function [33][34][35] . To explore these mechanistic effects behind the PZA resistance, we analyzed multiple characteristics of protein affected by conserved region mutations in MtRpsA CTD .
A higher RMSD and RMSF values of MT, D342N, D343N, A344P, and I351F signifies the effect of point mutations when compared with WT, affecting the activity of RpsA to interact with the POA. Stability and flexibility are essential properties maintaining the activity of biomolecules 36 . Mutations in drug target often affect these properties, making them a vulnerable target to interact with drugs. Increases in residue flexibility have a significant effect on activity. The earlier reports 32,37-39 have also found the effect of mutations in a specific location, resulting in the loss of activity. Thus RMSD and RMSF may be as a measure of effect behind mutations in targets, resulting in drug resistance or disease where stability is a fundamental property, affecting the function, activity, and regulation of biomolecules. Changes in stability and flexibility of targets proteins may be the loss of thermodynamic  www.nature.com/scientificreports www.nature.com/scientificreports/ stability and protein folding 40 . Destabilization in folding and thermodynamic stability may affect the total energy of biomolecules (Fig. 9).
Further, proteins folding stability may also be measured by Radius of gyration (Rg) which is a degree of compactness, commonly measured in MD simulation by a ratio of the accessible surface area to the surface area of the ideal sphere of the same volume, plotted against time. A stable Rg value signifies the proteins correct folding (Fig. 5), while any deviation is regarded as folding instability [41][42][43] . In comparison with MTs, a more stable Rg attanied by WT is an indication of correct folding during simulations (Fig. 5).
Oftenly, mutations may occurr far from the active site, causing some drastic changes in binding pocket volume. These distant sites may have a good communication with enzyme function in signal transmissions from one functional site to far site through a series of pathways during biological function 44,45 . Any change to the binding pocket may lead to the loss of interaction with inhibitors. Resistance to the drug may be developed due to change in the binding pocket 38,46,47 . The binding affinity of RpsA to POA may also be lost due to alteration in pocket size. In the binding pocket, the primary interactions accounted with ligand are, hydrogen bonds, van der Waals and electrostatic forces 36 . The three-dimensional structure of a protein has hydrogen and hydrophobic interactions playing a vital role in interactions. Weak interactions have fewer hydrogen and hydrophobic interactions. WT protein has more and MT, D342N, D343N, A344P, and I351F, has significantly fewer interactions indicating the strength of protein folding andbinding stability. Hydrogen bonds support the core, which is comprised of α-helices and β-sheets [48][49][50] . In spite of mutations, a low level of resistance may be developed due to efflux or influx 51 . Fore more better understanding of drug resistance, the role of the efflux needs to be specified.
We analyzed the consequences of our novel mutations D342N, D343N, A344P, and I351F on RpsA dynamics by comparing them with WT in PZA-resistant pncA WT MTB strains. These mutations have been involved, changing the RpsA activity by alltering the total energy, flexibility, folding and stability, thereby affecting the interactions with POA. In comparison with WT, interaction of POA and RpsA seemed to be altered due to variations in the binding pocket of MTs D342N, D343N, A344P, and I351F. The overall investigations supports the hypothesis that mutaions in the conserved region of the rspA gene might be involved in PZA-resistance. To the best of our knowledge, we presented a first comprehensive investigation of such kind where multiple characteristics have been investigated for better insight into mutations affecting RpsA activity and caused PZA resistance.

Samples collection. Provincial Tuberculosis Reference Laboratory (PTRL) is the Central laboratory of
Khyber Pakhtunkhwa (KPK) province of Pakistan. We sequenced 69 PZA resistance isolates for pncA gene mutations, out of which 51 has 36 different mutations in pncA gene 22 while 18 isolates were detected as pncA WT . We collected all the 18 isolates, previously detected as resistant to PZA (PZA-R) but pncA WT . To screen mutation in rpsA and panD genes, all these samples were grown on 7H9 media in the mycobacterium growth indicator tube (MGIT) 960 system for confirmation as MTB 52 . Approximately 100 µl of the sample of the positive tube was added to TBc ID device that indicates a positive MTB by the emergence of pink to red at the test and control location, confirming the antigen, MPT64 in the sample 53,54 . All the positive MTB tubes were subjected for repetition of drug susceptibility testing (DST). www.nature.com/scientificreports www.nature.com/scientificreports/ PZA DST. MTB Drug susceptibility testing (DST) is now routinely performed through automated BACTEC MGIT 960 system 55 . The samples were repeated for PZA DST along with positive (ATCC 25618/H37Rv) and negative controls (Mycobacterium bovis). A sample was marked as PZA-resistance when growth occurred at 100 μg/ml of PZA critical concentration 55 . Interpretation ofsci drug susceptibility testing. The test completion is interpreted when the growth unit (GU) value of control (GC) attained as 400 or more. An inventory report is printed as "S" for sensitive isolate and "R" for resistance. The growth unit (GU) value was less than 100 for sensitive isolates and greater than 100 for resistants 55-57 . rpsA and panD gene sequencing. DNA was isolated from the PZA resistance samples using the sonication method 58,59 . Forward and reverse primers, F-5′CGGAGCAACCCAACAATA-3′ and R-5′ GTGGACAGCAACGA CTTC-3′) 60  Crystal structure retrieval. A crystal structure of MTB RpsA 3 (PDB ID 4NNI) was retrieved from Brookhaven Raster Display (BRAD) protein data bank (PDB) 63 . Mutant structures of RpsA were generated by inducing mutations at D342N, D343N, A344P, and I351F residues using PYMOL 64 . The POA structure was retrieved from PubChem database (PubChem CID: 1047) 65 and energy minimized.

Molecular docking.
Structures were loaded into Chimera 66,67 , selenomethionines were changed into methionine and hydrogen atoms were corrected. Shape complementarity measuring score of drug and protein structures was performed using PatchDock server 68 , which is a kind of geometric matching, where drug and protein features are compared for the docking purpose. Molecular complexes have been found with good interactions are oftenly exhibited good shape complementarity. Mutations often cause conformational changes, affecting the interactions with a drug 69,70 . Pocket volumes of WT and mutant D342N, D343N, A344P, and I351F were compared through CastP server 71 .

Molecular dynamics (MD) simulation. Molecular dynamics (MD) simulations provide plenteous dynam-
ical structural and energetic information about the target protein and drug interactions. The essence of receptor and drug interaction can be analyzed accurately as MD simulation provides useful information in understanding the structure-function relationship, guiding the drug discovery and designing practice 47 . MD simulation was performed on all the structures using AMBER while topological parameters were generated using the antechamber tool (. frcmod). A transferable intermolecular potential with 3 points (TIP3P) water box was used to solvate the proteins and ligand coordinates with a distance of 8.0 Å 72 . Sodium and chloride ions were added to neutralize the systems, as these ions possess high electrostatic potential in the replacement of water molecules. The steepest descent algorithm was used to minimize the energy.
Constant volume and temperature (NVT) ensemble and pressure and temperature (NPT) ensemble was equilibrated and set out at 1-bar pressure for 100 ps. Temperature and pressure were maintained constant using Berendsen thermostat 73 and Parrinello-Rahman barostat 74 methods respectively. Linear constraint solver (LINCS) algorithm 75 was used for bond length rectification and CPPTRAJ was used for the post-dynamics simulation analysis such as RMSD, RMSF, radius of gyration (Rg), and total energy. Further, a free energy landscape calculation methodology, implemented in Gromacs was used to calculate the energy changes and metastable states during the course of simulation. Essential dynamics. To obtained internal motion of the system, Principal Component Analysis (PCA) of trajectory was performed on the mass-weighted cartesian coordinates, where rotation and its translation were removed. Low modes are recognize by long dynamics in proteins 76,77 . Further, PCA reduces the motion in a trajectory [78][79][80] where A set of variables z1, z2…, zp, transformed, called principal components (PCs) are generated. Free Energy Landscape (FEL) 81,82 is applied to calculate the energies of macromolecule conformations sets. The first two PCs (PC1 vs. PC2) of PCA computed, give the trajectories on the two principal components of motion 83 . The amplitude of fluctuations in proteins were captured through PCA and the first two components (PC1 and PC2) were plotted to analyze their fluctuation boundries 84,85 .
The lowest energy stable state is represented by the free energy landscape (FEL). The stable state with minimal energy on a plot is shown by deep valleys while boundaries between the deep valleys represent the intermediate conformations 86 . FEL was plotted, using g_sham module while PC1 and PC2 were used to analyze the FEL using the equation: www.nature.com/scientificreports www.nature.com/scientificreports/ The Gibbs and total binding free energy. Gibbs free energy (G), is a single value, combining enthalpy and entropy while the change in free energy (ΔG) is the sum of the enthalpy plus the product of entropy and the temperature of the system.
The G is minimized to the chemical equilibrium state of the system and it is a thermodynamic potential at constant pressure and temperature, where a reduction is a vital state for processes. The G (Sugita & Kitao, 1998) was plotted against WT. The binding free energy was measured from 500 snapshots at the end of 10-ns trajectory. The free energy of each component was estimated using the following equation: ∆G bind represents the total binding free energy The free energy of each component was estimated as: bond ele v dW pol n pol G bond , represents bonded, G ele is electrostatic, and G vdW is van der Waals interactions while Gpol and G npol represent polar and nonpolar to the total solvated free energies.