Potential Inhibitors of Galactofuranosyltransferase 2 (GlfT2): Molecular Docking, 3D-QSAR, and In Silico ADMETox Studies

A strategy in the discovery of anti-tuberculosis (anti-TB) drug involves targeting the enzymes involved in the biosynthesis of Mycobacterium tuberculosis’ (Mtb) cell wall. One of these enzymes is Galactofuranosyltransferase 2 (GlfT2) that catalyzes the elongation of the galactan chain of Mtb cell wall. Studies targeting GlfT2 have so far produced compounds showing minimal inhibitory activity. With the current challenge of designing potential GlfT2 inhibitors with high inhibition activity, computational methods such as molecular docking, receptor-ligand mapping, molecular dynamics, and Three-Dimensional-Quantitative Structure-Activity Relationship (3D-QSAR) were utilized to deduce the interactions of the reported compounds with the target enzyme and enabling the design of more potent GlfT2 inhibitors. Molecular docking studies showed that the synthesized compounds have binding energy values between −3.00 to −6.00 kcal mol−1. Two compounds, #27 and #31, have registered binding energy values of −8.32 ± 0.01, and −8.08 ± 0.01 kcal mol−1, respectively. These compounds were synthesized as UDP-Galactopyranose mutase (UGM) inhibitors and could possibly inhibit GlfT2. Interestingly, the analogs of the known disaccharide substrate, compounds #1–4, have binding energy range of −10.00 to −19.00 kcal mol−1. The synthesized and newly designed compounds were subjected to 3D-QSAR to further design compounds with effective interaction within the active site. Results showed improved binding energy from −6.00 to −8.00 kcal mol−1. A significant increase on the binding affinity was observed when modifying the aglycon part instead of the sugar moiety. Furthermore, these top hit compounds were subjected to in silico ADMETox evaluation. Compounds #31, #70, #71, #72, and #73 were found to pass the ADME evaluation and throughout the screening, only compound #31 passed the predicted toxicity evaluation. This work could pave the way in the design and synthesis of GlfT2 inhibitors through computer-aided drug design and can be used as an initial approach in identifying potential novel GlfT2 inhibitors with promising activity and low toxicity.

is one of the two naturally-occurring sulfonium ion glycosidase inhibitors, known as salacinol 30 . This compound was found to have an inhibitory activity towards α-glycosidase. Compounds #23 and #25 (Fig. 3) are both salacinol-derivatives replacing sulfur atom with nitrogen and selenium, respectively 31,32 .
With regard to the charged-sulfur containing compounds as GlfT2 inhibitors, a series of galactofuranosyl N,N'-dialkyl sulfenamides and sulfonamide were synthesized. Here, it was found that compounds with shorter N,N'-dihexyl chains showed low inhibition activity compared with compounds having longer dioctyl and didecyl chains 34 .
The potential GlfT2 inhibitors aforementioned have low inhibition activities with the enzyme. As such, it is timely to shift and find another strategy on designing new GlfT2 inhibitors. In silico studies such as molecular docking and 3D-QSAR are now possible with the availability of the GlfT2 crystal structure (PDB ID: 4FIX) 20 . Molecular dynamics (MD) simulations were performed to obtain ensemble of protein structures to be used for molecular docking studies. Also, 3D-QSAR was used as guide in designing new GlfT2 inhibitors. Moreover, the top hit compounds were screened using in silico ADMETox.

Results and Discussion
Ensemble docking. The 100 different protein conformations were obtained from the entire 100 ns simulation using the clustering analysis of the cpptraj 35 module of the AMBERTools 15 package 36 . Previously synthesized and functionalized compounds were prepared using the MarvinSketch software 37 . Subsequently, molecular docking was performed using AutoDock Vina [38][39][40] . Here, the inhibition constant (Ki) was obtained from the binding energy (ΔG) using the formula: Ki = exp(ΔG/RT), where R is the universal gas constant (1.985 × 10 −3 kcal mol −1 K −1 ) and T is the temperature (298.15 K). The previously synthesized and functionalized compounds were docked to obtain the binding energy of the complexes formed between the receptor and the ligands. The natural acceptor substrate was found to have a binding energy of −6.63 ± 0.02 kcal mol −1 and Y236, D256, W309, K369, D372, W399, and Q409 as key interacting amino acids with the natural substrate. The binding energy of the natural donor substrate will be used as the reference value here.   www.nature.com/scientificreports www.nature.com/scientificreports/ First line Anti-TB drug. Several anti-TB drugs developed were categorized according to target. One classification is the first-line anti-TB drugs (Fig. 4) which inhibits the synthesis of the bacterial cell wall essential for its pathogenicity 41 and virulence 42 . The first-line anti-TB drug compounds were subjected to molecular docking studies. Results showed that the binding energy values indicate that these compounds weakly bind to the active site. Majority of the first-line anti-TB drugs such as ethambutol, isoniazid and pyrazinamide were found to have no interaction with the key GlfT2 active site amino acids (Table 2). Overall, the observed low binding energies of these compounds originated from the weak to lack of interaction with the amino acids within the active site (Table 1).

Current GlfT2 inhibitors in the literature.
Recently, one of the strategies used to synthesized GlfT2 inhibitors is by mimicking the identified donor, acceptor, and transition-state substrates [43][44][45] .  www.nature.com/scientificreports www.nature.com/scientificreports/  Mimicking GlfT2's donor substrate as GlfT2 inhibitor. Compounds having a similar structure with the GlfT2's donor substrate could be used as potential GlfT2 inhibitors. In a previous study, donor substrate mimics such as compounds #49-52 ( Fig. 5) were examined using a spectrophotometric assay with GlfT1 and GlfT2. It was found that these compounds inhibit GlfT2 except for compound #50 16 . Docking results showed that the binding energy values of compounds #50 and #52 (Fig. 5) were relatively lower than the reference value (Table 11). On the other hand, with respect to the reference value, compounds #49 and #51 (Fig. 5), exhibited lower binding energy values of −6.79 ± 0.03 and −6.72 ± 0.10 kcal mol −1 , respectively (Table 11). Furthermore, it showed that compounds #49, #51, and #52 interact with Y236, W348, P167, G232, and D371 ( Fig. 6 and Table 12). Intriguingly, compound #50 has lost its hydrophobic interaction with D256, I368, and T168. In contrary, these amino acids were found to interact with compounds #49, #51 and #52 (Table 12). The absence of these amino acids interacting with compound #50 could be the reason for its weak binding affinity with GlfT2 as described previously 16 .
As discussed previously, compounds #15-19 were found to be UGM inhibitors and GlfT2 donor-substrate mimics (Fig. 4). Thus, all these compounds were subjected to molecular docking studies to probe whether the UGM inhibitors could also act as GlfT2 inhibitors. Results showed that the binding energy values of compounds #15, #16 and #19 were relatively higher than the reference value (Table 3). Compound #17 was found to have a lower binding energy value of −6.36 ± 0.01 kcal mol −1 compared with the reference value. Compound #18, compound #17's analog, was found to have a binding energy value of −5.02 ± 0.01 kcal mol −1 , relatively much lower compared with compound #17. The absence of a UDP moiety in compound #18, which was present in compound #17, could be the possible reason for its low binding affinity. The linker seems to increase the interaction of the  D256  W348  P167 R171  G232  W347 I368   16  Y236  W348 K369  D372  P167 R171  G232  F169 D371   17  Y236  D256  K369  D372  P167  G232  W347 I368 F169 D371  T168  G231   18  Y236  D256  W348 K369  D372  P167  G232  W347  D371   19  Y236  D256  W348 K369  D372  P167 R171  G232  W347 I368 F169 D371  G231  Q200   Table 4. Interacting amino acids with Imino-Sugars as GlfT2 inhibitors.  ligand with the amino acid residues within the active site through hydrogen bonding. The interactions among the linker's phosphate group, Y236, and G232 were also observed with compounds #15-17 and #19. The presence of this kind of interaction could be the reason for the stabilization of these compounds within the active site ( Fig. 7 and Table 4).
Mimicking GlfT2's acceptor substrate as GlfT2 inhibitor. Compounds #20 (galactofuranosyl N,N'-didecyl sulfenamide), #21 (galactofuranosyl dioctyl thioglycoside), and #22 (sulfone derivative of compound #21 via the oxidation of sulfur) (Fig. 8) were screened for inhibition effect using disk susceptibility test assay 46 . Results revealed that compound #20 exhibited an inhibitory effect comparable to the shorter diakyl chains. On the other hand, compounds #21 and #22 were found to have an inhibitory effect with GlfT2 at a concentration less than 5 μM 46 . These compounds were subjected to molecular docking studies and showed lower binding energy values compared with the reference value (Table 5). Among these three, compound #22 (Fig. 8) has highest binding energy   due to the additional oxygen (sulfone) that was observed to interact with D371 and D372 (Fig. 9). The dialkyls were observed to have hydrophobic interactions with R171, Y236, D371, D372, G232, W347, W348, and D256 (Table 6). Compounds #53-56 ( Fig. 2) were synthesized to evaluate the specificity of GlfT 16 . Using kinetic characterization of GlfT with these compounds, it was found that trisaccharides were better substrates than disaccharides 21 . Docking studies were performed on these compounds and the results showed that trisaccharides, compounds #55 and #56, have lower binding energy values (−6.24 ± 0.04 and −6.19 ± 0.04 kcal mol −1 , respectively) compared with disaccharides, compounds #53 and #54 (−6.16 ± 0.03 and −6.11 ± 0.03 kcal mol −1 , respectively) ( Table 13). The additional sugar moiety of a trisaccharide compared to a disaccharide extended the long alkyl chain, allowing interaction with W408 via hydrophobic interaction ( Fig. 10 and Table 14). This could be the origin of the enhanced interaction observed in both the experimental and docking studies.
UDP-Galactopyranose mutase (UGM) substrate as GlfT2 inhibitor. Another strategy used to inhibit GlfT2 was by repurposing compounds that were proposed to be inhibitors of enzymes involved in the galactan chain synthesis. One of these enzymes is UDP-Galactopyranose mutase (UGM). In the absence of galactofuranose in mammals 33 , UGM catalyzes the conversion of UDP-galactopyranose to UDP-galactofuranose which will eventually be the donor substrate of GlfT2 26,47 . Here, some of the proposed UGM inhibitors were subjected to molecular docking studies with GflT2.
Compound #14 (Fig. 1) was reported to be a weak inhibitor of UDP-Galp mutase 47 . Upon molecular docking investigation the binding energy was found to be much lower, −4.18 ± 0.01 kcal mol −1 , compared with the reference value (Table 3). Furthermore, among the key GlfT2 active site amino acids that were previously discussed, Y236, D256 and W348, were the only amino acids observed to be interacting with compound #14. The decrease in the number of interacting amino acids with compound #14 is one of the reasons for its poor binding affinity.  A recent study have discussed the development of microtiter plate-based assay to screen uridine-based compounds against UGM 48 . Among the compounds in the uridine-based library used in the assay, only compound #27 ( Fig. 12) was found to be a weak inhibitor (IC 50 = 6.0 μM). Its binding energy value (−8.32 ± 0.01 kcal mol −1 ) was observed to be lower than the reference value ( Table 9). The interactions of the compound with P167, G232, W347, I368, D371, T168, W408, M285 and H296 may account for its improved binding affinity (Fig. 13).
Aside from being a known antibacterial agent for urinary tract infection treatment, compound #28 ( Fig. 12) was further used as a UGM inhibitor and found to moderately inhibit the enzyme 49 . Docking this with GlfT2 resulted in a lower binding energy compared with the reference value ( Table 9). The absence of the interaction of this compound with W347, I368 and T168 may account for this lower binding affinity compared with compound #27.
Compounds #30 and #31 showed a promising activity towards UGM (IC 50 = 1.6 μM) 50,51 . These compounds differ in the type and number of halides they contain. Compound #30 has only bromide and compound #31 has two chloride and one iodide atom present in the 5-arylidene-2-thioxo-4-thiazolidinone (ATT) core. Results of the molecular docking studies showed that compound #31 have a lower binding energy compared with compound #30 ( Table 9). The observed interaction of Q200 with compound #30's chloride atom and K369 with iodide atom may account for the difference in the binding energy values between the two compounds (Table 10).
Compound #29 (Fig. 12) showed no inhibition or poor inhibitory activity with UGM 51 . Molecular docking studies have found that it only registered a binding energy of −3.96 ± 0.01 kcal mol −1 which was much lower compared with the reference value. The derivative of the sugar moiety may have occupied only the sugar binding region of the active site and do not interact with the residues within the UDP binding region. This suggests that the residues interacting with UDP contribute to the binding of UDP-Galf, and the absence of UDP or replacement of any moiety that could mimic it, may account for the compound's low binding affinity.
Among the reported synthesized GlfT2 inhibitors that were included in this study, compounds #27 and #31 ( Fig. 12) were found to have binding energies of −8.32 ± 0.01 kcal mol −1 and −8.08 ± 0.01 kcal mol −1 , respectively (Tables 9 and 10). The observed higher binding energy values compared with GlfT2's natural substrate seems to originate from the interaction of W347, W348, and W408 (only for compound #27) with the aromatic rings present in the compound. The presence of tryptophan within the active site could stabilize the inhibitors via π-π interaction with the aromatic rings.
The positively charged moiety of these compounds were found to interact with D371 and the other interacting amino acids as shown in Fig. 11 and Table 8. Still, lower binding energy values were observed compared with the reference value ( Table 7). The structural motif and binding energy values of compounds #23-25 were comparable  www.nature.com/scientificreports www.nature.com/scientificreports/ with compound #18. These compounds only have a sugar moiety and a short linker (Figs. 1 and 3). As previously discussed, the absence of a moiety that could interact with UDP binding region amino acids could account for the low binding affinity of a molecule within GlfT2's active site.
Newly Designed Sugar Furanosides as GlfT2 Inhibitors. In silico drug design draws attention among researches because it is time-saving and cheap. Computer-aided drug design uses computational tools to discover, develop, and analyze drugs 52 . One technique used for drug design is ligand-based computer-aided drug design which involves ligands that are known to interact with target receptor, and account for the binding strength of a given molecule by knowing the nature of the interactions 53 .
Functionalizing a drug with an azide group has been largely used in the pharmaceutical industries 54,55 . It was then recognized as a novel pharmacophore in medicinal chemistry especially in the emergence of zidovudine, an anti-retroviral drug for the treatment of Acquired Immuno-Deficiency Syndrome (AIDS). Also, azide group was used in tumor-labelling 56 for cancer treatment. It is noteworthy that azido-substituted drugs have high affinity towards the target receptor 57 . It was found that tetrahydroimidazobenzodiazepinthiones (TIBO) or thiourea derivatives are potential drugs for treating TB 58,59 . Guanidine derivative drugs also pose a promising role in medicinal chemistry because of their anticancer 60,61 , antiviral 62 , antibacterial 63 properties. Streptomycin, a known anti-TB drug, has two guanidino groups. The aforementioned functional groups such as azido, thiourea and guanidino groups were used here in designing new GlfT2 inhibitors because of their antibacterial property and high receptor affinity.
The design was an analog of the known GlfT2 substrate dissacharide, octyl β-D-galactofuranosyl-(1→5)β-L-arabino-furanoside. They found that substrates with longer chain aglycon were better substrates of glycosyltransferases in Mycobacterium species 64 . Compounds #2 and #3 (Fig. 14) were both trans-2-tridecen-1-yl glycosides having a modification in the non-reducing end wherein the 6-OH position was replaced with azido and thiourea functional groups, respectively. On the other hand, compound #1 was obtained via the oxidation of double bond in the aglycon of compound #3. It was found that compound #3 has a higher binding energy value of −10.32 ± 0.02 kcal mol −1 compared with compound #2 with binding energy value of −11.08 ± 0.02 kcal mol −1 . It is proposed that the presence of azido group have a higher binding affinity compared with the thiourea group which is evident on the binding energies presented.
From the binding energies of compounds #1 and #3, it can be observed that compound #1 has a lower binding energy of −14.67 ± 0.04 kcal mol −1 compared with compound #3 having −10.32 ± 0.02 kcal mol −1 (Table 15). There were observed interactions among D256,Y236, and thiourea and another set of interactions among D371, D372, the hydroxyl groups of the disaccharide, and the aglycon of compound #1 (Table 16). The addition of two hydroxyl groups on the aglycon effectively increased the enzyme-substrate interaction within the active site.
Compound #4 (Fig. 14), a glyceryl glycoside, has a modification in the non-reducing end wherein the 5-OH and 6-OH position were replaced with an azido and guanidino functional groups, respectively. As observed, D256 and I368 interacts with the hydroxyl groups of glyceryl aglycon (Table 16). Y344, P167 and D258 were observed to interact with the guanidino group and D372 was observed to interact with the azide group ( Fig. 15 and Table 16). These residues interact with the inhibitor through hydrogen bonding and seem to be the origin for the observed high binding affinity. Among the newly designed sugar-based inhibitors, compound #4 is the most promising compound with a binding energy of −19.23 ± 0.05 kcal mol −1 .  www.nature.com/scientificreports www.nature.com/scientificreports/ Redesigned Sugar Furanosides as GlfT2 Inhibitors using 3D-QSAR. Structures of the synthesized GlfT2 inhibitors were subjected to 3D-QSAR to improve the structural motif of the inhibitors. The 3D-QSAR is an important tool on providing substantial information about the molecular attribute essential for biological activity of compounds 65,66 .

Compounds
Results showed that the Pearson coefficient is R 2 = 0.99 which signifies the reliability of the test and training sets used. The structures of the presented synthesized GlfT2 inhibitors were aligned along their respective molecular field points. Figure 20 shows negative steric field points (green field) which indicate that steric groups should be avoided on that particular part of the molecule. Whereas, positive steric field points (yellow field) indicate that steric groups should be added on that particular part of the molecule. Moreover, negative electrostatic field points indicate that an electrostatic contributor i.e. negatively charged group/hydrogen-bond acceptor, should be added on that particular part of the molecule. Whereas, positive electrostatic field points indicate that an electrostatic contributor i.e. positively charged group/hydrogen-bond donor, should be added on that particular part of the molecule.
The 5-OH and 6-OH position of the non-reducing end of the substrates were functionalized with azido-and guanidino-group, respectively for compounds #58 and #59 (Figs. 16 and 17). For compounds #66-75 (Figs. 16 and 17), the 5-OH and 6-OH position of the non-reducing end of the substrates were functionalized with amine-and   www.nature.com/scientificreports www.nature.com/scientificreports/ methyl group. For compounds #60-63 and #65 (Figs. 16 and 17), only 6-OH position of the non-reducing end of the substrates were functionalized with a methyl group. Lastly, for compound #64 (Fig. 16), the 6-OH position of its non-reducing end was functionalized with a thiourea group.       (Tables 17 and 19).
Results show that the modification of the aglycon instead of the sugar moiety lead to the significant increase on the binding energy of designed compounds.
Additional amino acids were found to be interacting with these compounds as shown in Tables 18 and 20. It can be seen that compounds #63-65 were found to be interacting with both M286 and K402 through hydrogen bonding (Figs. 18 and 19). These interactions were also observed between compound #59 and K402, and between compounds #68 and #75 and M286 (Figs. 18 and 19).

Structure-activity relationship representation.
Various designs of the possible GlfT2 inhibitors are summarized in Fig. 21 using a Structure-Activity Relationship (SAR) representation. This shows the effect of the different R-groups added to the pharmacophore of the QSAR-based, Donor substrate-based and Acceptor substrate-based compounds on their activity. To design for possible GlfT2 inhibitors, the compound should have at least one sugar moeity (D-Galf) provided that the 5-OH and/or 6-OH position have R-groups that could disrupt the compounds' hydrogen-bond interaction with D372 ( Figure 18). In addition, the presence of a UDP or UDP-like moiety (long alkyl chain or steriodal aglycon) could possibly increase the binding affinity of the compound.
ADMETox evaluation of the best candidates. The predicted ADME part of this study was carried out using an online server, SwissADME 67 , that gives values for lipophilicity, water solubility, drug-likeness, medicinal chemistry (i.e. leadlikeness, and PAINS and Breck). Whereas, in silico toxicity evaluation was carried out also using an online server, ProTox-II 68 , that gives predicted oral toxicity values, predicted cytotoxicity, mutagenicity, carcinogenicity, hepatotoxicity, and immunotoxicity. In addition, ProTox-II also gives an overview whether the compounds being analyzed will bind to the proteins known to produce adverse reaction to drugs.
Lead-likeness of a compound is predicted using parameters such as MW (250 g mol −1 ≤ MW ≤ 350 g mol −1 ), octanol/water partition coefficients (XLOGP ≤ 3.5) and number of rotatable bonds (# rotatable bonds ≤ 7). Results showed that none of the top hit compounds fall within the set criteria. To quantify the complexity of the molecular structure, synthetic accessibility was assessed. The results showed that the scores for the compounds   Absorption, distribution, metabolism and excretion properties evaluation of the top hit compounds. Solubility is one of the major properties influencing absorption. The compound's aqueous and non-aqueous solubility either is important from the drug development process until oral in-take 67 . Lipophilicity is the effective solubility of a compound into the non-aqueous medium and correlated to various models of drug properties such as adsorption, distribution, metabolism and toxicity 70 . Five available predictive models, i.e iLOGP (implicit log P o/w ), XLOGP3 (enhanced atomic/hybrid log P o/w 3), WLOGP (Wildman and Crippen log P o/w ), MLOGP (quantitative-structure log P o/w ) and SILICOS-IT were used to evaluate the lipophilicity of the compounds. The mean predicted lipophilicity values from these methods is termed as the consensus log P o/w . A molecule is more soluble if the consensus log P o/w values is more negative 67 . Results showed that compounds #4, #27, and #65 were soluble in non-aqueous medium (Table 22). www.nature.com/scientificreports www.nature.com/scientificreports/ Some drugs have to be highly water soluble to deliver sufficient amount of the active ingredient. Three models were used by SwissADME to predict water solubility i.e ESOL (Estimated SOLubility), Ali and SILICOS-IT (SwissADME in-house solubility predictor). A qualitative estimation of solubility according to log S scale: <−10 -poorly soluble, <−6 -moderately soluble, <−4 -soluble, <−2 -very soluble, and <0 highly soluble 67 . Based from these predictive models, only compound 65 is predicted to be soluble. Compounds #1, #3, and #4 are predicted to be water soluble while compounds #27, #61, #63, #73 and #75 are predicted to be moderately water soluble. The remaining top hit compounds are predicted to be water insoluble (Table 22).
As the drug is absorbed by the system, it encounters diverse membrane barriers such as hepatocyte membrane, gastrointestinal epithelial cells, blood capillary wall, glomerulus, restrictive organ barriers (e.g. blood-brain-barrier), and the target cell 70 . A molecule is said to be less skin permeant if the value of log K p is more negative 67,72 . From the predicted results, compounds #4, #27 and #65 are found to be the least skin permeant (Table 23). Moreover, other parameters used to measure the adsorption and distribution of these drugs is through human intestinal absorption (HIA) or gastrointestinal (GI) adsorption data. These data show that compounds #31, #70, #71, #72, and #73 are predicted to be well-absorbed, whereas, compounds #31, #70, #71, www.nature.com/scientificreports www.nature.com/scientificreports/ #72, and #73 are predicted as non-brain penetrants ( Table 23). None of the top hit compounds was predicted to be blood-brain-barrier (BBB) permeant. This means that compounds being proposed here have a relatively large size and they cannot pass the blood-brain barrier. Also, a compound being non-blood-brain permeant lowers the possibility of causing harmful toxicants in the brain and blood stream when metabolized. The remaining compounds were predicted to be neither absorbed nor penetrated in the brain.
After being distributed to the organism's system, metabolism of these drugs takes place and eventually exit the excreta safely. Metabolism plays an important role in the bioavailability of drugs as well as drug-drug interactions. It is also important to have a better understanding if a certain compound is a substrate or non-substrate of the permeability glycoprotein (P-gp). This protein belongs to the ATP-binding cassette transporters which is important in assessing active efflux through biological membranes. It is also essential to have knowledge of the interaction of molecules with cytochrome P450 (CYP) enzymes as they are involved in drug elimination through metabolic transformation 73 . It has been suggested that CYP and P-gp can process small molecules synergistically to enhance the protection of tissues and organisms 74 . Inhibition of these isoenzymes may result in pharmacokinetics-related drug-drug interactions that could lead to unwanted adverse side-effects by lowering the solubility and the accumulation of the drug or its metabolites. To better understand the mechanism of drug deposition, efficacy and toxicity, the top hit compounds were evaluated to determine whether the compound can act as substrate or an inhibitor of P-gp and CYPs. All compounds are found to be substrates of P-gp except for compounds #1, #2, #4, #61 and #69. Moreover, the top hit compounds presented were found to be substrates of CYP1A2, CYP2C19 and CYP2D6. All compounds are predicted to be CYP2C9 substrates except compounds #31  www.nature.com/scientificreports www.nature.com/scientificreports/ and #73, whereas, for CYP3A4, compounds #27, #58, #59, #63, #64, #65, #68, and #72 were found to be potential substrates (Table 24).
In silico toxicity evaluation of top hit compounds. Investigating the ADMET properties of a compound is a critical step for drug development. If a drug passes this step, subsequent toxicity tests are warranted. However, toxicity tests are time consuming and expensive especially if there are significant number of candidate compounds 75,76 . To keep up with increasing demand from the pharmaceutical industries, in silico toxicity evaluation is initially used to determine the compound's toxicity as a fast and an inexpensive method to reduce the number of compounds to be sent later for further testing. In silico toxicity evaluation could not act as absolute answer for the compound's toxicity evaluation 75 . Thus, it should always be accompanied by an in vitro and in vivo experiments to verify the biological activities beyond the capability of these computational approaches.
Here, the top hit compounds were subjected to an in silico toxicity evaluation using Pro-Tox. The LD 50 is defined as the median lethal dose of a compound at which the test subjects die upon exposure to it. The toxicity  www.nature.com/scientificreports www.nature.com/scientificreports/ class ranges from 1 to 6, 1 being fatal if ingested and 6 being non-toxic 77 . The results showed that the top hit compounds #3, #4, #63, #65, #66, #67, and #73 were predicted to be orally toxic (range between toxicity class 1 to 3) ( Table 25).
The Pro-Tox online server 68 also predicts four toxicological endpoints such as cytotoxicity, mutagenicity, carcinogenicity, and immunotoxicity. Results suggested that all the top hit compounds were predicted to be immunotoxic except for compound #31 (Table 26). Immunotoxic chemicals are known to alter the correct functioning of immune system by B cell growth inhibition 68,77 . Moreover, the organ toxicity, specifically hepatotoxicity was predicted to evaluate if the compound will cause liver dysfunction 68,77 . Results showed that the top hit compounds were predicted to be non-hepatotoxic. Moreover, compound #4 was predicted to be a mutagenic compound (Table 26). This means that it can possibly cause alteration of a genetic material, such as the DNA of an organism.
Lastly, toxicity of the compounds depends on the different metabolic mechanisms. Several enzymes could either metabolize the drug therapeutically or lead to the formation of toxic metabolites. Below are the possible targets defined according to Novartis that are linked with adverse drug reactions: Adenosine A2A receptor (AA2AR), Adrenergic beta        78 .
The results showed that the top hit compounds are non-binders with these protein except for compounds #4 and# 65 which were predicted as binders of Prostaglandin G/H synthase 1 (Table 27).

conclusion
Tuberculosis is still a worldwide health problem due to the emergence of strains of M. tuberculosis that are resistant to existing anti-TB drugs. There is now a growing interest in targeting GlfT2, the enzyme responsible for the growth of the galactan chain, an important part of the cell wall. To obtain insights on the different interactions of the synthesized compounds with GlfT2, we did ensemble molecular docking studies and the binding energy values of the synthesized compounds showed a −3.00 kcal to −6.00 kcal mol −1 range. Two compounds, #27 and #31, have registered binding energy value of −8.32 ± 0.01 and −8.08 ± 0.01 kcal mol −1 , respectively. These compounds are synthesized as UGM inhibitors and could possibly inhibit GlfT2. Compounds #1-4 are analogs of a known substrate disaccharide modified at 6-OH and 5-OH position of the non-reducing end. Docking studies showed that these are promising compounds with binding energy values of −10.00 to −19.00 kcal mol −1 . The synthesized and designed compounds were subjected to 3D-QSAR to improve their structural scaffolds and effective interactions with the GlfT2 active site. Here, 18 newly designed compounds were produced considering all steric and electrostatic descriptors. Furthermore, these 18 compounds were all subjected to molecular docking and showed increased binding energy values from −6.00 to −8.00 kcal mol −1 . Also, a significant increase on the binding energy value was observed when modifying the aglycon part instead of the sugar moiety. Thus, it is suggested that a modification of the aglycon could a better putative way to design GlfT2 inhibitors. The drug development process includes ADMETox evaluation to determine if a certain proposed drug can be absorbed or can be toxic, thus, top hit compounds were subjected to in silico ADMETox. Compounds #31 and #70-73 are predicted to be well-absorbed and non-blood brain permeant. Moreover, compounds #31 and #73 were considered CYP2C9 inhibitor which could lead to adverse side effects. Compounds #70, #71, and #72 passed the ADME evaluation. Predicted toxicity evaluation showed that only compound #31 was non-toxic and passed all the toxicity endpoints.

Methods
Molecular dynamics simulation. Two GlfT2 crystal structures are available in PDB. One is bound with UDP-Galf (PDB ID: 4FIY) and the other one is unbound (PDB ID: 4FIX) 20 . The binding affinity of the natural acceptor substrate, with or without the presence of donor substrate in the active site, is statistically insignificant. Thus, for system simplification, the unbound GlfT2 crystal structure (PDB ID: 4FIX) was used for 100 ns all-atom Compound Hepatotoxicity Carcinogenicity Immunotoxicity Mutagenicity Cytotoxicity   1  Inactive  Inactive  Active  Inactive  Inactive   2  Inactive  Inactive  Active  Inactive  Inactive   3  Inactive  Inactive  Active  Inactive  Inactive   4  Inactive  Inactive  Active  Active  Inactive   27  Inactive  Inactive  Active  Inactive  Inactive   31  Inactive  Inactive  Inactive  Inactive  Inactive   58  Inactive  Inactive  Active  Inactive  Inactive   59  Inactive  Inactive  Active  Inactive  Inactive   60  Inactive  Inactive  Active  Inactive  Inactive   61  Inactive  Inactive  Active  Inactive  Inactive   62  Inactive  Inactive  Active  Inactive  Inactive   63  Inactive  Inactive  Active  Inactive