Introduction

Tuberculosis (TB) has been one of the fatal diseases worldwide1,2,3. The causative agent of this disease is a pathogenic microorganism, Mycobacterium tuberculosis (Mtb), which is a rod-shaped mycobacteria4. The World Health Organization (WHO) recently reported that one-third of world’s population5 has TB. There is still a slow decline on the number of infected individuals for the previous years6, concluding that current efforts to fight this disease is challenging. In the Philippines, TB has been identified as the sixth leading cause of morbidity and mortality, which makes the country ranked sixth out of the top 22 TB burdened countries worldwide7. In addition, the Philippines is considered one of the countries with the highest number of cases of multi-drug resistant tuberculosis (MDR-TB)8. Despite active efforts in administering the latest medical treatment to prevent the spread of TB, this pulmonary illness remains a global threat.

There are two kinds of resistance to TB: (a) mono and (b) poly resistant. The latter is divided into two: (a) multi-drug resistant (MDR-TB), to at least two of the standard or first-line anti-TB drugs, and (b) extensively/extreme-drug resistant (XDR-TB), to at least two of the first-line anti-TB drugs and is immune to the second-line anti-TB drugs. The inconsistent TB treatment for patients has led to a new strain that is totally drug-resistant (TDR-TB) to either the first-line or second-line anti-TB drugs9,10. There is currently no known treatment for TDR-TB; thus, this opens up the challenge of developing new strategies on design of new anti-TB drugs.

Mycobacterium species’ cell wall acts as a rigid scaffold that hinders the penetrating action of antibiotics. Some commercially-available anti-TB drugs (i.e. ethambutol or isoniazid) are used to disrupt the cell wall biosynthesis11,12,13, in combination with other anti-TB drug (i.e. rifampicin or streptomycin) that have intracellular targets14. Current anti-TB treatment, i.e. DOTS15, targets the Mtb cell wall’s integrity and allow the facile permeation of the antibiotics to the organism. The microorganism’s ability to adapt and develop resistance to these drugs is still a challenge. This continuing battle is evident in the emergence of resistant strains responsible for MDR-, XDR-, and TDR-TB in some regions of the world. Hence, a potential strategy for TB treatment is in the search for novel compounds that can interfere with the Mtb’s cell wall complex biosynthesis.

Mtb’s cell wall efficiently protects the mycobacteria from detrimental factors during the infection stage, at the same time, it could also be the weak spot of the organism16. One component of the Mtb’s cell wall is the arabinogalactan (AG) complex. This is a unique structure that is composed of D-galactofuranosyl and L-arabinofuranosyl monosaccharides17. The AG complex, like any other carbohydrate polymers, is essential for mycobacterial viability18,19. Thus, the enzymes involved in its biosynthesis might serve as putative therapeutic targets. One of the enzymes involved in the synthesis of this complex is Galactofuranosyltransferase 2 (GlfT2), an enzyme that catalyzes the transfer of galactofuranosyl residues from UDP-Galf to the growing galactan chain20,21. This chain is composed of ~30 D-galactofuranose (Galf) residues that are linked via alternating β-(1 → 5) and β-(1 → 6) linkages with the reducing end covalently attached to a linker disaccharide consisting of rhamnose and N-acetylglucosamine22.

Several active researches have zoomed in on GlfT2 inhibitors as anti-TB drugs. Current strategies in the design of GlfT2 inhibitors involved mimicking either the donor or acceptor substrate of the enzyme16,21,23. Uridine diphosphate-galactofuranose (UDP-Galf) is considered the donor substrate of GlfT2. The approach used is to design a glycomimetic compound with disrupted hydrogen-bonding interaction with the enzyme’s catalytic site, specifically with D372, through modification of the 6-OH position16. With this strategy, four compounds were synthesized such as the UDP-6-fluoro-α-D-Galf (compound #49), the UDP-β-L-Araf (compound #50), the UDP-5-deoxy-α-D-Galf (compound #51), and the UDP-6-deoxy-α-D-Galf (compound #52). Compounds #50 and #52 were found to inhibit the production of the glycolipids and compounds #49 and #51 have effectively reduced the length of galactolipid produced16.

The drawback of using the aforementioned compounds is in their inability to pass through the cell membrane because of the polar sugar moiety and charged diphosphate group. Thus, another strategy was proposed by replacing the diphosphate group of UDP-Galf with basic amino acids such as lysine, glutamine, tryptophan, and histidine. Also, instead of using D-Galf, L-Araf was used. Among the four sugar-amino acid-nucleosides, those with tryptophan and histidine as the replacement for the diphosphate moiety have shown 30% and 37% inhibition activity with GlfT223, respectively.

Another series of GlfT2’s donor-mimicking inhibitors are compounds #15–18 (Fig. 1). These are structurally described as imino-galactofuranose sugar moieties with uridine as an aglycon. These compounds only differ in linker composition that is an amide (compound #15)24, phosphate and double bond (compound #16)25,26 and phosphate and a hydroxyl group (compound #17)27. Compound #18 (Fig. 1) is considered an analog of compound #1728. Furthermore, compound #19 (Fig. 4) is an imino-galactofuranose with UDP as an aglycon. These probes were previously used as UDP-galactopyranose mutase (UGM) inhibitors26. It was found that only compound #18 exhibited less than 35% inhibition activity against UGM26.

Figure 1
figure 1

Structures of Imino-Sugars as GlfT2 inhibitors.

A previous study synthesized compounds #53–56 (Fig. 2) to evaluate GlfT’s specificity16. These compounds are octyl di- and tri-saccharides. Using kinetic characterization of GlfT with these compounds, it was found that trisaccharides were better substrates than disaccharides. The current strategy in designing for the acceptor mimic substrates were based from these results.

Figure 2
figure 2

Structures of Synthetic Acceptor Substrates as GlfT2 inhibitors.

Aside from mimicking the donor and acceptor substrates to inhibit GlfT2, making a resemblance of the positive character of the transition state of the substrate was also used. A recent study synthesized 14 sulfonium ions with varying side chains. These are expected to mimic the transition state during the glycosyl transfer reaction. Among these compounds, the one having a sulfonium ion with 12-hydroxydodecyl side chain exhibited the highest inhibition activity of 60%29.

Both glycosidase and galactofuranosyltransferase carry out glycosyl transfer reaction, hence, it is noteworthy that the proposed inhibitors of glycosidase may also potentially serve as GlfT2 inhibitors29. Compound #24 (Fig. 3) is one of the two naturally-occurring sulfonium ion glycosidase inhibitors, known as salacinol30. This compound was found to have an inhibitory activity towards α-glycosidase. Compounds #23 and #25 (Fig. 3) are both salacinol-derivatives replacing sulfur atom with nitrogen and selenium, respectively31,32.

Figure 3
figure 3

Structures of the Transition State Mimics as GlfT2 inhibitors.

With regard to the charged-sulfur containing compounds as GlfT2 inhibitors, a series of galactofuranosyl N,N’-dialkyl sulfenamides and sulfonamide were synthesized. Here, it was found that compounds with shorter N,N’-dihexyl chains showed low inhibition activity compared with compounds having longer dioctyl and didecyl chains34.

The potential GlfT2 inhibitors aforementioned have low inhibition activities with the enzyme. As such, it is timely to shift and find another strategy on designing new GlfT2 inhibitors. In silico studies such as molecular docking and 3D-QSAR are now possible with the availability of the GlfT2 crystal structure (PDB ID: 4FIX)20. Molecular dynamics (MD) simulations were performed to obtain ensemble of protein structures to be used for molecular docking studies. Also, 3D-QSAR was used as guide in designing new GlfT2 inhibitors. Moreover, the top hit compounds were screened using in silico ADMETox.

Results and Discussion

Ensemble docking

The 100 different protein conformations were obtained from the entire 100 ns simulation using the clustering analysis of the cpptraj35 module of the AMBERTools 15 package36. Previously synthesized and functionalized compounds were prepared using the MarvinSketch software37. Subsequently, molecular docking was performed using AutoDock Vina38,39,40. Here, the inhibition constant (Ki) was obtained from the binding energy (ΔG) using the formula: Ki = exp(ΔG/RT), where R is the universal gas constant (1.985 × 10−3 kcal mol−1 K−1) and T is the temperature (298.15 K). The previously synthesized and functionalized compounds were docked to obtain the binding energy of the complexes formed between the receptor and the ligands. The natural acceptor substrate was found to have a binding energy of −6.63 ± 0.02 kcal mol−1 and Y236, D256, W309, K369, D372, W399, and Q409 as key interacting amino acids with the natural substrate. The binding energy of the natural donor substrate will be used as the reference value here.

First line Anti-TB drug

Several anti-TB drugs developed were categorized according to target. One classification is the first-line anti-TB drugs (Fig. 4) which inhibits the synthesis of the bacterial cell wall essential for its pathogenicity41 and virulence42. The first-line anti-TB drug compounds were subjected to molecular docking studies. Results showed that the binding energy values indicate that these compounds weakly bind to the active site. Majority of the first-line anti-TB drugs such as ethambutol, isoniazid and pyrazinamide were found to have no interaction with the key GlfT2 active site amino acids (Table 2). Overall, the observed low binding energies of these compounds originated from the weak to lack of interaction with the amino acids within the active site (Table 1).

Figure 4
figure 4

Structures of the first-line Anti-TB drugs.

Table 1 Binding affinities and Inhibition constant (T = 298.15 K) of first-line Anti-TB drugs.
Table 2 Interacting amino acids with First-line anti-TB drugs.

Current GlfT2 inhibitors in the literature

Recently, one of the strategies used to synthesized GlfT2 inhibitors is by mimicking the identified donor, acceptor, and transition-state substrates43,44,45.

Mimicking GlfT2’s donor substrate as GlfT2 inhibitor

Compounds having a similar structure with the GlfT2’s donor substrate could be used as potential GlfT2 inhibitors. In a previous study, donor substrate mimics such as compounds #49–52 (Fig. 5) were examined using a spectrophotometric assay with GlfT1 and GlfT2. It was found that these compounds inhibit GlfT2 except for compound #5016. Docking results showed that the binding energy values of compounds #50 and #52 (Fig. 5) were relatively lower than the reference value (Table 11). On the other hand, with respect to the reference value, compounds #49 and #51 (Fig. 5), exhibited lower binding energy values of −6.79 ± 0.03 and −6.72 ± 0.10 kcal mol−1, respectively (Table 11). Furthermore, it showed that compounds #49, #51, and #52 interact with Y236, W348, P167, G232, and D371 (Fig. 6 and Table 12). Intriguingly, compound #50 has lost its hydrophobic interaction with D256, I368, and T168. In contrary, these amino acids were found to interact with compounds #49, #51 and #52 (Table 12). The absence of these amino acids interacting with compound #50 could be the reason for its weak binding affinity with GlfT2 as described previously16.

Figure 5
figure 5

Structures of UDP-furanoses as GlfT2 inhibitors.

Figure 6
figure 6

Three-dimensional plot of the interaction of compound #49 with GlfT2’s active site.

As discussed previously, compounds #15–19 were found to be UGM inhibitors and GlfT2 donor-substrate mimics (Fig. 4). Thus, all these compounds were subjected to molecular docking studies to probe whether the UGM inhibitors could also act as GlfT2 inhibitors. Results showed that the binding energy values of compounds #15, #16 and #19 were relatively higher than the reference value (Table 3). Compound #17 was found to have a lower binding energy value of −6.36 ± 0.01 kcal mol−1 compared with the reference value. Compound #18, compound #17’s analog, was found to have a binding energy value of −5.02 ± 0.01 kcal mol−1, relatively much lower compared with compound #17. The absence of a UDP moiety in compound #18, which was present in compound #17, could be the possible reason for its low binding affinity. The linker seems to increase the interaction of the ligand with the amino acid residues within the active site through hydrogen bonding. The interactions among the linker’s phosphate group, Y236, and G232 were also observed with compounds #15–17 and #19. The presence of this kind of interaction could be the reason for the stabilization of these compounds within the active site (Fig. 7 and Table 4).

Table 3 Binding affinities and Inhibition constant (T = 298.15 K) of Imino-Sugars as GlfT2 inhibitors.
Figure 7
figure 7

Three-dimensional plot of the interaction of compound 16 with GlfT2’s active site.

Table 4 Interacting amino acids with Imino-Sugars as GlfT2 inhibitors.

Mimicking GlfT2’s acceptor substrate as GlfT2 inhibitor

Compounds #20 (galactofuranosyl N,N’-didecyl sulfenamide), #21 (galactofuranosyl dioctyl thioglycoside), and #22 (sulfone derivative of compound #21 via the oxidation of sulfur) (Fig. 8) were screened for inhibition effect using disk susceptibility test assay46. Results revealed that compound #20 exhibited an inhibitory effect comparable to the shorter diakyl chains. On the other hand, compounds #21 and #22 were found to have an inhibitory effect with GlfT2 at a concentration less than 5 μM46.

Figure 8
figure 8

Structures of Sulfenamide and Sulfonamides as GlfT2 inhibitors.

These compounds were subjected to molecular docking studies and showed lower binding energy values compared with the reference value (Table 5). Among these three, compound #22 (Fig. 8) has highest binding energy due to the additional oxygen (sulfone) that was observed to interact with D371 and D372 (Fig. 9). The dialkyls were observed to have hydrophobic interactions with R171, Y236, D371, D372, G232, W347, W348, and D256 (Table 6).

Table 5 Binding affinities and Inhibition constant (T = 298.15 K) of Sulfenamide and Sulfonamides as GlfT2 inhibitors.
Figure 9
figure 9

Three-dimensional plot of the interaction of compound #22 with GlfT2’s active site.

Table 6 Interacting amino acids with Sulfenamide and Sulfonamides as GlfT2 inhibitors.

Compounds #53–56 (Fig. 2) were synthesized to evaluate the specificity of GlfT16. Using kinetic characterization of GlfT with these compounds, it was found that trisaccharides were better substrates than disaccharides21. Docking studies were performed on these compounds and the results showed that trisaccharides, compounds #55 and #56, have lower binding energy values (−6.24 ± 0.04 and −6.19 ± 0.04 kcal mol−1, respectively) compared with disaccharides, compounds #53 and #54 (−6.16 ± 0.03 and −6.11 ± 0.03 kcal mol−1, respectively) (Table 13). The additional sugar moiety of a trisaccharide compared to a disaccharide extended the long alkyl chain, allowing interaction with W408 via hydrophobic interaction (Fig. 10 and Table 14). This could be the origin of the enhanced interaction observed in both the experimental and docking studies.

Figure 10
figure 10

Three-dimensional plot of the interaction of compound #55 with GlfT2’s active site.

UDP-Galactopyranose mutase (UGM) substrate as GlfT2 inhibitor

Another strategy used to inhibit GlfT2 was by repurposing compounds that were proposed to be inhibitors of enzymes involved in the galactan chain synthesis. One of these enzymes is UDP-Galactopyranose mutase (UGM). In the absence of galactofuranose in mammals33, UGM catalyzes the conversion of UDP-galactopyranose to UDP-galactofuranose which will eventually be the donor substrate of GlfT226,47. Here, some of the proposed UGM inhibitors were subjected to molecular docking studies with GflT2.

Compound #14 (Fig. 1) was reported to be a weak inhibitor of UDP-Galp mutase47. Upon molecular docking investigation the binding energy was found to be much lower, −4.18 ± 0.01 kcal mol−1, compared with the reference value (Table 3). Furthermore, among the key GlfT2 active site amino acids that were previously discussed, Y236, D256 and W348, were the only amino acids observed to be interacting with compound #14. The decrease in the number of interacting amino acids with compound #14 is one of the reasons for its poor binding affinity.

A recent study have discussed the development of microtiter plate-based assay to screen uridine-based compounds against UGM48. Among the compounds in the uridine-based library used in the assay, only compound #27 (Fig. 12) was found to be a weak inhibitor (IC50 = 6.0 μM). Its binding energy value (−8.32 ± 0.01 kcal mol−1) was observed to be lower than the reference value (Table 9). The interactions of the compound with P167, G232, W347, I368, D371, T168, W408, M285 and H296 may account for its improved binding affinity (Fig. 13).

Figure 11
figure 11

Three-dimensional plot of the interaction of compound #25 with GlfT2’s active site.

Figure 12
figure 12

Structures of UDP-Galactopyranose mutase (UGM) inhibitors as GlfT2 inhibitors.

Table 7 Binding affinities and Inhibition constant (T = 298.15 K) of Transition State Mimics as GlfT2 inhibitors.
Table 8 Interacting amino acids with Transition State Mimics as GlfT2 inhibitors.
Table 9 Binding affinities and Inhibition constant (T = 298.15 K) of Synthesized Halogenated GlfT2 inhibitors.
Figure 13
figure 13

Three-dimensional plot of the interaction of compound #27 with GlfT2’s active site.

Aside from being a known antibacterial agent for urinary tract infection treatment, compound #28 (Fig. 12) was further used as a UGM inhibitor and found to moderately inhibit the enzyme49. Docking this with GlfT2 resulted in a lower binding energy compared with the reference value (Table 9). The absence of the interaction of this compound with W347, I368 and T168 may account for this lower binding affinity compared with compound #27.

Compounds #30 and #31 showed a promising activity towards UGM (IC50 = 1.6 μM)50,51. These compounds differ in the type and number of halides they contain. Compound #30 has only bromide and compound #31 has two chloride and one iodide atom present in the 5-arylidene-2-thioxo-4-thiazolidinone (ATT) core. Results of the molecular docking studies showed that compound #31 have a lower binding energy compared with compound #30 (Table 9). The observed interaction of Q200 with compound #30’s chloride atom and K369 with iodide atom may account for the difference in the binding energy values between the two compounds (Table 10).

Table 10 Interacting amino acids with synthesized halogenated compounds as GlfT2 inhibitors.
Table 11 Binding affinities and Inhibition constant (T = 298.15 K) of Synthetic UDP-furanoses.

Compound #29 (Fig. 12) showed no inhibition or poor inhibitory activity with UGM51. Molecular docking studies have found that it only registered a binding energy of −3.96 ± 0.01 kcal mol−1 which was much lower compared with the reference value. The derivative of the sugar moiety may have occupied only the sugar binding region of the active site and do not interact with the residues within the UDP binding region. This suggests that the residues interacting with UDP contribute to the binding of UDP-Galf, and the absence of UDP or replacement of any moiety that could mimic it, may account for the compound’s low binding affinity.

Among the reported synthesized GlfT2 inhibitors that were included in this study, compounds #27 and #31 (Fig. 12) were found to have binding energies of −8.32 ± 0.01 kcal mol−1 and −8.08 ± 0.01 kcal mol−1, respectively (Tables 9 and 10). The observed higher binding energy values compared with GlfT2’s natural substrate seems to originate from the interaction of W347, W348, and W408 (only for compound #27) with the aromatic rings present in the compound. The presence of tryptophan within the active site could stabilize the inhibitors via π-π interaction with the aromatic rings.

Mimicking GlfT2’s transition-state substrate as GlfT2 inhibitor

Aside from mimicking the GlfT2 donor and acceptor substrates, mimicking the GlfT2 transition-state substrate (positively charged moiety) was another interesting strategy. Since α-glycosidase and galactofuranosyltransferase carry out glycosyl transfer reaction, it was proposed that α-glycosidase inhibitors could also be potential GflT2 inhibitors29. With this, molecular docking studies using compounds #23−2531 (Fig. 3) generate binding energies of −4.99 ± 0.01, −5.31 ± 0.01, and −5.60 ± 0.01 kcal mol−1, respectively (Table 7).

The positively charged moiety of these compounds were found to interact with D371 and the other interacting amino acids as shown in Fig. 11 and Table 8. Still, lower binding energy values were observed compared with the reference value (Table 7). The structural motif and binding energy values of compounds #23–25 were comparable with compound #18. These compounds only have a sugar moiety and a short linker (Figs. 1 and 3). As previously discussed, the absence of a moiety that could interact with UDP binding region amino acids could account for the low binding affinity of a molecule within GlfT2’s active site.

Newly Designed Sugar Furanosides as GlfT2 Inhibitors

In silico drug design draws attention among researches because it is time-saving and cheap. Computer-aided drug design uses computational tools to discover, develop, and analyze drugs52. One technique used for drug design is ligand-based computer-aided drug design which involves ligands that are known to interact with target receptor, and account for the binding strength of a given molecule by knowing the nature of the interactions53.

Functionalizing a drug with an azide group has been largely used in the pharmaceutical industries54,55. It was then recognized as a novel pharmacophore in medicinal chemistry especially in the emergence of zidovudine, an anti-retroviral drug for the treatment of Acquired Immuno-Deficiency Syndrome (AIDS). Also, azide group was used in tumor-labelling56 for cancer treatment. It is noteworthy that azido-substituted drugs have high affinity towards the target receptor57. It was found that tetrahydroimidazobenzodiazepinthiones (TIBO) or thiourea derivatives are potential drugs for treating TB58,59. Guanidine derivative drugs also pose a promising role in medicinal chemistry because of their anticancer60,61, antiviral62, antibacterial63 properties. Streptomycin, a known anti-TB drug, has two guanidino groups. The aforementioned functional groups such as azido, thiourea and guanidino groups were used here in designing new GlfT2 inhibitors because of their antibacterial property and high receptor affinity.

The design was an analog of the known GlfT2 substrate dissacharide, octyl β-D-galactofuranosyl-(1→5)- β-L-arabino-furanoside. They found that substrates with longer chain aglycon were better substrates of glycosyltransferases in Mycobacterium species64. Compounds #2 and #3 (Fig. 14) were both trans-2-tridecen-1-yl glycosides having a modification in the non-reducing end wherein the 6-OH position was replaced with azido and thiourea functional groups, respectively. On the other hand, compound #1 was obtained via the oxidation of double bond in the aglycon of compound #3. It was found that compound #3 has a higher binding energy value of −10.32 ± 0.02 kcal mol−1 compared with compound #2 with binding energy value of −11.08 ± 0.02 kcal mol−1. It is proposed that the presence of azido group have a higher binding affinity compared with the thiourea group which is evident on the binding energies presented.

Figure 14
figure 14

Structures of Newly Designed Acceptor Substrates as GlfT2 inhibitors.

From the binding energies of compounds #1 and #3, it can be observed that compound #1 has a lower binding energy of −14.67 ± 0.04 kcal mol−1 compared with compound #3 having −10.32 ± 0.02 kcal mol−1 (Table 15). There were observed interactions among D256,Y236, and thiourea and another set of interactions among D371, D372, the hydroxyl groups of the disaccharide, and the aglycon of compound #1 (Table 16). The addition of two hydroxyl groups on the aglycon effectively increased the enzyme-substrate interaction within the active site.

Table 12 Interacting amino acids with Synthetic UDP-furanoses as GlfT2 inhibitors.
Table 13 Binding affinities and Inhibition constant (T = 298.15 K) of Synthetic Acceptor Substrates.
Table 14 Interacting amino acids with Synthetic Acceptor Substrates as GlfT2 inhibitors.
Table 15 Binding affinities and Inhibition constant (T = 298.15 K) of newly designed GlfT2 inhibitors.
Table 16 Interacting amino acids with Newly Designed Acceptor Substrates as GlfT2 inhibitors.

Compound #4 (Fig. 14), a glyceryl glycoside, has a modification in the non-reducing end wherein the 5-OH and 6-OH position were replaced with an azido and guanidino functional groups, respectively. As observed, D256 and I368 interacts with the hydroxyl groups of glyceryl aglycon (Table 16). Y344, P167 and D258 were observed to interact with the guanidino group and D372 was observed to interact with the azide group (Fig. 15 and Table 16). These residues interact with the inhibitor through hydrogen bonding and seem to be the origin for the observed high binding affinity. Among the newly designed sugar-based inhibitors, compound #4 is the most promising compound with a binding energy of −19.23 ± 0.05 kcal mol−1.

Figure 15
figure 15

Three-dimensional plot of the interaction of compound #4 with GlfT2’s active site.

Redesigned Sugar Furanosides as GlfT2 Inhibitors using 3D-QSAR

Structures of the synthesized GlfT2 inhibitors were subjected to 3D-QSAR to improve the structural motif of the inhibitors. The 3D-QSAR is an important tool on providing substantial information about the molecular attribute essential for biological activity of compounds65,66.

Results showed that the Pearson coefficient is R2 = 0.99 which signifies the reliability of the test and training sets used. The structures of the presented synthesized GlfT2 inhibitors were aligned along their respective molecular field points. Figure 20 shows negative steric field points (green field) which indicate that steric groups should be avoided on that particular part of the molecule. Whereas, positive steric field points (yellow field) indicate that steric groups should be added on that particular part of the molecule. Moreover, negative electrostatic field points indicate that an electrostatic contributor i.e. negatively charged group/hydrogen-bond acceptor, should be added on that particular part of the molecule. Whereas, positive electrostatic field points indicate that an electrostatic contributor i.e. positively charged group/hydrogen-bond donor, should be added on that particular part of the molecule.

From this, insights on designing the top 18 hit compounds were acquired. From 3D-QSAR, instead of having a flexible long chain aglycon, the aglycon was replaced with cholesterol (compound #58), tocopherol (compound #62), retinol (compound #61) cholesterol derivatives i.e. calciferol, calcitriol (compound #63), cholecalciferol (compounds #59, #60 and #64) and cholecalciferol derivatives (compounds #66–75) which are more rigid and sterically hindered.

The 5-OH and 6-OH position of the non-reducing end of the substrates were functionalized with azido- and guanidino-group, respectively for compounds #58 and #59 (Figs. 16 and 17). For compounds #66–75 (Figs. 16 and 17), the 5-OH and 6-OH position of the non-reducing end of the substrates were functionalized with amine- and methyl group. For compounds #60–63 and #65 (Figs. 16 and 17), only 6-OH position of the non-reducing end of the substrates were functionalized with a methyl group. Lastly, for compound #64 (Fig. 16), the 6-OH position of its non-reducing end was functionalized with a thiourea group.

Figure 16
figure 16

Structures of the 3D-QSAR designed compounds.

Figure 17
figure 17

3D-QSAR Designed Possible Inhibitors.

These 18 newly 3D-QSAR designed structures, compounds #58–75 (Figs. 16 and 17) were subjected to ensemble docking to reassess the binding affinity of the compounds. The compounds registered binding energy values of ~−8.00 kcal mol−1 from having a binding energy values of ~−6.00 kcal mol−1 (Tables 17 and 19).

Table 17 Binding affinities and Inhibition constant (T = 298.15 K) of 3D-QSAR designed compounds.
Table 18 Interacting amino acids with 3D-QSAR designed compounds.
Table 19 Binding affinities and Inhibition constant (T = 298.15 K) of 3D-QSAR designed compounds.

Results show that the modification of the aglycon instead of the sugar moiety lead to the significant increase on the binding energy of designed compounds.

Additional amino acids were found to be interacting with these compounds as shown in Tables 18 and 20. It can be seen that compounds #63–65 were found to be interacting with both M286 and K402 through hydrogen bonding (Figs. 18 and 19). These interactions were also observed between compound #59 and K402, and between compounds #68 and #75 and M286 (Figs. 18 and 19).

Table 20 Interacting amino acids with  3D-QSAR designed compounds.
Figure 18
figure 18

Three-dimensional plot of the interaction of compound #60 with GlfT2’s active site.

Figure 19
figure 19

Three-dimensional plot of the interaction of compound #74 with GlfT2’s active site.

Structure-activity relationship representation

Various designs of the possible GlfT2 inhibitors are summarized in Fig. 21 using a Structure-Activity Relationship (SAR) representation. This shows the effect of the different R-groups added to the pharmacophore of the QSAR-based, Donor substrate-based and Acceptor substrate-based compounds on their activity. To design for possible GlfT2 inhibitors, the compound should have at least one sugar moeity (D-Galf) provided that the 5-OH and/or 6-OH position have R-groups that could disrupt the compounds’ hydrogen-bond interaction with D372 (Figure 18). In addition, the presence of a UDP or UDP-like moiety (long alkyl chain or steriodal aglycon) could possibly increase the binding affinity of the compound.

Figure 20
figure 20

3D-QSAR model for the synthesized GlfT2 compounds (R2 = 0.99). Yellow isosurface represents the positive steric field, green isosurface represents the negative steric field.

Figure 21
figure 21

Structure-Activity Relationship (SAR) representation of the QSAR-based design, donor substrate-based design and acceptor substrate-based design compounds.

ADMETox evaluation of the best candidates

The predicted ADME part of this study was carried out using an online server, SwissADME67, that gives values for lipophilicity, water solubility, drug-likeness, medicinal chemistry (i.e. leadlikeness, and PAINS and Breck). Whereas, in silico toxicity evaluation was carried out also using an online server, ProTox-II68, that gives predicted oral toxicity values, predicted cytotoxicity, mutagenicity, carcinogenicity, hepatotoxicity, and immunotoxicity. In addition, ProTox-II also gives an overview whether the compounds being analyzed will bind to the proteins known to produce adverse reaction to drugs.

Drug-likeness, bioavailability, synthetic accessibility and alerts for PAINS and Brenk filters

Drug-likeness is a quantitative parameter that measures a compound’s oral bioavailability. Abbot bioavailability score predicts the chance of a compound to have at least 10% oral bioavailability in rat or measurable Caco-2 cell line permeability experiment. This permeability experiment use Caco-2 cells as a model for human intestinal absorption of drugs69. The parameters considered to measure the score are lipophilicity (−0.7 < XLOGP3 < 5.0), molecular weight (MW) (150 g mol−1 < MW < 500 g mol−1), polarity (20 Å2 < TPSA < 130 Å2), solubility (0 < log S (ESOL) < 6), saturation (0.25 < Fraction Csp3 < 1) and flexibility (0 < of rotatable bonds < 9). This semi-quantitative rule-based score defines the compounds into four probability score classes i.e 11%, 17%, 55% and 85%69,70. The acceptable probability score is 55% which indicates that it passed the rule of five. Among the top hits, compounds #31, #60, #61, #62, #66, #67, #71, #72, #73, #74 and #75 showed a score of 55%, indicating good bioavailability.

PAINS (Pan Assay Interference compounds) and Brenk71 method are used to identify potentially problematic molecular fragments that could give false-positive biological activity output65,69. Thus, the PAINS and Brenk screening showed that compounds having the following functional groups: (1) imine- and azo- fragments i.e compounds #1, #2, #4, #27, #58, and #59, (2) isolated alkene fragment i.e compounds #31, #58, and #62, (3) thiocarbonyl fragment i.e compounds #3 and #64, and (4) polyene fragment i.e compound #61 (Table 21). The remaining compounds showed no problematic chemical fragments.

Table 21 Drug-likeness parameter values for the top hit compounds.

Lead-likeness of a compound is predicted using parameters such as MW (250 g mol−1 ≤ MW ≤ 350 g mol−1), octanol/water partition coefficients (XLOGP ≤ 3.5) and number of rotatable bonds (# rotatable bonds ≤ 7). Results showed that none of the top hit compounds fall within the set criteria. To quantify the complexity of the molecular structure, synthetic accessibility was assessed. The results showed that the scores for the compounds were in the range of 3.92–8.96 (Table 21). The obtained values revealed that the compounds here have complex synthesis route.

Absorption, distribution, metabolism and excretion properties evaluation of the top hit compounds

Solubility is one of the major properties influencing absorption. The compound’s aqueous and non-aqueous solubility either is important from the drug development process until oral in-take67. Lipophilicity is the effective solubility of a compound into the non-aqueous medium and correlated to various models of drug properties such as adsorption, distribution, metabolism and toxicity70. Five available predictive models, i.e iLOGP (implicit log Po/w), XLOGP3 (enhanced atomic/hybrid log Po/w 3), WLOGP (Wildman and Crippen log Po/w), MLOGP (quantitative-structure log Po/w) and SILICOS-IT were used to evaluate the lipophilicity of the compounds. The mean predicted lipophilicity values from these methods is termed as the consensus log Po/w. A molecule is more soluble if the consensus log Po/w values is more negative67. Results showed that compounds #4, #27, and #65 were soluble in non-aqueous medium (Table 22).

Table 22 Predicted absorption parameters in ADME evaluation of top hit compounds.

Some drugs have to be highly water soluble to deliver sufficient amount of the active ingredient. Three models were used by SwissADME to predict water solubility i.e ESOL (Estimated SOLubility), Ali and SILICOS-IT (SwissADME in-house solubility predictor). A qualitative estimation of solubility according to log S scale: <−10 - poorly soluble, <−6 - moderately soluble, <−4 - soluble, <−2 - very soluble, and <0 highly soluble67. Based from these predictive models, only compound 65 is predicted to be soluble. Compounds #1, #3, and #4 are predicted to be water soluble while compounds #27, #61, #63, #73 and #75 are predicted to be moderately water soluble. The remaining top hit compounds are predicted to be water insoluble (Table 22).

As the drug is absorbed by the system, it encounters diverse membrane barriers such as hepatocyte membrane, gastrointestinal epithelial cells, blood capillary wall, glomerulus, restrictive organ barriers (e.g. blood-brain-barrier), and the target cell70. A molecule is said to be less skin permeant if the value of log Kp is more negative67,72. From the predicted results, compounds #4, #27 and #65 are found to be the least skin permeant (Table 23). Moreover, other parameters used to measure the adsorption and distribution of these drugs is through human intestinal absorption (HIA) or gastrointestinal (GI) adsorption data. These data show that compounds #31, #70, #71, #72, and #73 are predicted to be well-absorbed, whereas, compounds #31, #70, #71, #72, and #73 are predicted as non-brain penetrants (Table 23). None of the top hit compounds was predicted to be blood-brain-barrier (BBB) permeant. This means that compounds being proposed here have a relatively large size and they cannot pass the blood-brain barrier. Also, a compound being non-blood-brain permeant lowers the possibility of causing harmful toxicants in the brain and blood stream when metabolized. The remaining compounds were predicted to be neither absorbed nor penetrated in the brain.

Table 23 Predicted distribution parameters in ADME evaluation of top hit compounds.

After being distributed to the organism’s system, metabolism of these drugs takes place and eventually exit the excreta safely. Metabolism plays an important role in the bioavailability of drugs as well as drug-drug interactions. It is also important to have a better understanding if a certain compound is a substrate or non-substrate of the permeability glycoprotein (P-gp). This protein belongs to the ATP-binding cassette transporters which is important in assessing active efflux through biological membranes. It is also essential to have knowledge of the interaction of molecules with cytochrome P450 (CYP) enzymes as they are involved in drug elimination through metabolic transformation73. It has been suggested that CYP and P-gp can process small molecules synergistically to enhance the protection of tissues and organisms74. Inhibition of these isoenzymes may result in pharmacokinetics-related drug-drug interactions that could lead to unwanted adverse side-effects by lowering the solubility and the accumulation of the drug or its metabolites. To better understand the mechanism of drug deposition, efficacy and toxicity, the top hit compounds were evaluated to determine whether the compound can act as substrate or an inhibitor of P-gp and CYPs. All compounds are found to be substrates of P-gp except for compounds #1, #2, #4, #61 and #69. Moreover, the top hit compounds presented were found to be substrates of CYP1A2, CYP2C19 and CYP2D6. All compounds are predicted to be CYP2C9 substrates except compounds #31 and #73, whereas, for CYP3A4, compounds #27, #58, #59, #63, #64, #65, #68, and #72 were found to be potential substrates (Table 24).

Table 24 Predicted metabolism parameters in ADME evaluation of top hit compounds.

In silico toxicity evaluation of top hit compounds

Investigating the ADMET properties of a compound is a critical step for drug development. If a drug passes this step, subsequent toxicity tests are warranted. However, toxicity tests are time consuming and expensive especially if there are significant number of candidate compounds75,76. To keep up with increasing demand from the pharmaceutical industries, in silico toxicity evaluation is initially used to determine the compound’s toxicity as a fast and an inexpensive method to reduce the number of compounds to be sent later for further testing. In silico toxicity evaluation could not act as absolute answer for the compound’s toxicity evaluation75. Thus, it should always be accompanied by an in vitro and in vivo experiments to verify the biological activities beyond the capability of these computational approaches.

Here, the top hit compounds were subjected to an in silico toxicity evaluation using Pro-Tox. The LD50 is defined as the median lethal dose of a compound at which the test subjects die upon exposure to it. The toxicity class ranges from 1 to 6, 1 being fatal if ingested and 6 being non-toxic77. The results showed that the top hit compounds #3, #4, #63, #65, #66, #67, and #73 were predicted to be orally toxic (range between toxicity class 1 to 3) (Table 25).

Table 25 Predicted LD50 and Toxicity class of the top hit compounds.

The Pro-Tox online server68 also predicts four toxicological endpoints such as cytotoxicity, mutagenicity, carcinogenicity, and immunotoxicity. Results suggested that all the top hit compounds were predicted to be immunotoxic except for compound #31 (Table 26). Immunotoxic chemicals are known to alter the correct functioning of immune system by B cell growth inhibition68,77. Moreover, the organ toxicity, specifically hepatotoxicity was predicted to evaluate if the compound will cause liver dysfunction68,77. Results showed that the top hit compounds were predicted to be non-hepatotoxic. Moreover, compound #4 was predicted to be a mutagenic compound (Table 26). This means that it can possibly cause alteration of a genetic material, such as the DNA of an organism.

Table 26 Predicted activity of the top hit compounds on toxicity endpoints.

Lastly, toxicity of the compounds depends on the different metabolic mechanisms. Several enzymes could either metabolize the drug therapeutically or lead to the formation of toxic metabolites. Below are the possible targets defined according to Novartis that are linked with adverse drug reactions: Adenosine A2A receptor (AA2AR), Adrenergic beta 2 receptor (ADRB2), Androgen receptor (ANDR), Amine oxidase (AOFA), Dopamine D3 receptor (DRD3), Estrogen receptor 1 (ESR1) and 2 (ESR2), Glucocorticoid receptor (GCR), Histamine H1 receptor (HRH1), Nuclear receptor subfamily 1 group I member 2 (NR1I2), Opioid receptor κ (OPRK), Opioid receptor μ (OPRM), cAMP-specific 3′, 5′-cyclic phosphodiesterase 4D (PDE4D), Prostaglandin G/H synthase 1 (PGH1), and Progesterone receptor (PRGR)78. The results showed that the top hit compounds are non-binders with these protein except for compounds #4 and# 65 which were predicted as binders of Prostaglandin G/H synthase 1 (Table 27).

Table 27 Predicted activity of the top hit compounds towards the panel of protein toxicity targets.

Conclusion

Tuberculosis is still a worldwide health problem due to the emergence of strains of M. tuberculosis that are resistant to existing anti-TB drugs. There is now a growing interest in targeting GlfT2, the enzyme responsible for the growth of the galactan chain, an important part of the cell wall. To obtain insights on the different interactions of the synthesized compounds with GlfT2, we did ensemble molecular docking studies and the binding energy values of the synthesized compounds showed a −3.00 kcal to −6.00 kcal mol−1 range. Two compounds, #27 and #31, have registered binding energy value of −8.32 ± 0.01 and −8.08 ± 0.01 kcal mol−1, respectively. These compounds are synthesized as UGM inhibitors and could possibly inhibit GlfT2. Compounds #1–4 are analogs of a known substrate disaccharide modified at 6-OH and 5-OH position of the non-reducing end. Docking studies showed that these are promising compounds with binding energy values of −10.00 to −19.00 kcal mol−1. The synthesized and designed compounds were subjected to 3D-QSAR to improve their structural scaffolds and effective interactions with the GlfT2 active site. Here, 18 newly designed compounds were produced considering all steric and electrostatic descriptors. Furthermore, these 18 compounds were all subjected to molecular docking and showed increased binding energy values from −6.00 to −8.00 kcal mol−1. Also, a significant increase on the binding energy value was observed when modifying the aglycon part instead of the sugar moiety. Thus, it is suggested that a modification of the aglycon could a better putative way to design GlfT2 inhibitors.

The drug development process includes ADMETox evaluation to determine if a certain proposed drug can be absorbed or can be toxic, thus, top hit compounds were subjected to in silico ADMETox. Compounds #31 and #70–73 are predicted to be well-absorbed and non-blood brain permeant. Moreover, compounds #31 and #73 were considered CYP2C9 inhibitor which could lead to adverse side effects. Compounds #70, #71, and #72 passed the ADME evaluation. Predicted toxicity evaluation showed that only compound #31 was non-toxic and passed all the toxicity endpoints.

Methods

Molecular dynamics simulation

Two GlfT2 crystal structures are available in PDB. One is bound with UDP-Galf (PDB ID: 4FIY) and the other one is unbound (PDB ID: 4FIX)20. The binding affinity of the natural acceptor substrate, with or without the presence of donor substrate in the active site, is statistically insignificant. Thus, for system simplification, the unbound GlfT2 crystal structure (PDB ID: 4FIX) was used for 100 ns all-atom MD simulation using NAMD software package version 2.1079. The protein was parameterized using AMBER ff14SB force field. The system was solvated with TIP3P water model in a box of 15 Å on all sides. Counter ions were added to neutralize the system. The system was simulated in NVT with a temperature of 300 K and with an interval output every 2 fs80. Long-range interactions were evaluated using particle mesh Ewald method81. Bond constraints were applied using SHAKE algorithm82.