Quinoxalines against Leishmania amazonensis: SAR study, proposition of a new derivative, QSAR prediction, synthesis, and biological evaluation

Neglected tropical diseases, such as leishmaniasis, lead to serious limitations to the affected societies. In this work, a structure–activity relationship (SAR) study was developed with a series of quinoxaline derivatives, active against the promastigote forms of Leishmania amazonensis. As a result, a new quinoxaline derivative was designed and synthesized. In addition, a quantitative structure–activity relationship (QSAR) model was obtained [pIC50 = − 1.51 − 0.96 (EHOMO) + 0.02 (PSA); N = 17, R2 = 0.980, R2Adj = 0.977, s = 0.103, and LOO-cv-R2 (Q2) = 0.971]. The activity of the new synthesized compound was estimated (pIC50 = 5.88) and compared with the experimental result (pIC50 = 5.70), which allowed to evaluate the good predictive capacity of the model.


Computational chemistry
The three-dimensional (3D) structures of the quinoxaline derivatives (2a-2i, 3a-3i, and 4a-4d) (Table 1) were constructed using the Spartan'10 software (Wavefunction, Inc.) 15 .Each structure was submitted to a full geometry optimization step by a molecular mechanics model, using the Merck molecular force field (MMFF), available in the Spartan software.Then, each optimized structure was submitted to the default systematic conformational analysis at Spartan, using the same molecular mechanics force field.The lowest-energy conformer for each quinoxaline derivative was submitted to a full geometry optimization (energy minimization) step by a semiempirical model, using the Austin Method 1 (AM1) Hamiltonian at Spartan.Then, each optimized conformer was submitted to a single-point energy calculation by a density functional theory (DFT) model, using the B3LYP hybrid DFT method at Spartan, considering the 6-311 + + G(d,p) basis set.For each energy minimized DFT structure, the following thirteen physicochemical properties were obtained: total energy (E T , au), energy of the highest occupied molecular orbital (E HOMO , eV), energy of the lowest unoccupied molecular orbital (E LUMO , eV), HOMO-LUMO energy gap (GAP, eV), dipole moment (μ, Debye), base-10 logarithm of the partition coefficient (LogP), surface area (SA, Å 2 ), molecular volume (MV, Å 3 ), molecular weight (MW, amu), polarizability (P, 10 −30 m 3 ), number of hydrogen bond donors (HBD), number of hydrogen bond acceptors (HBA), and polar surface area (PSA, Å 2 ).
A linear cross-correlation matrix was constructed with the calculated thirteen physicochemical properties as a criterion to exclude at least one from the two highly correlated pair of properties and generate a subset of properties to be used in the QSAR equations construction.Therefore, the calculated values of a set of selected properties were set as the independent variables (X) used to calculate the QSAR equations along with the values of the dependent variable (Y), i.e., the biological activity values, which were converted from IC 50 (μM) (Table 1) to the corresponding pIC 50 (M) values, before the QSAR equations generation.Then, the QSAR equations were obtained by multiple linear regression (MLR) analysis, using the Microsoft Excel ® program (Microsoft Inc.).
In addition, using the OSIRIS Property Explorer server 16 , the toxicity risks of the quinoxaline derivatives were evaluated in silico and fragment-based drug-likeness score was calculated in the same server.
The 1 H NMR spectra of all intermediates and final product were obtained by using a Bruker ARX-400 equipment (400 MHz).

In vitro growth inhibition assay
Promastigote (1 × 10 6 cells/mL) cultures were inoculated in a 24-well plate in the absence or presence of different concentrations of the quinoxaline derivatives (0.1 and 100 μM).The inhibitory activity was evaluated after 72 h.The cell density for each concentration was determined by counting in a hemocytometer (Improved Double Neubauer).The concentration that inhibited cell growth in 50% (IC 50 ) was determined by nonlinear regression analysis 11 .

SAR analysis of the quinoxaline derivatives and design of a new derivative
Many descriptors reflect simple molecular properties give an insight referent to physicochemical nature of the observed biological activity 17 .
Table 2 shows the physicochemical descriptor values calculated at the DFT(B3LYP)/6-311 + + G(d,p) level of theory for the quinoxaline derivatives (2a-2i, 3a-3i, and 4a-4d).All the most active quinoline derivatives Table 1.Chemical structures of the quinoxaline derivatives (2a-2i, 3a-3i, and 4a-4d) and the corresponding in vitro inhibitory activities (IC 50 , μM) against the promastigote forms of Leishmania amazonensis 11 .(IC 50 < 3 μM, i.e., pIC 50 from 5.54 to 6.70 M, compounds 3a-3i, see Table 1) presented the number of hydrogen bond acceptors (HBA) ranging from 5 to 7, the polar surface area (PSA) values ranging from 46 to 74 Å 2 , and the LUMO energy (E LUMO ) values more negative than − 2.5 eV.In addition, the LogP values range from 1.6 to 3.5, and the HOMO energy (E HOMO ) values are more negative than − 5.9 eV.Unfortunately, the fragment-based drug-likeness values predicted by the OSIRIS server for these compounds are negative like most of the Fluka chemicals that have negative values, whereas 80% of the commercial drugs have a positive drug-likeness value.Toxicity was also predicted by the OSIRIS, and compounds 3g and 4a-d showed alerts of mutagenic risks.On the other hand, 3d showed the highest drug-score value (0.63).The drugscore index combines drug-likeness, cLogP (lipophilicity), LogS (water solubility), MW, and toxicity risks in one value used to predict the compound's overall potential as a drug.
Lipinski's rule-of-five 18 proposes that poor absorption or cell permeability of a drug occurs when its chemical structure fulfils more than one of the following criteria: the molecular weight (MW) is greater than 500 Daltons; Table 2. Physicochemical descriptors calculated at the DFT(B3LYP)/6-311 + + G(d,p) level of theory for the quinoxaline derivatives (2a-2i, 3a-3i, and 4a-4d) using the Spartan'10 software.Data for the most active compounds (3a-3i) (IC 50 < 3 μM, see Table 1) are highlighted in italic., for good oral availability, the PSA value must be less than or equal to 140 Å 2 .The physicochemical properties calculated for the studied compounds fit these parameters, except the LogP values for 4a-d (Table 2).
In order to improve these parameters, structural modifications on the studied compounds were proposed to design an antileishmanial agent with higher chances to become a drug.
Cogo and co-workers 11 noticed that hydrogen replacement at R1 position (Table 1) by halogen elements (Cl or Br) increases the activity, and substitution at R2 position (Table 1) did not show great interference on the activity.The methylsulfonyl group is present in all the most active compounds studied in this work (Table 1) and literature data also indicates that it is one of the main groups at 3-position of quinoxaline derivatives, which are responsible for the observed activity against Trypanosoma cruzi and Leishmania amazonensis 11 .
Based on this SAR analysis, several structural modifications were proposed and their synthetic viability as well as the OSIRIS Property Explorer's risk alerts were evaluated.After that, some of the designed compounds were selected for structural optimization and calculation of the corresponding physicochemical properties.Considering the properties related to the biological activity, compound 5 was proposed as a potential antileishmanial agent (Fig. 3).
It fulfilled all the requirements, presenting the physicochemical descriptors according to the most active compounds of the studied series: LUMO energy of − 2.79 eV, five H-bond acceptors, polar surface area of 53.18 Å 2 , LogP equal to 1.74, and HOMO energy of − 6.52 eV.
Besides, according to the OSIRIS Property Explorer server, compound 5 (Fig. 3) seems to have low toxicity risks (green color) and the drug-likeness and drug-score indexes were improved to 0.88 and 0.82, respectively, when compared to the other compounds of the series.It is also important to mention that compound 5 follows Lipinski's rule-of-five and Veber's rule related to PSA range of drug candidates.

QSAR model construction and validation
A QSAR model was built to predict the activity value of compound 5. Firstly, the degree of correlation between all pairs among the thirteen descriptors (Table 1) was verified by constructing a cross-correlation matrix.After removing multicollinear descriptors, seven of them were selected (E T , E HOMO , E LUMO , dipole moment, LogP, MW, and PSA), and equations that describe the dependency relationship between the independent (X, properties or descriptors) and dependent (Y, biological activity) variables were obtained based on Hansh and Unger's work 20 , who suggest that, in a selection of independent variables, for each independent variable included in the QSAR model, there must be no more than five observations (i.e., compounds), thus avoiding chance correlation 21 .
Therefore, the calculated values of those seven descriptors (Table 1) were set as the independent (X) variables used to calculate the QSAR equations along with the values of the dependent (Y) variable (i.e., biological activities), which were converted from IC 50 (μM) (Table 1) to the corresponding pIC 50 (M) values, before the QSAR equations generation.
Among the main methods used in the selection of the independent variables in QSAR, we applied the systematic search method, which consists in combining the available independent variables to build and analyze all possible linear regression equations.In the QSAR method, compounds are generally divided into training set and test set, compounds from the training set are used in the construction of QSAR equations and compounds from the test set are used in validation.Since there are 22 compounds (Table 1) and that part of them (~ 20% from the total number of compounds) should be removed from the model as a test group, we used a maximum of three independent variables to be included in each equation, considering N = 18 for the training set and N = 4 for the test set (namely, compounds 2i, 3g, 3h, and 4d).
The systematic search generated 63 regression equations: seven equations with one independent variable, 20 equations with two independent variables, and 34 equations with three independent variables.Tables 3, 4 and 5 list the previously selected independent variables included in the linear equations and the following statistical parameters of each equation calculated by the Microsoft Office Excel ® program (Microsoft Inc.): correlation coefficient (R), coefficient of determination (R 2 ), adjusted coefficient of determination (R 2 Adj ), standard error (s) and F-test.
Since literature data indicates that there is evidence that only models validated externally, after internal validation, can be considered reliable and applicable for external prediction and regulatory purposes 22,23 , the model was applied for external molecules.
(1) pIC 50 = −1.51−0.96(E HOMO ) + 0.02(PSA)  Carrying out an external validation, it was possible to confirm the robustness of the proposed model (Eq.1).A set of four compounds (2i, 3g, 3h, and 4d) was used as external test, representing about 20% of the quantity of observations (N = 22).The test group with its values of observed and calculated pIC 50 , residuals, and percentual deviation are shown in Table 8, where is possible to verify that all of them present a deviation smaller than or equal to 5% of the biological activity value observed experimentally.

Synthesis of the new derivative and activity prediction by the QSAR model
Unpublished compound 5 was synthesized, characterized by NMR, and its biological activity in the promastigote form of Leishmania amazonensis was evaluated.The built and validated QSAR model, corresponding to Eq. 1, was used to predict the activity of this new derivative.
Therefore, the descriptors present in Eq. 1 were calculated for the new compound 5 (E HOMO = − 6.52 eV and PSA = 53.19Å 2 ) and a value of 5.81 was predicted for biological activity (pIC 50 ) against Leishmania amazonensis.Comparison with the experimental result (IC 50 = 2.0 ± 1.2 μM and pIC 50 = 5.70) shows that the QSAR model (Eq. 1) proposed here, presented a good predictive capacity with a deviation of 1.93%, being useful to drive the synthesis of new quinoxaline derivatives, saving time and resources that would be spent on synthesis and testing of biological activity.

Conclusions
SAR studies of a series of quinoxaline derivatives were carried out and a new quinoxaline derivative was proposed as a potential antileishmanial agent.The unpublished compound was synthesized and tested against Leishmania amazonensis promastigotes.A new QSAR model was built, and it was capable to predict the activity of the new compound being useful to drive the synthesis of other ones.

Table 4 .
Statistical data for the 20 QSAR equations with two terms (N = 18 and p = 0.05) generated by systematic combination of seven theoretical physicochemical descriptors.