Quantitative structure–activity relationship modeling for predication of inhibition potencies of imatinib derivatives using SMILES attributes

Hamzehali, Hamideh; Lotfi, Shahram; Ahmadi, Shahin; Kumar, Parvin

doi:10.1038/s41598-022-26279-8

Download PDF

Article
Open access
Published: 15 December 2022

Quantitative structure–activity relationship modeling for predication of inhibition potencies of imatinib derivatives using SMILES attributes

Hamideh Hamzehali¹,
Shahram Lotfi²,
Shahin Ahmadi³ &
…
Parvin Kumar⁴

Scientific Reports volume 12, Article number: 21708 (2022) Cite this article

1218 Accesses
4 Citations
3 Altmetric
Metrics details

Subjects

Abstract

Chronic myelogenous leukemia (CML) which is resulted from the BCR-ABL tyrosine kinase (TK) chimeric oncoprotein, is a malignant clonal disorder of hematopoietic stem cells. Imatinib is used as an inhibitor of BCR-ABL TK in the treatment of CML patients. The main object of the present manuscript is focused on constructing quantitative activity relationships (QSARs) models for the prediction of inhibition potencies of a large series of imatinib derivatives against BCR-ABL TK. Herren, the inbuilt Monte Carlo algorithm of CORAL software is employed to develop QSAR models. The SMILES notations of chemical structures are used to compute the descriptor of correlation weights (CWs). QSAR models are established using the balance of correlation method with the index of ideality of correlation (IIC). The data set of 306 molecules is randomly divided into three splits. In QSAR modeling, the numerical value of R², Q², and IIC for the validation set of splits 1 to 3 are in the range of 0.7180–0.7755, 0.6891–0.7561, and 0.4431–0.8611 respectively. The numerical result of ${CR}_{p}^{2}$ > 0.5 for all three constructed models in the Y-randomization test validate the reliability of established models. The promoters of increase/decrease for pIC₅₀ are recognized and used for the mechanistic interpretation of structural attributes.

Cancer therapy with antibodies

Article 13 May 2024

MISATO: machine learning dataset of protein–ligand complexes for structure-based drug discovery

Article Open access 10 May 2024

Homogeneous multi-payload antibody–drug conjugates

Article 17 May 2024

Introduction

BCR-ABL tyrosine kinase (TK) oncoprotein as an oncogene is present in 95% of patients suffering from chronic myeloid leukemia (CML). Therefore, tyrosine kinase inhibitors (TKIs), such as imatinib as the first drug against the BCR-ABL TK, have been used in the therapy of most cases of CML patients. Imatinib competitively targets the ATP-binding site in the TK domain of the BCR-ABL oncoprotein and reduces the activity of BCR-ABL. Due to the point mutations in the BCR-ABL kinase domain, some patients particularly in the advanced phases of CML, develop imatinib resistance. Therefore, to overcome imatinib resistance, novel analogues of Imatinib such as ponatinib, nilotinib, dasatinib, bosutinib, etc., have been developed as TKIs and tested in patients with BCR-ABL positive CML. Hence, the development and design of more potent BCR-ABL TKIs, specifically imatinib derivatives is a matter of great importance and would help in the therapeutic treatments of CML patients^1,2,3,4,5.

Quantitative structure–activity relationship (QSAR) is an approach that can be applied to the construction of pharmacophore models, new drug discovery, and assessment of the activity/behavior of compounds^6,7,8. Also, QSAR is a predictive and diagnostic process employed for finding quantitative relationships between chemical structures and biological activity or property. QSAR is the concluding outcome of computational methods that begin with an appropriate molecular structure description and conclude with some interpretation, assumption, and judgments on the behaviour of molecules in the biological and physicochemical under examination^9,10. Finding a class of molecular descriptors that indicates variations in the structural properties of the molecule, is the main goal of QSAR model development.

The Monte Carlo algorithm of CORrelation And Logic (CORAL) software has been applied for QSAR modeling of different endpoints^{11,12,13,14,15}. Random distribution of dataset into training and validation subsets, production of optimal descriptors of correlation weights (DCW), and the construction of predictive models using the physicochemical conditions of corresponding experiments are unique options available in the CORAL software for the development of QSAR models^{16,17,18,19,20,21,22}. The literature survey shows that the Index of Ideality of Correlation (IIC) has been applied to improve the statistical result of the QSAR model^{23,24,25,26,27,28}. In addition, the most descriptors used in common QSAR models do not have physical meaning and can not be associated with mechanistic interpretation. It has to be noted that QSAR models developed with CORAL software are developed with SMILES notation based molecular descriptors that have mechanistic interpretation and could be associated with molecular fragments.

The objective of the present work is to apply the inbuilt Monte Carlo algorithm of CORAL software for the building QSAR model to predict inhibition potencies (pIC₅₀) of 306 Imatinib derivatives against BCR-ABL tyrosine kinase (TK). The balance of correlation method with IIC is used to develop QSAR models. The reliability and predictability of the designed QSAR model are assessed by three random splits.

Method

Data

Zin et al.²⁹ had extracted the inhibition potential of 306 compounds for the human BCR-ABL tyrosine-kinase from the ChEMBL v23 (2017) database³⁰. The inhibition potential of compounds was defined as half maximal inhibitory concentration in mol/L (IC₅₀). Additionally, the inhibition experimental data of BCR-ABL tyrosine kinase was transformed to a negative logarithm value (pIC₅₀ ). The endpoint pIC₅₀ was taken as the dependent parameter for constructing QSAR models. The range of pIC₅₀ was between 9.37 and 4.03. Three splits were created form the dataset (n = 306) and the compounds of each split was randomly divided into the training (34%), invisible training (35%), calibration (15%) and validation (16%) sets. The SMILES notations, split distribution, experimental pIC₅₀, predicted pIC_50, and applicability domain of each compound are depicted in Table S1. The task of each set in developing the QSAR models was already described in the literature^31,32.

Optimal SMILES-based descriptors

In the CORAL software, three types of optimal descriptors i.e. SMILES-based, graph-based and hybrid descriptors (combination of SMILES and Graph) can be employed to develop QSAR models.

The optimal descriptor is a mathematical function of so-called correlation weights (CW). Correlation weights are numerical coefficients associated with various molecular features extracted from SMILES symols. In other words, the univariate models investigated in this research are based on the “descriptors of correlation weights” (DCW). The Monte Carlo algorithm was used to calculate the DCW. In the present research, the SMILES-based descriptor was employed to make the QSAR models. The optimal descriptors used to build pIC₅₀ models are calculated as follows:

$$\mathrm{DCW}\left({\mathrm{T}}^{*}, {\mathrm{N}}^{*}\right)={}^{\mathrm{SMILES}}\mathrm{DCW}\left({\mathrm{T}}^{*}, {\mathrm{N}}^{*}\right)$$

(1)

$${}^{\mathrm{SMILES}}\mathrm{DCW}{}_{ }{}^{ }({\mathrm{T}}^{*}, {\mathrm{N}}^{*})=\sum \mathrm{CW}\left({\mathrm{SSS}}_{\mathrm{K}}\right)+\mathrm{CW}\left(\mathrm{HALO}\right)+\mathrm{CW}\left(\mathrm{NOSP}\right)+\mathrm{CW}\left(\mathrm{HARD}\right)+\mathrm{CW}\left(\mathrm{PAIR}\right)+\mathrm{CW}\left({\mathrm{C}}_{\mathrm{max}}\right)+\mathrm{CW}\left({\mathrm{N}}_{\mathrm{max}}\right)+\mathrm{CW}\left({\mathrm{O}}_{\mathrm{max}}\right)$$

(2)

Here, T is the notation of threshold and N is the notation of the number of epochs. The T is an integer utilized to split SMILES attributes (i.e. Sk, SSk, and SSSk) into two classes i.e. active and rare. If a molecular attribute, A, takes place less than T times, then this molecular attribute should be omitted from the construction of the model ( molecular attribute is calculated from SMILES in the training set), hence the correlation weight of the A, CW(A) = 0. Therefore, this molecular attribute has been distinguished as rare. The T* and N* are the numerical values of the T and N that yield the best statistical result of a model for the calibration set.

The details of notation given in Eq. (2) are as follows: SSS_k, a local SMILES attribute, is a combination of three SMILES atoms; NOSP, HALO, and BOND are global SMILES attributes that display the existence or absence of nitrogen (N), oxygen (O), sulfur (S), and phosphorus (P) (NOSP), fluorine, chlorine, and bromine (HALO); BOND illustrates the presence or absence of double (‘ = ’), triple (‘#’) and stereochemical (‘@’ or ‘@@)’ bonds; PAIR imply the combination of BOND and NOSP; HARD displays the presence or existence of NOSP, HALO, and BOND; C_max represents the maximum number of rings; N_max and O_max are the total numbers of nitrogen and oxygen atoms in the molecular structure. The CW(A) demonstrates the correlation weight for the SMILES-attributes e.g. SSS_k, NOSP, BOND, HALO, PAIR, Cmax, Nmax, and Omax. These correlation weights are calculated using the Monte Carlo optimization^{33,34,35,36,37}.

The obtained numerical data in terms of DCW is used to determine the inhibition potential for Imatinib derivatives (pIC₅₀) by the least square method using the following one-variable model:

$${pIC}_{50}={\mathrm{C}}_{0}+{\mathrm{C}}_{1}\times \mathrm{DCW}\left({\mathrm{T}}^{*}, {\mathrm{N}}^{*}\right)$$

(3)

Monte Carlo optimization

In the present research modified target function (TF_m) i.e. the balance of correlation with IIC was employed to compute the DCW³². The following mathematical relationships are used to compute TF_m:

$$TF={R}_{training}+{R}_{invTraining}-\left|{R}_{training}-{R}_{invTraining}\right|\times Const$$

(4)

$${TF}_{m}=TF+{IIC}_{CAL} \times Const$$

(5)

Here, R_training and R_invTraining indicate the correlation coefficients for the training and invisible training sets, respectively. The empirical constant (Const) is usually fixed.

The index of ideality if correlation for the calibration set (IIC_CAL) is calculated using the following equation:

$$\mathrm{IIC}={\mathrm{R}}_{\mathrm{C}AL}\times \frac{\mathrm{min}({}^{-}{\mathrm{MAE}}_{\mathrm{CAL}}, {}^{+}{\mathrm{MAE}}_{\mathrm{CAL}})}{\mathrm{max}({}^{-}{\mathrm{MAE}}_{\mathrm{CAL}}, {}^{+}{\mathrm{MAE}}_{\mathrm{CAL}})}$$

(6)

$${}^{-}{\mathrm{MAE}}_{\mathrm{CLB}}=-\frac{1}{\mathrm{N}}\sum_{y=1}^{{N}^{-}} \left|{\Delta }_{\mathrm{k}}\right| \quad {\Delta }_{\mathrm{k}}<0, {}^{-}\mathrm{N \, is \, the \, number \, of } \, {\Delta }_{\mathrm{k}}<0$$

(7)

$${}^{+}{\mathrm{MAE}}_{\mathrm{CLB}}=+\frac{1}{\mathrm{N}}\sum_{y=1}^{{N}^{+}}\left|{\Delta }_{\mathrm{k}}\right| \quad {\Delta }_{\mathrm{k}}\ge 0, {}^{+}\mathrm{N \, is \, the \, number \, of } \, {\Delta }_{\mathrm{k}}\ge 0$$

(8)

$${\Delta }_{\mathrm{k}}={\mathrm{Observed}}_{\mathrm{k}}-{\mathrm{Calculated}}_{\mathrm{k}}$$

(9)

The ‘k’ is the index (1, 2, …. N). The observed_k and calculated_k are related to the endpoint.

Applicability domain

According to the 3rd principle of the OECD, the applicability domain (AD) is recommended for the validation of the established QSAR model. The physicochemical, structural, or biological space, knowledge, or information on which the model's training set was created and for which it is used to generate predictions about new compounds is known as the AD^38,39.

In the CORAL program, Monte Carlo-based QSAR, scattering of SMILES attributes in the training, invisible training and calibration sets is utilized to achieve AD^40,41. If a substance does not fall within the scope of AD, it is identified as an outlier and cannot be associated with a reliable prediction.

In CORAL, a compound is recognized in the scope of AD if the following inequality is fulfilled, otherwise, it is recognized as an outlier:

$${\mathrm{Defect}}_{\mathrm{molecule}} <2\times {\overline{\mathrm{Defect}} }_{TRN}$$

(10)

where ${\overline{\mathrm{Defect}} }_{\mathrm{TRN}}$ is an average of the statistical defect (D) for the dataset of the training set.

The statistical defect (D) can be described as the sum of statistical defects of all attributes present in the SMILES notation.

$${\mathrm{Defect}}_{\mathrm{Molecule}}=\sum_{\mathrm{k}=1}^{N{\mathrm{A}}}{\mathrm{Defect}}_{{\mathrm{A}}_{\mathrm{K}}}$$

(11)

NA is the number of active SMILES attributes for the given compounds.

The “statistical defect,” Defect(A) for an attribute of SMILES can be defined by the following mathematical equation:

$${\mathrm{Defect}}_{{\mathrm{A}}_{\mathrm{K}}}=\frac{\left|{\mathrm{P}}_{\mathrm{TRN}}{(\mathrm{A}}_{\mathrm{K}})-{\mathrm{P}}_{\mathrm{CAL}}{(\mathrm{A}}_{\mathrm{K}})\right|}{{\mathrm{N}}_{\mathrm{TRN}}{(\mathrm{A}}_{\mathrm{K}})+{\mathrm{N}}_{\mathrm{CAL}}{(\mathrm{A}}_{\mathrm{K}})} \quad \mathrm{ If }{\mathrm{A}}_{\mathrm{K}}>0$$

(12)

$${\mathrm{Defect}}_{{\mathrm{A}}_{\mathrm{K}}}=1 \quad \mathrm{ If }{\mathrm{A}}_{\mathrm{K}}=0$$

${P}_{TRN}{(A}_{K})$ and ${P}_{TCAL}{(A}_{K})$ are the probability of an attribute 'A_k' in the training and the calibration sets; ${N}_{TRN}{(A}_{K})$ and ${N}_{CAL}{(A}_{K})$ are the number of times of A_k in the training and calibration sets, respectively.

Validation of the model

The statistical eminence of the created QSAR models for pIC₅₀ of Imatinib derivatives is evaluated on the basis of the three methodologies: (i) internal validation or cross-validation by determining the R², IIC, CCC, Q², and F-test on the training set; (ii) external validation by determining the Q²F₁, Q²F₂, Q²F₃, CRp², s, MAE, r̅_m², and Δr_m² utilizing the test set substances and (iii) data randomization or Y-scrambling (Table 1). The mathematical relationship of these statistical parameters has been provided in the literature^{42,43,44,45,46}. In Table 1, Y_obs is observation endpoint; Y_prd is the prediction endpoint; R² and ${R}_{0}^{2}$ are the squared correlation coefficient values between the observed and predicted endpoints with intercept and without intercept respectively, and ${R}_{r}^{2}$ is squared mean correlation coefficient of randomized models.

Table 1 The mathematical equation of different statistical benchmark of the predictive potential for CORAL models.

Full size table

Results and discussion

QSAR models

With the mentioned data in “Data”, three splits were generated randomly. Each split was further divided into four sets namely training, invisible training, calibration and validation sets. To establish the QSAR model, a balance of correlation with the IIC technique was employed. The values of IIC_weight (weight of IIC) and dR_weight (weight for dR in the balance of correlations) were 0.2, and 0.1, respectively. The result for the preferable T* and N* was 1 and 15 for all splits. With the best-preferred values of T* and N*, the pIC₅₀ (endpoint) for each split was computed and the developed QSAR models are as the following:

$$\mathrm{Split }1\quad {pIC}_{50}=3.6679\left(\pm 0.0196\right)+0.2889(\pm 0.0016)\times DCW(1, 15)$$

(13)

$$\mathrm{Split }2\quad {pIC}_{50 }=1.5438\left(\pm 0.0259\right)+0.2660(\pm 0.0017)\times DCW(1, 15)$$

(14)

$$\mathrm{Split }3\quad {pIC}_{50 }=3.4165\left(\pm 0.0126\right)+0.2696(\pm 0.0010)\times DCW(1, 15)$$

(15)

The statistical characteristics of the generated QSAR models computed by relationships 13–15 are depicted in Table 2. The outcomes in Table 2 demonstrate that all generated QSAR models from the statistical point of view are appropriate and match the requirements of various validation criteria. The robustness of established QSAR models was demonstrated by the numerical value of R² and Q² values which were more than 0.5 and 0.7^47,48. In addition, the numerical value of the R²m metric for the validation set of all designed QSAR models was satisfactory and follows the criteria suggested by Roy et al.⁴⁹. Also, the ${\overline{R} }_{m}^{2}$-scaled and ${\Delta R}_{m}^{2}$-scaled introduced as modified R²m metric by Roy et al. were computed⁵⁰, these values were 0.6928 and 0.0216, 0.6878 and 0.0929, and 0.7339 and 0.1230 for split 1 to 3, respectively. The trustworthiness of the constructed QSAR models was also confirmed by the Y-randomization test.

Table 2 The summary statistical characteristics and criteria of predictability of the QSAR models for three random splits.

Full size table

After several repetitions of new random models were developed and the values of R² were found below 0.1 (see Table S2 as supplementary information). These result indicates that the correlation between pIC₅₀ and molecular attributes is not based on chance correlation. Moreover, for three splits, the CR²p was obtained greater than 0.75, which confirmed the non-chance correlation of developed models⁵¹.

The AD for each compound in models 1 to 3 shown in Table S1 based on the results of defectvalue. The percentages of compounds in the AD of models were 81, 83, and 87% for splits 1–3, respectively. It showed that the three prediction models were able to predict more than 80% of the new data.

Figures 1 and 2 demonstrate the pictorial presentation of experimental data of pIC50 versus predicted pIC50 and residual pIC50 versus predicted pIC50 of three models. As can be seen in Fig. 1, there is good agreement between experimental and predicted data in the suggested models. It can also be seen in Fig. 2 that the dispersion of residual pIC50 near the horizontal line centred around zero. All these results confirmed that all constructed QSAR models were robust and well fitted.

Interpretation of the QSAR model

Mechanistic interpretation of models helps in understanding the effectiveness of descriptors in the predicted endpoint. The mechanistic interpretation of built-up QSAR models utilizing the CORAL program is done with correlation weights (CW) of SMILES-attributes which are achieved from several runs of the Monte Carlo optimization. The CW for each SMILES attributes in various probs of a model likely positive, negative, or both positive and negative. The positive and negative promoters are considered as promoters of increase and decrease of the activity or an endpoint, respectively. Consequently, promoters of increase of pIC50 have positive CW and promoters of decrease of pIC₅₀ have negative CW. But, if the structural attribute in all runs both positive and negative values of CW, then these attributes are undefined. Table 3 represents the list of the structural features as the promoters of increase or decrease of pIC₅₀ achieved in the results of three probs of the Monte Carlo optimization with optimum T* and N* along with the interpretation of the promoters (NT is number of attributes in the training set, NiT is number of attributes in the invisible training set, and NC is number of attributes in the calibration set). According to the results, the important SMILES-descriptors as the promoter of increase/decrease of pIC₅₀ were distinguished and recognized. The SMILES-based descriptors as promoters of increase of pIC₅₀ were c…c…c…, c…c…1… and Cmax.3……, and the promoter of decrease pIC₅₀ was C…(…(….

Table 3 List of structural attributes (SAk) as a promoter of increase/decrease extracted from three split of the constructed model.

Full size table

Comparison with prior reports

Kyaw Zin and colleagues²⁹ reported a QSAR model by the same data relying on deep neural nets (DNN) and hybrid sets of 2D/3D/MD descriptors to predict the inhibition potencies of 306 imatinib derivatives. The dataset was divided into two sets i.e. training set (260 compounds) and a test set (46 compounds). They built multiple DNN and RF regressors with hybrid 2D/3D/MD descriptors and showed high predictive power through rigorous validation tests. Through rigorous validation tests, they reported that their DNN regression models resulted excellent external prediction performances for the pIC₅₀ data set. The R² of training and validation setes was 0.99 and 0.68 respectively and the MAE of training and test set was 0.08 and 0.67 respectively.

The comparison QSAR model here with the previous study showed that the structure, physicochemical parameters or previous calculations of the chemicals descriptors for the construction of the models were required by the model, while in the case of CORAL software, a text file containing SMILES notations of compounds and endpoint was used for model development. Here, we used 3 splits to establish three QSAR models using four sets (training, invisible training, calibration and validation set), but in previously constructed models, a single split utilizing two sets (training and test set) was used. In the present research, the molecular features responsible for the increase/decrease of endpoint were also detected for mechanistic interpretation.

In terms of statistical characterization, the proposed QSAR model by CORAL for the prediction of pIC₅₀ was superior to the reported model. The statistical parameters ${Q}_{F1}^{2}$, ${Q}_{F2}^{2}$, ${Q}_{F3}^{2}$, ${CR}_{p}^{2}$, CCC and IIC were not reported in the previous report. The R² of training and validation setes for split 1 to 3 are between 0.76–0.85 and 0.71–0.78, respectively and the MAE of training and validation sets for split 1 to 3 are between 0.41–0.54 and 0.46–0.54, respectively. Therfore, the QSAR models established here are more reliable and have better predictability.

Conclusion

In this work, to predict pIC₅₀ of 306 Imatinib derivatives, QSAR models were created using the Monte Carlo method and validated with several parameters. The QSAR models were established using a modified target function (TF_m). The statistical characterization of constructed models was justified using internal and external validation metrics such as R², IIC, CCC, Q², ${Q}_{F1}^{2}$, ${Q}_{F2}^{2}$, ${Q}_{F3}^{2}$, F, s, MAE, RMSE, $\overline{{R }_{m}^{2}}$, $\overline{{\Delta R }_{m}^{2}}$, scaled-$\overline{{\mathrm{R} }_{\mathrm{m}}^{2}}$, scaled-$\overline{\Delta {\mathrm{R} }_{\mathrm{m}}^{2}}$, ${CR}_{p}^{2}$, and Y-randomization test. In the constructed QSAR model, the numerical value of R², Q², and IIC for the validation set of splits 1 to 3 were in the range of 0.7180- 0.7755, 0.6891–0.7561, and 0.4431–0.8611 respectively. The domain of applicability (AD) was applied to identify the outliers in the generated QSAR models. The structural features as promoters of pIC₅₀ increase/decrease were also identified.

Data availability

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

Demizu, Y. et al. Development of BCR-ABL degradation inducers via the conjugation of an imatinib derivative and a cIAP1 ligand. Bioorg. Med. Chem. Lett. 26, 4865–4869 (2016).
Article CAS Google Scholar
Yang, M., Xi, Q., Jia, W. & Wang, X. Structure-based analysis and biological characterization of imatinib derivatives reveal insights towards the inhibition of wild-type BCR-ABL and its mutants. Bioorg. Med. Chem. Lett. 29, 126758 (2019).
Article CAS Google Scholar
Li, Y.-T. et al. Syntheses and biological evaluation of 1, 2, 3-triazole and 1, 3, 4-oxadiazole derivatives of imatinib. Bioorg. Med. Chem. Lett. 26, 1419–1427 (2016).
Article Google Scholar
An, X. et al. BCR-ABL tyrosine kinase inhibitors in the treatment of Philadelphia chromosome positive chronic myeloid leukemia: A review. Leuk. Res. 34, 1255–1268 (2010).
Article CAS Google Scholar
San Juan, A. A. Structural investigation of PAP derivatives by CoMFA and CoMSIA reveals novel insight towards inhibition of Bcr-Abl oncoprotein. J. Mol. Graph. Model. 26, 482–493 (2007).
Article CAS Google Scholar
Azimi, A., Ahmadi, S., Kumar, A., Qomi, M. & Almasirad, A. SMILES-based QSAR and molecular docking study of oseltamivir derivatives as influenza inhibitors. Polycyclic Arom. Compds. 42, 1–21 (2022).
Google Scholar
Ghasedi, N., Ahmadi, S., Ketabi, S. & Almasirad, A. DFT based QSAR study on quinolone-triazole derivatives as antibacterial agents. J. Receptors Signal Transduct. 42, 1–11 (2021).
Google Scholar
Ahmadi, S., Mardinia, F., Azimi, N., Qomi, M. & Balali, E. Prediction of chalcone derivative cytotoxicity activity against MCF-7 human breast cancer cell by Monte Carlo method. J. Mol. Struct. 1181, 305–311 (2019).
Article ADS CAS Google Scholar
Shukla, S., Kouanda, A., Silverton, L., Talele, T. T. & Ambudkar, S. V. Pharmacophore modeling of nilotinib as an inhibitor of ATP-binding cassette drug transporters and bcr-abl kinase using a three-dimensional quantitative structure–activity relationship approach. Mol. Pharm. 11, 2313–2322 (2014).
Article CAS Google Scholar
Muhammad, U., Uzairu, A. & Ebuka Arthur, D. Review on: Quantitative structure activity relationship (QSAR) modeling. J. Anal. Pharm. Res. 7, 240–242 (2018).
Article Google Scholar
Toropova, A. P. & Toropov, A. A. Application of the monte carlo method for the prediction of behavior of peptides. Curr. Protein Pept. Sci. 20, 1151–1157 (2019).
Article CAS Google Scholar
Toropov, A. A., Toropova, A. P., Raitano, G. & Benfenati, E. CORAL: Building up QSAR models for the chromosome aberration test. Saudi J. Biol. Sci. 26, 1101–1106 (2019).
Article CAS Google Scholar
Kumar, P., Kumar, A., Sindhu, J. & Lal, S. QSAR models for nitrogen containing monophosphonate and bisphosphonate derivatives as human farnesyl pyrophosphate synthase inhibitors based on Monte Carlo method. Drug Res. 69, 159–167 (2019).
Article CAS Google Scholar
Ahmadi, S. Mathematical modeling of cytotoxicity of metal oxide nanoparticles using the index of ideality correlation criteria. Chemosphere 242, 125192 (2020).
Article ADS CAS Google Scholar
Lotfi, S., Ahmadi, S. & Zohrabi, P. QSAR modeling of toxicities of ionic liquids toward Staphylococcus aureus using SMILES and graph invariants. Struct. Chem. 31, 2257–2270 (2020).
Article CAS Google Scholar
Jafari, K., Fatemi, M. H., Toropova, A. P. & Toropov, A. A. Correlation intensity index (CII) as a criterion of predictive potential: Applying to model thermal conductivity of metal oxide-based ethylene glycol nanofluids. Chem. Phys. Lett. 754, 137614 (2020).
Article CAS Google Scholar
Toropova, A. P., Toropov, A. A., Roncaglioni, A. & Benfenati, E. The system of self-consistent models for vapour pressure. Chem. Phys. Lett. 790, 139354 (2022).
Article CAS Google Scholar
Kumar, P. & Kumar, A. Correlation intensity index (CII) as a benchmark of predictive potential: Construction of quantitative structure activity relationship models for anti-influenza single-stranded DNA aptamers using Monte Carlo optimization. J. Mol. Struct. 1246, 131205 (2021).
Article CAS Google Scholar
Kumar, P., Kumar, A. & Singh, D. CORAL: Development of a hybrid descriptor based QSTR model to predict the toxicity of dioxins and dioxin-like compounds with correlation intensity index and consensus modelling. Environ. Toxicol. Pharmacol. 93, 103893 (2022).
Article CAS Google Scholar
Kumar, P. et al. CORAL: Quantitative structure retention relationship (QSRR) of flavors and fragrances compounds studied on the stationary phase methyl silicone OV-101 column in gas chromatography using correlation intensity index and consensus modelling. J. Mol. Struct. 1265, 133437 (2022).
Article CAS Google Scholar
Kumar, A., Kumar, P. & Singh, D. QSRR modelling for the investigation of gas chromatography retention indices of flavour and fragrance compounds on Carbowax 20 M glass capillary column with the index of ideality of correlation and the consensus modelling. Chemom. Intell. Lab. Syst. 224, 104552 (2022).
Article CAS Google Scholar
Duhan, M. et al. Quantitative structure activity relationship studies of novel hydrazone derivatives as α-amylase inhibitors with index of ideality of correlation. J. Biomol. Struct. Dyn. 40, 4933–4953 (2022).
Article CAS Google Scholar
Toropov, A. A. & Toropova, A. P. The index of ideality of correlation: A criterion of predictive potential of QSPR/QSAR models?. Mutation Res./Genet. Toxicol. Environ. Mutagenesis 819, 31–37 (2017).
Article CAS Google Scholar
Toropov, A. A. & Toropova, A. P. Use of the index of ideality of correlation to improve predictive potential for biochemical endpoints. Toxicol. Mech. Methods 29, 43–52 (2019).
Article CAS Google Scholar
Kumar, P., Kumar, A. & Sindhu, J. Design and development of novel focal adhesion kinase (FAK) inhibitors using Monte Carlo method with index of ideality of correlation to validate QSAR. SAR QSAR Environ. Res. 30, 63–80 (2019).
Article CAS Google Scholar
Kumar, P. & Kumar, A. Unswerving modeling of hepatotoxicity of cadmium containing quantum dots using amalgamation of quasiSMILES, index of ideality of correlation, and consensus modeling. Nanotoxicology 15, 1199–1214. https://doi.org/10.1080/17435390.2021.2008039 (2021).
Article CAS Google Scholar
Kumar, A. & Kumar, P. Prediction of power conversion efficiency of phenothiazine-based dye-sensitized solar cells using Monte Carlo method with index of ideality of correlation. SAR QSAR Environ. Res. 32, 817–834. https://doi.org/10.1080/1062936X.2021.1973095 (2021).
Article CAS Google Scholar
Kumar, A. & Kumar, P. Cytotoxicity of quantum dots: Use of quasiSMILES in development of reliable models with index of ideality of correlation and the consensus modelling. J. Hazard Mater 402, 123777. https://doi.org/10.1016/j.jhazmat.2020.123777 (2021).
Article CAS Google Scholar
Kyaw Zin, P. P., Borrel, A. & Fourches, D. Benchmarking 2D/3D/MD-QSAR models for imatinib derivatives: How far can we predict?. J. Chem. Inf. Model. 60, 3342–3360 (2020).
Article Google Scholar
Gaulton, A. et al. ChEMBL: A large-scale bioactivity database for drug discovery. Nucleic Acids Res. 40, D1100–D1107 (2012).
Article CAS Google Scholar
Kumar, A. & Kumar, P. Identification of good and bad fragments of tricyclic triazinone analogues as potential PKC-θ inhibitors through SMILES-based QSAR and molecular docking. Struct. Chem. 32, 149–165 (2021).
Article CAS Google Scholar
Ahmadi, S., Ketabi, S. & Qomi, M. CO 2 uptake prediction of metal–organic frameworks using quasi-SMILES and Monte Carlo optimization. New J. Chem. 46, 8827–8837 (2022).
Article CAS Google Scholar
Toropova, A. P. & Toropov, A. A. QSPR and nano-QSPR: What is the difference?. J. Mol. Struct. 1182, 141–149 (2019).
Article ADS CAS Google Scholar
Toropova, A. P., Toropov, A. A., Benfenati, E., Leszczynska, D. & Leszczynski, J. Prediction of antimicrobial activity of large pool of peptides using quasi-SMILES. BioSystems 169, 5–12 (2018).
Article Google Scholar
Kumar, P. & Kumar, A. CORAL: QSAR models of CB1 cannabinoid receptor inhibitors based on local and global SMILES attributes with the index of ideality of correlation and the correlation contradiction index. Chemometr. Intelligent Lab. Syst. 200, 103982 (2020).
Article CAS Google Scholar
Lotfi, S., Ahmadi, S. & Kumar, P. A hybrid descriptor based QSPR model to predict the thermal decomposition temperature of imidazolium ionic liquids using Monte Carlo approach. J. Mol. Liq. 338, 116465 (2021).
Article CAS Google Scholar
Lotfi, S., Ahmadi, S. & Kumar, P. The Monte Carlo approach to model and predict the melting point of imidazolium ionic liquids using hybrid optimal descriptors. RSC Adv. 11, 33849–33857 (2021).
Article ADS CAS Google Scholar
Jaworska, J., Nikolova-Jeliazkova, N. & Aldenberg, T. QSAR applicability domain estimation by projection of the training set in descriptor space: A review. Altern. Lab. Anim. 33, 445–459 (2005).
Article CAS Google Scholar
Toropov, A. A. & Toropova, A. P. The correlation contradictions index (CCI): Building up reliable models of mutagenic potential of silver nanoparticles under different conditions using quasi-SMILES. Sci. Total Environ. 681, 102–109 (2019).
Article ADS CAS Google Scholar
Ahmadi, S. & Akbari, A. Prediction of the adsorption coefficients of some aromatic compounds on multi-wall carbon nanotubes by the Monte Carlo method. SAR QSAR Environ. Res. 29, 895–909 (2018).
Article CAS Google Scholar
Ahmadi, S., Lotfi, S. & Kumar, P. A Monte Carlo method based QSPR model for prediction of reaction rate constants of hydrated electrons with organic contaminants. SAR QSAR Environ. Res. 31, 935–950 (2020).
Article CAS Google Scholar
Roy, K., Das, R. N., Ambure, P. & Aher, R. B. Be aware of error measures. Further studies on validation of predictive QSAR models. Chemom. Intell. Lab. Syst. 152, 18–33 (2016).
Article CAS Google Scholar
Chirico, N. & Gramatica, P. Real external predictivity of QSAR models: how to evaluate it? Comparison of different validation criteria and proposal of using the concordance correlation coefficient. J. Chem. Inf. Model. 51, 2320–2335 (2011).
Article CAS Google Scholar
Ahmadi, S., Lotfi, S. & Kumar, P. Quantitative structure–toxicity relationship models for predication of toxicity of ionic liquids toward leukemia rat cell line IPC-81 based on index of ideality of correlation. Toxicol. Mech. Methods 32, 302–312 (2022).
Article CAS Google Scholar
Kumar, P. & Kumar, A. Nucleobase sequence based building up of reliable QSAR models with the index of ideality correlation using Monte Carlo method. J. Biomol. Struct. Dyn. 38, 3296–3306. https://doi.org/10.1080/07391102.2019.1656109 (2020).
Article CAS Google Scholar
Ahmadi, S., Toropova, A. P. & Toropov, A. A. Correlation intensity index: Mathematical modeling of cytotoxicity of metal oxide nanoparticles. Nanotoxicology 14, 1118–1126 (2020).
Article CAS Google Scholar
Sokolović, D. et al. Monte Carlo-based QSAR modeling of dimeric pyridinium compounds and drug design of new potent acetylcholine esterase inhibitors for potential therapy of myasthenia gravis. Struct. Chem. 27, 1511–1519 (2016).
Article Google Scholar
Golbraikh, A. & Tropsha, A. Beware of q2!. J. Mol. Graph. Model. 20, 269–276 (2002).
Article CAS Google Scholar
Roy, P. P. & Roy, K. QSAR studies of CYP2D6 inhibitor aryloxypropanolamines using 2D and 3D descriptors. Chem. Biol. Drug Des. 73, 442–455 (2009).
Article CAS Google Scholar
Roy, K. et al. Some case studies on application of “rm2” metrics for judging quality of quantitative structure–activity relationship predictions: Emphasis on scaling of response data. J. Comput. Chem. 34, 1071–1082 (2013).
Article CAS Google Scholar
Ojha, P. K. & Roy, K. Comparative QSARs for antimalarial endochins: Importance of descriptor-thinning and noise reduction prior to feature selection. Chemom. Intell. Lab. Syst. 109, 146–161 (2011).
Article CAS Google Scholar

Download references

Author information

Authors and Affiliations

Department of Chemistry, East Tehran Branch, Islamic Azad University, Tehran, Iran
Hamideh Hamzehali
Department of Chemistry, Payame Noor University (PNU), Tehran, 19395-4697, Iran
Shahram Lotfi
Department of Pharmaceutical Chemistry, Faculty of Pharmaceutical Chemistry, Tehran Medical Sciences, Islamic Azad University, Tehran, Iran
Shahin Ahmadi
Department of Chemistry, Kurukshetra University, Kurukshetra, Haryana, 136119, India
Parvin Kumar

Authors

Hamideh Hamzehali
View author publications
You can also search for this author in PubMed Google Scholar
Shahram Lotfi
View author publications
You can also search for this author in PubMed Google Scholar
Shahin Ahmadi
View author publications
You can also search for this author in PubMed Google Scholar
Parvin Kumar
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

H.H.: Performed drawing of structures and the writing original draft. S.L.: Writing original draft, Funding acquisition, Supervision. S.A.: Visualization, Performed models building and interpretation of models. P.K.: Writing-review and editing.

Corresponding author

Correspondence to Shahin Ahmadi.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Table S1.

Supplementary Table S2.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Hamzehali, H., Lotfi, S., Ahmadi, S. et al. Quantitative structure–activity relationship modeling for predication of inhibition potencies of imatinib derivatives using SMILES attributes. Sci Rep 12, 21708 (2022). https://doi.org/10.1038/s41598-022-26279-8

Download citation

Received: 11 July 2022
Accepted: 13 December 2022
Published: 15 December 2022
DOI: https://doi.org/10.1038/s41598-022-26279-8

This article is cited by

QSAR and molecular docking studies of isatin and indole derivatives as SARS 3CLpro inhibitors
- Niousha Soleymani
- Shahin Ahmadi
- Ali Almasirad
BMC Chemistry (2023)
In-silico activity prediction and docking studies of some flavonol derivatives as anti-prostate cancer agents based on Monte Carlo optimization
- Faezeh Tajiani
- Shahin Ahmadi
- Ali Almasirad
BMC Chemistry (2023)
Genetic descriptor search algorithm for predicting hydrogen adsorption free energy of 2D material
- Jaehwan Lee
- Seokwon Shin
- Youngdoo Son
Scientific Reports (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Introduction

Method

Data

Optimal SMILES-based descriptors

Monte Carlo optimization

Applicability domain

Validation of the model

Results and discussion

QSAR models

Interpretation of the QSAR model

Comparison with prior reports

Conclusion

Data availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links