Abstract
Several machine learning approaches have been proposed for predicting new benefits of the existing drugs. Although these methods have introduced new usage(s) of some medications, efficient methods can lead to more accurate predictions. To this end, we proposed a novel machine learning method which is based on a new optimization algorithm, named Trader. To show the capabilities of the proposed algorithm which can be applied to the different scope of science, it was compared with ten other state-of-the-art optimization algorithms based on the standard and advanced benchmark functions. Next, a multi-layer artificial neural network was designed and trained by Trader to predict drug-target interactions (DTIs). Finally, the functionality of the proposed method was investigated on some DTIs datasets and compared with other methods. The data obtained by Trader showed that it eliminates the disadvantages of different optimization algorithms, resulting in a better outcome. Further, the proposed machine learning method was found to achieve a significant level of performance compared to the other popular and efficient approaches in predicting unknown DTIs. All the implemented source codes are freely available at https://github.com/LBBSoft/Trader.
Similar content being viewed by others
Introduction
A drug is referred to a substance, except for the nutrients, which impose a temporary and/or diachronic physiological impact(s) in the body. Based on the mechanism of actions and therapeutic properties of drugs, they can be categorized into several classes such as the anatomical therapeutic chemical classification (ATC) and biopharmaceutics classification systems (BCS). Because of their importance and critical efficacies, many researchers have proposed various methods for the design of a drug1. Nonetheless, the design of a new drug is a very costly and time-consuming process, which takes over 15 years. Also, lots of drug discovery and development projects may fail, in large part because of the rigorous controls during drug development phases. Hence, researchers attempted to find other approaches for the treatment of diseases such as drug repurposing method as a cost- and time-effective strategy that offers many new benefits of the existing drugs. Several computational manners have been suggested for the repurposing of medications. These approaches can be categorized into some classes, including:
-
i)
Molecular docking methods: These methods, which look for ligands that can bind to proteins based on their multi-dimension structures, are the most popular approaches in drug repositioning field2. However, the methods cannot be used if the multi-dimension structure of a protein or a ligand is unknown.
-
ii)
Metabolic pathway-based methods: These procedures are usually used for treating orphan or rare diseases. For this purpose, the metabolic pathways related to the disease are identified. Next, drugs, which can affect the metabolic pathways of the diseases, are investigated3, and then, introduced to treat the diseases if they are qualified. Since the metabolic pathways of many orphan and rare diseases are not determined, these methods have a low level of success rate.
-
iii)
Connectivity-MAP (CMAP) methods: These approaches, which confront lots of genomic data, are used to discover relationships between diseases and genes4. For the methods, one can refer to some limitations such as various cell-lines, platforms, etc., which make the data inconsistent.
-
iv)
Data-mining methods: These methods, which include different procedures such as text-mining, machine learning, etc., are the most powerful ones in finding the novel usages of drugs. Since the methods act based on existing data, they increase the success rate of drug repositioning and many researchers take them into consideration5. Nevertheless, the validity of the acquired results remains a primary challenge.
The existing machine learning methods might achieve acceptable results. However, the more effective the approaches are, the better the prediction will be. To this end, we proposed an improved and efficient machine learning method which predicts drug-target interactions (DTIs) efficiently and fits into the fourth category of the groups mentioned above. The proposed method (the so-called ANNTR) is a multi-layer artificial neural network which is trained by a novel optimization algorithm called “Trader”. Accordingly, a proper model with a higher predicting ability is acquired. Besides introducing an efficient and improved machine learning approach for predicting DTIs, two other facts motivate us to introduce Trader optimization algorithm. First, an efficient algorithm, which eliminates the limitations of the optimization algorithms and can be applied to different fields such as engineering, biology, computer science, etc., is useful and essential. Second, a comprehensive and suitable comparison of optimization algorithms with others can determine their actual performance in the real-world usages.
Related Works
Our proposed method, which is a combination of artificial neural network and Trader optimization algorithm (ANNTR), falls into the data-mining class of drug repositioning and on predicting DTIs. This section is allocated to reviewing the related literature from the data-mining viewpoints. The conducted investigations have been categorized into six classes, as follows:
-
i)
Learner-based methods: In these studies, learners such as Deep learning6,7, Support vector machine8,9,10,11, Regression algorithms12, K-nearest neighbors13, Rotation forest learner11, and Relevance vector machine14 aimed to find out the relationships between the input and output using labeled datasets. The acquired model is evaluated and applied to predict unknown DTIs. Since every learner uses a different method for separating samples, their results differ from each other. The biggest weakness of the mentioned literature works is generating negative datasets and obtaining a model based on them. For this reason, the percentage of error goes up due to a possible positive interaction between a drug and a target in the generated negative dataset. To tackle such restriction, one-class classification machine learning approaches can be used15. There is a low level of accuracy in the methods used in the related literature despite the fact that their obtained results are acceptable. To enhance the prediction accuracy, we have introduced an efficient machine learning method, which is based on a new optimization algorithm, so-called “Trader”, as well as an artificial neural network.
-
ii)
Network-based methods: This type of literature works formulate drugs and their various targets (genes, proteins, enzymes, metabolic pathways, etc.) and then analyze them for obtaining new information. In a series of related works, the designed network is examined by various algorithms such as Random walk16,17 and Random forest18. Unlike the first class of related works which depends on the negative dataset19, the second group only considers the existing information. As a result, the error of the second category is lower than the first one. Nevertheless, the performance of the first category is higher than the second group.
-
iii)
Prioritization-based methods: These types of researches calculate drug-drug, network-network or target-target similarities. After they are ranked based on acquired scores, the intended drugs are suggested for treating diseases. To compute the scores, chemical information of drugs, topological information of networks, and sequence information of targets are examined20. Considering different studies, it can be concluded that the similarity is not an only determinant factor in the repositioning of drugs. Hence, the false positive rates of prioritization-based methods are high. To overcome the restriction, some researches integrate different information and then calculate the similarity scores21.
-
iv)
Mathematics and probabilistic-based methods: This type of studies formulate the problem as a graph and then mine it to obtain new information22. These methods run into difficulties when there are orphan nodes in the generated graph. To deal with the existing constraint, a matrix regulation and factorization method may be usefull23.
-
v)
Ensemble-based methods: It has been shown that a proper combination of machine learning methods usually leads to better results in computer science problems. Inspired by the combination idea, some researchers have predicted DTIs using a combination of the above-mentioned classes24,25,26. Although these methods enhance the separability power of a drug-target predictor, they increase the error rate and suffer from the disadvantages of the combined methods.
-
vi)
Review-based approaches: Large numbers of drug-target prediction literature studies are considered just to review articles which have investigated the problem from various viewpoints such as applied tools27, methods28, databases, software applications29, etc. These articles usually include a discussion of the advantages and disadvantages of proposed methods and give some directions to be followed in the future30.
Methods and Materials
Preparing the datasets
We integrate chemical and genomic spaces and gather information about drugs and targets as a dataset, similar to the work carried out by Yamanishi et al.31. The targets are divided into four classes, including enzymes (EN), ion channel proteins (IC), G-protein coupled receptors (GP), and nuclear receptor proteins (NR). To provide the datasets, the following steps can be considered:
-
i)
The chemical information on drugs and ligands is obtained from KEGG DRUG and KEGG LIGAND databases32. Then, the similarity scores between drugs are calculated by Eq. (1) 14. For this purpose, the pharmacological effects of medications on 17109 molecular properties are taken into consideration.
$${\rm{SIM}}(D,{\rm{D}}^{\prime} )=\frac{{\sum }_{i=1}^{n}{W}_{i}{F}_{i}{F^{\prime} }_{i}}{\sqrt{{\sum }_{i=1}^{n}{W}_{i}{F}_{i}^{2}}\sqrt{{\sum }_{i=1}^{n}{W}_{i}{F^{\prime} }_{i}^{2}}}$$(1)Where, n, Fi, SIM, and Wi are a total number of molecular properties (17109), the ith molecular feature, the similarity score between two drugs such as D and D’, and the weight of the Fi calculated by Eq. (2) 14, respectively.
$${{\rm{W}}}_{{\rm{i}}}=\exp (-{{{\rm{d}}}_{{\rm{i}}}}^{2}/({{\rm{\sigma }}}^{2}{{\rm{h}}}^{2}))$$(2)Where, di, σ, and h are the frequency of ith feature, the standard deviation of dk (k = 1through n), and a constant value 0.1, respectively. Using Eq. (1), a matrix of the effect similarity score for every pair of drugs is created.
-
ii)
Amino acid sequences of protein targets are obtained from the DrugBank33 database and KEGG GENE databases. Further, we have developed an integrated database named DrugR+34 (http://www.drugr.ir) which is a relational database and contains all data of DrugBank and some data of KEGG. Next, the similarity score between every pair of targets is computed by the normalized smith and waterman alignment scoring method35, and a matrix is generated for target-target similarity scores.
-
iii)
The interaction information between drugs and targets is obtained from the DrugR+ database.
For every type of the targets, a dataset is created by the pseudocode presented in Fig. 1. These datasets can be used as gold standard datasets by researchers who want to predict the interaction between drugs and targets using machine learning approaches. In Table 1, the attributes of the generated datasets are also shown.
The machine learning approach
Our proposed method, whose framework is depicted in Fig. 2, creates a prediction model using a multi-layer perceptron (MLP) artificial neural network (ANN) with two hidden layers. The generated datasets are divided into two sets, including (i) training and (ii) testing sets. For all the generated datasets, the ANN is trained by Trader in which every candidate solution consists of 38 variables. There are 8, 3, 2, and 1 neurons in the input, the first hidden, the second hidden, and the output layers, respectively. In the ANN, all the neurons of a layer are connected to all the neurons of the next layer, and hence, the total number of synapses or ANN’s edges is 8*3 + 3*2 + 2*1 = 32. Moreover, since there are six biases which are specified in Fig. 2, the total number of variables will be 32 + 6 = 38 in a potential answer. In this problem, the objective function is considered as root mean square error (RMSE) which is computed by Eq. (3):
Where, S, P, and O are the total number of samples, predicted and real-world values, respectively.
Trader optimization algorithm
Our proposed algorithm, Trader, has been inspired by the intelligent behavior of traders who are looking for more profit and property using different operations such as retailing, importing, exporting, and many other activities. In Fig. 3, the flowchart of Trader has been shown. Trader consists of several steps that are described, as follows:
-
i)
Creating the first population of candidate solutions: Like other optimization algorithms, Trader starts with some potential answers which consist of several variables and can be considered as an array. Equation (4) shows a candidate solution (CS) with n variables:
$${\rm{Variable}}=\{{{\rm{v}}}_{1},\,{{\rm{v}}}_{{\rm{2}}},\,\ldots ,\,{{\rm{v}}}_{{\rm{n}}},\,{\rm{G}}\}$$(4)Where G determines the group of the CS which belongs to a trader, vi shows the ith variable. The groups are not specified at the beginning of the algorithm. For the drug repurposing problem, a CS determines the weights of the ANN’s edges, and the variables show the edges of the ANN. Therefore, the total number of variables and edges of the ANN are the same.
-
ii)
Calling the objective function: After creating the first population of CSs, the worthiness of each of them is calculated by an objective function (OF), whose worthiness is defined based on a problem nature. For example, the fitness of a CS is computed by the value of the error in the problem of training an artificial neural network (Eq. (3)).
-
iii)
Grouping the candidate solutions: The groups are constituted based on the number of traders and their properties. At the start of the algorithm, all the traders have a same property which will be updated during the algorithm’s iterations. Equation (5) is used to calculate the number of CSs devoted to a specific trader (a group):
$${{\rm{NB}}}_{{\rm{i}}}=2+{\rm{round}}(\frac{Pi}{{\sum }_{j=1}^{T}Pj}\times (C-2\times T))$$(5)where, NBi, Pi, C, and T are the total number of CSs assigned to the ith trader or group, the property of the ith trader, the number of existing CSs, and the number of traders, respectively. Also, the constant value of 2 indicates that none of the traders or groups is eliminated during the algorithm iterations, and at least two CSs remain in every group. Figure S2 has illustrated an example of the competition among traders for getting the CSs.
-
iv)
Changing the candidate solutions: After grouping candidate solutions, at first, the best CS of each group named Master CS is selected, and then its variable values are distributed to the another CS, named Slave CS, using Eq. (6):
$${\sum }_{j=1}^{Ck}({\sum }_{i=1}^{R}(CS\_slave\_j(rand(n))=CS\_master\_k(rand(n))))$$(6)where n is the total number of variables in a CS, R is a random integer value between [1, n], Ck is the number of CSs of the kth group, CS_slave_j is the jth Slave CS of the kth group, and CS_master_k is the Master CS of the kth group. In case Eq. (6) enhances the value of the OF (RMSE), these changes are ignored. Otherwise, they will be accepted. In addition to the Eq. (6) which helps the Slave CSs to improve their value of OF, there is another operator that changes the Slave CSs based on their contents. These changes are applied to the Slave CSs using Eq. (7).
$${\sum }_{i=1}^{R}(C{S}_{slave(M)}=C{S}_{slave(M)}+k\times rand(C{S}_{slave(M)}))$$(7)where R is a random integer value between 1 and n/10, M is a random integer value between 1 and n, CSslave is a Slave CS, and k is an arbitrary value which is selected either 1 or −1. Like the previous operator, the changes are accepted if they improve the value of the OF. Unlike Eqs (6) and (7) which only change Slave CSs, there is another equation (Eq. (8)) which alters Master CSs. This operator exchanges values of variables among Master CSs. For applying it to the Master CSs, some of the values of the best CS of other groups are randomly chosen and then are imported to the selected Master CS.
$${\sum }_{j=1}^{T}{\sum }_{i=1}^{R}(CS\_master\_j(rand(n))=CS\_master\_k(rand(n)))$$(8)Where, R is a random integer value between 1 and n, j and k indicate the importer and exporter groups, CS_master_j shows the Master CS of the importer group, and CS_master_k shows the Master CS of the exporter group. The value of k is calculated by Eq. (9).
$${\rm{K}}=\{{\rm{a}}|{\rm{a}}\ne {\rm{j}}\,{\rm{and}}\,{\rm{a}}\,{\rm{is}}\,{\rm{an}}\,{\rm{integer}}\,{\rm{random}}\,{\rm{value}}\,{\rm{in}}\,1\le {\rm{a}}\le {\rm{n}}\}$$(9)Like the other operators of Trader, the changes, induced by the Eq. (8), are accepted if the imported values improve the value of the OF. By the Eq. (6) through (8), the weights of the ANN’s edges are altered, and a new drug-target predictor is acquired. Provided that the new drug-target predictor reduces the value of the RMSE (Eq. (3)), the changes of weights are admitted. Figures S3 through S5 illustrate how the changes on CSs are applied.
-
v)
Updating property: The operators of Trader, shown by Eq. 6 through 8, may change the CSs. Hence, the total value of the objective functions of a group, which is computed using Eq. (10), varies. Accordingly, the property of the groups must be updated.
$${{\rm{Property}}}_{{\rm{i}}}=\{{\sum }_{j=1}^{B}OF(j)|CS(j,\,G)={G}_{i}\}$$(10)where, propertyi is the property of the ith trader or group, B, Gi, and G are the number of CSs, the ith group, and the group which the jth CS belongs to it, respectively.
-
vi)
Termination condition: Like other optimization algorithms, each of the following options can be considered as the termination condition of Trader: (i) calling algorithms steps based on a predefined number of iterations; (ii) reaching a determined value of accuracy or error; (iii) elapsing a certain amount of time; (iv) stabilizing of the best answer in recent iterations. For training the ANN, a predefined number of iterations has been selected as the termination condition.
-
vii)
Selecting the best answer: When the termination condition is satisfied, a CS having the best value of OF will be selected and introduced as the solution to the problem. For the DTIs prediction problem, a CS, which has the minimum value of the RMSE, is chosen as a solution to forecast unknown DTIs. Figure 4 shows the pseudocode of Trader.
Results
The proposed machine learning approach has been implemented in MATLAB programing language and all the implemented source codes are available at (https://github.com/LBBSoft/Trader). This section contains three categories of results as follows:
Trader in comparison with the other optimization algorithms
Besides Trader, ten state-of-the-art optimization algorithms (PSO36, WCC37, TGA38, TE39, EPO40, ION41, VIR42, DVBA43, HTS44, and CEFOA45) were implemented. Then, these algorithms were applied to 20 benchmark functions which are used in various researches in which the above-mentioned optimization algorithms have been introduced. These standard test functions, which are available in Table S1 (Supplementary File), are categorized into unimodal, multimodal, fix dimension, expanded, penalized, and hybrid categories. Since optimization algorithms produce variable results in different executions, these algorithms are recommended to be executed at least 30 times for an intended problem, and then, the final best-obtained result should be reported to answer the problem46. Hence, all of the above-mentioned algorithms were executed over 50 individual executions on the determined benchmark functions with high dimensions. Further, the algorithms were executed under similar conditions such as the number of iterations during execution and the number of OF callings, and their parameters are determined in such a way that their performances were maximized. To evaluate the optimization algorithms, the criteria like convergence and stability of acquired results are considered. Figure 5 shows the convergences of the algorithms on the test functions, which relate to their best result over 50 individual executions. For similar convergence behaviors, the average outcomes were drawn. For example, the results of F11 and F12 were merged into one. Besides, the convergences of the algorithms on each of the benchmark functions are presented in Fig. S6 (Supplementary File).
The acquired results show the below findings:
-
i)
Trader, TGA, TE, and EPO have more convergence speed than the others and can get better results. However, the convergence speed of EPO is lower than TGA, Trader, and TE in early steps and depends on its fourth quarter of iterations in which the range of variables become smaller and smaller (Fig. 5a). Therefore, EPO gets more speed of convergence in the last quarter. TE, TGA, and ION use a similar method as does EPO and limit the range of variables by passing the iteration of the algorithms’ steps; and therefore, produce the better results for some special problems such as F1 through F9.
-
ii)
EPO, TGA, and ION algorithms cannot produce the desired results to some problems such as F11 and F12 (Fig. 5b,c). The other algorithms outperform these three algorithms when their iterations of steps are enhanced and can acquire better results.
-
iii)
For the small-sized benchmark functions such as F17 through F20, the algorithms have similar performance, and all of them can obtain the optimal solution (Fig. 5d).
-
iv)
The convergence of VIR, HTS, WCC, and CEFOA are slower than other algorithms for some of the test functions (Fig. 5a). Nonetheless, they can acquire acceptable results with enhancing the allocated time or the number of iterations, but not EPO, TE, and TGA, because of falling into local optima.
For an accurate evaluation of the algorithms, we summarized their findings over 50 distinct executions in Tables S2 through S4 (Supplementary File) with two decimal digits of accuracy using the ANOVA one-way test. We also provided Table 2 which includes the P- values of the algorithms compared to Trader as a test base and shows that the null hypothesis can be strongly rejected. For this purpose, the Wilcoxon rank sum test, which states how much the generated results are the same47, was done.
Based on the average and standard deviation point of views (Table 3), Trader has proper functionality, but its results are close to the outcomes of EPO, TGA, and TE for the test functions F1 through F9. However, they are only suitable for the problems whose optimal answer is 0 because of their operators’ nature. For this reason, their performance is the same for all of the benchmark function. From the STD aspect, HTS will be the best algorithm and the best option when the range of variables is small in a problem.
The proposed machine learning method against the others
In the second part of this section, the performance of the proposed method (ANNTR) is evaluated based on four gold standard datasets48, and then, is compared against three state-of-art methods, including the rotation forest-based drug-target (RFDT) predictor method11, the Bayesian (BAY) ranking-based method22, and a relevance vector machine-based method14 (RVM). The datasets, which are named Enzyme, Ion channel, G-protein, and Nuclear receptor, consist of 4,449, 2,029, 1,268, and 168 DTIs samples, respectively. Further, the samples have been marked using positive and negative labels, which show whether an intended drug and target have the interaction or not. The acquired results, which present the proposed method outperforms the other methods in the overall state, have been shown in Table 4. For every criterion on the datasets, the best-acquired outcome has been determined using the boldface value.
Figures 6 and 7 show the receiver operating characteristic (ROC) and precision-recall (PR) curves based on 5-fold cross-validation test, respectively. Besides, these data (Figs 6 and 7) represent information about the area under the curve (AUC) which compares the performance of the methods on the datasets. Except for the enzyme dataset, ANNTR achieves better results than three others. Furthermore, the proposed method obtains the average AUC values of 0.9457 and 0.9708 for ROC and PR curves, respectively, which are better than three others. Furthermore, as shown in Figs 6 and 7, RFDT, RVM, and BAY respectively obtain the average AUC values of 0.8736, 0.9216, and 0.9215 for the ROC, and 0.9248, 0.9581, and 0.9634 for the PR.
The acquired results on the generated datasets
In the third part of the results, we investigated the performance of Trader on the generated DTIs datasets (Table 1). For the testing datasets, a total of 751 (from 800 samples) DTIs were correctly predicted by the proposed method. In an evaluation with details and a comparison with other methods, we compared the proposed method with three other popular and efficient classification methods, including the support vector machine (SVM), the decision tree (DT), and the artificial neural network trained by error back propagation method (ANNEBP)15. The acquired results are shown in Table 5. Since the datasets are relating to the known DTIs, the problem is considered to be a one-class classification problem. Thus, true positive and false positive rates are reported in Table 5.
As reported in Table 5, ANNTR displays a higher detection capability of DTIs relative to the others. We used 10-fold cross-validation test in which a dataset has been divided into ten distinct sets for the comprehensive evaluation of the methods. In 10 iterations, four sets are considered as the training set; and the remaining one is used as the test set. There is also the convergence behavior of Trader on the generated datasets (Fig. 8).
Furthermore, we generated all the potential drug-target interactions dataset using pseudocode (Fig. 1), resulting in 119,743 records. Then, we applied the obtained models to the dataset of the potential DTIs. For all the possible DTIs dataset (119,743 samples), ANNTR predicted 47 new DTIs (Table 6).
Discussion
The proposed machine learning ANNTR method, which is based on the new optimization algorithm, was developed and compared with the well-known and efficient machine learning methods. Then, the acquired results were analyzed. Although many optimization algorithms have been proposed, they suffer from some limitations. Our proposed algorithm, which eliminates the shortcomings of other algorithms, shows stable behavior much more than the others do, and trains the ANN appropriately. The findings also show that Trader, VIR, HTS, DVBA, CEFOA, ION, PSO, and WCC have better performance in comparison with TGA, TE, and EPO. The main reasons for such performance are as follows: (i) They lack any assumptions about the optimal answer to a problem with their operators, whereas TGA, TE, and EPO include operators making the range of variables smaller and smaller and therefore can reach optimal answers in a faster manner for some of the problems. (ii) Their behaviors are almost the same on the different benchmark functions, whereas TE, EPO, and TGA fall into local optima positions for some test functions.
As a case study, we applied the proposed ANNTR method to some biological datasets and all the potential DTIs data to find drugs which may affect targets, with the result that 47 DTIs were discovered. The predicted results can be used in three manners. First, they propose some unknown DTIs. In case a disease is due to the intended target, the related drug can be introduced as an option for the treatment of disease. Further, the side effects of drugs can be determined by investigating the predicted relation between a drug and a target. Second, they show that some of the medications, like D00086 and D00145, have an identical predicted target. There is a possibility that they have similar functionality and can be used alternatively. Also, these results can be useful to chemical pharmacists who look for the novel potential efficacy of drugs and researchers who want to validate their predicted outcomes. Third, the predicted DTIs might reveal the real mechanism of actions (MOA) of drugs49 which show the pharmacological effect(s) of a drug.
For example, we predicted that diazoxide interacts with the angiotensin-I-converting enzyme (ACE). An investigation in the clinical impact of diazoxide and ACE together with some previous studies reveals that diazoxide can be used for the treatment of severe hypertension50, while ACE is responsible for controlling blood pressure51. Therefore, for the first time, we have shown that diazoxide can affect the ACE, while the MOA of diazoxide has been reported differently by others52. Also, similar to diazoxide, Ketotifen, used to reduce conjunctivitis allergic effects, can also interact with the ACE. Likewise, in a study conducted by Sanchez-Patan et al.53, Ketotifen was shown to decrease hypertension in rats.
Another example is erlotinib which is used for treating epithelial lung cancer. Our proposed method has predicted that erlotinib interacts with Muscle, skeletal, receptor tyrosine kinase (MuSK) which its antibodies are found in neuromuscular diseases. The disease leads to various phenotypes such as less eye involvement, weakness, and pain in the neck. Some researches related to the side effects of ertolinib54 may validate the predicted interaction.
Conclusion
A new optimization algorithm, named Trader, was introduced and compared with ten state-of-art optimization algorithms based on various statistical criteria. The results show that Trader outperforms other optimization algorithms and eliminates their limitations. As an empirical, yet smart evaluation, we examined the performance of Trader in the training of a multi-layer perceptron artificial neural network to discover potential DTIs on the gold-standard and generated datasets. The predicting model obtained from Trader achieved 94.62%, 94.24%, 94.80%, and 94.10% of the average 5-fold cross-validation respectively for the accuracy, sensitivity, specificity, and precision of the model. These values appeared to be better than the acquired results from other methods. Furthermore, the proposed method predicted 47 potential DTIs. We envision that the outcomes obtained by the proposed model may be used for managing possible side-effects of medications, understanding the MOA of drugs, and finding new research opportunities. Taken all, this study may pave the way in terms of de novo applications of computer-aided methods in drug discovery and development.
Data Availability
All the source codes and the datasets are available in the following link: https://github.com/LBBSoft/Trader.
References
Csermely, P., Korcsmáros, T., Kiss, H. J., London, G. & Nussinov, R. Structure and dynamics of molecular networks: a novel paradigm of drug discovery: a comprehensive review. Pharmacology & therapeutics 138, 333–408 (2013).
Luo, H., Mattes, W., Mendrick, D. L. & Hong, H. Molecular docking for identification of potential targets for drug repurposing. Current topics in medicinal chemistry 16, 3636–3645 (2016).
Wu, Z., Wang, Y. & Chen, L. Network-based drug repositioning. Molecular BioSystems 9, 1268–1281 (2013).
Qu, X. A. & Rajpal, D. K. Applications of Connectivity Map in drug discovery and development. Drug discovery today 17, 1289–1298 (2012).
Zhang, M., Luo, H., Xi, Z. & Rogaeva, E. Drug repositioning for diabetes based on’omics’ data mining. PloS one 10, e0126082 (2015).
You, J., McLeod, R. D. & Hu, P. Predicting Drug-Target Interaction Network Using Deep Learning Model. Computational Biology and Chemistry (2019).
Xie, L., He, S., Song, X., Bo, X. & Zhang, Z. Deep learning-based transcriptome data classification for drug-target interaction prediction. BMC genomics 19, 667 (2018).
Ho, Q.-T., Phan, D.-V. & Ou, Y.-Y. Using word embedding technique to efficiently represent protein sequences for identifying substrate specificities of transporters. Analytical Biochemistry (2019).
Song, D. et al. Similarity-based machine learning support vector machine predictor of drug-drug interactions with improved accuracies. Journal of clinical pharmacy and therapeutics 44, 268–275 (2019).
Keum, J. & Nam, H. Self-blm: Prediction of drug-target interactions via self-training svm. PloS one 12, e0171839 (2017).
Wang, L. et al. Rfdt: A rotation forest-based predictor for predicting drug-target interactions using drug structure and protein sequence information. Current Protein and Peptide Science 19, 445–454 (2018).
Cichonska, A. et al. Computational-experimental approach to drug-target interaction mapping: a case study on kinase inhibitors. PLoS computational biology 13, e1005678 (2017).
Cai, C. et al. In silico prediction of ROCK II inhibitors by different classification approaches. Molecular diversity 21, 791–807 (2017).
Meng, F.-R., You, Z.-H., Chen, X., Zhou, Y. & An, J.-Y. Prediction of drug–target interaction networks from the integration of protein sequences and drug chemical structures. Molecules 22, 1119 (2017).
Masoudi-Sobhanzadeh, Y., Motieghader, H. & Masoudi-Nejad, A. FeatureSelect: a software for feature selection based on machine learning approaches. BMC bioinformatics 20, 170 (2019).
Lee, I. & Nam, H. Identification of drug-target interaction by a random walk with restart method on an interactome network. BMC bioinformatics 19, 208 (2018).
Yan, X.-Y., Zhang, S.-W. & He, C.-R. Prediction of drug-target interaction by integrating diverse heterogeneous information source with multiple kernel learning and clustering methods. Computational biology and chemistry 78, 460–467 (2019).
He, L. et al. Patient-customized drug combination prediction and testing for t-cell prolymphocytic leukemia patients. Cancer research 78, 2407–2418 (2018).
Zheng, Y. et al. Predicting adverse drug reactions of combined medication from heterogeneous pharmacologic databases. BMC bioinformatics 19, 517 (2018).
Lu, Y., Guo, Y. & Korhonen, A. Link prediction in drug-target interactions network using similarity indices. BMC bioinformatics 18, 39 (2017).
Ji, X., Freudenberg, J. M. & Agarwal, P. In Computational Methods for Drug Repurposing 203–218 (Springer, 2019).
Peska, L., Buza, K. & Koller, J. Drug-target interaction prediction: A Bayesian ranking approach. Computer methods and programs in biomedicine 152, 15–21 (2017).
Ezzat, A., Zhao, P., Wu, M., Li, X.-L. & Kwoh, C.-K. Drug-target interaction prediction with graph regularized matrix factorization. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB) 14, 646–656 (2017).
Gu, W., Xie, X., He, Y. & Zhang, Z. Drug-target protein interaction prediction based on AdaBoost algorithm. Sheng wu yi xue gong cheng xue za zhi = Journal of biomedical engineering = Shengwu yixue gongchengxue zazhi 35, 935–942 (2018).
Ezzat, A., Wu, M., Li, X. & Kwoh, C.-K. In Computational Methods for Drug Repurposing 239–254 (Springer, 2019).
Sharma, A. & Rani, R. BE-DTI’: Ensemble framework for drug target interaction prediction using dimensionality reduction and active learning. Computer methods and programs in biomedicine 165, 151–162 (2018).
Rifaioglu, A. S. et al. Recent applications of deep learning and machine intelligence on in silico drug discovery: methods, tools and databases. Brief. Bioinform 10 (2018).
Ding, Y., Tang, J. & Guo, F. The computational models of drug-target interaction prediction. Protein and peptide letters (2019).
Lai, H.-Y. et al. A Brief Survey of Machine Learning Application in Cancerlectin Identification. Current gene therapy 18, 257–267 (2018).
Zhang, W. et al. Recent Advances in the Machine Learning-Based Drug-Target Interaction Prediction. Current drug metabolism (2019).
Yamanishi, Y., Araki, M., Gutteridge, A., Honda, W. & Kanehisa, M. Prediction of drug–target interaction networks from the integration of chemical and genomic spaces. Bioinformatics 24, i232–i240 (2008).
Kanehisa, M. & Goto, S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic acids research 28, 27–30 (2000).
Wishart, D. S. et al. DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic acids research 34, D668–D672 (2006).
Masoudi-Sobhanzadeh, Y., Omidi, Y., Amanlou, M. & Masoudi-Nejad, A. DrugR+: A comprehensive relational database for drug repurposing, combination therapy, and replacement therapy. Computers in Biology and Medicine (2019).
Smith, T. F. & Waterman, M. S. Comparison of biosequences. Advances in applied mathematics 2, 482–489 (1981).
Coello Coello, C. & Lechuga, M. In Proc., Evolutionary Computation, 2002. CEC'02. Proceedings of the 2002 Congress on. 1051–1056.
Masoudi-Sobhanzadeh, Y. & Motieghader, H. World Competitive Contests (WCC) algorithm: A novel intelligent optimization algorithm for biological and non-biological problems. Informatics in Medicine Unlocked 3, 15–28 (2016).
Cheraghalipour, A., Hajiaghaei-Keshteli, M. & Paydar, M. M. Tree Growth Algorithm (TGA): A novel approach for solving optimization problems. Engineering Applications of Artificial Intelligence 72, 393–414 (2018).
Kaveh, A. & Dadras, A. A novel meta-heuristic optimization algorithm: thermal exchange optimization. Advances in Engineering Software 110, 69–84 (2017).
Dhiman, G. & Kumar, V. Emperor Penguin Optimizer: A Bio-inspired Algorithm for Engineering Problems. Knowledge-Based Systems (2018).
Javidy, B., Hatamlou, A. & Mirjalili, S. Ions motion algorithm for solving optimization problems. Applied Soft Computing 32, 72–79 (2015).
Jaderyan, M. & Khotanlou, H. Virulence Optimization Algorithm. Applied Soft Computing 43, 596–618 (2016).
Topal, A. O. & Altun, O. A novel meta-heuristic algorithm: Dynamic Virtual Bats Algorithm. Information Sciences 354, 222–235 (2016).
Patel, V. K. & Savsani, V. J. Heat transfer search (HTS): a novel optimization algorithm. Information Sciences 324, 217–246 (2015).
Han, X., Liu, Q., Wang, H. & Wang, L. Novel fruit fly optimization algorithm with trend search and co-evolution. Knowledge-Based Systems 141, 1–17 (2018).
Mernik, M., Liu, S.-H., Karaboga, D. & Črepinšek, M. On clarifying misconceptions when comparing variants of the Artificial Bee Colony Algorithm by offering a new implementation. Information Sciences 291, 115–127 (2015).
Abdi, Y. & Seyfari, Y. Search Manager: A Framework for Hybridizing Different Search Strategies. International journal of advanced computer science and applications 9, 525–540 (2018).
Yamanishi, Y., Kotera, M., Kanehisa, M. & Goto, S. Drug-target interaction prediction from chemical, genomic and pharmacological data in an integrated framework. Bioinformatics 26, i246–i254 (2010).
Baker, N. C., Ekins, S., Williams, A. J. & Tropsha, A. A bibliometric review of drug repurposing. Drug discovery today (2018).
Sridharan, K. & Sequeira, R. P. Drugs for treating severe hypertension in pregnancy: a network meta-analysis and trial sequential analysis of randomized clinical trials. British journal of clinical pharmacology 84, 1906–1916 (2018).
Tartibian, B., Botelho Teixeira, A. M. & Baghaiee, B. Moderate Intensity Exercise is Associated With Decreased Angiotensin-converting Enzyme, Increased β2-adrenergic Receptor Gene Expression, and Lower Blood Pressure in Middle-Aged Men. Journal of aging and physical activity 23, 212–220 (2015).
Altszuler, N., Hampshire, J. & Moraru, E. On the mechanism of diazoxide-induced hyperglycemia. Diabetes 26, 931–935 (1977).
Sánchez-Patán, F. et al. Mast cell inhibition by ketotifen reduces splanchnic inflammatory response in a portal hypertension model in rats. Experimental and Toxicologic Pathology 60, 347–355 (2008).
Celik, T. & Kosker, M. Ocular side effects and trichomegaly of eyelashes induced by erlotinib: a case report and review of the literature. Contact Lens and Anterior Eye 38, 59–60 (2015).
Author information
Authors and Affiliations
Contributions
Yosef Masoudi-Sobhanzadeh: Conceptualization, implementation, formal analysis, investigation, writing, editing, and revising the manuscript. Yadollah Omidi: Results analysis, validation, Conceptualization, writing, editing, and revising the manuscript. Massoud Amanlou: Validation, data analysis, Editing-manuscript. Ali Masoudi-Nejad: Conceptualization, Supervision, Project administration, writing, editing, and revising the manuscript. All authors have read and approved the manuscript.
Corresponding author
Ethics declarations
Competing Interests
The authors declare no competing interests.
Additional information
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Masoudi-Sobhanzadeh, Y., Omidi, Y., Amanlou, M. et al. Trader as a new optimization algorithm predicts drug-target interactions efficiently. Sci Rep 9, 9348 (2019). https://doi.org/10.1038/s41598-019-45814-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-019-45814-8
This article is cited by
-
DITRA: an efficient event-driven multi-objective optimization algorithm for bandwidth allocation in IoT environments
Cluster Computing (2024)
-
A voting-based machine learning approach for classifying biological and clinical datasets
BMC Bioinformatics (2023)
-
DrugRep-HeSiaGraph: when heterogenous siamese neural network meets knowledge graphs for drug repurposing
BMC Bioinformatics (2023)
-
A machine learning method based on the genetic and world competitive contests algorithms for selecting genes or features in biological applications
Scientific Reports (2021)
-
Distinguishing drug/non-drug-like small molecules in drug discovery using deep belief network
Molecular Diversity (2021)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.