Trader as a new optimization algorithm predicts drug-target interactions efficiently

Several machine learning approaches have been proposed for predicting new benefits of the existing drugs. Although these methods have introduced new usage(s) of some medications, efficient methods can lead to more accurate predictions. To this end, we proposed a novel machine learning method which is based on a new optimization algorithm, named Trader. To show the capabilities of the proposed algorithm which can be applied to the different scope of science, it was compared with ten other state-of-the-art optimization algorithms based on the standard and advanced benchmark functions. Next, a multi-layer artificial neural network was designed and trained by Trader to predict drug-target interactions (DTIs). Finally, the functionality of the proposed method was investigated on some DTIs datasets and compared with other methods. The data obtained by Trader showed that it eliminates the disadvantages of different optimization algorithms, resulting in a better outcome. Further, the proposed machine learning method was found to achieve a significant level of performance compared to the other popular and efficient approaches in predicting unknown DTIs. All the implemented source codes are freely available at https://github.com/LBBSoft/Trader.

www.nature.com/scientificreports www.nature.com/scientificreports/ pharmacological effects of medications on 17109 molecular properties are taken into consideration. Where, n, F i , SIM, and W i are a total number of molecular properties (17109), the i th molecular feature, the similarity score between two drugs such as D and D' , and the weight of the F i calculated by Eq. (2) 14 , respectively.
i i 2 2 2 Where, d i , σ, and h are the frequency of i th feature, the standard deviation of d k (k = 1through n), and a constant value 0.1, respectively. Using Eq. (1), a matrix of the effect similarity score for every pair of drugs is created. ii) Amino acid sequences of protein targets are obtained from the DrugBank 33 database and KEGG GENE databases. Further, we have developed an integrated database named DrugR+ 34 (http://www.drugr.ir) which is a relational database and contains all data of DrugBank and some data of KEGG. Next, the similarity score between every pair of targets is computed by the normalized smith and waterman alignment scoring method 35 , and a matrix is generated for target-target similarity scores. iii) The interaction information between drugs and targets is obtained from the DrugR+ database.
For every type of the targets, a dataset is created by the pseudocode presented in Fig. 1. These datasets can be used as gold standard datasets by researchers who want to predict the interaction between drugs and targets using machine learning approaches. In Table 1, the attributes of the generated datasets are also shown. the machine learning approach. Our proposed method, whose framework is depicted in Fig. 2, creates a prediction model using a multi-layer perceptron (MLP) artificial neural network (ANN) with two hidden layers. The generated datasets are divided into two sets, including (i) training and (ii) testing sets. For all the generated datasets, the ANN is trained by Trader in which every candidate solution consists of 38 variables. There are 8, 3, 2, and 1 neurons in the input, the first hidden, the second hidden, and the output layers, respectively. In the ANN, all the neurons of a layer are connected to all the neurons of the next layer, and hence, the total number of synapses or ANN's edges is 8*3 + 3*2 + 2*1 = 32. Moreover, since there are six biases which are specified in Fig. 2, the total number of variables will be 32 + 6 = 38 in a potential answer. In this problem, the objective function is considered as root mean square error (RMSE) which is computed by Eq. (3):   www.nature.com/scientificreports www.nature.com/scientificreports/ Trader optimization algorithm. Our proposed algorithm, Trader, has been inspired by the intelligent behavior of traders who are looking for more profit and property using different operations such as retailing, importing, exporting, and many other activities. In Fig. 3, the flowchart of Trader has been shown. Trader consists of several steps that are described, as follows: i) Creating the first population of candidate solutions: Like other optimization algorithms, Trader starts with some potential answers which consist of several variables and can be considered as an array. Equation (4) shows a candidate solution (CS) with n variables: Where G determines the group of the CS which belongs to a trader, v i shows the i th variable. The groups are not specified at the beginning of the algorithm. For the drug repurposing problem, a CS determines the weights of the ANN's edges, and the variables show the edges of the ANN. Therefore, the total number of variables and edges of the ANN are the same. ii) Calling the objective function: After creating the first population of CSs, the worthiness of each of them is calculated by an objective function (OF), whose worthiness is defined based on a problem nature. For example, the fitness of a CS is computed by the value of the error in the problem of training an artificial neural network (Eq. (3)).

Figure 2.
The framework of the proposed method for drug repurposing. After generating the datasets, Trader trains the ANN using datasets. When the ANN is appropriately trained, the model is generated and then applied to the prediction of the unknown drug-target interactions. IN, H, D, and T show neurons of the input layer, and neurons of hidden layers, a drug, and a target, respectively. The proposed optimization algorithm starts with some candidate solutions which each of them determine the weights of the ANN. Next, they are placed into several groups and are improved by Eq. 6 through 8 (see the text for details). The steps of Trader are repeated until the termination condition is satisfied. By passing the steps of the algorithm, the value of RMSE is also reduced and a suitable predictor model is acquired.
www.nature.com/scientificreports www.nature.com/scientificreports/ iii) Grouping the candidate solutions: The groups are constituted based on the number of traders and their properties. At the start of the algorithm, all the traders have a same property which will be updated during the algorithm's iterations. Equation (5) is used to calculate the number of CSs devoted to a specific trader (a group): where, NB i , P i , C, and T are the total number of CSs assigned to the i th trader or group, the property of the i th trader, the number of existing CSs, and the number of traders, respectively. Also, the constant value of 2 indicates that none of the traders or groups is eliminated during the algorithm iterations, and at least two CSs remain in every group. Figure S2 has illustrated an example of the competition among traders for getting the CSs. iv) Changing the candidate solutions: After grouping candidate solutions, at first, the best CS of each group named Master CS is selected, and then its variable values are distributed to the another CS, named Slave CS, using Eq. (6): where n is the total number of variables in a CS, R is a random integer value between [1, n], Ck is the number of CSs of the k th group, CS_slave_j is the j th Slave CS of the k th group, and CS_master_k is the Master CS of the k th group. In case Eq. (6) enhances the value of the OF (RMSE), these changes are ignored. Otherwise, they will be accepted. In addition to the Eq. (6) which helps the Slave CSs to improve their value of OF, there is another operator that changes the Slave CSs based on their contents. These changes are applied to the Slave CSs using Eq. (7).
where R is a random integer value between 1 and n/10, M is a random integer value between 1 and n, CS slave is a Slave CS, and k is an arbitrary value which is selected either 1 or −1. Like the previous operator, the changes are accepted if they improve the value of the OF. Unlike Eqs (6) and (7)  Where, R is a random integer value between 1 and n, j and k indicate the importer and exporter groups, CS_master_j shows the Master CS of the importer group, and CS_master_k shows the Master CS of the exporter group. The value of k is calculated by Eq. (9).
= | ≠ ≤ ≤ K { a a j and a is an integer random value in 1 a n } (9) Like the other operators of Trader, the changes, induced by the Eq.
where, property i is the property of the i th trader or group, B, G i, and G are the number of CSs, the i th group, and the group which the j th CS belongs to it, respectively. vi) Termination condition: Like other optimization algorithms, each of the following options can be considered as the termination condition of Trader: (i) calling algorithms steps based on a predefined number of iterations; (ii) reaching a determined value of accuracy or error; (iii) elapsing a certain amount of time; (iv) stabilizing of the best answer in recent iterations. For training the ANN, a predefined number of iterations has been selected as the termination condition. vii) Selecting the best answer: When the termination condition is satisfied, a CS having the best value of OF will be selected and introduced as the solution to the problem. For the DTIs prediction problem, a CS, which has the minimum value of the RMSE, is chosen as a solution to forecast unknown DTIs. Figure 4 shows the pseudocode of Trader. (

Results
The proposed machine learning approach has been implemented in MATLAB programing language and all the implemented source codes are available at (https://github.com/LBBSoft/Trader). This section contains three categories of results as follows: Trader in comparison with the other optimization algorithms. Besides Trader, ten state-of-the-art optimization algorithms (PSO 36 44 , and CEFOA 45 ) were implemented. Then, these algorithms were applied to 20 benchmark functions which are used in various researches in which the above-mentioned optimization algorithms have been introduced. These standard test functions, which are available in Table S1 (Supplementary File), are categorized into unimodal, multimodal, fix dimension, expanded, penalized, and hybrid categories. Since optimization algorithms produce variable results in different executions, these algorithms are recommended to be executed at least 30 times for an intended problem, and then, the final best-obtained result should be reported to answer the problem 46 . Hence, all of the above-mentioned algorithms were executed over 50 individual executions on the determined benchmark functions with high dimensions. Further, the algorithms were executed under similar conditions such as the number of iterations during execution and the number of OF callings, and their parameters are determined in such a way that their performances were maximized. To evaluate the optimization algorithms, the criteria like convergence www.nature.com/scientificreports www.nature.com/scientificreports/ and stability of acquired results are considered. Figure 5 shows the convergences of the algorithms on the test functions, which relate to their best result over 50 individual executions. For similar convergence behaviors, the average outcomes were drawn. For example, the results of F11 and F12 were merged into one. Besides, the convergences of the algorithms on each of the benchmark functions are presented in Fig. S6 (Supplementary File).
The acquired results show the below findings: i) Trader, TGA, TE, and EPO have more convergence speed than the others and can get better results. However, the convergence speed of EPO is lower than TGA, Trader, and TE in early steps and depends on its fourth quarter of iterations in which the range of variables become smaller and smaller (Fig. 5a). Therefore, EPO gets more speed of convergence in the last quarter. TE, TGA, and ION use a similar method as does EPO and limit the range of variables by passing the iteration of the algorithms' steps; and therefore, produce the better results for some special problems such as F1 through F9. ii) EPO, TGA, and ION algorithms cannot produce the desired results to some problems such as F11 and F12 (Fig. 5b,c). The other algorithms outperform these three algorithms when their iterations of steps are enhanced and can acquire better results. iii) For the small-sized benchmark functions such as F17 through F20, the algorithms have similar performance, and all of them can obtain the optimal solution (Fig. 5d). iv) The convergence of VIR, HTS, WCC, and CEFOA are slower than other algorithms for some of the test functions (Fig. 5a). Nonetheless, they can acquire acceptable results with enhancing the allocated time or the number of iterations, but not EPO, TE, and TGA, because of falling into local optima.
For an accurate evaluation of the algorithms, we summarized their findings over 50 distinct executions in Tables S2 through S4 (Supplementary File) with two decimal digits of accuracy using the ANOVA one-way test. We also provided Table 2 which includes the P-values of the algorithms compared to Trader as a test base and shows that the null hypothesis can be strongly rejected. For this purpose, the Wilcoxon rank sum test, which states how much the generated results are the same 47 , was done.
Based on the average and standard deviation point of views (Table 3), Trader has proper functionality, but its results are close to the outcomes of EPO, TGA, and TE for the test functions F1 through F9. However, they are only suitable for the problems whose optimal answer is 0 because of their operators' nature. For this reason, their performance is the same for all of the benchmark function. From the STD aspect, HTS will be the best algorithm and the best option when the range of variables is small in a problem. the proposed machine learning method against the others. In the second part of this section, the performance of the proposed method (ANNTR) is evaluated based on four gold standard datasets 48 , and then, is compared against three state-of-art methods, including the rotation forest-based drug-target (RFDT) predictor method 11 , the Bayesian (BAY) ranking-based method 22 , and a relevance vector machine-based method 14 (RVM). The datasets, which are named Enzyme, Ion channel, G-protein, and Nuclear receptor, consist of 4,449, 2,029, 1,268, and 168 DTIs samples, respectively. Further, the samples have been marked using positive and negative www.nature.com/scientificreports www.nature.com/scientificreports/ labels, which show whether an intended drug and target have the interaction or not. The acquired results, which present the proposed method outperforms the other methods in the overall state, have been shown in Table 4. For every criterion on the datasets, the best-acquired outcome has been determined using the boldface value. Figures 6 and 7 show the receiver operating characteristic (ROC) and precision-recall (PR) curves based on 5-fold cross-validation test, respectively. Besides, these data (Figs 6 and 7) represent information about the area under the curve (AUC) which compares the performance of the methods on the datasets. Except for the enzyme dataset, ANNTR achieves better results than three others. Furthermore, the proposed method obtains the average AUC values of 0.9457 and 0.9708 for ROC and PR curves, respectively, which are better than three others. the acquired results on the generated datasets. In the third part of the results, we investigated the performance of Trader on the generated DTIs datasets (Table 1). For the testing datasets, a total of 751 (from 800 samples) DTIs were correctly predicted by the proposed method. In an evaluation with details and a comparison with other methods, we compared the proposed method with three other popular and efficient classification methods, including the support vector machine (SVM), the decision tree (DT), and the artificial neural network trained by error back propagation method (ANNEBP) 15 . The acquired results are shown in Table 5. Since the datasets are relating to the known DTIs, the problem is considered to be a one-class classification problem. Thus, true positive and false positive rates are reported in Table 5.
As reported in Table 5, ANNTR displays a higher detection capability of DTIs relative to the others. We used 10-fold cross-validation test in which a dataset has been divided into ten distinct sets for the comprehensive evaluation of the methods. In 10 iterations, four sets are considered as the training set; and the remaining one is used as the test set. There is also the convergence behavior of Trader on the generated datasets (Fig. 8).
Furthermore, we generated all the potential drug-target interactions dataset using pseudocode (Fig. 1), resulting in 119,743 records. Then, we applied the obtained models to the dataset of the potential DTIs. For all the possible DTIs dataset (119,743 samples), ANNTR predicted 47 new DTIs (Table 6).

Discussion
The proposed machine learning ANNTR method, which is based on the new optimization algorithm, was developed and compared with the well-known and efficient machine learning methods. Then, the acquired results were analyzed. Although many optimization algorithms have been proposed, they suffer from some limitations. Our proposed algorithm, which eliminates the shortcomings of other algorithms, shows stable behavior much more than the others do, and trains the ANN appropriately. The findings also show that Trader, VIR, HTS, DVBA, CEFOA, ION, PSO, and WCC have better performance in comparison with TGA, TE, and EPO. The main reasons for such performance are as follows: (i) They lack any assumptions about the optimal answer to a problem with their operators, whereas TGA, TE, and EPO include operators making the range of variables smaller and smaller and therefore can reach optimal answers in a faster manner for some of the problems. (ii) Their behaviors  www.nature.com/scientificreports www.nature.com/scientificreports/ are almost the same on the different benchmark functions, whereas TE, EPO, and TGA fall into local optima positions for some test functions.
As a case study, we applied the proposed ANNTR method to some biological datasets and all the potential DTIs data to find drugs which may affect targets, with the result that 47 DTIs were discovered. The predicted results can be used in three manners. First, they propose some unknown DTIs. In case a disease is due to the   www.nature.com/scientificreports www.nature.com/scientificreports/ intended target, the related drug can be introduced as an option for the treatment of disease. Further, the side effects of drugs can be determined by investigating the predicted relation between a drug and a target. Second, they show that some of the medications, like D00086 and D00145, have an identical predicted target. There is a possibility that they have similar functionality and can be used alternatively. Also, these results can be useful to chemical pharmacists who look for the novel potential efficacy of drugs and researchers who want to validate their predicted outcomes. Third, the predicted DTIs might reveal the real mechanism of actions (MOA) of drugs 49 which show the pharmacological effect(s) of a drug.  www.nature.com/scientificreports www.nature.com/scientificreports/ For example, we predicted that diazoxide interacts with the angiotensin-I-converting enzyme (ACE). An investigation in the clinical impact of diazoxide and ACE together with some previous studies reveals that diazoxide can be used for the treatment of severe hypertension 50 , while ACE is responsible for controlling blood pressure 51 . Therefore, for the first time, we have shown that diazoxide can affect the ACE, while the MOA of diazoxide has been reported differently by others 52 . Also, similar to diazoxide, Ketotifen, used to reduce conjunctivitis allergic effects, can also interact with the ACE. Likewise, in a study conducted by Sanchez-Patan et al. 53 , Ketotifen was shown to decrease hypertension in rats.
Another example is erlotinib which is used for treating epithelial lung cancer. Our proposed method has predicted that erlotinib interacts with Muscle, skeletal, receptor tyrosine kinase (MuSK) which its antibodies are found in neuromuscular diseases. The disease leads to various phenotypes such as less eye involvement, weakness, and pain in the neck. Some researches related to the side effects of ertolinib 54 may validate the predicted interaction.   www.nature.com/scientificreports www.nature.com/scientificreports/

Conclusion
A new optimization algorithm, named Trader, was introduced and compared with ten state-of-art optimization algorithms based on various statistical criteria. The results show that Trader outperforms other optimization algorithms and eliminates their limitations. As an empirical, yet smart evaluation, we examined the performance of Trader in the training of a multi-layer perceptron artificial neural network to discover potential DTIs on the gold-standard and generated datasets. The predicting model obtained from Trader achieved 94.62%, 94.24%, 94.80%, and 94.10% of the average 5-fold cross-validation respectively for the accuracy, sensitivity, specificity, and precision of the model. These values appeared to be better than the acquired results from other methods. Furthermore, the proposed method predicted 47 potential DTIs. We envision that the outcomes obtained by  www.nature.com/scientificreports www.nature.com/scientificreports/ the proposed model may be used for managing possible side-effects of medications, understanding the MOA of drugs, and finding new research opportunities. Taken all, this study may pave the way in terms of de novo applications of computer-aided methods in drug discovery and development.

Data Availability
All the source codes and the datasets are available in the following link: https://github.com/LBBSoft/Trader.