Large-scale Direct Targeting for Drug Repositioning and Discovery

Zheng, Chunli; Guo, Zihu; Huang, Chao; Wu, Ziyin; Li, Yan; Chen, Xuetong; Fu, Yingxue; Ru, Jinlong; Ali Shar, Piar; Wang, Yuan; Wang, Yonghua

doi:10.1038/srep11970

Download PDF

Article
Open access
Published: 09 July 2015

Large-scale Direct Targeting for Drug Repositioning and Discovery

Chunli Zheng¹^na1,
Zihu Guo¹^na1,
Chao Huang¹^na1,
Ziyin Wu¹^na1,
Yan Li²^na1,
Xuetong Chen¹^na1,
Yingxue Fu¹^na1,
Jinlong Ru¹^na1,
Piar Ali Shar¹^na1,
Yuan Wang³^na1 &
…
Yonghua Wang¹^na1

Scientific Reports volume 5, Article number: 11970 (2015) Cite this article

5545 Accesses
67 Citations
1 Altmetric
Metrics details

Subjects

Abstract

A system-level identification of drug-target direct interactions is vital to drug repositioning and discovery. However, the biological means on a large scale remains challenging and expensive even nowadays. The available computational models mainly focus on predicting indirect interactions or direct interactions on a small scale. To address these problems, in this work, a novel algorithm termed weighted ensemble similarity (WES) has been developed to identify drug direct targets based on a large-scale of 98,327 drug-target relationships. WES includes: (1) identifying the key ligand structural features that are highly-related to the pharmacological properties in a framework of ensemble; (2) determining a drug’s affiliation of a target by evaluation of the overall similarity (ensemble) rather than a single ligand judgment; and (3) integrating the standardized ensemble similarities (Z score) by Bayesian network and multi-variate kernel approach to make predictions. All these lead WES to predict drug direct targets with external and experimental test accuracies of 70% and 71%, respectively. This shows that the WES method provides a potential in silico model for drug repositioning and discovery.

Incorporating chemical sub-structures and protein evolutionary information for inferring drug-target interactions

Article Open access 20 April 2020

Affinity2Vec: drug-target binding affinity prediction through representation learning, graph mining, and machine learning

Article Open access 19 March 2022

HIDTI: integration of heterogeneous information to predict drug-target interactions

Article Open access 08 March 2022

Introduction

A system-level understanding of the relationships between drugs and their targets, especially direct targets¹, is vital to address the efficacy and safety-related issues of compounds in the later stages of drug discovery and development^2,3 and, thus, to reduce the high attrition rates in clinical trials⁴. Various biological means are available for identifying drug targets^5,6,7, but the detection on a large scale remains challenging and expensive even nowadays. The obstacle towards this goal lies in the time and costs of pharmacological experiments that can accurately recapitulate the target response for diverse drugs⁸.

Recently, many experiment-based approaches including the high-density microarray and cell-based assays have been proposed to investigate the indirect or direct features of drug–target interactions^8,9. However, the most reliable evidence of the direct interactions is the co-crystallization of the target proteins with drugs in a solution¹⁰. Recent developments in biotechnology have contributed to the increase in the amounts of high-throughput data for drugs and targets in the omics level, which can be precious sources for recognizing unknown drug-target interactions¹¹. These also accelerate a variety of in silico approaches that have been developed for predicting potential targets. A simple way to measure direct the interactions might be the molecular docking simulation¹², but which is limited by the availability of a reliable three dimensional (3D) structure of target proteins¹³. Thus, it is still very important to develop efficient computational methods to predict drug targets, which are independent of the protein structures.

Our previous work has developed a chemogenomic model based on chemical, genomic and pharmacological information for characterizing the complicated interactions between ligands and targets¹⁴. However, due to the limitation of database used, this model could not discriminate those direct or indirect interactions. Another recently developed similarity ensemble approach (SEA) is capable of detecting the direct interactions based on the chemical similarity of ligand sets, which has been demonstrated as an effective conceptual and methodological breakthrough in this field¹⁵.

In this work, we propose a novel weighted ensemble similarity (WES) algorithm, an extension of the SEA method, to predict the drug-target direct interactions. Here, the term ensemble is an extension concept derived from statistical physics. As we know, each protein (receptor) has several ligands, these ligands construct a set and here, the set was treated as an ensemble. This concept is proposed based on the following considerations: (1) if the ligand set has structurally similar compounds, then the ensemble average will cover a narrow chemical space. Thus, to compare a compound with the ensemble average or any single compound in a set might be have similar results; (2) however, in most cases, the ligands are diverse for a receptor like P-glycoprotein¹⁶ or COX2³, they might be divided into several smaller sub-clusters. If the prediction of a compound that is still made based on its similarity with a certain compound in the training set, it will not give reliable results. Thus, a more reasonable way is to compare a compound similarity with the whole feature of an ensemble (set).

Here, the WES model was built on a large data set involving 98,327 drug-target relations, which includes BindingDB¹⁷ (http://www.bindingdb.org/bind/index.jsp, access time: January 16, 2014), Drugbank¹⁸ (http://www.drugbank.ca/, access time: January 16, 2014), PDB¹⁹ (http://www.rcsb.org/pdb/, access time: January 16, 2014) databases and GoPubMed (http://www.ncbi.nlm.nih.gov/, access time: January 30, 2014). The efficiency of the model was also compared with other published models and further validated by pharmacological experiments.

Results

WES—an algorithm for predicting direct interactions of drugs and targets

The algorithm works in three phases: (1) identifying the key ligand structural and physicochemical features (CDK and Dragon) that are highly-related to the pharmacological properties in a framework of ensemble. We assembled the feature matrix for the ligand set of each protein based on statistical tests (non-parametric Wilcoxon Sum Rank Test for Dragon feature; one-sided Fisher’s exact test for CDK feature). (2) Determining a drug’s affiliation of a target by evaluation of the overall similarity of an ensemble rather than a single ligand judgment. As the resulting score does not discriminate relevant similarities from random but depends on the number of ligands in each set, it is not a perfect assessment of the overall similarity of the ligand sets. Then the overall similarities were converted into the size-bias-free normalized values to eliminate the relevant similarities from random. (3) And finally, integrating the standardized ensemble similarities (Z score) by Bayesian network to make predictions.

Model performance

Feature analysis.

To investigate the effects of different structural features of the ligands on the model performance, we have used the Chemical Development Kit (CDK), Dragon and the CDK-Dragon hybrid features for model construction, respectively (see Methods for details). Table 1 illustrates the results in terms of precision and recall rates. Clearly, the hybrid model outperforms both the CDK and Dragon ones in recovering the negative links. Notably, the hybrid model for the leave-one-out cross-validation (LOOCV) performs well in predicting the binding (sensitivity 85%, SEN) and the non-binding (specificity 71%, SPE) patterns, with the accuracy of 78%, the precision (PRE 74%) and the area under the receiver operating curves (AUC) of 0.85, respectively. It is noted that all the scores (Z score for CDK and Dragon model and likelihood for CDK-Dragon hybrid model), used to make prediction, in this work were selected when the models achieve the highest F1 score in cross-validation otherwise specified (see Methods for details). The ROC curves (Fig. 1) show that all the three models are capable of catching sufficient information related to detect interactions at high true-positive rates against low false-positive rates at any threshold. With the increase of the AUC in the complete dataset, the hybrid model improves the ability to identify those known drug-target links, demonstrating that more chemical and pharmacological information introduced to build models can achieve better predictive activity.

Table 1 Performance of the WES method.

Full size table

To investigate the influence of weighted features attributed to the WES performance, we tested the different inputs: weighted features vs. non-weighted features. Table S1 shows that the weighted hybrid feature-based WES outperforms the non-weighted feature-based model, with the ACC of 78%, PRE of 74% and AUC of 0.85, respectively. This reflects that WES algorithm weights and selects features to reduce dimensionality of the descriptor set, thus resulting in good performance.

Also we have made a check of the effectiveness of integrating the standardized ensemble similarities (Z score) by Bayesian network. Notably, the integrated WES model also performs better than the non-integrated one in predicting the binding (SEN 85%) and the non-binding (SPE 71%) patterns (Table S1). These results serve to highlight the fact that integration procedure of WES algorithm exhibits high prediction efficiency.

External data validation

To ensure the reliability of the WES model, we further carried out an external validation. The dataset for external validation includes both the binding (positive sample) and non-binding data (negative sample) as following: 1) the positive samples were extracted from PDB for those ligand-protein pairs with the half-maximal inhibitory concentrations (IC₅₀) < 10 μM. The interactions which overlap with the training set for model construction were manually deleted and finally 649 interactions were obtained; 2) the negative samples were achieved from BindingDB with a filter criterion of IC₅₀ > 500 μM. And finally, 3,172 ligand-target non-binding data was obtained as negative samples. The hybrid model shows the prediction ACC of 71% (458/649) for the positive samples and 70% (2,209/3,172) for the negative samples. All these demonstrate the weighted hybrid WES achieves excellent performance for different data sources.

Target class prediction

The performance of WES method was further tested on five pharmaceutical classes involving enzymes (n = 761), ion channels (n = 78), membrane proteins (n = 275), transporters (n = 50) and transcription factors (n = 39), respectively. Figure 1 and Table 1 show the AUC, SEN, SPE, PRE and ACC of the models. WES displays the highest prediction ability for the transcription factor (ACC = 0.80) and the membrane protein (ACC = 0.79), followed by the enzyme (ACC = 0.78), transporter (ACC = 0.79) and ion channels (ACC = 0.75), respectively.

Also, we have compared the performance of WES optimal model for target class prediction with other published models (enzymes, 664; ion channels, 204; membrane proteins, 95; nuclear receptors, 26; respectively.), including the nearest profile, weighted profile, bipartite Graph learning methods and the same criteria⁵. Table 2 indicates that all the methods have quite high AUC and SPE but low SEN values. The WES and bipartite graph model outperform the other two models (nearest profile, weighted profile). However, it has to be noted that, the WES model was constructed with a lager dataset exhibiting more molecular and pharmacological diversities, thus it is believed that WES might have more generalization ability for making predictions.

Table 2 Statistics of the prediction performance.

Full size table

Comparison of WES with 1NN

In multi-objective pattern recognition, the k-Nearest Neighbors algorithm (k-NN) is a non-parametric and widely used method. The output depends on whether k-NN is used for classification by a majority vote of its neighbors, with the object being assigned to the class most common among its k nearest neighbors (k is a positive integer, typically small). WES has been compared to a one nearest neighbor (1NN) model (Fig. 2), which judges the probability of a drug targeting to a protein based only on the maximum similarity to the reference ligands of the target. For close analogs, Tanimoto coefficients (Tc) > 0.65, the fraction of true positives was comparable between 1NN and WES (Fig. 2). Surprisingly, by across most similarity thresholds, WES substantially outperforms 1NN. Notably, among the correct drug-target predictions by WES, 4,319 of them show low similarity (Tc < 0.4) with the ligand sets of their respective targets. However, the proportion held by 1NN is zero. These results prove that WES is more capable of predicting drug targets for various structurally diverse chemicals.

Evaluation of ligand scaffold hopping

In order to further assess the ligand scaffold hopping (LSH) ability for WES model, we have compared the predicted ligands with those known ligands for the same targets. The results show a diversified structural scaffolds as shown in Table S2-3. This indicates that WES catches the relatively complete drug-binding features for a protein from the ensemble level not from its single ligand like 1NN method. For example, drug Hydrocortamate, which is predicted to modulate Enpp2 (Fig. 3), is only marginally similar to the known ligand sets (Tc value 0.47; Fig. 3). Clearly, those similar compounds are more easily identified by WES. For example, Saquinavir, closely resemble (Tc value 0.91; Fig. 3) to the ligand set of REN, is predicted to regulate REN (Fig. 3). The LSH analysis confirms the specificity of prediction for WES, which is important for drug repositioning for those known drugs in pharmaceutical researches.

Experimental validation

To validate the practicability of WES model, we randomly selected Enpp2, Faah, PTGS2, PPARG and REN, the five inflammation-related targets and predicted their direct ligand-target interactions. The 24 top-scoring (hybrid-WES) and commercially available drug-target interactions (Table 3) were tested by the ligand-binding assays.

Table 3 IC₅₀values for the 24 top-scored direct interactions.

Full size table

Here, the ligand-target affinities are calculated by IC₅₀ values and the ligands were then classified as strong (IC₅₀ < 1 μM), moderate (1 μM ≤ IC₅₀ < 10 μM), weak (10 μM ≤ IC₅₀ < 100 μM), or non-binders (IC₅₀ ≥ 100 μM) according to Regina S. Salvat et al.²⁰. In this work, the IC₅₀ ≤ 10 μM is defined for binders for building the training dataset. Clearly, this criteria is strict, as we believe that a more strict strategy will be helpful to reduce data noises, since which were collected from various resources. Here, both the weak and strong binders were counted, resulting in a prediction ACC of 71% (17/24) for the experimental interactions predicted by the hybrid WES.

Perhaps the most compelling results are the test of the drugs against those targets to which they were not previously known to bind, so called drug repositioning (Table 3). By direct binding assay, we find Desmopressin is a new 1 μM antagonist of REN receptor, which was not reported previously. This is also consistent with the phenomenon for Treprostinil which is newly found to antagonize PPARG in a micromolar concentration range. Intriguingly, Esmolol is also observed to modulate PPARG, though it has been reported to act on ADRB1²¹.

Discussion

The decoding of drug direct targets is of great importance in drug repositioning and discovery, but it is laborious and costly. Hence, a reliable computational approach for drug direct target prediction would be of significant values. In this study, we propose a new WES algorithm which exhibits reasonable reliability in discriminating direct interactions and non-interactions with a well specificity and sensitivity (AUC = 0.85), internal, external and experimental test accuracies of 78%, 70% and 71%, respectively.

Attention needs to be particularly paid to two steps in construction of the WES algorithm. First, the bulk of features have little to do with the pharmacological properties of a ligand. In order to identify the pharmacology-related features, we weighted the structural features based on statistical tests and optimization analysis in a framework of ensemble. This step not only reduces dimensionality of the descriptor set, but also eliminate data noise.

Second, most ligands are dissimilar with each other even they target to the same protein. Thus traditional single molecule similarity-based methods may be insufficient to predict the complex drug-target interactions. Here, we introduced the ensemble concept to assure the model to predict a compound activity not because of its similarity with certain compound in the training set, but of its similarity with the whole feature of an ensemble. Compared with the 1NN model, which judges the probability of a drug targeting to a protein based only on the maximum similarity to a reference ligand, the WES algorithm has more generalization ability in predicting those scaffold-hopping ligands.

Methods

Data sets

We obtained 822,643 protein-ligand pairs (PLPs) with information of inhibitory (Ki), IC₅₀ values and protein sequences from the BindingDB database, including 5,311 proteins and 490,282 ligands, respectively. K_i is the concentration of an inhibitor that is required to decrease the maximal rate of the reaction by half. IC₅₀ is a measure of the effectiveness of a substance in inhibiting a specific biological or biochemical function. To obtain a reliable data set, we filtered the PLPs with the following steps: (1) deleting the redundant PLPs based on the protein sequences and the ligand Inchkey; (2) removing the PLPs of which K_i and IC₅₀ values are unavailable or the average value of them larger than 10 μM; (3) expunging the smaller ligand-set sized protein that overlaps more than 60% ligands with another protein; (4) excluding those ligands whose Tanimoto similarity is larger than 0.75 in the ligand set of one protein; (5) deleting the proteins whose ligand number is less than 5. As a result, 1788 proteins and 68,777 ligands that constituted 98,327 PLPs were obtained as the positive set. The negative set was constructed by a random generation of the same number of relations that do not overlap with those positive interactions. The two datasets are then used for training the models. All the data can be download from our website related with this work (http://lsp.nwsuaf.edu.cn/tcmsp.php).

Construction of feature matrix

CDK Fingerprint matrix. Ligands were represented by 1,024-bit chemical hashed fingerprints, which were computed using the CDK with default 2D parameters. The CDK is a scientific, LGPL-ed library for bio-informatics and chemi-informatics and computational chemistry written in Java. Taking the ligand set of a protein j constituted by n_j ligands, an initial matrix P = {F^(j)} (n_j × 1024) was generated to represent the protein, where is the binary fingerprint vector of ligand k. To investigate which feature fit of the fingerprint has a higher contribution rate in distinguishing one protein from the others, we weighted each feature based on the significance (by P-value using one-sided Fisher’s exact test) of overrepresentation against the background incidence of the feature in respective protein. The P-values are adjusted to control for multiple hypothesis tests, yielding q-values. The weight for each feature was then computed using the following formula:

where , N is the number of total proteins in the training set. We used q = 0.05, the generally considered statistically significant threshold, as it ensures a reasonable discrimination of the feature weights (Figure S1).

Dragon Fingerprint matrix

In addition, ligands were also represented by 1,664 Dragon descriptors (http://www.talete.mi.it/index.htm). As a professional software package, Dragon calculates molecular descriptors frequently used to evaluate the molecular structure-activity relationship. Taking the ligand set of a protein j constituted by n_j ligands, an initial matrix P = {D^(j)} (n_j × 1664) is generated to represent the protein, where . All d_k,i were standardized according to the equation of , where μ_i and σ_i are the mean and standard deviation of ligand k, respectively. To recognize those features that can signally differentiate these proteins, we weighted each feature based on non-parametric Wilcoxon Sum Rank Test. The P-values are adjusted to control multiple hypothesis testing, yielding q-values. The weight for each feature was then computed using equation (1).

Model building

Firstly, for a protein j, we selected m_j1 and m_j2 highest weighted features from the CDK and Dragon descriptors, respectively; then the protein j was represented by the feature matrices P = {F^(j)} (n_j × m_j1) and P = {D^(j)} (n_j × m_j2); finally, the fingerprint-Dragon based weighted similarity scores between two ligand (l₁, l₂) were expressed as

where ∧ indicts the Boolean operator “AND”, whereas ∨ represents the Boolean operator “OR”, respectively.

In equation (3), <·,·> denotes the inner product, whereas |·| represents the module, respectively.

The feature (CDK and Dragon) number m of a protein ligand set was determined by the optimization model (equation 4).

In order to obtain a good estimate of the overall similarity with the ligand set (ensemble), we first defined a raw score for this ligand by summing its weighted similarity relative to the ligand set of protein j with S_i ≥ S_cut.

where

The threshold S_cut was determined by retrospective cross-fold analysis. Unlike WES, SEA chooses S_cut to meet that the random Z score is consistent and enriches for a BLAST-like background probability distribution. Actually, by sampling across the range of S_cut choices, we chose the threshold that will lead to the highest ROC AUC, resulting in a similarity threshold. The scores below the threshold were discarded which do not contribute to the overall similarity.

Then, a model of the distribution of random raw scores was developed and fitted. Random raw scores were calculated by comparing a randomly selected ligand set (size = 50) to the ligand set of each protein. Therefore, we can acquire the mean (μ) and standard deviation (σ) of the 50 random raw scores. And the normalized raw score, annotated as Z score, can be represented as equation (6):

The calculation process of Z score is as follows:

1
For a protein j, choose 50 ligands at random from all ligands and calculate the mean and standard deviation values of raw scores at different similarity thresholds (S_cut) with step size 0.01, where 0 < S_cut < 1. Store all calculated mean values (μ_j = {μ_j1,…, μ_j100}) and standard deviation values (σ_j = {σ_j1,…, σ_j100}), along with the set size of the protein j.
2
For each S_cut, plot the set size of protein ligand vs all μj(S_cut) and σj(S_cut) scores, respectively; and then the linear regression was applied to determine the equations of μ_j and σ_j. Typically, equations y_μ = α₁x + β₁ and y_σ = α₂x + β₂ are appropriate for standardizing the Raw sores. Given the normalized equation (6), calculate the Z score. If a new drug–target interaction has a Z score above a threshold, it will be treated as a direct interaction. The threshold above which the highest F1 score was achieved in LOOCV was used to make predictions (equation 7).

where precision is the ratio of the number of true positives to the number of predicted positives and recall is the ratio of the true positives which are correctly identified.

Z score integration

To depict the likelihood of a ligand binds to a specific protein, we integrated the Z scores into a likelihood value by the Bayesian network method, so called the hybrid model in this work. The likelihood was defined as:

where P(Z = z₁,z₂|C = c) indicates the probability of Z score scored z₁ or z₂ in class c and z₁ and z₂ represent the CDK and Dragon Z scores, respectively.

In addition, we evaluated the conditional probability by the multivariate kernel density estimation approach, which is a nonparametric technique for density estimation through the following formula:

where, is the Gaussian kernel, d is the dimensionality of vector X, (d = 2); n is the number of data samples in class c, H is the bandwidth (or smoothing) d × d matrix which is symmetric and positive definite. And a ligand is considered to incorporate into a protein when the L value is greater than threshold θ, which is the same as the threshold of Z score.

Performance evaluation

The WES model was evaluated and verified with LOOCV. In details, the WES algorithm is applied once for each interaction, using all other interactions as a training set and using the selected interaction as a single-item test set. Several parameters, ACC (equation 10), SEN (equation 11), SPE (equation 12) and PRE (equation 13), were used to measure the accuracy of overall, positive prediction, negative prediction and the positive predictive value of the model, respectively.

here, the TP, TN, FP and FN represent the number of true-positives, true-negatives, false-positives and false-negatives, respectively.

Comparison to a 1NN model

We evaluated two 1NN models, using either CDK or Dragon fingerprints. For a drug, it was compared to all known ligands of a target. The highest Tc value between the querying drug and known ligands was assigned to the drug-target pair. For each drug, we identified the lowest Tc value that yielded valid WES predictions using the respective fingerprint and collected all drug-target pairs with Tc scores above that threshold. We calculated an adjusted hit rate (equation 14):

The additional count for both numerator and denominator distinguishes cases where no predictions were confirmed.

External data validation for binding and non-binding data

To examine the generalization ability of WES, we manually collected the direct binding data in PDB and non-binding data in BindingDB (see details in Results).

Experimental validation

Molelues like Bleomycin, Pasireotide, Fingolimod, Hydrocortamate, Vancomycin, Alpha-Linolenic Acid, Pentagastrin, Roxatidine acetate, Alpha-Linolenic Acid, Mupirocin, Rimonabant, Pravastatin, Treprostinil, Esmolol, Cetrorelix, Carfilzomib, Saquinavir, Lopinavir, Indinavir, Ritonavir, Desmopressin and Felypressin were purchased from Yitai Technology Ltd. (Wuhan, China). Enpp2 (Autotaxin Inhibitor Screening Assay Kit), Faah (FAAH Inhibitor Screening Assay Kit), PTGS2 (COX Inhibitor Screening Assay Kit), PPARG (PPARγ Ligand Screening Assay Kit) and REN (Renin Inhibitor Screening Assay Kit) were purchased from Cayman Chemical, Ann Arbor, MI, USA. All drugs were dissolved in DMSO and freshly prepared due to the loss of activity under long-term storage. The activity of targets was detected according to manufacturer’s instructions. IC₅₀ values were determined using the Bliss method according to the eight data points per drug. The same drug-target interaction was repeated independently three times to obtain a mean IC₅₀ value and its standard deviation.

Additional Information

How to cite this article: Zheng, C. et al. Large-scale Direct Targeting for Drug Repositioning and Discovery. Sci. Rep. 5, 11970; doi: 10.1038/srep11970 (2015).

References

Günther, S. et al. SuperTarget and Matador: resources for exploring drug-target relationships. Nucleic Acids Res 36, D919–D922 (2008).
Article Google Scholar
Huang, C. et al. Systems pharmacology in drug discovery and therapeutic insight for herbal medicines. Brief Bioinform 15, 710–733 (2014).
Article Google Scholar
Zheng, C. et al. System-level multi-target drug discovery from natural products with applications to cardiovascular diseases. Mol Divers 18, 621–635 (2014).
Article CAS Google Scholar
Kola, I. & Landis, J. Can the pharmaceutical industry reduce attrition rates? Nat Rev Drug Discov 3, 711–716 (2004).
Article CAS Google Scholar
Yamanishi, Y., Kotera, M., Kanehisa, M. & Goto, S. Drug-target interaction prediction from chemical, genomic and pharmacological data in an integrated framework. Bioinformatics 26, i246–i254 (2010).
Article CAS Google Scholar
Takarabe, M., Kotera, M., Nishimura, Y., Goto, S. & Yamanishi, Y. Drug target prediction using adverse event report systems: a pharmacogenomic approach. Bioinformatics 28, i611–i618 (2012).
Article CAS Google Scholar
Mei, J.-P., Kwoh, C.-K., Yang, P., Li, X.-L. & Zheng, J. Drug–target interaction prediction by learning from local information and neighbors. Bioinformatics 29, 238–245 (2013).
Article CAS Google Scholar
Kuruvilla, F. G., Shamji, A. F., Sternson, S. M., Hergenrother, P. J. & Schreiber, S. L. Dissecting glucose signalling with diversity-oriented synthesis and small-molecule microarrays. Nature 416, 653–657 (2002).
Article ADS CAS Google Scholar
Haggarty, S. J., Koeller, K. M., Wong, J. C., Butcher, R. A. & Schreiber, S. L. Multidimensional chemical genetic analysis of diversity-oriented synthesis-derived deacetylase inhibitors using cell-based assays. Chem Biol 10, 383–396 (2003).
Article CAS Google Scholar
Stewart, L., Clark, R. & Behnke, C. High-throughput crystallization and structure determination in drug discovery. Drug Discov Today 7, 187–196 (2002).
Article CAS Google Scholar
Yamanishi, Y. et al. DINIES: drug–target interaction network inference engine based on supervised analysis. Nucleic Acids Res 42, W39–W45 (2014).
Article CAS Google Scholar
Cheng, A. C. et al. Structure-based maximal affinity model predicts small-molecule druggability. Nat Biotechnol 25, 71–75 (2007).
Article Google Scholar
Fan, Y.-N., Xiao, X., Min, J.-L. & Chou, K.-C. iNR-Drug: Predicting the interaction of drugs with nuclear receptors in cellular networking. Int J Mol Sci 15, 4915–4937 (2014).
Article Google Scholar
Yu, H. et al. A systematic prediction of multiple drug-target interactions from chemical, genomic and pharmacological data. PLoS One 7, e37608 (2012).
Article ADS CAS Google Scholar
Keiser, M. J. et al. Relating protein pharmacology by ligand chemistry. Nat Biotechnol 25, 197–206 (2007).
Article CAS Google Scholar
Wang, Y.-H., Li, Y., Yang, S.-L. & Yang, L. Classification of substrates and inhibitors of P-glycoprotein using unsupervised machine learning approach. J Chem Inf Model 45, 750–757 (2005).
Article CAS Google Scholar
Liu, T., Lin, Y., Wen, X., Jorissen, R. N. & Gilson, M. K. BindingDB: a web-accessible database of experimentally determined protein–ligand binding affinities. Nucleic Acids Res 35, D198–D201 (2007).
Article CAS Google Scholar
Law, V. et al. DrugBank 4.0: shedding new light on drug metabolism. Nucleic Acids Res 42, D1091–D1097 (2014).
Article CAS Google Scholar
Berman, H., Henrick, K., Nakamura, H. & Markley, J. L. The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data. Nucleic Acids Res 35, D301–D303 (2007).
Article CAS Google Scholar
Salvat, R. S., Parker, A. S., Choi, Y., Bailey-Kellogg, C. & Griswold, K. E. Mapping the pareto optimal design space for a functionally deimmunized biotherapeutic candidate. PLoS Comput Biol 11, e1003988 (2015).
Article ADS Google Scholar
Muszkat, M. et al. The common Arg389gly ADRB1 polymorphism affects heart rate response to the ultra-short-acting β1 adrenergic receptor antagonist esmolol in healthy individuals. Pharmacogenet Genom 23, 25–28 (2013).
Article CAS Google Scholar

Download references

Acknowledgements

Funding: This work was supported by the Fund of Northwest A & F University and was financially supported by the National Natural Science Foundation of China [Grant number 31170796, 81373892] and New Century Excellent Talents in University of Ministry of Education of China.

Author information

Zheng Chunli, Guo Zihu and Huang Chao contributed equally to this work.

Authors and Affiliations

Bioinformatics Center, College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, 712100, China
Chunli Zheng, Zihu Guo, Chao Huang, Ziyin Wu, Xuetong Chen, Yingxue Fu, Jinlong Ru, Piar Ali Shar & Yonghua Wang
Department of Materials Science and Chemical Engineering, Dalian University of Technology, Dalian, Liaoning, 116000, China
Yan Li
Department of Pathology and MCW Cancer Center, Medical College of Wisconsin, Milwaukee, WI, 53226, USA
Yuan Wang

Authors

Chunli Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Zihu Guo
View author publications
You can also search for this author in PubMed Google Scholar
Chao Huang
View author publications
You can also search for this author in PubMed Google Scholar
Ziyin Wu
View author publications
You can also search for this author in PubMed Google Scholar
Yan Li
View author publications
You can also search for this author in PubMed Google Scholar
Xuetong Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yingxue Fu
View author publications
You can also search for this author in PubMed Google Scholar
Jinlong Ru
View author publications
You can also search for this author in PubMed Google Scholar
Piar Ali Shar
View author publications
You can also search for this author in PubMed Google Scholar
Yuan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yonghua Wang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Yonghua Wang formulated the idea of the paper and supervised the research. Zihu Guo and Chao Huang performed the research. Ziyin Wu ran the experiments. Yan Li, Xuetong Chen, Yingxue Fu, Jinlong Ru, Piar Ali Shar and Yuan Wang prepared Tables and Figures. Chunli Zheng wrote the paper. All authors reviewed the manuscript.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Electronic supplementary material

Supplementary Information

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Reprints and permissions

About this article

Cite this article

Zheng, C., Guo, Z., Huang, C. et al. Large-scale Direct Targeting for Drug Repositioning and Discovery. Sci Rep 5, 11970 (2015). https://doi.org/10.1038/srep11970

Download citation

Received: 13 November 2014
Accepted: 12 June 2015
Published: 09 July 2015
DOI: https://doi.org/10.1038/srep11970

This article is cited by

Licorice extract inhibits growth of non-small cell lung cancer by down-regulating CDK4-Cyclin D1 complex and increasing CD8+ T cell infiltration
- Jinglin Zhu
- Ruifei Huang
- Yonghua Wang
Cancer Cell International (2021)
A review of computational drug repositioning: strategies, approaches, opportunities, challenges, and directions
- Tamer N. Jarada
- Jon G. Rokne
- Reda Alhajj
Journal of Cheminformatics (2020)
Exploring active ingredients and function mechanisms of Ephedra-bitter almond for prevention and treatment of Corona virus disease 2019 (COVID-19) based on network pharmacology
- Kai Gao
- Yan-Ping Song
- Anna Song
BioData Mining (2020)
Systems Pharmacology Uncovers Multiple Mechanisms of Erxian Decoction (二仙汤) for Treatment of Premature Ovarian Failure
- Bo Du
- Li-hong Liu
- Hao Ai
Chinese Journal of Integrative Medicine (2020)
Dissection of Pharmacological Mechanism of Chinese Herbal Medicine Yihuo Huatan Formula on Chronic Obstructive Pulmonary Disease: A Systems Pharmacology-Based Study
- Xia-Wei Zhang
- Wei Liu
- Bing Mao
Scientific Reports (2019)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Introduction

Results

WES—an algorithm for predicting direct interactions of drugs and targets

Model performance

Feature analysis.

External data validation

Target class prediction

Comparison of WES with 1NN

Evaluation of ligand scaffold hopping

Experimental validation

Discussion

Methods

Data sets

Construction of feature matrix

Dragon Fingerprint matrix

Model building

Z score integration

Performance evaluation

Comparison to a 1NN model

External data validation for binding and non-binding data

Experimental validation

Additional Information

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Ethics declarations

Competing interests

Electronic supplementary material

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links