Identification of potential inhibitors based on compound proposal contest: Tyrosine-protein kinase Yes as a target

Chiba, Shuntaro; Ikeda, Kazuyoshi; Ishida, Takashi; Gromiha, M. Michael; Taguchi, Y-h.; Iwadate, Mitsuo; Umeyama, Hideaki; Hsin, Kun-Yi; Kitano, Hiroaki; Yamamoto, Kazuki; Sugaya, Nobuyoshi; Kato, Koya; Okuno, Tatsuya; Chikenji, George; Mochizuki, Masahiro; Yasuo, Nobuaki; Yoshino, Ryunosuke; Yanagisawa, Keisuke; Ban, Tomohiro; Teramoto, Reiji; Ramakrishnan, Chandrasekaran; Thangakani, A. Mary; Velmurugan, D.; Prathipati, Philip; Ito, Junichi; Tsuchiya, Yuko; Mizuguchi, Kenji; Honma, Teruki; Hirokawa, Takatsugu; Akiyama, Yutaka; Sekijima, Masakazu

doi:10.1038/srep17209

Download PDF

Article
Open access
Published: 26 November 2015

Identification of potential inhibitors based on compound proposal contest: Tyrosine-protein kinase Yes as a target

Shuntaro Chiba¹,
Kazuyoshi Ikeda²,
Takashi Ishida^1,3,
M. Michael Gromiha⁴,
Y-h. Taguchi⁵,
Mitsuo Iwadate⁶,
Hideaki Umeyama⁶,
Kun-Yi Hsin⁷,
Hiroaki Kitano^7,8,9,
Kazuki Yamamoto¹⁰,
Nobuyoshi Sugaya¹¹,
Koya Kato¹²,
Tatsuya Okuno¹³,
George Chikenji¹²,
Masahiro Mochizuki¹⁴,
Nobuaki Yasuo^1,3,
Ryunosuke Yoshino^15,16,
Keisuke Yanagisawa^1,3,
Tomohiro Ban^1,3,
Reiji Teramoto¹⁷,
Chandrasekaran Ramakrishnan⁴,
A. Mary Thangakani¹⁸,
D. Velmurugan¹⁸,
Philip Prathipati¹⁹,
Junichi Ito¹⁹,
Yuko Tsuchiya¹⁹,
Kenji Mizuguchi¹⁹,
Teruki Honma²⁰,
Takatsugu Hirokawa^21,22,
Yutaka Akiyama^1,3,21,22 &
…
Masakazu Sekijima^1,3,15,22

Scientific Reports volume 5, Article number: 17209 (2015) Cite this article

9404 Accesses
30 Citations
10 Altmetric
Metrics details

Subjects

Abstract

A search of broader range of chemical space is important for drug discovery. Different methods of computer-aided drug discovery (CADD) are known to propose compounds in different chemical spaces as hit molecules for the same target protein. This study aimed at using multiple CADD methods through open innovation to achieve a level of hit molecule diversity that is not achievable with any particular single method. We held a compound proposal contest, in which multiple research groups participated and predicted inhibitors of tyrosine-protein kinase Yes. This showed whether collective knowledge based on individual approaches helped to obtain hit compounds from a broad range of chemical space and whether the contest-based approach was effective.

RNAi-based drug design: considerations and future directions

Article 03 April 2024

Qi Tang & Anastasia Khvorova

Antibody drug conjugate: the “biological missile” for targeted cancer therapy

Article Open access 22 March 2022

Zhiwen Fu, Shijun Li, … Yu Zhang

Opportunities and challenges in design and optimization of protein function

Article 02 April 2024

Dina Listov, Casper A. Goverde, … Sarel Jacob Fleishman

Introduction

Novel drug discovery is generally considered to be a time-consuming and expensive process. A typical drug discovery process takes 12–14 years and costs approximately one billion dollars^1,2. Various approaches have been developed to explore promising drug candidates while reducing the financial and time burdens imposed in acquiring new molecular entities. Techniques such as combinatorial chemistry and high-throughput screening have been used in traditional drug development^3,4. Since the 1960s, the available scientific knowledge has been used to guide drug discovery and computer-aided drug discovery (CADD) is currently a highly efficient technique in achieving these objectives. In the post-genomic era, CADD can be combined with data from large-scale genomic amino acid sequences, three-dimensional (3D) protein structures and small chemical compounds and can be used in various drug discovery steps, from target protein identification and hit compound discovery to the prediction of absorption, distribution, metabolism, excretion and toxicity (ADMET) profiles^5,6,7. The use of CADD is expected to cut drug development costs by 50%⁸. CADD approaches are divided into two major categories: protein structure-based (SB) and ligand-based (LB) methods. The SB approach is generally chosen when high-resolution structural data such as X-ray structures are available for the target protein. The LB approach is used to predict ligand activity based on its similarity to known ligand information^9,10.

In SB, molecular docking is widely used, but other techniques are often used in combination, such as homology modeling, which models the target 3D structure when no X-ray structure is available¹¹ and molecular dynamics, which searches for a binding site that is not found in the X-ray structure^12,13. In LB, machine learning is used when active ligands and inactive ligands are known^14,15,16 and similarity search^17,18 or pharmacophore modeling^19,20,21 is used when only active ligands are known. Although these techniques are theoretically expected to be useful for the discovery of promising novel drug candidates, recent studies have shown that the gold standard remains to be established. von Korff et al.²² conducted a validation study using the Directory of Useful Decoys (DUD)²³ as a validation set and in multiple methods of SB and LB, they demonstrated that different methods proposed compounds in different chemical spaces as hit molecules for the same target protein. In other words, the use of multiple different search methods is expected to yield hit molecules in a broader range of chemical space than the use of specific SB or LB methods alone. This study aimed at using multiple CADD methods through open innovation to achieve a level of hit molecule diversity that is not achievable with any particular single method. Therefore, we held a compound proposal contest, in which multiple research groups participated and predicted inhibitors of a target protein. This showed whether collective knowledge based on individual approaches helped to obtain hit compounds from a broad range of chemical space and whether the contest-based approach was effective.

In the contest, a library of 2.2 million commercially available compounds was provided and proposed compounds from each participant group were purchased; these were then tested with an inhibitory assay. Such a large number aimed to simulate a real CADD-oriented procedure. We also chose the tyrosine-protein kinase Yes, a member of the Src family, as a target protein because it has an established inhibitory activity assay protocol. Its inhibition is also clinically important and participants of the contest could employ either SB and/or LB approaches for this target.

Yes has a multi-domain structure, which consists of SH3 (residues 97–144), SH2 (residues 97–144) and tyrosine kinase (residues 277–526) domains. The Tyr416 residue of the tyrosine kinase domain is phosphorylated during activation of the kinase. Hence, the tyrosine kinase domain located at the C-terminal region of Yes is a direct target for predicting hit compounds. The 3D structures of several kinase proteins have been determined and stored in the Protein Data Bank (PDB)²⁴. However, the structure of Yes has not yet been determined (as of April 2014). Our prior-to-the-contest analysis showed that Yes has a high sequence identity with many other protein kinases (e.g., PDBID: 1Y57²⁵, 2SRC²⁶, 1FMK²⁷), of which structures were determined at high resolution. This indicates that homology modeling can be effectively used to obtain the 3D structure of Yes.

On the ligand point of view various open source drug discovery databases such as BindingDB²⁸, ChEMBL²⁹, DrugBank³⁰ and PubChem³¹ contain medicinal chemistry data on a number of drug compounds, active and inactive compounds and targets. For example, DrugBank is useful for searching known small-compound drugs of Yes and it contains data on a number of Food and Drug Administration (FDA)-approved small-compound drugs including dasatinib, which has been approved as an anticancer drug that targets mainly Abl and Src family kinases. Kinase SARfari³², a satellite database of ChEMBL that contains data on >4000 compounds targeting Src family kinases, is useful for structure–activity relationship (SAR) analysis. Specifically, it contains data on 188 bioactive compounds that inhibit Yes, 30 of which have a half maximal inhibitory concentration (IC₅₀) of <50 nM. The availability of bioactivity data aids realistic identification of potential hit compounds.

The compound proposal contest was organized by the Initiative for Parallel Bioinformatics (IPAB). It started on January 7 2014 and ended on March 20 2014. Ten groups participated in the contest. Any groups could participate in the contest if its members agreed that all proposed compounds and methods were to be made public. The participants were asked to propose a prioritized set of 120 compounds. We selected the top 50 compounds from each group and 118 additional compounds via clustering analysis of the submitted compounds, for a total of 600 unique compounds. In an inhibitory activity assay, 24 of the 600 compounds showed inhibition at various ranges and seven were identified as potential hit compounds. Among the potential hits IC₅₀ of three compounds with a novel structure were estimated. The salient features of the methods, experimental validations and potential inhibitors are discussed below.

Details of the contest

Compound library

Enamine Ltd provided a collection of approximately 2.2 million small compounds that are commonly used in high-throughput screening (HTS) studies to identify potential hit compounds. The compounds were readily available in the Enamine library; therefore, we used them for screening. Enamine libraries are available at http://www.enamine.net/.

Computational methods

Different methods have been adopted to identify potential inhibitors of Yes and they can be roughly classified into the following two categories: the protein structure-based method (SB) and the ligand-based method (LB). Here, we define SB as a docking simulation or a geometric hashing technique that utilizes protein structure. LB involves screening techniques based on structural similarity comparison to known inhibitors or SAR derived from known inhibitors. The comparative analysis of the methods used by the 10 different groups is presented in Table 1. Some groups employed a multi-step approach, where LB was employed to screen the Enamine library and SB was applied to the resultant compounds (denoted by LB → SB). Others used LB and SB simultaneously to screen the Enamine library (denoted by LB&SB).

Table 1 Details of Yes structure and processing methods of Enamine library by different groups to select 120 compounds.

Full size table

Protein structure-based method (Groups 1, 2, 3, 5, 7, 9 and 10)

In this approach, the structure of the target protein is the main focus for identifying potential inhibitors. Initially, the 3D structure of the target protein is obtained via homology modeling using Modeller^33,34,35, FAMS^36,37,38, etc. Potential inhibitors are then identified using docking, SAR and molecular dynamics (MD) simulations. The SB method was used by seven of the competing groups, 1, 2, 3, 5, 7, 9 and 10. Group 1 (G1): Based on a BLAST search of PDB, homologs of Yes that contains ligands were searched and 25 structures were identified. The Tanimoto indices between the 25 ligands and the Enamine library compounds were calculated and 1241 compounds with indices >0.55 were chosen for screening. The protein structure used in the docking simulation was created using a template structure with the smallest P-value. Finally, the 1241 compounds were docked using ChooseLD based on FPAScore function. Group 2 (G2): A series of sequence alignments and binding site investigations were performed to create a protein structure used in subsequent docking simulation and SAR studies. The following three approaches were employed separately: docking, SAR and similarity comparison with known inhibitors. The docking method used two machine learning systems³⁹ based on three docking simulation packages. The SAR model was created using PubChem BioAssay data⁴⁰ (AID 686947) and was applied to the Enamine library. For similarity comparison, a small number of known inhibitors of Yes were selected from the PubChem BioAssay data and the Enamine library was searched for similar compounds. Finally, the two-dimensional (2D)/3D structures of those compounds as well as their binding poses were compared with those of the native inhibitor found in the target structure. Group 3 (G3): Known inhibitors were selected from the literature^{41,42,43,44,45,46} and PubChem BioAssay data⁴⁰ (AID 686946). The Enamine library compounds were screened by similarity search using the known inhibitors. Then, docking simulation of the screened compounds was conducted. The model protein structure was selected based on the docking poses that reproduced those of the known ligands PP2 and dasatinib. Finally, the compounds selected via LB and SB were screened using the pseudo-consensus method⁴⁷. In total, 53 compounds from LB, 53 from SB and 14 from the pseudo-consensus method were included in the final list. Group 5 (G5): A multiple-template ligand was modeled using 70 protein-ligand complexes that had a protein sequence identity with Yes of >70%. Proteins and their bound ligands were superimposed by the protein structure alignment program MICAN⁴⁸ against a modeled Yes structure. The Enamine library compounds were compared with the multiple-template model using the geometric hashing technique⁴⁹, where scoring was defined as the number of coincident atoms minus the protein-compound crash penalty, to identify potential hit compounds⁵⁰. Group 7 (G7): MD simulation and fragment molecular orbital calculation of the Src-dasatinib complex structure were performed to identify residues that interact with dasatinib with high retention or interaction energy, respectively. This information was utilized to define constraints for docking simulation, i.e., to specify specific residues that the docked compounds should interact with. The protein structure for the docking simulation was created using homology modeling. Group 9 (G9): Data on active ligands were collected from BindingDB⁵¹ and their physicochemical characteristics were computed and compared with the set of the 2.2 million Enamine library compounds for primary screening. Homology modeling and MD studies were performed to select the best structural orientations and the resultant eight structures (one homology and seven MD structures) were independently subjected to docking simulation of the screened compounds, active inhibitors and decoys. The screened compounds with high docking scores were considered only when the protein structures used could supersede those of the active inhibitors and decoys in terms of scores. Group 10 (G10): The modeled 3D structure of Yes was validated by analyzing its binding poses with known ligands (dasatinib, saracatinib and bosutinib). These predicted binding poses were captured as a consensus SB pharmacophore model, which was used to screen the Enamine library. To further prioritize the compounds, an enriched substructure filter, which was derived using Src family kinase inhibitors retrieved from BindingDB⁵¹, was applied to the screened compounds. This list of 2000 potential hit compounds was clustered. Clusters were prioritized after visual inspection and a representative or the best hit of each cluster was chosen for the final list.

Ligand-based method (Groups 4, 6 and 8)

In the LB approach, potential hit compounds were primarily identified using the activity data for available kinases. Three of the groups, 4, 6 and 8, used this approach. Group 4 (G4): The IC₅₀s of tyrosine kinase inhibitors were downloaded from Kinase SARfari⁵² and relevant indices (pIC₅₀, ligand lipophilic efficiency⁵³, binding efficiency index⁵⁴ and surface efficiency index (SEI)⁵⁴) were calculated. Indices were related with physicochemical properties. The experimental indices; physicochemical properties such as hydrophobicity, volume and pI of the 36 amino acid residues surrounding the ATP binding sites (ABS36)⁵⁵; and compound descriptors were trained with support vector regression and three models (SEI OETree, SEI MACCS and SEI OBFP2) were proven to predict experimental values better than other models. These models were applied to the Enamine library to predict the active compounds. It should be noted that Group 4 focused on identifying compounds with good SEI rather than good inhibition activity. Group 6 (G6): PubChem BioAssay data of 858 compounds⁵⁶ of Yes (AID 686947) were downloaded and their inhibition rates were normalized at a concentration of 15 μM using linear interpolation between the nearest neighbors of actual measured activities. The activities and a set of molecular descriptors were trained using a random forest model⁵⁷ and the model was utilized for predicting the potential hit compounds. Group 8 (G8): PubChem BioAssay data of 858 compounds⁵⁶ (AID 686947) of Yes were downloaded. Compounds with an activity <1 μM were defined as active inhibitors and the rest were classified as inactive. In addition to the inactive inhibitors, some compounds in the Enamine library were defined as inactive to exploit large-scale inactive compounds for training in the SAR model. The SAR model was developed by comparing the activity data with 772 descriptors using the balanced random forest method^57,58 and was applied to the Enamine library to predict active compounds. An imbalance in the numbers of active and inactive compounds was addressed during training⁵⁹.

Selection of compounds for experimental inhibitory assay

Initially, we selected the topmost 50 compounds from each of the 10 groups to obtain a total of 482 unique compounds. In addition, following cluster analysis, 118 additional compounds were selected using a scoring procedure and manual inspection. We used k-means clustering to classify the compounds into 10 clusters and subsequently computed the similarity score relative to structures of known inhibitors of Src family kinases deposited in ChEMBL using the Tanimoto principle based on MACCS fingerprint⁶⁰. We selected compounds with a maximum similarity score of <0.72 to identify novel inhibitors. We defined a consensus number in each cluster, based on a number of different groups that proposed any compounds to each cluster. From the consensus, we chose 118 compounds, for a total of 600 compounds to be tested by the inhibitory activity assay.

Experimental procedures

We outsourced the inhibitory activity assay of the proposed compounds to Bienta (http://bienta.net/). Bienta utilized HTS in order to estimate a percentage inhibition rate at 10 μM of each compound. All HTS procedures were performed in accordance with the Promega Technical Manual for ADP-Glo™ Kinase Assay (Fitchburg, WI, USA. Catalog number: V9102). Human recombinant Yes (NCBI reference sequence: NP_005424.1) was purchased from BPS Bioscience (San Diego, CA, USA. Catalog number: 40488). Staurosporine, a well-known pan-kinase inhibitor typically used as a reference in kinase assays, was selected as an active inhibitor for Yes. The copolymer of Glu and Tyr (Glu:Tyr = 4:1, Sigma Aldrich, St. Louis, MO, USA. Catalog number: 81357) was used as a generic tyrosine kinase substrate. The final assay reagent concentrations were 5.5 nM Yes, 0.013 mM ATP and 0.2 mg/mL substrate. Each compound, at a final concentration of 10 μM, was dispensed in four wells of a 384-well plate. The average of the four values was used to calculate the inhibition rate for each compound. We evaluated each compound for the following three criteria based on the inhibition rate:

A
The compound’s inhibition rate was higher than the average inhibition rate of all compounds in the same plate plus three-fold of the standard deviation of inhibition rates in the same plate. In this calculation, the inhibition rates for positive and negative controls were not considered.
B
The compound’s inhibition rate was >25% (except for the compounds classified in “A”). This condition was implemented to eliminate false negatives.
C
The compound’s inhibition rate was the highest in its group (except for the compounds classified in “B”).

The compounds that identified with any of these criteria proceeded to the secondary assay conducted on one 384-well plate. Each compound was tested in six wells and the average of the six values was used to determine the percentage inhibition rate. We used an individual inhibition rate of >30% to identify the compounds that could potentially serve as inhibitors to Yes.

Results and Discussion

Common compounds identified by different methods

Ten groups each submitted 120 compounds for a total of 1200 compounds. The analysis of the submitted compounds showed that 17 compounds overlapped between two groups and one compound was the same in three groups. In total, 75% of overlapping compounds were proposed mainly by groups that utilized known ligand information directly or indirectly (G2, G3, G6 and G8). The higher overlapping rate would be attributed to the same information that these LB methods employed, i.e PubChem BioAssay AID 686947. All of the overlapping compounds were selected for the inhibitory activity assay. The behaviors of these compounds were distinct: two were identified as potential hit compounds and the others did not show any inhibition in assay experiments.

Inhibition rates of selected compounds

We conducted inhibition assay experiments at 10 μM for all of the selected 600 compounds to measure the percentage of inhibition. Among them, 24 compounds satisfied our primary hit conditions and were tested in the secondary assay. The secondary assay was conducted on a single plate and the results are shown in Table 2. Critical evaluation of these 24 compounds identified seven compounds as probable inhibitors of Yes and their structures along with their inhibition rates are presented in Table 3. These seven compounds were again tested from their fresh powders to confirm the inhibition rates. Compounds with inhibition rates >50% in the fresh powder assay were Z1546610485 (56.3%, G2, G6 and G8), Z820655914 (89.0%, G5), Z1546616191 (95.4%, G6), Z1157725083 (65.0%, G8) and Z653349554 (66.7%, G10).

Table 2 Activity details of the 24 compounds that show minimum 25% inhibition at a concentration of 10 μM.

Full size table

Table 3 Potential hit compounds in validation assay (from fresh powder).

Full size table

Compound Z1546610485 was identified by three groups that employed the LB method (LB or LB&SB in Table 1). The compound is known as gefitinib and is listed in the PubChem Bioactive database⁵⁶ (AID686946 and AID686947) as a tyrosine kinase inhibitor, which may explain why it was independently proposed by three groups. This shows that the LB methods used by these groups could correctly identify a hit compound. On the other hand, Z1546616191, a known tyrosine kinase inhibitor named sunitinib, also listed in the database, was proposed only by G6. It is unclear why the other LB groups did not propose it in their lists of 120 compounds. The number of compounds tested in this contest (a minimum of 50 per group) is insufficient to derive conclusive insight.

According to the PubChem BioAssay data, the inhibition rates of gefitinib and sunitinib at 8.6 μM with 4 nM Yes, 0.1 mM ATP and 0.3 mg/mL substrate (poly Glu:Tyr = 4:1) were 72.1% and 93.9%, respectively, indicating consistency of our data with the literature. The inhibition rates of a few compounds exceeded that of gefitinib in this study.

Among the 118 compounds that were added by the clustering analysis, there were no potential hit molecules. This might be because the compounds were selected so that their similarity score to known inhibitors was <0.72.

Chemical space diversity of submitted compounds

To examine the diversity of submitted compounds we conducted the principal component (PC) analysis of 211546 compounds’ MACCS fingerprint⁶⁰, which were 10% of the compound library and randomly sampled, followed by the projection of the sampled compounds (Random in Fig. 1A), assayed compounds of each group (G1−G10) and the seven potential hits (seven Hits) onto PC1 and PC2. The cumulative variances of PC1 and PC2 were 26% and 50%, respectively, indicating that PC1 and PC2 could well account for the chemical space of the compound library. Figure 1A shows that compounds submitted by the same group show a tendency to gather in the chemical space, i.e., the chemical space covered by compounds submitted only by one group tends to be small. To quantify the coverage, we divided the chemical space into 13 for both PC1 and PC2, as shown in Fig. 1A and counted a number of grids that contains at least one compound of a group concerned. The coverage numbers for all the groups were as follows: Random: 124, G1: 13, G2: 12, G3: 26, G4: 3, G5: 14, G6: 19, G7: 27, G8: 18, G9: 18, G10: 18, G1−10: 54 (G1−10 contains all the compounds that all the groups submitted) and Known: 89. These values show that the chemical space coverage submitted only by one group tends to be small. On the other hand, the coverage of the merged compounds, G1−10, was comparable to the chemical space of known Src inhibitors. Because the seven potential hits distributed over chemical space that could not be covered only by one group and were different from each other (see structures in Table 3), the contest-based approach can enhance diverse sampling. The coverages of G3 and G7 were relatively high because G3 employed three different approaches and made a compounds list from the three methods. G7 employed the more SB-oriented method that utilized information of lesser known inhibitors.

Figure 1B–D show the number density of the compound library, Src known inhibitors and assayed compounds, respectively, in the chemical space. The density number map of assayed compounds are not similar to that of the compound library but are similar to Src known inhibitors, indicating that the assayed compounds were not just randomly chosen but enriched toward Src inhibitors.

Characteristic features of submitted and assayed compounds

We analyzed the characteristic features of the 1200 submitted compounds (1180 unique compound structures) using their chemical properties: molecular weight (MW), ALogP, number of hydrogen bond acceptors (HBA), number of hydrogen bond donors (HBD), number of aromatic rings (AROM) and number of rotatable bonds (ROTB) using Canvas Version 2.2.013⁶¹. In addition, we analyzed the structures of 3528 known Src family kinase inhibitors retrieved from ChEMBL and BindingDB and the randomly selected 211546 structures from the Enamine library for comparison. Figure 2 shows the distribution of the six chemical properties for these three sets of compounds. We observed that for four of the six considered properties, there was a marked difference between the average values of the submitted compounds (AlogP: 3.2, HBA: 3.6, HBD: 1.4, AROM: 3.6) and the Enamine library compounds (AlogP: 2.1, HBA: 3.1, HBD: 1.0, AROM: 2.6) and the properties were biased toward the average values of Src family kinase inhibitors (AlogP: 3.8, HBA: 4.5, HBD: 2.3, AROM: 4.0). Notably, the average ROTB value of the submitted compounds (ROTB: 4.7) was smaller than that of the Enamine library (ROTB: 5.5), which is closer to the average ROTB value of the Src family kinase inhibitors (ROTB: 6.4), although the average ROTB value of the potential hit compounds was 5.7. In addition, the average MW of submitted compounds (MW: 361) was similar to that of the Enamine library (MW: 365), although the average MW of the potential hit compounds was 391, which is closer to the MW of the Src family kinase inhibitors (MW: 455). This analysis suggests that the prediction methods could be improved by considering ROTB and MW. In particular, for docking simulation, special consideration would be necessary when known inhibitors have a large ROTB value, because it is more difficult to cover the conformational space of compounds.

Further, we have surveyed the novelty of the submitted compounds. Figure 3 shows a distribution of maximum Tanimoto similarity coefficients for the submitted compounds compared with the known Src family inhibitors. We also measured the difference between the average similarity scores of different approaches to understand the effect of methodologies for selecting compounds. The 10 methods used in the contest are classified into two filter types (see the “Computational methods” section): SB: G1, G2, G3, G5, G7, G9 and G10; and LB: G4, G6 and G8. The average similarity scores for SB and LB are 0.765 and 0.767, respectively, indicating no apparent difference. Because G4′s method was modeled to identify compounds with good SEI, it may affect the average similarity score. The average similarity scores for LB without the inclusion of G4′s compounds is 0.824, indicating that compounds proposed by SB were more novel than those proposed by LB.

Comparison of different approaches for identifying the potential hit compounds

The systematic comparison of various methods for identifying potential hit compounds can provide insight for a deeper understanding of the concepts of drug design. Among the seven potential hit compounds, six were proposed by groups that either mainly or partly adopted an LB screening process or a ligand template (pharmacophore) derived from known inhibitors. The other compound was proposed by a group that also utilized such information to discriminate a good protein structure for docking from several model structures that are able to discriminate between known active and inactive compounds with respect to docking scores. Therefore, this study indicates that the usage of experimental binding affinity or binding poses is necessary to identify potential inhibitors. This concept reveals the importance of analyzing specific interactions to select potential hit compounds. The application of machine learning techniques helped to map the input features with binding affinity. Further, SB methods combined with pharmacophore modeling and docking could be useful in identifying potential hit compounds. Overall, the comparison of methods indicates the importance of balancing between LB and SB methods to identify inhibitors. Furthermore, we observed that the inclusion of visualization and detailed analysis are important for identifying potential hit compounds. With respect to speed, LB methods are faster than SB models and machine learning techniques could aid successful prediction.

As for the novelty of the potential hit compounds, the LB methods identified compounds similar to the known Src family kinase inhibitors (e.g., similarity scores of Z1157725083, Z1546616191 and Z1546610485 relative to known inhibitors were 0.80, 1.0 and 1.0, respectively). On the other hand, the SB methods predicted compounds with relatively lower similarity to the known Src family kinase inhibitors (e.g., similarity scores of Z820655914, Z126204226 and Z653349554 were 0.75, 0.77 and 0.79, respectively). When novelty is of interest, an SB method with the aid of known inhibitor information and/or docking poses is a good choice.

In addition, we calculated two different ligand efficiency indices: inhibition rate (%) divided by MW or topological polar surface area (TPSA), as shown in the Supporting Information (Supplementary Fig. S1). Two compounds (Z1095352660 and Z993990690) proposed by an LB approach (G4) are plotted in the upper-left corner of Supplementary Fig. S1. These compounds are small (MW of 151 and 255 for Z1095352660 and Z993990690, respectively) but have relatively high ligand efficiencies compared to their sizes (inhibition rate/MW: 0.26 and 0.10, respectively; inhibition rate/TPSA: 2.5 and 2.0, respectively). G4′s method focused on SEI and successfully identified compounds with high ligand efficiency comparable to those of known inhibitors.

Comparison between the potential hit and negative compounds

The list of compounds that did not show any inhibition of Yes is presented in Supplementary Table S1. These compounds can be used as decoys for docking and other studies. We have analyzed the characteristic features of the seven potential hit compounds and performed a comparison with the negative compounds. The physicochemical features of all 1180 proposed compounds, 24 active compounds and seven selected compounds, along with the 574 decoys, are shown in Fig. 4. Because there are no significant differences between the selected compounds and the rest of the submitted compounds, the negative compounds could be good decoys and may be helpful to further refine active inhibitors.

The strategy of using different approaches together

The contest based approach is the outcome of ten individual methods (Table 1), which are independent to each other on various perspectives: (i) different templates to obtain the target in SB approach, (ii) database of actives and decoys in LB method, (iii) variations in software packages for identifying the hits and docking and (iv) scoring procedures for ranking the hit compounds. Although the main objective of each method is to identify the lead compounds by covering a large chemical space and utilizing standard procedures none of them is able to identify all the hit compounds, which have been observed experimentally. We anticipated that all the methods could identify few hit compounds and most of them are not overlapping with each other. Hence, we have used the strategy of collecting the top ranked compounds in each method for verifying the hit compounds using experimental techniques.

The advantages of using different approaches together

Each prediction method utilized advanced techniques and reliable procedures reported in the literature for identifying potential hit compounds. The overlapping compounds are minimal among different methods and all methods provided diverse list of compounds with a strong basis for understanding the activity. Further, no single method is efficient to identify the hit compounds and it is not possible for a single group to perform all computational methods. The outcome of each method is complimenting with each other and hence the combination of methods could help to identify the hit compounds realistically. Interestingly, the hit compounds identified by experiments have been proposed by different groups participated in the contest. The contest based approach made it possible to narrow down the experiments from 2.2 million to 600 compounds and 24 of them are identified as hits.

Suggestions for future based on the experience gained in this contest

The outcome of the contest based approach provide several insights for future directions: (i) comparative performance of structure based and ligand based approaches for identifying the hits, (ii) list of actives and decoys for the target cYes kinase, which could be used to refine the methods and validating new methods, (iii) probable interaction and binding modes for target based drug design, (iv) utilizing efficient, reliable and wide range of information for identifying lead compounds and (v) combination of methods to identify and rank potential compounds. Looking back into known experimental data on several ligands it is possible to predict additional compounds with better affinity and understand the mechanism.

Conclusions

We conducted a contest-based approach to identify various inhibitors of the tyrosine-protein kinase Yes. In total, 10 groups participated in the contest and tackled the challenge using their own methods. The proposed compounds from all the groups collectively had a more diverse chemical space than compounds proposed only by each group, indicating that a contest-based approach can supply the early stage of drug discovery with various initial inhibitors. The contest was also successful in identifying 24 compounds with inhibition activity and seven potential hit compounds. The IC₅₀ evaluation of Z820655914, Z653349554 and Z1157725083 by the 8-point curve showed that the values of Z820655914 and Z1157725083 were >100 μM. The values for Z653349554 suggested that it had been reacted with a reagent. The potential hit compounds can be further considered for the next phase of drug design. Our study revealed that using information about known inhibitors or their docking poses was necessary for both the LB and the SB approaches.

Additional Information

How to cite this article: Chiba, S. et al. Identification of potential inhibitors based on compound proposal contest: Tyrosine-protein kinase Yes as a target. Sci. Rep. 5, 17209; doi: 10.1038/srep17209 (2015).

References

Paul, S. M. et al. How to improve R&D productivity: the pharmaceutical industry’s grand challenge. Nat Rev Drug Discov 9, 203–214, doi: 10.1038/nrd3078 (2010).
Article CAS PubMed Google Scholar
Morgan, S., Grootendorst, P., Lexchin, J., Cunningham, C. & Greyson, D. The cost of drug development: a systematic review. Health Policy 100, 4–17, doi: 10.1016/j.healthpol.2010.12.002 (2011).
Article PubMed Google Scholar
Haggarty, S. J. et al. Dissecting cellular processes using small molecules: identification of colchicine-like, taxol-like and other small molecules that perturb mitosis. Chem Biol 7, 275–286 (2000).
Article CAS Google Scholar
Young, K. et al. Identification of a calcium channel modulator using a high throughput yeast two-hybrid screen. Nat Biotechnol 16, 946–950, doi: 10.1038/nbt1098-946 (1998).
Article CAS PubMed Google Scholar
Egan, W. J., Merz, K. M., Jr. & Baldwin, J. J. Prediction of drug absorption using multivariate statistics. J Med Chem 43, 3867–3877 (2000).
Article CAS Google Scholar
Jorgensen, W. L. & Duffy, E. M. Prediction of drug solubility from structure. Adv Drug Deliv Rev 54, 355–366 (2002).
Article CAS Google Scholar
Sliwoski, G., Kothiwale, S., Meiler, J. & Lowe, E. W., Jr. Computational methods in drug discovery. Pharmacol Rev 66, 334–395, doi: 10.1124/pr.112.007336 (2014).
Article CAS PubMed PubMed Central Google Scholar
Tan, J. J. et al. Therapeutic strategies underpinning the development of novel techniques for the treatment of HIV infection. Drug Discov Today 15, 186–197, doi: 10.1016/j.drudis.2010.01.004 (2010).
Article CAS PubMed PubMed Central Google Scholar
Ou-Yang, S. S. et al. Computational drug discovery. Acta Pharmacol Sin 33, 1131–1140, doi: 10.1038/aps.2012.109 (2012).
Article CAS PubMed PubMed Central Google Scholar
Jorgensen, W. L. The many roles of computation in drug discovery. Science 303, 1813–1818, doi: 10.1126/science.1096361 (2004).
Article CAS ADS PubMed Google Scholar
Chen, L. et al. From laptop to benchtop to bedside: structure-based drug design on protein targets. Curr Pharm Des 18, 1217–1239 (2012).
Article CAS Google Scholar
Gohlke, H. & Klebe, G. Approaches to the description and prediction of the binding affinity of small-molecule ligands to macromolecular receptors. Angew Chem Int Ed Engl 41, 2644–2676, doi: 10.1002/1521-3773(20020802)41:15<2644::aid-anie2644>3.0.co;2-o (2002).
Lamb, M. L. & Jorgensen, W. L. Computational approaches to molecular recognition. Curr Opin Chem Biol 1, 449–457 (1997).
Article CAS Google Scholar
Ma, J., Sheridan, R. P., Liaw, A., Dahl, G. E. & Svetnik, V. Deep neural nets as a method for quantitative structure-activity relationships. J Chem Inf Model 55, 263–274, doi: 10.1021/ci500747n (2015).
Article CAS PubMed Google Scholar
Wang, F. et al. Computational screening for active compounds targeting protein sequences: methodology and experimental validation. J Chem Inf Model 51, 2821–2828, doi: 10.1021/ci200264h (2011).
Article CAS PubMed Google Scholar
Khamis, M. A., Gomaa, W. & Ahmed, W. F. Machine learning in computational docking. Artif Intell Med 63, 135–152, doi: 10.1016/j.artmed.2015.02.002 (2015).
Article PubMed Google Scholar
Muchmore, S. W., Edmunds, J. J., Stewart, K. D. & Hajduk, P. J. Cheminformatic tools for medicinal chemists. J Med Chem 53, 4830–4841, doi: 10.1021/jm100164z (2010).
Article CAS PubMed Google Scholar
Maldonado, A. G., Doucet, J. P., Petitjean, M. & Fan, B. T. Molecular similarity and diversity in chemoinformatics: from theory to applications. Mol Divers 10, 39–79, doi: 10.1007/s11030-006-8697-1 (2006).
Article CAS PubMed Google Scholar
Schuster, D. et al. The discovery of new 11beta-hydroxysteroid dehydrogenase type 1 inhibitors by common feature pharmacophore modeling and virtual screening. J Med Chem 49, 3454–3466, doi: 10.1021/jm0600794 (2006).
Article CAS PubMed Google Scholar
Wittayanarakul, K. et al. Insights into saquinavir resistance in the G48V HIV-1 protease: quantum calculations and molecular dynamic simulations. Biophys J 88, 867–879, doi: 10.1529/biophysj.104.046110 (2005).
Article CAS PubMed Google Scholar
Yoshino, R. et al. Pharmacophore modeling for anti-chagas drug design using the fragment molecular orbital method. PLoS One 10, e0125829, doi: 10.1371/journal.pone.0125829 (2015).
Article CAS PubMed PubMed Central Google Scholar
von Korff, M., Freyss, J. & Sander, T. Comparison of ligand- and structure-based virtual screening on the DUD data set. J Chem Inf Model 49, 209–231, doi: 10.1021/ci800303k (2009).
Article CAS PubMed Google Scholar
Huang, N., Shoichet, B. K. & Irwin, J. J. Benchmarking sets for molecular docking. J Med Chem 49, 6789–6801, doi: 10.1021/jm0608356 (2006).
Article CAS PubMed PubMed Central Google Scholar
Berman, H. M. et al. The Protein Data Bank. Nucleic Acids Research 28, 235–242, doi: 10.1093/nar/28.1.235 (2000).
Article CAS PubMed PubMed Central Google Scholar
Cowan-Jacob, S. W. et al. The crystal structure of a c-Src complex in an active conformation suggests possible steps in c-Src activation. Structure 13, 861–871, doi: 10.1016/j.str.2005.03.012 (2005).
Article CAS PubMed Google Scholar
Xu, W., Doshi, A., Lei, M., Eck, M. J. & Harrison, S. C. Crystal structures of c-Src reveal features of its autoinhibitory mechanism. Mol Cell 3, 629–638 (1999).
Article CAS Google Scholar
Xu, W., Harrison, S. C. & Eck, M. J. Three-dimensional structure of the tyrosine kinase c-Src. Nature 385, 595–602, doi: 10.1038/385595a0 (1997).
Article CAS ADS PubMed Google Scholar
Liu, T., Lin, Y., Wen, X., Jorissen, R. N. & Gilson, M. K. BindingDB: a web-accessible database of experimentally determined protein-ligand binding affinities. Nucleic Acids Res 35, D198–201, doi: 10.1093/nar/gkl999 (2007).
Article CAS PubMed Google Scholar
Gaulton, A. et al. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Research 40, D1100–D1107, doi: 10.1093/nar/gkr777 (2012).
Article CAS PubMed Google Scholar
Knox, C. et al. DrugBank 3.0: a comprehensive resource for ‘omics’ research on drugs. Nucleic Acids Res 39, D1035–1041, doi: 10.1093/nar/gkq1126 (2011).
Article CAS PubMed Google Scholar
Li, Q., Cheng, T., Wang, Y. & Bryant, S. H. PubChem as a public resource for drug discovery. Drug Discov Today 15, 1052–1057, doi: 10.1016/j.drudis.2010.10.003 (2010).
Article CAS PubMed PubMed Central Google Scholar
Bellis, L. J. et al. Collation and data-mining of literature bioactivity data for drug discovery. Biochem Soc Trans 39, 1365–1370, doi: 10.1042/BST0391365 (2011).
Article CAS PubMed Google Scholar
Sali, A. Comparative protein modeling by satisfaction of spatial restraints. Mol Med Today 1, 270–277 (1995).
Article CAS Google Scholar
Marti-Renom, M. A. et al. Comparative protein structure modeling of genes and genomes. Annu Rev Biophys Biomol Struct 29, 291–325, doi: 10.1146/annurev.biophys.29.1.291 (2000).
Article CAS PubMed Google Scholar
Fiser, A. & Sali, A. Modeller: generation and refinement of homology-based protein structure models. Methods Enzymol 374, 461–491, doi: 10.1016/S0076-6879(03)74020-8 (2003).
Article CAS PubMed Google Scholar
Umeyama, H. & Iwadate, M. FAMS and FAMSBASE for protein structure. Curr Protoc Bioinformatics Chapter 5, Unit5 2, doi: 10.1002/0471250953.bi0502s04 (2004).
Ogata, K. & Umeyama, H. An automatic homology modeling method consisting of database searches and simulated annealing. J Mol Graph Model 18, 258–272 305-256 (2000).
Article CAS Google Scholar
Takaya, D. et al. Bioinformatics based Ligand-Docking and in-silico screening. Chem Pharm Bull (Tokyo) 56, 742–744 (2008).
Article CAS Google Scholar
Hsin, K. Y., Ghosh, S. & Kitano, H. Combining machine learning systems and multiple docking simulation packages to improve docking prediction reliability for network pharmacology. PLoS One 8, e83922, doi: 10.1371/journal.pone.0083922 (2013).
Article CAS ADS PubMed PubMed Central Google Scholar
Patel, P. R. et al. Identification of potent Yes1 kinase inhibitors using a library screening approach. Bioorg Med Chem Lett 23, 4398–4403, doi: 10.1016/j.bmcl.2013.05.072 (2013).
Article CAS PubMed PubMed Central Google Scholar
Hirsch, A. J. et al. The src family kinase c-Yes is required for maturation of West Nile virus particles. Journal of virology 79, 11943–11951, doi: Doi 10.1128/Jvi.79.18.11943-11951.2005 (2005).
Article CAS PubMed PubMed Central Google Scholar
Georghiou, G., Kleiner, R. E., Pulkoski-Gross, M., Liu, D. R. & Seeliger, M. A. Highly specific, bisubstrate-competitive Src inhibitors from DNA-templated macrocycles. Nat Chem Biol 8, 366–374, doi: Doi 10.1038/Nchembio.792 (2012).
Article CAS PubMed PubMed Central Google Scholar
Yeung, C. L. et al. Loss-of-function screen in rhabdomyosarcoma identifies CRKL-YES as a critical signal for tumor growth. Oncogene 32, 5429–5438, doi: 10.1038/onc.2012.590 (2013).
Article CAS PubMed PubMed Central Google Scholar
Zhang, X., Meyn, M. A. & Smithgall, T. E. c-Yes Tyrosine Kinase Is a Potent Suppressor of ES Cell Differentiation and Antagonizes the Actions of Its Closest Phylogenetic Relative, c-Src. ACS chemical biology 9, 139–146, doi: DOI 10.1021/cb400249b (2014).
Article CAS PubMed Google Scholar
Anbalagan, M. et al. KX-01, a novel Src kinase inhibitor directed toward the peptide substrate site, synergizes with tamoxifen in estrogen receptor alpha positive breast cancer. Breast Cancer Res Treat 132, 391–409, doi: 10.1007/s10549-011-1513-3 (2012).
Article CAS PubMed Google Scholar
Blake, R. A. et al. SU6656, a selective Src family kinase inhibitor, used to probe growth factor signaling. Mol Cell Biol 20, 9018–9027, doi: 10.1128/Mcb.20.23.9018-9027.2000 (2000).
Article CAS PubMed PubMed Central Google Scholar
Omagari, K., Mitomo, D., Kubota, S., Nakamura, H. & Fukunishi, Y. A method to enhance the hit ratio by a combination of structure-based drug screening and ligand-based screening. Adv Appl Bioinform Chem 1, 19–28 (2008).
CAS PubMed PubMed Central Google Scholar
Minami, S., Sawada, K. & Chikenji, G. MICAN: a protein structure alignment algorithm that can handle Multiple-chains, Inverse alignments, C(alpha) only models, Alternative alignments and Non-sequential alignments. BMC Bioinformatics 14, 24, doi: 10.1186/1471-2105-14-24 (2013).
Article CAS PubMed PubMed Central Google Scholar
Kinnings, S. L. & Jackson, R. M. LigMatch: a multiple structure-based ligand matching method for 3D virtual screening. J Chem Inf Model 49, 2056–2066, doi: 10.1021/ci900204y (2009).
Article CAS PubMed Google Scholar
Okuno, T., Kato, K., Terada, T. P., Sasai, M. & Chikenji, G. VS-APPLE: A Virtual Screening Algorithm Using Promiscuous Protein-Ligand Complexes. J Chem Inf Model 55, 1108–1119, doi: 10.1021/acs.jcim.5b00134 (2015).
Article CAS PubMed Google Scholar
Liu, T. Q., Lin, Y. M., Wen, X., Jorissen, R. N. & Gilson, M. K. BindingDB: a web-accessible database of experimentally determined protein-ligand binding affinities. Nucleic Acids Research 35, D198–D201, doi: 10.1093/nar/gkl999 (2007).
Article CAS PubMed Google Scholar
Bellis, L. J. et al. Collation and data-mining of literature bioactivity data for drug discovery. Biochem Soc T 39, 1365–1370, doi: 10.1042/Bst0391365 (2011).
Article CAS Google Scholar
Arnott, John A. & Lipophilicity, R. K. A. S. L. P. Indices for Drug Development. Journal of Applied Biopharmaceutics and Pharmacokinetics 1, 31–36 (2013).
Google Scholar
Abad-Zapatero, C. et al. Ligand efficiency indices for an effective mapping of chemico-biological space: the concept of an atlas-like representation. Drug Discov Today 15, 804–811, doi: 10.1016/j.drudis.2010.08.004 (2010).
Article CAS PubMed Google Scholar
Sugaya, N. Training based on ligand efficiency improves prediction of bioactivities of ligands and drug target proteins in a machine learning approach. J Chem Inf Model 53, 2525–2537, doi: 10.1021/ci400240u (2013).
Article CAS PubMed Google Scholar
Patel, P. R. et al. Identification of potent Yes1 kinase inhibitors using a library screening approach. Bioorganic & medicinal chemistry letters 23, 4398–4403, doi: 10.1016/j.bmcl.2013.05.072 (2013).
Article CAS Google Scholar
Breiman, L. Random forests. Mach Learn 45, 5–32, doi: 10.1023/A:1010933404324 (2001).
Article MATH Google Scholar
Svetnik, V. et al. Random forest: a classification and regression tool for compound classification and QSAR modeling. J Chem Inf Comput Sci 43, 1947–1958, doi: 10.1021/ci034160g (2003).
Article CAS PubMed Google Scholar
Hido, S., Kashima, H. & Takahashi, Y. Roughly balanced bagging for imbalanced data. Statistical Analysis and Data Mining 2, 412–426, doi: 10.1002/sam.10061 (2009).
Article MathSciNet Google Scholar
Durant, J. L., Leland, B. A., Henry, D. R. & Nourse, J. G. Reoptimization of MDL keys for use in drug discovery. J Chem Inf Comput Sci 42, 1273–1280 (2002).
Article CAS Google Scholar
Duan, J., Dixon, S. L., Lowrie, J. F. & Sherman, W. Analysis and comparison of 2D fingerprints: insights into database screening performance using eight fingerprint methods. J Mol Graph Model 29, 157–170, doi: 10.1016/j.jmgm.2010.05.008 (2010).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

We gratefully acknowledge the financial support of Schrödinger KK, Namiki Shoji Co., Ltd., NEC, NVIDIA, Research Organization for Information Science and Technology (RIST), AXIOHELIX Co. Ltd., Accelrys, HPCTECH Corporation, Information and Mathematical Science and Bioinformatics Co. Ltd., DataDirect Networks, DELL and Leave a Nest Co. Ltd., which made it possible to complete our contest. We are deeply grateful to New Energy and Industrial Technology Development Organization (NEDO), Japan Bioindustry Association (JBA), Japan Pharmaceutical Manufacturers Association (JPMA), Japanese Society of Bioinformatics (JSBi) and Chem-Bio Informatics (CBI) Society. Y.h.T, M.I. and H.U thank Dr. Katsuichiro Komatsu for assistance with in silico drug screening using choose LD and finantial support by the Chuo University Joint Research Grant. We would like to offer our special thanks to Dr. K. Ohno and Ms. K. Ozeki.

Author information

Authors and Affiliations

Education Academy of Computational Life Sciences (ACLS), Tokyo Institute of Technology, 4259 Nagatsutacho, Midori-ku, 226-8501, Yokohama, Japan
Shuntaro Chiba, Takashi Ishida, Nobuaki Yasuo, Keisuke Yanagisawa, Tomohiro Ban, Yutaka Akiyama & Masakazu Sekijima
Level Five Co. Ltd., Shiodome Shibarikyu Bldg., 1-2-3 Kaigan, Minato-ku, 105-0022, Tokyo, Japan
Kazuyoshi Ikeda
Department of Computer Science, Tokyo Institute of Technology, 2-12-1, Ookayama, Meguro-ku, 152-8550, Tokyo, Japan
Takashi Ishida, Nobuaki Yasuo, Keisuke Yanagisawa, Tomohiro Ban, Yutaka Akiyama & Masakazu Sekijima
Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600 036, Tamilnadu, India
M. Michael Gromiha & Chandrasekaran Ramakrishnan
Department of Physics, Chuo University, 1-13-27 Kasuga, Bunkyo-ku, 112-8551, Tokyo, Japan
Y-h. Taguchi
Department of Biological Sciences, Chuo University, 1-13-27 Kasuga, Bunkyo-ku, 112-8551, Tokyo, Japan
Mitsuo Iwadate & Hideaki Umeyama
Okinawa Institute of Science and Technology Graduate University, 1919-1 Tancha, Onna-son, Kunigami, 904-0495, Okinawa, Japan
Kun-Yi Hsin & Hiroaki Kitano
The Systems Biology Research Institute, Falcon Building 5F, 5-6-9 Shirokanedai, Minato-ku, 108-0071, Tokyo, Japan
Hiroaki Kitano
Center for Integrative Medical Sciences, RIKEN, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama City, 230-0045, Kanagawa, Japan
Hiroaki Kitano
Research Center for Advanced Science and Technology, The University of Tokyo, 4-6-1 Komaba, Meguro-ku, 153-8904, Tokyo, Japan
Kazuki Yamamoto
PharmaDesign Inc., 2-19-8, Hatchobori, Chuo-ku, 104-0032, Tokyo, Japan
Nobuyoshi Sugaya
Department of Computational Science and Engineering, Nagoya University, Furocho, Chikusa, 464-8603, Nagoya, Japan
Koya Kato & George Chikenji
Division of Neurogenetics, Nagoya University Graduate School of Medicine, 65 Tsurumai, Showa-ku, 466-8550, Nagoya, Japan
Tatsuya Okuno
Information and Mathematical Science and Bioinformatics Co., Ltd., Level 6 OWL TOWER, 4-21-1 Higashi-Ikebukuro, Toshima-ku, 170-0013, Tokyo, Japan
Masahiro Mochizuki
Global Scientific Information and Computing Center, Tokyo Institute of Technology 2-12-1, Ookayama, Meguro-ku, Tokyo, 152-8550, Japan
Ryunosuke Yoshino & Masakazu Sekijima
Department of Biotechnology, The University of Tokyo, 1-1-1 Yayoi, Nunkyo-ku, 113-8657, Tokyo
Ryunosuke Yoshino
Forerunner Pharma Research, Co., Ltd., Yokohama Bio Industry Center, 1-6 Shuehiro-cho, Tsurumi-ku, 230-0045, Yokohama, Japan
Reiji Teramoto
Centre of Advanced Study in Crystallography and Biophysics and Bioinformatics Infrastructure Facility (DBT Funded), University of Madras, Chennai, 600025, Tamilnadu, India
A. Mary Thangakani & D. Velmurugan
National Institutes of Biomedical Innovation, Health and Nutrition, 7-6-8 Saito-Asagi, Ibaraki, 567-0085, Osaka, Japan
Philip Prathipati, Junichi Ito, Yuko Tsuchiya & Kenji Mizuguchi
Center for Life Science Technologies, RIKEN, 6-7-3 Minatojima-minamimachi, Chuo-ku, Kobe-shi, 650-0047, Hyogo, Japan
Teruki Honma
Molecular Profiling Research Center for Drug Discovery, National Institute of Advanced Industrial Science and Technology, 2-4-7 Aomi, Koto-ku, 135-0064, Tokyo, Japan
Takatsugu Hirokawa & Yutaka Akiyama
Initiative for Parallel Bioinformatics, Level 14 Hibiya Central Building, 1-2-9 Nishi-Shimbashi Minato-Ku, Tokyo, 105-0003, Japan
Takatsugu Hirokawa, Yutaka Akiyama & Masakazu Sekijima

Authors

Shuntaro Chiba
View author publications
You can also search for this author in PubMed Google Scholar
Kazuyoshi Ikeda
View author publications
You can also search for this author in PubMed Google Scholar
Takashi Ishida
View author publications
You can also search for this author in PubMed Google Scholar
M. Michael Gromiha
View author publications
You can also search for this author in PubMed Google Scholar
Y-h. Taguchi
View author publications
You can also search for this author in PubMed Google Scholar
Mitsuo Iwadate
View author publications
You can also search for this author in PubMed Google Scholar
Hideaki Umeyama
View author publications
You can also search for this author in PubMed Google Scholar
Kun-Yi Hsin
View author publications
You can also search for this author in PubMed Google Scholar
Hiroaki Kitano
View author publications
You can also search for this author in PubMed Google Scholar
Kazuki Yamamoto
View author publications
You can also search for this author in PubMed Google Scholar
Nobuyoshi Sugaya
View author publications
You can also search for this author in PubMed Google Scholar
Koya Kato
View author publications
You can also search for this author in PubMed Google Scholar
Tatsuya Okuno
View author publications
You can also search for this author in PubMed Google Scholar
George Chikenji
View author publications
You can also search for this author in PubMed Google Scholar
Masahiro Mochizuki
View author publications
You can also search for this author in PubMed Google Scholar
Nobuaki Yasuo
View author publications
You can also search for this author in PubMed Google Scholar
Ryunosuke Yoshino
View author publications
You can also search for this author in PubMed Google Scholar
Keisuke Yanagisawa
View author publications
You can also search for this author in PubMed Google Scholar
Tomohiro Ban
View author publications
You can also search for this author in PubMed Google Scholar
Reiji Teramoto
View author publications
You can also search for this author in PubMed Google Scholar
Chandrasekaran Ramakrishnan
View author publications
You can also search for this author in PubMed Google Scholar
A. Mary Thangakani
View author publications
You can also search for this author in PubMed Google Scholar
D. Velmurugan
View author publications
You can also search for this author in PubMed Google Scholar
Philip Prathipati
View author publications
You can also search for this author in PubMed Google Scholar
Junichi Ito
View author publications
You can also search for this author in PubMed Google Scholar
Yuko Tsuchiya
View author publications
You can also search for this author in PubMed Google Scholar
Kenji Mizuguchi
View author publications
You can also search for this author in PubMed Google Scholar
Teruki Honma
View author publications
You can also search for this author in PubMed Google Scholar
Takatsugu Hirokawa
View author publications
You can also search for this author in PubMed Google Scholar
Yutaka Akiyama
View author publications
You can also search for this author in PubMed Google Scholar
Masakazu Sekijima
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors made substantial contributions to this study and article. Y.A., T.I. and M.S. developed the concept. S.C, T.I., Y.A. and M.S. organized and operated the contest. K.I., T.M. and T.H. evaluated data. Y.h.T., M.I., H.U., K.Y.H., H.K., K.Y., N.S., K.K., T.O., G.C., M.M., N.Y., R.Y., K.Y., T.B., R.T., C.R., A.M.T., D.V., M.M.G., P.P., J.I., Y.T. and K.M. participated the contest and predicted hit compound for target protein by their method. S.C., K.I., M.M.G. and M.S. wrote the main manuscript text. All authors approve this version to be published.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Electronic supplementary material

Supplementary Information

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Reprints and permissions

About this article

Cite this article

Chiba, S., Ikeda, K., Ishida, T. et al. Identification of potential inhibitors based on compound proposal contest: Tyrosine-protein kinase Yes as a target. Sci Rep 5, 17209 (2015). https://doi.org/10.1038/srep17209

Download citation

Received: 07 August 2015
Accepted: 27 October 2015
Published: 26 November 2015
DOI: https://doi.org/10.1038/srep17209

This article is cited by

MERMAID: an open source automated hit-to-lead method based on deep reinforcement learning
- Daiki Erikawa
- Nobuaki Yasuo
- Masakazu Sekijima
Journal of Cheminformatics (2021)
Identification of key interactions between SARS-CoV-2 main protease and inhibitor drug candidates
- Ryunosuke Yoshino
- Nobuaki Yasuo
- Masakazu Sekijima
Scientific Reports (2020)
Molecular Dynamics Simulation reveals the mechanism by which the Influenza Cap-dependent Endonuclease acquires resistance against Baloxavir marboxil
- Ryunosuke Yoshino
- Nobuaki Yasuo
- Masakazu Sekijima
Scientific Reports (2019)
QEX: target-specific druglikeness filter enhances ligand-based virtual screening
- Masahiro Mochizuki
- Shogo D. Suzuki
- Yutaka Akiyama
Molecular Diversity (2019)
A prospective compound screening contest identified broader inhibitors for Sirtuin 1
- Shuntaro Chiba
- Masahito Ohue
- Masakazu Sekijima
Scientific Reports (2019)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Introduction

Details of the contest

Compound library

Computational methods

Protein structure-based method (Groups 1, 2, 3, 5, 7, 9 and 10)

Ligand-based method (Groups 4, 6 and 8)

Selection of compounds for experimental inhibitory assay

Experimental procedures

Results and Discussion

Common compounds identified by different methods

Inhibition rates of selected compounds

Chemical space diversity of submitted compounds

Characteristic features of submitted and assayed compounds

Comparison of different approaches for identifying the potential hit compounds

Comparison between the potential hit and negative compounds

The strategy of using different approaches together

The advantages of using different approaches together

Suggestions for future based on the experience gained in this contest

Conclusions

Additional Information

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Ethics declarations

Competing interests

Electronic supplementary material

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links