Introduction

Cyclin-dependent kinases are serine/threonine protein kinases with key roles in regulating cell cycle progression, transcription and neuronal function of the eukaryotic cells1,2,3. Thus far, 21 CDKs isoforms have been identified2. The active holoenzyme of CDK4 and its positive regulators (D type cyclins) are critical for regulating the transition through the G1/S phase of the cell cycle1. Overexpression of CDK4 has been identified in a wide variety of cancers4,5,6. In contrast, overexpression occurs less frequently for other CDKs. Thus, CDK4 is a potentially druggable anti-cancer target, more so than other CDKs. Malumbres et al7 have reported that tumorigenesis may be suppressed by knockdown of CDK4 in mammary tumor cells. Moreover, most human cancers arising from tumor suppressor mutations are frequently linked to the loss of function of p16INK4, an endogenous CDK4 and CDK6 negative regulator8,9. Thus, we hypothesize that selective inhibition of CDK4 activity may result in effective cancer suppression. For these reasons, developing potent and selective CDK4 inhibitors would be a valuable approach in cancer chemotherapy as the resulting compounds would have fewer off-target effects and are anticipated to be generally less cytotoxic.

However, due to the high sequence identity and the common folding patterns of the ATP binding pocket, it is not easy to improve the selectivity of CDK inhibitors. In the case of CDK2 and CDK4, their active binding sites are predicted to be very similar because the amino acid sequence identity between these two kinases is 72%10. How can we obtain the CDK4-specific inhibitors based on such minor differences in the active binding site? Mclnnes et al11 hypothesized that inhibitors containing positively charged groups at physiological pH would be electrostatically attracted to the negatively charged Asp99 and Glu144 of CDK4. These same groups would concurrently be electrostatically repelled by the positively charged Lys89 of CDK2, hence giving rise to enhanced CDK4 selectivity. Indeed, the selectivity of the CDK4 inhibitor PD018381212,13 may be attributed to the presence of a positively charged nitrogen atom in the molecule.

In this report, Comparative Molecular Field Analysis (CoMFA) analysis14 was used to establish the quantitative structure activity and structure selectivity relationships of a series of novel positively charged thieno[2,3-d]pyrimidin-4-yl hydrazine analogs that were previously reported to be potent CDK4 inhibitors with marked selectivity for CDK4 versus CDK2. Herein, the contribution of the positively charged groups in rendering CDK4 selectivity was investigated in detail. In addition, steric and electrostatic effects on CDK4 binding affinity and specificity of these compounds were analyzed to guide future drug design efforts.

Materials and methods

Data sets

The thieno[2,3-d]pyrimidin-4-yl hydrazines investigated in this report were synthesized by Horiuchi and co-workers15,16,17. Of the original 68 reported compounds, 11 were discarded due to their low and indeterminate potencies (IC50 (CDK4)>20 μg/mL) and/or indeterminate selectivity. The remaining 57 compounds were randomly divided into a training set (48 compounds) and a test set (9 compounds) for the derivation of CoMFA models. The IC50 values of the remaining compounds (in μmol/L) were converted to pIC50 as a measure of CDK4 potency, and the index for the CDK4 selectivity was represented by log[IC50 (CDK2)/IC50(CDK4)] in the CoMFA analysis. Structures and experimental values of these inhibitors are listed in Table 1.

Table 1 Structures and actual pIC50 (CDK4) and log[IC50 (CDK2)/IC50 (CDK4)] values of thieno[2,3-d]pyrim-idin-4-yl hydrazone analogues.

Molecular docking

The crystal structure of CDK4 co-crystalized with the ligand is not available. In order to construct the inhibitor located at the active site of CDK4, the X-ray structure of CDK2 complexed with a pyrazolo(4,3-H) quinazoline-3-carboxamide inhibitor (PDB ID:2WXV) was aligned with the CDK4 crystal structure (PDB ID:2W9F) using Align Structures in the Homology module of SYBYL 6.918. The CDK2 inhibitor was then merged into the CDK4 active site, following which it is modified to compound 47, one of the CDK4 inhibitors listed in Table 1. Molecular docking was performed with GOLD (Genetically Optimized Ligand Docking), version 5.019. The active binding site in GOLD docking was defined by selecting residues within 6 Ã… of the ligand. Examination of the binding site indicated that the side chains of Ile12 and Lys35 of CDK4 assumed sterically unfavorable conformations that would hinder docking of the inhibitors. To address this issue, the side chains of Ile12 and Lys35 were set to flexible mode, whereas the other active site residues were kept rigid. The predicted binding mode provided by Horiuchi et al16 indicated that NH of hydrazone and the nitrogen atom in the pyrimidine ring of these compounds had formed two backbone hydrogen bonds with Val96 (CDK4), which are commonly found in kinase-inhibitor interaction. So two hydrogen-bond constraints between the pyrimidine N and NH of the Val96 backbone, and NH of hydrazone and the backbone O of Val96 were used in GOLD docking. Genetic Algorithm (GA) was applied to identify potentially active binding conformations for the ligand. GoldScore function was used as the fitness function.

Molecular preparation and alignment

The 3D structures of the 56 compounds were constructed based on the best docked conformation of compound 47 using the SYBYL 6.9 software. Gasteiger-Hückel charges were calculated. Energy minimization using the distance-dependent dielectric function (standard Tripos force field) and Powell method was carried out. A convergence criterion of 0.005 kcal/mol·Å for the conjugate gradient was set. The maximum iterations number was 1000. The molecular alignment is crucial for deriving statistical QSAR and QSSR models. Due to the relative rigidity of the test compounds, ligand-based alignment rule was employed in the present work to derive CoMFA models. All these molecules were aligned on the maximum common substructure using Fit-atom protocol. The superposition of all 57 aligned molecules is shown in Figure 1.

Figure 1
figure 1

Molecular alignment of 57 thieno[2,3-d]pyrimidin-4-yl hydrazone analogues.

PowerPoint slide

CoMFA analysis

In the generation of CoMFA models, the aligned molecules were placed in a 3D cubic lattice box (2-Ã… grid spacing). CoMFA steric and electrostatic fields were calculated by using probe atom (Csp3+1). The computed field energies with the standard cutoffs of 30 kcal/mol were used as independent variables. The pIC50(CDK4) values and log[IC50 (CDK2)/IC50(CDK4)] values from the training set were used as dependent variables to derive 3D-QSAR and 3D-QSSR models, respectively. The leave-one-out (LOO) procedure with a column filtering of 2.0 was employed in the cross-validation to calculate q2 (cross-validated coefficient). The optimum number of components (ONC) optimized by the q2 was used in non-cross-validation. The external validation of the model predictability was performed in the test set. The predictive correlation coefficient r2pred was calculated as described in reference20.

Results and discussion

Molecular docking

The top-ranked docked conformation of compound 47 in the CDK4 active site is presented in Figure 2. This conformation had the highest Goldscore value. CDK4 active site residues are shown in Van der Waals surface form (Figure 2A) to demonstrate that no steric conflict existed between the ligand and the side chains of Ile12 and Lys35. The hydrogen bond interaction mode is represented in Figure 2B. The ligand was anchored in the ATP-binding pocket by two hydrogen bonds with Val96 in the CDK4 hinge region. One hydrogen bond was formed between the pyrimidine N and the backbone NH of Val96 whereas the other was formed between NH of hydrazone and the backbone oxygen of Val96. This docked conformation was used as a template in CoMFA analysis.

Figure 2
figure 2

Binding mode of the compound 47 in the CDK4 active site. The ATP binding pocket of CDK4 is shown in the form of Van der Waals surface, the compound 47 is represented in stick form (A). Hydrogen bond interaction mode of compound 47 with CDK4 (B). The ligand is anchored in the pocket by the two hydrogen bonds formed with the Val96. The two hydrogen bonds are represented by two red dashed lines.

PowerPoint slide

PLS statistical results

PLS analysis results are reported in Table 2. The derived 3D-QSAR and 3D-QSSR models had conventional r2 values of 0.965 and 0.923, respectively, and q2 of 0.724 and 0.742, respectively. The contributions of the steric and electrostatic fields to the QSAR model were 0.548 and 0.452, respectively. The corresponding values for the QSSR model were 0.444 and 0.556, respectively. External validation with the test set gave r2pred values of 0.945 and 0.863, for QSAR and QSSR respectively. The predicted pIC50 (CDK4) and the log[IC50 (CDK2)/IC50(CDK4)] values from these two models are listed in Table 3. The actual versus predicted values for the two models are shown in Figure 3.

Table 2 The summary analysis of 3D-QSAR and 3D-QSSR models.
Table 3 Predicted pIC50 (CDK4) and log[IC50 (CDK2)/IC50 (CDK4)] values and residuals from 3D-QSAR and 3D-QSSR models.
Figure 3
figure 3

Correlation between the actual values and predicted values of the 3D-QSAR model (A) and the 3D-QSSR model (B).

PowerPoint slide

3D contour map analyses

The 3D contour plots graphically interpreting the CoMFA models were generated using the standard STDEV*COEFF field type. The MOLCAD surfaces were calculated for CDK4 with the Fast Connolly method to display the solvent-accessible surface and the electrostatic potential surface. The projection of CoMFA contour maps on to the MOLCAD generated surface maps and the CDK4 active binding site residues, which are presented in Figures 4 and 5, respectively, to aid in the elucidation of the structure-activity and structure-selectivity relationships, respectively. Compound 47 was used as the representative compound for visualization purposes.

Figure 4
figure 4

3D-QSAR electrostatic contours plots projected onto the MOLCAD electrostatic potential surface of the CDK4 active binding site for compound 47 (A). Comparison between the electrostatic contours and the active site residues (B). Steric contour plots projected onto the MOLCAD solvent accessible surface map (C). Comparison between the steric contours and the active site residues (D).

PowerPoint slide

Figure 5
figure 5

The match of 3D-QSSR electrostatic contours with the CDK4 active site residues (A). The match of steric contours with the active site residues (B).

PowerPoint slide

3D-QSAR contour map analyses

In Figure 4, blue and red contours represent regions where electropositive groups are favored and disfavored for inhibitory activity, respectively. Blue and red colors on the electrostatic potential MOLCAD surface map represent the most negative and most positive electrostatic potentials, respectively. One bulky blue contour surrounded the positively charged piperidine moiety of compound 47 and coincided with a negative region of the active site (Figure 4A), supporting the notion that cationic groups at the R4 position near the side chain of Asp99, Thr102, and Glu144 (Figure 4B) were favorable. In fact, many compounds containing a protonated N in the pyridinyl, phenyl and thiazole rings at this position (Table 1) exhibited higher activities than compounds without electropositive groups. Three extensive red contours near the piperidine ring close to the Ile12 indicated that electropositive groups were not desired in these regions.

Green and yellow contours are indicative of regions where large groups favored and disfavored inhibition, respectively. Likewise, there was a parallel between the steric fields and the CDK4 active site (Figure 4C, 4D). Bulky yellow contour located on the hydrogen atoms of the piperidine moiety of compound 47 were close to the Ile12 (Figure 4D), which coincided with the extensive red contours in the electrostatic contour maps (Figure 4B). The implication was that bulky substituents near the backbone of Ile12 were undesirable. In the case of compound 47, the piperidine moiety had an adequate shape to occupy this site. However, compounds 21, 26, 30, 32, 43, 44 with alkyl groups that extended into the vicinity of this sterically and electrostatically disfavored region, were less active. One yellow contour on the left located in the solvent inaccessible region of ATP binding pocket (Figure 4C) indicated that bulky groups at R3 position near the backbone of Asp97 and Gln98 were unfavorable (Figure 4D). In fact, compound 6 with a methyl group at R3 position was much less active than most compounds with a hydrogen at the same position. The match of the green contour maps with the cavity of the CDK4 active binding site is displayed in Figure 4C and 4D. One contour was located at the solvent accessible region at the entrance of the ATP binding pocket, another was surrounded by the side chains of Val20 and Lys35, and the last was surrounded by backbone of Asn 145 and Asp158. In fact, compounds with cyclohexyl ring (compound 53) and 2,3-dimethylbutyl side chain (compound 54) at R1 position which were orientated toward these two green contours inside the binding pocket were more active than compounds 49, 50, 51, 52 which had smaller groups at R1 position.

3D-QSSR contour map analyses

Electrostatically favorable regions (blue contours) and disfavorable regions (red contours) for CDK4 selectivity are shown in Figure 5A. One large blue contour located at the positively charged piperidine nitrogen of compound 47 coincided with the acidic region surrounded by the side chains of Asp99 and Glu144 in CDK4. In contrast, such interactions were absent in CDK2 as Glu144 (CDK4) was replaced by Gln131 at the CDK2 binding site. In fact, most compounds containing a cationic group at R4 had greater CDK4 selectivity than compounds without a protonated N at the same position. This is attributed to the favorable electrostatic interaction between the positive groups of CDK4 selective inhibitors and Glu144 (CDK4) and the accompanying increase in coulombic energy11.

Two bulky red contours in the vicinity of the this large blue contour (Figure 5A) may explain the weak CDK4 selectivity of compounds 7 and 8, which had their t-butyl and cyclopropyl groups oriented toward these electrostatically disfavored regions at R4 position. Compound 5 with an ethyl group at R2 position, near these two red contours was less selective than most of the compounds with H at the same position which indicated that the electropositive groups were not favored at the R2 position.

One blue contour located on the left side of the hydroxyl group of Thr102 (CDK4) suggested that positive groups were desired in this region. In the case of CDK2, the δ-amino group of Lys89 projected in this region11. In fact, compounds with a negatively charged carboxylate (compound 23) and hydroxymethyl group (compound 24) at R4 were more selective for CDK2 than CDK4. This indicated that negatively charged groups were poorly tolerated at this position and would not serve to enhance CDK4 selectivity. In fact, compounds with positively charged substituents at the same position of the phenyl ring, exhibited greater CDK4 selectivity, as seen from compound 28 which had an aminomethyl side chain at R4. Very likely, the unfavorable electrostatic repulsion between the two NH3+ groups of compound 28 and Lys89 (CDK2), and the favorable H bond formed between the NH3+ group of compound 28 and the hydroxyl group of Thr102 (CDK4) enhanced CDK4 selectivity.

Sterically favorable regions (green contours) and disfavorable regions (yellow contours) for CDK4 selectivity are shown in Figure 5B. One green contour surrounding the piperidine moiety of compound 47 suggested that large substituents were favored in the vicinity of the side chains of Asp99 and backbone of Glu144. It would explain the greater CDK4 selectivity of compounds (38, 40, 44, 46, 49–56) which had alkyl amino groups extending into this green region. In contrast, compounds (35, 36, 37) that do not have alkyl amino substitutions at the thiazole ring were poorly selective for CDK4. One bulky yellow contour positioned on the left side near the Thr102 indicated that the large substituents were unfavorable in this region. This would account for the less selective inhibitors with the carboxyl (compound 23) and hydroxymethyl (compound 24) groups orientated towards this contour.

In conclusion, we have herein reported statistically reliable 3D-QSAR and 3D-QSSR models derived from thieno[2,3-d]pyrimidin-4-yl hydrazone analogues by using CoMFA analysis. These models had high predictive power (QSAR model: q2=0.724; QSSR model: q2=0.742), and were validated by an external test set. CoMFA contour maps that were in good agreement with the CDK4 active site identified the vital interactions between key residues and major substituent groups on the inhibitors. Structure relationship revealed by these models and docking studies are summarized in Figure 6. Bulky groups at R1 position would improve the inhibitory activity but large groups at R3 position would have the opposite effect. Appropriate bulky positively-charged groups at R4 position would increase not only the activity but also the specificity for CDK4. The insight obtained from these models would greatly assist and guide drug design strategies aimed at uncovering potent and selective CDK4 inhibitors.

Figure 6
figure 6

Structural relationships as revealed by 3D-QSAR and 3D-QSSR models and docking studies.

PowerPoint slide

Author contribution

Dr Hai-xiao JIN and Prof Xiao-jun YAN designed the research and revised the manuscript; Bao-qin CAI conducted the research and wrote the manuscript; Peng ZHU and Gui-xiang HU helped write the paper.