Synthesis, docking, MD simulation, ADMET, drug likeness, and DFT studies of novel furo[2,3-b]indol-3a-ol as promising Cyclin-dependent kinase 2 inhibitors

A new series of furo[2,3-b]indol-3a-ol derivatives was synthesized to investigate their potential as inhibitors of the Cyclin-dependent kinase 2 (CDK2) enzyme. CDK2 is a serine/threonine protein kinase belonging to a family of kinases involved in the control of the cell cycle. Based on results from clinical studies, it has been shown that overexpression of CDK2 may play a role in the development of cancer. In order to discover highly effective derivatives, a process of in silico screening was carried out. The obtained results revealed that compound 3f. had excellent binding energies. In this study, in silico screening was used to investigate protein–ligand interactions and assess the stability of the most favorable conformation. The methods utilized included molecular docking, density functional theory (DFT) calculations using the B3LYP/6-31++G(d,p) basis set in the gas phase, molecular dynamic (MD) simulation, as well as the evaluation of drug-likeness scores. The pharmacokinetic and drug-likeness properties of the novel furo[2,3-b]indol-3a-ol derivatives suggest that these compounds have the potential to be considered viable candidates for future development as anticancer drugs.


1
H-decoupled 13 C NMR spectrum of 3a exhibited 11 discernible resonances, confirming the proposed structure.The N-CH 3 and C-OH displayed distinctive signals at δ 26.5 and 115.8 ppm, respectively.Three chemical shifts were observed at δ 116.2, 153.9, and 161.3 ppm, corresponding to C-NO 2 , C=N, and C-N groups, respectively.The mass spectrum of compound 3a showed a molecular-ion peak at m/z 324, consistent with the suggested structure.The IR spectrum of this compound revealed the presence of hydroxyl (OH) and amino (NH) groups, observed broadly at 3434 cm −1 and 3272 cm −1 , respectively.Additionally, stretching vibrations of the CH groups were detected at 2925 cm −1 1 and 2859 cm −1 .Other notable bands were observed at wavenumbers of 1693 cm −1 , 1534 cm −1 , 1376 cm −1 , and 1260 cm −1 , which were associated with groups including C=N, C-N, NO 2 , C-O, and C-N, respectively.
A suggested mechanism for the generation of furo [2,3-b]indol-3a-ol 3 is illustrated in Scheme 1.The carbonyl group of isatin 1 is protonated in the first step.The nucleophilic attack of N-methyl-1-(methylthio)-2-nitroethenamine 2 upon the protonated carbonyl group of 1 by an aza-ene reaction affords compound 4, which is converted to 5 by imine-enamine tautomerization.Consequently, the intramolecular annulation of compound 6 resulted in the formation of intermediate 7. Ultimately, intermediate 7 was converted into the target product 3 by eliminating methanethiol.

Quantum chemistry via density functional theory calculation
The values corresponding to each of the selected compounds 3a-f are presented in Table 2. Initially, in the gas phase, the B3LYP/6-31++G(d,p) basis set was utilized in the Gaussian 09W software package and the Gauss View visualization tool to optimize these parameters 14 .Table 1.Optimize reaction conditions for the synthesis of 3a a a Reagents and conditions: 1a (0.5 mmol), 2 (0.5 mmol), catalyst (0.05 mmol), solvent (4.0 mL).nr=no reaction.www.nature.com/scientificreports/There were no instances of imaginary frequencies found, and the geometries of the selected compounds were adjusted to lower energy gradients, indicating that all compounds were really local minima.Figure 3 shows the optimized structures of the selected compounds.

SMe
The importance of Molecular Orbital Analysis (MO) in quantum chemistry is obvious since it serves as a pivotal tool for the comprehensive elucidation of chemical events.For the purpose of explaining chemical characteristics, the lowest unoccupied molecular orbital (LUMO) and the highest occupied molecular orbital (HOMO) are included.These qualities include reactivity, stability, and kinetics.The HOMO tends to release electrons, whereas the LUMO has a tendency to accept electrons.These orbitals can be utilized for assessing charge transfer phenomena.Indeed, the energy of the HOMO is related to ionization potential, while the energy of the LUMO is related to electron affinity.The frontier molecular orbitals (FMOs) of the synthesized compounds are illustrated in Fig. 4. The molecular orbital wave function attributes the positive and negative phases, respectively, to the color distributions of red and green.
The calculations conducted in the gas phase are presented in Table 3, encompassing various descriptors including HOMO-LUMO energies, band gap energies, chemical hardness (η), softness and potentials, electronegativity (X), and electrophilicity indexes.The parameters were computed using well-established approaches described in the literature, using the border orbital energies of HOMO and LUMO given in eV.The disparity in energy between the HOMO and the LUMO serves as a direct measure of chemical reactivity.A notable difference in energy between HOMO and LUMO signifies heightened stability and diminished chemical reactivity.Based on the results, compound 3a has a low HOMO-LUMO energy gap of 4.114 eV, indicating a high degree of chemical reactivity.We can order this parameter in terms of ΔE gap as follows: 3d > 3e > 3f.> e3c > 3b > 3a.Furthermore, the maximum HOMO delocalization spans between 10 and 12 atoms in compounds 3a-3e, whereas it is notably greater in compound 3f.Conversely, the primary regions of LUMO delocalization in compounds 3a-3e correspond to the furo, methylamino, and nitro groups.In compound 3f., almost all other groups contribute to delocalization, except for the nitro and methylamino groups, which play minimal roles.In addition, compound 3a has the highest degree of softness among all compounds, as evidenced by its lowest recorded hardness value of 2.057 and its highest polarizability.Compound 3f.exhibits a higher electronegativity value (5.231) compared to the other compounds, suggesting its enhanced ability to attract electrons and its improved performance as an electrophile (6.351)   www.nature.com/scientificreports/Molecular docking studies CDK2 plays an important role in controlling the progression of the eukaryotic cell cycle.It is commonly known that monomeric CDK2 lacks inherent regulatory activity.Instead, its regulatory function is activated by positive regulators such as cyclins E and A or through phosphorylation on the catalytic section.It is noteworthy that the aforementioned activation processes elicit notable alterations in the three-dimensional configuration of the kinase, particularly in the activation section.The CDK2 protein consists of a solitary polypeptide chain with 306 amino acids.In fact, CDK2 has been identified as a significant contributor to the process of cell proliferation in prostate cancer 15 and non-small cell carcinoma 16 .It has also been shown to have an important role in the malignant transformation of breast epithelial cells.Consequently, suppressing CDK2 activity has been proven to effectively impede the growth of cancer cells 17 .Considering the importance of CDK2, we conducted in silico investigations.The crystallographic structure of CDK2, identified by its Protein Data Bank (PDB) ID 6GUH 18 , was obtained at a resolution of 1.50.The amino acid residues that play a crucial role in the catalytic site have been identified and subjected to docking studies with potent furo[2,3-b]indol-3a-ol derivatives.The interactions with the amino acid residues present in the active pocket have been evaluated to determine the binding affinities and binding scores of the synthesized derivatives.The results revealed that a significant proportion of the compounds exhibited robust binding scores and remarkable binding affinities.Notably, compound 3f.exhibited the best binding energy at − 6.89 kcal/mol.The best configuration of 3f. was selected, followed by a comprehensive analysis of both bonding and non-bonding interactions.The docking results of the highly effective compounds, along with their corresponding interactions, are presented in Table 4.
To assess the validation of molecular docking, a re-docking procedure was conducted using the co-crystallized ligand.The resulting root-mean-square deviation (RMSD) value of 0.40 Å suggests that the docking experiment is reliable 19 (Fig. 6).

Molecular dynamics simulation
The molecular dynamics (MD) simulation of CDK2-ligand complexes was assessed over a time scale of 100 ns to elucidate the dynamic behavior and stability of the complexes.For this objective, a study was carried out to examine the structural alterations induced by the highly effective compound 3f.The RMSD was examined during the 100 ns of MD simulation to assess the protein-ligand complex's stability.As illustrated in Fig. 7, the plot utilizes the left Y-axis to represent the RMSD of the protein and the right Y-axis to display the ligand RMSD profile that is aligned with the protein backbone.The frames obtained from the 100 ns trajectory were aligned with the reference frame backbone.The RMSD plot demonstrates the stability of the CDK2-ligand complex after 5 ns, relative to the reference frame formed at time point 0 ns.Nevertheless, a slight elevation in the RMSD of the protein-bound ligand was observed at 81 ns.This deviation might be attributed to a conformational change in the rotatable bonds of the ligand.Root Mean Square Fluctuation (RMSF) serves as a metric quantifying the average deviation of each atom's position from its mean position within a specified simulation or ensemble of structures.It furnishes insights into the flexibility or mobility of the residue, with a higher RMSF value signifying increased flexibility or mobility of the residue.The results of the RMSF study inferred that the majority of amino acid residues exhibited a    notable level of stability.Some residues, however, show greater RMSF values than others, suggesting that they are more flexible and have been considerably impacted.For example, amino acid 51 has the highest RMSF value in the dataset, 3.86, showing that this residue is very flexible.Also, some residues with low RMSF values are less flexible.As an example, amino acid 195 has a very low RMSF value of 0.42, which is one of the lowest values in the dataset and means that it is relatively rigid.It is noteworthy that important residues of the protein of interest consistently maintained contact with compound 3f.Amino acid residues involved in interactions with the ligand are indicated by "green lines" in Fig. 8, while the remaining residues are depicted without any dashes.It was observed that when comparing the RMSF values of residues in contact with a ligand to those that are not in contact, no discernible pattern or trend emerged.Some residues engaged with a ligand exhibited high RMSF values, while others displayed low RMSF values.Similarly, some residues that did not come into contact with a ligand demonstrated high RMSF values, while others displayed low RMSF values.
The interactions between 3f. and the CDK2 active site pocket, which occurred for more than 30% of the simulation, are shown in Fig. 10.To summarize, the interactions may be succinctly described as follows: (1) A hydrogen bond formed between Gln139 and the hydroxyl group of the 2-(methylamino)-3, 5-dinitro-3aH-furo[2,3-b]  The contributing energy components of non-covalent interactions during the simulation are then shown in Fig. 11.The X-axis delineates the interacting residues at the active site with the ligand, while the Y-axis signifies the fraction of simulation time for the interaction.The stacked bar charts are normalized over the entire trajectory.As depicted in Fig. 11, Ile18 engaged in hydrophobic interactions with the ligand for approximately 20% of the simulation duration.Moreover, over at least 35% of the simulation period, Leu91, Asp94, and Gln139 formed hydrogen-bond interactions with the ligand.Significantly, Lys97 demonstrated a variety of interactions, encompassing ionic, water-bridged, and hydrogen-bond interactions with the ligand.Consequently, this residue experienced numerous interactions throughout the entire simulation time.
In order to determine the stability of ligand 3f. in the CDK2 receptor throughout the 100-ns simulation depicted in Fig. 12, an examination was conducted on six parameters 20 .The maximum RMSD of 3f.during the simulation was 0.75 Å.In the initial stage, fluctuations were observed from 0 to 20 ns, followed by a stable RMSD throughout the entire simulation process.The radius of gyration fluctuated until 100 ns, and then a stable conformation was attained over the whole simulation time.During the 100 ns simulation, the radius of gyration for compound 3f.varied between 3.18 and 3.38 Å.Strong intramolecular H-bond interactions indicated that compound 3f.possessed a potent inhibitory capacity.The SASA plot exhibited a variable pattern for the first 42 ns, followed by a period of stability until the simulations concluded.The MolSA and PSA plots provided evidence of the stability of ligand 3f.throughout the simulation time.Figure 13 displays a 2D schematic of compound 3f., with rotatable bonds that are color-coded.The rotatable torsional bond of compound 3f.was supplemented by a radial plot and the same color bar plots.A radial plot and the same color bar plots were used to augment compound 3f.'s rotatable torsional bond.The time progression was represented radially outwards from the center of the radial plot, which depicts the simulation's inception.The probability density of the torsion angle was shown using bar plots, which provided a concise summary of the data presented in the radial plots.The Y-axis of the bar plots depicted the rotational bond potential, expressed in  www.nature.com/scientificreports/units of kcal/mol.The radial and bar plots elucidated the torsional potential interactions and the conformational strain of compound 3f.while maintaining a conformation bound to a protein.

Drug-likeness prediction
Drug likeness is the degree to which certain compounds and well-known drugs are similar to each other.The foundation of this phenomenon rests on a delicate equilibrium between molecular and structural characteristics.The assessment of drug-likeness includes the evaluation of several molecular attributes, such as hydrophobicity, electronic distribution, hydrogen bonding, molecular weight, pharmacophore entity, bioavailability, reactivity, toxicity, and metabolic stability 21 .Lipinski's rule is a commonly employed approach in the assessment of the solubility and permeability characteristics of compounds, enabling the prediction of their viability as prospective drug candidates.Based on this principle, it may be inferred that compounds that contravene Lipinski's rule of five are more likely to manifest inadequate absorption or penetration.The derivatives were thoroughly examined using the SwissADME online web server 22 .None of the compounds 3a-f examined in this study violate the Lipinski rule, as their values fall within the acceptable range and demonstrate satisfactory absorption properties.Moreover, it is noteworthy that these compounds 3a-f occupy a favorable region within the physiochemical space, hence justifying their classification as potential lead compounds.The pharmacokinetic analysis demonstrated that the investigated compounds, namely 3a-e, exhibit favorable absorption characteristics inside the gastrointestinal system following oral administration.Additionally, these compounds were found to be susceptible to efflux by P-glycoprotein (P-gp).Conversely, compound 3f.had limited absorption via the gastrointestinal tract.This observation can be attributed to the higher molar refractivity (MR) value of 3f. in comparison to the MR values of other compounds that were examined.The application of pan-assay interference substances (PAINS) structural warnings has been employed in the field of pharmaceutical chemistry to identify regions within a compound's structure that are prone to instability, reactivity, and toxicity 23,24 .None of the compounds 3a-f exhibit any alarms in the PAINS descriptions, indicating their potential as promising therapeutic candidates.The synthetic accessibility score (SA score) is a quantitative measure employed to assess the level of difficulty associated with synthesizing drug-like compounds.It was noted that all of the compounds have a favorable SA, suggesting their potential for facile synthesis (Table 5).

ADMET properties
In the process of advancing therapeutic drug development, a profound understanding of pharmacology and toxicology is crucial.The acquisition of this knowledge not only serves to reduce the period of medication development but also augments the success rate.ADMET indices, comprising Absorption, Distribution, Metabolism, Excretion, and Toxicity, are frequently utilized to assess the characteristics of a compound.The parameters for furo[2,3-b]indol-3a-ol derivatives are obtained using the online web server ADMET Lab 2.0 25 .The utilization of CaCo-2 cells, which are generated from human colon epithelial cells, is a prevalent approach for assessing the absorption of pharmaceutical substances within the human digestive tract.On the other hand, Madin Darby Canine Kidney (MDCK) cells are of particular value in evaluating the swift permeability of drug molecules, as they possess a shorter growing time in comparison to CaCo-2 cells 26 .The CaCo-2 cell permeability data obtained for the synthesized compounds demonstrated values that fell within a satisfactory range, hence indicating favorable membrane permeability characteristics for these compounds.All furo [2,3-b]indol-3a-ol derivatives exhibited favorable MDCK cell permeability, suggesting a heightened likelihood of renal cell-mediated removal.In terms of Plasma glycoprotein (PGP) inhibitors and PGP substrates, all compounds were shown to be PGP inhibitors and substrates.The computed values for human intestinal absorption (HIA) indicate that all substances possess a high likelihood of being effectively absorbed through the intestinal membrane.The assessment of plasma protein binding (PPB) is a crucial determinant in evaluating the safety profile of medications.Drugs with a high PPB value (> 90%) often exhibit a narrow therapeutic index, indicating a smaller margin of safety.Conversely, pharmaceuticals with a low PPB value are generally considered to be safer.In the current study, it was shown that all compounds 3a-f had low plasma protein binding (PPB) values, indicating a wide therapeutic index for these compounds.Compounds that have CBrain/CBlood values greater than 1 are categorized as possessing central nervous system (CNS) activity, whereas compounds with CBrain/CBlood values below 1 are characterized as lacking CNS activity.Compounds exhibiting central nervous system (CNS) activity demonstrate the capacity to traverse the Blood-Brain Barrier (BBB) and induce adverse effects on the central nervous system 27 .Based on the data provided in Table 6, it can be seen that the CBrain/CBlood values of all the compounds are less than 1, suggesting their inability to traverse the Blood-Brain Barrier (BBB).As a result, the compounds we have synthesized exhibit a lack of neurotoxicity.

Conclusion
In the present investigation, a series of furo[2,3-b]indol-3a-ol derivatives were synthesized and then subjected to analysis using IR, Mass, 1 H, and 13 C NMR.The use of DFT calculations proved effective in the accurate prediction of structural geometry.Furthermore, molecular docking was performed on all of the compounds under consideration.The results indicated that all compounds had binding affinity with the CDK2 protein.Significantly, out of these compounds, 3f.demonstrated the most elevated binding energy values.The ligand-protein complex underwent MD simulation to assess stability, RMSD, and RMSF values.In addition, in silico ADMET studies predicted favorable drug-likeness properties for the synthesized derivatives.This comprehensive exploration offers a promising avenue for further investigation into their efficacy as CDK2 inhibitors in the context of drug development.Further validation of the chemoinformatics study's conclusions would need more in vivo and in vitro investigations.Further experimentation using in vitro and in vivo studies would be necessary to validate the findings of the chemoinformatics investigation.

General information
All solvents and reagents were purchased from Aldrich and Merck Chemical Co.DMSO-d 6 and acetone-d 6 solvents were used to obtain NMR spectra on a Bruker (400 MHz for 1 H and 100 MHz for 13 C).Melting points were measured using an electrothermal 9100.An Agilent 5975C VL MSD with a Triple-Axis detector recorded mass spectra at 70 eV.IR spectra were measured using the Bruker Tensor 27.

General method for synthesizing compounds 3a-f
A mixture of N-methyl-1-(methylthio)-2-nitroethenamine (0.5 mmol), isatin derivatives (0.5 mmol), and sulfamic acid (0.05 mmol) were magnetically stirred in EtOH/H 2 O (1:3, 4.0 mL) at reflux for 24 h.TLC was used to monitor the reaction, and the eluent used was a 1:1 ratio of ethyl acetate to n-hexane.Following the conclusion of the reaction, compounds 3a-f were obtained by filtering and washing the precipitated product with EtOH.The reaction was TLC-monitored.After the reaction, the precipitate was filtered and washed with EtOH to yield compound 3a-f.

Figure 5 .
Figure 5.The 3D and 2D bindings mode of compound 3f.into the active site of CDK2.

Figure 6 .
Figure 6.Superimposition of the docked ligand (red) and the original ligand (green).

Figure 7 .
Figure 7. RMSD values of the protein and ligand during 100 ns MD simulation.

Figure 8 .
Figure 8.The residue wise fluctuations of CDK2 in complex with compound 3f.

Figure 10 .
Figure 10.Schematic of detailed ligand atom interactions with the protein residues.

Figure 2 .
The Molecular structures and percent yields of the final compounds 3a-f.

. Compounds 3b and 3c, possessing
energy gap values of 4.178 eV and 4.233 eV, respectively, exhibited notable reactivity subsequent to compound

Table 2 .
Geometric parameters of the compounds

Table 3 .
Parameters of energy for compounds 3a

Table 4 .
Docking scores and interaction for each compounds 3a-f.