Novel computational and drug design strategies for inhibition of human papillomavirus-associated cervical cancer and DNA polymerase theta receptor by Apigenin derivatives

The present study deals with the advanced in-silico analyses of several Apigenin derivatives to explore human papillomavirus-associated cervical cancer and DNA polymerase theta inhibitor properties by molecular docking, molecular dynamics, QSAR, drug-likeness, PCA, a dynamic cross-correlation matrix and quantum calculation properties. The initial literature study revealed the potent antimicrobial and anticancer properties of Apigenin, prompting the selection of its potential derivatives to investigate their abilities as inhibitors of human papillomavirus-associated cervical cancer and DNA polymerase theta. In silico molecular docking was employed to streamline the findings, revealing promising energy-binding interactions between all Apigenin derivatives and the targeted proteins. Notably, Apigenin 4′-O-Rhamnoside and Apigenin-4′-Alpha-l-Rhamnoside demonstrated higher potency against the HPV45 oncoprotein E7 (PDB ID 2EWL), while Apigenin and Apigenin 5-O-Beta-d-Glucopyranoside exhibited significant binding energy against the L1 protein in humans. Similarly, a binding affinity range of − 7.5 kcal/mol to − 8.8 kcal/mol was achieved against DNA polymerase theta, indicating the potential of Apigenin derivatives to inhibit this enzyme (PDB ID 8E23). This finding was further validated through molecular dynamic simulation for 100 ns, analyzing parameters such as RMSD, RMSF, SASA, H-bond, and RoG profiles. The results demonstrated the stability of the selected compounds during the simulation. After passing the stability testing, the compounds underwent screening for ADMET, pharmacokinetics, and drug-likeness properties, fulfilling all the necessary criteria. QSAR, PCA, dynamic cross-correlation matrix, and quantum calculations were conducted, yielding satisfactory outcomes. Since this study utilized in silico computational approaches and obtained outstanding results, further validation is crucial. Therefore, additional wet-lab experiments should be conducted under in vivo and in vitro conditions to confirm the findings.

www.nature.com/scientificreports/Cervical cancer contributed to 604,127 cases (6.5% of all cancer cases) each year.It inflicted 341,831 deaths (7.7% of all cancer deaths) in 2020 globally, placing it as the fourth most lethal type of cancer in the female population 1 .The worldwide cancer burden is quickly growing as a consequence of continuing demographic and epidemiological shifts, and it is anticipated that this will contribute to a large percentage (> 4,74,000) of deaths among women by the year 2030 2 .Cervical cancer is primarily spread between affluent and less affluent nations, as evidenced by this worldwide cancer burden.evidence shows that viral infections are responsible for 15-20% of all human malignancies.Different phases of cancer development may be accelerated by infection with oncogenic viruses.There are many other forms of Human Papilloma Virus (HPV), but around 15 have been related to cancer.Even if screening procedures are very successful, cervical cancer is still a significant public health issue 3 .
HPV is a small, non-enveloped, icosahedral, double-stranded DNA virus that may transmit through sexual activities.It infects different parts of the body's organs, such as squamous epithelia, including the skin and upper respiratory and anogenital tract mucous membranes.About 100 various other forms of HPV and around 40 of them are known to infect the anogenital region 4 .Besides, it has been associated with several different malignancies, the most significant of which is cervical cancer.It is also reported that infection with HPV is the root cause of almost all cases of cervical cancer.The occurrence of cervical cancer has been linked to 18 different kinds of human papillomavirus, with types HPV-16 and HPV-18, in particular, being considered high-risk/oncogenic.Low-risk/non-oncogenic HPV strains cause genital warts, especially types HPV-6 and HPV-11.Within a year to two years after infection, cell-mediated immunity typically clears or suppresses most cervical HPV infections 5 .
The HPV may often be confirmed in clinical specimens using hybrid capture or polymerase chain reaction of HPV genomes 6 .HPV genotyping, on the other hand, is accomplished through hybridization with type-specific oligonucleotide probes.All probe assays rely on identifying target nucleic acid patterns by complementing probe nucleic acid patterns that may be replicated through PCR 7 .This identification can be accomplished in a variety of ways.The target DNA is subjected to several cycles of denaturation, primer hybridization, and primer extension in the PCR, which results in the specific to an individual of the target DNA.After 30 cycles of PCR, a proportion of target DNA equal to or more than one million duplicates is created.After that, either a standard dot blot or a Southern blot is used to determine the results of the amplified DNA.One sort of test that uses a modulation scheme to identify DNA or RNA substrates is called a hybrid capture technique 8 .The Hybrid capture method employs a hybridization solution consisting of RNA tags with DNA targets for HPV identification.This is complemented by an immunologically oriented back-end test comparable to an ELISA.Regarding the detection of HPV, the PCR approach has a better analytic sensitivity than the hybrid capture approach; nevertheless, the hybrid capture method may be more efficient in determining women who have concomitant squamous epithelial tumors 9 .
Secondly, a particular kind of genetic recombination known as homologous recombination (HR) occurs when two identical or comparable double-stranded or single-stranded nucleic acid molecules exchange genetic information.The biological system accurately repairs DNA breaks (both strands) through HR where a specific process known as homologous recombinational repair (HRR) contributes 10 .HR deficiency has emerged as a crucial biomarker for multiple types of cancers such as ovarian, breast, pancreatic, and prostate 11 .Mutation in two particular genes, BRCA1 and BRCA2 (combinedly addressed as BRCA1/2), is responsible for HR-mediated DNA repair deficiency leading to the development and initiation of several types of tumors 12 .Interestingly, Poly (ADP-ribose) polymerase inhibitors (PARPi) have shown excellent sensitivity against BRCA1/2-mutated tumors, and many PARPi has been approved in recent decades for clinical use 13,14 .But the main challenge to these medications' clinical success in patients with HR-deficient tumors is the rapid development of resistance.Therefore, across numerous therapeutic situations, including primary chemotherapy, neoadjuvant treatment, and combination therapy with immunotherapies, the effectiveness of PARPi is now being assessed [15][16][17][18] .
Recent research has suggested the synthetic lethality of HR deficiency with DNA polymerase theta making an emerging novel drug target for treating HR-deficient tumors 19 .The DNA polymerase theta contains three domains (N-terminus containing a helicase-like ATPase domain, central domain, and C terminus containing a nuclease domain) and a nuclease domain).It differs from other polymerases in structure and function as it suppresses mitotic crossovers for preserving genomic integrity [20][21][22] .Any patients suffering from breast and ovarian tumors with HR deficiency have high expression of DNA polymerase theta, which is a backup HR mediator in the DNA double-strand break repair process 23 .As DNA polymerase theta is synthetic and lethal with HR, inhibition of DNA polymerase theta in patients with defective HR can induce tumor cell death 23,24 .
Moreover, the DNA polymerase theta inhibition approach harmonizes with PARPi activity in HR-deficient tumor elimination 23,24 .Synthetic lethality among HR deficiency and DNA polymerase theta inhibition depends on several mechanisms through which DNA polymerase theta operates to keep the genome stable and stop tumorigenesis 21 .Besides, the DNA polymerase theta is critical in mutagenic microhomology-mediated endjoining (MMEJ), an significant DNA double-strand break-repairing mechanism 25 .However, the mechanism by which the DNA polymerase theta maintains synthetic lethality with HR-deficient tumors remains unclear, but this evidence suggests that the versatile functionality of the DNA polymerase theta is vital for HR-deficient tumor survival.However, The DNA polymerase theta demonstrates distinct characteristics of drug ability, offering a compelling case for creating DNA polymerase theta inhibitors 26 .Therefore, developing an inhibitor targeting DNA polymerase theta should be a rational option for curing HR-deficient tumor cells.
There is currently no therapy available for HPV 27 .So, potential treatment or drug is highly needed to manage HPV and its related cancer.But, developing an effective medication with a high degree of potentiality is a very time-consuming matter, and required huge research funds.Besides, during the developing phases, many drugs fail, and can't go final stages due to unwanted effects 28 .Resulting, the research community may lose huge amounts of resources and costs.But, in the modern era of drug development, this huge cost could be minimized by early investigation of physiochemical, and toxicity prediction 29 .So, in this investigation, the most popular in silico

Development of HPV in cervical cancer
The process of cervical cancerogenesis, in which HPV gene integration occurs between other cellular alterations and epigenetic factors, is a complicated mechanism of unregulated cellular proliferation.Mutations in the DNA caused by the cell's environment and HPV infection may allow the virus' DNA to integrate with the host's DNA synthesis machinery and cause replication of the virus.Thus, the virus can bypass cellular and immunological defenses while encouraging cell growth and blocking apoptosis 3 .The Two primary oncogenic protein products E6 and E7 control the cell cycle by maintaining the process of normal apoptosis and they play an essential role in promoting oncogenesis in cells.By duplicating their genetic material (DNA), viruses may produce cells to exhibit characteristics of cancer, including uncontrol growth, angiogenesis, invasion, metastasis, and resistance to apoptosis and growth suppressors 34 .

Role of DNA polymerase theta in disease development
DNA polymerase theta is a family of DNA polymerases (Pol θ) that has been essential to maintaining DNA repair and damage tolerance.When any problematic condition occurs in the double strand of DNA, it helps to repair it.Some studies have reported that when the DNA polymerase theta is overexpressed in cancer cells, it may promote the resistance of chemotherapeutic or cancerous agents.As a result, it makes it difficult to treat cancer.So, the DNA polymerase theta might be a potential target receptor for the discovery of a drug to treat breast and ovarian cancers, etc. 35,36 .

Apigenin pharmacological evidence
Apigenin is a flavonoid that may be found in a wide variety of fruits, vegetables, and plants used in traditional Chinese Medicine.This multifunctional molecule has several different biological functions, including antiinflammatory, antioxidant, antibacterial, and antiviral properties.As a result, Apigenin has a long history of usage in the context of alternative and conventional medical practices.Apigenin has been connected to having an antitumor effect against a broad spectrum of cancers.This effect is thought to be achieved through apoptosis and autophagy stimulation, cell cycle arrest, inhibition of cell migration and invasion, and an increase in immunological response 37,38 , and it may demonstrated to be capable of preventing, inhibiting, or reversing the effects of chemically induced genotoxicity in vitro cell models, in vivo investigations, and AMES tests employing bacterial models.This anti-mutagenic effect has been demonstrated and proven.The anti-carcinogenic function of Apigenin has received a lot of attention recently and has been discovered to be protective against different kinds of cancer, including breast cancer, cervical cancer, prostate cancer, skin cancer, thyroid cancer, colon cancer, leukemia, lung cancer, endometrial cancer, neuroblastoma, and adrenocortical cancer 39,40 .As Apigenin is composed of a multifunctional role against a number of diseases.So, this compound is chosen in our current computational experiment as a target biomolecule.

Ligand preparation and molecular optimization
The technique of optimization was worked out to achieve the best possible outcomes with regard to the performance of the molecular docking approach as well as the arrangement of molecules in a three-dimensional framework.In the outset, a three-dimensional framework of Apigenin analogs was retrieved by obtaining from the PubChem database (https:// pubch em.ncbi.nlm.nih.gov/) (Fig. 1) 41 .This allowed for the structure to be observed in three dimensions.www.nature.com/scientificreports/Prior to molecular docking, the ligands were optimized using the Gaussian09 software with the DFT/B3LYP-6311G method.However, no aqueous operation optimization was conducted as it is not necessary for ligand optimization.The purpose of these optimizations was to prepare the ligands for molecular docking.The molecular docking method was employed to select poses with the highest binding affinity (measured in kcal/mol) between apigenin and apigenin 5-O-Beta-D Glucopyranoside ligands and the L1 protein of human papillomavirus (PDB ID 6L31).Subsequently, the best-selected poses underwent molecular dynamics simulation.No separate ligand optimization was performed for the ligands in both complex structures.The topology files of the complex structures were prepared using Charmm-gui, eliminating the need for a separate optimization for the ligands.

Protein preparation and molecular docking study
The primary objective of the molecular docking technique is to make an informed prediction regarding the composition of the ligand-receptor complex through the utilization of computational approaches.The docking process encompasses two distinct yet interdependent steps.Firstly, it involves sampling several conformations that the ligand can adopt when bound to the protein's active site.Secondly, it entails the classification of these residues based on a performance index.By executing these intricately woven steps, molecular docking sheds between ligands and receptors are performed, and unraveling the prediction of their interactions and aiding drug discovery and design endeavors 42 .The PyRx application was employed to accomplish molecular docking 43 .Before that, the crystal structure of HPV45 oncoprotein E7 (PDB ID 2EWL), the L1 protein of human papillomavirus (PDB ID 6L31), and DNA polymerase theta (PDB ID 8E23) were collected from the RCSB protein data bank 36,44,45 .All of the crystal solvent constituents were removed, together with the native agonist and any extra compounds from the crystal structure, and prepared for docking study (Fig. 2 showing the targeted structure).During molecular docking studies, the grid box coordinates were strategically set to cover both the entire protein and the site of interest, ensuring accurate ligand placement.The grid center points were set to X = − 34.

Determination of ADMET, and pharmacokinetics, and drug-likeness
During the stage of drug discovery and development, it is completely obvious that the pharmacokinetic (PK) characteristics of potential therapies, and more specifically ADMET (absorption, distribution, metabolism, excretion, and toxicity), are critical factors to consider 46 .Consequently, the Apigenin derivatives that were being explored were initially placed through a critical drug-likeness screening using the SwissADME and pkCSM webservers 47 .This testing was based on calculated physicochemical and ADMET-related parameters.The SMILES strings of the ligands that were utilized as the input for the molecular markers for both websites.In point of fact, unfavorable pharmacokinetic features of potential drug candidates are a significant contributor to the rate of failure throughout clinical trials 48 .Because of this, it is vital to conduct pre-clinical evaluations of the pharmacokinetic characteristics and drug-likeness of proposed medications.

Molecular dynamics simulation (MDs) protocol
To understand the behavioral changes that occur in the protein-ligand complex when it is exposed to a dynamic environment on an atomic level, MD is a good computer simulation method currently receiving a lot of attention in drug development research 49 .It is also an indispensable instrument for determining the intra-or interatomic interaction stability of the protein-ligand complex over a user-specified period 50 .The best-scoring docking models of the most potential Apigenin, and Apigenin-5-O-beta-D-glucopyranoside, apo form L1 protein of human papillomavirus (PDB ID 6L31), were chosen as the starting points for a 100-ns all-atom molecular dynamics simulation using the GROMACS-2019 software (GNU, General Public License; http:// www.groma cs.org).The CHARMM36 force field and the Gromacs version 2019 software package were employed to conduct simulations running for 100 ns within a periodic water box 51,52 .The force field for Apigenin and Apigenin-5-O-beta-D-glucopyranoside, as well as the apo form L1 protein of human papillomavirus, was generated using the CHARMM-GUI server.Each complex was placed inside a rectangular box with a buffer distance of 10 in each direction and solvated with TIP3P water molecules.To neutralize the system's charge for the Apigenin ligand, 5 Na + ions and 0 Cl-ions were added.Similarly, 5 Na + ions and 0 Cl-ions were added to neutralize the system's charge for the Apigenin-5-O-beta-d-glucopyranoside ligand, while 33 Na + ions and 0 Cl-ions were added for the L1 protein of human papillomavirus.Following this, 0.0 M of NaCl was added to create an environment similar to cellular conditions for each complex.The complexes were subjected to structure minimization using the CHARMM36 force field.Each system was then equilibrated at a temperature of 310 K for 5000 steps (10 PS) in the NPT ensemble during the production run, which lasted for 100 s.The position, velocity, and energy of the system were recorded every 10 ps.Hydrogen atoms were constrained using the Lincs technique.A switching method of 12-14 was employed to calculate van der Waals forces, with a cutoff value of 14 53,54 .Long-range electrostatic interactions were calculated using the particle mesh Ewald (PME) approach, with a maximum grid spacing of 1.2.The PME computations were conducted at each step, without the use of a multiple-time stepping scheme.The temperature was maintained at 310 K, and the system size changes in the barostat were targeted at 1 bar.The integration time step was set to 2 fs (0.002 = dt), and nsteps = 50,000,000 (50,000,000*2 = 100,000,000 ps = 100 ns).Post-simulation, the trajectories were analyzed using the VMD (University of Illinois at Urbana-Champaign, Urbana, IL, USA) program, Bio3D, www.nature.com/scientificreports/and QTGRACE, respectively, following re-centering of the simulation output 55 .After each MD simulation was completed, its trajectory was analyzed to determine a variety of characteristics, such as the root mean square deviation (RMSD), root mean square fluctuation (RMSF), the radius of gyration (Rg), solvent-accessible surface area (SASA), hydrogen bonds (H-bonds), principal component analysis (PCA), and dynamics cross-correlation map (DCCM).The resulting files were analyzed and visualized using xmgrace (https:// plasma-gate.weizm ann.ac.il/ Grace/), Bio3D, and VMD software 56,57 .

Binding free energy calculation
Molecular mechanics/Poisson-Boltzmann surface area (M-PBSA) methodology provides a comprehensive analysis of the quantitative assessment of the interaction mechanism between proteins and ligand molecules.The current investigation utilizes the MM-PBSA technique to evaluate the binding affinity of a complex consisting of a protein and a ligand, to obtain a deeper understanding of the fundamental binding mechanism.The calculation of the van der Waals energy, electrostatic energy, polar solvation energy, and binding energy was conducted for the Apigenin and Apigenin 5-O-Beta-D-Glucopyranoside complexes to forecast the overall ΔGbind.The calculation of the binding free energy for the protein-ligand reaction was performed in the following manner: The equation ΔGbind = G complex-(G protein + G ligand) is utilized to determine the total binding energy of the protein-ligand complex, where G protein denotes the binding energy of the protein and G ligand represents the binding energy of Apigenin and Apigenin 5-O-Beta-D-Glucopyranoside, as examined in this investigation.

Density functional theory (quantum mechanics)
Density functional theory was used to conduct a quantum mechanical calculation on the top twelve compounds (hits) from the virtual screening.The Gaussian 09W program was used for the calculations by optimizing the compounds' geometries at DFT/B3LYP/6-31G (d'p') levels 58,59 .The compounds were analyzed to determine their electron acceptor and electron donor properties by calculating the frontier orbital energies, including the highest occupied molecular orbital (HOMO) and the lowest occupied molecular orbital (LUMO), as well as the energy gap and molecular electrostatic potential.These characteristics also provide information regarding the chemical reactivity and stability of compounds 60 .

Lipinski rule and pharmacokinetics
The complex balance of all molecular and structural characteristics (molecular weight, lipophilicity, rotatable bonds, surface area, number of hydrogen bond acceptors and donors, bioavailability), as determined by the specific evolution of various computational filters developed by Lipinski rule and it is included in the concept of drug-likeness.Our reported compound has a minimum violation of Lipinski rule except for two compounds (CID 5319,484 and 129861,756) and these compounds have minimum bioavailability scores while others have very good bioavailability scores and most of them are about 0.55 or 55% bioavailability (showing in Table 1).

Molecular docking analysis against targeted receptor
Initially, ADMET, PK, and drug-likeness were assessed, and it has been documented that most of the molecules had accepted the guidelines of the Lipinski rule, PK, or the ADMET calculations.As a consequence of this, these compounds were molecularly docked and subjected to further screening.The binding affinity is determined to measure how tightly inhibited or bonded the drugs with targeted protein are during the formation of complex structure 61 .It is said that binding affinities of molecules greater than -6.0 kcal/mol should be potential drug candidates 62 .After molecular docking, the result has been documented that the majority of the compounds have been shown to have potent interactions and greater binding energies with both target proteins.In Table 2, the most effective compounds' interactions with the HPV45 oncoprotein E7 (PDB ID 2EWL) protein are reported www.nature.com/scientificreports/Apigenin 4'-O-Rhamnoside, and Apigenin-4'-Alpha-L-Rhamnoside along with their docking scores − 6.9 kcal/ mol, and − 6.7 kcal/mol against HPV45 oncoprotein E7 (PDB ID 2EWL).Besides, the binding affinities of the L1 protein of human papillomavirus (PDB ID 6L31) ranged from − 7.7 kcal/mol to − 9.3 kcal/mol, whereas molecules Apigenin and Apigenin 5-O-Beta-d-Glucopyranoside showed significant binding energy against L1 protein of human.Secondly, DNA polymerase plays an essential role in the therapeutic strategy of cancer and some other disease by blocking or inhibiting DNA polymerases.Many hyperproliferative conditions, such as cancer, autoimmune diseases, and viral infections, are treated with drugs that block DNA synthesis.So, the DNA polymerase theta is also included in this investigation, and perform molecular docking to determine the capability of whether the reported Apigenin derivatives can inhibit the DNA polymerase theta or not and how much binding affinity is produced during binding with each other.This time, the binding affinity range is achieved − 7.5 kcal/mol to − 8.8 kcal/mol which represents that mentioned Apigenin derivatives should be inhibited the DNA polymerase theta (PDB ID 8E23).Besides, the vast majority of the Apigenin derivatives revealed more potent interactions and also exhibited strong interactions and optimum binding affinity with the target.So, the Apigenin derivatives suggested performing wet-lab synthesis and then evaluated on a biological or practical value, to establish as potential drug candidates.

Molecular docking pose and active site analysis
Molecular docking pose and active site analysis has been done by using discovery studio and Chimera X application.It helps to understand and visualize the specific amino acid residue where the ligand binds and formed a drug-protein in the complex (Fig. 3).In this study, the best two complexes are visualized based on maximum binding energy.The first one is drug protein complex of HPV45 oncoprotein E7 (PDB ID 2EWL) with Apigenin

Theoretical ADMET data analysis
The predicted ADMET data of reported compounds are available in Table 3.Few chemicals showed excellent human intestinal absorption and water solubility.This could help substances have an increased blood concentration for the best biological activity.Additionally, these substances showed low blood-brain barrier (BBB) penetration, indicating a free from the creating CNS or neurotoxicity.Most of the reported ligands were not causes of inhibitors of cytochrome P450 enzyme which indicate compounds are perfectly metabolized by cytochrome enzyme which is found in lever.
Many mechanisms, primarily the liver and kidney, are used to excretion of drug compounds from the body.Smaller drug molecules (< 300) were excreted from bile whereas bigger drug molecules (> 500) were removed from urine.Between 300 and 500 molecular weights are excreted from bile and urine both 63 .For determining the excretion level of drug compounds, the total clearance rate of specific compounds is shown in Table 3. Organic cation transporter 2 or OCT2 is another parameter that help to renal clearance of compounds.Where thus compounds did not expect to substrate on renal OCT2.www.nature.com/scientificreports/Unanticipated drug toxicity is a crucial factor in the failure of successful drug candidates and the withdrawal of marketed treatments.So, toxicity prediction of drug is one of the first requirements for the development and discovery of the drugs.Therefore Table 4 present several toxicity parameters as: AMES toxicity, skin sensitization and hepatotoxicity.In this research reported every article show positive result in ADMET prediction, which are non-toxic drug compound except one hepatotoxic compound.

QSAR and pIC50 calculation
Multiple linear regression (MLR) analysis was used in quantitative structure-activity relationship (QSAR) investigations to determine the influence of compounds on pharmacological activity.The relationship between biological activities and structural activities of chemical compounds has been calculated using the quantitative structure activities relationship (QSAR) method.It is discovered that various compounds have distinct QSAR and pIC 50 values, and the total value of the QSAR and pIC 50 investigation fits all the requirements.The range of QSAR and pIC50 is determined to be between 5.09 and 4.58, with 5.09 being the higher value and 4.58 being the lower value.According to Table 5, the predicted pIC50 indicates that these described compounds may have biological significance against human papilloma virus.Following equation is applied which developed by another publication 64 . H

Molecular dynamics simulation analysis
The substantial root mean square deviation (RMSD) value can be attributed to the protein's considerable size and its composition of five distinct chains.During the molecular dynamics' simulation, an energy minimization procedure was conducted at a time scale of 100 ns.Subsequently, the system was brought to a state of equilibrium.For energy minimization all three system, Using the steepest descent algorithm, the energy of each system was minimized until the maximal force was less than 1,000,000 kj/mol/nm.This was performed to eliminate any steric conflicts within the system.An isothermal-isochoric ensemble NVT (constant number of particles, volume, and temperature) and an isothermal-isobaric ensemble NPT (constant number of particles, pressure, and temperature) were used to equilibrate each system.At 310 K and 1 bar pressure, the two types of ensemble equilibration methods stabilized the three systems.After the molecular dynamics simulation at 100 ns was completed, the PBC effect was eliminated by using the code gmx trjconv -f step5.xtc-o new.xtc-s step5.tpr-pbc mol -center -n index.ndx-ur compact in the Gromacs software, and then it was checked in the VMD program and it was seen that there was no jump.

Root-mean-square deviation (RMSD) analysis
To gain a better understanding of the dynamic behavior and stability, the results of MD simulations for both the apo form and the ligand complex are investigated on a time scale of 100 ns.The MD simulation is carried out for a total of one hundred nanoseconds, and the trajectories for the RMSD plot are displayed in Fig. 4. The colors in the figure are those associated with Apo protein at a time scale of 100 ns when it is complexed with Apigenin and Apigenin-5-O-beta-d-glucopyranoside.The root means square deviation (RMSD) provides an interpretation regarding the extent to which a group of atoms deviates from the appropriate original reference structure of a protein, ligand, or even a ligand-protein complex.A substantial amount of instability, which is related to changes within the conformation of the molecule being researched, can be correlated with having high RMSD values.
It was determined that the average RMSD for the Apigenin-6L31 and Apigenin-5-O-beta-d-glucopyranoside -protein systems were 3.186 Å and 3.236 Å, respectively.The ligand-free protein, or apoprotein, has an RMSD value of 4.21 Å on average.For the apoprotein, we observed an abrupt increase in RMSD at the outset, followed by a sudden decrease 2.5 ns later.Following the trajectories, Apo form increased progressively for 20 ns, after which the value exceeded the RMSD of Apigenin, which may have occurred due to the higher occupancy of flexible loops in the C-terminal region and remained relatively stable until the end of the simulation.İn the case of the Apigenin complex system, a comparable deviation pattern was observed.The RMSD of apoprotein was greater during the 20 ns and 40 ns, and for the 40-50 ns, the value was very close to that of the apoprotein RMSD value.The RMSD of the Apigenin-6L31 system gradually increased and showed more fluctuations for 55 ns, after which the value attained its maximum at roughly 60 ns and declined to a lower value at 78.7 ns of the trajectory.After this time, the observed slight increased and became stable at 100 ns with negligible fluctuations.Whereas, the RMSD for the backbone atom of the Apigenin-5-O-beta-d-glucopyranoside complex concerning initial position quickly increased within the 20 ns run time and maintained a slightly non-significant fluctuation around 20 and 60 ns, where the following trajectories proceeded to drop slightly values till the end of the MD simulation.The average RMSD of the Apigenin-5-O-beta-D-glucopyranoside -6L31 is 0.82 greater than that of apoproteins; however, after 65 ns, the average RMSD value is less than that of apoproteins.In agreement with the above observation, it can be concluded that Apigenin complexes are very stable, as seen by the low RMSD value of Apigenin-5-O-beta-D-glucopyranoside − 6L31 after 60 ns and the constant RMSD value of Apigenin being very near to that of the apoprotein throughout the 100 ns MD simulation.Consequently, the RMSD of the Apigenin-6L31 and Apigenin-5-O-beta-D-glucopyranoside − 6L31 systems may suggest that they did not endure substantial conformational changes during the MD simulation.The RMSD histograms provided further evidence that the stability of the protein and ligand in the simulated system was seen and confirmed (Fig. 5A-C).For understanding the residues that participated in the causative factors for fluctuations, the RMSF plot is provided in Fig. 6.The RMSF value was utilized to identify the protein's hard and flexible regions.This validation criterion for structural variability in the ligand-protein complex highlights the importance of specific protein residues in these structural shifts.Using a timescale of 100 ns, the amino acid at each position is calculated for its deviation value.
For apoprotein, the amino acid position from 407 to 416 exhibits a deviation of up to 15 Å, whereas other amino acids exhibit deviations of 1 to 5 Å.The deviation that occurs in the 407-416th position and 1-5 A values may be the functional reason for the drift in apoprotein RMSD at 80th and 20 ns.In the process of comparing the values of the apo RMSF to those of the complex protein RMSF, it has been observed that Apigenin-6L31 demonstrates greater deviations than the other ligand complex.The amino acid positions from 55 to 57th, 131st to 137th, 173rd to 178th, 347th to 354th, and 417th to 418th have deviations ranging from 5 to 10. Apigenin-6L31's RMSD deviates between the 55th and 60th nanoseconds as a result of these positional amino acid fluctuations.Similarly, another compound, Apigenin-5-O-beta-D glucopyranoside-6L31 also shows fluctuations in the same regions, only the amino acids between 277 and 280 exhibited higher RMSF value, which confirms the greater

Radius of gyration (Rg) analysis
During the entirety of the simulation, the Rg parameter determines how compact a structure is.An increase in RoG values suggests a reduction in the compactness of the protein structure, indicating increased flexibility and decreased stability.The Rg-time fluctuations were observed to be nearly constant within the acceptable range, primarily maintained between 2.8 A and 2.6 A, indicating that the protein-ligand complexes undergo stable conformational changes.Compared to Apigenin-6L31 and apo_6L31, the radius of gyration was smallest for the Apigenin-5-O-beta-D glucopyranoside-6L31 complex (Fig. 7).According to the findings that were obtained, the Apigenin-5-O-beta-D glucopyranoside -6L31 complex was able to maintain a higher degree of stability during the simulation and bind successfully with the ligand.The trajectory of both proteins was used to produce a plot known as a solvent-accessible surface area (SASA), which was then used to research the proportion of each system's surface area that can be reached by the solvent (Fig. 7).

Solvent accessible surface area (SASA) analysis
The information obtained from SASA will be useful for analyzing whether the ligand is kept inside the shallow binding pocket or whether it is expelled from the binding cavity.From Fig. 8, the SASA for the Apo_6L31,  www.nature.com/scientificreports/Apigenin-6L31 complex, and Apigenin-5-O-beta-d glucopyranoside − 6L31 complex, where the average SASA for the native protein was calculated to be 265.49,261.037, and 255.149 nm2, was determined to be 265.49,261.037, and 255.149 nm2, respectively.It has been shown that the Apigenin-5-6L31 complex displays a lower SASA in comparison to the Apo_6L31 and Apigenin-6L31 complexes; this indicates that the Apigenin-5-O-beta-D glucopyranoside -6L31 complex is responsible for inducing conformational alterations.

Hydrogen bond analysis
Throughout the simulation, and in addition to the RMSD and SASA analyses, we also examined the stability of the hydrogen bonds (H-bonds) that are present in protein-ligand complexes.Understanding the connections between biomolecules necessitates a geometrical analysis of hydrogen bonding.Hydrogen bonds are an important interaction in maintaining the structural integrity of biomolecules.Also, During MD modeling, the creation of H-bonds is an essential component in maintaining the stability of the complexes, Throughout the entirety of the course of the MD simulation, it was discovered that the number of H-bonds that were present in the ligand-bound states was constantly changing, as shown in Fig. 9.The total number of H-bonds that were found between Apigenin and the protein during the MD simulation was 19, whereas the number of H-bonds that were found between Apigenin-5-O-beta-D glucopyranoside and the protein was a total of 8.It is evident from the graph that the Apigenin complex has more hydrogen bonds throughout the duration of the simulation, whereas the Apigenin-5-O-beta-d glucopyranoside -6L31 complex has fewer hydrogen bonds.In the instance of the Apigenin-5-O-beta-d glucopyranoside-6L31 mutant, it was discovered that there was a decrease in the number of hydrogen bonds when compared to Apigenin.Greater binding affinity is correlated with an increase in the number of hydrogen bonds formed and their duration.In addition, utilizing H-bond occupancy allowed for the identification of vital residues that were involved in the creation of H-bonds for ligand recognition.Using the VMD "Hydrogen bonds" tool, it was useful to explore the established ligand-protein hydrogen bond interactions and their relative frequencies 65 .The cut-off values for hydrogen bond (Donor H. Acceptor) distance and angle were assigned at 3.0 Å and 20°, respectively 66 .Apigenin did not maintain all of the H-bonds that were detected in the docked complex, except Asn289; however, it did form additional contacts with Ile220, Asp215, Thr199, Cys225, Tyr147, Thr223, Thr224, Arg259, Gly264, Val267, and Glu265.Apigenin-5 maintained only its interactions with Asn289 and developed interactions with Tyr287, Phe201, Gly200, Trp165, and Tyr227 (Fig. 9), and Table 6 is displayed H-bond occupancy.

Contact frequency (CF) analysis
To, Fig. 10 depicts the results of a contact frequency (CF) analysis performed with the contact Freq module on VMD and a cut-off of 4 to further evaluate the binding between 6L31 and the ligands.Phe206, Phe201, Met204, Val216, Asp215, Arg259, Cys225, Val288, Ile220, Thr199, Met145, Gly202, Val267, Thr224, Leu271, Pro268 and Pro266, Hsd164, Tyr147, and Val263 had the highest CF during the simulation.Apigenin exhibited a higher overall contact frequency with these residues than Apigenin-5.Moreover, the CFs of Phe201, Val288, and Gly202 with Apigenin and Apigenin-5 were quite distant.The hydrogen bonds formed between protein and apigenin ligand were not the same as those formed between apigenin-5 and protein, except asn289.In addition, van der Waals interactions were observed between Phe206, Phe201, Val216, Ile220, and Val288 with both ligands, all of which were identified in the presented high CF.Based on the H-bond analysis, we can conclude that Apigenin complex binds to protein active sites more efficiently and tightly than Apigenin-5-6L31.Additionally, the presence of hydrogen bonds between 6L31 and Apigenin derivatives has helped to strengthen the binding, which has contributed to the simulation's success in maintaining its stability (Fig. 10).total variations, respectively.The Apigenin complex was found to have the highest PC1 value (44.6%), which suggests that the complex has been subjected to a greater number of conformational changes.In contrast, the Apigenin-5 complex exhibits less PC1 (31.29%), indicating that it has undergone a smaller conformational change.Moreover, the PC1 of the Apo structure was 38.9%, which is greater than the Apigenin-5-O-beta-D glucopyranoside complex, indicating that the binding of Apigenin-5-O-beta-d glucopyranoside stabilizes the Apo's conformational changes.Figure 11(a, b and c) shows the conformational state of the three systems in the

Dynamic cross-correlation matrix (DCCM) analysis
To investigate the effect of Apigenin derivatives on the conformational motions of the 6L31 protein, DCCM analyses were undertaken on all C atoms in the 6L31 apo, the Apigenin complex, and the Apigenin-5-O-beta-D glucopyranoside complex system using 100 ns simulated trajectories (Fig. 12a, b and c).The DCCM exhibited a comprehensive correlation, encompassing a range of values from − 1.0 to 1.0, with the former indicating a dark purple hue and the latter indicating a dark blue hue.It was determined that different shades of color correspond to varying degrees of correlation between residues, with the deeper the color indicating a larger degree of association.The observed correlation coefficient, ranging from − 1 to 1, indicated that residues exhibited either a positive or negative relationship in their movements.A positive correlation indicated that residues moved in the same direction, while a negative correlation indicated that residues moved in opposite directions.Upon analyzing the DCCM diagrams of the three systems, it was observed that the correlated movements exhibited by each system were notably distinct.In contrast to the Apigenin complex system, the collective movements that exhibit positive correlation in the entire Apigenin-5-O-beta-D glucopyranoside complex remained relatively stable, while the movements that display negative correlation experienced a notable increase.The correlated movements of the Apigenin-5-O-beta-d glucopyranoside complex exhibit significant changes upon ligand binding, particularly in marked areas denoted by black dashed boxes.

The binding free energy estimation
The MM/PBSA approach is a noteworthy technique utilized for the computation of the binding free energy of protein-ligand complexes.The MM-PBSA method was employed to determine the binding free energy of the final 20 compounds based on the molecular dynamics (MD) trajectories.The value of ΔG is determined by the collective impact of diverse protein-ligand interactions, including but not limited to van der Waals energy (ΔEvdW), electrostatic energy (ΔEele), and EPB (electrostatic contribution to solvation-free energy by Poisson-Boltzmann) energy (Fig. 13).The binding energies of the Apigenin complex is − 29.61 kj/mol found in whereas for Apigenin-5-Obeta-d glucopyranoside complex is − 0.13 kJ/mol.The Apigenin, exhibited ∆VDW (− 35.52 kcal/mol), ∆EEL (− 23.39 kcal/mol), and ∆EGB (32.49kcal/mol), while compound Apigenin-5 11 reflected ∆VDW (− 0.40 kcal/ mol), ∆EEL (0.09 kcal/mol) and ∆EGB (0.29 kcal/mol) energies of completely different.The MM-PBSA analysis yielded findings indicating that Apigenin exhibited robust binding energy and greater stability.The validation of the outcomes of the molecular docking and MD simulations were carried out through the binding free energy calculation.

Dynamic behavior and confirmational change of protein-ligand complex
To understand the dynamic structural evolution of the L1 protein of human papillomavirus (PDB ID 6L31)ligand throughout the 100 ns simulation time frame, nine snapshots at every 10 ns have been taken.Additionally, it has been shown that the ligands remain entirely attached to the inhibitory site without undergoing any structural alteration, suggesting that they are quite stable (Fig. 14).

Frontier molecular orbital analysis (FMO)
The present investigation employed the Density Functional Theory (DFT) methodology based on quantum mechanics to compute the HOMO and LUMO energy of twelve compounds.The outcome of this analysis is depicted in Fig. 8.The frontier orbitals, specifically the highest occupied molecular orbital (HOMO) and the lowest occupied molecular orbital (LUMO) can be utilized to characterize the reactivity of chemical species 68 .The highest occupied molecular orbital (HOMO) and lowest unoccupied molecular orbital (LUMO) are utilized to characterize the electron-donating and accepting properties of chemical compounds.An additional parameter that warrants consideration is the energy gap, denoting the disparity between the highest occupied molecular orbital (HOMO) and the lowest unoccupied molecular orbital (LUMO) energies.This differential is indicative of intramolecular charge transfer and kinetic stability.Compounds possessing a significant energy gap exhibit reduced chemical reactivity and heightened kinetic stability.In contrast, individuals possessing a narrow energy gap exhibit heightened reactivity and diminished kinetic stability.In this study, the HOMO and LUMO energies of twelve compounds were calculated using the quantum mechanical Density Functional Theory (DFT) method, and the result is shown in Fig. 15.Based on the findings presented in Table 7, it can be observed that the HOMO-LUMO gaps of all the chemicals under investigation fall within the range of 3.960 eV to 4.079 eV.Furthermore, the data indicates that the order of the energy gap follows a descending pattern as follows: 10 > 9 > 6 > 2 > 7 > 1 > 8 > 5 > 3 > 4 > 12 > 11.The value of the softness is displayed in Table 7.It is essential to keep in mind that the disintegration time required for an element will be shorter and that it will deteriorate at a faster pace than that of other elements if its softness level is larger than a tiny value.On the other hand, the property of hardness is a fundamental characteristic of a substance, and its quantification serves as an indicator of its durability.Typically, compounds with higher hardness values exhibit greater resistance to alterations in electron configuration at the molecular level.Our reported molecules have shown 11 > 12 > 04 > 03 > 08 > 05 > 01 > 07 > 02 > 06 > 09 > 10, which means compounds 11, 12, and 04 will more rapidly disintegrate compared to the other molecules.Again, the hardness is always opposite to softness, and the hardness values are reported as 10 > 09 > 06 > 02 > 07 > 01 > 05 > 08 > 03 > 04 > 12 > 11 in our studies, which indicates that the compounds 11, 12 and 04 are lower hardness and ultimately disintegrate quickly.

Conclusion
The effectiveness of Apigenin derivatives has been utilized in this current investigation as the proposed compounds for the novel treatments for HPV-associated cervical cancer and the DNA polymerase theta since there is no targeted therapy for them.This research gap encourages our research team to develop an urgent search for the potential molecules against them with novel modes of action.So, this in silico study has been performed to screen potential drug candidates from a series of Apigenin derivatives with significant pharmacological properties.This current investigation also includes the pharmacokinetic properties, drug-likeness, ADMET profiles, molecular docking, molecular dynamic simulation, PCA, DCCM DFT, and QSAR.The molecular dynamics simulation, and molecular docking, methods were employed to prove the binding affinities against targeted receptors and the stability of the compounds.The results showed that all the derivatives of the Apigenin molecule are drug-like, and promising hydrogen bonding was reported, exhibiting remarkably inhibitory capability for each of the targeted receptors and favorable binding energies.The favorable rate-determining binding affinities across targeted protein ranges are − 7.1 kcal/mol to − 9.3 kcal/mol.It is also noted that Apigenin 5-O-Beta-d-Glucopyranoside was reported maximum affinities (− 9.3 kcal/mol) against the L1 protein of human papillomavirus (PDB ID 6L31).Finally, our investigation found that the Apigenin derivatives should be suggested as a novel compound against HPV-associated cervical cancer and the DNA polymerase theta.It is kept in mind that these advanced computational studies are provided the potential activity theoretically.Further wet lab experiments should be conducted to validate this effect in vitro, in vivo, pre-clinical, and clinical trials.

Limitations of the study
It is a theoretical investigation; to validate this investigation, and develop newer and safer drugs from the synthetic sources, these derivatives must be carried out from computational to (in vitro and in vivo), preclinical and clinical trials, to find out their practical value. https://doi.org/10.1038/s41598-023-43175-x

Figure 2 .
Figure 2. Three-dimensional protein structure of the targeted receptor.

Figure 3 .
Figure 3. Docking interactions between the proposed compound.

Figure 4 .
Figure 4. RMSD analysis of Apo and the ligand complex (C-Alpha) in molecular dynamics simulations for the time scale of 100 ns.The black and red colors represented the Apigenin, rand Apigenin-5-O-beta-d glucopyranoside complexes with L1 protein of human papillomavirus (PDB ID 6L31) the green is represented the Apo protein (PDB ID 6L31).

Figure 6 .
Figure 6.Displays the amino acid positional variation using RMSF analysis of Apo and ligand complexes in 100 ns molecular dynamics simulations.

Figure 7 .
Figure 7. Represents the ROG values of the Apo-protein and protein-ligand complexes to the protein backbone for 100 ns.RoG of Apigenin-6L31, Apigenin-5-O-beta-D glucopyranoside -6L31, and Apo_6L31 are shown in black, red, and green respectively.

Figure 9 .
Figure 9. Represents the number of hydrogen bonds responsible for the stability of the complexes (Apigenin-6L31 and Apigenin-5-6L31) throughout the 100 ns.

Figure 10 .
Figure 10.Contact frequency analysis during the MD simulation.A contact frequency plot of 6L31 residues interacting with Apigenin (black) and Apigenin-5-O-beta-d glucopyranoside antigen (red).

Figure 11 .
Figure 11.Principal component analysis of (a) Apigenin, (b) Apigenin-5, and (c) Apo protein.Each point represents the protein's conformation on the X and Y axes.The chromatic distribution of blue and red dots was utilized to depict the extent of conformational alterations in the simulation.The color gradient ranging from blue to white to red was indicative of the duration of the simulation.The color blue designates the initial timestep, while the color white represents the intermediate timestep, and the color red signifies the final timestep.

Figure 14 .
Figure 14.Time frame analysis at every 10 ns of Apigenin complex with L1 protein of human papillomavirus (PDB ID 6L31).

Figure 15 .
Figure 15.Diagram of the frontier molecular orbital.

Table 2 .
Binding affinity against targeted protein.

Table 3 .
ADMET data of reported ligand.

Table 4 .
Binding energy results.

Table 5 .
Data of QSAR and pIC50 data.

Table 7 .
Chemical reactivity and molecular properties data.