QSPR/QSAR study of antiviral drugs modeled as multigraphs by using TI’s and MLR method to treat COVID-19 disease

P, Ugasini Preetha; Suresh, M.; Tolasa, Fikadu Tesgera; Bonyah, Ebenezer

doi:10.1038/s41598-024-63007-w

Download PDF

Article
Open access
Published: 07 June 2024

QSPR/QSAR study of antiviral drugs modeled as multigraphs by using TI’s and MLR method to treat COVID-19 disease

Ugasini Preetha P¹^na1,
M. Suresh¹^na1,
Fikadu Tesgera Tolasa²^na1 &
…
Ebenezer Bonyah³^na1

Scientific Reports volume 14, Article number: 13150 (2024) Cite this article

901 Accesses
1 Citations
Metrics details

Subjects

Abstract

The ongoing COVID-19 pandemic continues to pose significant challenges worldwide, despite widespread vaccination. Researchers are actively exploring antiviral treatments to assess their efficacy against emerging virus variants. The aim of the study is to employ M-polynomial, neighborhood M-polynomial approach and QSPR/QSAR analysis to evaluate specific antiviral drugs including Lopinavir, Ritonavir, Arbidol, Thalidomide, Chloroquine, Hydroxychloroquine, Theaflavin and Remdesivir. Utilizing degree-based and neighborhood degree sum-based topological indices on molecular multigraphs reveals insights into the physicochemical properties of these drugs, such as polar surface area, polarizability, surface tension, boiling point, enthalpy of vaporization, flash point, molar refraction and molar volume are crucial in predicting their efficacy against viruses. These properties influence the solubility, permeability, and bio availability of the drugs, which in turn affect their ability to interact with viral targets and inhibit viral replication. In QSPR analysis, molecular multigraphs yield notable correlation coefficients exceeding those from simple graphs: molar refraction (MR) (0.9860), polarizability (P) (0.9861), surface tension (ST) (0.6086), molar volume (MV) (0.9353) using degree-based indices, and flash point (FP) (0.9781), surface tension (ST) (0.7841) using neighborhood degree sum-based indices. QSAR models, constructed through multiple linear regressions (MLR) with a backward elimination approach at a significance level of 0.05, exhibit promising predictive capabilities highlighting the significance of the biological activity $IC_{50}$ (Half maximal inhibitory concentration). Notably, the alignment of predicted and observed values for Remdesivir’s with obs ${pIC_{50} = 6.01}$,pred ${pIC_{50} = 6.01}$ ($pIC_{50}$ represents the negative logarithm of $IC_{50}$) underscores the accuracy of multigraph-based QSAR analysis. The primary objective is to showcase the valuable contribution of multigraphs to QSPR and QSAR analyses, offering crucial insights into molecular structures and antiviral properties. The integration of physicochemical applications enhances our understanding of factors influencing antiviral drug efficacy, essential for combating emerging viral strains effectively.

Exploring the SARS-CoV-2 virus-host-drug interactome for drug repurposing

Article Open access 14 July 2020

Biological activity-based modeling identifies antiviral leads against SARS-CoV-2

Article 23 February 2021

Protracted molecular dynamics and secondary structure introspection to identify dual-target inhibitors of Nipah virus exerting approved small molecules repurposing

Article Open access 14 February 2024

Introduction

Graph theory has seen a surge in its application to pharmacology and medicine, with chemical graph theoreticians focusing on computing topological indices of drug structures to gain insights into molecular properties and aid in drug development. SARS-CoV-2, a single-stranded RNA virus, causes COVID-19, the first major pandemic of the twenty-first century. In 2003, SARS, caused by a new corona virus strain, led to 916 deaths globally. Similarly, COVID-19 emerged in December 2019, originating in Wuhan, China, and was declared a global public health emergency by the WHO in January 2020¹. We are in the half past of 2023, but still, we are facing the corona virus pandemic situation. As of May 12, 2024, 10:39am CEST, the World Health Organization (WHO) has reported a global total of 775,379,864 confirmed COVID-19 cases, with 7 million recorded fatalities. For the latest statistics, refer to https://covid19.who.int/.

Our research, extending on prior studies highlighting double bonds, could improve correlation results in molecular modeling. Our study is inspired by previous research such as that by Kier et al.’s² observation in “Medicinal Chemistry: A Series of Monographs” about double-edge counts providing a more accurate representation of double bonds. Recent work by Simon et al. also indicated improved correlations for molecules with weighted Wiener indices compared to traditional Wiener indices for simple graphs, while Zakharov et al. proposed a novel approach using multigraphs for enhanced statistical QSAR model building^3,4. Using these insights, by these insights, we conducted a comparative analysis between simple and complex models to investigate the impact of double bonds on property estimation accuracy. Topological indices analyze the structure-property relationships in chemical compounds, providing numerical parameters for QSPR and QSAR studies. The research on TI’s has led to the development of over 3000 indices, reflecting the structural properties of the graphs used for their calculation. Most recently, Sakander Hayat et al. research explores the use of temperature-based topological indices, valency-based descriptors, distance-based graphical indices, and eigenvalues-based indices to predict physicochemical and thermodynamic properties of polycyclic aromatic hydrocarbons and benzenoid hydrocarbons^5,6,7,8,9,10. Recently, QSPR/QSAR analysis on the antiviral drugs, corona drugs and anticancer drugs has been analyzed using degree/reverse degree/distance/neighborhood based topological descriptors^{11,12,13,14,15,16}. Zaman et al.^{17,18,19,20,21,22,23,24,25,26} research delves into diverse applications of analytical and theoretical studies in chemistry and related fields, focusing on structural analysis, topological characterization, and mathematical modeling of various nanostructures, biochemical networks, and metal-organic models. The author’s work explores the relationships between molecular topology, irregular molecular descriptors, and novel topological indices, offering insights into the structural properties of complex materials and nanostructures.

This article represents chemical structures using hydrogen suppressed molecular multigraphs with the inclusion of double bonds. A multigraph is a graph containing multiple edges, where multiple edges indicate more than one connection between two vertices, and loops represent edges connecting the same vertex at both ends²⁷. Marrero Ponce in²⁸ discusses the application of QSPR/QSAR analysis for pseudo-graphs (graphs with loops and parallel edges), with considerations for hetero-atoms using the Valence delta concept²⁹. This study compares multigraph and simple graph modeling approaches using topological structure descriptors to estimate physicochemical and biological activity through QSPR/QSAR analysis. Multiple linear regression techniques validate correlation values, aiding in understanding estimators and identifying potential drugs. Notably, no previous literature directly compares multigraph and simple graph efficacy in this context, making this study’s contribution novel and original.

In this study, multigraphs are employed to establish correlations between the physicochemical properties and biological activity of the antiviral drugs. Our QSAR model, utilizing multigraphs, demonstrates a stronger association between the studied biological activity $(pIC_{50})$ with the topological indices compared to the QSAR model proposed by Kirmani et al.¹¹. Scientific literature has introduced several graph polynomials to aid in the calculation of various graph indices. Distance-based polynomials like the Hosoya polynomial, PI polynomial, Schultz polynomial, and modified Schultz polynomial have been suggested in previous studies see^30,31,32. In addition, Deutsch and Klavzar (2015)³³ developed the M-polynomial as a means to compute different degree-based TI’s.

The M-polynomial of graph $\mathscr {G}$ is defined in the following manner

$$\begin{aligned} M(\mathscr {G};x,y) = \sum _{j \le k}^{} m_{jk}(\mathscr {G})x^jy^k \end{aligned}$$

(1)

In this context, $m_{jk}$ represents the count of edges uv $\in$ $E(\mathscr {G})$, where $d_u$ and $d_v$ are the degrees of vertices u and v, respectively, and (j, k) corresponds to their respective degrees. The NM-polynomial, akin to the M-polynomial, is a polynomial designed specifically for neighborhood degree sum-based indices³⁴. It serves a similar purpose and its definition is as follows:

$$\begin{aligned} NM^{*}(\mathscr {G};x,y) = \sum _{j \le k}^{} nm^{*}_{jk}(\mathscr {G})x^jy^k \end{aligned}$$

(2)

Here $nm^{*}_{jk}$ represents the count of edges uv $\in$ $E(\mathscr {G})$, where $nd^{*}_u$, $nd^{*}_v$ = (j,k) respectively. $nd^{*}_u$, $nd^{*}_v$ denotes the neighborhood degree of the vertices u and v in the graph respectively. The objective of this research is to create reliable QSPR/QSAR models that can effectively forecast the physical/chemical and biological properties of drugs targeting COVID-19. Throughout the article, the abbreviations ‘NBD’ (neighborhood degree sum-based indices) and ’D’ (Degree based indices) are used in specific sections for convenience.

Material and method

In our study, we utilized algebraic polynomials to determine the topological indices of several antiviral drugs’ structures, our analysis yielded important findings in this regard. Table 1 presents the relationship between different TI’s derived from the M-polynomial and NM-polynomial and the range of integration defined in Table 1 as x = 1 and y = 1 is proved by Sandi Klavžar in³³. Neighborhood degree sum-based topological indices, as discussed in references^35,36 which demonstrates a remarkable capability to predict various physicochemical properties with high accuracy. Furthermore, a parallel effort has led to the construction of several other neighborhood degree sum-based topological indices, along with their corresponding classical degree-based topological indices, as detailed in references^37,38,39. Mondal et al. conducted a study²⁸ to assess the efficacy of four antiviral drugs in the treatment of COVID-19 patients. The study employed the M-polynomial and NM-polynomial methods for evaluation purposes. Additionally, Kirmani et al.¹¹ recently developed QSPR/QSAR models utilizing linear and multiple linear regression to establish relationships between physicochemical/biological properties and potential antiviral drugs using TI’s in the context of COVID-19 treatment.

To model the antiviral activity of drugs investigated for COVID-19 treatment, a combination of ten ’D’ and ten ’NBD’ based TI’s, alongside eight physicochemical properties, such as polar surface area, polarizability, surface tension, boiling point, enthalpy of vaporization, flash point, molar refraction and molar volume, were employed. The study focused on analyzing the drugs Hydroxychloroquine, Theaflavin, Lopinavir, Ritonavir, Arbidol, Chloroquine and Remdesivir. Thalidomide was excluded from the QSAR study due to insufficient available data on its antiviral activity. Fig. 1 displays the chemical structures of these drugs. We utilized ChemSketch to generate visual representations of the below chemical drug structures. Within this article, the QSAR model incorporates the biological activity $IC_{50}$ (Half maximal inhibitory concentration) to predict the antiviral activity of the mentioned drugs. Multiple linear regression (MLR) is employed as the statistical technique for this purpose. $IC_{50}$ is a widely used measure in drug development to assess the strength of potential drug candidates and compare their efficacy. It is also used in biochemical studies to understand the properties of proteins and enzymes. $pIC_{50}$ represents the negative logarithm of $IC_{50}$. The physicochemical properties and biological activity data of the antiviral drugs mentioned are presented in Table 2. These values were sourced from ChemSpider and the half-maximal inhibitory concentrations ($IC_{50}$) of antiviral activity for the compounds were collected from the scientific literature^{11,40,41,42,43}. and converted to their negative logarithmic scale ($pIC_{50}$) to facilitate data analysis and interpretation.

Table 1 Description and derivation of various TI’s obtained from the M-polynomial and NM-polynomial where $D_x = x\bigg (\frac{\partial (z(x,y))}{\partial x}\bigg )$, $D_y = y\bigg (\frac{\partial (z(x,y))}{\partial y}\bigg )$, $S_x = \int _{0}^{x} \frac{z(t,y)}{t}dt$, $S_y = \int _{0}^{y} \frac{z(x,t)}{t}dt$, $J(z(x,y)) = z(x,x)$, $Q_{\alpha }(z(x,y)) = x^{\alpha }z(x,y)$ For D based TI’s: $\Delta (u)=d_u$, $\Delta (v) = d_v$, $z(x,y) = M(\mathscr {G};x,y)$ for NBD based TI’s: $\Delta (u)=nd^{*}_u$, $\Delta (v) = nd^{*}_v$, $z(x,y) = NM^{*}(\mathscr {G};x,y)$.

Full size table

Table 2 Physicochemical properties/biological activity of several antiviral drugs.

Full size table

Results and discussions

Computation of M-polynomial and NM-polynomial of Lopinavir

In this section, we present the significant computational findings of our study. Our focus was on analyzing the molecular multigraph of lopinavir and deriving its M-polynomial and NM-polynomial, as described in the theorem below. Subsequently, we expanded our analysis to encompass seven additional molecular drug structures. We performed calculations to obtain the M-polynomial and NM-polynomial equations for each structure, and their corresponding values can be found in Table 3. Only lopinavir computation part is shown and Fig. 2 shows molecular multigraph of lopinavir. Figure 3 shows the 3D-Plot of M-polynomial and NM-polynomial of Lopinavir. From this observation the differences in the surface patterns imply that the degree-based and neighborhood degree-based topological indices derived from these polynomials will also differ in their numerical values and interpretations. To determine the superiority of one index over another, further analysis is required, such as comparing their performance in QSPR/QSAR models, evaluating their correlation coefficients with experimental data, and assessing their ability to discriminate between different molecular structures.

Theorem 1

Let $\mathscr {L}$ be the molecular multigraph of Lopinavir. Then we have,

$$\begin{aligned} M(\mathscr {L};x,y)= & {} 3xy^{3}+2xy^{4}+4x^{2}y^{2}+7x^{2}y^{3}+13x^{2}y^{4}+18x^{3}y^{3}+11x^{3}y^{4}+3x^{4}y^{4}\\ NM^{*}(\mathscr {L};x,y)= & {} 2x^{3}y^{5}+x^{3}y^{6}+x^{4}y^{4}+x^{4}y^{5}+3x^{4}y^{6}+4x^{4}y^{7}+2x^{4}y^{8}+x^{5}y^{9}+x^{5}y^{10}+10x^{6}y^{6}+14x^{6}y^{7}+x^{6}y^{10}+3x^{7}y^{7}+11x^{7}y^{8}+x^{7}y^{9}+x^{7}y^{10}+3x^{8}y^{10}+x^{9}y^{10} \end{aligned}$$

Proof

Consider $\mathscr {L}$ as the molecular multigraph representing Lopinavir (refer to Fig. 2). It comprises a total of 61 edges. Let $\Gamma _{(j,k)}$ represent the collection of edges where the endpoints have degrees i and j, respectively. (i.e.) $\Gamma _{(j,k)} = \{uv \in E(\mathscr {L}): \Delta (u) = j, \Delta (v) = k \}$. Let $m_{(i,j)}$ be the no.of edges in $\Gamma _{(j,k)}$. From 2 it is clear that $m_{(1,3)} = 3, m_{(1,4)} = 2, m_{(2,2)} = 4, m_{(2,3)} = 7, m_{(2,4)} = 13, m_{(3,3)} = 18, m_{(3,4)} = 11, m_{(4,4)} = 3$. To derive the M-polynomial of G, we use Eq. 1.

$$\begin{aligned} \begin{aligned} M(\mathscr {L};x,y)&= \sum _{j \le k}^{} m_{(j,k)}x^jy^k \\&= m_{(1,3)}x^{1}y^{3}+ m_{(1,4)}x^{1}y^{4}+ m_{(2,2)}x^{2}y^{2} + m_{(2,3)}x^{2}y^{3}+ m_{(2,4)}x^{2}y^{4} + m_{(3,3)}x^{3}y^{3} + m_{(3,4)}x^{3}y^{4}+ m_{(4,4)}x^{4}y^{4}. \end{aligned} \end{aligned}$$

By using the values of $m_{(j,k)}$, we get

$$\begin{aligned} M(\mathscr {L};x,y) = 3xy^{3}+2xy^{4}+4x^{2}y^{2}+7x^{2}y^{3}+13x^{2}y^{4}+18x^{3}y^{3}+11x^{3}y^{4}+3x^{4}y^{4}. \end{aligned}$$

Let $\Gamma ^{*}_{(j,k)}$ as the set of all edges in which the neighborhood degree sum of the endpoints corresponds to degrees i and j, respectively. (i.e.,) $\Gamma ^{*}_{(j,k)} = \{uv \in E(\mathscr {L}): \Delta (u) = j, \Delta (v) = k \}$. Let $nm^{*}_{(i,j)}$ be the no.of edges in $\Gamma ^{*}_{(j,k)}$. From 2 it is clear that $nm^{*}_{(3,5)} = 2, nm^{*}_{(3,6)} = 1, nm^{*}_{(4,4)} = 1, nm^{*}_{(4,5)} = 1, nm^{*}_{(4,6)} = 3, nm^{*}_{(4,7)} = 4, nm^{*}_{(4,8)} = 2, nm^{*}_{(5,9)} = 1, nm^{*}_{(5,10)} = 1, nm^{*}_{(6,6)} = 10, nm^{*}_{(6,7)} = 14, nm^{*}_{(6,10)} = 1, nm^{*}_{(7,7)} = 3, nm^{*}_{(7,8)} = 11, nm^{*}_{(7,9)} = 1, nm^{*}_{(7,10)} = 1, nm^{*}_{(8,10)} = 3, nm^{*}_{(9,10)} = 1$. To derive the NM-polynomial of G, we use Eq. (2).

$$\begin{aligned} NM^{*}(\mathscr {L};x,y)= & {} \sum _{j \le k}^{} nm^{*}_{(j,k)}x^jy^k \\= & {} nm^{*}_{(3,5)}x^{3}y^{5}+ nm^{*}_{(3,6)}x^{3}y^{6}+ nm^{*}_{(4,4)}x^{4}y^{4}+ nm^{*}_{(4,5)}x^{4}y^{5}+ nm^{*}_{(4,6)}x^{4}y^{6} + nm^{*}_{(4,7)}x^{4}y^{7} + nm^{*}_{(4,8)}x^{4}y^{8} \\{} & {} +nm^{*}_{(5,9)}x^{5}y^{9}+nm^{*}_{(5,10)}x^{5}y^{10}+nm^{*}_{(6,6)}x^{6}y^{6}+nm^{*}_{(6,7)}x^{6}y^{7}+nm^{*}_{(6,10)}x^{6}y^{10} +nm^{*}_{(7,7)}x^{7}y^{7} + nm^{*}_{(7,8)}x^{7}y^{8} \\{} & {} +nm^{*}_{(7,9)}x^{7}y^{9}+nm^{*}_{(7,10)}x^{7}y^{10}+nm^{*}_{(8,10)}x^{8}y^{10}+nm^{*}_{(9,10)}x^{9}y^{10}. \end{aligned}$$

The M-polynomial and NM-polynomial are computed to derive a range of ’D’ and ’NBD’ TI’s for the molecular multigraph representing Lopinavir. These findings are summarized in the following theorem. $\square$

Theorem 2

Let $\mathscr {L}$ be the molecular multigraph of Lopinavir. Then, their respective values in Table 3holds.

Proof

Initially, we determine the degree-based indices by referring to Table 1. Let $M(\mathscr {L};x,y) = t(x,y) = 3xy^{3}+2xy^{4}+4x^{2}y^{2}+7x^{2}y^{3}+13x^{2}y^{4}+18x^{3}y^{3}+11x^{3}y^{4}+3x^{4}y^{4}$. Then we have,

1.
$M_1(\mathscr {L}) = (D_x+D_y)t(x,y)|_{x=y=1} =12xy^{3}+10xy^{4}+16x^{2}y^{2}+35x^{2}y^{3}+78x^{2}y^{4}+108x^{3}y^{3}+77x^{3}y^{4} +24x^{4}y^{4} = 360.$
2.
$M_2(\mathscr {L}) = (D_xD_y)t(x,y)|_{x=y=1} = 9xy^{3}+8xy^{4}+16x^{2}y^{2}+42x^{2}y^{3}+104x^{2}y^{4}+162x^{3}y^{3}+132x^{3}y^{4}+48x^{4}y^{4}$
3.
$mM_2(\mathscr {L}) = S_xS_yt(x,y)|_{x=y=1} = xy^{3}+\frac{2}{4}xy^{4}+x^{2}y^{2}+\frac{7}{6}x^{2}y^{3}+\frac{13}{8}x^{2}y^{4}+\frac{18}{9}x^{3}y^{3}+\frac{11}{12}x^{3}y^{4}+\frac{3}{16}x^{4}y^{4} = 8.3958$
4.
$ReZG_3(\mathscr {L}) = D_xD_y(D_x+D_y)t(x,y)|_{x=y=1} = 36xy^{3}+40xy^{4}+64x^{2}y^{2}+210x^{2}y^{3}+624x^{2}y^{4}+972x^{3}y^{3}+924x^{3}y^{4}+384x^{4}y^{4} = 3254$
5.
$F(\mathscr {L}) = (D_x^{2}+D_y^{2})t(x,y)|_{x=y=1} = 30xy^{3}+34xy^{4}+32x^{2}y^{2}+91x^{2}y^{3}+260x^{2}y^{4}+324x^{3}y^{3}+275x^{3}y^{4}+96x^{4}y^{4} = 1142$
6.
$SDD(\mathscr {L}) = (S_xD_y+S_yD_x)t(x,y)|_{x=y=1} = \frac{30}{3}xy^{3}+\frac{34}{4}xy^{4}+\frac{32}{4}x^{2}y^{2}+\frac{91}{6}x^{2}y^{3}+\frac{260}{8}x^{2}y^{4}+\frac{324}{9}x^{3}y^{3} +\frac{275}{12}x^{3}y^{4}+ \frac{96}{16} = 139.0833$
7.
$H(\mathscr {L}) = 2S_xJt(x,y)|_{x=1} = \frac{7}{4}x^{4}+\frac{9}{5}x^{5}+\frac{31}{6}x^{6}+\frac{11}{7}x^{7}+\frac{3}{8}x^{8} = 21.3262$
8.
$I(\mathscr {L}) = S_xJD_xD_yt(x,y)|_{x=1} = \frac{25}{4}x^{4}+\frac{50}{5}x^{5}+\frac{266}{6}x^{6}+\frac{132}{7}x^{7}+\frac{48}{8}x^{8} = 85.4405$
9.
$A(\mathscr {L}) = S_x^{3}Q_{-2}JD_x^{3}D_y^{3}t(x,y)|_{x=1} = 42.125x^{2}+60.7407x^{3}+309.0313x^{4}+152.064x^{4}+56.8889x^{6} = 620.8499$
10.
$R_{\alpha }(\mathscr {L}) = D_x^{\alpha }D_y^{\alpha }t(x,y)|_{x=1} 3(3)^{\alpha }+2(4)^{\alpha }+4(4)^{\alpha }+7(6)^{\alpha }+13(8)^{\alpha }+18(9)^{\alpha }+11(12)^{\alpha }+3(16)^{\alpha } = 22.1114$

Next, we compute the neighborhood degree sum-based indices by taking into account $NM^{*}(\mathscr {L}) = t(x,y) = 2x^{3}y^{5}+x^{3}y^{6}+x^{4}y^{4}+x^{4}y^{5}+3x^{4}y^{6}+4x^{4}y^{7}+2x^{4}y^{8}+x^{5}y^{9}+x^{5}y^{10}+10x^{6}y^{6}+14x^{6}y^{7}+x^{6}y^{10}+3x^{7}y^{7}+11x^{7}y^{8}+x^{7}y^{9}+x^{7}y^{10}+3x^{8}y^{10}+x^{9}y^{10}$. By utilizing the edge partition of $\Gamma ^{*}_{(j,k)}$ in combination with Table 1, the NM-polynomial can be derived, thus concluding the proof. The obtained values of the ’D’ & ’NBD’ indices, calculated using the M-polynomial and NM-polynomial, are displayed in Tables 3 and 4, respectively. $\square$

Table 3 Selected antiviral drugs with degree based TI’s.

Full size table

Table 4 Selected antiviral drugs with neighborhood degree sum based TI’s.

Full size table

QSPR analysis of selected antiviral drugs with its target properties

Regression analyses

Table 5 Correlation coefficients (r) of degree based indices and the physicochemical properties of antiviral drugs modeled as molecular multigraphs using linear regression model.

Full size table

Table 6 Correlation coefficients (r) between ‘NBD’ and the physicochemical properties of antiviral drugs, modeled as molecular multigraphs using a linear regression model.

Full size table

To clarify the physical significance of our results, we have included concise discussions on the effectiveness of the computed topological indices. These quantitative measures reveal key structural attributes, with higher values indicating enhanced stability and lower reactivity, and lower values suggesting potential reactivity sites. Our study validates the predictive power of these indices by demonstrating strong correlations with experimental properties, supporting their use in understanding structure-property relationships and guiding drug design and development. We highlight the practical applications in drug delivery and material design while acknowledging the need to consider molecular context and explore advanced methods for improved accuracy.The correlated values between ‘D’ and ‘NBD’ based TI’s and the physicochemical properties of antiviral drugs (COVID-19 drugs) can be observed in Tables 5 and 6. From Table 5 we observe that inverse sum indeg index (estimator) reflects a strong positive relationship with boiling point(outcome variable) which is depicted in Fig. 4.

From Fig. 5 we observe that the high correlation coefficients ‘r’ values for the physicochemical properties of Surface tension(ST), Molar refractivity(MR), Molar volume(MV) and Polarizability(P) are higher than the simple graph’s representation of selected antiviral drugs. The existence of a double bond in a molecule can greatly impact its properties, including polarity, conjugation, and reactivity. These changes, in turn, can impact the molecule’s solubility, stability, and biological activity. For example when a molecule contains a double bond, it introduces regions of different electron density, resulting in a shift in polarity. The presence of the double bond can make the molecule more polar or less polar depending on the surrounding atoms and functional groups. We observe that molecular multigraphs can provide a more detailed and nuanced representation of the chemical structure and the high correlation coefficients ’r’ of simple graph representing seven drugs for the physicochemical properties of MR with r = 0.9709, P = 0.9710, ST = 0.5115 and MV = 0.9108 using degree based indices from¹¹. One can see the high correlation ‘r’ values of molecular multigraph in Table 5, bold values with an asterisk*. In similar fashion, From Table 6 we observe that Neighborhood Inverse sum indeg index(NI) (predictor variable) reflects a strong positive relationship with Boiling point(outcome variable) which is depicted in Fig. 6.

From Fig. 7 we observe that the high correlation coefficients ’r’ values for the physicochemical properties of Flash point(FP) and Surface tension(ST) are higher than the simple graph’s representation of selected antiviral drugs. The high correlation coefficients ’r’ of simple graph representing seven drugs for the physicochemical properties of FP with r = 0.9629 and ST with r = 0.6682 using Neighborhood degree sum based indices from¹¹. One can see the high correlation ’r’ values of molecular multigraph in Table 6, bold values with an asterisk *.

Note: We also have observed that the highly correlated values in the multigraph are nearly identical to the values found in the simple graph for both ’D’ and ’NBD’ based correlation values for example, BP with 0.9920, E with 0.9887 from¹¹ representing as simple graphs whereas for multigraphs BP with 0.9864 and E with 0.9827, we get a small variance with the correlation values and some are higher than the simple graph. However, when there is a low correlation between chemical structure descriptors and a target property, it suggests that additional factors may play a more significant role in determining the target property. Further analysis or experimentation might be necessary to identify and understand those factors.

QSAR analyses of biological activity $pIC_{50}$ versus degree based & nbd degree sum-based indices as predictors

Within this section, we employed IBM SPSS Statistics Version 27.0.1.0 software. To view url link of this version, visit https://www.ibm.com/support/pages/downloading-ibm-spss-statistics-27010 to carry out multiple linear regression analyses. $IC_{50}$ were used as dependent variable and several ’D’ and ’NBD’ based indices, (one can refer Table 1) were used as independent variables. $IC_{50}$, also known as half maximal inhibitory concentration, is a parameter that measures the effectiveness of a drug or compound in inhibiting a specific biological or biochemical process. It represents the concentration at which the drug can block the target protein’s function by 50 %. $pIC_{50}$ is a transformed version of $IC_{50}$, where the “p” stands for the negative logarithm (base 10) of the $IC_{50}$ value. $pIC_{50}$ are used in regression analyses over $IC_{50}$ since it is linearly related to the drug potency than $IC_{50}$. The selection of the optimal multiple linear regression model was based on these statistical criteria: Fisher ratio (F), squared multiple correlation coefficient $(R^2)$, adjusted correlation coefficient $(R^{2}_{adj})$, Durbin–Watson value (DW), variance inflation factor (VIF), tolerance value and significance (Sig). The main difference between QSPR and QSAR is the type of property that is being predicted. QSPR models utilize statistical and mathematical methods to establish a link between the molecular structure of compounds and their physicochemical properties. On the other hand, QSAR models employ statistical and machine learning techniques to establish a correlation between the molecular structure of compounds and their biological activities.

MLR model and MLR analyses

Multiple linear regression (MLR)⁵⁵ is a statistical technique that explores the relationship between a dependent variable and multiple independent variables. Its purpose is to find the best-fitting regression line that minimizes the differences between the predicted and actual values of the dependent variable. MLR is a statistical method that explores the linear relationship between target variable Y $(pIC_{50})$ and predictor variables X (2D descriptors). Through the least squares curve fitting technique, MLR calculates regression coefficients $(r^2)$ to estimate the model. This approach establishes a straight line equation that accurately represents the overall data points. The regression equation is formulated as follows:

$$\begin{aligned} Y = b_1 *I_1 + b_2 *I_2 + B_3 *I_3 + c \end{aligned}$$

(3)

In the regression equation, the dependent variable is represented as Y, and the regression coefficients ’b’ correspond to the independent variables ‘I’. The intercept or regression constant is denoted as ‘c’⁵⁶. Kirmani et al.¹¹ conducted a QSAR analysis on antiviral drugs represented as simple graphs, suggesting a weak association between biological activity $(pIC_{50})$ and TI’s. Inspired by their approach, we applied a similar analysis using molecular multigraphs for our selected drugs and achieved a well-fitting QSAR model by backward elimination method which will be elaborated in the upcoming section.

Multicollinearity and VIF⁵⁷

Multicollinearity refers to high correlation among independent variables, which can result in unstable and unreliable regression coefficient estimates. Variance inflation factor (VIF) is a measure used to evaluate the presence of multicollinearity in regression analysis, commonly utilized in tools such as SPSS and it is defined as $VIF = \frac{1}{1-R^2}$. VIF values ranging from 1 to 10 indicate no multicollinearity, while values below 1 or above 10 suggest the presence of multicollinearity. Our regression models showed signs of multicollinearity, as some independent variables had correlation coefficients near 1 and corresponding VIF values outside the ideal range of 1 to 10. This implies that the model may struggle to accurately estimate the individual effects of these correlated variables. Hence, it is crucial to address this issue to ensure the reliability and accuracy of our regression results.

QSAR model for $pIC_{50}$

The correlation matrix is a helpful tool for detecting multicollinearity in regression models. It displays the pairwise correlations between multiple variables, indicating the strength and direction of their relationship. By examining the matrix for high correlations between independent variables, we can identify multicollinearity and take appropriate measures to address it. In the Supplementary Table S1, we present the correlation matrix between various ’D’ and ’NBD’ based indices. In QSAR analysis, one of the primary goals is to identify the most important molecular descriptors or features that are correlated with the target property. When dealing with numerous molecular descriptors in QSAR analysis, including all of them in the model may not be practical. To tackle this issue, variable selection techniques are utilized to identify the most significant descriptors that exhibit strong correlations with the target property. This process helps improve the predictive performance of the model. Stepwise regression is one such variable selection method that is commonly used in QSAR analysis. It involves iteratively adding or removing descriptors based on their statistical significance in predicting the target property. The process continues until no more significant descriptors remain, resulting in a effective model.

We began constructing simple linear regression models using topological indices that had the lowest correlation (specifically, 0.1170 between $NDe_3$ and $NmM_2$). This led to the development of two mono-parameter models. However, both models demonstrated a weak correlation with $pIC_{50}$.

$$\begin{aligned} pIC_{50} = 6.183921-0.48734(\pm 0.502904)NmM_2 \end{aligned}$$

(Model 1)

$n=7, r=0.3976, R^2=0.1581, R_A^{2} = -0.01026, SE=0.4512, F=0.9390, PE=0.2121$

Here n : Number of drugs used, r(R):simple(multiple) correlation coefficient, $R_A^{2}$: adjustable $R^{2}$, F: Fisher’s statistics, PE: Probability error.

By employing Stepwise regression analysis, various combinations of two topological indices have been examined. The following bi-parametric model demonstrates significantly improved statistical measures in comparison to its mono-parametric (Model 1).

$$\begin{aligned} pIC_{50} = 6.782221-1.9E-05(\pm 1.06E-05)NDe_3-0.39912(\pm 0.422226)NmM_2 \end{aligned}$$

(Model 2)

$n=7, r=0.7292, R^2=0.5317, R_A^{2}=0.2976, SE=0.3762, F=2.2711, PE= 0.1179$.

To improve the statistical parameters of the models, trials were conducted to determine the correlation between three combined TI’s and the biological activity$pIC_{50}$. However, the resulting model exhibited only marginal improvements in its statistical measures.

$$\begin{aligned} pIC_{50} = 5.76991-0.00392(\pm 0.001944)S+0.000313(\pm 0.000165)NDe_3+1.170587(\pm 0.84103)NmM_2 \end{aligned}$$

(Model 3)

$n=7, r=0.8950, R^2=0.8011, R_A^{2}=0.6022, SE=0.2831, F=4.0282, PE= 0.0501$.

By applying successive Stepwise regression, a tetra-parametric model was derived, showcasing notable enhancements in the statistical parameters.

$$\begin{aligned} pIC_{50}&= 6.945062 + 0.001272(\pm 0.000599)NF - 0.00388(\pm 0.00132)S \\&\quad + 0.000167(\pm 0.00131)NDe_3 - 0.58105(\pm 1.003055)NmM_2 \end{aligned}$$

(Model 4)

$n=7, r=0.9689, R^2=0.9389, R_A^{2}=0.8167, SE=0.1921, F=7.6844, PE= 0.0154$.

After employing successive Stepwise regression, a penta-parametric model was obtained, demonstrating enhanced statistical parameters.

$$\begin{aligned} pIC_{50}&= 6.274774 + 0.030819(\pm 0.036622)NM_2 - 0.01093(\pm 0.014519)NF \\&\quad - 0.01637(\pm 0.014921)S + 0.000939(\pm 0.000928)NDe_3 + 0.726002(\pm 1.8948)NmM_2 \end{aligned}$$

(Model 5)

$n=7, r=0.9819, R^2=0.9642, R_A^{2}=0.7854, SE=0.2079, F=5.3922, PE= 0.0090$.

In the aforementioned QSAR models, the F-value signifies the ratio between the variability accounted for by the model and the remaining variability ascribed to error. This value is used as an indicator of the model’s statistical significance, with a higher F-value suggesting a greater probability of statistical significance. Probability error, also known as a type I error or alpha error, refers to a statistical concept in hypothesis testing, $PE = \frac{2(1-r^2)}{3\sqrt{n}}$⁵⁶. The p-value is a statistical measure that evaluates the likelihood of observing the given outcomes if the null hypothesis is true. It quantifies the level of evidence against the null hypothesis, indicating the strength of the observed results. A predetermined significance level, commonly set at 0.05, is used as a threshold to determine the statistical significance of the study findings and decide whether to reject the null hypothesis. In our QSAR models, we encountered insignificant results as our p (alpha) value was greater than 0.05. By selecting the least correlated variable can reduce the problem of pairwise correlations between the variables, it does not account for the possibility of higher-order correlations among the variables (multicollinearity). Pairwise correlation refers to the correlation between two variables. So we remove all the predictor variables included in the model since all our p values are greater than 0.05. To mitigate this problem, we used the backward elimination method. The objective was to identify a subset of predictor variables that exhibited the most robust association with the response variable $(pIC_{50})$ while avoiding the issue of over-fitting the model due to an excessive number of predictors.

Backward elimination method and validation

Backward elimination is a feature selection method used in statistical modeling and machine learning. It aims to identify the most relevant subset of features (independent variables) for a given predictive model. The method starts with a full model that includes all available features and iteratively eliminates features that are found to be non-significant. One can refer the article⁵⁸ for QSAR study utilizing TI’s with backward elimination method. By conducting a 2D-QSAR analysis on the biological activity $pIC_{50}$ of antiviral drugs, we generated multiple QSAR models. During the stepwise regression process, we successfully identified and eliminated five independent variables that exhibited insignificant associations with the $pIC_{50}$ (biological activity) outcome. Initially, our study encompassed a total of 18 independent(predictor) variables, but after removing the insignificant features, we were left with 13 remaining predictors. The best linear model for $pIC_{50}$ contains three topological indices $ReZG_3, NDe_5$ and NH. Through the process of backward elimination, we initially considered all 13 predictors $M_1$, F, $M_2$, H, SDD, $mM_2$, A, NH, I, $NM_1$, $ReZG_3$, $NDe_5$ and NI. The aim was to identify the best subset of predictors(independent variables) that displayed a strong association with $pIC_{50}$. The selected model, model 3 from Table 7, demonstrated the best combination of predictors based on various statistical parameters.

Table 7 Backward elimination: QSAR models.

Full size table

Validation: Durbin–Watson statistics and tolerance⁵⁹

The Durbin–Watson statistic is used to measure autocorrelation in regression residuals. It ranges from 0 to 4, with 2 indicating no autocorrelation. Autocorrelation occurs when residuals are correlated over time, violating the assumption of independence. The DW statistic helps assess the level of correlation among residuals. A DW value below 2 indicates the presence of positive autocorrelation, while a value above 2 suggests negative autocorrelation. A DW value of 2 indicates the absence of autocorrelation. To evaluate the model’s goodness of fit using the Durbin-Watson (DW) statistic, a value close to 2 indicates no significant autocorrelation in the residuals. This suggests that the model effectively represents the relationship between the variables. In our final QSAR model 3, the DW value is around 2, indicating that the errors are uncorrelated. The concept of tolerance is employed as an indicator of multicollinearity, measuring the correlation among independent variables in a model. It is represented on a scale from 0 to 1, with a higher tolerance value nearing 1 indicating a lower degree of correlation among predictor variables, thus suggesting reduced multicollinearity. Conversely, a low tolerance value close to 0 indicates high correlation among predictors, suggesting a potential issue of multicollinearity.

Discussion

Backward elimination typically uses a significance threshold (p-value) to determine whether a predictor should be removed from the model. If a predictor already exceeds the significance threshold at the beginning, it is considered non-significant and excluded directly without further evaluation. In our analysis, we found that 8 out of the 13 predictors did not meet the required statistical criteria, such as p-values, VIF, and tolerance values. As a result, these predictors were excluded from further analysis. The statistical parameters indicated that these predictors did not significantly contribute to the model and may have exhibited multicollinearity issues. So 5 independent predictors were carried out for backward elimination which is presented in Table 7, among which model 3 is the best to predict the biological activity $pIC_{50}$ based on these statistical criteria $VIF < 5$, Tolerance values are not close to zero, DW = 1.850 and all p-values are less than 0.05.

$$\begin{aligned} pIC_{50} = 6.745589-0.133408(NH)+0.009435(NDe_5)-0.000442(ReZG_3) \end{aligned}$$

(Model 3)

Ordinary residuals or regular residuals⁵⁹

Regular Residual $=$ Observed Value − Predicted Value. In simpler words, a residual signifies the difference between the observed value of the dependent variable and the value estimated by a regression model. It represents the residual error or the remaining variability that the model was unable to explain. They measure the vertical difference between the observed data points and the regression line or curve. The comparison between the actual and independent (predicted) values of the biological activity $pIC_{50}$ for seven antiviral drugs is presented in Table 8. Figure 8 illustrates the linear relationship between the actual $pIC_{50}$ values and the predicted $pIC_{50}$ values obtained from model 3 for the aforementioned drugs.

Table 8 Comparison between predicted and observed values of model 3 for the validation of $pIC_{50}$ of the respective drugs.

Full size table

Conclusion

This study delves into the evaluation of various antiviral drugs for treating COVID-19, utilizing molecular multigraphs to analyze their chemical structures. Through edge partition techniques, M-polynomial and NM-polynomial expressions were derived, leading to the computation of ’D’ and ’NBD’ based indices. The research also involved a thorough QSPR investigation focusing on antiviral drugs as multigraphs, showcasing the predictive power of computed topological indices (TI’s) in determining physicochemical properties. Notably, the inverse sum indeg and neighborhood inverse sum indeg indices exhibited a strong positive correlation with boiling point (BP), surpassing other indices.

Further, QSAR analysis of the biological activity $pIC_{50}$ of these antiviral drugs were estimated using multiple linear regression in conjunction with backward elimination approach. The results demonstrated that the MLR model was an effective tool for estimating biological activity $pIC_{50}.$ The validation criteria used were designed to assess the accuracy and predictive capability of the MLR model. The results highlight the effectiveness of the MLR model in estimating $pIC_{50}$, with specific TI’s like NH, $NDe_5$, and $ReZG_3$ showing significant predictive potential. Also the observed and predicted $pIC_{50}$ of the drugs for the best model evaluated using cross validation techniques shows minor variation, resulting in low residuals.

The study highlights the importance of considering multigraphs as graph models, offering a novel perspective on drug connectivity analysis. By diverging from conventional approaches focused on simple graphs, the research has provided insights into optimizing the drug selection process. In conclusion, there remains an open challenge in incorporating chemometric methods statistical and mathematical techniques for analyzing chemical data to further refine these models. Using these techniques, researchers can advance our understanding of drug behavior and improve strategies for enhancing drug effectiveness.

Data availability

The paper includes the information used to verify the study’s findings.

References

Pillaiyar, T., Manickam, M., Namasivayam, V., Hayashi, Y. & Jung, S. H. An overview of severe acute respiratory syndrome-coronavirus (SARS-COV) 3cl protease inhibitors: Peptidomimetics and small molecule chemotherapy. J. Med. Chem. 59, 6595–6628 (2016).
Article CAS PubMed PubMed Central Google Scholar
Hite, G. Medicinal Chemistry: A Series of Monographs: By George deStevens 1st edn. (Academic Press, 1964).
Google Scholar
Brezovnik, S., Tratnik, N. & Žigert Pleteršek, P. Weighted Wiener indices of molecular graphs with application to alkenes and alkadienes. Mathematics 9, 153 (2021).
Article Google Scholar
Zakharov, A. B., Tsarenko, D. K. & Ivanov, V. V. Topological characteristics of iterated line graphs in the QSAR problem: A multigraph in the description of properties of unsaturated hydrocarbons. Struct. Chem. 32, 1629–1639 (2021).
Article CAS Google Scholar
Hayat, S., Alanazi, S. J. & Liu, J. B. Two novel temperature-based topological indices with strong potential to predict physicochemical properties of polycyclic aromatic hydrocarbons with applications to silicon carbide nanotubes. Phys. Scr. 99, 055027 (2024).
Article Google Scholar
Hayat, S., Mahadi, H., Alanazi, S. J. & Wang, S. Predictive potential of eigenvalues-based graphical indices for determining thermodynamic properties of polycyclic aromatic hydrocarbons with applications to polyacenes. Comput. Mater. Sci. 238, 112944 (2024).
Article CAS Google Scholar
Hayat, S. & Liu, J. B. Comparative analysis of temperature-based graphical indices for correlating the total $\uppi$-electron energy of benzenoid hydrocarbons. Int. J. Mod. Phys. B 2550047 (2024).
Hayat, S., Khan, A., Ali, K. & Liu, J. B. Structure-property modeling for thermodynamic properties of benzenoid hydrocarbons by temperature-based topological indices. Ain Shams Eng. J. 15, 102586 (2024).
Article Google Scholar
Hayat, S. Distance-based graphical indices for predicting thermodynamic properties of benzenoid hydrocarbons with applications. Comput. Mater. Sci. 230, 112492 (2023).
Article CAS Google Scholar
Hayat, S., Suhaili, N. & Jamil, H. Statistical significance of valency-based topological descriptors for correlating thermodynamic properties of benzenoid hydrocarbons with applications. Comput. Theor. Chem. 1227, 114259 (2023).
Article CAS Google Scholar
Kirmani, S. A. K., Ali, P. & Azam, F. Topological indices and QSPR/QSAR analysis of some antiviral drugs being investigated for the treatment of Covid-19 patients. Int. J. Quantum Chem. 121, e26594 (2021).
Article CAS PubMed Google Scholar
Bokhary, S. A. U. H., Siddiqui, M. K. A. & Cancan, M. On topological indices and QSPR analysis of drugs used for the treatment of breast cancer. Polycycl. Arom. Compds. 42, 6233–6253 (2022).
Article CAS Google Scholar
Shirakol, S., Kalyanshetti, M. & Hosamani, S. M. QSPR analysis of certain distance-based topological indices. Appl. Math. Nonlinear Sci. 4, 371–386 (2019).
Article MathSciNet Google Scholar
Shanmukha, M. C., Basavarajappa, N. S., Shilpa, K. C. & Usha, A. Degree-based topological indices on anticancer drugs with QSPR analysis. Heliyon 6 (2020).
Kirmani, S. A. K., Ali, P., Azam, F. & Alvi, P. A. On ve-degree and ev-degree topological properties of hyaluronic acid-anticancer drug conjugates with QSPR. J Chem. 2021, 1–23 (2021).
Article Google Scholar
Arockiaraj, M., Greeni, A. & Kalaam, A. Linear versus cubic regression models for analyzing generalized reverse degree based topological indices of certain latest corona treatment drug molecules. Int. J. Quantum Chem. 123, e27136 (2023).
Article CAS Google Scholar
Zaman, S., Jalani, M., Ullah, A. & Saeedi, G. Structural analysis and topological characterization of sudoku nanosheet. J. Math. (2022).
Ullah, A., Zaman, S., Hamraz, A. & Saeedi, G. Network-based modeling of the molecular topology of fuchsine acid dye with respect to some irregular molecular descriptors. J. Chem. (2022).
Ullah, A., Zaman, S. & Hamraz, A. Zagreb connection topological descriptors and structural property of the triangular chain structures. Phys. Scr. 98, 025009 (2023).
Article ADS Google Scholar
Zaman, S., Jalani, M., Ullah, A., Ali, M. & Shahzadi, T. On the topological descriptors and structural analysis of cerium oxide nanostructures. Chem. Pap. 77, 2917–2922 (2023).
Article CAS Google Scholar
Zaman, S., Jalani, M., Ullah, A., Ahmad, W. & Saeedi, G. Mathematical analysis and molecular descriptors of two novel metal-organic models with chemical applications. Sci. Rep. 13, 5314 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Ullah, A., Bano, Z. & Zaman, S. Computational aspects of two important biochemical networks with respect to some novel molecular descriptors. J. Biomol. Struct. Dyn. 42, 791–805 (2024).
Article CAS PubMed Google Scholar
Hakeem, A., Ullah, A. & Zaman, S. Computation of some important degree-based topological indices for γ-graphyne and zigzag graphyne nanoribbon. Mol. Phys. 121, e2211403 (2023).
Article ADS Google Scholar
Zaman, S., Salman, M., Ullah, A., Ahmad, S. & Abdelgader Abas, M. Three-dimensional structural modelling and characterization of sodalite material network concerning the irregularity topological indices. J. Math. 1–9 (2023).
Zaman, S., Ullah, A. & Shafaqat, A. Structural modeling and topological characterization of three kinds of dendrimer networks. Eur. Phys. J. E 46, 36 (2023).
Article CAS PubMed Google Scholar
Ullah, A., Zaman, S., Hussain, A., Jabeen, A. & Belay, M. Derivation of mathematical closed form expressions for certain irregular topological indices of 2D nanotubes. Sci. Rep. 13, 11187 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Trudeau, R. J. Introduction to Graph Theory (Courier Corporation, 2013).
Google Scholar
Marrero-Ponce, Y. Linear indices of the “molecular pseudograph’s atom adjacency matrix’’: Definition, significance-interpretation, and application to qsar analysis of flavone derivatives as hiv-1 integrase inhibitors. J. Chem. Inf. Comput. Sci. 44, 2010–2026 (2004).
Article CAS PubMed Google Scholar
Kier, L. & Hall, L. Molecular connectivity VII: Specific treatment of heteroatoms. J. Pharmaceut. Sci. 65, 1806–1809 (1976).
Article CAS Google Scholar
Stevanović, D. Hosoya polynomial of composite graphs. Discrete Math. 235(1–3), 237–244 (2001).
Article MathSciNet Google Scholar
KHADIKAR, P. On a novel structural de-scriptor pi. Natl. Acad. Sci. Lett. 23, 113–118 (2000).
MathSciNet CAS Google Scholar
Schultz, H. P. Topological organic chemistry. 1. Graph theory and topological indices of alkanes. J. Chem. Inf. Comput. Sci. 29.
Deutsch, E. & Klavžar, S. M-polynomial and degree-based topological indices. arXiv preprint arXiv: 1407.1592 (2014).
Mondal, S., De, N. & Pal, A. On some general neighborhood degree based topological indices. Int. J. Appl. Math. 32, 1037 (2019).
Google Scholar
Shanmukha, M. C., Basavarajappa, N. S., Usha, A. & Shilpa, K. C. Novel neighbourhood redefined first and second Zagreb indices on carborundum structures. J. Appl. Math. Comput. 66, 263–276 (2021).
Article MathSciNet Google Scholar
Ghorbani, M. & Hosseinzadeh, M. A. Computing abc4 index of nanostar dendrimers. Optoelectron. Adv. Mater. Rapid Commun. 4, 1419–1422 (2010).
CAS Google Scholar
Graovac, A., Ghorbani, M. & Hosseinzadeh, M. A. Computing fifth geometric-arithmetic index for nanostar dendrimers. J. Discrete Math. Appl. 1, 33–42 (2011).
Google Scholar
Mondal, S., De, N. & Pal, A. On some new neighbourhood degree based indices. Acta Chem. Iasi 27, 31–46 (2019).
Article Google Scholar
Mondal, S., Siddiqui, M. K., De, N. & Pal, A. Neighborhood m-polynomial of crystallographic structures. Biointerface Res. Appl. Chem. 11.
Pizzorno, A. et al. In vitro evaluation of antiviral activity of single and combined repurposable drugs against SARS-COV-2. Antiviral Res. 181, 104878 (2020).
Article CAS PubMed PubMed Central Google Scholar
Fan, S. et al. Research progress on repositioning drugs and specific therapeutic drugs for SARS-COV-2. Future Med. Chem. 12, 1565–1578 (2020).
Article CAS PubMed Google Scholar
Jang, M. E. A. Tea polyphenols EGCG and theaflavin inhibit the activity of SARS-COV-2 3cl-protease in vitro. Evid.-Based Complem. Altern. Med. (2020).
Cicka, D. & Sukhatme, V. P. Available drugs and supplements for rapid deployment for treatment of covid-19. J. Mol. Cell Biol. 13, 232–236 (2021).
Article CAS PubMed PubMed Central Google Scholar
Gutman, I. & Trinajstic, N. Graph theory and molecular orbitals: Total pi-electron energy of alternant hydrocarbons. Chem. Phys. Lett. 17, 535–538 (1972).
Article ADS CAS Google Scholar
Miličević, A., Nikolić, S. & Trinajstić, N. On reformulated Zagreb indices. Mol. Divers. 8, 393–399 (2004).
Article PubMed Google Scholar
Ranjini, P. S., Lokesha, V. & Usha, A. Relation between phenylene and hexagonal squeeze using harmonic index. Int. J. Graph Theory 1, 116–121 (2013).
Google Scholar
Ghorbani, M. & Hosseinzadeh, M. The third version of Zagreb index. Discrete Math. Algorithms Appl. 5, 1350039 (2013).
Article MathSciNet Google Scholar
Furtula, B. & Gutman, I. A forgotten topological index. J. Math. Chem. 53, 1184–1190 (2015).
Article MathSciNet CAS Google Scholar
Randic, M. Characterization of molecular branching. J. Am. Chem. Soc. 97, 6609–6615 (1975).
Article CAS Google Scholar
Favaron, O., Mahéo, M. & Saclé, J. F. Some eigenvalue properties in graphs (conjectures of graffiti-II). Discrete Math. 111, 197–220 (1993).
Article MathSciNet Google Scholar
Vukičević, D. & Gašperov, M. Bond additive modelling 1. Adriatic indices. Croatica Chem. Acta 83, 243–260 (2010).
Google Scholar
Fajtlowicz, S. On conjectures of graffiti-II. Congr. Numer. 60, 187–197 (1987).
MathSciNet Google Scholar
Furtula, B., Graovac, A. & Vukičević, D. Augmented Zagreb index. J. Math. Chem. 48, 370–380 (2010).
Article MathSciNet CAS Google Scholar
Hosamani, S. M. Computing Sanskruti index of certain nanostructures. J. Appl. Math. Comput. 54, 425–433 (2017).
Article MathSciNet Google Scholar
Cohen, J., Cohen, P., West, S. G. & Aiken, L. S. Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences (Routledge, 2013).
Devillers, J. Neural Networks in QSAR and Drug Design (Academic Press, 1996).
Google Scholar
Johnson, R. A. & Wichern, D. W. Applied Multivariate Statistical Analysis (2002).
Esmaeili, E. & Shafiei, F. QSAR study on the physico-chemical parameters of barbiturates by using topological indices and MLR method. Bulgar. Chem. Commun. 50, 44–49 (2018).
Google Scholar
James, G., Witten, D., Hastie, T. & Tibshirani, R. An Introduction to Statistical Learning Vol. 112 (Springer, 2013).
Book Google Scholar

Download references

Author information

These authors contributed equally: Ugasini Preetha P, M. Suresh, Fikadu Tesgera Tolasa and Ebenezer Bonyah.

Authors and Affiliations

Department of Mathematics, College of Engineering and Technology, SRM Institute of Science and Technology, Kattankulathur, Tamil Nadu, 603203, India
Ugasini Preetha P & M. Suresh
Department of Mathematics, Dambi Dollo University, Oromia, Ethiopia
Fikadu Tesgera Tolasa
Department of Mathematics Education, Akenten Appiah Menka University of Skills Training and Entrepreneurial Development, Kumasi, Ghana
Ebenezer Bonyah

Authors

Ugasini Preetha P
View author publications
You can also search for this author in PubMed Google Scholar
M. Suresh
View author publications
You can also search for this author in PubMed Google Scholar
Fikadu Tesgera Tolasa
View author publications
You can also search for this author in PubMed Google Scholar
Ebenezer Bonyah
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.Suresh introduced the parameter and helped in proof reading., Ugasini Preetha .P analyzed, calculated and computed the main results and Fikadu Tesgera Tolasa helped in providing drug properties and in overall management of the article. Ebenezer Bonyah helped in providing software tools and helped in graphical work. Overall the authors are contributed equally to the manuscript.

Corresponding authors

Correspondence to M. Suresh or Fikadu Tesgera Tolasa.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Table S1.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

P, U.P., Suresh, M., Tolasa, F.T. et al. QSPR/QSAR study of antiviral drugs modeled as multigraphs by using TI’s and MLR method to treat COVID-19 disease. Sci Rep 14, 13150 (2024). https://doi.org/10.1038/s41598-024-63007-w

Download citation

Received: 09 April 2024
Accepted: 23 May 2024
Published: 07 June 2024
DOI: https://doi.org/10.1038/s41598-024-63007-w

Keywords

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Exploring the SARS-CoV-2 virus-host-drug interactome for drug repurposing

Biological activity-based modeling identifies antiviral leads against SARS-CoV-2

Protracted molecular dynamics and secondary structure introspection to identify dual-target inhibitors of Nipah virus exerting approved small molecules repurposing

Introduction

Material and method

Results and discussions

Computation of M-polynomial and NM-polynomial of Lopinavir

Theorem 1

Proof

Theorem 2

Proof

QSPR analysis of selected antiviral drugs with its target properties

Regression analyses

QSAR analyses of biological activity \(pIC_{50}\) versus degree based & nbd degree sum-based indices as predictors

MLR model and MLR analyses

Multicollinearity and VIF57

QSAR model for \(pIC_{50}\)

Backward elimination method and validation

Validation: Durbin–Watson statistics and tolerance59

Discussion

Ordinary residuals or regular residuals59

Conclusion

Data availability

References

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Publisher's note

Supplementary Information

Supplementary Table S1.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Comments

Search

Quick links

Multicollinearity and VIF⁵⁷

Validation: Durbin–Watson statistics and tolerance⁵⁹

Ordinary residuals or regular residuals⁵⁹