NMGMDA: a computational model for predicting potential microbe–drug associations based on minimize matrix nuclear norm and graph attention network

Liang, Mingmin; Liu, Xianzhi; Chen, Qijia; Zeng, Bin; Wang, Lei

doi:10.1038/s41598-023-50793-y

Download PDF

Article
Open access
Published: 05 January 2024

NMGMDA: a computational model for predicting potential microbe–drug associations based on minimize matrix nuclear norm and graph attention network

Mingmin Liang¹,
Xianzhi Liu¹,
Qijia Chen¹,
Bin Zeng¹ &
…
Lei Wang^1,2

Scientific Reports volume 14, Article number: 650 (2024) Cite this article

603 Accesses
Metrics details

Subjects

Abstract

The prediction of potential microbe–drug associations is of great value for drug research and development, especially, methods, based on deep learning, have been achieved significant improvement in bio-medicine. In this manuscript, we proposed a novel computational model named NMGMDA based on the nuclear norm minimization and graph attention network to infer latent microbe–drug associations. Firstly, we created a heterogeneous microbe–drug network in NMGMDA by fusing the drug and microbe similarities with the established drug–microbe associations. After this, by using GAT and NNM to calculate the predict scores. Lastly, we created a fivefold cross validation framework to assess the new model NMGMDA's progressiveness. According to the simulation results, NMGMDA outperforms some of the most advanced methods, with a reliable AUC of 0.9946 on both MDAD and aBioflm databases. Furthermore, case studies on Ciprofloxacin, Moxifoxacin, HIV-1 and Mycobacterium tuberculosis were carried out in order to assess the effectiveness of NMGMDA even more. The experimental results demonstrated that, following the removal of known correlations from the database, 16 and 14 medications as well as 19 and 17 microbes in the top 20 predictions were validated by pertinent literature. This demonstrates the potential of our new model, NMGMDA, to reach acceptable prediction performance.

A novel microbe-drug association prediction model based on stacked autoencoder with multi-head attention mechanism

Article Open access 06 May 2023

Multitask joint strategies of self-supervised representation learning on biomedical networks for drug discovery

Article Open access 24 April 2023

Reusability report: Uncovering associations in biomedical bipartite networks via a bilinear attention network with domain adaptation

Article 04 April 2024

Introduction

Microorganisms are a class of widely dispersed germs that include bacteria, viruses, fungus, and other species that are both helpful and hazardous to humans^1,2. Numerous human organs contain and are covered in human microbes³. In addition to promoting food absorption and maintaining intestinal health by managing the balance of the gut microbiota, they can control the host's mucosal and systemic immune systems^4,5. In the intestinal environment, these bacteria depend on one another and benefit one another. When the microbiota is out of balance, several diseases, include obesity⁶, inflammatory bowel disease⁷, and cancer⁸ can result. Additionally, numerous studies have demonstrated that while utilizing pharmaceuticals to cure diseases, there is a definite influence between bacteria and drugs^1,9,10. Therefore, understanding the relationship between microbes and medications becomes essential for the treatment of disease.

Humans have discovered certain relationships between drugs and microbes through investigations into biology, but because biological experiments demand a substantial amount of human, material, and time resources, their further advancement may be constrained. To address the limitations of biological studies, an increasing number of computational methods have been presented during the past few years due to the rapid development of relevant research tools. These methods aim to anticipate the relationship between drugs and microbes¹¹. In parallel, databases of microbe–drug associations that have undergone experimental validation, such as MDAD¹² and aBioflm¹³, have also been established. For instance, Zhu et al.¹⁴ presented HMDAKATZ, which uses the KATZ measure to identify microbe–drug associations. By integrating a network embedding approach with microbe–drug association prediction, Long et al.¹⁵ introduced the HNERMDA method. To predict probable microbe–drug associations, Ma et al.¹⁶ introduced the generalized Matrix decomposition method WHGMF based on weighted hypergraph learning. To infer new microbe–drug relationships, Yang et al.¹⁷ suggested the multi-core fusion model MKGNN based on Graph Convolutional Network(GCN). A deep neural network-based prediction model for microbe–drug associations called NNAN was created by Zhu et al.¹⁸ A contrastive learning model called SCSMDA was created by Tian et al.¹⁹ to forecast the connection between microbes and drugs. In order to anticipate probable microbe–drug correlations, Tan et al.²⁰ developed a computation technique termed GSAMDA based on the graph attention network and the sparse autoencoder. Yang et al.²¹ suggest a model, called MKGCN, for inferring microbe–drug associations based on Multiple Kernel Fusion on Graph Convolutional Network. Ma et al.²² designed a microbe–drug prediction model based on graph attention network (GAT) and convolutional neural networks (CNN).

As mentioned above, it is easy to know that these neural network-based methods are frequently used in hiding random association prediction works, and among them, CNN-based approaches adopt the method of parameter sharing to effectively prevent overfitting, however, the pooling layer will lose a significant amount of important data during processing. As for the GCN-based approaches, although the non-matrix organized data will be more applicable, however, the scalability and flexibility are still quite limited. As for the GAT-based methods, although the clustering performance of graph neural networks can be significantly improved, but the clustering of higher-order neighborhoods is still a challenging task. Hence, it is clear that better prediction results can be obtained by combining these above prediction methods organically.

In this study, we introduced a novel calculating approach called NMGMDA to predict latent associations between microbes and drugs, which is based on the nuclear norm minimization²³ and the graph attention network²⁴. Figure 1 depicts the NMGMDA structure. These are our primary contributions, in brief:

A novel heterogeneous network made up of microbes and drugs has been created by combining the microbe similarity network, drug similarity network, and existing microbe–drug relationships.
To get projected scores for potential microbe–drug associations, we used both the nuclear norm minimization (NNM) approach and the GAT-based auto-encoder. And then weighted averaged these two predicted scores to get the final results.
Experimental results and case studies demonstrated the significant prediction performance of NMGMDA on both the MDAD and the aBioflm Databases.

Materials and methods

Data sources

In this study, we assessed NMGMDA on the following two databases in order to show its efficacy.

MDAD database is a database of microbe–drug associations that was assembled and arranged by Sun et al.¹² in 2018 from a variety of drug-related databases, including TTD and DrugBank, as well as a substantial body of literature. After superfluous data is eliminated, 1373 drugs and 173 microbes were found to have 2470 microbe–drug associations.

ABiofilm database was created by Rajput et al.¹³, which includes 5027 antifungal drugs that target 140 microbes that were identified between 1988 and 2017. Following the removal of redundant data, 140 microbes and 1720 drugs were included in 2884 microbe–drug associations.

Table 1 provides specific statistics of microbes-drugs associations in the MDAD and aBioflm.

Table 1 The specific statistics of microbes-drugs associations in the MDAD and aBioflm.

Full size table

Methods

Microbe–drug adjacency matrix

We initially create an adjacency matrix $A\in {R}^{{n}_{d}\times {n}_{m}}$, where ${n}_{d}$ and ${n}_{m}$ represent the number of drugs and microbes, respectively, based on these microbe–drug associations. ${A}_{ij}$ equals 1 if there is a known relationship between the drug ${d}_{i}$ and microbe ${m}_{j}$, else it equals 0.

$${A}_{ij}=\left\{\begin{array}{l}1, \,\,if \,\,{d}_{i} \,\,associats \,\,with\,\, {m}_{j}\\ 0, \,\,otherwise\end{array}\right.$$

(1)

Drug/microbe Gaussian kernel similarity

The following formula will be used to determine the Gaussian kernel similarity ${D}_{GIP}\left({d}_{i},{d}_{j}\right)\in {R}^{{n}_{d}\times {n}_{d}}$ between ${d}_{i}$ and ${d}_{j}$, assuming that ${d}_{i}$ and ${d}_{j}$ are two drugs.

$${D}_{GIP}=exp\left(-{{\gamma }_{d}\Vert A\left({d}_{i}\right)-A\left({d}_{j}\right)\Vert }^{2}\right)$$

(2)

where $\Vert A\left({d}_{i}\right)-A\left({d}_{j}\right)\Vert $ is the Euclidean distance between two drugs. Since ${\gamma }_{d}$ is a regular parameter, it is easier to group together similar feature points the greater ${\gamma }_{d}$. And the definition of ${\upgamma }_{{\text{d}}}$ is as follows:

$${\gamma }_{d}=1/\left(\frac{1}{{n}_{d}}\sum_{i=1}^{{n}_{d}}{\Vert A\left({d}_{i}\right)\Vert }^{2}\right)$$

(3)

Similarly, we would calculate the Gaussian kernel similarity ${M}_{GIP}\left({m}_{i},{m}_{j}\right)\in {R}^{{n}_{m}\times {n}_{m}}$ between two microbes:

$${M}_{GIP}=exp\left(-{{\gamma }_{m}\Vert A\left({m}_{i}\right)-A\left({m}_{j}\right)\Vert }^{2}\right)$$

(4)

$${\gamma }_{m}=1/\left(\frac{1}{{n}_{m}}\sum_{i=1}^{{n}_{m}}{\Vert A\left({m}_{i}\right)\Vert }^{2}\right)$$

(5)

Microbe/Drug functional similarity

In the STRING²⁵ database, we can find many gene functional networks connected to microbes. A matrix ${M}_{F}\in {R}^{{n}_{m}\times {n}_{m}}$ can be produced by the Kamneva²⁶ tool, which determines microbe functional similarity based on microbial gene families.

The SIMCOMP2 tool²⁷ uses the chemical and molecular formula structures of drugs to determine how similar their structures are. To create a drug functional similarity matrix ${D}_{F}\in {R}^{{n}_{d}\times {n}_{d}}$, we adopt the similarity scores.

Drug/microbe integrated similarities

It is important to note that not every drug can determine functional similarity. As a result, using the drug structural similarity and the drug Gaussian kernel similarity, we were able to construct a new matrix $D\in {R}^{{n}_{d}\times {n}_{d}}$ of integrated drug similarities.

$$D=\left\{\begin{array}{c}\left({D}_{GIP}+{D}_{F}\right)/2, if {D}_{F}\ne 0\\ {D}_{GIP}, if {D}_{F}=0\end{array}\right.$$

(6)

where ${D}_{GIP}$ is the drug Gaussian kernel similarity, and ${D}_{F}$ is the drug functional similarity.

Similarly, the microbe integrated similarities matrix $M\in {R}^{{n}_{m}\times {n}_{m}}$ was calculated as follows:

$$M=\left\{\begin{array}{c}\left({M}_{GIP}+{M}_{F}\right)/2, if {M}_{F}\ne 0\\ {M}_{GIP}, if {M}_{F}=0\end{array}\right.$$

(7)

where ${M}_{GIP}$ is the drug Gaussian kernel similarity, and ${M}_{F}$ is the drug functional similarity.

Constructing the heterogeneous network ${\varvec{N}}$

The microbe–drug adjacency matrix, drug integrated similarities matrix and microbe integrated similarities matrix can be joined together to form a whole matrix $N\in {R}^{\left({n}_{d}+{n}_{m}\right)\times \left({n}_{d}+{n}_{m}\right)}$:

$$N=\left[\begin{array}{cc}D& A\\ {A}^{T}& M\end{array}\right]$$

(8)

where ${A}^{T}$ represents $A{\prime}$ s transposition.

Predicting microbe–drug associations by NNM

Currently, the convex optimization model includes nuclear norms, which are applied in many fields²⁸. It has a globally optimal solution¹¹. Therefore, the nuclear norm minimization of the heterogeneous network N can be expressed as:

$$\underset{N}{{\text{min}}}{\Vert N\Vert }_{*} \quad s.t. {N}_{ij}={A}_{ij}, \left(i,j\right)\in \Omega $$

(9)

where ${\Vert N\Vert }_{*}$ represents the nuclear norm of $N$, $\Omega $ is a set of known positions of elements.

We need to add restrictions to the model to make sure that the unknown elements fall within the range [0,1] since predicted scores for microbe–drug associations should be between [0,1]. This forecasting method is:

$$\underset{N}{{\text{min}}}{\Vert N\Vert }_{*} s.t. {\Vert {p}_{\Omega }\left(N\right)-{p}_{\Omega }\left(A\right)\Vert }_{F}\le \varepsilon $$

(10)

$${\left({p}_{\Omega }\left(N\right)\right)}_{ij}=\left\{\begin{array}{c}{N}_{ij}, if\left(i,j\right)\in \Omega \\ 0, otherwise\end{array}\right.$$

(11)

They are $\varepsilon $, which stands for measurement noise, ${\Vert \cdot \Vert }_{F}$, which stands for the Frobenius norm, and ${p}_{\Omega }$, which stands for the orthogonal mapping acting on $\Omega $. Then substituting regularized models for inequality constrained models:

$$\underset{N}{{\text{min}}}{\Vert N\Vert }_{*}+\frac{\alpha }{2}{\Vert {P}_{\Omega }\left(N\right)-{p}_{\Omega }\left(A\right)\Vert }_{F}^{2} \quad s.t. 0<{N}_{ij}<1$$

(12)

where $\mathrm{\alpha }$ is a variable that is learnable. The model can be optimized in the manner shown below by introducing the auxiliary matrix X, which was inspired by literature²⁹:

$$\underset{N}{{\text{min}}}{\Vert N\Vert }_{*}+\frac{\alpha }{2}{\Vert {P}_{\Omega }\left(X\right)-{p}_{\Omega }\left(A\right)\Vert }_{F}^{2} \quad s.t. N=X, 0<{X}_{ij}<1$$

(13)

Then, minimize the enhanced Lagrange function to solve the problem:

$$\zeta \left(X,N,Y,\alpha ,\beta \right)={\Vert N\Vert }_{*}+\frac{\alpha }{2}{\Vert {P}_{\Omega }\left(X\right)-{p}_{\Omega }\left(A\right)\Vert }_{F}^{2}+{T}_{\gamma }\left({Y}^{T}\left(N-X\right)\right)+\frac{\beta }{2}{\Vert N-X\Vert }_{F}^{2}$$

(14)

where $Y$ is the Lagrange multiplier and $\beta >0$ is the penalty factor.

Following that, implement iterative solution. The matrix ${X}_{k+1}$ must first be calculated:

$${X}_{k+1}=arg\underset{0\le X\le 1}{{\text{min}}}\zeta \left(X,{N}_{k},{Y}_{k},\alpha ,\beta \right)$$

$$=arg\underset{0\le X\le 1}{{\text{min}}}\frac{\alpha }{2}{\Vert {P}_{\Omega }\left(X\right)-{p}_{\Omega }\left(A\right)\Vert }_{F}^{2}+{T}_{\gamma }\left({Y}^{T}\left(N-X\right)\right)+\frac{\beta }{2}{\Vert N-X\Vert }_{F}^{2}$$

(15)

The best answer to the Eq. (15) for $arg\underset{0\le X\le 1}{{\text{min}}}\zeta \left(X,{N}_{k},{Y}_{k},\alpha ,\beta \right)$ is ${X}^{*}$. Think of restrictions for the interval [0,1] that:

$${\left({X}_{k+1}\right)}_{ij}=\left\{\begin{array}{c}1, {X}_{ij}^{*}>1\\ {X}_{ij}^{*}, 0\le {X}_{ij}^{*}\le 1\\ 0, {X}_{ij}^{*}<0\end{array}\right\}$$

(16)

Update the matrix ${N}_{k+1}$ and correct other variables:

$${N}_{k+1}=arg\underset{N}{{\text{min}}}\zeta \left({X}_{k+1},N,{Y}_{k},\alpha ,\beta \right)$$

$$=arg\underset{N}{{\text{min}}}{\Vert N\Vert }_{*}+\frac{\beta }{2}{\Vert N-\left({X}_{k+1}-\frac{{Y}_{k}}{\beta }\right)\Vert }^{2}={\vartheta }_{\frac{1}{\beta }}\left({X}_{k+1}-\frac{{Y}_{k}}{\beta }\right)$$

(17)

$${\vartheta }_{\tau }(x)={\sum }_{i=1}^{{\theta }_{i}\ge \tau }\left({\theta }_{i}-\tau \right){\mu }_{i}{\nu }_{i}^{T}$$

(18)

where ${\vartheta }_{\tau }(x)$ is singular value contraction operator, ${\theta }_{i}$ is the singular values of $X$ which is larger than $\tau $, while ${\mu }_{i}$ and ${\nu }_{i}$ are the left and right singular vectors corresponding to ${\theta }_{i}$.

We can update the Lagrange multiplier ${Y}_{k+1}$ as follows by adjusting other variables:

$${Y}_{k+1}={Y}_{k}+\beta \left({N}_{k+1}-{X}_{k+1}\right)$$

(19)

Finally, the following information can be found in the prediction matrix ${A}_{1}$ for microbe–drug associations:

$${X}_{k}\to \left[\begin{array}{cc}{D}_{1}& {A}_{1}\\ {A}_{1}^{T}& {M}_{1}\end{array}\right]\to {A}_{1}$$

(20)

Predicting latent microbe–drug associations by GAT

With the introduction of an attention-based design, the graph spatial network GAT performs node categorization for graph-structured data²⁴. To determine the matrix $N$'s structure, we created a GAT model. First determines the attention score between any two nodes in the matrix $N$:

$${\alpha }_{ij}=\frac{exp\left({e}_{ij}\right)}{{\sum }_{k\in {N}_{i}}exp\left({e}_{ik}\right)}$$

(21)

$${e}_{ij}=LeakyRelu\left(a\left[W{h}_{i}||W{h}_{j}\right]\right)$$

(22)

$$LeakyRelu\left(x\right)=\left\{\begin{array}{l}x, \quad x>0\\ \mu x, \quad otherwise\end{array}\right.$$

(23)

where ${N}_{i}$ stands for the total number of nodes,$a$ is an attention coefficient, $W$ is a learnable linear transformation, and ${h}_{i}$ represents the feature vector of the node $i$, $\mu $ is the hypermeter and $||$ denotes the concatenation.

Consequently, each node's ultimate output feature is:

$${h}_{i}=relu\left(\sum_{j\in {N}_{i}}{\alpha }_{ij}W{h}_{j}\right)$$

(24)

The activation function, $relu$, is defined as follows:

$$relu\left(x\right)=\left\{\begin{array}{l}x, \quad x>0\\ 0, \quad otherwise\end{array}\right.$$

(25)

A low dimensional structural matrix $X=\left[\begin{array}{c}{X}_{d}\\ {X}_{m}\end{array}\right]\in {R}^{\left({n}_{d}+{n}_{m}\right)\times l}$ is produced by substituting $N$ into the previously mentioned GAT model, where ${X}_{d}$ and ${X}_{m}$, respectively, stand in for the drug nodes and microbial nodes in $N$. After a number of testing, we ultimately decided on MSE loss as the loss function for optimizing our model.

An improved random walk with restart (RWR) is implemented on D in response to literature²⁰, allowing us to obtain a new matrix. Below is how the RWR was described:

$${s}_{i}^{l+1}=\lambda X{s}_{i}^{l}+\left(1-\lambda \right){\varepsilon }_{i}$$

(26)

$${\varepsilon }_{ij}=\left\{\begin{array}{l}1, \,\,if\, i=j\\ 0, \,\,otherwise\end{array}\right.$$

(27)

where ${\varepsilon }_{i}$ is the initial probability vector, $X$ is the matrix of transition probabilities, and $\uplambda $ is the restart probability. Similar to that, we might produce a novel matrix ${M}_{Z}$ by using the enhanced RWR on $M$.

As a result, by combining the drug matrix ${X}_{d}$, ${D}_{F}$, ${D}_{Z}$ and adjacency matrix $A$, influenced by literature²², we could create a new drug feature matrix ${Z}_{d}$ that looked like this:

$${Z}_{d}=\left[{X}_{d},{D}_{F},A,{D}_{Z},A\right]$$

(28)

Similarly, we could create the following new microbe feature matrix:

$${Z}_{m}=\left[{X}_{m},{A}^{T},{M}_{F},{{A}^{T},M}_{Z}\right]$$

(29)

Finally, we employ dot product to derive a microbe–drug association predictive score ${A}_{2}$:

$${A}_{2}=swich\left({Z}_{d}\left({d}_{i}\right)\cdot{{Z}_{m}\left({m}_{j}\right)}^{T}\right)$$

(30)

$$swich=xSigmoid\left(\beta x\right)$$

(31)

where $swich$ is an activation function, $\beta $, a learnable parameter, which is typically set it to 1, ${Z}_{d}\left({d}_{i}\right)$ indicates the ${i}_{th}$ row of ${Z}_{d}$ and ${Z}_{m}\left({m}_{j}\right)$ represents the ${j}_{th}$ row of ${Z}_{m}$.

Final predicted score of microbe–drug associations

The weighted arithmetic mean approach can be used to combine the prediction matrix ${A}_{1}$ acquired through NNM and the prediction matrix ${A}_{2}$ generated through GAT, resulting in the following final forecast matrix ${A}^{*}$ of microbe–drug associations:

$${A}^{*}=\lambda {A}_{1}+\left(1-\lambda \right){A}_{2}$$

(32)

where $\lambda $ is the weight value.

Experiments and results

In this section, we first carried out sensitive parameter analysis to get the optimum performance out of the model. Then, six state-of-the-art methods would be picked to contrast with NMGMDA. Finally, in order to confirm the validity of our model, we have chosen two typical microbes and drugs, respectively.

Parameter sensitivity analysis

Three pieces make up the NMGMDA model. $\alpha $ and $\beta $ in formula (14) are two crucial parameters in NNM. Dimension $l$ and learning rate ${l}_{r}$ are the two most important factors in GAT.The weight value $\lambda $ is an important parameter in the final prediction formula (32). In this section, to find the appropriate settings and ensure the independence of the training sets and test sets, we initially Randomly picked 20% of the associations are known and 20% are unknown for the training sets, with the remaining sets being test sets. Next, we utilized fivefold CV experiments with the MDAD database and ensure each of the experiments is independent.

In NNM, we decided to conduct joint tests and altered $\alpha $ and $\beta $ from $\left\{\mathrm{0.1,1},\mathrm{10,100,1000}\right\}$ and conduct joint experiments.Then, using a fivefold CV experiment, we determined the area under curve (AUC) and the area under the precision-recall curve (AUPR) of these parameter combinations. The findings are displayed in Table 2. Table 2 shows that the AUC and AUPR outcomes obtained by NMGMDA are both at their best when $\alpha $ and $\beta $ have values of 100 and 1, respectively.

Table 2 The AUC and AUPR values on different $\alpha $ and $\beta $ on MDAD database.

Full size table

In GAT, we decided to adjust the dimension $l$ changed from $\left\{\mathrm{16,32,64,128}\right\}$ and the learning rate ${l}_{r}$ changed from $\left\{\mathrm{0.1,0.05,0.01,0.005,0.001}\right\}$.

Figure 2 makes it clear that no substantial changes to the outcome were caused by changing any particular factors. We choose 32 as the dimension of node topological representation $l$ since it has a little better AUPR value than 64 or 128 dimensions. In line with typical learning models, the learning rate ${l}_{r}$ was set at 0.01.

Finally, the results are displayed in Fig. 3 for parameter $\lambda $ in formula (32), where we estimate the impact of the $\lambda $ altered from $\left\{\mathrm{0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9}\right\}$ for the fivefold on MDAD. Which makes it clear that NMGMDA, with $\lambda $ set to 0.7, may get the maximum AUC and AUPR values.

After comparing the performance on different hyperparameters by testing, the final parameters we selected are $\alpha $ = 100, $\beta $ = 1, $l$ = 32, ${l}_{r}$ = 0.01 and $\lambda $ = 0.7.

Comparison with advanced methods

In this case, taking into account the dearth of microbial drug association prediction methods, we would first contrast NMGMDA with a few standard approaches for link prediction issues, such as HMDAKATZ¹⁴, HMDA-Pred³⁰, LAGCN³¹, MNNMAD³² and GSAMDA²⁰, etc.

Here, considering the limited availability of microbial drug association prediction methods, we would first compare NMGMDA with some representative methods for link prediction problems such as HMDAKATZ¹⁴, HMDA-Pred³⁰, LAGCN³¹, MNNMAD³² and GSAMDA²⁰, etc. One of them, HMDAKATZ, predicted the association between microbes and drugs using the KATZ algorithm as a foundation. For the prediction of microbe–disease associations, HMDA-Pred is a novel computer model based on multi-data integration and network consistency projection. LAGCN is a complete end-to-end graph based deep learning method, which forecast the associations between drugs and diseases. By using a Matrix Nuclear Norm approach on data on known microbes and diseases, MNNMAD is a method for predicting microbe–disease relationships. Based on a graph attention network and sparse auto-encoder, GSAMDA offered a unique computer model for forecasting probable microbe–drug interactions.

We tested these techniques using their default settings and compared them using the fivefold CV experiment. AUC and AUPR values are used as indicators to evaluate the performance of NMGMDA, and the database we utilize is MDAD and aBioflm. The outcome was displayed in Table 3 and Fig. 4. Our suggested NMGMDA model has the greatest prediction performance of all the methods.

Table 3 The AUCs and AUPRs of compared methods based on databases MDAD and aBioflm under fivefold CV.

Full size table

Case study

To test the NMGMDA model's real prediction power, we chose two well-known drugs—Ciprofloxacin and Moxifloxaxin—as well as two common microbes—Human immunodeficiency virus type 1 and Mycobacterium tuberculosis—for case studies.

Ciprofloxacin is an organic molecule with excellent bactericidal effect and broad-spectrum antibacterial activity³³. It has shown to be a successful treatment for both acute and chronic urinary tract infections, as well as a variety of systemic infections³⁴. Staphylococcus aureus³⁵, Haemophilus influenzae³⁶ and Stenotrophomonas maltophilia³⁷ are all susceptible to its antibacterial properties. Based on the predicted score, ranked the Ciprofloxacin-related microbes scores from highest to lowest, and chose the top 20 microbes for validation after deleting the 10 associations that are currently on MDAD database. As indicated in Table 4, 19 of the top 20 anticipated microbes connected to Ciprofloxacin have been verified by published research in PubMed. Moreover, Moxifloxacin³⁹ belongs to the quinolone drugs class, which mostly used to treat infections of the skin and soft tissues in adults as well as upper and lower respiratory tract infections^38,39. According to the literature⁴⁰, Moxifloxacin is an effective treatment for Stenotrophomonas maltophilia keratitis. As indicated in Tables 5, after removing the 4 known associations on MDAD database, we discovered 17 microbes that had been verified by PubMed literature among the top 20 predicted microbes associated with moxifloxacin.

Table 4 The top 20 Ciprofloxacin associated candidate microbes on MDAD.

Full size table

Table 5 The top 20 Moxifoxacin associated candidate microbes on MDAD.

Full size table

Regarding microbes, the first microbe is Human immunodeficiency virus type 1 (HIV-1), which is a virus capable of attacking the immune system in humans, and causes AIDS, an extremely dangerous infectious illness⁴¹. HIV-1 has been widely studied in relation to various medicines. Saquinavir, for instance, has been shown to be an effective treatment for HIV-1-infected individuals who have diarrhea and/or wasting syndrome by Hervé Trout⁴². According to literature⁴³, the first-line protease inhibitor that is generally suggested in the initial treatment regimen for people with HIV-1 infection is lopinavir/ritonavir. After removing the 26 known associations on MDAD database, we discovered 16 (see Table 6) drugs that had been validated by PubMed literatures among the top 20 anticipated microbes associated with Human immunodeficiency virus type 1. Mycobacterium tuberculosis is the second microbes used in the case study. Mycobacterium tuberculosis is the pathogen that causes tuberculosis⁴⁴, and many microbes, including ciprofloxacin⁴⁵ and triclosan⁴⁶, have been shown to be associated with it. After removing the 14 known associations on MDAD database, Table 7 indicates that of the top 20 candidate drugs, 14 were linked to Mycobacterium tuberculosis.

Table 6 The top 20 Human immunodeficiency virus type 1 associated candidate drugs on MDAD.

Full size table

Table 7 The top 20 Mycobacterium tuberculosis associated candidate drugs on MDAD.

Full size table

In conclusion, these two sets of case studies further demonstrate how the NMGMDA model may anticipate the association between microbes and drugs.

Discussion and conclusion

The association between drugs and microbes has enormous significance for the treatment of diseases, according to biomedical studies. Consequently, a powerful computational prediction model could help researchers find additional microbe–drug associations and improve illness therapy.

By combining the NNM and GAT frameworks, we suggested a unique model in this study called NMGNAD to forecast potential microbe–drug associations. In NMGNAD, we first combined the drug similarity network, the known microbe–drug associations, and the similarity and association information between nodes to create a new microbe–drug heterogeneous network. The correlation scores between microbes and drugs were then predicted using the NNM model and the GAT model. In order to get the forecast results, we weighted average these two anticipated scores. According to experimental findings, NMGMAD outperformed state-of-the-art methods and produced acceptable case study outcomes.

Although NMGMDA can produce good prediction performance, there are still certain restrictions. First off, a few drug names in both databases are not accessible now, and the fact that they are no longer being updated will reduce the number of known connections that are available and have an impact on how the model is used in practice. Thus, we might think about creating a microbe–drug database that is more extensive. Then, to increase the precision of model predictions, we can think about adding more biological data to enhance the characteristics of drugs and microbes, such as data on drug side effects, data on the relationship between germs and diseases, and data on the association between drugs and diseases.

Data availability

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding authors.

References

Huttenhower, C., Gevers, D., Knight, R., Badger, J. H., Human Microbiome Project Consortium. Structure, function and diversity of the healthy human microbiome. Nature 486, 207–214. https://doi.org/10.1038/nature11234 (2012).
Article ADS CAS Google Scholar
Liang, C., Changlu, Q., He, Z., Tongze, F. & Xue, Z. gutMDisorder: A comprehensive database for dysbiosis of the gut microbiota in disorders and interventions. Nucleic Acids Res. 48, 554–560. https://doi.org/10.1093/nar/gkz843 (2020).
Article Google Scholar
Gill, S. R. et al. Metagenomic analysis of the human distal gut microbiome. Science 312, 1355–1359. https://doi.org/10.1126/science.1124234 (2006).
Article ADS CAS PubMed PubMed Central Google Scholar
Ventura, M. et al. Genome-scale analyses of health-promoting bacteria: Probiogenomics. Nat. Rev. Microbiol. 7, 61–71. https://doi.org/10.1038/nrmicro2047 (2009).
Article CAS PubMed Google Scholar
Sommer, F. & Bäckhed, F. The gut microbiota-masters of host development and physiology. Nat. Rev. Microbiol. 11, 227–238. https://doi.org/10.1038/nrmicro2974 (2013).
Article CAS PubMed Google Scholar
Ley, R. E., Turnbaugh, P. J., Klein, S. & Gordon, J. I. Human gut microbes associated with obesity. Nature 444, 1022–1023. https://doi.org/10.1038/4441022a (2006).
Article ADS CAS PubMed Google Scholar
Durack, J. & Lynch, S. V. The gut microbiome: Relationships with disease and opportunities for therapy. Rockefeller Univ. Press 216, 20–40. https://doi.org/10.1084/jem.20180448 (2019).
Article CAS Google Scholar
Schwabe, R. F. & Jobin, C. The microbiome and cancer. Nat. Rev. Cancer 13, 800–812. https://doi.org/10.1038/nrc3610 (2013).
Article CAS PubMed PubMed Central Google Scholar
McCoubrey, L. E., Gaisford, S., Orlu, M. & Basit, A. W. Predicting drug-microbiome interactions with machine learning. Biotechnol. Adv. 54, 107797. https://doi.org/10.1016/j.biotechadv.2021.107797 (2022).
Article CAS PubMed Google Scholar
Zimmermann, M., Zimmermann-Kogadeeva, M., Wegmann, R. & Goodman, A. L. Mapping human microbiome drug metabolism by gut bacteria and their genes. Nature 570, 462–467. https://doi.org/10.1038/s41586-019-1291-3 (2019).
Article CAS PubMed PubMed Central Google Scholar
Wang, L., Tan, Y., Yang, X., Kuang, L. & Ping, P. Y. Review on predicting pairwise relationships between human microbes, drugs and diseases: From biological data to computational models. Brief. Bioinform. 23, 1–25. https://doi.org/10.1093/bib/bbac080 (2022).
Article Google Scholar
Sun, Y. Z. et al. MDAD: A special resource for microbe–drug associations. Front. Cell. Infect. Microbiol. 8, 424. https://doi.org/10.3389/fcimb.2018.00424 (2018).
Article CAS PubMed PubMed Central Google Scholar
Akanksha, R., Anamika, T., Shivangi, S. & Manoj, K. aBioflm: A resource of anti-bioflm agents and their potential implications in targeting antibiotic drug resistance. Nucleic Acids Res. 46, 894–900. https://doi.org/10.1093/nar/gkx1157 (2018).
Article CAS Google Scholar
Zhu, L., Duan, G., Yan, C. & Wang, J . Prediction of microbe–drug associations based on KATZ measure. In 2019 IEEE International Conference on Bioinformatics and Biomedicine. https://doi.org/10.1109/BIBM47256.2019.8983209 (2019).
Long, Y. & Luo, J. Association mining to identify microbe drug interactions based on heterogeneous network embedding representation. IEEE J. Biomed. Health Inform. 25, 266–275. https://doi.org/10.1109/jbhi.2020.2998906 (2021).
Article PubMed Google Scholar
Ma, Y. & Liu, Q. Generalized matrix factorization based on weighted hypergraph learning for microbe–drug association prediction. Comput. Biol. Med. 145, 105503. https://doi.org/10.1016/j.compbiomed.2022.105503 (2022).
Article PubMed Google Scholar
Yang, H., Ding, Y., Tang, J. & Gao, F. Inferring human microbe–drug associations via multiple kernel fusion on graph neural network. Knowl. Based Syst. 238, 107888. https://doi.org/10.1016/j.knosys.2021.107888 (2022).
Article Google Scholar
Zhu, B. et al. NNAN: Nearest neighbor attention network to predict drug–microbe associations. Front. Microbiol. 13, 846915. https://doi.org/10.3389/fmicb.2022.846915 (2022).
Article PubMed PubMed Central Google Scholar
Tian, Z., Yu, Y., Fang, H., Xie, W. & Guo, M. Predicting microbe–drug associations with structure-enhanced contrastive learning and self-paced negative sampling strategy. Brief. Bioinform. 24, 634. https://doi.org/10.1093/bib/bbac634 (2023).
Article Google Scholar
Tan, Y. et al. GSAMDA: A computational model for predicting potential microbe–drug associations based on graph attention network and sparse autoencoder. BMC Bioinform. 23, 492. https://doi.org/10.1186/s12859-022-05053-7 (2022).
Article Google Scholar
Yang, H., Ding, Y., Tang, J. & Guo, F. Inferring human microbe–drug associations via multiple kernel fusion on graph neural network. Knowl.-Based Syst. 28, 107888. https://doi.org/10.1016/j.knosys.2021.107888 (2022).
Article Google Scholar
Ma, Q., Tan, Y. & Wang, L. GACNNMDA: A computational model for predicting potential human microbe–drug associations based on graph attention network and CNN-based classifier. BMC Bioinform. 24, 35. https://doi.org/10.1186/s12859-023-05158-7 (2023).
Article Google Scholar
Yoo, S., Wang, Z. & Seo, J. Adaptive weighted nuclear norm minimization for removing speckle noise from optical coherence tomography images. In 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). https://doi.org/10.1109/embc.2019.8857208 (2019).
Veličković, P., et al. Graph attention networks. arXiv:1710.10903. https://doi.org/10.48550/arXiv.1710.10903 (2017).
Szklarczyk, D. et al. STRING v11: Protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 47, 607–613. https://doi.org/10.1093/nar/gky1131 (2019).
Article CAS Google Scholar
Kamneva, O. K. Genome composition and phylogeny of microbes predict their co-occurrence in the environment. PLoS Comput. Biol. 13, 1005366. https://doi.org/10.1371/journal.pcbi.1005366 (2017).
Article ADS CAS Google Scholar
Hattori, M., Tanaka, N., Kanehisa, M. & Goto, S. SIMCOMP/SUBCOMP: Chemical structure search servers for network analyses. Nucleic Acids Res. 38, 652–656. https://doi.org/10.1093/nar/gkq367 (2010).
Article CAS Google Scholar
Yang, M., Luo, H., Li, Y. & Wang, J. Drug repositioning based on bounded nuclear norm regularization. Bioinformatics 35, 455–463. https://doi.org/10.1093/bioinformatics/btz331 (2019).
Article CAS Google Scholar
Candès, E. & Recht, B. Simple bounds for recovering low-complexity models. Math. Program. 141, 577–589. https://doi.org/10.1007/s10107-012-0540-0 (2012).
Article MathSciNet Google Scholar
Fan, Y., Chen, M., Zhu, Q. & Wang, W. Inferring disease-associated microbes based on multi-data integration and network consistency projection. Front. Bioeng. Biotechnol. 8, 831. https://doi.org/10.3389/fbioe.2020.00831 (2020).
Article PubMed PubMed Central Google Scholar
Yu, Z., Huang, F., Zhao, X., Xiao, W. & Zhang, W. Predicting drug-disease associations through layer attention graph convolutional network. Brief. Bioinform. 22, bbaa243. https://doi.org/10.1093/bib/bbaa243 (2021).
Article CAS PubMed Google Scholar
Haiyan, L. et al. MNNMDA: Predicting human microbe–disease association via a method to minimize matrix nuclear norm. Comput. Struct. Biotechnol. J. 21, 1414–1423. https://doi.org/10.1016/j.csbj.2022.12.053 (2023).
Article CAS Google Scholar
Terp, D. K. & Rybak, M. J. Ciprofoxacin. Drug Intell. Clin. Pharm. 35, 373–447. https://doi.org/10.1177/1060028087021007-801 (1988).
Article Google Scholar
Campoli-Richards, D. M. et al. Ciprofloxacin. A review of its antibacterial activity, pharmacokinetic properties and therapeutic use. J. Drugs 35, 373–447. https://doi.org/10.2165/00003495-198835040-00003 (1988).
Article CAS Google Scholar
Warraich, A. A. et al. Evaluation of anti-biofilm activity of acidic amino acids and synergy with ciprofloxacin on Staphylococcus aureus biofilms. J. Sci. Rep. 10, 9021. https://doi.org/10.1038/s41598-020-66082-x (2020).
Article ADS CAS Google Scholar
Kosikowska, U., Andrzejczuk, S., Plech, T. & Malm, A. Inhibitory effect of 1,2,4-triazole-ciprofloxacin hybrids on Haemophilus parainfluenzae and Haemophilus influenzae biofilm formation invitro under stationary conditions. Res. Microbiol. 167, 647–654. https://doi.org/10.1016/j.resmic.2016.05.009 (2016).
Article CAS PubMed Google Scholar
Ba, B. B. et al. Activities of ciprofloxacin and moxifloxacin against Stenotrophomonas maltophilia and emergence of resistant mutants in an in vitro pharmacokinetic-pharmacodynamic model. Antimicrob. Agents Chemother. 48, 946–953. https://doi.org/10.1128/aac.48.3.946-953.2004 (2004).
Article CAS PubMed PubMed Central Google Scholar
Balfour, J. A. & Wiseman, L. R. Moxifoxacin. Drugs 59, 115–139. https://doi.org/10.2165/00003495-199957030-00007 (1999).
Article Google Scholar
Jacobsen, F. et al. Efficacy of topically delivered moxifloxacin against wound infection by Pseudomonas aeruginosa and methicillin-resistant Staphylococcus aureus. Antimicrob. Agents Chemother. 55, 2325–2334. https://doi.org/10.1128/aac.01071-10 (2011).
Article CAS PubMed PubMed Central Google Scholar
Felix, G. et al. Efficacy of antibiotic treatment of implant-associated Staphylococcus aureus infections with moxifloxacin, flucloxacillin, rifampin, and combination therapy: An animal study. Drug Design Dev. Ther. 11, 1729–1736. https://doi.org/10.2147/dddt.s138888 (2017).
Article CAS Google Scholar
Spector, S. A. Human immunodefciency virus type-1. Ref. Module Biomed. Sci. 11, 1–12. https://doi.org/10.1016/B978-0-12-801238-3.00088-X (2014).
Article Google Scholar
Hervé, T. et al. Enhanced saquinavir exposure in human immunodeficiency virus Type 1-infected patients with diarrhea and/or wasting syndrome. Antimicrob. Agents Chemother. 48, 538–545. https://doi.org/10.1128/aac.48.2.538-545.2004 (2004).
Article Google Scholar
Kaplan, S. S. & Hicks, C. B. Safety and antiviral activity of lopinavir/ritonavir-based therapy in human immunodeficiency virus type 1 (HIV-1) infection. J. Antimicrob. Chemother. 56, 273–276. https://doi.org/10.1093/jac/dki209 (2005).
Article CAS PubMed Google Scholar
Koch, A. & Mizrahi, V. Mycobacterium tuberculosis. Trends Microbiol. 26, 555–556. https://doi.org/10.1016/j.tim.2018.02.012 (2018).
Article CAS PubMed Google Scholar
Chen, C. H., Shih, J. F., Lindholm-Levy, P. J. & Heifets, L. B. Minimal inhibitory concentrations of rifabutin, ciprofloxacin, and ofloxacin against Mycobacterium tuberculosis isolated before treatment of patients in Taiwan. Am. Rev. Respir. Dis. 140, 987–989. https://doi.org/10.1164/ajrccm/140.4.987 (1989).
Article CAS PubMed Google Scholar
Parikh, S. L., Xiao, G. & Tonge, P. J. Inhibition of InhA, the enoyl reductase from Mycobacterium tuberculosis, by triclosan and isoniazid. Biochemistry 39, 7645–7650. https://doi.org/10.1021/bi0008940 (2000).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

The authors thank the referees for suggestions that helped improve the paper substantially.

Funding

This work was partly sponsored by the National Natural Science Foundation of China (No. 62272064) and the Natural Science Foundation of Hunan Province (No. 2023JJ60185).

Author information

Authors and Affiliations

School of Information Engineering, Hunan Vocational College of Electronic and Technology, Changsha, 410000, China
Mingmin Liang, Xianzhi Liu, Qijia Chen, Bin Zeng & Lei Wang
Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China
Lei Wang

Authors

Mingmin Liang
View author publications
You can also search for this author in PubMed Google Scholar
Xianzhi Liu
View author publications
You can also search for this author in PubMed Google Scholar
Qijia Chen
View author publications
You can also search for this author in PubMed Google Scholar
Bin Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Lei Wang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.L., X.L. and L.W. produced the main ideas, and did the modeling, computation and analysis and also wrote the manuscript. Q.C., L.W. and B.Z. provided supervision and effective scientific advice and related ideas, research design guidance, and added value to the article through editing and contributing completions. All authors contributed to the article and approved the submitted version.

Corresponding authors

Correspondence to Qijia Chen, Bin Zeng or Lei Wang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Liang, M., Liu, X., Chen, Q. et al. NMGMDA: a computational model for predicting potential microbe–drug associations based on minimize matrix nuclear norm and graph attention network. Sci Rep 14, 650 (2024). https://doi.org/10.1038/s41598-023-50793-y

Download citation

Received: 18 September 2023
Accepted: 26 December 2023
Published: 05 January 2024
DOI: https://doi.org/10.1038/s41598-023-50793-y

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

A novel microbe-drug association prediction model based on stacked autoencoder with multi-head attention mechanism

Multitask joint strategies of self-supervised representation learning on biomedical networks for drug discovery

Reusability report: Uncovering associations in biomedical bipartite networks via a bilinear attention network with domain adaptation

Introduction

Materials and methods

Data sources

Methods

Microbe–drug adjacency matrix

Drug/microbe Gaussian kernel similarity

Microbe/Drug functional similarity

Drug/microbe integrated similarities

Constructing the heterogeneous network \({\varvec{N}}\)

Predicting microbe–drug associations by NNM

Predicting latent microbe–drug associations by GAT

Final predicted score of microbe–drug associations

Experiments and results

Parameter sensitivity analysis

Comparison with advanced methods

Case study

Discussion and conclusion

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

Comments

Search

Quick links