Exploring the mechanism of Yixinyin for myocardial infarction by weighted co-expression network and molecular docking

Yixinyin, the traditional Chinese medicine, has the effects of replenishing righteous qi, and promoting blood circulation to eliminate blood stagnation. It is often used to treat patients with acute myocardial infarction (MI). The purpose of our study is to explore the key components and targets of Yixinyin in the treatment of MI. In this study, we analyzed gene expression data and clinical information from 248 samples of MI patients with the GSE34198, GSE29111 and GSE66360 data sets. By constructing a weighted gene co-expression network, gene modules related to myocardial infarction are obtained. These modules can be mapped in Yixinyin PPI network. By integrating differential genes of healthy/MI and unstable angina/MI, key targets of Yixinyin for the treatment of myocardial infarction were screened. We validated the key objectives with external data sets. GSEA analysis is used to identify the biological processes involved in key targets. Through molecular docking screening, active components that can combine with key targets in Yixinyin were obtained. In the treatment of myocardial infarction, we have obtained key targets of Yixinyin, which are ALDH2, C5AR1, FOS, IL1B, TLR2, TXNRD1. External data sets prove that they behave differently in the healthy and MI (P < 0.05). GSEA enrichment analysis revealed that they are mainly involved in pathways associated with myocardial infarction, such as viral myocarditis, VEGF signaling pathway and type I diabetes mellitus. The docking results showed that the components that can be combined with key targets in YixinYin are Supraene, Prostaglandin B1, isomucronulatol-7,2′-di-O-glucosiole, angusifolin B, Linolenic acid ethyl ester, and Mandenol. For that matter, they may be active ingredients of Yixinyin in treating MI. These findings provide a basis for the preliminary research of myocardial infarction therapy in traditional Chinese medicine and provide ideas for the design of related drugs.

Myocardial infarction (MI) is one of the most common diseases in clinical practice, and the incidence rate has increased significantly in recent years 1 . Plaque rupture and subsequent thrombosis is the main cause of symptoms 2 . Factors that contribute to plaque vulnerability include the size of the lipid core, the thickness and cell structure of the fibrous cap, and the severity of the inflammatory response. Excessive inflammation may aggravate heart remodeling and cause heart failure 3 . The arterial wall response to injury can be induced by a variety of mechanisms, including infection, reactive oxygen species (ROS), oxidized low-density lipoprotein, and hemodynamic shear stress. In addition, smoking, diabetes, hypercholesterolemia and hypertension are recognized risk factors related to the occurrence and development of myocardial infarction 4 . Clinical methods such as drug thrombolysis and coronary stent implantation are commonly used to treat myocardial infarction, but they can relieve symptoms to a certain extent, but cannot repair necrotic myocardial tissue 5 .
Traditional Chinese medicine theory believes that Qi deficiency and blood stasis are the pathogenesis of patients with MI. Therefore, the treatment should be based on the adjustment of healthy Qi and the promotion of blood circulation. With the deepening of research on traditional Chinese medicine, more and more prescriptions are used for the treatment of patients with MI 6 . Yixinyin has the effect of promoting blood circulation and removing blood stasis, and is clinically used in the treatment of patients with MI 7 . Studies have shown that after patients are treated with Yixinyin, the plasma hsCRP, Fib and D-dimer levels are lower than those of the control group,  Table S1 and Fig. 2B. We identify outliers based on the distance of sample clustering. The cutHeight is set to 85,000 to eliminate three obvious outliers (GSM843945, GSM843912, GSM843889) ( Fig. 2A). Subsequently, the remaining 94 samples were background-corrected, standardized and polymerized to obtain a gene expression matrix. The soft threshold should be the smallest integer when the fitting coefficient R2 reaches 0.9, so that the constructed gene co-expression network can conform to the characteristics of the scale-free network. We choose β=12 as the soft threshold for this study (Fig. 2C). The dynamic tree cutting algorithm and the hclust function merge highly similar modules, and finally a clustering dendrogram can be obtained (Fig. 2D). A total of 21 gene modules were identified in this study. The gray modules are gene sets that cannot be collected in other modules. The correlation between these modules and the phenotype (gender, age, BMI, sbp, dbp, diabetes status, and so on) is calculated based on the characteristic vectors of each module, as shown in Fig. 2E. The heat map illustrates that Black, midnightblue, and yellow are significantly negatively correlated with gender A total of 169 component molecules and 717 Yixinyin targets were obtained from the Chinese medicine database. Afterwards, the PPI between the targets is imported into the Cytoscape to build Yixin's PPI network. The network contains 581 nodes and 4137 edges, as shown in Fig. S1. The figure is drawn by cytoscpe 3.7.0 software (https:// cytos cape. org/). The weighted gene co-expression network of MI is mapped to the Yixinyin PPI network, and the PPI network of Yixinyin is obtained for MI treatment. It contains a total of 64 nodes and 249 edges (Fig. 3).

Analysis of differentially expressed genes.
A total of 390 DEGs were obtained in the GSE66360 data set. Compared with control group samples, there are 310 mRNA expressions in MI samples illustrated an upregulation trend, and 80 mRNA expressions show a down-regulation trend (Fig. 4A,B). Besides, a total of 277 DEGs were screened out in the GSE29111 data set. Compared with the UA group, 121 mRNA expressions of MI samples presented an upward-regulated trend, and 156 mRNA expressions showed a downward-regulated trend (Fig. 4C,D). This figure is drawn by the sangerbox platform (http:// sange rbox. com/ Tool 24 . We have drawn volcano maps of the two datasets respectively. Genes and samples with the same or similar expression behavior are gathered and drawn into heat maps. DEGs in two datasets were analyzed by metascape. The results are shown in Figure 5. Up regulated genes in GSE66360 are significantly enriched in myeloid leukocyte activation, inflammatory response, response to bacterium, leukocyte migration, regulation of cytokine production. Down regulated genes in GSE66360 are significantly enriched in endoderm formation, chemokine receptors bind chemokines, PID IL12 2PATHWAY, alpha-beta T cell activation. Up regulated genes in GSE29111 are significantly enriched in phagocytosis, engulfment, platelet degranulation, Serotonergic synapse, retina homeostasis, positive regulation of reproductive process. Down regulated genes in GSE29111 are significantly enriched in Class A/1 (Rhodopsin-like receptors), cornification, regulation of leukocyte migration, GABA receptor Signaling, Arp2/3 complex-mediated actin nucleation. The detailed results are shown in Table S2.
Identification of key genes. We take the mapped PPI network module with the intersection of the differential genes of GSE29111 and GSE66360. The genes in the cross-region are selected as key genes. As shown in Fig. 6, the above procedure is drawn as a Venn diagram. There are 7 genes in the intersection between the mapped network module and GSE66360, which are ALDH2, IL1B, TLR2, C5AR1, FOS, THBD and ACSL1. There are 2 genes in the intersection between the mapped network module and GSE29111, which are E2F2 and TXNRD1. Therefore, these genes in the cross-region are considered as key genes.
Verification of key genes based on external data sets. We downloaded the GSE97320 dataset in the GEO database to verify key genes. As shown in Fig. 7, the expressions of ALDH2, IL1B, TLR2, C5AR1, FOS, www.nature.com/scientificreports/ THBD, ACSL1, E2F2, and TXNRD1 in MI samples were significantly higher than those in control samples. This is consistent with the results we analyzed in the GSE34198, GSE29111 and GSE66360 data sets.
GSEA analysis of key genes. We divided the samples into high expression groups and low expression groups based on the expression of key genes. Key genes are up-regulated in GSE29111, GSE66360, and GSE97320. Therefore, we used GSEA analysis to obtain pathways related to the high expression group. There are 8 pathways regulated by the high expression group of key genes. They are viral myocarditis, type I diabetes mellitus, O glycan biosynthesis, VEGF signaling pathway, FC gamma r mediated phagocytosis, natural killer cell mediated cytotoxicity, N glycan biosynthesis, glycosaminoglycan biosynthesis chondroitin sulfate (Fig. 8).

Identification of potentially active compounds.
A total of 169 astragalus components were obtained through the TCMSP database, and the component information is shown in Table S3. Only the proteins corresponding to ALDH2, C5AR1, IL1B, FOS, TLR2, and TXNRD1 have crystal structures that can be used, so we selected the above 6 targets for molecular docking. The crystal structures with PDB codes 3INL, 5O9H, 5R88, 1FOS, 2Z7X, 2J3N were selected as the docking models of ALDH2, C5AR1, IL1B, FOS, TLR2, TXNRD1 (Table 1 and Fig. 9A-F). The figure is drawn using the SYBYL-X 2.0 software (Tripos, St. Louis, MO) 25 . All components  Table S4.
Mechanism network of Yixinyin in treating MI. As shown in Fig. 10, for each target, we only selected the top ten components, and drew the Yixinyin-component-target-pathway network diagram.The figure is    www.nature.com/scientificreports/  www.nature.com/scientificreports/ drawn by cytoscpe 3.7.0 software (https:// cytos cape. org/). As shown in Fig. 9A-F, we show the interaction between the compound with the highest score and the target. The best combination with ALDH2 is Supraene. The best combination with C5AR1 is prostaglandin B1. The best combination with c-Fos is isomucronulatol-7,2′-di-O-glucosiole. The best combination with IL1 beta is angusifolin B. The best combination with TLR2 is Linolenic acid ethyl ester. The best combination with TRXR1 is Mandenol. There are also some ingredients that have good binding effects with multiple targets, such as senkyunone, isomucronulatol-7, 2'-di-O-glucosiole, Supraene, 1-Monolinolein, Linolenic acid ethyl ester, Ethyl linoleate. According to the network diagram, we found that every Chinese medicine in Yixinyin has active components that bind to key targets. This result shows that the ingredients in the prescription are synergistic and participate in the treatment of MI.

Discussion
In this study, we analyzed the gene expression data of 248 MI and control samples. Throughout the analysis of weighted gene co-expression network, 21 co-expressed gene modules were obtained. They are mapped in the PPI network to obtain the gene modules of Yixinyin to treat MI. By the integration of the DEGs of healthy/ MI and unstable angina/MI, the potential targets of Yixinyin for the treatment of MI (ALDH2, C5AR1, CFOS, IL1B, TLR2, TRXR1) have been obtained. External data sets have also verified the significance of these targets. GSEA analysis indicates that the high expression groups of ALDH2, C5AR1, CFOS, IL1B, TLR2, TRXR1 jointly participate in the pathways of viral myocarditis, VEGF signaling pathway and type I diabetes mellitus associated with MI. Toll-like receptor-2 (TLR2) is an essential protein molecule involved in non-specific immunity. It is apparent drawn from studies that by activating the nuclear factor κB and up-regulating interleukin-1β (IL1B), TLR2 can induce the proliferation of cardiomyocyte hypertrophy, fibroblasts and vascular endothelial cell proliferation. MiR-499 and other miRNAs are involved in ischemic myocardial protection. This protection may be achieved by the inhibition of TLR2 and the reduction of inflammatory cytokine release (including IL-1β and IL-6). These effects eventually lead to a dropping in the area of MI 26 . Therefore, regulating TLR2 signal may provide a new treatment strategy for heart failure. Mitochondrial aldehyde dehydrogenase (ALDH2) is a key enzyme that protects the heart during ischemia and reperfusion. Experiments have shown that both protein kinase C ε (PKCε) agonists and ethanol can increase ALDH2 during cardiac ischemia in mice. Therefore, its high expression in patients with MI may be a protective effect of the human body 27 . The activation of ALDH2 enzyme can reduce the differentiation of cardiac fibroblasts. This may be a promising strategy to alleviate myocardial fibrosis, and thus develop the cardioprotective drugs like ALDH2 activators 28 . Alda-1 is the first ALDH2 activator discovered, and its cardioprotective effect is widely recognized in vivo 29 . FOS is the coding gene of c-fos. After the coronary artery conjunctiva in rats, the expression of miR-101a/b in the peri-infarct area decreased. c-Fos was found to be the target of miR-101a. After silencing c-Fos, cardiac function has improved notably 30 . Some clinical trials have shown that postprandial hyperglycemia makes a difference in the occurrence and development of MI. Hyperglycemia can induce c-fos gene expression. For that matter, inhibiting the expression of c-fos may become a new treatment strategy for MI 31,32 . Studies have shown that in the absence of C5aR1, the levels of IL-1 beta and IL-6 decreased in the plasma of mice with cardiac insufficiency are reduced. IL-1 beta and IL-6 are closely related to the development of MI 33 . Hence, inhibiting the expression of C5aR1 is conducive to the reduction of the risk of MI. There has been no report about Yixinyin treating MI through these targets.
We discovered some potential active compounds in Yixinyin through docking. Studies have shown that squalene has a protective effect on rats with MI induced by isoproterenol. Its protective effect on the heart is attributed to its ability to reduce the release of hydrolase or to strengthen the myocardial membrane through its antioxidant properties against free radicals 34 . Linolenic acid ethyl ester is a derivative of linolenic acid and has similar physiological effects to linolenic acid. Studies have shown that linolenic acid is associated with a lower risk of acute myocardial infarction. Eating vegetable oils rich in linolenic acid can have an important protective effect on the cardiovascular system 35 . Ethyl linoleate is a derivative of linoleic acid and has similar physiological effects to linoleic acid. The combined administration of linoleic acid and nitrite can play a cardioprotective effect in the event of MI. It can reduce the level of hydrogen peroxide and significantly change the activity of myocardial mitochondrial respiration and electron transport chain 36 . It was found that the 1-Monolinolein may be the key compound for the hemostatic effect of Cortex Moutan. It is involved in related pathways of complement, coagulation cascade and platelet activation 37 . At present, there is no research on the compounds such as angusifolin B, isomucronulatol-7,2′-di-O-glucosiole, Mandenol, senkyunone in the treatment of MI. They may be potentially active compounds, which need to be further verified through pharmacological experiments.
In this study, we obtained gene modules related to MI by constructing a weighted gene co-expression network. As a result, the resulting modules have a better connection in biological functions. After that, we mapped the gene modules to the PPI network of Yixinyin, and screened out the key genes. These key genes are not only the potential targets of Yixinyin, but also closely related to MI. Compared with the direct construction of Yixinyin PPI network, this method can better integrate the transcriptome data of the disease, making the research results more concentrated 38 . Admittedly, our approach also has a certain limitation. Due to factors such as experimental cost and social ethics, the number of patient samples experienced in this study is very limited 39 . In addition, our research lacks the validation of animal or cell experiments. In future research, we will conduct perform pharmacodynamic analysis of potential ingredients, and biologically validate potential key genes through in vivo and in vitro experiments.

Materials and methods
Data download and preprocessing. In this study, in order to discover genes related to MI, we downloaded the GSE66360 data set containing samples of MI patients and healthy controls. In addition, recurrent unstable angina (UA; clinical symptoms of cardiac ischemia without myocardial necrosis) is a common development process of MI (clinical symptoms of cardiac ischemia with myocardial necrosis). Therefore, we also downloaded GSE29111 which contains samples of UA and MI. Further, in order to explore the relationship between clinical characteristics of MI patients and gene modules, we downloaded the GSE34198 data set containing complete clinical information. All three data sets come from the GEO database (https:// www. ncbi. nlm. nih. gov/ geo/ 40  Construction of weighted gene co-expression network. The GSE34198 data set with relatively complete clinical information was selected for weighted gene co-expression network analysis (WGCNA). The top 25% of the genes in the variance are used for subsequent calculations (i.e., the genes that have significantly changed in each sample are selected). The gene expression matrix performs the process of missing value (deleting genes with more missing values) and the elimination of outliers. The gene expression data of the remaining samples are used to construct a weighted gene co-expression network. According to the standard of the scaleless network, the appropriate weighting factor β is selected. The above process can be implemented by taking advantage of the pickSoftThreshold and softConnectivity function in the WGCNA package. The correlation between genes is calculated based on topological overlap (topological matrix, TOM). After that, a Dynamic Tree Cut is performed to divide the gene modules. The hclust function is used to merge highly similar modules for clustering dents. In this regard, the size of minimum module is set to 10 and a gene tree diagram is drawn 41 . The gene module is then associated with the clinical feature, and the significance of the p value is calculated to obtain the gene significance (GS). It represents the correlation between genes and phenotypes. For each module, we define module membership (MM) as the correlation of module eigengene and the gene expression profile. We select modules that are highly correlated with the phenotypic characteristics of MI.  42 . The oral bioavailability (OB) and drug-like properties (DL) related to ADME are used to screen the active ingredients of Yixinyin. The screening criteria are: OB≥30% and DL≥0.18 43 . In addition, the target data of Yixinyin comes from the TCMSP database, the Encyclopedia of Traditional Chinese Medicine (ETCM, http:// www. nrc. ac. cn: 9090/ ETCM/) and the BATMAN database (BATMAN, http:// bionet. ncpsb. org/ batman-tcm/) 44,45 . The PPI data between Yixinyin targets comes from the string database (https:// string-db. org 46 . The PPI data with confidence level greater than 0.7 are imported into the Cytoscape 3.7.0 software (https:// cytos cape. org/) to build the Yixinyin PPI network 47 .
Taking the PPI network of Yixinyin as the background, the gene modules obtained in the MI WGCNA are mapped into it. The Uniprot database (https:// www. unipr ot. org/) is used for the conversion of gene names and target names 48 . Therefore, the coexpression data based on WGCNA automatically divides the PPI network into multiple modules. key genes of Yixinyin in treating MI. In order to further clarify the genetic differences between patients with MI and the control group, we used the limma package in the R 3.5.0 software to analyze the differentially expressed genes for MI and the control group in the GSE66360 data set. Heat map and volcano map using sangerbox platform(http:// sange rbox. com/ Tool) 24 . The screening criterion is that |log2(fold change)| > 1 and P < 0.05 are differentially expressed genes (DEGs). Among them, log2(fold change) > 1 is marked as an up-regulated gene; log2(fold change) < − 1 is marked as a down-regulated gene. It is not the DEG that does not meet the above criteria. Similarly, we also screened the DEGs for UA and MI in the GSE29111 data set. In order to identify the biological functions of DEGs, this study mapped genes to the online website metascape (https:// metas cape. org/ gp/ index. html#/ main/ step1) for condensed analysis 49 . The entries with P<0.05 are considered statistically significant.
Based on the co-expression between genes, we divided the Yixinyin PPI network into modules. Specifically, the mapping gene module reflects the main PPI mechanism for the Yixinyin in treating MI. If nodes in the module network are differentially expressed between patients and healthy samples, or differentially expressed in UA and MI, they may be the primary regulatory targets of Yixinyin for the treatment of MI. Therefore, in order to screen for key genes, we take the intersection of the mapped network module and DEGs, where in the cross-region genes are selected as the key genes.