Study of the active ingredients and mechanism of Sparganii rhizoma in gastric cancer based on HPLC-Q-TOF–MS/MS and network pharmacology

Sparganii rhizoma (SL) has potential therapeutic effects on gastric cancer (GC), but its main active ingredients and possible anticancer mechanism are still unclear. In this study, we used HPLC-Q-TOF–MS/MS to comprehensively analyse the chemical components of the aqueous extract of SL. On this basis, a network pharmacology method incorporating target prediction, gene function annotation, and molecular docking was performed to analyse the identified compounds, thereby determining the main active ingredients and hub genes of SL in the treatment of GC. Finally, the mRNA and protein expression levels of the hub genes of GC patients were further analysed by the Oncomine, GEPIA, and HPA databases. A total of 41 compounds were identified from the aqueous extract of SL. Through network analysis, we identified seven main active ingredients and ten hub genes: acacetin, sanleng acid, ferulic acid, methyl 3,6-dihydroxy-2-[(2-hydroxyphenyl) ethynyl]benzoate, caffeic acid, adenine nucleoside, azelaic acid and PIK3R1, PIK3CA, SRC, MAPK1, AKT1, HSP90AA1, HRAS, STAT3, FYN, and RHOA. The results indicated that SL might play a role in GC treatment by controlling the PI3K-Akt and other signalling pathways to regulate biological processes such as proliferation, apoptosis, migration, and angiogenesis in tumour cells. In conclusion, this study used HPLC-Q-TOF–MS/MS combined with a network pharmacology approach to provide an essential reference for identifying the chemical components of SL and its mechanism of action in the treatment of GC.

Gastric cancer (GC) is one of the leading causes of cancer-related death worldwide, and its incidence rate is sixth among cancers 1 . At present, surgery, chemotherapy, and other traditional therapies are the main treatments. However, the incidence of local recurrence and distant metastasis after gastric cancer surgery is high. Chemotherapy is associated with toxicity and side effects; thus, it is challenging for these treatments to mediate a long-term antitumour effect. Therefore, it is necessary to explore new strategies for the treatment of this disease. In China, traditional Chinese medicine (TCM) is widely used in the treatment of GC and has shown advantages with its multipathway, multitarget, and multilink characteristics, small side effects, and significant efficacy. Sparganii rhizoma (SL) is the dried tuber of the Sparganiaceae plant Sparganium stoloniferum (Buch.-Ham. ex Graebn.) Buch.-Ham. ex Juz., which is a traditional Chinese medicine. It has a pungent, bitter, flat attributes and enters the liver and spleen meridians. Its effects include tonifying the blood and promoting qi, removing stagnant food, and alleviating pain. It is included in the Pharmacopoeia of the People's Republic of China (2015 Edition) 2 . Previous experiments by our research team suggested that the "Sparganii rhizoma-Curcuma zedoary-Salvia chinensis" herb pair with SL as one of the main components had growth-inhibitory effects on both regular and resistant GC cells, and the inhibitory effect increased with increasing concentration 3 . Modern pharmacological studies have also shown that SL has an apparent inhibitory effect on the proliferation of GC cells and can promote tumour cell apoptosis 4 . In addition, some studies have found that the combination of traditional Chinese medicine preparations mainly composed of SL and chemotherapy can prolong the progression-free survival  Table S3). GO analysis identified 310 biological processes (BP), 50 cellular components (CC), and 93 molecular  www.nature.com/scientificreports/ functions (MF). In BP, the targets mainly involve positive regulation of transcription from RNA, negative regulation of the apoptotic process, positive regulation of cell proliferation, and positive regulation of cell migration, angiogenesis, and the MAPK cascade. In CC, the targets mainly involve the nucleus, plasma membrane, cytoplasm, extracellular exosomes, integral components of the plasma membrane, and mitochondria. In MF, the targets mainly involve protein binding, ATP binding, enzyme binding, identical protein binding, protein kinase activity, and protein homodimerization activity. A total of 101 pathways were identified by KEGG pathway analysis, and the targets were closely related to pathways in cancer, PI3K-Akt signalling pathway, proteoglycans in cancer, microRNAs in cancer, focal adhesion, the Rap1 signalling pathway, the Ras signalling pathway, the cAMP signalling pathway, the HIF-1 signalling pathway, and the MAPK signalling pathway. This suggests that SL may play a role in the treatment of GC through the above pathways, among which the PI3K signalling pathway involves 47 potential targets, including most of the hub genes, and may be the key pathway. According to the number of enriched genes, the top 20 results in descending order of enrichment analysis were visualized, as shown in Fig. 5. The above results indicate that the biological processes involved in the anticancer targets of SL's main chemical components are diverse and distributed in different metabolic pathways, reflecting its multipathway characteristics.
Compound-target-pathway network analysis. A compound-target-pathway network was constructed with the targets included in the top 20 pathways and the chemical components corresponding to the targets obtained from KEGG pathway analysis (Fig. 6). The network contained 181 nodes with 29 representative components, 132 representative targets, 20 representative pathways, and 886 edges. From the diagram of the compound-targetpathway network, we can see intuitively that the targets of SL active components are distributed in different pathways, coordinate with each other, and play a common role in the treatment of GC, which comprehensively embodies the multicomponent, multitarget, and multipathway characteristics of traditional Chinese medicine.

Molecular docking analysis.
We performed molecular docking analysis on seven major active ingredients with node degree and betweenness centrality greater than the average in the compound-target network and core targets with the top ten degrees in the PPI network. Moreover, the original ligands of potential protein targets were analysed. After docking with AutoDock Vina, the obtained data were analysed by a heat map, as shown in Fig. 7.  www.nature.com/scientificreports/ It is generally believed that the lower the energy when the conformation of the ligand binding to the receptor is stable, the greater the possibility of action. In this study, almost all active ingredients and core target proteins' binding energies were less than − 5.0, which indicated that SL active ingredients had better binding activity with core targets, which stated that SL active ingredients had better binding activity with core targets. We selected the docking results of the compound (acacetin) that binds best to the target protein for display (Fig. 8).

Quasimolecular (n) [M − H]¯ = [M + Cl
External validation of hub genes. mRNA expression levels of hub gens. We used the Oncomine database to analyse the differential expression of hub genes between GC tissues and normal tissues. The following thresholds were set: p-value: 0.01; fold change: 2; gene rank: Top 10%; data type: mRNA. The analysis results showed that the mRNA expression of MAPK1 and STAT3 was significantly upregulated in GC tissues, and there were no significant differences between GC and normal gastric tissues for other mRNA levels (Fig. 9a). Subsequently, further validation with the GEPIA database showed that the mRNA levels of MAPK1 and HSP90AA1 were significantly upregulated in GC specimens compared with normal gastric specimens (P < 0.01) (Fig. 9b).
In addition, we analysed the relationship between hub gene mRNA levels and the pathological stage of GC.
The results showed that the levels of PIK3R1 and HSP90AA1 changed significantly with pathological stage and increased significantly in stage III (Fig. 9c). These results suggested that the expression levels of these two genes might be correlated with GC progression.
Protein expression levels of hub gens. Additionally, we analysed the immunohistochemical staining images in the HPA database to observe the expression levels of hub gene proteins in GC. The results showed that except for HSP90AA1, the other nine hub genes were expressed to different degrees in normal gastric tissues. Compared www.nature.com/scientificreports/ with normal gastric tissues, the expression levels of SRC, MAPK1, HSP90AA1, STAT3, and FYN were increased in GC tissues, while the expression of RHOA was decreased in GC tissues (Fig. 10).

Discussion
In this study, HPLC-Q-TOF-MS/MS technology was used to rapidly and comprehensively analyse the chemical components of SL, and 41 compounds were identified. Then, the identified compounds were studied in network pharmacology. Finally, we found seven main active ingredients in the drug, including acacetin, sanleng acid, ferulic acid, methyl 3.6-dihydroxy-2-[(2-hydroxyphenyl) ethynyl] benzoate, caffeic acid, adenine nucleoside, and azelaic acid; moreover, we identified PIK3R1, PIK3CA, SRC, MAPK1, AKT1, HSP90AA1, HRAS, STAT3, FYN, and RHOA as hub genes. Molecular docking showed that the active ingredients had good affinity for the hub gene proteins. These seven active ingredients may be the material basis for SL to exert therapeutic efficacy for GC. Modern pharmacological studies have shown that acacetin, as a natural flavonoid, can resist tumours in multiple links, pathways, and targets and is effective in most tumour cell lines. It can inhibit the proliferation of tumour cells, induce the autophagy and apoptosis of tumour cells, inhibit the invasion and migration of tumour cells and angiogenesis, regulate immunity, and reverse multidrug resistance 31   www.nature.com/scientificreports/ inhibit tumour cell proliferation 32,33 . Ferulic acid can also induce the apoptosis of GC cells by upregulating the tumour suppressor transcription factor p53 and downregulating the mRNA and protein expression levels of the apoptosis inhibitory proteins Survivin and XIAP 34,35 . Caffeic acid can also cause apoptosis of SCM1 human GC cells 36 . Sanleng acid and azelaic acid are organic acid compounds. Sanleng acid is the earliest organic acid component identified by SL analysis, but no specific action mechanism has been reported yet. Azelaic acid can destroy mitochondrial respiration and inhibit cell synthesis; thus, it has good antiproliferation and cytotoxic effects on various cultured tumour cell lines and can be used as a potential anticancer drug 37 . Although methyl 3,6-dihydroxy-2-[(2-hydroxyphenyl) ethynyl] benzoate and adenine nucleoside are the main active ingredients in GC treatment screened by us, there is no clear report on the antitumour effect at present, which deserves further study to discover the potential mechanism of action. An increasing number of studies have shown that TCM is a multitarget drug. Among the ten hub genes identified in this study, PIK3R1 and PIK3CA were identified as PI3K/protein kinase B (Akt) signalling pathway regulators. Studies have shown that abnormal upregulation of PIK3R1 and PIK3CA expression enhances the catalytic activity of PI3K and then activates the PI3K-Akt signalling pathway, causing GC cells to overproliferate and increasing the migration and invasion abilities of GC cells [38][39][40] . The proto-oncogene c-SRC, a member of the SRC family of kinases (SFKs), is one of the earliest nonreceptor-dependent tyrosine protein kinases found to be closely related to human diseases 41 . Current studies have shown that SRC can promote tumour cell proliferation and tumour angiogenesis, inhibit apoptosis, participate in cancer cell adhesion and invasion, and coregulate tumour growth through the interaction of growth factor receptors and growth factors 42,43 . Mitogen-activated protein kinase 1 (MAPK1) has been confirmed as an essential oncogene in the progression of GC, and its level is elevated in GC tissues and cells, which can promote the proliferation, migration, and invasion of GC cells [44][45][46] . Heat shock protein 90 (HSP90) is overexpressed in many malignant tumours, and members of the HSP90 gene family are essential for cell cycle regulation, survival, and apoptosis. Studies have shown that the expression of HSP90AA1 is associated with poor prognosis in GC 47,48 . STAT3, a key transcription factor in tumorigenesis, www.nature.com/scientificreports/ focuses on multiple signalling pathways, such as cell proliferation, carcinogenesis, and apoptosis, which can promote the growth, proliferation, angiogenesis, metastasis, and immune response of tumour cells 49,50 . Similar to SRC, FYN is an SFK that is overexpressed in GC and is positively correlated with metastasis and may promote gastric cancer metastasis by activating STAT3-mediated epithelial-mesenchymal transition 51 . In addition, our study also showed that SRC, MAPK1, STAT3, HSP90AA1, PIK3R1, and FYN were overexpressed in GC patients, which may be associated with the poor prognosis of GC patients. The AKT1 signalling pathway plays a vital role in regulating the biological functions of tumour cell growth, proliferation, apoptosis, and metabolism. Its positive expression rate in GC tissues is significantly higher than that in adjacent tissues, and it participates in the occurrence and development of GC [52][53][54] . HRAS belongs to the RAS gene family, which regulates RAF-MEK-ERK, PI3K/AKT, and other signalling pathways related to cell survival and proliferation by binding to GTP/GDP and the RAS protein to act as a molecular switch 55,56 . HRAS mutations are closely associated with the occurrence of various tumours. The expression of RHOA, a RAS homologous family, is related to certain tumorigenesis; however, its prognostic value in GC remains controversial. Some studies have found that the RHOA signaling pathway plays a vital role in the occurrence, invasion, metastasis, immune escape, and multidrug resistance mechanisms of gastric cancer 57,58 . Nevertheless, some studies have shown that the overall prevalence of RHOA-mutant GC is low, usually offering a lower T stage and no distant metastasis 59 . Our external validation also showed that RHOA was expressed at a low protein level in GC tissues; therefore, further study of this gene is necessary. www.nature.com/scientificreports/ To better understand the molecular mechanism of SL in the treatment of GC, we performed GO and KEGG pathway analyses on the targets. GO analysis results showed that the target genes were mainly related to biological processes such as positive regulation of transcription from RNA, negative regulation of the apoptotic process, positive regulation of cell proliferation, positive regulation of cell migration, angiogenesis, and similar processes. In CC, the nucleus accounted for the largest proportion. In MF, protein binding, ATP binding, and enzyme binding were the main components. KEGG pathway analysis showed that the signalling pathway of SL in the treatment of GC was most related to the PI3K-Akt signalling pathway. Additionally, it involved the Ras signalling pathway, the MAPK signalling pathway, and other signalling pathways. Most of the hub genes, such as HRAS, AKT1, HSP90AA1, PIK3CA, PIK3R1, MAPK1, and RHOA, play roles in these signalling pathways, which is consistent with the results of modern pharmacological studies.
In conclusion, the analytical method based on HPLC-Q-TOF-MS/MS technology in this study can accurately identify the chemical components in SL efficiently, rapidly, and comprehensively. Simultaneously, the network pharmacology method is used to deeply excavate its potential active ingredients and the mechanism of drug treatment for GC to provide more scientific theoretical guidance for the improvement of quality control standards and clinical application of SL in the future. In our study, we found that SL is a multitarget anticancer drug. We predicted that the primary mechanism of action of SL in the treatment of GC is as follows: mediating   Preparation of the test solution of SL. The proper amount of SL medicinal materials was crushed and sieved through 60 mesh, and 1 g of powder was precisely weighed. Then, the weighed 1 g powder was soaked ten times in double-distilled water for 30 min, refluxed and extracted twice, the first for 30 min and the second for 20 min, combined with two filtrates, evaporated by a rotary evaporator at 70 ℃, and then reconstituted with absolute ethanol to a 10-ml volumetric flask.

Identification of compounds.
According to the multistage mass spectrum fragment information and the precise relative molecular mass provided by high-resolution mass spectrometry, the molecular formula was fitted by Peakview 1.2 software with a mass deviation range (δ) ≤ 5 × 10 -6 , and the compounds were preliminarily predicted. Then, it was further confirmed by comparing the retention time and the mass spectrum fragment information provided by the SciFinder database and related references to achieve the purpose of the accurate identification of compounds.

Network pharmacology research. Prediction of potential targets of compounds and collection of disease
targets. SwissTargetPrediction (http://www.swiss targe tpred ictio n.ch/) 60 is a network tool for ligand-based target prediction of any small biologically active molecule. We transformed the compounds identified by mass spectrometry into canonical SMILES through the PubChem (https ://pubch em.ncbi.nlm.nih.gov/), Chemical Book (https ://www.chemi calbo ok.com/Produ ctInd ex.aspx), and ChemSpider (http://www.chems pider .com/) databases. We then imported SMILES into SwissTargetPrediction to predict all potential targets of compounds. Species were selected as "Homo sapiens" with probability > 0 as the screening condition.
Then, the predicted targets of the chemical components of SL were mapped with the targets of GC, and the intersection of the two was taken to obtain the target set of SL for the treatment of GC.
Construction of compound-target network. The chemical components of SL and its therapeutic targets in GC were introduced into Cytoscape (Version 3.8.0) (https ://cytos cape.org/) 65 to construct the compound-target network. The "network analysis" is used to analyse the topological parameters of the network, where the "degree" represents the number of nodes connected with this node in the network; the greater the degree of the node is, the more critical it is in the network. The "betweenness centrality" reflects the importance of a node in transmitting information through the network, and the greater the betweenness centrality of the node is, the more critical it is in the network. The core network was screened based on the network node topological parameters "degree" and "betweenness centrality" to obtain the main active ingredients of SL for the treatment of GC.
Construction of PPI network. The targets of SL for the treatment of GC were imported into the STRING Database (Version 11.0) (https ://strin g-db.org/) 66 , and the correlation between target proteins was analysed. "Organism" was set as "Homo sapiens". The PPI network was constructed with a "combined score" ≥ 0.9 as the screening condition. The visualization process was carried out with Cytoscape (Version 3.8.0), and targets with a high degree of connectivity were selected as hub genes.
Gene function annotation and construction of the compound-target-pathway network. The Database for Annotation, Visualization and Integrated Discovery (DAVID) (Version 6.8) (https ://david .ncifc rf.gov/) 67,68 provides systematic and comprehensive biological function annotation information for a large number of genes. It can identify the most significantly enriched biological annotations. We introduced the target set of SL for GC treatment into DAVID (Version 6.8) and defined the species as "Homo sapiens" for Gene Ontology (GO) and Kyoto www.nature.com/scientificreports/ Encyclopedia of Genes and Genomes (KEGG) pathway analyses. To more comprehensively annotate the biological functions of genes to better understand the molecular mechanism of SL in treating GC, GO will describe the nature of genes from three terms, including cell component (generally used to describe the location of gene action), molecular function (which can describe the activity at the molecular level) and biological process. P < 0.01 was used as a screening condition. Enrichment analysis bubble maps were plotted using the R language. Based on the results of KEGG pathway analysis, pathways related to GC and the top 20 enriched genes were identified. Then, Cytoscape (version 3.8.0) was used to further construct the compound-target-pathway network.
Molecular docking between active ingredients and hub genes. To further validate the reliability of the target prediction results, molecular docking was performed on the selected active ingredients and hub genes. Active ingredients were loaded in the SDF format file of their 3D structure through the PubChem database and were then imported into Chem3D for optimization and saved in mol2 format; hub genes were kept in the Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB, https ://www.rcsb.org/) 69,70 , where the best protein crystal structure was selected (human protein, with ligands, relatively complete structure, smaller resolution value), and its PDB format file was downloaded. Before docking, the original crystal ligand and water molecule in the protein-ligand complex were removed using PyMol 71 . The protein and ingredients were then hydrogenated, charged, and subjected to other operations using AutoDockTools and converted into PDBQT format files. Auto Dock Vina 72 was used to perform molecular docking between the processed ingredients and protein, and the docking results were visualized using PyMol software.
External validation of hub genes. Analysis of mRNA expression level. Oncomine 4.5 (https ://www. Oncom ine.org) 73 is a cancer gene expression profile database and integrated data-mining platform designed to facilitate the discovery of genome-wide expression analysis. Through the Oncomine database, we compared the differential expression of hub genes in GC tissues and normal gastric tissues.
Gene Expression Profiling Interactive Analysis (GEPIA, http://gepia .cance r-pku.cn/index .html) 74 is a newly developed interactive web server for analysing the RNA sequencing expression data of 9736 tumours and 8587 normal samples from the TCGA and GTEx projects using a standard processing pipeline. The GEPIA database can further verify the differential expression of hub genes between GC and normal gastric tissues, and it can also analyse them according to pathological stages.
Analysis of protein expression level. The Human Protein Atlas (Version 19.3) (HPA, https ://www.prote inatl as.org/) 75 database is mainly an extensive proteome database based on immunohistochemical analysis. The protein expression levels of hub genes in GC tissues and normal gastric tissues were compared according to the staining intensity and percentage of stained cells in the tissues, and representative immunohistochemical staining pictures were obtained.

Data availability
All data generated or analysed during this study are included in this published article and its "Supplementary Information" files.