Potential therapeutic targets of gastric cancer explored under endogenous network modeling of clinical data

Improvement in the survival rate of gastric cancer, a prevalent global malignancy and the leading cause of cancer-related mortality calls for more avenues in molecular therapy. This work aims to comprehend drug resistance and explore multiple-drug combinations for enhanced therapeutic treatment. An endogenous network modeling clinic data with core gastric cancer molecules, functional modules, and pathways is constructed, which is then transformed into dynamics equations for in-silicon studies. Principal component analysis, hierarchical clustering, and K-means clustering are utilized to map the attractor domains of the stochastic model to the normal and pathological phenotypes identified from the clinical data. The analyses demonstrate gastric cancer as a cluster of stable states emerging within the stochastic dynamics and elucidate the cause of resistance to anti-VEGF monotherapy in cancer treatment as the limitation of the single pathway in preventing cancer progression. The feasibility of multiple objectives of therapy targeting specified molecules and/or pathways is explored. This study verifies the rationality of the platform of endogenous network modeling, which contributes to the development of cross-functional multi-target combinations in clinical trials.

The core endogenous network of gastric cancer is constructed based on the hierarchical structure of cell regulatory networks and feedback from the surrounding environment.We first select important modules responsible for basic cellular functions and modules related to cancer (such as cell cycle, apoptosis, metabolism and stress response, immune response, and other fundamental functional modules and pathways), and then determine the key factors for each module.After agents are selected and integrated, we conducted extensive literature searches on interactions between the agents and carried out rigorous experimental validation to construct the network.It is important to note that network construction is an iterative process.Taking Network1 -the initial version of the network (see Excel -Network1.xlsx)as an example, we applied K-Means clustering to the attractors obtained from Network1 to generate a steady-state landscape graph, as illustrated in Figure S1A.All stable points were mixed without clear classification, which was inconsistent with the clinical data landscape graph from Ruijin Hospital (see Figure 4B in the main text).Furthermore, the metabolism module exhibited high expression in all attractors (see Figure S1B), which contradicted the experimental data.By introducing inhibitory factors of the metabolism module from other elements in the network and conducting multiple iterations, the final version of the network (see Table S2) was obtained.At this stage, K-Means clustering was used to categorize the attractors, resulting in 3 clusters, along with the expression patterns at the molecular level for each cluster, which were then compared with clinical data.The condition for network matching is that the consistency rate at the module level between the constructed model and experimental data should be 100%.Considering the data quality, the consistency rate at the molecular level should be around 70%, not less than 60%.Table S2 Literature sources for the interactions between factors in the network

B. From network topology to dynamical systems
In accordance with the equations (1)-( 5) outlined in the main text, equations 1-55 are employed to depict the temporal dynamics of the concentration/activation levels of the 55 factors within the network.Additionally, equations 56-110 are utilized to represent the random distribution parameters  for these 55 factors in the network.It is worth noting that the parameter n is assigned a value of 3, while the parameter a is set to 10.

C. Analyze steady states and microarray data
Steady State Data and Ruijin Hospital Samples 1: In the absence of network intervention, the simulation yield steady states for the network, target:0.

Analysis Process for the four Data Sets
We obtained microarray data from samples collected at Ruijin Hospital.To facilitate analysis, the data underwent Z-score normalization, transforming it into a standardized form with a mean of 0 and a standard deviation of 1.This normalization method eliminates inherent size differences between genes, enabling meaningful comparisons on a common scale.It ensures compatibility for comparing initially dissimilar and non-directly comparable values.This normalization step establishes the foundation for subsequent comparisons with gene data from Ruijin Hospital.

Analysis of All Steady States:
To analyze the simulation data, we standardized all the data with a mean of 0 and a standard deviation of 1.Using Principal Component Analysis (PCA), we identified the top contributing components, PC1 and PC2 (as illustrated in Figure S2).Subsequently, we performed hierarchical clustering on the stable points of the network, leading to the identification of three distinct clusters: A1, A2, and A3 (as depicted in Figure 3A in the main text).This finding suggests the presence of three major degenerate attractor basins.K-Means clustering was used to obtain expression profiles for center points of each steady state type (see Table S3).
Table S3: Expression profiles of the 55 endogenous agents corresponding to the centroids of each attractor obtained from K-Means clustering.Explore the classification of samples using hierarchical clustering and K-Means clustering methods.Next, hierarchical clustering was performed on the experimental data from Ruijin Hospital to cluster the samples.The clustering results are shown in main text Figure 4.The results indicate that using unsupervised hierarchical clustering, all samples from Ruijin Hospital were also clustered into three groups, with the normal samples forming one group and the abnormal tumor samples forming two types.This suggests that these tumor samples can be classified into two different subtypes of gastric cancer, which is consistent with the computational results obtained from the network constructed in this study.
The classification of samples was explored using hierarchical clustering and K-Means clustering methods.Firstly, hierarchical clustering was performed on the experimental data obtained from Ruijin Hospital to group the samples.The clustering results are presented in Figure 4 of the main text.The findings indicate that through unsupervised hierarchical clustering, all samples from Ruijin Hospital were effectively clustered into three groups.Specifically, the normal samples formed one distinct group, while the abnormal tumor samples were divided into two subtypes.This observation suggests that these tumor samples can be classified into two different subtypes of gastric cancer, which aligns with the computational results obtained from the constructed network in this study.Then, K-Means clustering is performed on the samples to validate the results of hierarchical clustering, as shown in Figure 4B(Main text).The expression profiles of each cluster center are listed in Table S4.Comparing the expression profiles of each cluster obtained from hierarchical clustering (Main text Figure 4A) with that obtained from K-Means clustering (Table S4), it is evident that the leftmost cluster from hierarchical clustering, which represents the normal samples, is consistent with the expression profile of the third cluster center obtained from K-Means clustering.The middle and rightmost clusters from hierarchical clustering, representing tumor samples, are consistent with that of the first and second cluster centers from K-Means clustering respectively.Thus the chip data from gastric cancer and normal gastric mucosa samples from Ruijin Hospital can be divided into three major groups (attractors).One represents the normal samples, and two from tumor samples map to the two subtypes of gastric cancer.
Subsequently, K-Means clustering was conducted on the samples to validate the results obtained from hierarchical clustering.The outcomes of K-Means clustering are illustrated in Figure 4B(Main text).The expression profiles of each cluster center are provided in Table S4.By comparing the expression profiles of each cluster derived from hierarchical clustering (Figure 4A) with those obtained from K-Means clustering (Table S4), it becomes evident that the leftmost cluster from hierarchical clustering, representing the normal samples, corresponds to the expression profile of the third cluster center obtained from K-Means clustering.Likewise, the middle and rightmost clusters from hierarchical clustering, representing tumor samples, correspond to the first and second cluster centers from K-Means clustering, respectively.Consequently, the chip data obtained from gastric cancer and normal gastric mucosa samples at Ruijin Hospital can be categorized into three major groups (attractors).One group represents the normal samples, while the other two groups, derived from tumor samples, correspond to the two subtypes of gastric cancer.
K-Means clustering was used to obtain expression profiles for center points of each samples type (see Table S4).The data is linearly scaled to [0, 1] and represented by a red-blue gradient indicating high/low expression levels.The first column represents the gene, while S1, S3, and S2 represent the centroids of the first, second, and third attractor basins, respectively.
The expression profiles of the attractor centers computed by the network model were compared with the expression profiles of the attractor centers obtained from the samples collected at Ruijin Hospital.The comparison revealed conformity rates of 96.36%, 60.00%, and 89.09% for the three attractors, as presented in Table S5.It is important to consider several factors in evaluating the degree of agreement between the model and clinical data at the molecular level.Firstly, the simplicity of the core network, which consists of only 55 nodes, highlights the robustness and efficiency of the model.Secondly, it is crucial to acknowledge potential errors in the measurement of gene expression data, as these can introduce variability in the results.Lastly, the presence of heterogeneity among cancer samples further emphasizes the need for careful interpretation of the findings.Given these considerations, the level of concordance observed between the model and clinical data warrants further simulation and analysis.This approach can serve as a valuable "dry experiment" platform for exploring molecular-targeted therapy in cancer, significantly reducing the search space for subsequent wet experiments.3. Analysis of gastric cancer and normal samples from TCGA Database.
The same method was adopted for analysis TCGA data, the final comparison results are shown here.The first column represents the 55 factors in the network.The second column displays the expression profile of the center for the first attractor domain, predicted by the model.The third column shows the expression profile of the attractor center for the second attractor, using samples from TCGA.The fourth column represents the absolute difference between the second and third columns, with values greater than 0.5 marked in red (indicating non-matching) and values less than 0.5 marked in blue (indicating matching).The seventh and tenth columns represent the absolute differences in the expression profiles of two additional pairs of attractor centers between the two datasets.The last row represents the matching rates of each attractor at the molecular level.
4. Analysis of all samples from GEO Database.
We compared the calculated results of VEGF=0 with a set of experimental data from GEO Datasets (GSE160613), Gene expression data of human gastric cancer MKN45 cells xenograft tumors treated with anti-VEGFR2 and anti-VEGF-A antibodies, the final comparison results are shown here.
Table S7: A comparison of simulation results of anti-VEGF with GEO data.
D. Simulation results after perturb the network.

Figure
Figure S1: A K-Means clustering of Network1.B HeatMap of Network1, The x-axis represents different stable points, while the y-axis represents different genes.
Figure S2: A comparison of steady-state dimension reduction before and after PCA analysis.The x-axis represents different steady states, while the y-axis represents genes.
a): Hierarchical clustering of gastric cancer and normal samples b): PCA dimensionality reduction results of gastric cancer and normal samples Figure S3: Dimensionality reduction of different samples collected in the clinical setting.The x and y coordinates in the figure represent different samples and genes, respectively.The green color bar represents normal gastric mucosa samples, and the red color bar represents tumor samples, the same applies to subsequent figures.

Figure
Figure S4 Hierarchical clustering heatmap of anti-VEGF treatment.After inhibiting VEGF, the left and middle clusters show apoptosis modules activated and growth factor modules opened, indicating that cells are still able to evade apoptosis and have unrestricted cell proliferation.

Figure
Figure S5 Hierarchical clustering heatmap of inhibiting Caspase3 and activating NF-κB.After inhibiting Caspase3 and activating NF-κB, the right cluster shows the apoptosis module open and the immune module closed, indicating that cells cannot evade apoptosis and have restricted cell proliferation, and the immune response is weakened. )

Table S4 :
Expression profiles corresponding to the centroids in the samples via K-means clustering.

Table S5 :
A comparison of self-emerged steady states of the network to the gastric cancer and normal samples collected from Ruijin Hospital at the molecular level.

Table S6 :
A comparison of self-emerged steady states of the network to the gastric cancer and normal samples collected from TCGA data.