Introduction

Hepatocellular carcinoma (HCC), remains the third leading cause of tumor-related death worldwide1. Chronic hepatitis B-induced cirrhosis leading to HCC is the most common progress pattern in liver cancer. The other top three risk factors for HCC are alcohol consumption, chronic hepatitis C and non-alcoholic fatty liver disease2. Although great strides have been made in advancing early diagnosis, surgical technology3, targeted treatment4,5 and immunotherapy6,7, the high rate of recurrence and mortality remain a challenge. This is because most patients present with unresectable lesions or distant metastasis at time diagnosis. In such cases, prognosis is poor. The 5-year overall survival is only 10–18%8,9,10. Therefore, a novo prognosis prediction model is needed to aid in patient evaluation, treatment optimization and possibly improve patient outcome.

Matrix metalloproteinases (MMPs), a family of zinc-dependent endoproteases, is significantly associated with extracellular matrix degradation through protein denaturation, which plays a vital role in apoptosis, angiogenesis, and immune response11,12,13,14 of the tumor microenvironment. MMP-2 together with MMP-9, are the two most common progression markers correlated with invasion and metastasis in various tumor, especially HCC15,16. Although MMP1 has been more commonly reported to be expressed in non-neoplastic liver tissues17, it has also been shown to be associated with invasion and migration in HCC by extracellular matrix (ECM) degradation in the epithelial-mesenchymal transition (EMT)18. MMP1 can be expressed with a low positive rate under normal conditions in a wide range of cells including stromal fibroblasts, macrophages, endothelial cells and epithelial cells. However, it’s expression can be elevated in malignant tumors with poor prognosis (such as ovarian, liver, lung, gastric, colorectal, and prostate)19,20,21,22,23,24. While some studies have reported a relationship between MMP1 and HCC, its specific role in prognosis and the associated tumor-immunity are still unclear.

Furthermore, tumor infiltration immune cells (TIICs) and tumor-associated immune microenvironment are currently key areas of interest for researchers25,26. Immune-related cells and genes may respond to the tumor progression and metastasis through a multitude of pathways and interactions in HCC. The suppression of HCC immune microenvironment facilitates immune tolerance and escape through a number mechanisms27. MMPs play an important role in promoting bladder cancer metastasis through the B cell induced signaling pathway28 and their upregulation by tumor-associated macrophages contributes to tumor infiltration and metastasis in various carcinomas29,30,31, indicating their potential involvement in the tumor-immune microenvironment. However, which and how MMP1 influences the immune cells and the underlying microenvironment still needs to be explored.

In this study, we carried out a comprehensive investigation of the prognostic potential of MMP1 and its relationship with immune-related cells and genes in HCC.

Materials and methods

Genome structure analysis and overview of the mechanisms

We obtained the genome annotations of the MMP1 gene from the University of California Santa Cruz (UCSC) genome browser (http://genome.ucsc.edu/) on Human Dec 2013 (GRCh38/hg28) assembly32.

After searching through the relevant literature, we summarized and outlined the pathologic pathways and mechanisms mediated by MMP1 in different disorders and cancers based on present cell- or animal-experimental evidence.

Gene expression analysis

The tumor immune estimation resource (TIMER), version 2.0 database (http://timer.comp-genomics.org), incorporating 10,009 samples across 23 cancer types from TCGA, is a comprehensive web resource for the systematical evaluation of the differential gene correlation and clinical relevance of tumor-immune infiltrates analysis33,34,35. We selected the “Gene_DE” module to explore the difference in MMP1 expression level between tumor and adjacent normal tissues for an array of carcinomas or their sub-types.

We downloaded the liver hepatocellular carcinoma (LIHC) data from TCGA database and conducted the differential expression analysis of MMP1 based on 50 paired samples.

Gene expression profiling interactive analysis (GEPIA) database (http://gepia.cancer-pku.cn/index.html), a public web server for tumor and normal gene expression profiling and interactive analysis based on the data from Genotype-Tissue Expression (GTEx) and TCGA database36, was used to analyze the difference in MMP1 expression in unpaired samples and each pathologic stage of LIHC. The log2(Transcripts per million (TPM) + 1) for log-scale was used in the assessments.

We included two gene expression datasets related to HCC (GSE14520 (n = 488) and GSE25097 (n = 557)) to conduct differentia expression validation analysis of MMP1.

Genetic alteration analysis

Utilizing the cBioPortal (37,38,39, which is an open platform supporting multidimensional cancer genomics data, we performed a visualized genetic alteration analysis of MMP1. We selected “Quick select: TCGA PanCancer Atlas Studies” on the home page and submitted query for “MMP1” genetic variation characteristics. Data containing alteration frequency, structural variant, mutation and copy number alteration (CNA) was extracted. Next, we obtained mutated site summary of MMP1 exhibited in the pattern chart and three-dimensional (3D) plot of protein structure via the “Mutations” module. We re-selected “Liver Hepatocellular Carcinoma (TCGA, PanCancer Atlas)” on the home page to conduct a survival analysis in overall survival (OS), disease free survival (DFS), progress free survival (PFS) and disease specific survival (DSS) with/without MMP1 gene variation via the “Comparison/Survival” module. Expression difference between wild type (WT) and mutated MMP1 was compared via “Gene_Mutation” module of TIMER 2.0.

DNA methylation and gene enrichment analysis

MEXPRESS (40,41 provided openly visualized DNA methylation, expression and clinical data, as well as statistical analysis using the Pearson correlation coefficient and Benjamini–Hochberg methods. Using this tool, we performed DNA methylation analysis between MMP1 gene of numerous probes (e.g., cg25320665, cg14543953, etc.) and LIHC.

We used the STRING database (42, which contains data on functional proteins association networks, to construct a MMP1-related protein–protein interaction (PPI) network. The main parameters were set as follows: protein name (“MMP1”), organism (“Homo sapiens”), meaning of network edges (“evidence”), active interaction sources (“Experiments”), minimum required interaction sore [“low confidence (0.150)”] and max number of interactors to show [“1st shell: no more than 50 interactors”].

Through analysis of the differential expression of genes, we acquired data for MMP1-related/similar genes from DESeq2 platform43 (version 1.26.0) based on TCGA-LIHC using R (version 3.6.344. Next, we screened out the genes with log2FoldChange (log2FC) > 2/ < -2 and p value < 0.05 to conduct Gene ontology (GO) enrichment and Kyoto encyclopedia of genes and genomes (KEGG) pathway analysis. The results were visualized using “ggplot2” (version 3.3.3) and “clusterProfiler” (version 3.14.3) packages in R45. Functional analysis combined with log2FC was performed via R’s “Goplot” package (version 1.0.2)46. In details, biological process (BP), cellular component (CC) and molecular function (MF) were output visualized cnetplots (node_label = T, colorEdge = T, circular = F) respectively.

Clinical correlation and survival analysis

Clinical data of 374 cases was extracted from TCGA-LIHC and cleansed. Based on MMP1 expression, we divided the patients into two groups (low expression (0–50%) vs high expression (50–100%)). We conducted the clinical correlation analysis between MMP1 expression and numerous clinical indexes (e.g., vascular invasion, pathologic stage, tumor status, etc.). We also evaluated the baseline characteristics of patients from TGCA-LIHC. A logistics regression model of related characteristics was performed and Odds Ratio (OR) calculated.

We retrieved the corresponding prognostic data47 as supplements to conduct prognostic analysis of OS, DSS and progress free interval (PFI). The prognostic data of PFS was obtained from the Kaplan–Meier Plotter database48 (http://kmplot.com/analysis), assembling gene microarray and RNA-seq data from the gene expression omnibus (GEO)49, European genome-phenome archive (EGA)50 and TCGA public databases. We conducted a series of OS analyses of subgroups to identify the high-risk factors related to MMP1 expression and prognosis. All the Kaplan–Meier curves were obtained via R’s “survival/survminer” package, so were hazard ratios (HR) and confident intervals (CI).

We also performed the univariate and multivariate analysis for the OS, DSS and PFI of TCGA-LIHC by log-rank test or cox regression model.

Establishment of MMP1-related prognostic model

Based on the data of TCGA-LIHC, we initially evaluated the diagnostic potential of MMP1 via receiver operating character (ROC) curve. According to the previous analysis, we established an MMP1-related nomogram prognostic model involving 6 clinical indicators (Tumor status, T stage, M stage, Pathologic stage, Age and Histologic grade) to predict 2–4 years OS probability in HCC patients. Next, we conducted time-dependent ROC curves, decision curve analysis (DCA) and prognostic calibration analysis to verify the reliability and accuracy of the model.

Experimental validation of MMP1 expression

We conducted Western blotting (WB) and real-time quantitative PCR (RT-qPCR) to determine MMP1 expression level in HCC. The procedures were as follows:

Samples inclusion: 108 cases diagnosed HCC pathologically at Ningbo University affiliated Lihuili hospital, Ningbo, from 2012 to 2020, were included into this study. We obtained paired samples of tumor and adjacent tissues (normal) from HCC patients by surgical resection and stored in liquid nitrogen.

WB: We randomly selected 20 of the samples. Cells were lysed in ice-cold radioimmunoprecipitation assay (RIPA) cell lysis buffer supplemented with phenylmethanesulfonyl fluoride (PMSF) (Beyotime Biotechnology, Shanghai, China). Protein was extracted and quantitated by BCA protein assay kit (Beyotime Biotechnology, Shanghai, China). Equal proteins were separated by 10% sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE). Then the targeted blots were cut and electro-transferred to polyvinylidene difluoride (PVDF) membranes (Millipore, MA, USA). Membranes were blocked with 5% fat-free milk in Tris-buffered saline containing 0.05% Tween-20 at room temperature for 1 h, followed by incubation with anti-MMP1 (1:5000; cat. no. ab38929; Abcam) and anti-GAPDH (1:10,000; cat. no. ab8245; Abcam) overnight at 4 °C. Corresponding secondary antibodies were co-incubated at 4 °C for 1 h, followed by dilution in PBS. The protein bands were semi-quantified and photographed by the AlphaView analysis system (ProteinSimple, CA, USA).

RT-qPCR: Total RNA was extracted from samples using TRIzol® LS (Invitrogen, CA, USA). cDNA was synthesized with PrimeScript™ RT reagent kit with gDNA Eraser (cat#RR047A; TAKARA-bio) and amplified with SYBR® Premix Ex Taq™ II kit (cat#RR820A; TAKARA-bio) on the ABI PRISM® 7500 Sequence Detection System (Applied Biosystems, CA, USA), according to the manufacturers’ protocols. Sequences of primers (all purchased from Sangon Biotech (Shanghai) Co., Ltd.) were as follows: MMP1 (RefSeq: NM_002421.4): Fwd 5’- ATGCGAACAAATCCCTTCTACC-3’; Rev 5’- TTTCCTCAGAAAGAGCAGCATCG -3’); β-actin (RefSeq: NM_001101.5): Fwd 5’-CCTTCCTGGGCATGGAGTCCTG-3’; Rev 5’-GGAGCAATGATCTTGATCTTC-3’. The mixture was incubated at 95 °C for 30 s, followed 40 alternate cycles at 95 °C for 5 s and 60 °C for 34 s. The 2−ΔΔCT method51 was applied to semiquantitative gene expression analysis with normalized level of β-actin.

Clinical verification of MMP1-related prognostic model

Clinical data of 108 patients above was retrieved. We evaluated the baseline characteristics and conducted survival analysis to determine OS and PFS. Using the same clinical indicators, we conducted the univariate and multivariate analyses for the OS, as well as the MMP1-related nomogram prognostic model. Corresponding validation of time dependent ROC curve, DCA and prognostic calibration analysis were subsequently performed.

Correlation analysis of MMP1and tumor-immune microenvironment in LIHC

We conducted the correlation analysis of MMP1 expression with 24 immune-related cells52, as well as their infiltration levels in LIHC using the spearman’s test53. Similar analysis of other TIICs was visualized via TIMER 2.0 with different algorithms. We also explored the clinical relevance of tumor immune subsets. Relationships between MMP1 and three kinds of immunomodulators54 expression across different cancers based on TCGA were visualized using heatmaps. After filtering out the genes with p > 0.05, we conducted Lasso regression and prognostic risk factors prediction of immuno-inhibitors, immune-stimulators and major histocompatibility complex (MHC) molecules.

Evaluation of therapy

We investigated the correlation between MMP1 expression levels and the tumor mutational burden (TMB) and microsatellite instability (MSI) in pan-carcinomas using Sangerbox tools (http://www.sangerbox.com), a free online platform for data analysis. Immunotherapy analysis and drug evaluation were performed accordingly via tumor-immune system interaction database (TISIDB)55, integrating multiple heterogeneous data. Immunophenotyping score (IPS) was included to further estimate the correlation between MMP1 and immunotherapy.

Statistical analysis

All statistical analysis and graphing were performed using R (version 3.6.3). Normally distributed variables were analyzed using the t-test and one-way ANOVA test. Non-normally distributed variables were analyzed using nonparametric tests. Log-rank test and cox regression were used for survival analysis, Pearson’s correlation and spearman’s rank correlation test for correlation analysis. p value < 0.05 was considered statistically significant. The correlations was defined as follows: 0.00–0.10 (negligible), 0.10–0.39 (weak), 0.40–0.69 (moderate), 0.70–0.89 (strong), 0.90–1.00 (very strong)56.

Ethical approval

All procedures in the study involving human participants were approved by the research ethics committee of Lihuili hospital affiliated to Ningbo University with informed consent written from each patient prior to enrollment (Approval no. KY2021PJ082). This study was performed in compliance with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

Results

Overview of MMP1

The study design is detailed in the flow chart in Fig. 1. The study aimed to perform a multidimensional assessment of the tumorigenic role and prognostic potential of human MMP1 (NM_002421.4 for mRNA or NP_002412.1 for protein, Fig. 2A). We refined and summarized MMP1 mechanisms and pathways published in recent years (Fig. 2). MMP1 is a zinc-dependent endopeptidase with the function of degrading multiple substrates (e.g., I, II, III, VII collagen, pro-TNF, pro-MMP2, etc.) in the tumor microenvironment57 (Fig. 2B). According to the reported studies so far, most inferential mechanisms and pathways based on experiments revealed that up-regulation of MMP1 expression led to tumor progression, involving the down-expression of circ DLC158, capicua (CIC)59 and MTAP60, and the up-expression of 14–3-3σ61, MPP362 and c-Jun63. On the flip side, Praeruptorin A was considered to reduce metastasis of HCC via ERK/MMP1 signaling pathway64 (Fig. 2C).

Figure 1
figure 1

Study design flow chart. TCGA, the cancer genome atlas; qPCR, quantitative Polymerase Chain Reaction; WB, western blot; ROC, receiver operating characteristic curve.

Figure 2
figure 2

MMP1 gene information, mechanisms and pathways. (A) Genomic location and structural characteristics of human MMP1; (B) Potential mechanisms and pathways of MMP1; (C) The reported pathogenic pathways mediated by MMP1 in multiple cancers with relevant references included.

Comparative analysis of MMP1 expression

We compared MMP1 expression levels between tumor and normal tissues in all cancer types of TCGA via TIMER2.0. As shown in Fig. 3A, the MMP1 expression level in the tumor tissues of LIHC was much higher than the adjacent normal tissues (P < 0.001). We proceeded to conduct Wilcoxon signed rank test on 50 matched pairs sample in TGCA (Fig. 3B) and included data of the normal tissues from the GTEx database as controls (corrected the batch effects via TOIL method) to further evaluate the difference in MMP1 expression between tumor and normal tissues of LIHC (Fig. 3C). MMP1 expression was consistently elevated in tumor than normal tissues in LIHC (P < 0.001). Based on GEO database, it was further verified that MMP1 expression was higher in tumor tissues than normal tissues in GSE14520 (P < 0.001) (Fig. S1A) and GSE 25,097 (P < 0.05) (Fig. S1B). We also investigated the differential expression level of MMP1 in each pathological stage of LIHC using “pathological stage plot” module of GEPIA2.0. Significant differences were observed (F value = 7.37, P = 8.48e-05) (Fig. 3D).

Figure 3
figure 3

MMP1 expression levels in different tumors, tissues and pathological stages. (A) Analysis of MMP1 expression level in different tumors or their subtypes and corresponding normal tissues via TIMER2.0. (B) Paired comparison of MMP1 expression level in LIHC. (C) Unpaired comparison of MMP1 expression level in LIHC by including the relevant normal tissues of the GTEx database as controls. (D) Analysis of MMP1 expression in different pathological stages of LIHC. LIHC, liver hepatocellular carcinoma. *P < 0.05, **P < 0.01, ***P < 0.001.

Genetic alteration analysis

Using the cBioPortal, we conducted a comprehensive analysis of genetic alteration status of MMP1 across different cancers in the TCGA database. Although the alteration frequency of MMP1 in LIHC was lower than 2%, amplification was the dominant component (Fig. 4A). We also observed that mutation was the only other type of MMP1 genovariation in LIHC. Detailed information of mutation sites, types and frequencies of MMP1 are shown in Fig. 4B. Missense mutation of MMP1 was the most common form (82/94, 87.23%) and P412S/H/L mutation in the Hemopexin domain was detected in 2 cases of skin cutaneous carcinoma (SKCM), 1 case of lung adenocarcinoma (LUAD) and 1 case of head and neck squamous cell carcinoma (HNSC) (Fig. 4B), which may result in frame-shift mutation of the MMP1, translation from S (serine) to H (histidine)/L (leucine) at the 412 site of MMP1 protein and promoting protein truncation. The 3D structure MMP1 with the mutated portion is presented in Fig. 4C. Subsequently, we assessed the expression difference between wild type (WT) and mutated MMP1 (P = 0.6, Fig. 4D), as well as the relationship between MMP1 alteration and prognosis in LIHC. As demonstrated in Fig. 4E, there was no statistical significance between LIHC patients with and without MMP1 alteration on OS, DSS, PFS and DFS (all P > 0.05).

Figure 4
figure 4

Genetic variation features of MMP1 in different cancers based on TCGA via cBioPortal. (A) The mutation frequency and mutation types in diverse cancers; (B) Potential sites of mutation; (C) mutation site with highest frequency (P412S/H/L) in the 3D version of MMP1 and related carcinomas; (D) Comparison of expression level between mutated and WT MMP1; (E) MMP1 alteration impact on OS, DSS, DFS, and PFS of liver hepatocellular carcinoma (LIHC) by the cBioPortal. WT, wild type; OS, overall survival; DSS, disease-specific survival; DFS, disease free survival; PFS, progression free survival; *P < 0.05.

DNA methylation and gene enrichment analysis

With the aid of MEXPRESS, we investigated the potential correlation between MMP1 DNA methylation and tumorigenesis in LIHC. Despite the inadequate methylation data, we were still able to observe some significant differences in several probes. A significant negative correlation at probe (cg14543953) of promoter region (r =  0.144, P < 0.01) and positive correlation at probe (cg25320665) of non-promoter region (r = 0.173, P < 0.001) were observed (Fig. 5A).

Figure 5
figure 5

MMP1 DNA methylation in TCGA-LIHC and Gene enrichment analysis. (A) Correlation between MMP1 DNA methylation and LIHC expressed as beta value, Pearson correlation coefficients (R) and Benjamini-Hochberg-adjusted P value; (B) MMP1-related functional proteins association networks with experimental determination via STRING database; (C) MMP1-related genes KEGG pathway analysis; (D) MMP1-related genes Go analysis; (E) Centplots for molecular function (MF), cellular component (CC) and biological process (BP) data.

For further exploration of MMP1 molecular mechanism in tumorigenicity, we tried to construct MMP1-related PPI networks and conduct enrichment analyses of the signal pathways. 50 MMP1-binding proteins verified by experimental evidence were acquired and their interaction network chart via STRING tool (Fig. 5B). By screening out genes with low correlations, we obtained 300 highly correlated genes (log2FC > 2/ <  2 and p value < 0.05) to conduct KEGG and GO enrichment analysis. We acquired 1 dataset of KEGG, 9 datasets of MF, 1 dataset of CC and 18 datasets of BP under the threshold conditions (adjust P < 0.05 and q value < 0.2). KEGG plot suggested that “neuroactive ligand-receptor interaction” (P < 0.001) might be the main pathways involved in MMP1 tumorigenicity, as well as “catenin complex” (P = 0.004) of CC. Based on the GO enrichment analysis, “hormone secretion”, “cell–cell adhesion mediated by cadherin” and “cell–cell adhesion via plasma-membrane” (all P = 0.006) of BP and “receptor ligand activity” (P = 0.001), “hormone activity” (P = 0.001) and “neurotransmitter receptor activity” (P = 0.005) of MF were predicted to have intimate connection to MMP1 (Fig. 5C). We included the log2FC for conjoint analysis to perform a chordal graph showing highly related genes in KEGG/GO datasets. We observed that most of them had a significant positive correlation with MMP1, while the other four had a negative correlation, including INS-IGF2, WNT3A, SAA2 and BMP10 (Fig. 5D). Upon drilling down the GO analysis, several potential signal pathways were also predicted, such as digestion, collagen catabolic process and extracellular matrix disassembly of BP, zymogen granule membrane of CC and serine-type endopeptidase activity of MF (Fig. 5E).

Clinical correlation and Survival analysis

374 cases of LIHC from TCGA were divided equally into two groups according to expression level of MMP1 (low (0–50%) vs high (50–100%)). The patients’ background and baseline characteristics are summarized in Table 1. There were significant differences between the groups with respect to T stage (P < 0.001), pathologic stage (P = 0.012), tumor status (P = 0.013), histologic grade (P = 0.01) and vascular invasion (P = 0.007). There was no significant difference in other clinical characteristics (Table1). We evaluated the correlation between MMP1 expression and clinical indicators one by one and identified 6 associated risk factors. MMP1 expression was higher in patients with AFP > 400 ng/ml than those with AFP ≤ 400, as well as in patients with vascular invasion than ones without (all P values < 0.05). As for histologic grade, higher expression of MMP1 was only observed in G3&G4 compared to G1 (P < 0.05). There was significant difference between pathologic stage III and stage I (P < 0.001). Patients with tumor had a higher MMP1 expression level than tumor free patients (P < 0.001). T stage might be a highly sensitive clinical indicator correlated with MMP1 since there was significant difference in the expression level between each stage (T1 vs T2, P < 0.05; T1 vs T3, P < 0.01; T1 vs T4, P < 0.001; T2 vs T4, P < 0.01; T3 vs T4, P < 0.05) except T2 vs T3 (Fig. 6A). Furthermore, we included these 6 indicators logistic regression analysis. T stage (Odds Ratio (OR) = 1.985 (1.316–3.009), P = 0.001), vascular invasion (OR = 1.952 (1.225–3.127), P = 0.005), histologic grade (OR = 2.072 (1.151–3.833), P = 0.017), tumor status (OR = 1.754 (1.149–2.688), P = 0.009) and pathologic stage (OR = 1.862 (1.221–2.853), P = 0.004) were determined to be risk factors associated with MMP1 in LIHC patients (Table 2).

Table 1 Baseline characteristics of patients (TCGA-LIHC). Data are presented as n (%).
Figure 6
figure 6

Clinical correlation and prognosis analysis of MMP1 expression in TCGA-LIHC. (A) Correlation analysis between multiple clinical indicators and MMP1 expression; (B) Prognosis analysis of MMP1 expression in TCGA-LIHC; (C) A series of subgroup analyses on OS of MMP1 expression in TCGA-LIHC. OS, overall survival, *P < 0.05, **P < 0.01, ***P < 0.001.

Table 2 Logistics regression model of MMP1 (TCGA-LIHC).

We had access to data from TCGA and Kaplan–Meier plotter to investigate the prognostic potential of MMP1 expression in LIHC. We found that higher expression of MMP1 was associated with a poorer prognosis in patients with LIHC (OS: HR = 1.83, 95%CI = 1.29–2.61, P = 0.001, n = 373; PFS: HR = 1.96, 95%CI = 1.46–2.63, P < 0.001, n = 370; DSS: HR = 2.36, 95%CI = 1.48–3.76, P < 0.001, n = 365; PFI: HR = 1.46, 95%CI = 1.09–1.95, P = 0.011, n = 373. Figure 6B). We subsequently conducted subgroup survival analysis for an in-depth evaluation of the correlation between MMP1 expression and various of clinicopathological factors (Table 3). Elevated MMP1 expression was associated with a worse OS in T1 stage (HR = 2.14, 95%CI = 1.17–3.19, P = 0.013), M0 stage (HR = 1.62, 95%CI = 1.05–2.51, P = 0.029), tumor-bearing (HR = 1.57, 95%CI = 1.00–2.46, P = 0.049), pathologic stage I (HR = 2.21, 95%CI = 1.17–4.17, P = 0.014), males (HR = 2.23, 95%CI = 1.41–3.52, P = 0.001), age ≤ 60 (HR = 2.07, 95%CI = 1.21–3.57, P = 0.008) or age > 60 (HR = 1.61, 95%CI = 1.01–2.56, P = 0.046), BMI ≤ 25 (HR = 2.14, 95%CI = 1.26–3.63, P = 0.005), R0 resection (HR = 1.70, 95%CI = 1.16–2.50, P = 0.006), histologic G1 (HR = 3.45, 95%CI = 1.26–9.46, P = 0.016)/G2 (HR = 1.96, 95%CI = 1.15–3.32, P = 0.013)/G3&G4 (HR = 2.20, 95%CI = 1.21–3.97, P = 0.009), AFP ≤ 400 (HR = 2.19, 95%CI = 1.29–3.70, P = 0.004), Albumin ≥ 3.5 (HR = 1.86, 95%CI = 1.14–3.03, P = 0.013), Prothrombin time ≤ 4 (HR = 2.30, 95%CI = 1.30–4.07, P = 0.004), Child–Pugh A (HR = 1.88, 95%CI = 1.11–3.20, P = 0.019), fibrosis ishak of 5/6 (HR = 2.84, 95%CI = 1.07–7.54, P = 0.036) and nonvascular invasion (HR = 1.71, 95%CI = 1.02–2.87, P = 0.041) (Fig. 6C).

Table 3 Correlation of MMP1 expression and OS in hepatocellular carcinoma patients with different clinicopathological factors via Kaplan–Meier plotter.

Establishment and evaluation of an MMP1 based prognosis prediction model

According to the TCGA data, we first investigated the diagnostic potential of MMP1 in LIHC and obtained the ROC curve that showed an above average performance (area under the curve (AUC) = 0.769, CI = 0.703–0.835) (Fig. 7A). Next, we included all variables into the univariate analysis in respect to OS, DSS and PFI. T3&T4 stage (P < 0.001), M1 stage (P = 0.017), pathologic stage III&IV (P < 0.001), tumor-bearing status (P < 0.001), MMP1 expression (P < 0.001) were significantly correlated with OS (Table 4), T3&T4 stage (P < 0.001), M1 stage (P = 0.024), pathologic stage III&IV (P < 0.001), prothrombin time > 4 (P = 0.031), Child–Pugh B (P = 0.047) and MMP1 expression (P < 0.001) to DSS (Table 5) and T3&T4 stage (P < 0.001), M1 stage (P = 0.035), pathologic stage III&IV (P < 0.001), tumor-bearing status (P < 0.001), vascular invasion (P = 0.003) and MMP1 expression (P < 0.001) to PFI (Table 6). Tumor-bearing status (HR = 1.819, 95%CI = 1.137–2.911, P = 0.013) and MMP1 expression (HR = 1.236, 95%CI = 1.037–1.473, P = 0.018) were independent factors impacting the OS of patients with LIHC (Table 4). Child–Pugh B (HR = 3.667, 95%CI = 1.135–11.845, P = 0.030) and MMP1 expression (HR = 1.424, 95%CI = 1.020–1.987, P = 0.038) for DSS (Table 5), so as tumor-bearing status (HR = 14.440, 95%CI = 8.484–24.580, P < 0.001), vascular invasion (HR = 1.536, 95%CI = 1.009–2.337, P = 0.045) and MMP1 expression (HR = 1.204, 95%CI = 1.003–1.446, P = 0.047) for PFI (Table 6).

Figure 7
figure 7

Establishment of an MMP1 expression based prognostic model in TCGA-LIHC. (A) Diagnostic ROC curve of MMP1 expression in TCGA-LIHC; (B) An MMP1 expression-based nomogram predicting risk of TCGA-LIHC; (C) Time dependent ROC curve for verifying the utility of nomogram; (D) Decision curve analysis for evaluating the prognostic model; (E) Nomogram calibration curve.

Table 4 The univariate and multivariate analysis for the OS (TCGA-LIHC).
Table 5 The univariate and multivariate analysis for the DSS.
Table 6 The univariate and multivariate analysis for the PFI.

According to the univariate and multivariate cox regression analysis above, MMP1 were deemed an important independent prognostic factor in addition to clinical factors. These results showed its potential to serve as a reliable and innovative biomarker for patients with LIHC. More importantly, in order to predict the LIHC patients’ prognosis, we constructed a nomogram incorporating MMP1 and multiple clinicopathological characteristics based on the above analysis. As shown in Fig. 7B, we could calculate the score for each variable and combined them to predict the prognosis of patients with LIHC (C-Index = 0.680, 95%CI = 0.647–0.713). As a supplement, we employed time-dependent ROC curves to assess the accuracy of MMP1 for predicting OS, DSS and PFI in LIHC patients. The AUC values for OS (2-year: 0.642, 3-year: 0.664, 4-year: 0.658), DSS (2-year: 0.677, 3-year: 0.706, 4-year: 0.684) and PFI (2-year: 0.645, 3-year: 0.647, 4-year: 0.673) (Fig. 7C). DCA (Fig. 7D) was performed to determine the clinical utility of the nomogram (C-Index = 0.614, 95%CI = 0.577–0.650). Nomogram calibration curves (Fig. 7E) showed a good predictive accuracy of the model.

Validation of clinical dataset in HCC patients

Tumor/normal samples and clinicopathological data of 108 patients diagnosed with HCC and treated with surgical resection were retrieved and analyzed. MMP1 expression was higher in tumor tissues at translation level (Fig. 8A). Consistent with the result of WB, higher expression of MMP1 in tumor than normal tissues (P < 0.001) were corroborated by qPCR (Fig. 8B).

Figure 8
figure 8

Clinical verification of the MMP1 expression based prognostic model in HCC patients. (A) Western blot analysis of MMP1 expression proteins in paired samples from HCC patients (The original images uncropped and the clipping process were shown in supplementary materials); (B) qPCR analysis of MMP1 expression in HCC patients; (C) Prognosis analysis of MMP1 expression in HCC patients; (D) MMP1 expression-based risk prediction nomogram for HCC patients; (E) Decision curve analysis of the prognostic model; (F) Time dependent ROC curve for accessing the utility of the nomogram; (G) Nomogram calibration curve. HCC, hepatocellular carcinoma. *P < 0.05, **P < 0.01, ***P < 0.001.

The 108 cases were divided into two groups according to MMP1 expression level: low (0–50%) and high (50–100%) expression groups. The patients’ background and baseline characteristics are summarized in Table 7. Unlike the baseline of TCGA-LIHC, there were significant differences between the groups with respect to pathologic stage (P = 0.019), AFP (P = 0.020), albumin (P = 0.026) and Child–Pugh grade (P = 0.002) in clinical HCC cohort. Consistent with the above results, the Kaplan–Meier survival curves confirmed that the low-expression group was associated with a better OS (HR = 5.41, 95%CI = 2.91–10.06, P < 0.001) and PFS (HR = 6.61, 95%CI = 3.88–11.25, P < 0.001) (Fig. 8C).

Table 7 Baseline characteristics of patients (Clinic HCC). Data are presented as n (%).

Based on univariate logistic regression, Age > 60 (P = 0.038), tumor-bearing status (P = 0.002), albumin < 3.5 (P = 0.004), Child–Pugh B (P < 0.001) and MMP1 (P < 0.001) were significantly correlated with OS (Table 8). Tumor-bearing status (HR = 3.594, 95%CI = 1.429–9.036, P = 0.007) and MMP1 expression (HR = 1.524, 95%CI = 1.238–1.875, P < 0.001) were independent factors associated with poor OS, while the age > 60 (HR = 0.501, 95%CI = 0.268–0.939, P = 0.031) and R1&2 resection (HR = 0.226, 95%CI = 0.073–0.701, P = 0.010) were independent protective factors. Based on the above, we constructed a nomogram composed variables in the MMP1 based nomogram to predict the 2,3and4 years survival probability of HCC patients (Fig. 8D). The model showed a good accuracy for the patients in the cohort (C-Index = 0.797, 95%CI = 0.766–0.828). DCA curves showed consistent results (C-Index = 0.759, 95%CI = 0.727–0.792) (Fig. 8E). The AUC of time-dependent ROC curve likewise showed good performance of the model (0.802, 0.752 and 0.741 at 2, 3 and 4 years, respectively) (Fig. 8F). Nomogram calibration curves are shown in Fig. 8G.

Table 8 The univariate and multivariate analysis for the OS (Clinical HCC).

Correlation analysis between MMP1 and tumor-immune microenvironment in HCC

Tumor infiltrating immunocytes, have an important role in the complex tumor-immune microenvironment, and have been shown to influence the progression of a variety of tumors. Thus, it was necessary for us to investigate any relationship between MMP1 expression and TIICs in HCC. 24 immune-related cells were first included into the correlation analysis (results in Fig. 9A). We used circle heatmap to perform other 9 immune-related cells as supplement via TIMER2.0 (Fig. 9B). Diverse of algorithms (i.e. XCELL, ssGSEA, MCPcounter, TIDE, EPIC, CIBERSORT, etc.) were applied for multi-verification (Fig.S2-S4, S5A). As shown in Fig. 9C and 9D, we found that a high MMP1 expression was negatively related with dendritic cell (DC) (r = -0.187, P < 0.001), T gamma delta (Tgd) (r =  0.143, P = 0.006), Th17 cells (r =  0.250, P < 0.001), common myeloid progenitor (CMP) (r =  0.133, P < 0.05), endothelial cells (EC) (r = -0.283, P < 0.001), granulocyte-monocyte progenitor (GMP) (r =  0.214, P < 0.001) and hematopoietic stem cell (r =  0.375, P < 0.001), while positively associated with activated DC (aDC) (r = 0.166, P = 0.001), NK CD56bright cells (r = 0.141, P = 0.006), macrophages (r = 0.144, P = 0.005), Th1 cells (r = 0.154, P = 0.003), T follicular helper (TFH) (r = 0.168, P = 0.001), T helper cells (r = 0.214, P < 0.001), Th2 cells (r = 0.399, P < 0.001), CD4+ T cells (r = 0.116, P < 0.05), monocytes (r = 0.342, P < 0.001), cancer associated fibroblast (CAF) (r = 0.227, P < 0.001), common lymphoid progenitor (CLP) (r = 0.31, P < 0.001) and myeloid-derived suppressor cells (MDSCs) (r = 0.421, P < 0.001).

Figure 9
figure 9

Multiple immune-related cells—MMP1 expression correlation analysis. (A) Lollipop plot of MMP1 expression and immune infiltration cells correlation in TCGA-LIHC; (B) Circle heatmap of MMP1 expression—immune-related factors correlation via TIMER2.0; (C) MMP1 expression and some types of immune-related cells; (D) Correlation analysis of other immune-related cells via TIMER2.0. *P < 0.05, **P < 0.01, ***P < 0.001.

Furthermore, survival analysis of covariates which comprised of MMP1 expression and several immune factors was performed. There were significant differences between each group of different gene expression and immune cell infiltration. Patients with MMP1 high expression + high macrophage/MDSC/T cell CD4 + infiltration (all P < 0.001) had a poorer OS (Fig. S5B).

In addition, we conducted correlation analysis between MMP1 and immunomodulators on all cancers of TCGA. We screened out the prominent risk factors correlated with MMP1 affecting HCC patient prognosis (Fig. 10). For immune-inhibitors, the potential correlation between MMP1 and immune-related genes was shown in a heatmap (Fig. 10A). We then conducted lasso regression analysis including the genes with P < 0.05 and screened 8 most correlated risk genes out. Patients in high-risk group might be associated with worse prognosis and positively with genes of HAVCR2, IL10RB, LGALS9, TGFB1 and TGFBR1, while negatively with gene of KDR (Fig. 10B). Although most of MHC molecules were significantly related with MMP1 (Fig. 10C), no risk factors predictive of prognosis were identified (Fig. 10D). In the case of immune-stimulators, correlation (Fig. 10E) and lasso regression (Fig. 10F) analyses were performed and 7 risk genes were screened out. Genes of RAET1E, TNFRSF14, TNFRSF4 and TNFSF4 seemed positively correlated with high-risk group accompanied worse prognosis, while CD27, TNFRSF13C and TNFRSF17 positively correlated with low-risk group.

Figure 10
figure 10

Analysis of correlation between MMP1 expression and multiple immune-related genes. (A) Immune-inhibitor genes—MMP1 expression heatmap; (B) Lasso analysis and risk factor prediction for immune suppressor genes; (C) MHC molecule genes—MMP1 expression correlation heatmap; (D) Lasso analysis for MHC molecules; (E) Immune-stimulator genes—MMP1 expression correlation heatmap; (F) Lasso analysis and risk factor prediction for immune-stimulating genes; MHC, major histocompatibility complex, *P < 0.05, **P < 0.01, ***P < 0.001.

MMP1-related therapy

Studies have increasingly reported that TMB65 and MSI66 could be used as predictive biomarkers for cancer immunotherapy, which might be one of most popular methods to predict the therapeutic efficiency of immunotherapy on carcinomas. Therefore, we investigated the correlation between MMP1 expression and TMB/MSI in 32 types cancers via SangerBox. There was no significant correlation between MMP1 expression and TMB/MSI in HCC patients (Fig. 11A–B). Although response to MMP1-related immunotherapy was reported in melanoma (all group, P = 0.0458; MAPKi group, P = 0.0355) and urothelial cancer (all group, P = 0.0168; smoking group, P = 0.0132), its correlation or lack of in HCC has not been reported in any cohort studies thus far (Fig. 11C). As a supplement, we performed correlation analysis between MMP1 and IPS based on the cancer immunome (TCIA) database, which provides results of comprehensive immunogenomic analyses of next generation sequencing data (NGS) data for 20 solid cancers67. Although low-risk group had a higher score in patients with PD1-negtive and CTLA4-negtive (P < 0.05), there were no significant difference in PD1-positive, CTLA4-positive or both (Fig. S5C). For further exploration on MMP1-related drugs, we constructed a network diagram. Currently, drugs targeting MMP1 still remained in the experimental stage and Marimastat was the unique broad-spectrum MMPs inhibitor with oral activity (Fig. 11D).

Figure 11
figure 11

MMP1 immunotherapies and relevant drugs. (A) Correlation between MMP1 expression and tumor mutational burden (TMB); (B) Correlation between MMP1 expression and microsatellite instability (MSI); (C) Evaluation of immunotherapies associated with MMP1; (D) prediction of MMP1-related drugs. *P < 0.05, **P < 0.01, ***P < 0.001.

Discussion

Although the incidence and mortality of HCC have decreased in South-Eastern Asia due to hepatitis vaccination progress, it is still one of the leading causes of cancer related deaths world wide1. Lack of efficient prognostic factors leads to delayed diagnosis and intervention, which in turn considerably contribute to poor patient survival. In this study, we demonstrated that MMP1 expression was elevated in HCC and was significantly correlated with worse prognosis utilizing a bioinformatic analysis method based on public databases resources and experimental verification. Most importantly, we established an innovative MMP1-related prognostic model for predicting the survival probability of patients with HCC with good accuracy. The MMP1 expression had different relationships with corresponding TIICs and various immune-related genes in HCC. All the findings indicated an underlying mechanism of MMP1 expression in remodulating the tumor-immune microenvironment and immune escape. To our knowledge, our study is the first of a kind comprehensive analysis of MMP1 and establishes a novo prognostic model based on MMP1 expression in hepatocellular carcinoma, as well as evaluating MMP1-related risk factors of immunomodulators in HCC.

MMP1, as a member of MMPs, participates in the EMT which was identified as a strict programmed shift playing a crucial role in tumor invasion and metastasis68. MMPs could auto-activate and lead to a cascade of interaction activation between each other to enhance their influence in the EMT20. For invasive HCC, overexpression of MMP1 has been confirmed to correlate with an increased capacity of invasion and migration in HCC cells. The most likely mechanism involved is that of ECM degradation promoting the transmembrane migration of tumor cells18. This speculation combined with the BP of cell–cell adhesion via plasma-membrane adhesion molecules (Fig. 5C) would to some extent explain the outcomes of high expression of MMP1 in tumor tissues and poor prognosis in HCC patients. On the other hand, MMP1-mediated tumor progression could be regulated negatively by circDLC158, CIC59 and MTAP60 and positively by 14–3-3σ61, MPP362 and c-Jun63, indicating multiple pathological pathways of MMP1 carcinogenesis in HCC.

Although MMP1 expression has been reported to be high in malignant tumors and correlated with a poor prognosis in various cancers (ovarian, liver, lung, gastric, colorectal, and prostate), there is no comprehensive survival analysis between MMP1 expression and prognosis, nor a precise prognostic prediction model based on MMP1 in HCC. Consistent with previous studies, high expression of MMP1 is closely related with poor OS, PFS, DSS and PFI. However, high expression of MMP1 was not uniformly correlated with the poor OS for all clinicopathological characteristics. By performing the subgroup survival analysis, we found no significant correlation between MMP1 expression and T2–T4 stage, M1 stage, pathologic stage II-IV, R1&2 resection, AFP > 400, albumin < 3.5, prothrombin > 4, Child–Pugh B&C, vascular invasion, fibrosis ishak score 0–4 (all due to small sample size), tumor-free, gender of female and BMI > 25. Apart from tumor-free status, which had rare expression of MMP1, MMP1’s relationship with female gender and obesity needs further investigation.

Based on the univariate and multivariate analysis results, we suggest that monitoring MMP1 expression can be of significant value in early detection and mitigation of early HCC recurrence. Therefore, we developed a new predictive model that incorporates multiple clinical indications and MPP1 expression to predict patient prognosis. We validated the model using TCGA-LIHC and clinical HCC dataset from our center. Multi-center validation is still required.

According to resent studies, the integration of clinicopathological characteristics and TIICs can be a clinical predictive model for the efficiency of immunotherapy26,69. The genesis and development of tumors can involve large numbers of immune infiltrating cells and inflammatory mediators. Although MMP1 is involved in the tumor-immune-related progression of some carcinomas, there is barely any studies regarding the interaction between MMP1 and TIICs in HCC proliferation and migration. Our study, presents relationship between MMP1 expression and different TIICs or immunomodulators, signifying a close connection between MMP1 and immune infiltration in HCC patients. Although the function of TIICs in carcinogenesis is still controversial, a cluster of studies have reported that MMP1 alongside TIICs plays a vital role in tumor progression70,71,72. To further investigate the relationship between MMP1 expression and immune-related cells infiltration in HCC, we analyzed the data using the spearman test, ssGESA and other statistical algorithms (Figs. 9, S2-4, S5A). As reported in prior studies, high MMP1 expression may promote the production of tumor-killing immune cells (aDC, NK CD56bright cells, macrophages, Th1/2, TFH, T helper cells and CD4+ T cells) and cause regional inflammation and fibrosis (monocyte and CAF). The positive correlation between MMP1 and MDSC could be probably explained by MDSC suppression of ability of immune cells to respond thus leading to tumor progression. In the same way, the negative correlation between MMP1 and DC/GMP may be a result of the depletion caused by continuous activation of DC and monocyte. The poor survival associated with high MMP1 expression + high macrophage/MDSC/T cell CD4 + infiltration also indicated that these risk factors might contribute to tumor progression synergistically. However, it is difficult to explain the negative correlation between MMP1 and Th17 (induce immune response to bacteria and fungi), Tgd (adjuvant tumor killing), EC, CMP and HSC, as well as the positive correlation with CLP. Tumor-immune microenvironment is extremely complex and involved with a multitude of undisclosed mechanisms. All the correlations above need to be validation and explored deeper. The results of TMB/MSI evaluation suggested the patients with HCC might not readily benefit from the treatment of PD-1, necessitating the exploration of inhibitors targeted to new immunosuppressive site. In consistent with it, the results of IPS correlation analysis did not reveal more available immunotherapeutic information for LIHC patients yet. Hence, one or more HCC-related immunotherapy cohorts are vital for future research.

Understanding mechanism of the tumor-immune microenvironment has been at the frontier of research to screen out related genes that can serve as biomarkers for diagnosis and prognosis or therapeutic targets73. Using lasso regression analysis, we screened out several potential immune-related genes with significant correlation with MMP1.

Although we conducted numerous analyses, there were several notable limitations to our study. First, since we used many of databases and statistic methods in attempting to elaborate the role of MMP1 on tumorigenesis and prognosis in HCC, cask effect appeared when harmonizing the data. This is mainly a result of data updates being out of sync or the source having a single function. We conducted the experiments and analyzed the clinical data to provide a more concrete basis for the conclusions, but did not conduct the experiments for immune-related cells/genes since a large amount of data is required to complete the experimentation and such experimentation is not feasible at a single center. The correlation between MMP1 and TIICs was detectable but not strong, and some contradictory results were difficult to explain. This is an important direction and terra incognita for further research. Since this was a bioinformatic analysis, the batch effect of samples and cross-platforms, the difference in the sample sizes, types and data processing methods among various databases were common and difficult to eliminate. Although we had unified data cleaning and batch effect correction, the batch effect of cross-platforms may still exist.

The internal validation of the MMP1 based nomogram relied on a relative small dataset, therefore external validation with larger sample sizes is required.

In conclusion, our study revealed a close relationship between high MMP1 expression and poor prognosis in HCC and MMP1 involvement in tumor-immune cell infiltration and immunomodulators. We also suggest a model to predict prognosis in HCC patient with good accuracy. Although some mechanisms associated with MMP1 are unknown, we still have reason to believe that MMP1 is a promising prospective prognostic biomarker in HCC. The potential mechanism of MMP1 and tumor immune microenvironment and relevant immunotherapy cohort can be the focus of future research.