Introduction

Lung cancer is the most commonly diagnosed cancer and the leading cause of cancer-related deaths worldwide [1]. Non-small cell lung cancer (NSCLC) is comprised mainly of lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC), which accounts for 80–85% of all lung cancer patients [2, 3]. In the past decade, remarkable progress has been made in the treatment of NSCLC in patients whose tumors harbor targetable somatic mutations, and EGFR, ALK, PD-L1, or CTLA-4 inhibitors have been proven to be more effective than conventional chemotherapy [4,5,6,7]. Despite the advances in screening and treatment of NSCLC, only a small portion of NSCLC patients benefit from these approaches, and the 5-year survival rate is <20% [2]. The potential markers for prediction of outcomes and responses to specific therapy in NSCLC are limited.

The TGF-β/Smad signaling pathway is one of the most commonly altered cellular pathways in human cancers and plays a dual role in tumorigenesis [8]. TGF-β serves as a tumor suppressor by inhibiting proliferation and accelerating apoptosis during initiation and early progression of the tumor, but as the disease progresses, it promotes tumor formation by facilitating migration, invasion, angiogenesis, and evasion of the immune system [9, 10]. The loss of sensitivity to TGF-β is frequently observed in human cancers, and inactivation of Smad family members is a prominent mechanism in disruption of the TGF-β pathway [11].

SMAD4 (also called deleted in pancreatic carcinoma 4) was initially identified as a tumor suppressor gene at a homozygous deleted region on human chromosome 18q21.1 in pancreatic ductal adenocarcinoma by Hahn et al. [12] and encodes the only one co-Smad. It has been confirmed that the loss of SMAD4 predicts a worse prognosis and the development of metastatic disease in patients with pancreatic cancer [13, 14]. In combination with other biomarkers, the loss of SMAD4 on immunohistochemical staining is often used to suggest a pancreaticobiliary primary site in the evaluation of metastatic adenocarcinoma with a unknown primary site [15]. Recently, several studies have shown that mutations and deletions of SMAD4 are detected in colon, ovarian, appendiceal, esophageal, and lung cancers [16,17,18,19,20,21,22], in addition to pancreatic cancer. Although mutation and loss of SMAD4 have been described in NSCLC, where they are associated with lymph node metastases, increased angiogenesis and more aggressive cellular behavior in vitro [22, 23], detailed documentation of the prognostic role of SMAD4 and its association with other molecular parameters in a large patient cohort are still lacking. In this study, we detected 24 SMAD4-mutated cases in a 963-case Chinese NSCLC cohort using next-generation sequencing (NGS). The correlation between SMAD4 mutation and the clinico-molecular features of the patients was further evaluated. We found that the mutation of SMAD4, as well as loss or reduction of its expression, was linked to the progression of NSCLC and patient survival.

Materials and methods

Patient characteristics

A total of 6564 patients diagnosed with NSCLC who underwent surgical resection were identified at Fudan University Shanghai Cancer Center (FUSCC) between June 2017 and July 2019. Of them, NGS was performed on 1087 patients. The patients with tumor samples not available or missing (n = 74) and the patients with incomplete follow-up information (n = 50) were excluded. Finally, 963 patients from FUSCC who provided fully informed consents were used for the analysis (Fig. 1A). For all patients, the following data were extracted from the electric medical record: gender, age, pathological grade, and TNM stage. Tumor stage was determined based on the American Joint Committee on Cancer tumor staging system (eighth edition, 2017). The median follow-up of the patients was 23 months (IQR, 10–30 months), and the last date of follow-up was June 1, 2020. In addition, the expression of 1135 NSCLC based on RNA-seq data from The Cancer Genome Atlas (TCGA) were available, while those cases with unknown SMAD4 expression (n = 113), incomplete pathological or clinical information (n = 86), and unknown survival outcomes or <30 days of follow-up time (n = 26) were removed, and 910 cases from TCGA were enrolled in this study for analysis (Fig. 1B).

Fig. 1: Flow chart of cases selection.
figure 1

Cases from FUSCC (a) and TCGA (b).

Next-generation sequencing

Genomic DNA was extracted from the FFPE tissue samples using the QIAamp DNA Mini Kit, and targeted deep sequencing of mutational hotspots was conducted using a capture-based targeted sequencing panel, which included all exons of 68 genes and selected introns in ALK, RET, and ROS1 for the detection of translocation events. The library DNAs were prepared by amplifying the targeted regions using multiplex polymerase chain reaction, followed by adapter DNA ligation. Multiplexed sequencing was performed using the Illumina HiSeq 2500 platform. Mutations and variants, including indels, substitutions, rearrangements, and amplifications, were identified using Illumina suite software.

Immunohistochemistry

For histopathological analysis, tissue samples were fixed for 48 h in 4% paraformaldehyde, decalcified with 20% EDTA, and embedded in paraffin. Next, 4-μm sections were cut and incubated in sodium citrate (pH 6.0) at 100 °C for 15 min. After endogenous peroxidase activity elimination by 3% H2O2, the sections were blocked with 1% PBS. Primary antibody specific for SMAD4 (1:100; ab40759, Abcam) was applied at 4 °C overnight. After the samples were washed with PBS, horseradish peroxidase-conjugated anti-rabbit/mouse secondary antibodies (Gene Technology, Shanghai, China) were applied and incubated for 1 h at room temperature. Normal epithelium adjacent to the tumor served as the internal positive control. The immunohistochemical staining was scored in a semiquantitative approach incorporating both the intensity and distribution of specific staining as follows. The intensity of specific staining was characterized as not present (0), weak (1+), distinct (2+), and very strong (3+) based on the average expression of both nucleus and cytoplasm. The percentage of positive tumor cells per slide (0 to 100%) was multiplied by the dominant intensity pattern of staining (Fig. 2). The H-score (0–300) was calculated according to the following formula:

$$\left[ {1 \times \left( {\% \,{\rm{cells}}\,1 + } \right) + 2 \times \left( {\% \,{\rm{cells}}\,2 + } \right) + 3 \times \left( {\% \,{\rm{cells}}\,3 + } \right)} \right].$$
Fig. 2: Immunohistochemical analysis of Smad4 in NSCLC specimens.
figure 2

A different score (0–3) was given to each sample based on the intensity of staining signal in both nucleus and cytoplasm. A final H-score (0–300) was obtained by multiplying the intensity and reactivity extension. Graphics indicate different staining: “0” in (a), “1+” in (b), “2+” in (c), and “3+” in (d). e The sample image comprised mainly of negative or low Smad4 expression, and H-score was about 85 that was classified as low Smad4 group.

An H-score of ≥75 was the threshold value for a “positive” immunohistochemical assay [24]. Smad4 loss was considered if H-score <75 in order to minimize confounding effects introduced by varying degrees of Smad4 downregulation. The median score of positive expression was utilized to stratify patients into low Smad4 (75–187) and high Smad4 (188–300) groups.

Gene Ontology (GO) enrichment analysis

The original RNA-seq read counts of NSCLC cases were acquired via the TCGA website (https://portal.gdc.cancer.gov) using the gdc-client tool. Normalization of expression matrix data was executed by the R command normalizeBetweenArrays. The cases were divided into two groups depending upon the median mRNA expression level of SMAD4. The Genomic Analysis of Differentially Expressed Genes (DEGs) analysis was performed by the limma R package. The two cutoff values for the DEGs were an expression fold change >2 and a p value < 0.05. The Database for Annotation, Visualization, and Integrated Discovery (http://www.david.niaid.nih.gov) integrates GO, KEGG, UniProt, and DrugBank, among other authoritative data sources, to provide detailed annotation of genes from cell components, molecular functions, and biological processes. Point plots were generated by R software.

Gene set enrichment analysis (GSEA)

GSEA is a computing method for exploring the statistical significance and concordant differences of defined gene sets or pathways between two biological states. In the present study, GSEA was used to deeply analyze biological information, enlightening our understanding of relevant biological events. Files of genomic expression data and contrast information were inputted, and the analysis was carried out using R software.

Statistical analysis

Frequencies were compared using Fisher’s exact and Pearson’s χ2 tests, as appropriate. Overall survival (OS) was measured from the date of surgical resection to the date of death or last follow-up. Disease-free survival (DFS) was defined from the date of surgery to cancer recurrence, death, or last follow-up whichever was the earliest. Progression-free survival (PFS) was defined from the date of surgery to progression as determined by the treating physician based on radiologic, biochemical, and/or clinical criteria. Survival curves were generated using Kaplan–Meier methods and compared using the log-rank test. Univariate and multivariate analyses were performed to identify prognostic factors. Analyses were performed using R software (version 3.6.3), GraphPad Prism (version 7.0), and IBM SPSS software (version 24.0). p values were considered significant if < 0.05 in all analyses.

Results

Patient characteristics and clinicopathological variables

Of the 963 NSCLC cases from FUSCC included in the study, 24 (2.5%) had SMAD4 mutations, including 23 adenocarcinomas and 1 adenosquamous carcinoma. The patients were divided into two groups: mutated type (n = 24) and wild-type (n = 939) SMAD4 according to the NGS results. The demographic and baseline clinicopathologic data, including gender, diagnostic age, pathological grade, histological type, TNM stage, and driver mutation status, are shown in Table 1. Most patients in this study had an adenocarcinoma (91.5%), and the vast majority of the patients were at an early stage (85.8%). The patients with SMAD4 gene mutations were more likely to be at T stage III or IV (p = 0.015, Table 1) and show poorer tumor differentiation (p = 0.044, Table 1) than the patients without mutations. In contrast, SMAD4 mutation status was not significantly associated with gender, age, N stage, M stage, histology type, or mutations in dominant cancer driver genes, such as EGFR, ALK, and KRAS.

Table 1 Clinicopathologic characteristics in patients with mutated and wild-type SMAD4 from FUSCC.

Genetic variations of SMAD4 in NSCLC

The SMAD4 alterations in pan cancers were analyzed using cBioPortal. The genetic variations of SMAD4 showed incidence rate of 5% in NSCLC, including mutation and copy number deletion mainly (Fig. 3A). One major limitation of NGS approaches was the ability to detect long copy number variations, which was probably one reason of different alterations between the FUSCC cohort and the public cases. To further assess the genetic variations of SMAD4, we examined SMAD4 alterations of 819 NSCLC cases from five studies (University of Turin [25]; MSKCC [26]; TRACERx [27, 28]; MSK [29]; MSK [30]). The results showed that the alteration frequency of SMAD4 in NSCLC was 4% (Fig. 3B), and mutation was the most common type, comprising more than half of all genetic variations (Fig. 3C). Therefore, mutation might be one of the main mechanisms by which SMAD4 is dysfunctional in NSCLC. Moreover, the SMAD4 mutation sites on the peptide sequence in lollipop plots were portrayed. Four types of mutations were confirmed (Fig. 3E): missense mutations (66.7%), deletions (20.8%), nonsense mutations (8.3%), and frameshift mutations (4.2%). We observed the mutational hotspots were D351 in cases with NSCLC from cBioPortal (Fig. 3D) and R361 in the Chinese cohort from FUSCC (Fig. 3E). R361 is located on the SMAD4 homotrimer interaction interface, and this hotspot normally stabilizes homo- or heterotrimer oligomerization. Mutation at R361 in SMAD4 correlates with metastasis and decreased survival in colon cancer [31]. Those mutations could have widespread effects, because SMAD4 is a binding partner for all Smad dependent transcriptional regulation. The SMAD family peptide chains share conserved sequences in amino and carboxyl-terminal regions with certain structural similarities, known as Mad homology (MH) domains 1 and 2, respectively [13]. In spite of different hotspots, the missense mutations were most common, and mutations were mostly located in the MH2 domain in two data sets.

Fig. 3: Analyses of genetic variations in SMAD4 for NSCLC.
figure 3

a Alteration frequencies of SMAD4 across cancer studies. b OncoPrint overview of the genetic variations of SMAD4. c Alteration frequencies of SMAD4 in different studies. d Distribution of SMAD4 mutations from cBioPortal. e Distribution of SMAD4 mutations from FUSCC.

Patterns of SMAD4 expression in NSCLC

To determine the correlation between the mutation and expression of SMAD4 in NSCLC, we randomly selected 45 wild-type SMAD4 NSCLC cases with sufficient tissues to investigate the SMAD4 expression and distribution by tissue microarray. Immunohistochemistry was performed on 45 wild-type SMAD4 and 24 mutated SMAD4 samples. The prevalence of SMAD4 expression in the patients with SMAD4 mutations was significantly different from that in the matched group, and the results are shown in Table 2. We found that wild-type SMAD4 samples tended to show high IHC expression, and the loss of SMAD4 was almost not seen. Mutated cases showed loss of SMAD4 expression levels with loss areas ranging from 10 to 100%, thus most cases showed low SMAD4 expression. This result demonstrated that the patients harboring SMAD4 mutations tended to show deficient SMAD4 expression. Further studies confirmed that the staining intensity was weakly affected by synonymous mutations; tissues with missense mutations, deletions, and especially frameshift mutations were found to have apparent SMAD4 deficiency. These data indicate that individual discrepancies and heterogeneous expression exist. Given the presence of different types of mutations and possibly contaminating SMAD4-positive mesenchymal cells, the specific governing mechanisms remain to be elucidated.

Table 2 Comparison between SMAD4 gene and SMAD4 IHC statuses.

To further explore the expression of SMAD4 in NSCLC, we performed data mining in NSCLC cohort in TCGA database using cBioPortal. We found that the mRNA expression levels of wild-type SMAD4 were much higher than mutated (Fig. 4A), while the correlations between mRNA and protein expression levels were feeble (Fig. 4B). These correlations seem to provide some evidence for SMAD4 deficiency with significant heterogeneity in IHC. Moreover, the top 50 genes positively correlated with SMAD4 expression in LUAD and LUSC were identified (Fig. 4C) using LinkedOmics. The heatmap showed that SMAD2, another member of the SMAD family, and SMAD4 had high correlation coefficients, which was verified in cBioPortal and ENCORI (Fig. 4D, E). Phosphorylated SMAD2 and SMAD3 form a heteromeric complex with SMAD4 and are transported to the nucleus, where it binds with other DNA-binding transcription factors and consequently regulates the transcription of TGF-β target genes [8, 13]. Abundant evidence suggests that depletion of SMAD2 results in enhanced cell invasion, metastasis [10, 14], and angiogenesis [32,33,34] in skin squamous carcinoma, NSCLC, and breast cancer. There seems to be a synergy between the SMAD2 and SMAD4 in inhibiting tumors.

Fig. 4: Patterns of SMAD4 expression in NSCLC.
figure 4

Correlations between mutations and mRNA expression levels (a), mRNA and protein expression levels (b). c The top 50 genes positively correlated to SMAD4 expression. d, e Correlations between SMAD4 and SMAD2. *p < 0.05; **p < 0.01; ***p < 0.001.

SMAD4 is significantly reduced in all stages of NSCLC

To assess the relationship between SMAD4 expression and lung cancer, we first analyzed the mRNA expression profile of SMAD4 in 910 NSCLC cases from TCGA. We found that the SMAD4 mRNA levels were significantly lower in the NSCLC patients than in the healthy individuals (p < 0.001; Fig. 5A) and lower in LUAD than in LUSC (p < 0.001; Fig. 5B). No significant differences were observed in age (p = 0.330; Fig. 5C) or gender (p = 0.211; Fig. 5D). Moreover, SMAD4 expression was reduced in all stages of NSCLC compared with that in normal tissues (Fig. 5E). We found that SMAD4 was expressed at lower levels in the LUAD and LUSC tissues than in the adjacent normal tissues using GEPIA (Fig. 5F). These results suggest that there may remain a link between SMAD4 expression and the occurrence and development of NSCLC.

Fig. 5: SMAD4 was significantly downregulated in NSCLC cases from TCGA.
figure 5

The expression of SMAD4 at the mRNA level in 910 NSCLC from TCGA. Box plots show the association of SMAD4 expression with normal/tumor tissues (a), histological types (b), age (c), gender (d), and TNM stage (e). f GEPIA was used to detect the expression of SMAD4 in TCGA NSCLC data set. *p  <  0.05; **p  <  0.01; ***p  <  0.001.

Mutation or reduced expression of SMAD4 predicts poor survival and chemotherapy resistance

The low expression of SMAD4 in NSCLC compared with adjacent normal tissues indicates its role as a tumor suppressor. To assess the relationship between SMAD4 expression and clinical outcomes of NSCLC patients, we investigated OS of LUAD and LUSC using the online tool ENCORI. However, no significant association of SMAD4 expression with OS was found (Fig. 6A). To further research its prognostic value, we stratified 910 NSCLC patients from TCGA into two groups using optimal cutoff values determined by the “surv_cutpoint” function of the “survminer” R package: high expression (high, n = 555) and low expression (low, n = 355) of SMAD4. The results showed no significant association of SMAD4 expression with OS (HR = 0.95, p = 0.24; Fig. 6B), but a significantly longer DFS (HR = 0.84, p = 0.027; Fig. 6B) was detected when SMAD4 expression was high. Among the 332 patients received drug treatments, the overwhelming majority of which were platinum compounds (n = 322, 97.0%), such as cisplatin and carboplatin. As a consequence, we split the patients into two groups according to the preceding method to explore the relationship between SMAD4 expression and the effects of platinum-based chemotherapy, including high expression (high, n = 200) and low expression (low, n = 122) of SMAD4. We found that low SMAD4 expression was associated with poor OS (HR = 0.83, p = 0.038; Fig. 6C) as well as DFS (HR = 0.85, p = 0.048; Fig. 6C). OS of reduced SMAD4 in NSCLC identified patients with poor DFS and resistance to platinum-based chemotherapy. The PFS curves of altered (n = 28) and wild-type (n = 791) SMAD4 cases were analyzed using cBioPortal. No significant association was found (p = 0.410; Fig. 6D). To further analyze the impact of SMAD4 mutations on survival of NSCLC patients, we downloaded the information of five studies as mentioned above, including follow-up and gene mutations. We found that SMAD4-mutated cases (n = 14) were more likely to progresses than wild-type SMAD4 cases (n = 147) in patients with driver (EGFR, ALK, KRAS, and MET) mutations (p = 0.042; Fig. 6E). We also compared the survivals between LUAD and LUSC patients in Fig. S1.

Fig. 6: Kaplan–Meier survival analysis according to SMAD4 mutation or expression.
figure 6

a The overall survival curves analyzed using ENCORI. b The overall and disease-free survival curves of NSCLC cases from TCGA. c The overall and disease-free survival curves of chemotherapy-treated patients from TCGA. d The progression-free survival curves of altered and unaltered SMAD4 cases analyzed using cBioPortal. e The progression-free survival curves of mutated and wild-type SMAD4 cases both with driver mutation from cBioPortal.

In 24 SMAD4-mutated cases from FUSCC, 18 (75%) harboring EGFR mutations were selected (Table 1). To further investigate the effect of SMAD4 and EGFR mutations on survival outcomes, we randomly matched 72 cases with EGFR mutations and SMAD4 wild type according to TNM stage. Univariable and Cox proportional hazards models for OS and PFS of these 90 NSCLC patients were created (Table 3). SMAD4 was an independent prognostic factor for OS in NSCLC (p = 0.038, Table 3). TNM stage (p < 0.001, Table 3), smoking status (p = 0.025, Table 3), and SMAD4 status (p = 0.002, Table 3) were independent prognostic factors for PFS of NSCLC patients according to the univariate and multivariate regression models. These analyses demonstrated that SMAD4 mutation was strongly associated with poor prognosis. Patients with coexisting mutations in the EGFR and SMAD4 genes had a significantly worse OS and PFS than those with single EGFR mutations.

Table 3 Univariable and multivariate analyses for OS and PFS in NSCLC patients from FUSCC.

Gene Ontology analysis and GSEA

In order to explore the potential functions and molecular mechanisms of SMAD4 in NSCLC progression, we conducted GO enrichment analysis and GSEA on NSCLC cases from TCGA database. The GO terms, in which the DEGs were significantly enriched, included integral components of nucleoplasm, cytoplasm, regulation of apoptotic process and gene expression, immune response, etc. (Fig. 7A). GSEA indicated enrichments of gene sets including cell cycle, focal adhesion, adherens junction, and pyrimidine metabolism in the samples with low SMAD4 expression (Fig. 7B); KEGG jak stat signaling pathway, KEGG toll-like receptor signaling pathway, KEGG T-cell receptor signaling pathway, KEGG B-cell receptor signaling pathway in the samples with high SMAD4 expression (Fig. 7C).

Fig. 7: GO and GSEA of SMAD4-related enrichment gene sets in NSCLC cases from TCGA.
figure 7

a GO analysis of DEGs based on SMAD4 expression. b Enrichment gene sets in the samples with low SMAD4 expression by GSEA. c Enrichment gene sets in the samples with high SMAD4 expression by GSEA.

Discussion

SMAD family members are a series of downstream effectors of the TGF-β signaling pathway that play a critical role in tumor progression. Smad2 and Smad3 proteins are phosphorylated by activated TGF-β receptors and then combine with the common Smad4 to form a complex, and the resulting trimer translocates into the nucleus, where it functions as a gene expression regulator [8]. Smad4 protein has been demonstrated as an important factor in IHC to suggest pancreaticobiliary primary site [15] and an independent prognostic factor for both DFS and OS in advanced colorectal cancer [35]. Varying SMAD4 mutations have been detected in different types of cancers by large-scale sequencing. Compared with 35% in pancreatic cancer and 12% in colon cancer cases [36,37,38,39,40], a significantly lower frequency of SMAD4 mutations have been found in other types of cancers. In previous cohort studies, point mutations of SMAD4 were identified in 0.21%, 2.24%, 2.46%, and 8.86% of kidney, lung, esophageal, and biliary tract cancers, respectively [40,41,42]. In this study, we found that the mutation of SMAD4 in the Chinese NSCLC cohort was 2.5%, which was consistent with the previous reported 2.24% in general, and the mutations were mostly located in the MH1 and MH2 domains. Although point mutations are rare, protein deletions are common in cancers due to the presence of various other influencing factors, such as alternative splicing. MH2 has been reported to play a key role in transcriptional activation and homo- and hetero-oligomer formation, while MH1 functions as a negative regulatory domain [13, 43]. These two domains are thought to always affect each other because MH1 disturbs the MH2 interaction with phosphorylated R-Smads, and the connection between MH1 and cDNAs is prevented by MH2. The linker region that maintains the interaction between MH1 and MH2 is often deleted during alternative splicing. Wan et al. [44] found several alternatively spliced variants that are associated with exon deletion in the linker region, showing significant effects on EMT and regulating E-cad and VIM protein expression.

In our study, the expression level of SMAD4 gradually decreased as the disease progressed, except it increased in stage IV. We speculated this was due to the small sample size because there were a very small number of samples of stage IV. Another possible reason is that SMAD4 is involved in the dual function of the TGF-β signaling pathway. TGF-β serves as a tumor suppressor by inhibiting proliferation and accelerating apoptosis during initiation and early progression of the tumor, whereas it could promote tumor formation by facilitating migration, invasion, and angiogenesis in the late stage [9, 10], the specific molecular mechanism of which has not been fully elucidated. It is possible that inactivation of SMAD4 is one of the involved mechanisms. In addition, TGFBR2 serves as an initial regulator of the TGF-β signaling pathway, and its loss has been implicated as a mechanism for lung cancer formation, especially in LUSC formation. It has been reported that dysregulated TGFBR2/ERK-Smad4/SOX2 signaling drives LUSC formation [45].

Although SMAD4 mutation is not deemed as a tumor initiating event in most cases, loss or mutation of SMAD4 is thought to enhance tumor progression, for example, after KRAS mutational activation in pancreatic adenocarcinoma [42]. Researchers found that low expression of SMAD4 could promote A549 cell growth [46, 47], which indicated that the deletion of SMAD4 was associated with apoptosis by affecting the Bcl-2/Bax balance in NSCLC, thereby contributing to carcinogenesis. Kawaguchi et al. [48] showed that double or triple gene mutations in RAS, TP53, and SMAD4 are associated with a worse survival and earlier recurrence than the cases with mutations in only one or none of these genes after resection of colorectal liver metastases. Several previous studies have shown that SMAD4 depletion in a head and neck squamous cell carcinoma cell line induces cetuximab resistance and results in worse survival in an orthotopic mouse model in vivo [49]. Wasserman et al. [47] also suggested that SMAD4 loss correlates with worse clinical outcome and resistance to chemotherapy. In our study, we first analyzed the prognostic role of SMAD4 mutation and its association with other molecular parameters in a large Chinese NSCLC cohort. SMAD4 mutation or loss, as well as reduced expression, was associated with poor DFS and platinum-based chemotherapy resistance. Coexisting mutations in EGFR and SMAD4 genes conferred significantly worse OS and  PFS than single EGFR mutations. However, Ziemkea et al. [50] found that NSCLC with reduced SMAD4 expression is more sensitive to treatment regimens containing DNA topoisomerase inhibitors than those without. To detect more potential mechanisms of SMAD4 in NSCLC progression, we conducted GO analysis and GSEA in NSCLC cases from TCGA database and found that reduced SMAD4 expression might promote NSCLC progression by promoting proliferation and adhesion. We also found that the increased SMAD4 expression (vs. reduced expression) was associated with immune response. Indeed, SMAD4 participated in the expression of the anti-inflammatory cytokine, IL-10, in TH17 cells [51, 52]. A TGFβ–SKI–Smad4 pathway that critically and specifically directs resident CD103+ CD8+ T-cell generation for protective immunity against primary and secondary viral infection was revealed recently [53]. The correlations and mechanisms of SMAD4 and immune microenvironment warrant further investigation.

This study has several limitations. First, although our cohort of patients with NSCLC was large, the number of SMAD4-mutated cases was rather moderate. Second, the patients diagnosed with NSCLC undergoing surgical resection for the past 2 years were included in this study, and the follow-up time was limited. A long-term study is needed to confirm our findings. Third, our study was limited by the inherent issues of retrospective studies, including potential selection bias. This should be considered when generalizing our findings to other populations.

In conclusion, we found that SMAD4 alteration was associated with poor survival and resistance to platinum-based chemotherapy, suggesting that SMAD4 alteration might be a predictive marker or therapeutic target in these NSCLC. Further prospective studies are needed to evaluate these findings, and the clinical significance of SMAD4 alteration in NSCLC.