High expression of SOX30 is associated with favorable survival in human lung adenocarcinoma

In our previous study, we had identified SOX30 as a novel tumor suppressor that acts through direct regulation of p53 transcription in human lung cancer. Here, we sought to determine the clinical relevance of SOX30 expression in a series of surgically-resected non-small cell lung cancer (NSCLC) patients. Analysis of SOX30 expression and clinico-pathologic features reveal a significant correlation of SOX30 expression with histological type (n = 220, P = 0.008) and clinical stage (n = 220, P = 0.024). Kaplan-Meier analysis indicates an association of high SOX30 expression with better prognosis in NSCLC patients (n = 220, P = 0.007). Via multivariate Cox-regression analysis, SOX30 expression is revealed to be an independent prognostic factor for overall survival (OS) of NSCLC patients (n = 220, P = 0.014, hazard ratio (HR) = 0.816). In particular, SOX30 is a favorable and independent prognostic factor in one main subtype of NSCLC, lung adenocarcinoma (ADC) patients (n = 150, P = 0.000, HR = 0.405), but not in another main subtype of NSCLC, squamous cell carcinoma patients. Furthermore, high expression of SOX30 represents a favorable and independent factor for the prognosis of ADC patients at clinical stage II (P = 0.013), with positive lymph node (P = 0.003), at histological grade 2 (P = 0.000) or grade 3 (P = 0.025). In summary, SOX30 expression represents an important prognostic factor for survival time in ADC patients.

Here, we report on the possible roles of SOX30 as a prognostic marker for NSCLC patients. In this study, we sought to determine the clinical relevance of SOX30 expression by immunohistochemistry (IHC) on tissue microarrays (TMA) in 150 ADC and 70 SCC patients. Results indicated that SOX30 expression was significantly correlated with histological type and clinical stage of NSCLC patients. Univariate and multivariate analyses revealed that high SOX30 expression was obviously associated with better OS in NSCLC patients. Above all, SOX30 had a favorable prognostic impact on lung ADC patients, but not in SCC patients. Furthermore, SOX30 expression might represent a key prognostic factor for ADC patients at clinical stage II, with positive lymph node, at histological grade 2 or grade 3. Additionally, functional analysis revealed that SOX30 had no effect on tumor cell proliferation, cycle and apoptosis in lung SCC, which is unlike in lung ADC.

SOX30 expression is significantly correlated with histological type and clinical stage of NSCLC patients.
To determine SOX30 expression in NSCLC patients, we conducted IHC on a TMA containing 220 cancers. After IHC, we used the scoring system to consolidate the results for intensity and positive staining percentage. Based on the results, positive staining of tumor cell was quantified and classified into three groups: high (Fig. 1A), medium (Fig. 1B) and low (Fig. 1C). After investigating for possible associations between SOX30 expression and clinico-pathologic features of the patients, we found that SOX30 expression was evidently correlated to histological type (n = 220, P = 0.008) and clinical stage (n = 220, P = 0.024) of NSCLC patients ( Table 1). The incidence of SOX30 over-expression was 31.33% (47/150) in ADC, 14.29% (10/70) in SCC, 29.35% (27/92) in clinical stage I, 26.19% (11/42) in clinical stage II and 13.64% (9/66) in clinical stage III+ IV of NSCLC patients, respectively (Table 1). However, SOX30 expression was not correlated to age (P = 0.535), gender (P = 0.052), histological grade (P = 0.516), tumor size (P = 0.086) and lymph node status (P = 0.533) ( Table 1). In addition, we also found that SOX30 expression was correlated to clinical stage (n = 150, P = 0.036) of ADC patients, and the incidence of SOX30 over-expression was 35.29% (24/68) in clinical stage I, 29.17% (7/24) in clinical stage II and 15.79% (6/38) in clinical stage III+ IV of ADC patients, respectively (Table S1).
Increased SOX30 expression is obviously associated with better OS of NSCLC patients. We then combined low and intermediate scores to obtain two groups of SOX30 expression: low and high. SOX30 expression groups were analyzed with respect to patient survival data. Survival analysis using Kaplan-Meier and log rank test revealed poorer OS in NSCLC patients characterized with low SOX30 expression as compared to patients with high SOX30 expression (P = 0.014) ( Fig. 2A). This result was confirmed by the survival analysis of three directly obtained groups: low, intermediate and high (P = 0.007) (Fig. 2B). To correct for bias caused by univariate analysis, SOX30 expression as well as other parameters were examined in a multivariate Cox-regression analysis (after adjustment for age, clinical stage, gender, histological grade, tumor size and lymph node). In addition to age (hazard ratio (HR) = 1.057, P = 0.000) and clinical stage (HR = 1.858, P = 0.000), SOX30 expression was found to be an independent prognostic factor (HR = 0.816, P = 0.027 for two groups/HR = 0.736, P = 0.014 for three groups) for the OS of NSCLC patients (Fig. 2C,D, Table S2).
To further confirm our results, we then analyzed the clinical significance of SOX30 mRNA expression in human lung cancers with Kaplan-Meier Plotter (http://kmplot.com/analysis/index.php?p = background), an online tool to evaluate the correlation of SOX30 expression (207678_s_at) with lung cancer prognosis in over 1400 clinical patients. We found that higher mRNA expression of SOX30 was linked to markedly longer OS of lung cancers ( Figure S1A, HR = 0.76, p = 0.00065). In a multivariable analysis adjusted for histology, grade, stage, AJCC stage T, AJCC stage N, gender, age, smoking history, the patients with higher mRNA expression of SOX30 also had a better prognosis ( Figure S1B, HR = 0.4, p = 0.0024). In consideration of the gene expression data of the KMplotter database from different microarray analysis platforms, we analyzed the clinical significance of SOX30 mRNA expression with Kaplan-Meier Plotter in human lung cancers from three selected cohorts: the TCGA dataset ( Figure  S1C, HR = 0.31, p = 0.0075), GSE19188 dataset ( Figure S1D, HR = 0.4, p = 0.0023) and GSE4573 dataset ( Figure S1E, HR = 0.41, p = 0.0028) respectively. We also found that higher mRNA expression of SOX30 was linked to markedly longer OS of lung cancers as figure S1C-E.
High expression of SOX30 suggests favorable survival outcomes in ADC patients. To investigate the correlations between SOX30 expression and survival of ADC and SCC patients respectively, the prognostic significance of SOX30 was analyzed. In ADC patients, Kaplan Meier analysis indicated that patients with high levels of SOX30 expression had significantly prolonged OS compared to those Survival analysis of SOX30 expression in 220 NSCLC patients. Patients were split into two groups by Kaplan-Meier survival curve. Tissue array analysis was performed for 220 cases of NSCLC patients with survival information. Compared to patients with high SOX30 expression, patients with low SOX30 expression had an inferior OS rate; Low, staining weak and moderate; High, staining strong. (B) Kaplan-Meir survival analysis of SOX30 expression in 220 NSCLC patients split into three groups. Longer OS was observed in the higher SOX30 group as compared to the lower SOX30 group. Low, staining weak; Medium, staining moderate; High, staining strong. (C) Survival analysis of SOX30 expression in 220 NSCLC patients split into two groups by multivariate Cox regression. SOX30 expression was determined to be an independent prognostic factor. (D) Cox regression survival analysis of SOX30 expression in 220 NSCLC patients split into three groups. Longer OS was determined for the higher SOX30 group as compared to the lower SOX30 group. SOX30 protein expression was an independent prognostic factor of survival.
with low levels of SOX30 expression (p = 0.000, Fig. 3A). To avoid the influence caused by univariate analysis, the multivariate Cox regression analysis was performed, and SOX30 expression was identified as an independent prognostic factor (HR = 0.405, p = 0.000), in addition to age (HR = 1.056, p = 0.001) and clinical stage (HR= 1.962, p = 0.000) in ADC patients ( Fig. 3B and Table 2). In SCC patients, univariate analysis demonstrated that patients with high SOX30 expression had poorer OS than those with low SOX30 expression (p = 0.022, Fig. 3C). Moreover, multivariate Cox regression analysis showed that SOX30 expression was also an independent prognostic factor (HR = 1.283, p = 0.046), in addition to age (HR = 1.067, p = 0.024) in SCC patients (Fig. 3D, Table S3). These findings revealed that high SOX30 expression was a favorable and independent prognostic factor for ADC patients, but not for SCC patients.
To further validate the result above, we also analyzed the clinical significance of SOX30 mRNA expression with Kaplan-Meier Plotter in lung ADC and SCC respectively. The data from analyses of public dataset is consistent with our study in lung ADC ( Figure S1F, HR = 0.60, p = 0.00067), but is not in lung SCC. Unfortunately, in the three selected cohorts, we failed to analyze the clinical significance of SOX30 mRNA expression in lung ADC and SCC respectively, because of small case number. High expression of SOX30 indicates better survival of clinical stage II or lymph node-positive ADC patients. To determine the importance of clinical stage and lymph node status on the correlation between SOX30 expression and OS of ADC patients, we stratified patients by SOX30 expression and clinical stage or lymph node status, followed by analysis of survival data. The results revealed that SOX30 expression was not statistically associated with OS of ADC patients at clinical stage I (P = 0.215), stage III (P = 0.156), stage III+ IV (P = 0.223) or with negative lymph node (P = 0.224). However, high SOX30 expression was significantly correlated with better OS of stage II (univariate analysis: P = 0.015/multivariate analysis: P = 0.013) or lymph node-positive (P = 0.000/P = 0.003) ADC patients (Fig. 4A-D). These findings suggested that high SOX30 expression was a favorable and independent prognostic factor for stage II or lymph node-positive ADC patients.
High expression of SOX30 predicts better outcome of histological grade 2 or grade 3 ADC patients. Next, we sought to determine the impact of SOX30 expression on patient survival while considering histological grade. After stratifying patients based on SOX30 expression, we analyzed patient survival data in correlation to histological grade. High SOX30 expression was evidently correlated to better survival in histological grade 2 (univariate analysis: P = 0.000/multivariate analysis: P = 0.000) or grade 3 (P = 0.036/P = 0.025) ADC patients (Fig. 5A-D). However, SOX30 expression was not correlated with survival in histological grade 1 (P = 0.755) ADC patients. These results indicated that high SOX30 expression was a favorable and independent prognostic factor for histological grade 2 or grade 3 ADC patients.
SOX30 induces cancer cell apoptosis with inhibiting proliferation in lung ADC, but not in lung SCC. Our previous studies have showed that SOX30 functions as a tumor suppressor mainly through promoting tumor cell apoptosis with inhibiting proliferation in lung ADC 13 . To explore the potential role of SOX30 in lung SCC, we generated gain-of-function cell models by transfecting a SOX30-expressing construct into the human SCC NCI-H520 (H520) cell line (Fig. 6A). We then examined the effect of SOX30 over-expression on cell proliferation and viability. Five-day growth curve analysis showed that over-expression of SOX30 did not affect proliferation of H520 cells (Fig. 6B). To further confirm the result, we measured the percentage of cell cycle and conducted Annexin V-APC/7-amino-actinomycin D double staining followed by flow cytometry analysis. The results showed that SOX30 also did not affect H520 cell cycle and apoptosis (Fig. 6C,D). Taking our previous study into account 13 , the functional data reveal that SOX30 is a tumor suppressor gene in lung ADC, but has no effect on tumor cell proliferation, cycle and apoptosis in lung SCC.

Discussion
In the present study, we investigated for the first time the clinical relevance of SOX30 expression in NSCLC patients. From our data, we find that SOX30 expression is obviously correlated with histological type and clinical stage of NSCLC patients. Kaplan-Meier analysis reveals that the patients with increased expression levels of SOX30 have correspondingly prolonged OS compared to patients characterized by low SOX30 expression. Multivariate Cox-regression analysis suggests that SOX30 is an independent prognostic factor for OS in NSCLC patients. In particular, SOX30 is a favorable and independent prognostic factor in lung ADC patients, but not in lung SCC patients. Furthermore, SOX30 expression represents a favorable and independent prognostic factor for ADC patients at clinical stage II (please see Table S4 (ADC-150-Sox30) of Supplementary data (excel) for the detail of the staging system used), with positive lymph nodes, at histological grade 2 or grade 3. These data suggest that SOX30 protein expression is an independent predictor of favorable prognosis in ADC patients. Accumulated evidence have indicated that different histologic subtypes of NSCLC showed different molecular events during tumorigenesis 14 abnormalities in those tumor types separately. In our study, SOX30 expression is remarkably correlated with histological type of NSCLC patients, suggesting a different clinicopathological significance of SOX30 in ADC and SCC patients. Our further studies by univariate and multivariate analyses demonstrates that high SOX30 expression significantly correlates with better outcome in ADC patients, but not in SCC patients. However, this result was not well validated from analyses of the public dataset. There maybe two reasons: 1) the gene expression data of the KMplotter database are from different microarray analysis platforms; 2) the SOX30 expression data of the dataset are mRNA levels, while the SOX30 expression in our data is protein level, and in many cases the changes of gene expression in mRNA level do not always reflect the changes in protein level.
Previous studies have shown that histological type may be a major cause of heterogeneity of NSCLC patients in different studies 16 . Recently, specific genetic abnormalities have been identified in NSCLC, which shows distinct features according to the histologic types. In general, activating mutations of EGFR, ALK and KRAS are mainly identified in ADC patients, whereas p53 mutation appears to be more frequent in SCC patients 17 . These dysregulations influence apoptosis signaling pathways in NSCLC patients. In particular, our previous study revealed that SOX30 acts as a tumor suppressor by promoting cancer cell apoptosis through directly regulation of p53 transcription in ADC patients 13   promotes cell apoptosis through the direct regulation of p53 transcription in SCC patients with p53 mutation. Therefore, an explanation for the different prognostic role of SOX30 may be that different functional roles of SOX30 cause different prognostic effects in ADCs and SCCs. To explore the potential role of SOX30 in SCCs, we generated gain-of-function cell models, and then examined the effect of SOX30 over-expression on cell proliferation, cell cycle and apoptosis. The results show that SOX30 do not affect NCI-H520 cell proliferation, cell cycle and apoptosis (Fig. 6A-D). From our previous and present studies, SOX30 is a tumor suppressor in ADC, but has no effect on tumor cell proliferation, cell cycle and apoptosis in SCC. This is probably the reason for the opposite results of SOX30 expression in survival outcome between ADC and SCC.
As the efficacy of treatment options for cancer varies in different patient subgroups, the finding of useful predictive and prognostic markers are necessary for improving patient survival 18,19 . At present, many factors, such as age, tumor size and lymph node status, have been used to predict the outcome of cancer patients 20,21 . However, these factors are unable to determine the cancer patient's individual risk. Thus, identification of new clinically-relevant prognostic markers are still of great importance. In the present study, we describe the association between SOX30 expression and survival with regards to clinical stage, lymph node status and histological grade in ADC patients. Our data suggest that high SOX30 expression is correlated with a favorable prognosis for ADC patients at clinical stage II, with positive lymph nodes, at histological grade 2 or grade 3. From these results, we proposed SOX30 expression as an independent predictor of favorable prognosis for clinical stage II, lymph node-positive, histological grade 2 or grade 3 lung ADC patients.
In our study, the five-year overall survival rate is over 40% for NSCLC patients [22][23][24][25] , as most cases of patients who require surgery are diagnosed at a early clinical stage (I and II). However, we report a different follow-up period using survival curves as in Fig. 3. After a careful check, it was found that this difference could be attributed to missing information in certain cases, and to multivariate analyses that were negatively impacted by the lack of complete patient information. From our data, the significant better OS for high expression of SOX30 was found only in the subset of stage II and not in stage I, which seemed to be confusing. Considering that the number of patients with stage II is rather small, it also needs to be validated in a larger series. Previous studies have indicated that the value of events per variable should equal 10 26 . In the present study, several factors were characterized with a lower event per variable value. Therefore, we suggest revalidation of some of the results in a study with a larger sample size. Additionally, although our data represent SOX30 expression as an independent predictor of favorable prognosis in ADC patients, the predictive role of it for treatment response should be further explored in upcoming clinical trials.
In conclusion, SOX30 is significantly associated with histologic type and clinical stage in NSCLC patients, and heightened SOX30 represents an independent prognostic factor for increased survival time of ADC patients or the ones who are at clinical stage II, with positive lymph nodes, at histological grade 2 or grade 3.

Materials and Methods
Patient samples. A total of 220 primary NSCLC patients including 150 ADCs and 70 SCCs, who had undergone surgical resection with curative intent between 2004 and 2007, were obtained from the Southwest Hospital in Chongqing, China. The clinico-pathologic information was retrieved from the patients' electronic medical records, which included age, gender, lymph node (negative or positive), tumor size, histological grade and clinical stage defined according to AJCC (American Joint Committee on Cancer) 7th edition and the new TNM classification 27 , and follow-up information (5-8 years) for OS rates. This study was approved by the ethics committee of the Southwest Hospital Affiliated to Third Military Medical University, and all experiments were carried out in accordance with approved guidelines of Third Military Medical University. Informed consent was signed by all of the recruited patients. Tissue microarray (TMA) generation. All samples from NSCLC patients were reviewed histologically by hematoxylin and eosin staining. To construct the TMA slides, two cores were taken from each representative tumor and adjacent noncancerous tissue (within a distance of 20 mm). The non-cancerous adjacent tissues were compared with normal tissue, stained with hematoxylin-eosin then reviewed histologically by at least two pathologists. Duplicate cylinders from intratumoral and peritumoral areas were obtained. Finally, the TMAs were constructed (in collaboration with Shanghai Biochip Company Ltd, Shanghai, China) 28 .
Immunohistochemical analysis. IHC was performed using SOX30 antibody (1:100; Santa Cruz Biotechnology) as described previously 12 . Tumor cell staining was evaluated and considered positive when immunoreactivity was greater than or equal to 10%. Based on IHC, positive staining was quantified and classified into 5 categories: < 10% positive cells for 0 (score); 10% to 25% for 1; 26% to 50% for 2; 51% to 75% for 3 and ≥ 76% for 4. Staining intensity was graded as negative (scored as 0), weak (1), moderate (2) or strong (3). All core biopsies were independently reviewed by two pathologists, and expression levels were defined by the sum of the grades for the percentage of positive staining and intensity.
Construction of SOX30 expression vector, cell transfection, MTS and Flow cytometry assays. Construction of SOX30 expression vector, cell transfection, MTS assay and Flow cytometry assay were performed as previously described 13 . Statistical Analysis. Statistical analyses were performed using the SPSS 16.0 software (SPSS, Inc., Chicago, IL). The difference in categorical variables was analyzed by Chi-square test and Linear-by-Linear Association (2-sided). OS was calculated according to Kaplan-Meier method and evaluated by log-rank test. Cox regression was used for multivariate analysis of prognostic predictors. SOX30 expression was categorized as high or low using the median score 29 . A p value of less than 0.05 was taken as statistically significant.