Proteomic analysis of hypopharyngeal and laryngeal squamous cell carcinoma sheds light on differences in survival

The link between differences in molecular expression and survival among advanced laryngeal (LSCC) and hypopharyngeal squamous carcinoma (HPSCC) remains unclear. Here, we applied the Surveillance, Epidemiology, and End Results (SEER) program, Isobaric tag for relative and absolute quantitation (iTRAQ) with Liquid chromatography-mass spectrometry (LC–MS/MS) proteomics data and The Cancer Genome Atlas (TCGA) related data to discover the possible disparities between HPSCC and LSCC. Our results showed a significantly worse 5-year overall-survival in HPSCC compared with LSCC before and after adjusting for clinical parameters. 240 differentially expressed proteins were enriched in molecular networks of cytoskeleton remodeling and antigen presentation. Moreover, HPSCC consisted of less T-central-memory cells, T-follicular-helper cells, TGF-β response, and CD4 +  T memory resting cells, but more wound healing than LSCC. Furthermore, 9 mRNAs expression were significantly and independently correlated to overall survival in 126 HPSCC and LSCC patients, which was further validated in another cohort of head and neck cancers. These findings support that Immunity signatures as well as pathway networks that include cytoskeleton remodeling and antigen presentation may contribute to the observed differences in survival between HPSCC and LSCC.

. Similar only in histomorphology, there are many pieces of evidence applying immunohistochemistry or RNA sequencing to show several mRNAs (such as SCEL, CRNN, KRT4, SPINK5, and TGM3) and protein(such as Ki67, P53) differences between LSCC and HPSCC 5−8 . One possible contributing factor to the difference in survival between HPSCC and LSCC may be related to these differences in molecular expression 4 . It has been suggested that protein heterogeneity may contribute to the observed differences in survival and likelihood of metastasis between HPSCC and LSCC 4,[9][10][11][12][13][14][15][16] . However, other reports indicate that observed differences in survival between HNSCCs from sub-anatomic locations might not reflect differences in molecular profiles between subsites 7,17 . Patient variability in age, clinical stage, tumor size, lymph node status and other factors, as well as limited methodologies (such as comparing a limited number of biomarkers, low-resolution methods, and others) may result in these conflicting conclusions. Therefore, highly sensitive and global protein detection for HPSCC and LSCC patients with similar clinicopathological characteristics is required to provide more thoughtful analyses. Isobaric tags for relative and absolute quantitation (iTRAQ) can simultaneously mark and quantify global proteins using label peptides, which can be identified by sensitive mass spectrometers 18 . Thus, iTRAQ has significant advantages over some conventional proteomics techniques and is extensively performed with proven value in the discovery of global protein expression 19 . In this study, a proteomic strategy using iTRAQ with liquid chromatography and tandem mass spectrometry (LC-MS/MS) was applied to 10 pairs of cancerous and normal mucosa samples of advanced HPSCC and LSCC patients with similar clinical characteristics. Further analyses based on the Cancer Genome Atlas (TCGA) 20 datasets support our hypothesis that the significant differentially expressed proteins and their related networks may contribute to the observed difference in survival between hypopharyngeal and laryngeal cancers.

Survival differences between hypopharyngeal and laryngeal cancer patients in SEER.
After excluding cases with distant metastasis and missing or unknown TNM information, 29,783 of 35,023 cases from the SEER database were included in this study. The clinical characteristics of these patients are summarized in Table 1. Hypopharyngeal and laryngeal cancer groups were significantly different in T and N classification, as well as AJCC clinical stage. Hypopharyngeal cancer patients had a more substantial proportion of T3 or T4 tumors compared to laryngeal cancer patients (52.7% vs. 33.7%, p < 0.001) and a more significant proportion with node-positive disease (67.8% vs. 25.5%, p < 0.001). The majority of hypopharyngeal cancer patients presented with stage IV disease (61.3%) compared to only 25.3% in the laryngeal cancer group, while only 4.9% of the hypopharyngeal cancer patients presented with stage I compared to 38.9% in the laryngeal group (p < 0.001).
It was thought that any difference in survival of patients with hypopharyngeal or laryngeal cancer might be confounded by the differences in clinical characteristics, as shown in Table 1. Therefore, propensity score matching was applied to match hypopharyngeal and laryngeal cancer patients to get rid of biases from age, gender, T classification, N classification, and AJCC clinical stage (Table 1). After adjustment of these clinical parameters, Kaplan-Meier survival analysis remained to show that patients with hypopharyngeal cancer have worse survival than those with laryngeal cancer (Fig. 1 The relative expression of global proteins in HPSCC and LSCC patients. A differential protein profile between hypopharyngeal (HPM) and laryngeal (LM) normal mucosa was first examined to establish a reference for baseline differences and to account for variation between individual samples. An abundance of 613 proteins was found to be altered in HPM compared to LM, with 515 proteins being overexpressed, and 98 proteins under expressed (Table S1, Supplementary Data). Thirty proteins did not show differential expression in HPM vs. LM (Table S2, Supplementary Data). 424 proteins were founded to be differentially expressed in HPSCC vs. LSCC samples. To account for the baseline variations found in the analysis of normal mucosa samples, 184 differentially expressed proteins that were common to both tumor and normal mucosa analyses were excluded from the 424-protein list. The remaining 240 proteins were considered tumor-related differentially expressed proteins and included Vimentin, β-catenin, HLA-A/B/C, and MICA, among others (Table S3, Supplementary Data).
GO Enrichment Maps and Networks for significantly altered proteins. To annotate the above 240 differentially expressed proteins, we applied the MetaCore mapping tool to generate an enrichment analysis of pathway maps and cellular processes ( Fig. 2A-B, respectively). MetaCore has the ability of generating both pathway maps, which are manually created based on known canonical pathway data, as well as networks, which are automatically created by the tool based on existing protein interactions within a database. The Gene Ontology analysis indicated that the most significantly enriched pathway networks were involved in cytoskeleton   The landscape of cancer immunity in hypopharyngeal and laryngeal cancer. The proportion of cancer infiltrating immune cells was deconvoluted by CIBERSORT method in hypopharyngeal and laryngeal cancer patients with matched clinical stage, gender and age from the TCGA public dataset 21,22 (Fig. 4A). The proportions of CD4 + T memory resting cells and mast cells appeared to be drastically different between laryngeal and hypopharyngeal cancer. Moreover, we constructed a heatmap for known immune checkpoint biomarkers ( Fig. 4B), which indicated the differential expression of immune checkpoints such as CTLA-4, LAG-3 and TIGIT.
Considering that the significantly expressed proteins in the above iTRAQ results were enriched in the cytoskeleton remodeling and immune response pathways, immune infiltration and wound healing were further compared in 10 paired hypopharyngeal and laryngeal cancer patients (Table S4, Supplementary Data). As shown in Fig. 4C-J, hypopharyngeal cancer patients had less T central memory (Tcm) cells, T follicular helper (Tfh) cells, TGF-beta response, and CD4 + T memory resting cells, but a higher wound healing score than laryngeal cancer patients (p < 0.05). To investigate whether the differentially expressed proteins may contribute to differences in prognosis in hypopharyngeal cancer and laryngeal cancer, we analyzed the mRNAs corresponding to these 240 significantly expressed proteins in hypopharyngeal and laryngeal cancers derived from the TCGA dataset. On univariate Cox regression analysis, mRNA expression of 53 genes was significantly associated with overall survival in 126 hypopharyngeal cancer and laryngeal cancer patients (Table 3). Moreover, Kaplan-Meier analysis indicated that high or low expression of genes, including RALY, TSTA3, and HLA-A, was associated with significant differences in survival in hypopharyngeal cancer and laryngeal cancer patients (p < 0.05, Fig. 5). To validate whether these findings could extend to other head and neck cancers, univariate and multivariate Cox proportional hazards regression analysis was performed to validate correlations of the 53 genes and overall survival outcomes in 519 HNSCC patients across all subsites, including the oral cavity and oropharynx. 9 of 53 genes were confirmed in   Table 3, and the associated survival curves were visualized and compared by Kaplan-Meier analysis (Fig. 6).

Discussion
HNSCC is represented by a heterogeneous group of cancers in various anatomic locations 1,23 . A recent EURO-CARE population-based study indicated that the 5-year relative survival rate was lowest for hypopharyngeal cancer (25%) and highest for laryngeal cancer (59%) 24 , which is consistent with our findings. Most interestingly, these drastically different survival rates are attributed to cancers that arise in such geographically close anatomic locations 24,25 . As is known, some clinical parameters, such as clinical stage, age, and gender, may influence the survival time of cancer patients 26 . One argument as to why patients with hypopharyngeal cancer often do worse than laryngeal cancer patients is that hypopharyngeal cancer is usually diagnosed at a more advanced stage. Only 4.9% of the hypopharyngeal cancer patients presented with stage I cancer, compared to 38.9% in the laryngeal group in this study. However, using propensity score matching, we compared the survival of hypopharyngeal and laryngeal cancer with similar clinical parameters, including clinical stage, T stage, age, gender, and neck node status. Our findings demonstrate that even amongst clinical stage I and T1-2 tumors, hypopharyngeal cancer portends a worse prognosis. Our study suggests that HPSCC may demonstrate more aggressive biology irrespective of age, gender, T classification, N classification, and AJCC clinical stage. To our best knowledge, this is the first study to compare the survival of HPSCC and LSCC controlling for these clinical factors in a large population. Another explanation for the worse survival in HPSCC is the abundant vascular supply and lymphatic drainage in HPSCC. However, the oropharynx is very similar to the hypopharynx in terms of its tissue structure, and also boasts a robust vascular supply with ample lymphatic tissue. Despite this, oropharyngeal squamous cell carcinoma patients tend to have better survival than HPSCC patients. Therefore, the anatomic structure alone in HPSCC does not sufficiently explain the higher likelihood of a worse prognosis. As such, we turned our attention towards a molecular characterization of HPSCC to investigate its possible contribution to its worse outcomes.
There have been several studies examining the transcriptomic and genomic heterogeneity in HNSCC across various subsites, including oral cavity, tonsil, and oropharynx 27−29 . These works showed controversial conclusions in comparing the genomes and their expression 7,17 . However, there have been no studies for the specific comparison between hypopharyngeal and laryngeal cancer. In this work, we employed iTRAQ (2D) LC-MS/ MS proteomic analysis to measure the global protein expression in HPSCC and LSCC. Traditional methods for detecting protein expression, such as western blotting, usually limits the number of proteins that can be investigated simultaneously. However, our proteomic method can yield information on hundreds of proteins from a relatively small amount of tissue. Furthermore, iTRAQ has the ability to cover more peptides and provides sensitive quantification compared with traditional methods, like DIGE 30 .
Although this study had a limited number of sample pairs for iTRAQ proteomic analysis, we employed a strict method of analysis to increase the confidence in our results. To eliminate intra-patient variation, normal mucosa from the same patients were simultaneously analyzed with tumor tissues. Additionally, our proteomic samples from HPSCC and LSCC were selected based on having similar clinical parameters like clinical stage, to decrease confounding bias. Two hundred forty differentially expressed proteins and 208 non-altered proteins were found in the comparison of HPSCC and LSCC, which indicates that both molecular differences and similarities exist. The 208 similarities may represent histological and anatomical commonalities between HPSCC and LSCC. Regarding the differences, it would be both interesting and clinically meaningful if the observed differences in survival are related or even explained in part by the distinct protein expression patterns seen.
The differentially expressed proteins that we identified were analyzed using Metacore, a precise, comprehensive pathway analysis and knowledge mining tool that delivers high-quality biological systems content in context 31 . Cytoskeleton remodeling (intermediate filaments, integrin-mediated cell-matrix adhesion, and  www.nature.com/scientificreports/ cytoskeleton rearrangement) and phagosomes in antigen presentation were the top networks enriched by these 240 proteins. Cytoskeleton remodeling in the cancer cell has been linked with increased cell mobility and facilitated metastasis, which may be linked to unfavorable survival time in some patients 32 . Strikingly, we found that many of these proteins were enriched in pathways related to antigen presentation by MHC Class I and phagosomal machinery. In consideration of the crucial effect of antigen presentation on immune-mediated cancer clearance, this data suggests the need for further investigation into the role of innate or adaptive immunity-related differences in HPSCC and LSCC. TCGA is a public landmark cancer genomics program with the aim to catalog and discover major genomic alterations to create a comprehensive "atlas" of 11,000 primary cancers 33 . The immunity signatures using mRNA expression data from the TCGA was obtained from previously published work 21,22 . Here, we showed that hypopharyngeal cancer had less T central memory cells, T follicular helper cells, TGF-beta response, and CD4 + T memory resting cells, but a higher wound healing score than laryngeal cancer. These findings suggest that hypopharyngeal cancers display an altered immune response that may potentially affect the survival seen compared with laryngeal cancer. In addition, taken together with the differentially expressed immune-checkpoint related proteins found between HPSCC and LSCC, these findings may have clinical implications with regards to response to checkpoint inhibition for these cancers in clinical trials 34 .
We also found that expression of 53 out of 240 genes was significantly associated with overall survival on univariate logical regression analysis in 126 HPSCC and LSCC patients using the TCGA dataset. Similar correlations of survival with 9 of these 53 genes were observed when including all 519 HNSCC patients in the TCGA cohort across all subsites, including the oral cavity and oropharynx (Fig. 6). This data provides support that the observed survival differences among HPSCC and LSCC patients may be related to differential expression of these 9 proteins, including RALY, TSTA3, and HLA-A. RALY, a member of the heterogeneous nuclear ribonucleoprotein (hnRNP), is thought to be involved in mRNA splicing and metabolism. Its role in tumorigenesis and development remains unclear. However, it has been reported as an oncogene in hepatocellular carcinoma 35 , clear cell renal cell carcinoma 36 , triple-negative breast cancer 37 , and non-small-cell lung cancer 38 . TSTA3, also known as GFUS, participates in the pathway of transporting to the Golgi apparatus as well as metabolism 39 , and is considered an oncogene in many cancers including esophageal squamous cell carcinoma 40 . To the best of our knowledge, this is the first study to report the correlation of RALY and TSTA3 with overall survival in head and neck cancer. Finally, high expression of HLA-A or HLA-C is a favorable factor for head and neck cancer patients, which is as this correlates with increased antigen presentation leading to immune-mediated tumor clearance.
There are several limitations in our study. While a main advantage of our study is our selection of patients based on similar clinical characteristics in order to minimize the effect of these as confounding factors, we aren't able to eliminate any bias that may arise from the development of the individual tumors themselves. Specifically, without fully characterizing the timeline of these protein aberrations with regard to tumor development, it is not understood whether these differences in protein expression are a cause or a result of tumor development itself. Additionally, while we performed our proteomic analysis using a small sample of patients from our own institution, we performed our experimental validation using mRNA expression data from the TCGA. While mRNA transcript and protein levels often covary closely within the cell, one is not a perfect surrogate for the other. Ideally, future experiments using the same proteomic analysis in larger cohort should be performed in order to validate our findings. Additionally, in vivo experiments, such as gene knockout experiments in mice may be more informative regarding the clinical implications of the genes discovered in this exploratory analysis. Finally, an important limitation in our study is the lack of information regarding type of treatment patients received. It is plausible that some bias is introduced, especially with regards to survival outcomes, related to whether a patient received surgery, chemotherapy, radiation, or any combination of these. Unfortunately, a majority of the patients included in our study had treatment information that was either missing or incomplete. In this setting, the inclusion of treatment as a covariate would create additional bias from the exclusion of such a large percentage of our sample population, and would significantly reduce the power of our study. By controlling for cancer stage, however, it was thought that the risk of bias from treatment effect would be somewhat minimized, as patients with similar stage, especially later stage, are more likely to undergo a similar course of treatment with multimodality therapy.

Conclusion
We showed hypopharyngeal carcinoma patients survived significantly poorer than laryngeal carcinoma independent of age, gender, tumor size, neck lymph node status, and AJCC clinical stage. There are 240 proteins differently expressed in hypopharyngeal carcinoma and laryngeal carcinoma, which are enriched in the networks of cytoskeleton remodel and antigen presentation. Nine of the above 240 molecules correlated to the overall survival time in HPSCC/LSCC and HNSCC. Hypopharyngeal carcinoma had less Tcm cells, Tfh cells, TGF-β response, and CD4 + T memory resting cells, but more wound healing than laryngeal carcinoma.
In addition to diagnosed at relatively late clinical stage abundant, vascular supply and lymphatic drainage, our data may have implications that differential expressed proteins may be some of reasons for hypopharyngeal cancer has poorer prognosis as compared to laryngeal cancer. This study provides a comprehensive view about the disparities between HPSCC and LSCC, which remind us these differences when treating patients or designing clinical trials.

Materials and methods
Data sources. We obey the principles of the 1983 Declaration of Helsinki. All of experiments in this paper obey this principle. Informed consent was obtained from all subjects or, if subjects are under 18, the informed consent of a parent and/or legal guardian was obtained. Informed consent was obtained from all patients before iTRAQ proteomic analysis. The iTRAQ (2D) LC-MS/MS experiments were performed previously 43,44 .
Briefly, normal mucosa and primary tumor samples of 10 HPSCC and 10 LSCC patients were collected under the approval of the Medical Research Ethics Committee of Xiangya Hospital (Changsha, China). The hypopharyngeal normal mucosa and primary tumor specimens were labeled IT118 and IT121, while the laryngeal normal and tumor specimens were labeled IT115 and IT113, respectively. The relative global protein expression in HPSCC and LSCC samples was analyzed using Protein Pilot v3.0 software (Applied Biosystems) according to the human International Protein Index (IPI) database v3.45.
To minimize the false positive rate, a strict cutoff for protein identification was used based on the following criteria: unused ProtScore > 1.3 and more than one peptide with 95% confidence per repetition. It was shown that 43,673 spectra, 19,882 peptides, and 853 proteins were identified and quantified by the calibration with a 5% global false discovery rate. Protein relative expression ratios were based on the peak area ratios of the peptides from the same protein. The resulting dataset was auto bias-corrected to eliminate any variability due to the unequal mixing of the variously labeled digests. A fold change in protein expression greater than 1.2 or less than 0.8 was considered significant, with values in between considered as similar expression. MetaCore (Gene-Go; St. Joseph, MI, USA) from Clarivate Analytics, an integrated program with manual databases and practical algorithms for functional analysis, was applied to annotate the functions of the differentially expressed proteins 45 . Software and statistics. All statistical analyses were performed using Rstudio (version 3.5.3, https :// cran.r-proje ct.org/). Briefly, Kaplan-Meier survival analysis was performed using R packages ('survival' , 'survminer'). The "surv_cutpoint" command was used to identify the best cutoff for 'High expression' or 'Low expression' in Kaplan-Meier survival analysis.
The log-rank test was used to analyze differences in survival, with a p-value of less than 0.05 considered statistically significant. Overall survival was censored at a maximum time of 60 months. Univariable Cox proportional hazards regression models were used to correlate survival with mRNA expression. The violation of the proportional hazards assumption was tested using the 'cox.zph' function in the "survminer" package. Propensity score matching (PSM) was performed for matching clinical parameters for patients in SEER and TCGA datasets by R package "MatchIt" 46 .