Exploring ALDH2 expression and immune infiltration in HNSC and its correlation of prognosis with gender or alcohol intake

The aldehyde dehydrogenase 2 point mutation (ALDH2*2) is a common frequent human gene variant, especially in East Asians. However, the expression and mechanism of action of ALDH2 in HNSC remain unknown. The present study explored the clinical significance and immune characteristics of ALDH2 in HNSC. The receiver operating characteristic curve was analysed to assess the diagnostic value of ALDH2 expression. ALDH2 expression in normal tissues and HNSC tissues was evaluated by IHC, and we also analysed ALDH2 gene expression in 4 HNSC cell lines. ALDH2 expression was significantly reduced in HNSC tissues compared to normal tissues (p < 0.05). HNSC patients with high ALDH2 expression had a better prognosis compared to patients with low ALDH2 expression (p < 0.05). GSEA indicated that these gene sets were correlated with signalling pathways, including the JAK-STAT signalling pathway. Unexpectedly, we found a significant prognostic effect of ALDH2 for HNSC based on alcohol consumption and the male sex. The correlation between ALDH2 expression and immune inhibitors showed an effect for ALDH2 in modifying tumour immunology in HNSC, and there may be a possible mechanism by which ALDH2 regulates the functions of T cells in HNSC. In addition, we developed a prognostic nomogram for HNSC patients, which suggested that low ALDH2 expression indicated poor prognosis in HNSC patients who were males and alcoholics.

The aldehyde dehydrogenase 2 point mutation (ALDH2*2) is a common frequent human gene variant, especially in East Asians. However, the expression and mechanism of action of ALDH2 in HNSC remain unknown. The present study explored the clinical significance and immune characteristics of ALDH2 in HNSC. The receiver operating characteristic curve was analysed to assess the diagnostic value of ALDH2 expression. ALDH2 expression in normal tissues and HNSC tissues was evaluated by IHC, and we also analysed ALDH2 gene expression in 4 HNSC cell lines. ALDH2 expression was significantly reduced in HNSC tissues compared to normal tissues (p < 0.05). HNSC patients with high ALDH2 expression had a better prognosis compared to patients with low ALDH2 expression (p < 0.05). GSEA indicated that these gene sets were correlated with signalling pathways, including the JAK-STAT signalling pathway. Unexpectedly, we found a significant prognostic effect of ALDH2 for HNSC based on alcohol consumption and the male sex. The correlation between ALDH2 expression and immune inhibitors showed an effect for ALDH2 in modifying tumour immunology in HNSC, and there may be a possible mechanism by which ALDH2 regulates the functions of T cells in HNSC. In addition, we developed a prognostic nomogram for HNSC patients, which suggested that low ALDH2 expression indicated poor prognosis in HNSC patients who were males and alcoholics.
Head and neck squamous carcinoma (HNSC) is the eighth most common malignancy according to the information reported in Global Cancer Statistics 2021 1 . The most promising method to reduce mortality is early cancer diagnosis as early detection is correlated with a more favourable prognosis for almost all types of cancer 2 . Currently, there is a lack of accurate biomarkers for the detection of HNSC. With the development of highthroughput sequencing technology, many important diagnostic genes are constantly being discovered. However, key biomarkers that can be used to enhance the prognosis of patients with head and neck squamous cell carcinoma remain to be confirmed.
Acetaldehyde dehydrogenase 2 (ALDH2) is a main enzyme for acetaldehyde metabolism during alcohol metabolism. ALDH2 is acknowledged for its alcohol oxidation among many aldehyde dehydrogenase genes, and approximately 30% to 40% of Asians have genetic defects in this enzyme. Individual exposure to large amounts of the catalytic active form of acetaldehyde may also make the individual more susceptible to many types of cancer. ALDH2 gene defects are correlated with an increased risk of hepatocellular carcinoma in patients with hepatitis B cirrhosis due to excessive alcohol consumption 3 . Jin et al. showed that ALDH2 plays a key role as a cancer suppressor by sustaining the stability of the liver genome, and the common human ALDH2 mutation may be an important risk factor for liver cancer 4 . A recent study has found a significant positive dose-response correlation between DFS and drinking history in HNSC patients with ALDH2 Glu/Glu 5 . However, the function of ALDH2 in HNSC remains unclear.
In the present study, we explored ALDH2 expression in numerous neoplasms using The Cancer Genome Atlas (TCGA) and its association with HNSC patient prognosis. Gene set enrichment analysis (GSEA) was applied to further evaluate the biological functions of the ALDH2 regulatory network correlated with HNSC pathogenesis. Because infiltration of immune cells is vital for the prognosis of HNSC patients, we also analysed the connection The expression of ALDH2 at the transcriptional level and its diagnostic function. The ALDH2 expression from TCGA pan-cancer data was analysed. Compared to normal tissues, the expression level of ALDH2 in cancer tissues was significantly reduced (Fig. 1A). To verify the expression of ALDH2 in HNSC patients, we used 3 related GEO datasets, which presented sequencing results with similar ALDH2 expression differences in TCGA-HNSC ( Fig. 1B-D). ALDH2 expression differences between nonpaired samples were statistically significant as shown in Fig. 1E. Among 36 pairs of matched tissues, the expression of ALDH2 in tumour tissues and paraneoplastic tissues was also significantly different (Fig. 1F). To evaluate the diagnostic efficacy of ALDH2, we performed ROC curve analysis on the expression data from tumour and normal tissues. The area under the ROC curve was 0.833 [95% confidence interval (CI): 0.793-0.872] (Fig. 1G).
Expression of ALDH2 in tissues and cell lines. Immunohistochemistry (IHC) staining showed the differential expression of ALDH2 in HNSC tissues. The staining intensity and quantity were significantly downregulated in HNSC compared to normal tissue ( Fig. 2A). We also examined the ALDH2 alteration frequency across HNSCs (Fig. 2B). Importantly, HNSC cell lines had different ALDH2 expression levels with the highest expression in CAL33 HNSC cells and the lowest expression in YD15 cells lines compared to the other cell lines (Fig. 2C). Network analysis of the differentially expressed genes correlated with ALDH2 in HNSC. The mRNA sequencing data of 502 HNSC patients from TCGA were analysed by the functional module of Linke- www.nature.com/scientificreports/ dOmics. As shown in the volcano plot ( Fig. 4A), red dots indicate a significant positive correlation with the ALDH2 gene, whereas green dots represent a significant negative correlation with the ALDH2 gene (false discovery rate [FDR] < 0.01). The interacting gene network was predicted using GENEMANIA (Fig. 4B). The heatmap shows 50 gene sets that were significantly positively or negatively correlated with ALDH2 ( Fig. 4C).
Function and enrichment analyses. The clusterProfiler R software package was used to analyse highly correlated gene groups to explore possible functional pathways. GO functional enrichment analysis indicated that ALDH2 was correlated primarily with immune cell proliferation-related pathways, including T cell proliferation and regulation of leukocyte proliferation ( Fig. 5A-C). GSEA was used to search the Reactome pathway and Kyoto Encyclopaedia of Genes and Genomes (KEGG) databases. The KEGG results indicated that the JAK-STAT signalling pathway, transcriptional misregulation in cancer and microRNAs in cancer were significantly enriched (Fig. 5D). Reactome pathway analysis revealed significant enrichment in the VEGFA-VEGFR2 pathway, death receptor signalling and adaptive immune system pathways (Fig. 5E). These results indicated that ALDH2 expression is correlated with complicated oncogenic pathway hyperactivation in head and neck squamous carcinoma, especially signalling that correlates with cell proliferation and the immune system.
Survival analysis suggests prognostic significance of ALDH2. Disease-specific survival, overall survival and progression-free survival were analysed using the Kaplan-Meier method. Interestingly, high ALDH2 expression correlated with good prognosis in HNSC patients ( Fig. 7A-C). A multivariable Cox proportional hazard model is shown in Fig. 7D, which indicated that age (P < 0.01) and stage (P < 0.05) were important covariates in predicting survival.  www.nature.com/scientificreports/ Surprisingly, our subgroup analysis of overall survival showed a significant difference in overall survival between the alcohol history and sex (Fig. 8A-D), underlining the importance of ALDH2 in predicting the outcomes of male patients who are regular drinkers. Finally, we established a nomogram combining prognostic information from clinicopathologic data and ALDH2 expression to predict the prognosis of HNSC patients (Fig. 8E). The findings indicated that the ALDH2 expression level has important implications for the survival prediction of HNSC patients.

Discussion
ALDH2, located on chromosome 12q24.12, belongs to the aldehyde dehydrogenase family of proteins 7 . Although previous studies have shown that significant differences in the ALDH2 genotype lead to different prognoses in cancer patients 8,9 , its biological roles and prognostic value in HNSC have rarely been characterized. To our knowledge, this is the first study to assess the influence of alcohol intake combined with ALDH2 expression on clinical survival in HNSC patients. Our findings contribute to existing knowledge, strengthen treatment design and improve the prognostic classification for HNSC patients.
In the present study, we found that ALDH2 was downregulated in HNSC cancer tissues compared to normal tissues and that high ALDH2 expression indicated a good prognosis and was associated with lower tumour stage. The results of univariate and multivariate Cox analyses indicated that ALDH2 might be a prospective independent biomarker for the prognosis of HNSC. We also explored the regulator networks and genes significantly www.nature.com/scientificreports/ correlated with ALDH2. Finally, a correlation analysis between immune signatures or immune infiltration and ALDH2 was conducted. The results showed that ALDH2 was correlated with most immune marker genes and that T cell infiltration may be an important prognostic factor. The findings of the present study will guide research on HNSC in the future. To explore the mechanism by which ALDH2 prevents the progression and occurrence of HNSC, DEGs were screened through correlation analysis, and functional annotation and pathway analyses performed. A gene network was established, the functional annotation and pathway analysis of the genes in the main gene clusters were explored. The results suggested that ALDH2 mainly affects the occurrence and progression of HNSC through the JAK-STAT signalling pathway (Fig. 5).
The JAK-STAT pathway is one of the most significant pathways in HNSC. In head and neck carcinoma, STATs are initiated through a variety of signal transduction pathways, including epidermal growth factor receptor (EGFR), erythropoietin receptors and interleukin (IL) receptor pathways 10 . The functional enrichment analysis found that ALDH2 was related to the JAK-STAT pathway, indicating that the possible mechanism of the differential expression and prognostic value of ALDH2 in HNSC is related to the JAK-STAT pathway. A recent study has confirmed the inhibitory effect of aldehyde on the JAK2 signalling pathway through in vitro experiments 11 , further suggesting the possibility that ALDH2 participates in the JAK pathway. However, further experiments are needed to confirm the conclusions of this research. Correlation analysis (Supplemental Fig. 1) indicated that ALDH2 was positively correlated with JAK family genes, thus providing a direction for future research. In the future, in vivo and in vitro experiments should be used to explore the regulatory mechanism of ALDH2 in the JAK/STAT pathway.
GSEA also indicated that the expression of ALDH2 was related to the death receptor signalling pathway. Previous studies have suggested that death receptor signalling is related to antigen-independent drug resistance www.nature.com/scientificreports/ in leukaemia by inducing CAR-T cell dysfunction 12 . However, it remains to be further explored whether ALDH2 modulates HNSC progression via the death receptor pathway. Immune cells in the tumour microenvironment are important components that regulate the progression behaviours of tumour cells [13][14][15][16] . Another significant feature of this research was the correlation between ALDH2 expression and diverse levels of immune infiltration in HNSC. Our results showed that ALDH2 expression was related to the infiltration level of macrophages, neutrophils, CD8 + T cells, DCs, CD4 + T cells, and B cells in HNSC (Fig. 6A, B). In addition, the correlation between immune inhibitors and ALDH2 expression suggested that ALDH2 regulates tumour immunity in HNSC. These correlations suggested the potential mechanism by which ALDH2 regulates T cell function in HNSC, indicating that ALDH2 plays a key role in the recruitment and regulation of HNSC immune-infiltrating cells. Because ALDH2 is closely related to the immune system, it is worth investigating the role of ALDH2 in cancer immunotherapy. Recently, several researchers have found that ALDH2 mediates the immune evasion induced by alcohol in colorectal cancer by stabilizing the expression of PD-L1 17 . It is also worthwhile to examine whether ALDH2 has this effect in HNSC immunotherapy. Although we used TCGA data to explore the immune infiltration of ALDH2 in HNSC, such statistical inferences may only www.nature.com/scientificreports/ be suggestive. Future research should involve the collection of large-scale case data for immunological testing to promote the application of ALDH2 in HNSC immunotherapy. Levels of alcohol drinking are closely related to various types of cancer 18 . ALDH2 dysfunction initiates numerous diseases, such as cardiovascular diseases and cancer 19 . Previous studies have not highlighted the independent prognostic ability of ALDH2, but it was confirmed in our work. Unexpectedly, we found a significant prognostic effect of ALDH2 for HNSC patients based on alcohol consumption and the male sex. Cox analysis suggested that ALDH2 may be a potential independent biomarker for the prognosis of HNSC. In the multivariable Cox proportional hazard model, only the Stage 4 and age subgroups were statistically significant. In the prognostic analysis of subgroups (Supplemental Fig. 2), ALDH2 had a more significant prognostic significance in the advanced groups (i.e., T3, T4, N1, N2, N3, Stage III and Stage IV), suggesting that the prognostic value of ALDH2 in early HNSC is limited. Therefore, when using ALDH2 for prognostic prediction, attention should be given to the priority groups, that is, alcohol consumption, men, and advanced stage.
In summary, medical and biological research on ALDH2 has received increasing attention 7 . Additionally, Alda-1, which restores the activity of the ALDH2*2 enzyme, has the potential to reverse some of the ALDH2*2 population. Therefore, Alda-1 may be used to improve the normal function of ALDH2 and improve the prognosis of HNSC patients 20,21 . This pilot study had several limitations. For example, the present study was a retrospective www.nature.com/scientificreports/ study. The role and prognosis of the ALDH2 gene in HNSC require a prospective and functional study to provide more accurate information.

Conclusion
This is the first study describing the correlation between the prognosis of alcohol drinkers and ALDH2 gene expression levels. This research provided comprehensive evidence for the function of ALDH2 in the progression of HNSC and its potential as a prognostic predictor and biotherapy target. These findings can be used to forecast the survival prognosis of patients, especially HNSC patients who are men and alcoholics.

Materials and methods
TCGA RNA-sequencing data. In total, 502 HNSC cases and 44 normal samples were included in the gene sequence expression data. Detailed information on the clinicopathological data was downloaded from TCGA data portal (https:// www. cancer. gov/). The RNA-Seq gene expression data and clinicopathological data of 502 patients were processed and further analysed (Table 1). This research was conducted under the provisions of the Declaration of Helsinki (revised in 2013).
Differently expressed genes (DEGs) and gene functional analysis. We performed Pearson correlation analysis to explore the genes significantly correlated with the expression of ALDH2, and the differentially expressed genes (DEGs) were retrieved using the heatmap package of R software. The clinical and mutation information for HNSC patients was obtained from cBioPortal 22 . Gene interactions were predicted using GENEMANIA 23 . LinkedOmics includes multiple sets of data of all 32 cancer types from TCGA, which is an open portal site 24 . By using the "LinkFinder" function of LinkedOmics, we performed the Pearson's test for

Cox proportional hazard model and nomograms.
The Cox proportional hazard model was developed using the TIMER survival module 28 . The covariates consisted of clinical factors (sex, age, tumour stage and ethnicity) and gene expression. The survival model enables researchers to study the clinical relevance of subassemblies from the tumour immune system. The results from TIMER were uploaded to R (version 3.6.3).
The meta-analysis and forest plot were generated by the ggplot2 package. Nomograms are widely used to predict the prognosis of cancer patients. In this study, a nomogram was constructed using gene expression and clinical factors 29 . The multivariable model nomograms were generated by the rms package (6.2.0) 30 in R software (3.6.3).
Immunohistochemistry and evaluation of immunostaining intensity. ALDH2 expression was analysed using the Human Protein Atlas 31 . Immunostaining was performed using a rabbit anti-ALDH2 antibody (HPA051065). IHC staining was graded as high, medium, low or not detected. The IHC intensity was graded as strong, moderate, weak or negative. The quantity of IHC intensity was graded as > 75%, 75-25%, < 25% and none.
Statistical analysis. R software (3.6.3) was used for all statistical analyses. The relationship between clinicopathological characteristics was assessed by logistic regression and the Wilcoxon signed rank test. Clinicopathological characteristics associated with overall survival (OS) in HNSC patients from TCGA were analysed by Kaplan-Meier methods and Cox regression. The receiver operating characteristic (ROC) curve was analysed using Wilson's method. Univariate logistic regression was used to analyse the correlation between clinicopathological characteristics and ALDH2 expression. Univariate and multifactorial Cox analyses were used to assess the effect of ALDH2 expression on survival with other clinicopathological characteristics (e.g., grading, staging, lymph node status, distant metastatic status and age). The ALDH2 expression cut-off value was set as the median, and HNSC patients were divided into two groups.
Ethical approval. All aspects of the work are the responsibility of the author to ensure that issues related to the accuracy or completeness of any part of the work are properly investigated and resolved. This research was conducted under the provisions of the Declaration of Helsinki (revised in 2013). All data used in this research were publicly available, and approval from the local Ethics Committee was not required.