Immune, metabolic landscapes of prognostic signatures for lung adenocarcinoma based on a novel deep learning framework

Qin, Shimei; Sun, Shibin; Wang, Yahui; Li, Chao; Fu, Lei; Wu, Ming; Yan, Jinxing; Li, Wan; Lv, Junjie; Chen, Lina

doi:10.1038/s41598-023-51108-x

Download PDF

Article
Open access
Published: 04 January 2024

Immune, metabolic landscapes of prognostic signatures for lung adenocarcinoma based on a novel deep learning framework

Shimei Qin¹^na1,
Shibin Sun¹^na1,
Yahui Wang¹^na1,
Chao Li¹,
Lei Fu¹,
Ming Wu¹,
Jinxing Yan¹,
Wan Li¹,
Junjie Lv¹ &
…
Lina Chen¹

Scientific Reports volume 14, Article number: 527 (2024) Cite this article

926 Accesses
1 Altmetric
Metrics details

Subjects

Abstract

Lung adenocarcinoma (LUAD) is a malignant tumor with high lethality, and the aim of this study was to identify promising biomarkers for LUAD. Using the TCGA-LUAD dataset as a discovery cohort, a novel joint framework VAEjMLP based on variational autoencoder (VAE) and multilayer perceptron (MLP) was proposed. And the Shapley Additive Explanations (SHAP) method was introduced to evaluate the contribution of feature genes to the classification decision, which helped us to develop a biologically meaningful biomarker potential scoring algorithm. Nineteen potential biomarkers for LUAD were identified, which were involved in the regulation of immune and metabolic functions in LUAD. A prognostic risk model for LUAD was constructed by the biomarkers HLA-DRB1, SCGB1A1, and HLA-DRB5 screened by Cox regression analysis, dividing the patients into high-risk and low-risk groups. The prognostic risk model was validated with external datasets. The low-risk group was characterized by enrichment of immune pathways and higher immune infiltration compared to the high-risk group. While, the high-risk group was accompanied by an increase in metabolic pathway activity. There were significant differences between the high- and low-risk groups in metabolic reprogramming of aerobic glycolysis, amino acids, and lipids, as well as in angiogenic activity, epithelial-mesenchymal transition, tumorigenic cytokines, and inflammatory response. Furthermore, high-risk patients were more sensitive to Afatinib, Gefitinib, and Gemcitabine as predicted by the pRRophetic algorithm. This study provides prognostic signatures capable of revealing the immune and metabolic landscapes for LUAD, and may shed light on the identification of other cancer biomarkers.

Machine learning reveals diverse cell death patterns in lung adenocarcinoma prognosis and therapy

Article Open access 26 February 2024

Overall survival prediction of non-small cell lung cancer by integrating microarray and clinical data with deep learning

Article Open access 13 March 2020

Predicting lung adenocarcinoma prognosis, immune escape, and pharmacomic profile from arginine and proline-related genes

Article Open access 14 September 2023

Introduction

Cancer is still a global major public health problem¹. According to the latest global cancer statistics estimates, lung cancer remains the leading cause of cancer death and the second most commonly diagnosed cancer, accounting for approximately 20% of cancer-related mortality and 10% of incidence². Non-small cell lung cancer (NSCLC) accounts for approximately 85% of lung cancer cases³, with lung adenocarcinoma (LUAD) as its main histologic subtype and its incidence is still increasing⁴. Despite significant advances in diagnosis and treatment, the 5-year survival rate of LUAD is only 4–17%⁵. Therefore, screening and poor prognosis of lung adenocarcinoma remains an ongoing challenge. Accumulating evidence suggests that biomarker identification and application are of major importance for timely diagnosis and accurate prognosis of cancer^6,7.

Machine learning has been widely used in biomarker discovery studies for cancer^8,9,10. For example, based on breast cancer gene expression datasets, Zare et al.¹¹ identified 59 novel inflammatory breast cancer-specific gene signatures using a random forest approach. Zhang et al.¹² used a support vector machine algorithm to excavate a m6A target miRNAs diagnostic signature for cancer detection and successfully implemented it for lung cancer. Machine learning algorithms were deployed to develop tumor-infiltrating immune cell associated RNAs to predict survival outcomes in LUAD patients¹³. Deep learning has gained increasing attention in the research of cancer signatures identification^14,15. The denoising autoencoder, an unsupervised deep learning algorithm, has been successfully applied to LUAD molecular signature mining¹⁶. Divate et al.¹⁷ presented a robust neural network model providing gene signatures for accurate classification of cancers based on gene expression data. Three feed-forward neural networks were employed to RNA-seq samples from 18 solid tumor types and recognized transcriptome signatures that were consistent across tumors¹⁸. Deep learning methods are capable of identifying highly complex patterns in large datasets compared to common machine learning techniques, which help to efficiently identify molecular signatures associated with cancer¹⁹.

Variational autoencoder (VAE) is a deep generative model based on variational Bayesian inference designed to learn nonlinear latent representations of high-dimensional data²⁰. VAE has shown encouraging results in capturing biologically meaningful low-dimensional representations of multi-omics data. Daniel et al.²¹ utilized VAE to learn a generalized latent representation of large-scale metabolomics data, and VAE representation outperformed both linear and nonlinear principal component analysis. VAE models trained on gene expression data have good ability to identify generalizable biological representations²². In addition, VAE has been applied in multi-omics representation learning such as proteomics and epigenomics^23,24. Evidently, VAE is a powerful method for dimensionality reduction. Neural networks are commonly employed in the current deep learning field, among which multilayer perceptron (MLP) is widely used in cancer diagnosis research. Lorencin et al.²⁵ combined MLP and Laplacian edge detector to achieve bladder cancer detection. Deep learning methods including MLP have revealed salivary glycopatterns as biomarkers for the diagnosis of papillary thyroid cancer²⁶. MLP has also been applied to screening for breast, colorectal, and prostate cancers^27,28,29. Shapley Additive Explanations (SHAP) is derived from the Shapley value in cooperative game theory, which aims to tackle the problem of lack of interpretability faced by machine learning³⁰. In a study by Chakraborty et al.³¹, SHAP facilitated the identification of prognostic factors for breast cancer by enhancing the interpretability of the extreme gradient boosting model. Multiple studies have utilized SHAP to interpret the output of machine learning models to screen out important features^32,33,34. SHAP is considered a state-of-the-art machine learning interpreter³⁵. The magnitude and direction of the influence of features on the output variables can be assessed by calculating SHAP values.

Here, we developed a LUAD biomarker potential scoring algorithm based on a novel deep learning framework, variational autoencoder joint multilayer perceptron (VAEjMLP), which combined the feature dimensionality reduction and classification prediction tasks and efficiently evaluated the importance of each feature gene by applying SHAP. This work successfully identified 19 LUAD biomarkers (LABs) and constructed a LUAD prognostic risk model. The functions mediated by the biomarkers in LUAD were investigated. The metabolic and immune landscapes of the prognostic risk model and its associations with angiogenesis, epithelial-mesenchymal transition (EMT), tumorigenic cytokines, and inflammation were explored. Our study provided reliable biomarkers for LUAD and may help to more accurately determine the survival of LUAD patients.

Materials and methods

LUAD data sources and preprocessing

The RNA-seq expression data and clinical information for LUAD were retrieved from The Cancer Genome Atlas (TCGA) (https://tcga-data.nci.nih.gov/tcga/) data portal. In constructing the prognostic risk model, the samples were filtered by clinical information including age, sex, survival time, overall survival (OS) status, pathologic T, N, M, stage, and history of previous cancer diagnosis, and a total of 482 patients with LUAD were included in the study. The GSE72094 derived from The Gene Expression Omnibus (GEO)³⁶ (http://www.ncbi.nlm.nih.gov/geo) database was used as a validation set for the prognostic risk model, in which 393 LUAD samples with complete clinical information were included in this study. Single-cell RNA sequencing (scRNA-seq) data for 11 cases of distal normal lung tissue and 15 cases of primary LUAD were acquired from the GEO database, with accession number GSE131907. The GSE131907 dataset was used to analyze the expression of prognostic risk factors at the single-cell transcriptome level.

Methods

Identifying reliable biomarkers is crucial for improving the prognosis of LUAD. Deep learning has been widely used in the exploration of biological problems. In this study, we proposed a novel deep learning framework based on VAE, MLP, and SHAP to identify biomarkers and construct a prognostic risk model for LUAD (the workflow of our investigation is shown in Fig. 1).

LUAD biomarker identification framework

In order to mine biomarkers for LUAD, it is first necessary to identify genes that are highly altered between cancer and normal states. Based on the TCGA LUAD expression data consisting of read counts, the differentially expressed genes between normal and cancer samples were identified using the negative binomial distribution model of the R package “edgeR” (version 3.40.2), with a threshold set at | log2(fold change) |> 1 and p-value < 0.05³⁷. To identify more stable differentially expressed genes, random sampling was performed on normal and cancer samples in a 1:6 ratio for 1000 times, where all normal samples were taken in each sampling. Differential expression analysis was conducted on each of the 1000 sets of samples and then the frequency of differential expression of each gene was counted. All differentially expressed genes, as well as genes with a frequency not less than 950 or 750, were selected as subsequent input features, respectively. The joint framework VAEjMLP was developed by combining the VAE model with MLP to perform the feature dimensionality reduction and classification prediction tasks simultaneously, allowing efficient representation learning with the help of the classification task. VAE consists of an encoder and a decoder and is an important generative model for dealing with feature representation capabilities. In this study, the encoder of the VAE is an inference model modelled by a neural network with the variational parameter φ as a learnable parameter, defined as an approximate posterior distribution, which is further assumed to be a Gaussian distribution:

$$\begin{array}{c}z=encoder\left(x\right) \sim {q}_{\varphi }\left(z|x\right)\end{array}$$

(1)

where $z$ is the latent variable and $x$ is the original input. The encoder will output the mean, $\mu$, and variance, $\sigma$, of the latent distribution, allowing for sampling from the hidden variable space. Since this operation of sampling is non-differentiable, to make the model trainable, sampling was performed by re-parameterization:

$$\begin{array}{c}z=\mu +\sigma \epsilon \end{array}$$

(2)

where $\epsilon \sim N\left(\mathrm{0,1}\right)$. The decoder still modelled by a neural network uses $z$ as input to reconstruct the original data to generate ${x}^{\mathrm{^{\prime}}}$:

$$\begin{array}{c}{x}^{\prime}=decoder\left(x\right)\sim {p}_{\theta }\left(x|z\right)\end{array}$$

(3)

where $\theta$ is the learnable parameter of the decoder and ${p}_{\theta }\left(x|z\right)$ follows a Bernoulli distribution. The objective function of the VAE is given by:

$$\begin{array}{c}{L}_{VAE}=BCE\left(x,{x}^{\prime}\right)+{D}_{KL}\left(N\left(\mu ,\sigma \right)\Vert N\left(\mathrm{0,1}\right)\right)\end{array}$$

(4)

where BCE is the binary cross entropy used to compute the distance between the original input $x$ and the reconstructed data ${x}{\prime}$, and ${D}_{KL}$ is the difference between the learned distribution and the standard Gaussian distribution. The VAE maps the original input to a low-dimensional latent space, which achieves feature dimensionality reduction. The latent space representation was then used as an input to the MLP to perform the classification task. The MLP adopted the cross entropy as a loss function, which can be expressed as:

$$\begin{array}{c}{L}_{MLP}=CE\left(y,{y}^{\prime}\right)\end{array}$$

(5)

where $y$ is the true label and ${y}^{\prime}$ is the predicted label. Ultimately, the loss function of the VAEjMLP can be expressed as:

$$\begin{array}{c}{L}_{VAE-MLP}=\lambda {L}_{VAE}+{L}_{MLP}\end{array}$$

(6)

where $\lambda$ is the hyperparameter that balances the training weights of VAE and MLP and was set to 0.001. The training objective of VAEjMLP is namely to minimize the loss function ${L}_{VAE-MLP}$. To train the VAEjMLP, we had randomly split the sample into an 80% training set and a 20% test set. The trained VAEjMLP was then passed into SHAP to quantify the contribution of each feature to the prediction results. SHAP interprets the predicted value ${y}^{\left(i\right)}$ of sample ${x}^{\left(i\right)}$ as the sum of the attributions of each feature in the sample, where the attributions are the SHAP values, which satisfy the following formula:

$$\begin{array}{c}{{\text{y}}}^{\left({\text{i}}\right)}={{\text{y}}}^{{\text{base}}}+\sum_{{\text{j}}=1}^{{\text{N}}}{\text{f}}\left({{\text{x}}}_{{\text{j}}}^{\left({\text{i}}\right)}\right)\end{array}$$

(7)

where ${y}^{base}$ is the model's baseline, which is the mean prediction of all training samples, $N$ represents the number of feature genes, and $f\left({x}_{j}^{\left(i\right)}\right)$ is the SHAP value of the j-th feature of the sample ${x}^{\left(i\right)}$. $f\left({x}_{j}^{\left(i\right)}\right)>0$ indicates that the feature is a positive contributor to the prediction, and vice versa, indicating that the feature is a negative contributor to the prediction. SHAP outputs the contribution rate of each feature gene in each sample. Based on the positive or negative impact of feature genes on the predicted output, this study proposed a biomarker potential scoring algorithm to assess the potential of the feature genes as markers for LUAD. First, the distribution of the data formed by all contribution rate values for all feature genes was examined, and a threshold for significant contribution was determined based on the distribution. In this study, a positive contribution rate indicates that the feature gene tends to classify the sample as a cancer sample and, conversely, a negative contribution rate when classified as a normal sample. For each feature gene $g$ with significant contribution, there exists the number of times ${C}_{P}$ counted as a significant positive contribution in cancer samples and the number of times ${C}_{N}$ counted as a significant negative contribution in normal samples. The number of cancer samples is ${N}_{tumor}$ and the number of normal samples is ${N}_{normal}$, then the proportion of feature gene $g$ that make a significant positive contribution to cancer samples is ${P}_{tumor}=\frac{{C}_{P}}{{N}_{tumor}}$, and the proportion of feature gene $g$ that make a significant negative contribution to normal samples is ${P}_{normal}=\frac{{C}_{N}}{{N}_{normal}}$. Then, the biomarker potential score ${S}_{core}\left(g\right)$ for feature gene $g$ is calculated as follows,

$$\begin{array}{c}{S}_{core}\left(g\right)=\frac{2\times {P}_{tumor}\times {P}_{normal}}{{P}_{tumor}+{P}_{normal}}\end{array}$$

(8)

The biomarker potential scores of all feature genes with significant contributions were calculated, and the 25th percentile of all scores in descending order was used as the lower bound to screen for potential LABs. The framework for identifying LABs was implemented in python using the PyTorch SHAP software packages^38,39.

Enrichment analysis of LABs

Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis was performed using the R package “clusterProfiler”⁴⁰. Biological pathways of target mitochondrial genes were annotated by MitoCarta 3.0 (http://www.broadinstitute.org/mitocarta)⁴¹ database. Gene set variation analysis (GSVA) is an unsupervised method that calculates enrichment scores for specific gene sets in each sample⁴². We used the GSVA to assess and compare differences in enrichment of functional gene sets across samples, with reference gene sets from the KEGG database resource (http://www.genome.jp/kegg/)⁴³. Gene set enrichment analysis (GSEA) is used to assess the distribution trend of genes in a predefined gene set in a gene list ranked by phenotypic relatedness to determine their contribution to the phenotype⁴⁴. In this study, GSEA was applied to identify biological functions associated with LUAD. | NES (normalized enrichment score) | > 1, Permutation test p-value < 0.05 and false discovery rate (FDR) < 0.25 were considered statistically significant.

Constructing a prognostic risk model for LUAD

Univariate Cox regression analysis was performed on each LABs to screen genes significantly associated with OS in LUAD. Then, the screened genes were further filtered for biological functions and the regression coefficients of the genes were calculated using multivariate Cox regression analysis. A prognostic risk score was generated for each patient using the following formula,

$$\begin{array}{c}RiskScore=\sum_{i=1}^{M}Coe{f}_{Gene\left(i\right)}\times Ex{p}_{Gene\left(i\right)}\end{array}$$

(9)

where $M$ represents the number of samples, $Coe{f}_{Gene\left(i\right)}$ represents the regression coefficient of $Gene\left(i\right)$ and $Ex{p}_{Gene\left(i\right)}$ represents the expression level of $Gene\left(i\right)$. The samples were divided into high-risk and low-risk groups based on the median risk score.

Enrichment analysis of the high- and low-risk groups

GO enrichment analysis was performed for differentially expressed genes between the high- and low-risk groups. Functional differences between the high- and low-risk groups were further analyzed by calculating enrichment scores for the hallmark and immunologic gene sets between the two groups using the GSVA method. The hallmark gene sets and immunologic signature gene sets were obtained from “h.all.v2023.1” and “c7.immunesigdb.v2023.1” of The Molecular Signatures Database (MSigDB) (https://www.gsea-msigdb.org/gsea/msigdb)⁴⁴, respectively.

Characterization of immune, metabolic, and other tumor-related factors

The relative infiltration abundance of 22 immune cell types in each sample was estimated based on gene expression data using the CIBERSORTx⁴⁵ online tool. The R package “estimate” calculated stromal and immune scores to estimate the abundance of stromal cells and the level of immune cell infiltration in the sample's tumor microenvironment (TME) and combined the two scores to infer tumor purity⁴⁶. Metabolic reprogramming is recognized as one of the key features of malignant tumors, with reprogramming of glycolysis, amino acid metabolism and lipid metabolism providing a tremendous energy demand for cancer cell proliferation⁴⁷. Angiogenesis, EMT, tumorigenic cytokines and inflammation play crucial roles in tumor growth and progression^48,49,50,51. Angiogenesis, EMT and tumorigenic cytokines-related genes were collected from the published literature of Qiu et al.⁵². Genes related to aerobic glycolysis, metabolism of glutamine, serine, glycine, arginine, methionine, tryptophan, fatty acids and sphingolipids, and inflammatory response were downloaded from MSigDB (https://www.gsea-msigdb.org/gsea/msigdb) database. Reference collections include “h.all.v2023.1”, “c2.cp.wikipathways.v2023.1”, “c2.cp.reactome.v2023.1”, “c2.cp.kegg.v2023.1” and “c5.go.bp.v2023.1”. Metabolic reprogramming, angiogenic activity, EMT, tumorigenic cytokines and inflammatory response scores were calculated for each sample by single-sample gene set enrichment analysis (ssGSEA) using the R package “GSVA”⁴².

Prediction of immunotherapy response

The R package “EaSIeR” was used to predict the likelihood of patient response to immune checkpoint inhibitors (ICI). “EaSIeR” employs five types of features describing the immune TME to construct a model and uses transcriptome-based immune response scores (cytolytic activity (CYT), Roh immune score (Roh_IS), chemokines, Davoli immune signature (Davoli_IS), IFNy signature (IFNy), Expanded immune signature (Ayers_expIS), T-cell inflamed signature (Tcell_inflamed), T-cell inflamed signature (RIR), Tertiary lymphoid structures signature (TLS)) as learning targets to predict patients' ICI response⁵³. Higher scores indicate a higher likelihood that a patient will respond to ICI therapy.

Analysis of prognostic risk factors

The expression of prognostic risk factors and their correlations with immune-related genes and immune checkpoints were analyzed. Immune-related genes were downloaded from the TISIDB (http://cis.Hku.hk/TISIDB/) database⁵⁴ and 79 immune checkpoints were from Hu et al.⁵⁵.

Statistical analysis

All statistical analyses were performed based on R software (4.2.1). Kolmogorov-Smirnov test was used for testing normality. Two-tailed Student's t-test was used to estimate the difference between the two groups when the data obeyed normal distribution, otherwise the Wilcoxon rank-sum test was performed. Spearman's rank correlation analysis was employed to explore the correlation between the variables. If not mentioned, p < 0.05 was considered statistically significant. If not specifically labelled, for the symbolic marking of statistical significance, we used the following convention, “ns”: p > 0.05; “*”: p <= 0.05; “** “: p <= 0.01; “***”: p <= 0.001; “****”: p <= 0.0001.

Results

Potential biomarkers for LUAD

The TCGA database contains 539 cancer samples and 59 normal samples corresponding to LUAD. To identify stable differentially expressed genes, 1000 random samplings were performed between normal and cancer samples in a ratio of 1:6 (59:354). A total of 8836 differentially expressed genes were screened in the 1000 sample sets. Among them, 80.24% of the genes were differentially expressed with a frequency of 1000, 7471 genes had a frequency not less than 950, and 7723 genes had a frequency not less than 750 (Fig. 2A). For each of the above 1000 sets of samples, three groups of differentially expressed genes screened according to frequency thresholds were respectively used as feature inputs for VAEjMLP. For the VAEjMLP model, it used a VAE model consisting of an encoder and decoder modeled by a neural network, respectively, to achieve feature dimensionality reduction, and input the reduced features into an artificial neural network MLP for classification prediction. In VAEjMLP, the encoder part of VAE consisted of four hidden layers, with 1024 neurons in hidden layer 1 and hidden layer 2, 512 in hidden layer 3, and 128 in hidden layer 4, and the decoder had a mirrored structure with the encoder (Fig. 2B). The latent vector of dimension 128 obtained from VAE training was passed into a three-layer MLP with 128 neurons in the first layer, 64 in the second layer and the third layer as the output layer (Fig. 2B). Both VAE and MLP used the Leaky ReLU activation function. The number of epochs was set to 250, with the first 50 epochs for unsupervised training of VAE, the 51st–100th epochs for classification training of MLP, and the 101st–250th epochs for joint training of VAEjMLP. The area under the curve (AUC) of the trained VAEjMLP model reached over 0.999, and the accuracy, precision, and recall all reached over 0.99, 0.955, and 0.994, respectively. For each set of input samples, three groups of significant contributing genes were screened using ±5%, ±1%, and ±0.5% of the distribution formed by all the contribution values of the SHAP output as thresholds, respectively. Using the biomarker potential scoring algorithm, each significant contributing gene received its scores in each set of samples. The quartiles of the descending biomarker potential scores of the three groups of significantly contributing genes were used as lower bounds to screen the genes, respectively, and the intersection was considered as candidates. When all differentially expressed genes, genes with a frequency not less than 750, and genes with a frequency not less than 950 were used as features, respectively, the number of candidate genes identified in 1000 sample sets was 127–183, 109–137, and 92–132, respectively (Fig. 2C). Eventually, the 19 shared genes of all candidates were identified as potential LABs (Fig. 2D).

Association analysis of LABs with LUAD

LABs consist of 15 nuclear and 4 mitochondrial genes. The biological functions of nuclear genes were explored using GO and KEGG enrichment analysis. The results demonstrated that nuclear genes were mainly enriched in immune pathways, such as MHC protein complex assembly, Th1 and Th2 cell differentiation, antigen processing and presentation, regulation of T cell activation, etc. (Fig. 3A,B). In addition, four mitochondrial genes (MT-ND6, MT-ND5, MT-CO3, MT-CO2) were all involved in driving mitochondrial oxidative phosphorylation (OXPHOS) (Fig. 3C). The OXPHOS system serves as the center of cellular metabolism and is critical for energy production in eukaryotic cells⁵⁶. Besides, the GSVA results showed that immune pathways, such as platelet activation, leukocyte transendothelial migration, natural killer cell mediated cytotoxicity, and Th1 and Th2 cell differentiation, were mostly enriched in the normal group as compared to the LUAD group (Fig. 3D). Simultaneously, the normal and LUAD groups presented significant differences in the enrichment of metabolic pathways (e.g., fructose and mannose metabolism, arginine and proline metabolism, and cholesterol metabolism) (Fig. 3E). The GSEA was also employed to evaluate the signaling pathways involved in nuclear genes. The results indicated that nuclear genes of LABs were negatively linked to immune pathways (Th17 cell differentiation, Th1 and Th2 cell differentiation, Phagosome, intestinal immune network for IgA production, and hematopoietic cell lineage) in LUAD (Fig. 3F–J). These results illustrated that LABs were implicated in the regulation of immune and metabolic functions in LUAD.

Construction and validation of a prognostic risk model

The univariate Cox regression analysis showed that HLA-DRB1, SCGB1A1, SFTPB, RNASE1, SLC34A2, and HLA-DRB5 among the LABs were significantly associated with LUAD survival (Fig. 4A). Further analysis was conducted using survival-related biomarker sets that were enriched in the same GO entries. There were four sets of genes enriched in the same GO entries (Fig. 4B): HLA-DRB1/HLA-DRB5, HLA-DRB1/SCGB1A1/HLA-DRB5, HLA-DRB1/SCGB1A1, and HLA-DRB1/SFTPB/HLA-DRB5. The Kaplan-Meier survival curve analysis for all four gene sets showed significant associations with LUAD survival. More specifically, the significant p-values for the HLA-DRB1/HLA-DRB5 (Fig. 4C), HLA-DRB1/SCGB1A1 (Fig. 4D), and HLA-DRB1/SFTPB/HLA-DRB5 (Fig. 4E) sets were 0.01, 0.0024, and 0.0014, respectively. The HLA-DRB1/SCGB1A1/HLA-DRB5 set presented the most significant difference in survival rates (p-value=0.00055) (Fig. 4F). Ultimately, the set HLA-DRB1/SCGB1A1/HLA-DRB5, which was most correlated with LUAD survival, was selected to construct the LUAD prognostic risk model and risk scores for patients were calculated. The regression coefficients for HLA-DRB1, SCGB1A1, and HLA-DRB5 in the LUAD prognostic risk model were − 7.36e−05, − 1.51e−03, and − 7.43e−04, respectively. It can be observed that the patients with higher risk scores had poorer OS (Fig. 4F). The univariate Cox regression analysis suggested that pathological T, N, M and stage staging were significantly associated with survival in LUAD patients (Fig. 4G). A multivariate Cox regression analysis was performed to explore the predictive independence of the prognostic risk model by combining these survival-related clinicopathological factors with the risk score. The results demonstrated that the risk score (HR = 2.33, 95% CI = 1.38–3.9, P = 1e−3) served as an independent risk factor for OS in patients with LUAD (Fig. 4H). The GSE72094 dataset derived from the GEO database validated the accuracy of the prognostic risk model (Fig. 4I,J). The association of the risk score with survival status and prognostic risk factors was further explored (Fig. 4K). Overall, a higher risk score was associated with a higher number of deaths, indicating a poorer prognosis and higher risk of death in the high-risk group. Compared to the low-risk group, the prognostic risk factors SCGB1A1, HLA-DRB1, HLA-DRB5 were all down-regulated in the high-risk group.

Enrichment analysis of the prognostic risk model

Genes differentially expressed in the high- and low-risk groups were identified, and 459 up-regulated genes and 352 down-regulated genes were recognized in the high-risk group (Fig. 5A). The up-regulated genes were mainly enriched in various metabolic pathways, secretion and transportation of substances, and pathways related to development (Fig. 5B). The down-regulated genes were predominantly enriched in various immune-related pathways (Fig. 5C). To further explore the differences in biological functions of different risk groups, we conducted GSVA analysis. Among the 50 cancer hallmark gene sets, significant enrichment differences were observed in 37 gene sets between the high and low-risk groups (Fig. 5D). Compared with the low-risk group, the high-risk group was mainly accompanied by the enrichment of cell proliferation pathways (such as E2F targets, G2M checkpoint, MYC targets, etc.) and metabolic pathways (such as MTORC1 signaling, glycolysis, oxidative phosphorylation, etc.) (Fig. 5D). Also of concern was the significant down-regulation of the activity of immune-related pathways in the high-risk group compared to the low-risk group, such as IL2/STAT5 signaling, IL6/JAK/STAT3 signaling, inflammatory response, and other pathways (Fig. 5D). Both GO and GSVA analyses illustrated that metabolism-related pathways were primarily enriched in the high-risk group and immune-related pathways were mostly enriched in the low-risk group. The characteristics of unlimited tumor proliferation often require alterations in energy metabolism⁵⁷, and studies have indicated that increased expression of glycolytic enzymes correlates with poor prognosis in lung cancer patients^58,59. The immune system plays a dual role of both promotion and inhibition of cancer development⁶⁰, and the signaling pathways IL2/STAT5 and IL6/JAK/STAT3 were closely associated with the prognosis of pancreatic ductal adenocarcinoma and olfactory neuroblastoma^61,62. The immunologic signature gene sets derived from the MSigDB database represent immune states and perturbations signatures. Therefore, we used GSVA analysis to compare the differences in the enrichment of immunologic signature gene sets between the high- and low-risk groups. Among the 4,872 immunologic signature gene sets, there were differences in enrichment in 3,574 gene sets between the two risk groups. Specifically, the high-risk group exhibited significantly lower enrichment scores in 2,585 immunologic signature gene sets compared to the low-risk group. Not only that, the top 100 gene sets with the most significant differences were all showed down-regulation of activity in the high-risk group (Fig. 5E presents the top 20 gene sets with the most significant differences).

Immune landscapes for the prognostic risk model

The high-risk group presented significantly down-regulation of immune pathway activity compared to the low-risk group. Therefore, the CIBERSORT algorithm was employed to calculate the immune infiltration of the high-risk and low-risk groups, while the ESTIMATE algorithm was utilized to assess the immune TME in both groups. The level of infiltration of 22 immune cell types in LUAD patients was estimated using the CIBERSORT algorithm. The proportions of 13 immune cell types infiltrated showed significant differences between the two risk groups. The infiltration proportions of B cells naive, eosinophils, macrophages M0, mast cells activated, plasma cells, and T cells CD4 memory activated were higher in the high-risk group than in the low-risk group (Fig. 6A). The correlation between the risk score and the level of infiltration of these six immune cells was significant and positive (Fig. 6B). B cells memory, dendritic cells resting, macrophages M2, mast cells resting, monocytes, T cells CD4 memory resting, and T cells regulatory (Tregs) infiltration levels were higher in the low-risk group (Fig. 6A). The infiltration proportions of these 7 immune cell types showed a significant and negative correlation with the risk score (Fig. 6B). Several LUAD survival studies have demonstrated that macrophages M0, mast cells activated and T cells CD4 memory activated were accompanied by higher levels of infiltration in high-risk groups, and infiltration levels of B cells memory, dendritic cells resting, mast cells resting, monocytes, T cells CD4 memory resting and T cells regulatory (Tregs) were higher in low-risk groups^63,64,65. This aligns with the findings of our study. Stromal scores for stromal cell abundance, immune scores for the extent of immune cell infiltration and tumor purity in the TME were estimated using ESTIMATE for different risk groups. The results indicated that patients in the high-risk group had higher tumor purity than those in the low-risk group (p < 2.2e−16) (Fig. 6C). The risk score was positively correlated with tumor purity in LUAD patients (R = 0.55, p < 2.2e−16) (Fig. 6D). Compared to the low-risk group, patients in the high-risk group had significantly lower stromal scores (p = 3.2e−16), immune scores (p < 2.2e−16), and ESTIMATE scores (p < 2.2e−16) (Fig. 6E,G,I). The risk score showed negative correlations with the stromal score (R = − 0.43, p < 2.2e−16), the immune score (R = − 0.59, p < 2.2e−16), and the ESTIMATE score (R = − 0.55, p < 2.2e−16) (Fig. 6F,H,J). Taken together, we thought that the immune response effect in the high-risk group may be inferior to that of the low-risk group. Multiple studies have separately constructed survival models for LUAD, and their findings consistently demonstrated that LUAD patients with high-risk scores exhibited lower immune scores, stromal scores, ESTIMATE scores, and higher tumor purity^64,66,67,68. This provides valid support for our findings.

Metabolism landscapes and tumor-related factors analysis for the prognostic risk model

Cancer cells reprogram their metabolic pathways to meet the needs of tumor initiation and progression^69,70. GO and GSVA analyses revealed significant metabolic differences between the high- and low-risk groups. The metabolic reprogramming differences between the high- and low-risk groups were investigated by calculating the reprogramming scores of glycolysis, amino acid metabolism, and lipid metabolism in LUAD patients. Aerobic glycolysis is a key metabolic hallmark of cancer, promoting the survival and proliferation of cancer cells. The preference of cancer cells for glycolysis is often associated with poorer clinical outcomes⁷¹. The risk score was positively correlated with the aerobic glycolysis reprogramming score (R = 0.29, p = 1.1e−10) (Fig. 7A). The patients in the high-risk group had significantly higher aerobic glycolysis reprogramming scores (p = 2.9e−06) than those in the low-risk group (Fig. 7A). Amino acid metabolism is one of the pivotal nutrients supporting cancer cell growth, and the metabolism of glutamine, methionine, serine, glycine, arginine, and tryptophan is deregulated in many cancers^{72,73,74,75,76}. The risk score was positively correlated with the metabolic reprogramming scores of glutamine (R = 0.58, p < 2.2e−16) and methionine (R = 0.15, p = 1.2e−3) (Fig. 7B,C). The metabolic reprogramming scores of glutamine (p < 2.2e−16) and methionine (p = 2.4e−2) were both higher in the high-risk group than in the low-risk group (Fig. 7B,C). Studies have shown that the metabolism of glutamine and methionine not only drives tumor cell growth, but also creates a TME that benefits from tumor immune escape^77,78,79. Meanwhile, significant positive correlations were observed between the risk score with the metabolic reprogramming scores of serine (R = 0.37, p < 2.2e−16) and glycine (R = 0.39, p < 2.2e−16) (Fig. 7D,E). The high-risk group presented higher metabolic reprogramming scores of serine (p = 3e−09) and glycine (p = 3.6e−13) (Fig. 7D,E). The upregulation of serine\glycine metabolism is associated with poor clinical outcomes in several cancers⁸⁰. While the patients' metabolic reprogramming scores for arginine (R = − 0.32, p = 8e−13) and tryptophan (R = − 0.24, p = 5.8e−08) were negatively correlated with the risk scores (Fig. 7F,G). The metabolic reprogramming scores of arginine (p = 3.2e−08) and tryptophan (p = 1.3e−4) were lower in the high-risk group (Fig. 7F,G). Lipids are acting as central players in cancer biology⁸¹, and metabolic imbalances in fatty acids and sphingolipids assume a crucial role in cancer development^82,83. There were significant negative correlations between the risk scores with the metabolic reprogramming scores of fatty acids (R = − 0.15, p = 8.2e−4) and sphingolipids (R = − 0.28, p = 2.8e−10) (Figure 7H,I). The patients with higher metabolic reprogramming scores for fatty acids (p = 5e−2) and sphingolipids (p = 8.9e−07) tended to be accompanied by lower risk scores (Fig. 7H,I).

In addition, angiogenesis, EMT, tumorigenic cytokines and inflammation play important and complex roles in the progression of malignant tumors. Thereby, angiogenic activity score, EMT score, tumorigenic cytokine score and inflammatory response score were evaluated for the prognostic risk model, respectively. The angiogenesis activity score in the high-risk group was significantly higher than that in the low-risk group (p = 2.9e−08) (Fig. 7J). There was a significant trend of positive correlation between the risk score and the angiogenesis activity score (R = 0.32, p = 9.5e−13) (Fig. 7J). Angiogenesis is associated with poor prognosis in several cancers, including NSCLC, and anti-angiogenic drugs are being developed^84,85. Patients in the high-risk group had significantly lower EMT scores (p = 2e−05), tumorigenic cytokine scores (p = 9.4e−08) and inflammatory response scores (p < 2.2e−16) than those in the low-risk group (Fig. 7K–M). The risk score was negatively correlated with the EMT score (R = − 0.2, p = 6.5e−06), tumorigenic cytokine score (R = − 0.3, p = 3.9e−11), and inflammatory response score (R = − 0.51, p < 2.2e−16) (Fig. 7K–M).

In conclusion, patients with poor prognosis experienced significant changes in metabolic reprogramming, angiogenesis, EMT, tumorigenic cytokines, and inflammation.

Characterization of prognostic risk factors

We further analyzed the prognostic risk factors HLA-DRB1, SCGB1A1 and HLA-DRB5. They were all significantly lower expressed in the high-risk group than in the low-risk group (Fig. 8A–C). HLA-DRB1, SCGB1A1 and HLA-DRB5 were accompanied by lower expression in LUAD tissues compared to their expression in normal lung tissues (Fig. 8A–C). The scRNA-seq dataset GSE131907 was used to explore the expression of prognostic risk factors in normal lung tissue and primary LUAD. With data preprocessing and quality control, a total of 92,951 single cells were captured and clustered into nine cell types including endothelial cells, epithelial cells, monocytes, etc. Similarly, the expression of prognostic risk factors was observed at the single-cell transcriptome level were all significantly lower in the disease state than in the normal state (Fig. 8D–F). Differential expression of prognostic risk factors was also compared in different cell types between normal and LUAD states. HLA-DRB1 exhibited differential expression between normal and disease states in all cell types except MAST cells (Fig. 8G). In the LUAD state, HLA-DRB1 expression was significantly higher in all cell types except B Cells and Monocyte than in the normal state (Fig. 8G). SCGB1A1 expression was significantly higher in all cell types in the normal state than in the LUAD state (Fig. 8H). HLA-DRB5 showed significant expression differences between normal and disease states in B cells, DC cells, endothelial cells, epithelial cells, macrophages, MAST cells, and T cells (Fig. 8I).

Recently, targeting TME has emerged as a promising method for cancer treatment⁸⁶. Immune-related genes, as important components of TME, play a crucial role in tumorigenesis and TME homeostasis⁸⁷. In parallel, immune checkpoints can modulate the degree of immune activation in TME⁸⁸. Co-expression analysis was performed to explore the correlation of the prognostic risk factors HLA-DRB1, SCGB1A1 and HLA-DRB5 with immune-related genes (chemokines, chemokine receptors, MHC, immunostimulatory genes, immunosuppressive genes) and immune checkpoints. The prognostic risk factors HLA-DRB1 and HLA-DRB5 are immune checkpoints, of which HLA-DRB1 is also an MHC gene. The prognostic risk factors HLA-DRB1 (Fig. 9), SCGB1A1 (Supplementary Fig. S1), and HLA-DRB5 (Supplementary Fig. S2) were positively correlated with the majority of immune-related genes and immune checkpoints. PD-1 (PDCD1), PD-L1 (CD274), CTLA4 and LAG3 are common immune checkpoints focused on by researchers. HLA-DRB1 was positively correlated with those four immune checkpoints (PD-1: R = 0.36, p = 2.22e−16; PD-L1: R = 0.43, p = 0; CTLA4: R = 0.42, p = 0; LAG3: R = 0.27, p = 1.13e−09), as was HLA-DRB5 (PD-1: R = 0.30, p = 8.75e−12; PD-L1: R = 0.38, p = 0; CTLA4: R = 0.36, p = 8.88e−16; LAG3: R = 0.20, p = 1.44e−05), whereas SCGB1A1 showed a positive correlation with CTLA4 (R = 0.13, p = 4.6e−3). The findings of Aram et al.⁸⁹ showed that HLA-DRB1 was associated with anti-PD-1/PD-L1 therapy response in NSCLC. A recent study demonstrated the ability of fucosylation of HLA-DRB1 to enhance the efficacy of immune checkpoint blockade anti-PD1 drugs in melanoma⁹⁰.

Prediction of immunotherapy response and analysis of therapeutic benefit in different risk groups

The immune checkpoint blockade therapy has made revolutionary progress in the treatment of human cancers, including LUAD⁹¹. Hence, we employed the “EaSIeR” to predict the response scores to ICI therapy in the high-risk and low-risk groups. It was shown that the ICI treatment response score of the low-risk group was significantly higher than that of the high-risk group (p < 2e−16) (Fig. 10A). And the score was significantly negatively correlated with the risk score (R = − 0.52, p < 2.2e−16) (Fig. 10B). Additionally, “EaSIeR” further provided nine indicators (CYT, Roh_IS, chemokines, Davoli_IS, IFNy, Ayers_expIS, Tcell_inflamed, TLS, and RIR) for the evaluation of the immune response. Eight of these scores, except the RIR score, were higher in the low-risk group compared to the high-risk group, and these eight scores were negatively correlated with the risk score (Fig. 10C–K). This indicated a stronger immune response in the low-risk group. For the common immune checkpoints PD-1, PD-L1, CTLA4 and LAG3, inhibitors targeting them have already received regulatory approval or are undergoing clinical trials⁹². The expression levels of PD-1 (p = 3.1e−08), PD-L1 (p = 9.7e−09), CTLA4 (p = 1.4e−08), and LAG3 (p = 8e−05) were higher in the low-risk group (Fig. 10L–O). These portend that patients in the low-risk group may benefit more from immune checkpoint blockade.

The pRRophetic algorithm was applied to evaluate sensitivity differences between the high- and low-risk groups to drugs in the Cancer Genome Project database, in which eight NSCLC therapeutic drugs were included⁹³. The results showed that the high-risk group had significantly lower IC50 values than the low-risk group for Afatinib (p = 1.9e−2), Gefitinib (p = 7.5e−06), and Gemcitabine (p = 5e−2), suggesting that high-risk patients may benefit more from these drugs (Fig. 11A–C). Crizotinib (p = 2.8e−2) and Erlotinib (p = 2e−04) had lower IC50 values in the low-risk group, indicating that these drugs may be more suitable for low-risk patients (Fig. 11D–E). However, there was no significant difference in sensitivity to Docetaxel, Paclitaxel, and Vinorelbine between the two groups (Fig. 11F–H).

Discussion

LUAD is a cancer with a poor prognosis⁹⁴, and the search for available LUAD biomarkers is important to achieve effective medical management of patients and improve their prognosis. In this study, we proposed an interpretable deep learning framework based on VAE, MLP, and SHAP that has mapped a large number of genes to a low-dimensional latent representation, enabling efficient feature selection. Considering the SHAP value reflected the influence of the feature genes as well as the positivity and negativity of the influence, a potential biomarker scoring algorithm was developed and 19 LUAD biomarkers were successfully identified. These biomarkers were enriched in the metabolic and immune pathways, revealing that LUAD patients present significant immune alterations and metabolic disturbances.

Given the importance of prognostic prediction in guiding precise therapeutic decisions, a 3-gene prognostic risk model for LUAD was developed through functional profiling and survival significance testing of identified biomarkers. The high- and low-risk groups divided by the risk score showed differences in the immune and metabolic landscapes. The low-risk group featured higher immune pathway activity, immune infiltration, and ICI treatment response scores. This indicates that high-risk patients were under an immunosuppressive state and that low-risk patients may be more suitable for ICI therapy. Interestingly, the three prognostic risk factors were positively related to the majority of immune-related genes. They were differentially expressed genes between the high-risk and low-risk groups, enriched in immune-related pathways such as regulation of T cell activation and leukocyte cell-cell adhesion, especially among them HLA-DRB1 and HLA-DRB5, which are immune checkpoints. Anti- PD-1/PD-L1 has yielded significant benefits for patients with NSCLC by inhibiting immune checkpoint activity, and the study demonstrated that HLA-DRB1 enhances the efficacy of anti-PD-1 therapy⁹⁰. These suggest that the screened prognostic risk factors may be promising targets for immunotherapy of LUAD. Whereas the high-risk group presented significant enrichment of metabolic pathways. The high-risk patients had a higher propensity for metabolic reprogramming of aerobic glycolysis, glutamine, methionine, serine, and glycine, while having a lower propensity for metabolic reprogramming of arginine, tryptophan, fatty acids, and sphingolipids. Aerobic glycolysis and the metabolism of serine and glycine were related to poorer prognosis in cancers^71,80. The close relationship between the prognostic risk model and metabolic reprogramming implies that targeting metabolism may contribute to improving the clinical prognosis of LUAD patients. Notably, increasing evidence suggests that altered metabolic patterns not only provide a survival advantage for tumor cells, but are also critical in modulating anti-tumor immune responses^95,96 Methionine depletion and metabolic byproducts of aerobic glycolysis led to antitumor immune impairment^79,97. Blocking glutamine metabolism has been shown to enhance anticancer immunity⁹⁸. The study by Omkar et al.⁹⁹ revealed that modification of fatty acid metabolism in CAR-T cells can improve their anti-tumor immune capacity. These findings revealed that the immune system and metabolism influence each other in cancer progression.

Angiogenesis, EMT, tumorigenic cytokines, and inflammation are correlates of tumor development. For the risk groups divided by risk score, the high-risk group presented higher angiogenic activity and lower EMT, tumorigenic cytokines, and inflammatory response. Studies have shown that angiogenesis is a poor prognostic indicator for survival in NSCLC^100,101. The significant variations of these factors between the high- and low-risk groups may provide ideas for the treatment of LUAD. Of concern, EMT can stimulate cancer cells to produce proinflammatory factors, while inflammation can in turn induce EMT¹⁰². Not only that, EMT is also closely related to anti-tumor immune response¹⁰³. In addition, researchers have found that the tumor vessel and the immune system affect each other's function, and that the combination of anti-angiogenesis therapies and immunotherapies may produce better therapeutic outcomes¹⁰⁴. Aerobic glycolysis is associated with angiogenesis, a phenomenon closely associated with tumors¹⁰⁵. Researchers have found that production of the tumorigenic cytokine IL-5 is associated with an enhanced response to immune checkpoint blockade in breast cancer patients¹⁰⁶. Processes such as immune, metabolism, EMT, inflammation, and angiogenesis may crosstalk with each other, and combinatory interventions on them may present promising strategies for potential synergy.

To further explore the diagnostic potential of identified biomarkers, five common classification models were used, including a modified version of the integrated model AdaBoost.M1³⁹, the generalized linear model GLM¹⁰⁷, the L2 regularized linear support vector machine with class weights SVMLinearWeights2¹⁰⁸, the penalized logistic regression model PLR¹⁰⁹ and the extreme gradient boosting model XGBDART¹¹⁰. The five classification models were trained and evaluated on the TCGA dataset and another independent dataset GSE81089 from GEO (involving 108 LUAD samples and 19 normal samples) in a leave-one-out method. The models were trained using the R package “caret” and the AUC was used as the evaluation metric for the models. For the TCGA LUAD dataset, the AUC values of the AdaBoost.M1, GLM, SVMLinearWeights2, PLR, and XGBDART models were 0.9012, 0.875, 0.8929, 0.891, and 0.8918, respectively (Supplementary Fig. S3A). For the GSE81089 dataset, the AUC values for the five classification models reached 0.8855, 0.9769, 0.9242, 0.9454, and 0.8684, respectively (Supplementary Fig. S3B). To better understand the performance of the models, performance metrics including accuracy, precision, recall, and F1-score were also calculated. The five classification models achieved accuracy, precision, recall, and F1-score of over 94% on both the TCGA dataset and the GSE81089 dataset (Supplementary Table S1). The excellent performance of multiple classification models on different datasets reflected the diagnostic superiority of the biomarkers for LUAD.

Conclusion

Collectively, this study proposed a novel deep learning framework and successfully identified 19 biomarkers and 3 prognostic signatures of LUAD. The immune and metabolic landscapes as well as alterations in angiogenesis, EMT, tumorigenic cytokines, and inflammation based on prognostic signatures were investigated. In the meantime, the prognostic signatures may be indicators for predicting response to immunotherapy, but further research is needed to clarify this point. Our study may provide new insights for screening genomic markers for LUAD, provide support for clinical judgment of LUAD prognosis and have potential to be applied to other cancers.

Data availability

All data generated or analysed during this study are included in this published article (and its Supplementary Information files).

References

Mullard, A. Addressing cancer’s grand challenges. Nat. Rev. Drug Discov. 19, 825–826. https://doi.org/10.1038/d41573-020-00202-0 (2020).
Article CAS PubMed Google Scholar
Sung, H. et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 71, 209–249. https://doi.org/10.3322/caac.21660 (2021).
Article PubMed Google Scholar
Zhang, N. et al. Circular RNA circSATB2 promotes progression of non-small cell lung cancer cells. Mol. Cancer 19, 101. https://doi.org/10.1186/s12943-020-01221-6 (2020).
Article CAS PubMed PubMed Central Google Scholar
Shukla, S. et al. Development of a RNA-seq based prognostic signature in lung adenocarcinoma. J. Natl. Cancer Inst. https://doi.org/10.1093/jnci/djw200 (2017).
Article PubMed Google Scholar
Hirsch, F. R. et al. Lung cancer: Current therapies and new targeted treatments. Lancet 389, 299–311. https://doi.org/10.1016/S0140-6736(16)30958-8 (2017).
Article CAS PubMed Google Scholar
Sharma, A. K. Emerging trends in biomarker discovery: Ease of prognosis and prediction in cancer. Semin. Cancer Biol. 52, iii–iv. https://doi.org/10.1016/j.semcancer.2018.05.008 (2018).
Article PubMed Google Scholar
Wolrab, D. et al. Lipidomic profiling of human serum enables detection of pancreatic cancer. Nat. Commun. 13, 124. https://doi.org/10.1038/s41467-021-27765-9 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Specogna, A. V. & Sinicrope, F. A. Defining colon cancer biomarkers by using deep learning. Lancet 395, 314–316. https://doi.org/10.1016/S0140-6736(20)30034-9 (2020).
Article PubMed Google Scholar
Huang, L. et al. Machine learning of serum metabolic patterns encodes early-stage lung adenocarcinoma. Nat. Commun. 11, 3556. https://doi.org/10.1038/s41467-020-17347-6 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Shi, R. et al. Identification and validation of hypoxia-derived gene signatures to predict clinical outcomes and therapeutic responses in stage I lung adenocarcinoma patients. Theranostics 11, 5061–5076. https://doi.org/10.7150/thno.56202 (2021).
Article CAS PubMed PubMed Central Google Scholar
Zare, A., Postovit, L. M. & Githaka, J. M. Robust inflammatory breast cancer gene signature using nonparametric random forest analysis. Breast Cancer Res. 23, 92. https://doi.org/10.1186/s13058-021-01467-y (2021).
Article CAS PubMed PubMed Central Google Scholar
Zhang, B. et al. m(6)A target microRNAs in serum for cancer detection. Mol. Cancer 20, 170. https://doi.org/10.1186/s12943-021-01477-6 (2021).
Article CAS PubMed PubMed Central Google Scholar
Zhang, N. et al. An artificial intelligence network-guided signature for predicting outcome and immunotherapy response in lung adenocarcinoma patients based on 26 machine learning algorithms. Cell Prolif. 56, e13409. https://doi.org/10.1111/cpr.13409 (2023).
Article CAS PubMed PubMed Central Google Scholar
Zhang, Z. et al. Deep learning identifies a T-cell exhaustion-dependent transcriptional signature for predicting clinical outcomes and response to immune checkpoint blockade. Oncogenesis 12, 37. https://doi.org/10.1038/s41389-023-00482-2 (2023).
Article CAS PubMed PubMed Central Google Scholar
Pei, G., Hu, R., Dai, Y., Zhao, Z. & Jia, P. Decoding whole-genome mutational signatures in 37 human pan-cancers by denoising sparse autoencoder neural network. Oncogene 39, 5031–5041. https://doi.org/10.1038/s41388-020-1343-z (2020).
Article CAS PubMed PubMed Central Google Scholar
Wang, J. et al. Denoising autoencoder, a deep learning algorithm, aids the identification of a novel molecular signature of lung adenocarcinoma. Genom. Proteom. Bioinform. 18, 468–480. https://doi.org/10.1016/j.gpb.2019.02.003 (2020).
Article Google Scholar
Divate, M. et al. Deep learning-based pan-cancer classification model reveals tissue-of-origin specific gene expression signatures. Cancers https://doi.org/10.3390/cancers14051185 (2022).
Article PubMed PubMed Central Google Scholar
Jha, A. et al. Identifying common transcriptome signatures of cancer by interpreting deep learning models. Genome Biol. 23, 117. https://doi.org/10.1186/s13059-022-02681-3 (2022).
Article CAS PubMed PubMed Central Google Scholar
Zou, J. et al. A primer on deep learning in genomics. Nat. Genet. 51, 12–18. https://doi.org/10.1038/s41588-018-0295-5 (2019).
Article CAS PubMed Google Scholar
Kingma, D. P. & Welling, M. Auto-encoding variational bayes. arXiv:1312.6114 (2013). <https://ui.adsabs.harvard.edu/abs/2013arXiv1312.6114K>.
Gomari, D. P. et al. Variational autoencoders learn transferrable representations of metabolomics data. Commun. Biol. 5, 645. https://doi.org/10.1038/s42003-022-03579-3 (2022).
Article PubMed PubMed Central Google Scholar
Way, G. P., Zietz, M., Rubinetti, V., Himmelstein, D. S. & Greene, C. S. Compressing gene expression data using multiple latent space dimensionalities learns complementary biological representations. Genome Biol. 21, 109. https://doi.org/10.1186/s13059-020-02021-3 (2020).
Article PubMed PubMed Central Google Scholar
Ding, X., Zou, Z. & Brooks Iii, C. L. Deciphering protein evolution and fitness landscapes with latent space models. Nat. Commun. 10, 5644. https://doi.org/10.1038/s41467-019-13633-0 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Arslan, E., Schulz, J. & Rai, K. Machine learning in epigenomics: Insights into cancer biology and medicine. Biochim. Biophys. Acta Rev. Cancer 1876, 188588. https://doi.org/10.1016/j.bbcan.2021.188588 (2021).
Article CAS PubMed PubMed Central Google Scholar
Lorencin, I., Andelic, N., Spanjol, J. & Car, Z. Using multi-layer perceptron with Laplacian edge detector for bladder cancer diagnosis. Artif. Intell. Med. 102, 101746. https://doi.org/10.1016/j.artmed.2019.101746 (2020).
Article PubMed Google Scholar
Ren, X. et al. Machine learning reveals salivary glycopatterns as potential biomarkers for the diagnosis and prognosis of papillary thyroid cancer. Int. J. Biol. Macromol. 215, 280–289. https://doi.org/10.1016/j.ijbiomac.2022.05.194 (2022).
Article CAS PubMed Google Scholar
Ellis, M. et al. Development and validation of a method for using breast core needle biopsies for gene expression microarray analyses. Clin. Cancer Res. 8, 1155–1166 (2002).
CAS PubMed Google Scholar
Battista, A., Battista, R. A., Battista, F., Iovane, G. & Landi, R. E. BH-index: A predictive system based on serum biomarkers and ensemble learning for early colorectal cancer diagnosis in mass screening. Comput. Methods Programs Biomed. 212, 106494. https://doi.org/10.1016/j.cmpb.2021.106494 (2021).
Article PubMed Google Scholar
Hu, X. et al. Artificial neural networks and prostate cancer–tools for diagnosis and management. Nat. Rev. Urol. 10, 174–182. https://doi.org/10.1038/nrurol.2013.9 (2013).
Article PubMed Google Scholar
Lundberg, S. & Lee, S.-I. A unified approach to interpreting model predictions. arXiv:1705.07874 (2017). <https://ui.adsabs.harvard.edu/abs/2017arXiv170507874L>.
Chakraborty, D. et al. Explainable artificial intelligence reveals novel insight into tumor microenvironment conditions linked with better prognosis in patients with breast cancer. Cancers https://doi.org/10.3390/cancers13143450 (2021).
Article PubMed PubMed Central Google Scholar
Rynazal, R. et al. Leveraging explainable AI for gut microbiome-based colorectal cancer classification. Genome Biol. 24, 21. https://doi.org/10.1186/s13059-023-02858-4 (2023).
Article PubMed PubMed Central Google Scholar
Cheng, J. et al. Multimodal disentangled variational autoencoder with game theoretic interpretability for glioma grading. IEEE J. Biomed. Health Inform. 26, 673–684. https://doi.org/10.1109/JBHI.2021.3095476 (2022).
Article PubMed Google Scholar
Ma, M. et al. Predicting the molecular subtype of breast cancer and identifying interpretable imaging features using machine learning algorithms. Eur. Radiol. 32, 1652–1662. https://doi.org/10.1007/s00330-021-08271-4 (2022).
Article CAS PubMed Google Scholar
Xu, G. et al. Decoding river pollution trends and their landscape determinants in an ecologically fragile karst basin using a machine learning model. Environ. Res. 214, 113843. https://doi.org/10.1016/j.envres.2022.113843 (2022).
Article CAS PubMed Google Scholar
Barrett, T. et al. NCBI GEO: Archive for functional genomics data sets–update. Nucleic Acids Res. 41, D991-995. https://doi.org/10.1093/nar/gks1193 (2013).
Article CAS PubMed Google Scholar
Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: A Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140. https://doi.org/10.1093/bioinformatics/btp616 (2010).
Article CAS PubMed Google Scholar
Paszke, A. et al. Pytorch: An imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 32 (2019).
Freund, Y. & Schapire, R. E. in icml. 148–156 (Citeseer).
Wu, T. et al. clusterProfiler 4.0: A universal enrichment tool for interpreting omics data. Innovation 2, 100141. https://doi.org/10.1016/j.xinn.2021.100141 (2021).
Article CAS PubMed PubMed Central Google Scholar
Rath, S. et al. MitoCarta3.0: An updated mitochondrial proteome now with sub-organelle localization and pathway annotations. Nucleic Acids Res. 49, D1541–D1547. https://doi.org/10.1093/nar/gkaa1011 (2021).
Article CAS PubMed Google Scholar
Hanzelmann, S., Castelo, R. & Guinney, J. GSVA: Gene set variation analysis for microarray and RNA-seq data. BMC Bioinform. 14, 7. https://doi.org/10.1186/1471-2105-14-7 (2013).
Article Google Scholar
Kanehisa, M., Furumichi, M., Sato, Y., Kawashima, M. & Ishiguro-Watanabe, M. KEGG for taxonomy-based analysis of pathways and genomes. Nucleic Acids Res. 51, D587–D592. https://doi.org/10.1093/nar/gkac963 (2023).
Article CAS PubMed Google Scholar
Subramanian, A. et al. Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. USA 102, 15545–15550. https://doi.org/10.1073/pnas.0506580102 (2005).
Article ADS CAS PubMed PubMed Central Google Scholar
Newman, A. M. et al. Determining cell type abundance and expression from bulk tissues with digital cytometry. Nat. Biotechnol. 37, 773–782. https://doi.org/10.1038/s41587-019-0114-2 (2019).
Article CAS PubMed PubMed Central Google Scholar
Yoshihara, K. et al. Inferring tumour purity and stromal and immune cell admixture from expression data. Nat. Commun. 4, 2612. https://doi.org/10.1038/ncomms3612 (2013).
Article ADS CAS PubMed Google Scholar
Li, Z. & Zhang, H. Reprogramming of glucose, fatty acid and amino acid metabolism for cancer progression. Cell Mol. Life Sci. 73, 377–392. https://doi.org/10.1007/s00018-015-2070-4 (2016).
Article CAS PubMed Google Scholar
Liu, Z. L., Chen, H. H., Zheng, L. L., Sun, L. P. & Shi, L. Angiogenic signaling pathways and anti-angiogenic therapy for cancer. Signal Transduct. Target. Ther. 8, 198. https://doi.org/10.1038/s41392-023-01460-1 (2023).
Article CAS PubMed PubMed Central Google Scholar
Huang, Y., Hong, W. & Wei, X. The molecular mechanisms and therapeutic strategies of EMT in tumor progression and metastasis. J. Hematol. Oncol. 15, 129. https://doi.org/10.1186/s13045-022-01347-8 (2022).
Article PubMed PubMed Central Google Scholar
Lytle, N. K., Barber, A. G. & Reya, T. Stem cell fate in cancer growth, progression and therapy resistance. Nat. Rev. Cancer 18, 669–680. https://doi.org/10.1038/s41568-018-0056-x (2018).
Article CAS PubMed PubMed Central Google Scholar
Qian, B. Z. Inflammation fires up cancer metastasis. Semin. Cancer Biol. 47, 170–176. https://doi.org/10.1016/j.semcancer.2017.08.006 (2017).
Article CAS PubMed Google Scholar
Qiu, C. et al. Identification of molecular subtypes and a prognostic signature based on inflammation-related genes in colon adenocarcinoma. Front. Immunol. 12, 769685. https://doi.org/10.3389/fimmu.2021.769685 (2021).
Article CAS PubMed PubMed Central Google Scholar
Lapuente-Santana, O., van Genderen, M., Hilbers, P. A. J., Finotello, F. & Eduati, F. Interpretable systems biomarkers predict response to immune-checkpoint inhibitors. Patterns 2, 100293. https://doi.org/10.1016/j.patter.2021.100293 (2021).
Article CAS PubMed PubMed Central Google Scholar
Ru, B. et al. TISIDB: An integrated repository portal for tumor-immune system interactions. Bioinformatics 35, 4200–4202. https://doi.org/10.1093/bioinformatics/btz210 (2019).
Article CAS PubMed Google Scholar
Hu, F. F., Liu, C. J., Liu, L. L., Zhang, Q. & Guo, A. Y. Expression profile of immune checkpoint genes and their roles in predicting immunotherapy response. Brief Bioinform. https://doi.org/10.1093/bib/bbaa176 (2021).
Article PubMed PubMed Central Google Scholar
Vercellino, I. & Sazanov, L. A. The assembly, regulation and function of the mitochondrial respiratory chain. Nat. Rev. Mol. Cell Biol. 23, 141–161. https://doi.org/10.1038/s41580-021-00415-0 (2022).
Article CAS PubMed Google Scholar
Yu, T. et al. CircRNAs in cancer metabolism: A review. J. Hematol. Oncol. 12, 90. https://doi.org/10.1186/s13045-019-0776-8 (2019).
Article CAS PubMed PubMed Central Google Scholar
Cruz-Bermudez, A. et al. PGC-1alpha levels correlate with survival in patients with stage III NSCLC and may define a new biomarker to metabolism-targeted therapy. Sci. Rep. 7, 16661. https://doi.org/10.1038/s41598-017-17009-6 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Majem, B., Nadal, E. & Munoz-Pinedo, C. Exploiting metabolic vulnerabilities of non small cell lung carcinoma. Semin. Cell Dev. Biol. 98, 54–62. https://doi.org/10.1016/j.semcdb.2019.06.004 (2020).
Article CAS PubMed Google Scholar
Schreiber, R. D., Old, L. J. & Smyth, M. J. Cancer immunoediting: Integrating immunity’s roles in cancer suppression and promotion. Science 331, 1565–1570. https://doi.org/10.1126/science.1203486 (2011).
Article ADS CAS PubMed Google Scholar
Katsuta, E., Rashid, O. M. & Takabe, K. Fibroblasts as a biological marker for curative resection in pancreatic ductal adenocarcinoma. Int. J. Mol. Sci. https://doi.org/10.3390/ijms21113890 (2020).
Article PubMed PubMed Central Google Scholar
Romani, C. et al. Gene expression profiling of olfactory neuroblastoma helps identify prognostic pathways and define potentially therapeutic targets. Cancers https://doi.org/10.3390/cancers13112527 (2021).
Article PubMed PubMed Central Google Scholar
Zhang, C. et al. Comprehensive molecular analyses of a TNF family-based signature with regard to prognosis, immune features, and biomarkers for immunotherapy in lung adenocarcinoma. EBioMedicine 59, 102959. https://doi.org/10.1016/j.ebiom.2020.102959 (2020).
Article CAS PubMed PubMed Central Google Scholar
Zhang, Z. et al. A novel basement membrane-related gene signature for prognosis of lung adenocarcinomas. Comput. Biol. Med. 154, 106597. https://doi.org/10.1016/j.compbiomed.2023.106597 (2023).
Article CAS PubMed Google Scholar
Zhang, A., Yang, J., Ma, C., Li, F. & Luo, H. Development and validation of a robust ferroptosis-related prognostic signature in lung adenocarcinoma. Front. Cell Dev. Biol. 9, 616271. https://doi.org/10.3389/fcell.2021.616271 (2021).
Article PubMed PubMed Central Google Scholar
Li, Q. et al. Combining autophagy and immune characterizations to predict prognosis and therapeutic response in lung adenocarcinoma. Front. Immunol. 13, 944378. https://doi.org/10.3389/fimmu.2022.944378 (2022).
Article CAS PubMed PubMed Central Google Scholar
Ouyang, W. et al. A prognostic risk score based on hypoxia-, immunity-, and epithelialto-mesenchymal transition-related genes for the prognosis and immunotherapy response of lung adenocarcinoma. Front. Cell Dev. Biol. 9, 758777. https://doi.org/10.3389/fcell.2021.758777 (2021).
Article PubMed Google Scholar
Qi, C., Ma, J., Sun, J., Wu, X. & Ding, J. The role of molecular subtypes and immune infiltration characteristics based on disulfidptosis-associated genes in lung adenocarcinoma. Aging (Albany NY) 15, 5075–5095. https://doi.org/10.18632/aging.204782 (2023).
Article PubMed Google Scholar
Martinez-Reyes, I. & Chandel, N. S. Cancer metabolism: Looking forward. Nat. Rev. Cancer 21, 669–680. https://doi.org/10.1038/s41568-021-00378-6 (2021).
Article CAS PubMed Google Scholar
Counihan, J. L., Grossman, E. A. & Nomura, D. K. Cancer metabolism: Current understanding and therapies. Chem. Rev. 118, 6893–6923. https://doi.org/10.1021/acs.chemrev.7b00775 (2018).
Article CAS PubMed Google Scholar
Chen, X. et al. Monomethyltransferase SET8 facilitates hepatocellular carcinoma growth by enhancing aerobic glycolysis. Cell Death Dis. 10, 312. https://doi.org/10.1038/s41419-019-1541-1 (2019).
Article CAS PubMed PubMed Central Google Scholar
Altman, B. J., Stine, Z. E. & Dang, C. V. From Krebs to clinic: Glutamine metabolism to cancer therapy. Nat. Rev. Cancer 16, 619–634. https://doi.org/10.1038/nrc.2016.71 (2016).
Article CAS PubMed PubMed Central Google Scholar
Wang, Z. et al. Methionine is a metabolic dependency of tumor-initiating cells. Nat. Med. 25, 825–837. https://doi.org/10.1038/s41591-019-0423-5 (2019).
Article CAS PubMed Google Scholar
Locasale, J. W. Serine, glycine and one-carbon units: Cancer metabolism in full circle. Nat. Rev. Cancer 13, 572–583. https://doi.org/10.1038/nrc3557 (2013).
Article CAS PubMed PubMed Central Google Scholar
Morris, S. M. Jr. Recent advances in arginine metabolism: Roles and regulation of the arginases. Br. J. Pharmacol. 157, 922–930. https://doi.org/10.1111/j.1476-5381.2009.00278.x (2009).
Article CAS PubMed PubMed Central Google Scholar
Xue, C. et al. Tryptophan metabolism in health and disease. Cell Metab. 35, 1304–1326. https://doi.org/10.1016/j.cmet.2023.06.004 (2023).
Article CAS PubMed Google Scholar
Oh, M. H. et al. Targeting glutamine metabolism enhances tumor-specific immunity by modulating suppressive myeloid cells. J. Clin. Invest. 130, 3865–3884. https://doi.org/10.1172/JCI131859 (2020).
Article CAS PubMed PubMed Central Google Scholar
Sanderson, S. M., Gao, X., Dai, Z. & Locasale, J. W. Methionine metabolism in health and cancer: A nexus of diet and precision medicine. Nat. Rev. Cancer 19, 625–637. https://doi.org/10.1038/s41568-019-0187-8 (2019).
Article CAS PubMed Google Scholar
Bian, Y. et al. Cancer SLC43A2 alters T cell methionine metabolism and histone methylation. Nature 585, 277–282. https://doi.org/10.1038/s41586-020-2682-1 (2020).
Article CAS PubMed PubMed Central Google Scholar
Amelio, I., Cutruzzola, F., Antonov, A., Agostini, M. & Melino, G. Serine and glycine metabolism in cancer. Trends Biochem. Sci. 39, 191–198. https://doi.org/10.1016/j.tibs.2014.02.004 (2014).
Article CAS PubMed PubMed Central Google Scholar
Butler, L. M. et al. Lipids and cancer: Emerging roles in pathogenesis, diagnosis and therapeutic intervention. Adv. Drug Deliv. Rev. 159, 245–293. https://doi.org/10.1016/j.addr.2020.07.013 (2020).
Article CAS PubMed PubMed Central Google Scholar
Carracedo, A., Cantley, L. C. & Pandolfi, P. P. Cancer metabolism: Fatty acid oxidation in the limelight. Nat. Rev. Cancer 13, 227–232. https://doi.org/10.1038/nrc3483 (2013).
Article CAS PubMed PubMed Central Google Scholar
Ogretmen, B. Sphingolipid metabolism in cancer signalling and therapy. Nat. Rev. Cancer 18, 33–50. https://doi.org/10.1038/nrc.2017.96 (2018).
Article CAS PubMed Google Scholar
Cox, G., Jones, J. L., Walker, R. A., Steward, W. P. & O’Byrne, K. J. Angiogenesis and non-small cell lung cancer. Lung Cancer 27, 81–100. https://doi.org/10.1016/s0169-5002(99)00096-3 (2000).
Article CAS PubMed Google Scholar
Herbst, R. S., Onn, A. & Sandler, A. Angiogenesis and lung cancer: Prognostic and therapeutic implications. J. Clin. Oncol. 23, 3243–3256. https://doi.org/10.1200/JCO.2005.18.853 (2005).
Article CAS PubMed Google Scholar
Xiao, Y. & Yu, D. Tumor microenvironment as a therapeutic target in cancer. Pharmacol. Ther. 221, 107753. https://doi.org/10.1016/j.pharmthera.2020.107753 (2021).
Article CAS PubMed Google Scholar
Hu, C. et al. Comprehensive profiling of immune-related genes in soft tissue sarcoma patients. J. Transl. Med. 18, 337. https://doi.org/10.1186/s12967-020-02512-8 (2020).
Article CAS PubMed PubMed Central Google Scholar
Pardoll, D. M. The blockade of immune checkpoints in cancer immunotherapy. Nat. Rev. Cancer 12, 252–264. https://doi.org/10.1038/nrc3239 (2012).
Article CAS PubMed PubMed Central Google Scholar
Musaelyan, A. A. et al. Inflammatory and autoimmune predictive markers of response to anti-PD-1/PD-L1 therapy in NSCLC and melanoma. Exp. Ther. Med. 24, 557. https://doi.org/10.3892/etm.2022.11495 (2022).
Article CAS PubMed PubMed Central Google Scholar
Lester, D. K. et al. Fucosylation of HLA-DRB1 regulates CD4(+) T cell-mediated anti-melanoma immunity and enhances immunotherapy efficacy. Nat. Cancer 4, 222–239. https://doi.org/10.1038/s43018-022-00506-7 (2023).
Article CAS PubMed PubMed Central Google Scholar
Morad, G., Helmink, B. A., Sharma, P. & Wargo, J. A. Hallmarks of response, resistance, and toxicity to immune checkpoint blockade. Cell 184, 5309–5337. https://doi.org/10.1016/j.cell.2021.09.020 (2021).
Article CAS PubMed PubMed Central Google Scholar
Zhou, F., Qiao, M. & Zhou, C. The cutting-edge progress of immune-checkpoint blockade in lung cancer. Cell Mol. Immunol. 18, 279–293. https://doi.org/10.1038/s41423-020-00577-5 (2021).
Article CAS PubMed Google Scholar
Geeleher, P., Cox, N. & Huang, R. S. pRRophetic: An R package for prediction of clinical chemotherapeutic response from tumor gene expression levels. PLoS One 9, e107468. https://doi.org/10.1371/journal.pone.0107468 (2014).
Article ADS CAS PubMed PubMed Central Google Scholar
Ma, L. et al. The essential roles of m(6)A RNA modification to stimulate ENO1-dependent glycolysis and tumorigenesis in lung adenocarcinoma. J. Exp. Clin. Cancer Res. 41, 36. https://doi.org/10.1186/s13046-021-02200-5 (2022).
Article CAS PubMed PubMed Central Google Scholar
Xia, L. et al. The cancer metabolic reprogramming and immune response. Mol. Cancer 20, 28. https://doi.org/10.1186/s12943-021-01316-8 (2021).
Article PubMed PubMed Central Google Scholar
Wang, W. & Zou, W. Amino acids and their transporters in T cell immunity and cancer therapy. Mol. Cell 80, 384–395. https://doi.org/10.1016/j.molcel.2020.09.006 (2020).
Article CAS PubMed PubMed Central Google Scholar
Reinfeld, B. I., Rathmell, W. K., Kim, T. K. & Rathmell, J. C. The therapeutic implications of immunosuppressive tumor aerobic glycolysis. Cell Mol. Immunol. 19, 46–58. https://doi.org/10.1038/s41423-021-00727-3 (2022).
Article CAS PubMed Google Scholar
Huang, M. et al. Targeting glutamine metabolism to enhance immunoprevention of EGFR-driven lung cancer. Adv. Sci. 9, e2105885. https://doi.org/10.1002/advs.202105885 (2022).
Article MathSciNet CAS Google Scholar
Kawalekar, O. U. et al. Distinct signaling of coreceptors regulates specific metabolism pathways and impacts memory development in CAR T cells. Immunity 44, 380–390. https://doi.org/10.1016/j.immuni.2016.01.021 (2016).
Article CAS PubMed Google Scholar
Giatromanolaki, A. et al. Prognostic value of angiogenesis in operable non-small cell lung cancer. J. Pathol. 179, 80–88. https://doi.org/10.1002/(SICI)1096-9896(199605)179:1%3c80::AID-PATH547%3e3.0.CO;2-X (1996).
Article CAS PubMed Google Scholar
Cai, S. et al. Integrative analysis and experiments to explore angiogenesis regulators correlated with poor prognosis, immune infiltration and cancer progression in lung adenocarcinoma. J. Transl. Med. 19, 361. https://doi.org/10.1186/s12967-021-03031-w (2021).
Article CAS PubMed PubMed Central Google Scholar
Suarez-Carmona, M., Lesage, J., Cataldo, D. & Gilles, C. EMT and inflammation: Inseparable actors of cancer progression. Mol. Oncol. 11, 805–823. https://doi.org/10.1002/1878-0261.12095 (2017).
Article PubMed PubMed Central Google Scholar
Xiao, Z., Cai, Z., Deng, D., Tong, S. & Zu, X. An EMT-based risk score thoroughly predicts the clinical prognosis, tumor immune microenvironment and molecular subtypes of bladder cancer. Front. Immunol. 13, 1000321. https://doi.org/10.3389/fimmu.2022.1000321 (2022).
Article CAS PubMed PubMed Central Google Scholar
Tian, L. et al. Mutual regulation of tumour vessel normalization and immunostimulatory reprogramming. Nature 544, 250–254. https://doi.org/10.1038/nature21724 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Zhang, D. et al. Metabolic regulation of gene expression by histone lactylation. Nature 574, 575–580. https://doi.org/10.1038/s41586-019-1678-1 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Blomberg, O. S. et al. IL-5-producing CD4(+) T cells and eosinophils cooperate to enhance response to immune checkpoint blockade in breast cancer. Cancer Cell 41, 106–123. https://doi.org/10.1016/j.ccell.2022.11.014 (2023).
Article CAS PubMed Google Scholar
Nelder, J. A. & Wedderburn, R. W. Generalized linear models. J. R. Stat. Soc. Ser. A Stat. Soc. 135, 370–384 (1972).
Article Google Scholar
Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R. & Lin, C.-J. LIBLINEAR: A library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008).
Google Scholar
Park, M. Y. & Hastie, T. Penalized logistic regression for detecting gene interactions. Biostatistics 9, 30–50. https://doi.org/10.1093/biostatistics/kxm010 (2008).
Article PubMed Google Scholar
Vinayak, R. K. & Gilad-Bachrach, R. in Artificial Intelligence and Statistics. 489–497 (PMLR).

Download references

Funding

This study was funded by the National Natural Science Foundation of China (61702141; 81627901), the Natural Science Foundation of Heilongjiang Province (LH2021F043) and the Heilongjiang Postdoctoral Funds for Scientific Research Initiation (LBH-Q17132).

Author information

These authors contributed equally: Shimei Qin, Shibin Sun and Yahui Wang.

Authors and Affiliations

College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150000, China
Shimei Qin, Shibin Sun, Yahui Wang, Chao Li, Lei Fu, Ming Wu, Jinxing Yan, Wan Li, Junjie Lv & Lina Chen

Authors

Shimei Qin
View author publications
You can also search for this author in PubMed Google Scholar
Shibin Sun
View author publications
You can also search for this author in PubMed Google Scholar
Yahui Wang
View author publications
You can also search for this author in PubMed Google Scholar
Chao Li
View author publications
You can also search for this author in PubMed Google Scholar
Lei Fu
View author publications
You can also search for this author in PubMed Google Scholar
Ming Wu
View author publications
You can also search for this author in PubMed Google Scholar
Jinxing Yan
View author publications
You can also search for this author in PubMed Google Scholar
Wan Li
View author publications
You can also search for this author in PubMed Google Scholar
Junjie Lv
View author publications
You can also search for this author in PubMed Google Scholar
Lina Chen
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

L.C. and J.L. contributed conceptualization, project administration, writing—review & editing. S.Q. contributed data curation, formal analysis, investigation, methodology, validation, writing—original draft, writing—review & editing. S.S. helped in data curation, formal analysis, investigation, methodology. Y.W. helped in data curation, investigation, formal analysis. C.L., L.F., M.W., J.Y. and W.L. carried out data curation, validation. All authors reviewed the manuscript.

Corresponding authors

Correspondence to Junjie Lv or Lina Chen.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Qin, S., Sun, S., Wang, Y. et al. Immune, metabolic landscapes of prognostic signatures for lung adenocarcinoma based on a novel deep learning framework. Sci Rep 14, 527 (2024). https://doi.org/10.1038/s41598-023-51108-x

Download citation

Received: 17 October 2023
Accepted: 30 December 2023
Published: 04 January 2024
DOI: https://doi.org/10.1038/s41598-023-51108-x

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.