Artificial intelligence reveals dysregulation of osteosarcoma and cuproptosis-related biomarkers, PDHA1, CDKN2A and neutrophils

Jiang, Jie; Zhan, Xinli; Wei, Jianxun; Fan, Qie; Li, Haowen; Li, Hao; Li, Shuzhen; Zhao, Yong; Yin, Guodong; Tang, Lin; Wu, Yongxiang; Lan, Mindong; Qin, Yijue; Guo, Quan; Xu, Weicheng; Lu, Ling; Yang, Yanwei; Zhang, Yitian; Qu, Haishun

doi:10.1038/s41598-023-32195-2

Download PDF

Article
Open access
Published: 26 March 2023

Artificial intelligence reveals dysregulation of osteosarcoma and cuproptosis-related biomarkers, PDHA1, CDKN2A and neutrophils

Jie Jiang ORCID: orcid.org/0000-0002-4599-1544¹,
Xinli Zhan ORCID: orcid.org/0000-0002-6936-481X²,
Jianxun Wei¹,
Qie Fan¹,
Haowen Li¹,
Hao Li¹,
Shuzhen Li¹,
Yong Zhao¹,
Guodong Yin¹,
Lin Tang¹,
Yongxiang Wu¹,
Mindong Lan¹,
Yijue Qin³,
Quan Guo³,
Weicheng Xu³,
Ling Lu³,
Yanwei Yang³,
Yitian Zhang³ &
…
Haishun Qu³

Scientific Reports volume 13, Article number: 4927 (2023) Cite this article

1791 Accesses
2 Citations
3 Altmetric
Metrics details

Subjects

Abstract

At present, the impact of cuproptosis-related genes in the study of osteosarcoma is largely unknown. Genome-wide data of osteosarcoma and controls were downloaded from 3 different databases, and specific diagnostic models associated with cuproptosis in osteosarcoma were constructed by support vector machines with artificial intelligence, random forest trees and LASSO regression. Differential analysis of immune cell infiltration was examined using routine blood data from 25,665 cases. Differential expression was examined using immunohistochemistry and PCR. PDHA1 and CDKN2A were obtained as specific cuproptosis-related biomarkers for osteosarcoma after artificial intelligence analysis. PDHA1, CDKN2A and neutrophils were differentially expressed in OS and control groups. PDHA1 and CDKN2A are significantly dysregulated in OS and are able to serve as biomarkers of OS.

Integrating artificial intelligence in osteosarcoma prognosis: the prognostic significance of SERPINE2 and CPT1B biomarkers

Article Open access 21 February 2024

Low expression of IGFBP4 and TAGLN accelerate the poor overall survival of osteosarcoma

Article Open access 03 June 2022

Development and validation of a cuproptosis-related prognostic model for acute myeloid leukemia patients using machine learning with stacking

Article Open access 02 February 2024

Introduction

Osteosarcoma (OS) is the number one primary malignant bone tumor among orthopedic tumors, and those who are susceptible to this disease are children and adolescents, whose 5-year survival rate is less than 20% if it metastasizes^1,2. It has been shown that immune dysregulation in the body is associated with the development of osteosarcoma, and improving the survival of patients with osteosarcoma has long been considered challenging, although treatment of this disease has also improved in recent years, and immune checkpoint inhibition may enhance the therapeutic effect on OS by improving immunosuppression as well as the tumor microenvironment³. It is increasingly recognized that osteosarcoma cells, immune cells, osteoblasts and osteoclasts coexist and interact with each other in the microenvironment of osteosarcoma⁴. Because of the poor prognosis of OS, there is an urgent need to find more accurate biomarkers to guide early diagnosis.

Artificial intelligence (AI) is an emerging technology, and as technology advances, more and more methods of AI are being used in the medical field⁵. Some studies have shown that AI tools can increase the detection of precancerous polyps during colonoscopy, a technological advancement that may help prevent colon cancer in the long term⁶. The study by Mirelman et al. found that the machine learning approach had higher discriminatory value in motor disease segments by applying machine learning features and classification algorithms to differentiate between 332 Parkinson's disease patients (Hoehn and Yahr scale I–III) and 100 healthy controls⁵. However, the research and application of AI, an advanced technology in OS, is still inadequate, which requires researchers to apply AI technology to the early diagnosis of OS.

Copper is an essential factor in all organisms, and once copper concentrations exceed the threshold maintained by homeostatic mechanisms, then copper in the body becomes toxic; in human cells, copper-dependent death occurs through direct binding of copper to lipid acylated components, which leads to aggregation of lipid acylated proteins and loss of iron-sulfur cluster proteins, ultimately leading to cell death⁷. The balance of copper as an important cofactor is crucial, as dysregulation of intracellular copper bioavailability will induce cytotoxicity and oxidative stress; in both prokaryotes and eukaryotes, copper homeostasis is finely regulated with the main aim of preventing excessive accumulation of copper ions in the cell and thus threatening cell survival⁸. It is now clear that the genes CDKN2A, FDX1, DLD, DLAT, LIAS, GLS, LIPT1, MTF1, PDHA1 and PDHB are closely associated with cuproptosis as a form of mortality^7,9. There is still a gap in the research of this novel form of death in OS.

The main objective of this study is to explore OS and cuproptosis-related and immune-related diagnostic biomarkers through an artificial intelligence approach in order to more accurately guide early clinical diagnosis and immunotherapy of this disease.

Materials and methods

Data download and preliminary processing

In this study, genomic expression data of osteosarcoma for the training set were downloaded from the UCSC Xena database (https://xena.ucsc.edu/), and skeletal muscle samples from the GTEx database (https://www.gtexportal.org/home/) were used as normal controls for the training set, and both data were removing inter-batch differences and normalization. The gene expression data of the validation set for osteosarcoma was downloaded from the GEO database, and the GSE42352 dataset (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE42352) was selected from the GEO database as the validation set for this study^10,11. Subsequently, the probe numbers of the training and validation sets were converted into recognizable gene symbols. here, all statistical analyses and graphing were processed and analyzed using the programming language R x64 (version 4.1.3), using Strewberry Perl (https://strawberryperl.com/, version: v5.32.1) for the conversion of probes into gene symbols and the processing of this paper.

Differential expression analysis

In this study, we first performed differential expression analysis of genome-wide mRNA expression in the training set using the “limma” package, with cut off values set to |logFC| ≥ 1.5 and adjusted-p < 0.05. We then used the “pheatmap” package to visualize the top 100 differentially expressed genes as heat maps; “dplyr” package, “ggplot2” package and “ggrepel” package to visualize all the differentially expressed genes as a volcano map. To investigate the role of cuproptosis-related genes in OS, 10 cuproptosis genes were extracted from the genome-wide mRNA expression matrix, and each two genes were correlated using the “corrplot” and “circlize” packages and visualized as correlation heat maps.

Artificial intelligence of randomForest analysis

In order to obtain more accurate early diagnostic biomarkers, a further screening was performed using the randomForest method. Random forest is a supervised learning method in which multiple prediction models are generated simultaneously and the results of the models are aggregated to improve the accuracy¹². Random forest has many advantages over other AI analysis methods: its ability to handle a large number of input variables, its ability to assess the importance of variables, and its ability to handle mixed data^13,14. We screened the cuproptosis-associated genes again to obtain the genes for the final diagnostic model.

Artificial intelligence machine learning-support vector machine recursive feature elimination (SVM-RFE) analysis

Here, we used the SVM-RFE approach to analyze all cuproptosis-related genes in order to obtain the most accurate biomarkers for early diagnosis of OS.

SVM is a class of generalized linear classifiers that perform binary classification of data in a supervised learning manner. The decision boundary is the maximum margin hyperplane solved for the learned samples, i.e., it is the separation hyperplane solved that correctly divides the data set and has the largest geometric separation. SVM-RFE is used to reduce the features and find the optimal number of features, which can remove the confounding factors very efficiently to obtain high accuracy¹⁵. As an advanced artificial intelligence screening, we analyzed cuproptosis genes using the “e1071” package, the “kernlab” package and the “caret” package of the programming language R to obtain the optimal diagnostic genes.

Analysis of least absolute shrinkage and selection operator (LASSO) of artificial intelligence

Subsequently, we used LASSO regression analysis for further screening of cuproptosis genes. the LASSO method was able to achieve variable selection by compressing the coefficients of non-significant variables to zero¹⁶. Therefore, we use this method to obtain the most streamlined and accurate genes for constructing diagnostic models.

Differentially expressed gene enrichment analysis and protein–protein interaction network (PPI)

In this study, we performed GO enrichment analysis and KEGG pathway enrichment analysis^17,18,19 (https://www.kegg.jp/kegg/kegg2.html) for these two genes using GSEA enrichment analysis in order to analyze the role of differentially expressed genes, respectively. The “limma” package, the “org.Hs.eg.db” package, the “clusterProfiler” package and the The “enrichplot” package was used for enrichment analysis of these two genes. Subsequently, to obtain a more accurate PPI network map, we increased the cut off value to 3.2 for another screening, and imported the screened genes into the STRING database (https://cn.string-db.org/) to obtain the relationships among all genes. Subsequently, we identified genes from the PP network that were associated with the diagnostic genes from the screen and constructed their radar maps using the “fmsb” package to visualize their correlation.

All screening to take the intersection and receiver operating characteristic (ROC) diagnostic curve construction

In this study, we took the intersection of all the screening results and used them to obtain the most accurate and precise genes for the diagnostic model of OS. We used the “VennDiagram” package to take the intersection of the genes screened by the five different methods and visualize it as a Venn diagram. Two genes, Pyruvate Dehydrogenase E1 Subunit Alpha 1 (PDHA1) and Cyclin Dependent Kinase Inhibitor 2A (CDKN2A), were identified in the crossover as genes for the diagnostic model. We used the ROC diagnostic curve approach to examine the diagnostic efficacy of these two genes for the diagnosis of OS. We constructed ROC diagnostic curves for the training and test sets in turn.

Analysis of tumor immune cell infiltration and correlation between genes and immune cells

In this study, we performed quantitative immune cell analysis of OS samples and normal control samples from the training set using CIBERSORT software²⁰ in order to explore in depth the relationship between cuproptosis genes and tumor immune cell infiltration. Tumor immune cell infiltration refers to the transfer of immune cells from the blood to the tumor tissue to begin to exert its effects, and infiltrating immune cells that can be isolated from the tumor tissue. We used the “e1071” package, the “parallel” package and the “preprocessCore” package to analyze immune cell infiltration in OS and control groups. Subsequently, we used the “limma” package, the “reshape2” package, the “ggpubr” package and the “ggExtra” package to correlate 2 genes with immune cells for the construction of the diagnostic model.

Big data blood test immune cell composition

In this study, we collected samples from a healthy control group and an experimental group diagnosed with osteosarcoma from the First Affiliated Hospital of Guangxi Medical University from January 2012 to January 2022 in order to test the accuracy of immune cells obtained from CIBERSORT software analysis. We counted absolute neutrophil values, neutrophil percentages, absolute lymphocyte values and lymphocyte percentages in a total of 25,665 routine blood data. Among them, there were 1727 OS cases and 23,938 healthy control cases. These four types of data were statistically analyzed and visualized as box plots for this OS group and the healthy control group using a two independent samples t test.

Immunohistochemical specific staining analysis

In the present study, we performed an analysis using immunohistochemistry in order to examine the differences between the two genes used to construct the diagnostic model in OS tissue and in paracancerous tissue. This study was reviewed and approved by the ethics department of the First Affiliated Hospital of Guangxi Medical University for immunohistochemical analysis of tissue samples from anonymous patients, and therefore a waiver of patient informed consent was requested. The participants' informed consent was waived in the name of the review committee of the First Clinical Affiliated Hospital of Guangxi Medical University. The pathological tissue sections used for immunohistochemistry were obtained from OS tissue samples and paracancer tissue samples excised during surgery at the First Clinical Affiliated Hospital of Guangxi Medical University. Antibodies for specific staining of pathological tissues for immunohistochemistry were purchased from Proteintech (PDHA1, https://www.ptgcn.com/products/PDHA1-Antibody-18068-1-AP.htm, item no. 18068-1-AP) and Bioss (CDKN2A, http://www.bioss.com.cn/prolook_03.asp?id=AF08169606000664&pro37=1, item number: bs-0740R). We first dewaxed the completed pathological sections by first immersing the sections in invasive xylene I for 5 min; then 3 times in xylene for 5 min each; then in 95% ethanol for 5 min; in 80% ethanol for 5 min; in 75% ethanol for 5 min; and then rinsing the paraffin in running water for 2 min. The sections were then subjected to steps such as blocking, antigen repair, and blocking bar endogenous peroxidase. We placed the finished stained images under an inverted microscope to observe and collect the images. We found that PDHA1 expression was higher in paraneoplastic tissues than in osteosarcoma. And the expression of CDKN2A was higher in osteosarcoma than in the control group. This is consistent with the results of our analysis.

Osteosarcoma cell line culture and real-time quantitative-PCR (RT-qPCR)

In this study, all cells used for experimental studies were sourced from the ATCC cell bank (ATCC: The Global Bioresource Center, ATCC). The purchased cells were first resuscitated and subsequently cell passages were performed for all cell lines. Subsequently, PCR primers were designed and synthesized, total RNA was extracted from the experimental and control cells, real-time fluorescence quantitative PCR was performed, reverse transcription was performed, and the raw Ct values obtained by qRT-PCR were substituted into the relative quantification formula (2^−∆∆Ct) for data analysis.

Ethical disclosure

This study was approved by the Ethics Review Committee of the First Affiliated Hospital of Guangxi Medical University and was in accordance with the provisions of the Declaration of Helsinki of the World Medical Association.

Results

Results of data download and preliminary processing

A total of 88 OS cases were downloaded from the UCSC Xena database and 396 normal skeletal muscle samples were downloaded from the GTEx database, and the data from these two databases were normalized and processed to remove inter-batch differences before synthesizing a dataset as a training set. The GSE42352 downloaded from the GEO database contained a total of 118 samples, including 84 osteosarcoma samples, and a total of 3 osteoblasts were selected as their control group, and the data set of these 87 samples was used as the validation set. We use Perl scripts to transform the probe numbers of the training and validation sets into gene symbols.

Results of differential expression analysis

In this study, we set the cut off value to |logFC| ≥ 1.5, adjusted-p < 0.05, and obtained a total of 4811 differentially expressed genes from the whole genome of 54,751 genes for all genes. We visualized the top 100 differentially expressed genes for the heat map (Fig. 1A) and the volcano map (Fig. 1B). We found that a portion of genes were highly expressed in the OS group and a portion of genes were highly expressed in the control group. Subsequently, we extracted the expression of cuproptosis genes and analyzed the expression relationship between the two two genes, visualized as a correlation heat map (Fig. 1C). From the expression of the correlation heat map we can find that if the line between two genes is red it indicates synergistic high expression and if it is green it is synergistic low expression.

Results of artificial intelligence stochastic randomForest analysis, SVM-RFE analysis and LASSO regression analysis

In this study, we profiled OS using an artificial intelligence approach in order to obtain more precise diagnostic biomarkers associated with cuproptosis. From the graph of randomForest analysis results (Fig. 2A,B) we can find that PDHA1 and CDKN2A are located in the top 2 positions of the screened genes, i.e. these two genes can be used to get the best efficacy for the diagnosis of OS. From the results of the SVM-RFE analysis (Fig. 2C) we found that the best efficacy was obtained when the number of variable factors was 4, yielding the genes PDHA1, PDHB, CDKN2A and DLD. Subsequently, the genes of the obtained diagnostic model were refined using LASSO regression analysis, and from Fig. 2D we can learn that the best efficacy was obtained when D, we can learn from Fig. 2D that better diagnostic efficacy can be obtained when the number of variables is 6. Finally, we took the intersection of all differential genes, differentially expressed genes associated with cuproptosis, genes obtained by randomForest analysis, genes obtained by SVM-RFE analysis and genes obtained by LASSO regression analysis, and obtained PDHA1 and CDKN2A as the best diagnostic genes associated with cuproptosis in OS (Fig. 2E).

Results of differentially expressed gene enrichment analysis and construction of PPI

We used the GSEA enrichment analysis method to analyze the 2 genes that construct the OS cuproptosis-related diagnostic model in order to obtain the GO entries and KEGG pathway where the differentially expressed genes are located. We found that the GO entry of PDHA1 (Fig. 3A) was mainly enriched in chromatin assembly or disassembly, mitochondrial gene expression and mitochondrial translation, etc. The KEGG pathway of PDHA1 (Fig. 3B) was mainly enriched in ERBB signaling pathway, pyrimidine metabolism, ribosome and spliceosome. The GO enrichment analysis of CDKN2A was mainly enriched in phagocytosis, Regulation of translation initiation, Translation initiation, Plasma membrane signaling receptor complex and t-cell receptor complexes (Fig. 3C). The KEGG pathway of CDKN2A (Fig. 3D) is mainly enriched in natural killer cell-mediated cytotoxicity, NOD LIKE receptor signaling pathway and T cell receptor signaling pathway. We performed PPI constructs for these two genes and we found that PDHA1 and CDKN2A are closely linked by multiple genes each (Fig. 3E,F).

Results of ROC diagnostic curve

In this study, we constructed ROC diagnostic curves in order to test the diagnostic efficacy of the constructed OS cuproptosis-related diagnostic model. From the area under the curve of ROC in the training set (Fig. 4A), we found that the area under the curve of PDHA1 for diagnosing OS reached 100%, while the diagnostic efficacy of PDHA1 in the validation set reached 88.5% (Fig. 4C). On the other hand, CDKN2A was used to diagnose OS with an area under the curve of 93.9% (Fig. 4B), while the diagnostic efficacy of PDHA1 in the validation set reached 68.7% (Fig. 4D). This result illustrates that the OS diagnostic model we constructed is accurate.

Results of the analysis of tumor immune cell infiltration and correlation between genes and immune cells

In this study, we performed immune cell infiltration and immune cell differential analysis using CIBERSORT software in order to analyze the relationship between OS, a malignant tumor, and immune cell infiltration. From the immune cell composition graph we can find (Fig. 5A) that the immune cell composition of the experimental group (Treat) differs in general from the control group (Con). On the other hand, from the immune cell correlation analysis we can know the correlation between each two immune cells, with red squares indicating a trend of synergistic high expression between these two and blue squares indicating a trend of synergistic ground expression between these two (Fig. 5B). Most importantly, we found from the differential expression analysis of immune cell infiltration that there were multiple immune cell expression differences between OS and normal controls, such as Neutrophils, Eosinophils, B cells naïve, T cells CD4 memory resting, T cells CD8 and T cells CD4 naïve (Fig. 5C).

Results of gene and immune cell correlation analysis

In the present study, we performed a correlation analysis between these two genes and immune cells in order to analyze the relationship between these two genes that construct the OS diagnostic model. We found (Fig. 6A–F) that the gene expression of PDHA1 showed a significant positive correlation with Neutrophils, Eosinophils, B cells naïve, T cells CD8 and T cells CD4 naïve, i.e. the higher the expression of PDHA1, the higher the expression of these immune cells. The gene expression of CDKN2A showed a significant positive correlation with T cells CD4 memory resting and a significant positive correlation with Neutrophils, Eosinophils, B cells naïve, T cells CD8 and T cells CD4 memory resting, cells CD8 and T cells CD4 naïve showed a significant negative correlation (Fig. 7A–F). This result provides a new reference for immunotherapy for the treatment of OS as a malignant tumor.

Big data to test the results of immune cell differential analysis

In this study, we analyzed absolute neutrophil values, neutrophil percentages, absolute lymphocyte values, and lymphocyte percentages from routine blood data of 25,665 cases (Fig. 8). Our results showed that absolute neutrophil values, neutrophil percentages, absolute lymphocyte values and lymphocyte percentages were significantly different between OS and healthy controls. Moreover, the percentage of neutrophils was significantly higher in cases in the OS group compared to the healthy control group, and the difference was statistically significant. In addition, we also found that the absolute value of lymphocytes and the percentage of lymphocytes were lower in osteosarcoma compared to the healthy control group by the comparison of large data, and the difference was statistically significant. This further illustrates the accuracy of our immune cell infiltration and immune cell differential analysis.

Results of immunohistochemistry

We performed immunohistochemical specific staining analysis of PDHA1 and CDKN2A, the two genes used to construct the diagnostic model (Fig. 9A1–D2). Their results showed that the expression of PDHA1 was significantly higher in the paraneoplastic tissues than in the OS group. And relative to CDKN2A, its expression was significantly higher in osteosarcoma than in paraneoplastic tissues. This result was consistent with the results of our bioinformatics analysis. The results of bioinformatic difference analysis also showed that CDKN2A was highly expressed in OS, while PDHA1 was highly expressed in paraneoplastic tissues. This further tested the reliability of our analysis.

Osteosarcoma cell line culture and results of real-time quantitative-PCR (RT-qPCR)

We obtained the expression of the two genes used to construct the diagnostic model in OS cells and control cells after performing laboratory steps such as resuscitation, passaging, cellular RNA extraction, primer design, and real-time quantitative PCR on cells purchased from the ATCC cell bank. We found from the results that the expression of PDHA1 in control cells hFOB1.19 was significantly lower than that in OS cell lines HOS and MG63, and the difference was statistically significant, which is consistent with our analysis (Fig. 10). This further illustrates the accuracy of our analysis.

Discussion

In this study, we obtained differentially expressed genes by analyzing their differential expression from the whole genome of OS and controls, and we performed functional enrichment analysis of these genes and found that the KEGG pathway of PDHA1 was mainly enriched in ERBB signaling pathway, pyrimidine metabolism, ribosome and spliceosome. the GO entries of CDKN2A were mainly enriched in phagocytosis, translation initiation and T-cell receptor complex entries. It has been shown that the metabolic reprogramming of beloved signaling promotes cancer development and progression, and notably, one of the causes of cell death is the depletion of pyrimidines²¹. More importantly, macrophages are important mediators of tissue homeostasis, and tumors can distort this propensity to stimulate cell proliferation, metastasis and angiogenesis, in which their phagocytosis also plays an important role²². This is consistent with our findings. Our study also showed that two genes containing PDHA1 and CDKN2A were also enriched in the corresponding pathways.

Pyruvate Dehydrogenase E1 Subunit Alpha 1, abbreviated as PDHA1, is a protein-encoded gene, and diseases associated with PDHA1 include pyruvate dehydrogenase E1-alpha deficiency and sudden infant death syndrome²³. Previous studies have shown that PDHA1 is able to achieve consistent prostate cancer development in human xenograft tumor models by affecting lipid synthesis²⁴. In addition, it has also been shown that lysine acetylation of PDHA1 and PDP1 is very common in both epidermal growth factor (EGF)-stimulated cells and various human cancer cells, and that acetylation of K202 is able to inhibit PDP1 by dissociating its fifth PDHA1, both of which have a great role in promoting glycolysis and tumor development in cancer cells²⁵. Gonçalves et al. showed that the enzymatic activity of PDHA1 is inhibited by phosphorylation in cells with FH defects, which in turn limits the entry of carbon atoms from glucose into the tricarboxylic acid cycle, and that phosphorylation of PDHA1 is present in tumor cells with FH defects²⁶. This is consistent with our findings. Here, we found that two cuproptosis-associated genes, including PDHA1, could be used to diagnose OS with high efficiency and accuracy by exploring novel diagnostic genes for OS by means of artificial intelligence. On the other hand, it has also been shown that reducing pyruvate activity by depleting mitochondrial pyruvate carrier 2 (MPC2) or PDHA1 stimulates and enhances NLRP3 inflammasome activation²⁷. This is similar to the results of our study. We performed immune cell infiltration of gene expression in OS and controls by CIBERSROT software and found significant differences in neutrophils and lymphocytes between these two groups, and, we tested this with routine blood data from 25,665 cases. Our study provides a new reference for immunotherapy of OS as a malignancy.

Cyclin Dependent Kinase Inhibitor 2A (CDKN2A) is a protein-encoded gene and diseases associated with CDKN2A include melanoma, cutaneous malignancies 2 and melanoma-pancreatic cancer syndrome²⁸. Back in 2016, it has been reported that pancreatic cancer as a malignant tumor, mutation of CDKN2A gene is an important factor for its pancreatic tumorigenesis²⁹. It has also been shown that CDKN2A is significantly mutated in cutaneous melanoma in a study by Hayward et al.³⁰. It has also been shown that in gliomas, hypermutation and acquired CDKN2A deletion are closely associated with an increase in tumor cells at the time of recurrence of this tumor, and that its changes reflect the active growth state in which the tumor cells are in³¹. More interestingly, this is similar to the results of our study. In the present study, our findings showed significant differential expression of CDKN2A in OS and controls, two cuproptosis-related genes including CDKN2A, which can serve as a diagnostic marker for OS as a malignancy and can guide early diagnosis. On the other hand, the control of cancer by adaptive immunity involves some well-defined clearance and death mechanisms when tumor necrosis factor in combination with interferon-gamma (IFN-γ) drives cancer into senescence by inducing permanent growth arrest in the G1/G0 phase and activation of CDKN2A to drive expression of the gnathostome 40 large T antigen (Tag) expressed under the control of the rat insulin promoter³². This is similar to our findings. Our findings suggest that there is a significant dysregulation of neutrophils and lymphocytes in OS and that its may be a key factor with OS.

Here, we screened cuproptosis-related diagnostic genes by three artificial intelligence methods in an attempt to find early biomarkers of OS as a malignant disease and mechanisms related to immune cell infiltration, providing a new basis for early diagnosis and immunotherapy. We screened PDHA1 and CDKN2A as early diagnostic genes for OS using three artificial intelligence methods with high precision and high performance, and both had high diagnostic efficacy. The differential expression of these two genes in OS and control groups was examined by two methods, immunohistochemistry and PCR. The differential expression of this immune cell infiltration was also examined using routine blood data from 25,665 cases.

Of course, our study, like all other studies, has its limitations. First, the inadequacy of sample size, although we used 88 OS and GTEx396 normal controls from UCSC Xena database as the training set and 87 samples from GEO database as the validation set, it is not enough compared to the large sample size. Second, the laboratory validation was insufficient. We only used immunohistochemistry, PCR and clinical large data validation methods for the test.

Conclusion

PDHA1, CDKN2A, and neutrophils were significantly different in OS and normal controls, and PDHA1, CDKN2A was able to serve as specific biomarkers for OS.

Data availability

The datasets used and/or analyzed in the current study are available in the following publicly available datasets. The datasets supporting the conclusions of this article are available in the Ucsc Xena database (http://xena.ucsc.edu/), GTEx Database (https://www.gtexportal.org/home/) and GEO datasets (https://www.ncbi.nlm.nih.gov/gds/).

References

Shoaib, Z., Fan, T. M. & Irudayaraj, J. M. K. Osteosarcoma mechanobiology and therapeutic targets. Br. J. Pharmacol. 179, 201–217. https://doi.org/10.1111/bph.15713 (2022).
Article CAS PubMed Google Scholar
Lu, Y., Song, T., Xue, X., Cao, G. & Huang, P. Kinesin superfamily proteins: Roles in osteosarcoma. Front. Biosci. (Landmark edition) 26, 370–378. https://doi.org/10.52586/4950 (2021).
Article CAS Google Scholar
Gill, J. & Gorlick, R. Advancing therapy for osteosarcoma. Nat. Rev. Clin. Oncol. 18, 609–624. https://doi.org/10.1038/s41571-021-00519-8 (2021).
Article PubMed Google Scholar
Li, S. The basic characteristics of extracellular vesicles and their potential application in bone sarcomas. J. Nanobiotechnol. 19, 277. https://doi.org/10.1186/s12951-021-01028-7 (2021).
Article ADS CAS Google Scholar
Mirelman, A. et al. Detecting sensitive mobility features for Parkinson’s disease stages via machine learning. Mov. Disord. 36, 2144–2155. https://doi.org/10.1002/mds.28631 (2021).
Article PubMed Google Scholar
Areia, M. et al. Cost-effectiveness of artificial intelligence for screening colonoscopy: A modelling study. Lancet Digit. Health 4, e436–e444. https://doi.org/10.1016/s2589-7500(22)00042-5 (2022).
Article CAS PubMed Google Scholar
Tsvetkov, P. et al. Copper induces cell death by targeting lipoylated TCA cycle proteins. Science (New York, N.Y.) 375, 1254–1261. https://doi.org/10.1126/science.abf0529 (2022).
Article ADS CAS PubMed Google Scholar
Li, S. R., Bu, L. L. & Cai, L. Cuproptosis: Lipoylated TCA cycle proteins-mediated novel cell death pathway. Signal Transduct. Target. Ther. 7, 158. https://doi.org/10.1038/s41392-022-01014-x (2022).
Article CAS PubMed PubMed Central Google Scholar
Bian, Z., Fan, R. & Xie, L. A novel cuproptosis-related prognostic gene signature and validation of differential expression in clear cell renal cell carcinoma. Genes https://doi.org/10.3390/genes13050851 (2022).
Article PubMed PubMed Central Google Scholar
Kuijjer, M. L. et al. IR/IGF1R signaling as potential target for treatment of high-grade osteosarcoma. BMC Cancer 13, 245. https://doi.org/10.1186/1471-2407-13-245 (2013).
Article CAS PubMed PubMed Central Google Scholar
Kuijjer, M. L. et al. Kinome and mRNA expression profiling of high-grade osteosarcoma cell lines implies Akt signaling as possible target for therapy. BMC Med. Genom. 7, 4. https://doi.org/10.1186/1755-8794-7-4 (2014).
Article CAS Google Scholar
Huang, C., Jiang, Y. & Yan, J. Comparative analyses of ACE2 and TMPRSS2 gene: Implications for the risk to which vertebrate animals are susceptible to SARS-CoV-2. J. Med. Virol. 93, 5487–5504. https://doi.org/10.1002/jmv.27073 (2021).
Article CAS PubMed PubMed Central Google Scholar
Dong, X., Dong, J. & Chantler, M. J. Perceptual texture similarity estimation: An evaluation of computational features. IEEE Trans. Pattern Anal. Mach. Intell. 43, 2429–2448. https://doi.org/10.1109/tpami.2020.2964533 (2021).
Article PubMed Google Scholar
Wang, X., Yang, Y., Liu, J. & Wang, G. The stacking strategy-based hybrid framework for identifying non-coding RNAs. Brief. Bioinform. https://doi.org/10.1093/bib/bbab023 (2021).
Article PubMed PubMed Central Google Scholar
Naorem, L. D., Prakash, V. S., Muthaiyan, M. & Venkatesan, A. Comprehensive analysis of dysregulated lncRNAs and their competing endogenous RNA network in triple-negative breast cancer. Int. J. Biol. Macromol. 145, 429–436. https://doi.org/10.1016/j.ijbiomac.2019.12.196 (2020).
Article CAS PubMed Google Scholar
Wang, J., Fan, W. & Ye, J. Fused Lasso screening rules via the monotonicity of subdifferentials. IEEE Trans. Pattern Anal. Mach. Intell. 37, 1806–1820. https://doi.org/10.1109/tpami.2014.2388203 (2015).
Article PubMed Google Scholar
Ogata, H. et al. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 27, 29–34. https://doi.org/10.1093/nar/27.1.29 (1999).
Article CAS PubMed PubMed Central Google Scholar
Kanehisa, M. Toward understanding the origin and evolution of cellular organisms. Protein Sci. 28, 1947–1951. https://doi.org/10.1002/pro.3715 (2019).
Article CAS PubMed PubMed Central Google Scholar
Kanehisa, M., Furumichi, M., Sato, Y., Kawashima, M. & Ishiguro-Watanabe, M. KEGG for taxonomy-based analysis of pathways and genomes. Nucleic Acids Res. 51, D587-d592. https://doi.org/10.1093/nar/gkac963 (2023).
Article PubMed Google Scholar
Newman, A. M. et al. Robust enumeration of cell subsets from tissue expression profiles. Nat. Methods 12, 453–457. https://doi.org/10.1038/nmeth.3337 (2015).
Article CAS PubMed PubMed Central Google Scholar
Kim, J. et al. CPS1 maintains pyrimidine pools and DNA synthesis in KRAS/LKB1-mutant lung cancer cells. Nature 546, 168–172. https://doi.org/10.1038/nature22359 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
DeNardo, D. G. & Ruffell, B. Macrophages as regulators of tumour immunity and immunotherapy. Nat. Rev. Immunol. 19, 369–382. https://doi.org/10.1038/s41577-019-0127-6 (2019).
Article CAS PubMed PubMed Central Google Scholar
Zhang, W. et al. A global transcriptional network connecting noncoding mutations to changes in tumor gene expression. Nat. Genet. 50, 613–620. https://doi.org/10.1038/s41588-018-0091-2 (2018).
Article CAS PubMed PubMed Central Google Scholar
Chen, J. et al. Compartmentalized activities of the pyruvate dehydrogenase complex sustain lipogenesis in prostate cancer. Nat. Genet. 50, 219–228. https://doi.org/10.1038/s41588-017-0026-3 (2018).
Article CAS PubMed PubMed Central Google Scholar
Fan, J. et al. Tyr phosphorylation of PDP1 toggles recruitment between ACAT1 and SIRT3 to regulate the pyruvate dehydrogenase complex. Mol. Cell 53, 534–548. https://doi.org/10.1016/j.molcel.2013.12.026 (2014).
Article CAS PubMed PubMed Central Google Scholar
Gonçalves, E. et al. Post-translational regulation of metabolism in fumarate hydratase deficient cancer cells. Metab. Eng. 45, 149–157. https://doi.org/10.1016/j.ymben.2017.11.011 (2018).
Article CAS PubMed PubMed Central Google Scholar
Lin, H. C. et al. Lactic acid fermentation is required for NLRP3 inflammasome activation. Front. Immunol. 12, 630380. https://doi.org/10.3389/fimmu.2021.630380 (2021).
Article CAS PubMed PubMed Central Google Scholar
He, K. Y. et al. Leveraging linkage evidence to identify low-frequency and rare variants on 16p13 associated with blood pressure using TOPMed whole genome sequencing data. Hum. Genet. 138, 199–210. https://doi.org/10.1007/s00439-019-01975-0 (2019).
Article CAS PubMed PubMed Central Google Scholar
Kamisawa, T., Wood, L. D., Itoi, T. & Takaori, K. Pancreatic cancer. Lancet (London, England) 388, 73–85. https://doi.org/10.1016/s0140-6736(16)00141-0 (2016).
Article CAS PubMed Google Scholar
Hayward, N. K. et al. Whole-genome landscapes of major melanoma subtypes. Nature 545, 175–180. https://doi.org/10.1038/nature22071 (2017).
Article ADS CAS PubMed Google Scholar
Varn, F. S. et al. Glioma progression is shaped by genetic evolution and microenvironment interactions. Cell 185, 2184-2199.e2116. https://doi.org/10.1016/j.cell.2022.04.038 (2022).
Article CAS PubMed Google Scholar
Braumüller, H. et al. T-helper-1-cell cytokines drive cancer into senescence. Nature 494, 361–365. https://doi.org/10.1038/nature11824 (2013).
Article ADS CAS PubMed Google Scholar

Download references

Acknowledgements

Special thanks to Prof. Xinli Zhan and Dr. Chong Liu from the First Affiliated Hospital of Guangxi Medical University for their kind support to this study.

Funding

This study was supported by the self-funded scientific research project of Guangxi Zhuangzi Autonomous Region Health Department, Contract No. Z2016632.

Author information

Authors and Affiliations

Orthopedics, Guangxi Academy of Medical Sciences, Guangxi Zhuang Autonomous Region People’s Hospital, Nanning, 530016, People’s Republic of China
Jie Jiang, Jianxun Wei, Qie Fan, Haowen Li, Hao Li, Shuzhen Li, Yong Zhao, Guodong Yin, Lin Tang, Yongxiang Wu & Mindong Lan
Spinal Orthopedic Surgery, The First Affiliated Hospital of Guangxi Medical University, Nanning, 530021, People’s Republic of China
Xinli Zhan
Department of Traditional Chinese Medicine, Guangxi Academy of Medical Sciences, Guangxi Zhuang Autonomous Region People’s Hospital, Nanning, 530016, People’s Republic of China
Yijue Qin, Quan Guo, Weicheng Xu, Ling Lu, Yanwei Yang, Yitian Zhang & Haishun Qu

Authors

Jie Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Xinli Zhan
View author publications
You can also search for this author in PubMed Google Scholar
Jianxun Wei
View author publications
You can also search for this author in PubMed Google Scholar
Qie Fan
View author publications
You can also search for this author in PubMed Google Scholar
Haowen Li
View author publications
You can also search for this author in PubMed Google Scholar
Hao Li
View author publications
You can also search for this author in PubMed Google Scholar
Shuzhen Li
View author publications
You can also search for this author in PubMed Google Scholar
Yong Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Guodong Yin
View author publications
You can also search for this author in PubMed Google Scholar
Lin Tang
View author publications
You can also search for this author in PubMed Google Scholar
Yongxiang Wu
View author publications
You can also search for this author in PubMed Google Scholar
Mindong Lan
View author publications
You can also search for this author in PubMed Google Scholar
Yijue Qin
View author publications
You can also search for this author in PubMed Google Scholar
Quan Guo
View author publications
You can also search for this author in PubMed Google Scholar
Weicheng Xu
View author publications
You can also search for this author in PubMed Google Scholar
Ling Lu
View author publications
You can also search for this author in PubMed Google Scholar
Yanwei Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yitian Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Haishun Qu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

J.J., H.Q. and X.Z. designed the study. J.W., H.L., H.L., S.L., Y.Y., L.L., Y.Z. and G.Y. analyze the data. L.T., M.L., Y.Z., Q.G., Y.Q., W.X. and Y.W. digital visualization. X.Z. and J.J. collected data on routine blood data. J.J. wrote and revised the manuscript. H.Q. and X.Z. revised the manuscript. All authors read and approved the final manuscript. All co-authors participated in the laboratory operation. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Haishun Qu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Jiang, J., Zhan, X., Wei, J. et al. Artificial intelligence reveals dysregulation of osteosarcoma and cuproptosis-related biomarkers, PDHA1, CDKN2A and neutrophils. Sci Rep 13, 4927 (2023). https://doi.org/10.1038/s41598-023-32195-2

Download citation

Received: 24 July 2022
Accepted: 23 March 2023
Published: 26 March 2023
DOI: https://doi.org/10.1038/s41598-023-32195-2

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.