Genomic and transcriptomic characterization of pre-operative chemotherapy response in patients with osteosarcoma

Osteosarcoma is a heterogeneous disease with regard to its chemotherapy response and clinical outcomes. This study aims to investigate the genomic and transcriptomic characteristics related to pre-operative chemotherapy response. Samples from 25 osteosarcoma patients were collected to perform both whole exome and transcriptome sequencing. Osteosarcoma had significant amount of chromosomal copy number variants (CNVs). Chemotherapy responders showed the higher chromosomal CNV burden than non-responders (p = 0.0775), but the difference was not significant. The percentage of COSMIC signature 3, associated with homologous recombination repair deficiency, was higher in responders (56%) than in non-responders (45%). Transcriptomic analysis suggested that 11 genes were significantly up-regulated in responders and 18 genes were up-regulated in non-responders. Both GSEA and KEGG enrichment analysis indicted that four pathways related to cardiomyopathy were up-regulated in responders, while neuroactive ligand − receptor interaction was up-regulated in non-responders. Finally, a previously published chemoresistant model was validated using our dataset, with the area under the curve of 0.796 (95% CI, 0.583–1.000). Osteosarcoma had the heterogeneous mutational profile with frequent occurrence of CNVs. Transcriptomic analysis identified several signaling pathways associated with chemotherapy responsiveness to osteosarcoma. Transcriptomic signatures provides a potential research direction for predicting the chemotherapy response.


Introduction
Osteosarcoma is the most common primary malignant tumor of bone.Although it is a rare disease in all ages, primary osteosarcoma commonly occurs in children and adolescents aged 10-24 years.According to the statistical analyses of 5 016 osteosarcoma patients from the Surveillance, Epidemiology, and End Results program from 1975 to 2017 1 , the age-adjusted incidence of osteosarcoma was 3.3 / 1 000 000 in all age groups and 7.2 / 1 000 000 in population aged 10-24 years.Pre-operative (neoadjuvant) chemotherapy followed by de nite surgery and postoperative (adjuvant) chemotherapy is the mainstay of therapy for osteosarcoma, signi cantly improving the survival and prognosis in approximately twothirds of patients with localized disease 2 .
Despite the remarkable e cacy of chemotherapy, there is a lack of uniform and objective criteria for evaluating the chemotherapy response in osteosarcoma patients.How to evaluate the effectiveness of chemotherapy, especially pre-operative chemotherapy, is crucial for the adjustment of treatment plans and patient prognosis assessment.Currently, the commonly applied evaluation method to determine the chemotherapy response and tumor prognosis is based on the tumor cell necrosis rate 3,4 .However, the evaluation and application of tumor cell necrosis rate have certain limitations: the procedure of tumor necrosis rate analysis is complex; heterogeneous tumors are di cult to carry out; the evaluation time required is long, resulting in a late outcome which is not conducive to subsequent treatment.Huvos 5 rst proposed a histological evaluation method for osteosarcoma necrosis rate in the 1970s, which has been used for more than 40 years.However, this method cannot be applied to evaluate the chemotherapy response before surgery, and it does not comprehensively consider clinical and imaging aspects.Therefore, it is necessary to understand the molecular characteristics related to chemotherapy response and nd a new effective evaluation method for chemotherapy response in osteosarcoma.
The rapid development and widespread application of high throughput sequencing techniques promote the understanding of tumor development and progression from the molecular level, and thus change the clinical treatment modes and survival outcomes of patients in various of cancers, especially in non-small cell lung cancer and colorectal cancer 6,7 .In addition to supporting treatment decisions for cancer patients, next-generation sequencing has also been utilized in multiple clinical settings, including detecting tumor susceptibility genes, early diagnosis and screening, evaluating patient prognosis and monitoring minimal residual diseases and tumor progression [8][9][10][11] .This study aims to investigate the genomic and transcriptomic characteristics of osteosarcoma using DNA-and RNA-based next-generation sequencing techniques, nd clinical and molecular factors related to chemotherapy response and try to construct a machine learning-based classi er to predict chemotherapy response.

Clinicopathologic characteristics and evaluation of chemotherapy response
The clinicopathologic features of 25 patients with pathologic diagnoses of osteosarcoma were summarized in Supplementary Table S2.The median age at diagnosis was 15 years (range 8-32 years) and 75% patients were male.The most common site of osteosarcoma in this study was femur (60%), followed by tibia (20%), humerus (12%), bula (4%) and multifocal (4%).Majority of the patients were conventional osteosarcoma (88%) and SSS stage IIB (92%).The tumor volume ranged from 24.633 to 14705.856cm 3 with the median of 371.57cm 3 .Pre-operative chemotherapy and surgery were conducted in 84% and 92% of patients, respectively.Pathological evaluation suggested that the tumor necrosis rate was higher than 90% in 36% of patients.Increased ossi cation, clear margin, decreased blood supply and decreased circumference after chemotherapy were observed in 52%, 36%, 40% and 36%, respectively.
To evaluate the chemotherapy response, we designed an integrated rating scale based on the pathological and imaging ndings (Supplementary Table S3).A patient was considered as a chemotherapy responder when the sum of scores for all indicators was greater than the upper quartile.As a result, among 21 patients who received pre-operative chemotherapy, 9 patients were classi ed as chemotherapy responders and 12 patients were non-responders.Tumor necrosis rate (p < 0.0001), change of margin (p = 0.0158) and circumference (p = 0.0071) were signi cantly related to the results of integrated evaluation (Table 1).Two patients without tumor necrosis rate information were evaluated as one responder and one non-responder according to the results of other indicators.

Relationship between genomic features and chemotherapy response
To explore the relationship between genomic characteristics and chemotherapy response, sher's exact test was used to nd the differentially mutated genes between responders and non-responders with signi cance, and the result was negative, which may be explained by the low mutation rate in osteosarcoma (data not shown).Additionally, no difference in TMB between two groups was observed (median TMB for responders vs. non-responders: 0.5111 vs. 0.3778 muts/Mb, p = 0.3013) (Fig. 2A).Notably, the chromosomal CNV burden of chemotherapy responders appeared to be higher than that of non-responders with the median of 7 and 3.5, respectively, but the difference was not statistically signi cant (p = 0.0775) (Fig. 2B).The composition of COSMIC mutational signatures was also compared between chemotherapy responders and non-responders (Fig. 2C).Mutational signatures 1, 3, 6, 17, 20, 22 and 28 were present in both groups.Apparently, the percentage of signature 3, which was associated with failure of DNA double-strand break-repair by homologous recombination, was higher in chemotherapy responders (56%) than in non-responders (45%), which suggested that signature 3 may be related to chemotherapy response.By contrast, the percentage of two signatures of unknown etiology (signature 17 and 28) was higher in non-responders (Non-responders vs. Responders: 12% vs. 4% for signature 17; 6% vs. 1% for signature 28).

Differentially expressed genes (DEGs) and pathway enrichment analysis
The raw expression counts for all genes were executed for the differentially expressed gene analysis between chemotherapy responders and non-responders.The thresholds for adjusted p (padj) and |log2FC| were set as < 0.05 and ≥ 1, respectively.As a result, the volcano plot showed that 11 genes were signi cantly up-regulated in responders and 18 genes were signi cantly up-regulated in non-responders with the adjusted p value smaller than 0.05 (Fig. 3A).GSEA enrichment analysis was performed to identify the up-regulated pathways in chemotherapy responders and non-responders.Four pathways related to cardiomyopathy were up-regulated in responders (Fig. 3B), while neuroactive ligand − receptor interaction and olfactory transduction pathways were up-regulated in non-responders (Fig. 3C).
Due to the relative small number of DEGs, genes with p < 0.05 and |log2FC| ≥ 1 were included to perform conventional KEGG pathway enrichment analysis and PPI analysis.The signi cant KEGG pathways (adjusted p < 0.05) of 309 signi cantly expressed genes in responders included calcium signaling pathway, motor proteins, cGMP − PKG signaling pathway, circadian entrainment, arrhythmogenic right ventricular cardiomyopathy, as well as 4 pathways identi ed by GSEA enrichment analysis (Fig. 4A).In non-responders, neuroactive ligand-receptor interaction pathway was the only signi cant KEGG pathway with the adjusted p value smaller than 0.05 (Fig. 4B).Genes involved in these pathways were summarized in Supplementary Table S4.The PPI analysis was performed using STRING database and visualized by Cytoscape (Supplementary Fig. 1).A total of 709 nodes and 1888 edges were included in the network.ACTN2, TTN and MYH6 were considered as potential hub genes with the node degree score of 45, 43 and 39, respectively.ACTN2, TTN and MYH6 were overexpressed in chemotherapy responders with the |log2FC| of 1.8487, 3.6915, 2.1881 and p value of 0.0103, 0.0001, 0.0200, respectively (data not shown).In addition, all of them were involved in pathways enriched in chemotherapy responders (Supplementary Table S4).

Development of random forest classi er to predict chemotherapy response
To discriminate chemotherapy response, we developed a random forest-based classi er using 23 DEGs.
The importance of genes in the classi er was illustrated in Supplementary Fig. 2A, which showed that UST, CTNNA2 and DAPL1 was the top 3 important genes.Among 21 patients with prior chemotherapy treatment and RNA sequencing, the classi er classi ed 12 non-responders into 11 non-responders and 1 responder, and classi ed 9 responders into 6 responders and 3 non-responder, with the out-of-bag estimate of error rate of 19.05%.The classi er had the area under the curve of 0.843 (95% con dence interval, 0.654-1.000)to classify chemotherapy response (Supplementary Fig. 2B).

Discussion
Osteosarcoma is a heterogeneous disease with regard to its histology, chemotherapy response and clinical outcomes 12,13 .In this study, whole exome and transcriptome sequencing were conducted in 25 patients with osteosarcoma.A clinicopathologic integrated rating scale was developed to evaluate the chemotherapy response.In addition, clinical, genomic and transcriptomic features relevant to chemotherapy response were explored and a chemotherapy response classi er was constructed.To our knowledge, this is the rst study to comprehensively investigate the clinical and molecular characteristics of chemotherapy response in osteosarcoma using multi-omics techniques, which expands our knowledge of this complex disease.
In multiple previous studies [14][15][16] , osteosarcoma was mainly characterized by diverse variants, recurrent structural variants, high frequencies of TP53 and RB1 mutations.Similarly, our study revealed that the genetic variants in osteosarcoma were dispersedly distributed.In terms of mutation type, single nucleotide variants and small insertions and deletions were atypical events, whereas CNVs, especially ampli cation, were common in osteosarcoma.PHOX2B mutations were rstly identi ed in osteosarcoma in our study, which encodes a transcription factor participating in the development of the peripheral nervous system and is known to be related to neuroblastoma and congenital central hypoventilation syndrome 17 .Our study also found that osteosarcoma had the extremely low level of TMB and relatively high level of chromosomal CNV burden.In addition, it seemed that patients with high chromosomal CNV burden had a nonsigni cantly better response to chemotherapy, which required to be veri ed in further studies.Consistent with the previous study 15 , mutations in osteosarcoma were mainly caused by the dysfunctional DNA damage repair in this study.Moreover, this study suggested that chemotherapy responders had more homologous recombination repair de ciency -induced mutations, which may be explained by the administration of platinum-based chemotherapy regimens in osteosarcoma and the favorable response to platinum in defective homologous recombination repair ovarian cancer 18 .
In this study, 11 genes were signi cantly up-regulated in responders and 18 genes were signi cantly upregulated in non-responders.Both KEGG and GSEA enrichment analyses suggested that 4 pathways relative to cardiomyopathy were signi cantly enriched in responders and neuroactive ligand-receptor interaction pathway was signi cant in non-responders.The recommended neoadjuvant chemotherapy regimen for osteosarcoma included high-dose methotrexate, doxorubicin, cisplatin and ifoshamide.Doxorubicin can induce clinically symptomatic cardiac toxicity in 1.5%-1.7% of patients with osteosarcoma and the incidence of subclinical cardiac abnormalities may be higher 19 .The enrichment of cardiomyopathy-related pathways in chemotherapy responders may be related to the treatment of doxorubicin, which reminds clinicians of the occurrence of cardiac toxicity in chemotherapy responders.
In a previous study, transcriptomic analysis was performed in breast cancer to investigate biomarkers related to chemotherapy sensitivity.KEGG analysis indicated that a great number of DEGs in neuroactive ligand-receptor interaction and the number of DEGs in neuroactive ligand-receptor interaction pathway subexpressed in chemotherapy resistant group was more than overexpressed genes.However, whether the entire pathway was up-regulated or down-regulated was not analyzed 20 .In another study in breast cancer, the relationship between neuroactive ligand-receptor interaction pathway and pathological complete response to chemotherapy was not clearly elucidated 21 .One study used the paclitaxel-induced peripheral neuropathy rat model to understand the transcriptomic level of the dorsal root ganglia neurons and found that neuroactive ligand-receptor interaction was majorly involved in sensory neurons of rats with paclitaxel-induced peripheral neuropathy 22 .Therefore, the enrichment of neuroactive ligand-receptor interaction pathway in chemotherapy non-responder group in our study might be related to chemotherapy-induced peripheral neuropathy.PPI analysis identi ed ACTN2, TTN and MYH6 as hub genes resulting in chemotherapy responsiveness.ACTN2 encoded protein links the anti-parallel actin laments, contributes to sarcomere stability and is related to cardiomyopathy 23 .TTN encoded protein plays an important role in skeletal and in heart muscles 24 .MYH6 gene provides instructions for making a protein known as the cardiac alpha (α)-myosin heavy chain, which forms part of type II myosin in cardiac muscles 25 .All of them were overexpressed in chemotherapy responders and involved in pathways enriched in chemotherapy responders, which con rmed the pathway enrichment ndings.
In clinics, multiple methods are used to evaluate the neoadjuvant chemotherapy response in osteosarcoma, including symptoms and signs, laboratory tests, imaging examinations and evaluation of tumor necrosis rate.Our study suggested that changes of tumor margin and circumference after chemotherapy were more related to chemotherapy response.Zeng et.al.developed a chemoresistant risk model using markers ontained from bulk RNA and single-cell RNA sequencing with the area under the curve of 0.82 in the TARGET-OS training cohort and 0.84 in the GSE33382 validation cohort 26 .Our study used the limited internal data to develop a random forest-based classi er, and performance of the model was comparative to the previous study.Due to the inaccessible clinical information in TARGET-OS dataset and limited overlapping genes between GSE33382 dataset and our internal dataset, no suitable external dataset could be used to validate our classi er.A potential limitation of this study was the small amount number of participants, which may reduce the representativeness of certain ndings and need to be further veri ed in large-scale studies.
In conclusion, multi-omics sequencing techniques help us better understand the molecular characteristics of osteosarcoma.Osteosarcoma is a highly heterogeneous disease with frequent occurrence of CNVs at the genomic level.Transcriptomic analysis identi ed several signaling pathways associated with chemotherapy responsiveness to osteosarcoma, including pathways related to cardiomyopathy and neuroactive ligand-receptor interaction.Additionally, the performance of chemotherapy response classi er was acceptable, which requires to be validated using more data before utilizing in clinics.

Materials and Methods
Patients and samples.A total of 25 patients with the de nite diagnosis of osteosarcoma in Beijing Jishuitan Hospital (Beijing, China) from December 2021 to November 2022 were prospectively enrolled in this study.Clinical information including demographics, pathologic diagnoses, treatment history and imaging examinations were collected.Tumor tissue samples (including 6 fresh tissues, 18 formalin immersed tissues and 1 formalin xed para n-embedded tissue) and 2 ml matched peripheral blood were collected from all participants to perform the whole exome and transcriptome sequencing.All procedures were conducted in accordance with the Declaration of Helsinki.This study was approved by the Ethics Committee of Beijing Jishuitan Hospital (Approval No. K2023013-00) and written informed consent was obtained from all participants.
Pathologic and imaging evaluation of chemotherapy response.The cell necrosis rate was analyzed using the tumor tissues collected from tumor resection and separated into two groups: greater than 90% and less than 90%.The circumferences of the tumor site on the affected limb and the same site on the healthy limb were measured, and their difference was calculated before and after chemotherapy.Enhanced CT examinations on the tumor site were performed before and after chemotherapy to evaluate the degree of ossi cation, tumor margin and blood supply of the tumor.
DNA-and RNA-based next-generation sequencing.Whole exome and transcriptome sequencing were performed in Geneplus-Beijing (Beijing, China) as previously described [27][28][29] .Brie y, genomic DNA and RNA were extracted from the tumor samples using FirePureTM FFPE gDNA Extraction Kit for genomic DNA and RNeasy FFPE Kit for RNA or AllPrep DNA/RNA FFPE Kit for both DNA and RNA (QIAGEN, Germany).Genomic DNA from leukocytes was extracted using the CWE9600 Blood DNA Kit (Cwbiotech, Taizhou, China).Sequencing libraries of genomic DNA and mRNA were prepared using KAPA DNA Library Preparation Kit (Kapa Biosystems, Wilmington, MA, USA) and NEB Next Ultra™ RNA Library Prep Kit (Illumina, Inc., San Diego, CA, USA), respectively.The DNA and RNA sequencing were performed using the DNBSEQ-T7RS High-throughput Sequencing platform (MGI, Shenzhen, China).All experimental procedures followed the manufacturer's instructions.Whole transcriptome sequencing cannot be performed for one patient due to the severe degradation of RNA.The detailed quality control data of DNA and RNA sequencing was provided in Supplementary Table S1.
Bioinformatics analysis.After removal of terminal adaptor sequences and low-quality reads, the clean sequencing reads were aligned to the reference human genome (hg19) using BWA (version 0.7.10) and HISAT (version 2.0.4) for DNA and RNA sequencing, respectively.Genomic single nucleotide variants, small insertions and deletions, copy number variants (CNVs) and structural variants were detected using MuTect (version 1.1.4)/ NChot, GATK (version 3.4-46-gbc02625) and CONTRA (version 2.0.8),respectively.Transcript assembly was performed using StringTie (version 1.2.3).
Tumor mutation burden was evaluated as the number of non-synonymous variants with the mutant allele frequency greater than 5% per megabase in the coding region.Chromosomal CNV burden represented the total level of ampli cations or deletions at the chromosome level, which was calculated as previously described 30 .Gistic 2.0 was used to detect the signi cantly recurrent regions with ampli cation or deletion.The mutational landscape was portrayed using R package 'maftools' (version 2.14.0).

Figure 1 The
Figure 1

Figure 2 The
Figure 2

Table 1
Comparison of single indicators and integrated evaluation of chemotherapy response Statistical analysis.Z-score normalization of gene expression data was performed using the scale function in R package 'base' (version 4.2.2) taking the raw count matrix as an input.Spearman correlation analysis was conducted to study the correlation between copy number and Z-score of CNVs.Differences of variables between groups were assessed using Mann-Whitney test for continuous variables and Fisher's exact test or Chi-square test for categorical variables, with P < 0.05 considered as statistically signi cant.