Introduction

Osteosarcoma (OS) is a prevalent form of bone cancer that affects primarily children and older adults1. Long-term survival in OS patients has not improved despite multi-agent chemotherapy, surgical resection, radiation, and recent immune and targeted therapies. The NCI-funded Pediatric Preclinical Testing Program/Consortium (PPTP/PPTC), now named Pediatric in vivo Testing (PIVOT), has undertaken a concerted effort to test potential new treatment regimens in panels of patient-derived xenograft (PDX) models of various pediatric diseases, including osteosarcoma2. PDX models offer distinct advantages over cell lines and genetically engineered mouse models. The in vivo implantation of patient tumors into mice provides insights into the manners in which tumors grow within complex and dynamic microenvironments3. They also have similar response rates to therapies as unselected patient populations that then allows for selection of predictive biomarkers4. This approach is particularly significant in pediatric oncology, where the condition's rarity requires careful testing to prioritize agents for early-phase clinical trials. Molecular validation of these PDX models allows for selection of those that best reflect disease manifestation for therapeutic testing3.

The genomic characterization of PDX models through next-generation sequencing plays a crucial role in assessing model fidelity, identifying captured cancer types and subtypes, and determining the recovery of druggable driver mutations. Recently, the PPTC have employed various profiling techniques, including whole exome sequencing (WES), RNA sequencing, and chromosomal microarray analysis, to gain a better understanding of the genomic alterations and gene expression patterns in the osteosarcoma PDX models2. These investigations have unveiled potential avenues for choosing therapeutic targets for this aggressive cancer, such as the TP53 classifiers that may become biomarkers of response2. However, the osteosarcoma genome exhibits substantial complexity and instability, characterized by a high prevalence of rearrangements, chromothripsis, and copy number alterations, necessitating the use of whole-genome sequencing (WGS) for a comprehensive characterization5,6. In addition, the activities of common cancer pathways have not been assessed at the protein level. Therefore, we conducted genome and protein analyses on these original 17 PDX models in addition to four new osteosarcoma PDX models and elaborated on their relatedness to patient findings.

Results

Our cohort comprised 21 PDX models, with a predominant representation of children and young adult patients (Table 1). 17 of these correspond to the same patient-derived xenograft (PDX) models utilized in the PPTC studies2. The four additional models are OS21, OS39R-SJ, OS58-SJ, and OS53-SJ.

Table 1 PDX model sample identification.

OS PDX copy number landscape based on WGS

Most genetic alterations observed in OS are attributed to changes in copy number and genome rearrangements5,6,7. Accurate determination of copy number profiles is thus crucial for identifying genes and pathways that are implicated in OS genome instability. WGS provides greater genome coverage as compared to the single nucleotide polymorphism (SNP) array, enabling more precise copy number profiles. The SNP array will indicate where breakpoints occur, however split reads within sequencing data would provide evidence of the discontinuous chromosome regions that are aberrantly joined together. For the 17 PDX models that were analyzed previously by PPTC using SNP array2, we first conducted a comparative analysis of mutations and copy number profiles obtained from WGS and PPTC SNP array data.

We observed a high correlation between the copy number profiles generated by WGS and SNP array data for most samples except for two models (OS43 and OS51) (Fig. 1, Supplementary Figs. S1, S2). When examining the mutation calls between the exome of OS43 and the genome, there was no overlap, indicating that this may be an entirely different sample or else separate lineage (Supplementary Fig. S3). Unfortunately, we are unable to comment on OS51 since no exome is available.

Figure 1
figure 1

Pair-wise Pearson correlation of whole genome copy number profile from whole genome sequencing and SNP array data.

We also conducted a deep comparative analysis of the copy number profiles of TP53 and RB1 between the WGS and SNP array data from the 17 PDX models (Fig. 2). WGS revealed breakpoints between exons 1 and 2 of TP53 in six out of the 17 samples (OS1, 17, 31, 36, 43, and 58), whereas the SNP array detected this alteration only in one sample (OS2) (Fig. 2a). While the SNP array detected a deep deletion across most of TP53 in sample OS33, WGS identified copy number gain in this same model. Similarly, we also detected divergent copy number profiles of RB1 when comparing WGS and SNP array data (Fig. 2b). WGS identified deep deletions in RB1 in seven models, whereas the SNP array only detected this alteration in two of these.

Figure 2
figure 2

Copy number profiles from SNP array (top panel) and WGS data (bottom panel) for a TP53; and b RB1. Integrative Genomics Viewer panels where red indicates copy number gains while blue indicates copy number loss.

We also compared the copy number profiles of these PDX models with our published genomes from MD Anderson Cancer Center (MDACC) OS patient samples (Fig. 3, Supplementary Table S1)7. In general, the complexity of the copy number profiles in the PDX models were greater than those of OS patient samples, likely due to the lack of matched germline samples for the PDX models. We applied GISTIC to identify recurrent focal copy number events based on the copy number profiles of both PDX samples and MDACC OS patient samples (Fig. 4). There, both PDX and patient samples have gains in chr8q and losses in chr17.

Figure 3
figure 3

Copy number profiles of OS PDX models (a) and patient samples (b). Gains are displayed in red and losses displayed in blue.

Figure 4
figure 4

GISTIC plots using SNP array (top panels) and WGS data (bottom panels). (a) amplification; (b) deletion.

Structural rearrangements detected in PDX models by WGS

OS is characterized by a highly complex genome instability and a high level of genetic heterogeneity5,6,7. Most genetic alterations observed in OS are associated with copy number changes and genome rearrangements. Whole-genome sequencing (WGS) offers a distinct advantage over techniques like whole exome sequencing (WES) and SNP array analysis, as it can detect structural rearrangements that would otherwise go undetected.

In our study, we employed the BReakpoint AnalySiS (BRASS) method to identify structural rearrangements from aligned WGS data. Since we did not have matched germline DNA for each of the PDX models, many of the detected rearrangements were highly recurrent across all PDX models and likely false positives. TP53 rearrangements were identified in 12 models (Supplementary Table S2), with all of them exhibiting breakpoints between exon 1 and 2. Remarkably, we discovered TP53 rearrangements in two models (OS-34 and 43) using WGS, which were not previously detected using WES and SNP array analysis in the PPTC studies2.

Several methods have been developed to identify gene fusions and chimeric RNAs from RNA-seq data. However, these approaches often yield a significant number of false positives, posing a challenge in identifying genuine driver fusions8,9. In the PPTC studies, a high-confidence fusion annotation pipeline utilizing four algorithms was employed, resulting in the identification of 925 unique high-confidence fusions based solely on RNA-seq data2. Among these fusions, 220 were found in the PDX models, with 9 of them involving TP532. To minimize false positives in fusion detection, we integrated our identified rearrangements in WGS with the 220 fusions identified in the PPTC studies2 as recommended by Zhang et al.10. Only 7 fusions were detected in both the PPTC RNA-seq data and our WGS data (as listed in Supplementary Table S3), indicating that most of structural rearrangement are not expressed. Notably, none of the 9 TP53 fusions detected by the PPTC RNA-seq data were identified in our WGS data. Given that most TP53 rearrangements are associated with loss of TP53 transcription11, it is plausible that the 8 TP53 fusions detected using RNA-seq data in the PPTC studies may be false positives.

WGS provide an accurate presentation of genomic alteration landscape of OS PDX models

By integrating non-silent mutations alongside copy number and rearrangement data, we have successfully constructed a comprehensive genome landscape of several recurrently mutated genes across the 21 patient-derived xenografts (PDX) (Fig. 5). Our analysis revealed that 20 out of the 21 samples exhibit TP53 alterations, while 19 samples exhibit RB1 alterations. Previous studies conducted within the PPTC utilized a classifier trained on RNA expression data from The Cancer Genome Atlas (TCGA), which predicted non-functional TP53 in all OS PDX models with available RNA-seq data2. The authors were unable to attribute this observation to genetic alterations using WES and SNP array analyses for OS-34-SJ, OS-43-TPMX, and OS-51-CHLX. In contrast, the WGS data here successfully identified mutations in TP53 through structural rearrangements for OS-34 SJ and OS-43-TPMX, and a missense mutation in OS-51-CHLX. This underscores the comprehensive nature of WGS in characterizing the precise genome profiles of OS PDX models.

Figure 5
figure 5

Landscape of recurrently altered genes in PDX models as detected by WGS data. Each of the 21 samples shown has at least one alteration in the genes listed (altered in 100% of the samples shown). The top panel indicates the tumor mutation burden (TMB) for each model. The bottom panel illustrates the top recurrent somatic mutations along with an indication of the presence of chromothripsis (N = No, Y = Yes).

Chromothripsis detected in a majority of PDX models

Chromothripsis refers to a genomic process in which extensive rearrangements occur in a single catastrophic event, leading to significant alterations in the genome12. This phenomenon has the potential to generate genetic drivers of oncogenesis through DNA copy number gain and loss, as well as rearrangements like translocations. Studies on OS have reported the presence of chromothripsis at varying frequencies (20–89%) across patient samples of different ages6,7. Notably, previous research revealed that younger OS patients exhibit clustered rearrangements associated with chromothripsis to a greater extent than older patients7. These findings suggest that catastrophic chromothripsis events may play a more prominent role in driving oncogenesis among young OS patients compared to older adults. Outside of the context of this study, investigators may interrogate these results in the context of extensive publications describing drug testing across these models.

In this study, we utilized ShatterSeek13 to identify chromothripsis events in our cohort of 21 OS PDX samples (Fig. 6) (see Methods). Supplementary Fig. S4 presents illustrative examples from samples OS17 and 33, demonstrating the characteristic features of chromothripsis, including oscillations in copy number between two or three states and clusters of intermingled structural variations. It is worth noting that most of the patients represented in our sample cohort are children and young adults (Supplementary Table S1), and out of the 21 samples, only two (OS-46-SJ and OS-58-SJ) did not exhibit detectable chromothripsis events in their genomes (Fig. 6a). Thus, our findings in PDX models are consistent with our previous investigations on patient samples7. Furthermore, we observed recurrent occurrences of chromothripsis in specific chromosome regions, such as chromosome 1, 3, and 5 (Fig. 6b).

Figure 6
figure 6

Chromothriptic regions in the PDX models. (a) Integrated Genomic Viewer display of the chromothriptic regions called by ShatterSeek. High confidence 1: at least 6 interleaved intrachromosomal structural variants, 7 contiguous segments oscillating between 2 copy number states, the fragment joins test, and either the chromosomal enrichment or the exponential distribution of breakpoints test. High confidence 2: at least 3 interleaved intrachromosomal structural variants and 4 or more interchromosomal structural variants, 7 contiguous segments oscillating between 2 copy number states and the fragment joins test. Low confidence: at least 6 interleaved intrachromosomal structural variants, 4, 5 or 6 adjacent segments oscillating between 2 copy number states, the fragment joins test, and either the chromosomal enrichment or the exponential distribution of breakpoints test. (b) Frequency plot of the chromosome regions in OS PDX models that have undergone chromothripsis. The frequency indicates the number of patients with chromothripsis. Left panel: High confidence 1 calls. Right panel: High confidence 2 calls.

Two major subgroups of osteosarcoma by proteomic analyses

Reverse protein phase arrays are large-scale proteomic surveys that measure expression levels of hundreds of proteins in a single batch using high-quality antibodies14. These antibodies include those that detect post-translationally modified forms (eg. phosphorylation) to enable further interpretation into the activation levels of common cancer pathways14. For each of the osteosarcoma PDX models, we measured the expression of 377 antibodies (Supplementary Table S4, Supplementary Fig. S5).

First, we used these RPPA proteomic profiles to determine the level of similarity between osteosarcoma patients and the PDX models. To do this, we kept 217 antibody signals that were overlapping between the osteosarcoma patients from Wu et al.7 and the PDX models and removed any batch effects. Following the methodology of Liu et al.15, we then categorized the patients and PDX models into groups across various scales, generating co-clustering scores for each pair (see Methods). The patients and the PDX models clustered into two groups, A and B, based on these co-clustering scores (Fig. 7A). Using this heatmap, the models OS34-SJ, OS29, OS9, and OS39R-SJ were the most representative of group A while models OS60-SJ, OS53-SJ, and OS56-SJ were like group B. When using the Student’s t-test to compare the levels of antibody binding between groups A and B using both patient and PDX models, with 1.5 fold change and adjusted p-value < 0.05, we found that ATM, Jak2, mTOR, Cyclin B1, 53BP1, and ATRX were present at statistically higher levels in group A than in group B (Supplementary Table S5).

Figure 7
figure 7

RPPA analysis of osteosarcoma. a Heatmap of co-clustering scores between the RPPA data in OS PDX models and OS patients. b Heatmap of the pathway scores in the PDX models. The passage number of the PDX model is indicated after the PDX model name.

One routine analysis for RPPA data is to use established groupings of the antibodies into biological pathways based on manual literature curation. Pathway scores were generated according to the positive or negative contribution of each protein to the overall activation of the pathway (see Methods, Fig. 7B, Supplementary Table S6)14. We used the Student’s two sample t-test to compare each pathway score for both osteosarcoma patients and the PDX models in the batch-corrected data between group A and group B (Supplementary Table S7). At a false discovery rate of 0.05, the pathway scores in “breast reactive” was statistically higher in group B than in group A, which is due to higher levels of Calveolin-1, Myosin-11 and Ras-related protein Rab-11A, with lower levels of β-catenin beta-1, GAPDH, and RNA-binding protein 15 (RBM15) in group B as compared to group A (Supplementary Table S5). Gender, primary tumor status (primary or metastatic), and differentiation (osteoblastic, fibroblastic, chondroblastic) did not delineate between groups A and B.

We then attempted to determine the utility of these RPPA findings by comparing the protein levels in AKT, Jak2, mTOR, and tyrosine kinases with published drug response data on these target inhibitors (Supplementary Tables S8–S14). The first challenge in making any comparison of drug effect to these protein profiles was that most drugs did not have any impact on osteosarcoma PDX lines. Even so, there was no correlation between the target protein levels and drug response except for rapamycin that targets mTOR (Pearson correlation coefficient = 0.83, linear regression R2 = 0.6971, Supplementary Table S9). Therefore, there is no strong indication that protein levels and pathway scores can be used to predict both Kaplan Meier estimate of median days to event and overall response.

Discussion

The genome of osteosarcoma (OS) exhibits considerable complexity and instability, characterized by a high frequency of rearrangements, chromothripsis, and somatic whole genome copy number alterations5,6,7. Given this complexity, a comprehensive characterization of OS necessitates the use of whole-genome sequencing (WGS). In this study, we employed WGS to thoroughly examine 21 OS PDX models, out of which 17 had previously undergone characterization in the PPTC studies utilizing WES, RNA-seq, and SNP array analyses. By utilizing WGS in this project, our aim was to provide a valuable resource to the OS research community, facilitating advancements in our understanding and treatment of osteosarcoma.

The copy number profiles derived from WGS data largely resembled those obtained from SNP array data. Among the differences observed, breakpoints between exons 1 and 2 of TP53 were found in six PDX samples, whereas this alteration was seen in only one sample within the SNP array data. This alteration leads to the functional loss of TP53 and has been previously identified in approximately 20% of OS patient cases using WGS5,11. These types of breakpoints in intron 1 have been confirmed in a recent study where they are more prevalent in pediatric cases16. Since breakpoints within intron 1 of TP53 are expected in OS, the higher coverage in WGS may provide a more accurate representation of the copy number landscape of OS PDX models as compared to SNP array analysis. Other discrepancies were associated with the copy number calls, including deeper deletions in genes such as TP53 and RB1 within the WGS data that were absent in the SNP array data. The presence of deeper deletions in the WGS indicate that purity was not a primary reason for the differences observed between SNP array and WGS calls. Instead, these deletions may reflect the inherent genomic instability of OS that would give rise to tumor heterogeneity and progression as the PDX lines are expanded over time.

Structural rearrangements were found in all 21 PDX samples, including those involving TP53 in 12 samples with breakpoints between exon 1 and 2, and those involving RB1 in 2 samples. Integration analysis of WGS and RNA-seq data revealed that very few of these structural rearrangements are expressed, suggesting a low potential level of neoantigens in OS PDX models. This result aligns with our previous studies of OS patient samples7, which proposed that the limited presence of expressed genomic alterations may be associated with low immune infiltrate in OS.

Chromothripsis was identified in 19 of the 21 samples, all of which were derived from children and young adult patients. Therefore, our findings in these PDX models are consistent with our previous investigations on patient samples7, which revealed that younger OS patients exhibit clustered rearrangements associated with chromothripsis to a greater extent than older patients. The locations of chromosomal breaks have been associated with tandem-rich repeats, G-rich sequences, transposable elements, non-B DNA forming motifs, and fragile sites in cancers and osteosarcoma7,17,18.

Omic-informed treatment strategies have been demonstrated in other osteosarcoma PDX models. Those with amplification of MYC (> 4 copies) had decreased tumor growth when treated with the CDK inhibitors19,20. Our preliminary analysis suggests that both target protein levels and pathway scores derived from RPPA have little correlation with drug response. The only exception were the high correlations between the levels of mTOR protein, including its phosphorylated form and the median time to event for the mTOR inhibitor rapamycin. This may be promising given that mTOR inhibitors were among the class of drugs with the greatest efficacy in an independent set of OS PDX models21. In this same study, the predictive value of RNA expression levels and resulting EC50 (uM) was also unclear. Therefore, future approaches to improve the value of RPPA and RNA sequencing data in predicting response include implementing machine-based learning that incorporates gene networks, pathway crosstalks and pathway interactions. These future studies may also elucidate combination therapies to be pursued.

In summary, our utilization of WGS demonstrated that the 21 PDX models effectively and precisely capture the genomic complexity and instability features of OS, including chromothripsis, rearrangement, and precise copy number profiles. This WGS data for the 21 PDX models represent a valuable resource for better understanding the genomic features of OS and can aid in the identification of translational biomarkers and therapeutic targets for the treatment of this disease.

Materials and methods

PDX sample collection

PDX models were generated as described22,23,24. They are available through the PPTC, a National Cancer Institute funded program to evaluate novel agents against pediatric solid tumor and leukemia preclinical models. We used 21 osteosarcoma PDX models for WGS study, 17 models (OS1, OS2, OS9, OS17, OS29, OS31, OS33, OS34-SJ, OS36-SJ, OS42-SJ, OS43-SJ, OS46-SJ, OS49-SJ, OS51-SJ, OS55-SJ, OS56-SJ, OS60-SJ) correspond to the same PDX models utilized in the PPTC studies2, and 4 of them are new models (OS21, OS39R-SJ, OS53-SJ, OS58-SJ).

Patient collection

This is a previously published cohort where samples were obtained with informed consent and approved by the University of Texas MD Anderson Cancer Center Institutional Review Board7.

Whole genome sequencing

Genomic DNA (gDNA) was extracted with the QIAamp DNA Mini kit (Qiagen, Germantown, MD) and used for high depth paired-end whole genome sequencing. Whole genome sequencing data was generated at Baylor College of Medicine – Human Genome Sequencing Center (BCM-HGSC) using established library preparation and sequencing methods. Libraries were prepared using KAPA Hyper PCR-free library reagents (KK8505, KAPA Biosystems Inc.) on Beckman robotic workstations (Biomek FX and FXp models). Briefly, DNA (750 ng) was sheared into fragments of 200–600 bp using the Covaris E220 system (96-well format, Covaris, Inc. Woburn, MA) followed by purification of the fragmented DNA using AMPure XP beads. A double size selection step was employed, with different ratios of AMPure XP beads, to select a narrow size band of sheared DNA molecules for library preparation. DNA end-repair and 3’-adenylation were then performed in the same reaction followed by ligation of the Illumina unique dual barcoded adapters (Illumina TruSeq UD Indexes, #20040870) to create PCR-Free libraries, and the library run on the Fragment Analyzer (Advanced Analytical Technologies, Inc., Ames, Iowa) to assess library size and presence of remaining adapter dimers. This step was followed by qPCR assay using KAPA Library Quantification Kit (KK4835) using their SYBR® FAST qPCR Master Mix to estimate the size and quantification. WGS libraries were sequenced on the Illumina NovaSeq 6000 instrument using the NovaSeq 6000 S4 Reagent Kit v1.5 (Catalog #20028312) and XP 4-Lane Kit v.1.5 (Catalog # 20043131) Libraries were loaded at an average concentration of 240 pM to generate 150 bp paired-end reads. On average, 133 GB of unique aligned sequence was obtained per sample. The average insert sizes in these samples were 439 bp median and 418 bp mode. For each sample, the reads were mapped to the hg19 reference genome using BWA-MEM (version 0.7.15), followed by the downstream analyses detailed below.

Somatic point variants were called from aligned WGS data using MuTect25. High quality variants were defined as those with a minimum tumor read depth of ≥ 15, minimum matched normal read depth of ≥ 7, and minimum alternate allele frequencies in the tumor and normal as ≥ 0.05 and ≤ 0.01, respectively. Known single nucleotide polymorphisms from the 1000 Genomes and ExAC were removed. Indel variants were called using Pindel 26. Pindel raw calls were further filtered to select for calls with score > 30, ESP6500 and 1000G population minor allele frequencies > 0.01, and not intronic. Structural rearrangements were detected from aligned WGS data using the bespoke algorithm BReakpoint AnalySiS (BRASS) (https://github.com/cancerit/BRASS). Total copy number calls were determined by applying HMMcopy27, with log2 scores above 0.5 indicating gains and log2 scores below -0.5 indicating losses. The weighted genome instability index (WGII)28 was utilized to quantify the proportion of the genome with copy number alterations as a measure of chromosomal instability. Focal recurrent copy number alterations were identified at a 95% confidence level using GISTIC 2.029. Gene-level data was derived from the segmented total copy number data using the R package CNTools30. Chromothripsis was detected by analyzing copy number profiles and filtered rearrangements with ShatterSeek https://github.com/parklab/ShatterSeek13. High confidence 1: at least 6 interleaved intrachromosomal structural variants, 7 contiguous segments oscillating between 2 copy number states, the fragment joins test, and either the chromosomal enrichment or the exponential distribution of breakpoints test. High confidence 2: at least 3 interleaved intrachromosomal structural variants and 4 or more interchromosomal structural variants, 7 contiguous segments oscillating between 2 copy number states and the fragment joins test. Low confidence: at least 6 interleaved intrachromosomal structural variants, 4, 5 or 6 adjacent segments oscillating between 2 copy number states, the fragment joins test, and either the chromosomal enrichment or the exponential distribution of breakpoints test.

RPPA data analysis

The Reverse Phase Protein Array was processed as previously described and these previously annotated pathways were also used in the analysis31. We compared the RPPA data from the PDX lines with those obtained from 38 osteosarcoma patients (previously reported in14). To ensure consistency, we selected only the proteins that were common to both datasets, resulting in 217 proteins for subsequent analysis. To address batch effects between these two datasets, we utilized the combat method32.

For assessing pairwise independence between patients, PDX models, and patient-PDX relationships, we employed the distance correlation measure33. A value of 0 indicates independence between two samples, while a higher value suggests a stronger linear or nonlinear association.

Following the methodology of15, we categorized the patients and PDX models into different groups across various scales at false discovery rate = 0.2. Larger scales yielded fewer groups with larger sizes, whereas smaller scales resulted in larger communities. The co-clustering score employed adjusted clustering methods based on NetDrugMatch (https://github.com/bayesrx/Software/blob/master/NetDrugMatch/NetDrugMatch_allFunctions.R) and quantified the extent to which each patient-PDX pair shared the same community at different scales. Additionally, we utilized the adjusted co-clustering score, which is weighted by the inverse variation of the community structure. A higher score indicates greater similarity between two samples. Final unsupervised clustering of the co-clustering scores was conducted using Complex Heatmap and revealed two groups A and B as described in the Results34. The Shapiro test was used to determine whether each antibody result was normally distributed. To compare groups A and B in both patient and PDX samples, the batch-corrected RPPA data were then taken for both Student’s t-test (two-sided) (for antibodies that were normally distributed) and Wilcox test (for those antibodies that were not normally distributed). The false discovery rate (FDR) was used to adjust the p-values.

RPPA pathway scores were generated for both patient and PDX models. First, member proteins that were used as pathway predictors were selected by literature review as previously described14. Briefly, median-centered and normalized RPPA data were used as the relative protein levels. The pathway scores were then calculated by taking the sum of the relative protein level of all positive regulatory pathway members minus that of negative regulatory pathway members. We averaged antibodies targeting different phosphorylated forms of the same protein with ρ > 0.85 (Pearson’s correlation)." The pathway scores are relative values and no absolute cutoff has been found to group samples based on the scores. Smaller scores indicate that the pathway has less activity relative to those samples with larger scores. Scores were combined for patient and PDX groups and then Groups A and B were compared using two-sided Student’s t-test with FDR correction.