Introduction

Post-radiation sarcomas (also known as secondary sarcomas or radiation-induced sarcomas) represent a small fraction of all sarcomas (<5%) and such second malignancies arise from previous radiation therapies [1,2,3]. Firstly established in 1948 [4] and reviewed in 1971 [5], post-radiation sarcomas are defined by three criteria: a latency period of at least 3 years prior to the newly diagnosed sarcoma, sarcoma located in the radiation area of the previous cancer, and a histological difference between the primary tumor and the sarcoma. Though this secondary tumor is generally diagnosed after a decade, a shorter period (6 months) has also been proposed if the other criteria are fulfilled [6].

Despite the incontestable benefits of radiation therapy on cancer, a small fraction of patients develops a secondary tumor (<1%) [7]. Survival studies show that post-radiation sarcomas are more aggressive than sporadic ones [6, 8,9,10,11]. This can be explained by several factors: unfavorable tumor locations, poor treatment responses, late diagnoses, elderly patients, necrosis, and frequent metastatic outcomes [12, 13]. In addition, these groups have distinct clinical characteristics. For example, undifferentiated pleomorphic sarcomas, osteosarcomas, and angiosarcomas have higher incidences as post-radiation compared to sporadic tumors [9,10,11, 13, 14].

Ionizing radiation is known to alter DNA [15]. It creates genomic alterations and disables DNA-repair mechanisms. Post-radiation sarcomas predominantly develop in zones exposed to high doses (50 Gy, normally inducing cell death) rather than low doses (30 Gy, generating genomic instability) [8, 9]. On the other hand, carcinomas are observed in areas exposed to low doses, even though a link between radiation dose and cumulative incidence rate has been suggested [16].

To date, few genetic differences have been observed between sporadic and post-radiation tumors. For example, post-radiation angiosarcomas harbor recurrent MYC amplifications (>50%; a well-known proto-oncogene) [17] and KDR mutations (10%; coding for the vascular endothelial growth factor receptor 2) [18], unlike sporadic angiosarcomas. This suggests that the oncogenic mechanism may be different in these two models. However, other post-radiation subtypes do not harbor specific genomic alterations, which could be related to different biological pathways from sporadic sarcomas, whose oncogenesis remains unclear. Post-radiation sarcomas with complex genetics have similar genomic alterations to sporadic ones, such as RB1 loss and p53 pathway inactivation [19,20,21]. Nevertheless, a sarcoma-based gene expression signature, representing chronic oxidative stress, has been proposed to distinguish the two entities [22]. In addition, several gene-sets have been proposed as indicators of irradiation in various human models [23,24,25,26,27,28].

We consequently studied the genome and transcriptome characteristics of post-radiation sarcomas to better understand the biology of these tumors. We used DNA-arrays to measure their genomic alterations and performed RNA sequencing (RNA-seq) to explore their transcriptome profiles.

Material and methods

Cohorts

To highlight differences between post-radiation and sporadic sarcomas, we compared our results to an independent cohort of sporadic tumors. Clinical characteristics are available in Table 1. Additional information about patient recruitment, histological review, molecular profiling, and data processing/filtering can be found in the Supplementary Methods.

Table 1 Clinical characteristics in the two independent cohorts: sporadic and post-radiation sarcomas

Data accession

The 93 sporadic sarcomas are part of cohort #2 (95 cases) previously published [29], except cases N001 and S915 for which patients had previously reported cancers. Gene Expression Omnibus (GEO) accession: GSE71119 and Sequence Read Archive (SRA) accession: SRP057793. Data for the 77 post-radiation sarcomas are available at GEO accession: GSE102055 and SRA accession: SRP113755.

Results

Genetic specificity of post-radiation angiosarcomas

As previously reported, we observed a high frequency of MYC amplifications in all but one (96%) post-radiation angiosarcomas. Besides MYC alterations, there were a few other genomic alterations (Sup. Fig. 1). Since recurrent KDR (coding for the VEGFR-2 protein) mutations were also reported in post-radiation angiosarcomas, we screened variations of this gene in our 24 cases. Two of them (8%) harbor missense KDR variants: p.S691C (NM_002253 exon 14, c.2072C>G; case AA6201; Sup. Fig. 2A) and p.D1312H (NM_002253 exon 30, c.3934G>C; case AA6175; Sup. Fig. 2B). These two variations involve the extracellular and cytoplasmic regions of the VEGFR-2 protein, respectively. To date, no domain has been reported for D1312 while S691 is located in the immunoglobulin-like C2-type 7 domain, which has not been identified as an essential component for VEGF-A or VEGF-C interaction [30].

Genomic characterization of post-radiation and sporadic sarcomas with complex genetics

To characterize genomic alterations in these groups, we performed DNA-arrays on 11 post-radiation leiomyosarcomas and 20 post-radiation undifferentiated pleomorphic sarcomas and compared the profiles to those of 34 sporadic leiomyosarcomas and 22 sporadic undifferentiated pleomorphic sarcomas (respectively, including 7 and 5 cases from the reference cohort of 93 sporadic tumors analyzed by RNA-seq). We considered both gains and losses observed in the different groups (leiomyosarcomas vs. undifferentiated pleomorphic sarcomas and/or sporadic vs. post-radiation sarcomas) with penetrance plots (Fig. 1). Globally, sporadic and post-radiation sarcomas had the same levels of genomic alterations, including those that typically occur in sarcomas with complex genetics: losses of chromosomes 10, 13, 16 and gains of chromosomes 1, 9, 14 with high frequencies [31,32,33]. In particular, RB1 (13q14.2) was targeted at the same frequency in both sporadic and post-radiation tumors (nearly 75% for sporadic and post-radiation leiomyosarcomas; 85% and 75% for sporadic and post-radiation undifferentiated pleomorphic sarcomas, respectively; Sup. Fig. 3: upper panel), in agreement with the literature.

Fig. 1
figure 1

Penetrance plots showing recurrent copy number variations in leiomyosarcomas (LMS; upper panel: 34 sporadic and 11 post-radiation tumors), undifferentiated pleomorphic sarcomas (UPS; intermediate panel: 22 sporadic and 20 post-radiation tumors), and pleomorphic sarcomas (LMS + UPS; lower panel). Each panel displays gains upwards and losses downwards for chromosomes 1–22 and includes cytogenetic bands on top

However, a difference can be observed in the 9p21.3 region (including CDKN2A and CDKN2B, coding for p14/p16 and p15 proteins, respectively; Sup. Fig. 3: lower panel). This region was lost in 35% of sporadic leiomyosarcomas vs. 64% of post-radiation ones and in 45% of sporadic undifferentiated pleomorphic sarcomas vs. 75% of post-radiation ones. Grouped together (leiomyosarcomas and undifferentiated pleomorphic sarcomas), this region was significantly lost with a higher frequency in post-radiation than in sporadic sarcomas (71% and 39%, respectively; Fisher’s P = 6.92e−3). In sporadic tumors, CDKN2A loss co-occurred with RB1 loss in 86% of tumors (19 cases out of 22). In post-radiation tumors, this intersection was lower and represented 68% of cases (15 cases out of 22; binomial probability P(X ≤ 15) = 2.3e−2), while 7 tumors out of 31 (23%) lost CDKN2A but not RB1. Taken together, RB1 or CDKN2A losses occurred in 91% of sporadic sarcomas (48 cases out of 53) and 97% of post-radiation ones (30 cases out of 31).

Mutational specificity of post-radiation sarcomas

An initial pool of 310,641 unfiltered variants was detected in all the available cases processed by RNA-seq: 93 sporadic and 77 post-radiation sarcomas (see the section “Material and methods”). After the filtration step, we obtained 5594 variants covering 3925 genes.

For each gene, we reported the number of altered sporadic and post-radiation tumors. Then, gene variations having an incidence difference of <5% in the two cohorts were removed. A total of 75 variations corresponding to 21 genes was identified with minimum and maximum incidence differences of 5.19% and 11.38%, respectively (Table 2). Among these 21 genes, MUTYH was the only one reported in the Cancer Gene Census database (2018/03 version) [34] in the context of colon cancer as an alternative to APC mutations in familial adenomatous polyposis. Though no variant was detected for this gene in sporadic sarcomas, four missense variants were detected in post-radiation ones (cases: AA6216, AA6209, AA6233, and AA6182) in different exons (16, 13, 11, and 10, respectively; NM_001048171 transcript). These sarcomas were, respectively, two undifferentiated pleomorphic sarcomas, one extraskeletal osteosarcoma and one undifferentiated spindle cell sarcoma. Their respective locations were the vagina, chest wall, shoulder girdle, and orbit.

Table 2 Genes for which variants were detected in post-radiation and sporadic sarcomas with a difference in incidence of more than 5%

We then looked for differences in mutational patterns on filtered variants depending on the radiation context. First, no difference in terms of mutation types (missense, nonsense, and stop loss) or base substitutions (transversion or transitions) was detected (Fisher’s P-values are 1.86e−1 and 1e−1, respectively). Second, taking into account the nucleotides before and after the variant one, we obtained similar profiles between sporadic and post-radiation tumors, corresponding to the spontaneous deamination of the 5-methyl-cytosine signature. This mutational signature has been observed in many cancer types and represents an aging signature, since this phenomenon constitutionally occurs in lifetime [35].

Fusion genes in secondary sarcomas

A total of 25,142 and 9163 fusion genes was predicted for sporadic and post-radiation sarcomas, respectively (see the section “Material and methods”). After the filtration step, 2480 and 515 fusion genes were investigated in these groups, respectively. Among the 515 fusion genes detected in post-radiation sarcomas, we kept those that recurred in this group (possibly with different partners) and were not detected in sporadic sarcomas. Twelve genes fulfilled these conditions but sequence reanalysis excluded 10 of them as false-positives due to sequence homology. Two genes were subsequently investigated by in silico analyses: RUFY1 and TRAM1. RUFY1 is either fused with CMTM6 (CMTM6exon 1-RUFY1exon 12; t(3;5); case AA6176) or deleted (RUFY15′ UTR-intergenic; case AA6206). TRAM1 is either fused with RAVER1 (TRAM1exon 10-RAVER1exon 1; t(8;19); case AA6218) or produces an inter-chromosomal transcript (TRAM1exon 6-intergenic; t(8;12); case AA6230). Since the partners were not identical and each concerned only two cases, a fusion gene was ruled out as a main mechanism for the oncogenesis of post-radiation sarcomas compared to sporadic ones.

Transcriptome analysis of post-radiation vs. sporadic sarcomas

To analyze homogeneous sub-groups and maximize transcriptomic differences, we focused on leiomyosarcomas and undifferentiated pleomorphic sarcomas. Unsupervised methods (t-SNE and clustering) comparing sporadic (22 leiomyosarcomas plus 31 undifferentiated pleomorphic sarcomas) and post-radiation (11 leiomyosarcomas plus 20 undifferentiated pleomorphic sarcomas) sarcomas failed to identify any transcriptomic differences, suggesting that these groups share similar transcriptomic patterns (Fig. 2). In addition, the seven published gene expression signatures related to irradiation failed to categorize sporadic and post-radiation tumors (Fisher’s P with Benjamini–Hochberg adjustment >0.05; Sup. Fig. 4).

Fig. 2
figure 2

unsupervised clustering methods on pleomorphic sporadic sarcomas and post-radiation ones (t-SNE and agglomerative hierarchical clustering in the left and right panels, respectively) to segregate tumors based on their transcriptomic profiles. This shows a mixed stratification for these groups, so they share global transcriptomic similarities

Supervised differential expression analysis between sporadic and post-radiation sarcomas identified 529 up-regulated genes in sporadic tumors and 198 in post-radiation tumors (Sup. Tables 1 and 2). The main gene ontology associated with sporadic tumors is cation transmembrane transport (Q = 9.9e−3). The main gene ontologies associated with post-radiation tumors are responses to oxygen-containing compounds and organic substances (Q are 3.19e−5 and 2.73e−4, respectively), positive regulation of lipid storage (Q = 8.36e−3) and inflammatory response (Q = 1.28e−2). No enrichments for response to radiation ontology (or associated sub-terms) were detected.

Characterization of secondary angiosarcomas

We then analyzed the entire cohort of secondary sarcomas, including diverse histological subtypes (Table 1), but excluding sporadic tumors. Though no transcriptomic pattern was measured in almost all post-radiation tumors, angiosarcomas carried a distinct gene expression profile: all of the 24 cases clustered together (Fig. 3). Accordingly, we performed gene set enrichment analysis on known MYC targets and observed that they were up-regulated in angiosarcomas but not in other types (P = 1.5e−2; Sup. Fig. 5).

Fig. 3
figure 3

unsupervised clustering methods on all available post-radiation sarcomas (t-SNE and principal component analysis in the left and right panels, respectively) to segregate tumors based on their transcriptomic profiles. This shows a distinct transcriptomic pattern for angiosarcomas compared to other histological subtypes. Principal component analysis highlights two angiosarcoma sub-groups, termed ASg1 and ASg2

These 24 angiosarcomas also displayed two different transcriptomic patterns, corresponding to a first group of eight tumors termed ASg1 and a second group of 16 tumors termed ASg2 (Fig. 3). Though 75% of patients in ASg1 died (6 among 8) compared to 37.5% in ASg2 (6 among 16), these overall survivals were not significantly distinct (P = 1.09e−1; hazard ratio = 2.57 [0.78–8.49]), and the two groups also shared similar clinical characteristics (vital status, age at diagnosis, tumor size, site, cell differentiation, necrosis, number of mitosis, and grade of tumors; all P > 0.05; Sup. Table 3). Supervised differential expression analysis between the two groups identified 1554 up-regulated genes in ASg1 and 2211 in ASg2 (Sup. Tables 4 and 5). The main gene ontology associated with ASg1 was protein translation (and associated sub-terms; Q < 1e−10) through the overexpression of most (71%) of the ribosomal proteins (both small and large subunits, including mitochondrial ribosomal proteins). The main gene ontologies associated with ASg2 were cell adhesion, immune response, and regulation of cell communication and migration (all Q < 1e−10). MYC was not differentially expressed between ASg1 and ASg2 (P = 7.03e−1; log2-fold change = −0.19).

The gene ontologies associated with ASg2 conducted an advanced immune characterization. We took advantages of the CIBERSORT approach (RNA-seq-based) to estimate immune populations in the post-radiation angiosarcomas [36]. Among the 22 immune populations analyzed by CIBERSORT, none was significantly different between ASg1 and ASg2 (Sup. Fig. 6). We consequently studied the immune infiltrate at the cellular level with histochemistry (Sup. Table 3). The ASg2 group was characterized by a higher nuclear atypia (P = 1.08e−2), a higher intensity of inflammation (P = 2.45e−2), and a higher immune infiltrate at tumor periphery (P = 3.95e−3). No significant difference was observed with the immune infiltrate neither in the tumor stroma nor in direct contact with tumor cells. Finally, we characterized the immune infiltrate at the cell type level.

From a chromosomal standpoint, these two groups had few genomic imbalances except chromosome 17q24.2-17qter, which was significantly gained in 12 cases (75%) of ASg2 vs. two cases (25%) of ASg1 (Fisher’s P = 3.24e−2; Sup. Fig. 7). This specific genomic disorder did not modify the gene expression in this region (among 212 listed protein-coding genes, 21 and 23 were overexpressed in ASg1 and ASg2, respectively).

Discussion

Post-radiation sarcomas represent a clinical challenging group of tumors. They are very aggressive neoplasms with few effective therapeutic opportunities. However, identifying some recurrent molecular characteristics holds promise for effective targeted therapy. For example, as KDR encodes one of the VEGF receptors, KDR-altered post-radiation angiosarcomas might be more sensitive to tyrosine kinase inhibitors like sunitinib or sorafenib compared to wild-type KDR tumors [18]. Nevertheless, a better understanding of cancer genomics is needed, particularly in sarcomas with complex genetics, associated with massive genome reshuffling and misunderstood oncogenesis [37]. We thus performed a genome-wide characterization of post-radiation sarcomas and correlated it to consequences on their transcriptomes to better understand the biology of these tumors.

In sarcomas with complex genetics, we observed that sporadic and post-radiation sarcomas share similar genetic alterations, including RB1, which is lost in nearly 75% of all such tumors. Interestingly, CDKN2A and CDKN2B are significantly lost with a higher frequency in post-radiation than in sporadic sarcomas (71% and 39%, respectively; Fisher’s P = 6.92e−3). The p16 protein (CDKN2A) inhibits the cyclin D/Cdk4 complex (cycle cell pathway) that phosphorylates pRb (RB1), allowing cells to enter the S phase [38]. Consequently, CDKN2A inactivation could replace RB1 loss in post-radiation sarcomas, since all but one tumor harbored at least one of the two alterations, so it might play a role in the oncogenesis of such tumors by destabilizing the cycle cell progression.

In sarcomas with complex genetics, we did not observe any specific transcriptomic pattern in post-radiation sarcomas compared to sporadic ones. First, no recurrent and driver fusion genes have been characterized in these two groups [37, 39]. Second, a supervised approach with differential gene expression analyses did not highlight a strong oncogenic pathway in either of the two cohorts. Third, unsupervised approaches with clustering and t-SNE were unable to categorize whole transcriptome profiles either in terms of radiation context or the seven radiation-related signatures [22,23,24,25,26,27,28]. However, owing to technical considerations, the two latter points require clarification, as we selected non-overlapping coding genes owing to a difference in the sequencing strategy (see the section “Material and methods”). Nevertheless, more than 16,500 (87%) protein-coding genes were considered in these analyses, without any evidence of batch effects.

We validated the previously described alterations in post-radiation angiosarcomas in our independent cohort. Two cases (8%) presented KDR variants, in accordance with the reported 10% [18]. All but one case (96%) presented MYC amplifications, in accordance with the reported high frequencies [14, 17]. In the two groups of post-radiation angiosarcomas, no association was found with clinical characteristics. Though the overall survival analysis was non-significant (P = 1.09e−1), probably due to the cohort size, ASg1 were highly aggressive sarcomas (6 out of 8 patients died; 25% overall survival) compared to ASg2 (6 out of 16 patients died; 62.5% overall survival). The transcriptomic difference between these two groups led us to characterize the immune infiltrate.

The CIBERSORT approach did not identify differences in terms of cell populations. However, this result is probably biased because of a lack of immune cells within the tumor stroma of both groups. Instead, histochemistry revealed a higher intensity of inflammation and a higher immune infiltrate at tumor periphery (P are 2.45e−2 and 3.95e−2, respectively) in ASg2, the least aggressive group. Consequently, in these two groups of post-radiation angiosarcomas, the involvement of the immune system could explain the differences in terms of vital status. Because of a lack of statistical power to accurately measure the contribution of the immune system to the clinical course, a larger cohort may provide a better understanding of such sub-groups.

DNA-array is a useful technique for understanding cancer genomics [40]. Particularly, high-resolution arrays, such as the CytoScan HD (Affymetrix) used in this analysis can identify gains and losses of a few kilobases. Whole genome sequencing (DNA-seq) offers the opportunity to precisely detect a wide spectrum of genomic alterations, being able to work at a single base pair resolution. We did not observe a mutational signature specific to post-radiation sarcomas using RNA-seq, so coding point mutations are likely either secondary events not depending on oncogenesis initiation mechanisms, or these mechanisms are identical between sporadic tumors and post-radiation ones. This assumption should nevertheless be taken with caution, since our analysis was restricted to point variants (all of them are not actual cancer mutations) in expressed genes (whereas DNA-seq covers the whole genome). Mutational signatures have been identified in secondary malignancies, namely angiosarcomas, osteosarcomas, spindle cell sarcomas, and breast tumors [41]. These tumors (n = 12), compared to a reference cohort of primary cancers (n = 319), significantly harbored more deletions relative to insertions and balanced inversions. Consequently, DNA-seq is a promising tool to better understand consequences of radiations in the context of cancer therapy by measuring changes in DNA that may precise their oncogenesis.

Radiation therapies cause DNA damage to normal cells surrounding tumor cells. It is tempting to postulate that such DNA damage may directly alter oncogenic drivers and lead to cancer development. However, secondary tumors rarely arise after such frequently administered treatment. Thus, most DNA damage may be passenger mutations while the driver ones may occur after the cell enters an unstable state. In this context, constitutive variants that allow oncogenic alterations to occur would be good candidates for promoting tumor initiation. A constitutive study of variants found in two groups of patients with and without secondary tumors, and using DNA or exome sequencing would answer this point.