Globally altered epigenetic landscape and delayed osteogenic differentiation in H3.3-G34W-mutant giant cell tumor of bone

The neoplastic stromal cells of giant cell tumor of bone (GCTB) carry a mutation in H3F3A, leading to a mutant histone variant, H3.3-G34W, as a sole recurrent genetic alteration. We show that in patient-derived stromal cells H3.3-G34W is incorporated into the chromatin and associates with massive epigenetic alterations on the DNA methylation, chromatin accessibility and histone modification level, that can be partially recapitulated in an orthogonal cell line system by the introduction of H3.3-G34W. These epigenetic alterations affect mainly heterochromatic and bivalent regions and provide possible explanations for the genomic instability, as well as the osteolytic phenotype of GCTB. The mutation occurs in differentiating mesenchymal stem cells and associates with an impaired osteogenic differentiation. We propose that the observed epigenetic alterations reflect distinct differentiation stages of H3.3 WT and H3.3 MUT stromal cells and add to H3.3-G34W-associated changes.

T he discovery of mutated histone genes in aggressive cancers raised a lot of interest in the cancer research community due to their ability to globally alter the epigenomic landscape 1 . A frequently mutated histone is the non-canonical histone variant H3.3 2,3 . In contrast to canonical H3.1 and H3.2, the incorporation of histone variant H3.3 is replication-independent and its turnover occurs throughout the cell cycle 4 . Deposition occurs either via the histone regulator A (HIRA) chaperone complex at sites of gene activation or through alpha-thalassemia/mental retardation Xlinked-death domain associated protein (ATRX-DAXX) into heterochromatic regions 5 and the silent allele of imprinted genes 6 . Mouse embryonic stem cells require H3.3 for correct establishment of H3K27me3 patterns at bivalent promoters of developmentally regulated genes 7,8 . While two human genes, H3 histone family member 3A (H3F3A) and 3B (H3F3B), encode for an identical H3.3 protein, oncogenic mutations occur gene-specifically in different tumor types. Lysine-27-to-methionine (H3.3-K27M) and glycineto-arginine or valine substitution (H3.3-G34R/V) in pediatric gliomas 9 , as well as glycine-34-to-tryptophan or leucine (H3.3-G34W/L) substitutions in giant cell tumor of bone (GCTB) 10 have been described, all due to mutations in H3F3A. For H3F3B, mutations leading to a lysine-36-to-methionine (H3.3-K36M) substitution were reported in chondroblastomas 10,11 . The molecular consequence of the H3.3-K27M mutation is a global loss of the repressive chromatin mark H3K27me3 through inactivation of the PRC2 complex [12][13][14] . Similarly, the H3.3-K36M mutation suppresses the deposition of the H3K36me3 mark by interference with histone methyltransferases nuclear receptor binding SET domain protein 2 (NSD2) and SET domain containing 2 (SETD2) 11,15 . Recent findings suggested reduced levels of H3K36me3 and increased levels of H3K27me3 in cis in HeLa cells overexpressing H3.3-G34W 16 . However, the detailed effects of this mutant histone variant on the epigenome are yet to be determined and to be analyzed in patients. GCTB, where this mutation was shown as the sole alteration, offers a unique system to study these effects in primary patient material.
GCTB is a rare locally aggressive bone neoplasm that typically affects the meta-epiphyseal regions of long bones in young adults 17 . These tumors consist of three major cell types: stromal cells originating from mesenchymal stem cells (MSC), multinuclear giant cells and mononuclear histiocytic cells 18 . The GCTB stromal cells show incidence of H3.3-G34W in more than 90% of cases and display markers of both MSC and pre-osteoblast cell populations 10,19 . The neoplastic stromal cell population secretes high levels of the receptor activator of NF-κB ligand (RANKL) and reduced levels of its decoy receptor, osteoprotegerin (OPG), thereby attracting and activating surrounding monocytes. Upon activation, the recruited monocytes fuse to form multinucleated giant cells, which resemble osteoclasts and lead to massive bone destruction 17 .
Here we investigate the effects of H3.3-G34W on global epigenomic patterns in patient samples from four different centers. We find epigenetic distortions that contribute to the phenotypes of GCTB, stochastic genomic instability and increased osteolysis. Furthermore, we demonstrate that neoplastic and non-neoplastic GCTB stromal cells represent distinct stages of osteogenic differentiation. Differentiation-related epigenetic differences add to the overall picture of H3.3-G34W-associated global epigenetic alterations, whereas the differentiation delay is potentially driven by the direct effects of H3.3-G34W. Our findings collectively suggest that the single-residue alteration of H3.3 induces epigenomic changes with implications for the development of stromal cells and the tumorigenic process.

Results
H3K36me is unaltered in H3.3-G34W-expressing stromal cells. Recent biochemical studies have shown that G34 substitutions in H3.3, including G34W, inhibit the activity of the histone methyltransferase SETD2, which is responsible for H3K36me3, in cis (on the same histone tail) 16 . We verified this effect in HEK293T cells stably overexpressing H3.3-G34W or wild-type H3.3 as a control and confirmed in cis effects on H3.3-G34W K36 trimethylation levels ( Supplementary Fig. 1a). We did not observe any in trans effects on endogenous H3 modifications ( Supplementary Fig. 1a), as found for other mutant histones such as H3-K36M.
To specifically test whether these biochemical findings apply to patient samples, we obtained access to GCTB biopsies from four different cohorts (Supplementary Data 1). For the initial characterization of 30 GCTB samples (Supplementary Data 1), we performed immunohistochemical analysis with a H3.3-G34W-specific antibody. Positive staining was observed and validated in 29 of 30 cases (Fig. 1a, b and Supplementary Fig. 1b). The H3.3-G34W-negative case (unified patient identifier, UPI-13) carried a H3F3A (c.103_104GG>TT) mutation encoding a H3.3-G34L substitution that has already been described for GCTB 10 (Fig. 1b, Supplementary Fig. 1c). We established both, neoplastic, H3.  Fig. 1d). In total, we collected 96 tissue samples from 95 different GCTB patients from four different cohorts (Supplementary Data 1) and were able to establish 26 stromal cell lines from 24 different GCTB patients from two cohorts (Fig. 1c). In addition to H3.3 WT cells, we analyzed bone marrow-derived primary MSCs from non-GCTB patients from here on referred to as nontumoral stromal cells (nt-SC) (Supplementary Data 1). We verified the mutational status of the cells using Sanger resequencing, ultra-deep resequencing on the MiSeq platform or wholegenome sequencing (Supplementary Data 1, Fig. 1b, Supplementary Fig. 1c, e). In addition to the common mutation leading to the G34W alteration in H3.3, whole-genome sequencing of seven patient-derived primary H3.3 MUT cell lines (Supplementary Data 1) did not reveal any recurrent genetic alterations (Fig. 1d, Supplementary Fig. 1e, f). In contrast to other malignancies carrying H3F3A mutations, as for example pediatric glioblastoma with co-occurring mutations in tumor protein 53 (TP53) and ATRX 9,20 (Fig. 1e, Supplementary Fig. 1g), and to other bone tumors, e.g., osteosarcoma (Fig. 1f, Supplementary Fig. 1h, i), GCTB showed an extremely low overall mutation frequency for H3.3 MUT and H3.3 WT cells (Fig. 1f, Supplementary Fig. 1e, f). This places GCTB in a unique position to uncover epigenomic alterations linked to H3.3-G34W.
To understand how H3.3-G34W exerts its function in tumor cells, we analyzed protein fractions by Western Blot and found H3.3-G34W incorporated into chromatin ( Supplementary  Fig. 2a). To analyze in cis effects of H3.3-G34W in patient derived cells, we used a specific and verified antibody to map H3.3-G34W and identify sites of H3.3-G34W enrichment in two independent patient cell lines (Supplementary Data 1). We identified high-confidence genomic regions showing H3.3-G34W enrichment ( Supplementary Fig. 2b, c, Supplementary Data 2) and profiled H3K36me3 along with several other histone modifications using chromatin immunoprecipitation (ChIPmentation). Changes of H3K36me3 or H3K27me3 between H3.3 WT and H3.3 MUT stromal cells at sites of H3.3-G34W enrichment were minute and did not recapitulate the strong loss of H3K36me3 in cis observed in HEK293T cells ( Supplementary  Fig. 2b, c). Nucleosomes enriched by H3.3-G34W-ChIP most probably contain H3.3-G34W, as well as wild-type H3.3 potentially masking in cis effects on the H3.3-G34W N-terminal tail. Using Western Blot analysis of whole cell lysates we did not reveal any changes in the total amount of K36 methylated histone H3 indicating that the G34W substitution does not affect methylation of K36 on other histones lacking the mutation, which is in line with the analysis in HEK293T cells in Supplementary Fig. 1a (Supplementary Fig. 2d).
The epigenome of H3.3 MUT cells is globally altered. As the effects of H3.3-G34W on the global level of H3.3K36me3 were insignificant, we tested whether H3.3-G34W might exert its tumorigenic effects through other epigenetic mechanisms. We analyzed several H3. 3 Fig. 1 Initial characterization of GCTB patient samples. a Immunohistochemical staining of primary GCTB tumor resections with a H3.3-G34W-specific antibody (Active Motif) (red). UPI, unified patient identifier. The scale bar exemplarily shown for UPI-30 indicates 500 µm. b Quantification of the mutation at position 103 in the H3F3A gene (c.103G>T) leading to the H3.3 G34W substitution in tumor resections (dark gray) and derived stromal cell lines (light gray) using deep targeted resequencing. VAF, variant allele frequency. c Overview of GCTB tissues and derived stromal cell lines (H3F3A wt in blue, mut in red, non-tumoral stromal cells (nt-SC) in violet) analyzed within this study. IHC, immunohistochemistry; WGS, whole genome sequencing; WGBS, whole genome bisulfite sequencing; 450 K, DNA methylation array. d-f Circos plot of recurrent structural variants in GCTB (d), H3.3-G34R-baring pediatric glioblastoma (PGBM, e) and bone cancer (BOCA-UK, f), cohorts based on whole genome sequencing data. Green lines represent translocations, blue lines deletions, red lines duplications, and black lines inversions. The variant recurrences are represented by bar plots. The outermost layer represents functional small variants (SNVs and small indels). The middle layer represents copy number variations. The innermost layer represents structural variations. All layers are normalized to the compared cohort size. Osteosarcoma cohort was sub-sampled at random to the size of the other two cohorts. See high-resolution versions in Supplementary Fig. 1f mutation (Fig. 1b). This observation prompted us to restrict all our subsequent analyses to H3.3 MUT and H3.3 WT cells in order to obtain a clean view of H3.3-G34W-associated epigenetic alterations in pure cell populations. To further a detailed analysis of the epigenome, we profiled DNA methylation at single CpG resolution using whole genome bisulfite sequencing (WGBS), analyzed chromatin accessibility with the assay for transposaseaccessible chromatin using sequencing (ATAC-seq) and analyzed global distribution of several histone marks (H3K4me1, H3K4me3, H3K9me3, H3K27me3, H3K27ac, H3K36me3) to investigate their potential redistribution. Since the initial methylation analysis showed high homogeneity within the mutant and wild-type groups and culturing primary tumor cells did not yield sufficient material for a comprehensive omics profiling of each line, we analyzed H3. In order to identify loci with more pronounced epigenetic changes, we used a genome-wide approach to stratify the observed differences from all epigenetic layers using the ENCODE functional annotation of the MSC epigenome 21 . We observed profound differences in several groups of chromatin states including heterochromatic and Polycomb-repressed states, as well as bivalent domains (Fig. 2f). Heterochromatic regions were noticeably hypomethylated while bivalent domains accumulated a whole range of alterations, including gain of DNA methylation, loss of chromatin accessibility, as well as loss of several histone modifications (H3K27me3, H3K4me3 etc). Collectively, we conclude that the epigenome of H3.3 MUT cells shows significant and reproducible differences to that of H3.3 WT stromal cells.
Hypomethylation in megabase domains and genomic instability. We first followed up changes associated with loss of DNA methylation. In contrast to the homogeneity of DNA methylation profiles observed within each of the two groups ( Supplementary Fig. 3a), DNA methylation was non-uniformly altered between H3.3 WT and H3.3 MUT cells across the entire genome (Fig. 3a). The majority of significantly changed loci showed profound reduction of DNA methylation in H3.3 MUT cells, while a minority was hyper-methylated ( Fig. 3a, Supplementary Fig. 3b). Overall, H3.3 MUT cells exhibited a 20% genome-wide reduction of DNA methylation (Fig. 3b, Supplementary Fig. 3c). To systematically characterize the abundant DNA methylation changes, we segmented the genome into large methylation domains (LMD I to IV) with sizes >20 kb (Supplementary Fig. 3d, Supplementary Data 2) using a combination of breakpoint analysis and clustering (see Methods for details). The majority of regions (predominantly LMDs III and IV) matched the criteria for partially methylated domains i.e., megabase-scale domains of predominantly repressive chromatin with low gene density 22 . Each LMD had a distinct level of DNA methylation, a discrete pattern of histone marks and gene density, suggesting different functional roles (Fig. 3c-e). For instance, LMD III showed enrichment of H3K27me3, while LMD IV was associated with H3K9me3, suggesting an association with facultative and constitutive heterochromatin (Fig. 3e). Furthermore, LMD III domains were often detected as flanking to LMD IV. LMDs I to IV showed decreasing average methylation levels starting from over 0.75 down to less than 0.25 (Fig. 3c). Lowly methylated LMDs III and IV which could be characterized as heterochromatic ( Fig. 3e) showed the most pronounced demethylation comparing H3.3 MUT cells to H3.3 WT cells. We concluded that global DNA methylation alterations in GCTB are non-uniformly distributed along the genome, and that the most pronounced global changes take place in large-scale domains associated with facultative and constitutive heterochromatin, roughly corresponding to LMDs III and IV.
We next sought to associate hypomethylation with other epigenetic alterations. Consistent with the global loss of DNA methylation, we observed increased chromatin accessibility in the H3.3 MUT cells with approximately 1.5-times more gained than lost differential ATAC-seq peaks (Fig. 3f, Supplementary Data 3). ATAC peaks gained in H3.3 MUT cells were overrepresented in the LMDs III and IV (Fig. 3g) indicating a more open state in respective genomic segments. These LMDs also showed strong reduction in the heterochromatic histone mark H3K9me3 (Fig. 3h). Examples of changes in different LMDs can be seen in Fig. 3i. The ATAC peaks gained in H3.3 MUT showed a significant overlap with repetitive elements such as long intersperse nuclear elements (LINE), short interspersed nuclear elements (SINE) and long terminal repeat (LTR) elements which are normally silenced by DNA methylation ( Supplementary  Fig. 3e, Supplementary Data 3). Looking at other known repetitive regions, such as centromeres and telomers, we found them to be affected by the genome-wide hypomethylation ( Supplementary Fig. 3f). Multicolor fluorescence in situ hybridization (m-FISH) analysis found H3.3 WT cells to have a normal karyotype, while H3.3 MUT cells in contrast displayed different, non-recurrent centromeric fusions which could be a potential consequence of the heterochromatin defects described above ( Supplementary Fig. 3g). We conclude that H3.3-G34W associates with heterochromatic defects that potentially contribute to a genomic instability previously described as characteristic for GCTB 23 .
Isogenic cells recapitulate H3.3 MUT DNA hypomethylation. As H3.3 MUT stromal cells showed the strongest epigenetic differences to H3.3 WT stromal cells on the DNA methylation level, we expected DNA methylation to play a major role in stromal cell transformation leading to GCTB. To verify that the observed epigenetic changes were dependent on H3.3-G34W expression, we aimed to recapitulate the findings in an unrelated cell line system. To this end, we introduced the H3.3-G34W encoding mutation into HeLa cells by targeting the endogenous H3F3A locus as earlier described ( Supplementary Fig. 4) 24 . Individual iso-H3.3-WT and iso-H3.3-G34W clones were isolated, of which four iso-H3.3-WT and four iso-H3.3-G34W monoclonal lines and one parental HeLa cell line were subjected to HumanMethylationEPIC DNA methylation analysis (Fig. 4a). The iso-H3.3-G34W samples showed similar alterations as those found when comparing H3.3 MUT and H3.3 WT stromal cells of GCTB ( Fig. 4a-e). Principal component analysis of the most variable methylation probes clustered iso-H3.3-WT clones together while iso-H3.3-G34W clones dispersed off, indicating changes in DNA methylation in iso-H3.3-G34W but not WT isogenic cell lines (Fig. 4a). Similar to primary GCTB cells, the largest principal component (PC1, 31% of variance explained) captured widespread hypomethylation in iso-H3.3-G34W accompanied by focal hypermethylation events (Fig. 4a).
Clustering analysis further enforced the distinct difference between iso-H3.3-WT and G34W clones (Fig. 4b). A scatter plot of all probes indicated that iso-H3.3-G34W cells showed predominantly hypomethylation as also shown for H3.3 MUT cells from GCTB (Fig. 4c). We found 9047 differentially methylated probes (Δ β-value>0.2 and FDR < 0.05) of which 5688 (63%) were hypomethylated in iso-H3.3-G34W cells (Fig. 4d). These data recapitulate the bias towards hypomethylation found in H3.3 MUT stromal cells in comparison to H3.3 WT stromal cells (compare Fig. 3a, b), and confirm that the changes in DNA methylation are associated with H3.3-G34W. CpG sites that lost methylation were more frequently localized in intergenic regions and promoter-proximal exons, while hypermethylated ones were in addition moderately enriched at promoters and depleted at  Fig. 3b) observed in H3.3 MUT cells. A specific example for a hypomethylated region in iso-H3.3-G34W cells is the RANKL locus showing hypomethylation at exon 3, a potential alternative promoter element (Fig. 4f). This is in line with the methylation differences observed between H3.3 WT and H3.3 MUT stromal cells at the same locus (Fig. 5a). RANKL signaling has been extensively studied in GCTB 25 . The expression of RANKL, a master regulator of osteoclast differentiation 26 , has been shown to be upregulated in GCTB stromal cells causing an osteolytic phenotype 25 . We verified the increased expression of RANKL (TNFSF11) in H3.3 MUT cells (Fig. 5b, Supplementary  Fig. 5a) and additionally observed decreased expression and secretion of its decoy receptor Osteoprotegerin (OPG, TNFRSF11B) (Fig. 5c, Supplementary Fig. 5a, b). Similar expression patterns were already described by us earlier 24 . The OPG locus showed decreased levels of the active histone marks H3K4me3 and H3K27ac potentially indicating a missing activation by a transcription factor (Supplementary Fig. 5c). One known transcription factor of OPG is the early B-cell factor 2 (EBF2) 27,28 which is a bivalent gene (Fig. 5f) and belongs to the key regulators of osteogenic differentiation in mice 29 . OPG became downregulated after siRNA mediated EBF2 knockdown, confirming a role of EBF2 in OPG expression in stromal cells (Fig. 5d, Supplementary   Fig. 5d). The EBF family is a conserved group of four transcription factors. Our RNA-seq analysis found EBF2 and EBF3 to be differentially expressed between H3.3 WT and H3.3 MUT stromal cells (Fig. 5e, Supplementary Fig. 5e). EBF3, previously reported as a tumor suppressor in glioblastoma 30,31 , showed reduced expression in H3.3 MUT cells whereas EBF2 expression was almost completely lost. We found the EBF2 locus to be hypomethylated with a focal hypermethylation around the promoter region (Fig. 5f). Increased H3K9me3 and H3K27me3 levels and decreased levels of H3K27ac supported a repressed state of EBF2. Lost ATAC signals in H3.3 MUT cells indicated differentially closed chromatin. These findings link the H3.3 G34Wassociated epigenetic dysregulation of EBF2 expression to the osteolytic phenotype of GCTB.
Taken together, we showed that the stable introduction of H3.3-G34W into a GCTB unrelated cell line recapitulates the DNA hypomethylation trend seen in GCTB stromal cells. Furthermore, we could detect epigenetic alterations that directly and indirectly affected the expression of key regulators of bone metabolism shedding novel light on the emergence of the osteolytic phenotype in GCTB.  (Fig. 2f), which are well known sites of H3.3 deposition 5,32 . Changes included gain of DNA methylation which coincide with decreased accessibility, decrease of H3K27me3 and increased levels of H3K36me3 (Fig. 2f). Most promoterassociated ATAC peaks that were lost in H3.3 MUT cells overlapped with bivalent promoters (Supplementary Fig. 6a). As a consequence of epigenetic disturbance, bivalent genes comprised a significant portion of differentially expressed genes both upregulated and downregulated in H3.3 MUT cells ( Supplementary  Fig. 6b) (Fig. 6a). Accordingly, an overlap of the differentially expressed genes from both experiments showed that the majority of matches were upregulated during differentiation and downregulated in H3.3 MUT cells (Supplementary Fig. 6f). Examples included insulin growth factor 2 (IGF2) and leptin (LEP) previously implicated in osteogenic differentiation 34,35 (Fig. 6a). A global principal component analysis confirmed a less ALP activity whereas only some H3.3 MUT cells showed ALP activity (Fig. 6c). Quantification of the ALP activity relative to viability as a surrogate for cell count confirmed reduced ALP activity for H3.3 MUT stromal cells (Fig. 6d). Impaired differentiation of GCTB stromal cells was suggested earlier based on the analysis of histological markers 36 and transcriptomic profiling 37 . We performed an osteogenic differentiation of H3.3 MUT   (Fig. 6c, d), we noticed that H3.3 MUT stromal cells lagged behind and did not achieve the level of ALP activity reached by H3.3 WT stromal cells (Fig. 6c, d). We concluded that H3.3-G34W associates with changes in bivalent regions and that their deregulation potentially has an effect on osteogenic differentiation. H3.3 WT and H3.3 MUT stromal cells therefore differ in their state of osteogenic differentiation in GCTB patients and this difference contributes to the herein found epigenetic alterations (Fig. 6e).

Discussion
Mutant epigenetic regulator proteins, including histones, have been described in multiple cancer types 2 . Despite the fact that oncogenic mutations in histones influence the epigenetic landscape, deep insight into the mechanistic ramifications to cancer initiation is still incomplete. Several studies investigated the mechanism of how the G34W substitution in histone variant H3.3 affects the epigenetic machinery responsible for its posttranslational modifications. To this end, reduced H3K36 methylation and increased H3K27me3 in cis were described in model systems and verified by us in this work 16,38 . However, whether these effects can be found in patients and whether epigenetic changes are involved in tumorigenesis of GCTB was not known so far. In this paper, we therefore analyzed GCTB tissue, as well as patient-derived primary stromal cells from four different centers to investigate H3.3-G34W-associated epigenetic changes and their contribution to the GCTB neoplasia. The high incidence of H3.3-G34W in GCTB in a largely unaltered genomic context leaves GCTB as a suitable system to study histone mutationdriven tumorigenesis. We found that, despite that H3.3-G34W is incorporated into chromatin, its presence did not lead to changes in the global amount of H3K36me3 as shown for other substitutions affecting H3.3 lysine 27 and lysine 36 11,13,39 . This ruled out a possible in trans effects of H3.3-G34W on histone posttranslational modifications. Using ChIP sequencing, we identified high confidence H3.3-G34W enrichment sites in patient cell lines and analyzed common histone marks at these regions. No pronounced differences were found that could recapitulate the in cis effects on H3.3 lysine methylation observed in HEK293T cells. The absence of clear-cut in cis effects upon chromatin modifications could be explained by an inability to distinguish wild-type and mutant H3.3 in GCTB samples by western blot analysis. In ChIP analysis, nucleosomes of mutant cells could include both, wild-type, as well as mutated H3.3 obliterating possible in cis effects. Moreover, antibodies against histone modifications cannot distinguish between H3 variants and the overall small fraction of the mutant H3.3 (25% of overall H3.3) in the total H3 pool makes analysis of H3.3 modifications challenging.
We show that H3.3-G34W, the sole recurrent alteration of GCTB, associates with large-scale differences in multiple epigenetic marks. H3.3 MUT stromal cells show heterochromatic defects, specifically reduced genome-wide levels of DNA methylation and gained accessibility at heterochromatic regions which is in line with previous reports on pediatric glioblastomas with the G34R substitution in H3.3 39 . These changes could potentially contribute to the genomic instability described for GCTB 23 . We furthermore observed localized changes at bivalent regions, some of the same sites previously described as targets of H3.3 deposition 5,7 . DNA methylation changes could be recapitulated using an isogenic system verifying effects on the methylome to be H3.3-G34W-specific. Moreover, we show H3.3-G34W-associated methylome changes to directly or indirectly affect the expression of the main players of bone metabolism, RANKL and OPG. These results connect the H3.3-G34W-associated epigenetic dysregulation to the osteolytic phenotype, a hallmark of GCTB.
While the epigenomic differences between H3.3 WT and H3.3 MUT stromal cells can be partially ascribed to direct effects of H3.3-G34W, other alterations observed can be explained by the fact that H3.3 MUT and H3.3 WT stromal cells represent distinct differentiation stages. Assuming epigenetic reprogramming during differentiation as demonstrated in recent studies 40,41 , this could also contribute to the epigenetic differences seen between H3.3 WT and H3.3 MUT cells. While expression profiles from H3.3 MUT cells resemble precursor states of osteoblasts, the H3.3 WT cells resemble more mature osteoblasts, suggesting impaired or delayed differentiation of H3.3 MUT cells as already suggested by 42 . A weak differentiation signature is in line with previous reports suggesting that neoplastic GCTB stromal cells are expressing markers of an early osteoblastic differentiation 43 . Genes such as IGF2 34 , LEP, and stathmin-like 2 (STMN2), found to be downregulated in H3.3 MUT stromal cells, were described to play a role in mesenchymal stem cell differentiation and osteoclastogenesis in mice, in promoting bone differentiation 35 and as a marker for osteogenesis 44 , respectively. Consistently, H3.3 was found to be closely associated with differentiation processes 7,45,46 . The impairment of differentiation of H3.3 MUT stromal cells could be mediated by H3.3-G34W-associated downregulation or prevention of induction of genes upregulated during osteogenic differentiation through a yet unknown mechanism. Epigenetic repression of bivalent genes and dysregulation of PRC2 targets described in this study could potentially contribute to the observed phenotype. H3.3 MUT stromal cells are still able to undergo osteogenic differentiation, as it was also found in studies not separating H3.3-mutant neoplastic from H3.3-wild-type stromal cells of GCTB tissue 47,48 .
Direct effects of H3.3-G34W upon the epigenome could be confirmed in isogenic HeLa cells. The effects on DNA methylation were less pronounced in HeLa cells compared to the alterations observed comparing H3.3 WT and H3.3 MUT stromal cells, potentially reflecting the relatively small passage number after the introduction of the H3F3A mutation (approximately 20 passages) in the HeLa cells compared to primary cells. Another explanation may be that DNA methylation differences between H3.3 WT and MUT stromal cells largely reflect differentiationassociated epigenomic alterations shown to occur during osteogenesis 41 as discussed above. A detailed analysis of H3.3-G34Winduced effects on the epigenome and their connection to impeded osteogenic differentiation will require systems capable of reproducing the differentiation context in vitro or in vivo, such as MSC-derived immortalized cell lines or animal models. Alternatively, detailed transcriptomic and epigenomic maps of the osteogenic differentiation can be instrumental for delineating H3.3-G34W-induced and differentiation-related alterations. Furthermore, meticulous computational approaches, e.g., 49 , may help disentangle differentiation-related from cancer-specific epigenetic alterations.
Taken together, our results suggest that the molecular mechanism behind H3.3-G34W-induced epigenomic alterations and neoplastic transformation in GCTB is different to the action of other mutant histone variants such as H3.3-K27M and H3.3-K36M, where the mutated amino acid is a direct target for histone modification 11,13,15 . This suggests that the H3.3-G34W substitution alters the N-terminal interactome as previously described 24 , recruiting novel binders or disrupting interactions with binders. The described DNA methylation changes can be indicative of H3.3-G34W affecting either the binding or the activity of enzymes writing (DNA methyltransferases, DNMTs) or erasing (ten eleven translocators, TETs) this modification. On the structural side it is known that DNMT3a/b recognize H3K36me3 with their PWWP-domains 50,51 . The G34W substitution might interfere with binding of DNTM3a/b PWWP preventing proper establishment of DNA methylation patterns. The thereby induced remodeling of the epigenome associated with H3.3-G34W, in particular changes at heterochromatic and bivalent regions, could contribute to an impaired osteogenic differentiation and the phenotypes of GCTB: stochastic chromosomal rearrangements and increased RANKL signaling. Precise sequences of molecular events leading to these phenotypes will be a matter of further studies. Isolation of nt-SC cell lines from the bone marrow. Nt-SCs were isolated from fresh bone marrow samples derived from the iliac crest under the approval of the Ethics Committee of the University of Heidelberg. Bone marrow cells were purified on a Ficoll-Paque™ Plus density gradient (GE Healthcare, München, Germany), washed in PBS and treated with erythrocyte lysis buffer (0.154 M NH 4 Cl, 10 mM KHCO 3 , 0.1 mM EDTA) to remove erythrocytes. The nt-SC-enriched fraction was seeded and cultured in DMEM high-glucose supplemented with 12.5% FCS (Biochrom), 2 mM L-glutamine, 100 U ml −1 penicillin, 100 μg ml −1 streptomycin (Lonza), 4 ng/ml basic fibroblast growth factor (Merck Chemicals GmbH, Darmstadt, Germany, 50 μM 2−mercaptoethanol and 1% non-essential amino acids (Invitrogen, Karlsruhe, Germany). After 48 h, cultures were washed with PBS to remove non-adherent material. During expansion, medium was replaced twice a week.
Immunohistochemistry. For immunohistochemical detection of the H3.3-G34W mutation formalin fixed paraffin embedded GCT tissue sections were deparaffinized in Roti-Histol (Carl Roth GmbH, Karlsruhe, Germany) and rehydrated in isopropanol. Antigen retrieval was performed using Dako target retrieval solution pH 6 (Dako, Hamburg, Germany) for 5 min at 121°C in a pressure cooker. Sections were blocked for 30 min at room temperature using PBS supplemented with 5% bovine serum albumin (BSA). The primary rabbit anti-H3G34W antibody (Active Motiv, Carlsbad, USA) was diluted 1:1000 in PBS/1% BSA and incubated over night at 4°C. The signal was amplified using the BrightVision +Poly-AP kit (VWR, Darmstadt, Germany) according to the manufacturer´s instructions. Samples were counterstained with hematoxylin (Carl Roth GmbH) and mounted using Neo-Mount (Merck, Darmstadt, Germany). All used antibodies are listed in Supplementary Table 1.
Lineage verification by flow-cytometric analysis. Stromal cells were harvested by trypsination and transferred to 1,5 ml Eppendorf in PBS. Cells were washed with PBS 2% FCS and centrifuged for 5 min at 250×g at 4°C. The antibody staining cocktail was prepared in PBS 2% FCS and cells were stained 20 min at 4°C (all antibodies are specified in the Supplementary Table 1; dilutions: anti-CD45, anti-CD235, anti-CD105 1:50; anti-CD90 1:100). Afterwards cells were washed with PBS 2% FCS to remove unbound antibodies and centrifuged at 250×g for 5 min. To exclude dead cells during flow cytometric analysis, propidium iodide was added prior to flow analysis. Data were acquired on a BD FACSAria Fusion (Beckton, Dickinson and Company (BD)) and data were analyzed using FlowJo (BD). Prism (GraphPad Software) was used to generate bar graphs. Hematopoietic/erythroid cells were defined as CD45+CD235+, CD45−CD235− CD105+and CD45 −CD235− CD90+ cells were defined as mesenchymal cells. Freshly frozen iliac crest bone marrow aspirates from healthy donors were used as healthy controls.
OPG expression quantification with ELISA. 5000 cells were seeded into a 96-well plate and cultivated in 100 µl media for two days. Enzyme-linked immunosorbent assay (ELISA) was performed with the Abcam human Osteoprotegerin ELISA Kit following the manufacturer´s instructions.
Detection of the H3F3A-G34W mutation by mutation-specific PCR. Genomic DNA was isolated using the Quick DNA Miniprep kit (Zymo research, Freiburg, Germany) according to the manufacturer's protocol. PCR amplification was performed using H3F3A wild-type and H3F3A-G34W specific primer, respectively. The reaction consisted of 2 U Platinum Taq polymerase (Thermo Fisher Scientific, Dreieich, Germany), 0.6 µl MgCl 2 (50 mM), 0.4 µl dNTPs (10 mM each), 0.5 µl of each primer (10 µM) and 100 ng genomic DNA as template in a total volume of 20 µl. Samples were incubated at 94°C for 3 min followed by 34 cycles of denaturation at 94°C for 15 s, annealing at 66°C for 20 s and extension at 72°C for 30 s and a final extension step at 72°C for 7 min. PCR products were separated on a 1.6% agarose gel, visualized by Midori Green (Biozym, Hessisch Oldendorf, Germany) and imaged. All primers are listed in Supplementary Data 5.
Deep targeted resequencing using MiSeq. Deep resequencing of H3F3A amplicons was performed according to 52 . DNA was extracted using the Qiamp Mini Kit (Qiagen, Hilden, Germany), according to the manufacturer´s instructions. The PCR reaction to amplify the mutation spanning region of the H3F3A gene consisted of 0.35U HotStart Q5 polymerase (NEB, Ipswich, USA), 0.6 µl dNTPs (10 mM each, Fermentas, St. Leon-Rot, Germany), 0.3 µl of each primer (10 µM, Sigma-Aldrich, Taufkirchen, Germany) and 25 ng genomic DNA as template in a total volume of 25 µl. Samples were incubated at 98°C for 1 min followed by 33 cycles of denaturation at 98°C for 10 s, annealing at 61°C for 30 s and extension at 72°C for 20 s, and a final extension step at 72°C for 2 min. Primer sequences are listed in Supplementary Data 5. Primers included a sequence complementary to the primers used for library preparation. Samples were separated on a 1.2% agarose gel and DNA was visualized by Ethidium Bromide. Gel extraction was performed using the Gel extraction Kit (Qiagen, Hilden, Germany). Libraries were prepared using 12. Slides were denatured in 70% formamide/1× SSC/15% dextran sulfate for 2 min at 72°C. Hybridization mixture consisting of 50% formamide, 2× SSC, Cot1-DNA, and labeled DNA probes was denatured for 7 min at 75°C, preannealed for 20 min at 37°C, and hybridized to the denatured metaphase preparations. After 48 h incubation at 37°C slides were washed at room temperature in 2× SSC, 3 × 5 min, followed by 2 × 5 min in 0.2% SSC/0.2% Tween-20 at 56°C. For indirect labeled probes, a immunofluorescence detection was carried out. Biotinylated probes were visualized using three layers of antibodies: (1)  Chromatin fractionation. Cell pellets were resuspended in lysis buffer (10 mM HEPES pH 7.6, 10 mM KCl, 0.05% NP40) with protease inhibitors and incubated on ice for 30 min. Samples were centrifuged at 16,000×g at 4°C for 10 min and the supernatant was taken as cytosolic fraction. Leftover pellet was further lysed with low salt buffer (10 mM Tris HCl pH 7.5, 3 mM MgCl 2 ) supplemented with 1% Triton X-100 for 15 min on ice and centrifuged. Supernatant was taken as nuclear proteins. Leftover pellet was resuspended in 0.2 M HCl, incubated on ice for 20 min and centrifuged. The supernatant was neutralized with 1 M Tris HCl buffer (pH8) and used as the chromatin fraction.
Quantification of ALP activity. 50,000 cells were seeded in 24-well plates. When confluent media was changed to differentiation media (compare Differentiation and alkaline phosphatase staining) and the first measurement was performed. After CTB as described above, wells were PBS washed and fixated for 60 s with 4% PFA in PBS. After PBS washing 500 µl Alkaline Phosphatase Yellow (pNpp) Liquid Substrate (System for ELISA, Sigma-Aldrich, USA) was added and incubated at 37°C for 8 min. Absorbance was measured at 405 nm.
Whole-genome sequencing and analysis. Read pairs were mapped to the human reference genome (build 37, version hs37d5), using bwa mem (v. 0.7.8) 53 with minimum base quality threshold set to zero [-T 0] and remaining settings left at default values, followed by coordinate-sorting with bamsort (with compression option set to fast REF1) and marking duplicate read pairs with bammarkduplicates (with compression option set to best 56 ); both are part of biobambam package (v.0.0.148) 57 . Somatic SNVs were identified with the DKFZ SNV-calling workflow 58 . Somatic SNVs and indels in matched tumor normal pairs were identified using the DKFZ core variant calling workflows of the ICGC Pan-cancer Analysis of Whole Genomes (PCAWG) project (https://dockstore.org/containers/quay.io/ pancancer/pcawg-dkfz-workflow). Tumor and matched control samples were analyzed by Platypus (v. 1.0) 59 to identify indel events. SNVs and indels from all samples were annotated using ANNOVAR (v. 2017Jul16) 56 according to GEN-CODE gene annotation (v. 19) and overlapped with variants from dbSNP10 (build 141) and the 1000 Genomes Project database. Genomic structural rearrangements were detected using SOPHIA (v.34.0) (https://bitbucket.org/utoprak/sophia/src) as described in ref. 60 . Briefly, SOPHIA uses supplementary alignments as produced by bwa mem as indicators of a possible underlying SV. SV candidates are filtered by comparing them to a background control set of sequencing data obtained using normal blood samples from a background population database of 3261 patients from published TCGA and ICGC studies and both published and unpublished DKFZ studies, sequenced using Illumina HiSeq 2000 (100 bp), 2500 (100 bp), and HiSeq X (151 bp) platforms and aligned uniformly using the same workflow as in this study. An SV candidate is discarded if (i) it has more than 85% of read support from low quality reads; (ii) the second breakpoint of the SV was unmappable in the sample and the first breakpoint was detected in 10 or more background control samples; (iii) an SV with two identified breakpoints had one breakpoint present in at least 98 control samples (3% of the control samples); or (iv) both breakpoints have less than 5% read support. Statistics over SVs for 9 samples with matched control and integrated variant analysis over all samples were based on SOPHIA calls. Allele-specific copy-number aberrations were detected using ACEseq (v. 1.0) 61 . SVs called by SOPHIA were incorporated to improve genome segmentation.
Whole-genome sequencing data of bone cancer and glioblastoma. WGS data of 73 bone cancer patients (BOCA cohort) were obtained via the Pan-cancer Analysis of Whole Genomes Project 62 . Pediatric glioblastoma data (PGBM cohort) were obtained from PedBrain Tumor Project of the International Cancer Genome Consortium 20 . The processing and variant calling (SNVs, SVs, CNVs) was performed consistently with the GCTB data as described above. To balance sample sizes for the analysis of recurrent genetic events, in addition to analyzing the complete BOCA cohort ( Supplementary Fig. 1i), we draw several random 10patient subsets, one of which is shown in Fig. 1d. From the PGBM cohort only 10 samples with H3.3-G34R mutation were selected for the analysis. Patient identifiers of all WGS samples are given in Supplementary Data 1.
ATAC-sequencing and analysis. Libraries for ATAC-sequencing were prepared in accordance with the original protocol with minor modifications 63 . 50,000 cells were lysed by 1% NP40 and PBS washed. After centrifugation at 1200×g for 10 min, tagmentation at 55°C for 8 min in a reaction mix with 2.5 µl of TDE1 (Nextera Illumina DNAKit), 25 µl Tagmentation buffer (Nextera Illumina DNA Kit) and 22 µl water took place. Reaction was stopped by adding 10 µl Guanidium (5 M) and samples were purified using 72 µl Ampure Beads. Libraries were generated using NEBNext High Fidelity PCR Mix and sequenced on the Illumina HiSeq 2000 platform. Sequencing reads were adapter-trimmed using cutadapt (v. 1.10) 64 . Genomic alignments were performed against the human reference genome (hg19, NCBI build 37.1) using Bowtie2 (v. 2.3.0) 65 . The non-default parameters -q 20 -s were used. PCR duplicates were removed by Picard MarkDuplicates (v. 1.125). Signal tracks were generated using deepTools (v. 2.3.3) 66 . A compatible CWL-based ATAC-seq data processing workflow is available at https://github.com/ CompEpigen/ATACseq_workflows. Peaks were called using Macs2 (v. 2.1.1.) 67 with the parameters -nomodel -shift -50 -extsize 100 -qvalue 0.01. All peaks were merged to create a common bed file with read counts before differential analysis using edgeR (v. 0.3.16) 68 . Gene annotations were made using ChIPpeakAnno (v. 3.18.0) 69 . Transcription binding motif analysis was performed using HOMER (v. 4.9) 70 . Motifs with a P-value <0.01 and a ratio of motif to background above 1.1 were defined as significantly enriched.
ChIP-sequencing and analysis. We used the ChIP-mentation protocol 71 to map the genomic distribution of WT and G34W H3.3, total H3, as well as 6 histone modifications (H3K4me1, H3K4me3, H3K9me3, H3K27ac, H3K27me3, and H3K36me3) in a subset of samples (see details in Supplementary Data 1). To confirm the specificity of the H3.3 and H3.3 G34W antibody used for ChIP-Seq analysis, we performed a validation experiment with a Histone code peptide array (JPT, Berlin, Germany) containing short peptides that densely cover most of the known histones and their modifications. We did not observe any significant binding of the G34W-speicifc antibody to H3.3 peptides and a highly specific binding of the H3.3 antibody. Cells of a confluent T175 flask were harvested by TrypLE incubation After centrifugation for 5 min at 1000 xg, cells were washed in PBS, distributed into 750.000 cell aliquots and pelleted again. The cell pellet fixed in 8 ml 1% FA in PBS at room temperature for 10 min. Crosslinking was stopped by the addition of 400 μl 2.5 M glycine. Cells were pelleted, resuspended in 1 ml ice cold PBS supplemented with 1x protease inhibitor and again pelleted at 3000×g for 3 min at 4°C. The supernatant was taken off and the pellet was resuspended in 130 μl FL buffer supplemented with 2× protease inhibitor and transferred to an AFA Covaris 130 μl tube. For nuclei isolation, cells were sheared with a duty factor of 2.5, 200 burst and 40 watts for 3-10 min using the Covaris M220. The nuclei were spinned down for 3 min at 4°C. The nuclei pellet was resuspended in 130 μl shearing buffer and shearing with a duty factor of 5, 75 watt and 200 burst was performed for 10-12 min with the Covaris M220. 100-200 ng sheared DNA were filled up to 200 μl with dilution buffer and mixed with 1 µg antibody and rotated over night at 4°C. The storage solution of 20 μl Protein A beads per IP was taken off using a magnet and the beads were washed with 500 μl PBS supplemented with 0.1% BSA. To block the beads, they were resuspended in 150 μl PBS with 0.1% BSA and also rotated over night at 4°C. The next day, the PBS was taken off from the beads with the help of a magnet and beads were taken up in 20 μl dilution buffer. The bead-dilution buffer mixture was added to the sheared DNA-antibody mixture and rotated for 2 h at 4°C. The supernatant of the mixture was taken off with the help of a magnet and the bead-bound chromatin antibody conjugate was washed twice with 500 μl WB1, once with WB2 and once with WB3. Two more washing steps with 10 mM Tris HCl ph8 took place. Beads were then resuspended in 30 μl tagmentation mix (15 μl tagmentation buffer, 14 μl water, 1 μl TDE1 (Nextera Illumina DNA Kit)) and incubated for exactly 10 min at 37°C. The tagmentation mix was removed and beads were washed twice with 500 μl Tn5 buffer. DNA elution took place by incubation of the beads in 100 μl ChIP-mentation elution buffer supplemented with 2 μl proteinase K for 2 h at 55°C and 8 h at 65°C. The supernatant containing the DNA was mixed with 200 μl AMpure beads for purification. For elution, beads were mixed with 26 μl water and incubated for 3 min at room temperature. Beads were collected with a magnet and the supernatant was collected as it contained the DNA with added adapters. Barcode amplification was performed with 24 μl sample. 0.8 μl universal Tn5 fwd primer and 0.8 μl barcode primer, both complementary to the transposase added adapters, as well as 25 μl NEB Next High Fidelity 2× mix and 0.3 μl 100× Sybr green. Samples were incubated at 72°C for 5 min for gap repair followed by 30 s incubation at 98°C. Cycles of denaturation at 98°C for 10 s, annealing for 30 s at 63°C and extension at 72°C for 30 s were repeated until the amplification curve almost reached saturation but for maximum 16 cycles. 70 μl AMpure beads were added to the PCR mixture for purification. For elution, beads were mixed with 20 μl elution buffer and incubated for 3 min at room temperature. Beads were collected with a magnet and the supernatant was collected as it contained the library. The detailed composition of all buffers is described in the original publication. Up to 4 libraries were equimolarly pooled and sequenced in a 125 bp paired end run on a HiSeq machine. Sequence reads were preprocessed using cutadapt (v. 1.10) 64 and aligned with Bowtie2 (v. 2.3.0) 65 with the default command line options. A compatible CWLbased ChIP-seq data processing workflow is available at https://github.com/ CompEpigen/ChIPseq_workflows. We used deepTools (v. 2.3.3) 66 with non-default options -binSize 10 -extendReads 400 -normalizeTo1x 2451960000 -ignoreForNormalization chrX to quantify genomic coverage in fixed-size window intervals for meta-plots and heatmaps.
Calling of H3.3-G34W enriched regions. We used the Poisson test-based binarization module of the ChromHMM software (v. 1.18) 72 to generate 200 bp windows with statistically significant enrichment of the wild-type H3.3 or H3.3-G34W signal over a simulated background. Adjacent windows were merged to generate primary enrichment regions. The regions were filtered against a union of the ENCODE ChIP-seq blacklists.
Whole-genome bisulfite sequencing and analysis. WGBS Libraries were prepared using the TruSeq DNA PCR-Free LT Library Preparation Kits with partially modified steps in fragmentation and bead clean up/size selection. Briefly, 2 µg genomic DNA (diluted in nuclease-free water) was fragmented to 150-200 bp using a Covaris ultrasonicator (Covaris, Inc.) and quality checked using Agilent TapeStation (Agilent Technologies). The fragmented DNA sample was diluted with water and split into two aliquots of~1 µg in 50 µl, end-repaired and purified using 1.6X sample purification beads. Adenylation of 3′ ends and ligation of TruSeq LT adapters were performed as described in manufactures protocol. Then, adapterligated fragment libraries were treated with bisulfite using the EpiTect Bisulfite Kit (Qiagen) following the instructions in the Illumina WGBS for Methylation Analysis Guide. After bisulfite conversion the fragment libraries were enriched with 8 cycles of PCR using KAPA HiFi Uracil+ DNA Polymerase with customized primer (Supplementary Data 5) and an annealing temperature of 69°C according to the settings for PE libraries in the technical Data Sheet (KAPA HiFi HotStart Uracil+ Ready Mix, KR0413-version 1.12, peqlab). Amplified libraries were purified with 1× Agencourt AMPure XP beads (Beckman Coulter Inc.) and quantified using Qubit fluorometer (Life Technologies-Invitrogen). Then both aliquots of one sample were pooled, validated using Agilent TapeStation and again quantified using Qubit fluorometer. The final libraries were pooled and clustered on the cBot (Illumina) according to the manufacturer's instructions with a final concentration of 250 pM, spiked with 5% PhiX control v3. Paired-end 150 bp sequencing on HiSeq X was performed using standard Illumina protocols. Basic statistics about the sequencing results is given in Supplementary Data 1. Raw reads were processed using Trimmomatic (v. 0.36) 73 and aligned against reference sequence of the Genome Research Consortium (v. 37) using bwameth (https://github.com/brentp/ bwa-meth) wrapping bwa mem (v. 0.7.8) 53 with default parameters, except for invoking-T 0. After alignment duplicates were marked by applying Picard Mark-Duplicates (v. 1.125). Methylation calling was performed with MethylDackel (v. 0.3.0). A compatible CWL-based data processing workflow for bisulfite sequencing is available at https://github.com/CompEpigen/WGBS_workflows (BWA_meth_-start_with_trimmed.cwl). Subsequently and prior to DMR calling, BSmooth was used (v. 1.4.0) with default parameters to smooth the processed methylation profiles in all samples 74 . We then used we DSS (v. 2.27.0) 75 to call DMRs for pairwise comparison between H3.3 WT and H3.3 MUT cells. Regions with at least 3 CpGs, a minimum length of 50 bp and a Benjamin-Hochberg corrected P value <0.05 were selected. All DMRs were filtered requiring a minimal mean methylation-value difference of 0.1.
HumanMethylation450 and HumanMethylationEPIC analysis. Methylation analysis using HumanMethylation450 arrays was performed by the Genomics and Proteomics Core Facility according to the manufacturer's instructions. Profiling of isogenic HeLa cell lines with HumanMethylationEPIC arrays was conducted at Korean Cancer Center according to the manufacturer's instructions. Unnormalized signals (IDAT files) were loaded into R using RnBeads software (v. 2.2.0) 76 and subjected to preprocessing with default option settings. 10,000 sites most variable across all samples were used for both, Principal Component Analysis and clustering analysis, visualized as a heatmap.
RNA-sequencing and analysis. Poly-A RNA sequencing of GCTB samples was performed according to the standard protocol 77 . Total RNA was prepared for each cell line by using the RNeasy Mini kit (Qiagen, Hilden, Germany) and library preparation was done using TruSeq Stranded mRNA Kit (Illumina), according the manufacturer's instruction. Paired-end 125 bp sequencing runs were performed on Illumina Hiseq2000 v4 machines. Raw sequence reads were preprocessed using cutadapt (v. 1.10) 64 to remove sequencing primers and adapters. Reads were aligned to the GRCh37 human reference genome with HISAT2 (v. 2.0.4) 78 with additional non-default parameters -max-intronlen 20000 -no-unal -dta. Transcripts were assembled and quantified with StringTie (v. 1.3.3) 79 with the GRCh37 transcript database. The RNA-seq processing workflow is available from https:// github.com/CompEpigen/RNASeq_GCTB. Differential expression analysis was performed using DeSeq2 (v. 1.18.1) 80 . Genes were called differentially expressed at FDR 0.05 and the absolute log-fold difference of greater or equal to one.
RNA-Seq of the differentiation samples. RNA-seq of the MSC differentiation samples was performed according to the following protocol. 50,000 cells were seeded in 6-well plates in maintenance medium: MesenPRO-RS™ (Thermo Fisher Scientific, Massachusetts, USA). After 3 days, medium was replaced with osteogenic differentiation medium: DMEM high glucose supplemented with 10% FBS, 1× non-essential amino acids (NEAA), 2 mM L-glutamine, 0.28 mM ascorbic acid, 10 mM β glycerophosphate, and 10 nM dexamethasone (Sigma-Aldrich, USA). At each time-point: 0, 0.5, 1, 2, 4, 6,8,12,16,24,48,72,125,168,336, and 504 h during osteogenesis, total RNA was isolated using TRIzol (Invitrogen™ Life Technologies, Carlsbad, USA) with the Direct-zol RNA kit (ZymoResearch, USA) according to the manufacturer's instruction. RNA-seq library preparation was carried out using the NEBNext Poly(A) mRNA Magnetic Isolation Module and NEBNext Ultra Directional RNA Library Prep Kit for Illumina (New England Biolabs, USA) according to the manufacturer's instruction. The quantity and quality of the cDNA library were assessed using the Agilent 2200 tapestation