The ETS transcription factor ETV6 constrains the transcriptional activity of EWS–FLI to promote Ewing sarcoma

Transcription factors (TFs) are frequently mutated in cancer. Paediatric cancers exhibit few mutations genome-wide but frequently harbour sentinel mutations that affect TFs, which provides a context to precisely study the transcriptional circuits that support mutant TF-driven oncogenesis. A broadly relevant mechanism that has garnered intense focus involves the ability of mutant TFs to hijack wild-type lineage-specific TFs in self-reinforcing transcriptional circuits. However, it is not known whether this specific type of circuitry is equally crucial in all mutant TF-driven cancers. Here we describe an alternative yet central transcriptional mechanism that promotes Ewing sarcoma, wherein constraint, rather than reinforcement, of the activity of the fusion TF EWS–FLI supports cancer growth. We discover that ETV6 is a crucial TF dependency that is specific to this disease because it, counter-intuitively, represses the transcriptional output of EWS–FLI. This work discovers a previously undescribed transcriptional mechanism that promotes cancer.

Transcription factors (TFs) are frequently mutated in cancer. Paediatric cancers exhibit few mutations genome-wide but frequently harbour sentinel mutations that affect TFs, which provides a context to precisely study the transcriptional circuits that support mutant TF-driven oncogenesis. A broadly relevant mechanism that has garnered intense focus involves the ability of mutant TFs to hijack wild-type lineage-specific TFs in self-reinforcing transcriptional circuits. However, it is not known whether this specific type of circuitry is equally crucial in all mutant TF-driven cancers. Here we describe an alternative yet central transcriptional mechanism that promotes Ewing sarcoma, wherein constraint, rather than reinforcement, of the activity of the fusion TF EWS-FLI supports cancer growth. We discover that ETV6 is a crucial TF dependency that is specific to this disease because it, counter-intuitively, represses the transcriptional output of EWS-FLI. This work discovers a previously undescribed transcriptional mechanism that promotes cancer.
We validated an ETV6 dependency in three cell lines of Ewing sarcoma, A673, EW8 and TC32, via CRISPR-Cas9 disruption. Loss of ETV6 reduced cell growth in vitro ( Fig. 1b and Extended Data Fig. 1f) and reduced anchorage-independent growth in methylcellulose ( Fig.  1c and Extended Data Fig. 1g). We established a biochemical dTAG approach 46,47 to perturb ETV6 abundance with precise temporal control and without eliciting acute DNA damage. FKBP12 F36V -tagged proteins can be acutely degraded following exposure to the dTAG small-molecule dTAG V -1, which recruits the von Hippel-Lindau E3 ligase to ubiquitinate FKBP12 F36V (ref. 46). In the Ewing sarcoma cell lines A673 and EW8, we exogenously expressed ETV6 carboxy-terminally tagged with FKBP12 F36V and a human influenza haemagglutinin (HA) epitope (Fig. 1d). Simultaneously, we knocked out endogenous ETV6 such that FKBP12 F36V -tagged ETV6 constituted the dominant form of ETV6 protein. ETV6-FKBP12 F36V degradation reduced anchorage-independent growth (Fig. 1e,f and Extended Data Fig. 1h). Degradation of ETV6 ( Fig.  1g and Extended Data Fig. 2a) as well as CRISPR-Cas9-mediated knockout of endogenous ETV6 in parental A673 cells (Extended Data Fig. 2b) led to G1/G0 cell cycle arrest but did not induce apoptosis (Extended Data Fig. 2c and Supplementary Fig. 1).
In vivo, CRISPR-Cas9-mediated knockout of ETV6 reduced the growth of subcutaneous TC32 tumours (Extended Data Fig. 2d). Using an orthotopic-like mouse model, in which A673 Ewing sarcoma cells implanted intramuscularly in the hindlimb are capable of metastasis 48 , we observed that ETV6 loss reduced primary tumour growth (Fig. 1h). ETV6 loss reduced metastasis to liver tissues (Fig. 1i, left), and lung tissues displayed the same trend in one out of two ETV6 knockout conditions (Fig. 1i, right).
Next we asked whether the DNA-binding domain (DBD) of ETV6 was crucial to its function. We knocked out endogenous ETV6 and exogenously expressed wild-type ETV6 or mutant ETV6 bearing a C-terminal DBD deletion, which precluded ETV6 binding to chromatin and partially impeded its nuclear localization. This result is consistent with the report that the nuclear localization signal of ETV6 protein lies in its C terminus 49 (Extended Data Fig. 2e). Whereas wild-type ETV6 expression rescued ETV6 knockout, expression of the mutant ETV6 did not (Extended Data Fig. 2f), which suggests that the specific activity of ETV6 on chromatin is crucial to its function in Ewing sarcoma.

ETV6 and EWS-FLI co-occupy loci genome-wide
ETV6 and EWS-FLI harbour the ETS family DBD, which recognizes consensus 5′-GGA(A/T)-3′ motifs. We therefore asked whether they via multimerization and recruitment of chromatin-modifying complexes, which in turn lead to an altered gene expression programme 20,32 . Efforts to establish key dependencies in Ewing sarcoma have prioritized the identification of specific gene targets of EWS-FLI. Studies have described cell-type-specific TFs that are activated by, and cooperate with, EWS-FLI to reinforce oncogenic programmes 23,24,[32][33][34][35][36][37][38][39] , including in CRCs 40 . Unbiased and systematic approaches are needed, however, to reveal crucial disease mechanisms specific to Ewing sarcoma.
Here we describe the results of a genome-scale CRISPR-Cas9 screen revealing that the wild-type ETS TF ETS variant 6 (ETV6; also known as TEL) is a crucial Ewing-sarcoma-selective TF dependency. We validate this dependency in vitro and in vivo. In contrast to selective TF dependencies that reinforce the oncogenic programmes of mutant TFs in other cancer types, the repressive activity of ETV6 constrains EWS-FLI gene activation at 5′-GGAA-3′ repeat enhancers to promote Ewing sarcoma growth. We therefore discover a previously undescribed mechanism promoting cancer: competition on chromatin between an oncogenic fusion TF and a 'restraining' inhibitory TF.
Article https://doi.org/10.1038/s41556-022-01059-8 co-localized on chromatin. We profiled endogenous ETV6 binding sites in parental A673 cells using cleavage under targets and tagmentation (CUT&Tag) 50 and profiled ETV6-FKBP12 F36V -HA binding sites in ETV6-dTAG cells using anti-HA chromatin immunoprecipitation with sequencing (ChIP-seq). These analyses defined a consensus list of ETV6-binding sites (Extended Data Fig. 3a and Fig. 2a). dTAG V -1 treatment reduced ETV6 abundance on chromatin in both dTAG models ( Fig. 2b and Extended Data Fig. 3b). In parental Ewing sarcoma cells, we performed histone H3 lysine 27 acetylation (H3K27ac) ChIP-seq and analysed public histone H3 lysine 4 trimethylation (H3K4me3) ChIP-seq data 26 to annotate ETV6-binding sites. The results showed that these sites occurred at active promoters and enhancers ( Fig. 2c and Extended Data Fig. 3c). We performed ChIP-seq for EWS-FLI in A673 and EW8 parental cells by immunoprecipitating the C-terminal FLI1 domain. This is an accepted approach to identify EWS-FLI-binding sites because wild-type FLI1 typically is not expressed in Ewing sarcoma cells 39,43 . EWS-FLI bound ubiquitously at ETV6-binding sites in both models ( Fig. 2c and Extended Data Fig. 3c), although co-occupied binding sites constituted only a small proportion of total EWS-FLI-binding sites ( Fig. 2d and Extended Data Fig. 3d). EWS-FLI pioneers closed chromatin at GGAA repeat microsatellites 27 , including at repeats of four or more 26  s g E T V 6 -4 s g E T V 6 -3 s g E T V 6 -2 s g E T V 6 -1 s g L a c Z s g C h r 2 . 2 P a r e n t a l D M Article https://doi.org/10.1038/s41556-022-01059-8 at a higher frequency in Ewing sarcoma than in B lymphocytes or in K-562 leukaemia cells 51,52 , which express ETV6 (P < 2.2 × 10 −16 ) ( Fig. 2e and Extended Data Fig. 3e).  Fig. 2f). At 72 h, alterations were more dynamic, exhibiting both increases and decreases (Fig. 2f). We categorized loci by whether they gained or lost EWS-FLI binding at 6 h and whether they occurred at transcription start sites (TSSs) or at H3K27ac-defined enhancers (Fig. 2g,h). Regions that lost EWS-FLI binding did not change to as great a degree as regions that gained binding (Fig. 2i). Thus, the loss of ETV6 led acutely and predominantly to increased EWS-FLI binding, which provides support for the hypothesis that these TFs compete for binding. Additionally, ChIP-seq of H3K27ac at 6 h in both models (Fig. 2g,h) demonstrated a modest increase in H3K27ac abundance at enhancer regions that gained EWS-FLI binding (Extended Data Fig. 3f). Differential EWS-FLI binding was highly dynamic at tandem 5′-GGAA-3′ repeats (Extended Data Fig. 3g). Notably, genomic regions that gained EWS-FLI binding were more likely to contain shorter tandem repeats of 2, 3 or 4 motifs compared with regions that lost EWS-FLI binding (P < P = 6.974 × 10 −15 ). Consistent differences were not observed for single GGAA motifs or >4 GGAA repeats.

ETV6 is a transcriptional repressor in Ewing sarcoma
We next characterized genes regulated by ETV6, a reported transcriptional repressor [53][54][55][56][57] . We performed RNA sequencing (RNA-seq) in both dTAG models at 6, 24 and 72 h following treatment with dimethylsulfoxide (DMSO) or dTAG V -1 ( Fig. 1e and Extended Data Fig. 4a). Globally, the expression profiles of each of the engineered dTAG cell lines approximated that of their corresponding parental cell lines (Extended Data Fig. 4b). At 6 h, the majority of differentially expressed genes were upregulated, which suggests that ETV6 acts predominantly as a transcriptional repressor in Ewing sarcoma (Fig. 3a). Strongly ETV6-repressed genes increased in expression over time following ETV6 degradation (Fig. 3b). We observed concordance in regulated genes between dTAG models (Extended Data Fig. 4c) and identified a common set of 85 ETV6-repressed genes ( Fig. 3c and Supplementary Table 6). We performed RNA-seq on parental A673 cells transduced with ETV6 CRISPR knockout ( Fig. 3d and Extended Data Fig. 4d). The results showed that most of the 85 genes were also repressed by endogenous levels of wild-type ETV6 (P = 2.66 × 10 −20 ). Consistent with the localization of ETV6 at active promoters and enhancers, ETV6-repressed genes were expressed and not completely silenced (Extended Data Fig. 4e). Additionally, ETV6-binding sites were enriched in ETV6-regulated genes ( Fig. 3e and Extended Data Fig. 4f).
ETV6 is a master TF implicated in the normal development of neural and mesenchymal lineages 58,59 . Developmental lineage-specific gene sets were enriched in ETV6-repressed genes ( Fig. 3f and Supplementary Tables 6-11) and in ETV6-activated genes (Extended Data Fig. 4g and Supplementary Tables 12-17). ETV6-repressed genes, but not activated genes, were strongly enriched for genes regulated by histone deacetylases (HDACs), which may reflect the ability of ETV6 to recruit HDACs 54,60-63 . We also observed strong enrichment of EWS-FLI-regulated genes in ETV6-regulated genes (Fig. 3f), consistent with their co-localization on chromatin.

ETV6 functions similarly in clinically relevant Ewing sarcoma models
Well-established cancer cell lines may use distinct biological mechanisms to that of primary tumour cells. We therefore tested the relevance of our findings from cell lines in two newly derived Ewing sarcoma cell lines: CCLF_PEDS_0009_T (PEDS0009) and CCLF_PEDS_0010_T (PEDS0010) 64 . ETV6 knockout impaired cell growth in vitro and colony formation in methylcellulose (Fig. 5a,b and Extended Data Fig. 6a,b). Additionally, we tested cells from a minimally passaged Ewing sarcoma patient-derived xenograft (PDX): ES-PDX-001 (refs. 65,66). Again, knockout of ETV6 impaired cell growth in vitro (Extended Data Fig. 6c). In PEDS0009 cells, we observed ETV6 and EWS-FLI binding at previously defined EWS-FLI consensus binding sites (Fig. 5c). Concordant with our cell line data, ETV6 bound to GGAA microsatellites (Fig. 5d), and ETV6 loss resulted in increased EWS-FLI binding at the same loci that exhibited increased EWS-FLI occupancy in cell lines ( Fig. 5e-g and Extended Data Fig. 6d,e). Genomic regions that gained EWS-FLI binding were more likely to contain shorter GGAA repeats of 2, 3 or 4 compared with regions that lost EWS-FLI (Extended Data Fig. 6f) (P = 5.186 × 10 −11 ). These observations in minimally passaged cells were concordant with the data from well-established cell lines.

ETV6 and EWS-FLI antagonism at SOX11 is functional
We next asked whether the antagonistic relationship between EWS-FLI and ETV6 is responsible for the dependency of Ewing sarcoma cells on ETV6. Almost half of the gene sets enriched in ETV6-repressed genes were related to developmental pathways (Extended Data Fig. 7a and Supplementary Table 18), and 46 of these included SOX11 (Supplementary Table 19). SOX11 expression exerts context-dependent effects on cancer cell survival, growth and metastasis 67,68 . SOX11 acts as an oncogene in mantle cell lymphoma 69 and promotes metastasis in breast cancer 70,71 . Conversely, it also reduces proliferation and metastasis in prostate cancer 72 and induces differentiation of glioma cells 73 . In Ewing sarcoma cells, the exogenous expression of SOX11 impaired cell growth, whereas the expression of a DBD-deleted mutant did not (Extended Data Fig. 7b). These results provide support for a tumour-suppressive role for SOX11 activity.
We observed differential EWS-FLI binding at a distal enhancer that mapped to SOX11 as the nearest expressed gene (Fig. 6a, left). RNA-seq data from the Cancer Cell Line Encyclopedia 74 show that the neighbouring genes, SILC1 and LOC400940, are not expressed in Ewing sarcoma. This enhancer occurred at tandem GGAA repeats and exhibited increased EWS-FLI binding, H3K27ac abundance and chromatin accessibility following ETV6 loss (Fig. 6a, right). RNA-seq confirmed that SOX11 is repressed by ETV6-FKBP12 F36V in dTAG cells and by endogenous ETV6 in parental A673 cells (Extended Data Fig. 7c). EWS-FLI was required for SOX11 upregulation after ETV6 loss (Fig. 6b,c). Knockout of SOX11 in A673 ETV6-dTAG cells (Fig. 6d) rescued the effects of ETV6 degradation (Fig. 6e). Additionally, knockout of SOX11 in A673 and TC32 cells ( Fig. 6f and Extended Data Fig. 7d) rescued ETV6 knockout ( Fig. 6g and Extended Data Fig. 7e,f). In vivo, we observed rescue in TC32 cells grown as subcutaneous tumours in mice (Fig. 6h). These findings support the hypothesis that ETV6 dependency is specific to Ewing sarcoma cells because ETV6 constrains EWS-FLI activation of SOX11 expression.
Finally, we asked whether co-regulation at SOX11 by ETV6 and EWS-FLI could be recapitulated with ectopic expression of EWS-FLI. In rhabdomyosarcoma RD cells, we exogenously expressed wild-type EWS-FLI or the R340N DNA-binding mutant of EWS-FLI, which cannot bind to DNA 75 . SOX11 protein expression was induced by wild-type EWS-FLI but not the mutant (Extended Data Fig. 7g). Knockout of ETV6 further upregulated SOX11 abundance in the setting of wild-type EWS-FLI but not in the context of mutant EWS-FLI expression (Extended Data Fig. 7g). These findings demonstrate that the DBD of EWS-FLI is required for its activation of SOX11 expression, an activity that is repressed by ETV6.

Discussion
In this study, we discovered an oncogenic mechanism underlying the paediatric cancer Ewing sarcoma. We demonstrated that the ETS TF ETV6 is a selective dependency in Ewing sarcoma because it antagonizes the transcriptional activity of EWS-FLI at ETS motifs. To our knowledge, this report constitutes the first description of transcriptional constraint of a fusion TF on chromatin as a crucial driver of tumour growth.
Although previous studies have described specific TFs as dependencies that reinforce the EWS-FLI transcriptional programme in Ewing sarcoma [32][33][34][35]37 , including in CRCs 40 , these targets were not identified in DepMap screening as selective gene dependencies. Instead, our discovery that ETV6 constrains EWS-FLI activity highlights a distinct, but equally central, epigenetic mechanism that drives tumour growth and reveals an unexpected contrast between Ewing sarcoma and other paediatric tumours in which CRCs are functionally dominant. versus sgSOX11 dTAG V -1, P adjusted < 0.0001). Right, bottom: relative median intensity comparing dTAG V -1-treated with DMSO-treated wells (two-tailed t-test, n = 3, P < 0.0001). Represents two independent experiments. f, Western blot of TC32 cells transduced with CRISPR-Cas9 constructs in combination. Represents one experiment. g, Line graph depicting mean viability in vitro (n = 6 biological replicates, s.e.m. bars too small to depict) of cells in f. ETV6 knockout alone (red) reduced viability compared to control (black) (two-way ANOVA, Tukey's multiple comparisons, P adjusted < 0.0001). Simultaneous ETV6 and SOX11 knockout (blue star) did not reduce viability compared to SOX11 knockout alone (grey) (NS, P adjusted = 0.8847) and exhibited greater viability than ETV6 knockout alone (red) (P adjusted < 0.0001). h, Left: Line graph depicting mean subcutaneous tumour volume (mm 3 ) ± s.e.m. (n = 6 tumours, biological replicates) formed by cells shown in f. ETV6 knockout alone reduced tumour volume (two-way ANOVA, Tukey's multiple comparisons, P adjusted < 0.0001). Simultaneous ETV6 and SOX11 knockout did not reduce tumour growth (NS, P adjusted = 0.9892) and exhibited greater growth than ETV6 knockout alone (P adjusted < 0.0001). Right: representative tumours from each condition. Article https://doi.org/10.1038/s41556-022-01059-8 Cancer cells frequently co-opt mechanisms that underlie normal development 76 . The competition between EWS-FLI and ETV6 in Ewing sarcoma bears resemblance to a mechanism of ETS TF competition governing cell-fate decisions in developing Drosophila. Pointed, the activating orthologue of human Ets-1, competes for binding at ETS motifs within specific enhancers with Yan, the repressive orthologue of ETV6, to regulate the expression of key differentiation genes in distinct tissues 77-81 . Here we described a similar mechanism that has been co-opted in cancer to regulate the transcriptional output of a fusion TF.
The epigenetic activity of ETS TFs other than EWS-FLI may contribute to the phenotype of ETV6 loss. Notably, ETV7, the homologue of ETV6, is not expressed in Ewing sarcoma cells (Supplementary Table 20   and we did not observe strong changes in the expression of other ETS TFs with ETV6 loss. The maximum change exhibited by one gene was roughly threefold, and only five genes displayed a significant alteration in expression across the models evaluated. Moreover, none of the genes that displayed a change in expression were scored as dependencies or tumour suppressors in DepMap in Ewing sarcoma. Although most human TF families contain paralogues that are co-expressed within distinct cell types 82-84 , an understanding of their interactions at shared motifs is lacking. We began to unravel key cis regulatory principles that distinguish the specific functions of ETV6 and EWS-FLI, the antagonism of which on chromatin frequently occurred at shorter 5′-GGAA-3′ repeats. As the pathogenesis of EWS-FLI is typically associated with its activity at longer repeats or true microsatellites, we highlight a previously undescribed cis regulatory role for shorter GGAA repeats in this disease, which facilitates ETV6 fine-tuning of EWS-FLI. Even though the reconstitution of EWS-FLI for biochemical assays has been a challenge for the field, future work is needed to delineate the precise GGAA repeat code that determines the activities of each TF. Similarly, ETV6 and wild-type FLI1 proteins can engage in an inhibitory heterodimer 85 , an interaction mediated by the amino-terminal Pointed (PNT) domain of ETV6, and further studies are needed to determine whether ETV6 and EWS-FLI engage in a protein-protein interaction. Notably, however, our experiments using an ETS DBD-deleted mutant of ETV6, with an intact PNT domain, demonstrated that the DNA-binding activity of ETV6 is crucial to its function in Ewing sarcoma.
ETV6 is a master TF in normal development and is recurrently mutated in cancer. ETV6 mutations include deletions and chromosomal translocations involving 30 distinct gene partners 53 . Germline and somatic loss-of-function mutations frequently occur in pre-malignant disorders and leukaemias. For example, in B cell acute lymphoblastic leukaemia, ETV6 deletions frequently co-occur with ETV6-RUNX1 rearrangements, which result in biallelic loss of the ETV6 protein 86,87 . Chromosomal translocations also fuse the N terminus of ETV6 with the tyrosine kinase domain from a number of receptor tyrosine kinases, which facilitate constitutive autophosphorylation and growth signalling 53 . ETV6, however, has not been reported as recurrently mutated in Ewing sarcoma [42][43][44] . Furthermore, ETV6 is not regulated by EWS-FLI (Supplementary Table 4) and does not exhibit a marked pattern of expression specific to this cancer type (Extended Data Fig. 1d). Nonetheless, we discovered its role as a crucial tumour-type-selective dependency in regulating EWS-FLI activity. As such, this report reaffirms the importance of performing unbiased functional screens at scale to reveal oncogenic mechanisms sustained by wild-type proteins.
Our findings suggest that a hallmark of Ewing sarcoma biology may involve the reliance on mechanisms constraining EWS-FLI activity to promote tumour growth. Indeed, we previously described mechanisms mediated by an E3 ligase (TRIM8) and cohesin that restrain EWS-FLI activity 48,66,88 . Here we discovered a distinct mechanism in support of an EWS-FLI Goldilocks phenomenon 66 that is operative on chromatin. Future translational efforts could ultimately seek to modulate the activity of this pharmacologically challenging protein, either by decreasing or paradoxically increasing its activity.
In conclusion, we discovered the oncogenic role of TF competition on chromatin between a mutant TF and a wild-type paralogue. Our work contributes to an understanding of the dysregulated epigenetic mechanisms that can promote cancer, raising the possibility that similar mechanisms are relevant in other disease contexts.

Online content
Any methods, additional references, Nature Portfolio reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/s41556-022-01059-8.   Fig. 6c, a minimally passaged cell line previously derived in our laboratory 66 from a previously characterized Ewing sarcoma PDX 65 (HSJD-ES-PDX-001) was studied. As such, this experiment was performed in vitro and did not involve the use of animals. As previously described 65 , this PDX originated from a biopsy in a 21.7-year-old patient whose sex was not reported 65 . It was collected with informed consent without compensation under an Institutional Review Board-approved protocol at Sant Joan de Déu Hospital (HSJD, Barcelona, Spain), protocol number HSJD 135/11 (ref. 65).

CRISPR-Cas9 screen dependency analysis
All genome-scale dependency data are available at the DepMap portal website: https://depmap.org. DepMap AVANA 21Q1 dependency data were used (18,333 genes in 808 cell lines, https://figshare.com/articles/ dataset/public_21q1/13681534). Twelve cell lines were not included in the analyses: four cell lines are classified as engineered lines; the origin of one cell line, CHLA57, is unknown as it is incorrectly identified as Ewing sarcoma; seven cell lines are listed as commonly misidentified cell lines in the ICLAC Register of Misidentified Cell Lines (https:// iclac.org/databases/cross-contaminations/). Therefore, dependency data for 796 cell lines were examined. CERES gene effect scores were calculated as previously described 41,90 . A lower CERES gene effect score indicates an increased likelihood that a specific gene is required for viability in that cell line. A CERES score of 0 indicates that gene deletion exhibited no effect on growth, whereas a score of −1 is comparable with the median of all commonly essential genes, that is, genes that were essential for growth in nearly every cell line across the entire screen. Tumour-type-enriched 'selective' dependencies were determined by performing a two-class comparison between the gene effect scores for cell lines of each tumour type (in-group) and the remainder of all other cell lines in the screen (out-group) for a specific gene as previously described 41 . In brief, effect size was calculated as the difference in the mean gene effect dependency score in the in-group compared with that in the out-group. In addition to two-sided P values, one-sided P values were generated to test whether the in-group exhibited, on average, greater or lesser dependency on a specific gene than the out-group. All P values were corrected for multiple hypothesis testing using the Benjamini-Hochberg correction and reported as q values. Tumour-type-enriched dependencies were identified in each tumour type as those with a q value of <0.05 and with a negative effect size (the mean of dependency gene effect score was more negative in the in-group than in the out-group). The same analyses were performed on the genome-scale CRISPR-Cas9 screens using the Broad Institute GeCKO library (18,

Cancer cell line and primary tumour gene expression
RNA-seq gene expression data from the Cancer Cell Line Encyclopedia 74 were downloaded (19,177 genes in 1,376 cell lines) from the 21Q1 DepMap portal website (https://depmap.org). Tumour-type-enriched expression for each gene was calculated by performing a two-class comparison between the log 2 (transcripts per million (TPM) + 1) gene expression for cell lines of each tumour type (in-group) and the remainder of all other cell lines profiled (out-group). All P values were corrected for multiple hypothesis testing using the Benjamini-Hochberg correction and reported as q values. RNA-seq gene expression data for primary tumours were downloaded from the Treehouse Childhood Cancer Initiative 45 (UCSC Genomics Institute, https://treehousegenomics. soe.ucsc.edu/public-data).

Cell samples and culture
All cell lines were genotyped by short tandem repeat analysis and tested for Mycoplasma. Whole-exome sequencing and RNA-seq were performed to validate cell line identity 43 . The A673 cell line was purchased from the American Type Culture Collection (ATCC, CRL-1598). EW8 (originally derived by P. Houghton 91 ) and TC32 (originally derived by T. Triche 92 ) cell lines were obtained from the Golub Lab. A673 and EW8 cells were grown in Dulbecco's modified Eagle's medium (DMEM) (Thermo Fisher Scientific, MT10013CM), supplemented with 10% FBS (Sigma-Aldrich, F2442) and 1% penicillin-streptomycin (Life Technologies, 15140163). TC32 cells were grown in Roswell Park Memorial Institute (RPMI)-1640 medium (Thermo Fisher Scientific, MT10040CM), supplemented with 10% FBS and 1% penicillin-streptomycin. The PEDS0009 and PEDS0010 cell lines were obtained from the Cancer Cell Line Factory (Broad Institute) and were derived as previously described 64 . They were cultured in RPMI-1640 supplemented with 10% FBS and 1% penicillin-streptomycin. The Ewing sarcoma PDX (HSJD-ES-PDX-001) was provided by J. Mora (HSJD) 65 . To generate the minimally passaged cell line (ES-PDX-001), PDX tumours were processed as previously described 66 . The RD cell line (ATCC, CRL-7731) was a gift from the DepMap group at the Broad Institute. RD cells were cultured in RPMI-1640 supplemented with 10% FBS.
One million Ewing sarcoma cells were seeded per well of a 6-well plate and spin-infected using 2 ml of virus and 8 µg ml -1 polybrene (Santa Cruz Biotechnology, SC-134220) at 37 °C at 1,190 r.c.f. for 30 min. The following day, fresh medium containing 1 µg ml -1 puromycin (InvivoGen, ant-pr-1) was added. Cells were selected for at least 48 h. In experiments requiring knockout of two genes, the cells were co-transduced with constructs encoding two distinct sgRNAs, each conferring resistance to either puromycin or blasticidin. Cells were selected with 1 µg ml -1 puromycin and 5 µg ml -1 blasticidin (Life Technologies, A1113903) for at least 5 days. Separate samples of non-infected cells treated with drug were used to confirm cell death.
The dTAG V -1 molecule was provided by the Gray Laboratory (Dana-Farber Cancer Institute, Boston, MA) and used at a stock concentration of 10 mM suspended in DMSO. For 6, 24 and 72 h RNA-seq time points, A673 ETV6-dTAG cells were seeded at 1 million cells per 6 cm dish, 0.5 million cells per 6 cm dish and 0.5 million cells per 10 cm dish, respectively. EW8 ETV6-dTAG cells were seeded at 0.75 million cells per 6 cm dish, 0.5 million cells per 10 cm dish and 0.2 million cells per 10 cm dish, respectively. For each time point, three separate dishes were seeded and treated per DMSO or dTAG V -1 condition. Cells were collected for total RNA extraction and western blot validation (described below). For 6 and 72 h ChIP-seq time points, A673 ETV6-dTAG cells were seeded at 5 million cells per 15 cm dish and 1 million cells per 15 cm dish, respectively. EW8 ETV6-dTAG cells were seeded at 7.7 million cells per 15 cm dish and 1 million cells per 15 cm dish, respectively. Twenty-four hours after seeding, existing medium was exchanged for DMSO or dTAG V -1-containing medium. For all experiments, dTAG V -1 was used at a final concentration of 1 µM. Equivalent volumes of DMSO were used as control.

Relative viability studies
Cells transduced with lentivirally packaged CRISPR-Cas9 constructs were seeded in 384-well plates at densities of 3,500 (TC32), 2,000 (A673), 250 (EW8) and 1,000 (PEDS0009, PEDS0010 and ES-PDX-001) cells per well suspended in 40-50 µl of medium per well containing 0.5-1 µg ml -1 puromycin. Cells from each condition were grown separately in 6-8 wells per plate across 4 plates, which corresponded to day 0, 3, 5 and 7 time points. Wells at plate edges were filled with 50 µl of PBS to maintain humidity. To measure cell viability, 10 µl of CellTiter-Glo reagent (Promega, G7573) was added to each well, luminescing at an intensity proportional to ATP abundance, and plates were shaken at room temperature for 15 min. Luminescence was measured using a FLUOstar Omega microplate reader (BMG LabTech).
Relative viability was calculated by dividing the luminescence measurement of each well on day 7 by luminescence at day 0 using Microsoft Excel 16.50. In parallel, whole cell lysate was collected on day 7 for western blotting to confirm ETV6 knockout. Statistics shown compare mean relative viability between conditions at day 7, analysed using GraphPad Prism 9.0.0.

Anchorage-independent growth
A 16-gauge blunt-end needle was used to transfer 12 ml of semi-solid methylcellulose-based medium (Stemcell Technologies, 03814) to a 50 ml conical tube and 3 ml of cell suspension containing 15,000 (A673 and /TC32), 5,000 (EW8) or 20,000 (PEDS0009, PEDS0010 and ES-PDX-001) cells. The mixture was vortexed and left at room temperature for 10-15 min until bubbles dissolved. A blunt-end needle was used to transfer 3 ml of the mixture to separate 6 cm dishes, which were placed inside a 15 cm plate containing a PBS-filled 6 cm dish used to maintain humidity. Colonies were stained 7 days later by adding 1 ml of a 1:1 mixture of PBS and MTT dye (Roche Diagnostics, 11465007001) per dish and incubating for 30-45 min at 37 °C. Colonies in each dish were imaged using an ImageQuant LAS 4000 imager (GE Healthcare) and quantified using ImageQuant TL 8.2 software (Cytiva). In parallel, whole cell lysate was collected from cultured cells for western blotting.

Flow cytometry and cell cycle analysis
Cell cycle analysis was performed using Click-iT Plus EdU Alexa Fluor 647 Flow Cytometry Assay kits (Life Technologies, C10424) per kit instructions with minor modifications. Cells were seeded and cultured separately before being pulsed with 10 µM of the modified nucleotide analogue 5-ethynyl-2′-deoxyuridine (EdU) for 90 min at 37 °C. Around 1-2 million cells per sample were trypsinized, washed, fixed, permeabilized and then treated with a reaction cocktail containing Alexa Fluor-647-conjugated picolyl azide to label incorporated EdU. Cells were stained with a RNAse-containing propidium iodide solution (Cell Signaling, 4087S) for 45 min at 37 °C. Cells were analysed by flow cytometry at 5,000-10,000 cells per sample on a BD FacsCelesta instrument. Live cells were gated using FSC-A and SSC-A. The data were analysed using FlowJo v.10.6.1 software. Cells were collected from each sample for western blotting.

Mouse studies
All mouse studies were approved by the Dana-Farber Cancer Institute Institutional Animal Care and Use Committee (Animal Welfare Assurance number: D16-00010 (A3023-01)) and were performed in accordance with NIH guidelines for the humane care and use of animals. The intramuscular mouse xenograft experiment (Fig. 1h,i) studied immunodeficient NOD.Cg-Prkdcscid Il2rgtm1Wjl/SzJ (NSG) mice ordered from Jackson Laboratory in a semi-orthotopic manner as previously described 48 . A673 cells were lentivirally transduced to express luciferase and CRISPR-Cas9 constructs targeting ETV6. These cells were intramuscularly implanted in the hindlimbs of 7-week-old female mice. On the day of implantation, cells were suspended in a 1:1 mixture of PBS and Matrigel (Thermo Fisher Scientific, CB40230C) and injected directly into the hindlimb cranial thigh muscle, away from the sciatic nerve, at a concentration of 50,000 cells per mouse in 50 µl. Five mice per condition (sgChr2.2, sgLacZ, sgETV6-1 and sgETV6-2) were implanted. Disease progression was monitored by serial bioluminescence imaging of the whole body. Bioluminescence was measured 10 min following subcutaneous injection of luciferin using a PerkinElmer IVIS Spectrum (exposure time, 0.5-180 s; binning, 2-16; luminescent, 25,000) to determine the maximum bioluminescence exhibited by each mouse. Mice in each condition were imaged at the same time. Mice were euthanized at the end point. Lung and liver tissue samples were collected following euthanasia and placed in a 6-well dish for bioluminescence imaging. Subcutaneous mouse xenograft experiments were conducted in Jackson NSG mice (Extended Data Fig. 2d) Article https://doi.org/10.1038/s41556-022-01059-8 and CrTac:NCR-Foxn1<nu>(nude) mice from Taconic Biosciences (Fig. 6h). In the former study, 12-week-old males were used. In the latter study, 6-8-week-old females were used. Cells were suspended in a 7:3 mixture of culture medium and Matrigel and injected bilaterally subcutaneously into sublethally irradiated mice at 3 million cells in 100 µl. Three to four mice per condition received transplants. Tumours were measured with calipers serially twice weekly.
Animals were euthanized when tumours reached maximal 2 cm in at least one dimension or a humane end point such as ulceration or reduced mobility, in adherence to the NIH/NCI guidelines on limits of tumour size (equal to or less than 2.0 cm per tumour in any one dimension). This limit was not exceeded. Randomization was not appropriate in any study as drug treatments were not used. Mice from the same conditions were kept in different cages to minimize confounding environmental factors. Mice were housed with strictly controlled temperature and humidity and kept on 12-h light and dark cycles. No statistical methods were used to predetermine sample sizes, but sample sizes were similar to those reported in previous publications in which statistical significance was achieved 48,66 . Data distribution was assumed to be normal, but this was not formally tested, with the exception of data shown in Fig. 1i, for which the data were not normal (Shapiro-Wilk P < 0.05) and thus log-transformed. Data collection and analysis were not performed blind to the conditions of the experiments. No animals or data points were excluded from the analyses.

Crystal violet staining and quantification
Cell samples were cultured separately and re-seeded at normalized cell densities (50,000 cells per well in 6-well plate) every 5 days with refreshed DMSO or 1 µM dTAG V -1. On day 20, each well was incubated with 1 ml of crystal violet stain, composed of 20% methanol and 1% w/v crystal violet powder (Sigma Aldrich, C6158) at room temperature for 20 min. Wells were washed with 3 ml of deionized H 2 O five times and dried at room temperature. Plates were imaged using an ImageQuant LAS 4000 imager (GE Healthcare). The median intensity of stain in each well was quantified using ImageQuant TL 8.2 image analysis software (Cytiva).

Western blotting
Cells were lysed using cell lysis buffer (Cell Signaling Technology, 9803S), supplemented with protease inhibitor (Sigma Aldrich, 11836170001) and phosphatase inhibitor (Sigma Aldrich, 04906837001). Protein quantification of whole cell lysate was measured using a Bradford-based colorimetric assay (Bio-Rad, 5000006). Around 50-60 µg of whole cell lysate was mixed with loading buffer (Life Technologies, NP0007), reducing buffer (Life Technologies, NP0009) and water and heated to 75 °C for 10 min. Samples were loaded onto 4-12% bis-tris 10-well gels (Life Technologies, NP0335BOX) and run at 100 V for 30 min followed by 150 V for 90 min using MOPS buffer (Life Technologies, NP0001). Gels were transferred to polyvinylidene difluoride membranes (Thermo Fisher Scientific, IPVH00010) at 100 V for 90 min using transfer buffer (Boston BioProducts, BP-190-1L) at 4 °C. Membranes were blocked in milk (Cell Signaling Technology, 9999S) for 60 min at room temperature. Membranes were rocked overnight at 4 °C in a solution of Tris-buffered saline and Tween-20 (TBST; Cell Signaling Technology, 9997S) containing 5% w/v BSA (Research Products International, A30075-1000.0), 0.02% sodium azide (Santa Cruz Biotechnology, SC-208393) and primary antibody. The following day, membranes were washed in TBST five times, 5 min per wash. For a subset of western blots, membranes were rocked for 1 h at room temperature in milk containing 1:5,000 dilution of HRP-conjugated secondary antibody against mouse (Cell Signaling Technology, 7076S) or rabbit (Cell Signaling Technology, 7074S). Membranes were then washed in TBST three times and immersed in a solution containing chemiluminescent substrate (Life Technologies, 34076), allowed to develop for 1 min, then imaged using film (Thermo Fisher Scientific, PI34091). Other western blots were imaged using a LI-COR system.
Membranes were rocked in a TBST solution containing a 1:10,000 dilution of secondary antibody against mouse (LI-COR Biosciences, 926-32210) and rabbit (LI-COR Biosciences, 926-68071) and 1:10,000 dilution of 10% SDS solution (Life Technologies, 15553027) for 1 h at room temperature. Membranes were washed in TBST three times and then briefly rinsed in PBS and imaged on an Odyssey CLx machine at medium resolution (ImageStudioLite 5.2.5).

SOX11 overexpression
Complementary DNAs of wild-type SOX11 and SOX11 mutants harbouring a deletion of the DBD (H48-R119) were synthesized as gBlocks fragments (IDT), and then cloned into a pLX_TRC307 lentiviral expression vector co-expressing a puromycin resistance gene (obtained from the Genetic Perturbation Platform at the Broad Institute) using a Gibson Assembly Cloning kit (New England Biolabs E5510S). Constructs were lentivirally delivered to cells as described above.

Rescue of ETV6 knockout with wild-type and ETS-deleted ETV6 overexpression
DNA fragments encoding codon-optimized ETV6 wild-type (ETV6-WT) and mutant ETV6 harbouring deletion of the ETS domain (ETV6-ΔETS) were purchased from gBlock (IDT) and cloned into pDONR-221 via BP gateway cloning. Constructs were further cloned into pINDUCER20 (Addgene, 44012) by LR cloning and lentivirally packaged as described above. A673 and EW8 cells were transduced with lentivirus encoding either ETV6-WT or ETV6-ΔETS and incubated with 100 ng ml -1 of doxycycline or vehicle for 24 h. Subcellular fractionation was performed according to the manufacturer's protocol (Thermo Fisher, PI78840). Western blotting and cell viability experiments were performed as described above.

qPCR
Total RNA was extracted from cells using an extraction kit with column-based genomic DNA removal (Qiagen, 74134). RNA was reverse transcribed to cDNA using an iScript kit (Bio-Rad Laboratories, Nature Cell Biology Article https://doi.org/10.1038/s41556-022-01059-8 1708841) and diluted 1:7 with H 2 O. For sgFLI rescue experiments, A673 ETV6-dTAG cells were transduced with sgChr2.2 or sgFLI CRISPR-Cas9 constructs and treated separately with DMSO or dTAG V -1 in duplicate. All qPCR reactions were performed using a TaqMan system (Thermo Fisher Scientific) with technical triplicates. Probes were selected to span exon-exon junctions when possible. Specific probes were as follows: GAPDH: Hs02758991_g1; FAS: Hs00236330_m1; ACTA2: Hs00426835_g1; TRIB1: Hs00179769_m1; SEMA5B: Hs00400720_m1; BCL11B: Hs01102259_m1; and SOX11: Hs00846583_s1. In each qPCR reaction, the gene of interest was measured using FAM dye, whereas GAPDH control was measured using VIC dye. Samples were analysed in 384-well plate format using 5 µl TaqMan gene expression master mix (Thermo Fisher Scientific, 4369016), 0.5 µl of FAM-emitting probe, 0.5 µl of VIC-emitting GAPDH probe and 4 µl of diluted cDNA for a total of 10 µl per reaction. qPCR plates were analysed using a QuantStudio 6Flex Real-Time PCR machine and the accompanying QuantStudio Real-Time PCR software v.1.7 (Thermo Fisher Scientific). The delta-threshold cycle number (ΔCt) was calculated as the difference in threshold cycle number (Ct) between the gene of interest and GAPDH. The ΔΔCt was calculated as the difference between the ΔCt of a particular sample and the average ΔCt of the DMSO-treated, sgChr2.2 control samples. Fold increase in gene expression (after the loss of ETV6) was calculated as the ratio of 2 −ΔΔCt in dTAG V -1-treated cells to the average 2 −ΔΔCt in DMSO-treated cells, in either the sgChr2.2 or the sgFLI conditions.

RNA-seq
All RNA-seq experiments were performed using total RNA extracted using a column-based kit (Qiagen, 74104) and treated with DNAse digestion. The Life Technologies external RNA control consortium (ERCC) RNA spike-in samples were added to each sample for normalization per kit instructions (Thermo Fisher Scientific, 4456740). For all RNA-seq experiments, except the A673 sgETV6 CRISPR-Cas9 experiments, RNA-seq library preparation and sequencing were performed by Novogene (https://en.novogene.com) at a depth of roughly 20 million reads per sample. Per Novogene correspondence, the quality control for the RNA samples was performed using Qubit fluorometric quantitation (Thermo Fisher Scientific) and a Bioanalyzer instrument (Agilent). Libraries were then prepared using a New England Biolabs NEBNext Ultra II non-directional RNA Library Prep kit. Library quality and concentrations were assessed using Labchip (Perkin Elmer) and qPCR. Libraries were sequenced in 150-bp paired-end fashion on a Novaseq6000 instrument (Illumina). For the A673 sgETV6 CRISPR-Cas9 experiments, polyA-tailed mRNA was isolated from 1 µg total RNA using a magnetic bead-based kit per kit instructions (New England Biolabs, E7490S). RNA-seq library preparation was performed using a NEBNext Ultra II Directional RNA Library Prep Kit for Illumina (New England Biolabs, E7760S). Libraries were quantified using a Qubit dsDNA high sensitivity assay (Q32851). The distribution of DNA fragment sizes was measured using a High Sensitivity D1000 assay (Agilent, ScreenTape, 5067-5584; reagents, 5067-5585). The molarity of each library was calculated and normalized to 4 nM. Libraries were pooled and sequenced on a Nextseq 500 instrument (Illumina) (single-end; 75 cycles at a depth of roughly 40 million reads per sample) using a Nextseq 500 sequencing kit (Illumina, 20024906).

CUT&Tag
CUT&Tag was performed as previously described 50 with slight modifications by the Lessnick Laboratory (Nationwide Children's Hospital, Columbus, OH). About 250,000 cells per CUT&Tag condition were bound to BioMag Plus Concanavalin A-coated magnetic beads (Bangs Laboratories, BP531) and incubated with primary antibodies (ETV6 rabbit, Bethyl A303-674, 1:50; ETV6 mouse, Sigma WH0002120M1, 1:50; rabbit anti-mouse, Abcam, ab46540, 1:50) overnight at 4 °C, and secondary antibodies (guinea pig anti-rabbit IgG, Antibodies-Online ABIN101961, 1:100; rabbit anti-mouse, Abcam ab46540, 1:100) for 1 h at room temperature. Adapter-loaded protein A-Tn5 fusion protein was added at a dilution of 1:250 and incubated for 1 h at room temperature. To activate Tn5, tagmentation buffer containing MgCl 2 was added and samples were incubated for 1 h at 37 °C. Reactions were stopped by addition of EDTA, and DNA was solubilized with SDS and proteinase K for 1 h at 50 °C. Total DNA was purified using phenol-chloroform extraction followed by ethanol precipitation. CUT&Tag libraries were prepared using NEBNext HiFi 2× PCR master mix (NEB, M0541S) and indexed primers 97 using a combined annealing-extension step at 63 °C for 10 s and 15 cycles followed by a 1.1× post-amplification AMPure XP (Beckman Coulter, A63880) bead clean-up. Libraries were pooled and sequenced (2 × 150 bp paired end) on an Illumina HiSeq4000 platform (Nationwide Children's Hospital Institute for Genomic Medicine). Two independent replicates of each CUT&Tag sample were prepared.

ChIP-seq
Antibodies were conjugated to magnetic beads. For each immunoprecipitation (IP), 100 µl of protein A Dynabeads (Thermo Fisher Scientific, 10002D) were washed three times in 1 ml BSA blocking solution (0.5% w/v sterile-filtered BSA in H 2 O) and resuspended in 250 µl. Beads were then rotated overnight at 4 °C with antibody, using 5 µg to target H3K27ac (Abcam, 4729) or 10 µg to target TFs (anti-HA, Abcam, ab9110; anti-FLI1, Abcam, ab15289). For comparative studies (that is, comparing the relative binding of EWS-FLI), 2 µg of spike-in antibody recognizing a Drosophila-specific histone variant was added (Active Motif, 61686). The following morning, the antibody-conjugated beads were washed four times in 1 ml BSA blocking solution and then resuspended in 100 µl of the solution and stored at 4 °C.
To prepare sheared chromatin, Ewing sarcoma cells (20 million cells per ChIP reaction) were collected in a 1.5 ml tube and washed twice in 1 ml PBS. Cells were then crosslinked by resuspension in 1 ml PBS containing 1% methanol-free formaldehyde (Thermo Fisher Scientific, 28906) and rotated for 10 min at room temperature at 12 r.p.m. The reaction was quenched with 100 µl of 1.25 M glycine (Sigma Aldrich, G7126) and 100 µl 1 M Tris-HCl pH 8.0 (Thermo Fisher Scientific, 15568025). Cell pellets were washed twice with 1 ml PBS at room temperature and resuspended in 1 ml of SDS lysis buffer (0.5% SDS, 5 mM EDTA, 50 mM Tris-HCl pH 8.0) supplemented with protease inhibitor cocktail (Thermo Fisher Scientific, PI78429) and incubated at room temperature for 2 min with gentle agitation. Lysates were centrifuged at 15,000g for 10 min at 4 °C. The nuclear pellet was re-suspended in 950 µl of ChIP IP buffer (2 parts SDS lysis buffer and 1 part Triton dilution buffer, which was composed of 100 mM Tris-HCl pH 8.0, 100 mM NaCl, 5 mM EDTA, 0.2% NaN 3 and 5% Triton X-100) supplemented with protease inhibitor and transferred to a milliTUBE (Covaris, 520130). Sonication was performed on an E220 Focus Ultra sonicator (Covaris) at 5% duty cycle, 140 W peak power, 200 cycles per burst, at 4 °C for 30 min per milliTUBE. Sheared chromatin was Nature Cell Biology Article https://doi.org/10.1038/s41556-022-01059-8 transferred to a 1.5 ml tube and centrifuged at 15,000g for 10 min at 4 °C. The supernatant of sheared chromatin was transferred to a new reaction tube. To prepare the ChIP DNA input sample, 5 µl of sheared chromatin was transferred to a PCR strip-tube and mixed with 40 µl de-crosslinking buffer (100 mM NaHCO 3 and 1% SDS buffer), 1 µl RNAse A (Thermo Fisher Scientific, 12091021) and 1 µl proteinase K (Thermo Fisher Scientific, AM2546). The tube was incubated for 2 h at 65 °C in a thermal cycler to de-crosslink DNA-protein covalent bonds. DNA was isolated using Agencourt AMPure XP bead-based purification at a 1.2× ratio (Beckman Coulter, A63881), eluted in 50 µl H 2 O and stored at −20 °C. The remaining sheared chromatin was divided or pooled according to the target of interest; at least 5 million cells were used for IP of histone marks and 40 million cells for TFs. Each IP reaction was brought up to a total volume of at least 1 ml with ChIP IP buffer. Pooled reactions were conducted in 2 ml or 5 ml reaction tubes. 50 ng or 20 ng of Drosophila spike-in chromatin was added for each H3K27ac or TF ChIP reaction, respectively. The 100 µl conjugated bead-antibody solution was then added to the sheared chromatin. IP reactions were rotated overnight at 4 °C.
ChIP-seq libraries were prepared using a SMARTer ThruPLEX single-index DNA-Seq kit (Takara Bio, R400674, R400695). H3K27ac and TF samples were PCR-amplified 4 and 10 cycles, respectively. Libraries were prepared as described above and sequenced in 37-bp paired-end fashion for 75 cycles (Illumina, 20024906) at a depth of roughly 30 million reads per sample on the NextSeq 500.

ChIP-seq data analysis
The raw Illumina sequencer output was converted to fastq format using the program bcl2fastq (v.2.17). Sequencing read quality was examined using FastQC (http://www.bioinformatics.babraham.ac.uk) (v.0.11.9). Trimming of low-quality reads and clipping of sequencing adapters was done using the program Trimmomatic (v.0.36) 101 , and all reads shorter than 40 bp after trimming were discarded. Reads were aligned to the human genome (hg19) using Bowtie2 (v.2.3.5) 102,103 using the '-very_sensitive' preset collection of parameters. File conversion of .bam to .sam was done using SamTools (v.1.9q) 104 , and duplicate reads were removed using Picard-tools (v.2.19.0) (http://picard.sourceforge. net). ChIP-seq peaks were called using MACS2 (ref. 105) with a false discovery rate (FDR) q < 0.01 unless otherwise stated. The MACS2 algorithm utilizes a dynamic Poisson distribution to capture local biases in the genomic sequence, which allows for a sensitive and robust prediction of peaks. Unless otherwise noted, peaks were assigned to the closest gene within ±400 kb using the ChIPseeker package in R 106 . Visualizations of the ChIP-seq data tracks were produced with the R Bioconductor Gviz package 107 .

CUT&RUN data analysis
CUT&RUN FLI1 data for the PEDS0009 sample used a pipeline based on the bulk-level method outlined in CUT&RUNTools 2.0 (ref. 111) that is largely the same as the ChIP-seq pipeline. The changes to the ChIP-seq pipeline are an extra adapter trimming step run after Trimmomatic using kseq from CUT&RUN Tools and the addition of the '-dovetail' flag to the Bowtie2 command. CUT&RUN samples also included E. coli spike-in for sample normalization and it was aligned to the E. coli (Escherichia_coli_K_12_DH10B NCBI 2008-03-17) genome.

Differential ChIP-seq binding
Differential binding analysis in ETV6-dTAG ChIP-seq samples was performed with the R Bioconductor package CSAW 89 . CSAW uses a sliding window approach to count reads across the genome from sorted and indexed .bam files, for which each window is tested for significant differences between libraries using statistical methods from the edgeR package. Differential CSAW analysis was performed on A673 and EW8 ETV6-dTAG at 6 and 72 h in FLI1 and H3K27ac. The differential analysis performed here normalized samples based on Drosophila spike-in values, the reads of which were aligned to the dm6 version of the Drosophila genome. The differential ChIP-seq analysis procedure generally followed the approach outlined in the CSAW introductory usage tutorial as follows. The .bam files were read in allowing a maximum fragment length of 800, a minimum q = 20 and discarding any reads that fell in the hg19 or dm6 ENCODE blacklist files. A window size of 150 bases was used for analysis and tiled across the genome in 50 base steps. The ChIP-seq input control samples were used to help filter out regions containing just background reads by binning input control reads into Nature Cell Biology Article https://doi.org/10.1038/s41556-022-01059-8 10,000 base blocks with a threshold of minimum prior counts of 2. The binned input reads were then compared with the ChIP-seq binding across all regions, and all ChIP-seq regions with a fold change of less than 3 over input were filtered out. After filtering, adjacent and overlapping 150 base regions were merged together to reduce the number of hypotheses tested (for example, A673 6 h ETV6-dTAG FLI1 had an average merged window width of 494 bases). Drosophila spike-in control reads were processed similarly to the human reads except, as there was no input control for the spike-in control, the spike-in reads were filtered using a global filtering method that required regions to be threefold above background. The counts for all enriched spike-in regions were used to calculate the normalization factors by applying the trimmed mean of M-values method on these counts via the function normFactors. Differential binding is tested for significance using the quasi-likelihood framework in the edgeR package, whereby edgeR models the counts using a negative binomial distribution that accounts for over-dispersion between biological replicates. To account for multiple hypothesis testing, CSAW converts per-window statistics into a P value for each region and then applies the Benjamini-Hochberg method to calculate the corrected FDR.

ChIP-seq heatmaps
ChIP-seq heatmaps were produced by functions in the following deeptools package (v.3.3.0) 112 : computeMatrix, plotProfile and plotHeatmap. All heatmaps were made using data in .bigWig files that have been generated by deeptools bamCompare that generates .bigWig files based on the comparison of a ChIP-seq sample .bam file to its corresponding input (from the same cell line and same batch) while being simultaneously normalized for sequencing depth. The function com-puteMatrix was then used to calculate scores for genome regions and to prepare an intermediate file that can be used with plotHeatmap and plotProfiles. Unless otherwise stated, the genome regions were regions defined by a BED file corresponding to ETV6 or FLI1 peaks. For Fig. 2a-c and Extended Data Fig. 3a-c, computeMatrix was used with multiple .bigWig score files and two BED region files, in which the ETV6 peaks are split into two groups depending on whether the ETV6 peak overlapped with a region defined by gene TSSs ± 2.5 kb according to UCSC hg19 refGene transcript definitions. Figure 2g,h used regions defined by differential FLI1 regions from P < 0.05 CSAW, whereby regions not intersecting with a TSS were further divided into two groups according to whether the region intersects with a H3K27ac ChIP-seq peak from MACS2 with q < 0.01 in the parental A673 or EW8 cell line.

GGAA repeat frequency at peak locations
Stacked bar plots were created in R using frequencies of overlap from the function summarizePatternInPeaks from R Bioconductor package ChIPpeakAnno (v.3.9) 113 . The function summarizePatternInPeaks was used to calculate the frequency of overlap of regions of the standard hg19 reference genome with GGAA repeats (from a single GGAA up to five consecutive GGAA sequences without any gaps) with peaks in FLI1 and ETV6 as called by MACS2. The ENCODE datasets analysed were from the Gene Expression Omnibus: GSE96274 (B lymphocyte) and GSE95877 (K-562).

Differential ATAC-seq regions
Processing of ATAC-seq data (that is, Fig. 4c) used the same pipeline as the ChIP-seq data, although an extra step was added after Bowtie2 alignment that used samtools to remove mitochondrial reads (ChrM). CSAW was used for the differential analysis of ATAC-seq data in the same manner as CSAW was used with ChIP-seq data, except that there was no input control for filtering or spike-in control for sample normalization. In the absence of a matching input control, CSAW region filtering was performed by requiring regions to be threefold above the local background, whereby local background was defined by using wider windowing function of 2,000 bases and requiring regions to be threefold above the neighbouring regions. Within CSAW, ATAC-seq samples were normalized to the background using 10,000 base windows to calculate compositional biases of samples.

RNA-seq data analysis
Gene expression values were derived from paired-end RNA-seq data, except for the A673 sgETV6 CRISPR-Cas9 RNA-seq experiment, which was sequenced in single-end fashion. The RNA-seq processing pipeline was roughly modelled on the GTEx pipeline (https://github.com/ broadinstitute/gtex-pipeline/) 114 . FastQC was used to evaluate read quality on raw RNA-seq reads. Reads were aligned to the human genome (hg19) using STAR 115 . Transcript-level quantifications were calculated using RSEM (v.1.3.1) 116 . Gene counts from STAR were then used to quantify differentially expressed genes between the experimental and control conditions using the R Bioconductor package DESeq2 (ref. 108) using the approximate posterior estimation for GLM coefficients (apeglm) method for effect size. Normalized expression values for individual samples were obtained from RSEM log 2 (TPM) values with the RSEM log 2 (TPM + 1) values used for GSEA and producing RNA-seq heatmap plots.
The RNA-seq samples included the ERCC spike-in control mix 117 . Sequences for the ERCC transcripts were added to the hg19 reference for the STAR transcript alignment, and the fold changes of ERCC probes were examined in the DESeq2 output. Fold changes for ERCC probes were typically very small between the conditions in the ETV6-dTAG sample sets (for example, average fold change for 24 h A673 ETV6-dTAG of 0.995 between conditions). As such, ERCC spike-ins were not used to perform sample normalization.

Gene set pathway enrichment analysis
Gene set pathway enrichment analysis was performed with signatures from v.6.0 of the Broad Institute's molecular signature database (MSigDB) (http://www.broadinstitute.org/gsea/msigdb/index.jsp) using the c2 curated gene sets from various sources such as online pathway databases, the biomedical literature and knowledge of domain experts. These pathway enrichment results are shown in Fig. 3f and Extended Data Fig. 4g. Pathway enrichment analysis was performed in R using the clusterProfiler package that provides the enricher function for a hypergeometric test for a test of over-representation of pathway genes in a set of user-defined genes. Figure 3f shows a combined enrichment plot of the top MSigDB c2 pathways enriched in the ETV6-repressed genes at 6, 24 and 72 h common to both A673 and EW8 (genes up in ETV6 dTAG V -1 treatment RNA-seq). The plot shows a selected subset of the top enriched c2 gene sets, and the complete set of enriched sets is shown Supplementary Tables 7-11. The dot size corresponds to the number of genes in the gene set out of the total number of significantly ETV6-repressed genes at 6, 24, and 72 h (85, 251 and 832 genes, respectively). The colour corresponds to the gene set grouping. Missing points at times along the x axis represent times at which the enrichment was not significant with P < 0.05. The pathways are ordered first by the gene group and then by the average gene ratio (count of repressed genes in a pathway/number of repressed genes) across the three time points. Extended Data Fig. 4g shows a combined enrichment plot of the top MSigDB c2 pathways enriched in the ETV6-activated genes at 6, 24 and 72 h common to both A673 and EW8 (genes down in ETV6 dTAG V -1 treatment RNA-seq). The plot shows a selected subset of the top enriched c2 gene sets, and the complete set of enriched sets is shown Supplementary Tables 12-17. The dot size corresponds to the number of genes in the gene set out of the total number of significantly ETV6-activated genes at 6, 24 and 72 h (33, 130 and 543 genes, respectively). The colour corresponds to the gene set grouping. Missing points at times along the x axis represent times at which the enrichment was not significant. The pathways are ordered first by the gene group and then by the average gene ratio (count of repressed genes in a pathway/number of repressed genes) across the three time Nature Cell Biology Article https://doi.org/10.1038/s41556-022-01059-8 points. Extended Data Fig. 7a shows a pie chart of the top 100 enriched c5 gene sets, ranked by significance, in A673 ETV6-dTAG cells at 24 h. Each c5 gene signature was assigned to one of the categories listed; a complete list is shown in Supplementary Table 19.

GSEA
The GSEA algorithm 118,119 was used to evaluate the association of gene sets with ETV6 regulation. GSEA was run with 2,500 permutations of the phenotype using signal-to-noise to rank genes. This GSEA algorithm was used for Fig. 3e to test enrichment and generate enrichment plots of ETV6-bound genes in ETV6-regulated genes. The A673 ETV6 peak locations are defined by the peaks that overlap in all three A673 ETV6 samples (two A673 ETV6 CUT&Tag samples from two ETV6 antibodies and one untreated A673 ETV6 dTAG HA sample) and the EW8 ETV6 peak locations are defined by peaks in the EW8 ETV6 HA sample. ETV6-bound genes were identified by mapping the peaks to their nearest genes using the R package ChIPseeker.

Statistics and reproducibility
Further information is available in the Nature Portfolio Reporting Summary linked to this article.

Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability
CRISPR-Cas9 screen data and the genomic characterization of cancer cell lines (RNA-seq) used in this study are publicly available at https:// depmap.org. Gene expression data from the Treehouse Childhood Cancer Initiative characterizing primary tumours is publicly available at https://treehousegenomics.soe.ucsc.edu/public-data/. The Broad Institute's MSigDB is publicly available at http://www.broadinstitute. org/gsea/msigdb/index.jsp. Genomics data shown in this study have been deposited in the Gene Expression Omnibus under accession code GSE181554. Source data are provided with this paper.