Site- and allele-specific polycomb dysregulation in T-cell leukaemia

T-cell acute lymphoblastic leukaemias (T-ALL) are aggressive malignant proliferations characterized by high relapse rates and great genetic heterogeneity. TAL1 is amongst the most frequently deregulated oncogenes. Yet, over half of the TAL1+ cases lack TAL1 lesions, suggesting unrecognized (epi)genetic deregulation mechanisms. Here we show that TAL1 is normally silenced in the T-cell lineage, and that the polycomb H3K27me3-repressive mark is focally diminished in TAL1+ T-ALLs. Sequencing reveals that >20% of monoallelic TAL1+ patients without previously known alterations display microinsertions or RAG1/2-mediated episomal reintegration in a single site 5′ to TAL1. Using ‘allelic-ChIP’ and CrispR assays, we demonstrate that such insertions induce a selective switch from H3K27me3 to H3K27ac at the inserted but not the germline allele. We also show that, despite a considerable mechanistic diversity, the mode of oncogenic TAL1 activation, rather than expression levels, impact on clinical outcome. Altogether, these studies establish site-specific epigenetic desilencing as a mechanism of oncogenic activation.

That the insertion occurred outside a cryptic site is surprising considering the many functional cryptic RSSs surrounding the insertion region ( Fig.4), some of which largely documented to be involved in RAG-mediated STIL-TAL1 deletion or t(1;14) translocation. 2. RAG-mediated transposition 7,8 : double-ended RAG-mediated transposition is an alternative mechanism of episomal insertion which does not involve RSS at the insertion site. This mechanism generates specific imprints at the breakpoints, namely a short (~5bp) duplication of the insertion site, due to the asymmetric nucleophile attack of episomal SEs leaving staggered-type opening of the genomic target, subsequently filled-in during ligation/repair. No such transposition marks were apparent at the breakpoints in OC patient (Fig.4); 3. End-donation 9 : end-donation is another mechanism of episomal insertion which does not involve RSS at the insertion site. Intermediate coding or signal 10 (RSS) ends are erroneously repaired with chromosomal broken-ends, and breakpoint features include N nucleotide addition in the junction; the presence of short deletions, or duplications at the broken end side depends on whether the initial break generated 3' overhang, blunt or 3' recessed ends. End-donation is by far the most frequent mechanism of V(D)J-mediated translocations in human B-and T-cell neoplasia (also called "type 2" translocation), and a frequent mechanism of episomal reinsertion in lymphoid cells from mice models (estimated to occur once every ~50,000 V(D)J recombinations) 11,12 . Breakpoint features in OC patient were compatible with end-donation. Intriguingly, this case corresponds to a SCID-X1 patient who developed leukemia secondary to another insertion (a retroviral-induced insertion in front of LMO2, a well-known TAL1cooperating oncogene, following gene therapy).

Supplementary Figure 6. Identification of TAL1 transcription start site in Patient OC. A. A putative mechanism in which transcription
initiates from the episome, and generates a >7kb-long fusion transcript encompassing TAL1 (illustrated in the bottom lane) was tested. A Rapid Amplification of cDNA Ends (5' RACE) assay was performed from TAL1 exon 4 and exon 6. STIL-TAL1 cell lines (RCF-CEM and RPMI 8401) in which transcription initiates from STIL promoters, were used as controls, and gave rise to complete SIL-TAL fusion transcripts (top lanes). In OC, a single transcript corresponding to the oncogenic p4 TAL1 variant was obtained from exon 6 (starting from p4 and comprising part of exon 4, and full exons 5 and 6), and accordingly no RACE product could be obtained from exon 4 (middle lanes). RACE primers are indicated by black arrows. B. A RT-PCR exon walking assay was also performed in and between various TAL1 exons, across the episomal breakpoint, and in the episome to detect potential splice variants. Expression is normalized to ABL. In line with the RACE data, no amplification of TAL1 exons or breakpoints was observed upstream of p4 (walking primers are pictured as blue arrow-heads). We conclude that TAL1 overexpression in OC was not initiated from a transcriptional start site located in the inserted episome.

Supplementary Figure 7: Dysregulation of PcG complex in bi-allelicTAL1 cases. Considering the involvement of PcG in insertional
mutagenesis leading to mono-allelic TAL1 expression, we investigated whether PRC2 mutations might similarly lead to trans-activation (and therefore bi-allelic expression) of TAL1. We first tested whether bi-allelic TAL1 cases were associated with a general decrease in the expression levels of one or several of the main (co-)factors of PRC2 complex (EZH2, SUZ12, EED, AEBP2). Although a large individual variability in expression levels was apparent, no statistically significant differences (Mann-Whitney test) could be observed in EZH2, SUZ12, EED, or AEBP2 expression levels either according to the mono-vs. bi-allelic status (A.) or to TAL1 expression levels (B.). This suggested the absence of a unifying mechanism targeting PRC2 expression in bi-allelic TAL1 cases.
To further test whether loss-of-function of some components of the PRC2 complex might sporadically lead to cases of bi-allelic TAL1 activation, we screened for mutations (by direct sequencing) and/or copy number variation (i.e. deletion by CGH-Array and quantitative genomic PCR) of EZH2 and SUZ12, two PRC2 complex components in which mutations were previously reported. Mutations in EZH2 and SUZ12 were only observed in bi-allelic cases (3/47 biallelic tested cases vs 0/28 monoallelic (C. and Table S2). Large deletions in SUZ12 were also observed in 6/47 bi-allelic and 2/28 monoallelic cases (D. and Table S2). Of note, the latter corresponded to the two largest deletions observed and are therefore likely to include other genes that SUZ12. None of the mono-allelic cases with insertion contained mutations or deletions in EZH2 or SUZ12.
Because we could not exhaustively screen all possible mutations in all (co-)factors of the PRC2 and PRC1 complexes (most of which are actually unknown in human), we cannot evaluate the extent to which direct or indirect PcG loss-of-function alterations might add-up to contribute globally to TAL1 trans-activation in biallelic cases. Nevertheless, to the extent of this small screen, our data suggest that this may be a relatively infrequent event which cannot account for the large fraction of bi-allelic TAL1 expression.