SCL/TAL1 cooperates with Polycomb RYBP-PRC1 to suppress alternative lineages in blood-fated cells

During development, it is unclear if lineage-fated cells derive from multilineage-primed progenitors and whether active mechanisms operate to restrict cell fate. Here we investigate how mesoderm specifies into blood-fated cells. We document temporally restricted co-expression of blood (Scl/Tal1), cardiac (Mesp1) and paraxial (Tbx6) lineage-affiliated transcription factors in single cells, at the onset of blood specification, supporting the existence of common progenitors. At the same time-restricted stage, absence of SCL results in expansion of cardiac/paraxial cell populations and increased cardiac/paraxial gene expression, suggesting active suppression of alternative fates. Indeed, SCL normally activates expression of co-repressor ETO2 and Polycomb-PRC1 subunits (RYBP, PCGF5) and maintains levels of Polycomb-associated histone marks (H2AK119ub/H3K27me3). Genome-wide analyses reveal ETO2 and RYBP co-occupy most SCL target genes, including cardiac/paraxial loci. Reduction of Eto2 or Rybp expression mimics Scl-null cardiac phenotype. Therefore, SCL-mediated transcriptional repression prevents mis-specification of blood-fated cells, establishing active repression as central to fate determination processes.

I n embryonic development, early cell fate decisions occur at gastrulation when epiblast cells migrate through the primitive streak (PS) and specify into germ layers [1][2][3][4] . A characteristic of cells undergoing specification is their developmental plasticity, as revealed by their ability to respond to changing environmental or intrinsic cues and adopt different fates, suggesting multipotency 3 . Cellular identity becomes locked after egression from the PS. Therefore, acquisition of lineage-specific features is likely to coincide with progressive loss of the ability to generate alternative lineages through restriction of cell potential. Whether emerging tissue-specific transcriptional regulators driving lineage specification also actively control these restriction processes is unclear.
Specification of blood cells from FLK1 + mesodermal progenitors is a model of lineage development. The initial steps of this process are under control of the transcription factor (TF) SCL/TAL1, placing it at the apex of the haematopoietic transcriptional hierarchy 5,6 . Interestingly, absence of SCL not only leads to complete block in haematopoiesis 7,8 , but also to expansion of the heart field in zebrafish embryos and ectopic cardiomyocyte production from mouse yolk sac and ES cell-derived endothelium, revealing latent cardiac potential in blood/endothelial progenitors [9][10][11] . Conversely, forced expression of Scl mRNA expands blood and endothelial tissues at the expense of myocardial tissues in vivo 12 and in vitro 13 . This highlights a close developmental relationship between blood and cardiac lineages and supports the notion of plasticity.
However, it is unclear if common, multilineage-primed blood/ cardiac mesodermal progenitors exist and whether active repression mechanisms are established in blood-fated cells to prevent development of the cardiac lineage. Two recent studies propose contrasting mechanisms. Molecular analyses of ES cellderived FLK1 + cells show that SCL occupies a subset of enhancers regulating cardiac-specific genes, suggesting this makes these enhancers unavailable for activation by cardiac-specific TFs 11 . In contrast, single cell analyses from mouse Scl −/− embryos failed to detect increased cardiac gene expression in Scl −/− FLK1 + cells, questioning the role of SCL in suppressing the cardiac fate 14 . However, it is unclear if the two studies were conducted at similar developmental time points and examined functionally equivalent FLK1 + cells.
Chromatin remodelling proteins, like repressive Polycomb (PcG) complexes, play critical functions in early development. PcG complexes control pluripotency and differentiation of embryonic stem (ES) cells and, in vivo, are required for survival and organogenesis 16 . Two PcG complexes (PRC1/PRC2) usually work in concert. Their activities are associated with distinct histone modifications: H2AK119 monoubiquitination (H2AK119ub, PRC1) and H3K27 trimethylation (H3K27me3, PRC2). Several PcG complexes exist that all contain enzymatic activities (PRC1 ubiquitin ligases; PRC2 methyltransferases), but vary in their overall composition. PRC1 complexes include ubiquitin ligase modules (RING1A/1B and PCGF1-6) and CBX or RYBP/YAF2 proteins in a mutually exclusive manner 17 . PcG complexes commonly bind CpG islands at gene promoters 18 .
To get further insight into the mechanisms underlying blood specification, we used murine ES cell differentiation cultures to follow production of mesoderm-derived blood-fated cells. We report a series of molecular events that occur over a restricted, one-day developmental time-window, at the onset of blood specification. We first document multi-lineage (blood/cardiac/paraxial) priming in single mesodermal cells. We then show that absence of SCL leads to rapid conversion of blood-fated cells into functional cardiac and paraxial cells, in agreement with the notion of cellular plasticity. To suppress alternative lineages, SCL activates expression of select repressors (ETO2 and PRC1 members) and creates a global repressive epigenetic environment, in parallel to activating blood/endothelial-related genes to promote haematopoietic specification. These processes form the basis of lineage selection and highlight the prevalence of active transcriptional repression in cell fate choices.

Results
Transient co-expression of distinct lineage-affiliated TFs. Mouse ES cell/embryoid body (EB) differentiation cultures recapitulate major embryonic developmental processes 19 (Fig. 1a,  top). Following production of Fgf5 + epiblast-like cells, Brachyury + mesoderm develops from day 2.5 ( Fig. 1a, right). From day 3, expression of VEGFA receptor, Flk1, marks the emergence of mesodermal progenitors at the origin of the endothelial and blood lineages 20 . Day 3/3.5 sees robust expression of tissue-specific regulators of distinct mesoderm-derived lineages: Scl (haematopoietic 5 ), Mesp1 (cardiac 21 ) and Tbx6 (paraxial 22 ) (Fig. 1a, bottom). This stage corresponds to the development of nascent/ posterior mesoderm in the primitive streak of day E7/7.5 mouse embryos ( Supplementary Fig. 1) and marks the onset of lineage specification in the ES/EB model.
To test if multilineage-primed mesodermal progenitors exist, we asked if Scl, Mesp1 and Tbx6 were co-expressed in the same cells by single molecule mRNA (smRNA) FISH. We designed probe libraries for each mRNA species, co-stained day 3 to day 4.5 EB cells and quantitated the number of single mRNA molecules (foci) in individual cells (Fig. 1b). The average foci number/cell for each mRNA target (Fig. 1c) followed the expression pattern of the corresponding mRNA species in cell populations (Fig. 1a, bottom). When assessing co-expression of the three markers, we observed triple (Scl/Mesp1/Tbx6) and double (Scl/Mesp1, Scl/Tbx6) positive cells (Fig. 1d, e, Supplementary Fig. 2a-c). The proportion of these cells amongst Sclexpressing cells increased from day 3 to day 3.5 (20-35% of total Scl-positive cells, Fig. 1d), but decreased thereafter (9 and 3% at days 4 and 4.5, Fig. 1d). Because Mesp1 broad expression in early gastrulating embryos suggests it could label mesodermal lineages other than cardiac 23,24 , data obtained with Mesp1 were confirmed with another cardiac-defining marker, Gata4 ( Supplementary  Fig. 2d). The high proportion of Scl/Mesp1/Gata4-coexpressing cells amongst day 3.5 Scl/Mesp1-positive cells (87%) validated Mesp1 as a marker of the cardiac lineage in Scl-expressing cells, at the developmental timepoint examined. Thus, Scl, Mesp1 and Tbx6 are co-expressed over a tight developmental time-window. At a single cell level, the decrease in double and triple positive cells in day 4/4.5 EBs was accompanied by increased numbers of Scl-expressing cells (45 and 51% of total EB cells).
In conclusion, co-expression of Scl/Mesp1/Tbx6, Scl/Mesp1 and Scl/Tbx6 preferentially occurs at low mRNA levels in a subset of Scl-expressing cells at day 3/3.5, consistent with multi-lineagepriming of distinct mesodermal lineages. Subsequently, as Scl mRNA levels increase, reflecting consolidation of the blood programme, expression of alternative lineage-specific regulators is lost.
Having established the presence of multi-lineage primed cells, we next investigated how lineage potentials resolve. We studied blood fate selection in day 3.5 EBs, when 35% of Scl-expressing cells co-express Mesp1 and/or Tbx6 mRNA. To identify lineage output, we monitored expression of cell surface markers FLK1 and PDGFRα that separate mesodermal populations with distinct functional potential [25][26][27][28] (Fig. 2a). Multiple mesodermal potentials are initially found in the FLK1 + PDGFRα + (double positive, DP) compartment. FLK1 single positive (F-SP) cells contain lateral plate mesoderm with haematopoietic/endothelial potential and PDGFRα single positive (P-SP) cells contain paraxial and cardiac potential.
To test this hypothesis, we isolated F-SP cells from Scl −/− day 3.5 EBs (Fig. 2d) and subjected them to the same re-aggregation assay employed with WT cells. After 24 h, Scl −/− F-SP cells exhibited a markedly different immunophenotypic pattern (Fig. 2c, left, Scl −/− panels). The majority of Scl −/− F-SP cells acquired PDGFRα expression and were DP or P-SP, consistent with the hypothesis that they may have acquired non-blood fates.
A key observation from the smRNA FISH experiments was the tight temporal control of transcriptional priming of blood, cardiac and paraxial lineages, peaking at day 3.5 and greatly reduced by day 4.5 ( Fig. 1d, Supplementary Fig. 2b). This suggested that cells could alter mesodermal fates maximally at day 3.5. Indeed, when compared to day 3.5 Scl −/− F-SP cells, a greatly decreased number of day 4.5 Scl −/− F-SP cells acquired low level PDGFRα expression in the re-aggregation assay (Fig. 2c, right panels). Therefore, temporal restriction of multi-lineage priming correlates with the tight temporal window in which F-SP cells acquire PDGFRα expression.
Absence of SCL increases adoption of alternative fates. We next asked if gain in PDGFRα expression in absence of SCL reflected increased numbers of cells adopting non-blood cell fates. We first replated day 3.5 WT and Scl −/− mesodermal populations (P-SP/ DP/F-SP) in conditions promoting cardiac development. Cardiac cell output was assessed by immunofluorescence staining for cardiac troponin (cTNT, Fig. 3a) and mRNA analysis (Tnnt2/Tnni3/Myh6, Fig. 3b). Strikingly, significantly more cardiac cells were detected in the Scl −/− vs. WT P-SP populations, showing a rapid functional conversion of Scl −/− cells. We confirmed ectopic cardiomyocyte production in vivo from Scl −/− mouse embryos. Obvious foci of cardiac cTNT + cells were detected in day E9.5 Scl −/− , but not WT, yolk sacs (Fig. 3c). Next, we tested the paraxial potential of WT and Scl −/− cells by replating day 3.5 fractionated or whole EB cells in chondrogenic conditions. Cultures from Scl −/− P-SP/DP populations exhibited increased formation of chondrogenic-specific glycosaminoglycan clusters (Alcian blue staining, Fig. 3d), and increased expression of the early chondrogenic marker, Sox9 33 , (Fig. 3e), when compared to WT cells. Similarly, cultures from Scl -/-EB cells showed more cells positive for COLLAGEN IIA, major downstream target of Sox9 34 , and increased ColIIa mRNA expression (Fig. 3f). Increased chondrogenic potential was confirmed in vivo from Sclnull yolk sacs (Fig. 3g). As controls, haematopoietic cells developed only from WT, but not Scl −/− , F-SP/DP populations, in blast colony assays 35 (Fig. 3h). Therefore, in absence of SCL, functional cardiac and paraxial PDGFRα + populations expand in day 3.5 EBs.
Finally, to further validate the hypothesis that only Scl low expressing cells can adopt a non-blood fate, we purified day 3.5 Scl:mCherry low and Scl:mCherry high cells and tested their developmental potential in re-aggregation and cardiac assays. As expected, 24 h after purification, the majority of the mCherry low cells became mCherry high and FLK1/PDGFRα DN, pursuing their normal hematopoietic differentiation trajectory (Fig. 3i, Supplementary Fig. 3g). Remarkably, however, day 3.5 Scl:mCherry low , but not Scl:mCherry high , cells generated a PDGFRα-positive population in the re-aggregation assay (7.9%, Fig. 3i). Confirming the adoption of alternative fates by cells expressing low levels of Scl, cTNT-positive cardiomyocytes were exclusively observed from Scl:mCherry low cells in cardiac cultures (Fig. 3j). Therefore, cell fate decisions critically depend on Scl mRNA thresholds.
In conclusion, our data identify a temporally restricted conversion of day 3.5 Scl −/− F-SP cells into functional cardiac cells, in line with previous data showing ectopic emergence of PDGFRα cells with a cardiac expression programme in Scl −/− yolk sac 10 . We extend these observations by showing that paraxial lineage cells are also produced from day 3.5 Scl −/− F-SP cells. Critically, the tight time-window during which non-blood lineages are produced in absence of SCL coincides with the restricted developmental period revealing multilineage-primed mesodermal cells.
SCL controls expression of transcriptional repressors. We next defined the molecular mechanism leading to acquisition of nonblood cell fates in absence of SCL. RNA sequencing revealed 2036 differentially expressed genes (DEGs) between day 3.5 WT and Scl −/− FLK1 + cells (Fig. 4a). Consistent with the functional cellular output (Fig. 3a-h), the 1038 genes normally activated by SCL (upregulated in WT versus Scl −/− cells) were associated with haematopoietic/endothelial differentiation, as revealed by Gene Ontology (GO) terms and GSEA analyses (Fig. 4b, examples of genes in Fig. 4c). In contrast, the 998 genes upregulated in Scl -/cells were associated with mesoderm, cardiac and paraxial lineage development (Fig. 4b, examples of genes in Fig. 4c). Interestingly, genes encoding transcriptional repressors were also activated in WT cells (Fig. 4b, c), for example SCL's partner ETO2 (CBFA2T3) 36 .
SCL ChIP-seq from day 3.5 WT FLK1 + cells revealed 9836 SCL-bound loci. GO terms associated with the genes nearest to the peaks were related to haematopoiesis, vasculature and cardiac development (Fig. 4d)  We next examined the 100 DEG-associated peaks showing the strongest SCL binding (greatest number of mapped reads from ChIP-seq data) (Fig. 4g, Supplementary Data 2). Remarkably, 97 of these peaks were attributed to genes normally activated by SCL. Whilst 22 peaks were associated to haematopoiesis-related genes and 10 to endothelium-related genes, 27 were associated with 4 genes encoding blood-related transcriptional repressors: Zfpm1 and Gse1 (Fig. 4f), Eto2 and Runx1t1 (Eto). In addition to these 4 genes, another 7 transcriptional repressors, either expressed in blood cells (Zfpm2/Gfi1/Gfi1b/Ikzf1/Izkf2) or members of the general repressive complex ncPRC1 (Rybp and its partner Pcgf5) were SCL direct targets (Supplementary Data 1). Of note, the number of SCL peaks was higher in loci normally activated (Runx1/Fli1/Zfpm1/Gse1) than on those normally repressed (Gata6/Mesp1/Tbx6) ( Fig. 4f; Wilcoxon rank-sum test confirmed the difference in peak distribution between all 514 downregulated and 264 upregulated direct DEGs at p value 5.8 × 10 −08 ). This suggests mechanistic differences in SCL-mediated transcriptional activation versus repression.
In summary, SCL normally activates a blood/endothelial programme in FLK1 + cells and represses cardiac and paraxial programmes. Unexpectedly, SCL strongly binds to and activates expression of genes encoding transcriptional repressors. This suggests that SCL may suppress cardiac/paraxial programmes in blood-fated cells through transcriptional activation of repressors.
SCL controls levels of histone marks linked to repression. To further understand SCL-driven mechanisms of transcriptional regulation, we surveyed histone marks associated with activated and repressed genes (H3K27ac/H3K4me3 and H3K27me3/ H2AK119ub, respectively) by quantitative ChIP-seq (ChIP-Rx 37 ) in mCherry + cells isolated from Scl:mCherry and Scl Δ/Δ :mCherry day 4 EB cells. H3K27ac, the only mark detected at SCL genomic binding sites, was analysed at all 9836 SCL ChIP peaks (Fig. 5a) and on SCL peaks associated to SCL's 778 direct DEGs (Fig. 5b). As all these histone modifications mark promoter regions, we examined their distribution at transcriptional start sites (TSSs) of SCL-bound 5512 loci, whole genome TSSs and TSSs of SCL's 778 direct DEGs (Fig. 5c-f).
H3K27ac was higher at SCL 9836 peaks in WT cells (Scl: mCherry) than in Scl-null cells (Scl Δ/Δ :mCherry) (Fig. 5a), consistent with SCL's ability to recruit histone acetyltransferase P300 38 . H3K27ac at SCL peaks located in SCL direct DEG genomic loci was higher in the 514 activated genes compared to the 264 repressed genes (Fig. 5b, top) in an SCL-dependent manner (Fig. 5b, bottom). In contrast, H3K27ac and H3K4me3 levels at TSSs of the 5512 loci bound by SCL (of which only 422 are bound by SCL at TSSs) and TSSs genome-wide were SCLindependent (Fig. 5c). As expected, the level of these two histone marks was higher at TSSs of the 514 activated genes compared to the 264 repressed genes and SCL-independent (Fig. 5d). Extending these observations, we noticed a broad and punctuate H3K27ac pattern at activated gene loci, not just restricted to SCL peaks and fully SCL-dependent (Fli1/Mafb, Supplementary  Fig. 4a). SCL-independent H3K4me3 distribution at TSSs was visualised at both activated (Fli1/Mafb) and repressed (Gata6/Irx3) genes ( Supplementary Fig. 4a, b). In summary, levels of H3K27 acetylation are higher and SCL-dependent at SCL's peaks associated to genes normally activated. At TSSs, both H3K27ac and H3K4me3 marks are more prominent at genes normally activated and are SCL-independent.
We next studied histone marks associated with repression, H3K27me3 and H2AK119ub. In contrast to H3K27ac and H3K4me3, we observed a striking, global SCL-dependent reduction of these marks, at TSSs of the 5512 SCL-bound loci and TSSs genome-wide (Fig. 5e). This reduction was confirmed by western blot (Fig. 5g, Supplementary Fig. 8a). Furthermore, whereas SCL-dependent levels of H3K27me3 were just confined to~2.5 kb around TSSs genome-wide, SCL-dependent levels of H2AK119ub extended for large domains (at least 100 kb) either side of the TSSs (Fig. 5h). Focussing on the 778 SCL direct DEGs, H3K27me3 levels were higher at TSSs of repressed, rather than activated, genes ( Fig. 5f, top), and showed SCL-dependence at repressed and, to a lesser extent, activated genes (Fig. 5f, bottom; Gata6/Irx3/Mafb, Supplementary Fig. 4). Surprisingly, the level of the H2AK119ub mark did not vary between TSSs of genes repressed and activated by SCL (Fig. 5f, top), and, in both cases, was SCL-dependent ( Fig. 5f, bottom, Supplementary Fig. 4).
Taken together, these data show that SCL mediates H3K27 acetylation specifically at SCL-bound cis-regulatory elements of genes it normally activates. SCL also regulates the level of repression-associated histone marks, H3K27me3 and H2AK119ub, not only at TSSs of genes it normally represses but also at some (H3K27me3) or all (H2AK119ub) genes it activates. This suggests that inappropriate gene activation may be restrained by SCL-mediated repressive environment. Regulation of repression, as surveyed by histone modifications, constitutes an important facet of SCL function at the onset of blood specification.
ETO2, RYBP and SCL are functionally linked. To investigate SCL-regulated active repression of gene expression in mesoderm patterning, we studied two transcriptional repressors and direct SCL target genes in day 3.5 FLK1 + cells, ETO2 and RYBP. ETO2 is a known SCL partner that represses premature megakaryocytic and erythroid gene activation later in development 36,39,40 . RYBP, member of non-canonical PRC1 (ncPRC1) complexes, modulates H2AK119 ubiquitination levels through stabilisation of RING1B recruitment and enhancement of its enzymatic activity [41][42][43] . Given the role of PcG complexes in development and the changes in ubiquitination levels observed in Scl -/cells, we reasoned that RYBP may mediate some of SCL-regulated repression mechanisms.
Day 4 ETO2 ChIP-seq analysis showed co-localisation with SCL at 1949 peaks, which represent~20% of all SCL peaks and Fig. 3 Scl-null cells acquire cardiac and paraxial potentials. a-c Cardiac assays. a Day 3.5 WT and Scl -/-P-SP, DP, and F-SP populations were replated in cardiac condition for 7 days and cTNT (cardiac troponin) expression monitored by immunofluorescence (IF); scale bar, 100 μm. b RT-qPCR analysis of cardiac gene expression (Tnnt2, Tnni3 and Myh6) relative to Gapdh in cultures shown in a; n = 3-5. c Day E9.5 WT and Scl -/mouse yolk sacs replated in cardiac assay for 7 days. IF reveals cardiomyocytes (cTNT, green), endothelium (CD31, red) and nuclei (DAPI, blue). d-g Chondrogenic assays. d Day 3.5 WT and Scl -/-P-SP, DP and F-SP populations were replated in chondrogenic condition for 21 days. Alcian blue staining reveals glycosaminoglycan clusters; e RT-qPCR analysis of Sox9 expression relative to Gapdh in cultures shown in d, n = 2. f Collagen IIa (ColIIa) expression following culture of day 3.5 WT and Scl -/-EB cells in chondrogenic condition (left, IF: COLIIA green, DAPI blue; right, RT-qPCR analysis) n = 2. g Alcian blue staining of day 18 chondrogenic cultures from day E9.5 WT, Scl +/and Scl -/mouse yolk sacs. OP9, no yolk sac cells. c, g Number of embryos presenting the phenotype shown is indicated for each genotype. h Blast colony assay showing number of endothelial/haematopoietic colonies obtained from day 3.5 WT and Scl -/purified mesodermal populations (P-SP, DP, F-SP); n = 3. i Left: day 3.5 Scl:mCherry WT cells were FACS-sorted according to the level of mCherry (and therefore SCL) expression into low and high fractions. Right: mCherry high and mCherry low cells were re-aggregated for 24 h, and mCherry (left) and FLK1/PDGFRα (right) expression re-assessed. Note that only the day 3.5 mCherry low fraction produced a PDFGRα + population (bottom right panel, orange events). At day 3.5 + 24 h, the majority of the PDGFRα + cells have lost mCherry expression (bottom left panel, orange events). j Day 3.5 Scl:mCherry high and Scl:mCherry low FACS-sorted cells were replated in cardiac assay. IF reveals cardiomyocytes (cTNT, green), endothelium (CD31, red) and nuclei (DAPI, blue). Mean ± SD is shown   In bold, transcriptional regulators. d The top more significant GO biological processes associated to SCLbound loci (GREAT analysis). e Integration of SCL ChIP-seq and RNA-seq data reveals 778 SCL direct differentially expressed target genes. Below, GO terms attributed to DEG-associated peaks (GREAT analysis). f Left, SCL ChIP-seq tracks of selected direct DEGs; FC = NS, fold-change in expression in Scl -/cells is not significant (RNA-seq data). Tbx6 locus: the SCL peak was attributed to Tbx6, the closet DEG. Right, RT-qPCR gene expression analysis relative to Gapdh from WT and Scl -/day 3.5 FLK1 + cells. n = 3, mean ± SD; student's t-test, *p < 0.05, **p < 0.01. Colour code, same as in c. g Biological functions attributed to the 100 strongest DEG-associated SCL ChIP-seq peaks. In brackets, numbers of genes in each category. See also Supplementary Data 1, 2 60% of all ETO2 peaks (Fig. 6d, Supplementary Fig. 6a). The biological processes related to the genes associated to these 1949 peaks were enriched for blood, vascular, cardiac and muscle development ( Supplementary Fig. 6b). We then considered the 1725 gene loci co-occupied by SCL and ETO2. These corresponded to 31% of SCL-bound genes (1725/5512) and 79% of ETO2-bound genes (1725/2186) (Fig. 6e). These genes contained 352 out of SCL's 778 direct DEGs (45%). The strength of SCL and ETO2 binding across these 352 DEGs was the same for both proteins: higher in the genes activated by SCL vs. those repressed ( Supplementary Fig. 6c). Therefore, SCL and ETO2 often colocalise on activated and repressed genes and most likely cooperate in multi-protein complexes.
Similarly, SCL binding was detected at the loci encoding Rybp and its partner Pcgf5, in areas associated with open chromatin and SCL-binding DNA motifs (Fig. 6f, Supplementary Fig. 5b). Strikingly, expression of Rybp and Pcgf5 was significantly downregulated in days 3.5/4 Scl −/− EB cells (Fig. 6g), corresponding to the narrow time-window of transcriptional copriming of blood/cardiac/paraxial lineages and developmental plasticity of F-SP cells (Figs. 1d, 2c). None of the other members of PRC1/PRC2 complexes we examined, including enzymes responsible for deposition of PcG-associated histone marks (Ring1a/1b, Ezh1/2), or members of the KAT and TrxG activating complexes, were direct targets of SCL (Fig. 6f, g, Supplementary  Fig. 7), highlighting the unique relationship between SCL and       RYBP/PCGF5. Finally, in contrast to the Eto2 locus, binding of SCL to Rybp and Pcgf5 loci was not seen in maturing blood cells (foetal liver pro-erythroblasts) 44 (Supplementary Fig. 5a, b), suggesting specific requirements for these two repressors during early blood specification processes.
In conclusion, ETO2's expression entirely relies on SCL throughout EB differentiation and full expression of RYBP and PCGF5 requires SCL within a narrow developmental window. ETO2 and RYBP interact with SCL. When considering gene loci, ETO2 co-occupies nearly half and RYBP the majority of SCL direct DEGs.
In conclusion, SCL and these repressors bind to overlapping and distinct functional categories of activated and repressed target genes.
Loss of Eto2 and Rybp phenocopies SCL-null cardiac phenotype. Given the aggregate molecular and biochemical data that SCL, ETO2 and RYBP could function in concert at the time of blood specification, we asked if loss of ETO2 or RYBP could phenocopy loss of SCL in developing EBs (Fig. 7a). We first knocked down expression of Eto2 and Rybp RNA and protein by siRNA, by 40-70%, at day 3.5 (siEto2 and siRybp, Fig. 7b, c, Supplementary Fig. 8b). The level of reduction of Rybp mimicked that seen in absence of SCL during the window of mesodermal lineage specification (Fig. 6g) and, strikingly, led to decreased levels of H2AK119ub similar to that observed in Scl Δ/Δ :mCherry cells (Figs. 5g, 7c Supplementary Fig. 8c). Importantly, Eto2 and Rybp knock-down cells produced a 4-6-fold increase in cTNTpositive cells in cardiac assays, similar to that observed in absence of SCL (Fig. 7d, e). Similarly, induction of Rybp deletion in Rybp fl/fl :Cre-ERT2 45 EBs resulted in complete loss of Rybp expression (Fig. 7f) and expansion of cTNT-positive cardiac cells (Fig. 7g). Importantly, reduction of Eto2 and Rybp expression did not alter blood lineage specification assayed by blast colony assays (Fig. 7h).
To directly test the function of H2AK119ub in cardiac development, we treated EBs with PRC1 ubiquitin ligase inhibitor PRT4165 46 . This achieved up to 84% decrease in ubiquitination levels (Fig. 7i, Supplementary Fig. 8d) and led to increased cardiac output, as judged by cTNT staining and Tnnt2 mRNA expression (Fig. 7j), suggesting that H2AK119 ubiquitination mediates the cardiac phenotype observed from Rybp knock-down and Scl-null cells.
Finally, we asked if reduction in H3K27me3 could have a similar effect. We treated EBs with PRC2 inhibitor UNC1999 targeting EZH1/2 function 47 , to achieve near complete loss of H3K27me3 at day 3.5 (Fig. 7k, Supplementary Fig. 8e). Here, there was no increase in cardiac output assayed by cTNT staining and Tnnt2 mRNA expression (Fig. 7l).  6 Genome-wide binding of SCL, ETO2 and RYBP in FLK1 + cells. a SCL ChIP-seq track of Eto2 locus in day 3.5 FLK1 + EB cells. Blue arrows, SCL peaks associated to Eto2; red rectangles, ATAC peaks; green rectangles, SCL-bound cis-elements further detailed in Supplementary Fig. 5. b RT-qPCR analyses of Eto2 mRNA expression from WT and Scl -/-EB differentiation kinetics (day 2-day 6) relative to Gapdh. n = 3-5, mean ± SD; student's t-test *p < 0.05. c Western blot analysis of day 4.5 WT EB nuclear extracts immunoprecipitated (IP) with anti-SCL antibodies. Members of SCL complex are detected as indicated. n = 3. d Overlap between SCL and ETO2 ChIP-seq peaks. e Overlap between SCL-bound and ETO2-bound genes. 352 of these are SCL direct DEGs. f SCL ChIP-seq track of Rybp, Pcgf5 and Ring1b loci in day 3.5 FLK1 + EB cells. Blue arrows, SCL peaks associated to the genes; red rectangles, ATAC peaks; green rectangles, SCL-bound cis-elements further detailed in Supplementary Fig. 5. g RT-qPCR analyses of Rybp, Pcgf5 and Ring1b mRNA expression from WT and Scl -/-EB differentiation kinetics (day 2-day 6) relative to Gapdh. n = 3-5, mean ± SD; student's t-test *p < 0.05. h Western blot analysis of day 4.5 WT EB nuclear extracts immunoprecipitated (IP) with anti-SCL antibodies (top), -RYBP (bottom, left) and -RING1B (bottom, right) antibodies. Members of PRC1 complex (RING1B, RYBP) and SCL complex (LMO2, ETO2) are detected as indicated. Asterisk (*) indicates heavy or light IgG chain. White arrow indicates SCL band in RING1B IP. i Overlap between SCL and RYBP ChIP-seq peaks. j Overlap between SCL-bound and RYBP-bound genes. Seven hundred and ten of these are SCL direct DEGs. k Overlap between SCL-bound, ETO2-bound, and RYBP-bound 778 SCL direct DEGs. Altogether, these data suggest specific roles for ETO2 and RYBP and, by association, the PRC1 complex, most likely through H2AK119 ubiquitination, to suppress alternative fates in FLK1 + cells in collaboration with SCL.

Discussion
We have studied how lineage commitment is executed during development by focusing on how ES cell-derived FLK1 + mesodermal cells segregate into the blood lineage and suppress alternative cardiac and paraxial lineage programmes. We show that, at a narrowly defined time-point, a single TF, SCL, not only promotes haemopoiesis in FLK1 + cells, but concurrently suppresses mis-specification of these cells to alternative lineages (Fig. 8). A series of genome-wide molecular and biochemical studies demonstrates that SCL represses alternative gene expression in collaboration with ETO2 and the PRC1 complex at the precise time-point when multi-lineage transcriptional priming and cellular plasticity are detected. Whether this occurs in FLK1 + SCL + haematopoietic progenitors giving rise to primitive and/   or definitive blood cells is not known as, to date, these two types of progenitors cannot be distinguished immunophenotypically or by gene expression. Our findings have broad implications for our understanding of lineage determination. First, low level co-expression of Scl, Mesp1 and Tbx6 (required to specify blood, cardiac and paraxial lineages, respectively) confirms multi-lineage priming and suggests that progenitors can potentially adopt multiple fates. Existence of multilineage-primed cells agrees with studies showing that epiblast cells exhibit multipotency 48 and with observations of "transcriptional noise" in day E6.5 mouse epiblast cells 49 . Importantly, multi-lineage priming needs to be distinguished from multi-lineage potency. Even though Mesp1 and Tbx6 are coexpressed with Scl in blood-fated FLK1 + cells, and, in absence of SCL, these cells can make cardiac and paraxial lineages, it is unclear if these cells are multi-potent at a clonal level in steady state, as multi-potency may only manifest in perturbed circumstances. One approach to address these open questions is to conduct in vivo fate tracing studies at appropriate developmental stages. This will characterise the extent of multi-potency and establish developmental relationships between mesodermal lineages.
Second, our data reveal the tight developmental window determining functional plasticity. Multi-lineage primed cells were most abundant in day 3.5 EBs and dramatically reduced at day 4.5. FLK1 + Scl −/− cells gave rise to PDGFRα + cardiac/paraxial cells at day 3.5, but not at day 4.5. Finally, SCL activated Rybp and Pcgf5 expression at days 3.5/4 only. In wild-type cells, plasticity is detected in cells expressing low levels of Scl, as only Scl:mCherry low cells can adopt a cardiac fate at day 3.5, likely to be elicited by higher Mesp1 levels. This supports the notion of thresholddependent fate determination, as recently reported for the TF ETV2 in hemangiogenic fate specification 50 . At earlier stages of development, the trophectoderm and the inner cell mass also exhibit time-restricted plasticity 51,52 . This is believed to establish flexible gene regulatory networks based on heterogeneity of gene expression 53 . Plasticity may therefore be a general property of cells undergoing lineage determination.
Third, our study reveals the importance of active repression of alternate fates in lineage-fated cells. To the best of our knowledge, we provide the first example of a single lineage-specific TF (SCL) directly regulating expression of PcG members during development. In so doing, SCL regulates PRC1 function, as shown by SCL-mediated global H2AK119 ubiquitination. Though PRC1 exhibits repressive roles during early embryonic development 41,45 , somatic cell reprogramming 54 and haematopoietic development 55 , a function in lineage selection in germ layers has not been previously reported.
SCL activates expression of two specific PRC1 members, RYBP and PCGF5, both necessary for the function of ubiquitin ligase RING1B and full H2AK119 ubiquitination levels 42,56 . Thus, it is unsurprising that ubiquitination levels depend on SCL in bloodfated cells. Whilst our biochemical data indicate interaction between RYBP and SCL and co-localisation on a few genes, the mechanisms controlling RYBP recruitment and H2AK119ub deposition for most genes remain unclear. Interestingly, RYBP-PRC1 target genes are usually expressed at higher levels than targets of canonical PRC1 complexes 42 . This suggests that ncPRC1 may cause incomplete gene expression silencing, thus permitting low-level gene expression and allowing lineage-fated cells to retain plasticity. Thereafter, ncPRC1 complexes could be replaced by more potent complexes, causing irreversible gene silencing and loss of plasticity.
Our work also reveals a complex pattern of gene regulation by SCL, ETO2 and RYBP. Indeed, ChIP-seq shows locus-specific binding of ETO2 and RYBP on SCL's 778 activated and repressed genes. Why would co-repressors bind to genes normally activated? One possibility is that co-repressors finely modulate gene expression levels, rather than simply repressing gene expression 57 . This transcriptional flexibility could allow for rapid   changes in transcription patterns to adapt to changing environments, and underlies cellular plasticity and transcriptional priming. What mechanisms dictate whether a gene is bound by any combination of SCL/ETO2/RYBP and whether it is activated or repressed remain important questions. Detailed analyses of specific gene loci are required. Finally, our work highlights some of the relationships between histone marks, TF/co-factor binding and gene expression. Although the significance of SCL-dependent H2AK119 ubiquitination over extended genomic regions remains unclear, this observation is reminiscent of the wide "blanket" of Polycomb recruitment on the inactive X chromosome 58 , a paradigm of transcriptional repression. Based on the hierarchical model that PRC1 recruits PRC2 18 , H2AK119Ub marks may facilitate recruitment of PRC2 and maintain high levels of H3K27me3 at genes normally repressed in lowly acetylated chromatin. Genome-wide, high levels of ubiquitination also suggest that the default transcriptional state may be gene expression silencing. It is noteworthy that genes activated by SCL usually harbour multiple SCL binding sites, whilst repressed genes show much fewer binding events. Multiple SCL occupancy sites may be required to robustly recruit activating epigenetic complexes to overcome a globally repressive environment.
In conclusion, this study unveils the importance of transcriptional repression by a single tissue-specific regulator in multilineage primed cells and over tightly defined developmental windows during which lineage-specific gene expression programmes are initiated. Complex, gene-specific patterns of transcriptional regulation promote and repress lineage-affiliated gene expression to allow single lineage selection. These mechanisms are likely to be shared by other lineage-specific regulators in development and differentiation.
Microscopy and data analysis. To avoid bleaching, smRNA FISH images were focused and taken based on DAPI stain only. For each image, a Z-stack of 53 images was acquired for all channels (DAPI, Alexa-488, Alexa-594 and Alexa-647). Image volumes were analysed using a Fiji/ImageJ macro script that automated the analysis (available upon request). Firstly, image volumes were Z-maximumprojected through a sub-range of the original slices; the sub-range was pre-chosen to best encapsulate the signal containing range of the cells within a dataset. Next, background fluorescence in the DAPI channel was removed from images by smoothing a copy of the maximum-projected image with a coarse Gaussian filter (sigma = 80) and subtracting this from the unsmoothed image. The filtered image was then thresholded to extract the nuclear regions. From this binary image, a Voronoi filter was applied to evenly split the image into regions based on the location of the nuclear centres. Each region was then assigned to the nucleus it encapsulated. Within each nuclei-associated region, the corresponding cell autofluorescence in the 488 nm channel was used to identify the cell. Segmentation was performed by thresholding the cellular fluorescence using an Otsu algorithm to convert the image into a binary representation, and then, using the "Watershed" algorithm, to split any larger objects. These cellular regions were then dissected using the Voronoi regions to split them up based on the distribution of nuclear centres. Regions of fluorescence within the thresholded auto-fluorescence channel and containing a corresponding DAPI signal were retained for further analysis; all other areas were discarded. This methodology provided an accurate means of obtaining cell regions even if more than one cell were touching and also removed artifactual stained non-cellular areas. Finally, within each cell, the channels containing the foci (Mesp1-A448, Scl-A594 or Tbx6-A647) were processed sequentially. All channels were smoothed with a Gaussian filter (sigma = 0.5) to reduce the impact of signal noise. Following this processing step, the signal intensities were identified and measured. The foci locations were identified using the ImageJ "Find Maxima" function with a simple filtering step to remove maxima that were too large to be foci (>40 total pixel area). To avoid false positives, a specific point of cut off for positivity of 6 foci per cell was chosen after performing smRNA FISH on cells derived from ES cells knocked-out for the mRNAs of interest. Within these cells, the number of foci and also the average intensity level was measured from regions (5 × 5 pixel) centred on each foci. In addition, as a comparison, the intensity was also measured in regions surrounding the foci to give an idea of the background noise in these samples. These regions were selected by taking the coordinates of each true foci and then randomly perturbing the coordinates to find a region nearby (within 10 pixels) which did not overlap with any of the true foci locations. Data were exported to Excel for subsequent analysis and statistical testing.
Images shown in Fig. 1 and Supplementary Fig. 2 are maximum Z-stack projections of fluorescence images spanning the extent of the cells, in inverted black/white for Scl, Mesp1 and Tbx6. Within each subfigure, images from the same channel were adjusted equally, for direct comparison. Before compilation of experiments, a Stratified Kruskal-Wallis test was performed to check for significant differences, and none were found. To test for significant increases or decreases in cellular sub-types, a two-tailed Fisher's exact test was applied.
Scl:mCherry cell lines. gRNAs (sequences are shown in Supplementary Table 1) were inserted into GFP-expressing pX458 or a derivative of pX330 (Addgene), by BbsI digestion and ligation. ES cells were transfected with gRNAs and a homology donor (sequences available upon request) using Nucleofector 2b (Lonza) with programme A-013. 48hrs after transfection, single GFP-positive cells were sorted, clones expanded and checked for insertion by Sanger sequencing. The heterozygote Scl:mCherry cell line was generated from WT J1 cells using mCherry_gRNAs, and harbours the mCherry insertion on one allele. The Scl Δ/Δ :mCherry cell line was engineered from the Scl:mCherry cell line with mCherryΔgRNA1 and mCher-ryΔgRNA2, and is homozygous for deletion of the bHLH domain.
Histone ChIP-sequencing-Rx. Histone ChIPSeq-Rx was carried out according to published protocols 37 . Briefly, day 4 mCherry-positive sorted cells were mixed in a 2:1 ratio to Drosophila S2 cells. Samples were prepared and sequencing was performed as above. Data were mapped using an in-house pipeline. Reads were aligned to Mus musculus (mm9) and Drosophila melanogaster (dm3) genomes. The number of reads mapping to the genomes as well as the derived normalisation factor for each histone mark are provided in Supplementary Data 4. Peak calling was performed using MACS2 (version 2.0.10) using parameters for broad peak calling and respective inputs as reference. Tag directories were created using Homer and the generated BedGraphs were visualised on UCSC Genome Browser as custom tracks, after scaling the tag densities by the normalisation factor for each histone mark. Histograms and heatmaps were generated using AnnotatePeaks.pl command with up to −100 kb to 100 kb around the indicated genomic regions separated into 25 equally sized bin; the ChIP fragment coverage was scaled by the derived normalisation factor.
Code availability. All codes used in this study can be accessed from the GEO database (accession number GSE104883).

Data availability
All ChIP-seq and RNA-seq datasets are deposited at the NCBI GEO database under the accession number GSE104883. All other data supporting the results of this study are available within the article and its supplementary information files and from the corresponding author upon request. A reporting summary is available as a Supplementary Information file.