Nuclear interacting SET domain protein 1 inactivation impairs GATA1-regulated erythroid differentiation and causes erythroleukemia

The nuclear receptor binding SET domain protein 1 (NSD1) is recurrently mutated in human cancers including acute leukemia. We show that NSD1 knockdown alters erythroid clonogenic growth of human CD34+ hematopoietic cells. Ablation of Nsd1 in the hematopoietic system of mice induces a transplantable erythroleukemia. In vitro differentiation of Nsd1−/− erythroblasts is majorly impaired despite abundant expression of GATA1, the transcriptional master regulator of erythropoiesis, and associated with an impaired activation of GATA1-induced targets. Retroviral expression of wildtype NSD1, but not a catalytically-inactive NSD1N1918Q SET-domain mutant induces terminal maturation of Nsd1−/− erythroblasts. Despite similar GATA1 protein levels, exogenous NSD1 but not NSDN1918Q significantly increases the occupancy of GATA1 at target genes and their expression. Notably, exogenous NSD1 reduces the association of GATA1 with the co-repressor SKI, and knockdown of SKI induces differentiation of Nsd1−/− erythroblasts. Collectively, we identify the NSD1 methyltransferase as a regulator of GATA1-controlled erythroid differentiation and leukemogenesis. Loss of function mutations of NSD1 occur in blood cancers. Here, the authors report that NSD1 loss blocks erythroid differentiation which leads to an erythroleukemia-like disease in mice by impairing GATA1-induced target gene activation.

S teady-state erythropoiesis is primarily controlled by erythropoietin (EPO) and other hormones including stem cell factor and glucocorticoids. Different pathways translate external signals to the activation of transcription factors and coregulators that drive expression programs that define erythroid identity 1 . Erythroid differentiation is mainly regulated by a relatively small number of transcriptional regulators, including GATA-1, SCL/TAL1, LMO2, LDB1, KLF1, and GFI1b, that dynamically form multiprotein complexes. However, it remains poorly understood how distinct complexes interact and activate or repress specific gene expression programs 2 .
The best studied erythroid transcription factor is the GATA1 zinc-finger protein. GATA1 was shown to activate its target genes by complexing with SCL/TAL1, the bHLH protein E2A, and the LIM domain containing factors LMO2 and LDB1. GATA1-mediated repression was proposed to be executed by complexes containing FOG1, GFI1b, and/or Polycomb repressive complex 2 (PRC2) proteins 2,3 . Inactivation studies in mice revealed that GATA1 is an essential master regulator of erythropoiesis as Gata1-null embryos died in utero from anemia 4 . Moreover, some adult female mice that are heterozygous for the targeted disruption of the X chromosome-linked Gata1 promoter region displayed reduced Gata1 gene expression (Gata1 1.05/X allele) and developed an early onset erythroleukemia-like disease 5 . This mouse model suggested that reduced Gata1 activity contributes to leukemogenesis by preventing proper erythroid differentiation. Acute erythroleukemia is a rare form of human acute myeloid leukemia (AML) generally associated with poor outcome 6 . Recent studies started to unravel the genetic AEL landscape but the molecular mechanisms that control the erythroid identity of the tumor cells remain poorly understood 7 .
The nuclear receptor SET domain protein 1 (NSD1) histone methyltransferase was identified as a protein interacting with several nuclear receptors 8,9 . Mono-and di-methylation of histone 3 lysine 36 (H3K36) and lysine 168 of linker histone 1.5 have been proposed to be the major cellular NSD1 substrates 10,11 . Multiple studies suggest that NSD1 can act as a tumor suppressor gene. First, the NSD1 gene locus is subject to recurrent putative loss-of-function mutations in hematological malignancies and solid cancers [12][13][14][15][16] . Second, the CpG island promoter of the NSD1 locus has also been reported to be frequently hyper-methylated in certain human cancers, thereby epigenetically silencing the allele 17,18 . Third, heterozygous germline point mutations in NSD1 are the molecular correlate for SOTOS, an overgrowth syndrome with learning disabilities and increased cancer risk 19,20 . Finally, NSD1 was identified as putative cancer predisposition gene mediated by rare germline variants and somatic loss-of-heterozygosity (LOH) 21 . However, the mechanism of how NSD1 protects different cell types from malignant transformation remains unknown.
We study the role of NSD1 in steady-state hematopoiesis and leukemia. We observe that reduced NSD1 expression alters the clonogenic growth of erythroid progenitor cells derived from human CD34 + hematopoietic cells. Targeted Nsd1 gene inactivation during late fetal hematopoiesis in mice leads to malignant accumulation of erythroblasts phenocopying human acute erythroleukemia. Complementation experiments reveal that the NSD1-SET domain is critical for in vitro erythroblast terminal differentiation. In addition, our work suggests that NSD1 controls target gene activation by the erythroid master regulator GATA1, most likely through regulated association with the transcriptional co-repressor SKI. Collectively, we identify NSD1 as a co-regulator of GATA1-controlled terminal erythroid maturation and leukemogenesis.

Results
NSD1 knockdown in human CD34 + hematopoietic cells. To address the role of NSD1 in hematopoiesis, we first optimized lentiviral shRNA-mediated knockdown in human CD34 + hematopoietic cells ( Supplementary Fig. 1a-d). We identified three NSD1 shRNA that reduced the numbers of colonies grown in methylcellulose (MC) containing growth factors including EPO (Fig. 1a, b, Supplementary Fig. 1a, b). Interestingly, whereas very few colonies were generated upon replating of Ctrl-shRNAtransduced cells, cells transduced with NSD1-shRNA "372" or "353" formed abundant relatively dense reddish colonies (Fig. 1b,  c, Supplementary Fig. 1c). These colonies were mostly composed of CD45 low cells expressing the transferrin receptor (CD71) and glycophorin-A (GPA) presenting with a proerythroblast-like morphology (Fig. 1d, e, Supplementary Fig. 1d). The cells could however not be further expanded in MC or in liquid cultures. Very similar results were obtained with human cord bloodderived cells ( Supplementary Fig. 1e-h). Collectively, these data suggest that NSD1 regulates clonogenic erythroid differentiation of fetal and adult human CD34 + hematopoietic cells in vitro.
Ablation of Nsd1 induces erythroleukemia in mice. To address its role in steady-state hematopoiesis, we inactivated Nsd1 in mice 22 . Nsd1 fl/fl ;Vav1-iCre tg/+ transgenic mice (here referred as Nsd1 −/− ) efficiently excised both alleles in cells from different lineages leading to almost undetectable levels of Nsd1 exon 5 mRNA and protein expression (Supplementary Fig. 2a-g). At the age of 6-25 weeks (median 91 days, n = 24) all Nsd1 −/− mice developed signs of distress, significant hepatosplenomegaly with extensive cellular multi-organ infiltrations, reduced red blood cell (RBC) counts and hemoglobin levels, reticulocytosis, and severe thrombocytopenia ( Fig. 2a-h, Supplementary Fig. 2h, Supplementary Table 1). White blood cell (WBC) counts were mostly within the normal range but "unclassified leukocytes" were detected and erythroblast-like cells were seen on peripheral blood smears (Fig. 2i, Supplementary Fig. 2i-k). Transplantation of BM cells from symptomatic Nsd1 −/− mice (alone or 1:1 in competition with normal cells) rapidly induced the same disease in lethally irradiated wild-type recipients, after a latency of 33 and 42 days, respectively, characterized by hepatosplenomegaly, multi-organ infiltration, anemia, thrombocytopenia, and erythroblasts in the periphery (Fig. 2j, Supplementary Fig. 21, Supplementary Table 2).
BM and spleen cells from diseased mice expressed modest levels of the transferrin receptor (CD71) and variable amounts of Kit and FcγRII/III, but were negative for CD34, B220, and Sca-1 (Fig. 3a, Supplementary Fig. 3a, b). Erythroid differentiation was defined by staining of CD71 and Ter119 progressing from immature CD71 low Ter119 low ("R0") to CD71 low Ter119 high ("R4") cells ( Supplementary Fig. 3c) 23 . Whereas a decrease of the R4 fraction that was mostly evident in the spleens, all diseased Nsd1 −/− mice significantly accumulated CD71 dim / Ter119 low cells in BM and spleen (Fig. 3b, Supplementary  Fig. 3d). BM cells of diseased Nsd1 −/− mice formed reduced numbers of colonies in MC with significant reduction of CFU-GM and BFU-E colonies accompanied with sometimes large and abnormally dense, reddish and benzidine-staining positive "BFU-E-like" serially platable colonies, composed of myeloid and erythroid progenitors (Fig. 3c-e).
As Vav1-promoter driven Cre expression resulted in significant reduction of Nsd1 expression as early as at E13.5 of development, we also analyzed the impact of Nsd1 inactivation during fetal liver hematopoiesis ( Supplementary Fig. 3g, h) 24 . Hereby, we observed clusters of large cells with a dark-blue cytoplasm on E19.5 fetal liver sections (Fig. 3g). MC cultures did not display any significant differences in total colony number; however, E19.5 Nsd1 −/− fetal liver cells formed dense colonies of mostly CD71 + cells (Fig. 3h, i) ( Supplementary Fig. 3i) resembling those formed by diseased adult BM (Fig. 3d).
Collectively, these data show that inactivation of Nsd1 in the hematopoietic system induces an erythroleukemia-like disease in mice 26 .
Aberrant regulation of GATA1 in Nsd1 −/− erythroblasts. To elucidate the role of Nsd1 in erythroleukemia, we first established culture conditions for primary erythroblasts that maintain cytokine dependency as well as differentiation potential towards enucleated erythrocytes (Fig. 4a) 27 . Growth of fetal liver (FL)derived Nsd1 −/− erythroblasts did not significantly differ from littermate controls in maintenance medium ("MM", containing dexamethasone, hIGF1, cholesterol, and hEPO). In contrast, differentiation of Nsd1 −/− cells was significantly impaired while control cells completely matured in mSCF and hEPO containing differentiation-inducing medium ("DM") ( Fig. 4b- Erythropoiesis is controlled by the transcriptional master regulator GATA1 28 . While Nsd1 −/− erythroblasts expressed reduced Gata1 mRNA levels in DM, GATA1 protein expression remained abundant in maintenance medium and during induced differentiation (Fig. 5a, b, Supplementary Fig. 4e, f)  To study the impact of NSD1 on GATA1 transcription factor activity we compared the expression of previously proposed GATA1 target genes. Overexpression of GATA1 promoted induced expression of several genes in Nsd1 −/− cells, including Hba-A, Hbb-B, and Bcl2l1, which are normally activated by GATA1 during differentiation (Fig. 5f, Supplementary Fig. 4h, i). In contrast, exogenous Gata1 did not affect expression of Spi1 but further reduced expression of Kit and Gata2 known to be downregulated by GATA1 during normal erythroid differentiation (Fig. 5g, Supplementary Fig. 4j, k). Together, these data suggest that the activation of GATA1-controlled target genes during erythroid differentiation is modulated by NSD1 29 .
NSD1-SET is essential for in vitro erythroblast maturation. To address how NSD1 controls erythroid differentiation we compared the effects of expression of wild type (Nsd1) or a catalytically inactive SET domain mutant (Nsd1 N1918Q ) in Nsd1 −/− erythroblasts (Fig. 6a) 30 . Expression of Nsd1 but not Nsd1 N1918Q significantly rescued terminal maturation as illustrated by cellular morphology, a shift of CD71 and Ter119 surface expression, formation of reddish cell pellets, reduced proliferation, and reduced colony formation in MC ( Fig. 6b-f, Supplementary  Fig. 5a-c).

a b i)
Spleen

ii) iii)
Lung Liver To address the molecular mechanisms, we measured DEGs and total proteome expression in BM-derived Nsd1 −/− erythroblasts retrovirally expressing Nsd1 or Nsd1 N1918Q expanded in MM and kept for 24h in DM (Fig. 7a). After Table 3) 37,38 . Collectively, these observations suggest that the catalytic activity of NSD1 is essential for terminal erythroid maturation and regulation of GATA1 targets.
Nsd1 regulates GATA1 chromatin binding and protein interactions. As expression of wild type or mutant Nsd1 did not overtly change GATA1 protein levels in Nsd1 −/− erythroblasts kept for 2 days in DM, we compared chromatin binding and putative protein interactions of GATA1 by ChIP and IP-MS after 24 h of induced differentiation (Fig. 8a-c). Hereby, we found increased occupancy of GATA1 at over 3000 sites in the genome overlapping with 1362 genes (p < 0.01) in cells expressing Nsd1 in comparison to the catalytically inactive Nsd1 N1918Q mutant (Fig. 8d). Of genes with significantly increased binding of GATA1, 731 of them had the promotor regions decorated by H3K27 ac while H3K36 me3 marks overlapped with 1179 gene bodies (Supplementary Data 6). Hence, while global levels of GATA1 remains constant, reintroduction of Nsd1 resulted in increased DNA binding to available GATA1 sites in promotor regions, similarly reflected in changes in H3K36 me3 and H3K27 ac at the genomic coordinates. Interestingly, changes in gene expression aligned with H3K27 ac around TSS, confirming that these epigenetic marks are directly regulating the down-stream transcriptional programming (Fig. 8e). However, we could not detect any gene loci with statistically significant increase of all three, GATA1, H3K36 me3 , and H3K27 ac (Supplementary Data 7 and 8), which could be a matter of temporal distance along the activation pathway. Nevertheless, Nsd1-induced regulation of several erythroid regulators was associated with simultaneously changed GATA1 binding, H3K27 ac and H3K36 me3 marks. The Pklr gene locus, encoding for the liver-red cell pyruvate kinase linked to erythroid differentiation and Art4, encoding for the developmentally regulated Dombrock blood group glycoprotein, were both higher expressed in Nsd1 −/− cells expressing wild-type Nsd1 associated with a narrow GATA1 peak in the promotor region within a broader decoration of H3K27 ac , followed by gene body-wide H3K36 me3 marks (Fig. 8f) 39,40 . The opposite was observed for the gene encoding for Fgf2 (fibroblast growth factor 2) associated with inhibition of efficient erythroid differentiation that appeared higher expressed in Nsd1 N1918Q than in Nsd1-expressing cells ( Supplementary Fig. 6a) 41 . Immunoblot and masspectrometry analysis revealed globally reduced mono-, di-, and tri-methylated H3K36 in Nsd1 −/− erythroblasts expressing the inactive Nsd1 N1918Q mutant compared to those expressing wild-type Nsd1 (Supplementary Fig. 6b, c).
To address whether impaired chromatin binding and transactivation of GATA1 in the absence of Nsd1 might be associated with altered protein interactions we immunoprecipitated GATA1 followed by mass spectrometry in Nsd1 −/− cells either expressing wild-type Nsd1 of the inactive Nsd1 N1918Q mutant kept for 24 h in DM. We identified 413 differentially expressed proteins (p < 0.05) (Supplementary Data 9), of which the most significant ones included known interactors of GATA1 such as MBD2, RBBP4, ZFPM1, RUNX1, and TAL1 suggesting functionality of the assay (Fig. 8g) 42 . Interestingly, mass spectrometry analysis revealed that differentiation of Nsd1-expressing Nsd1 −/− erythroblasts was associated with a highly significant reduction (logFC = −1.96; p < 1.08 × 10 −7 ) of the transcriptional repressor protein SKI previously proposed to interact with and inhibit GATA1 activation, most likely in cooperation with the nuclear co-repressor (NCoR) complex (Fig. 8g, h, Supplementary Fig. 6d) 43,44 . Notably, several members of the NCoR complex (NCOR1, HDAC3, TBLXR1) co- appeared with SKI, as differentially regulated (Fig. 8g, Supplementary Data 9).
SKI knockdown differentiates Nsd1 −/− erythroblasts. To functionally explore reduced GATA1-SKI association upon Nsd1 expression, we asked whether experimental shRNA-mediated reduction of SKI might be sufficient to initiate maturation of Nsd1 −/− erythroblasts (Fig. 9a). We found that SKI knockdown significantly increased in vitro induced terminal maturation of erythroblasts from three independent Nsd1 −/− mice, as shown by cellular morphology, flow cytometry (CD71/Ter119/Kit), and proliferation ( Fig. 9b-d, Supplementary Fig. 7a). SKI knockdown did not alter GATA1 protein levels (Fig. 9e). Notably, prolonged culture in DM was associated with a general reduction of SKI levels suggesting a role for SKI during initiation rather than terminal differentiation. SKI knockdown also significantly reduced clonogenic growth, total number of cells, and Kit + expression of the cells in MC (Fig. 9f-h, Supplementary Fig. 7b). Collectively, these data suggest that in the absence of Nsd1, terminal erythroid maturation is blocked as a consequence of impaired GATA1 transactivation dependent on its association with the transcriptional repressor SKI.

Discussion
The observations that reduced expression of NSD1 altered erythroid clonogenic growth of human CD34 + cells and significantly impaired terminal erythroid maturation leading to an erythroleukemia-like disease in mice characterizes NSD1 as a regulator of erythroid differentiation. Mechanistically, we found   or Nsd1 N1918Q were analyzed during expansion in maintenance medium (0 h) and after 24 h in differentiation medium by RNA-seq and global proteome analysis. b Heatmap of the top 100 differentially expressed genes (corresponding to FDR < 1.06 × 10 9 ) of Nsd1 −/− BM-derived erythroblasts expressing Nsd1 (brown squares) and Nsd1 N1918Q (black squares) in maintenance medium, and after 24 h in differentiation medium (Nsd1, red squares; Nsd1 N1918Q , gray square). Columns clustering was done by Wards linkage on correlations. c Gene set enrichment analysis (GSEA) (weighted Kolmogorov-Smirnov-like statistics, two-sided, with adjustment for multiple comparisons) of differential expression between Nsd1 −/− BM-derived erythroblast expressing Nsd1 before and after 24 h in differentiation medium. d GSEA (weighted Kolmogorov-Smirnov-like statistics, two-sided, with adjustment for multiple comparisons) of differential expression between  erythroleukemia cells were able to differentiate into mature erythrocytes when complemented with full-length Gata1 (ref. 45 ). In contrast to Gata1 1.05/+ mice, Nsd1 −/− mice developed a fully penetrant erythroleukemia-like phenotype after a shorter latency (Fig. 2). The best-studied in vivo erythroleukemia model is Friend's virus complex induced erythroblastosis in which viral integration results in aberrant expression of the Spi1 gene encoding for PU.1 (ref. 46 ). Similar to Nsd1 −/− , Spi1 transgenic mice develop anemia, thrombocytopenia, and multi-organ infiltration of erythroblasts progressing from an EPO-dependent stage to EPO-independence by acquisition of activating mutations in the c-kit receptor tyrosine kinase 47,48 . However, very similar to Nsd1 −/− erythroblasts, Friend's virus erythroblastosis-derived MEL cells constitutively expressed GATA1 protein that could not be explained by the interaction with PU.1 (Fig. 5) 49 . In addition, conditional activation of exogenous Gata1 was also reported to induce erythroid differentiation in some MEL cell lines 49,50 .
Nsd1 controls GATA1 protein interaction and activation of erythroid regulators. To study the mechanism of Nsd1-controlled erythroid differentiation we faced the problem that primary erythroblast cultures can contain significant fractions of myeloid cells, which are not present in Nsd1 −/− cultures. Therefore, we chose to virally express WT or a previously reported catalytically inactive Nsd1 N1918Q SET-mutant in Nsd1 −/− erythroblasts 30 . However, not only the large size of the Nsd1 ORF drastically impaired the gene transfer efficacy, transduced cells also did not tolerate high levels of exogenous Nsd1, which limited generation of stably expressing cells in time and numbers. Nevertheless, low-level mRNA expression resulted in detectable Nsd1 protein expression sufficient to restore terminal maturation of Nsd1 −/− erythroblasts in a methyltransferase activity-dependent manner (Fig. 6). Interestingly, Nsd1 expression was associated with increased binding to and transactivation of a large number of previously proposed GATA1 target genes associated with changes in H3K27 ac and H3K36 me3 marks (Fig. 8). These observations led us to speculate that in the absence of Nsd1, GATA1 might be functionally trapped in some saturated interactions that may limit its transactivation potential, which can be overcome by expression of additional "free" GATA1.
Characterization of putative GATA1 interactions by immunoprecipitation and mass spectrometry suggested that in the absence of Nsd1, GATA1 associates with potent transcriptional repressors (Fig. 8). Notably, expression of wild type but not the inactive SET Nsd1 mutant resulted in a highly significant reduced association of GATA1 with the transcriptional co-repressor SKI. SKI is well known for its role as a regulator of the TGF-beta/Smad signaling pathway 51,52 . SKI was also found to be overexpressed in AML and proposed to repress retinoic acid receptor and RUNX1mediated signaling. [53][54][55] . In addition, SKI was reported to control HSC fitness in myelodysplastic syndromes (MDS) 56 . Most importantly, SKI was shown to physically interact, to repress GATA1-mediated transactivation, and to block erythroid differentiation by blocking its interaction with DNA 43 . In addition, SKI-mediated repression seemed to be NCoR dependent, and several NCoR complex proteins were altered in GATA1 pulldowns upon expression of Nsd1 (Fig. 8) 44 . These observations suggest that the methyltransferase activity of NSD1 controls the interaction of GATA1 with SKI or other, yet to be defined mediators.
Very recent studies using quantitative proteomics revealed that co-repressors are dramatically more abundant than co-activators in erythroblasts 57 . How the lack of Nsd1 directly regulates differential interaction of GATA1 co-activators and co-repressors remains to be elucidated. One can hypothesize that Nsd1mediated H3K36 methylation provides the anchors for ultimate accumulation of sufficient co-activators on critical target gene loci.
Recent work suggested that H3K36 methylation is critical for normal erythroid differentiation. A conditional H3K36M mutation severely affected the murine hematopoietic system resulting in defects that partially phenocopy those observed in the Nsd1 −/− mice. H3K36M transgenic mice also developed anemia, thrombocytopenia, and splenomegaly. Most notably, these mice also showed a dramatic increase in early Ter119 − erythroid progenitor cells in the BM but also in the periphery 58 . Another study found that reduced H3K36 me2 by Nsd1 inactivation in ES cells resulted in re-localization of the DNMT3A DNA methyltransferase, which interacts with the H3K36 me3 through its PWWP domain. This led to hypomethylation of euchromatic intergenic regions as observed in SOTOS patients having NSD1 loss of function mutations 59 . Interestingly, normal erythroid maturation and particularly at Fig. 8 Nsd1 expression increases GATA1 chromatin binding and changes GATA1 protein interaction partners during induced differentiation of Nsd1 −/− cells. a Relative Gata1 mRNA expression levels (1/dCt) in Nsd1 −/− BM-derived erythroblasts virally expressing Nsd1 (red bars) or Nsd1 N1918Q (gray bars) expanded in maintenance medium (day 0) or after 1 and 2 days in differentiation medium. Values were normalized to Gapdh (n = 4 per group). b Western blot showing GATA1 protein expression in 1 × 10 6 Nsd1 −/− BM-derived erythroblasts expressing Nsd1 or Nsd1 N1918Q upon expansion in maintenance medium (day 0), and after 1 and 2 days in differentiation medium. Actin was used as a loading control. This data represents one of two experiments. c Experimental setup of the ChIP-seq and IP-MS experiment. d Heatmaps of genome-wide ChIP-seq signals in Nsd1 −/− BM-derived erythroblasts expressing Nsd1 (left column) or Nsd1 N1918Q (right column) after 24 h in differentiation medium for GATA1, H3K27 ac , and H3K36 me3 . All heatmaps are sorted decreasingly according to read coverage around transcriptional start sites (TSS) of GATA1 (leftmost). Input denotes sheared non-immunoprecipitated DNA (rightmost), serving as visual control. Density plots above each heatmap depicts corresponding averaged binding around TSS. e One-dimensional heatmap of logFC between gene expression of Nsd1 −/− BM-derived erythroblasts expressing Nsd1 or Nsd1 N1918Q after 24 in differentiation medium (as presented in Fig. 5h, j) sorted according to read coverage around TSS for H3K27 ac ChIP (data as shown in panel c, sorted independently. Only overlapping genes are displayed). f Integrated genome viewer (IGV) representation of GATA1, H3K27 ac , and H3K36 me3 ChIP peaks in the Pklr (top panel) and Art4 gene locus (lower panel) from Nsd1 −/− BM-derived erythroblasts either expressing Nsd1 or Nsd1 N1918Q after 24h in differentiation medium. Right panels show Pklr and Art4 mRNA relative expression levels (1/dCt) in Nsd1 −/− BM-derived erythroblasts expressing Nsd1 or Nsd1 N1918Q in maintenance medium (day 0) and after 24 h differentiation medium. Values are shown as relative expression normalized to Gapdh (n = 4). g Volcano plot of differential protein enrichments by GATA1 immunoprecipitation (GATA1-IP) in Nsd1 −/− BMderived erythroblasts either expressing Nsd1 or Nsd1 N1918Q kept for 24h in differentiation medium, each group is normalized to IgG control (n = 2). Significantly reduced GATA1-SKI association (indicated by a black arrow) was observed upon expression of Nsd1 compared to Nsd1 N1918Q (FDR < 0.05). h Western blot analysis showing SKI protein expression in 1 × 10 6 BM-derived Nsd1 −/− erythroblasts either expressing Nsd1 or Nsd1 N1918Q during expansion in maintenance medium (day 0), and after 1 and 2 days in differentiation medium. Actin was used as a loading control. This data represent one of two experiments. Values are presented as individual points, bar graphs represent the mean value of biological replicates, error bars as standard error of the mean. Statistical significances in a, f tested with a paired two-tailed t-test.
NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-16179-8 ARTICLE transition from CFU-E to proerythroblasts was found to correlate with activation of a significant number of genes associated with gained DNA methylation on selective genes including numerous GATA1 targets 60 . Together, these findings suggest that the loss of H3K36 methylation and redistribution of DNMT3A could be directly responsible for impaired binding of GATA1 and its coactivators. The fact that we pulled down DNMT3A by immunoprecipitation of GATA1 in Nsd1 −/− cells (Supplementary Datas 10 and 11) suggests that GATA1 binding could indeed not only be dependent on H3K36me but also on DNMT3A-mediated DNA methylation.
NSD1, SKI, and human erythroleukemia. Nsd1 gene inactivation during late stage fetal liver hematopoiesis induced a fully penetrant lethal disease that phenocopied several aspects of acute erythroleukemia, a rare form of human AML 6 . Putative loss of function missense or frameshift NSD1 mutations have been found in various human cancers including AML (https://cancer.sanger. ac.uk/cosmic/gene/analysis?ln=NSD1). Interrogation of the cancer cell line encyclopedia (CCLE) revealed that very few human cancer cell lines express even negligible levels of NSD1 mRNA and protein, including F-36P, a cell line established from a patient with acute erythroleukemia (https://portals.broadinstitute.org/ccle/ page?gene=NSD1) 61 . Notably, very recent work revealed several cases of childhood acute erythroleukemia that harbored fusion genes involving NSD1 7 . Based on our findings one can speculate that in such cases the fusion may either act in a dominant-negative manner to NSD1 expressed from the non-arranged allele, or the presence of LOH is reducing NSD1 activity as recently reported for a significant number of solid cancers 21 . Interestingly, we also found aberrantly high SKI expression levels in tumor cells of some erythroleukemia patients, and that in vivo overexpression of a SKI ORF in BM-derived HSPC resulted in an erythroleukemia-like disease in mice, suggesting that SKI expression may not only be critical for impaired erythroid differentiation in Nsd1 −/− mice but also a driver of the human disease 62 . Collectively, our observations suggest that impaired NSD1 activity functionally interferes with lineage-associated transcriptional master regulators such as GATA1 resulting in impaired cellular differentiation as a first step to malignant transformation.  Western blotting. For protein detection, total cell extracts were isolated from freshly cultured 1 × 10 6 cells using 60 μl of Laemmli sample buffer containing 20% SDS. Following 5 min boiling at 100°C, samples were centrifuged at 4°C for 10 min, and supernatant was placed in a new tube. Nuclear protein lysates were prepared by resuspending cells in hypotonic lysis buffer (10 mM HEPES pH 7.9, 10 mM KCl, 0.1 mM EDTA, 0.1 mM EGTA, 1 mM DTT) for 15 min on ice, followed by treatment with 0.1% NP-40 and 15 s vortexing. Nuclei were spun down at 14,000 r.p.m. for 2 min at 4°C and supernatant containing cytoplasmic fraction kept for analysis. Pellets were resuspended in nuclear lysis buffer (20 mM HEPES pH 7.9, 0.4 M NaCl, 1 mM EDTA, 1 mM EGTA, 1 mM DTT). In addition, pellets were sonicated for three cycles (30 s sonication, 30 s pause) on a Bioruptor pico sonicator (Diagenode, Seraing, Belgium) and left for 20 min on ice before spinning down at 14,000 r.p.m. for min at 4°C. Lysates were kept for analysis of nuclear proteins and remaining pellets used for histone extraction in 0.2 N HCl and betamercaptoethanol. Lysis buffers were supplemented with Complete Mini protease inhibitors (Cat. 11836153001; Roche). Proteins were quantified by Bradford assay (Biorad, München, Germany) and loading adjusted. Samples were prepared in 4× Laemmli buffer (Biorad, München, Germany) and boiled for 10 min at 95°C before loading on pre-cast (BioRad) or hand casted gels of different percentages. For NSD1 blot, 50 μg of nuclear extract was loaded on a 5% running gel. Wet transfer was done overnight at 4°C in 5% methanol/0.1% SDS/tris-base-bicine buffer on 0.45 µM nitrocellulose membranes. For blotting GATA1, 10 μg nuclear extract was loaded on 10% gels and semi-dry transfer was done for 30 min on nitrocellulose 0.2 µM (Biorad, München, Germany). For SKI, whole lysate from 1 MIO cells was boiled and loaded on 6-7.5% gels. Wet transfer was carried out for 3 h at 4°C. Membranes were blocked in 5% non-fatty milk (NFM) in PBS-1% Tween for 2 h at room temperature. Blots were probed overnight with antibody at 4°C in 2.5% NFM/PBS-1%Tween, washed three times for 15 min in PBS-1% Tween and probed with a secondary antibody in 2.5%NFM/PBS-1%Tween. Again, blots were washed three times in for 15 min in PBS-1% Tween and then probed with Supersignal West Femto Max substrate (Thermo Scientific, Reinach, Switzerland).
Carestream Biomax Kodak films were used for development (Sigma, New York, USA). Uncropped original scans of the western blots membranes as shown in Figs. 5b, 6c, 8b, h are provided in Supplementary Fig. 9. Information regarding the used antibodies can be found in Supplementary Table 7.
Retroviral gene transfer. Full-length cDNAs for murine Nsd1 (pSG5) was obtained from R. Losson (Strasbourg). Wild type (Nsd1) and a catalytically inactive (Nsd1 N1918Q ) mutant ORF were cloned into the murine stem cell virus (pMSCV) expression vector and sequence verified. A retrovirus (pLMP) encoding for an SKIspecific mir-shRNA was a gift from M. Hayman (Bufallo, NY). A full-length cDNA for murine Gata1 was obtained from T. Mercher (Paris) and cloned into pMSCV and sequenced verified. Retroviral stocks were produced by transient cotransfection of packaging vectors (pIPAK6) and respective plasmids using Turbofect or Jetprime transfection reagent (Life Technologies, Paisley, UK) in HEK293T-LX cells kept in DMEM (Gibco, Lubio, Thermo Fisher Scientific, Reinach, Switzerland) with 10% FSC and 1% P/S. Viral supernatants were harvested 48 and 72 h after transfection, 10× Vivaspin 20 (Sartorius, Göttingen, Germany) concentrated at 4000 r.p.m. for 40 min at 4°C and snap frozen in liquid nitrogen and stored in −80°C until usage. Cells were spin-infected either in StemSpan SFEM, supplemented with 50 ng/ml hTPO (Peprotech, London, UK) and 50 ng/ml mSCF or in maintenance medium used for erythroblast culture as described above, in the presence of 5 µg/ml polybrene (Sigma Aldrich, Buchs, Switzerland) with virus for 90 min, 2500 rpm at 30°C. Four hours after spin infection, the cells were washed with PBS and plated in maintenance medium. Two days after spin infection, the cells were selected with 2 µg/ml puromycin (Gibco, Thermo Fisher Scientific, Reinach, Switzerland) or EGFP + cells were FACS enriched as described before. Information regarding the used plasmids can be found in Supplementary Table 8.
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
The RNA raw expression data are accessible with the following number GSE136811. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifierPXD017657 (ref. 65 ).