An Erg-driven transcriptional program controls B cell lymphopoiesis

B lymphoid development is initiated by the differentiation of hematopoietic stem cells into lineage committed progenitors, ultimately generating mature B cells. This highly regulated process generates clonal immunological diversity via recombination of immunoglobulin V, D and J gene segments. While several transcription factors that control B cell development and V(D)J recombination have been defined, how these processes are initiated and coordinated into a precise regulatory network remains poorly understood. Here, we show that the transcription factor ETS Related Gene (Erg) is essential for early B lymphoid differentiation. Erg initiates a transcriptional network involving the B cell lineage defining genes, Ebf1 and Pax5, which directly promotes expression of key genes involved in V(D)J recombination and formation of the B cell receptor. Complementation of Erg deficiency with a productively rearranged immunoglobulin gene rescued B lineage development, demonstrating that Erg is an essential and stage-specific regulator of the gene regulatory network controlling B lymphopoiesis. B cell development is tightly regulated in a stepwise manner to ensure proper generation of repertoire diversity via somatic gene rearrangements. Here, the authors show that a transcription factor, Erg, functions at the earliest stage to critically control two downstream factors, Ebf1 and Pax5, for modulating this gene rearrangement process.

T ranscription factors are critical for controlling the expression of genes that regulate B-cell development. The importance of specific B-lymphoid transcription factors is highlighted by the phenotype of gene knockout models. Failure of B-cell lineage specification from multi-potential progenitors occurs with deletion of Ikzf1 1 and Spi1 (Pu.1) 2 , while deletion of Tcf3 (E2A) 3 and Foxo1 4 results in failure of B-cell development from common lymphoid progenitors (CLPs). Developmental arrest later in B lymphopoiesis is observed with deletion of Ebf1 and Pax5 at the pre-proB and proB stages, respectively 5,6 . This sequential pattern of developmental arrest associated with the loss of gene function, along with ectopic gene complementation studies 2 , gene expression profiling 7 and analysis of transcription factor binding to target genes, support models in which transcription factors are organised into hierarchical gene regulatory networks that specify B-lymphoid lineage fate, commitment and function 8 .
Two transcription factors that have multiple roles during B-cell development are Ebf1, a member of the COE family, and Pax5, a member of the PAX family. While Ebf1 and Pax5 have been shown to bind to gene regulatory elements of a common set of target genes in a co-dependent manner during later stages of B lineage commitment 9 , both manifest distinct roles during different developmental stages. Ebf1 has been proposed to form a transcriptional network with E2A and Foxo1 in CLPs that appears important in early B-lymphoid fate determination 10 , while during later stages of B lymphopoiesis, Ebf1 acts as a pioneer transcription factor that regulates chromatin accessibility at a subset of genes co-bound by Pax5 11 as well as at the Pax5 promoter itself 12 . Pax5 in contrast, regulates B-cell genomic organisation 13 including the Immunoglobulin heavy chain (Igh) locus during V(D)J recombination, co-operating with factors such as CTCF 14 , as well as transactivating 15 and facilitating the activity of the recombinase activating gene (Rag) complex 16 .
It is unclear, however, how these various functions of Ebf1 and Pax5 are co-ordinated during different stages of B-lymphoid development. In particular, it would be important to ensure coordinated Ebf1 and Pax5 co-expression before the pre-BCR checkpoint, such that Ebf1 and Pax5 co-regulated target genes required for V(D)J recombination and pre-B-cell receptor complex formation are optimally expressed 9 .
Here we show that the ETS-related gene (Erg), a member of the ETS family of transcription factors, plays this vital role in B lymphopoiesis. Deletion of Erg from early lymphoid progenitors resulted in developmental arrest at the early pre-proB-cell stage and loss of V H -to-DJ H recombination. Gene expression profiling, DNA-binding analysis and complementation studies demonstrated Erg to be a transcriptional regulator that lies at the apex of an Erg-dependent Ebf1 and Pax5 gene regulatory network commencing in pre-proB cells. This co-dependent transcriptional network directly controls expression of the Rag1/Rag2 recombinase activating genes and the Lig4 and Xrcc6 DNA repair genes required for V(D)J recombination, as well as expression of components of the pre-BCR complex such as CD19, Igll1, Vpreb1 and Vpreb2. Taken together, we define an essential Erg-mediated transcription factor network required for regulation of Ebf1 and Pax5 expression that is exquisitely stage specific during early Blymphoid development.

Results
Erg is required for B-cell development. To build on prior work defining the role of the transcription factor Erg in regulation of hematopoietic stem cells (HSCs) 17 and megakaryocyteerythroid specification 18 , we sought to identify whether Erg played roles in other hematopoietic lineages. Erg expression in adult hematopoiesis was first examined by generating mice carrying the Erg tm1a(KOMP)wtsi knock-in first reporter allele (Erg KI ) (Fig. 1a). Consistent with the known role for Erg in hematopoiesis [17][18][19][20][21] , significant LacZ expression driven by the endogenous Erg promoter was observed in HSCs and multi-potential progenitor cells, as well as in granulocyte-macrophage and megakaryocyte-erythroid progenitor populations, with declining activity accompanying erythroid maturation (Fig. 1b with definitions of cells examined provided in Supplementary Table 1 and representative flow cytometry plots in Supplementary Fig. 1). In other lineages, transcription from the Erg locus was evident in CLP, all lymphoid and B-cell-biased lymphoid progenitor cells, as well as in B lineage committed pre-proB, proB and preB cells and double-negative thymic T-lymphoid cell subsets, with a reduction in transcription with later B-and T-cell maturation (Fig. 1b, c). We confirmed these findings with RNA-sequencing (RNA-seq) analysis that showed significant Erg RNA in pre-proB, proB and preB cells (Fig. 1d). This detailed characterisation of Erg expression raised the possibility that Erg plays a stage-specific function at early developmental stages of the lymphoid lineages.
To determine whether Erg had a role in lymphoid development, mice carrying floxed Erg alleles (Erg fl/fl , Fig. 1a) were interbred with Rag1Cre transgenic mice that efficiently delete floxed alleles in CLPs and T-and B-committed progenitor cells 22 , but have normal lymphoid development ( Supplementary Fig. 2a). The resulting Rag1Cre T/+ ;Erg Δ/Δ mice specifically lack Erg throughout lymphopoiesis (Fig. 1e, Supplementary Fig. 2b). While numbers of red blood cells, platelets and other white cells were normal, Rag1Cre T/+ ;Erg Δ/Δ mice displayed a deficit in circulating lymphocytes (Supplementary Table 2). This was due to a specific absence of B cells; the numbers of circulating T cells and thymic progenitors were not decreased (Fig. 1f, Supplementary Fig. 2c).
B cells are produced from bone marrow progenitor cells that progress through regulated developmental stages. B-lymphoid development was markedly compromised in Rag1Cre T/+ ;Erg Δ/Δ mice, with proB, preB, immature B and mature recirculating B cells (Hardy fractions C-F, defined in Supplementary Table 1) markedly reduced in number or virtually absent (Fig. 1f). A Blymphoid developmental block was clearly evident at the pre-proB (Hardy fraction A-to-B) stage, with excess numbers of these cells present in the bone marrow.
Erg deficiency perturbs V H -to-DJ H recombination. To further characterise the developmental B lineage block in Rag1Cre T/+ ; Erg Δ/Δ mice, B220 + bone marrow cells were examined for Igh somatic recombination. Unlike cells from control Erg fl/fl mice, B220 + cells from Rag1Cre T/+ ;Erg Δ/Δ mice had not undergone significant V H -to-DJ H immunoglobulin heavy chain gene rearrangement, although D H -to-J H recombination was relatively preserved (Fig. 2a).
We next investigated the abnormalities underlying Igh recombination in greater detail. We first undertook fluorescence in situ hybridisation (FISH) at the Igh locus to measure the intra-chromosomal distance between distal V H J558 and proximal V H 7183 V H family genes, as cell stage-specific contraction of the Igh locus is essential for efficient V(D)J recombination 23 . This revealed that pre-proB cells from Rag1Cre T/+ ;Erg Δ/Δ mice had reduced locus contraction compared with Erg fl/fl controls (Fig. 2b). To assess whether other structural perturbations across the Igh locus were also present, high throughput chromatin conformation capture (in situ Hi-C) was performed. We performed a differential analysis of the data and revealed a reduction of long-range interactions across the Igh locus in Rag1Cre T/+ ;Erg Δ/Δ pre-proB cells when compared with Erg fl/fl and C57BL/6 controls (Fig. 2c). As these findings were also observed in Pax5 deficient cells 13,23 reflecting a direct role for Pax5 in co-ordinating the structure of the Igh locus 14 , we mapped Erg binding sites across the Igh locus by ChIP-seq. Unlike well-defined Pax5 binding to Pax5-and CTCF-associated intergenic regions (PAIR domains) 14,16 , Erg binding to V H families was not identified across the locus (Fig. 2c, Supplementary Fig. 3a). Thus, a structural role for Erg in maintaining the multiple longrange interactions and V H -to-DJ H recombination in normal cells is unlikely and cannot account for the absence of these in Rag1Cre T/+ ; Erg Δ/Δ pre-proB cells. Analysis of Igh locus accessibility by ATAC-seq did not reveal any significant difference between Erg-deficient pre-proB, proB and preB cells and control cells ( Supplementary  Fig. 3a), suggesting that the loss of locus accessibility either by chromatin regulation 24 or by peripheral nuclear positioning with lamina-associated domain silencing 25 were not mechanisms that could adequately explain reduced Igh locus contraction, reduction of long range interactions, and loss of V H -to-DJ H recombination in the absence of Erg.
NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-16828-y ARTICLE activating element located in the intronic region between the Igh joining region (J H ) and constant region (Cμ) implicated in efficient V H -to-DJ H recombination and Igh chain transcription 26 . The iEμ enhancer is proposed to nucleate a three-loop domain at the 3′ end of Igh interacting with the V H region to juxtapose 5′ and 3′ ends of the heavy chain locus 27 . Erg and its closest related ETS family member, Fli1, were shown to bind to the μA element and transactivate iEμ co-operatively with a bHLH transcription factor in vitro 28 . We therefore sought to determine whether the lack of Erg, and Erg binding in particular to the μA site of iEμ, could account for the loss of V H -to-DJ H recombination observed in Rag1Cre T/+ ;Erg Δ/Δ mice in vivo. While ChIP-PCR demonstrated Erg binding to the iEμ enhancer containing the μA element ( Supplementary Fig. 3b), mice in which the μA region (μA Δ/Δ ) was deleted had preserved numbers of circulating mature B cells compared with cEμ Δ/+ controls (Fig. 2d) and intact V H -to-DJ H recombination ( Supplementary Fig. 3c). This was in contrast to cEμ Δ/Δ mice, in which a core 220 bp element of iEμ was deleted, in which a marked reduction of circulating mature IgM + IgD + B cells was evident in peripheral blood, in keeping with previous models 29 (Fig. 2d). Importantly, ChIP-seq did not demonstrate Erg binding to other iEμ enhancer regions in μA Δ/Δ proB cells (Supplementary Fig. 3d). Together these data show thatwhile Erg can bind to the μA region of the iEμ in vivo, deletion of this region did not result in significant perturbation of B lymphoid development. It is therefore unlikely that Erg binding to μA element of iEμ could account for the loss of V H -to-DJ H recombination in particular, or the Rag1Cre T/+ ;Erg Δ/Δ phenotype in general.
Rearranged IgH allele permits Erg-deficient B lymphopoiesis. Given the loss of V H -to-DJ H recombination associated with structural perturbation of the Igh locus in Erg-deficient pre-proB cells, we sought to complement the loss of formation of a functional Igh μ transcript and in doing so, determine whether failure to form a pre-BCR complex was a principal reason for the developmental block in Rag1Cre T/+ ;Erg Δ/Δ mice 30 . Complementation with a functionally rearranged Igh allele in models of defective V H -to-DJ H recombination such as deletion of Rag1, Rag2, or components of DNA-dependent protein kinase (DNA-PK) that mediate V H -to-DJ H recombination, can overcome the pre-BCR developmental block [31][32][33][34] .
Erg-deficient pre-proB cells do not express Ebf1 and Pax5. To define the mechanism by which Erg regulates V H -to-DJ H recombination and pre-BCR formation, we undertook gene expression profiling of Rag1Cre T/+ ;Erg Δ/Δ pre-proB cells. Differential gene expression and gene-ontogeny analysis of differentially expressed genes in Rag1Cre T/+ ;Erg Δ/Δ pre-proB compared with Erg fl/fl pre-proB cells demonstrated deregulated expression of multiple B lymphoid genes (Fig. 4a). These included genes encoding cell surface or adhesion receptors and core components of the pre-BCR complex CD19, CD22, Igll1, Vpreb1, Vpreb2, CD79a and CD79b, genes required for Igh recombination such as Rag1 and Rag2 and components of non-homologous endjoining repair complex associated with V(D)J recombination: Xrcc6 (Ku70) and Lig4, and importantly, transcription factors implicated in B-cell development (Ebf1, Pax5, Tcf3, Bach2, Irf4, Myc, Pou2af1, Lef1, Myb) (Fig. 4b).
Ebf1 and Pax5 are critical for B lineage specification 5 and maintenance 36,37 and act co-operatively to regulate a gene network in early B-cell fates 9 . Because we observed with the loss of Erg, reduced expression of several critical B lineage genes previously identified to be controlled by Ebf1 and/or Pax5, for example CD19, Vpreb1, and Igll1 (Fig. 4a), we speculated that Erg may play an important role in regulating the expression of these two essential transcription factors and their targets. To determine if Erg bound Ebf1 and/or Pax5 gene regulatory regions and directly regulated their expression, we undertook ChIP-seq analysis in wild-type proB cells and ATAC-seq to assess locus accessibility at the Ebf1 and Pax5 loci in the absence of Erg in Rag1Cre T/+ ;Erg Δ/Δ pre-proB cells and proB and preB cells rescued with the IgH VH10tar knock-in allele. This demonstrated direct Erg binding to the proximal (β) promoter region of Ebf1 38 as well as to the Pax5 promoter and Pax5 lymphoid-specific intron 5 enhancer 12 (Fig. 4c, Supplementary Fig. 4b). Direct Erg binding to these regulatory regions together with the absence of Ebf1and Pax5 transcription in Rag1Cre T/+ ;Erg Δ/Δ pre-proB cells and the loss of Ebf1 and Pax5 protein in Rag1Cre T/+ ;Erg Δ/Δ by western blot (Fig. 4d), demonstrated that Erg was a direct transcriptional regulator of Ebf1 and Pax5. Importantly, the loss of Ebf1 and Pax5 expression occurred while expression of other known regulators of Ebf1 expression, namely, Foxo1, Spi1, Tcf3 and Ikzf1 were maintained ( Supplementary Fig. 4a), and both Ebf1 and Pax5 loci remained accessible by ATAC-seq in Rag1Cre T/+ ;Erg Δ/Δ pre-proB cells (Fig. 4c). Reinforcing the observation that Erg, Ebf1 and Pax5 may form a co-ordinated transcriptional network, the Erg promoter region was directly bound by Pax5, and the Erg enhancer region was bound by Pax5 and Ebf1 (Fig. 4c).
To better understand the roles of Erg, Ebf1 and Pax5 in the Bcell lineage trajectory, single-cell RNA-seq of CLP, pre-proB and CD19 + proB and preB populations was examined (GSE 114793, Fig. 4e). Consistent with our other analysis (Fig. 1b), an increase in Erg expression in CLPs, pre-proB and proB cells was observed (Fig. 4e) with the identity of proB and preB populations confirmed with analysis of additional B lineage genes (Supplementary Fig. 5). Importantly, Erg expression preceded the expression of Ebf1 and Pax5 in the B lineage trajectory, with Ebf1 and Pax5 expression increasing during the later proB and preB stages. Taken together, this data strongly supported an apical role for Erg in initiating Ebf1 and Pax5 expression during early B-cell development.
An Erg, Ebf1 and Pax5 co-dependent gene regulatory network. As we observed Ebf1 and Pax5 binding to cis-regulatory regions of the Erg locus (Fig. 4c), we determined whether Ebf1 and Pax5 could regulate Erg gene expression in B-cell progenitors by examining a publicly available dataset in which Ebf1 (Ebf1 Δ/Δ ) or Pax5 (Pax5 Δ/Δ ) had been deleted (Fig. 5a). Deletion of either Ebf1 or Pax5 resulted in reduced Erg expression (Fig. 5b), with Ebf1 appearing to be the stronger influence. We next compared gene expression changes in Ebf1 Δ/Δ pre-proB cells and Pax5 Δ/Δ proB cells to those genes regulated by Erg in pre-proB cells. As would be predicted if Erg, Ebf1 and Pax5 were components of a codependent gene regulatory network, this analysis showed a highly significant correlation in gene expression changes observed with Ebf1 or Pax5 deletion in pre-proB and proB cells and those observed with Erg deletion in pre-proB cells. This was noted for downregulated genes in Erg, Ebf1 and Pax5 deficient cells in particular (Fig. 5c).
Finally, to confirm that Ebf1 and Pax5 were transcriptional regulators downstream of Erg, transduction of Rag1Cre T/+ ;Erg Δ/Δ pre-proB cells with MSCV-driven constructs for constitutive overexpression of Ebf1 and Pax5 was performed. This experiment demonstrated rescue of B220 expression with Ebf1 or Pax5 overexpression in Erg deficient cells (Fig. 5d). Notably, partial rescue of CD19 expression and V H -to-DJ H recombination was observed with Ebf1 overexpression, while no rescue was observed with Pax5 overexpression (Fig. 5d, e). RNA-seq analysis of Ergdeficient cells transduced with Ebf1 or Pax5 expression vectors demonstrated Ebf1 overexpression could rescue the expression of several target genes of the transcriptional network including Pax5 itself, genes involved in pre-BCR signalling (such as Vpreb1, Vpreb2, CD79a, CD79b, CD22 and CD19), genes involved in Vto-DJ H recombination (such as Rag1, Rag2), as well as transcription from the Igh locus (Ighv1-5, Ighv1-7, Ighv1-4). In the absence of Ebf1, Pax5 overexpression alone induced the expression of a much more limited set of these target genes (Fig. 5f). Therefore these data suggest that Pax5 lies downstream of Ebf1 and supports the model where Ebf1 facilitates the role of Pax5 in B-cell development 11 . These findings were also in keeping with a hierarchical model of Erg, Ebf1 and Pax5 forming a codependent transcriptional network that co-regulate critical target genes required for V H -to-DJ H recombination and pre-BCR signaling.
Erg co-binds common Ebf1 and Pax5 target genes. Because expression of multiple B-cell genes were deregulated in Rag1Cre T/+ ; Erg Δ/Δ pre-proB cells, including those to which Ebf1 and Pax5 had been shown to directly bind and regulate, we investigated the possibility that Erg co-bound common target genes to reinforce the Ebf1 and Pax5 gene network using a genome-wide motif analysis of Erg DNA-binding sites in proB cells. As expected, the most highly enriched motif underlying Erg binding was the ETS motif. However, significant enrichment of Ebf1-, E2A-, Pax5-and Foxo1-binding motifs were also identified within 50 bp of Erg-binding sites (Fig. 6a), suggesting that Erg acts co-operatively with other transcription factors to regulate target gene expression in a co-dependent gene network. Analysis of the binding of each of Erg, Ebf1 and Pax5 to regulatory regions of genes that were differentially expressed in Rag1Cre T/+ ;Erg Δ/Δ pre-proB cells was then undertaken. This analysis identified significant overlap of Erg-, Ebf1-and Pax5binding sites within 5 kb of the transcriptional start site (TSS) of genes differentially expressed in Rag1Cre T/+ ;Erg Δ/Δ pre-proB cells compared with control pre-proB cells (Fig. 6b). Taken together, these data provided further compelling evidence for a gene regulatory network in which Erg is required for initiating and maintaining expression of Ebf1 and Pax5 from the pre-proB cell stage of development, as well as reinforcing expression of target genes within the network by co-operative binding and co-regulation of target genes with Ebf1 and Pax5.  To further delineate the directly regulated target genes in an Erg-dependent Ebf1 and Pax5 transcriptional network, we undertook mapping of ChIP-seq binding of Erg, Ebf1 and Pax5 to Erg-dependent genes at the pre-proB cell stage of development. The majority of these target genes demonstrated direct combinatorial binding of Erg, Ebf1 and/or Pax5 to annotated promoter regions, gene body enhancer/putative enhancer regions or putative distal enhancer regions of these genes (Fig. 6c). Detailed examination of several key target genes for which expression was completely dependent on Erg in pre-proB cells identified direct binding of Erg to the promoter and enhancer regions for several pre-BCR components, including CD19, Igll1, Vpreb1 and CD79a. This occurred with co-ordinate binding of Ebf1 and Pax5 to the regulatory regions of these genes 15 (Fig. 6d). In addition, indirect regulation by Erg at the Rag1/Rag2 locus was also identified, with downregulation of expression of transcription factors that bind and regulate the Rag2 promoter such Pax5, Lef1 and c-Myb in Rag1Cre T/+ ;Erg Δ/Δ pre-proB cells (Fig. 4b) 39 , as well as direct binding of Erg to the conserved B-cell specific Erag enhancer 40 ( Supplementary Fig. 4a, b). Importantly, the loss of Rag1 and Rag2 expression in Rag1Cre T/+ ;Erg Δ/Δ pre-proB cells occurred while expression of Foxo1, a positive regulator of the locus 41 was relatively maintained (Supplementary Fig. 4a).
An Erg-Ebf1-Pax5 mediated gene regulatory network was then mapped using each target gene, expression of which was perturbed in Rag1Cre T/+ ;Erg Δ/Δ pre-proB cells, and that was directly bound by Erg, Ebf1 and/or Pax5 at promoter, proximal or distal gene regions, to provide a comprehensive representation of this gene network (Fig. 6e). This highlights the interdependent roles of these transcription factors in multiple cellular processes required for B lymphopoiesis.
An important observation arising from our data was that the B-lymphoid developmental block arising in Rag1Cre T/+ ;Erg Δ/Δ pre-proB cells could be overcome with the provision of a rearranged functional Igh VH10tar allele. This suggested that once the pre-BCR checkpoint was bypassed, Erg was no longer critical for further B-cell development and function, including V L J L recombination of the Igl and BCR formation (Fig. 3c, d). Indeed, beyond the pre-BCR checkpoint, re-emergence of Ebf1 and Pax5 expression occurred (Fig. 4c) as well as expression of target genes of the Ebf1 and Pax5 network (Fig. 6d, Supplementary Fig. 4a) in Erg-deficient Rag1Cre T/+ ;Erg Δ/Δ ; IgH VH10tar/+ proB and preB cells rescued with a VH10tar allele. This was in keeping with the expression pattern of Erg in the B lineage trajectory (Figs. 1b-d and 4e) and defines the role of Erg as an exquisitely stage-specific regulator of early B-cell development.

Discussion
In this study we explored the role of the transcription factor Erg in B lymphopoiesis. Our studies suggest that Erg expression from the CLP stage of development initiates a transcriptional network comprised of Erg, Ebf1 and Pax5 in pre-proB and proB cells to regulate V H -to-DJ H Igh recombination and pre-BCR signaling (Fig. 1b, Fig. 4e).
This important role for Erg in B-cell development was demonstrated in mice in which Erg had been deleted throughout lymphopoiesis, which exhibited a developmental block at the pre-proB cell stage that was associated with profound defects in V H -to-DJ H recombination, Igh locus organisation and transcriptional changes in multiple B-cell genes, including loss of expression of Ebf1, and Pax5. Combining RNA-seq, ChIP-seq and gene complementation studies, we were able to define a codependent transcriptional network between Erg, Ebf1 and Pax5, with direct Erg binding to the proximal (β) Ebf1 promoter, to which Pax5, Ets1 and Pu.1 also co-operatively bind 38 , as well as Erg binding to the Pax5 promoter and potent intron 5 enhancer region, two critical regulatory elements required for correct transcriptional initiation of Pax5 in early B-cell development 12 . These data support a model (Fig. 6f) in which increased Erg expression from CLPs is required to initiate and maintain Ebf1 and Pax5 expression in pre-proB cells and proB cells, to establish an inter-dependent B-lymphoid gene regulatory network.
Together Erg, Ebf1 and Pax5 directly co-regulated the expression of multiple genes that had previously been identified as direct transcriptional targets of Ebf1 and Pax5 (Fig. 6c-e). Direct Erg binding to promoters of the pre-BCR signalling complex genes such as Igll1, VpreB and CD79a, establish Erg as a transcriptional regulator of target genes in this network. In addition to Rag1 and Rag2, we also identified network regulation of expression of Xrcc6, the gene encoding the Ku70 subunit of DNA-dependent protein kinase holoenzyme (DNA-PK) that binds DNA double strand breaks during V(D)J recombination 42 , and Lig4, encoding the XRCC4-associated DNA-ligase that is required for DNA-end joining during V(D)J recombination 43 ( Supplementary Fig. 4a, b). Along with direct Erg promotion of expression of Pax5, a structural regulator of the Igh locus, these findings are sufficient to explain the Rag1Cre T/+ ;Erg Δ/Δ phenotype in which V H -to-DJ H recombination was lost. Together with loss of expression of components of the pre-BCR complex, we can conclude B-cell development was blocked as a consequence of Erg deletion due to the collapse of the Erg-mediated transcriptional network.
Importantly, re-emergence of Ebf1 and Pax5 expression beyond the pre-BCR checkpoint in Igh-rescued Rag1Cre T/+ ;Erg Δ/Δ ; IgH VH10tar/+ proB and preB cells was observed, along with Fig. 4 Gene expression in Rag1Cre T/+ ;Erg Δ/Δ pre-proB cells and Erg DNA binding. a Differentially expressed genes in Rag1Cre T/+ ;Erg Δ/Δ pre-proB cells compared with Erg fl/fl controls, manually curated according to function based on GO term analysis (see Supplementary Data 1) with the number of genes for each functional category shown by the horizontal axis and selected genes highlighted in boxes. b Differential expression of transcription factors 77 between Erg fl/fl and Rag1Cre T/+ ;Erg Δ/Δ pre-proB cells, ordered by logFC, with selected B lineage transcription factors highlighted in red. c RNA-seq for Ebf1, Pax5 and Erg loci, with ChIP-seq for Erg binding in C57BL/6 proB cells and thymic Rag1Cre T/+ ;Erg Δ/Δ Erg knockout cells (Erg KO) to control for sites of non-Erg ChIP binding to DNA (see also Supplementary Fig. 4b). Ebf1, Pax5, H3K4me3 promoter mark, H3K27ac promoter and enhancer mark in proB cells by ChiP-seq, and ATAC-seq in Erg fl/fl pre-proB cells (pre-proB), Rag1Cre T/+ ;Erg Δ/Δ pre-proB (Erg KO pre-proB), and Erg-deficient proB (Rescue proB) and preB (Rescue preB) cells in Rag1Cre T/+ ;Erg Δ/Δ ;IgH VH10tar/+ mice that develop with a functionally rearranged Igh allele as shown. The asterisk (*) indicates Erg binding to the promoter region (shaded blue) of Ebf1 and Pax5. Erg binding to intragenic enhancer regions of Pax5 (shaded pink) with intron number as indicated. The delta (Δ) indicates Pax5 binding to Erg promoter region. Ebf1 and Pax5 binding to Erg intragenic enhancer regions (shaded pink) with intron number as indicated (see also Supplementary Fig. 4b). d Western blot for Erg, Ebf1, Pax5 and β-actin in Rag1Cre T/+ , C57BL/6 and Erg fl/fl proB and Rag1Cre T/+ ;Erg Δ/Δ pre-proB cells (two samples of each genotype are shown). Representative of two independent experiments. e Single-cell RNA-seq analysis (3297 cells, GSE114793) showing t-distributed stochastic neighbour embedding (tSNE) plots of CLP, pre-proB and CD19 + proB and preB populations demonstrating the imputed expression of Erg, Ebf1 and Pax5 in single cells in the B lineage trajectory (see also Supplementary Fig. 5). Source data are provided in the Source data file.
expression of target genes of Ebf1 and Pax5. This demonstrates that Erg is a stage-specific regulator of early B-cell development, with emergence of an Erg-independent Ebf1 and Pax5 gene network during the later stages of B-cell development, once clones have transitioned through the pre-BCR checkpoint. This would allow IgL chain V L to J L recombination and BCR formation to proceed in preB cells in which endogenous Erg expression is also reduced (Figs. 1b, c and 4e). The transcriptional regulators of Ebf1 and Pax5 expression during these later stages of B-cell development remain to be defined.
Erg, however, is critical for initiating and maintaining Ebf1 and Pax5 expression in pre-proB and proB cells (Fig. 4e), orchestrating a transcriptional network required for early B-cell development. In this role, Erg not only co-ordinates the transcriptional functions of Ebf1 and Pax5, but also directly binds and activates critical target genes required for transition through the pre-BCR checkpoint.

Methods
Mice. Mice carrying the Erg tm1a(KOMP)wtsi knock-first reporter allele 44 (Erg KI , KOMP Knockout Mouse Project) were generated by gene targeting in ES cells. Mice with a conditional Erg knockout allele (Erg fl ) from which the IRES-LacZ cassette was excised were generated by interbreeding Erg KI mice with Flpe transgenic mice 45 . Rag1Cre mice 46 , in which Cre recombinase is expressed during lymphopoiesis from the CLP stage 22 , were interbred with Erg fl mice to generate mice lacking Erg in lymphopoiesis (Rag1Cre T/+ ;Erg Δ/Δ ) and Rag1Cre +/+ ;Erg fl/fl (Erg fl/fl ) controls. Mice carrying the rearranged immunoglobulin heavy chain IgH VH10tar allele 47 were a gift from Professor Robert Brink. The cEμ Δ/Δ and μΑ Δ/Δ mice were generated by the MAGEC laboratory (Walter and Eliza Hall Institute of Medical Research) 48 on a C57BL/6J background. To generate cEμ Δ mice, 20 ng/μl of Cas9 mRNA, 10 ng/μl of sgRNA (GTTGAGGATTCAGCCGAAAC and ATGTTGAGTTGGAGTCAAGA) and 40 ng/μl of oligo donor Row Z-score  Twenty-four hours later, two-cell stage embryos were transferred into the oviducts of pseudo-pregnant female mice. Viable offspring were genotyped by next-generation sequencing. Non-commercial unique materials are subject to Materials Transfer Agreements. Mice were co-housed in a barrier facility and analysed from 6 to 18 weeks of age. Male and female mice were used. The primers and PCR conditions used for genotyping are provided in Supplementary Table 3. This study was performed in accordance with the Australian Code for the Care and Use of Animals for Scientific Purposes, published by the Australian National Health and Medical Research Council. Euthanasia was performed by CO 2 induction or cervical dislocation. Experimental procedures were approved by the Walter and Eliza Hall Institute of Medical Research Animal Ethics Committee.
Primary cell culture. B-cell progenitors were obtained from bone marrow that was lineage depleted using biotinylated Ter119, Mac1, Gr1, CD3, CD4, and CD8 antibodies, anti-biotin microbeads and LS columns (Miltenyi Biotec) and cultured on OP9 stromal cells in Iscove's Modified Dulbecco's Medium (Gibco, Invitrogen) supplemented with 10% (v/v) foetal calf serum (Gibco, Invitrogen), 50 μM βmercaptoethanol as well as murine interleukin-7 (10 ng/mL) at 37°C in 10% CO 2 for 7 days. Splenic B cells were purified by negative selection using a B-cell isolation kit (Miltenyi Biotec) 49 and purity was confirmed by flow cytometry prior to labelling with Cell Trace Violet (CTV; Life technologies) as per the manufacturer's instructions. Labelled cells were seeded at 5 × 10 4 cells per well and cultured for 90 h.
Haematology. Blood was collected into tubes containing EDTA (Sarstedt) and analysed on an Advia 2120 analyser (Bayer).  Table 4 for antibody dilutions and catalogue numbers for commercial antibodies. FACS-Gal analysis was performed using warm hypotonic loading of fluorescein di β-D-galactopyranoside (Molecular Probes) on single cells 50 followed by immunophenotyping using relevant surface antigens as defined in Supplementary Table 1. Cells were analysed using a LSR II or FACS Canto flow cytometer (Becton Dickinson) or sorted using a FACSAria II (Becton Dickinson) flow cytometer after antibody staining and lineage selection or depletion using anti-biotin beads and LS columns (Miltenyi Biotec). Data were analysed using FlowJo software (Version 8.8.7, Tree Star).
Analysis of publicly available RNA-seq datasets. FASTQ files containing RNAseq profiles of pre-proB cells from Ebf1 Δ/Δ (GSM2879293, GSM2879294, GSM2879295), pro-B cells from Pax5 Δ/Δ (GSM2879296, GSM2879297, GSM2879298) and control populations from wild-type mice (GSM2879299, GSM2879300, GSM2879301). Reads were aligned to the mm10 genome using Rsubread's align function and read counts were summarised at the gene level as for the primary samples (See Supplementary Methods) 52 . Genes were filtered from downstream analysis using edgeR's filterByExpr function and library sizes were TMM normalised. Counts were transformed to log2-CPM and the mean-variance relationship estimated using the voom function in limma 53 . Heatmaps were generated using heatmap.2 function in gplots. Genes were tested for differential expression using linear modelling in limma 3.38.2 54 . Gene set testing was performed using camera 55 and barcode plots were generated with limma.
Gene network analysis. All Ebf1-, Pax5-and Erg-ChIP-seq peaks mapping to differentially expressed genes in Rag1Cre T/+ ;Erg Δ/Δ pre-proB cells within 10 kb of the TSS were identified. Peaks inside the gene body were annotated as "proximal targets", peaks overlapping the TSS were labelled as promoter regulated targets, peaks less than 3 kb upstream or downstream of the TSS were labelled as putative promoter regulated targets, peaks more than 3 kb upstream or downstream TSS were labelled as putative distal targets. Gene Ontogeny (GO) annotation of differentially expressed genes was performed and underwent expert manual curation. The network was constructed using 62  Hi-C analysis. For in situ Hi-C analysis 13,65 , primary immune cell libraries were generated in biological duplicates for each genotype. An Illumina NextSeq 500 was used to sequence libraries with 80 bp paired-end reads to produce libraries of sizes between 42 million and 100 million valid read pairs. Each sample was aligned to the mm10 genome using the diffHic package v1.14.0 66 , which utilises cutadapt v0.9.5 67 and bowtie2 v2.2.5 58 for alignment. The resultant BAM file was sorted by read name, the FixMateInformation command from the Picard suite v1.117 (https://broadinstitute.github.io/picard/) was applied, duplicate reads were marked and then re-sorted by name. Read pairs were determined to be dangling ends and removed if the pairs of inward-facing reads or outward-facing reads on the same chromosome were separated by less than 1000 bp for inward-facing reads and 6000 bp for outward-facing reads. Read pairs with fragment sizes above 1000 bp were removed. An estimate of an alignment error was obtained by comparing the mapping location of the 3′ segment of each chimeric read with that of the 5′ segment of its mate. A mapping error was determined to be present if the two segments were not inward-facing and separated by less than 1000 bp, and around 1-2% were estimated to have errors. Differential interactions (DIs) between the three different groups were detected using the diffHic package 66 . Read pairs were counted into 100 kbp bin pairs. Bins were discarded if found on sex chromosomes, contained a count of less than 10, contained blacklisted genomic regions as defined by ENCODE for mm10 68 or were within a centromeric or telomeric region. Filtering of bin-pairs was performed using the filterDirect function, where bin pairs were only retained if they had average interaction intensities more than 5-fold higher than the background ligation frequency. The ligation frequency was estimated from the inter-chromosomal bin pairs from a 500 kbp bin-pair count matrix. The counts were normalised between libraries using a loess-based approach. Tests for DIs were performed using the quasi-likelihood (QL) framework 69 of the edgeR package. The design matrix was constructed using a layout that specified the cell group to which each library belonged and the mouse sex. A mean-dependent trend was fitted to the negative binomial dispersions with the estimateDisp function. A generalised linear model (GLM) was fitted to the counts for each bin pair 70 , and the QL dispersion was estimated from the GLM deviance with the glmQLFit function. The QL dispersions were then squeezed towards a second mean-dependent trend, using a robust empirical Bayes strategy 71 . A P value was computed against the null hypothesis for each bin pair using the QL F test. P values were adjusted for multiple testing using the Benjamini-Hochberg method. A DI was defined as a bin pair with a false discovery rate (FDR) below 5%. DIs adjacent in the interaction space were aggregated into clusters using the diClusters function to produce clustered DIs. DIs were merged into a cluster if they overlapped in the interaction space, to a maximum cluster size of 1 Mbp. The significance threshold for each bin pair was defined such that the cluster-level FDR was controlled at 5%. Cluster statistics were computed using the csaw package v1.16.0 72 . Overlaps between unclustered bin pairs and genomic intervals were performed using the InteractionSet package 73 . Plaid plots were constructed using the contact matrices and the plotHic function from the Sushi R package 74 . The colour palette was inferno from the viridis package (https://github.com/sjmgarnier/ viridis accessed 30 March 2018) and the range of colour intensities in each plot was scaled according to the library size of the sample. The plotBedpe function of the Sushi package was used to plot the unclustered DIs as arcs where the z-score shown on the vertical access was calculated as -log 10 (p-value). These data have been deposited in Gene Expression Omnibus database (accession number GSE133246).
Fluorescence in situ hybridisation. Cultured B-cell progenitors were resuspended in hypotonic 0.075 M KCl solution and warmed to 37°C for 20 min. Cells were pelleted and resuspended in 3:1 (vol/vol) methanol:glacial acetic acid fixative. Fixed cells were dropped onto coated Shandon TM polysine slides (ThermoFisher Scientific) and air dried. The cells were hybridised with FISH probes (Creative Bioarray) at 37°C for 16 h beneath a coverslip sealed with Fixogum (Marabu) after denaturation at 73°C for 5 min. Cells were washed at 73°C in 0.4× SSC/0.3%NP 40 for 2 min followed by 2× SSC/0.1%NP 40 for less than 1 min at room temperature and air dried in the dark and cover slipped. Images of nuclei were captured on an inverted Zeiss LSM 880 confocal using a 63×/1.4 NA oil immersion objective. Z-stacks of images were then captured using the lambda scan mode, a 405 and a multi-band pass beam splitter (488/561/633). The following laser lines were used: 405, 488, 561 and 633 nm. Spectral data were captured at 8 nm intervals. In all cases, images were set up with a pixel size of 70 nm and an interval of 150 nm for zstacks. Single dye controls using the same configuration were captured and spectra imported for spectral unmixing using the Zen software (Zen 2.3, Zeiss Microscopy). Unmixed data were then deconvolved using the batch express tool in Huygens professional software (Scientific Volume Imaging). Images were analysed using TANGO software 75 after linear deconvolution. Nuclear boundaries were extracted in TANGO using the background nuclear signal in the Aqua channel. A 3D median filter was applied and the 3D image projected with maximum 2D image projection for nuclei detection using the Triangle method for automated thresholding in ImageJ 76 . Binary image holes were filled and a 2D procedure implemented to separate touching nuclei using ImageJ 2D watershed implementation. The 2D boundaries of the detected nuclei were expanded in 3D and inside each 3D delimited region, Triangle thresholding was applied to detect the nuclear boundary in the 3D space. Acquired images from immunofluorescent probes were first filtered using 3D median and 3D tophat filter to enhance spot-like structures followed by application of the "spotSegmenter" TANGO plugin with only the best four spots having the brightest intensity kept for analysis. The spots identified by TANGO were manually verified against the original immunofluorescent image to identify and record the correct distance computed by TANGO between the aqua and 5-Rox immunofluorescent probes for both Igh alleles within a nucleus.
Statistical analysis. Student's unpaired two-tailed t tests were used using GraphPad Prism (GraphPad Software) unless otherwise specified. Unless otherwise stated, a P value of <0.05 was considered significant.
Details of reagents and software packages used are provided in Supplementary  Table 4.
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.