Dynamics of the 4 D genome during lineage specification , differentiation and maturation in vivo

Mammalian gene expression patterns are controlled by regulatory elements, which interact within Topologically Associating Domains (TADs). The relationship between activation of regulatory elements, formation of structural chromatin interactions and gene expression during development is unclear. We developed Tiled-C, a low-input Chromosome Conformation Capture (3C) approach, to study chromatin architecture at high spatial and temporal resolution through in vivo mouse erythroid differentiation. Integrated analysis of matched chromatin accessibility and single-cell expression data shows that regulatory elements gradually become accessible within pre-existing TADs during early differentiation. This is followed by structural re-organization within the TAD and formation of specific contacts between enhancers and promoters. In contrast to previous reports, our high-resolution data show that these enhancer-promoter interactions are not established prior to gene expression, but formed gradually during differentiation, concomitant with progressive upregulation of gene activity. Together, these results provide new insight into the close, interdependent relationship between chromatin architecture and gene regulation during development.


Introduction
Enhancers are non-coding regulatory elements required for precise control of gene expression during mammalian development. The interaction of active enhancer elements with their target genes is constrained by Topologically Associating Domains (TADs), ~0.1-1 Mb self-interacting regions that are usually delimited by convergent binding sites for the zinc-finger protein CTCF, so-called boundary elements [1][2][3] . However, the relationship between enhancer activation, formation of intra-TAD chromatin interactions and gene activation are not completely understood [4][5][6] . To better understand how genome structure relates to function, it is important to characterize the three-dimensional nuclear architecture of the genome at higher resolution, and to determine how it changes during differentiation and development.
It has been shown that TAD boundaries are generally established early in development and remain relatively stable during differentiation [7][8][9][10] . By contrast, interactions within TADs are extensively restructured in differentiating cells, which involves the formation of specific interactions between enhancers and promoters [11][12][13][14][15] . It has been suggested that this reorganization precedes gene activity and that enhancer-promoter interactions are formed prior to gene expression [16][17][18] . However, due to limitations in temporal and/or spatial resolution in previous studies, it is not known precisely when such interactions are formed during development. The detailed order of events and precise relationship between chromatin architecture and activation of regulatory elements and genes therefore remain unclear. For example, it is possible that enhancers and promoters form limited interactions prior to gene activation due to changes in the general self-interactivity of the TAD during differentiation, but that strong upregulation of gene expression is associated with more specific, subtle changes in conformation that will only be detected in data with sufficient resolution and sensitivity.
To better understand the relationship between chromatin architecture and gene expression, it is crucial to characterize chromatin structures at high resolution in pure, primary cell populations representing relevant developmental stages. This has been hampered by the lack of high-resolution 3C methods that are suitable for the analysis of limited numbers of primary cells. Therefore, we developed Tiled-C, a new 3C-based approach 19 , which can generate high-resolution contact matrices of selected regions of interest from as few as 2,000 cells and thereby allows for the analysis of cell populations that have previously been inaccessible.
We have used Tiled-C to study the chromatin architecture of key erythroid genes through sequential stages of in vivo erythroid differentiation in the mouse, including highly purified hematopoietic stem and progenitor cells. In addition, we have generated matched chromatin accessibility and single-cell expression data. We examined six loci, including the a-globin, Slc25a37, Tal1, Cd47, Cpeb4 and Btg2 genes, and focused our analyses on the a-globin genes, because the regulatory elements in this locus are extremely well-characterized. We find that the TAD encompassing the a-globin genes is already present in hematopoietic stem cells. We also find that the first steps in gene activation occur in early committed erythroid progenitors and involve opening of the a-globin enhancers, which become accessible prior to both chromatin reorganization and activation of a-globin RNA expression. Subsequent chromatin reorganization involves the appearance of smaller self-interacting domains within the larger TAD, in which specific interactions between enhancers and promoters are formed. In contrast to the current literature [16][17][18] , we find that these enhancer-promoter interactions do not precede upregulation of gene activity, but are formed gradually and concomitantly with progressive activation of a-globin expression. Importantly, we find a similar order of events at other erythroid gene loci.
Therefore, our data demonstrate that -at this improved level of resolution -chromatin architecture and gene activation are more tightly linked than previously appreciated. Together, these findings provide new insights into the mechanisms contributing to the establishment of tissue-specific chromatin structures during development.

Tiled-C generates high-resolution contact matrices from small numbers of cells.
We developed Tiled-C, a 3C technique which generates deep, high-resolution contact matrices of genomic regions of interest. Tiled-C maximizes library complexity by employing a single-tube protocol for 3C library preparation, which minimizes losses during the procedure 20 . This is combined with enrichment derived from the efficient Capture-C technology, which allows for up to a million-fold enrichment of restriction fragments of interest 21,22 . While Capture-C targets individual restriction fragments as viewpoints, Tiled-C uses a panel of capture oligonucleotides tiled across all restriction fragments of specified genomic regions to efficiently enrich for contacts within this region. This allows for deep, targeted sequencing of chromatin interactions within regions of interest and thus for the generation of high-resolution contact matrices at unprecedented depth, across multiplexed samples and genomic regions. Tiled-C therefore combines the ability of all vs all methods such as Hi-C 23 to map large-scale chromatin structures including TADs, and the ability of one vs all methods such as 4C 24,25 and Capture-C 21,22 to robustly identify enhancer-promoter interactions within TADs in detail (Supplementary Figure 1,2). To validate the Tiled-C approach, we compared Tiled-C data to the deepest currently available in situ Hi-C datasets (mouse ES cells 9 ; Figure 1a). Tiled-C data at this region was ~28-fold higher in depth and required ~19-fold less sequencing (Supplementary Table 1 Table 2). These methods require millions of cells per sample. In contrast, as Tiled-C is specifically designed to maximize the efficiency of the experimental procedure, its increased sensitivity allows for the generation of reproducible, high-resolution contact matrices from as few as  Table 3), thereby enabling the analysis of previously intractable primary cell types. This is critical for the investigation of 4D (3D structure through developmental time) genome organization, as cell numbers become extremely limiting at early stages of development.
We used Tiled-C to study changes in chromatin structure associated with gene activation in primary cells during in vivo mouse erythropoiesis. We initially focused on the a-globin cluster because the aglobin genes and their regulatory elements are extremely well-characterized. The mouse a-globin cluster comprises the duplicated adult a-globin genes Hba-1 and Hba-2, as well as the embryonic gene Hba-x, and two genes of unknown function Hbq-1 and Hbq-2. These genes are located in a TAD, which also contains five additional genes upstream of the a-globin cluster: Nprl3, Mpg, Rhbdf1, Snrnp25 and Il9r. The a-globin genes are regulated by five erythroid-specific enhancer elements (R1-R4 and Rm), which classify as a super-enhancer 29 . In terminally differentiating erythroblasts these enhancers interact with the gene promoters in a sub-TAD flanked by multiple CTCF-binding elements, which are predominantly in a convergent orientation 30,31 (Supplementary Figure 1).

Isolation of primary cell populations from sequential stages of in vivo erythropoiesis.
Using fluorescence-activated cell sorting (FACS), we isolated cells at sequential stages of erythroid differentiation directly from mouse fetal livers (Figure 3a,b). This allowed us to analyze highly purified primary cells through in vivo erythropoiesis. The S0-low cell population consists of early progenitors, predominantly Burst-Forming Unit-Erythroid (BFU-E) cells. S0-medium consists primarily of early Colony-Forming Unit-Erythroid (CFU-E) cells, while S1 and S2 contain the last CFU-E cell division before terminal differentiation 32,33 . S3 through S5 consist of terminally differentiating erythroblasts in progressively more mature states. Because erythroid cells enucleate in the final stages of differentiation, we have focused our analyses on stages S0 through to S3. In vitro, differentiation from S0 cells to S1 cells takes about 10 hours, and differentiation to S3 cells takes an additional 10 hours.

seq.
We used Cellular Indexing of Transcriptomes and Epitopes by Sequencing (CITE-seq) 34 , a variant of single-cell RNA-seq, to characterize gene expression in the isolated cell populations, generating the first dataset to include the full course of in vivo erythroid differentiation through to terminal differentiation in the mouse (Supplementary Figure 5). This dataset shows that a-globin is expressed at basal levels in the S0 populations. Expression of a-globin dramatically increases during the S2 stage and plateaus at S3, however the earliest cells showing elevated expression of a-globin are found in the S1 stage ( Figure 3c). To validate that erythroid-specific a-globin upregulation begins at S1, we performed RNA-FISH to detect nascent transcription in FACS-sorted primary cells (Figure 3d). We detect a small increase in nascent transcription from S0-low to S0-medium cells and confirm a robust increase in expression from S0 to S1 cells (P < 0.005 by paired T-test; Figure 3e).

Upregulation of gene expression is associated with progressive changes in chromatin architecture.
We used the Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq) 35 to profile chromatin accessibility in these stages. Interestingly, we find that both enhancer and promoter elements are accessible prior to the onset of erythroid-specific gene expression, and that the degree of accessibility gradually increases, concomitant with upregulation of a-globin expression ( Figure 4, Supplementary Figure 6). Tiled-C shows that a TAD structure encompassing the a-globin locus is present at the earliest stage (S0-low), prior to the formation of weak enhancer-promoter interactions in the S0-medium stage ( Figure 4, Supplementary Figures 7,8). These enhancer-promoter interactions strengthen in the subsequent S1 and S2 stages, accompanied by increases in a-globin expression and accessibility. In the S3 stage, where chromatin accessibility and expression reach their maximum levels, we observe a further increase in enhancer-promoter interactions in a sub-compartmentalized chromatin structure similar to that observed in primary erythroblasts derived from mature spleen tissue (Figure 1, Supplementary Figures 1,2). This smaller sub-compartmentalized structure, which forms within the preexisting TAD, is delimited by convergent CTCF-binding elements that flank the a-globin enhancers and genes ( Figure 4, Supplementary Figure 1). We have previously shown that these CTCF-binding elements are functionally important to restrict the interactions of the a-globin enhancers and prevent other genes within the TAD, but outside of the sub-compartmentalized structure, from being upregulated 31,36 . This suggests that this smaller erythroid-specific domain is likely formed by similar CTCF-dependent mechanisms as TADs, although it is smaller in size (~70 kb) and has very high internal interaction frequencies compared to typical TADs.
Since both accessibility and the encompassing TAD structure are present prior to erythroid-specific aglobin expression, we purified early hematopoietic progenitor populations to investigate when in differentiation these features are established. Interestingly, we find that the pre-existing TAD containing the a-globin locus is already present in hematopoietic stem cells, despite four out of five enhancers and both promoters being inaccessible at this stage (Supplementary Figure 9).
To examine whether a similar order of events operates at other gene loci, we examined the chromatin architecture of five additional erythroid gene loci through erythropoiesis: Slc25a37, Tal1, Cd47, Cpeb4 and Btg2. In each of these loci, we find that regulatory elements are accessible prior to gene activation and gradually increase in accessibility as gene expression is upregulated ( Supplementary Figures 10-12). We also find that these elements interact at basal levels in a pre-existing TAD structure prior to gene expression. However, specific interactions between the regulatory elements progressively increase during differentiation as erythroid-specific gene activity increases ( Figure 5). These results confirm that specific regulatory interactions do not precede gene activation, but are formed gradually and concomitant with upregulation of gene expression.

Discussion
The mouse a-globin cluster has a long history as model locus for studying gene regulation during differentiation and development 37 . Previous analysis of transcription factor binding at the a-globin locus has shown that lineage commitment and differentiation are driven by sequential appearance of key transcription factors 38 , consistent with the gradual increase in chromatin accessibility at the regulatory elements we describe here. In addition, previous 3C studies of both the aand b-globin loci in erythroid cell lines have demonstrated that interactions between enhancers and promoters are tissuespecific 39,40 . More recently, the tissue-specific conformation of the a-globin locus has also been described based on super-resolution imaging of two stages of ex vivo erythroid differentiation 41 A limitation of the current body of work on chromatin organization during differentiation and development -both at the globin clusters and other gene loci -is that experiments have been performed at low spatial and temporal resolution and predominantly in vitro. Previous studies have therefore not been able to identify at what point in differentiation tissue-specific interactions between enhancers and promoters are established, nor how the formation of these interactions relates to changes in gene expression. Progress in this area has been limited by a lack of techniques capable of generating highresolution interaction data from the small numbers of cells available in developmentally relevant primary cell populations.
To overcome this hurdle, we have developed Tiled-C, a new 3C-based approach which can generate high-resolution interaction data from as few as 2,000 cells and thus enables analysis of cell populations that have previously been inaccessible. We have used Tiled-C, in combination with ATAC and singlecell RNA-seq, to study the dynamic chromatin architecture and expression of the a-globin cluster through in vivo erythropoiesis. Our data show that the boundaries of the TAD containing the a-globin cluster are established in hematopoietic stem cells -prior to activation of the regulatory elements and genes within the domain -and maintained during further differentiation. This is consistent with previous reports which have shown that TADs are relatively stable during differentiation and development [7][8][9][10] . In contrast to the current literature [16][17][18] however, the higher resolution of our data has allowed us to show that sub-compartmentalization of the large TAD into smaller domains and the subtle structural changes that strengthen specific enhancer-promoter interactions both occur gradually during terminal erythroid differentiation, concomitant with progressive upregulation of gene activity ( Figure   6). Interestingly, initial chromatin accessibility is detectable at the regulatory elements of the a-globin locus prior to conformational change and gene expression activation, although accessibility also increases gradually during differentiation. In addition to the a-globin cluster, we demonstrate the same order of events across five other erythroid gene loci.
It is likely that the formation of specific sub-domains within TADs early in differentiation facilitates interactions between enhancers and promoters to prime loci for gene activation. It has recently been shown for the globin loci that gene activation is associated with the formation of higher-order hub-like structures, in which multiple enhancers and promoters form simultaneous, specific interactions 30,42 . Our data suggest that these structures may only be formed in the final stages of differentiation, when chromatin accessibility and interactions between enhancers and promoters are strongest, and may be important to achieve maximal gene expression. This is further supported by recent live imaging experiments in Drosophila, in which gene activation only occurred upon the formation of tight associations between enhancers and a gene promoter, and not after induced enhancer-promoter proximity resulting from interactions between insulator elements 43 .
This model implies that there are multiple processes contributing to the formation of specific chromatin structures associated with gene activation. A pre-existing TAD encompassing the a-globin locus is formed prior to and thus independent of activation of the regulatory elements within the domain. This is likely driven by tissue-invariant loop extrusion mediated by cohesin and constitutive CTCF-binding elements 44 . During differentiation, chromatin accessibility increases, and a smaller sub-domain is formed within this TAD. We have previously shown that deletion of the CTCF-binding sites at the base of this sub-domain causes it to expand and leads to aberrant expression of the neighboring genes 31,36 .
This indicates that sub-compartmentalization is dependent on these CTCF-binding sites and implies that its formation is mediated by loop extrusion. Since the CTCF-binding sites are constitutively occupied, erythroid-specific compartmentalization is likely driven by increased rate or processivity of loop extrusion in this region during differentiation. As we have previously observed erythroid-specific accumulation of cohesin at the a-globin enhancers 31 , it is possible that this is mediated by increased cohesin recruitment at the activated regulatory elements. This is further supported by studies showing that cohesin co-localizes with transcription factors across the genome 45,46 .
The initial appearance of chromatin accessibility at a-globin regulatory elements occurs early in differentiation and significantly precedes the onset of specific enhancer-promoter interactions. This indicates that chromatin opening can occur independently of larger scale chromatin reorganization, yet further increases in accessibility do occur alongside the establishment and progressive strengthening of enhancer-promoter interactions, suggesting only a partial decoupling. This suggests that active regulatory elements are not required for the establishments of TADs, consistent with our previous work showing that deletion of the a-globin enhancers has no impact on the formation of the a-globin TAD.
These deletions do affect specific enhancer-promoter interactions 29,41 , reinforcing that regulatory elements do play a role in the formation of tissue-specific chromatin structures, possibly mediated by interactions between the multi-protein complexes bound at these elements.
In conclusion, our dissection of the chromatin architecture of a well-understood gene locus through in vivo erythroid differentiation demonstrates that chromatin architecture and gene activation are tightly linked during development and provides new insights into the distinct mechanisms contributing to the establishment of tissue-specific chromatin structures. Importantly, Tiled-C provides an approach that enables such detailed analysis in cell types that were previously intractable.

Mature erythroid cells
Mature primary Ter 119+ erythroblasts were obtained from spleens of female C57BL/6 mice treated with phenylhydrazine as previously described 22 .

Mouse ES cells
Mouse ES cells were cultured and harvested as previously described 22 .

Erythroid progenitors
Primary erythroid progenitor cells were isolated from fetal livers, which were freshly isolated at e12.5-

Replicates
The presented Tiled-C data derived from mature splenic erythroblasts and ES cells represent biological triplicates produced from separate mice or culture flasks, respectively. The presented Tiled-C data derived from hematopoietic and erythroid progenitor populations represent biological duplicates, with the exception of the S1 stage, for which we used a single biological replicate to generate technical duplicates. The presented ATAC data derived from hematopoietic and erythroid progenitor populations represent biological triplicates for the S0-low, S0-medium and S1 populations, biological duplicates for the S2 and S3 populations, and single replicates for the hematopoietic progenitor populations. The presented RNA-FISH data represent biological triplicates except for the brain and no-primary-antibody negative control, which represent biological duplicates.

Ethics
All protocols were approved through the Oxford University Local Ethical Review process and all experimental procedures were performed in accordance with European Union Directive 2010/63/EU and/or the UK Animals (Scientific Procedures) Act, 1986.

Rationale
Tiled-C is a hybrid of the all vs all 3C methods, such as Hi-C 23 , and the one vs all methods, such as 4C 24,25 and Capture-C 21,22 . Tiled-C generates all vs all contact matrices of specified genomic regions and thus combines an unbiased all vs all view with the ability to target regions of interest, without the need to sequence chromatin interactions genome-wide. Tiled-C has similarities to 5C 48

and Targeted
Chromatin Capture (T2C) 26 . However, Tiled-C allows for reliable PCR duplicate filtering based on random sonication ends and uses more efficient capture enrichment, and is therefore able to generate data at higher resolution and depth. Tiled-C also has similarities to Capture Hi-C 27

and HYbrid Capture
Hi-C (Hi-C 2 ) 28 , which use oligonucleotide capture to enrich Hi-C libraries for regions of interest (Supplementary Table 2). The main differences are that these methods enrich a biotinylated Hi-C library, while Tiled-C enriches a 3C library generated with an optimized procedure to retain maximal library complexity, which is critical for the analysis of small cell numbers. In addition, Tiled-C uses an efficient capture oligonucleotide design targeted directly to the ends of all restriction fragments present in the region of interest and an efficient capture enrichment procedure enabling up to a million-fold enrichment of restriction fragments of interest. The combination of high library complexity and efficient enrichment in Tiled-C enables high-resolution data generation at great depth and makes the method suitable for the analysis of small cell numbers. Moreover, enriching for targeted regions of interest substantially decreases sequencing costs. We should note however that the synthesis of large amounts of capture oligonucleotides can also be expensive. We therefore believe that Tiled-C is particularly We ordered panels of double-stranded capture oligonucleotides from Twist Bioscience (Custom probes for NGS target enrichment). As recommended by Twist, we used 13.67 fmol of each individual oligonucleotide per enrichment reaction.

Experimental procedure
For samples containing 100,000 cells or fewer, we followed a low-input 3C library preparation protocol 49 . Cells were sorted into 1 ml medium and fixed with 2% formaldehyde for 10 minutes. After When cell numbers were not limiting, we used aliquots of ~10 7 cells to prepare 3C libraries, following the efficient Capture-C protocol 22 . Cells were fixed with 2% formaldehyde for 10 minutes, after which the reaction was quenched with glycine. The fixed cells were washed in cold PBS and resuspended in cold lysis buffer. After incubation on ice for 20 minutes, the cells were snap frozen. Prior to digestion, the cells were resuspended in restriction buffer, Dounce homogenized on ice, and treated with SDS and Triton X-100. The chromatin was digested with DpnII, using 3 aliquots of 1500 U DpnII restriction enzyme, which were added several hours apart over a total incubation time of 16-24 hours at 37 °C.
The digestion reaction was heat-inactivated and digested chromatin was ligated overnight with 720 U of T4 DNA ligase at 16 °C. The ligated DNA was reverse crosslinked and treated with proteinase K overnight at 65 °C. After RNase treatment, DNA was purified using phenol-chloroform extraction and precipitation with ethanol and sodium acetate at −80 °C. The resulting 3C libraries were resuspended in PCR-grade water. Aliquots of 5-6 μg of 3C library were sonicated to ~200 bp fragments using a Covaris S220 Focused Ultrasonicator (six cycles of 60s; duty cycle: 10%; intensity: 5; cycles per burst: 200). Illumina TruSeq adaptors were added using NEBNext DNA Library Prep reagents according to the manufacturer's protocol. The libraries were indexed and amplified using Agilent Herculase II reagents in a 6-cycle PCR reaction. DNA clean-up steps were performed using AMPure XP beads in a 1.8:1 bead:sample ratio. Where possible, we processed 2 aliquots of each sample in parallel and ligated the DNA with the same index to generate maximum library complexity. 1-1.5 μg of indexed material was used during subsequent capture-based enrichment.
For enrichment using single-stranded oligonucleotides, we used the Nimblegen SeqCap EZ reagents and followed the SeqCap EZ Library SR User's Guide (Chapters 5-7). We multiplexed up to 6 samples per enrichment reaction in a single tube, and multiplied the volumes described in the protocol by the μl PCR tube and heated to 47 °C in a PCR thermocycler. After denaturation, the 3C library mixture was added to the biotinylated oligonucleotides without removing them from the heating block in the thermocycler. The hybridization reaction was incubated in the thermocycler at 47 °C for 64-72 h with a heated lid at 57 °C. After incubation was complete, we enriched for the captured DNA fragments using M270 streptavidin beads and the Nimblegen SeqCap EZ wash buffers, following the procedure described in the manufacturer's protocol. After the washing steps, the beads with captured material were resuspended in 40 μl PCR-grade water. Captured DNA was amplified directly of the beads using the KAPA master mix provided in the SeqCap EZ accessory kit v2, in 2 separate reactions of ~10 cycles, as described in the protocol. Ampure-XP beads were used in a 1.8:1 bead:sample ratio to clean up the amplification reaction and DNA was eluted in 30 μl PCR-grade water. To increase enrichment, a second round of oligonucleotide capture was performed following the same procedure, using up to 2 ug of enriched material in a single hybridization reaction (even if multiplexed in first round) of 20-24 hours.
For enrichment using double-stranded oligonucleotides, we used Twist Biosciences reagents and followed the Twist Custom Panel Protocol (Steps 4-7). To multiplex samples, we used 375-500 ng indexed library per sample, mixed up to 1.5 μg (in exact 1:1 ratio) in a single tube, and used single reaction volumes as described in the protocol. We processed multiple tubes simultaneously if required.
Streptavidin C1 beads were used to enrich the hybridized DNA and the washed material was amplified using 10-12 cycles of PCR. Ampure-XP beads were used in a 1.8:1 bead:sample ratio to clean up the amplification reaction and DNA was eluted in 30 μl PCR-grade water. To increase enrichment, a second round of oligonucleotide capture was performed following the same procedure, using up to 1.5 μg of enriched material in a single hybridization reaction (even if multiplexed in first round) of 20-24 hours.
The enriched Tiled-C libraries were assessed using the Agilent Bioanalyzer or D1000 Tapestation and quantified using KAPA Library Quantification reagents, before sequencing using the Illumina NextSeq platform. In high-quality libraries, sequencing 3-5 million reads per enriched Mb per sample is sufficient for data at 5 kb resolution.

Analysis
The most straightforward way to analyze Tiled-C data is to use the HiCPro pipeline 50 with the options for Capture-Hi-C analysis. We have also adjusted our pipelines for Capture-C analysis to be compatible with Tiled-C data. This pipeline is designed to analyze deep, targeted 3C data and provides very stringent filtering, especially regarding PCR-related artefacts. All data presented in the paper have been analyzed using a combination of this CCseqBasic pipeline (https://github.com/Hughes-Genome-Group/CCseqBasicF/releases), custom scripts and ICE normalization 51  When Tiled-C data are compared between different cell types (Figures 1b, 1c, 4 Quantification of enhancer-promoter interactions of interest was performed based on interaction counts in the corresponding bins after normalizing for the total number of counts in the matrix. We used the following coordinates for quantification of regulatory elements of interest (highlighted in Supplementary Figures 10-12 We identified TADs based on insulation indices using TADtool 52 .
To examine the reproducibility of Tiled-C in low-input samples, we used HiCRep 53 to calculate stratumadjusted correlation coefficients, considering a maximum distance of 100,000 bp.

Hi-C
We compared Tiled-C data in mouse ES cells to the deepest currently available Hi-C data in mouse ES cells 9 . We explored the data using HiGlass 54 and downloaded and re-analyzed the Hi-C data using the HiC-Pro pipeline 50 with default options and ICE normalization 51 .

ATAC-seq
Experimental procedure For FACS-sorted erythroid progenitors from fetal liver, either 2 or 3 replicates were processed of ~50,000 cells each for each sorted population. ATAC-seq was performed as previously described 35 .
For FACS-sorted hematopoietic stem and progenitor cells from adult bone marrow, 1 replicate of between 5,000 and 20,000 cells was processed for each population. Cells were spun at 500 g for 10 minutes at 4 °C. The supernatant was discarded and cells were resuspended directly in Dig-transposition buffer (25 μl 2x TD Buffer [Illumina], 2.5 μl Tn5 transposase, 0.5 μl 1% digitonin and 22 μl H2O) before incubating at 37 °C for 30 minutes with agitation at 600 rpm. After the transposition step, samples were processed as previously described 35 .

Analysis
Reads were mapped to the mouse mm9 genome and PCR duplicates removed using NGseqBasic 55 .
Technical replicates were merged and peaks called using MACS2 56 . Peaks were merged and the number of reads in each sample overlapping each peak was calculated using BEDTools merge and multicov 57 .
For visualization, bedgraph files were generated using BEDTools genomecov with a scaling factor of 1e6 / (total number of reads in peaks). All analysis scripts are available at https://github.com/rbeagrie/alpha-tiledc.

Single-cell RNA-seq
Experimental procedure Fetal livers were harvested and pooled from 7 e13.5 C57BL/6 mouse embryos and processed as above.
Cells were first stained with 2.5 μg/ml biotin-conjugated anti-Ter119 (BD 553672) and 2.5 μg/ml Antibodies were conjugated to streptavidin as previously described 34 and mixed with biotinylated custom oligonucleotides. Cells were processed for single-cell RNA-seq using the 10x Genomics Single Cell 3' v2 kit.
Analysis cDNA reads were mapped to the mouse mm10 assembly and barcodes assigned to cells using Cell Ranger v2.1.1 (10x Genomics). ADT reads were mapped to cell/antibody barcodes using CITE-seqcount (https://github.com/Hoohm/CITE-seq-Count). Potential doublet cells were removed using Scrublet 58 . Further analysis was performed using Seurat v2. Low quality cells with less than 300 or more than 5,000 identified genes, or with more than 9% mitochondrial reads were also removed.
Clusters were identified using the "FindClusters" function and UMAP projection was generated using the "RunUMAP" function, both with the first 16 principle components. Seurat clusters were annotated using marker genes and by reference to previously published data 33 -Seurat identified two clusters corresponding to committed erythroid progenitors (CEP) and four clusters corresponding to cells undergoing erythroid terminal differentiation (ETD). Average gene expression for populations matching those obtained by FACS sorting was generated by using the "SubsetData" function to select cells with low levels of cell-surface barcodes corresponding to lineage markers, and appropriate levels of barcodes corresponding to CD71 and Ter119. All analysis scripts are available at https://github.com/rbeagrie/alpha-tiledc.

RNA-FISH
Experimental procedure Standard RNA-FISH was carried out as previously described 59 . Sorted cells from mouse fetal liver were placed back into culture for 6 hours to allow nascent transcription to be re-established. Samples were hybridized with digoxygenin-labelled oligonucleotide probes directed to a-globin introns (30 ng per slide) and visualized using FITC-conjugated antibodies (primary: sheep anti-DIG FITC [Roche] 1:50, secondary: rabbit anti-sheep FITC [Vector] 1:100). Two negative controls were also included: brain tissue from a male, adult CD1 mouse and Ter119+ (i.e. mature) fetal liver erythroid cells that were probed with secondary antibody but no primary antibody. Magnetically purified but not FACS-purified

Image analysis
Image analysis was blinded by renaming image files from all experiments with random character strings and processing them together. Images were manually examined, and each cell was scored for the presence of active nascent-transcription foci. Analysis scripts are available at https://github.com/rbeagrie/alpha-tiledc.

Data availability
Tiled-C, ATAC-seq and single-cell RNA-seq data generated in this study have been deposited in the Gene Expression Omnibus (GEO) under accession code GSE137477. All RNA-FISH image files are    Tiled-C contact matrices of 500 kb spanning the mouse a-globin locus in sequential stages of in vivo erythroid differentiation at 2 kb resolution. Contact frequencies represent normalized, unique