Cohesin-mediated loop anchors confine the locations of human replication origins

DNA replication occurs through an intricately regulated series of molecular events and is fundamental for genome stability1,2. At present, it is unknown how the locations of replication origins are determined in the human genome. Here we dissect the role of topologically associating domains (TADs)3–6, subTADs7 and loops8 in the positioning of replication initiation zones (IZs). We stratify TADs and subTADs by the presence of corner-dots indicative of loops and the orientation of CTCF motifs. We find that high-efficiency, early replicating IZs localize to boundaries between adjacent corner-dot TADs anchored by high-density arrays of divergently and convergently oriented CTCF motifs. By contrast, low-efficiency IZs localize to weaker dotless boundaries. Following ablation of cohesin-mediated loop extrusion during G1, high-efficiency IZs become diffuse and delocalized at boundaries with complex CTCF motif orientations. Moreover, G1 knockdown of the cohesin unloading factor WAPL results in gained long-range loops and narrowed localization of IZs at the same boundaries. Finally, targeted deletion or insertion of specific boundaries causes local replication timing shifts consistent with IZ loss or gain, respectively. Our data support a model in which cohesin-mediated loop extrusion and stalling at a subset of genetically encoded TAD and subTAD boundaries is an essential determinant of the locations of replication origins in human S phase. A study shows that the three-dimensional conformation of the human genome influences the positioning of DNA replication initiation zones, highlighting cohesin-mediated loop anchors as essential determinants of their precise location.

DNA replication occurs through an intricately regulated series of molecular events and is fundamental for genome stability 1,2 . At present, it is unknown how the locations of replication origins are determined in the human genome. Here we dissect the role of topologically associating domains (TADs) 3-6 , subTADs 7 and loops 8 in the positioning of replication initiation zones (IZs). We stratify TADs and subTADs by the presence of corner-dots indicative of loops and the orientation of CTCF motifs. We find that high-efficiency, early replicating IZs localize to boundaries between adjacent corner-dot TADs anchored by high-density arrays of divergently and convergently oriented CTCF motifs. By contrast, low-efficiency IZs localize to weaker dotless boundaries. Following ablation of cohesin-mediated loop extrusion during G1, high-efficiency IZs become diffuse and delocalized at boundaries with complex CTCF motif orientations. Moreover, G1 knockdown of the cohesin unloading factor WAPL results in gained long-range loops and narrowed localization of IZs at the same boundaries. Finally, targeted deletion or insertion of specific boundaries causes local replication timing shifts consistent with IZ loss or gain, respectively. Our data support a model in which cohesin-mediated loop extrusion and stalling at a subset of genetically encoded TAD and subTAD boundaries is an essential determinant of the locations of replication origins in human S phase.
The interphase human genome folds into TADs and nested sub-TADs. TADs were originally defined in first-generation Hi-C and 5C data as megabase (Mb)-scale, self-interacting chromatin segments in which DNA sequences exhibit substantially higher contact frequency within-compared to between-domains [3][4][5][6] . Molecular and computational advances over the past decade have resulted in ultrahigh-resolution genome folding maps with substantially improved signal-to-noise ratios [8][9][10][11] . Such technical advances have enabled the discovery of fine-grained A/B compartments 8 , nested sub-TADs within TADs 7 , punctate dot structures indicative of long-range looping interactions 8 , and stripes indicative of loop extrusion [12][13][14] . In light of the critical importance of dissecting the link between specific higher-order chromatin architectural features and genome function, a leading challenge is to classify subtypes of TADs/subTADs in Hi-C maps by their fine-grained structural features. Clearly defining structural classes of TADs/subTADs can in turn facilitate the careful dissection of each boundary's molecular composition, organizing principles and unique cause-and-effect relationship across a range of genome functions.
Here we ascertain the functional link between distinct structural classes of TADs/subTADs and DNA replication. Replication initiates from tens of thousands of origins licensed in excess across the human genome in telophase and throughout G1 (refs. 1,2 ). A small proportion of licensed origins subsequently fire in orchestrated temporal waves during S phase 2 . It is established that origins fire at one or more sites chosen stochastically within ≈40 kb regions (IZs) [15][16][17] . Nevertheless, a consensus sequence encoding origin or IZ placement has not been definitively identified in humans. Waves of early and late replication correlate with A and B compartments, respectively, and the temporal transitions from early to late replication can in some cases align with TAD boundaries 3,18,19 . However, the role of fine-scale genome folding patterns during interphase (such as loops, subTADs and TADs detectable in high-resolution Hi-C data) in the genomic placement of initiated origins following entry into S phase is not known.
We recently developed a high-resolution Repli-seq method to identify the placement of IZs across the genome at 50-kb resolution 16 . We first compared the genomic locations of IZs replicating across early, early-mid and late S phase to our high-resolution Hi-C data developed in the 4D Nucleome Consortium from H1 human embryonic stem (ES) cells 11 . We noticed that high-efficiency, early-S-phase IZs colocalize to strongly insulated boundaries demarcated by corner-dot TADs/sub-TADs on one or both sides ( Fig. 1a

Article
By contrast, low-efficiency IZs that fire late in S phase can colocalize with boundaries between TADs/subTADs devoid of corner-dots ( Fig. 1a and Extended Data Figs. 2b and 3). Our qualitative observations suggest that early and late IZs are enriched at genomic locations serving as boundaries of corner-dot and dotless TAD/subTADs, respectively. To quantify the link between TAD/subTAD boundaries and IZ genomic placement, we identified a total of 23,851 chromatin domains genome-wide in Hi-C data for human ES cells using our graph-theory-based method 3DNetMod 20 (Supplementary Methods  and Supplementary Table 1). We also applied statistical methods developed by our laboratory and others to identify dot-like structures representative of bona fide looping interactions 8,21,22 . We identified 16,922 dots genome wide in ensemble Hi-C maps of human ES cells. Such dots represent punctate groups of adjacent pixels with significantly higher contact frequency compared to the surrounding local chromatin domain structure (Fig. 1a, Table 3). We stratified boundaries into three groups, including those that are structurally demarcated by: adjacent corner-dot TADs/ subTADs on both sides (double-dot boundaries, n = 6,318); corner-dot TADs/subTADs on only one side and dotless on the other (single-dot boundaries, n = 2,163); and adjacent dotless TADs/subTADs on both sides (dotless boundaries, n = 1,089) (Supplementary Table 4). By applying a range of parameter stringencies and methods for dot calling, we could modify the proportion of boundaries classified as double-dot, single-dot and dotless, but the colocalization of dot boundaries with IZs was evident regardless of statistical methodology (Supplementary Methods and Extended Data Fig. 4). We combined all double-dot and single-dot boundaries into dot boundaries, as they showed similar IZ localization patterns (Supplementary Table 4).
Cohesin is essential for the formation of TADs/subTADs through loop extrusion and stalling against boundaries insulated by the architectural protein CTCF 12,13,[23][24][25] . We reasoned that the density and orientation of CTCF-binding sites might reveal an architectural protein signature at boundaries linked to placement of active origins that fire in S phase. We observed a substantially higher density of co-bound CTCF + cohesin-binding sites at dot boundaries overlapping early IZs compared to those that do not overlap any IZs (Fig. 1b and Supplementary Tables 5 and 6). We also examined sites that bind only cohesin, as they can earmark CTCF-independent enhancer-promoter interactions 7,23 , but we did not see a notable difference in the number of sites that bind only cohesin across dot versus dotless TAD/subTAD boundaries (Fig. 1b). Together, our data indicate that boundaries colocalizing with human early-S-phase IZs exhibit enriched occupancy of motifs co-bound by CTCF and cohesin, but not cohesin alone, thus confirming and substantially expanding on observations in previous reports linking cohesin generally to a small subset of replication origins in Drosophila 26 and humans 27 .
Recent reports have uncovered that convergently oriented CTCF motifs anchor long-range looping interactions formed by cohesin-mediated extrusion 12,14,23,28,29 . We observed that most dot boundaries are marked by two or more CTCF + cohesin-bound motifs arranged in a convergent or divergent orientation (hereafter called complex motif orientation; Fig. 1c), and this molecular signature was further enriched when dot boundaries colocalize with early replicating IZs. By contrast, nearly all dotless boundaries have only one or no CTCF + cohesin-bound motifs (Fig. 1c). Dotless boundaries colocalized with late IZs were most often anchored by one CTCF motif. We therefore establish six boundary classes by stratifying dot (classes 1-3) and dotless (classes 4-6) boundaries into those localized with CTCF + cohesin-bound motifs in a complex orientation (classes 1 and 4), tandem or single-motif orientation (classes 2 and 5), or no bound motifs (classes 3 and 6; Fig. 1d).
We next formulated a statistical test to quantify IZ enrichment at boundaries compared to the background expectation across autosomes (Supplementary Methods and Supplementary Table 7). Consistent with our qualitative observations, high-efficiency IZs firing in early S phase were significantly enriched at dot boundaries marked by CTCF + cohesin-binding sites in complex orientations compared to a null distribution of random intervals matched by size and A/B compartment distribution (class 1; Fig. 1d,e, Extended Data Fig. 5b-d and Supplementary Methods). By contrast, low-efficiency IZs firing in late S phase were depleted at dot boundaries and significantly enriched at dotless boundaries with tandem + single CTCF + cohesin-bound motifs or no bound motifs (classes 5 and 6; Fig. 1d,e, Extended Data Fig. 5b-d and Supplementary Methods). We note that our null distribution was created with random intervals matched to real IZs by their size and compartment distribution, reinforcing that the enrichment reflects a strong localization at boundaries above the known link between early and late replication and A and B compartments, respectively (Supplementary Methods).
We sought to independently verify our observed link between IZs and boundaries with an orthogonal technique for assaying replication origin activity. Small nascent strand sequencing (SNS-seq) identifies approximately 10 origins per 100 kb of the genome and enriches for high-efficiency origins localized in early replicating regions 30 . A previous report using ENCODE (Encyclopedia of DNA Elements) phase I pilot microarray data of 1% of the human genome reported enrichment of the cohesin subunit RAD21 at approximately 300 replication origins 27 . Here, using genome folding features from high-resolution Hi-C data, we find that SNS-seq data from human ES cells 30 exhibits heightened origin enrichment specifically at class 1 dot boundaries (Extended Data Fig. 5e). Thus, through two independent replication mapping techniques, we observe a strong enrichment of high-efficiency, early-S-phase IZs at a subset of genetically encoded corner-dot TAD/ subTAD boundaries. The colocalization of IZs with TAD boundaries generally has been further confirmed recently with super-resolution imaging 31 .
Transcription correlates with origin placement and efficiency 15,17,[32][33][34][35] . To ascertain whether transcription at boundaries could explain our results, we stratified dot boundaries with a complex CTCF orientation (class 1), dotless boundaries with a complex CTCF orientation (class 4) and dotless boundaries with no CTCF occupancy (class 6) into those that also had transcribed genes and those that were devoid of genes or had only inactive genes (Extended Data Fig. 6 and Supplementary Table 8). Boundaries with transcribed genes in the absence of the dot features (Extended Data Fig. 6b) or in the absence of CTCF + cohesin (Extended Data Fig. 6c) did not exhibit precise localization of high-efficiency early IZs. These results are consistent with the literature, as a large proportion of active promoters are not sites of efficient replication initiation, suggesting that further distinguishing features encode human origins 36 . It is also particularly noteworthy that we see enrichment of early IZs at dot boundaries with a complex CTCF motif orientation only when transcribed genes were also present (Extended Data Fig. 6a). Our data suggest that transcription alone is not sufficient to localize high-efficiency early IZs at boundaries. Transcription may cooperate with CTCF and cohesin-based loop extrusion to position high-efficiency IZs replicating in early S phase.
To understand whether cohesin and TAD/subTAD structural integrity are functionally necessary for origin placement in S phase, we examined IZs after global genome folding disruption using wild-type HCT116 cells engineered to degrade the cohesin subunit RAD21 within hours using a degron 23 . Such a system is uniquely suited to test the role of cohesin-mediated extrusion on IZs decoupled from transcription, as only hours of RAD21 degradation results in genome-wide ablation of nearly all loops with minimal short-term effect on transcription 23 . We synchronized HCT116 RAD21-mAID cells in mitosis, degraded RAD21 with auxin throughout G1, and then assessed replication initiation across S phase (Extended Data Fig. 7 and Supplementary Methods). We identified the same dot and dotless TADs/subTADs and boundary classes in Hi-C from wild-type HCT116 (untreated HCT116 RAD21-mAID) cells as in human ES cells ( Fig. 2a and Supplementary Tables 9-14). Consistent with previous reports 23 , our observations show that nearly all dot and dotless boundaries were destroyed following short-term cohesin knockdown in HCT116 cells (Fig. 2b,d and Extended Data Fig. 8). Therefore, although the molecular composition of boundaries influences their structural features of insulation strength and corner-dot presence, most are dependent on cohesin.
Previous studies have reported that replication timing domains are not globally altered following genome-wide disruption of cohesin-mediated loops [37][38][39] . Analyses in these studies relied on the log ratio of DNA synthesized in the first or second halves of S phase (two-fraction early/late Repli-seq) 40 23 . c,e, High-resolution 16-fraction Repli-seq data in wild-type HCT116 (WT; untreated HCT116 RAD21-mAID; c) and HCT116 RAD21-knockdown (KD; auxin-treated HCT116 RAD21-mAID; e) cells. Each row represents a temporal fraction from S phase, with 16 rows/fractions in total. The Repli-seq signal plotted represents an average across all boundaries in a particular class for that fraction (y-axis) in 50-kb bins across a ±750-kb genomic distance centred on the midpoint of the boundaries (x-axis). Sample sizes for each class are shown in a. f, ORM data for wild-type (untreated HCT116 RAD21-mAID; black) and RAD21-knockdown (auxin-treated HCT116 RAD21-mAID; red) cells.

Article
Repli-seq signals were often quantile normalized 37,39 , which obscures the localized disruption in IZ placement and timing shifts at specific TAD/subTAD boundaries. We generated and analysed high-resolution 16-fraction Repli-seq data (Fig. 2c,e and Supplementary Table 15), as well as single-molecule optical replication mapping (ORM) data 17 (Fig. 2f), in both wild-type and cohesin-knockdown HCT116 cells (Extended Data Fig. 7 and Supplementary Methods). As in human ES cells, we observed that 16-fraction Repli-seq data exhibit focal enrichment of high-efficiency/early IZs specifically at dot boundaries marked by CTCF + cohesin co-bound motifs in a complex orientation in wild-type HCT116 cells (class 1; Fig. 2c). Enrichment of early IZs occurs only at boundaries that colocalize with cohesin (Extended Data Fig. 9). Moreover, as in human ES cells, low-efficiency, late IZs were enriched at weak dotless boundaries in wild-type HCT116 cells (Fig. 2c). Using single-molecule ORM data, which can directly assess IZ efficiency as the percentage of molecules that initiate within a particular IZ, we detected enriched origin initiation specifically at class 1 boundaries (Fig. 2f).
Together, our single-molecule and ensemble replication initiation data indicate that early-S-phase IZs fire at a key subset of genetically encoded dot boundaries.
Following ablation of cohesin-mediated boundaries (Fig. 2b,d and Extended Data Fig. 8), we observe severe disruption of high-efficiency early-S-phase IZs specifically at class 1 boundaries, as evidenced by a diffuse and delocalized Repli-seq signal (class 1; Fig. 2c,e). Consistent with our qualitative observations, early wave IZs were less numerous and increased in width specifically at dot boundaries with a complex CTCF motif orientation after loss of cohesin (Extended Data Fig. 10 and Supplementary Table 16). We also noticed that low-efficiency IZs shift to replicating at the end of S phase (fractions 14-16) at dotless boundaries following cohesin knockdown (classes 4-6, Fig. 2c,e and Extended Data Fig. 10). Independently conducted ORM analyses confirmed our observations of IZ disruption by cohesin removal (Fig. 2f). Cell cycle progression and 5-bromodeoxyuridine incorporation was not substantially affected by RAD21 knockdown 39 (Extended Data Fig. 7).  Together, our ensemble and single-molecule IZ data demonstrate that disruption of cohesin-mediated loops during G1 alters the genomic placement where origins or clusters of origins fire during early S phase. On the basis of our observations, we reason that a failure of cohesin to unload, and therefore the creation of new long-range loops due to more cohesin molecules stalled at complex CTCF boundaries in G1 phase, might result in an increased number of high-efficiency origins or a narrowing of their genomic placement in S phase. Recently, it was reported that knockdown of the gene encoding the cohesin unloading factor WAPL results in increased long-range loops 41 . We examined the genomic placement of IZs in S phase with 16-fraction Repli-seq in wild-type HCT116 cells engineered with an improved degron system (AID2) to degrade WAPL throughout G1 phase 42 . First, we created Hi-C libraries in wild-type and WAPL-knockdown HCT116 cells (Fig. 3a,b and Extended Data Fig. 7). Consistent with published results, our observations show that dots indicative of loops are more numerous, and traverse a longer genomic distance, compared with those in wild-type HCT116 cells (Fig. 3a,b and   Article the gain-of-looping phenotype following WAPL knockdown occurs most strongly at dot boundaries with a complex CTCF motif orientation (class 1; Fig. 3c). At class 1 boundaries, we observe that early IZs become significantly narrower following WAPL knockdown (Fig. 3d,e and Supplementary Table 17). We note that IZs tighten and refine following gain of looping in the WAPL-knockdown condition at the same boundaries where IZs grow more diffuse following cohesin knockdown (Fig. 3a,b and Extended Data Fig. 10). Together, the findings from our gain and loss of structural boundary experiments further support a model in which cohesin-based loop extrusion in interphase deterministically informs the placement of the subset of origins that fire during S phase. We finally sought to understand whether specific boundaries are necessary and sufficient to regulate IZ firing. We used targeted CRISPR-Cas9 genome editing to delete an 80-kb section of the genome containing a complex array of more than 10 CTCF + cohesin-binding sites with complex motif orientations anchoring a long-range chromatin loop that separates late from early replication timing domains (Fig. 4a). The loop anchor was chosen because it also partially overlaps an early-S-phase IZ, but does not encompass the full IZ, thus allowing us to ablate the loop while keeping much of the IZ intact. We observed a striking local delay of replication timing from early to late following deletion of the 80-kb loop anchor, consistent with the loss of an early IZ (Fig. 4a,c(i)). As a negative control, we deleted a different 30-kb loop anchored by two tandemly oriented CTCF-binding sites within an adjacent late replication timing domain, but not overlapping an IZ (Fig. 4b). Deletion of this 30-kb loop anchor disrupted the dot boundary but preserved the timing and genomic location of DNA replication (Fig. 4b,c(ii)). The direct overlap of IZs with boundaries precludes our ability to fully decouple them, and overlap of functional elements remains a technical challenge for functional perturbative studies in the genome biology field at large. Nevertheless, our data provide evidence that replication at a specific early IZ can undergo a striking shift to late S phase following ablation of a boundary. These data are consistent with our cohesin-knockdown observations and our model in which boundaries marked by a complex CTCF motif orientation inform the precise placement of high-efficiency IZs.
As the direct overlap of IZs with boundaries is not amenable to clean, single-variable 'loss-of-structure' perturbative experiments, we also examined a 'gain-of-structure' approach in which we assessed whether the introduction of an engineered ectopic boundary was sufficient to induce changes in replication initiation. We mapped replication with two-fraction Repli-seq in published HAP1 cell lines in which we have previously demonstrated a gain in boundary following insertion of an established 2 kb-sized cell-type-invariant boundary element 43 . We observed a striking shift from late to early replication directly at the location of the engineered boundary (Fig. 4d), consistent with the possibility that boundaries can be sufficient for de novo early IZ firing. Together, our data reveal that both global and local gain and loss of structural boundaries can deterministically influence the placement of IZs.
It is well established that the initiation of DNA replication involves two mutually exclusive steps 1,2 . The first step, origin licensing, begins in telophase with the loading of two copies of the mini-chromosome maintenance (MCM2-7) complex 2,44 . MCM2-7 is initially loaded in excess at tens of thousands of sites across the human genome in an inactive form as a double hexamer that encircles double stranded DNA (yellow double hexamers in Fig. 4e). The second step, origin activation, occurs at the onset of S phase. Origin activation involves mechanisms that both prevent further MCM loading and recruit multiple extra factors to initiate the unwinding of the double helix and DNA synthesis 2,44 . In mammalian systems, a critical mystery remains regarding the mechanisms that governing the selection of a subset of MCM-bound, licensed origins for activation in S phase.
Here we propose a model in which cohesin-mediated loop extrusion and stalling at dot boundaries marked by CTCF + cohesin-binding sites oriented in convergent and divergent directions is required for the positioning of high-efficiency replication origins (Fig. 4e). We propose two possible models to explain the strong localization of high-efficiency IZs to a subset of cohesin-dependent, genetically encoded boundaries: cohesin could directly push licensed MCM double hexamers or other origin activation cofactors along the genome before stalling at high-density arrays of CTCF + cohesin-bound motifs in complex orientations; alternatively, cohesin might pass over many licensed, MCM-bound origins and selectively participate in the activation of those already loaded at boundaries. We also posit that low-efficiency IZs might fire at weaker dotless boundaries later in S phase because cohesin only temporarily pauses during its traversal along the genome, and thus cannot aggregate initiation activity (Fig. 4e). In the cell types from our study, cohesin-mediated loop extrusion is required for IZ placement, and the changes in replication timing are subtle and indirect owing to the altered distance of nearby genomic regions to the nearest initiation site. We note that although we do not see evidence for a dominant role for cohesin on the larger replication timing program, we cannot rule out that cohesin knockdown might have a more profound effect on the replication timing program in other cell types, species and experimental designs.
Previous studies using mass spectrometry and co-immunoprecipitation have reported the direct binding of cohesin to DNA replication factors, such as MCM7, MCM6, MCM4, RFC1 and DNA polymerase α 27,45 . The MCM complex has the ability to slide after loading and can be pushed by polymerase during transcription [46][47][48] . However, the extent and rate at which this occurs on chromatin in the presence of nucleosomes (≈11 nm) is still an open question. The internal diameter of cohesin is 40 nm, whereas the MCM2-7 double hexamer is only 15 nm. The findings of a recent Hi-C and imaging study suggest that, despite their small size, MCM complexes could also serve as boundaries to block cohesin-based loop extrusion 49 . TAD boundaries and loops persist through S phase 50 , but MCMs are removed from chromatin after IZs fire 1,2 . Therefore, we favour a model in which cohesin pushes licensed MCMs in G1, leading to the localization and activation of a key subset of origins at boundaries with a complex CTCF motif orientation in S phase (Fig. 4e). Nevertheless, both proposed models remain exciting areas for future mechanistic dissection.
Understanding the structure-function relationship of the human genome remains a major challenge for human geneticists and chromatin biologists. Here we stratify TADs and subTADs by their structural and molecular features. We conduct global and local perturbative studies to reveal that genetically encoded TAD/subTAD boundaries formed by cohesin-mediated loop extrusion in G1/pre-S functionally inform genome function in the case of the initiation of DNA replication in S phase. Our work sheds light on the question of whether and how the location of fired origins is deterministically encoded in humans by the genome, epigenome and higher-order chromatin folding.

Online content
Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/s41586-022-04803-0.

Corresponding author(s):
Double-blind peer review submissions: write DBPR and your manuscript number here instead of author names.

Last updated by author(s): 2/4/2022
Reporting Summary Nature Research wishes to improve the reproducibility of the work that we publish. This form provides structure for consistency and transparency in reporting. For further information on Nature Research policies, see our Editorial Policies and the Editorial Policy Checklist.

Statistics
For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section.

n/a Confirmed
The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly The statistical test(s) used AND whether they are one-or two-sided Only common tests should be described solely by name; describe more complex techniques in the Methods section.
A description of all covariates tested A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted

Software and code
Policy information about availability of computer code Data collection Juicebox (juicer_tools_0.7.5.jar) was used for extracting hic chromosome matrices from .hic files with conversion to .npz sparse matrices.
Cooler package in python (cooler 0.8.10) was used for extraction of hic chromosome matrices from .cool files with conversion to .npz sparse matrices Data analysis Domain calls were made with 3dnetmod from https://bitbucket.org/creminslab/cremins_lab_tadsubtad_calling_pipeline_11_6_2021/. loops on hic were called from custom python code based on HICCUPs approach from Aiden and colleagues available https://bitbucket.org/ creminslab/cremins_lab_loop_calling_pipeline_11_6_2021/. bowtie version 0.12.7 and samtools version 1.2 were used for Chipseq and Cut&Run fastq processing. MACS2 2.1.1.20160309 was used for peak calls. bedtools 2.15.0 was used for intersection of peaks with domain boundaries. opencv-python 4.2.0.32 (cv2 4.2.0) was used for APA pileup of domains. linalg class of numpy (numpy 1.16.6) was used for eigenvector decomposition and compartment calling in hic matrices. Custom scripts in python were used to access domain layer hierarchy and domain intersection with compartment and dots and subsequent boundary classification as well as IZ to closest boundary permutation test. For repliseq, the BIRCH clustering algorithm was used for IZ determination.
For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors and reviewers. We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information.