Topoisomerase IIβ targets DNA crossovers formed between distant homologous sites to induce chromatin opening

Type II DNA topoisomerases (topo II) flip the spatial positions of two DNA duplexes, called G- and T- segments, by a cleavage-passage-resealing mechanism. In living cells, these DNA segments can be derived from distant sites on the same chromosome. Due to lack of proper methodology, however, no direct evidence has been described so far. The beta isoform of topo II (topo IIβ) is essential for transcriptional regulation of genes expressed in the final stage of neuronal differentiation. Here we devise a genome-wide mapping technique (eTIP-seq) for topo IIβ target sites that can measure the genomic distance between G- and T-segments. It revealed that the enzyme operates in two distinctive modes, termed proximal strand passage (PSP) and distal strand passage (DSP). PSP sites are concentrated around transcription start sites, whereas DSP sites are heavily clustered in small number of hotspots. While PSP represent the conventional topo II targets that remove local torsional stresses, DSP sites have not been described previously. Most remarkably, DSP is driven by the pairing between homologous sequences or repeats located in a large distance. A model-building approach suggested that topo IIβ acts on crossovers to unknot the intertwined DSP sites, leading to chromatin decondensation.

treated with sarkosyl-CsCl. To obtain experimental evidence for the claim shown in Fig. 1D, we designed a model experiment in that tag-purified topo IIβ was immobilized on magnetic beads and supercoiled plasmid DNA was used as a substrate. Flag-tagged topo IIβ was expressed in the human embryonal kidney cell line HEK293E cells transfected with pFlag-top2β plasmid encoding the full-length rat topo IIβ, and purified by immunoprecipitation with Dynabeads Protein G that had been pre-coated with anti-Flag antibody as described previously 2 . The plasmid harbored an insert originated from actual topo IIβ−target site 3 , which is an AT-rich region containing a Ts2 toposite and LINE sequence ( No DNA was detected in the unbound fraction (U) (Fig. S2C To test whether or not the circularity of form Ir DNA is required for its retention, DNA in the S/C-altered complex was treated with Bam HI in situ to separate into the insert and vector portions, followed by separation into bound and unbound fractions (Fig. S2D, lanes 3 and 4). The agarose gel bands were quantified by densitometry (Fig. S2D, right panel).
Comparison between lanes 1 and 2 suggested that the S/C treatment induced a breakage-resealing cascade in enzyme-bound DNAs: dsb (form III) to ssb (form II), and ssb to no break (form Ir). This is also a direct evidence for the preservation of religation activity and the dimeric structure of topo IIβ in the presence of sarkosyl.
Results are visualized schematically in Fig. S2E. Note that enzyme-bound dsb fragment is not detectable after Bam HI treatment because of their heterogeneous size and the bands in lane 5 are exclusively attributable to the ssb fragments. The insert to vector ratio in lanes 5 and 6 of Fig. S2E suggests that resealing efficiency of the enzyme bound to insert portion is significantly higher than that bound to vector portion. Crossovers between insert and vector portions are probably rare to occur because I/V ratio of high salt-released DNA should be unity in this case. 2) as etoposide is significantly diluted-out once the cells were lysed.
2) The enzyme's resealing activity and association with the resealed DNA is preserved even in the presence of sarkosyl. Existence of Ts2 toposites in eTIPa-seq solely depends on the reversal of the gap on G-segments. T-segments may be also clamped to the partially denatured enzyme, probably after being transferred through the gap.

Critical factors in eTIP-seq procedure
In conventional immuno-capturing methods to separate the topo II-DNA cleaved complex, sarkosyl in nuclear run-on assays 5,6 . Therefore, topo IIβ may also retain noncovalently bound DNA and a certain level of enzymatic activity in 1% sarkosyl. Since etoposide is diluted away after the cell lysis, the residual topo II activity may reseal the breaks in G-segment to some extent.
In the practical procedure, cell lysate in 1% sarkosyl was supplemented with 0.5 M CsCl to further remove materials interacting nonspecifically with topo IIβ. In the final step of eTIPa-seq, significant amounts of DNA are eluted in P2 fraction by 0.5 M NaCl treatment, implying that positively charged residues of topo IIβ interacting with phosphoryl groups of DNA are replaced by Na + ion but not by Cs + ion. This is consistent with the fact that the accumulation of alkali metal cations near the dsDNA phosphoryl groups varies inversely with its ionic size 7 . Thus, Na + is stronger than Cs + as an agent for eluting DNA fragments ionically bound to topo IIβ. The DNA yield in P2 is highly dependent on etoposide (Fig. S1D). This implies that these DNA fragments released by 0.5 M NaCl are not just irrelevant ones but are genuine components of topo IIβ reaction intermediates.
Etoposide-induced dsb mediated by topo IIβ might be repaired by microhomology-mediated end-joining (MMEJ) or by other homologous end-joining mechanisms. During this process, two distally located homologous sites would be somehow brought into proximity and processed into chimeric ligation products, namely DSP chimeras.
However, this is unlikely to occur for number of reasons. i) Topo IIβ molecule should be removed from dsb ends before the start of repair end-joining process, which requires multiple protein factors. There should be little possibility that all these factors are associated with topo IIβ-DNA complex after cell lysis and can execute the expected reaction. ii) MMEJ usually operates between short homologous DNA segments in the vicinity of dsb site 8 that are not positioned kilobases apart from each other as in DSP chimeras. iii) The DSP chimera is derived from the ligation between two DNA fragments associated with topo IIβ. The ligation occurs between artificial adaptors attached to the ends of randomly sheared DNA. The adaptor serves as a signature for isolation of chimeras. iv) MMEJ-mediated ligation would never bring about the biased read orientations as observed in DSP chimeras (Fig. 4F).

Other genome-wide mapping for correlative analyses
We performed additional genome-wide analyses together with topo IIβ-targeted sites. Most attention was given to the binding sites of hnRNP U/SAF-A/SP120 as determined by ChIP-seq.
While any DNA, with a certain sequence preference, can be a substrate for topo II in vitro, accessibility of the enzyme to DNA in vivo is restricted significantly by bound chromatin proteins. Therefore, FAIRE-seq assays were employed to probe the local chromatin accessibility to evaluate potential topo IIβ targets.
These mapping results were displayed on custom tracks of the UCSC genome browser together with toposites. Using this display, a number of correlative analyses were performed as described in the text.

Analysis of Ts1/PSP sites
Length distribution of Ts1/PSP showed that median length is around 700-800 bp (Fig. S3A).
Location of Ts1/PSP chimera in the classified genomic regions was examined to assess their functional relevance (Fig. S3B). The remarkable enrichment of Ts1/PSP in TSS zone suggests that it is involved in the control of transcriptional initiation. We divided Ts1/PSP into two groups: associated and non-associated with TSS zone (Fig. S3C). Analysis of length distribution showed that TSS-associated Ts1/PSPs are significantly longer than the other group (Fig. S3D). With respect to the sequences around TSS (+/-4 kb) extracted from rat RefSeq genes, Ts1/PSP and other features were plotted in an aggregation plot (Fig. S3E).
Peak regions covering each genomic position were counted-up and expressed as relative numbers of RefSeq genes. SP120 and CpG island peak around TSS, whereas Ts1 toposite forms twin peaks that locate ~1,300 bp apart harboring the other peaks. These results are consistent with the model shown in Fig. S3F. As depicted in the figure, the left peak of Ts1 toposite very likely to correspond to the cleavage site of topo IIβ in action. The enzyme can also approach the duplex crossover from the other side, which should create the right peak.
Taken together it is strongly suggested that topo IIβ complexed with SP120 recognizes a right-handed crossover formed between the positions of ~300 bp upstream TSS and ~1 kb downstream into the gene. In average genes, therefore, topo IIβ/SP120 acts on positively supercoiled loop of ~1,300 bp formed around TSS whose size matches well with the chimera size of ~1,200 bp (Fig. S3D). After strand passage, the loop is converted to negative loop, which may facilitate the initiation step of transcription 9 . Ts1/PSPs that are not associated with TSS but enriched in genic region may represent topo IIβ involved in the relaxation of positive supercoils generated by ongoing transcription 10 . In this case the loop also contains interaction sites for SP120 but the loop size is smaller (800-900 bp). The hotspot Chr20-2 is composed of two  16 , which is consistent with the fact that RT1 genes are coded by opposite strands in these domains. The connection between the read orientation and the coding strand observed here is an additional strong evidence for the rule shown in Fig. 4E. In contrast to RT1-A/CE genes, other RT1 genes and MHC class II genes are devoid of Ts3/DSP. (C) The hotspot Chr18-2 is a Ts3-hotspot that encompasses about 100 kb of pericentromeric region adjacent to the p-q boundary of the cytoband. All the Ts3/DSP reads in this region (either RF or FR orientation) coincide with the positions of Ts3 toposites, SP120 sites, FAIRE-seq peaks and Sat I sequences with high SW score from RepeatMasker. Sat I is a rat-specific repetitive element 17 . Multiple copies of Sat I repeats (unit length~370 bp) aligned in tandem constitute the centromeric region spanning megabases on most chromosomes in the rat 18 . Similar Sat I clusters were located in the pericentromeric region of Chr13, Chr17 and Chr18, as well. (D) The Ts2-hotspot Chr 3-5 encompassing about 300 kb is a typical hotspot enriched with Ts2 toposites and Ts2/DSP chimeras with RF/FR read orientation. The chimera ends overlap with LINE repeats in the RepeatMasker track.

Comparison to END-seq data
To investigate the relationship between the present study and other mapping techniques for topo IIβ-generated DSBs, we compared the peak positions of Ts1 toposite and those of DSB obtained by END-seq 19 . The END-seq data downloaded from NCBI SRA for mouse cortical neurons (END_seq_NRN_ETO) was first converted to fastq file and mapped to the mouse genome mm10 by Bowtie 1.1.2. Peaks of 57,961 were identified by a peak-calling algorithm (MACS 1.4.3). Using the LiftOver tool of UCSC genome browser, 26,560 peaks (46%) were successfully converted onto the rat genome rn4. Positions of these END-seq peaks were compared with those of 121,478 Ts1 toposites (listed in Supplementary Table S1). We found that 12,098 END-seq peaks (46%) were located within 500 bp from Ts1 toposites.

Analysis of gene expression by mRNA-seq
Total cellular RNA from D1, D5, and D5+ cells were prepared as described previously 11 .