Nucleosome binding by the pioneer transcription factor OCT4

Transcription factor binding to genomic DNA is generally prevented by nucleosome formation, in which the DNA is tightly wrapped around the histone octamer. In contrast, pioneer transcription factors efficiently bind their target DNA sequences within the nucleosome. OCT4 has been identified as a pioneer transcription factor required for stem cell pluripotency. To study the nucleosome binding by OCT4, we prepared human OCT4 as a recombinant protein, and biochemically analyzed its interactions with the nucleosome containing a natural OCT4 target, the LIN28B distal enhancer DNA sequence, which contains three potential OCT4 target sequences. By a combination of chemical mapping and cryo-electron microscopy single-particle analysis, we mapped the positions of the three target sequences within the nucleosome. A mutational analysis revealed that OCT4 preferentially binds its target DNA sequence located near the entry/exit site of the nucleosome. Crosslinking mass spectrometry consistently showed that OCT4 binds the nucleosome in the proximity of the histone H3 N-terminal region, which is close to the entry/exit site of the nucleosome. We also found that the linker histone H1 competes with OCT4 for the nucleosome binding. These findings provide important information for understanding the molecular mechanism by which OCT4 binds its target DNA in chromatin.

Scientific RepoRtS | (2020) 10:11832 | https://doi.org/10.1038/s41598-020-68850-1 www.nature.com/scientificreports/ recognizes its target DNA sequence, which is located at the entry/exit site of the nucleosome. A crosslinking mass spectrometry analysis revealed that the OCT4 bound to the nucleosome is located in the proximity of the N-terminal region of histone H3, which is also near the entry/exit site of the nucleosome. Finally, we found that the linker histone H1 competes with OCT4 for nucleosome binding. These new findings provide novel insights toward understanding the molecular mechanism by which OCT4 binds chromatin and regulates the pluripotency of cells.

Results
The DNA fragment containing the LIN28B distal enhancer region forms a positioned nucleosome in vitro. To test the OCT4 binding to its target DNA sequence in the nucleosome, we reconstituted the nucleosome with a 162 base-pair DNA fragment containing the human LIN28B distal enhancer region (LIN28B nucleosome). Three potential OCT4 target sequences (site 1, site 2, and site 3) exist in the 162 base-pair DNA fragment (Fig. 1a). In human fibroblast cells, a positioned nucleosome is reportedly present in the genomic DNA region containing the 162 base-pair DNA segment 10 .
We reconstituted the LIN28B nucleosome by the salt dialysis method in vitro, and tested its nucleosome positioning by the chemical probing assay. To do so, the LIN28B nucleosome was reconstituted with the human canonical histones H2A, H2B, H3.1, and the histone H4 S47C mutant, in which the Ser47 residue was replaced by Cys ( Fig. 1b and Fig. S1). In the nucleosome, the H4 Cys residue inserted at position 47 is properly located near the DNA backbones around the nucleosomal dyad, and the DNA strands are chemically cleaved by the Fenton reaction 19,20 (Fig. 1b). Therefore, the nucleosomal dyad position can be mapped as the cleavage sites by the chemical reaction. A control experiment with the stably positioned nucleosome containing the Widom 601 sequence (147 base pairs) yielded two major 72 and 75 nucleotide bands, confirming that the nucleosomal DNA strands were precisely cleaved around the nucleosomal dyad (Fig. 1c). We then performed the chemical probing assay with the nucleosome containing the human LIN28B distal enhancer region (162 base pairs). The LIN28B nucleosome was mainly positioned at the center of the 162 base-pair DNA fragment with linker DNAs on both sides (as revealed by 80-and 82-nt DNA fragments), although the other positions were also observed as smeared bands (Fig. 1d,e). Therefore, the DNA sequence of the human LIN28B distal enhancer region (162 base pairs) intrinsically possesses a nucleosome positioning property.
Cryo-EM structure of the LIN28B nucleosome. We then determined the structure of the LIN28B nucleosome by cryo-electron microscopy. The OCT4-LIN28B nucleosome complex was prepared by sucrose gradient, and cryo-EM images were collected (Fig. 2a,b). Since the extra volume corresponding to OCT4 was not obvious probably by the weak or flexible association of OCT4 with the LIN28B nucleosome, the cryo-EM structure of the LIN28B nucleosome without OCT4 was reconstructed, selected, and refined to 3.6 Å resolution (Fig. 2c, Fig. S2, and Table S1). In the structure, the DNA sequence could not be oriented, because the reconstructed nucleosome structure may be a mixture of two opposite orientations. The putative positions of three possible OCT4 binding sites, site 1, site 2, and site 3, are mapped around superhelical location (SHL) 7, SHL4.5, and SHL1.5, respectively, in the LIN28B nucleosome (Fig. 2c).
OCT4 specifically binds to its target DNA sequence located at the entry/exit site of the nucleosome. To test the OCT4 binding to the LIN28B nucleosome, we purified the His 6 -tagged human OCT4 (5CS) protein (Fig. S3), and performed an electrophoretic mobility shift assay. In the 5CS mutant, all 5 Cys residues in the POU S ·POU HD domain (C185, C198, C221, C252, and C279) are replaced with Ser to stabilize the protein 21 . Consistent with the previous result 10 , OCT4 efficiently bound to the LIN28B nucleosome, and appeared as a discrete band corresponding to the OCT4-nucleosome complex (Fig. 3a, lanes 1-3). We then mutated one of the three OCT4 target sequences 22 (site 1-3 mutants), and reconstituted the mutant LIN28B nucleosomes (Fig. S4). Interestingly, OCT4 efficiently formed the specific complexes, when the site 2 and site 3 target sequences were each disrupted (Fig. 3a, lanes 7-12, and Fig. S5a). In contrast, the OCT4 binding was drastically reduced when the site 1 target sequence was disrupted (Fig. 3a, lanes 4-6, and Fig. S5a). The site 1 target sequence is located at the entry/exit site of the LIN28B nucleosome (Fig. 2c). Therefore, these results indicated that OCT4 preferentially binds its target DNA sequence located at the entry/exit site of the nucleosome. Note that the weak, but clear, band corresponding to the specific OCT4-nucleosome complex was observed when site 1 was mutated (Fig. 3a, lanes 4-6, and Fig. S5a). This suggests that OCT4 also binds to site 2 and/or site 3.
We next tested whether the position of the OCT4 target sequence in the nucleosome affects the OCT4 binding. To this end, the site 1 target sequence was moved five (+ 5) and ten (+ 10) base pairs toward the nucleosomal dyad ( Fig. 3b and Fig. S4). Since site 2 is located 17 base pairs away from site 1, the + 5 and + 10 sites are positioned between site 1 and site 2 (Fig. S4). As compared to the wild-type LIN28B nucleosome, the OCT4 binding to the LIN28B (+ 5) nucleosome was reduced (Fig. 3b, lanes 1-3 and lanes 4-6, and Fig. S5b). The OCT4-nucleosome binding was also suppressed when the LIN28B (+ 10) nucleosome was used as a template (Fig. 3b, lanes 1-3 and lanes 7-9, and Fig. S5b). In the LIN28B (+ 10) nucleosome, the additional OCT4 binding sequence is supposed to have a similar rotational setting relative to the histone surface with the wild-type LIN28B nucleosome. Therefore, the translational positioning at the entry/exit site of the nucleosome, but not the rotational setting, of the target DNA sequence in the nucleosome may play an important role for efficient OCT4 binding to the nucleosome.
OCT4 interacts with histone H3 in the nucleosome. We next performed the crosslinking mass spectrometry (XL-MS) analysis, to map the OCT4 binding site relative to the core histones in the nucleosome. The lysine residues located close to the OCT4-nucleosome complex were crosslinked with disuccinimidyl suberate (DSS)-H12/D12. The crosslinked lysine residues between OCT4 and histones were then detected by mass spec-Scientific RepoRtS | (2020) 10:11832 | https://doi.org/10.1038/s41598-020-68850-1 www.nature.com/scientificreports/ trometry ( Fig. 3c and Fig. S6). Intriguingly, we found that the POU S ·POU HD domain of OCT4 was predominantly crosslinked with the N-terminal region of histone H3 (Fig. 3c, and d). In the nucleosome, the N-terminal region of histone H3 is located near the entry/exit DNA sites (Fig. 2c). Therefore, the XL-MS analysis consist- www.nature.com/scientificreports/ ently showed that OCT4 predominantly binds to its site 1 target DNA sequence located near the entry/exit site of the nucleosome. It should be noted that a few N-terminal residues of H2A and H4 also crosslinked with OCT4 ( Fig. 3c and d). These interactions may be responsible for OCT4 binding to the other nucleosomal regions, such as site 2 and site 3.
Linker histone H1 competes with OCT4 for nucleosome binding. Linker histones, such as histones H1 and H5, preferentially bind to the entry/exit DNA regions of the nucleosome [23][24][25][26] , and alter the relaxed chromatin conformation to the condensed form 26,27 . In pluripotent cells, the chromatin conformation is largely relaxed, allowing the nucleosome binding of the pioneer transcription factors, including OCT4 28,29 . Interestingly, citrullination of the H1 DNA-binding site reportedly induces H1 displacement and chromatin decondensation in pluripotent cells 30 . The transition of linker histone types also occurs during early embryogenesis 31 . This may play an important role in the global conformational change of chromatin to promote the transcriptionally active somatic state during developmental processes. In these processes, the OCT4 bound to the nucleosome may be replaced by linker histones, suppressing the cells back to the pluripotent state. Therefore, we tested whether the OCT4-nucleosome binding could be replaced by the H1 binding, by performing a competitive nucleosome binding assay (Fig. 4a). In this assay, we used the nucleosome reconstituted with the 193-base-pair 601 sequence DNA, which contains the OCT4 target sequence at its entry/exit site (Fig. S7). We then prepared the H1-nucleo- (c) Cryo-EM iso-potential map of the LIN28B nucleosome, contoured at 4.75 sigma above mean density. Histones H2A, H2B, H3, and H4 are colored yellow, red, light blue, and green, respectively. Three OCT4 target DNA sequences are colored magenta, dark green, and blue, respectively. In the left panel, the H3 N-terminal region disordered in the structure is shown by the dashed line.

Discussion
In the early stage of cell reprogramming, pioneer TFs bind to their specific target DNA sequences and induce conformational changes of the chromatin architecture. In differentiated cells, the target sites for pioneer TFs are considered to be buried in the closed (condensed) chromatin, which is generally inaccessible to DNA binding proteins. Therefore, pioneer TFs somehow bind their target DNA in chromatin to induce the reprogramming. In the present study, we focused on the pioneer TF, OCT4, and studied its specific binding to the LIN28B nucleosome containing its natural binding DNA sequence, the LIN28B distal enhancer DNA sequence.
Previous studies have reported that the LIN28B distal enhancer DNA sequence contains target DNA sequences for pioneer TFs, including OCT4, SOX2, and KLF4, which are known to promote induced-pluripotent stem cells (iPSC) 7,8,10 . We performed the OCT4 binding assay with the reconstituted LIN28B nucleosomes containing the OCT4 targeting sequence in various positions. Our chemical probing and cryo-EM analyses revealed that the LIN28B distal enhancer DNA fragment forms a positioned nucleosome, and is properly wrapped around the histone octamer ( Figs. 1 and 2). We then found that OCT4 stably and preferentially binds its target DNA sequence located at the entry/exit site of the nucleosome (Fig. 3a,b). Our XL-MS data fully supported the favored entry/ exit binding of OCT4 in the nucleosome (Fig. 3c). Unexpectedly, OCT4 binding is substantially suppressed when the target DNAs are located in other nucleosomal regions (Fig. 3a,b). Importantly, the suppression of the OCT4 binding may not depend on the rotational setting of the target DNA sequence on the nucleosome surface ( Fig. 3a,b). Therefore, OCT4 may efficiently bind its target DNA sequence, if it can be peeled from the histone surface at the entry/exit site of the nucleosome. These findings are consistent with the previous study showing that POU-family TFs prefer the ends of nucleosomal DNA 32 . Recently, the cryo-EM structures of the OCT4-SOX2-nucleosome complexes with designed DNA sequences were reported 33 . In the structures, SOX2 peels the DNA end from the histone surface, and facilitates OCT4 binding to its target site (Fig. S9) 33 . These results, along with our previous findings, suggest that OCT4 prefers to bind the DNA loosely associated with or detached from the histone surface in the nucleosome. In contrast, a single molecule analysis suggested that OCT4 does not discriminate between end-positioned and dyad-positioned target sequences in the nucleosome 34 . This is probably due to the addition of an excess amount of OCT4 relative to the nucleosome, which would favor nonspecific OCT4 binding. It is also possible that OCT4 specifically binds to the internal OCT4 sites of the nucleosome with low affinity, as shown in Fig. 3.
The DNA peeling from the histones could avoid steric clashes between the POU S ·POU HD domain and core histones. In the previous crystal structures of the POU S ·POU HD -DNA complex, the POU S and POU HD domains bind the DNA from opposite sides and the steric clash with the histones is unavoidable in the nucleosome 10 . Therefore, OCT4 may not efficiently bind the target DNA, if it is located at a region tightly associated with core histones in the nucleosome. In cells, when the OCT4 target DNA sequence is entirely wrapped in the nucleosome, it may not be targeted by OCT4 without repositioning via a nucleosome remodeling mechanism, which may be induced by the other pioneer TFs and nucleosome remodelers. Intriguingly, the cryo-EM OCT4-SOX2nucleosome structures revealed that OCT4 binds to the nucleosomal target site with assistance from SOX2, which may induce the DNA peeling from the histone surface of the nucleosome 33 . Similar to SOX2, SOX11 also induces the DNA peeling of the nucleosomal DNA ends 35 . Therefore, in the nucleosome, the SOX-family proteins may function to enhance the OCT4 binding by peeling the DNA around the OCT4 targeting sites.
Our XL-MS analysis revealed that OCT4 crosslinks with the H3 N-terminal region, which is located near the DNA entry/exit sites (Fig. 3c,d). Intriguingly, the OCT4-H3 crosslinkings are predominantly observed in the POU S domain, but not in the POU HD domain. This may happen because the POU S domain mainly binds to the nucleosomal DNA at the entry/exit site of the nucleosome. This is consistent with the cryo-EM structure, in which the POU S domain, but not the POU HD domain, of OCT4 binds to the nucleosomal target site 33 . The POU S domain may be a primary recognition module for the nucleosomal target site, and the POU HD domain may have a distinct function in later stages of gene regulation.
We also found that upon nucleosome binding, the linker histone H1 releases the OCT4 bound to the nucleosome (Fig. 4). This is consistent with another pioneer TF, HNF3 (FOXA), which competes with H1 binding on the nucleosome 36 . HNF3 contains a DNA-binding motif with a winged-helix structure, which is similar to that of linker histone H1 37 . In contrast, the POU domain of OCT4 is structurally different from a winged-helix structure 14,15 . Therefore, OCT4 may bind to the nucleosome with a different mode from the winged-helix proteins, such as linker histones and HNF3, and may be evicted from nucleosomes when the winged-helix types of linker DNA-binding proteins are produced in cells.
The OCT4 removal by somatic types of linker histones may function in cellular differentiation. In the early stage of the developmental process, the somatic types of linker histones are expressed at low levels in cells 38,39 . However, the levels of the somatic types of H1s apparently increase progressively upon differentiation 38 . In ES cells, H1 is reportedly more loosely bound to chromatin than in differentiated cells 28,30 . Under these conditions, OCT4 may form a complex with nucleosomes, if the OCT4 target DNA sequences are properly positioned in the nucleosome. This OCT4 binding may accompany the developmental stages of cells. The strong binding of somatic linker histones to chromatin may compete with the OCT4 binding to chromatin, and may contribute On the other hand, during the reprogramming process, since OCT4 may not bind to the nucleosome complexed with a linker histone, the linker histone must be evicted from the chromatin before OCT4 can bind to its target DNA. The H1 bound to chromatin reportedly exchanges rapidly in vitro and in vivo 40,41 . This process may be stimulated by histone chaperones, such as NAP1, and modifications of linker histones, which can remove or destabilize the linker histone bound to the nucleosome 30,42 . Further studies are required to clarify the mechanism by which OCT4 modulates the chromatin conformation with linker histones and regulates cell fates.

Materials and methods
Preparation of DNA fragments. The 162 base-pair LIN28B distal enhancer DNA fragment 10  Purification of histones and histone complexes. The human histones H2A, H2B, H3.1, and H4 were purified as recombinant proteins, as described previously 43 . Using the purified, lyophilized histones, the H2A-H2B and H3-H4 complexes were reconstituted and isolated 43 . The complexes were flash frozen in liquid nitrogen, and stored at − 80 °C.
Reconstitution and purification of nucleosomes. The nucleosomes were prepared as described previously 43 . Briefly, a DNA fragment was mixed with the H2A-H2B and H3-H4 complexes in high-salt buffer, and the nucleosomes were reconstituted by the salt dialysis method 43 . To prepare the nucleosome for the chemi-  (lanes 2, 5, 8, and 11) and 0.2 µM (lanes 3, 6, 9, and 12) of OCT4, and analyzed by non-denaturing polyacrylamide gel electrophoresis with ethidium bromide staining. Replicated experiments confirmed the reproducibility of the results (Fig. S5a). (b) Gel-shift assay using the + 5 and + 10 mutants of the LIN28B nucleosome. The site 1 target (colored magenta) sequence was moved five (+ 5) and ten (+ 10) base pairs toward the nucleosomal dyad (upper illustration). The nucleosomes (0.1 µM) were mixed with 0 µM (lanes 1, 4, and 7), 0.1 µM (lanes 2, 5, and 8) and 0.2 µM (lanes 3, 6, and 9) of OCT4, and analyzed by non-denaturing polyacrylamide gel electrophoresis with ethidium bromide staining. Replicated experiments confirmed the reproducibility of the results (Fig. S5b). (c) Peptide sequences identified by XL-MS. The peptide sequences are sorted by their Ld-scores, which are the linear discriminant scores calculated by xQuest/xProphet. The 'Position' corresponds to the position of the crosslinked lysine in the protein. (d) OCT4-histone interactions in the nucleosome, determined by crosslinking mass spectrometry. OCT4, H2A, H3.1, and H4 are represented by pink, yellow, blue, and green rectangles, respectively. The OCT4 POU S ·POU HD domains are colored light pink. The interactions between OCT4 and histones (H2A, H3.1, and H4) are shown by lines. Numbers shown above or below the rectangles represent the amino acid residues crosslinked between OCT4 and histones. www.nature.com/scientificreports/ cal probing assay, H4 S47C, in which the Ser47 residue of H4 was replaced by Cys, was used instead of wild-type H4. The resulting nucleosomes were further purified by non-denaturing polyacrylamide gel electrophoresis, using a Prep Cell apparatus (Bio-Rad). The nucleosomes were collected in buffer F [20 mM Tris-HCl (pH 7.5) and 1 mM DTT], and were concentrated using a Millipore centrifugal filter. The nucleosome samples were stored at 4 °C.
Cryo-electron microscopy. The OCT4-LIN28B nucleosome complex for the cryo-EM analysis was purified by sucrose gradient ultracentrifugation. A gradient was formed with buffer G [10 mM HEPES-NaOH (pH 7.5), 20 mM NaCl, 1 mM DTT, and 5% sucrose] and buffer G containing 15% sucrose, using a gradient maker. For complex formation, the LIN28B nucleosome (0.96 µM) was mixed with OCT4 (nucleosome:OCT4 = 1:6 molar ratio) in buffer H [10 mM Tris-HCl (pH 7.5), 20 mM NaCl, and 1 mM DTT], and was incubated for 1 h on ice. The sample was applied on the top of a gradient, and was centrifuged at 27,000 rpm at 4 °C for 16 h, using a Beckman Sw41Ti rotor. The fractions were analyzed by non-denaturing polyacrylamide gel electrophoresis, and the peak fractions were dialyzed against buffer F. The sample was then concentrated using a Millipore centrifugal filter. Tween-20 was added to a final concentration of 0.00074%, and a 2 µl portion of the sample (0.6 mg/ml) was applied to a glow-discharged Quantifoil holey carbon grid (R1.2/1.3 200-mesh Cu). The grids were blotted for 6 s www.nature.com/scientificreports/ under 100% relative humidity at 16 °C, and were immediately plunged into liquid ethane, using a Vitrobot Mark IV (Thermo Fisher). Cryo-EM images were collected by the EPU auto acquisition software on a Talos Arctica cryo-electron microscope (Thermo Fisher), operated at 200 kV at a nominal magnification of × 100,000, which renders a pixel size of 1.32 Å at the object scale. Images were recorded under low-dose conditions with 10-s exposure times, using a K2 Summit direct electron detector and a GIF Quantum energy filter (slit width 20 eV) (Gatan) in the counting mode, retaining a total of 40 frames with a total dose of ~ 50 electrons per Å 2 .
Image processing. In total, 1,877 movies of the LIN28B nucleosome were aligned and integrated using MOTIONCOR2 (https ://emcor e.ucsf.edu/ucsf-motio ncor2 ) 44 , with dose weighting. The contrast transfer function (CTF) was estimated by CTFFIND4 (https ://grigo rieff ab.janel ia.org/ctf) 45 from the digital micrographs, with dose weighting. In total, 1,274 images were selected based on the CTF fit correlation to approximately 5 Å resolution. RELION 3.0 (https ://www2.mrc-lmb.cam.ac.uk/relio n/index .php/Main_Page) 46 was used for all subsequent image processing operations. Subsequently, 325,272 particles of the LIN28B nucleosome were picked automatically, with a box-size of 150 × 150 pixels. Two-dimensional classification to remove bad particles resulted in the selection of 264,297 particles. The crystal structure of a canonical nucleosome (PDB: 3LZ0) in the low-pass filtered to 60 Å was used as the initial three-dimensional reference. The best classes containing 150,721 particles, in which the nucleosomal DNA was fully wrapped around the histone octamer, were selected from the three-dimensional classification. The three-dimensional refinement of the LIN28B nucleosome was performed, followed by particle polishing and two rounds of CTF refinement. The final three-dimensional map of the LIN28B nucleosome was sharpened with an exponential B-factor (− 50.7 Å 2 ). Gel mobility shift assay. Each reaction was performed in a total volume of 10 µl. The LIN28B nucleosome Crosslinking mass spectrometry. Crosslinking mass spectrometry was performed as described previously 51,52 . The LIN28B nucleosome (4.3 μM) was mixed with OCT4 in a nucleosome:OCT4 = 1:4 molar ratio, and was incubated on ice for 1 h. The sample was then crosslinked with 9.6 mM DSS-H12/D12 (Creative Molecules) at 30 °C for 30 min. The reaction was stopped by adding 48 mM ammonium bicarbonate, and then incubated at 30 °C for 15 min. The sample was reduced, alkylated, and then digested by sequencing-grade endopeptidase Trypsin/Lys-C Mix (Promega), at an enzyme-substrate ratio of 1:50 wt/wt. The digested sample was applied to a Superdex 30 Increase 3.2/300 (GE Healthcare) column, using buffer containing 25% acetonitrile and 0.1% TFA. The eluted fractions (150 μl) were collected, dried, and re-dissolved in 0.1% TFA. The samples were analyzed by liquid chromatography tandem mass spectrometry (LC/MS-MS), using an LTQ-Orbitrap Velos mass spectrometer (Thermo Fisher Scientific) equipped with a Zaplous Advance nano UHPLC HTS-PAL Scientific RepoRtS | (2020) 10:11832 | https://doi.org/10.1038/s41598-020-68850-1 www.nature.com/scientificreports/ xt System (AMR). The crosslinked peptides were identified using the xQuest/xProphet software (https ://prote omics .ethz.ch/) 51 , and the crosslinks were visualized using the webserver xVis (https ://xvis.genze ntrum .lmu.de/ login .php) 53 . The mass spectrometry raw data used in this study have been deposited to the proteomeXchange Consortium via the JPOST repository (PXD019160, https ://prote omece ntral .prote omexc hange .org/cgi/GetDa taset ?ID=PXD01 9160) 54 .

Data accessibility
The cryo-EM map of the LIN28B nucleosome has been deposited in the Electron Microscopy Data Bank, with the EMDB ID codes EMD-30070. The mass spectrometry raw data have been deposited to the proteomeXchange Consortium via the JPOST repository, with the ID code PXD019160.