Nucleosome eviction along with H3K9ac deposition enhances Sox2 binding during human neuroectodermal commitment

Neuroectoderm is an important neural precursor. However, chromatin remodeling and its epigenetic regulatory roles during the differentiation of human neuroectodermal cells (hNECs) from human embryonic stem cells (hESCs) remain largely unexplored. Here, we obtained hNECs through directed differentiation from hESCs, and determined chromatin states in the two cell types. Upon differentiation, H2A.Z-mediated nucleosome depletion leads to an open chromatin structure in promoters and upregulates expression of neuroectodermal genes. Increase in H3K9ac signals and decrease in H3K27me3 signals in promoters result in an active chromatin state and activate neuroectodermal genes. Conversely, decrease in H3K9ac signals and increase in H3K27me3 signals in promoters repress pluripotency genes. Moreover, H3K9ac signals facilitate the pluripotency factor Sox2 binding to target sites unique to hNECs. Knockdown of the acetyltransferase Kat2b erases H3K9ac signals, disrupts Sox2 binding, and fails the differentiation. Our results demonstrate a hierarchy of epigenetic regulation of gene expression during the differentiation of hNECs from hESCs through chromatin remodeling.

Chromatin remodeling offers the epigenetic basis for transcriptional regulation and has a pivotal role in many biological processes such as cell differentiation, embryonic development, and so on. Nucleosome is the fundamental repeating structural unit of chromatin. 1 Cell differentiation involves extensive nucleosome reorganization that alters chromatin structure. Lineage-specified cells have a more condensed chromatin structure compared to pluripotency stem cells. 2 Nucleosomes are evicted upstream near transcription start sties (TSSs) or in distant enhancers to activate genes required for lineage commitment during differentiation of stem cells. 3,4,5 Genome-wide comparisons of chromatin structure between mammalian pluripotent stem cells and differentiated cell types reveal that regions of difference in nucleosome occupancy are enriched in the loci associated with pluripotency and differentiation. 6 Thus, dynamics of nucleosome positioning is important to cell fate commitment.
Histone modifications (HMs) are another key impact factor for chromatin structure. HMs exert their epigenetic regulatory roles through changing chromatin state or recruiting non-histone proteins to chromatin. 7 Global changes in HMs during multilineages differentiation of human embryonic stem cells (ESCs) have been reported in previous studies. 8,9 Comparisons between human pluripotent stem cells and donor fibroblasts identified lots of regions with differences in H3K4me3 and H3K27me3 signals. 10 Profiling of HMs of mouse ESCs, neural progenitor cells (NPCs), and embryonic fibroblasts revealed that H3K4me3 and H3K27me3 effectively distinguish genes with different expression levels and reflect lineage potential. 11 A recent HPLC-MS-based quantitative proteomics identified dozens of HM sites in mouse ESCs and NPCs and elaborated the effect of combinational HMs on the differentiation of mouse ESCs to NPCs. 12 The above studies all mainly focused on profiling of HMs and the global changes from ESCs to NPCs. However, the role of HM changes in the differentiation largely remains unclear.
Transcription factors' (TFs) binding to the target sites is closely correlated with HMs for a precise regulation of gene expression. Some TFs contain a protein domain that recognizes specific type of HM. Consequently, the TFs may have an affinity for the specific HM. For example, bromodomain PHD finger TF (BPTF) has a higher affinity to H3K4me3 than H3K4me2 and discriminates against H3K4me1. 13 It was reported that the core pluripotency TF Oct4 binds to the target sites with primed epigenetic signatures during mouse somatic cell reprogramming. 14 Motifs of distinct TFs are enriched in the regions with stage-specific transition during neural differentiation. 15 However, it is unclear which HM(s) have the dominant role and how it regulates TFs' binding. It is also under debate at most time as whether HMs are causal or a consequence of TFs' binding.
Neuroectoderm specification is of major interest to developmental biology. Neuroectoderm (neuroepithelia) is a very important neural precursor. It develops into neural plate, providing the source of the central nervous system. 16 Moreover, neuroectoderm has potentials to differentiate into neurons, astrocytes, and oligodendrocytes. Therefore, the differentiation of ESCs to neuroectoderm is of great significance for understanding of neural disease, for example, primitive neuroectodermal tumors. Epigenetic regulation has a critical role in the differentiation. For example, a subset of poised enhancers gained active chromatin signatures (H3K27ac) during differentiation of hESCs to neuroepithelium. 17 However, to date, the epigenetic regulatory mechanisms underlying neuroectoderm specification remains elusive.
To unveil the patterns and roles of chromatin remodeling during the differentiation of human ESCs (hESCs) to human neuroectodermal cells (hNECs), we employed the differentiation method of hESCs to a synchronized population of hNECs. 18,19 This differentiation process resembles in vivo neural development and serves a model to study the mechanisms underlying human neuroectoderm specification. 20 For example, Pax6 was identified as a transcriptional determinant of human neuroectoderm through this neural differentiation tool. 21 In this study, we profiled transcriptome, nucleosome occupancy, core histone modifications, and genomic bindings of TF Sox2 for hESCs and hNECs. The integrative analyses of these data show that extensive nucleosome repositioning and histone modification changes occur upon differentiation. H2A. Z-mediated nucleosome disassembly in promoters leads to an open chromatin structure to activate transcription of neuroectodermal genes. Moreover, H3K9ac signals deposited by acetyltransferase Kat2b are essential to the differentiation through enhancing recruitment neuroectodermal factors Sox2 and Pax6 to their target sites.

Results
Expression profiles upon differentiation. We employed a chemically defined protocol to make directed differentiation of hESCs to nearly pure hNECs. 18,19 The synchronized   Figure 1a). The qPCR results show that the mRNA levels of core pluripotency genes Oct4 (also known as Pou5f1) and Nanog are significantly decreased in hNECs whereas the hNEC determinant factor Pax6 is opposite (Supplementary Figure S1a). Immunofluorescence assay confirmed the similar results at protein levels ( Figure 1b). Of note, Sox2 has a retention of high protein levels in both hESCs and hNECs. This is consistent with the critical roles of Sox2 in both ESCs and NPCs. 22,23 We'll discuss more about the functions of Sox2 on the differentiation of hNECs hereinafter.
Gene set enrichment analysis (GSEA) 24 revealed that hESC-specific marker genes are significantly repressed during the differentiation, whereas the genes associated with neurogenesis and central nervous system development are significantly upregulated (Figure 1c  H2A.Z-mediated nucleosome eviction. We scanned the genome with a 10-kb window and calculated nucleosome occupancy for each window. Nucleosome occupancy of the biological replicates is highly reproducible (Supplementary Figure S2a). There occur global extensive changes in nucleosome occupancy in the differentiation with significant occupancy increase in most genomic regions in hNECs (Supplementary Figure S2b). We obtained the same results with different offsets and window sizes (data not shown here). Moreover, the nucleosome positioning becomes more fixed in hNECs than in hESCs ( Figure 2a). As a result, chromatin structure becomes more condensed and stable in hNECs. This is consistent with the chromatin structures in pluripotent stem cells and differentiated cells, respectively. 2 We next predicted nucleosome positions using nucleosome prediction tool GeneTrack. 25 Comparisons found that only 1% of nucleosomes are fixed without changing their positions upon differentiation, 6% are disassembled or reassembled, and the rest shift. Further analysis show that nucleosome loss is enriched in promoters (Supplementary Figure S2c). This finding suggests that there are extensive nucleosome repositioning during the differentiation.
Nucleosome organization in promoters has a critical role in regulation of gene expression. 26 There exists a canonical nucleosome organization of − 1, nucleosome depletion region (NDR), +1, +2, et al. nucleosomes around TSS in both cell types. The average nucleosome occupancy in promoters is lower in hESCs than hNECs. Surprisingly, there is a small peak in the NDR in hESCs indicating that NDRs of some genes are occupied by nucleosomes in hESCs but free of nucleosomes in hNECs (Figure 2b). The nucleosome occupancy in the NDRs (−150-50 bp of TSS) is negatively correlated with gene expression levels (Supplementary Figure S2d). This suggests the key roles of NDRs in regulating differentiation.
To address the role of NDR formation in the differentiation of hNECs, we identified genes containing an NDR in hNECs that is occupied by a nucleosome in hESCs. Nucleosome arrangement around TSS of these genes is similar in both cell types except for an occupied NDR in hESCs and an unoccupied NDR in hNECs ( Figure 2c). Intriguingly, this group of genes include many neuroectodermal genes such as Lhx2, Pax6, Meis2 and so on. As a result, their expression levels are significantly increased. ChIP-qPCR results confirmed the significantly reduced nucleosome occupancy in the NDRs (Figure 2d). GSEA results further show that 'nervous system development' gene set is activated upon differentiation ( Figure 2e). Remarkably, H2A.Z nucleosomes are enriched on the evicted nucleosomes that form the NDRs in hNECs ( Figure 2f). This suggests that H2A.Z nucleosomes mediate NDR formation in promoters of neuroectodermal genes to facilitate the differentiation of hNECs. As contrast, we collected pluripotency genes and examined nucleosome organization around TSS. Unexpectedly, NDRs retain unoccupied during the differentiation (Figures 2g and h; Supplementary Figure S2e). However, their expression levels are significantly decreased (Figure 2g). This indicates that other factors other than nucleosome occupancy in NDRs regulate their activity.
We further analyzed the distribution of evicted nucleosomes and H2A.Z nucleosomes around TSS. The results show that evicted nucleosomes are enriched in the canonical − 1, NDR, +1, +2 nucleosomes. Consistently, H2A.Z nucleosomes are also enriched in this region (Supplementary Figure S2f). In addition, only 1.2% of evicted nucleosomes and 10.4% of H2A.Z nucleosomes are located around TSS, respectively. These findings suggest that nucleosome eviction is not restricted to the promoter regions on the genome scale.
Establishment of a permissive chromatin state in promoters. To uncover how HMs change and regulate gene expression during the differentiation, we generated genomewide occupancy maps of core HMs using high-throughput ChIP-seq technology with high reproducibility (Supplementary Figure S3a). We next categorized promoters by HM marks. The majority of promoters are marked by H3K4me3/H3K9ac or H3K4me3/H3K9ac/H3K27me3 in both cell types. The prominent difference is that more promoters are marked by H3K9ac only or H3K9ac/H3K27me3 in hNECs than hESCs whereas promoters marked by H3K4me3/H3K9ac/H3K27me3 are opposite (Supplementary Figure S3b). This implies that H3K9ac likely has a critical role in the differentiation of hNECs. Analysis of HM dynamics shows that the majority of promoters remain their original chromatin state in the differentiation of hNECs from hESCs ( Figure 3a and Supplementary Figure S3c). Moreover, the increased promoters marked by H3K9ac only in hNECs are from promoters marked by H4K3me3/H3K9ac in hESCs. The increased promoters marked by H3K9ac/H3K27me3 in hNECs are from promoters marked by H4K3me3/H3K9ac/H3K27me3 in hESCs. As a result, gene expression levels increase as active HM signals increase and/or repressive HM signals decrease, vice versa ( Figure 3b and Supplementary Figure S3d).
We next specifically examined HM dynamics in the promoters with NDR formation during the differentiation. The active HM signals largely increase around TSS from hESCs to hNECs, especially H3K9ac levels are increased to approximately twofolds. Conversely, the repressive H3K27me3 is decreased ( Figure 3c). There is also enrichment of RNA Pol II around the TSS and of H3K36me3 across the gene body that indicates active transcription of these genes ( Figure 3d). This suggests that nucleosome eviction together with HM dynamics set up an open chromatin structure in the promoter regions in neuroectodermal genes for transcription to facilitate the differentiation of hNECs from hESCs.
H3K9ac and H3K27me3 dynamics accurately regulates gene activity of distinct lineages. To address which HM(s) have a dominant role in control of gene activity in the differentiation of hNECs, we collected three sets of genes: pluripotency genes for hESCs, neuroectodermal genes for hNECs, and other lineage-specific genes for comparison, and examines HM dynamics in promoters and gene expression change. The results show that H3K4me3 levels largely decrease in pluripotency genes whereas H3K27me3 levels increase. H3K9ac levels change very little. Concomitantly, the gene expressions are downregulated (Figures 4a and b). This suggests that H3K27me3 and H3K4me3 have a pivotal role in regulation of pluripotency genes. Contrarily, H3K27me3 signals greatly decrease in neuroectodermal genes, whereas H3K9ac signals increase. H3K4me3 signals remain unchanged. Consequently, the gene expressions are upregulated (Figures 4a and b). Moreover, H3K9ac and H3K27me3 levels are the best predictive for gene activity in hNECs among the core HMs ( Figure 4c). This indicates that H3K27me3 and H3K9ac have a dominant role in regulation of neuroectodermal genes. In contrast,there is very little change in both H3K27me3 and H3K9ac levels in other lineagespecific genes in concordance with unchanged gene expression levels (Figures 4a and b). This implies that other lineagespecific genes remain unchanged expression levels under the context of unchanged chromatin structure in promoters. ChIP-qPCR results confirmed the aforementioned distinct HM dynamics in promoters of the representative genes (Supplementary Figure S4a).
We further quantified change in H3K9ac and H3K27me3 signals in promoters upon differentiation by calculating the change in levels of the two HMs: log 2 (fold change of H3K9ac hNECs/hESCs ) − log 2 (fold change of H3K27me3 hNECs/hESCs ) that is defined as Histone Modification Index (HMI). The distribution of HMI in genes is a normal distribution (Supplementary Figure S4b). The net effect of HMI ⩽ − 1 is that increase in H3K9ac in promoters of these genes is equal to or less than two folds of increase in H3K27me3. Consistently, the expression levels of genes with HMI ⩽ − 1 significantly decrease in hNECs. Conversely, the net effect of HMI ⩾ 1 is that increase in H3K9ac in promoters of these genes is equal to or more than two folds of increase in H3K27me3. The expression levels of genes with HMI ⩾ 1 significantly increase in hNECs. In contrast, there is no significant change in the expression levels of the rest of genes (Supplementary Figure S4c). This suggests that the combinational H3K9ac and H3K27me3 marks serve an appropriate predictive of global gene activity and have a critical role in the differentiation of hNECs from hESCs.  Figure S4d). Interestingly, CRRs are enriched in promoters (Figures 4d and e; Supplementary Figure S4e). Moreover, CRRs contain binding motifs of many TFs associated with neural differentiation and development, including Pax6, Six6, and Sox family (Figure 4f). Collectively, H3K9ac and H3K27me3 execute their epigenetic regulatory functions during the differentiation through chromatin remodeling mainly in promoters and consequently likely controlling accessibility of binding motifs of key TFs.
Sox2 binds to hNEC-specific sites upon the differentiation. TF Sox2 has important roles in both ESCs and progenitors of multiple lineages. 22,23 How Sox2 executes its distinct functions, especially in the differentiation of hNECs, remains largely unknown. Our highly reproducible Sox2 ChIPseq data (Supplementary Figure S5a) Figure S5b). This suggests that our Sox2 ChIP-seq data capture its bona fide target sites. De novo motif discovery further identified motifs of other pluripotency TFs, including Oct4, Nanog, and so on, in hESC-specific Sox2-binding sites. In contrast, motifs of other neural TFs, including Nkx2.1, Sox9, Pax6, and so on, are enriched in hNEC-specific Sox2-binding sites (Figure 5b). This together indicates that Sox2 exerts its distinct functions in pluripotency maintenance and multilineage progenitor commitment by binding to distinct target sites with involvement of other corresponding co-factors.
Kat2b deposits H3K9ac to recruit Sox2 to its target sites. To understand how chromatin remodeling regulates Sox2 binding to its target sites upon differentiation, we examined nucleosome occupancy around hNEC-specific Sox2-binding sites and found no change during the differentiation (Supplementary Figure S5c). However, H3K9ac levels are largely increased on the sites whereas H3K27me3 levels are decreased (Figure 5c). Moreover, Sox2-binding signals are enriched in CRRs (Figure 5d). This implies that H3K9ac and H3K27me3 are a dominant factor for recruitment of Sox2 to the target sites other than nucleosome occupancy. Acetyltransferase Kat2b is the key enzyme responsible for H3K9ac deposition. Coincidently, Kat2b is significantly upregulated upon differentiation. Thus, we hypothesized that Kat2b may be the key factor to regulate Sox2 binding by depositing H3K9ac upon differentiation. To test this, we knocked down Kat2b by shRNAs (Supplementary Figure S5d). As knockdown of Kat2b, H3K9ac levels are significantly decreased at Sox2binding sites, and Sox2 occupancy is also significantly reduced. Similarly, Kat2b knockdown also significantly reduces H3K9ac levels at Pax6 target sites and consequently Pax6 binding signals are precipitously decreased (Figure 5e). As a result, Kat2b knockdown fails the differentiation of hNECs from hESCs (Supplementary Figure S5e).

Discussion
Chromatin remodeling has a critical role in cell differentiation. In this study, the results show there occur extensive nucleosome repositioning and HM dynamics during the differentiation of hNECs from hESCs. H2A.Z mediates nucleosome eviction to form NDRs in the promoters of neuroectodermal genes. Acetyltransferase Kat2b deposits H3K9ac in the genes in concomitance with decrease in repressive H3K27me3 levels. The resultant open chromatin structure in the promoters activates neuroectodermal gene transcription. Meanwhile, the increased H3K9ac signals help recruit Sox2 to its target sites unique to hNECs. As a result, hESCs are directly differentiated to hNECs ( Figure 6). These findings reveal the dynamics of chromatin structure and its epigenetic regulatory roles in the differentiation of hNECs from hESCs. This indicates that the fine-tuning of chromatin structure is critical to lineage-commitment gene regulation.
Dynamic changes in chromatin modifications have crucial roles in cell differentiation. A previous study illustrated that H3K4me3 at promoters remains largely invariant during hESCs differentiating to a mesendodermal lineage. In contrast, switch between H3K27ac and H3K27me3 at promoters is critical to the differentiation by repressing pluripotency genes and activating mesendodermal genes. 28 Intriguingly, our findings suggest that dynamic changes in H3K9ac and H3K27me3 at promoters are important to the differentiation of hNECs from hESCs. Consistently, H3K9ac levels at promoters are critical to glial and neuronal commitment in Drosophila embryonic neural development. 4,5,29 Collectively, marked changes in distinct combinational histone modifications in cell fate commitment tend to be cell-type specific.
Nucleosome occupancy is fundamental for chromatin structure. Nucleosome eviction leads to an open chromatin structure increasing accessibility of DNA sequences and therefore has an important role in the differentiation process. It was reported that chromatin remodeling complexes SWI/SNF and INO80 participate nucleosome depletion in promoters during differentiation toward the endoderm/hepatic fate. 3 Similarly, we observed H2A. Z-mediated nucleosome eviction in promoters upon differentiation of hNECs than significantly increase transcription levels of neuroectodermal genes. However, the detailed mechanism of which chromatin remodeling complexes orchestrate to regulate nucleosome eviction requires future studies.
Interplay between TFs and histone post-translational modifications precisely regulates transcription. Chromatin modifications enhance recruitment of TFs to certain genomic regions. For example, TFs and co-activators favor binding to transcriptional enhancers defined by H3K4me1 enrichment. Extended H3K27ac domains are abundant of motifs for master TFs in the respective cell types. 30,31 Here, we reveal that H3K9ac marks at the hNEC-specific binding sites of neuroectodermal factors Sox2 and Pax6 are essential for their binding. Knockdown of acetyltransferase Kat2b largely reduces H3K9ac levels, disrupts bindings of Sox2 and Pax6, and consequently fails neuroectodermal differentiation. This greatly improves our understanding of regulatory roles of chromatin remodeling in the differentiation of hNECs from hESCs.

Materials and Methods
Differentiation of hNECs from hESCs. Human ESCs (H9, passages [18][19][20][21] were differentiated into hNECs as described previously. 19 . 13256-029). Then, the colonies were detached from MEFs and formed aggregates to initiate the neural differentiation procedure by withdrawing bFGF. After 4 days of suspension culture, the ESC growth medium was replaced with neural induction medium (Sterilely combine 489 ml of DMEM/F12, 5 ml N2 supplement, 5 ml MEM nonessential amino acids solution, and 1 ml of 1 mg/ml Heparin) to guide neuroectodermal specification. Then transferred the suspension cells to the 6-well plates without feeder on the sixth day. After another 4 days of adherent growth, we got hNECs on day 10.
Immunofluorescence staining. Cells were fixed by 4% paraformaldehyde at room temperature for 10 min, followed by 0.1% Triton X-100 permeabilization for 10 min and 1% BSA blocking for 1 h. They were then incubated with primary antibodies against Oct4  Table S1.
MNase-seq and ChIP-seq for histone modifications. Overall, 2 × 10 7 cells were crosslinked with 1% formaldehyde for 10 min at room temperature, and crosslinking was terminated by addition of 125 mM glycine. Cells were harvested for lysis to isolate nuclei. Suspend nuclei in 500 μl of MNase digestion buffer (10 mM Tris-HCl (pH 7.5), 15 mM NaCl, 60 mM KCl, 1 mM CaCl 2 , 0.15 mM spermine, 0.5 mM spermidine, 1 × EDTA free protease inhibitor cocktail). Then digest nuclei with 2.5 μl of micrococcal nuclease (NEB# M0247S, 2000 gel units/μl) at 37°C for 20 min. Then put the samples on ice and add EDTA to a final concentration of 10 mM to stop MNase digestion. Crosslinking was then reversed for at least 6 h at 65°C along with proteinase K digestion. Nucleosomal DNA was extracted using phenol-chloroform and purified on 1.5% agarose gels.
The purified mononucleosomal DNA by MNase digestion and ChIP'ed by HM antibodies were subjected to massively parallel DNA sequencing on Illumina HiSeq2000 platform (San Diego, CA, USA) using 49 bp single end protocol.
ChIP-seq for Sox2 and RNA Pol II. Genomic DNA was crosslinked and extracted as described above MNase-seq. The purified genomic DNA was suspended in 500 μl of sonication buffer (1XPBS, 1% NP-40, 0.5% sodium deoxycholate, 0.1% SDS, 1 × EDTA free protease inhibitor cocktail), and sonicated with 6 rounds of 30-s on and 30-s off on ice using XL-2000 Misonix sonicator with power output of 7 Watts. It is critical that the average length of the sheared chromatin is about 250 bp, with length ranging from 150-500 bp. The fragmented DNA was immunoprecipitated with 3~5 μg of specified antibody (anti-Sox2: Abcam ab59776, anti-Pol II: Abcam ab817) as described in the above ChIP-seq for HMs. The ChIP'ed DNA was subjected to massively parallel DNA sequencing on Illumina HiSeq2000 platform (San Diego, CA, USA) using 49 bp single end protocol.
ChIP-qPCR. ChIP-qPCR was conducted according to user manual of ChampionChIP PCR Array from SABiosciences (Germantown, MD, USA). Briefly, DNA fragments were prepared as above MNase-seq or ChIP-seq. Relative signal abundance in regions of interest in sample DNA was measured by qPCR using Power SYBR. Fragmented genomic DNA or IgG-immunoprecipitated DNA was used as control sample. Relative signal enrichment was calculated using ΔΔCt method by normalizing Ct values against control sample. The ChIP-qPCR primers were were listed in the Supplementary Table S2.  Microarray analysis. Gene expression microarray data of the two cell types were downloaded from ArrayExpress database (accession number E-MEXP-2426, Affymetrix HG-U133A). 33 The data contained 3 replicates. The.CEL files were analyzed using the affy package within R/Bioconductor. 34 Robust multichip averaging (RMA) was used to correct for background, normalize and generate expression data. 35 The Limma (linear models for microarray data) package was then used to identify differentially expressed genes with the adjusted P-valueo0.05 and expression fold change ⩾ 2. 36 Functional annotation of gene sets. GO analysis for gene sets was performed using the tool PANTHER. 37 The tool Gene Set Enrichment Analysis (GSEA v2.1.0) 24 was used to identify any priori defined set of genes that share common biological function or regulation and show statistically significantly, concordantly different expression levels between two samples. GSEA was performed using the gene sets of gene ontology (C5) and user-defined HESC SPECIFIC MARKERS gene set that was collected from published paper. 38 Metric for ranking genes was set to Signal2Noise. Gene ontology gene sets were curated by MSigDB database v4.0. Other parameters used default values.
Nucleosome prediction and analysis of nucleosome positioning dynamics. Sequencing reads were aligned to Homo sapiens reference genome (hg19) using Bowtie-1.0.0 (ref. 39) with up to two mismatches. Only the uniquely mapped reads were retained for nucleosome prediction. Because only the border of mononucleosomal DNA was sequenced, each read was moved 73 bp interior to its end to represent nucleosome dyad. The peak-calling tool GeneTrack 25 was used to smooth the clustered distribution of reads and predict nucleosomes using an exclusion zone of 147 bp and sigma of 20 bp. Nucleosomes that were detected six or more times (that is, read count ⩾ 6) were further analyzed, although patterns were identical when all nucleosomes were analyzed. Nucleosome fuzziness was calculated as the standard deviation of the coordinates of all reads defining the same nucleosome as described previously. 40 It measures how spread out a nucleosome position is. Each nucleosome was assigned to promoter, genic or intergenic regions depending on the location of the nucleosome midpoint.
Two nucleosomes from the two cell types are fixed if their midpoints locate at the same position. The nucleosome is gain or loss if the distance between two midpoints is ⩾ 107 bp. That is, the two nucleosomes overlap less than 20 bp. The rest of nucleosomes shift upon differentiation.
Nucleosome distribution profile around TSS. The original composite distribution of nucleosome around TSS was calculated by aggregating nucleosomal read count at each distance relative to TSS as follows: each read represents a nucleosome by extending toward 3′ end to a length of 147 bp. The midpoint of extended read defines the nucleosome position. We summed total read counts at each site within ± 2 kb of TSS. The nucleosomal read count was further normalized by the uniquely mapped total reads. We further binned the nucleosome occupancy by a 10-bp interval of nucleosome distance to TSS, and smoothed it with 5-bin moving average and 1-bin step size.
Nucleosome organization in the regions − 0.3 kb to +1 kb of TSS was plotted as heat map by the tool seqMINER 41 without reads extension. Nucleosome positions predicted by GeneTrack was used as input with nucleosome width shrunk to 127 bp for better visualization. Heat map for nucleosomes with H3K36me3 modifications was drawn in the same manner. Heat map for RNA Pol II was drawn in the similar way except that its peaks predicted by MACS-1.4.2 (ref. 42) with default settings were used.
Chromatin state in promoters. HM ChIP-seq reads were aligned to Homo sapiens reference genome (hg19) as above nucleosomal reads. Similarly, only the uniquely mapped reads were retained for prediction of nucleosome with HM by GeneTrack using the same parameters as above nucleosome prediction. Occupancy of nucleosomes with HM were normalized as Reads Per Kilobase per Million mapped reads (RPKM). Nucleosomes with HM that had occupancy as RPKM41 were retained for following analysis. Promoters (±500 bp of TSS) were categorized to eight classes (H3K4me3+, H3K27me3+, H3K9ac+, H3K4me3+/H3K27me3+, H3K4me3 +/H3K9ac+, H3K9ac+/H3K27me3+, H3K4me3+/H3K27me3+/H3K9ac+, none) depending on the composition of nucleosomes with modifications located in promoters. For example, H3K4me3+ promoters contained only nucleosomes with H3K4me3, H3K4me3+/H3K27me3+ contained only both nucleosomes with H3K4me3 and nucleosomes with H3K27me3, and so on.
Change in HMs in promoters. Occupancy of a HM was measured as its read counts in promoters that were normalized as reads per kilobase per million mapped reads (RPKM). HM occupancy ratio of hNECs to hESCs was calculated, transformed to log2, and represented as heat map.
We further used ROC curve to identify which HM(s) were the best classifier for the significantly DEGs. In first case, upregulated DEGs were treated as actual positives, the rest were actual negatives. The fold changes of HM occupancy in promoters were calculated as above and sorted descendingly. Each fold change value was used as a threshold. For each threshold, the genes, whose fold changes of HM occupancy in promoters were greater than the threshold, were predictive positives. Otherwise, the genes are predictive negatives. Then, the true positive rate (TPR, that is, sensitivity) and false positive rate (FPR, that is, 1-specificity) were calculated for each threshold. The ROC curve was thus TPR as a function of FPR. The ROC curve for downregulated DEGs was plotted in the same manner except that downregulated DEGs were treated as actual positives, the rest were actual negatives.
Chromatin remodeling regions. The regions with significant increase in H3K9ac upon differentiation were detected by the tool MACS-1.4.2 (ref. 42) with a P-value cutoff of 10 − 5 using hNECs as treatment sample and hESCs as background. The regions with significant decrease in H3K27me3 upon differentiation were detected similarly using as hESCs treatment sample and hNECs as background. We retained the regions with H3K9ac or H3K27me3 occupancy 41 RPKM. We combined these regions as chromatin remodeling regions. De novo motif discovery in these regions was done by the tool MEME 43 as follows: the DNA sequences of CRRs were retrieved as input to MEME for motif finding with E-value cutoff of 0.05 and motif width of 6-16 bp. The other parameters used default values. To compare the predicted motifs to the known ones, the TOMTOM motif comparison tool (Version 4.11.3) from the MEME suite was used to search a database of vertebrates (in vivo and in silico) using the default parameters (E-valueo10).
Profiles of Sox2 binding and motif finding in Sox2 peaks. As above HM sequencing data analysis, only Sox2 reads uniquely mapped to Homo sapiens reference genome were retained for peak calling by the tool MACS-1.4.2 (ref. 42) with a P-value cutoff of 10 − 4 and fold change of 10. The de novo motif finding in the Sox2 peak sequences was done by Cistrome Analysis Pipeline 27 as follows: we chose the SeqPos motif tool in the section Integrative Analysis from the Cistrome toolbox. The coordinates of Sox2 peaks in BED format was the input file. The 'de novo motif search' motif database from the 'curated cistrome motif database' was used. The species 'Homo sapien or Mus musculus' was used. The rest of parameters were default. The tool HOMER was applied to find enriched motifs of co-factors. 44 Only the motifs with the highest alignments to known transcription factors, nonredundant matrixes and non-repetitive sequences were retained. Sox2 occupancy in peaks was normalized as RPKM. Browser tracks for Sox2 occupancy were generated using the tool IGV 2.3. 45 Accession numbers. The accession numbers for the MNase-seq data and the ChIP-seq data reported in this paper are GEO: GSE76084. The H2A.Z ChIPseq data was downloaded from GSM1003579. Pax6 binding peaks in hNECs was kindly provided by the author from the published work. 46 This work is licensed under a Creative Commons