A Core Regulatory Circuit in Glioblastoma Stem Cells Links MAPK Activation to a Transcriptional Program of Neural Stem Cell Identity

Glioblastoma, the most common primary malignant brain tumor, harbors a small population of tumor initiating cells (glioblastoma stem cells) that have many properties similar to neural stem cells. To investigate common regulatory networks in both neural and glioblastoma stem cells, we subjected both cell types to in-vitro differentiation conditions and measured global gene-expression changes using gene expression microarrays. Analysis of enriched transcription factor DNA-binding sites in the promoters of differentially expressed genes was used to reconstruct regulatory networks involved in differentiation. Computational predictions, which were biochemically validated, show an extensive overlap of regulatory circuitry between cell types including a network centered on the transcription factor KLF4. We further demonstrate that EGR1, a transcription factor previously shown to be downstream of the MAPK pathway, regulates KLF4 expression and that KLF4 in turn transcriptionally activates NOTCH as well as SOX2. These results demonstrate how known genomic alterations in glioma that induce constitutive activation of MAPK are transcriptionally linked to master regulators essential for neural stem cell identify.

A number of different studies have used global gene expression changes and other genome-wide assays during NSC differentiation to identify master regulators and pathways 13,14 . In the most comprehensive analysis of NSC cis-regulatory elements to date, Mateo et al. 15 combined DNASE-seq with an analysis of histone modifications to study cis regulatory elements(CRE) in neural stem cell biology to reveal a core network of transcription factors motifs including bHLH, NFI, SOX, and FOX. An in-depth analysis of the bHLH transcription factor Olig2 using ChIP-Seq confirmed the genomic analysis and computational predictions.
In a similar way, two recent efforts used global gene expression changes during GSC differentiation to identify 5 functionally important transcription factors: NOTCH1 16 , SOX2, SALL1, POU3F, and Olig2 17 that act to block differentiation in GSC. Carro et al. 18 also used global gene expression data from clinical glioma samples to infer the activity of master regulators STAT3 and CEBPB in the Mesenchymal subtype of glioblastoma and showed that these transcription factors could reprogram a glioma cell line towards a mesenchymal lineage 18 .
Given the importance of the proneural clinical subtype of glioblastoma, we asked whether differentiation-induced global gene expression changes in 5 GSC derived from 5 parental proneural tumors could reveal type-specific master regulators. Additionally, we sought to compare regulatory networks between proneural GSC and neural stem cells with the aim to elucidate how pathways affected by genomic alterations specific to glioma could act to induce constitutive activation of master regulators important for neural stem identity and self-renewal. Identification of a network of neural stem-related regulators in GSCs and how these regulators are connected to oncogenic pathways in glioma could potentially lead to new treatment strategies designed to induce differentiation in GSCs and halt tumor progression.

Results
Inferring Regulatory Networks During In-Vitro Differentiation. We developed a strategy to reconstruct regulatory networks in GSC and NSC from in-vitro differentiation experiments. Affymetrix microarrays were used to measure global gene expression changes in five different GSCs lines and one embryonic murine neural stem cell line (E14) in NBE media (day0) and after a switch to FBS/RA media (day3 and day10). Markers of differentiation (GFAP) and the undifferentiated state (SOX2 and Nestin) showed changes across 10 days of the experiment (Fig. 1).
To predict which transcription factors were active during differentiation, we next examined promoter regions of differentially expressed genes for the presence of known DNA-binding motifs of transcription factors (Fig. 2). Significant enrichment of these motifs over what would be expected by chance was used to infer activity of the cognate transcription factors. Results show a substantial overlap of predicted transcription factor activity between the 5 GSC lines and the NSC E14 during differentiation. Common transcription factors include those known to be important in NSC biology such as SOX2, KLF4, EGR1, HES1 (activated by NOTCH), and OLIG1/2. Further analysis of 3 neural stem cell lines derived from primary human brain tissue agreed with the findings from NSC E14 in the identification of a core set of transcription factor binding motifs representing SOX, EGR1, HES1, and KLF4 ( Fig. 3) 19 . The transcription factor motif analysis results also overlapped with the results of a recent computational analysis of NSC cis regulatory elements (Mateo et al. 15 ) including SOX, FOX, E2F, ETS, and HLH families as well as CTCF, MAX, Tfap2a, TCF, Sp1, Rest, and Zic.
Transcription factors specific to the GSCs include known oncogenes such as STAT3 (downstream of PI3K) and SRF (downstream of RAS) as well as suspected oncogenes such as FOXD1. Transcription factors specific to the NSCs included known tumor suppressors such as P53 and suspected tumor suppressors such YY1. Several key regulators, including SOX2, KLF4, EGR1, and HES1 were shown by western blot to exhibit protein expression in the nucleus that decreases with differentiation (Figs 1b and 2d). Established glioma cell lines U87 and U251, which are cultured in serum-containing media, showed minimal gene expression changes under the differentiation stimulus consistent with their non-stem cell nature (Fig. 2b).
To gain insight into the regulation of signaling pathways during GSC differentiation, we performed a geneset enrichment analysis of canonical pathways curated from the Biocarta and KEGG databases (Fig. 2c). Both NSC and GSC show changes in pathways related to the differentiation stimulus (RA/FBS), including retinoic acid signaling, downregulation of G1/G2 pathways, consistent with cell-cycle arrest, and a number of genesets associated with receptor tyrosine kinase signaling, such as EGFR, PDGFR, MAPK, ERK, AKT, and MTOR. Importantly, both cell types show downregulation of pathways associated with NSCs, including NOTCH, WNT, and Neurotrophin signaling. In addition, GSCs show changes in regulation to pathways previously associated with glioma, including VEGF, NF-kB, and TGF-Beta.
To connect the inferred activity of transcription factors to changes in signaling pathways and other regulators, a regression model of significantly enriched motifs vs. differentially expressed genes was created. Briefly, a N × M matrix of transcription factor binding motif scores was used to fit a multiple regression model against the set of N gene expression values that were found previously to be differentially expressed during in-vitro differentiation. Significant motif-target gene interactions defined by a regression model were identified as significant by the use of an empirical null distribution created by random permutation of both the motif scores and targets genes in the model (see methods).
A Core Transcriptional Circuit Links EGR1 to KLF4, NOTCH, and SOX2. Given the importance of the transcription factor KLF4 in NSC biology, we examined the resulting network of TF-Target-Gene interactions for both upstream and downstream regulators of KLF4 (Fig. 4). Upstream transcriptional activators of KLF4 predicted by the model included both STAT3 (a transcription factor previously shown to control KLF4 expression in embryonic stem cells) and EGR1 (a transcription factor and so-called intermediate early gene activated downstream of RAS/MAPK). The large decrease in EGR1 levels during GSC differentiation shown by both microarray (Fig. 4b) and by western blot (Fig. 2d) and the significant positive correlation between KLF4 and EGR1 gene expression levels in clinical GBM samples led us to examine the relationship between EGR1 and KLF4 protein Scientific RepoRts | 7:43605 | DOI: 10.1038/srep43605 levels. We constructed GSC827 and GSC923 cell lines stably expressing two different shRNAs directed against EGR1 and demonstrated significantly reduced levels of KLF4 (Fig. 5c). Binding of EGR1 to the upstream promoter region of KLF4 (− 1 kb to 0 kb relative the TSS) was confirmed by ChIP-PCR of RNA purified from nuclear lysate in GSC827 incubated with an anti-EGR1 antibody (Fig. 5b). In addition, overexpression of EGR1 in U87 cells augmented the expression of KLF4 and vice versa (Supplementary Figure 2), No changes in neural stem cell markers were observed in U87, consistent with its non-stem cell nature.
RAS/MAPK activation in glioma has been shown to result both from constitutive activation of receptor tyrosine kinases EGFR, PDGFRA, and c-MET as well by an inactivating mutation of NF1, a suppressor of RAS GTPase activity. Given the well-established relationship between MAPK activation and EGR1 expression, we hypothesized that the transcriptional regulation of KLF4 by EGR1 in GSCs could act as a conduit through which known genomic alterations in glioma could be linked to the activation of master regulators of NSC identity and self-renewal. To test this hypothesis, we next examined the regulatory relationship in GSCs between KLF4 and ChIP-PCR of DNA extracted from nuclear lysate by an antibody against KLF4 was used to identify promoters of genes bound by KLF4 in GSCs (Fig. 5a). Transcriptional activation of SOX2, DLL1(Notch ligand), and NOTCH1 predicted by the regression model was consistent with ChIP-PCR results in-vitro. Overexpression of KLF4 in GSCs also increased SOX2, DLL1, and NOTCH1 protein levels in-vitro as demonstrated by western blot (Fig. 5d).  First, we demonstrated that knockdown of KLF4 by shRNA reduced the phosphorylation of ERK (MAPK) by western blot (Fig. 5c). Since upstream regulators of RAS/MAPK activation including EGFR, PDGFRA, and HRAS were down-regulated following differentiation (Fig. 4b), we explored the regulatory relationship between KLF4 and the expression of these potential target genes. Binding of KLF4 to the upstream promoters (− 1 kb, 0 kb TSS) of EGFR, HRAS, PDGFRA, and MET was confirmed by ChIP-PCR in GSC923 (Fig. 5a). However, overexpression of KLF4 induced only the upregulation of HRAS by western blot (Fig. 5d).

Genome-Wide Binding of KLF4 in Glioblastoma Stem
Cells. KLF4 is known to play a major role in determining pluripotency in embryonic stem cells and was identified as one of the four original Yamanaka transcription factors. However, transcription factor binding can vary based on cell lineage and disease state. To study KLF4 binding in GSC, we performed a ChiP-Seq experiment to determine genome-wide binding. Out of 1528 genes with promoter binding sites identified, 246 overlap with KLF4 sites in embryonic stem cells (hESC) (Fig. 6b). A functional analysis of these peaks was performed using the annotation software GREAT 20 (Fig. 6c). Peaks common to both GSC and hESC showed enrichment for pathways involved in neurogenesis including Regulation of Nervous System Development, Neuron Migration, and Positive Regulation of Neurogenesis. Pathways enriched specifically in GSC included terms relevant to glioma including Activation of Kinase Activity, Negative Regulation of Apoptosis, and several pathways involved in the positive regulation of angiogenesis. Figure 6a shows KLF4 binding peaks near the promoter regions of the genes SOX2, NOTCH1, DLL1, and HRAS.

Increased KLF4 expression blocks GSC differentiation in-vitro and is associated with increased tumor aggressiveness in-vivo.
To test the effect of KLF4 on GSC differentiation, we next stably overexpressed KLF4 in GSC lines 923 and 1228 and subjected GSCs to the differentiation stimulus (RA/FBS) for 12 days ( Fig. 7a). Compared to the control (empty vector), KLF4 overexpression induced a blockade of differentiation, as seen by lower intensity staining of GFAP (marker of differentiation) and higher intensity staining of nestin (marker of the undifferentiated state) after incubation with RA/FBS for 10 days. These results were confirmed by western blot using an inducible Flag-tagged KLF4 expression system (Fig. 7b).
Interrogation of gene-expression microarray data for 49 clinical GBM samples with the same molecular subtype as the 5 GSCs used in the current study (Proneural GCIMP-) revealed that KLF4 expression significantly predicted survival (Fig. 8a). KLF4 expression levels correlated negatively with survival using the Pearson Correlation Coefficient. (n = 49, r = − 34, p < = 0.01). A Kaplan Meier model was fit between high-expressing KLF4 tumors and showed a significant effect on survival (p < 0.002, n = 28). In agreement with our observations in-vitro, EGR1 expression levels in these clinical samples were significantly correlated with KLF4 expression levels (n = 49, r = 0.32, p < 0.01) as well as for all clinical subtypes (n = 256, r = 0.34, p < 0.001).
Next, we examined pathway activation in high vs. low KLF4 expressing samples (n = 28) using 186 KEGG genesets from the GSEA collection. In support of our in-vitro findings, samples with the highest levels of KLF4 showed upregulation of MAPK (p < 0.001, FDR < 0.01) and NOTCH (p < 0.05, FDR < 0.1) genesets compared to the lowest KLF4 expressors. Further, the highest KLF4 expressors showed downregulation of CELL_CYCLE (p < 0.001, FDR < 0.01) and DNA_REPLICATION (p < 0.001, FDR < 0.01) genesets, consistent with a partial reduction in cellular proliferation and increase in anchorage-independent growth seen with in-vitro KLF4 overexpression (Fig. 7c,d). These results suggest a similar effect of increased KLF4 levels to what has previously been observed in multiple myeloma, in which higher KLF4 levels reduce proliferation in-vitro, but produce tumors that are associated with decreased patient survival in-vivo due to increased resistance to apoptosis by alkylating chemotherapy agents 21 . The role of KLF4 in this process is further supported by our ChIP-Seq data that indicate binding near the promoters of genes in GSC involved in the negative regulation of apoptosis (Fig. 6c).
A difference between MAPK activation for the two groups of clinical samples suggests that genomic alterations to genes in the RTK/Ras signaling axis upstream of MAPK might vary between the two groups. Indeed, we observed that 90% of high KLF4 expressing tumors in the GCIMP-Proneural subtype shown alteration in RTK/Ras genes (EGFR, PDGFRA, MET, NF1, RAS) compared to only 50% in the low KLF4 expressing tumors (p < 0.05, two-tailed Fishers Exact Test). Differences between the 2 groups of clinical samples for genomic alterations in the PI3K/AKT signaling axis (AKT1, AKT2, AKT3, PIK3R1, PIK3CA, PTEN) were not statistically significant (Fig. 9).

Conclusion
By conducting a computational analysis of global gene expression changes in GSC and NSC, we have shown that both cell types show a substantial overlap in regulatory circuitry during differentiation. Our data identify both previously characterized regulators in NSC/GSC biology such as SOX2, OLIG2, DLL, NOTCH, and HES1, and reveal a new role for the transcription factor KLF4 in GSC differentiation. As reported by Qin et al. 22 , overexpression of KLF4 in neural stem cells both blocks in-vitro differentiation and reduces proliferation 22 . We have observed an identical effect of KLF4 overexpression in GSC. Reduction of proliferation and increase of anchorage-independent growth is consistent with the promotion of stem-like phenotype, and we suggest that the role of KLF4 in glioblastoma mirrors that in Multiple Myeloma, where increased KLF4 levels are associated with an overall reduction of proliferation, but greater resistance to alkylating chemotherapy agents 21 .
Our data support a model in which KLF4 is activated in GSC downstream of the MAPK pathway in part through EGR1, and we further demonstrate that EGR1 acts as a transcriptional regulator of KLF4 in GSCs in-vitro, and its expression level positively correlates with KLF4 in clinical GBM tumor samples. KLF4 acts as a  raises the possibility that KLF4 may also act to indirectly suppress PTEN activity, a known deletion target in approximately forty percent of human GBMs. The positive regulation of the PI3K/AKT signaling axis through suppression of PTEN by HES transcription factors has also been documented in other cancers such as T-ALL 23 . This relationship may be the focus of future investigation.
Given that both KLF4 and EGR1 play roles in the differentiation of normal neural progenitors 21,24 , what might allow them to promote a persistent undifferentiated state in glioblastoma stem cells? Both transcription factors have been shown to function as tumor suppressors or oncogenes depending on cellular context 25 . The activity of KLF4 is normally constrained in part by the activation of p21, which induces cell-cycle arrest 26 . The repression of p21 by MYC, previously demonstrated in GSCs, may allow KLF4 to escape this restriction. In a similar way, EGR1 transcriptionally activates both PTEN and P53, and these two tumor suppressors act to limit its oncogenic capacity. The genomic loss of PTEN/P53 activity, frequently seen in glioblastoma, removes these critical negative feedback points (including p21 activity), allowing EGR1 to act potentially without restraint on downstream target genes 26 . Calogero et al. 27 reported that EGR1 levels are suppressed in glioblastoma (n = 31) compared to normal brain tissue while Mittelbronn et al. 28 found that EGR1 was upregulated in all grades of astrocytomas compared to normal glial cells in situ, we show using a larger number of samples (n = 256) that EGR1 mRNA levels vary greatly between different tumors and are highly expressed in many samples.
Future work may define how KLF4 is involved in other aspects of GBM biology such as its cerebral invasive nature, as suggested by a previous study that showed EGR1 transcriptionally regulates KLF4 to induce in-vitro cell-scattering, in part, by suppressing the expression of E-Cadherin in 2 different established cell-lines 29 .

Methods
Tissue Culture. Fifteen cm Tissue Culture Dishes were coated with poly-ornithine (Sigma P4957) for 1 hr at 37 °C and washed 3 times with PBS. Cells were plated at 1E6 cells per dish in NBE media (Tumor stem cells derived from glioblastomas cultured in bFGF and EGF more closely mirror the phenotype and genotype of primary tumors than do serum-cultured cell lines). After 3 to 5 days in NBE media, when cells reached 70% confluence, cells were processed for RNA collection as Control sample at Day 0 or media was changed to DMEM (Invitrogen)/5%FBS (Invitrogen)/2 μ M all trans-retinoic acid, RA (Sigma R2625). Cells treated with RA were processed on Day 3 and Day 10. RNA was collected using Qiagen RNeasy Kit (74106). Cells were lysed directly on the dishes and total RNA was isolated following the manufacturers protocol. Media was changed approximately every 2-3 days to maintain a constant dose of RA. All experiments were done in triplicate.
Total RNA extraction. RNA was prepared using TRIzol ® reagent and PureLink ® RNA Mini Kit (Life Technologies, Carlsbad, CA) RNA quantity was determined using the NanoDrop ® ND-8000 spectrophotometer and quality was assessed using the Agilent Bioanalyzer 2100 system. Affymetrix GeneChip ® Human Genome U133 Plus 2.0 Array. 500 ng of total RNA was prepared for hybridization using the Affymetrix 3′ IVT Express Kit Arrays were processed following manufacturer's recommendations using the Affymetrix GeneChip ® Hybridization Oven 640, Fluidics Station 450 and Scanner 3000 7G. DNA extraction. High molecular weight genomic DNA was prepared using a QIAamp DNA Kit (Qiagen, Valencia, CA) following the manufacturer's instructions.
DNA quantity was determined using the NanoDrop ® ND-8000 spectrophotometer and quality was assessed by electrophoresis on a 2% agarose gel.
Analysis of Gene-Expression Microarray Data. Affymetrix 133 Plus 2.0 CEL files were processed using the MAS5 algorithm and probesets were converted to Refseq Transcript ID using a custom Chip Description File (CDF). Differentially expressed genes were computed for both 0-3 days and 0-10 days for each cell-line using an independent 2-tailed Students T-test. P-values were corrected using the method of Benjamini and Hochberg for conversion to a False Discovery Rate. For 5 cell-lines that showed evidence of differentiation based on protein markers, a two-way analysis of variance was used to determine common genes that were changing significantly across days 0, 3, and 10. ANOVA p-values were corrected using the method of Benjamini and Hochberg.
Transcription Factor DNA-Binding Motif Enrichment Analysis. Transcription factor dna-binding motifs were taken from the Swissregulon database (version 3). Swissregulon motifs were originally curated from both the Transfac and JASAPAR public motif databases as well as from selected publicly available ChIP-Chip and ChIP-Seq data sources. The Swissregulon database has the additional advantage that motifs are filtered by evolutionary conservation among 7 placental mammals using the MONTEVO algorithm. Evolutionary filtering increases the likelihood that any given motif shows functional binding to its cognate transcription factor. In addition, the use of evolutionary conserved motifs makes the comparison of regulatory networks between species (Human and Mouse) less problematic, as regulatory binding motifs can experience evolutionary drift. Motifs associated with each Refseq transcript (Hg18) were selected that occurred between − 500/+ 100 relative to each transcription start-site of each gene as annotated in the Refgene.txt file from the UCSC genome browser database. Although important regulatory binding events do occur outside the proximal promoter region of each gene, this region was chosen as a reasonable compromise between sensitivity and selectivity to represent the relationship between transcription factor binding and gene-expression, an approach that was previously used in the successful reconstruction of regulatory networks in a myeloid leukemia cell-line 30 .
Motif enrichment for the set of differentially expressed genes was computed by first creating a motif score for the promoter of each Refseq transcript. This score was created as the sum of the posterior probabilities for each motif for each gene and reflects both the number of motifs in each promoter as well as their evolutionary conservation. A composite score for each motif type in the set of differentially expressed genes was then computed, where n is the total number of differentially expressed genes, m is the number of motifs in the promoter of each genes (− 500, + 100 TSS) and s is the evolutionarily conserved motif score.
Significance of the composite score for each motif was computed by permutation. 10,000 same-sized samples of genes were chosen randomly from the dataset and motif enrichment scores were computed for each randomly chosen set of genes. P-values were computed from this null-distribution using the Empirical Cumulative Null Distribution (ECDF) and converted to false discovery rate using the method of Benjamini and Hochberg.
Geneset Enrichment Analysis. For both the GSCs and Murine NSCs E14, the F-test statistic from a two-way ANOVA for the effect of treatment day (0, 3, 10) was used to rank genes from the microarray. Genesets representing canonical pathways (1452) were taken from the C2 database from the GSEA website (Broad Institute). The GeneSetTest function of the Limma R package was then used to compute a p-value for the mean rank of each geneset, which uses the Wilcoxon two-sample rank test according to the mean-rank gene-set enrichment method developed by Michaud et al. 31 . P-values were corrected for multiple hypothesis testing by the method of Benjamini and Hochberg.
Gene Regulatory Network Reconstruction. The relationship between transcription-factor binding motifs and gene-expression changes was used to reconstruct a gene-regulatory network. To accomplish this goal, a multivariate regression model was fit between TF DNA-binding motifs and differentially expressed genes using the Random Forest algorithm (randomforest R package). Random Forest implements an ensemble of regression trees, and has the advantage over multiple linear regression in that it can model the multiplicative variable interactions and non-linear relationships that can occur between transcription factors 32 .
Links between TF DNA-Binding motifs and targets genes were established using the case-wise Variable Importance metric from Random Forest. This metric measures the drop in the performance (measured by mean-squared-error) of the regression model when individual input variables (TF DNA-Binding motifs) are permuted. A significance threshold for motif and target gene relationships was established by randomly permuting the input variables and output variables and fitting a model to establish a null distribution of Variable importance values that would be expected by chance. P-values were generated using the empirical cumulative null distribution (ECDF) and corrected for multiple hypothesis testing using the method of Benjamini and Hochberg.
Mean of Squared Residuals for Random Forest: Where n is the number of trees in the model and ŷ i OOB is the mean of Out of Bag predictions for the ith tree. The percent variance explained by the Random Forest model is then: Limiting Dilution and Proliferation Assay. In vitro NBE-cultured GSCs were dissociated into single-cell suspensions. For the limiting dilution assay, cells were plated into 96-well plates at various seeding densities (2, 5, and 10 cells per well) and were incubated at 37 °C for 1-2 weeks. Each well was examined for the formation of tumor spheres. For the proliferation assay, cells were plated into 6-well plates with 2 × 10 5 cells per well and incubated at 37 °C for 2 and 4 days. At the time of quantification, cells in each well were counted using Vi-CELL XR (BCM).
Western Blot. Subcellular fractions were prepared with the ProteoExtract subcellular proteome extraction kit (Calbiochem) following manufacturer's instructions. Nuclear and total soluble cell-fraction lysates were quantified using the Pierce BCA protein assay (Thermo Scientific). Proteins were separated by SDS-PAGE on a NuPAGE mini gel (Invitrogen). Proteins were transferred to a PVDF membrane, blocked with 5% nonfat dry milk or Bovine Serum Albumin (Sigma) in TBST for 1 hr RT, primary antibody was added and incubated O/N at 4 °C. Then blot was washed 3X in TBST, then incubated in 5% nonfat dry milk in TBST with secondary-HRP conjugated antibody for 1 hr RT, washed 3X in TBST, developed with SuperSignal West Pico chemiluminescence kit (Thermo Scientific) and exposed to Biomax MR film (Kodak). Western blots were performed with the following antibodies: anti-beta actin (Sigma), anti-EGR1 and anti-Hes1 (Santa Cruz), anti-GFAP and anti-Histone H3, anti-KLF4, anti-SOX2 (R&D), anti-nestin (Covance), anti-beta-tubulin (Sigma).
ChIP-Seq of KLF4. Following the Active Motif ChIP-It Express kit with modifications, cells were seeded on 15 cm polyornithine-coated plates. When they reached 70-80% confluency (approximately twelve million cells/ plate), cells were fixed with a formaldehyde solution for 8 min at 37 °C, reactions were stopped with glycine solution for 5 min RT, cells were harvested by scraping plates on ice with cold PBS and collecting cell pellets by centrifugation. Cells were sonicated in Active Motif shearing buffer with fresh protease inhibitors and PMSF using the following Covaris SonoLab 7.1 protocol: peak incident power 240, duty factor 20, cycles/burst 200, duration 300 sec. To assess the efficiency of DNA shearing and determine initial DNA concentration, an aliquot of sheared chromatin was subjected to crosslink reversal and treated with proteinase K. DNA was purified by phenol/chloroform extraction and EtOH precipitation. An aliquot of the DNA was separated by electrophoresis through a 1.5% agarose gel to determine the shearing efficiency. For the chromatin immunoprecipitation reaction: 100 μ g sheared chromatin + 10 μ g biotinylated anti-human KLF4 antibody (or biotinylated normal IgG) (R&D) + 40 μ l streptavidin magnetic beads (R&D) were mixed in Active Motif ChIP buffer 1. Samples were incubated with end-over-end shaking for 24 hr at 4 °C, beads were washed 3X, chromatin was eluted, subjected to crosslink reversal and treated with proteinase K. DNA was purified by phenol/chloroform extraction and EtOH precipitation. Finally, DNA pellets were resuspended in 45 μ l of TE, quantified using High Sensitivity DNA Bioanalyzer chips (Agilent Technologies), and analyzed by DNA sequencing.
ChIP-PCR. After chromatin immunoprecipitation reaction, 5 μ l of each DNA sample were added per well in 96 well plate for RT-qPCR. Each sample was run in triplicate and a standard curve was prepared for each primer pair using input DNA. Input DNA refers to the purified DNA after sonication, but before the ChIP reaction. EpiTect ChIP qPCR primers for specific promoters were purchased from Qiagen. Fast SYBR green master mix was from ABI. Fold enrichment was calculated by finding the slope of the standard curve, solving for the DNA quantity of each sample, and finally determining the fold enrichment of the ChIP sample relative to the IgG sample.