A G-quadruplex structure at the 5′ end of the H19 coding region regulates H19 transcription

The H19 gene, one of the best known imprinted genes, encodes a long non-coding RNA that regulates cell proliferation and differentiation. H19 RNA is widely expressed in embryonic tissues, but its expression is restricted in only a few tissues after birth. However, regulation of H19 gene expression remains poorly understood outside the context of genomic imprinting. Here we identified evolutionarily conserved guanine (G)-rich repeated motifs at the 5′ end of the H19 coding region that are consistent with theoretically deduced G-quadruplex sequences. Circular dichroism spectroscopy and electrophoretic mobility shift assays with G-quadruplex-specific ligands revealed that the G-rich motif, located immediately downstream of the transcription start site (TSS), forms a G-quadruplex structure in vitro. By using a series of mutant forms of H19 harboring deletion or G-to-A substitutions, we found that the H19-G-quadruplex regulates H19 gene expression. We further showed that transcription factors Sp1 and E2F1 were associated with the H19-G-quadruplex to either suppress or promote the H19 transcription, respectively. Moreover, H19 expression during differentiation of mouse embryonic stem cells appears to be regulated by a genomic H19 G-quadruplex. These results demonstrate that the G-quadruplex structure immediately downstream of the TSS functions as a novel regulatory element for H19 gene expression.


Statistical analysis.
All data are representative of at least three independent experiments. P values were calculated by applying Dunnett's multiple-comparison test or two-tailed t-test. Data are presented as the mean ± standard error of the mean.

Results
The G-rich motifs located immediately downstream of the H19 TSS form a G-quadruplex structure. H19 harbors G-rich sequences immediately downstream of the TSS, which are conserved in mammalian species (Fig. 1A). The G-rich sequence in the region between + 14 and + 39 of mouse H19 displays a G-score of 84, calculated by QGRS Mapper (Max length: 30, Min G-group: 2, Loop size: 0 to 36), a software that provides information on the composition of putative G-quadruplex forming G-rich sequences 50 , suggesting that this region forms a G-quadruplex structure. To assess this possibility, we prepared single-stranded oligonucleotides corresponding to the regions between + 1 to + 78, + 12 to + 42, + 42 to + 74 and + 1 to + 27 of mouse H19, and analyzed them by circular dichroism (CD) spectroscopy. The CD spectrum of the oligonucleotide for the region between + 1 to + 78 was characteristic for parallel G-quadruplex structures 51  ) support the notion that WT-(+ 1 to + 78) and Mut3-(+ 1 to + 78), but not Mut1-(+ 1 to + 78) or Mut2-(+ 1 to + 78), form G-quadruplex structures. In addition, the oligonucleotide for the region between + 12 to + 42 [Figs 1B and 2D, WT-(+ 12 to + 42)] exhibited a characteristic spectrum for parallel G-quadruplex structures. In contrast, the oligonucleotide for the region between + 42 to + 74, as well as the mutant oligonucleotides with G-to-A substitutions in this region, barely formed G-quadruplex structures [Figs 1B and 2D, WT-(+ 42 to + 74) and Mut7-(42 to + 72)]. The oligonucleotides for the region between + 1 to + 27 partially exhibited a characteristic spectrum for parallel G-quadruplex structures [Figs 1B and 2D, WT-(+ 1 to + 27)]. These results suggest that the region between + 12 to + 42 within the H19 gene forms a DNA G-quadruplex structure. The region between + 1 to + 27 is capable of forming a G-quadruplex structure in the short oligonucleotides, but the region between + 1 to + 12 is dispensable for G-quadruplex structure formation in the longer oligonucleotides.
The H19 G-quadruplex regulates H19 gene transcription. To determine whether the G-quadruplex structure at the H19 TSS regulates H19 gene transcription, we constructed a series of plasmids encoding mouse WT-H19, Mut2-G4-H19, Mut3-G4-H19, or ∆ G4-H19, where the + 1 to + 56 region of H19 was deleted, under the control of the EF1α promoter (Fig. 4A). 293T cells were transfected with each plasmid and the expression levels of H19 RNA were analyzed by qRT-PCR. PCR analysis confirmed the same degree of transfection efficiency of the plasmids (Fig. 4C). The results showed that the expression levels of H19 RNA were significantly higher in the cells transfected with ∆ G4-H19 and Mut2-G4-H19 compared with the WT-H19-transfected cells (Fig. 4B) at the various concentrations of the plasmids (Supplementary Figure 2). Mut3-G4-H19 exhibited the similar level of H19 RNA as WT-H19 (Fig. 4B), suggesting that the G-rich sequences within the region between + 43 to + Scientific RepoRts | 7:45815 | DOI: 10.1038/srep45815 78 is dispensable for regulating H19 gene transcription. Next, we constructed luciferase assay vectors, in which the H19 promoter element together with G-quadruplex sequence (− 840 to + 84) (H19 pro-G4-Luc) or the H19 promoter element alone (− 840 to + 14) (H19 pro-Luc) was fused to the luciferase-coding sequence (Fig. 4D). We transfected these vectors into 293T or EpH4 cells, and found that luciferase activity was much higher in the cells transfected with H19 pro-Luc compared with that of H19 pro-G4-Luc or the control plasmid-transfected cells  Fig. 4E and F). These results indicate that the H19 G-quadruplex sequence in the regions between + 1 to + 42 has a function to suppress H19 gene transcription.
Identification of proteins associated with the H19 G-quadruplex. To gain insight into the molecular mechanisms, we next determined the H19 G-quadruplex-associate proteins. To this end, we performed a pull-down assay using the biotinylated WT-and Mut2-oligonucleotides. As the region between + 43 to + 78 was dispensable for suppressing H19 gene transcription (see Fig. 4B), we used the biotinylated WT-and Mut2-oligonucleotides for the region between + 1 to + 43 of H19 for a pull-down assay (Fig. 5A, bio-WT and bio-Mut2, respectively). We confirmed that both single-and double-stranded bio-WT, but not bio-Mut2, exhibit mobility-shift bands in EMSA in the presence of 100 mM KCl (Fig. 5B) or L1BOD-7OTD (Fig. 5C). We incubated the single-or double-stranded biotinylated oligonucleotides with the cell lysates in the presence of 100 mM KCl. By using whole cell lysates from HeLa cells and mouse embryonic stem cells (mESCs), we examined the association of the oligonucleotides with proteins that have been reported to interact with G-quadruplex, including Sp1 52-54 , Nucleolin (NCL) 52,53,55 , and Poly(ADP-ribose) polymerase-1 (PARP1) [56][57][58] . We found that Sp1 and NCL bound to bio-WT, but not bio-Mut2 (Fig. 5D). On the other hand, PARP1 bound to both bio-WT and bio-Mut2 double-stranded oligonucleotides, but not single-stranded oligonucleotides (Fig. 5D), indicating that PARP1 is associated with double-stranded oligonucleotides in a DNA sequence-independent manner. ChIP-qPCR analysis showed binding of endogenous Sp1, but not NCL, to the genomic region of H19 G-quadruplex (Fig. 5E and F). These observations indicate that Sp1 is associated with the H19 G-quadruplex both in vitro and in vivo. Notably, Sp1 bound to bio-WT more efficiently in the lysates prepared from the G1/S phase-synchronized cells than that from M phase-synchronized cells (Supplementary Figure 3B), suggesting the cell cycle-dependent association of Sp1 with the H19 G-quadruplex. We found that the ectopic expression of Sp1 in HeLa cells resulted in the   (Fig. 5G). Conversely, knockdown of Sp1 by siRNA resulted in upregulation of the H19 RNA level (Fig. 5H), indicating that Sp1 suppresses H19 gene transcription. We further found that E2F1, which is reported to regulate H19 gene transcription 23 , also bound to single-and double-stranded bio-WT, but not bio-Mut2 in a pull-down assay (Fig. 5D). Importantly, the ectopically expressing E2F1 in HeLa cells binds to the genomic region of H19 G-quadruplex (Fig. 5I) and increased the endogenous H19 RNA level (Fig. 5G). The ectopic expression of NCL had no effect on the H19 RNA level (Fig. 5G). These results taken together indicate that, through binding to the H19 G-quadruplex, Sp1 and E2F1 regulate H19 transcription in an opposite way; Sp1 suppresses whereas E2F1 promotes H19 gene transcription.
Genomic H19 G-quadruplex regulates H19 transcription during mESC differentiation. It has been reported that the expression level of H19 RNA increases during differentiation of mESCs 8,59-61 . Consistently, we observed that levels of H19 RNA increased during neural differentiation of mESCs by the SFEBq method on Day 5 and Day 7, while expression of a pluripotent gene, Oct4, and a neural-progenitor-specific gene, Sox1, are decreased and transiently increased, respectively (Fig. 6A). We found that addition of the compound, L1H1-7OTD, which can bind and stabilize G-quadruplex structures 62 , into the SFEBq differentiation media significantly decreased H19 RNA levels, without affecting the expression levels of Oct4 or Sox1 on Day 5 (Fig. 6B). L1H1-7OTD also decreased H19 RNA levels in HeLa cells and U2OS cells (Supplementary Figure 4). This indicates H19 G-rich sequence forms a functional G-quadruplex structure in genome. Similarly, when mESCs were differentiated into the three germ layers by the EB culture method, H19 RNA levels were increased at differentiation Day 6, and this increase was significantly attenuated when L1H1-7OTD was added to the differentiation medium (Fig. 7A). To investigate a functional relevance of the genomic H19 G-quadruplex structure, we established the mESC lines where the genomic H19 G-quadruplex sequence was replaced by the H19-Mut2-G4 sequence (Mut2-G4 cell) (see Fig. 1B). The Mut2-G4 cells proliferated efficiently comparable to WT-G4 cells in the mESC maintenance medium (data not shown) and were capable of differentiating into all three-germ layers, including nestin-expressing ectodermal cells, α -fetoprotein (α -FP)-expressing endodermal cells, and α -smooth muscle actin (α -SMA)-expressing mesodermal cells (Supplementary Figure 5A), indicating Mut2-G4 cells retain self-renewal ability and pluripotency. Mut2-G4 cells also properly underwent neural differentiation by the SFEBq method (Supplementary Figure 5B). We found that in the Mut2-G4 cells, the L1H1-7OTD-induced downregulation of H19 RNA level was attenuated during differentiation (Fig. 7B), indicating that L1H1-7OTD suppresses H19 gene transcription through binding to the genomic H19 G-quadruplex structure. Inconsistently, however, H19 RNA levels in Mut2-G4 cells were significantly lower than that in WT-G4 cells both in EB culture on Day6  (Fig. 7D), suggesting an H19 transcription-promoting function of the H19 G-quadruplex sequence. Consistent with this notion, the level of E2F1, which promotes H19 gene transcription (see Fig. 5G), became increased, whereas Sp1, which suppresses H19 gene transcription (see Fig. 5G and H), was decreased during neural differentiation of both WT-and Mut2-G4 cells. Furthermore, ectopic expression of E2F1 significantly increased the endogenous H19 RNA level in WT-G4 cells, but not in Mut2-G4 cells in the mESC maintenance medium. These results demonstrate that the genomic H19 G-quadruplex structure immediately downstream of TSS regulates H19 transcription during mESC differentiation in a dual opposite way.

Discussion
The H19 gene is located 200 kb downstream of the Insulin-like growth factor 2 (Igf2) gene on chromosome 7 in mice and 11p15.5 in humans 63 . The H19-Igf2 locus is under the control of genomic imprinting, whereby H19 is expressed from the maternal allele and Igf2 is expressed from the paternal allele. The mechanism of the monoallelic expression of H19 by the epigenetic modification within DMR is well established and the methylation pattern within DMR is generally maintained indefinitely. However, the mechanism that explains the differential expression of H19 among cell types or tissues, which would be relevant to cell differentiation condition, remains unclear.
In this report, we show that H19 gene transcription is regulated by a G-quadruplex which is located at the region immediately downstream of H19 TSS. H19 expression is increased during mESC differentiation, which is attenuated by the G-quadruplex stabilizing compound L1H1-7OTD in WT-G4 mESCs but not in Mut2-G4 mESCs, indicating the functional H19 G-quadruplex-mediated H19 transcription regulation. It has been reported that the monoallelic expression of H19 is maintained during ESC differentiation 64 . Therefore, the H19 G-quadruplex-mediated H19 transcription regulation during mESC differentiation seems independent of genomic imprinting. In Mut2-G4 cells, H19 expression level is lower than that in WT-G4 cells, suggesting the promoting role of G-quadruplex in H19 transcription. Consistently, our results show that E2F1 binds to the H19 G-quadruplex, and promotes H19 transcription. Notably, however, the H19 expression level was partially upregulated in the Mut2-G4 cells during differentiation (see Fig. 7B and D). Therefore, in addition to the G-quadruplex-mediated mechanism, H19 gene transcription would also require H19 promoter activation during mESC differentiation. On the other hand, our results show that Sp1 also binds to the H19 G-quadruplex, and suppresses H19 transcription. It is worth to note that the expression levels of E2F1 and Sp1 are increased and decreased, respectively, during mESC differentiation (see Fig. 7E and F). Therefore, the balance of E2F1/Sp1 expression levels would determine the function of the H19 G-quadruplex on H19 gene transcription regulation.
How E2F1/Sp1 regulates H19 gene transcription through the H19 G-quadruplex remains open question. Sp1 is known to recruit a large number of proteins including transcription initiation complex and transcription repressor complex 65 . Our data show that Sp1 acts as H19 transcription repressor in conjunction with H19 G-quadruplex. Although we could not determine whether Sp1 recognizes the G-quadruplex structure or Sp1-target sequence within the H19 G-rich motif, it would be possible that Sp1 recruits transcription repressor complexes to the H19 G-quadruplex to suppress H19 transcription. It has previously shown that E2F1 binds to H19 promoter region 23 . We show that E2F1 is associated with the H19 G-quadruplex downstream of H19 TSS. It would be interesting to determine which region or both plays a pivotal role in promoting H19 transcription.
This study describes a regulatory mechanism for H19 gene transcription via the G-quadruplex that has not been described before. Putative G-quadruplex sequences are distributed throughout the genomic regions of non-coding RNA; therefore, these G-quadruplexe structures would function as regulatory elements of transcription. H19 RNA is highly expressed in various cancers and plays a proto-oncogenic function in several tumors. Therefore, our findings implicate that G-quadruplex-mediated transcription regulation of H19 gene would be an effective target for anti-cancer agent.