Introduction

CCCTC-binding factor (CTCF) is an 82-kDa protein with 11 zinc fingers.1 It was first identified as a transcriptional repressor of the chicken c -myc gene, a regulatory gene that encodes the c-myc transcription factor.2 CTCF is ubiquitously expressed and is highly conserved in eukaryotes.1, 3 CTCF consists of three separate domains: an N-terminal domain, a C-terminal domain and a central domain region with 11 zinc fingers.4 CTCF uses these zinc fingers cooperatively to bind the genome, and the zinc finger domain is especially highly conserved, highlighting its importance in CTCF function.1 All three domains are subject to distinct post-translational modifications.5 CTCF binds between 55 000 and 65 000 sites in mammalian genomes,6 and of these sites, ~50% are intergenic, whereas 35% are intragenic and the rest are promoter proximal,7 showing that CTCF can arrange chromosomal architecture by binding to various sites. CTCF also binds the nuclear matrix and stabilizes nuclear architecture.8 One of exceptional role of CTCF involves its insulator function. Insulators are short nucleotide sequences that set boundaries between nearby genomic domains.9 When CTCF binds to an insulator sequence, the conversation between an enhancer and a gene promoter is impeded and transcription of the gene is blocked.10 In studying gene regulation, it is important to consider regulation at both the locus and genomic levels because gene activity is largely influenced by the spatial positioning of the genome and the stability of genomic architecture.11 This idea emphasizes the importance of CTCF’s ability to bind to a wide range of sequences and control gene expression via the activation or repression of promoters, the insulation of enhancers and the regulation of distant chromatin interactions.3

Role of CTCF as an insulator protein

As briefly mentioned above, an insulator is a DNA sequence that can block the actions of cis-acting elements, such as enhancers, and prevent gene activation.12 Enhancer-mediated activation is a core mechanism of gene regulation in eukaryotes, and enhancers can activate transcription upon activator binding, even when they are positioned far upstream or downstream of the promoter.12, 13 There are two individual loci that were important in the discovery of CTCF’s functions: β-globin and the imprinted H19-Igf2 loci. After discovering that the chicken β-globin locus can block the enhancer activity,14 Bell et al. identified a 42-bp fragment of the chicken β-globin locus that is responsible for the enhancer-blocking activity. They showed that this sequence is the binding site for CTCF and that CTCF has a role in the insulator activity. Through chromosome conformation capture technology, CTCF was shown to help form chromatin loops that encompass elements such as the β-globin gene and the locus control region.15 As these β-globin sites are repositioned through this chromatin loop formation in a way that blocks the enhancer signals, gene transcription is repressed.

Another important locus is the H19/Igf2 locus. The H19 and Igf2 genes are separated from each other by the imprinting control region (ICR), which can be conditionally methylated.16 Methylation of the ICR is a decisive factor in CTCF binding because CTCF only binds the unmethylated ICR on the maternal chromosome. This CTCF binding prevents the communication between the H19-proximal enhancer and the Igf2 promoter, and consequently, Igf2 remains inactivated. Conversely, CTCF cannot bind the paternal ICR because it is methylated,17 and thus, the enhancer is able to activate Igf2 transcription from the paternal chromosome (Figure 1). Upon CTCF binding, various chromatin loops encompassing specific alleles with enhancers and promoters can be formed at the maternal ICR.18 This locus illustrates the basis of how CTCF contributes to insulator activity. Taken together, studies of these two loci showed how CTCF serves as a position-dependent insulator element to block inappropriate enhancer signals and protect against spurious gene activation. This directional enhancer-blocking activity by CTCF seems to be functionally conserved, as the CTCF-binding sites in the insulator region are found in diverse vertebrate species.19 Table 1 shows a list of example sequences that have been tested for insulator function.

Figure 1
figure 1

H19 and Igf2 gene expression is controlled by CTCF binding at the ICR. CTCF fails to bind to the methylated paternal ICR; thus, the enhancer (E) can induce Igf2 gene expression. However, CTCF binding to the unmethylated maternal ICR acts as a barricade between the Igf2 gene and the enhancer, activating H19 gene expression. CTCF, CCCTC-binding factor; ICR, imprinting control region.

Table 1 Target genes of CTCF as an insulator

CTCF binding and remodeling of the three-dimensional structure of the genome

Accumulating evidence shows that CTCF aids long-range chromosomal interactions via looping. For example, using chromosome conformation capture and fluorescence in situ hybridization, Hoffman and colleagues have shown that CTCF helps form interchromosomal interactions that co-localize the Igf2/H19 locus on chromosome 7 with the Wsb/Nf1 locus on chromosome 11.20 Deletion of CTCF led to loss of this interchromosomal association and, consequently, changes in Wsb1/Nf1 gene expression.20 This finding provides an example of CTCF’s role in the regulation of chromatin structure and gene expression. CTCF also helps chromatin attach to the nuclear matrix and form functionally distinct regions called topological domains.21 Using ChIP, CTCF was shown to bind numerous genomic sites,22 many of which are conserved among different cell types.23 CTCF is known to bind a CpG-rich consensus sequence that is usually unmethylated, as CTCF preferentially binds to unmethylated elements, such as the H19-Igf2 locus.23 CTCF-binding sites are located at both active and inactive domain boundaries,24 and some are also located at the borders of the lamina-associated domains, where transcriptional activity is low. Steensel and colleagues have created a high-resolution interaction map of the human genome showing that the nuclear lamina interacts with specific genomic regions and organizes the chromosomes into distinct domains.25 They found that CTCF binding is enriched at the lamina-associated domain boundaries, suggesting that CTCF has a role in shaping the three-dimensional chromatin organization. Overall, the characteristics of the identified binding sites influence CTCF’s conformation, its interaction with other proteins and ultimately, genome regulation.

As chromatin interactions are a crucial part of transcriptional regulation, Chromatin Interaction Analysis with Paired-End Tag sequencing (ChIA-PET) is a developing technology that is very useful for studying CTCF’s functions. ChIA-PET allows the analysis of long-range chromatin interactions made by a specific protein.26 In other words, ChIA-PET enables the discovery of interactions between DNA and DNA-associated proteins.27 For CTCF, ChIA-PET revealed ~1500 intrachromosomal and 300 interchromosomal interactions.28 Handoko et al.28 were the first to present a high-resolution CTCF-mediated chromatin interactome map by applying ChIA-PET sequencing in mouse embryonic stem cells. Their data showed that CTCF can create local clusters of genes, direct communication between promoters and regulatory elements, and define the boundaries of chromatin compartments, such as the nuclear lamina. Because CTCF was confirmed to make both intra- and inter-chromosomal interactions,20 it is considered a key organizer that works on the genome at a global scale. A number of studies have revealed that sequences proximal to CTCF-binding sites interact more often with sequences on the same chromosome than do CTCF-distal sites.29 Additional evidence for CTCF as a genome organizer is provided by the observation that its binding sites are enriched with housekeeping genes at the boundaries of topological domains. Ren and colleagues first identified and named the ‘topological domains,’ which are stably inherited in mammalian genomes and mainly act as barriers to the spread of heterochromatin.21 At these domains, CTCF and the cohesin complex hold long-range interactions together and form chromatin loops that define the topological domains. Hadjur and colleagues found that in post-mitotic nuclei without properly working cohesin and CTCF, topological domains become loosened.30 This result shows that CTCF cooperates with other proteins as an organizer and determines which sequences are brought together and what other binding proteins are recruited. Thus, the roles of CTCF extend beyond its original roles as a transcriptional regulator and an insulator; this multivalent protein also orchestrates interactions between distal sequences by binding to chromatin and creating geometrical loops.15

CTCF and the cohesin complex in transcriptional regulation

A number of studies have demonstrated the co-localization of CTCF and cohesin on chromosomes, suggesting their functional cooperation. Cohesin is a ring-shaped complex comprised of the subunits SMC1, SMC3 and SCC3.31 It is known to stabilize chromatin loops between enhancers and promoters and also to promote the binding of transcription factors at enhancers.32, 33 Cohesin does not directly bind to DNA, but instead associates with CTCF through its subunit SCC3 at a specific loci.34 In fact, CTCF leads cohesin to its binding sites, and cohesin is required for CTCF to carry out its insulator function.35 Therefore, sequence-specific cohesin binding is dependent on the presence of CTCF, whereas CTCF is not dependent on cohesin for its function.36 These results support the view that CTCF and cohesin work together to assist long-range interactions (Figure 2). CTCF and the cohesin complex maintain higher-order chromatin structures by co-localizing at many sites across the genome. One example is the CFTR locus, which has CTCF-binding insulator sequences. The CFTR locus is activated by enhancers that reach the active promoter through looping. When CTCF or RAD21, a component of the cohesin complex, was knocked down by siRNA, the chromatin structure of the CFTR locus was disturbed, leading to increased gene expression and alterations in histone modifications.37 CTCF also regulates activation of the APP gene, which has highly conserved promoter sequences that share many transcription factor-binding sites. CTCF acts as a transcription factor here that binds the GC-rich -93/-82 promoter region (APBβ) and activates transcription.38

Figure 2
figure 2

CTCF and the cohesin complex can lead to transcriptional activation or repression in a binding site-dependent manner. For example, when CTCF and cohesin bind to their binding sites and create a chromatin loop that encompasses the enhancer and promoter, transcriptional activation occurs. Conversely, if CTCF and cohesin form a chromatin loop that prevents the enhancer from reaching the promoter, gene expression is repressed. CTCF, CCCTC-binding factor.

Regulation of CTCF activity

As observed for the interaction between CTCF and cohesin, CTCF’s functions are highly affected by its DNA-binding partners. Through the whole genome analyses of CTCF-binding sites, many co-localized proteins were identified, including YY1, Oct4, RNA polymerases and TR.39, 40 Table 2 includes a list of some CTCF-associated proteins and their descriptions. Ying Yang 1 (YY1) is a ubiquitous transcription factor with four zinc fingers that was found to bind along with CTCF to the Trix region of the X chromosome. Renkawitz and colleagues reports that CTCF forms a functional complex with YY1-Oct4-Trix to control the X chromosome. CTCF associated with RNA polymerase II carries out transcription activation, CTCF associated with USF and RNA polymerase I carries out rDNA spacer transcription and CTCF associated with TR causes hormone-sensitive enhancer blocking.39

Table 2 CTCF-binding partners

Moreover, CTCF activity, especially its DNA-binding activity, can be altered by various factors, including DNA methylation. As we have observed in the H19/Igf2 ICR example, methylation is a critical factor that dictates whether CTCF binds at a given locus. Furthermore, Pendone and colleagues were the first to show that only 4 out of the 11 CTCF zinc fingers are essential for strong DNA-binding activity and that each zinc finger differentially contributes to the interaction with a specific locus.41 They also found that cytosine methylation inhibits the DNA binding of zinc finger 7, thus significantly weakening CTCF’s binding affinity.41 At CTCF-binding elements lacking a CpG, CTCF binding was found to be affected by nucleosome repositioning. Lefevre et al.42 showed that proinflammatory stimuli, such as lipopolysaccharide treatment, induces lysozyme gene transcription, which causes nucleosome remodeling. This remodeling eventually leads to the removal of CTCF and its partner cohesin, reversing CTCF-induced gene repression.42 As described, many neighboring factors help CTCF execute its diverse functions by coordinating its zinc fingers to bind to various sequences, and CTCF-binding activity is also regulated by various factors.

Discussion

CTCF is a highly conserved zinc finger protein that performs various regulatory roles in the cell. In this review, we described CTCF’s roles as an insulator protein, a chromatin remodeler and a transcription factor. In addition, we discussed the association of CTCF with cohesin, which is a protein complex that co-localizes with CTCF on the genome. Another primary function of CTCF is as an architectural protein. As the eukaryotic genome is packaged through several levels of organization, its three-dimensional organization is closely related to how genes are expressed. CTCF changes the higher-order chromatin structure and controls the distance between associating domains within and among chromosomes. CTCF is exceptional in that it executes the master role of controlling gene expression. CTCF organizes the genome structure in ways that alter topological domain interactions and ultimately regulate gene expression. Since the first discovery of its role as a transcriptional repressor, many additional studies have revealed CTCF’s various roles and binding sites across the genome. However, many questions still remain about the exact mechanisms by which CTCF carries out its roles, and more information about this multifaceted protein remains to be uncovered.