• An Erratum to this article was published on 06 September 2011

This article has been updated


Many genomic alterations associated with human diseases localize in noncoding regulatory elements located far from the promoters they regulate, making it challenging to link noncoding mutations or risk-associated variants with target genes. The range of action of a given set of enhancers is thought to be defined by insulator elements bound by the 11 zinc-finger nuclear factor CCCTC-binding protein (CTCF). Here we analyzed the genomic distribution of CTCF in various human, mouse and chicken cell types, demonstrating the existence of evolutionarily conserved CTCF-bound sites beyond mammals. These sites preferentially flank transcription factor–encoding genes, often associated with human diseases, and function as enhancer blockers in vivo, suggesting that they act as evolutionarily invariant gene boundaries. We then applied this concept to predict and functionally demonstrate that the polymorphic variants associated with multiple sclerosis located within the EVI5 gene impinge on the adjacent gene GFI1.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Change history

  • 03 June 2011

    In the version of this article initially published, the affiliation for authors at the Department of Molecular and Cellular Biology, Centro Nacional de Biotecnología, Madrid, Spain, was incomplete. The full affiliation is "Department of Molecular and Cellular Biology, Centro Nacional de Biotecnología, CSIC, Madrid, Spain." The error has been corrected in the HTML and PDF versions of the article.


Gene Expression Omnibus


  1. 1.

    & Tuning in to the signals: noncoding sequence conservation in vertebrate genomes. Trends Genet. 24, 344–352 (2008).

  2. 2.

    Genomewide association studies and assessment of the risk of disease. N. Engl. J. Med. 363, 166–176 (2010).

  3. 3.

    Cis-regulatory mutations in human disease. Brief. Funct. Genomics Proteomics 8, 310–316 (2009).

  4. 4.

    et al. Long-range gene regulation links genomic type 2 diabetes and obesity risk regions to HHEX, SOX4, and IRX3. Proc. Natl. Acad. Sci. USA 107, 775–780 (2010).

  5. 5.

    & CTCF: master weaver of the genome. Cell 137, 1194–1211 (2009).

  6. 6.

    et al. High-resolution profiling of histone methylations in the human genome. Cell 129, 823–837 (2007).

  7. 7.

    et al. Global analysis of the insulator binding protein CTCF in chromatin barrier regions reveals demarcation of active and repressive domains. Genome Res. 19, 24–32 (2009).

  8. 8.

    , & Three subclasses of a Drosophila insulator show distinct and cell type-specific genomic distributions. Genes Dev. 23, 1338–1350 (2009).

  9. 9.

    et al. A comprehensive map of insulator elements for the Drosophila genome. PLoS Genet. 6, e1000814 (2010).

  10. 10.

    , & CTCF shapes chromatin by multiple mechanisms: the impact of 20 years of CTCF research on understanding the workings of chromatin. Chromosoma 119, 351–360 (2010).

  11. 11.

    , & CTCF-dependent chromatin insulator is linked to epigenetic remodeling. Mol. Cell 23, 733–742 (2006).

  12. 12.

    et al. Mediation of CTCF transcriptional insulation by DEAD-box RNA-binding protein p68 and steroid receptor RNA activator SRA. Genes Dev. 24, 2543–2555 (2010).

  13. 13.

    et al. Cohesins functionally associate with CTCF on mammalian chromosome arms. Cell 132, 422–433 (2008).

  14. 14.

    et al. CTCF physically links cohesin to chromatin. Proc. Natl. Acad. Sci. USA 105, 8309–8314 (2008).

  15. 15.

    et al. Cohesin mediates transcriptional insulation by CCCTC-binding factor. Nature 451, 796–801 (2008).

  16. 16.

    et al. Comparative epigenomic analysis of murine and human adipogenesis. Cell 143, 156–169 (2010).

  17. 17.

    , & Deep homology and the origins of evolutionary novelty. Nature 457, 818–823 (2009).

  18. 18.

    , , & The genetics of multiple sclerosis: SNPs to pathways to pathogenesis. Nat. Rev. Genet. 9, 516–526 (2008).

  19. 19.

    , , , & Genetic and environmental factors and the distribution of multiple sclerosis in Europe. Eur. J. Neurol. 17, 1210–1214 (2010).

  20. 20.

    & The genetics of multiple sclerosis: an update 2010. Mol. Cell. Probes 24, 237–243 (2010).

  21. 21.

    et al. Risk alleles for multiple sclerosis identified by a genomewide study. N. Engl. J. Med. 357, 851–862 (2007).

  22. 22.

    et al. EVI5 is a risk gene for multiple sclerosis. Genes Immun. 9, 334–337 (2008).

  23. 23.

    Australia and New Zealand Multiple Sclerosis Genetics Consortium (ANZgene). Genome-wide association study identifies new multiple sclerosis susceptibility loci on chromosomes 12 and 20. Nat. Genet. 41, 824–828 (2009).

  24. 24.

    et al. Tag-SNP analysis of the GFI1–EVI5-RPL5–FAM69 risk locus for multiple sclerosis. Eur. J. Hum. Genet. 18, 827–831 (2010).

  25. 25.

    , , , & Genome-wide identification of in vivo protein-DNA binding sites from ChIP-Seq data. Nucleic Acids Res. 36, 5221–5231 (2008).

  26. 26.

    et al. The UCSC Genome Browser database: update 2010. Nucleic Acids Res. 38, D613–D619 (2010).

  27. 27.

    , , & ECR Browser: a tool for visualizing and accessing data from comparisons of multiple vertebrate genomes. Nucleic Acids Res. 32, W280–W286 (2004).

  28. 28.

    et al. An exceptionally conserved transcriptional repressor, CTCF, employs different combinations of zinc fingers to bind diverged promoter sequences of avian and mammalian c-myc oncogenes. Mol. Cell. Biol. 16, 2802–2813 (1996).

  29. 29.

    et al. Integration of external signaling pathways with the core transcriptional network in embryonic stem cells. Cell 133, 1106–1117 (2008).

  30. 30.

    , & Loss of transcriptional activity of a transgene is accompanied by DNA methylation and histone deacetylation and is prevented by insulators. Genes Dev. 12, 2852–2862 (1998).

  31. 31.

    et al. Position-effect protection and enhancer blocking by the chicken β-globin insulator are separable activities. Proc. Natl. Acad. Sci. USA 14, 6883–6888 (2002).

  32. 32.

    & We gather together: insulators and genome organization. Curr. Opin. Genet. Dev. 17, 400–407 (2007).

  33. 33.

    et al. Domain organization of human chromosomes revealed by mapping of nuclear lamina interactions. Nature 453, 948–951 (2008).

  34. 34.

    et al. Developmentally regulated activation of a SINE B2 repeat as a domain boundary in organogenesis. Science 317, 248–251 (2007).

  35. 35.

    , & Positional enhancer-blocking activity of the chicken β-globin insulator in transiently transfected cells. Proc. Natl. Acad. Sci. USA 96, 14354–14359 (1999).

  36. 36.

    et al. Zebrafish enhancer detection (ZED) vector: a new tool to facilitate transgenesis and the functional analysis of cis-regulatory regions in zebrafish. Dev. Dyn. 238, 2409–2417 (2009).

  37. 37.

    et al. An atlas of combinatorial transcriptional regulation in mouse and man. Cell 140, 744–752 (2010).

  38. 38.

    & Long-range control of gene expression: emerging mechanisms and disruption in disease. Am. J. Hum. Genet. 76, 8–32 (2005).

  39. 39.

    , , & Multiple sclerosis–a coordinated immune attack across the blood brain barrier. Curr. Neurovasc. Res. 1, 141–150 (2004).

  40. 40.

    , , , & Gfi1-cells and circuits: unraveling transcriptional networks of development and disease. Curr. Opin. Hematol. 17, 300–307 (2010).

  41. 41.

    et al. Gfi1 expression is controlled by five distinct regulatory regions spread over 100 kilobases, with Scl/Tal1, Gata2, PU.1, Erg, Meis1, and Runx1 acting as upstream regulators in early hematopoietic cells. Mol. Cell. Biol. 30, 3853–3863 (2010).

  42. 42.

    et al. Microarray analysis identifies altered regulation of nuclear receptor family members in the pre-disease state of multiple sclerosis. Neurobiol. Dis. 38, 201–209 (2010).

  43. 43.

    et al. Oncogenic activity of Cdc6 through repression of the INK4/ARF locus. Nature 440, 702–706 (2006).

  44. 44.

    & GATA-1 modulates the chromatin structure and activity of the chicken α-globin 3′ enhancer. Mol. Cell. Biol. 28, 575–586 (2008).

  45. 45.

    , , & Chicken α-globin switching depends on autonomous silencing of the embryonic π globin gene by epigenetics mechanisms. J. Cell. Biochem. 108, 675–687 (2009).

  46. 46.

    et al. Galaxy: a web-based genome analysis tool for experimentalists. Curr. Protoc. Mol. Biol. 19, 19.10.1–19.1.21 (2010).

  47. 47.

    et al. A framework for collaborative analysis of ENCODE data: making large-scale analyses biologist-friendly. Genome Res. 17, 960–964 (2007).

  48. 48.

    , & Pscan: finding over-represented transcription factor binding site motifs in sequences from co-regulated or co-expressed genes. Nucleic Acids Res. 37, W247–W252 (2009).

  49. 49.

    et al. TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes. Nucleic Acids Res. 34, D108–D110 (2006).

  50. 50.

    & The value of prior knowledge in discovering motifs with MEME. Proc. Int. Conf. Intell. Syst. Mol. Biol. 3, 21–29 (1995).

  51. 51.

    et al. GOToolBox: functional analysis of gene datasets based on Gene Ontology. Genome Biol. 5, R101 (2004).

  52. 52.

    et al. The UCSC known genes. Bioinformatics 22, 1036–1046 (2006).

  53. 53.

    et al. A gene atlas of the mouse and human protein-encoding transcriptomes. Proc. Natl. Acad. Sci. USA 101, 6062–6067 (2004).

  54. 54.

    , , & Capturing chromosome conformation. Science 295, 1306–1311 (2002).

Download references


This research was supported by the following grants: BFU2007-60042/BMC, BFU2010-14839, Petri PET2007_0158, CONSOLIDER CSD2007-00008 (Spanish Ministerio de Ciencia e Innovación (MICINN)) and Proyecto de Excelencia CVI-3488 (Junta de Andalucía) to J.L.G.-S.; BFU2009-07044 (MICINN) and Proyecto de Excelencia CVI 2658 (Junta de Andalucía) to F.C.; FIS PI081636 (ISCIII) to F.M.; PN-SAF2009-11491 (MICINN) and Proyecto de Excelencia P07-CVI-02551 (Junta de Andalucía) to A.A.; BFU2008-00838, CONSOLIDER CSD2007-00008 (MICINN), Regional Government of Madrid (CAM S-SAL-0190-2006) and the Pro-CNIC Foundation to M.M.; BFU2006-12185 and BIO2009-12697 (MICINN) to L.M.; Dirección General de Asuntos del Personal Académico, Universidad Nacional Autónoma de México (IN209403, IN214407 and IN203811) and Consejo Nacional de Ciencia y Tecnología, México (CONACyT: 42653-Q, 58767 and 128464) to F.R.-T.; Intramural Research Program of the US NCBI (NIH) to I.O. and BIO2006-03380, CONSOLIDER CSD2007-00050 (MICINN) and RETICS RD07/0067/0012 (Spanish MICINN) to R.G. L.M. thanks A. Fernández for technical assistance and L. Barrios for statistical analysis. F.R.-T. thanks G.G. Avendaño for technical assistance.

Author information

Author notes

    • Cristina Pantoja
    • , Ana Fernández Miñán
    • , Christian Valdes-Quezada
    • , Eduardo Moltó
    • , Fuencisla Matesanz
    •  & Ozren Bogdanović

    These authors contributed equally to this work.


  1. Center for Genomic Regulation, Universitat Pompeu Fabra, Barcelona, Spain.

    • David Martin
    •  & Roderic Guigó
  2. Centro Nacional de Investigaciones Oncológicas, Madrid, Spain.

    • Cristina Pantoja
    • , Orlando Domínguez
    • , María A Blasco
    •  & Manuel Serrano
  3. Centro Andaluz de Biología del Desarrollo, Consejo Superior de Investigaciones Científicas (CSIC)–Universidad Pablo de Olavide (UPO), Seville, Spain.

    • Ana Fernández Miñán
    • , Ozren Bogdanović
    • , Elisa de la Calle-Mustienes
    • , Fernando Casares
    •  & José Luis Gómez-Skarmeta
  4. Instituto de Fisiología Celular, Departamento de Genética Molecular, Universidad Nacional Autónoma de México, México DF, México.

    • Christian Valdes-Quezada
    • , Mayra Furlan-Magaril
    •  & Félix Recillas-Targa
  5. Department of Molecular and Cellular Biology, Centro Nacional de Biotecnología, CSIC, Madrid, Spain.

    • Eduardo Moltó
    •  & Lluís Montoliu
  6. Centro de Investigación Biomédica en Red de Enfermedades Raras, Instituto de Salud Carlos III (ISCIII), Madrid, Spain.

    • Eduardo Moltó
    •  & Lluís Montoliu
  7. Instituto de Parasitología y Biomedicina Lopez-Neyra, CSIC, Granada, Spain.

    • Fuencisla Matesanz
    • , Antonio Alcina
    •  & María Fedetz
  8. National Center for Biotechnology Information (NCBI), National Library of Medicine, National Institutes of Health (NIH), Bethesda, Maryland, USA.

    • Leila Taher
    •  & Ivan Ovcharenko
  9. Centro Nacional de Investigaciones Cardiovasculares (CNIC), Madrid, Spain.

    • Susana Cañón
    •  & Miguel Manzanares
  10. Instituto de Biologia Moleculare Celular (IBMC), Universidade do Porto, Oporto 4150-180, Portugal.

    • Paulo S Pereira


  1. Search for David Martin in:

  2. Search for Cristina Pantoja in:

  3. Search for Ana Fernández Miñán in:

  4. Search for Christian Valdes-Quezada in:

  5. Search for Eduardo Moltó in:

  6. Search for Fuencisla Matesanz in:

  7. Search for Ozren Bogdanović in:

  8. Search for Elisa de la Calle-Mustienes in:

  9. Search for Orlando Domínguez in:

  10. Search for Leila Taher in:

  11. Search for Mayra Furlan-Magaril in:

  12. Search for Antonio Alcina in:

  13. Search for Susana Cañón in:

  14. Search for María Fedetz in:

  15. Search for María A Blasco in:

  16. Search for Paulo S Pereira in:

  17. Search for Ivan Ovcharenko in:

  18. Search for Félix Recillas-Targa in:

  19. Search for Lluís Montoliu in:

  20. Search for Miguel Manzanares in:

  21. Search for Roderic Guigó in:

  22. Search for Manuel Serrano in:

  23. Search for Fernando Casares in:

  24. Search for José Luis Gómez-Skarmeta in:


J.L.G.-S. and F.C. conceived the study, designed the experiments, interpreted results and wrote the manuscript. D.M. devised bioinformatics methods, carried out data analysis and wrote the paper. C.P., M.S. and M.A.B. conducted mouse ChIP experiments. C.V.-Q., M.F.-M. and F.R.-T. carried out chicken ChIP experiments. E.C.-M., E.M. and L.M. carried out insulator assays. A.F.M. conducted the 3C experiments. F.M., A.A. and M.F. provided PBMCs from blood cells and carried out qrtPCR, CNRA/CNRB activity assay in a luciferase reporter assay, quantification of GFI1 relative expression of 108 PBMC samples, genotyping of the EVI5 rs11804321 and statistical analysis. O.D. carried out the high-throughput sequencing. O.B., L.T., I.O., S.C. and P.S.P. did data analysis. M.M. and R.G. collaborated in the experimental design, discussion of results and in writing the manuscript.

Competing interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to Fernando Casares or José Luis Gómez-Skarmeta.

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–7 and Supplementary Methods

Excel files

  1. 1.

    Supplementary Table 1

    Genes flanking CTCF sites in human, mouse and chicken

  2. 2.

    Supplementary Table 2

    Gene Ontology of genes associated with CTCF sites (Biological Processes)

  3. 3.

    Supplementary Table 3

    Gene Ontology of genes associated with CTCF sites (Molecular Function)

  4. 4.

    Supplementary Table 4

    Primers used to amplify the human CTCF bound regions and in the 3C experiments

About this article

Publication history






Further reading