The ENCODE project

de Souza, Natalie

doi:10.1038/nmeth.2238

Download PDF

Research Highlights
Published: 06 November 2012

Genomics

The ENCODE project

Natalie de Souza

Nature Methods volume 9, page 1046 (2012)Cite this article

5002 Accesses
57 Citations
1 Altmetric
Metrics details

Subjects

The second, genome-wide phase of the Encyclopedia of DNA Elements (ENCODE) project is being reported.

The function of most of the human genome is unknown. Protein-coding genes account for only a small fraction (about 3%) of the total genome sequence; most functional genomic sequences are likely to have regulatory roles. Understanding human gene organization and regulation and their impact on normal and disease phenotypes requires that functional elements be mapped and annotated across the genome. This is the goal of the ENCODE project.

The initial 5-year pilot phase of the project focused on 1% of the human genome sequence. The second 5-year phase of ENCODE, which began in 2007 and is now coming to fruition, has extended the analysis of functional elements genome wide. A functional element as defined by ENCODE is a genomic sequence that either encodes a particular product (for instance, a protein or noncoding RNA) or has a consistent biochemical property (for instance, being bound by protein or having a particular biochemical mark).

The laboratories in the ENCODE Project Consortium have developed and applied a huge range of sequencing-based techniques to map functional elements across the genome. To put it succinctly, the ENCODE project has mapped chromatin state and structure, three-dimensional genome organization, DNA methylation, transcription factor binding, RNA transcription and protein expression genome wide. Experiments were conducted in multiple cell types, with the highest priority given to widely studied cell lines but with the list also including a human embryonic stem cell line and, in some cases, primary cells.

It is striking that a large fraction (80%) of the genome overlaps with at least one ENCODE-defined functional element in at least one examined cell type; an even larger fraction (99%) lies nearby such an element (within 1.7 kilobases). An examination of previously identified disease-associated single-nucleotide polymorphisms shows that they are enriched in ENCODE-annotated regions, suggesting hypotheses for functional consequences of single-nucleotide polymorphisms that can be further tested.

The data generated by ENCODE are vast and can be only very briefly summarized here. The collected ENCODE papers may be examined at http://www.encodeproject.org/ENCODE/pubs.html or explored with a dedicated visualization tool at http://www.nature.com/ENCODE/.

References

The ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

Download references

Authors

Natalie de Souza
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

de Souza, N. The ENCODE project. Nat Methods 9, 1046 (2012). https://doi.org/10.1038/nmeth.2238

Download citation

Published: 06 November 2012
Issue Date: November 2012
DOI: https://doi.org/10.1038/nmeth.2238

This article is cited by

Hi-Tag: a simple and efficient method for identifying protein-mediated long-range chromatin interactions with low cell numbers
- Xiaolong Qi
- Lu Zhang
- Shuhong Zhao
Science China Life Sciences (2024)
Integrative analyses highlight functional regulatory variants associated with neuropsychiatric diseases
- Margaret G. Guo
- David L. Reynolds
- Paul A. Khavari
Nature Genetics (2023)
MOCCASIN: a method for correcting for known and unknown confounders in RNA splicing analysis
- Barry Slaff
- Caleb M. Radens
- Yoseph Barash
Nature Communications (2021)
Towards community-driven metadata standards for light microscopy: tiered specifications extending the OME model
- Mathias Hammer
- Maximiliaan Huisman
- Caterina Strambio-De-Castillia
Nature Methods (2021)
Genome-wide association study identifies susceptibility loci for acute myeloid leukemia
- Wei-Yu Lin
- Sarah E. Fordham
- James M. Allan
Nature Communications (2021)

The ENCODE project

Subjects

References

Rights and permissions

About this article

Cite this article

This article is cited by

Hi-Tag: a simple and efficient method for identifying protein-mediated long-range chromatin interactions with low cell numbers

Integrative analyses highlight functional regulatory variants associated with neuropsychiatric diseases

MOCCASIN: a method for correcting for known and unknown confounders in RNA splicing analysis

Towards community-driven metadata standards for light microscopy: tiered specifications extending the OME model

Genome-wide association study identifies susceptibility loci for acute myeloid leukemia

Search

Quick links

Subjects

References

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Hi-Tag: a simple and efficient method for identifying protein-mediated long-range chromatin interactions with low cell numbers

Integrative analyses highlight functional regulatory variants associated with neuropsychiatric diseases

MOCCASIN: a method for correcting for known and unknown confounders in RNA splicing analysis

Towards community-driven metadata standards for light microscopy: tiered specifications extending the OME model

Genome-wide association study identifies susceptibility loci for acute myeloid leukemia

Search

Quick links