When the human genome was first fully sequenced, it was often described as the recipe for making a person. In reality, the genome is more like an entire cookbook that can produce hundreds of different cell types and a staggering range of cell functions depending on which genes are switched on and off. That switching is accomplished using a vast suite of epigenetic marks — molecular and structural modifications to DNA that do not change the underlying sequence but ensure that the right genes are expressed at the right time.

This week, the Roadmap Epigenomics Project, a US$170-million effort to identify and map those marks — known collectively as the human epigenome — begins its first comprehensive data release. Although it is not the only such effort worldwide (see Nature 463, 596–597; 2010), the US National Institutes of Health (NIH) epigenomics project is one of the most ambitious. The newly released data include more than 300 maps of epigenetic changes in 56 cell and tissue types, and represent a significant step towards the complete epigenome — the full picture of all the ways in which DNA can be modified, thus revealing the influence of epigenetics on cell development and its role in complex diseases (see graphic).

Click for a larger version.

Click here for larger image

Various epigenetic mechanisms regulate gene expression. These include different types of modification on the histone proteins around which genomic DNA winds; attachment of methyl groups to the nucleotide cytosine in DNA, an alteration that is thought to switch off genes; sites of high sensitivity to an enzyme called DNase I, which cleaves accessible DNA and marks the location of gene regulatory regions; and RNA transcription, which, although not a DNA mark, is one measure of the global epigenetic state, revealing how much protein a particular gene makes in different cells. The NIH project developed a standardized protocol for measuring these four factors, and four designated centres around the United States have been charged with making reference maps of each type of modification in embryonic stem cells, induced pluripotent cells and in hundreds of primary adult and fetal tissues.

The project, slated to run for another five years, aims to produce maps for "a broad swathe of cell types that would be useful to disease research, fund work on specific diseases and develop novel technologies," says John Satterlee, a behavioural geneticist at the National Institute on Drug Abuse in Rockville, Maryland, and one of the coordinators of the project.

A variable response

Some scientists have been wary of the mapping component's 'big science' approach, fearing it will churn out data without ties to the biological questions it is meant to address. Others have questioned the idea that reference maps can be useful to scientists who study specific diseases. Researchers would still have to make their own maps using cells from people without disease, because most studies compare patients to healthy controls who are matched for factors such as age or sex, says John Greally at the Albert Einstein College of Medicine in New York. His projects on epigenomic factors that affect the developing fetus and that cause kidney disease receive funding from the NIH initiative.

Moreover, adds Greally, whereas the four US mapping centres use highly sophisticated techniques to produce their reference maps, individual labs mostly use simpler, cheaper methods to determine epigenetic marks, so comparing their data to the reference maps may be tricky. "Having the [mapping] information is valuable in itself," he says, "but the focus has got to be on how you use this to understand disease."

So far, the wider community of researchers has largely been unaware of the effort's existence. "We've sort of been in stealth mode," says Joseph Ecker, a plant and molecular biologist at the Salk Institute for Biological Studies in La Jolla, California, whose lab is working with a mapping centre to produce reference maps of DNA methylation. The newly released data should push the project to a point where the information can be widely used by researchers in different fields, he says.

Satterlee notes that the mapping component is just one arm of the project, making up about $57 million of its total budget. The rest of the money is going to individual investigator grants, 53 of which have already been awarded. Indeed, says cell biologist Benjamin Tycko of Columbia University in New York, whose work on the role of DNA methylation in Alzheimer's disease is funded by the project, "they've actually adopted a small lab approach" by funding research in labs with expertise on different diseases.

figure a