Manel Esteller's phone did not stop ringing for weeks. It was summer 2005, and he and his team at the Spanish National Cancer Centre in Madrid had just published a study comparing the activity of DNA in identical twins. The anxious callers were invariably twins whose sibling had developed a serious disease such as cancer or diabetes. Could the study help predict whether they too would succumb, they asked. Did the identical DNA sequence they shared with their afflicted twin mean they had the same genetic predisposition to illness?

Credit: C. Jay

Surprisingly, the answer to the second question is ‘not necessarily’. Researchers have known for years that, despite their common genes, identical twins can have very different physical constitutions and develop different diseases. The traditional explanation for this is that our environment somehow interacts with our genes to produce our physical attributes, or phenotype, but no one knew exactly how.

The study by Esteller and his team1 showed that the missing link between nature and nurture could lie in a phenomenon known as epigenetics: a cryptic chemical and physical code written over our genome's DNA sequence. The term ‘epigenetics’ was first coined in the 1940s by British embryologist and geneticist Conrad Waddington, to describe “the interactions of genes with their environment, which bring the phenotype into being”2. The term now refers to the extra layers of instructions that influence gene activity without altering the DNA sequence.

By studying 80 pairs of identical twins, ranging in age between 3 and 74, Esteller's team found that epigenetic differences were hardly detectable in the youngest twins, but increased markedly with age. These changes had a striking effect on gene activity: the number of genes that differ in activity between 50-year-old twins was more than three times that in pairs aged 3. “So we are more than our genes,” says Esteller. “Not only is the DNA sequence important but also how gene activity is regulated in response to environment. This might explain why many identical twins have different susceptibility to disease.”

Spot the difference

Deciphering the epigenetic code will illuminate some of the most profound questions in biology. Stephan Beck

As well as offering answers to identical twins, deciphering this epigenetic code promises to dramatically alter our understanding of disease in the wider population (see ‘Tagged for disease’). Many cancers might be triggered by epigenetic faults, for example. It should also fill some big gaps in our grasp of how the environment affects a creature's constitution — epigenetic changes explain how simply altering the diet of a pregnant mouse, for example, can completely change the coat colour of her pups3, or even alter their response to stress4.

“It will illuminate some of the most profound questions in biology,” says Stephan Beck, an immunologist at the Wellcome Trust Sanger Institute, Cambridge, UK, who worked on the Human Genome Project. How a given cell executes its unique genomic programme in time and space could shed fresh light not only on development and disease but also on what makes us human, he says.

The complete epigenetic code of our genome, its ‘epigenome’ has increasingly been the focus of research over the past decade, and scientists are now embarking on an ambitious attempt to crack it. The International Human Epigenome Project, or IHEP, first suggested by Beck and colleagues in 1999, is the logical next step after the Human Genome Project, which published the draft sequence of the human genome's 3 billion DNA letters in 2001. But the IHEP faces daunting challenges. The sequence of the human genome is the same in all our cells, whereas the epigenome differs from tissue to tissue, and changes in response to the cell's environment. Can researchers really hope to pin down this vast, complex and ever-changing code in a meaningful way?

Clever packaging

If the DNA sequence of the genome is like the musical score of a symphony, then the epigenome is like the key signatures, phrasing and dynamics that show how the notes of the melody should be played. Epigenetic control of gene expression occurs in two main ways: either the DNA itself is chemically altered, or the proteins that package DNA into chromatin (the main component of chromosomes), are modified. These proteins, called histones, determine whether the chromatin is tightly packed, in which case gene expression is shut down (or silenced), or relaxed, in which case gene expression is active.

The first kind of alteration takes the form of methyl groups added to the DNA — frequently to the base cytosine when it is immediately followed by guanine — by a process known as DNA methylation (see graphic). The methyl group can be sensed by proteins that turn gene expression on or off through regulating chromatin structure. The second, more complex kind of alteration involves changes to the histones around which chromosomal DNA is wrapped. Each histone has a protruding ‘tail’ to which more than 20 chemical tags can attach, like charms on a bracelet. Some of these tags, or certain combinations of them, dubbed the histone code, give rise to relaxed chromatin; others have the opposite effect.

Epigenetic codes are much more subject to environmental influences than the DNA sequence. “This could explain how lifestyle and toxic chemicals affect susceptibility to diseases,” says Vardhman Rakyan, a researcher at the Sanger Institute. “Up to 70% of the contribution to a particular disease can be non-genetic.” Indeed, one key finding of Esteller and his team's study was that epigenetic profiles of twins who had been raised apart or had noticeably distinct lifestyles differed more than those who had lived together for a while or shared similar environments and experiences1. Rakyan himself is studying a cohort of identical twins, where one twin has type 1 diabetes and the other does not, with the aim of finding epigenetic changes associated with the disease.

Although different labs around the world, such as Rakyan's, are already working on their own individual studies, several researchers argue that it is time for a coordinated effort. A series of international workshops, and expert and government reports have emerged in recent months that address the value and scope of an international human epigenome project. The ultimate goal of such a project would be to identify all the chemical modifications of DNA and histone proteins for all chromosomes in all types of normal human tissue.

As for the HGP, an international consortium would set priorities, coordinate research efforts, centralize materials and resources, create the necessary technologies and monitor research progress.

Piece by piece

Epigenomics is where genomics was 30 years ago, when everyone was working on part of the puzzle. Peter Jones

“Epigenomics is at a stage where genomics was 30 years ago, when everyone was working on their part of the puzzle,” remarks cancer biol–ogist Peter Jones at the University of Southern California, Los Angeles. Jones was formerly president of the American Association for Cancer Research (AACR), which is based in Philadelphia. “We need to see the bigger picture. It takes concerted efforts on an international scale. And this is how the IHEP would make a difference.”

Although a number of funding bodies — such as the Wellcome Trust, the AACR, the US National Cancer Institute (NCI), and the US National Human Genome Research Institute (NHGRI) — have shown interest by taking part in the discussion, funding agencies have yet to commit to financing and leading the project.

All aboard

Since the completion of the Human Genome Project, there have been many multi-centre schemes, each of which costs millions or even billions of dollars. Some of these initiatives, such as the US National Institutes of Health Human Cancer Genome Atlas (which aims to identify and catalogue genetic mutations in human cancers), have prompted arguments over scale and cost-effectiveness. A key question for funding bodies is whether the IHEP would be yet another multi-million-dollar project. Proponents say no. “The goal of the IHEP is not to create another big enterprise, but to make things as cost effective as possible, to interface with wonderful projects that are under way and to fund important pilot projects,” says Andrew Feinberg, director of the Centre for Epigenetics of Common Human Disease at the Johns Hopkins University in Baltimore, Maryland.

A number of smaller scale multi-centre epigenome projects are already under way or under discussion in Europe, the United States, India and Japan (see External links, below). Most prominent is that set up by the European Human Epigenome Project (HEP) Consortium in 2000. Following the publication of a pilot project in 2004 (ref. 6), the European HEP Consortium will soon make its data on the epigenetics of the entire chromosomes 6, 20 and 22 publicly available.

Although few people doubt the importance of an international human epigenome project, how to go about it remains a subject of debate. A key challenge is defining what the epigenome entails and what cell types to study. Some researchers argue that the project should first tackle blood cells, because they are easy to collect and work with, and are our main ‘window’ into the epigenome of both healthy and diseased individuals. Once a high-resolution blood epigenome is determined, it will serve as a reference with which other epigenomes, including those of diseased or ageing tissues, could be compared.

But the diversity of epigenomes in different cell types means that it may not make sense to restrict pilot projects to one single tissue, or to a particular time in a tissue's development. After intense discussion in three recent international workshops5,7,8,9, researchers in the epigenetic community now agree that initially eight to ten tissues, including the blood, should be studied simultaneously. Ultimately, the epigenome of all tissues, including embryonic stem cells, will be mapped out.

Another question is whether to study cells grown in the lab or biopsies of tissues taken from people. Biopsies contain different cell types, which would muddy the picture, but lab-grown cells might contain abnormal epigenetic tags. At the moment, some biologists are leaning towards lab-grown cells as being the lesser of two evils, but exactly how different the epigenomes of cell lines are compared with normal tissues remains to be seen. The inclusion of cell lines in some pilot studies in the proposed IHEP should be able to resolve this issue.

Final frontier

Perhaps the greatest challenges facing the IHEP are technological: mass-production-style tools must be developed to decode the epigenome, and the morass of data will have to be stored and analysed. At the moment, the main method used to determine DNA methylation sites is reliable, but extremely expensive, and the technology used to study histone marks is prone to problems with accuracy and reproducibility.

Scientists hope to tackle these problems by linking the IHEP to projects on the epigenomes of lab workhorses, such as the yeast, fruitfly and mouse, for which techniques are more advanced. Computational scientists are also developing the sophisticated bioinformatics tools needed to store and analyse multi-dimensional epigenome data.

Given these technological challenges, it is only natural to question whether the research community is ready for such an enormous undertaking. Drawing on the experience of the early days of planning for the HGP, researchers working on epi–genetics are unanimous in thinking they can do it. They have drawn up a plan of how to manage the international assets available for the IHEP5,7,8,9 and say that, like the HGP, the IHEP will catalyse its own development. “One can never be 100% ready. We have 60% of the technology to go for the real thing,” says Thomas Jenuwein, a molecular biologist at the Research Institute of Molecular Pathology at the Vienna Biocentre, Austria. “The rest will happen once the momentum is built up. We should have that vision to go in big.”