Published online 22 December 2010 | Nature | doi:10.1038/news.2010.687


Genome 'census' reveals hidden riches

Fruitfly and nematode studies could advance understanding of the human genome.

DNA helixA genome-wide map is helping researchers to decode patterns of gene regulation.Getty

A sweeping study of fruitfly and nematode genomes has uncovered thousands of new genes, providing a better understanding of how the complex genetic networks needed to guide an animal through development are generated.

The study, reported today in four papers in Science1,2 and Nature3,4 as well as a suite of publications in other journals, is the first fruit of a project called modENCODE (Model Organism ENCyclopedia Of DNA Elements), which aims to map all of the functional elements in the genome, including those that regulate gene expression. ModENCODE teams generated nearly 1,000 new data sets, including thorough tallies of the RNA molecules produced at different stages of development, as well as maps of the DNA-binding sites used by proteins called transcription factors, which regulate gene expression.

Altogether, the modENCODE team has uncovered 100,000 new elements in the fruitfly genome that serve as a template for RNA molecules, says Susan Celniker, a geneticist at the Lawrence Berkeley National Laboratory (LBNL) in Berkeley, California, and a leader of one of the ten modENCODE teams. Among these new features are 1,938 previously unrecognized genes.

"It's quite amazing," says Mike Cherry, a computational biologist at Stanford University in California who was not a co-author on the papers, but who is an external adviser to the modENCODE project. "ModENCODE has made a great step in determining the topology of the chromosome by taking a census of its elements."

Combinatorial clues

In 2007, a related project called ENCODE released an in-depth characterization of 1% of the human genome5. Although it was a landmark moment for the field, focusing on such a small portion of the genome made it difficult to uncover patterns of gene regulation, says Manolis Kellis, a computational biologist at the Massachusetts Institute of Technology in Cambridge.

That year, modENCODE was born. The project focuses on two widely used models of animal development: the fruit fly Drosophila melanogaster, and the nematode Caenorhabditis elegans. But researchers hope that the results will also inform their efforts to map the dark recesses of the human genome, particularly those mysterious stretches of DNA that do not seem to code for protein, and which make up 99% of the genome.

Within those vast expanses of DNA are elements that are important for regulating gene expression. But those elements can be difficult to find based on sequence alone. To help crack the regulatory code, the modENCODE consortium also looked for patterns in the many chemical modifications, or 'marks', carried by some proteins that interact with DNA. Those marks were known to be important for regulating gene expression, but previous studies had analysed the distribution of a single type of chemical modification at a time, says Gary Karpen, who studies epigenetics at the LBNL.

Karpen and his colleagues decided to look for patterns in the combinations of different marks. They tracked 18 different chemical modifications to DNA-associated proteins called histones, and were able to pinpoint nine predominant patterns that are associated with differences in the level at which associated genes are expressed. The team than used this information to identify additional genetic elements that regulate gene expression.

The human touch

Such information could be particularly useful as the era of personalized genomics draws near, says Mark Gerstein, a professor of biomedical informatics at Yale University in New Haven, Connecticut, and a member of the modENCODE consortium. Most of the variation among genomes will reside in those spacious noncoding regions, he notes.


Furthermore, efforts to map DNA sequences associated with disease have sometimes yielded hits in noncoding regions with unknown function, says Kellis. "It's a major question right now: how are we going to be able to interpret all of these disease variants sitting in the middle of these large, intergenic noncoding regions?" he asks.

While the modENCODE consortium ploughed through its data, the ENCODE team has also tackled a full genome analysis. ENCODE, which has received US$161.9 million from the US National Human Genome Research Institute, hopes to publish the work in the next year, says Gerstein. And modENCODE, which has thus far received $69.3 million, is not finished either: it still has over a year of funding, and discussions about a possible second phase of the project are underway, says Celniker.

Ultimately, the goal is to use these analyses to determine how the genome guides an animal as it develops and responds to the environment. "That's a bigger challenge," says Karpen. "But we're many steps closer now that we have a map." 

  • References

    1. Gerstein, M. B. et al. Science 330, 1775-1787 (2010). | Article
    2. The modENCODE Consortium et al. Science 330, 1787-1797 (2010). | Article
    3. Kharchenko, P. V. et al. Nature doi:10.1038/nature09725 (2010).
    4. Graveley, B. R. et al. Nature doi:10.1038/nature09715 (2010).
    5. The ENCODE Project Consortium, Nature 447, 799-816 (2007). | Article


If you find something abusive or inappropriate or which does not otherwise comply with our Terms or Community Guidelines, please select the relevant 'Report this comment' link.

Comments on this thread are vetted after posting.

  • #60757

    An inherent assumption in the hypothesis that the Human Genome Project results could be used to diagnose, prevent, and treat most human diseases is that disease is related to one's genetic composition. While this approach garnered huge sums of money for scientific research, it could only solve those health problems directly influenced by genes. If just a few of the genome-wide association studies have correlated a specific disease to a particular gene or set of genes, we would be wise to reassess the validity of the assumption. I think this reawakens the question of how nature and nurture contribute to health.

Commenting is now closed.