With the excitement of the completion of genome sequences for several model organisms came the realization that this was not enough to understand how DNA translates into complex multicellular organisms.

To address this question, the US National Human Genome Research Institute started the Encyclopedia of DNA elements (ENCODE) project in 2003, with the goal of finding functional elements in the human genome. In 2007 the institute launched a complementary program, model organism (mod)ENCODE. The aim was to harness the power of large-scale genome-wide tools, already available for Drosophila melanogaster and Caenorhabditis elegans, to improve the annotation of their functional elements.

The modENCODE Consortium collected over 200 genome-wide data sets for C. elegans (Gerstein et al., 2010) and over 700 for D. melanogaster (modENCODE Consortium et al., 2010)—profiling transcripts, transcription factor binding sites and chromatin structure across several developmental stages.

In D. melanogaster the teams discovered nine chromatin states that are indicative of specific functions (Kharchenko et al., 2010). The researchers used chromatin immunoprecipitation with antibodies to 18 histone modifications followed by array analysis of the enriched DNA to define chromatin states. These maps allow the clear distinction of functional features between genes, cell types and developmental stages.

The teams working on C. elegans used high-throughput sequencing of RNA derived from four major developmental stages to generate a comprehensive annotation of protein-coding genes, adding more than 1,600 new genes. The distribution of 19 histone modifications and two key histone variants demonstrated clear boundaries between regions of high gene activity in the central regions of chromosomes and marks of repression along the peripheral arms.

It will be of immense value to compare these datasets with genome-wide data for other organisms, as they become available, to arrive at a functional annotation of these genomes.