In 2008, five years after the US National Institutes of Health launched the monumental effort that was ENCODE (Encyclopedia of DNA Elements), it initiated the Roadmap Epigenomics Project.

The goal for ENCODE was to characterize functional features in DNA (epigenetic marks such as histone modifications, chromatin accessibility and DNA methylation) and RNA expression in close to 150 cell lines; the Epigenomics Roadmap was launched to complement this resource and look at these features in stem cells and ex vivo tissues that are relevant for disease. Although the sheer amount of resulting data can seem intimidating to navigate, researchers should be encouraged to take the plunge and see what is there—and what is still missing and may need their input.

The Epigenomics Roadmap Consortium has three apparent goals: to create a public resource of disease-relevant human epigenomic data, to design and improve technologies that allow researchers to study epigenetic marks more efficiently, and to understand how epigenetic marks are established and what their biological functions are.

Initially, the participants focused on technologies, sample selection and preparation, and data flow within the consortium. Advances in technology, such as high-throughput sequencing, as well as in more biological methods, such as the cultivation and differentiation of stem cells, helped to generate valuable data. Within the first year of the project, it became apparent that one big challenge lay in data dissemination to the public. In response, several epigenomics repositories, such as the Human Epigenome Atlas and the WashU Epigenome Browser, were generated. Integrative analytical approaches that allow comparative epigenome analysis followed.

With over 120 epigenomes now characterized, the consortium has made substantial progress toward its first two goals.

Key findings and resources are being published in February and March in Nature and several Nature research journals (for example, Nature Biotechnology, Nature Communications and Nature Protocols, as well as on p230 and p265 in this issue of Nature Methods). A special website (http://www.nature.com/epigenomeroadmap/) presents the papers as well as 'threads' highlighting topics that are otherwise covered only in subsections of individual papers.

How can researchers navigate and use the data? A good place to start is the Roadmap Epigenomics landing page (http://roadmapepigenomics.org/), which houses data, protocols and links to browsers. Users can, for example, select a particular cell type or tissue, home in on the epigenetic marks of interest and compare them across all cell types. As one user describes it, “I think the biggest compliment I can pay the resource is that, many times, it's saved us from having to do an experiment—instead, we could test the hypothesis computationally. Of course, experimental follow-up was still needed eventually, but the point is, we are now performing better, more targeted experiments.”

Amid the current, rather dire, state of research funding (Nat. Methods 11, 1077, 2014), some may ask whether the investment of close to $200 million in the project is the best use of money allocated for research. The answer should be an emphatic yes. Epigenomic regulation is increasingly recognized as a contributor to health or susceptibility to disease. Unlike for the human genome, it is impossible to generate one reference to which epigenomes from diseased cells can be compared. The epigenome of a stem cell will differ from that of a differentiated cell, and thus a reference library representing every cell type relevant for a disease is needed. A resource at that scale cannot be generated by individuals or even groups of investigators. In the long run, the ability to test some of their hypotheses in silico will free up researchers' time and money.

Tools to tackle the third goal of the project, the understanding of the biological function of epigenetic marks, are only now emerging. Although many marks have been correlated with different transcriptional processes, it has been difficult to establish causation in most cases. The recent development of the clustered, regularly interspaced, short palindromic repeats (CRISPR)-Cas9 editing system and its promise of easy genome manipulation will enable researchers to target and rewrite particular epigenetic marks and will go a long way toward assessing their functional impact.

The Epigenomics Roadmap is not the only project seeking to comprehensively profile epigenomes. The European Union–funded Blueprint initiative, for example, focuses on the epigenomic landscape in healthy and malignant hematopoietic cells. Both will feed into the International Human Epigenome Consortium with the goal of deciphering 1,000 epigenomes in the next ten years.

As these projects grow, researchers will need to become more comfortable with 'big data', to use them to their advantage and fill in the blanks the data cannot address.