The human intestinal microbiota is composed of 1013 to 1014 microorganisms that have a collective genome — the microbiome — which contains at least 100 times as many genes as our own genome. Most of these cells reside in our colon, where densities approach 1011 to 1012 cells per ml, the highest density recorded for any microbial habitat. These microbial communities, however, remain largely unstudied, and we have little understanding of their influence on human development, physiology, immunity and nutrition.

In the past, shedding light on human-associated microbial communities has been limited by traditional microbiological approaches that focus on the study of individual species as isolated units. Most microbial species have not been successfully isolated as viable specimens for analysis, and for those species that have been isolated, analyses of genetic make-up, gene-expression patterns and metabolic pathways have rarely extended to inter-species or microbial–host interactions. However, the recent development of techniques that allow the genomic analysis of an assemblage of microorganisms, including those species that cannot be cultured by standard techniques, is now allowing researchers to address fundamental questions about microbial communities in their natural environments. To take advantage of these recent advances, the United States (US) National Institutes of Health (NIH) Roadmap has initiated the Human Microbiome Project (HMP), with the mission of “generating resources enabling comprehensive characterization of the human microbiota and analysis of its role in human health and disease”.

...the HMP will lay the foundation for further studies of human-associated microbial communities.

By leveraging both the metagenomic and traditional approach to genomic DNA sequencing, it is envisaged that the HMP will lay the foundation for further studies of human-associated microbial communities. Key questions that will form the focus of this initiative include: determining whether individuals share a core human microbiome; understanding whether changes in the human microbiome can be correlated with changes in human health; and developing the new technological and bioinformatic tools that will be needed to support these objectives. To achieve these goals, the NIH plan to award a total of US$115 million to researchers over the next 5 years.

The HMP initiative will begin by developing a reference set of microbial genome sequences and performing a preliminary characterization of the human microbiome, a task that will involve sequencing up to 600 bacterial genomes from isolates that are associated with different human body sites, including the gastrointestinal and female urogenital tracts, the oral cavity, the nasal and pharyngeal tracts, and the skin. To this end, $8.2 million have been awarded to 4 US-based sequencing centres, with objectives including the sequencing of 200 human-isolated microbial genomes and performing preliminary 16S ribosomal DNA gene metagenomic sequence analyses to estimate the complexity of the microbiota at different body sites. As this groundwork is being laid, a second initiative will aim to determine the relationship between human health and changes in the human microbiome. One obvious obstacle that could impede progress is achieving meaningful data analysis in light of the vast quantities of data that will be generated from the start of the initiative. Consequently, development of the appropriate analysis tools and establishment of a coordinated data-analysis resource will be crucial if meaningful conclusions are to be reached.

The HMP initiative will be welcomed for several reasons, not least of which is the assumption that the techniques and technologies pioneered by this project will not be limited to studies of human health, but will also be applicable to the study of microorganisms in a wide range of environments. More importantly, however, there is an inherent need for such an initiative, given the scale and complexity of the task. Not only can a large number of different microbial species contribute to the human microbiota, but there is probably enormous variation between the microbial communities that are resident in different hosts, as well as considerable temporal and spatial microbial variation within an individual host. In addition to these considerations, we have no understanding of the potential alterations to these communities that contribute to, and result from, human disease.

For these reasons, a collaborative, well-funded initiative that embraces ongoing international efforts — not unlike the Human Genome Project — is the only realistic path to ensure a rich and comprehensive data set that is publicly available for use by all investigators in their efforts to understand and improve human health. We wish them well.