Even as next-generation sequencing platforms have delivered rapid and accurate genomic sequencing to scientists, the process of annotation and functional characterization of genes has lagged behind, and the gap continues to widen.

“Every single-cell or environmental genomic project has added a huge number of putative enzymes, the functions of which are often unknown and at best deduced from sequence comparison,” says Manuel Ferrer of the Institute for Catalysis in Madrid. “For example, from the 27 million open reading frames found in the Global Ocean Sampling Project—coding 5.7 million nonredundant protein sequences—only a few dozen have been characterized.”

In an effort to decouple the science of metabolomics—the full characterization of metabolic processes taking place in a biological specimen—from genomic analysis, Ferrer and colleagues, including Peter Golyshin of Bangor University in the UK, developed an assay based on enzyme-substrate interactions. They built what is essentially a metabolite microarray, in which a large collection of known metabolites and enzymatic substrates are tethered onto a slide with a specially designed linker and then exposed to lysate derived from cells of interest. Each target molecule incorporates a Cy3 dye label that is subject to intramolecular quenching; upon enzymatic binding and catalysis, the substrate is released, and the quenching is relieved. “The fluorescent signal obtained for each spot provides a quantitative measure of the enzyme activity present in [a cellular] lysate, and bioinformatic analysis of the array results provides a global overview of the metabolic network of the cell—the 'reactome'—at the moment of sampling,” explains Ferrer. The substrate linkers are also designed to physically trap reactive enzymes so that array results of interest can be investigated further via mass spectrometric analysis of proteins captured by a metabolite or subset of metabolites.

A key advantage of this array approach is that, by focusing on biochemically relevant interactions, it becomes essentially species-independent. “The molecules that the array contains are universal for all forms of life; in fact, the metabolites collectively represent the central metabolic pathways of all cellular systems,” says Ferrer. “For this reason, it may be useful for any kind of sample, ranging from single cells, to environmental samples, to tissues, blood samples and so on.”

In an initial trial, the team characterized the 'reactome' of the bacterium Pseudomonas putida and observed a strong correlation between their enzyme-substrate profile and predictions based on this species' annotation in the Kyoto Encyclopedia of Genes and Genomes (KEGG)—the source for many of the substrates used on their array. However, they could also directly assign functions to dozens of hypothetical proteins whose existence had been predicted based solely on sequence data and to refine the functional annotations of many previously characterized enzymes.

Investigating the ecosystem in an environmental sample increases the metabolomic challenge by orders of magnitude. “One gram of soil can contain millions of reactions, enzymes and microbes,” says Ferrer. However, because of its species-independence, the reactome chip proved useful for culling informative metagenomic data from diverse environments. For example, they found that a mineral-rich geothermal pool predominantly contained species with heightened enzymatic capacity for iron and sulfur oxidation whereas a sample of heavily polluted water from the Barents Sea was highly enriched for species with adaptations for efficient petroleum degradation and hydrocarbon use.

Other potential applications for this technology could include characterization of physiological changes resulting from cancer or infection, or analysis of metabolic alterations in transgenic plants, and Ferrer is keen to forge new collaborations. “We believe that the technology offers many possibilities for doing both basic and applied research,” he says.