Abstract
Until recently nutrigenomics was mainly about transcriptomics related data. That already confronted us with overwhelming analytical problems. We learned to mathematically and statistically treat genome wide expression studies and studies directed to gene expression regulation. Nutrigenomics researchers had to become bilingual speaking: English and R1 and learned to think about co-expression, clusters and false discovery rates. The latter in fact proofed to be a trap. Removing all the false positives made us loose the information we were really interested in. To understand the results of our genomics experiments we often had to confront what we were measuring with what we already knew. After all false positives are not likely to all be related to the same meaningful biological process. That asked for the development of new analytical tools like Cytoscape for network analysis and PathVisio for pathway analysis. More importantly we had to structure what we know. Text mining and data mining helped us to do that, but what was really needed was mobilization of all the knowledge that is present in the heads of the scientific community. WikiPathways was our contribution to the rapidly emerging field of community curation. Thus we started to become able to integrate different types of technologies that span the full gene expression pipeline and to understand that in the biological context.Today the story repeats itself. Genome wide genetics is becoming real. We can do Genome Wide Association Studies and soon we can sequence individual genomes in relation to food intake and phenotypic responses. And then what? How can we deal with that new avalanche of data? The oversampling problems will be a few orders of magnitude larger; after all there can be hundreds of SNPs in every gene. There will just be too many to understand which SNPs are important from the data alone. We will again have to relate them to the biological processes. But is that enough? I think not. We will only understand the outcome of those large scale genetics studies if we not only attribute the SNPs to genes and thereby to pathways. We will also have to consider the actual sequences and see what the functional effect is that the SNP causes. Is it likely to influence transcription factor binding, miRNA effects, or protein-protein interactions? This calls for new types of data integration, for which we already have the tools. And it calls for new creative ways to do that. What we really need is teams of creative minds. Some new initiatives seem to show that these are already being formed.1: http://www.r-project.org
Similar content being viewed by others
Article PDF
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Evelo, C. Using a data triangle to understand molecular nutrition. Nat Prec (2011). https://doi.org/10.1038/npre.2011.5689.1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/npre.2011.5689.1