Nature Biotechnology
21, 247 - 254 (2003)
doi:10.1038/nbt0303-247
A systematic approach to modeling, capturing, and disseminating
proteomics experimental dataChris F. Taylor1, 2, Norman W. Paton2, Kevin L. Garwood2, Paul D. Kirby1, 2, David A. Stead3, Zhikang Yin3, Eric W. Deutsch4, Laura Selway3, Janet Walker3, Isabel Riba-Garcia5, Shabaz Mohammed5, Michael J. Deery7, Julie A. Howard8, Tom Dunkley8, Ruedi Aebersold4, Douglas B. Kell5, Kathryn S. Lilley8, Peter Roepstorff9, John R. Yates III10, Andy Brass1, 2, Alistair J.P. Brown3, Phil Cash3, Simon J. Gaskell5, Simon J. Hubbard6
& Stephen G. Oliver11
School of Biological Sciences, University of
Manchester, Oxford Road, Manchester M13
9PL, UK. 2
Department of Computer Science, University of
Manchester, Oxford Road, Manchester M13
9PL, UK. 3
Department of Molecular & Cell Biology,
Institute of Medical Science, University of Aberdeen, Aberdeen
AB25 2ZF, UK. 4
Institute for Systems Biology, 1441
N 34th St., Seattle, Washington
98103. 5
Department of Chemistry, UMIST, PO Box 88,
Manchester M60 1QD, UK. 6
Department of Biomolecular Sciences, UMIST, PO Box
88, Manchester M60 1QD, UK. 7
Inpharmatica Ltd, 60 Charlotte
Street, London, UK. 8
Department of Biochemistry, University of
Cambridge, Building O, Downing Site,
Cambridge CB2 1QW, UK. 9
Department of Biochemistry & Molecular
Biology, University of Southern Denmark, Campusvej 55,
DK-5230 Odense M, Denmark. 10
Department of Cell Biology, Scripps Clinic &
Research Institute, La Jolla, California
92037.
Correspondence should be addressed to Stephen G. Oliver steve.oliver@man.ac.ukBoth the generation and the analysis of proteome data are becoming
increasingly widespread, and the field of proteomics is moving incrementally
toward high-throughput approaches. Techniques are also increasing in complexity
as the relevant technologies evolve. A standard representation of both the
methods used and the data generated in proteomics experiments, analogous to
that of the MIAME (minimum information about a microarray experiment)
guidelines for transcriptomics, and the associated MAGE (microarray gene
expression) object model and XML (extensible markup language) implementation,
has yet to emerge. This hinders the handling, exchange, and dissemination of
proteomics data. Here, we present a UML (unified modeling language) approach to
proteomics experimental data, describe XML and SQL (structured query language)
implementations of that model, and discuss capture, storage, and dissemination
strategies. These make explicit what data might be most usefully captured about
proteomics experiments and provide complementary routes toward the
implementation of a proteome repository.
|