Both the generation and the analysis of proteome data are becoming increasingly widespread, and the field of proteomics is moving incrementally toward high-throughput approaches. Techniques are also increasing in complexity as the relevant technologies evolve. A standard representation of both the methods used and the data generated in proteomics experiments, analogous to that of the MIAME (minimum information about a microarray experiment) guidelines for transcriptomics, and the associated MAGE (microarray gene expression) object model and XML (extensible markup language) implementation, has yet to emerge. This hinders the handling, exchange, and dissemination of proteomics data. Here, we present a UML (unified modeling language) approach to proteomics experimental data, describe XML and SQL (structured query language) implementations of that model, and discuss capture, storage, and dissemination strategies. These make explicit what data might be most usefully captured about proteomics experiments and provide complementary routes toward the implementation of a proteome repository.

Special thanks go to Francesco Brancia, Jenny Ho, and Sandy Yates for their critical appraisal of the Schema at various stages. This work was supported by a grant from the Investigating Gene Function (IGF) Initiative of the Biotechnology & Biological Sciences Research Council to S.G.O., N.W.P., A.B., S.G., S.H., P.C., and A.J.P.B. for the COGEME (Consortium for the Functional Genomics of Microbial Eukaryotes) program. D.B.K. thanks the BBSRC for financial support, also under the IGF initiative. K.L.G. is supported by the North West Regional e-Science centre (ESNW), within the UK eScience Programme. Many people have contributed their advice and expertise to the design of PEDRo, at various meetings formal and otherwise, notably attendees at the 2002 Proteomics Standards Initiative meeting of the Human Proteome Organisation at the European Bioinformatics Institute.

  1. School of Biological Sciences, University of Manchester, Oxford Road, Manchester M13 9PL, UK.

    • Chris F. Taylor
    • , Paul D. Kirby
    • , Andy Brass
    •  & Stephen G. Oliver
  2. Department of Computer Science, University of Manchester, Oxford Road, Manchester M13 9PL, UK.

    • Chris F. Taylor
    • , Norman W. Paton
    • , Kevin L. Garwood
    • , Paul D. Kirby
    •  & Andy Brass
  3. Department of Molecular & Cell Biology, Institute of Medical Science, University of Aberdeen, Aberdeen AB25 2ZF, UK.

    • David A. Stead
    • , Zhikang Yin
    • , Laura Selway
    • , Janet Walker
    • , Alistair J.P. Brown
    •  & Phil Cash
  4. Institute for Systems Biology, 1441 N 34th St., Seattle, Washington 98103.

    • Eric W. Deutsch
    •  & Ruedi Aebersold
  5. Department of Chemistry, UMIST, PO Box 88, Manchester M60 1QD, UK.

    • Isabel Riba-Garcia
    • , Shabaz Mohammed
    • , Douglas B. Kell
    •  & Simon J. Gaskell
  6. Department of Biomolecular Sciences, UMIST, PO Box 88, Manchester M60 1QD, UK.

    • Simon J. Hubbard
  7. Inpharmatica Ltd, 60 Charlotte Street, London, UK.

    • Michael J. Deery
  8. Department of Biochemistry, University of Cambridge, Building O, Downing Site, Cambridge CB2 1QW, UK.

    • Julie A. Howard
    • , Tom Dunkley
    •  & Kathryn S. Lilley
  9. Department of Biochemistry & Molecular Biology, University of Southern Denmark, Campusvej 55, DK-5230 Odense M, Denmark.

    • Peter Roepstorff
  10. Department of Cell Biology, Scripps Clinic & Research Institute, La Jolla, California 92037.

    • John R. Yates III


Correspondence to Stephen G. Oliver.

