Nature Biotechnology
22, 1459 - 1466 (2004)
Published online: 4 November 2004; | doi:10.1038/nbt1031
A common open representation of mass spectrometry data and its application to proteomics researchPatrick G A Pedrioli1, Jimmy K Eng1, Robert Hubley1, Mathijs Vogelzang1, Eric W Deutsch1, Brian Raught1, Brian Pratt2, Erik Nilsson2, Ruth H Angeletti3, Rolf Apweiler4, Kei Cheung5, Catherine E Costello6, Henning Hermjakob4, Sequin Huang6, Randall K Julian Jr7, Eugene Kapp8, Mark E McComb6, Stephen G Oliver9, Gilbert Omenn10, Norman W Paton11, Richard Simpson8, Richard Smith12, Chris F Taylor4, Weimin Zhu4
& Ruedi Aebersold11
Institute for Systems Biology, 1441 North 34 Street, Seattle, Washington 98103-8904 USA. 2
Insilicos LLC, 4509 Interlake Avenue North, no. 223, Seattle, Washington 98103-6773, USA. 3
Albert Einstein College of Medicine, LMAP Room 405, Ullman Bldg., 1300 Morris Park Avenue, Bronx, New York 10461 USA. 4
EMBL Outstation European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK. 5
Center for Medical Informatics, Department of Anesthesiology, Yale University School of Medicine, PO Box 208009, New Haven, Connecticut 06520, USA. 6
Boston University School of Medicine, 715 Albany Street, R-806, Boston, Massachusetts 02118-2526, USA. 7
Lilly Research Laboratories, One Lilly Corporate Center, Indianapolis, Indiana 46285, USA. 8
Joint Proteomics Laboratory, Ludwig Institute For Cancer Research & The Walter and Eliza Hall Institute of Medical Research, Royal Melbourne Hospital, Parkville, Victoria, Australia 3050. 9
School of Biological Sciences, University of Manchester, The Michael Smith Building, Oxford Road, Manchester, M13 9PT, UK. 10
The University of Michigan Medical School, 1150 W. Medical Center Drive, Ann Arbor, Michigan 48109-0656, USA. 11
Department of Computer Science, University of Manchester, Oxford Road, Manchester, M13 9PL, UK. 12
Biological Sciences Division and Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, PO Box 999, Richland, Washington 99352, USA.
Correspondence should be addressed to Ruedi Aebersold raebersold@systemsbiology.orgA broad range of mass spectrometers are used in mass spectrometry (MS)-based proteomics research. Each type of instrument possesses a unique design, data system and performance specifications, resulting in strengths and weaknesses for different types of experiments. Unfortunately, the native binary data formats produced by each type of mass spectrometer also differ and are usually proprietary. The diverse, nontransparent nature of the data structure complicates the integration of new instruments into preexisting infrastructure, impedes the analysis, exchange, comparison and publication of results from different experiments and laboratories, and prevents the bioinformatics community from accessing data sets required for software development. Here, we introduce the 'mzXML' format, an open, generic XML (extensible markup language) representation of MS data. We have also developed an accompanying suite of supporting programs. We expect that this format will facilitate data management, interpretation and dissemination in proteomics research.
MORE ARTICLES LIKE THIS These links to content published by NPG are automatically generated.
|