TY - JOUR
T1 - A common open representation of mass spectrometry data and its application to proteomics research
AU - Pedrioli, Patrick G.A.
AU - Eng, Jimmy K.
AU - Hubley, Robert
AU - Vogelzang, Mathijs
AU - Deutsch, Eric W.
AU - Raught, Brian
AU - Pratt, Brian
AU - Nilsson, Erik
AU - Angeletti, Ruth H.
AU - Apweiler, Rolf
AU - Cheung, Kei
AU - Costello, Catherine E.
AU - Hermjakob, Henning
AU - Huang, Sequin
AU - Julian, Randall K.
AU - Kapp, Eugene
AU - McComb, Mark E.
AU - Oliver, Stephen G.
AU - Omenn, Gilbert
AU - Paton, Norman W.
AU - Simpson, Richard
AU - Smith, Richard
AU - Taylor, Chris F.
AU - Zhu, Weimin
AU - Aebersold, Ruedi
N1 - Funding Information:
This project was funded in part by federal funds from the National Heart, Lung, and Blood Institute, National Institutes of Health, under contract no. N01-HV-28179 and by grant no. 1R33CA93302 from the National Cancer Institute. The Institute for Systems Biology is supported by a generous gift from Merck and Co. We are grateful to SourceForge for hosting the project and Eugene Yi for providing the seven-protein mix data set. We would also like to acknowledge the following for endorsing the mzXML format: Philip C. Andrews, Tom Blackwell, Daniel Burns, Jayson Falkner, Panagiotis Papoulias, Abhik Shah, Peter Ulintz, Al Burlingame, Robert Chalkley, Karl Clauser, Bruno Domon, James Eddes, Robert Moritz, Daniel Figeys, Barry L. Karger, William Hancock, Tomas Rejtar, Peter James, Matthias Mann, Sanford Markey, Matthias Wilm, Ken Williams and Kratos Analytical Limited (a Shimadzu Group Company).
PY - 2004/1/1
Y1 - 2004/1/1
N2 - A broad range of mass spectrometers are used in mass spectrometry (MS)-based proteomics research. Each type of instrument possesses a unique design, data system and performance specifications, resulting in strengths and weaknesses for different types of experiments. Unfortunately, the native binary data formats produced by each type of mass spectrometer also differ and are usually proprietary. The diverse, nontransparent nature of the data structure complicates the integration of new instruments into preexisting infrastructure, impedes the analysis, exchange, comparison and publication of results from different experiments and laboratories, and prevents the bioinformatics community from accessing data sets required for software development. Here, we introduce the 'mzXML' format, an open, generic XML (extensible markup language) representation of MS data. We have also developed an accompanying suite of supporting programs. We expect that this format will facilitate data management, interpretation and dissemination in proteomics research.
AB - A broad range of mass spectrometers are used in mass spectrometry (MS)-based proteomics research. Each type of instrument possesses a unique design, data system and performance specifications, resulting in strengths and weaknesses for different types of experiments. Unfortunately, the native binary data formats produced by each type of mass spectrometer also differ and are usually proprietary. The diverse, nontransparent nature of the data structure complicates the integration of new instruments into preexisting infrastructure, impedes the analysis, exchange, comparison and publication of results from different experiments and laboratories, and prevents the bioinformatics community from accessing data sets required for software development. Here, we introduce the 'mzXML' format, an open, generic XML (extensible markup language) representation of MS data. We have also developed an accompanying suite of supporting programs. We expect that this format will facilitate data management, interpretation and dissemination in proteomics research.
UR - http://www.scopus.com/inward/record.url?scp=8344284323&partnerID=8YFLogxK
U2 - 10.1038/nbt1031
DO - 10.1038/nbt1031
M3 - Review article
C2 - 15529173
AN - SCOPUS:8344284323
SN - 1087-0156
VL - 22
SP - 1459
EP - 1466
JO - Nature Biotechnology
JF - Nature Biotechnology
IS - 11
ER -