TY - JOUR
T1 - The HUPO proteomics standards initiativemass spectrometry controlled vocabulary
AU - Mayer, Gerhard
AU - Montecchi-Palazzi, Luisa
AU - Ovelleiro, David
AU - Jones, Andrew R.
AU - Binz, Pierre Alain
AU - Deutsch, Eric W.
AU - Chambers, Matthew
AU - Kallhardt, Marius
AU - Levander, Fredrik
AU - Shofstahl, James
AU - Orchard, Sandra
AU - Vizcaíno, Juan Antonio
AU - Hermjakob, Henning
AU - Stephan, Christian
AU - Meyer, Helmut E.
AU - Eisenacher, Martin
PY - 2013/12/1
Y1 - 2013/12/1
N2 - Controlled vocabularies (CVs), i.e. a collection of predefined terms describing a modeling domain, used for the semantic annotation of data, and ontologies are used in structured data formats and databases to avoid inconsistencies in annotation, to have a unique (and preferably short) accession number and to give researchers and computer algorithms the possibility for more expressive semantic annotation of data. The Human Proteome Organization (HUPO)-Proteomics Standards Initiative (PSI) makes extensive use of ontologies/CVs in their data formats. The PSI-Mass Spectrometry (MS) CV contains all the terms used in the PSI MS-related data standards. The CV contains a logical hierarchical structure to ensure ease of maintenance and the development of software that makes use of complex semantics. The CV contains terms required for a complete description of an MS analysis pipeline used in proteomics, including sample labeling, digestion enzymes, instrumentation parts and parameters, software used for identification and quantification of peptides/proteins and the parameters and scores used to determine their significance. Owing to the range of topics covered by the CV, collaborative development across several PSI working groups, including proteomics research groups, instrument manufacturers and software vendors, was necessary. In this article, we describe the overall structure of the CV, the process by which it has been developed and is maintained and the dependencies on other ontologies.
AB - Controlled vocabularies (CVs), i.e. a collection of predefined terms describing a modeling domain, used for the semantic annotation of data, and ontologies are used in structured data formats and databases to avoid inconsistencies in annotation, to have a unique (and preferably short) accession number and to give researchers and computer algorithms the possibility for more expressive semantic annotation of data. The Human Proteome Organization (HUPO)-Proteomics Standards Initiative (PSI) makes extensive use of ontologies/CVs in their data formats. The PSI-Mass Spectrometry (MS) CV contains all the terms used in the PSI MS-related data standards. The CV contains a logical hierarchical structure to ensure ease of maintenance and the development of software that makes use of complex semantics. The CV contains terms required for a complete description of an MS analysis pipeline used in proteomics, including sample labeling, digestion enzymes, instrumentation parts and parameters, software used for identification and quantification of peptides/proteins and the parameters and scores used to determine their significance. Owing to the range of topics covered by the CV, collaborative development across several PSI working groups, including proteomics research groups, instrument manufacturers and software vendors, was necessary. In this article, we describe the overall structure of the CV, the process by which it has been developed and is maintained and the dependencies on other ontologies.
UR - http://www.scopus.com/inward/record.url?scp=84879329259&partnerID=8YFLogxK
U2 - 10.1093/database/bat009
DO - 10.1093/database/bat009
M3 - Article
C2 - 23482073
AN - SCOPUS:84879329259
SN - 1758-0463
VL - 2013
JO - Database
JF - Database
M1 - bat009
ER -