TY - JOUR
T1 - State of the human proteome in 2014/2015 As viewed through peptideatlas
T2 - Enhancing accuracy and coverage through the atlas prophet
AU - Deutsch, Eric W.
AU - Sun, Zhi
AU - Campbell, David
AU - Kusebauch, Ulrike
AU - Chu, Caroline S.
AU - Mendoza, Luis
AU - Shteynberg, David
AU - Omenn, Gilbert S.
AU - Moritz, Robert L.
N1 - Publisher Copyright:
© 2015 American Chemical Society.
PY - 2015/9/4
Y1 - 2015/9/4
N2 - The Human PeptideAtlas is a compendium of the highest quality peptide identifications from over 1000 shotgun mass spectrometry proteomics experiments collected from many different laboratories, all reanalyzed through a uniform processing pipeline. The latest 2015-03 build contains substantially more input data than past releases, is mapped to a recent version of our merged reference proteome, and uses improved informatics processing and the development of the AtlasProphet to provide the highest quality results. Within the set of ∼20 000 neXtProt primary entries, 14 070 (70%) are confidently detected in the latest build, 5% are ambiguous, 9% are redundant, leaving the total percentage of proteins for which there are no mapping detections at just 16% (3166), all derived from over 133 million peptide-spectrum matches identifying more than 1 million distinct peptides using AtlasProphet to characterize and classify the protein matches. Improved handling for detection and presentation of single amino-acid variants (SAAVs) reveals the detection of 5326 uniquely mapping SAAVs across 2794 proteins. With such a large amount of data, the control of false positives is a challenge. We present the methodology and results for maintaining rigorous quality along with a discussion of the implications of the remaining sources of errors in the build.
AB - The Human PeptideAtlas is a compendium of the highest quality peptide identifications from over 1000 shotgun mass spectrometry proteomics experiments collected from many different laboratories, all reanalyzed through a uniform processing pipeline. The latest 2015-03 build contains substantially more input data than past releases, is mapped to a recent version of our merged reference proteome, and uses improved informatics processing and the development of the AtlasProphet to provide the highest quality results. Within the set of ∼20 000 neXtProt primary entries, 14 070 (70%) are confidently detected in the latest build, 5% are ambiguous, 9% are redundant, leaving the total percentage of proteins for which there are no mapping detections at just 16% (3166), all derived from over 133 million peptide-spectrum matches identifying more than 1 million distinct peptides using AtlasProphet to characterize and classify the protein matches. Improved handling for detection and presentation of single amino-acid variants (SAAVs) reveals the detection of 5326 uniquely mapping SAAVs across 2794 proteins. With such a large amount of data, the control of false positives is a challenge. We present the methodology and results for maintaining rigorous quality along with a discussion of the implications of the remaining sources of errors in the build.
KW - Human Proteome Project
KW - PeptideAtlas
KW - observed proteome
KW - repositories
KW - shotgun proteomics
KW - tandem mass spectrometry
UR - http://www.scopus.com/inward/record.url?scp=84941031116&partnerID=8YFLogxK
U2 - 10.1021/acs.jproteome.5b00500
DO - 10.1021/acs.jproteome.5b00500
M3 - Article
C2 - 26139527
AN - SCOPUS:84941031116
SN - 1535-3893
VL - 14
SP - 3461
EP - 3473
JO - Journal of Proteome Research
JF - Journal of Proteome Research
IS - 9
ER -