The state of the human proteome in 2012 as viewed through peptideatlas

Terry Farrah, Eric W. Deutsch, Michael R. Hoopmann, Janice L. Hallows, Zhi Sun, Chung Ying Huang, Robert L. Moritz

Research output: Contribution to journalArticlepeer-review

107 Citations (Scopus)

Abstract

The Human Proteome Project was launched in September 2010 with the goal of characterizing at least one protein product from each protein-coding gene. Here we assess how much of the proteome has been detected to date via tandem mass spectrometry by analyzing PeptideAtlas, a compendium of human derived LC-MS/MS proteomics data from many laboratories around the world. All data sets are processed with a consistent set of parameters using the Trans-Proteomic Pipeline and subjected to a 1% protein FDR filter before inclusion in PeptideAtlas. Therefore, PeptideAtlas contains only high confidence protein identifications. To increase proteome coverage, we explored new comprehensive public data sources for data likely to add new proteins to the Human PeptideAtlas. We then folded these data into a Human PeptideAtlas 2012 build and mapped it to Swiss-Prot, a protein sequence database curated to contain one entry per human protein coding gene. We find that this latest PeptideAtlas build includes at least one peptide for each of ∼12500 Swiss-Prot entries, leaving ∼7500 gene products yet to be confidently cataloged. We characterize these PA-unseen proteins in terms of tissue localization, transcript abundance, and Gene Ontology enrichment, and propose reasons for their absence from PeptideAtlas and strategies for detecting them in the future.

Original languageEnglish
Pages (from-to)162-171
Number of pages10
JournalJournal of Proteome Research
Volume12
Issue number1
DOIs
Publication statusPublished - 4 Jan 2013
Externally publishedYes

Keywords

  • Database
  • Human Proteome Project
  • LC-MS/MS
  • PeptideAtlas
  • Protein inference

Cite this