Originally published In Press as doi:10.1074/mcp.M700558-MCP200 on November 12, 2008.
Molecular & Cellular Proteomics 8:547-557, 2009.
© 2009 by The American Society for Biochemistry and Molecular Biology, Inc.
Research
Bayesian Nonparametric Model for the Validation of Peptide Identification in Shotgun Proteomics*,S
Jiyang Zhang , ,¶,
Jie Ma ,¶,
Lei Dou ,
Songfeng Wu ,
Xiaohong Qian ,
Hongwei Xie ,
Yunping Zhu ,|| and
Fuchu He ,**,
From the State Key Laboratory of Proteomics, Beijing Proteome Research Center, Beijing Institute of Radiation Medicine, Beijing 102206, China, School of Mechanical Engineering and Automatization, National University of Defense Technology, Changsha 410073, China, and ** Institutes of Biomedical Sciences, Fudan University, Shanghai 200032, China
Tandem mass spectrometry combined with database searching allows high throughput identification of peptides in shotgun proteomics. However, validating database search results, a problem with a lot of solutions proposed, is still advancing in some aspects, such as the sensitivity, specificity, and generalizability of the validation algorithms. Here a Bayesian nonparametric (BNP) model for the validation of database search results was developed that incorporates several popular techniques in statistical learning, including the compression of feature space with a linear discriminant function, the flexible nonparametric probability density function estimation for the variable probability structure in complex problem, and the Bayesian method to calculate the posterior probability. Importantly the BNP model is compatible with the popular target-decoy database search strategy naturally. We tested the BNP model on standard proteins and real, complex sample data sets from multiple MS platforms and compared it with PeptideProphet, the cutoff-based method, and a simple nonparametric method (proposed by us previously). The performance of the BNP model was shown to be superior for all data sets searched on sensitivity and generalizability. Some high quality matches that had been filtered out by other methods were detected and assigned with high probability by the BNP model. Thus, the BNP model could be able to validate the database search results effectively and extract more information from MS/MS data.
|| To whom correspondence may be addressed: Beijing Proteome Research Center, 33 Life Science Park Rd., Changping District, Beijing 102206, China. Tel.: 86-10-80705225; E-mail: zhuyp{at}hupo.org.cn
 To whom correspondence may be addressed: Beijing Proteome Research Center, 33 Life Science Park Rd., Changping District, Beijing 102206, China. Tel.: 86-10-68171208; E-mail: hefc{at}nic.bmi.ac.cn

CiteULike Complore Connotea Del.icio.us Digg Reddit Technorati What's this?
Copyright © 2009 by the American Society for Biochemistry and Molecular Biology.
|
Advertisement
Advertisement
|