Submitted on December 16, 2004
Revised on March 14, 2005
Accepted on March 16, 2005
Improving protein identification using complementary fragmentation techniques in Fourier transform mass spectrometry
Michael L. Nielsen, Mikhail M. Savitski, and Roman A. Zubarev
Laboratory for Biological and Medical Mass Spectrometry, Uppsala University, Uppsala 75123
Corresponding Author: Michael.Lund-Nielsen{at}bmms.uu.se
Identification of proteins by tandem mass spectrometry (MS/MS) is performed by matching experimental mass spectra against calculated spectra of all possible peptides in a protein database. The search engine assigns each spectrum a score indicating how well the experimental data complies with the expected one; higher score means increased confidence in the identification. One problem is the false positive identifications, which arise from incomplete data, as well as from the presence of misleading ions in experimental mass spectra due to gas-phase reactions, stray ions, contaminants and electronic noise. We employed a novel technique of reduction of false positives which is based on a combined use of orthogonal fragmentation techniques electron capture dissociation (ECD) and collisionally activated dissociation (CAD). Since ECD and CAD exhibit many complementary properties, their combined use greatly increased the analysis specificity, which was further strengthened by the high mass accuracy (1 ppm) afforded by the employed Fourier transform mass spectrometry. The utility of this approach is demonstrated on a whole cell lysate from Escherichia coli. Analysis was made using data-dependent acquisition mode. Extraction of complementary sequence information was performed prior to database search using in-house written software. Only masses involved in complementary pairs in the MS/MS spectrum from the same or orthogonal fragmentation techniques were submitted to the database search. ECD/CAD identified twice as many proteins at a fixed statistically significant confidence level, with on the average 64% higher Mascot score. The confidence in protein identification was hereby increased by more than one order of magnitude. The combined ECD/CAD searches were on average 20% faster than CAD-only searches. A specially developed test with scrambled MS/MS data revealed that the amount of false positive identifications has dramatically reduced by the combined use of CAD and ECD.