Submitted on September 4, 2007
Revised on February 13, 2008
Accepted on February 25, 2008
Post-experiment monoisotopic mass filtering and refinement (PE-MMR) of tandem mass spectrometric data increases accuracy of peptide identification in LC/MS/MS
Byunghee Shin, Hee-Jung Jung, Seok-Won Hyung, Hokeun Kim, Dongkyu Lee, Cheolju Lee, Myeong-Hee Yu, and Sang-Won Lee
Chemistry, Korea University, Seoul 136-701
Corresponding Author: sw_lee{at}korea.ac.kr
Methods for treating tandem mass spectrometric (MS/MS) data to achieve accurate peptide identification are currently the subject of much research activity. In this paper we describe a new method for filtering MS/MS data and refining precursor masses that provides highly accurate analyses of massive sets of proteomic data. This method, coined post-experiment monoisotopic mass filtering and refinement (PE-MMR), consists of several data processing steps: 1) generation of lists of all monoisotopic masses observed in a whole LC/MS experiment; 2) clusterization of monoisotopic masses of a peptide into unique mass classes (UMCs) based on their masses and LC elution times; 3) matching the precursor masses of the MS/MS data to a representative mass of a UMC; 4) filtration of the MS/MS data based on the presence of corresponding monoisotopic masses and refinement of the precursor ion masses by the UMC mass. PE-MMR increases the throughput of proteomic data analysis, by efficiently removing garbage MS/MS data prior to database searching, and improves the mass measurement accuracies [i.e., 0.05 1.49 ppm for yeast data (from 4.46 2.81 ppm), 0.03 3.41 ppm for glycopeptide data (from 4.8 7.4 ppm)] for an increased number of identified peptides. In proteomic analyses of glycopeptide-enriched samples, PE-MMR processing greatly reduces the degree of false glycopeptide identification by correctly assigning the monoisotopic masses for the precursor ions prior to database searching. By applying this technique to analyses of proteome samples of varying complexities, we demonstrate herein that PE-MMR is an effective and accurate method for treating massive sets of proteomic data.