Investigating MS2/MS3 Matching Statistics

Improvements in ion trap instrumentation have made n-dimensional mass spectrometry more practical. The overall goal of the study was to describe a model for making use of MS2 and MS3 information in mass spectrometry experiments. We present a statistical model for adjusting peptide identification probabilities based on the combined information obtained by coupling peptide assignments of consecutive MS2 and MS3 spectra. Using two data sets, a mixture of known proteins and a complex phosphopeptide-enriched sample, we demonstrate an increase in discriminating power of the adjusted probabilities compared with models using MS2 or MS3 data only. This work also addresses the overall value of generating MS3 data as compared with an MS2-only approach with a focus on the analysis of phosphopeptide data.

Improvements in ion trap instrumentation have made ndimensional mass spectrometry more practical. The overall goal of the study was to describe a model for making use of MS 2 and MS 3 information in mass spectrometry experiments. We present a statistical model for adjusting peptide identification probabilities based on the combined information obtained by coupling peptide assignments of consecutive MS 2 and MS 3 spectra. Using two data sets, a mixture of known proteins and a complex phosphopeptide-enriched sample, we demonstrate an increase in discriminating power of the adjusted probabilities compared with models using MS 2 or MS 3 data only. This work also addresses the overall value of generating MS 3 data as compared with an MS 2 -only approach with a focus on the analysis of phosphopeptide data.

Molecular & Cellular Proteomics 6:71-87, 2007.
Advances in mass spectrometer design continue to propel proteomics research. One of the most widely used mass analyzers for protein work has historically been the ion trap, and a large proportion of the data from current mass spectrometry-based proteomics experiments are generated on such instruments. This trend continues with current generation "linear trap" instruments that are characterized by increased ion capacity and thus improved resolution and sensitivity (1,2). Standard proteomics approaches are based on the predictable fragmentation of peptides in the collision cell of the mass spectrometer and the subsequent interpretation of the resulting spectra to infer amino acid sequence, referred to as tandem mass spectrometry (MS/MS or MS 2 ) 1 (3)(4)(5)(6)(7). In practice, however, acquired MS/MS spectra are often noisy, contain only a small number of fragment ions due to incom-plete peptide fragmentation, or reflect unanticipated instrumental or chemical artifacts. As a result, in a typical analysis of MS/MS spectra generated in a large scale experiment, only a small fraction of the spectra can be successfully interpreted and assigned a peptide sequence with high confidence (8,9).
Newer instrumentation supports alternative techniques for data generation that have the potential to improve peptide and protein identification. One such technique is three-stage mass spectrometry (MS 3 ) in which peptide ions in an ion trap or ICR mass spectrometer are subjected to an additional stage of isolation and fragmentation. The faster acquisition times of newer linear trap instruments such as the LTQ provide the option of collecting MS 3 spectra of abundant MS 2 peaks with overall cycle times similar to those of normal MS/MS 2 cycles on older three-dimensional trap instruments. As a result, a number of researchers are choosing to routinely collect MS 3 spectra during LC-MS/MS runs that have the potential to provide additional information useful for peptide identification and characterization. This is deemed particularly important in the case of proteins identified by single peptides (10,11) and for the analysis of phosphopeptides, the spectra of which are frequently dominated by a major fragment ion representing neutral loss of the phosphate group from the precursor peptide. Therefore, phosphopeptides have been analyzed by automated data-dependent triggering of MS 3 acquisition whenever the dominant neutral loss ion of the appropriate mass is detected in an MS 2 spectrum (12)(13)(14). Fragmentation of the neutral loss ion typically provides significantly increased structural information via increased peptide bond cleavage. Similar approaches may be applied to other major neutral loss ions (e.g. loss of 64 Da from peptides containing methionine sulfoxide) and to excessive prolyl-or aspartyl-directed fragmentation. MS 3 spectra have proven to be useful in top-down analysis as well both for protein identification and for characterization of specific sites of posttranslational modification (15,16).
Generally speaking, there are several ways of combining MS 2 and MS 3 spectra from the same peptide to improve peptide identification. One strategy involves integrating matching MS 2 and MS 3 spectra directly at the spectrum level, generating an "intersection spectrum" that contains only one type of ion, thus allowing simplified de novo sequencing of the peptide. This approach has been described by Zhang and McElvain (17), who demonstrated the usefulness of the tech-nique in protein sequencing. Olsen and Mann (11) describe a custom scoring algorithm for MS 3 spectra: their final score for a peptide is the product of the Mascot-generated MS 2 and the custom MS 3 scores. In glycoproteomics, it is frequently the case that MS 2 and MS 3 provide complementary structural information on a glycopeptide: information on the structure of side-chain carbohydrate moieties is generally obtained from the MS 2 spectrum, whereas amino acid sequence information is more readily obtained in the MS 3 (18). In the top-down technique described by Zabrouskov et al. (16), sequence tags are extracted from MS 3 spectra using a de novo algorithm and used to complement correlated MS 2 spectral data in a "hybrid" database search strategy implemented in the ProSight PTM search engine (19).
Related to the problem of MS 2 /MS 3 spectrum integration, de novo sequencing-based algorithms have been described for combining pairs of spectra corresponding to unmodified and modified versions of the same peptide or pairs of spectra corresponding to the same peptide tagged with a light or heavy version of a labeling reagent (9,20,21,23). However, although de novo sequencing approaches are promising, no computational tools are currently available that can be robustly applied in a high throughput environment. As a result, analysis of MS 2 and MS 3 data is still largely carried out with a conventional database search approach using commercially available programs such as SEQUEST, Mascot, Spectrum-Mill, Phenyx, Paragon, or open source programs X! Tandem, Open Mass Spectrometry Search Algorithm (OMSSA), In-sPecT, or ProbID (24 -29).
Although all existing database search tools can be used to identify peptides from both MS 2 and MS 3 spectra, automated analysis of those different types of spectra may not be identical. This often leads to the requirement that MS 2 and MS 3 spectra be separated for processing. The main reason for this is that the measured precursor mass associated with MS 3 spectra will not always correspond to the mass of an appropriate database peptide calculated using the same conventional rules that are applied in the case of MS 2 spectra. For example, in phosphopeptide analyses variable modifications of Ϫ18 Da due to loss of phosphoric acid from Ser or Thr residues need to be specified for MS 3 , whereas the normal ϩ80-Da phosphorylation modification on Ser, Thr, and Tyr are used for MS 2 . It is computationally inefficient, and an unnecessary source of false positive identifications, to perform a combined search that permits both the Ϫ18-Da loss for MS 2 spectra and the ϩ80-Da addition for MS 3 spectra.
Searching MS 3 spectra separately from their parent MS 2 spectra essentially decouples the two sets of scans. Intuitively if analysis of successive MS 2 and MS 3 scans results in matching peptide sequences, there is an increased confidence in both identifications. The work described here attempts to provide a general, statistically sound assessment of the confidence achieved by combining the search results of MS 2 and MS 3 spectra from the same peptide. In contrast to aforemen-tioned work, we assume a work flow in which the MS 2 and MS 3 spectra are searched independently using a common search engine (namely SEQUEST in this work) and are independently statistically validated using PeptideProphet. We then recouple matching consecutive MS 2 and MS 3 scans and adjust the peptide probabilities initially computed by Pep-tideProphet to account for the new "linked" MS 2 /MS 3 information. We describe a model that produces an adjusted probability of peptide identification and demonstrate, using a data set of MS 2 and MS 3 spectra generated using a control protein mixture, that such a correction can be used to better discriminate between correct and incorrect database search results. We also investigate ways to combine the adjusted MS 2 and MS 3 probabilities to compute a single confidence measure for their corresponding unique peptide. We then further demonstrate the utility of our method using a phosphopeptide-enriched data set generated from Drosophila melanogaster samples on an LTQ linear ion trap instrument. Finally we compare runs in which both MS 2 and MS 3 spectra are generated with an MS 2 -only method to address the overall benefit of generating MS 3 data.

Sample Preparation and Mass Spectrometry
Two experimental data sets of MS/MS spectra were used in this work to evaluate the statistical model and to investigate its utility in the analysis of phosphopeptide-enriched samples. All spectra were acquired using an ESI linear ion trap tandem mass spectrometer (Thermo Electron's LTQ).

Phosphopeptide Sample
This sample is a trypsin-digested, IMAC-enriched D. melanogaster whole cell lysate. The preparation of the phosphopeptide samples is described in detail in Bodenmiller et al. (30). Several mass spectrometry analyses of this sample were conducted both for analysis of performance of the probability model and to test the value of generating MS 3 data.
Kc167 cell cytosolic phosphoproteome. Peptides were loaded on a capillary (BGB Analytik, Bö ckten, Switzerland) reverse phase C 18 column (75-m inner diameter and 11 cm of bed length with Magic C 18 AQ 5-m 200-Å resin (Michrom BioResources, Auburn, CA)) and then eluted from the capillary column at a flow rate of 200 -300 nl/min to the mass spectrometer through an integrated electrospray emitter tip. Peptides were eluted for each analysis from 12 to 33% acetonitrile in which the ions were detected, isolated, and fragmented in a completely automated fashion. The exact settings for MS n acquisition were as follows.
Nine-protein Mixture-In the first scan event, all peptides eluting from the column were recorded in MS mode. The most intense ion was selected for product ion spectrum (MS 2 ) in the second event. An MS 3 spectrum of the most intense peak in the MS 2 spectrum was automatically selected in the third scan event. The second and third events are then repeated two more times in the cycle, for the second and third most abundant MS 1 ions, for a total cycle of seven events. A threshold of 5000 ion counts was used for triggering an MS 2 attempt. Wide band activation was enabled for all MS 2 and MS 3 scan events. MS 2 isolation width was set to 2.0 m/z, and MS 3 isolation width was set to 4 m/z. For triggering an MS 3 event the most intense ion had to be above 50 ion counts. No further restrictions were made for the selection of the MS 3 precursor.
Phosphopeptide Sample-All peptides eluting from the column were recorded in MS mode in the first scan event. The most intense ion was selected for product ion spectrum (MS 2 ) in the second event. An MS 3 spectrum of the most intense peak in the MS 2 spectrum, which for the phosphopeptide containing sample is in most cases the neutral loss peak (of 98 Da) from a serine/threonine phosphopeptide, was automatically selected in the third scan event. These three events form one complete cycle. A threshold of 20,000 ion counts was used for triggering an MS 2 attempt. Wide band activation was enabled for all MS 2 and MS 3 scan events. MS 2 isolation width was set to 2 m/z, and MS 3 isolation width was set to 3 m/z. For triggering an MS 3 event the most intense ion had to be above 500 ion counts. No further restrictions were made for the selection of the MS 3 precursor.
Phosphopeptide Sample: Additional Data Sets for Comparison of MS 2 -only with MS 2 /MS 3 Methods-For the MS 2 /MS 3 data set the data-dependent MS n spectra were acquired as follows. In the first scan event, all peptides eluting from the column were recorded in MS mode, and then the most intense ion was selected for product ion spectrum (MS 2 ) in the second event. In the third event an MS 3 spectrum was triggered specifically in the event of a phosphate neutral loss (Ϫ98 Da for singly, Ϫ49 Da for doubly, and Ϫ32.66 Da for triply charged peptides) in the MS 2 event. The second and third events are then repeated two more times in the cycle, for the second and third most abundant MS 1 ions, for a total cycle of seven events. For the MS 2 -only data set the data-dependent MS n spectra were acquired as follows. In the first scan event, all peptides eluting from the column were recorded in MS mode, and then the three most intense ions were consecutively selected for product ion spectrum (MS 2 ) for a total cycle of four events. Further settings for these samples were as follows: wideband activation was enabled for all MS 2 and MS 3 scan events, MS 2 isolation width was set to 2 m/z, and MS 3 isolation width was set to 4 m/z. For triggering an MS 3 event in the MS 2 /MS 3 data set the most intense ion had to be above 50 ion counts. No further restrictions were made for the selection of the MS 3 precursor.

Database Searching and Analysis of Results
mzXML files were generated from ThermoFinnigan *.raw files using the ReAdW tool available in the Trans-Proteomic Pipeline (TPP) platform (31)(32)(33). MS 2 and MS 3 peak list files in *.dta format were extracted separately from the mzXML files using mzXML2Other tool with the -level option. 2 For the 9-Mix data set, a custom fasta sequence file was constructed consisting of sequences corresponding to the proteins in the mixture and common contaminants appended to a reversed version of the International Protein Index human data set. Resulting *.dta files for the 9-Mix data set were searched with SE-QUEST using the following parameters: peptide tolerance of 3.0 Da; b-and y-ion series; partial trypsin digestion, allowing for one missed cleavage site; a fixed modification of 57.02 Da was specified for cysteine; and a variable post-translational modification (PTM) of 16.0 Da was specified for methionine. MS 3 data sets were searched using identical parameters. Note that partial trypsin specificity is required for searching MS 3 spectra corresponding to the fragmentation of a selected y-or b-ion from the MS 2 spectrum. If sufficient computational resources are available, searching MS 2 spectra allowing for partially tryptic peptides can often be beneficial and result in additional identifications. However, doing so requires that the results are properly analyzed with a tool that accommodates tryptic termini information in the statistical model, such as PeptideProphet. In addition, a subset of MS 3 spectra from this data set was also searched allowing for the C-terminal variable modification of Ϫ18.0 Da to accommodate the possibility that the MS 3 precursor is a b-ion (11). The results indicated that including this modification does not significantly alter the overall performance; in fact, accommodating the variable modification decreases the number of identifications slightly (due to loss of a number of true peptide assignments because of increases in search space). Based on this, the C-terminal modification was not used in the final analysis of data presented in this study. The resulting data set contained 76,873 peptide assignments, counting 2ϩ/3ϩ duplicates: 48,921 MS 2 (554 singly charged, 24,233 doubly charged, and 24,134 triply charged) and 27,952 MS 3 (4582, 11,700, and 11,670 singly, doubly, and triply charged, respectively). Note that because of the charge state ambiguity (in the case of low mass accuracy data such as the data sets used in this work, the charge state of a multiple charged peptide ion cannot be reliably determined), most of the multiply charged spectra were searched twice, assuming 2ϩ or 3ϩ charge state. Furthermore due to a relatively small number of singly charged MS 2 spectra, all such spectra were left out of the subsequent analysis.
The database for the phosphopeptide-enriched samples consisted of all D. melanogaster sequences exported from the UniProt database (34), 26,311 entries total, to which the reversed set of sequences was appended. Parameters for the MS 2 search were as follows: peptide tolerance of 3.0 Da; partial trypsin digestion with one possible missed cleavage; fixed modification of 57.02 Da for cysteine; variable modifications of 80 Da were specified for Ser, Thr, and Tyr; and a maximum four PTMs per peptide. The MS 3 spectra were searched with the same set of parameters except that variable modifications of Ϫ18 Da on Ser and Thr (instead of ϩ80 Da) were specified to accommodate loss of phosphoric acid leading to a dehydroalanine or dehydrobutyric acid, respectively. SEQUEST database searching for the primary phosphopeptide data set (excluding the MS 2 /MS 3 to MS 2 -only comparisons) resulted in 28,865 peptide assignments, counting 2ϩ/3ϩ duplicates: 16,647 MS 2 (143 singly charged, 8483 doubly charged, and 8021 triply charged) and 12,218 MS 3 (547, 5895, and 5776 singly, doubly, and triply charged, respectively).

Processing of MS 2 and MS 3 Search Results
Search results for each LC-MS/MS run were generated by first producing an html result file using the out2summary tool, exporting one result file for each MS level, for each run: a total of six files for the 9-Mix data set and two files for the phospho data set. html results were then converted into pepXML format (31) using Sequest2XML. PeptideProphet (32) was run on each result set, generating probability scores for each search result that are added to the pepXML documents. For the phospho data sets, PeptideProphet was run with the "Ϫl" option, which results in alternate processing of ⌬Cn scores marked with "*," results for which the top and second highest ranked peptide assignment to a spectrum have homologous sequences (Ͼ70% sequence identity). With this option on, PeptideProphet will use the Xcorr score of the first non-homologous lower scoring peptide match when computing ⌬Cn score of the best scoring peptide. This option is beneficial in the event that the search returns several identical results that differ only by modification site for a sequence as often occurs in phosphorylated peptide identifications. 3 Resulting files were parsed and processed to generate all matching statistics using a custom set of scripts implemented in Python. Certain subsets of data were also exported into a local Mysql database to facilitate generation of specific statistics.

Linking MS 2 and MS 3 Scans and Search Results
The spectra in these experiments were generated in an interlaced manner, i.e. the scan cycle on the instrument followed the format MS 1 3 MS 2 3 MS 3 3 MS 2 3 MS 3 3 MS 2 3 MS 3 or MS 1 3 MS 2 3 MS 3 with the MS 2 scans triggered in a data-dependent manner from the MS 1 and the MS 3 scans triggered from the preceding MS 2 . As a result, a set of linked MS 2 /MS 3 scans were generated based on consecutive scan numbers. In the resulting data set, MS 2 scans with no consecutive MS 3 were retained and designated as linked but as a link to a null MS 3 identification. MS 3 scans without preceding MS 2 scans should not occur physically but do in these data for several reasons: namely the corresponding MS 2 peak lists that produced no database search result are typically not reported. Also some spectra containing only a few peaks may be filtered out by the data conversion software. The small number of instances in which these "orphaned" MS 3 scans are generated invariably result in incorrect peptide identifications and are eliminated from subsequent analysis.
Due to uncertainty with the charge state each multiply charged scan was searched twice (in both 2ϩ and 3ϩ charge state), resulting in multiple search results for each scan. Consideration needs to be given to potential links between MS 2 and MS 3 search results for any pair of scan numbers. A ϩ1 MS 2 search result may only be linked to an MS 3 search result that is ϩ1, and a ϩ2 MS 2 scan may produce a link to a search result with either a ϩ1 or ϩ2 charge state. The double and triple charged SEQUEST search duplication, however, creates a situation in which a ϩ3 MS 2 search result may produce two possible links to ϩ2 and ϩ3 MS 3 search results for any pair of scan numbers. After generating all possible links, one pair of search results among all possible pairs for any two scan numbers (designated as the "unique pair") is selected based on whether the sequences of the two peptide identifications composing a pair are matching. Matching is defined here as whether or not the sequences are equal or whether one contains a subsequence of the other. For non-matching pairs and scan sets with more than one pair with matching sequences, the match pair with the highest summed PeptideProphet probability is designated as the unique pair. A schematic of all matching possibilities and selection of a unique pair is shown in supplemental Fig. 1.

RESULTS AND DISCUSSION
Overview of the Probability Adjustment Method-The overall methodology for our approach is outlined in Fig. 1. Data generated by the mass spectrometer are processed via the TPP following normal procedures and using SEQUEST, Mascot, or X! Tandem database search tools for peptide identification (the tools currently supported by TPP) up through generation of peptide probabilities from PeptideProphet (32). Analyses in this early stage of processing are conducted separately for MS 2 and MS 3 data. To calculate an adjusted probability for all assignments, successive scans must be linked as described under "Experimental Procedures." The multiple potential matches resulting from the charge state ambiguity are reduced in the processing, and only the most probable matching pair for any two scan numbers is retained.
Based on the sequence of the highest scoring peptide FIG. 1. Overview of methodology. MS 2 and MS 3 spectra are extracted from the raw data, and the spectra are assigned peptides using sequence database searching (SEQUEST or similar programs). The resulting peptide assignments are statistically validated using PeptideProphet, which calculates for each assignment in the data set a probability of being correct (applied separately for MS 2 and MS 3 data). MS 2 and MS 3 scan results are correlated based on scan number in which an MS 3 spectrum is linked to an MS 2 if its scan number is consecutive. Based on the overall matched data set, a Bayesian probability correction is applied to linked scan results individually for MS 2 and MS 3 spectra, resulting in adjusted probability scores. In the final step, the MS 2 and MS 3 scan results are combined, and a final probability is calculated for each scan number as representative of the peptide identification. produced by the database search tool for each scan, consecutive MS 2 /MS 3 pairs may then be classified as to whether or not they match the same peptide sequence. This classification forms the basis for the adjusted probability score (see below), which functions to reward assignments with matching sequences. Only the top ranked peptide sequence for each spectrum is used in this analysis; accommodation of lower ranking results, although potentially useful, is not considered for simplicity. The result of the probability correction procedure is a data set of linked MS 2 and MS 3 peptide identifications with adjusted probability scores.
Linking MS 2 and MS 3 Data: a Case Study of the 9-Mix Data Set-This analysis is carried out using a mixture of purified proteins (nine-protein mixture data set) in which it is possible to confidently label peptide identifications as "correct" or "incorrect." Because this data set was searched against a database consisting of the sequences of the mixture proteins appended with a much larger reversed human protein sequence database, each spectrum could be assigned a correctness label based on whether the top SEQUEST hit for the spectrum was to one of the known protein entries. The method used was simply to label as incorrect any assignment of a peptide from a known incorrect database entry (reversed human protein sequence entries in this case), whereas all assignments of peptides to one of the sample proteins can be considered correct (32).
The procedure begins by linking consecutive MS 2 and MS 3 scans using their scan numbers. Overall there were 48,921 MS 2 spectra and 27,952 MS 3 spectra generated for the 9-Mix data set. Due to the uncertainty in the precursor charge state for LTQ spectra, many spectra are redundant; for any pair of consecutive MS 2 /MS 3 scan numbers, there may be one or two SEQUEST search results generated for each MS level as described under "Experimental Procedures." Consequently an MS 2 search result may be linked to more than one MS 3 search result. For the 9-Mix data set, there are 16,140 unique linked pairs in which the MS 3 is not null. Among these, 89 have MS 2 /MS 3 charge states of ϩ1/ϩ1, eight of which match correct protein sequences in the database (either one or both of the sequences match). For doubly charged MS 2 pairs, 3761 are ϩ2/ϩ2 and 4043 are ϩ2/ϩ1 of which 878 and 2020 are correct, respectively. For triply charged MS 2 , for ϩ3/ϩ3 there are 4020 pairs of which 631 are correct, for ϩ3/ϩ2 there are 3777 pairs of which 1177 are correct, and for ϩ3/ϩ1 there are 450 pairs of which 111 are correct. In all, linked pairs in which the MS 3 has one less charge than the MS 2 are more likely to be correct. However, linked pairs for which the MS 3 is the same charge state as MS 2 account for 36% of the correct identifications.
Neutral loss of amino acids from the N and C termini is a common phenomenon and has been described previously (35,36). Selecting linked pairs in which both MS 2 Table 1.
After linking consecutive scans and selecting a unique linked pair, the peptide assignments are binned into sequence match categories dependent on whether a consecutive scan exists and, if so, whether the top scoring SEQUEST sequence result of the successive scans matches (Table I). Sequence match categories (referred to as match categories or simply "Match" later in the text) are defined as follows: 0, no consecutive scan; 1, consecutive scans, but MS 2 and MS 3 sequences do not match; 2, consecutive scans, and MS 3 sequence is a subset of the MS 2 sequence; 3, consecutive scans, and MS 3 sequence is identical to the MS 2 sequence; and 4, consecutive scans, and MS 2 sequence is a subset of MS 3 sequence. In the data set of unique pairs, 69% of all MS 2 spectra produced consecutive MS 3 spectra (16,140). Of those consecutive pairs, 1458 (9%) had matching sequences in which the MS 3 sequence was a subset of the MS 2 sequence. 116 MS 3 spectra were orphaned because they did not have a preceding MS 2 scan and were discounted. We note that there were no instances of identical sequence matches between MS 2 and MS 3 top scoring hits in the 9-Mix data set as may occur for neutral ion events in which only a side-chain moiety is lost from the otherwise intact peptide backbone (e.g. a phosphate). These losses are observed in other similar data sets, however, and do occur in the phospho-enriched data sets described later.
For a small number of linked pairs, the top scoring MS 3 sequence appears to be a superset of the MS 2 sequence, binned as sequence match category 4. Clearly such pairs are not physically possible. Detailed analysis indicated that that most of those cases can be explained as resulting from misidentification of the true peptide sequence from either the MS 2 or MS 3 scan. For example, in some of these instances, the sequence corresponding to the ϩ2 MS 2 is a subsequence of both the ϩ3 MS 2 sequence and the ϩ2 MS 3 sequence with the ϩ2/ϩ2 MS 2 /MS 3 pair selected as the unique pair. In those cases, the peptide assignment to the ϩ3 MS 3 peak list (with ϩ3 being the true charge state of the peptide ion) scored lower than the assignment of a shorter peptide (a subsequence of the true peptide) to the ϩ2 MS 3 peak list. Other examples involved cases of a high scoring assignment of a longer partially tryptic peptide sequence when the true peptide was a post-translationally modified tryptic peptide missed due to the restricted nature of the database search. Similarly several cases were observed where an MS 3 scan acquired on a doubly charged b-ion fragment from the parent MS 2 spectrum resulted in a match of a longer sequence to the ϩ3 MS 3 peak list and no match in the case of the correct ϩ2 charge state. In any event, as can be seen from Table I, match category 4 represents a small number of special case instances. For simplicity of articulation, this category is dropped from subsequent analysis.
Using the labeling of the data, the accuracies and sensitivities of the probability calculations could be determined. Toward this end, each linked pair of spectra can also be assigned a truth category based on the correctness of the peptide assignments to the MS 2 and MS 3 scans. The truth category is a label indicating whether neither, both, or one of the matching scans has a correct label. The total numbers of scans in each truth category are shown in Table II. The number of unique pairs of search results in which both sequences were correctly assigned is 1509, corresponding to 6.4% of the total number of unique pairs of scans. A greater number of linked pairs (3316 total, 14.2%) have either the MS 2 only assigned correctly (2029) or only the MS 3 assigned correctly (1287).
When comparing the counts in the sequence match category bins (Table I) with the truth category bins (Table II), there appear to be several (34) more ϩ/ϩ truth matches than expected from the number of entries in the sequence match bin categories 2 and 4. These entries are the result of sequence match category 1 entries contributing to the ϩ/ϩ truth bin. There are a number of cases in which the top scoring MS 2 and MS 3 sequences both match one of the sample mixture proteins, but the proteins are different or the match is to different peptides from the same protein. Most of the instances are examples of the latter case: a homologous sequence in the protein TRFE_BOVIN results in two different peptides (CLMEGAGDVAFVK and KGDVAFVK) being identified in the joined pairs. One of the commercially obtained proteins in the mixture, TRFE_BOVIN, was also contaminated with the homologous TRFL_BOVIN, which exhibits 59% sequence identity. As a result, homologous but not identical peptide sequences between the two proteins are identified in the joined pairs. For four cases, however, although both MS 2 and MS 3 identifications in the pair are labeled correct in that individually their sequences match one of the sample proteins, there is no similarity between the matching sequences. These can be considered as chance matches to one of the sample mixture proteins incorrectly labeled as correct (the observed number of such chance matches is consistent with the expected number given the relative sizes of the 9-Mix and the reversed human protein sequence database). In all of such cases, either the MS 2 or the MS 3 was a high probability result with the other joined probability very low.
Probability Adjustment Calculation-In automated analysis of mass spectrometry data, one of the most important tasks is the calculation of accurate and discriminative confidence measures for each peptide assignment to a spectrum produced by a database search tool. Toward that end, we seek to calculate a correction to the probability score that accommodates the increase in confidence resulting from matching MS 2 and MS 3 spectra. The fact that matched consecutive MS 2 and MS 3 spectra are more likely to be correct forms the basis for adjusting the probabilities of these spectra.
Calculation of probabilities for each peptide assignment in the data set, performed independently for MS 2 and MS 3 data, represents the starting point in this analysis. PeptideProphet computes a probability for a peptide, designated here as p(ϩ͉D) by using the mixture model expectation maximization algorithm to model the distributions of various discriminant spectrum-level parameters, collectively represented here as D. The spectrum-level information D typically includes the discriminant database search score (a linear combination of the renormalized search scores reported by the database search tool used), the number of termini consistent with the specificity of the enzyme used to digest proteins, the number of missed internal cleavage sites, and the difference between the measured and the calculated precursor ion mass. In certain cases, additional parameters are included in the model such as the peptide pI value (37) or the presence of certain residues or sequence motifs in the sequence of the assigned peptide (e.g. the presence of a cysteine in the case of ICAT experiments or NX(S/T) motif in the case of experiments using glycopeptide enrichment strategies). PeptideProphet probabilities are reasonably accurate for both MS 2 and MS 3 spectra. A plot displaying probability accuracies of PeptideProphet results for the 9-Mix data is provided in supplemental Fig. 2.
The approach used to accommodate the additional sequence matching information is similar to the method described previously (33) for adjusting probabilities to account for additional protein level information using the number of  3

scan pairs into truth categories
A "ϩ" in the truth category column descriptors indicate a correct match, "Ϫ" indicates an incorrect match, and "null" indicates the lack of consecutive MS 3 for an MS 2 scan. sibling peptides. The MS 2 /MS 3 sequence match information is not available at the initial data analysis step but can be used to adjust the initial probabilities p(ϩ͉D) after linking the corresponding MS 2 and MS 3 scans. Again the adjustment is performed separately for MS 2 -and MS 3 -level data. Given the sequence match category (Match) assignments for all linked spectra, the adjusted probability of a linked peptide assignment from a certain sequence match category, p(ϩ͉D, Match), may be calculated as where p(Match͉ϩ) and p(Match͉Ϫ) represent the empirically derived probabilities of observing a peptide assignment in each match category among all (MS 2 or MS 3 ) correct and incorrect peptide assignments in the data set, respectively. Note that this calculation assumes that the information derived from linking consecutive scans is independent of the identification information generated by a search engine. This is largely true. Normalized PeptideProphet SEQUEST discriminant score distributions for correct and incorrect peptide assignments to MS 2 spectra of doubly charged precursor ions, plotted separately for peptide assignments to MS 2 spectra belonging to different match categories, are shown in supplemental Fig. 3; score distributions are similar for all values of Match parameter, justifying the assumption of the independence between the discriminant database search score and Match parameter.
The probability distribution p(Match͉ϩ) may be calculated for each match category k as where N is the total number of (MS 2 or MS 3 ) peptide assignments in the data set, and the sum is over all peptides i in each match category. The term p(Match͉Ϫ) is calculated in a similar way. The overall proportion p(ϩ) of correct assignment in the data set may be calculated as follows.
p͑ϩ͒ ϭ 1 The probabilities in Equation 1 and the Match parameter distributions in Equation 2 can be determined by starting with the initial PeptideProphet probability for each assignment, p(ϩ͉D i ), and the overall proportion, p(ϩ). The probabilities and Match distributions can then be updated in an iterative manner. However, a single iteration was deemed to be sufficient for the data set used in this work.
Application of the Probability Adjustment Method to the 9-Mix Data Set- Table III lists p(Match͉ϩ) and p(Match͉Ϫ) distributions calculated using Equation 2 for the 9-Mix data set for both MS 2 and MS 3 scans. It can be seen that, in the case of MS 2 spectra, a larger fraction of incorrect assign-ments have no consecutive matching scan. For all instances, the most likely sequence match category is category 1, corresponding to the case in which consecutive scans occur but with no matching sequence. This is perhaps intuitive in the sense that it might frequently be the case that either the MS 2 or the MS 3 will produce an identifiable sequence but not both. The most obvious discriminating measure is the fact that for 30% of the correctly assigned MS 2 spectra (the top row in Table III) the linked MS 3 spectrum was assigned a peptide sequence that is a subset of the MS 2 sequence as opposed to a 5% incidence for incorrect MS 2 identifications. If sequence matches are observed, identifications are thus much more likely to be correct; the same argument applies for MS 3 scans preceded by MS 2 scans. Also noteworthy is the fact that for match category 1 pairs the probability of a correct identification is less than the probability of an incorrect identification. This will result in a probability penalty for consecutively linked scans without matching sequences. The penalty is small in this case, much smaller than the boost due to a consecutive matching scan, but is nevertheless an effect of the model.
It should be noted that in addition to classifying peptide match pairs into bins as a function of sequence matching they can also be classified into various precursor charge state pairs. Significant differences exist between the precursor charge state distributions of correct and incorrect matches. An expansion of the sequence match category probabilities into charge category bins is provided in supplemental Fig. 4 for each of the four posterior Match probability distributions of Table III as well as total counts of the number of matches falling into each bin for the 9-Mix data set. The charge state information would likely provide additional discriminative power. However, further subclassification of the data into charge state pairs requires a larger amount of data and complicates the model. Thus, the charge state information has not been utilized in the model at this time.
An example of the probability adjustment procedure described above is illustrated in Fig. 2a using a pair of match-

TABLE III Posterior probabilities of observing a correctly (ϩ) or incorrectly (Ϫ) matching peptide to a MS 2 or MS 3 scan among peptides from the four most frequently observed sequence match categories in the 9-Mix data set
0, no consecutive scan; 1, consecutive scan, no matching sequence; 2, consecutive scan, MS 3 sequence is a subset of MS 2 sequence; 3, consecutive scan, MS 3 sequence identical to MS 2 sequence.

Analysis of Consecutive MS 2 /MS 3 Spectra
ing scans from the 9-Mix data set. MS 2 spectrum A06_7233_c.18651.18651 is first paired to MS 3 spectrum A06_7233_c_18652.18652 by consecutive scan number. MS 2 -assigned peptide sequence TLNFNAEGEPELLMLAN-WRPAQPLK is then compared with MS 3 sequence GE-PELLMLANWRPAQPLK. Because the MS 3 sequence represents a fragment of the MS 2 sequence, the linked pair is assigned to sequence match category 2. The adjusted probabilities are then calculated for each spectrum using Equation 1. In this instance, the initial PeptideProphet probability of 0.712 is adjusted to 0.995 for the MS 2 spectrum, and 0.832 is adjusted to 0.989 for the MS 3 spectrum. A combined probability may then optionally be calculated for the linked pair as a new discriminating measure as discussed later in the text.
Also indicated in Fig. 2 are examples of fragmentation patterns from other charge state pairs. These examples are provided here to illustrate both differences in the relative extent of fragmentation that can occur as a function of charge and also the presence of redundant ions appearing in both the MS 2 and MS 3 spectra. Fig. 2, b-d, contain examples from the phospho data set, specific features of which will be discussed in more detail later in the text. It should be noted that many identical ions can be observed between matching MS 2 and MS 3 spectra.
In the development of the model, several (match category 2) cases were observed where both paired spectra had a low initial probability of being correct, but their probabilities became intermediate or even high values after adjustment. For example, the initial probabilities for peptide assignments to linked scans A06_7232_c.4362.4362.3 (MS 2 scan) and A06_7231_c.4363.4363.2 (MS 3 scan) of 0.077 and 0.319 would get boosted to 0.827 and 0.830, respectively, if the probabilities were adjusted using the Match parameter distributions shown in Table II. Boosting such low probability as-signments may be undesirable regardless of their match category. To address this, several approaches were investigated, including introduction of probability-dependent match categories. A very simple constraint that worked well in the case of the 9-Mix data set was to avoid any probability adjustment for category 2 matches if both initial MS 2 and MS 3 probabilities were below a specified threshold, 0.5 in the case of these data. This was an optional feature that was investigated using the 9-Mix data set but not utilized for the phosphopeptide data sets as it was deemed a minor adjustment that did not significantly affect the overall results; specifically the number of entries in the 9-Mix data set that were affected by this exception was only 24 of a total 23,367 unique matches.
The improved discriminatory power of the adjusted probabilities, calculated using the p(Match͉ϩ) and p(Match͉Ϫ) distributions shown in Table III (after the empirical correction described above), is indicated in Fig. 3, which shows receiveroperator characteristic curves for the data. The performance of the model is evaluated separately for MS 2 and MS 3 spectra. The false positive error rate is plotted as a function of the sensitivity attainable by selecting a variable probability threshold. Sensitivity in this case is defined as the ratio of the number of correct peptide assignments to MS 2 (Fig. 3a) or MS 3 scans (Fig. 3b) with a probability greater than or equal to a specific probability threshold and the total number of correct assignments to MS 2 (4870) or MS 3 (1256) spectra, respectively. Similarly the false positive error rate is calculated as the fraction of incorrect matches in the total number of spectra above each probability threshold. Note that there is redundancy between the MS 2 and MS 3 peptide assignments, so summing the total possible number of correct peptide identifications from both MS 2 and MS 3 scans would not reflect the total number of unique identifications.  3 . c, a ϩ3/ϩ1 identification; the y 8 ion is selected for MS 3 . d, an example of a ϩ2/ϩ2 loss of the phosphate moiety in which the most abundant MS 2 peak selected for MS 3 is the doubly charged y 13 Ϫ 98 Da.

FIG. 3. Performance of MS 2 and MS 3 scores with probability adjustment.
Error rate of MS 2 (a) and MS 3 (b) scores are shown as a function of sensitivity for initial (dashed) and adjusted (solid) probabilities. Inset panels are enlarged areas of the plots for the 0 -10% error rate range.

Analysis of Consecutive MS 2 /MS 3 Spectra
For both the MS 2 and MS 3 scans, the adjusted probability provides a better performance profile, achieving greater sensitivity at an equivalent error rate as compared with the initial data. For example, at a 0.9 probability threshold, the initial MS 2 probability results in the selection of 4072 correct peptide assignments at the expense of 67 incorrect ones. Using the adjusted probabilities, selecting the same number of correct identifications results in only 38 incorrect peptide assignments. The improvement in MS 3 discrimination is even more pronounced, especially in the optimal region of the curve. Using initial probabilities, 1350 correct and 19 incorrect assignments to MS 3 spectra pass the 0.9 threshold. Using the adjusted probabilities, it becomes possible to select the same number of correct peptide assignments with the inclusion of only one false positive.
Combining MS 2 and MS 3 Probabilities-The result of the probability adjustment procedure described above is now two adjusted probabilities for each unique linked pair of scans, one each for MS 2 and MS 3 . Possibilities for best utilizing both of these scores in selection of correct and incorrect identifications are now explored. Ideally a combined scoring approach would provide a greater discriminatory power for selecting correct and incorrect identifications than a subsequent counting of unique matches based on MS 2 and MS 3 taken individually. Two possibilities for utilizing both scores are examined, where p MS 2 and p MS 3 are the adjusted probabilities for the MS 2 and MS 3 scans, respectively, for the same linked pair. The first option is appropriate when the two probabilities can be considered independent and has been utilized (in a different context, i.e. for combining the evidence from different peptides) for the protein identification problem (33,38). p comb reflects the probability that at least one of the two peptide assignments, either to the MS 2 or to the MS 3 spectrum, is correct. However, it is obvious that MS 2 and MS 3 spectra, and therefore the probability scores p MS 2 and p MS 3 of those spectra, are not fully independent measurements of a peptide in that identical ions will be measured in both spectra. An alternative approach is to select the assignment with the highest probability, p max , thus reducing the likelihood of possible overestimation of the final probability. p max has been used in other similar situations, e.g. in selecting among several alternative equivalent peptides (assignments of the same peptide to multiple MS/MS spectra) in the ProteinProphet protein probability score (33) and in Mascot protein-level scoring (24). Fig. 4, a and b, show the results of counting the number of correct peptide assignments above specified probability thresholds, utilizing all possible scores calculated for a linked pair as the discriminating measure: initial MS 2 , initial MS 3 , adjusted MS 2 , adjusted MS 3 , p max , and p comb . Displayed are the results on the set of all unique linked pairs. A comparison of the initial and adjusted probability results for MS 2 and MS 3 FIG. 4. Discriminating power and accuracy of computed probabilities. a, total number (num) of correct peptide assignments is plotted as a function of minimum (min) probability threshold for MS 2 and MS 3 spectra alone, both initial and adjusted, and both p max and p comb scores. b, same as a, enlarged in the region of minimum probability threshold 0.9 -1.0. c, number of correct peptide assignments as a function of the number of incorrect assignments plotted separately for MS 2 (green) and MS 3 (blue) initial (dashed) and adjusted (solid) probabilities as well as the combined p max (red) and p comb (purple). d, probability accuracy of the adjusted MS 2 , MS 3 , p max , and p comb probabilities.
again demonstrates an increase in the number of selectable correct peptide assignments at any probability threshold as a result of the probability adjustment. Both p max and p comb scores perform similarly and provide improved discrimination as compared with the individual measures. Obviously the primary reason for the performance increase is the fact that the combined score permits the possibility of selecting either the MS 2 or the MS 3 for any linked pair, thus permitting a pair to be selected as correct if either probability is above threshold. At the 99% probability threshold, for example, the adjusted MS 2 , adjusted MS 3 , p max , and p comb probabilities correspond to 3141, 1050, 3775, and 3807 correct peptide identifications, respectively. Fig. 4c provides a measure of the rate of false positives on these data for the most interesting thresholds. The same performance trends are evident: including roughly 40 false positives, specifically 40, 41, 39, and 39 for adjusted MS 2 , adjusted MS 3 , p max , and p comb measures, respectively, results in selection of 1806, 4139, 4594, and 4762 correct identifications. In all, p comb provides the most discriminative measure.
In addition to analyzing the discriminative power of computed probabilities, one must also assess their accuracy. Probability accuracy plots for the adjusted and combined measures are shown in Fig. 4d. The adjusted probability scores still provide an accurate representation of true probabilities and fit the 45°line well. The p comb and p max measures perform similarly well. Interestingly p comb does not overestimate probabilities as one might expect given the dependence of MS 2 -and MS 3 -level spectra on this data set. Additional analysis would be necessary to determine whether this is a general characteristic.
Phosphopeptide Data Set Results-One of the main motivating factors in collecting MS 2 /MS 3 data is to increase the confidence levels and the total number of phosphopeptide identifications. The identification of phosphopeptides from MS 2 spectra is challenging because spectra recorded using an ion trap mass spectrometer often exhibit one or more dominant neutral loss peaks of 98 Da, whereas the occurrence and intensity of the other fragment ions (containing peptide sequence information) may be impaired. To investigate potential improvement in discrimination as a result of the probability adjustment on a phosphopeptide-enriched data set, a data set of MS spectra from a single LTQ injection of an IMAC-enriched D. melanogaster sample was selected for detailed analysis in this work. The data were acquired in a data-dependent mode with MS 3 scans triggered for the most abundant peak of the MS 2 spectra that in the case of this sample mostly corresponds to the neutral loss peaks: Ϫ98.00 (Ϫ116.00), Ϫ49.00 (Ϫ58.00), and Ϫ32.60 (Ϫ36.66) Da from the precursor as explained under "Experimental Procedures." Because the sample in this case is a complex protein mixture, a precise labeling of peptide identifications as correct or incorrect is not possible. Instead only the composite false discovery rates (FDRs) (a single measure for each filtering threshold) can be estimated by counting the number of matches to reversed sequences.
The methodology for generating adjusted probability scores for this data set is analogous to the 9-Mix data set. Top scoring MS 2 and MS 3 SEQUEST peptide assignments are linked based on consecutive scan numbers, and the top scoring pair for consecutive scans is selected. Note that if MS 3 spectra are triggered based on neutral loss peaks charge state ambiguity between matching pairs can potentially be reduced. This fact is not exploited in our analysis; rather we maintain the same procedure for allowing all possible charge pairs in a match. The match pairs are then classified into sequence match categories as described above. The same four sequence match categories are used: 0, no consecutive match; 1, consecutive match but no matching sequence; 2, matching sequences with MS 3 sequence a subset of MS 2 sequence; and 3, matching sequences with MS 3 sequence identical to MS 2 . In this data set, there were only two instances of scans that would correspond to the sequence match category 4, matching sequences with MS 2 sequence a subset of MS 3 sequence. Again this category was eliminated for simplicity. We note that the additional constraints imposed by the data-dependent triggering of these data and the resultant database searching provisions would allow us to generate additional useful sequence match categories, corresponding to whether the site of modification of a match is identical between the two sequences. We observed a number of instances in these data where the sequences matched but the sites of modification of the match did not, indicating ambiguity in the localization of the modified residues. A larger data set would allow a more rigorous analysis of these types of results (39,40).
SEQUEST searching of this data set produced 16,647 and 12,218 results for the MS 2 and MS 3 data sets, respectively, corresponding to 7547 unique matching pairs of searched results. Of these, 6270 had non-null MS 3 assignments. Counts for the four sequence match categories are shown in Table IV. Most significant is the fact that the sequence match category corresponding to neutral loss-only pairs (match category 3) is no longer null; rather it is the more abundant category among the two representing matching sequences with 313 unique matches. Corresponding posterior probabilities were calculated for the sequence match categories and then used to calculate the final adjusted probability for each unique pair. These numbers are shown in Table IV. The frequencies of observing a correct or incorrect assignment to an MS 2 scan with no matching MS 3 sequence (match category 1) are relatively close; only a small probability correction occurs for these instances. MS 3 category 1 probabilities are penalized as are MS 2 instances that lack a corresponding MS 3 result. A probability boost is received for pairs in categories 2 and 3 with a greater correction given to the latter.
Although a true sensitivity measure for these data is impossible, it is possible to evaluate the relative performance of the various probability measures by examining the number of reversed database matches. The decoy database method is increasingly being used as an effective means of estimating false positive rates in database searching when other methods of error rates estimation cannot be readily performed (41,42). At any given probability threshold, the number of matches to reversed sequences can be calculated and compared with the total number of peptide assignments above that threshold to derive an estimate of the FDR (42). A measure of the performance of the various model probabilities on these data is shown in Fig. 5a. The figure plots the estimated number of correct identifications as a function of FDR. These data are generated by ranking all peptide assignments in order of decreasing probability. The number of assignments of peptides from the forward database (n f ) having a probability equal or greater than the probability of the nth top ranking reverse entry (n r ) is counted, and the estimated false discovery rate is determined as n r /n f . The estimated number of correct assignments is similarly measured as n f Ϫ n r . This analysis is done separately for each of the initial and adjusted probability measures: MS 2 and MS 3 initial and adjusted as well as the combined probability measures p comb and p max . A version of these data in table form is provided in supplemental Table 2, which presents estimated false positive percentages and number of forward match counts for inclusion of one, two, five, 10, 50, and 100 reversed matches as well as the number of those forward entries that are identified as containing phosphorylation sites.
As can be seen from Fig. 5a, at equivalent false discovery rates, the adjusted probability measures for MS 2 and MS 3 data provide a small but distinguishable improvement in the number of correct entries that can be selected, particularly for MS 3 . The bigger benefit of course comes with the combined p comb and p max scores, which provide a much higher selection rate of forward matches than the initial MS 2 and MS 3 probabilities. For example, by filtering the data using p max instead of the initial MS 2 probability it becomes possible to extract 203 more forward matching identifications without allowing any reverse database matches (1703 peptide identifications versus 1499). At a roughly 5% FDR, the initial MS 2 probability estimates 1893 correct peptides, whereas the p max measure selects 2093. It is interesting that p comb is much more discriminative than the p max probability measure on these data, selecting 2328 correct peptides at the 5% FDR. Overall the acquisition of MS 3 spectra does appear to increase the total number of phosphopeptide identifications by 10 -25% in this data set, depending on the specific combined probability score used for comparison.
The results discussed above for this sample have focused on the total number of identifications, the majority of which are phosphopeptides. An equivalent plot of the results, but including only ranked non-phosphorylated identifications from the phosphopeptide data set, is shown in Fig. 5b. In general, the same trends can be seen; the model improves the assignment scores of unmodified peptides as well.
Example MS 2 and MS 3 Spectra from the Phosphopeptide Data Set-To understand the underlying reasons for improved identification confidence, it is informative to briefly revisit the example shown in Fig. 2. These spectra are representative illustrations of matched MS 2 and MS 3 phosphopeptide spectra of various precursor charge states. Several spectral features are of interest. Fig. 2b shows an example of a ϩ2/ϩ1 match pair. The threonine in position 3 of the sequence matching the MS 2 spectrum is phosphorylated. The large y 12 peak corresponding to a fragmentation N-terminal to a double FIG. 5. Performance of probability scores on the phosphopeptide data set. The number (num) of correct identifications estimated using the decoy database method is plotted as a function of FDR estimated using the decoy database search method. a, results for the phosphopeptide data set; b, results for the non-phosphorylated identifications only in the phosphopeptide data set. For MS 2 and MS 3 results, dashed lines indicate initial and solid lines indicate corrected probability scores. proline was selected by the instrument for MS 3 . This is a general characteristic of the singly charge spectra corresponding to correct identifications in these data: the majority are proline-directed with a Pro identified in the first position. Although the fragmentation is reasonable in this MS 3 spectrum, a large fraction of singly charged spectra exhibit poor fragmentation with one or two major peaks corresponding to Pro, Asp, or occasionally Glu cleavage dominating. This is not surprising due to the relatively low energy imparted to singly charged ions via CID in a trap instrument; typically the most facile fragments are the most readily observable. As can be seen, many of the same ions occur in both spectra. However, the shorter sequence and the absence of the phosphorylated residue in the MS 3 simplify the spectrum and increases confidence in the identification. Fig. 2c shows a ϩ3/ϩ1 phosphopeptide example. ϩ3/ϩ1 instances are rarer than the ϩ2/ϩ1 (see supplemental Fig. 3), and the same trends occur. The MS 3 spectrum shown is a proline-directed fragmentation event with Asp-directed fragmentation peaks dominating the spectrum. Fig. 2d is an example of a ϩ2/ϩ2 phosphopeptide ion. The peak selected for MS 3 corresponds to the doubly charged y 13 peak with a Ϫ98-Da loss of the phosphate moiety. Although many identical ions are identified in both spectra, there is a significant difference in the fragmentation pattern with several ions observable in MS 3 that are not readily observable in MS 2 .
Data Set Dependence of Probability Adjustment-Because the two primary data sets used in this work differ significantly in terms of sample complexity, it is also informative to compare these two data sets with respect to the MS 2 /MS 3 matching statistics and the degree to which the initial peptide probabilities are adjusted to account for the sequence match information. The Match parameter distributions p(Match͉ϩ) and p(Match͉Ϫ) vary between the data sets, reflecting the differences in the sample complexity and data set size. This is illustrated in Fig. 6, which plots the logarithm of the ratio p(Match͉ϩ)/p(Match͉Ϫ) for each match category k for both data sets. A ratio greater than 1 (log ratio greater than 0) indicates the region where the probabilities are boosted after adjustment for Match information, whereas a ratio less than 1 (log ratio below 0) indicates that the Match adjustment reduces the probability that a peptide assignment is correct. Although the overall trend is similar for both data sets, significant differences exist in the amount of adjustment. For example, the penalty applied to a peptide assignment to a MS 2 spectrum with no subsequent MS 3 spectrum (match category 0) is approximately twice as high in the case of the phosphopeptide-enriched data set than in the 9-Mix data set. On the other hand, the amount of probability boost for peptide assignments in the Match ϭ 2 category is higher in the case of the 9-Mix data set. A better understanding of these results requires analysis of the MS 2 /MS 3 linking statistics for a larger data set. However, it is clear that the amount of probability adjustment in each sequence match category is data set-dependent. Thus, it is advantageous to use statistical methods for combining MS 2 -and MS 3 -level data that can learn the appropriate amount of probability adjustment from the data itself, such as the method presented in this work.
Comments on the Overall Merit of Generating MS 3 Data-This study describes a method for utilizing MS 2 and MS 3 information for cases in which such data have been generated. A fundamental question arises, however, as to whether or not the benefits of generating MS 3 justifies the additional cycle time on the instrument or whether the additional MS 2 spectra that would be generating in that time would offset the potential advantage. It has recently been suggested (e.g. Ref. 43) that the overall benefit of generating MS 3 information for phosphopeptide experiments may be limited. Although a comprehensive analysis of the merits of MS 3 data generation is beyond the scope of this work, the situation is explored here by comparing sets of mass spectrometry runs on identical samples utilizing both methods: the MS 2 /MS 3 cycle discussed above and an MS 2 -only method.
LC-MS/MS analysis was performed on two additional IMAC-enriched whole cell D. melanogaster tryptic digests using a Thermo LTQ as described under "Experimental Procedures." Each sample was separated into two equal fractions that were run individually using the MS 2 /MS 3 run method or the MS 2 -only method. MS 2 and MS 3 peak lists were extracted from the raw data file and searched separately using SEQUEST. Final SEQUEST reports were then combined into two final result sets for each pair of experiments, one set for the MS 2 /MS 3 and one for the MS 2 -only data. These four result sets were then analyzed using Peptide/ProteinProphet.
To compare results at both the peptide and protein levels, individual identifications for each of the two final result sets were grouped based either on unique peptide sequence or protein accession numbers. The union, intersection, and differences between the MS 2 /MS 3 and MS 2 -only runs were calculated. The results are displayed as Venn diagrams in Fig. 7 for both pairs of experiments. Given that there was significant variation between the number of peptide and protein identifications of the same run method, the two pairs of experiments were not combined to reduce the effect of instrument sam-FIG. 6. Degree of probability score adjustment by sequence match category for the 9-Mix and phosphopeptide data sets.

Analysis of Consecutive MS 2 /MS 3 Spectra
pling rate variability in peptide identification, providing a more fair assessment of differences between the two methods. The top pair of Venn diagrams indicate the number of unique proteins identified by each method. Proteins were included in a set if they participated in an identified protein group (see Ref. 33) with a group probability of at least 0.95. Proteins from the same group (indistinguishable proteins given the sequences of identified peptides) were counted as a single entry. The lower set of Venn diagrams shows unique peptide identifications. Peptides were included in these sets if their modified sequences were unique, i.e. two peptides with any modification or sequence differences were considered two unique peptides for the main figure. PeptideProphet probability scores of 0.95 or above were required for inclusion. Peptide uniqueness can be defined by a number of standards, however; and the number of identifications listed in each area of the Venn diagram may be overestimated depending on the definition. The breakout boxes for each of the peptide sets indicate the number for each region of the Venn diagram under four alternative definitions of peptide uniqueness. Under the Type 1 definition, peptides identified from consecutive MS 2 and MS 3 scans that differ only by the loss of one or more phosphate groups on one of the residues (i.e. MS 3 was triggered on the neutral loss) were considered identical and counted as one. Under the Type 2 definition, peptides that differ at the N or C terminus by one or more amino acid residues (e.g. due to a missed cleavage) were considered identical, e.g.

FVSϩ80EGDGGHVKPTTF FVSϩ80EGDGGHVKPTTFTMR FVSϩ80EGDGGHVKPTTFTMRD
where Sϩ80 indicates a phosphorylated Ser residue. Under the Type 3 definition, peptides were counted as identical if they had the same sequence, but the modification site was ambiguous (residues identified as being phosphorylated are within three amino acid sequences of each other) as follows e.g.

KESϩ80NSEDELEYDPSLYPQR KESNSϩ80EDELEYDPSLYPQR
Under the Type 4 definition, peptides were counted as unique based on the sequence alone; e.g.

KKESϩ80NSϩ80EDELEYDPSLYPQR KKESϩ80NSEDELEYDPSLYPQR
KKESNSϩ80EDELEYDPSLYPQR KKESNSϪ18EDELEYDPSLYPQR were considered identical sequences. Although these four definitions do not include all possible types and permutations that occur, using them to count peptides allows a more comprehensive comparison between the data sets.
The results indicate that for these data there are potential advantages to both techniques. At the protein level, the majority of proteins were identified by both methods. However, in one pair of runs the MS 2 -only method outperformed the MS 2 / MS 3 method by identifying 42 more unique proteins than the MS 2 /MS 3 method. At the peptide level, the MS 2 /MS 3 method was able to identify more phosphorylated peptide forms in both sets of runs under most of the criteria in which modifications were considered unique (Types 1-3). In terms of the number of unique peptides identified by sequence alone (Type 4), not taking into account modification state, the MS 2 - only set identifies more peptides in one of the runs. This suggests that, at least for certain conditions, sequence coverage may be better with the MS 2 -only method.
Overall these results indicate that generation of MS 3 data may result in a decrease in the number of unique peptide and protein identifications. However, several additional comments are necessary for more objective evaluation of the benefits of acquiring MS 3 data. First the probabilities used in the comparison presented above (Fig. 7) were the original probabilities generated by the PeptideProphet and ProteinProphet tools. The probability correction procedure described in this work should permit the selection of a greater number of peptides (and therefore proteins) at a fixed FDR, which would potentially mitigate the loss of sequence coverage. Furthermore if the goal of the study is to identify as many unique modification states as possible, MS 3 data may improve the results. It should also be mentioned that the phosphopeptide data sets used in this work were of high quality (high degree of phosphopeptide enrichment), resulting in sufficiently strong intensity MS signal of phosphopeptide ions and relatively good MS 2 fragmentation. On the other hand, it is possible that in other data sets (e.g. no or poor phosphopeptide enrichment), the relatively low abundance of phosphorylated peptides would lead to less intense MS signal and less interpretable MS 2 spectra, thus making benefits of acquiring MS 3 data more apparent.
Concluding Remarks-The generation of MS 3 information is common in directed areas of proteomics such as phosphopeptide identification. Whether generation of MS 3 information is the best strategy or not is partially dependent on the overall goals of the experiment. Data generated from a complex phosphopeptide-enriched sample suggest that generation of MS 3 spectra can potentially result in an increased number of unique phosphorylation site identifications. On the other hand, the cycle time spent on generation of MS 3 data does appear to detract from the overall number of unique peptides (by sequence only) and proteins identified in such an experiment. Also although MS 2 spectra in which neutral loss peaks are dominant are still observed in current generation trap instruments, these spectra appear to frequently contain better backbone fragmentation than older equivalents due to increased ion capacity of the trap. Nevertheless in experiments in which MS 3 data have been generated, MS 2 /MS 3 matching information from the entire experiment can be used to adjust the probabilities of the individual peptide assignments, which has the effect of compensating for the reduced number of MS 2 spectra.
In cases in which a very high certainty in a mapped phosphorylation site is needed, MS 3 experiments are highly valuable as exemplified in the mapping of phosphorylation sites for which biological follow-up experiments are performed. Also in cases in which neither measurement time nor the amount of phosphopeptide samples are limiting factors, the measurement of MS 3 spectra is advantageous. In fact, in an experimental setup that aims to maximize the number of identified phosphorylation sites from a complex sample, one efficient strategy is to first perform MS 2 experiments and then target specifically the unidentified phosphopeptide ions using MS 2 /MS 3 measurements (22,44).
Generally speaking, much of proteomics data analysis relies on the scores and probabilities produced by automated search algorithms. It is thus important that any probability measure is accurate and makes use of all available information, particularly in situations where the targeted peptide identifications are rare, e.g. for phosphopeptides and/or when proteins are identified by a reduced number of peptides (such as an analysis in which N-terminal peptides are enriched). Here we have described methods for translating the additional information obtained by matching coupled peptide assignments to MS 2 and MS 3 spectra into a combined probability score, improving the ability to discriminate between true positive and false positive identifications. We have demonstrated an increase in sensitivity and a corresponding decrease in the error rate of selecting correct identifications as a result of the adjusted probability using a mixture of known standard proteins and applied the method to a complex phosphopeptideenriched data set, demonstrating an improved discrimination between correct and incorrect peptide assignments for that sample.
The goal of this study was to describe a relatively simple but valid mechanism for adjusting probabilities of peptide identifications in scenarios in which standard database searching has been performed on MS 2 /MS 3 data sets. An alternative computational strategy for accommodating MS 3 information is to merge MS 2 and MS 3 spectra into a single spectrum prior to database searching. Full investigation of the relative merits of pre-database search, spectral merging approaches versus a post-database search probability adjustment procedure such as the one discussed here is beyond the scope of this work but is the subject of current investigation. Other methodologies, such as merging spectra from differently charged precursors of the same peptide, could likely be utilized to improve peptide identification as well.
As instrumentation continues to improve the speed and accuracy of tandem MS measurements, the ability to generate complementary information such as MS 3 spectra for any given ion will become increasingly practical. Methods for accommodating this information are consequently useful and can significantly improve the quality of the results generated by automated processing of mass spectrometry data.
Data and Code Availability-mzXML and raw data files and processed unique linked pair data for both the 9-Mix and phospho samples are available on line via the Tranche system (ProteomeCommons). The software used in this work was developed in Python. Python modules were implemented making use of the code library available with the InsPecT software package by the University of California San Diego Computational Mass Spectrometry Research Group (28). All