Reference-facilitated Phosphoproteomics

Recent advances in instrument control and enrichment procedures have enabled us to quantify large numbers of phosphoproteins and record site-specific phosphorylation events. An intriguing problem that has arisen with these advances is to accurately validate where phosphorylation events occur, if possible, in an automated manner. The problem is difficult because MS/MS spectra of phosphopeptides are generally more complicated than those of unmodified peptides. For large scale studies, the problem is even more evident because phosphorylation sites are based on single peptide identifications in contrast to protein identifications where at least two peptides from the same protein are required for identification. To address this problem we have developed an integrated strategy that increases the reliability and ease for phosphopeptide validation. We have developed an off-line titanium dioxide (TiO2) selective phosphopeptide enrichment procedure for crude cell lysates. Following enrichment, half of the phosphopeptide fractionated sample is enzymatically dephosphorylated, after which both samples are subjected to LC-MS/MS. From the resulting MS/MS analyses, the dephosphorylated peptide is used as a reference spectrum against the original phosphopeptide spectrum, in effect generating two peptide spectra for the same amino acid sequence, thereby enhancing the probability of a correct identification. The integrated procedure is summarized as follows: 1) enrichment for phosphopeptides by TiO2 chromatography, 2) dephosphorylation of half the sample, 3) LC-MS/MS-based analysis of phosphopeptides and corresponding dephosphorylated peptides, 4) comparison of peptide elution profiles before and after dephosphorylation to confirm phosphorylation, and 5) comparison of MS/MS spectra before and after dephosphorylation to validate the phosphopeptide and its phosphorylation site. This phosphopeptide identification represents a major improvement as compared with identifications based only on single MS/MS spectra and probability-based database searches. We investigated an applicability of this method to crude cell lysates and demonstrate its application on the large scale analysis of phosphorylation sites in differentiating mouse myoblast cells.

Recent advances in instrument control and enrichment procedures have enabled us to quantify large numbers of phosphoproteins and record site-specific phosphorylation events. An intriguing problem that has arisen with these advances is to accurately validate where phosphorylation events occur, if possible, in an automated manner. The problem is difficult because MS/MS spectra of phosphopeptides are generally more complicated than those of unmodified peptides. For large scale studies, the problem is even more evident because phosphorylation sites are based on single peptide identifications in contrast to protein identifications where at least two peptides from the same protein are required for identification. To address this problem we have developed an integrated strategy that increases the reliability and ease for phosphopeptide validation. We have developed an off-line titanium dioxide (TiO 2 ) selective phosphopeptide enrichment procedure for crude cell lysates. Following enrichment, half of the phosphopeptide fractionated sample is enzymatically dephosphorylated, after which both samples are subjected to LC-MS/ MS. From the resulting MS/MS analyses, the dephosphorylated peptide is used as a reference spectrum against the original phosphopeptide spectrum, in effect generating two peptide spectra for the same amino acid sequence, thereby enhancing the probability of a correct identification. The integrated procedure is summarized as follows: 1) enrichment for phosphopeptides by TiO 2 chromatography, 2) dephosphorylation of half the sample, 3) LC-MS/MS-based analysis of phosphopeptides and corresponding dephosphorylated peptides, 4) comparison of peptide elution profiles before and after dephosphorylation to confirm phosphorylation, and 5) comparison of MS/MS spectra before and after dephosphorylation to validate the phosphopeptide and its phosphorylation site. This phosphopeptide identification represents a major improvement as compared with identifications based only on single MS/MS spectra and probability-based database searches. We investigated an applicability of this method to crude cell ly- Reversible protein phosphorylation, at specific serine, threonine, and tyrosine residues, is a key determinant in many fundamental cellular functions such as survival, differentiation, structural organization, and stress responses (1)(2)(3)(4). Significant progress has been made in phosphopeptide identification at the femtomole level as phosphoproteomics, which permits rapid and effective identification and quantification of a large number of phosphoproteins (5)(6)(7)(8)(9)(10)(11)(12)(13)(14)(15)(16)(17)(18). Large scale phosphoproteomics analysis generally depends on selective proteolytic digestions, selective phosphopeptide enrichment methods, and sensitive and informative LC-MS/MS. However, there are several unsolved issues regarding the use of these techniques in biological applications. Although recent improvements of phosphopeptide enrichment methods have raised the expectations for the field of phosphoproteomics (5, 9, 17, 19 -21), the technologies used are still not reliable and easy enough, and the general level of knowledge is insufficient for global in vivo analysis of phosphopeptides. In particular, reliable site-specific phosphopeptide identification and validation by MS/MS is more difficult than identification of unmodified peptides.
Current large scale proteomics analyses based on MS and database searches most likely include a substantial number of false positive identifications (22,23). Currently identification of at least two unique peptides, of significant quality or high probability, derived from each protein are required for reliable protein identification. In large scale phosphorylation studies, the likelihood of false positive data is much greater because the identification of phosphorylation is based on a single peptide identification, and MS/MS spectra of phosphopeptides are typically more complicated than those of unmodified peptides. Thus, there is a great risk of a large number of false positive identifications of phosphorylation sites. In this study, we show by examples that in many cases the phosphopeptides can be found, but the exact site of phosphorylation is ambiguous or difficult to pinpoint.
To pinpoint the site of phosphorylation one must use one or more of a number of reliable methods available for validation of phosphorylation sites. However, these are not adequate when applied to large scale phosphoproteomics data because the number of identifications can overwhelm cost and personnel resources in most laboratories. Fig. 1 presents the dilemma in scaling up: more putative identifications but less confidence in what is correct. Therefore we have developed a relatively fast and reliable phosphopeptide validation procedure. To validate phosphopeptide identification, phosphopeptides are enzymatically dephosphorylated, after which the dephosphorylated peptide is used as a reference product against the original phosphopeptide. Comparison of phosphopeptide and corresponding dephosphorylated peptide is achieved after performing a two-step LC-MS/MS procedure, with a tandem-in-space mass spectrometer (in our case a hybrid Q-TOF tandem mass spectrometer), on both the phosphopeptide sample and the corresponding dephosphorylated sample. To succeed with this strategy, a significantly improved off-line phosphopeptide enrichment method compatible with LC-MS was developed. This step in the overall procedure eliminates corresponding non-phosphorylated peptides from the original sample prior to the following enzymatic dephosphorylation step of the phosphopeptide-enriched fraction. The enriched phosphopeptide fraction is then dephosphorylated, thereby creating a pool of peptides that had once been phosphorylated. Two common phosphopeptide enrichment method procedures were further developed, IMAC (16,17,24) and titanium dioxide (TiO 2 ) 1 chromatography (20,(25)(26)(27), with marked optimization especially on the latter technique (our recent report (24) already described the optimization of IMAC). After the development of the phosphopeptide validation method using caseins we investigated the applicability of this method with differentiating mouse myoblast cells.

EXPERIMENTAL PROCEDURES
Tryptic Digests of Standard Phosphoproteins-Commercially available phosphoproteins, 1 nmol of bovine ␣and ␤-casein (Sigma), were separately digested with 1 g of modified trypsin (Promega, Madison, WI) in 900 l of 50 mM NH 4 HCO 3 at 37°C for 20 h. Tryptic digests were slightly acidified with HCOOH and adjusted to 1 ml with H 2 O (1 pmol/l). Also a tryptic digest of non-phosphorylated protein, BSA (Sigma), was prepared to a final concentration of 10 pmol/l. To optimize the elution condition in TiO 2 chromatography, 10 l of the tryptic casein mixture consisting of 5 pmol of both tryptic ␣and ␤-casein was 10-fold diluted with 0.1% TFA, 80% ACN, which was initially used as a TiO 2 washing solution, and then subjected to TiO 2 chromatography (see below). To optimize washing conditions in IMAC and TiO 2 chromatography, 10 l of tryptic casein and BSA mixture consisting of 500 fmol of both tryptic ␣and ␤-caseins and 25 pmol of BSA was 10-fold diluted with one of the tested washing solutions and then subjected to IMAC or TiO 2 chromatography (see below).
Optimization of TiO 2 Procedure-The TiO 2 chromatography procedure used in this study was as follows. The Sachtopore-NP TiO 2 beads (20 m, 300 Å; ZirChrom, Anoka, MN) were suspended in 0.1% TFA, 30% ACN to a final concentration of 10 mg of TiO 2 /ml. Twenty microliters of this TiO 2 suspension was loaded onto a constricted GELoader tip (Eppendorf, Hamburg, Germany). Note that an end cap filter, such as Empore C 8 membrane (20), was not needed to pack the TiO 2 beads into the constricted GELoader tip. After equilibration of the TiO 2 microcolumn with 10 l of one of the tested washing solutions, the tryptic casein mixture or the tryptic casein and BSA mixture diluted with the washing solution was loaded onto the column. Three washes of the column were performed with 10 l of the washing solution followed by an additional wash with 10 l of 0.1% TFA. The retained peptides were eluted with 10 l of one of the tested eluants and immediately acidified with HCOOH if the eluant used was basic. For MALDI-MS, volatile eluant can be removed by evaporation and followed by two repeat evaporations after addition of 50 l of H 2 O. To optimize the elution condition the eluate was dissolved with 10 l of 0.1% HCOOH, but 5 l was used to optimize the washing condition. However, non-volatile eluants were desalted with a C 18 microcolumn. The TiO 2 eluates were 10-fold diluted with 0.1% HCOOH; loaded onto a microcolumn that was made with a piece of Empore C 18 disk (3M, St. Paul, MN) packed into a 200-l pipette tip (28); washed three times with 10 l of 0.1% HCOOH; eluted with 5 l of 0.1% HCOOH, 80% ACN; and then diluted with 5 l of 0.1% HCOOH.
Optimization of IMAC Procedure-The IMAC procedure used in this study was as follows. The Poros 20 MC beads (Applied Biosystems, Framingham, MA) were charged with FeCl 3 and reconstituted to a final concentration of 10 mg of Poros beads/ml of 70 mM CH 3 COOH, 30% ACN solution. Twenty microliters of this Fe 3ϩ -Poros slurry was loaded onto a constricted GELoader tip (29). After equilibration of the IMAC microcolumn with 10 l of one of the tested washing solutions (such as 500 mM CH 3 COOH, 30% ACN), the tryptic casein and BSA mixture diluted with the washing solution was loaded 1 The abbreviations used are: TiO 2 , titanium dioxide; AP, alkaline phosphatase; CE, collision energy; DHB, 2,5-dihydroxybenzoic acid; IF, intermediate filament; XIC, extracted ion chromatogram; Cdk5, Cyclin-dependent kinase 5.  (54,55) with amino acid mutations or comparison with synthetic phosphopeptides as reference materials; however, they are time-consuming and not applicable to hundreds or thousands of identifications, which could contain a considerable number of false positive identifications, e.g. identifications with at least 95% probability could contain at most 5% false positive identifications. There was no standard validation method for the large scale phosphoproteomics data. Therefore, we have developed a fast and reliable phosphopeptide validation method using phosphatase treatment and LC-ESI-Q-TOF MS/MS. onto the column. Three washes of the column were performed with 10 l of the washing solution, and then the retained peptides were eluted with 10 l of the previously optimized IMAC eluant, 5% H 3 PO 4 , 50% ACN (24). For MALDI-MS, the IMAC eluates were applied to the C 18 microcolumn; washed with 0.1% HCOOH; eluted with 2.5 l of 0.1% HCOOH, 80% ACN; and then diluted with 2.5 l of 0.1% HCOOH.
MALDI-MS-MALDI-MS was performed using an Ultraflex TOF/ TOF tandem mass spectrometer (Bruker Daltonics, Bremen, Germany). Aliquots (0.5 l) of the IMAC and TiO 2 eluates with or without the C 18 treatment were loaded onto an AnchorChip MALDI probe (400 m; Bruker Daltonics) together with 0.5 l of a matrix solution consisting of 5g/liter 2,5-dihydroxybenzoic acid (DHB; Aldrich), 0.3% H 3 PO 4 (30,31). All spectra were obtained from m/z 640 to m/z 4000 by 3000 laser shots in positive reflector mode. The washing and elution efficacies of IMAC and TiO 2 chromatography were investigated using the reported tryptic phosphopeptides derived from ␣and ␤-caseins (Supplemental Table 1) (20,31).
Alkaline Phosphatase Treatment Followed by LC-MS/MS-One picomole of the tryptic casein mixture was enriched for phosphopeptides using our optimized TiO 2 chromatography, which consisted of washing with 750 mM TFA, 80% ACN and elution with 5% NH 4 OH (pH 12.0). The TiO 2 eluate was acidified with HCOOH, evaporated, and then dissolved with 20 l of 100 mM NH 4 HCO 3 . One-half was enzymatically dephosphorylated with 10 l of 0.1 unit of alkaline phosphatase (AP; Promega, calf intestinal)/l of 100 mM NH 4 HCO 3 at 37°C for 1 h. After desalting with the C 18 microcolumn, aliquots of the resulting TiO 2 and TiO 2 -AP samples corresponding to 80 fmol of caseins were subjected to LC-MS/MS, Mascot database search, and the data comparison before and after AP treatment. Note that 20 units/l Promega AP containing 50% glycerol was used in this study; however, complete removal of glycerol with C 18 material was not achieved, and consequently (even small amounts of) glycerol affected the LC-MS/MS analysis. We now use and advise use of an alternative AP reagent containing no glycerol (Roche Applied Science, calf intestinal, enzyme immunoassay (EIA) grade).
Reference-facilitated Phosphoproteomics Analysis of Differentiating Mouse Myoblasts-C2C12 mouse myoblast cells were grown at 37°C in a 7% CO 2 atmosphere in proliferation medium consisting of Dulbecco's modified Eagle's medium (Invitrogen) supplemented with 10% (v/v) fetal bovine serum, 2 mM glutamine, penicillin G, and streptomycin in a 150-mm cell culture dish. When the cells were grown to 80% confluence, differentiation was induced by replacing the proliferation medium with differentiation medium (Dulbecco's modified Eagle's medium supplemented with 2% fetal calf serum and 4 mM glutamine) and further incubation at 37°C. After 48 h in differentiation medium, the cells (of which the population was usually around 2 ϫ 10 7 in the dish) were directly lysed in two aliquots of 500 l of Laemmli SDS-PAGE sample buffer (total, 1 ml) and boiled for 5 min. A quarter of the cell lysate was separated by SDS-PAGE in two 10% acrylamide gels (8-cm width ϫ 7-cm height, 12 of 20 lanes were used).
Ten even bands of the gels were cut out from top to bottom (bands 1-10). Each gel band was washed with 50 mM NH 4 HCO 3 , 50% ACN; reduced with 10 mM DTT in 50 mM NH 4 HCO 3 ; alkylated with 50 mM iodoacetamide in 50 mM NH 4 HCO 3 ; washed with 50 mM NH 4 HCO 3 , 50% ACN three times and with ACN; evaporated; and then in-gel digested with 1 g of modified trypsin in 400 l of 50 mM NH 4 HCO 3 at 37°C for 16 h. Tryptic digests were eluted with 0.1% HCOOH, 50% ACN twice and with ACN, evaporated, and subjected to TiO 2 chromatography. Each half of the TiO 2 eluates (bands 1-10) was dephosphorylated by AP treatment. After desalting with the C 18 microcolumn, the resulting TiO 2 and TiO 2 -AP samples were subjected to LC-MS/MS, Mascot database search (below), and the data comparison before and after AP treatment. The procedure used for this analysis is summarized in Fig. 2.
LC-ESI-Q-TOF MS/MS-LC-MS/MS was performed with a nanoflow LC system (Famos, SwitchosII, and Ultimate; LC Packings, Sunnyvale, CA) coupled to a QSTAR Pulsar ESI-hybrid Q-TOF tandem mass spectrometer (Applied Biosystems/MDS Sciex, Toronto, Canada). The program Analyst QS (version 1.1; Applied Biosystems) was used for data acquisition and instrument control. was applied that prevented the same m/z from being selected for 2 min after its acquisition.
Mascot Database Search-Data were analyzed with the Analyst QS software, and peak lists were generated using the Mascot.dll script of the software. MS/MS spectra were centroided and deisotoped. Peak intensity threshold was at 0.1% of the base peak. Data of caseins and mouse myoblasts were searched using Mascot (version 2.1; Matrix Sciences, Boston, MA) against the Swiss-Prot database (version 48.5, other Mammalia, 6907 sequences) on December 9, 2005 and the National Center for Biotechnology Information non-redundant (NCBI nr) database (version 20051120, Mus musculus, 96,500 sequences) on November 26, 2005, respectively. One missed cleavage site was allowed, carbamidomethylation was searched as a fixed modification, and methionine oxidation and phosphorylation of serine/threonine/ tyrosine were allowed as variable modifications. Mass tolerance in MS and MS/MS modes was 0.2 Da. Searched peptide sequences with the expectation value (in Mascot searches) less than 0.05 indicated identity, which roughly showed a Mascot score of more than 25 against Swiss-Prot database (other Mammalia) and more than 37 against NCBI nr database (M. musculus). Although molecular weight suggested by SDS-PAGE was used as supporting information, proteins shown in this study were singled out as examples from the multiple candidates searched by Mascot without any other supporting information. Therefore, results of this study should not ensure precise protein identifications but phosphopeptide identifications.

Optimization of IMAC and TiO 2 Chromatography for Phosphopeptide Enrichment-
The efficacy of the several washing and elution conditions in off-line Fe 3ϩ -charged IMAC and TiO 2 chromatography was evaluated with the reported phosphopeptides (20, 31) derived from tryptic digests of commercially available phosphoproteins, bovine ␣and ␤-caseins, by MALDI-MS. First the phosphopeptide elution condition of TiO 2 chromatography was optimized. Tryptic digests of 5 pmol of caseins were loaded onto the TiO 2 microcolumn followed by washing with 0.1% TFA, 80% ACN. Phosphopeptides were eluted with the following eluants: NH 4 OH, Na 2 HPO 4 , H 3 PO 4 /ACN, and NH 4 H 2 PO 4 /ACN. Aliquots (5%, v/v) of eluates desalted with a C 18 microcolumn were analyzed and compared by MALDI-MS with DHB/H 3 PO 4 matrix (30,31). Many reported phosphopeptides (ϳ18) were observed even after washing with the high concentration of ACN (Supplemental Table 1). The eluant 5% NH 4 OH (pH 12.0) showed a greater number of phosphopeptides (15 phosphopeptides), especially multiply phosphorylated peptides, with more intensity and better signal-to-noise ratio than other tested eluants. The order of the elution efficacy was 5% NH 4 OH (pH 12.0) Ͼ 0.6% NH 4 OH (pH 10.5) ϳ 100 mM Na 2 HPO 4 Ͼ 5% H 3 PO 4 , 50% ACN Ͼ 100 mM NH 4 H 2 PO 4 , 50% ACN (Supplemental Fig. 1). The reported eluant NH 4 OH (20) also was most efficient in our experiments (32). Although NH 4 OH (pH 10.5) has been reported previously as the optimal eluant with other TiO 2 beads (Titansphere, 5 m; GL Sciences, Tokyo, Japan) where a more basic eluant did not result in further improvement (20), here NH 4 OH (pH 12.0) was more efficient as an eluant using the Sachtopore-NP TiO 2 beads (20 m; ZirChrom).
Subsequently the washing condition of TiO 2 chromatography was optimized. Tryptic digests of 500 fmol of caseins and an excess of 25 pmol of BSA were loaded onto the TiO 2 microcolumn followed by washing with each solution (such as DHB/TFA/ACN, TFA/ACN, and CH 3 COOH/ACN). Aliquots (10%, v/v) of 5% NH 4 OH eluates were analyzed and compared by MALDI-MS. The volatile NH 4 OH was removed from the eluates by evaporation, instead of the C 18 treatment, prior to MALDI-MS analysis. The order of the washing efficacy was 750 -1500 mM TFA, 80% ACN Ͼ 375 mM TFA, 80% ACN Ͼ 150 mM TFA, 80% ACN Ͼ Ͼ 0.1% TFA, 80% ACN ϳ 20 g/liter DHB, 0.1% TFA, 80% ACN Ͼ 100 -1000 mM CH 3 COOH, 80% ACN (Supplemental Fig. 2). The efficacy correlated with TFA concentration, and washing with the high concentration of TFA significantly decreased nonspecific binding of peptides, whereas the casein phosphopeptides are still retained by the TiO 2 microcolumn (32). The optimized TFA concentration of 750 mM is ϳ6% (v/v) TFA. The reagents used in this optimized TiO 2 chromatography can be removed by either evaporation or C 18 microcolumn. After the evaporation, almost all peptides containing a methionine residue appeared to be oxidized.
Optimization of the elution condition in Fe 3ϩ -charged IMAC has already been achieved in our previous study, and the H 3 PO 4 /ACN eluant was more efficient than previously known IMAC eluants such as DHB/H 3 PO 4 /ACN, Na 2 HPO 4 , NH 4 H 2 PO 4 , and NH 4 OH (24). Therefore, the washing condition of IMAC was also investigated as well (Supplemental Fig.  3). Washing with a high concentration of CH 3 COOH decreased nonspecific binding peptides; however, neither TFA nor a high concentration of ACN resulted in sufficient phosphopeptide retention. In the end, the IMAC and TiO 2 chromatography optimized in this study were compared, and consequently the TiO 2 chromatography was more highly selective for the phosphopeptides than the IMAC (Fig. 3).
Comparison of Tryptic Casein Phosphopeptides by LC-ESI-Q-TOF MS/MS before and after Enzymatic Dephosphorylation-To develop the validation method of phosphopeptide identification, enzymatic dephosphorylation and comparative LC-MS/MS analysis were evaluated. After TiO 2 chromatography of 1 pmol of tryptic caseins, half of the resulting eluate was treated with AP. Aliquots of both casein samples, TiO 2 and TiO 2 -AP, corresponding to 80 fmol were subjected to LC-ESI-Q-TOF MS/MS, and consequently four singly phosphorylated peptides (TiO 2 ) and eight non-phosphorylated peptides (TiO 2 -AP) were identified (Supplemental Table 1). In the TiO 2 sample only phosphorylated peptides were found, and in the TiO 2 -AP sample only dephosphorylated peptides were found. All methionine-containing peptides were identified only in oxidized form. Four of the eight dephosphorylated peptides corresponded to the four phosphopeptides. Phosphopeptide enrichment with TiO 2 chromatography and ensuing enzymatic dephosphorylation were more stringently evaluated by comparison of extracted ion chromatograms (XICs; Ϯ0.2 Da) of TiO 2 and TiO 2 -AP sam-ples. Only phosphopeptides and the corresponding dephosphorylated peptides with the same oxidation states and charge states were used for all the comparative analyses in this study. Phosphopeptide ion peaks were clearly observed in the XIC of the TiO 2 sample; however, no ion peaks of the corresponding non-phosphorylated peptides (⌬m used for HPO 3 , 79.9663 Da) were observed (Fig. 4). As expected, phosphopeptide peaks were not observed in the XIC of the TiO 2 -AP sample; however, ions for the corresponding dephosphorylated peptides were clearly observed. All of the four corresponding dephosphorylated peptides showed slightly earlier retention time than the singly phosphorylated peptides, corroborating with previous findings (33,34). It should be noted that the peak areas of the corresponding dephosphorylated peptides were 2-5 times larger than the phosphopeptides. One of the reasons for that could be false negative identification of multiply phosphorylated peptides in the TiO 2 sample. These results indicated that the identified phosphopeptides were enzymatically dephosphorylated; therefore, phosphorylation was not only suggested by the database search of MS/MS spectra but also confirmed by the enzymatic reaction.
Although four of the singly phosphorylated tryptic casein peptides were identified by the Mascot database search of  Figs. 1-3). To compare the optimized IMAC and TiO 2 chromatography, tryptic digests of 500 fmol of caseins and an excess (25 pmol) of BSA were loaded onto both microcolumns, and after the washing, aliquots (10%, v/v) of the eluates were analyzed by MALDI-MS. Consequently the TiO 2 chromatography was more highly selective for the phosphopeptides than the IMAC. The reagents used in the optimized TiO 2 chromatography can be removed by either evaporation or C 18 treatment. Almost all peptides containing a methionine residue appeared to be oxidized after the evaporation. Reported casein phosphopeptides (20,31) and their oxidized forms are indicated by a red asterisk and blue pound sign, respectively (Supplemental Table 1 the MS/MS spectra, some of them had multiple candidates of phosphorylation sites with high probability; for example a phosphopeptide derived from ␣-S2-casein (CAS2_BOVIN), 153 TVDoMEpSTEVFTK 164 where oM is oxidized methionine and pS is phosphoserine ([M ϩ 2H] 2ϩ m/z 741.80), was identified with two phosphorylation site candidates, Ser(P) 158 and Thr(P) 159 , with Mascot scores of 64.0 and 50.8, respectively (Fig. 5A). To identify the exact phosphorylation site, fragment ions between the candidates must be annotated; however, they are often uncertain due to poor ion signals and/or other misleading ions. By comparing MS/MS spectra of the phosphopeptides and the corresponding dephosphorylated peptides, we found with our setup that both showed highly similar fragmentation patterns for two aspects: the first was the presence of fragment ions, and the second was our observation of their similar intensity distribution (Fig. 5A). Although full estimation of detectable fragment ions is generally not easy or practical, using corresponding dephosphorylated peptides as reference products provides a new and informative fragmentation analysis of these phosphopeptides by peptide presence and similarity.
MS/MS using CID usually generates the dominant neutral loss of H 3 PO 4 (Ϫ97.98 Da) from the labile phosphate group on the serine and threonine residues, resulting in dehydroalanine and 2-aminodehydrobutyric acid, respectively. In this study with the casein phosphopeptides, mainly the dominant neutral loss ions were annotated by the Mascot database search as the fragment ions originally phosphorylated at serine residues, and fragment ions still containing a phosphate group were also annotated. Consequently the fragment ions containing the phosphorylated serine residues mainly showed Ϫ18.01 Da and also a small amount of ϩ79.97 Da in comparison with the corresponding fragment ions derived from the dephosphorylated peptide. The similarity of fragmentation FIG. 5. Validation of a phosphopeptide and its phosphorylation site. A, MS/MS spectra of a tryptic phosphopeptide, 153 TVDoMEp-STEVFTK 164 , and the corresponding dephosphorylated peptide were compared. The spectra were centroided. They showed highly similar fragmentation patterns for not only the mass of fragment ions but also their intensity distributions. This feature is available to quickly validate phosphopeptide sequence identification based on the more reliable identification of the dephosphorylated peptide. In this case, two phosphorylation site candidates, Ser(P) 158 and Thr(P) 159 , were suggested as highly probable with Mascot scores of 64.0 and 50.8, respectively. Fragment ion annotations of the phosphopeptide were validated as stated below. Consequently the weak but validated y 6 ion between the two candidates clarified the ambiguous phosphorylation site candidates to the known site, Ser(P) 158 . B, fragment ion annotations are validated by the comparison of MS/MS spectra. The fragment ion containing the phosphorylated serine/threonine residues mainly shows Ϫ18.01 Da (dominant neutral loss of H 3 PO 4 ) and also a small amount of ϩ79.97 Da (original fragment ion) in comparison with the corresponding fragment ion derived from the dephosphorylated peptide. This corresponding fragment ion pair shows similarity of relative intensities as indicated in the figure. The fragment ion pair containing no phosphorylation also shows similarity of relative intensities without mass shift. The fragment ion annotations of phosphopeptides are validated based on the comparison with the corresponding fragment ion annotations of the dephosphorylated peptide, which should be more reliable. C, in the case of tyrosine phosphorylation, the fragment ion containing the phosphorylated tyrosine residue does not show the dominant neutral loss, but it does show the original fragment ion. Although caseins did not contain phosphorylated threonine/tyrosine, these phosphorylation sites were theoretically considered and then confirmed in the following C2C12 experiment. pT, phosphothreonine; pS, phosphoserine. patterns between the phosphopeptides and the corresponding dephosphorylated peptides was based on overall similarity of relative intensities of the corresponding fragment ion pairs ( Fig. 5 and Supplemental Fig. 5). Furthermore identification of the dephosphorylated peptide should be much more reliable than that of the phosphopeptide because of less complicated MS/MS spectra and fewer candidates with high probability on the database search; therefore, fragment ion annotation of the dephosphorylated peptide should be much more reliable than that of the phosphopeptide, especially annotation of the fragment ions containing phosphorylation site candidates.
Detected fragment ions derived from the casein phosphopeptides were compared with the corresponding fragment ions derived from the dephosphorylated peptides with similar relative intensities (Fig. 5A). Using this strategy the identification of the phosphorylation sites were quickly validated without manual annotation or sequencing. However, if no fragment ion is detected between phosphorylation candidates, the identification cannot be validated.
Reference-facilitated Phosphoproteomics Analysis of Differentiating Mouse Myoblast C2C12 Cells-To evaluate the use of the validation strategy with larger sample sizes, we performed phosphorylation analysis of crude cell extracts from differentiating mouse myoblast C2C12 cells (Fig. 2). A crude lysate extracted from 5 ϫ 10 6 C2C12 cells was separated by SDS-PAGE (distributed over 10 bands), in-gel digested with trypsin, and then subjected to TiO 2 chromatography phosphopeptide enrichment. Each half of the resulting eluates was treated with AP. Subsequently both the TiO 2 and TiO 2 -AP samples were analyzed by data-dependent LC-MS/ MS. Phosphorylation analysis of the TiO 2 samples using Mascot yielded 297 phosphopeptide candidates. The results from this experiment are summarized in Fig. 6. Phosphopeptides of identical amino acid sequence but different phosphorylation location, single and multiple phosphorylation(s), were considered the same (ϭ1) peptide. Fifty-eight of 68 methioninecontaining peptides were oxidized, and three of the 68 were found in both oxidized and non-oxidized forms, indicating that ϳ90% of the methionine-containing peptides (61 of 68) were identified in oxidized form. The Mascot search of the TiO 2 -AP samples resulted in 512 peptides with a high probability, of which no phosphopeptides were found; 135 dephosphorylated peptides corresponded to the phosphopeptides identified in the TiO 2 samples from the same SDS-PAGE bands with the same oxidation states and charge states (above and Fig. 6), but three dephosphorylated peptides had different charge states. Only three of the 135 were also identified in the TiO 2 samples as non-phosphorylated peptides, indicating the specificity of the TiO 2 .
XICs of the 135 phosphopeptide and dephosphorylated peptide pairs from the TiO 2 and TiO 2 -AP samples were compared (data not shown). To estimate retention time shifts, 88 of the non-phosphorylated peptides identified from both TiO 2 and TiO 2 -AP were used as markers. The comparison of XICs of the 135 peptide pairs clearly showed the decrease of phosphopeptide ions and increase of dephosphorylated peptide ions. Although the three peptide pairs showed peaks of the non-phosphorylated peptides in XICs of the TiO 2 samples, they were relatively small and remarkably increased by the AP treatment as the dephosphorylated peptides. Consequently phosphorylations on the 135 phosphopeptides identified by the Mascot search were confirmed by enzymatic dephosphorylation.
The MS/MS spectra of the phosphopeptides and the corresponding dephosphorylated peptides were compared (Fig.  7). The comparison of the fragmentation patterns provided 134 similar pairs (one was not similar (Fig. 7C)). Furthermore one of the 134 pairs, which were identified as singly and also doubly phosphorylated peptides, showed fragmentation pattern similarity between the singly phosphorylated peptide and the corresponding dephosphorylated peptide but not between the doubly phosphorylated peptide and the dephosphorylated peptide (data not shown). These phosphopeptides without the similarities were suggested as false positive identifications even with high Mascot scores for both the phosphopeptides (scores of 38 and 46) and the dephosphorylated peptides (scores of 81 and 73), respectively. The 134 phosphopeptides should contain 138 phosphorylation sites based FIG. 6. Reference-facilitated phosphoproteomics applied on differentiating C2C12 cells. The Mascot database search of the TiO 2 samples of the differentiating C2C12 cells resulted in 297 phosphopeptides and also 199 non-phosphorylated peptides with high probability at expectation value Ͻ0.05, which roughly corresponded to a Mascot score Ͼ37 (NCBI nr database, M. musculus). Also the Mascot database search of the TiO 2 -AP samples resulted in 512 peptides with high probability; this did not include any phosphopeptide. The 512 peptides included 135 dephosphorylated peptides that corresponded to the 135 phosphopeptides identified in the TiO 2 samples from the same SDS-PAGE bands with the same oxidation states and charge states. The 135 phosphopeptides were validated by comparison with the corresponding dephosphorylated peptides. Consequently 134 phosphopeptides, which contained 116 precise phosphorylation sites (108 serine, six threonine, and two tyrosine residues) and 22 ambiguous sites, were confirmed. Six of the 116 sites were not suggested as the first candidates by the Mascot database search. pT, phosphothreonine; pS, phosphoserine. on the Mascot database search and XIC comparisons. Subsequently fragment ions between the phosphopeptides and the dephosphorylated peptides were compared to validate their annotation. The resulting annotated fragment ions indicated 116 precise phosphorylation sites (Supplemental Table  2 and Supplemental Figs. 6 -9). Note that six of these were not the first candidate provided by the Mascot database search (Fig. 7B). An additional 22 phosphorylation sites were still ambiguous in this analysis. Although manual validation of the MS/MS spectra could lower the number of ambiguous peptides, they have been left as "ambiguous." The 116 sitespecific identifications consisted of 108 serine, six threonine, and two tyrosine phosphorylations. As stated above, all of the validated phosphopeptides and the corresponding dephosphorylated peptides showed the fragmentation pattern similarities even with a 10-fold difference of absolute intensities and for doubly phosphorylated peptides. Examples can be found in Supplemental Fig. 4, A-E. DISCUSSION We developed a reliable TiO 2 chromatography method that shows high specificity for "single phosphoproteins" (␣-and ␤-caseins) and highly complex mixtures. To investigate efficacy of the TiO 2 chromatography against a complex mixture we evaluated the integrated procedure, starting with SDS- PAGE followed by in-gel digestion, TiO 2 chromatography, and LC-MS/MS, using an extremely crude extract of differentiating myoblast C2C12 cells. The result was that 297 phosphopeptides were found using the Mascot search program. Because one of our interests is in the biological role of intermediate filaments (IFs), such as vimentin (Fig. 7A) and nestin (Supplemental Fig. 4A), both relatively insoluble proteins, we used the most powerful protein extraction method, direct extraction with SDS-PAGE sample buffer from the cell culture dish, to obtain all IF proteins. Phosphopeptide selectivity of the optimized TiO 2 chromatography was still effective against the extremely crude cell extract. As expected the TiO 2 chromatography provided a markedly selective and sensitive phosphopeptide enrichment prior to MS analysis.
We observed oxidation of methionine in our samples. Methionine residues of peptides are readily oxidized during sample preparation, and both oxidized methionine and non-oxidized methionine are quite often detected in proteomics analysis. As stated above, almost all methionine was oxidized after the TiO 2 treatment in this study. This methionine oxidization might be caused by the photocatalytic oxidation effect of TiO 2 (35,36). In the optimized TiO 2 chromatography, we evaporated the eluate just after the elution; therefore if a small amount of TiO 2 leaked or eluted from the microcolumn might be concentrated, consequently the oxidization effect of TiO 2 might be enhanced during the evaporation. Actually C 18 treatment of the TiO 2 eluate just after the elution did not induce methionine oxidization (Supplemental Fig. 1). Because sample complexity should be reduced, fortunately this oxidization effect should increase the number of peptide identifications, and also pairs of phosphopeptide and the corresponding dephosphorylated peptide with the same oxidization state were mainly available for the MS/MS spectra comparison.
The strategy we have developed uses enrichment of phosphopeptides by TiO 2 chromatography and subdivision into two portions. One portion is directly analyzed by LC-MS/MS on an ESI-Q-TOF mass spectrometer, and the other portion is analyzed after enzymatic dephosphorylation using a phosphatase. The phosphatase treatment has been used to confirm phosphopeptides (37,38); however, its main use has been limited to the relatively simple MALDI-MS analysis (7,29,31,39,40) with the exception of LC-MS/MS analysis of dephosphorylated peptides to estimate possible phosphopeptides (5,41). In our strategy elimination of phosphopeptide and the appearance of corresponding dephosphorylated peptides after the AP treatment renders phosphorylation on the identified peptides plausible. However, site specificity of a phosphorylation event is not attained by this evidence alone. We have found that comparison of the MS/MS spectra of the phosphopeptides and the corresponding dephosphorylated peptides provided the informative, useful, and unique information to validate the identifications of the peptide sequences and their phosphorylation sites. A mass spectrum has two axes, m/z and intensity; however, only m/z value is generally available for peptide identification using the database search. With the efficient linear acceleration of ions from the QSTAR collision cell, we found that intensity value (relative intensity) can also be used to discriminate small differences between phosphopeptide fragmentation spectra. Consequently the phosphopeptide validation method developed in this study, based on the combination of enzymatic dephosphorylation and LC-ESI-Q-TOF MS/MS, will immensely decrease false positive data from large scale phosphoproteomics study. The procedure is summarized as follows.
1. Enrichment of phosphopeptides by TiO 2 chromatography. 2. Dephosphorylation of half the sample. 3. Identification of phosphopeptides and corresponding dephosphorylated peptides by MS/MS (ESI-Q-TOF). 4. Comparison of extracted precursor ion chromatograms before and after dephosphorylation to confirm phosphorylation (Fig. 4). 5. Comparison of MS/MS spectra before and after dephosphorylation to validate phosphopeptide and determine site specificity of phosphorylation (Fig. 5).
One of the most prominent phosphoproteins identified in this study was IF protein nestin (Supplemental Fig. 4A and Supplemental Table 2). IFs are cytoskeletal polymers that maintain the structural and mechanical integrity of cells and tissues (42). IFs are composed of a wide family of over 65 different proteins that vary considerably with respect to their sequences, expression patterns, and abundance in different tissues (42). Nestin is specifically expressed during the early stages of development in the central nervous system (43) and in myogenic tissue (44,45). Upon differentiation nestin is down-regulated and replaced by other tissue-specific IF proteins such as desmin, which is the characteristic IF protein of fully differentiating muscle (46). Despite the fact that IFs have a crucial role in maintaining the cellular architecture they are by no means static structures but rather highly dynamic cellular constructions that undergo rearrangements during cell division, migration, stress, and differentiation. Structural alterations are a result of phosphorylation, which is the key regulator of IF polymerization, subcellular distribution, and dynamics (47). Our group has previously described that nestin is phosphorylated by Cyclin-dependent kinase 5 (Cdk5) upon myogenic differentiation (48). Cdk5 is a multifunctional kinase that, regardless of its name, is not regulating cell cycle but is a crucial modulator of development and differentiation of neuronal (49,50) and muscle tissue (51). The fusion of undifferentiating myoblasts to long, multinucleated myotubes is associated with massive reorganization of nestin filaments. We have provided indications that Cdk5-mediated phosphorylation of nestin might be one of the underlying causes of this rearrangement process (48). Cdk5 has been demonstrated to specifically target serines and threonines in a proline-directed manner (49). However, so far we have not been able to determine whether each SP sequence identified in this analysis is a target site for Cdk5 or is contributing to nestin reorganization. In fact, recent work with IFs has emphasized the role of IFs as a scaffold for various signaling molecules and, thereby, organizer of signal transduction pathways (42,52). Phosphorylation has been shown to be involved in regulating the interactions between IFs and various IF-associated molecules (47) indicating that myogenesis-associated nestin phosphorylation might also be generated to accomplish specific protein-protein interactions.
In this study, we used only MS/MS spectra produced by an ESI-Q-TOF tandem-in-space mass spectrometer with CID for the comparison between phosphopeptides and corresponding dephosphorylated peptides. Other tandem-in-space mass spectrometers with a similar CID efficiency, such as triple quadrupole and hybrid Q-linear ion trap mass spectrometers, could be available for this purpose. In the case of ion trap tandem-in-time mass spectrometers with CID, MS/MS spectra of tyrosine phosphopeptide and the dephosphorylated peptide could show similarity of fragmentation patterns; however, labile serine/threonine phosphorylation generally provides a dominant neutral loss of H 3 PO 4 from the precursor ion and less sequence-specific fragment ions, therefore this MS/MS spectrum should be different from the MS/MS spectrum of dephosphorylated peptide. Similarity might be found from comparison between the MS/MS/MS spectrum of serine/threonine phosphopeptide after the neutral loss of H 3 PO 4 and MS/MS spectrum of the dephosphorylated peptide; however, the MS/MS/MS spectrum should show much less ion intensity. MS/MS spectra produced by ion trap mass spectrometers with electron transfer dissociation or FT-ICR-MS with electron capture dissociation also might show the similarity. We will investigate applicabilities of the reference-facilitated phosphoproteomics on these different types of mass spectrometers.
Accurate mass analysis (Ͻ5 ppm for precursor ions) with the latest mass spectrometers will improve the reliability of phosphopeptide identifications but not exclude false identifications. Other developments for phosphopeptide identification including phosphorylation site determination by way of statistical analysis (53) will also provide useful information, but as a stand alone technology falls short of providing dynamic information. The concept of the reference-facilitated phosphoproteomics is comparison of XICs and MS/MS spectra between phosphopeptide and the corresponding dephosphorylated peptide. Disadvantages of our method are the loss of half of the phosphopeptide samples to make the dephosphorylated peptides as references, twice the analysis time, and incomplete coverage of phosphopeptide identification with dephosphorylated peptide identification. However, the use of the corresponding dephosphorylated peptide as the reference product has distinct advantages over existing approaches: the evidence of phosphorylation based on the enzymatic dephosphorylation, the confirmation of phosphopeptide sequence identification based on the fragmentation pattern sim-ilarity with more reliably identified dephosphorylated peptide, and the validation of the fragment ion annotation, especially uncertain fragment ions, based on the mass shift and similar relative intensity. Perhaps the most important advantage is that phosphopeptide identification is not fully dependent on a single MS/MS spectrum and associated probability score. With the methods proposed here two peptides represent the same sequence. Other possible validation methods include 32 P labeling (54,55) with amino acid mutations or comparison with synthetic phosphopeptides (or the usual manual validation of MS/MS spectra) (Fig. 1), but validation of a large data set is still time-consuming. Our concept and strategy is simple and will soon be applicable to automated analysis, enabling easier validation and reducing the overall analysis time and ambiguous phosphorylation site information. We are currently in the process of writing a software tool that will be made available as open source and be platform-independent.