Large Scale Identification and Quantitative Profiling of Phosphoproteins Expressed during Seed Filling in Oilseed Rape*S

Seed filling is a dynamic, temporally regulated phase of seed development that determines the composition of storage reserves in mature seeds. Although the metabolic pathways responsible for storage reserve synthesis such as carbohydrates, oils, and proteins are known, little is known about their regulation. Protein phosphorylation is a ubiquitous form of regulation that influences many aspects of dynamic cellular behavior in plant biology. Here a systematic study has been conducted on five sequential stages (2, 3, 4, 5, and 6 weeks after flowering) of seed development in oilseed rape (Brassica napus L. Reston) to survey the presence and dynamics of phosphoproteins. High resolution two-dimensional gel electrophoresis in combination with a phosphoprotein-specific Pro-Q Diamond phosphoprotein fluorescence stain revealed ∼300 phosphoprotein spots. Of these, quantitative expression profiles for 234 high quality spots were established, and hierarchical cluster analyses revealed the occurrence of six principal expression trends during seed filling. The identity of 103 spots was determined using LC-MS/MS. The identified spots represented 70 non-redundant phosphoproteins belonging to 10 major functional categories including energy, metabolism, protein destination, and signal transduction. Furthermore phosphorylation within 16 non-redundant phosphoproteins was verified by mapping the phosphorylation sites by LC-MS/MS. Although one of these sites was postulated previously, the remaining sites have not yet been reported in plants. Phosphoprotein data were assembled into a web database. Together this study provides evidence for the presence of a large number of functionally diverse phosphoproteins, including global regulatory factors like 14-3-3 proteins, within developing B. napus seed.

The storage reserves found in most plant seeds consist of carbohydrates, oils, and proteins (1)(2)(3)(4)(5). They contribute up to 90% or more of the dry seed weight and are necessary for seed viability and early seed germination and seedling growth (6). In nature, the relative proportions of the stored components in seeds vary drastically among different plant species (1,7). Variation also exits within plant races. For example, a recent survey of oil content in the seed of ϳ360 known Arabidopsis ecotypes revealed a range from 34 to 46% of seed dry weight (8). Extensive studies on seed development have firmly established that the components of storage reserve begin to accumulate and their relative levels are determined in the mature seed during a particular phase of seed development (for reviews, see Refs. 5, 9, and 10), referred to as seed filling (1,4,11,12). The seed filling phase involves cell division, cell expansion, and the early maturation stage (1,2).
Numerous studies including two global approaches, transcriptomics and proteomics, conducted to date on seed filling, in particular oilseed plants such as oilseed rape (Brassica napus L.), soybean, and Arabidopsis, continue to produce new paradigms and refine our understanding of the existing biosynthetic pathways responsible for accumulation of seed storage reserves (5,9,10,12,(13)(14)(15)(16)(17)(18)(19)(20). A microarray of developing Arabidopsis seed provided insight into primary transcriptional networks that coordinate the metabolic responses to seed developmental programs and lead to the distribution of carbon among carbohydrate, oil, and protein reserves (12). Proteomics studies have also been performed in Medicago truncatula (18), pea (19), and soybean (20) using high resolution two-dimensional gel electrophoresis (2-DGE) 1 in combination with either MALDI-TOF-MS or LC-MS/MS. Of these studies, the soybean investigation was the most systematic with respect to quantitative expression profiling and protein identification. In that study, 679 and 422 protein spots were profiled and identified, respectively, representing 216 nonredundant proteins and 14 functional classes (20). Despite these investigations, including the large data sets of genes and proteins, the specific underlying regulatory mechanisms that control the levels of storage reserves in the seed remain largely unknown.
Understanding the underlying regulatory mechanism(s) of the networks by identifying and characterizing their regulatory components will undoubtedly broaden our knowledge on how the levels of stored components in the seed are fine tuned. Reversible phosphorylation is a major post-translational mechanism by which cells transduce cellular, developmental, and environmental signals and thereby control a myriad of biological processes in diverse organisms including plants (for reviews, see Refs. [21][22][23]. Previous studies have shown that several intermediary and primary metabolic enzymes are regulated by reversible phosphorylation in plants, including sucrose synthase (24), sucrose-phosphate synthase (Ref. 25; for a review, see Ref. 26), trehalose-6-phosphate synthase (27,28), pyruvate kinase (29), acetyl-CoA carboxylase (30), phosphoenolpyruvate carboxylase (31), nitrate reductase (32), and the mitochondrial pyruvate dehydrogenase complex (33).
Hence understanding the dynamics of this post-translation modification in response to cellular as well as environmental cues can lead to the identification of candidate regulatory proteins (protein kinases) and their substrates and thereby can help in dissecting the signaling and metabolic networks. This emerging new area of systems biology is termed phosphoproteomics (34 -36). Many advances in phosphoproteomics technologies, including enrichment, detection, phosphorylation site mapping, and quantification of phosphoproteins, have made the large scale study of phosphoproteins a feasible task (23, 34 -39). One of the major developments in this area is the detection of phosphoproteins using a unique fluorescence dye, Pro-Q Diamond phosphoprotein stain (Pro-Q DPS; Ref. 40). Two major advantages of this stain are: 1) it can be used for global quantitative analysis of phosphoproteins as it binds directly to the phosphate moiety of phosphoproteins with high sensitivity and linearity regardless of phosphoamino acid and 2) the stain is fully compatible with other staining methods and modern MS. Despite these recent advancements, phosphoproteins in plants have been rarely studied on a large scale basis (for a review, see Ref. 41). One example of a large scale phosphoproteomics study is the identification of more than 300 phosphorylation sites from Arabidopsis plasma membrane proteins (42). To our knowledge, no large scale study of phosphoproteins has been carried out on developing plant seeds.
We have embarked on large scale phosphoproteomics study during seed filling in B. napus with the following objectives: (i) to obtain quantitative expression profiles of phosphoproteins through seed development, (ii) to generate a 2-DGE phosphoprotein reference map, (iii) to determine the phosphorylation sites of phosphoproteins, and (iv) to begin building resources for dissecting biological processes that might be regulated by reversible phosphorylation, including metabolism. A high throughput 2-DGE approach in combination with Pro-Q DPS and LC-MS/MS were applied on five sequential stages 2, 3, 4, 5, and 6 weeks after flowering (WAF), covering the majority of seed filling. This study reports major achieve-ments toward fulfilling these objectives by the establishment of a high resolution 2-DGE phosphoprotein reference map comprising 234 quantitative expression profiles, 103 identified phosphoproteins, and a map of phosphorylation site in 16 non-redundant phosphoproteins. This study also extends the oilseed proteomics web-based database with a "Brassica phosphoproteomics" resource for the plant community.
Plant Materials and Growth Conditions-B. napus (var. Reston) seeds were grown in soil (Promix, Quakertown, PA) in a growth chamber (light/dark cycles of 16 h (23°C)/8 h (20°C), 48% humidity, and light intensity of 8000 lux). Plants were fertilized at 2-week intervals (all purpose fertilizer, 15:30:15, nitrogen-phosphorus-potassium). Flowers were tagged immediately after opening of buds (between 1 and 3 p.m.). Harvesting of developing seeds was also performed between 1 and 3 p.m. during the daytime at precisely 2, 3, 4, 5, and 6 WAF. At each developmental stage, the dry weight and total protein content of whole seeds were also measured. Total protein was quantified in triplicate using dye-binding protein assay kit (Bio-Rad) and chicken ␥-globulin as standard.
Two-dimensional Gel Electrophoresis-Total seed protein was prepared from each developmental stage according to a modified phenol-based procedure as described previously (20). The protein pellet, obtained from the above extraction procedure, was suspended in IEF extraction buffer (8 M urea, 2 M thiourea, 2% (w/v) CHAPS, 2% (v/v) Triton X-100, 50 mM DTT) and vortexed at low speed for 30 min at room temperature followed by centrifugation at 14,000 rpm for 15 min to remove insoluble material. Supernatant was used for measuring protein concentration and 2-DGE analysis. 2-DGE was carried out using IPG strips (pH 3-10 or 4 -7, 24 cm; GE Healthcare) as described previously (20). One milligram of total protein was mixed with 2.25 l of IPG buffer (pH 3-10 or 4 -7; GE Healthcare) and IEF extraction buffer to bring up to 450 l, vortexed for 30 s, and subjected to IEF followed by SDS-PAGE. A total of six to eight high resolution 2-D gels were run for each developmental stage using protein isolated from four independent biological samples to finally select four gels for further analysis. Additionally "reference gels" were also run for the purpose of spot matching and their downstream analysis (20) and for the development of a reference map of phosphoproteins. A reference gel is defined as 0.2 mg of total protein from each of the five developmental stages pooled and resolved by 2-DGE.
SDS-PAGE for Phosphorylation Site Mapping-SDS-PAGE was performed by standard methods utilizing 4% T, 2.6% C stacking gels, pH 6.8, and 12% T, 2.6% C separating gels, pH 8.8 (43). The % T is the total monomer concentration expressed in grams/100 ml, and % C is the percentage of cross-linker. The stacking and separating gel buffer concentrations were 0.125 M Tris-HCl, pH 6.8, and 0.375 M Tris-HCl, pH 8.8, respectively. The reservoir buffer concentration was 0.025 M Tris, 0.192 M glycine, pH 8.3. All gel and reservoir buffers contained SDS to a final concentration of 0.1% (w/v). SigmaMarker or protein samples were heated for 5 min at 75°C in SDS loading buffer (5% (v/v) glycerol, 60 mm SDS, 100 mm DTT, 0.03 mm bromphenol blue, and 60 mm Tris-HCl, pH 6.8) and were cooled to room temperature before loading to 10 and 15% gels. A total of 500 g was loaded per lane, and electrophoresis was conducted overnight in a Hoefer vertical SE600 electrophoresis unit (GE Healthcare catalog number 80-6171-96) at room temperature until the dye reached the bottom of the gel. Gels were stained with colloidal Coomassie Brilliant Blue (CBB) to detect protein.
Detection of Phosphoproteins and Proteins-For phosphoprotein detection, 2-D gels were stained with a modified protocol using Pro-Q DPS (44). Briefly all gels were treated with fixation solution (2 ϫ 30 min), washed with deionized water (2 ϫ 15 min), stained with 3-fold diluted Pro-Q DPS in deionized water (120 min), destained with destaining solution (4 ϫ 30 min) to remove gel-bound nonspecific Pro-Q DPS, and washed again with deionized water (2 ϫ 5 min). Following scanning of Pro-Q DPS-stained gels, the same gels were then overstained with colloidal CBB G-250 to detect proteins (45). All gels in the incubation solution were constantly shaken on an orbital shaker (GeneMate, ISC Bioexpress) at room temperature at speeds of 35 rpm.
Image Analysis of 2-D Gels-Following the Pro-Q DPS procedure, gels were imaged using an FLA 5000 laser scanner (Fuji Medical Systems, Stamford, CT) with 532 nm excitation and 550-nm bandpass emission filter. Data were collected as 100-m resolution, 16-bit TIFF files using the Image Gauge Analysis software (Fuji Medical Systems). With this software, fluorescent protein signals in 2-D gels were displayed as dark spots. To quantify phosphoprotein spots in profile mode, 2-D gels were analyzed using ImageMaster 2D Platinum software version 5 (hereafter called ImageMaster software; GE Healthcare) as described by Hajduch et al. (20). Under applied stringent criteria to select phosphoprotein spots for quantification and expression profiling, only those spots were analyzed that were present in all four gels derived from independent biological samples of one developmental stage and expressed at least in two of five developmental stages. The relative volumes of high quality phosphoprotein spots were determined followed by establishment of their expression profiles. To obtain the statistical significance of the variation of each expressed phosphoprotein spot volumes across all four selected replicates of each developmental stage, the coefficient of variation (CV) was calculated using the following formula, where x is the average of relative volumes (x) of spots in biological quadruplicate analysis and n is the sample size (four in case of biological quadruplicate).
It is important to mention that we used two types of protein markers, PeppermintStick standards and SigmaMarker, in SDS-PAGE. The PeppermintStick standards carry two phosphorylated (ovalbumin and bovine ␤-casein of 45.0 and 23.6 kDa, respectively) and four non-phosphorylated (␤-galactosidase, bovine serum albumin, avidin, and lysozyme of 116.25, 66.2, 18.0, and 14.4 kDa, respectively) proteins. SigmaMarker contains one phosphorylated protein (ovalbumin, 45.0 kDa) of 13 protein markers. Detected phosphoprotein spots on 2-D gel were normalized against positive and negative phosphoprotein markers to eliminate false positive spots. Colloidal CBBstained gels were imaged using a Scan Maker 9800XL densitometer (Microtek, Carson, CA). Digitized 2-D gel images (300 dpi, 16-bit grayscale pixel depth) were analyzed using ImageMaster software as described previously (20).

Hierarchical Cluster Analysis of Expression Profiles-
The expression profiles of phosphoproteins established by ImageMaster software were subjected to hierarchical cluster analysis using SAS statistical software (SAS Institute Inc., Cary, NC). The program accomplishes the cluster analysis of expression profiles in two steps. The first step is to establish a number of classes that is best suited for a present data set. The "CLUSTER" keyword is used with options "STANDARD METHOD ϭ AVERAGE CCC PSEUDO" as the command for step 1 in which "STANDARD" means to normalize the variables, "AVERAGE" means a certain clustering method in contrast to the other 10 methods which are included in SAS IDE, and "CCC" and "PSEUDO" are both options for calculating some statistical variables that are used to determine the class number. The second step is to cluster expression profiles into each of the established classes. Expression profile data are also normalized as follows: any zero between two non-zero points is replaced with the average of the values of two neighbors, and a linear transformation is used to normalize expression profiles of different spots to uniform scale. The SAS program used procedure "FASTCLUS" for the real clustering and the maximum number of clusters established earlier. SAS also generates the variable distance parameter for each spot.
In-gel Trypsin Digestion-Phosphoprotein spots were excised from a 2-D reference gel, transferred to polypropylene 96-well filter bottom plates, and digested with sequencing grade modified trypsin (Promega, Madison, WI) according to a previous procedure (20). In case of SDS-PAGE, the colloidal CBB-stained gel was cut into 54 bands per protein sample, diced into 1-mm cubes, and transferred into a sterile 1.5-ml microcentrifuge tubes. The gel pieces were washed with water, destained, and in-gel digested with sequencing grade modified trypsin as described previously (20).

Identification of Phosphoprotein 2-DGE Spots by LC-MS/MS-
was added to each sample well to reconstitute dried peptides. Ten microliters were used for mass spectral analysis using a configuration termed high throughput protein identification on an LTQ ProteomeX linear ion trap LC-MS/MS instrument. Briefly on-line capillary LC included two polymeric PSDVB-based peptide traps (2-g capacity each; MicroBioresource) and a fast equilibrating C 18 capillary column (Micro-Tech Scientific, Cousteau Ct. Vista, CA; packed with C 18 , 5 m, 300 Å, 150-m inner diameter ϫ 10 cm). The method alternated between loading/equilibration and elution using the two peptide traps to reduce the time required for sample analysis on the LC-MS/MS instrument. Sample was loaded onto peptide traps for concentration and desalting prior to final separation by C 18 column using an acetonitrile gradient (0 -80% solvent B in solvent A for a duration of 20 min; solvent A ϭ 0.1% (v/v) FA in water; solvent B ϭ 100% acetonitrile containing 0.1% FA). The peptide trap and C 18 column were then reset for 2 min and re-equilibrated for 10 min with 100% of solvent A before the sample already loaded onto the second trap was eluted. The m/z ratios of eluted peptides and fragmented ions from fused silica PicoTip emitter (12 cm, 360-m outer diameter, 75-m inner diameter, 30-m tip; New Objective, Ringoes, NJ) were analyzed in the data-dependent positive acquisition mode on the LC-MS/MS instrument. Following each full scan (400 -1600 m/z), a data-dependent triggered MS/MS scan for the most intense parent ion was acquired. The heated fused silica PicoTip emitter was held at ion sprays of 1.7 kV and a flow rate of 250 nl/min.
Phosphorylation Site Mapping-To map phosphorylation site(s), 10 l of the reconstituted tryptic peptides with 0.1% (v/v) FA in water was subjected to LC-MS/MS using the ProteomeX-based configuration, termed phosphorylation site mapping method. This configuration utilizes two Zorbax 300SB C 18 traps (5 m, 5 ϫ 0.3 mm; Agilent) and one C 18 PicoFrit capillary column (10 cm, 360-m outer diameter, 75-m inner diameter, 15-m tip; packed with 5-m Biobasic C 18 ; Thermo-Finnigan). Peptides were loaded onto one of the peptide traps via autosampler, concentrated, desalted, and eluted into the C 18 PicoFrit capillary column with an acetonitrile gradient (0 -80% solvent B in solvent A for a duration of 72 min; solvent A ϭ 0.1% (v/v) FA in water; solvent B ϭ 100% acetonitrile containing 0.1% FA). The peptide trap and C 18 PicoFrit capillary column were then reset for 2 min and re-equilibrated for 15 min with 100% of solvent A. Following reequilibration, the sample already loaded onto the second trap was eluted and analyzed. The m/z ratios of eluted peptides and fragmented ions from the C 18 PicoFrit capillary column were analyzed in the data-dependent neutral loss MS/MS/MS mode. In this mode, each full scan (400 -1600 m/z) was followed by three data-dependent MS/MS scans (isolation width, 2 amu; normalized collision energy, 35%; minimum signal threshold, 500 counts; dynamic exclusion (repeat count, 2; repeat duration, 30 s; exclusion list size, 50; exclusion duration, 180 s)) on the top three intense ions from that scan. An MS/MS/MS scan was automatically performed when the most intense peak from the MS/MS spectrum corresponded to a neutral loss event of 98 (ϩ1), 49 (ϩ2), and 32.7 (ϩ3) m/z Ϯ 1.0 Da. The heated C 18 PicoFrit tip was held at ion sprays of 1.7 kV and a flow rate of 250 nl/min. The total run time was 98 min per sample.
Database Customization and Indexing-The National Center for Biotechnology Information (NCBI; ftp.ncbi.nih.gov/blast/) non-redundant database (as of March 2005) was used for querying all data. The FASTA database utilities and indexer of the BioWorks 3.1SR1 software allowed us to create a plant database (keywords Arabidopsis, Oryza sativa, Zea mays, Medicago, Brassica, and Glycine) extracted from NCBI non-redundant database and to index it against trypsin enzyme and static and differential modifications, respectively. Two types of indexed databases were created using the plant database. The indexed database I carried cysteine (carboxyamidomethylation; ϩ57 Da) and methionine (oxidation; ϩ16 Da) as static and differential modifications, respectively. The indexed database II carried a static modification of ϩ57 Da on cysteine and differential modifications of ϩ16 Da on methionine and ϩ80 Da on serine, threonine, and tyrosine residues.
Database Search-For identification of phosphoprotein spots, LC-MS/MS (high throughput configuration) data were searched against the indexed database I using the SEQUEST algorithm (46,47) as part of the BioWorks 3.1SR1 software suite. The search parameters for this database were set as follows: enzyme, trypsin; number of internal cleavage sites, 2; mass range, 400 -4000 Da; threshold, 500; minimum ion count, 35; and peptide mass tolerance, 1.5 Da. Matching peptides were filtered for correlation scores (X Corr at least 1.5, 2.0, and 2.5 for ϩ1, ϩ2, and ϩ3 charged ions, respectively). The X Corr value represents the overlap correlation between experimental and theoretical MS/MS spectra produced by candidate peptides in the database. For all protein assignments, a minimum of two unique peptides was required. Assignments annotated as "unknown" were BLASTP searched against the NCBI non-redundant database (as of March 2005) to further query their homology.
The indexed database II was used for analysis of raw data (MS/MS spectra) collected using the configuration "phosphorylation site mapping method." The search parameters for this database were the same except peptide mass tolerance was 2 Da as mentioned for the indexed database I. X Corr values of at least 1.9, 2.5, and 3.3 for ϩ1, ϩ2, and ϩ3 charged ions, respectively, were applied for database matches. MS/MS/MS spectra were also searched against the database with a variable modification of Ϫ18 to serine and threonine residues (dehydroalanine). Resulting spectra were inspected manually to verify phosphorylation events.
Database Construction-The annotated high resolution 2-DGE reference gel image and associated experimental, predicted, and other data presented in this study can be freely accessed via the oilseed proteomics server (oilseedproteomics.missouri.edu) under the links "Phosphoproteomics of B. napus seed filling." Data are viewable through 2-D gel reference map viewer and a protein identification table (Table I). The spots on the reference gel are hyperlinked to display expression profile and protein identification data.

Physiological Characterization of Seed Development to Determine Seed
Filling Phase-A detailed study of B. napus has shown previously that the process of embryo development occurs over a period of 11-12 weeks beginning shortly after pollination through seed desiccation (1). The aqueous soluble components and starch constitute up to 80% of the dry matter within the first 4 WAF and is followed by a marked rise in the lipid and protein content coinciding with an equivalent decrease in aqueous and starch fraction between 4 and 6 WAF (1,5,11). Based on these studies, flowers were tagged immediately after opening of buds, and siliques were collected precisely at stages 2, 3, 4, 5, and 6 WAF to conduct the initial physiological study to ensure complete coverage of seed filling.
Seeds collected at different stages and embryos dissected from these seeds are shown in Fig. 1, top panel. A visual observation of seeds and embryos indicated that seeds at 2 WAF are largely occupied with liquid endosperm because the embryos occupied only a small portion of the total seed mass. As seed development progressed, the color of seeds turned green, and the embryo mass increased. The light pale green color of seeds changed to deep green and then part of the FIG. 1. Changes in fresh seed weight, embryo development, and total protein during seed filling in B. napus. Dissected seeds and embryos, collected at 2, 3, 4, 5, and 6 WAF, are shown in the top panel. Fresh seed and embryo mass measured at given stages are graphically presented as mg per seed or embryo, whereas total protein content in seed is presented as mg per seed. Standard deviation represents the average value of four independent experiments and a total of 10 seeds. seed coat changed to dark red at 4 and 6 WAF, respectively. A dramatic expansion in embryo size occurred between 2 and 4 WAF and continued at a much slower rate until 6 WAF, indicating consumption of aqueous soluble constituents and sugars by the embryo. This is consistent with a previous finding where a shift from starch to oil accumulation during the early stages of cotyledon filling (i.e. embryo development) was shown (48); that is, embryo size increases with reserve deposition. Furthermore a decrease in deep green color followed by an increase in dark red color at 6 WAF indicates the initiation of maturation phase around 5 WAF. Another interesting observation was that embryos acquired a degree of photosynthetic capacity by 4 WAF as apparent by dark green embryos followed by a change to light green suggesting a decrease in photosynthesis (Fig. 1, Embryos). Two physiological parameters of seed, fresh seed and embryo mass and total protein contents in seed, were also measured at each seed developmental stage to support our above observations (Fig. 1, lower panel). The fresh seed weight reached a maximum at 5 WAF (6.1 mg/seed) followed by sudden decline at 6 WAF (3.6 mg/seed). A similar profile for embryo was also found except for only a slight increase in embryo mass at 3 WAF. The protein content increased gradually until 4 WAF and then sharply from 4 to 6 WAF. Therefore, to include most processes occurring during seed filling, seed materials at these studied stages were ideal for a systematic phosphoproteomics study.
The pI Range 4 -7 Resolves Approximately 90% of Phosphoproteins-B. napus seed proteins were initially analyzed using pH 3-10 and pH 4 -7 24-cm IPG strips to evaluate their suitability for developing a high resolution reference map of phosphoproteins. Analysis of pH 3-10 2-D gels revealed that the vast majority of phosphoprotein spots (ϳ90%) were concentrated in the region between pH 4 and 7, and many spots in this region overlapped with each other (data not shown).
Overlapping spots cause problems in downstream analysis, such as precise volume quantification and identification by MS. In contrast, analysis of pH 4 -7 showed distinct and defined spots; a slight spot overlap was noticed near the end of pH 4 and 7 and around the region of 84 kDa. Because quantitative expression profiling of phosphoproteins and establishment of a reference map were the main aims of this study, we decided to use pH 4 -7 IPG strips exclusively.
As mentioned under "Experimental Procedures," high resolution (24-cm) 2-D gels of 2, 3, 4, 5, and 6 WAF were developed along with their reference gel. In the reference gel, 300 phosphoprotein spot groups were assigned by Image-Master software. The dynamic range of spot volume of all Pro-Q DPS-detected spots ranged from 1 to 3.2 ϫ 10 4 . Perfunctory analysis of 2-D gels revealed dynamic changes of phosphoprotein spots throughout seed filling (Fig. 2); gel sections presented are representative of four high quality 2-D gels obtained for each stage. The phosphoprotein spot numbers were assigned by ImageMaster software after image analysis of the gels (described below). For example, spot numbers 669 and 679 in area I were expressed at all the stages analyzed with maximum accumulation at 3 WAF, whereas spot numbers 792 and 868 in area II were initially detected at 5 WAF followed by a decrease in their expression levels. In area III, spot numbers 831 and 833 were also detected at 5 WAF, and their abundance increased dramatically at 6 WAF.
After visualization of phosphoprotein spots on 2-D gels and their image acquisition, all gels were subsequently stained with colloidal CBB in an attempt to directly correlate phosphoprotein with total protein expression in the same gel. Approximately 1000 high quality protein spots were defined by ImageMaster software in the reference gel. The dynamic range of spot volume of all visualized protein spots ranged from 1 to 4.5 ϫ 10 6 . When the phosphoprotein and total protein 2-D images were merged together, only 3% of the spots were unequivocally matched (supplemental figure). This suggests that the high sensitivity of Pro-Q DPS (minimum, 1-16 ng; Refs. 40 and 44) compared with colloidal CBB (minimum, 50 -100 ng; Ref. 49) precludes direct spot matching between these two stains. Comparison of these images also indicated that highly abundant protein spots were not stained with Pro-Q DPS, suggesting high specificity of Pro-Q DPS to phosphoproteins.
Quantitative Analysis of Phosphoprotein Spots Established 234 Developmental Expression Profiles-A strategy for analysis of acquired 2-D images is schematically presented in Fig.  3. Selection of spots and their matching were carried out using ImageMaster software as described under "Experimen- tal Procedures." As shown at the left-hand side, four 2-D gels derived from biological quadruplicates were processed to select high quality spots. To generate high quality quantitative expression profiles, the following threshold criteria were applied: first, the spot should be present in all four biological replicate gels, and second, these spots should be detected in at least two developmental stages. Under these conditions, ImageMaster software established 234 developmental expression profiles (supplemental table); the matched spots and their volume were also manually validated. The variance of acquired quantification data on protein spots was calculated using two different statistical formulas, standard deviation (STDEV) and CV, and are presented in the supplemental table. The standard deviation represented the biological and tech-

FIG. 3. Schematic depiction of experimental design for quantitative expression analysis, MS and database analyses, and database construction.
Four 2-D gels from four independent biological samples of each developmental stage were analyzed together with gels from other developmental stages and a reference gel using the ImageMaster 2D Platinum software version 5 to select those phosphoprotein spots present in all four gels analyzed from one stage and also in at least two different developmental stages. An example for 6 WAF and reference gel is given where the numbers at the left corner represent the bar code assigned for each IPG strip by the manufacturer. High quality spot groups were automatically processed to calculate relative spot volumes, which were then used to generate expression profiles as shown in Step 1 for spot number 718. For MS analysis, phosphoprotein spots were excised from the reference gel, trypsin-digested, and analyzed by LC-MS/MS as shown in Step 2. LC-MS/MS data were analyzed by Sequest, and generated data were assembled to construct a database termed Brassica phosphoproteomics. Data are available at the web site oilseedproteomics.missouri.edu/links.php.

FIG. 4. Occurrence of six major expression trends during seed filling.
The dynamic phosphorylation profiles of 234 phosphoprotein spots were analyzed by hierarchical clustering using SAS statistics software. The total number of phosphoprotein spots grouped together for each profile is listed in parentheses. Values on the y axes are relative volumes of spots in that cluster.

TABLE I-continued
Quantitative Phosphoproteomics of Developing Rapeseed nical variations arising from four independent sample harvesting events followed by four independent protein extractions. The CV values allowed direct spot-to-spot comparison of significance levels in acquired quantitative data; therefore, the statistical significance of differentially expressed phosphoprotein is inversely proportional to the CV value. The dynamic range in spot volume for these spots varied from 4.5 ϫ 10 2 to 3.2 ϫ 10 4 . An example of quantitative expression profile for spot number 718 is shown (Fig. 3, upper panel).
To characterize the expression trends, hierarchical cluster analysis was performed on all 234 phosphoprotein expression profiles. A total of six expression clusters were identified from these data (Fig. 4). In clusters 1, 2, and 3, the relative volume of phosphoprotein spots increased as seed development progressed and reached a maximum at 3, 4, and 5 WAF, respectively, followed by a decrease in abundance. Phosphoproteins predominantly expressed in the early phase of seed filling grouped with clusters 4, 5, and 6, and their abundance generally decreased with seed development. Of particular interest, the abundance of phosphoprotein spots in cluster 4 dropped to a minimum at 3 WAF and then gradually accumulated, reaching a level slightly higher than their initial abun-dance at 6 WAF. Comparison of clusters 5 and 6 indicated that abundance of phosphoprotein spots in cluster 6 declined sharply, reaching almost undetectable levels at 4 WAF, and remained nearly at the same level until 6 WAF, whereas phosphoprotein spots in cluster 5 were expressed until 6 WAF. The total number of phosphoprotein expression profiles grouped together in each cluster was 15,32,99,8,34, and 46 in clusters 1, 2, 3, 4, 5, and 6, respectively (Fig. 4). Therefore, the most abundant group is cluster 3, which represents 43% of the expressed phosphoproteins during seed filling.

LC-MS/MS and Database Analyses Identified 70
Non-redundant Phosphoproteins-Using LC-MS/MS, 103 phosphoprotein spots were identified (Table I) of which 70 were unique phosphoproteins (i.e. independent accession numbers). Furthermore a distribution of 7, 10, 44, 4, 17, and 21 phosphoprotein spots of 103 identified belonged to clusters 1, 2, 3, 4, 5, and 6 in Fig. 4, respectively. The expression cluster number for each phosphoprotein is listed in Table I. velop the "B. napus phosphoproteomics" database ( Fig. 3). An interactive high resolution 2-DGE phosphoprotein reference map serves as the portal for expression and identification data (Fig. 5). The presence of the cursor on any active spot automatically shows experimental molecular weight, pI, and phosphoprotein name (if identified by LC-MS/MS). Each of the active spots is hyperlinked to another web page displaying the expression profile.
Phosphoprotein Verification by Mapping the Phosphorylation Sites in 16 Non-redundant Phosphoproteins-To map the phosphorylation sites in identified phosphoproteins, we first determined whether the phosphorylation site mapping method on the LTQ linear ion trap LC-MS/MS instrument was valid. The instrument method associated with this configuration applies data-dependent acquisition of MS/MS/MS spectra for improved phosphopeptide identification. Under this condition, if an MS/MS spectrum of a peptide yields a neutral loss of 98 amu, an MS/MS/MS analysis is triggered. Application of this concept was recently demonstrated for characterization of HeLa cell nuclear phosphoproteins using the LCQ DECA XP ion trap mass spectrometer (51).
In-gel digested tryptic peptides of BSA and bovine ␤-casein (derived from PeppermintStick standards) as non-phosphorylated and phosphorylated protein markers, respectively, were used to evaluate the above mentioned configuration and instrument method. As expected, no phosphopeptides were detected in the BSA trypsin-digested sample. In contrast, a monophosphorylated peptide (FQS*EEQQQTEDELQDK; the asterisk indicates phosphorylation at the serine residue) was detected in the bovine ␤-casein sample with high confidence (Fig. 7A) To determine the phosphorylation sites in the identified phosphoproteins (Table I), the same reconstituted tryptic peptides of 2-D gel spots previously used for identification were analyzed. In repeated attempts, we failed to detect phosphopeptides in these samples. However, a number of reasons could be attributed to this failure (54), most notably the low abundance of peptide analyte eluted from faint 2-D gel spots. A recent phosphoproteomics study of developing brain used 6 mg of proteins to identify protein phosphorylation sites, reasoning that such an amount of proteins is required to successfully identify phosphorylation events at 10% stoichiometry (55). They fractionated total proteins by preparative SDS-PAGE and analyzed the strong cation exchange-enriched tryptic phosphopeptides by reverse-phase HPLC in combination with LC-MS/MS. Therefore, we applied an SDS-PAGE approach as schematically presented in Fig. 7B; however, unlike the previous study no phosphopeptide enrichment step was included. In this approach, total proteins (0.5 mg) from 2, 4, and 6 WAF seed were separated by SDS-PAGE and stained with colloidal CBB. Excised bands were trypsindigested and analyzed by LC-MS/MS. Among the identified phosphopeptides, 21 corresponded to 16 non-redundant phosphoproteins identified from 2-D gel: 14 were singly phosphorylated, and seven were multisite-phosphorylated (Table  II). These phosphoproteins are also highlighted in Table I. FIG . 6. B. napus seed phosphoproteins belong to 10 major functional classes. Of 103 identified phosphoprotein spots, 70 phosphoproteins were found to be non-redundant. The pie chart shows the distribution of these non-redundant phosphoproteins into their functional classes in percentages. Functional classification was performed according to Bevan et al. (50). DISCUSSION Phosphoproteomics is still an emerging area in plant biology. The only systematic study available to date in plants is the identification of more than 300 phosphorylation sites from plasma membrane proteins of suspension-cultured Arabidopsis cells (42). However, a quantitative assessment was not provided for either the phosphopeptides or phosphoproteins. As protein phosphorylation can be either static or dynamic, quantitation of phosphorylation events is necessary to distinguish between the two possibilities to elucidate regulatory networks. Phosphoprotein assignment depends mostly on a single phosphopeptide, and so the analysis is assisted by sequenced and annotated genomes to provide some level of confidence. Ostensibly in plants such types of large scale experiments are currently possible only in Arabidopsis and rice (O. sativa L.). However, because B. napus is closely   FIG. 7. Mapping of phosphorylation site. A, the phosphopeptide mapping workflow on the LC-MS/MS instrument was evaluated using BSA (protein) and ␤-casein (phosphoprotein). Phosphopeptide was only detected in ␤-casein, and its phosphorylation sites were mapped with high confidence (X Corr ϭ 5.2). A MS/MS spectrum of a phosphopeptide shows a typical neutral loss of phosphoric acid from the most intense peak (precursor ion) of a single MS scan. The triggered MS/MS/MS spectrum is then acquired from the neutral loss precursor ion in MS/MS. Abundant peptide bond fragmentation allowed the unambiguous identification of the peptide (FAS*EEQQQTEDELQDK) from ␤-casein and mapping the phosphorylation site on serine residue marked by an asterisk. B, a preparative SDS-PAGE approach for identification of phosphorylation sites. A flow chart depicts the steps used to identify phosphorylation sites using trypsin digestion and LC-MS/MS analysis coupled to database analysis. related to Arabidopsis, it seemed plausible that protein identification and phosphopeptide mapping are achievable with this crop.
A phosphoprotein stain, Pro-Q DPS, was applied to detect and quantify phosphoproteins resolved by 2-DGE. As phosphorylation is a dynamic process, quantification of phosphoproteins is one major objective of this study in addition to the identification of phosphoproteins and their phosphorylation sites. To achieve this objective, we recently investigated some of the inherent limitations with Pro-Q DPS and established a modified protocol to reproducibly detect phosphoproteins on high resolution 2-D gels (44). Using this protocol, we detected 300 phosphoprotein spots on 2-D gels and provided quantitative expression profiles for 234 spots expressed in at least two seed developmental stages ( Fig. 3 and supplemental  table). Hierarchical clustering of expression profiles further revealed the occurrence of six major expression trends (Fig.  4). It appeared that phosphoproteins in each cluster are regulated in a stage-specific and coordinate manner; that is, abundance of phosphoproteins in clusters 1, 2, and 3 reached a maximum at 3, 4, and 5 WAF, respectively. Identification of 70 non-redundant phosphoproteins and their classification into 10 functional categories (energy and metabolism are the largest groups) suggested that protein phosphorylation might be involved in the regulation of storage reserve synthesis. As these phosphoproteins are of diverse functions ranging from signal transduction to protein synthesis and destination and are known to operate on different signaling and metabolic pathways, it is highly likely that phosphoproteins are linked to each other either through kinase function or common pathways.
The paucity of systematic studies on phosphoproteins in plants precludes us from discussing relational information about the phosphoproteins identified here, particularly because many of these phosphoproteins are novel. However, one can extract important information from the given inventory of phosphoproteins and their dynamic behavior. A few examples are as follows. Analyses of phosphoproteins belonging to different functional categories and their dynamic expressions revealed that the majority of phosphoproteins (75%) involved in signal transduction accumulated predominantly at 2 WAF followed by a sharp decrease in their abundance as shown for expression clusters 5 and 6 in Fig. 4. This is in contrast to expression profiles of ϳ70% of phosphoproteins belonging to either the energy or metabolism category, which group with clusters 2 and 3 ( Fig. 4 and Table I). Such opposite expression patterns of phosphoproteins suggest a tight coordination between the function of phosphoproteins associated with signal transduction and energy/metabolism. The 14-3-3 proteins have been shown previously to regulate both the signal transduction and metabolic pathways (for reviews, see Refs. 21 and 56 -58). The 14-3-3 proteins typically contain two 14-3-3 protein family signature motifs, one each near the N-and C-terminal regions and one annexin binding motif (for reviews, see Refs. 56 -58). 14-3-3␥ has been shown to interact with and be phosphorylated by multiple protein kinase C isoforms in platelet-derived growth factorstimulated human vascular smooth muscle cells (59). However, there is no direct evidence for in vivo phosphorylation of 14-3-3 proteins in plants. This study identified two 14-3-3 phosphoproteins (Group IDs 720 and 867) and two annexin proteins (Group IDs 658 and 699) ( Fig. 5 and Table I). In addition, among the known potent targets of 14-3-3 proteins, H ϩ -ATPase and heat shock proteins were also found as phosphoproteins ( Fig. 5 and Table I). A recent proteomics study of soybean has also identified four 14-3-3 isoforms highly expressed during seed development (20). These find-ings suggest that 14-3-3 proteins might be involved in signaling and metabolic pathways within developing seed. The overall inventory of phosphoproteins also indicated the presence of several phosphoproteins that have not yet been reported as a phosphoprotein, including stearoyl-acyl-carrier-protein desaturase (stearoyl-ACP desaturase) (Table I). Stearoyl-ACP desaturase is a key regulator of unsaturated fatty acids and has been implicated in the regulation of cell growth and development and the defense/stress responses (60,61). By mapping the phosphorylation sites in at least 16 nonredundant phosphoproteins, including the 14-3-3 and annexin, this study provided a further level of confidence in the identified phosphoproteins (Table II). The criteria applied for selection of these phosphopeptides are based on the seminal large scale phosphoproteomics studies of the developing mouse brain and HeLa cell nuclear phosphoproteins (51,55). A survey of the published literature on phosphoproteomics showed that the criteria for assignment of phosphorylation sites and selection of phosphopeptides are still evolving (42,FIG. 8. Phosphorylation sites mapped within the 14-3-3 phosphoprotein are conserved across organisms. A, amino acid residues ranging from 184 to 245 in the sequence of H. sapiens 14-3-3 protein isoform are shown. These sequences encompass helices 8 and 9 (boxed regions filled with a gray color). Block 5 is the last block of the sequence that shows conserved identity in 99% of all sequences. The protein signature motif 2 of the 14-3-3 protein includes helix 9 and surrounding few sequences. The amino acid motif known as nuclear export signal (NES) is also shown. Bold amino acids represent the full tryptic peptide identified by LC-MS/MS. B, the 14-3-3 proteins were selected from diverse organisms, such as mammals, insects, worm, fungi, and plants, as representative, and their sequences were aligned using the ClustalW program (European Bioinformatics Institute). For clarity, only those amino acid sequences are shown that match to the tryptic peptide identified by LC-MS/MS (see Table II). The source, accession number, total length, and homology of respective 14-3-3 sequences are also given. Because H. sapiens 14-3-3 protein is well studied (62), the sequences of this protein were used as a reference to determine the percentage of homology of other 14-3-3 proteins. Amino acids shaded in a gray color were found to be phosphorylated. Of three phosphorylation sites identified in this study, serine is the only one (marked by asterisk in A) previously postulated for phosphorylation (59,62). aa, amino acids. 51,54,55). Nevertheless among the applied criteria, the X Corr value of at least 1.9, 2.5, and 3.3 for ϩ1, ϩ2, and ϩ3 charged ions, respectively, appears to be widely accepted. The given X Corr value has been applied along with the concept of datadependent MS/MS/MS strategy to accurately assign the phosphorylation sites in this study. However, in our experience and as discussed by Beausoleil et al. (51), data-dependent MS/ MS/MS strategy may be unnecessary for large scale phosphoprotein analyses. It was previously noted that only 96 of 2002 phosphorylation sites were determined from an MS/ MS/MS spectrum (51). Of the mapped phosphorylation sites in this study, one site at the serine residue of 14-3-3 phosphoprotein was previously postulated based on phosphorylation of human 14-3-3␥ and on the presence of serine in all species (59,62). Two serine, including one previously speculated, and threonine residues were identified as potential candidate sites for phosphorylation (Table II). Considering 14-3-3 proteins as ubiquitous regulatory factors, the 14-3-3 proteins were selected from diverse organisms, and their amino acid sequences were aligned together to determine the functional signification of the mapped phosphorylation sites (Fig. 8). For this purpose, Homo sapiens 14-3-3 protein isoform sequence was used as reference because the crystal structure of this isoform has been determined (62). As highlighted with a gray color in Fig. 8B, each putative phosphorylation site is conserved from plants to mammals and is located within the signature motif 2. The helix 9 of the C-terminal tail is within the signature motif 2, and this helix contains seven residues, which are directly involved in peptide binding (62,63). Therefore, it is likely that phosphorylation at identified phosphorylation sites affects ligand bindings or induces conformational changes, which eventually might be important for specificity and selectivity of binding partners. This idea is consistent with increasing evidence that 14-3-3 post-translational modifications act as mechanisms for regulating interactions (64). In particular, the direct phosphorylation of 14-3-3 proteins, mediated by a variety of protein kinases, alters the binding affinity and dimerization properties of different 14-3-3 isoforms (64).
Tubulin ␤-8 chain, annexin, and phosphoglycerate kinase (PGK) were among the identified phosphoproteins in which the phosphorylation sites were also determined (Tables I and  II). Tubulin ␤-subunit and annexin have previously been implicated in cell division during embryogenesis and in a number of environmental and developmental signals (for reviews, see Refs. [65][66][67][68]Ref. 69). The tubulin ␤-subunit (Group ID 512) and annexin (Group IDs 658 and 699) phosphoprotein spots were expressed predominantly at 2 WAF followed by a decrease in their abundance, and this matched well with the period of cell division in B. napus (i.e. before 3 WAF; Refs. 1 and 5). The functional significance of these phosphoproteins in seed development comes from the detected phosphopeptides and their mapped phosphorylation sites. Sequences of detected phosphopeptides of tubulin ␤-subunit (S*T*VCDIPP-TGLKMASTFIGNSTSIQEMFR) and annexin (AVMLWT-*LDPPER) were located at the C and N termini of their amino acid sequences, respectively. In mammals, the C-terminal region of tubulin ␤-subunits and the N-terminal region of annexins, respectively, have been shown previously as the targets for phosphorylation (66,67,70). The threonine residue of annexin is also located within the second annexin repeat of the N-terminal region. In addition to tubulin ␤-subunits and annexins, PGK could also play a key role in seed development because it catalyzes the conversion of 1,3-bisphosphoglycerate into 3-phosphoglycerate. Along with PGK, this study identified a total of five glycolytic enzymes as phosphoproteins (Table I). However, the phosphopeptide could only be detected for PGK (MSHIST*GGGASLELLEGK). Like the 14-3-3 phosphopeptide, sequence alignment of PGK proteins selected from diverse organisms showed that the mapped threonine residue is also conserved from plants to mammals, suggesting the importance of this residue in PGK function. It is possible that in vivo phosphorylation of tubulin ␤-subunits, annexins, PGK, and other phosphoproteins identified in this study might alter their activity and stability to properly regulate the cellular processes of seed filling in B. napus.
In conclusion, we present the first quantitative global analysis of phosphoproteins during seed filling in B. napus. Phosphoprotein-specific Pro-Q DPS coupled to a 2-DGE and LC-MS/MS strategy led to the generation of 234 quantitative expression profiles of phosphoproteins and to the identification of 70 non-redundant phosphoproteins. Equally important was the establishment of a high resolution 2-D gel phosphoprotein reference map, which could be used for comparative functional proteomics. Identification of in vivo 14-3-3 phosphoproteins and their potential phosphorylation sites implied a potential role for 14-3-3 also in seed development. This collective data set is a step forward in the analysis of the seed filling phosphoproteome in B. napus and toward an understanding of the regulatory mechanisms behind embryo development and seed reserve deposition. Future studies will be directed toward investigating the function of phosphoproteins involved in mediating multiple signaling and metabolic pathway such as 14-3-3 phosphoproteins and developing a suitable and high throughput system to map phosphorylation sites in Pro-Q DPS-detected phosphoprotein spots on 2-D gels. The latter part will help determine whether the changes in expression of phosphoproteins are due to total phosphoprotein expression or phosphorylation status (i.e. phosphorylation stoichiometry).