Large Scale Comparative Proteomics of a Chloroplast Clp Protease Mutant Reveals Folding Stress, Altered Protein Homeostasis, and Feedback Regulation of Metabolism*

The clpr2-1 mutant is delayed in development due to reduction of the chloroplast ClpPR protease complex. To understand the role of Clp proteases in plastid biogenesis and homeostasis, leaf proteomes of young seedlings of clpr2-1 and wild type were compared using large scale mass spectrometry-based quantification using an LTQ-Orbitrap and spectral counting with significance determined by G-tests. Virtually only chloroplast-localized proteins were significantly affected, indicating that the molecular phenotype was confined to the chloroplast. A comparative chloroplast stromal proteome analysis of fully developed plants was used to complement the data set. Chloroplast unfoldase ClpB3 was strongly up-regulated in both young and mature leaves, suggesting widespread and persistent protein folding stress. The importance of ClpB3 in the clp2-1 mutant was demonstrated by the observation that a CLPR2 and CLPB3 double mutant was seedling-lethal. The observed up-regulation of chloroplast chaperones and protein sorting components further illustrated destabilization of protein homeostasis. Delayed rRNA processing and up-regulation of a chloroplast DEAD box RNA helicase and polynucleotide phosphorylase, but no significant change in accumulation of ribosomal subunits, suggested a bottleneck in ribosome assembly or RNA metabolism. Strong up-regulation of a chloroplast translational regulator TypA/BipA GTPase suggested a specific response in plastid gene expression to the distorted homeostasis. The stromal proteases PreP1,2 were up-regulated, likely constituting compensation for reduced Clp protease activity and possibly shared substrates between the ClpP and PreP protease systems. The thylakoid photosynthetic apparatus was decreased in the seedlings, whereas several structural thylakoid-associated plastoglobular proteins were strongly up-regulated. Two thylakoid-associated reductases involved in isoprenoid and chlorophyll synthesis were up-regulated reflecting feedback from rate-limiting photosynthetic electron transport. We discuss the quantitative proteomics data and the role of Clp proteolysis using a “systems view” of chloroplast homeostasis and metabolism and provide testable hypotheses and putative substrates to further determine the significance of Clp-driven proteolysis.

Intracellular proteolysis is important for regulation of metabolic and signaling pathways as well as protein homeostasis and viability of cells and organelles. Chloroplasts contain multiple soluble and membrane-bound proteases and processing peptidases (1) presumably with partially overlapping substrates. These include stromal processing peptidase (2) and stromal PreP1,2 involved in degradation of cleaved transit peptides (3); various amino peptidases (4,5); the thylakoid processing peptidases cTPA (6), TPP (7), and thylakoid/envelope signal peptidase I (8); and thylakoid-bound proteases SppA (9) and Egy1 (10) as well as stromal and thylakoid members of the Deg, FtsH, and Clp families (11)(12)(13). Together with several chaperone systems, including CPN60/CPN10, HSP70/DnaJ, HSP90, and ClpB3 (14), these proteases are part of the chloroplast protein homeostasis network. Importantly the connectivity and overlap of proteins within this homeostasis network is poorly understood; in particular it is unclear how protease substrates are recognized by the different proteolytic systems. Several suppressors of variegated FtsH protease mutants in Arabidopsis have elegantly demonstrated that the balance between protein synthesis and degradation plays an important role in chloroplast homeostasis (15)(16)(17). Comparative proteome analysis of chloroplast homeostasis mutants will provide insights in this homeostasis network as we recently showed for a protein sorting mutant (18), and it will identify candidate protease substrates.
The Clp proteins form the largest plastid localized protease family with five serine-type ClpP (P1,P3-6) proteases, four non-catalytic ClpR (R1-4) proteins, three Clp AAA ϩ chaperones (C1,C2, and D), and several additional members (ClpT1,T2,S) with unknown functions (1,13,19). We note that we renamed Arabidopsis ClpS1,S2 and ClpT as ClpT1,T2 and ClpS to be consistent with the Escherichia coli nomenclature for ClpS (20). The ClpR proteins lack the three catalytic amino acid residues that are conserved across ClpP proteins (21). All proteins of the Arabidopsis Clp proteolytic system have been identified by mass spectrometry (13), including a potential substrate affinity regulator, ClpS. 1 Recent evidence shows that the Clp proteolytic system plays a critical role in plant growth, development, and protein homeostasis. ClpP1 is plastid-encoded and was shown to be essential for shoot development in tobacco (22,23). Downregulation of the plastid-encoded CLPP1 gene in the green algae Chlamydomonas reinhardtii suggested that ClpP1 is involved in the degradation of the thylakoid-bound subunits of cytochrome b 6 f and photosystem II (PSII) 2 complex (24,25). Arabidopsis mutant clpr1-1 carries a premature stop codon in the CLPR1 gene and showed a virescent phenotype and delayed chloroplast development and differentiation (26). Maturation of 23 and 4.5 S chloroplast ribosomal RNA (rRNA) is delayed in clpr1-1 (26), but it is not clear how this is related to the loss of ClpR1 protein. Phenotypes of Arabidopsis antisense lines against CLPP4 (27) and CLPP6 (28) also showed delayed chloroplast and plant development as well as reduced accumulation of other ClpP,R subunits. Based on twodimensional gel analysis, several chloroplast proteins were suggested to be substrates of the Clp machinery (28 -30), but direct evidence is lacking. A null mutant for the CLPC1 chaperone (also named HSP93-V) resulted in reduced plant growth, chloroplast development, and protein import rates, but homozygous plants are autotrophic and seeds are viable (31)(32)(33). A null mutant for chaperone CLPC2 has no visible phenotype, whereas lack of both CLPC1 and CLPC2 prevents embryogenesis (34). Interestingly ClpC1 is also involved in accumulation of chlorophyll a oxygenase, which is responsible for conversion of chlorophyll a to chlorophyll b (35).
In a previous study, we identified and characterized a T-DNA-tagged Arabidopsis thaliana mutant with reduced expression of CLPR2; this mutant was named clpr2-1 (36). Accumulation of the assembled 325-kDa ClpPRT complex was 2-3-fold reduced and resulted in delayed chloroplast and plant development with small chloroplasts and a pale green phenotype. The clpr2-1 mutant shows the strongest visible phenotype when seedlings are young. To better understand the role of the Clp machinery in chloroplast biogenesis and homeostasis and to discover potential Clp substrates, a comprehensive proteome analysis at different points in leaf devel-opment of the clpr2-1 mutant is presented in the current study. The methods to quantitatively analyze differences in protein accumulation have greatly improved over the last decade and have shifted from gel image-based quantification to quantification within the mass spectrometer (37)(38)(39). Taking advantage of these new developments and opportunities, we compared the leaf proteome of clpr2-1 and wt seedlings early in development using spectral counting. This was complemented with a comparative analysis of the chloroplast soluble proteome of fully developed leaf rosettes. The seedling proteome analysis showed that the strongest effects occurred within the chloroplast. The functional significance of one of the most up-regulated proteins, ClpB3, was confirmed by additional mutant analysis. Putative substrates for the Clp system suggested in recent studies (28 -30, 35) are reviewed in the context of our findings. This study provides testable hypotheses to further determine the significance of Clp-driven proteolysis and provides new insights in the plastid protein homeostasis network and how secondary metabolism is intertwined with photosynthetic capacity. We show that a systems view of chloroplast biogenesis and proteome homeostasis is needed to identify putative protease substrates and to understand the role of proteolysis in chloroplast biology. Finally we believe that the experimental setup described in this study provides an attractive template for comparative proteome analysis of other (chloroplast) mutants.

EXPERIMENTAL PROCEDURES
Plant Growth of clpr2-1 for Proteome Analysis-A. thaliana clpr2-1 and wt (Col-0 background) plants were grown side by side on soil under short day (10/14-h light/dark) with 120 mol photons⅐m Ϫ2 ⅐s Ϫ1 light at 23/19°C in controlled growth chambers (Conviron). For the seedling analysis, seeds were directly placed on soil for germination and never transplanted to ensure homogeneity of the seedlings.
Label-free Shotgun Analysis by 1-D SDS-PAGE and Nano-LC-LTQ-Orbitrap Mass Spectrometry of Seedlings-Total proteins from leaf rosettes in growth stage 1.07 and 1.14 of wt and clpr2-1 were extracted by grinding 200 mg of fresh leaves in liquid N 2 into a fine powder. Three independent biological replicates for each genotype were obtained (two for stage 1.07 and one for stage 1.14). 1 ml of extraction buffer (5% SDS, 125 mM Tris-HCl, pH 8.8, 5 mM EDTA, 5 mM tributylphosphine, 50 g/ml Pefabloc) was then added, and the pestle was used to solubilize the material. Cellular debris were removed by a 1-min spin at 14,000 ϫ g at 4°C followed by an additional spin for 10 min at 14,000 ϫ g at 4°C to remove unsolubilized material. Proteins were precipitated in 75% acetone at Ϫ80°C for at least 30 min, and the resulting protein pellet was solubilized in 2% SDS, 50 mM Tris-HCl, pH 8.25. Protein concentrations were determined using the bicinchoninic acid method (41). Finally proteins were reduced and further solubilized in Laemmli solubilization buffer. 400 g of total leaf protein of both growth stages of clpr2-1 and wt each were run out on a 1-D SDS-PAGE gel (12% acrylamide). Each of the gel lanes was cut into 12 or 14 bands followed by reduction, alkylation, and in-gel digestion with trypsin as described previously (42).
The peptides extracted from these gel bands were analyzed in duplicate (technical replicates) by data-dependent MS/MS using an on-line LC-LTQ-Orbitrap (Thermo Electron Corp.). Peptide extracts were automatically loaded on a guard column (LC Packings MGU-30-C18PM) via an autosampler followed by separation on a PepMap C 18 reverse-phase nanocolumn (LC Packings nan75-15-03-C18PM) using 90-min gradients with 95% water, 5% ACN, 0.1% FA (solvent A) and 95% ACN, 5% water, 0.1% FA (solvent B) at a flow rate of 200 nl/min. To minimize variation, the samples were run in a specific order as follows. clpr2-1 and wt samples of similar growth stage were run after one another, starting with the gel band at the top of the gel from wt followed by two blanks and then the gel sample at the top of the gel from clpr2-1. After two blanks, the subsequent gel slice of wt was run followed by two blanks and then the equivalent clpr2-1 sample, etc.
The acquisition cycle consisted of a survey MS scan in the Orbitrap with a set mass range from 350 to 1800 m/z at the highest resolving power (100,000) followed by five data-dependent MS/MS scans acquired in the LTQ. Dynamic exclusion was used with the following parameters: exclusion size, 500; repeat count, 2; repeat duration, 30 s; exclusion time, 180 s; exclusion window, Ϯ6 ppm. Target values were set at 5 ϫ 10 5 and 10 4 for the survey and tandem MS scans, respectively, and the maximum ion accumulation times were set at 200 ms in both cases. Regular scans were used both for the precursor and tandem MS with no averaging. The precursor isolation window was set at 2 m/z with monoisotopic peak selection, and the FTMS preview option was used. Peak lists (.mgf format) were generated using DTASuperCharge (v1.19) software (SourceForge, Inc.) and searched with Mascot v2.2 (Matrix Science). For off-line calibration, first a preliminary search was conducted with the precursor tolerance window set at Ϯ30 ppm. Peptides with the ion scores above 33 were chosen as benchmarks to determine the offset for each LC-MS/MS run. This particular ion score value, 33, was chosen in accordance with the results of the search against the target-decoy database (see further below). This offset was than applied to adjust precursor masses in the peak lists of the respective .mgf file for recalibration using a Perl script. 3 The recalibrated peak lists were searched against The Arabidopsis Information Resource A. thaliana database v8, including sequences for known contaminants (e.g. keratin and trypsin) (33,013 entries total) concatenated with a decoy database where all the sequences were in reverse orientation. Each of the peak lists was searched using Mascot v2.2 (maximum p value of 0.01) for full tryptic peptides using a precursor ion tolerance window set at Ϯ6 ppm, variable methionine oxidation, fixed cysteine carbamidomethylation, and a minimal ion score threshold of 33; this yielded a peptide false discovery rate below 1% with peptide false positive rate calculated as 2 ϫ (decoy_hits/total_hits). The false protein identification rate of protein identified with two or more peptides was zero. To reduce the false protein identification rate of proteins identified by one peptide, the Mascot search results were further filtered as follows: ion score threshold was increased to 35 and mass accuracy on the precursor ion was required to be within Ϯ3 ppm. All filtered results were uploaded into the Plant Proteomics Database (PPDB) (43).
For quantification by spectral counting, each accession was scored for total spectral counts (SPCs), unique SPC (uniquely matching to an accession), and adjusted SPC. The latter assigns shared peptides to accessions in proportion to their relative abundance using unique spectral counts for each accession as a basis. Proteins that shared more than 80% of their matched peptides with other proteins across the complete data set were grouped into families. For many Arabidopsis genes more that one protein model is predicted. For our quantitative analysis, protein models with the highest total protein MOWSE score were used (summated across all experiments); if the protein models did not differ in total MOWSE score, protein model 1 was selected. To increase the robustness and significance of the data set, we removed all proteins (or families) for which the total number of adjusted SPCs within each pairwise clpr2-wt comparison was less than 10. The particular cutoff of 10 for adjusted SPC was chosen based on the results of the confidence testing by G-test; namely we found that given the absence of protein in one condition 10 or more SPCs are required in the other condition to claim significant overaccumulation with 95% confidence. The quantified proteins, along with details of identification, are listed in supplemental Table 1. The clpr2-1/wt ratios for each protein were calculated within each technical replicate. Proteins identified in only one genotype were assigned a fixed and arbitrary clpr2/wt ratio of 10.1 (absent in wt) or [1/10.1] (absent in clpr2). This assignment simplified further calculations and analysis of ratios, making all the values of numerical type. The median value for the clpr2-1/wt protein ratios (averaged for the three biological replicates) was 1.38. To determine significance (95% confidence) of up-and down-regulated proteins, first the G-test was used for each biological replicate (14,44). The p values determined by G-testing were corrected for multiple hypotheses testing using the Benjamini-Hochberg method (45). Briefly p values were ranked from 1 to n where 1 corresponded to the lowest and n corresponded to the highest p values; a protein with p value of rank k is said to pass the G-test at 95% significance; if inequality p value Ͻ(k/n) * 0.05 holds. Subsequently only those accessions were deemed significantly up-or down-regulated if 1) they passed the G-test in two or three biological replicates at 95% confidence and 2) they were either consistently up or down in all biological replicates. The clpr2-1/wt ratios and results of the G-tests are provided in supplemental Table 1. The sequence coverage (in %) and unique matched sequences of all proteins quantified in this study are provided in supplemental Table 5.
Quantitative Comparative Analysis of Chloroplast Stroma from Fully Developed Plants-Intact chloroplasts were isolated from fully developed rosette leaves of wt (42 days old) and clpr2-1 (90 days old), and stroma was collected as described previously (46). ICAT-based comparative analysis of clpr2-1 and wt was carried out according to Rutschow et al. (18) and was as follows. 200 g of purified stromal proteins from wt and from clpr2-1 were denatured and reduced using 50 mM tris(2-carboxyethyl)phosphine. All cysteine residues were labeled with the light (containing only 12 C stable isotopes) or heavy ( 13 C) ICAT reagent for 2 h at 37°C according to the manufacturer's instructions (Applied Biosystems). After labeling, the samples were combined, and proteins were separated by one-dimensional SDS-PAGE (12% acrylamide) followed by Coomassie Brilliant Blue (R-250) staining. The gel lane containing the ICAT labeled samples was completely cut into 12 slices. Each slice was cut into smaller pieces, washed, and digested with modified trypsin (Promega), and the resulting peptides were eluted as in Shevchenko et al. (42). After vacuum concentration, 30 l of 25% ACN, 5 mM KH 2 PO 4 , 350 mM KCl, pH 2.7 and 500 l of affinity loading buffer (pH 7.2) were added to each sample as described previously (47). The biotin-tagged peptides were purified on avidin columns as instructed by the manufacturer (Applied Biosystems). Protein digests were qualitatively analyzed on a MALDI-TOF mass spectrometer in linear mode (Voyager DE-STR, Applied Biosystems) to confirm total digestion. Peptides were then loaded on a guard column (LC Packings MGU-30-C18PM) followed by separation on a PepMap C 18 reverse-phase nanocolumn (LC Packings nan75-15-03-C18PM) using 90-min gradients with 95% water, 5% ACN, 0.1% FA (solvent A) and 95% ACN, 5% water, 0.1% FA (solvent B) at a flow rate of 200 nl/min. The labeled peptide mixtures were analyzed by LC-ESI-MS/MS (Q-TOF; Waters). Each ICAT-labeled sample was run in duplicate under the same chromatographic settings as follows.
(i) For protein quantification the samples were analyzed in MS mode, and the area of the peptides were calculated using MassLynx 4.0 SP1. (ii) The second run was set up for data-dependent MS/MS acquisition for protein identification. The flow-through from the 12 avidin column purifications (containing non-labeled peptides; i.e. peptides without cysteines) were also collected and analyzed by MS/MS (peptides not usable for quantification but valuable for protein identification) after off-line desalting on C 18 microcolumns (48). All quantified peptides and details are listed in supplemental Table 2. Proteins were identified by searching the MS data against The Arabidopsis Information Resource A. thaliana database v8 including sequences for known contaminants (e.g. keratin and trypsin) (33,013 entries total) and concatenated with a decoy database where all sequences were in reversed orientation using Mascot v2.2 (Matrix Science) with a significance threshold of 0.01. The maximum mass error tolerance for precursor and products ions was set at 1.2 and 0.8 Da, respectively. Search criteria were as follows: full tryptic peptides only, variable methionine oxidation, and a variable ICAT modification was included. Proteins that were identified with only one amino acid sequence and that were not quantified in the LTQ data sets were removed as there is an increased likelihood that they represent false positive identifications.
In addition, a comparative proteome analysis was also carried out using the spectral counting technique using the nano-LC-ESI-Q-TOF instrument for analysis. Similar to the LTQ-Orbitrap analysis of total leaf proteome, 400 g of isolated chloroplast stromal proteins (wt or clpr2-1) were separated by SDS-PAGE, and each gel lane was cut in 12 gel slices and manually processed and digested with trypsin followed by MS/MS analysis. Mass spectrometry and nano-LC settings for the spectral counting experiment were the same as described in detail above (see "Label-free Shotgun Analysis by 1-D SDS-PAGE and Nano-LC-LTQ-Orbitrap Mass Spectrometry of Seedlings"). Quantification of clpr2-1/wt ratios was similar to the seedling analysis, and details for quantified proteins are provided in supplemental Table 3. The sequence coverage (in %) and unique matched sequences of all proteins quantified in this study are provided in supplemental Table 5.
Western Blot Analysis-Total protein extracts of wt and clpr2-1 seedlings were obtained by grinding the seedlings with a mortar and pestle in liquid nitrogen followed by addition of 2% SDS, 50 mM Tris-HCl, pH 8.8, 5 mM EDTA and a mixture of protease inhibitors (49). Unsolubilized material was removed by passing the material over a spin frit column (30-m pore size; Pierce) by centrifugation for 1 min at 10,000 rpm. Protein concentrations were determined by the bicinchoninic acid assay (41). Proteins were separated by SDS-PAGE (12% acrylamide) and transferred to nitrocellulose membrane. Proteins were immunodecorated by primary antiserum serum, and bound antiserum was detected with secondary antiserum conjugated to horseradish peroxidase (Sigma) and enhanced chemiluminescence.
The Plastid Proteomics Database-Mass spectrometry-based information of all identified proteins was extracted from the Mascot search pages and filtered for significance (e.g. minimum ion scores, etc.), ambiguities, and shared spectra as described previously (14). This information includes MOWSE scores, number of matching peptides, number of matched MS/MS spectra (counts), number of unique and adjusted counts, highest peptide score, highest peptide error (in ppm), lowest absolute error (ppm), sequence coverage, and tryptic peptide sequences. This information is available in the PPDB (43) by using the search function "Proteome Experiments" and selecting the desired output parameters; this search can be restricted to specific experiments. Alternatively information for specific accessions (either individually or a group) can be extracted using the search function "Accessions," and if desired, this search can be limited to specific experiments. Finally information for a particular accession can also be found on each "protein report page." The MapMan bin system (50) was used for functional assignment, and proteins were reassigned to other bins if needed.

Large Scale Leaf Protein Quantification of wt and clpr2-1
Seedlings by Spectral Counting-To determine the effect of reduced clpr2-1 expression on the developing leaf proteome, clpr2-1 and wt plants were grown side by side, and seedlings were harvested at growth stages 1.07 and 1.14 (stages are as defined in Ref. 51) when plants have, respectively, 7 and 14 leaves (Fig. 1A). Because clpr2-1 is delayed in its development, a comparison of identical growth stages rather than plant age has most biological relevance (51,52) as was also demonstrated in our proteome analysis of the developmentally delayed cpSRP54 mutant (18). Because we were interested in those clpr2-1 responses that are consistently present in young seedlings with rapidly expanding leaves and to ensure that we recorded the most significant response to reduced ClpR2 accumulation, we treated the clpr2-1/wt comparisons for stages 1.07 and 1.14 as equivalent biological replicates. Total leaf proteomes of wt and clpr2-1 plantlets were extracted with SDS, and 400 g of each proteome were run out on an SDS-PAGE gel and stained by Coomassie Brilliant Blue (Fig. 1B). Each gel lane was cut in slices followed by in-gel trypsin digestion and protein identification by on-line nanoliquid chromatography-tandem mass spectrometry (nano-LC-MS/MS) with an LTQ-Orbitrap (Fig. 1B).
Three biological replicates were analyzed (two from stage 1.07 and one from stage 1.14) with two technical replicates for each sample. The technical replicates served to (i) increase the number of SPCs per protein, (ii) ensure that each run had no abnormalities by comparing the two technical runs, and (iii) reduce variation associated with the LC-MS/MS acquisition. The significance G-tests were done on the sum of the technical replicates. If a biological replicate passed the G-test but the clpr2-1/wt ratio was inconsistent in direction (above or below 1) between the two technical replicates, such data point was considered not significant. Thus the technical replicates increased the stringency of the analysis. More than 180,000 spectra were matched to ϳ2800 identified proteins (at Ͻ1% false positive peptide identification rate), corresponding to an average of 64 spectra per protein (not shown, but see PPDB).
It has been shown for LC-MS-based analyses of digested proteomes that the number of matching MS/MS spectra (SPCs) correlates with protein abundance if there are sufficient SPCs per protein (14,(53)(54)(55)(56). To determine differential protein accumulation between clpr2-1 and wt, we used SPC adjusted for shared peptides (see "Experimental Procedures"). Proteins for which the SPC was less than 10 within a pairwise clpr2-wt comparison were excluded from the quantitative analysis because G-tests determined that this is not sufficient for accurate quantification; this removed 72% of the identified proteins with 768 proteins remaining (supplemental Table 1). Evaluation of predicted and annotated subcellular locations (from PPDB (43)) showed that 50% of the 768 quantified proteins are chloroplast-localized, 6 -9% of the quanti-fied proteins are localized to mitochondria, and 9% have a predicted signal peptide for targeting to the endoplasmic reticulum (Table I). The remaining proteins are localized in the cytosol or have no known location. Proteome homeostasis functions (synthesis, folding, and degradation) are the largest quantified functional category (205 of 768) followed by the photosynthetic light reaction (59 of 768) and amino acid metabolism (46 of 768). 115 quantified proteins have either no known function or a miscellaneous function. An overview of the functional distribution of the quantified proteins is provided in supplemental Table 4.
The clpr2-1/wt ratios for the 768 proteins were calculated within each replicate. Proteins identified in only one genotype were assigned a clpr2-1/wt ratio of 10.1 (absent in wt) or [1/10.1] (absent in clpr2-1). To determine whether proteins were significantly up-or down-regulated in either developmental stage, a G-test with Benjamini-Hochberg correction was carried out for each biological replicate. Only those proteins that showed significant (95% confidence after correcting for multiple testing) and consistent changes across at least two of the three biological replicates were considered significantly affected (supplemental Table 1). 25 proteins were up-regulated, whereas 18 proteins were significantly down-regulated (Table II). All but two of the affected proteins were chloroplast-localized, indicating that the dominating effect of reduced accumulation of the ClpPR complex is within the chloroplast. Most of the up-regulated proteins were involved in protein homeostasis and secondary metabolism, whereas most of the down-regulated proteins were involved in the dark and light reactions of photosynthesis (Table II and supplemental Table 4) as will be discussed in more detail below. All quantified proteins are shown for each biological replicate in Fig. 2, which crosscorrelates the number of spectral counts for each protein in wt and clpr2-1. Significantly changed proteins that passed the G-test are marked. The plots clearly show that as the number of spectral counts increases smaller -fold changes can be detected. For example, in the range of 30 -60 counts, -fold changes of 2.5 or more are significant, whereas in the range of 60 -500 spectral counts 2-fold changes are frequently significant. For highly abundant proteins, such as AtpA, AtpB, and Rubisco large subunit, changes as low as 10% could be detected.
Comparative Proteome Analysis of wt and clpr2-1 Stroma Extracted from Isolated Chloroplasts of Mature Plants-In fully developed leaves, chloroplast biogenesis is complete, and the need for Clp proteolysis is likely to be reduced. To better define the role of the Clp complex under these steady-state conditions, we analyzed the chloroplast stromal proteome of clpr2-1 and wt in fully developed leaves (both in growth stage 3.90) using differential stable isotope labeling of cysteine residues with cleavable ICAT (57) and by quantification using spectral counting; these were independent and complementary experiments.

ClpR2 Protease in A. thaliana
The ICAT analysis was carried out similarly as described for the Arabidopsis cpSRP54 chloroplast sorting mutant (18), and for more details see "Experimental Procedures." 310 ICAT labeled peptides were manually quantified matching to 129 non-redundant proteins or small gene families; additional proteins were identified but not quantified due to lack of cysteinecontaining peptides (see "Discussion"). Supplemental Table 2 provides a complete list of these labeled peptides with their sequence, retention time, integrated peak area, and other details. A number of closely related chloroplast proteins were not quantified individually but as closely related family members. 70 proteins or protein families were quantified by one peptide pair, and 64 proteins were quantified by two or more peptide pairs (supplemental Table 2). We did not detect unlabeled cysteine-containing peptides (even after searching for acrylamide adducts to Cys), indicating that the ICAT labeling was saturating. Consistently the ratio of total peak volume of all heavy and all light ICAT labeled peptides was close to unity (1.08). The coefficient of variation (CV) of the clpr2-1/wt peptide ratios for proteins identified by at least two peptides was on average 0.13, indicating highly consistent quantifications between peptides matching to the same protein.
For the label-free approach, 400 g of each stromal proteome were run out on SDS-PAGE and processed in a similar fashion as for the total seedling extracts but using a nano-LC-Q-TOF mass spectrometer with comparable chromatography. Only those proteins with more than 10 SPCs were used for quantification (supplemental Table 3). Four proteins that are not chloroplast-localized as well as seven lumenal and peripheral thylakoid membrane proteins were removed from the stromal analysis. 95 stromal proteins could thus be quantified, and all proteins except ClpB3 (only found in clpr2-1 with 12 peptides) were found in both wt and clpr2-1 stroma (supplemental Table 3), showing that the wt and clpr2-1 stromal proteomes of fully developed plants are similar and that our experimental analysis was robust.
The ICAT-based and spectral counting-based quantifications showed a good linear correlation (not shown, but see Table II). Proteins quantified by both methods covered in particular the Calvin cycle (15 proteins), protein homeostasis (13 proteins), starch metabolism (four proteins), and nitrogen metabolism (three proteins) ( Table II). Most of the up-regulated proteins were chaperones (CPN60, cpHSP70, cpHSP90, and ClpB3) and elongation factors (EF-TU-G; EF-TU-1, EF-TU-TypA/BipA), results very similar to those of the seedling analysis (Table II) as will be discussed below.
Chaperones and Isomerases-CPN60-␣,␤ (GroEL homologues) and cpHSP70-1,2 (DnaK homologues) are the chloro-FIG. 1. Comparative proteome analysis of wt and clpr2-1 and CLPB3 mutant analysis. A, outline of the comparative proteome analysis of clpr2-1 and wt. Total leaves were harvested at growth stages 1.07 and 1.14 at, respectively, 18 and 23 days for wt and 34 and 46 days for clpr2-1 for the comparison of total cellular proteomes. Soluble stromal proteins were collected from chloroplasts isolated from fully developed wt and clpr2-1 plants at growth stage 3.90 at 42 and 90 days, respectively, for wt and clpr2-1. The seedling proteomes were compared using the spectral counting technique from data obtained by nano-LC-ESI-LTQ-Orbitrap, whereas the stromal proteomes were compared using spectral counting and ICAT using data obtained by nano-LC-ESI-Q-TOF. B, example of 1-D electrophoresis gels of extracted seedling proteomes at stage 1.07 and stage 1.14. Gels were stained with Coomassie Brilliant Blue, and each lane was cut in 12 slices followed by manual in-gel digestion and extraction of peptides. Samples were analyzed in duplicate on the LC-LTQ-Orbitrap following the gradient and injection scheme as depicted. Two blanks were injected after each sample to prevent carryover between samples. C, comparison of the wt, clpr2-1, clpb3-1, and clpb3-1xclpr2-1 (b3-1 x r2-1) mutants grown on Murashige and Skoog medium ϩ 2% sucrose for 10 weeks under a 10-h light/14-h dark cycle at 60 mol photons⅐m Ϫ2 ⅐s Ϫ1 . DDA, data-dependent acquisition.

ClpR2 Protease in A. thaliana
plast chaperones that provide the central folding activity in the chloroplast (58). They were all ϳ2-fold increased in clpr2-1 both in young seedlings and mature chloroplasts. HSP90 proteins are typically involved in the late folding steps of proteins (59), and an Arabidopsis cpHSP90 mutant, cr88, was delayed in chloroplast development in cotyledons and young leaves (60). cpHSP90 was on average 2.3-fold up-regulated in young clpr2-1 seedlings and also up-regulated in chloroplasts isolated from mature clpr2-1 leaves (Fig. 3A and Table II). Chaperones ClpC-1,2 are members of the HSP100 superfamily of ATP-dependent chaperones, and ClpC1,2 are believed to deliver substrates to the ClpPR core complex; they also have a function in chloroplast protein import through association with the central chloroplast envelope import component, Tic110 (see the Introduction) (61). Accumulation of ClpC-1,2 significantly increased on average 1.7-fold in clpr2-1 in the seedlings and 2-fold in mature chloroplasts (Fig. 3A). Chloroplast ClpB3 is also a member of the HSP100 family, but ClpB3 does not contain the conserved tripeptide domain (IGF) needed for interaction with the ClpPR core (13); hence ClpB3 is highly unlikely to deliver any substrate to the ClpPR core.
The main function of ClpB in E. coli is protein disaggregation FIG. 2. Quantification of clpr2-1 and wt seedling proteomes using spectral counting. Cross-correlation between the number of spectral counts in clpr2-1 and wt and significant changes for the 768 quantified proteins using log-scaled scatter plots are shown. A, 711 proteins quantified in the 1.07 growth stage seedlings, biological replicate 1. B, 711 proteins quantified in the 1.07 growth stage seedlings, biological replicate 2. C, 537 proteins quantified in the 1.14 growth stage. Horizontal and vertical coordinates correspond to the SPCs measured in clpr2-1 and wt, respectively. Closed dark circles correspond to proteins that showed significant changes in accumulation levels as determined by the G-test (95% confidence) and passed the requirement of consistent direction of change across both technical replicates. Open gray circles correspond to the quantified proteins that either showed no statistical evidence for differential accumulation or failed the consistency requirement. All proteases are indicated by squares with black filled squares indicating those that are significant. Two gray lines marking 2 and 0.5 clpr2-1/wt ratios are shown. Names of selected proteins are indicated. Abbreviations not provided in the main text are: MDAR, monodehydroascorbate reductase; P5CS A, ⌬ 1 -pyrroline-5-carboxylate synthetase A; EF1B␣2, elongation factor 1␤ subunit ␣-2; EDGP, extracellular dermal glycoprotein; NDH-H, NADPH dehydrogenase complex subunit H; PPR, pentapeptide repeat protein At5g46580; CYS-synth, cysteine synthase; RBCL, Rubisco large subunit; PC, plastocyanin. aided by HSP70/GrpE (62). ClpB3 was on average 5.5-fold up-regulated in the seedlings and was also strongly up-regulated (Ͼ4-fold) in mature chloroplasts ( Fig. 3A and Table II). ROC4 is a very abundant stromal peptidyl-prolyl isomerase (14,63) with in vitro rotamase activity (64). Its activity is regulated by redox reagents (21,22) and a regenerator of chloroplast peroxiredoxins (22). Recently two additional functions were suggested, namely the folding of stromal serine acetyltransferase, thereby enabling the cysteine-based thiol biosynthesis pathway to adjust to light and stress conditions (65), and repair of damaged photosystem II (66). ROC4 was 2-fold up-regulated in the clpr2-1 seedlings but returned to wt levels in mature leaves ( Fig. 3A and Table II). Thus within the groups of chaperones and isomerases, ClpB3 showed the most pronounced up-regulation.
Protein Sorting and Translocation-17 proteins involved in protein sorting and translocation were quantified; nine of them were chloroplast-localized (supplemental Table 1). Among those, only Tic110, a central component of the central protein import machinery in the chloroplast inner envelop (61,67), was significantly and strongly (Ͼ3-fold) increased in the clpr2-1 seedlings (Table II and Fig. 3A).
Proteolysis-We quantified 38 proteins in the seedlings that were involved in protein degradation (supplemental Table 1); two of them were significantly affected (ClpC-1,2 and PreP1,2), and they were each chloroplast-localized (Table II FIG. 3. Bar diagram displaying the clpr2-1/wt protein accumulation ratios (in log scale) for the significantly affected proteins involved in protein homeostasis (A), photosynthetic electron transport located in the thylakoid (B), and primary and secondary metabolism as well proteins located in plastoglobules (C). The bars represent quantifications for the different biological replicates for the seedlings at growth stages 1.07 and 1.14 and for chloroplast stroma isolated from mature leaves. The standard deviations show the variation between the technical replicates. Bars marked with an asterisk (*) indicate a significant change for that biological replicate as determined by the G-test with 95% confidence. RBCL, Rubisco large subunit; GAPDH, glyceraldehyde-3-phosphate dehydrogenase; Fd-GOGAT, ferredoxin-dependent glutamate synthase; RBCS, Rubisco small subunit; LHC, light-harvesting complex; GGDR, geranylgeranyl reductase; WDK, water dikinase.

ClpR2 Protease in A. thaliana
and Fig. 3A). ClpP1,4,5,6, ClpR3,4, and ClpT1,2 were each quantified in all three biological replicates, but they were not significantly changed (supplemental Table 1). The up-regulation of chaperones ClpC-1,2 was already mentioned, and in addition to a role in import, they are believed to deliver substrate to the ClpPR cores. Stromal Zn 2ϩ -proteases PreP1,2 (also named ZnMP1,2) were suggested to be involved in degradation of cleaved chloroplast transit peptides (3,68,69) and were on average 4-fold up-regulated in the seedlings, but they were unchanged in the mature chloroplast as determined by ICAT and in agreement with the colorless native PAGE analysis of mature clpr2-1 (36).
Chloroplast Protein Synthesis and Ribosome Assembly-A large group of 117 proteins involved in protein synthesis was quantified; many of them belong to 80 S cytosolic or 70 S chloroplast ribosomes (49 and 37 proteins, respectively) (supplemental Table 1). Three of these proteins, all localized in the chloroplast, were significantly upregulated in both young and mature leaves, and they are DEAD box RNA helicase RH3, elongation factor EF-TU-1, and the GTPase TypA/BipA. EF-TU-1 is a general elongation factor, but the maize homologue was suggested to also serve as a chaperone in particular during heat stress (70,71). GTPase TypA/BipA is a specialized ribosome-associated translation factor (72) and was on average 5-fold upregulated, whereas EF-TU-1 was 1.6-fold up-regulated. No significant change in the chloroplast ribosomal protein population was observed even if many were quantified. Several chloroplast biogenesis mutants show defects or delays in the processing and maturation of chloroplast rRNA molecules (73)(74)(75). These rRNAs are encoded by the chloroplast genome, and they are transcribed as a polycistronic transcript that undergoes extensive splicing and processing. In the case of a subset of biogenesis mutants such as dal (76), rnr1 (77,78), and clpr1-1 (26), a specific delay in maturation of the discistronic 23 S-4.5 S intermediate was observed. To investigate possible defects in chloroplast rRNA processing in clpr2-1, we separated total leaf ribosomes on sucrose density gradients and analyzed the rRNA population by gel separation and ethidium bromide staining (Fig.  4A). The ribosome profiles between wt and clpr2-1 were very similar except that two distinct unprocessed higher molecular mass rRNA molecules accumulated in clpr2-1 but not in wt, indicating defects in rRNA processing (Fig. 4A). To determine the exact nature of these additional rRNA species, total RNA was extracted from wt and clpr2-1 seedlings at stages 1.08 and 1.14, and they were analyzed by Northern blots using specific probes against each of the chloroplast rRNA species (Fig. 4B). This showed that the 5 and 16 S rRNA molecules were unaffected, whereas both 4.5 S rRNA and 23 S rRNA processing were partially affected with higher molecular mass species accumulating ( Fig. 4B and data not shown).

The Photosynthetic Electron Transport Chain and Associated Plastoglobule
Particles-59 thylakoid proteins of the photosynthetic electron transport chain (PSI, PSII, cytochrome b 6 f, and ATP synthase) were quantified in the seedlings (supplemental Table 1). 21 of them were significantly down-regulated and include proteins of both photosystems and the ATP synthase ( Fig. 3B and Table II). This is consistent with the delayed greening shown in Fig. 1A. We note that the lumenal OEC proteins were among the most significantly down-regulated proteins, possibly suggesting competition for thylakoid protein translocation capacity similar to that observed for secretory proteins in E. coli when membrane proteins are overexpressed (79). We note that although lumenal isomerase TLP40 and lumenal DegP1 protease also had low clpr2-1/wt ratios, several other lumenal proteins were not down-regulated (e.g. HCF136).
Plastoglobules (PGs) are thylakoid-associated lipoprotein particles that play a role in quinone, tocopherol, and carotenoid metabolism and storage and also play a role in stress defense and chlorophyll and thylakoid membrane turnover (80). Microscopy analysis showed that PG size and number increased in the clpr2-1 mutant (36). Fibrillins, also named plastoglobulins, are structural proteins of the PG and likely serve to stabilize the PG and control their size. We quantified nine of the 13 known fibrillins, but only FIB1A and FIB1B were significantly up-regulated (4 -5-fold) (Fig. 3C and Table II). The other fibrillins were generally not much affected (clpr2-1/wt ratios between 1 and 2) except for FIB8 that seemed enriched in clpr2-1 (supplemental Fig. 1).
Primary and Secondary Metabolism-12 proteins involved in primary and secondary metabolism were found to be significantly affected in clpr2-1 (Fig. 3C and Table II). All proteins, except for a GDSL-like lipase/acylhydrolase with unknown location, were chloroplast-localized. Three of these proteins were down-regulated and are involved in the Calvin cycle, whereas the up-regulated proteins are involved in different metabolic pathways, including metabolism of isoprenoids, tetrapyrroles, thiamine (vitamin B 1 ), and starch.
The Calvin Cycle-We quantified 20 Calvin cycle enzymes (or pairs of homologues); most of them were in all three biological seedling replicates and also in the stroma of mature leaf chloroplasts (supplemental Fig. 2 and supplemental Tables 1-3). Three of the Calvin cycle enzymes (Rubisco large subunit, Rubisco small subunit, and glyceraldehyde-3-phosphate dehydrogenase A1) were slightly (20 -45%), but significantly, decreased in the seedlings (Fig. 3C and Table II).
Starch Metabolism-10 proteins involved in starch synthesis (two ADP-glucose pyrophosphorylase subunits) and degradation (kinases, dikinases, phosphorylases, amylases, and transglucosidase) were quantified in the seedlings (supplemental Table 1). One of them, the chloroplast-localized phosphoglucan water dikinase 1 (GWD1) (also known as SEX1 or the R protein) was significantly up-regulated (ϳ6-fold) in the seedlings but not in the mature leaves ( Fig. 3C and Table II). A second dikinase (phosphoglucan water dikinase) was quantified and was similarly increased, but due to lower spectral counts (35 in total), it was not significant (supplemental Table 2). These dikinases are involved in control of phosphorolytic starch degradation during the night (81), and their up-regulation in young, but not mature, leaves likely relates to the lower starch accumulation levels observed in the young clpr2-1 leaves (36).
Thiamin Biosynthesis-Thiamin pyrophosphate (also named vitamin B 1 ) is synthesized in plastids and is an essential cofactor for several enzymes in central carbon metabolism, such as pyruvate dehydrogenase, transketolase, pyruvate decarboxylase, and ␣-ketoglutarate dehydrogenase (82). We quantified two key enzymes in thiamine biosynthesis, hydroxymethylpyrimidine phosphate synthase (THIC) and hy-droxyethylthiazole synthase (THI1), in the seedlings (supplemental Table 1), and both were on average 5-fold up-regulated, and the up-regulation of THI1 was statistically significant (Fig. 3C and Table II). Supplementation of the growth medium by different concentrations of thiamine hydrochloride did not result in any phenotypic complementation indicating that lack of thiamine does not cause the pale green clpr2-1 (and clpb3-1) phenotype and developmental delay (data not shown).
Isoprenoid Biosynthesis and Tetrapyrrole Metabolism-Isoprenoids are central in plant development, and the universal precursors are isopentenyl diphosphate and its isomer dimethylallyl diphosphate. They are derived from acetyl-CoA in the cytosol through the mevalonate pathway or from FIG. 4. rRNA processing is affected in clpr2-1 seedlings. To determine whether the clpr2-1 mutant showed any defects in 70 S chloroplast rRNA processing and ribosome assembly, rRNA populations from stages 1.07 or 1.08 and stage 1.14 clpr2-1 and wt plants were determined. A, sucrose density centrifugation and RNA analysis by ethidium bromide staining. The two cytosolic 18 and 25 S rRNAs are clearly visible in 40 and 60 S particles and assembled 80 S particles. Two additional, high molecular mass bands (between the 18 and 25 S bands) are visible in the clpr2-1 seedlings and represent unprocessed rRNA species. B, Northern blot analysis of total RNA extracted from wt and clpr2-1 seedling leaves. RNA was blotted to membrane and stained by methylene blue or analyzed with specific probes against chloroplast ribosomal RNA molecules derived from genes rrn23, rrn4.5, rrn16, and rrn5 as indicated above each panel. RNA species accumulating in clpr2-1 are indicated by asterisks and arrows.
In the seedling analysis, we quantified five enzymes in the plastid isoprenoid pathway, and they are 1-deoxy-D-xylulose-5-phosphate reductoisomerase, 4-hydroxy-3-methylbutyl diphosphate synthase (HDS; also named GcpE or CLB4) (both in the MEP pathway), 4-diphosphocytidyl-2-C-methyl-Derythritol synthase involved in the conversion of isopentenyl diphosphate into dimethylallyl diphosphate (84), geranyldiphosphate synthase (AtSPS2) (85), and geranylgeranyl reductase (GGR) (86) (Fig. 5A). HDS was significantly and strongly up-regulated (on average 9-fold) in the seedlings and was also 50% increased in fully developed plants as determined by both ICAT and spectral counting (Fig. 3C and Table  II). HDS is thylakoid-associated and can accept electrons directly from the photosynthetic machinery in the light (87). Although it did not pass our significance test, 1-deoxy-Dxylulose-5-phosphate reductoisomerase levels were ϳ2-fold higher in the 1.07 seedlings but down-regulated in the 1.14 seedlings and in the chloroplasts of mature plants (ICAT data).
Western blot of 2-C-methyl-D-erythritol 4-phosphate cytidylyltransferase, immediately downstream of 1-deoxy-D-xylulose-5-phosphate synthase and three steps upstream of HDS, showed an increase of ϳ2-fold in clpr2-1 stage 1.07 (Fig. 5B) as well as stage 1.14 (not shown). GGR, downstream of the MEP pathway, was on average 1.9-fold up-regulated in the seedlings. GGR is involved in reduction of geranylgeranylchlorophyll into chlorophyll a as well as the reduction of free geranylgeranyl diphosphate into phytyl diphosphate used for chlorophyll, tocopherol, and phylloquinone synthesis (86, 88) (Fig. 5A). GGR also associates with the thylakoid membrane, and electrons required for GGR function are likely generated by the photosynthetic electron transport chain, similar to HDS. We quantified 17 proteins in tetrapyrrole synthesis and two proteins involved in chlorophyll degradation in the seedlings; they showed a median clpr2-1/wt ratio of 1.5 (supplemental Fig. 3). Only Mg 2ϩ -protoporphyrin IX chelatase subunits H (genome uncoupled mutant 5 (GUN5)) and I-1 were found to be significantly up-regulated (Ͼ5-fold) (Fig. 3C and Table II); Mg 2ϩ -chelatase subunit D was higher, but it did not pass the significance test ( Fig. 3C and Table II). Other Metabolic Functions-We found three additional metabolic enzymes that were significantly up-regulated: these are chloroplast-localized pyruvate kinase-2 involved in the ATP-generating step in the production of pyruvate during glycolysis (Ͼ3-fold up), ferredoxin-dependent glutamate synthase involved in nitrogen assimilation (1.7-fold up), and GDSL-like lipase/acylhydrolase (1.6-fold up) with unknown location (Fig. 3C and Table II). The up-regulation of pyruvate kinase-2 fits well with the limited amount of photosynthesis and the contribution of glycolysis in generating extra ATP to compensate for reduced electron transport and coupled ATP production. The increase of ferredoxindependent glutamate synthase is likely a consequence of limited electron transport and may reflect feedback regulation to better compete for electrons needed for nitrogen assimilation.
Genetic Interaction between ClpR2 and ClpB3-The comparative proteomics analysis showed that in particular ClpB3 was strongly increased in clpr2-1. ClpB3 is the chloroplast homologue of the bacterial ClpB protein (89), which unfolds aggregated proteins in the bacterial cytosol, aided by the DnaK chaperone system (homologous to cpHSP70) (90,91). This suggests that the clpr2-1 mutant suffers from protein aggregation. To determine the functional significance of these increased levels of the ClpB3 protein further, we obtained a T-DNA-tagged mutant in CLPB3 (clpb3-1) with a T-DNA insertion in the first of the nine exons similar to that characterized previously (92). We confirmed that homozygous clpb3-1 did not express any CLPB3 transcript (supplemental Fig. 4A). The homozygous clpb3-1 null mutant was seedling-lethal under autotrophic conditions (not shown) but could be rescued when supplied with sucrose and moderate light (Fig. 1C). These heterotrophically grown plants showed a virescent phenotype that partially recovered. Once seedlings were established under these heterotrophic conditions for ϳ4 weeks, they could be transferred to soil, and they produced viable seeds (not shown). clpb3-1 was crossed with clpr2-1 and double homozygous mutants were recovered in the F2/F3 generation (Fig. 1C and supplemental Fig. 4B). The double mutant was blocked in development after emergence of the cotyledons under autotrophic conditions (not shown) but could be partially rescued under the heterotrophic conditions (2% sucrose) (Fig. 1C). Under these heterotrophic conditions, the double mutant developed and greened slowly (Fig. 1C) and produced flowers but no seeds (not shown). The synergistic effect between the lack of ClpB3 and reduced accumulation of ClpR2 is consistent with the observed increased in ClpB3 protein in clpr2-1 and supports the idea that reduced Clp levels lead to increased protein aggregation. This also suggested that the increase in ClpB3 is not because it is a substrate of the Clp protease. DISCUSSION In our previous study, we showed that the clpr2-1 mutant has a 2-3-fold reduced accumulation of the 325-kDa ClpPRT complex in mature leaves (36). The clpr2-1 mutant is highly suitable to better determine the role of the Clp machinery in chloroplast biogenesis and homeostasis and to possibly discover potential Clp substrates because clpr2-1 is a stable T-DNA insertion line with a clear, but moderate, phenotype, and plants can grow autotrophically on soil. In the current study, we compared the leaf proteome of clpr2-1 and wt seedlings early in development when the clpr2-1 phenotype is most visible using the spectral counting technique. This was complemented with a comparative analysis of the chloroplast proteome of fully developed rosette leaves using spectral counting and ICAT labeling. After a brief discussion of the applied methodologies and the importance for chloroplast research, we discuss the quantitative data in the context of substrates of the Clp protease system, the chloroplast protein homeostasis network, and feedback on metabolic pathways.
Spectral Counting of Total Leaf Proteomes Is an Excellent Tool to Study Chloroplast Mutants-Many proteins are involved in chloroplast biogenesis, and their loss of expression often leads to developmental delays with the strongest phenotypes during early leaf development. It is therefore important to study the chloroplast proteome in young, small seedlings rather than in larger, fully developed plants. Although it might be possible to isolate intact chloroplasts from these young seedlings, it is far more practical if one can assess the chloroplast without actual fractionation with the added benefit that the cellular response outside the chloroplast can also be assessed. The current study demonstrates that the MS-based spectral counting technique, using a fast and accurate MS instrument, does facilitate comparative analysis of unfractionated leaf proteomes. Furthermore it is a successful strategy to study the response of the chloroplast proteome because chloroplast proteins are relatively abundant, and more than 1000 Arabidopsis proteins have been assigned with confidence to the plastid (14,43), making chloroplast isolation generally unnecessary for protein localization assignment. Because quantification is done by counting matching peptides, it is important to keep the false positive peptide identification rate low (e.g. Ͻ1%) and to acquire as many MS/MS spectra (counts) as possible. Therefore, using a fast and accurate MS instrument, such as the LTQ-Orbitrap, is optimal to maximize sampling depth of the complex proteome. We note that we used the Orbitrap at its highest resolution (100,000) because our pilot experiments showed that for the high complexity Arabidopsis protein mixtures and the peptide amounts used in our experimental setup this was beneficial in two ways. 1) The total amount of MS/MS data recorded was lower, but the quality of the data was higher as also reflected by a higher percentage of matched MS/MS spectra of the total acquired spectra, and 2) it resulted in a higher mass accuracy in particular for ions with higher charge states, allowing us to use lower peptide ion score cutoffs (33) to arrive at a 1% false positive rate.
As a "rule of thumb," the more total spectral counts that are observed for a given protein, the smaller the -fold changes that can be detected by a statistical test. This was evident from Fig. 2, which cross-correlated the number of spectral counts for each protein in wt and clpr2-1. A further advantage of spectral counting is the straightforward implementation of statistical tests of significance (55,93). The G-test of independence becomes the method of choice to determine significant changes in protein levels in pairwise comparisons when only a few biological replicates are available, yet the number of simultaneously tested hypotheses is large (55). Previously we implemented the G-test to assess reproducibility of spectral counting between technical and biological replicates of LC-MS analysis of the wild type chloroplast proteome (14). Here we applied G-tests, corrected for multiple testing (see "Experimental Procedures"), to determine significance (at the 95% level) of protein level changes for each of the biological replicates analyzed.
Comparative Proteome Analysis of Isolated Stroma by ICAT and Spectral Counting-In the original study in which we described the isolation of the clpr2-1 mutant, we analyzed the soluble stromal proteome of isolated chloroplasts from fully developed plants by native gel analysis (colorless native PAGE) followed by a second dimension by SDS-PAGE and subsequent image analysis (36). This showed that the stromal proteome of clpr2-1 was generally qualitatively and quantitatively similar to wt with the exception of up-regulated CPN60, but the gel-based quantification lacked the accuracy and/or resolution needed to detect smaller changes and/or resolve mixtures of specific proteins. To obtain a more accurate overview and to determine to what extent proteome homeostasis recovered in these older plants as compared with the young seedlings, we applied two MS-based techniques, ICAT and spectral counting, and used a Q-TOF hybrid MS instrument. The soluble stromal proteome is at least 10-fold less complex than the total leaf proteome, and we expected that this reduced complexity would compensate for the lower MS/MS sampling rate of the Q-TOF as compared with the LTQ instrument, in particular if similar sampling loading and on-line chromatography were used. Indeed the average number of counts per identified protein per wt ϩ clpr2-1 replicate was very similar for the LTQ and Q-TOF analyses (on average 19 counts/protein for Q-TOF and 10 -20 counts/protein for LTQ replicates), whereas the number of identified proteins per replicate was 5-6-fold higher for the total leaf proteome analysis than the stromal proteome analysis. We also chose to use ICAT labeling because we observed low CVs and good accuracy in a separate study on a different chloroplast (sorting) mutant (18). Indeed also in our current study we obtained an excellent consistency for ICAT quantification from different peptides matching to the same protein as reflected by the low CV of 0.13. Comparison of the ICAT data with the spectral counting data showed a good linear correlation with the ICAT analysis giving consistently slightly more pronounced clpr2-1/wt protein accumulation ratios. One known drawback of the ICAT technique is that labeling requires cysteine residues in the protein. Because we also analyzed the unlabeled peptides (lacking cysteine residues) from the flow-through of the avidin columns, we were able to explore why a significant number of identified proteins were not quantified: a strong positive correlation was observed between quantification success rate and the theoretical number of cysteine-containing tryptic peptides (from in silico digestion) (supplemental Fig. 5) as well as protein abundance as measured by the MOWSE protein score (supplemental Fig. 5, inset). Thus ICAT and spectral counting techniques are complementary, and as such, both are valuable tools in quantitative proteomics.
Summarizing Overview for the Response of the Chloroplast Proteome to Reduced ClpR2 Accumulation: the Importance of a Systems View of the Chloroplast-The consequences of reduced ClpR2 accumulation on the chloroplast proteome are summarized in Fig. 6; significantly changed processes or proteins are indicated by numbers with details provided in the figure legend. As indicated in Fig. 6, the chloroplast proteome requires expression, sorting, and assembly of nucleus-and chloroplast-encoded proteins. Accumulation of these two groups is co-regulated through (i) chloroplast-nucleus signaling (94,95), (ii) the role of nucleus-encoded chloroplast proteins in the biogenesis of chloroplast proteins (96,97), and (iii) proteolysis of chloroplast-and nucleus-encoded proteins (1,19). Chloroplast biogenesis and protein homeostasis are also functionally intertwined with primary and secondary metabolism. Therefore interpretation of the comparative proteomics data must consider these connectivities, in particular if one aims to understand the substrates and general role of the Clp protease system in chloroplasts. Although overaccumulation of a protein in the clpr2-1 mutant (or in protease mutants in general) could indicate that the protein is a substrate for proteolysis, such increase could also represent an indirect, compensatory response as appears to be the case for e.g. ClpB3. In other words, when evaluating the clpr2-1 molecular phenotype, one should consider the chloroplast as one system that also communicates with the rest of the cell. We will discuss our findings in the context of such a systems view of the chloroplast.
clpr2-1 Shows an Increased Need for Protein Import, Folding, and Unfolding Capacity-An important part of the molecular phenotype was the consistent up-regulation of proteins involved in import (Tic110), folding and maturation (CPN60/ cpHSP70/cpHSP90), and unfolding (ClpB3) with the response of ClpB3 by far the strongest. Our working hypothesis is that reduced Clp protease activity leads to accumulation of unwanted/damaged proteins that accumulate as protein aggregates. ClpB3 and cpHSP70 are up-regulated in an effort to unfold and reactivate these aggregates. Disaggregation by the bacterial ClpB3 homologue followed by protein refolding has been shown to be critical for cell viability in E. coli (98). The engagement of cpHSP70 in refolding activity of aggregated proteins likely reduces the capacity for folding of newly imported nucleus-encoded proteins or newly synthesized chloroplast-encoded proteins and thus further contributes to destabilized proteome homeostasis. The up-regulation of both CPN60 and cpHSP90 suggests an extra demand for ATP-dependent folding activity possibly because of less favorable/efficient folding conditions, for instance a limited availability of ATP or because of prolonged engagement of these chaperones with unwanted proteins. The Tic110 increase suggests a bottleneck in protein import into the chloroplast; we speculate that this is due to delayed release from the import channel and Tic110 because chaperones, including ClpC, are engaged in unproductive stabilization of proteins rather than aiding in completing the import process. The increased levels of HSP70, CPN60, and ClpC in the mature chloroplasts correspond with our Western blot analysis (36) and suggests that even when chloroplast biogenesis is complete chloroplasts continue to experience folding stress.
Specific Effects on Plastid Gene Expression-Whereas chloroplast gene expression is clearly not blocked in the clpr2-1 mutant (as evidenced by accumulation patterns of chloroplast-encoded proteins), the strong and persistent upregulation of the BipA/TypA translation factor and to a lesser degree EF-TU-1 suggests specific effects on plastid translation. E. coli BipA interacts with ribosomes at the same site as EF-G (99), whereas the homologue in the plant species Suaeda salsa was suggested to play a critical role in the development of oxidative stress tolerance perhaps as a translational regulator of chloroplast-encoded stress-responsive proteins (72). Analysis of the Arabidopsis chloroplast stromal proteome by gel filtration and mass spectrometry suggests interaction of BipA with translating 70 S ribosomes. 1 No significant change in the accumulation of chloroplast ribosomal proteins was observed in the seedlings although 37 ribosomal proteins were quantified, indicating that synthesis of chloroplast ribosomal proteins is not affected. However, two lines of evidence in our study suggest a delay in ribosome assembly and/or defect in RNA metabolism. First of all, we observed delayed processing of the dicistronic 23 S-4.5 S rRNA in the chloroplasts of young but not mature plants indicative of a temporal problem in RNA processing. Although several chloroplast biogenesis mutants, such as ppi1 (73) The nucleus-encoded proteins are imported through the Toc/Tic translocon in the chloroplast envelope followed by removal of the N-terminal chloroplast transit peptide (cTP), folding, and assembly (61,117). Proteins destined for the thylakoid membrane system, including the thylakoid lumen, are targeted following different pathways and involving different sorting components (118). The chloroplast-encoded proteins are synthesized on 70 S ribosomes either in the chloroplast stroma or at the thylakoid surface (118). Proteins that were significantly up-regulated in clpr2-1 seedlings are: Tic110 (1); PreP1,2 (2); cpHSP70 and CPN60 (3); cpHSP90 (4); ClpB3 (5); EF-BipA, EF-TU, and DEAD box helicase RH3 (6); FIB1a,B (8); GGR (9); HDS (10); THI1 (11); GUN5 and CHLI-1 (12); and water dikinase (WDK)/SEX1 (13). Proteins that were significantly down-regulated in clpr2-1 seedlings are components of PSII, PSI, and ATP synthase (7). Chl, chlorophyll; PPP, triphosphate; GG, geranylgeranyl; GGPP, geranylgeranyl diphosphate; vit, vitamin.
ClpR2 Protease in A. thaliana tant such as dal (76), rnr1 (77,78), and clpr1-1 (26). Second, we observed a very strong increase in the RH3 DEAD box RNA helicase and increase in the exoribonuclease polynucleotide phosphorylase, which is also named RIF10. RH3 DEAD box RNA helicase shows, together with ClpB3 and EF-TU BipA, the strongest up-regulation in clpr2-1 (5-13-fold); this helicase has not been studied in plants, but members of this helicase family are often involved in ribosome maturation and/or rRNA processing and stability (100,101). This helicase might overaccumulate in clpr2-1 young and mature plants as a compensatory response to overcome a bottleneck in ribosome assembly or other aspects of plastid gene expression or alternatively because it is a substrate of the Clp protease system. Polynucleotide phosphorylase (RIF10) has been shown to be indispensable for 3Ј-end maturation of 23 S rRNA transcripts and the efficiency of 3Ј-end processing of mRNAs and polyadenylation as well as the degradation of mRNA and tRNA (102)(103)(104). We quantified the polynucleotide phosphorylase in both biological replicates of the seedlings in stage 1.07, including each of the four technical replicates (clpr2-1/wt values were 3.0, 4.5, 4.4, and 2.0). The average clpr2-1/wt ratio across the biological replicates was 3.5; although this did not pass the significance tests (due to the relatively low number of spectral counts), the consistency suggests an increase in polynucleotide phosphorylase levels. Processing of 23 S rRNA depends on ribosome assembly, and the increase in polynucleotide phosphorylase suggests that the delayed rRNA processing is not due to lack of polynucleotide phosphorylase but reflects a compensatory reaction to other rate-limiting steps, such as the ribosome protein assembly. Collectively, these data suggest that an important molecular consequence of reduced ClpPR activity is delayed ribosome maturation. Studies are in progress to determine whether the up-regulation of the DEAD box helicase is a direct or indirect effect of reduced ClpR2 accumulation. 4 The Connection between Plastid Gene Expression, the Thylakoid Photosynthetic Electron Transport Chain, the MEP Pathway, and Chlorophyll Biosynthesis-Thylakoid proteins involved in photosynthetic electron transport were reduced in the young seedlings, but Western blot analysis showed that this was not the case in mature plants, which is consistent with other phenotypic parameters (36). There are several explanations for the reduced thylakoid accumulation. (i) The shortage of available chaperones leads to reduced targeting efficiency of nucleus-encoded proteins to the thylakoid membrane. (ii) Photosynthetic thylakoid proteins cannot be stabilized due to a rate limitation in chlorophyll synthesis, resulting from reduced MEP pathway activity and/or rate-limiting GGR activity. (iii) Reduced chloroplast translation rates lead to signaling to the nucleus resulting in transcriptional down-regulation of genes encoding for photosynthetic proteins. We favor a combination of these three scenarios because they are tightly intertwined (see further below). The increase in fibrillins 1a,b and increased levels of plastoglobule particles as observed earlier by electron microscopy (36) are indicative of a thylakoid membrane homeostasis problem (80).
Chloroplast gene expression is essential for formation of photosynthetic capacity; in its turn photosynthetic electron transport affects primary and secondary metabolism through production of ATP and NADPH. Two thylakoid-interacting enzymes in the isoprenoid (the MEP) and tetrapyrrole pathways (HDS and GGR) were both up-regulated in young clpr2-1 seedlings but not significantly in mature clpr2-1 plants; in particular HDS was strongly up-regulated (5-10fold). Because these two enzymes are both associated with thylakoids and because they are both reductases, it is most likely that these reductases are up-regulated to better compete for free electrons generated by photosynthetic electron transport.
An inverse relationship between chloroplast translation rates and accumulation of MEP enzymes was observed in two pale green chloroplast mutants, rif1 and rif10. rif1 was shown to be mutant for a GTPase (YqeH) (29,105), which in bacteria is required for ribosome assembly, whereas rif10 lacks the polynucleotide phosphorylase discussed above. Plastid gene expression and accumulation of chlorophylls and carotenoids were reduced in both mutants, whereas expression of enzymes in the MEP pathway was increased in young seedlings; these effects could be mimicked by chloroplast translational inhibitors (29,105). It was suggested that the MEP protein levels were regulated through turnover by Clp protease (29). Based on our analysis of clpr2-1, we suggest that the connectivity between plastid translation and the up-regulation of the MEP pathway is best explained by the reduced formation of the photosynthetic apparatus and consequent reduction in photosynthetic electron flow, which in turn creates a bottleneck in the MEP pathway because of its strict requirement for NADPH and ATP.
Up-regulation of Subunits of the Mg 2ϩ -Chelatase Complex and Plastid to Nucleus Signaling-Mg 2ϩ -chelatase subunits I-1 and H (GUN5) were strongly and significantly up-regulated in the young seedlings but not in mature plants. Subunit D was also severalfold higher (supplemental Table 1), but this increase passed the G-test (95% confidence) for only one biological replicate and not the others because of insufficient spectral counts. The Mg 2ϩ -chelatase complex operates at the branch point of chlorophyll and heme biosynthesis (Fig. 5A), and its up-regulation could indicate increased competition with Fe 2ϩ -chelatase (for heme production) between the two branches. Previous studies on light regulation have shown that GUN5 is the exclusive target for Mg 2ϩ -chelatase gene regulation (106). Moreover GUN5 (but not the other Mg 2ϩchelatase subunits) was previously shown to be a key component in plastid to nucleus retrograde signaling (107); recently it was also identified as one of the abscisic acid hormone receptors (108). The multiple functions of GUN5 explain why the response was stronger than the increase of the other chelatase subunits. The actual signal transduction pathway from the plastid to the nucleus is still unknown (109,110), but there is consensus that chloroplast translation and redox state, as well as the GUN1 and GUN5 proteins, contribute collectively to the plastid signal (94,95). Thus the disruption of chloroplast biogenesis in the clpr2-1 mutant likely leads to signaling from the chloroplast to the nucleus to adjust nuclear gene expression. This will help to coordinate plastid and nuclear gene expression and also to make appropriate adjustments to the chloroplast proteome, e.g. by upregulation of chaperones and up-or down-regulation of metabolic enzymes. Thus the up-and down-regulation of nucleus-encoded chloroplast proteins in clpr2-1 is at least in part due to transcriptional regulation as a result of plastid signaling.
Identification of Putative Clp Protease Substrates and Overlapping Substrates between Different Protease Systems: the Chloroplast Protease Network-Based on its relatively high abundance in chloroplasts and its central function in bacteria, it is expected that many chloroplast proteins can be substrates for the Clp protease system (19). The challenge is to obtain direct evidence for chloroplast Clp substrates (and other chloroplast proteases). This is a challenge because the Clp protease substrates do not have easily recognizable sequence motifs or identifiable tags such as ubiquitination as in the case of substrates for the proteasome. E. coli and many other bacterial species have a unique trans-translation tagging system resulting in the attachment of a C-terminal peptide tag, named ssrA, that is than recognized by the Clp protease (111). However, no such ssrA system has (yet) been identified in plants.
In the last 2 years, several proteins have been suggested to be Clp substrates. A dozen proteins were assigned as putative substrates based on increased accumulation as detected by image analysis of two-dimensional electrophoresis gels from chloroplasts of an antisense CLPP6 mutant (28) or the clpr1-1 mutant (30). These proteins were mostly highly abundant stromal proteins involved in e.g. protein synthesis or folding. Based on our current analysis, it is not clear that they indeed are up-regulated because they are substrates; rather they seem to represent an indirect response, similar as the increase in fibrillin 1a and the increases in the chaperones HSP90, HSP70, and CPN60. As mentioned in the previous section, MEP pathway proteins have been suggested to be a direct substrate of the Clp machinery based on Western blot analysis of 5-day-old seedlings of the rif1 and rif10 mutants impaired in plastid gene expression and/or RNA metabolism (29). Although this is certainly possible, an indirect link between MEP pathway protein levels and Clp activity seems quite likely for two reasons. 1) The MEP pathway enzymes are under a strong developmental control (with highest expression in the youngest leaves), and a comparison between very young mutant and wt seedlings without correction for the developmental delay can explain the relative high levels of MEP pathway proteins in the mutants. 2) Our current study provides a scenario for up-regulation of MEP pathway proteins under conditions where photosynthetic electron transport is impaired.
Despite the strong increase of THI1 in young clpr2-1 seedlings, seedlings could not be chemically complemented by supplementation with thiamine. This suggests that a loss of thiamine biosynthesis does not contribute to the mutant phenotype. Therefore, the increase in THI1 levels could be a direct effect of reduced turnover (without negative consequences for cell viability) and make THI1 a putative Clp substrate. Cellular thiamine concentrations and several enzymes in thiamine biosynthesis were shown to increase under oxidative stress in maize seedlings (112), and the observed upregulation of THI1 could also reflect a direct response to oxidative stress experienced in the clpr2-1 mutant. A promising candidate Clp protease substrate is chlorophyll a oxygenase protein involved in chlorophyll a to b conversion (35). A screen for Arabidopsis mutants that overaccumulated chlorophyll a oxygenase-green fluorescent protein fusion protein identified a CLPC1 mutant (35). It was suggested that the Clp machinery is involved in regulating chlorophyll b biosynthesis through the destabilization of chlorophyll a oxygenase protein in response to the accumulation of chlorophyll b. The decreased chlorophyll a/b ratio in clpr2-1 that we observed (36) is consistent with this scenario. However, a demonstration of the involvement of the ClpPR protease, rather than the ClpC1 chaperone with diverse functions, will be needed to further support chlorophyll a oxygenase as a substrate of the Clp protease system.
As outlined in the Introduction, the chloroplast contains multiple protease systems, and it is quite likely that there are a number of proteins that can be degraded by more than one protease system. This has been demonstrated in bacteria for the Lon and Clp systems (113). Moreover degradation of substrates can also involve sequential proteolytic steps involving two or more protease systems operating in series. Examples are degradation of the thylakoid D1 protein involving both the FtsH and DegP system (114,115) and bacterial proteolysis in which oligopeptidase A degrades cleavage products generated by the Clp complex (116). We quantified 38 proteases in the seedlings (Fig. 2, marked as squares) located inside or outside of the chloroplast. Only two chloroplast proteases, PreP and ClpC, showed a significant response in young but not in mature clpr2-1 leaves. The increase of PreP could well reflect overlapping substrates with the Clp system that would be most needed under conditions of chloroplast biogenesis and rapid protein import.
The Importance of a Systems View of Chloroplast Protein Homeostasis-This comparative proteomics study demonstrates that for understanding of chloroplast protease function it is important to take a "systems view" of the chloroplast. Moreover impaired proteolysis can have direct or indirect consequences on metabolism, leading to changes in cellular proteome composition and effects on growth, development, and even aging. Large scale comparative proteomics of Arabidopsis chloroplast single and double mutants is clearly feasible when using the latest generation of mass spectrometers and will provide further insight into the chloroplast protein homeostasis networks.