In Vivo Stable Isotope Labeling of Fruit Flies Reveals Post-transcriptional Regulation in the Maternal-to-zygotic Transition*

An important hallmark in embryonic development is characterized by the maternal-to-zygotic transition (MZT) where zygotic transcription is activated by a maternally controlled environment. Post-transcriptional and translational regulation is critical for this transition and has been investigated in considerable detail at the gene level. We used a proteomics approach using metabolic labeling of Drosophila to quantitatively assess changes in protein expression levels before and after the MZT. By combining stable isotope labeling of fruit flies in vivo with high accuracy quantitative mass spectrometry we could quantify 2,232 proteins of which about half changed in abundance during this process. We show that ∼500 proteins increased in abundance, providing direct evidence of the identity of proteins as a product of embryonic translation. The group of down-regulated proteins is dominated by maternal factors involved in translational control of maternal and zygotic transcripts. Surprisingly a direct comparison of transcript and protein levels showed that the mRNA levels of down-regulated proteins remained relatively constant, indicating a translational control mechanism specifically targeting these proteins. In addition, we found evidence for post-translational processing of cysteine proteinase-1 (Cathepsin L), which became activated during the MZT as evidenced by the loss of its N-terminal propeptide. Poly(A)-binding protein was shown to be processed at its C-terminal tail, thereby losing one of its protein-interacting domains. Altogether this quantitative proteomics study provides a dynamic profile of known and novel proteins of maternal as well as embryonic origin. This provides insight into the production, stability, and modification of individual proteins, whereas discrepancies between transcriptional profiles and protein dynamics indicate novel control mechanisms in genome activation during early fly development.

In many organisms, the first few hours of development are controlled by maternal proteins and mRNAs, which are deposited into the egg during oogenesis. After fertilization, the primary roles of these factors are to facilitate zygotic transcription and to establish the initial body framework. In Drosophila, zygotic transcription is initiated after approximately 2 h of development when the first 13 synchronous nuclear divisions give rise to the formation of the syncytial blastoderm. This is also referred to as the maternal-to-zygotic transition (MZT). 1 The first developmental processes controlled by zygotic factors are mitotic cycle 14 and the cellularization of the blastoderm thereby hallmarking the midblastula transition. Although the existence of this process has been known for a long time (1)(2)(3)(4)(5), the molecular mechanisms regulating the transition from mother to zygote are only beginning to be unraveled.
The mother transfers a large number of mRNAs to the oocyte, estimated at ϳ7,000 transcripts (6,7). The bulk of these are degraded (8), whereas a selected set needs to be stabilized allowing translation to sustain development of the embryo. This is achieved by the combined effects of mRNA localization, (de)stabilization by maternal and zygotic proteins, and translational repression and activation. Over the years it has become clear that in Drosophila multiple mechanisms act simultaneously to achieve protein expression at the right dose, at the right time, and at the right location. One way of localizing a particular protein is to stabilize and localize its mRNA transcript prior to translation, ensuring high levels of protein to restricted well defined cytoplasmic positions (9,10). This complements mechanisms suppressing activation of untranslated transcripts, which have been shown to aggregate in specific cytoplasmic granules known as P bodies (11,12).
One of the important questions in the activation of the zygotic genome relates to the origin of proteins either by deposition in the oocyte by the mother or by transcriptional and translational activity in the embryo. Although recent proteomics studies aimed to define the Drosophila proteome (13)(14)(15), they investigated a different developmental event, or they did not specifically focus on fly development. In a number of recent studies genomics techniques were used to distinguish maternal from zygotic gene expression. Lé cuyer et al. (25) used high resolution fluorescent in situ hybridization assuming that maternal and zygotic transcripts localize in the cytoplasm and nucleus, respectively. De Renzis et al. (6) addressed a similar question by investigating chromosomeablated mutants to discriminate between transcriptional and post-transcriptional regulation of gene expression, and it was estimated that ϳ20% of the transcripts at cycle 14 were of zygotic origin. Although the presence and precise localization of transcripts are crucial to understand developmental activation of the embryo, they do not necessarily allow extrapolation to protein expression. Notably multiple mechanisms shown to determine mRNA stability and translational activity (e.g. dependent on or independent of deadenylation, targets of RNA silencing, or transacting factors) provide an additional level of regulation (16). The result of the combined effect of these post-transcriptional processes can only be captured by determining expression levels of individual proteins before and after MZT.
Therefore, we used a proteomics approach quantifying the relative protein expression levels before (1.5 h after oviposition, embryonic stages 1-3) and after MZT (4.5 h after oviposition, embryonic stages 6 -9). By applying a combined approach using in vivo labeling of fruit flies by the incorporation of stable isotope-coded nitrogen ( 15 N) (17) combined with LC-MS/MS, more than 1,700 proteins could be quantitated in two biological independent experiments. About half of these changed in abundance of which ϳ350 proteins increased, providing for the first time direct evidence of the identity of proteins as a product of embryonic translation in a large scale approach. Although these up-regulated proteins represent a wide variety of functional classes, maternal proteins were among the most dramatically down-regulated proteins including transacting factors involved in regulation of mRNA stability (including maternal expression at 31B (ME31B), Smaug (SMG), and a number of proteins interacting with these). Moreover specific down-regulation of these proteins appears to be governed by a post-transcriptional mechanism as evidenced by direct comparison of protein and transcript levels in the same samples. In addition, evidence was found that a limited number of proteins, including poly(A)-binding protein (PABP) and cysteine proteinase-1 (CP1), were subject to post-translational processing leading to truncation, possibly resulting in an altered function of these proteins. Altogether this study provides a dynamic profile of known and novel proteins of maternal as well as embryonic origin associated with embryonic development in Drosophila.

EXPERIMENTAL PROCEDURES
Fly Stock, Labeling, and Embryo Collection-Wild-type OregonR flies were maintained by standard methods at 25°C and were labeled as described previously (17). Briefly larvae were grown in boxes containing 15 N-labeled or unlabeled yeast and kept at 25°C throughout larval and pupal developmental stages. Hatched flies were transferred to fly cages and kept on 15 N-labeled or unlabeled yeast. Embryos were collected on agarose-agar plates completed with a small amount of 15 N-labeled or unlabeled yeast that were removed from the fly cage after 90 min. Unlabeled stage 6 -9 embryos were obtained by aging the 0 -90-min embryos using standard methods for another 180 min, whereas 15 N-labeled embryos were processed immediately. Embryos were washed in water and dechorionated by incubation in 2.5% sodium hypochlorite for 90 s followed by another wash and then kept at Ϫ20°C. A biological duplicate, independent experiment was performed with swapped labels (i.e. 0 -90 min unlabeled and 180 -270 min 15 N-labeled).
Sample Preparation-Equal amounts of labeled and unlabeled embryos were combined and lysed in 8 M urea and 50 mM ammonium bicarbonate. Cellular debris were pelleted by centrifugation at 20,000 ϫ g for 20 min. Prior to digestion, proteins were reduced with 1 mM DTT and alkylated with 2 mM iodoacetamide. The mixture was diluted 4-fold to 2 M urea using 250 l of 50 mM ammonium bicarbonate and 50 l of 0.1 mg/ml trypsin solution and incubated overnight at 37°C.
Strong Cation Exchange-Strong cation exchange was performed using a Zorbax BioSCX-Series II column (0.8-mm inner diameter ϫ 50-mm length, 3.5 m), a FAMOS autosampler (LC Packings, Amsterdam, The Netherlands), and a Shimadzu LC-9A binary pump and a SPD-6A UV detector (Shimadzu, Tokyo, Japan). Prior to strong cation exchange (SCX) chromatography, protein digests were desalted using a small plug of C 18 material (3M Empore C 18 extraction disk) packed into a GELoader tip (Eppendorf) similar to what has been described previously (18) onto which ϳ10 l of Aqua C 18 (5 m, 200 Å; Phenomenex) material was placed. The eluate was dried completely and subsequently reconstituted in 20% acetonitrile and 0.05% formic acid. After injection, a linear gradient of 1% min Ϫ1 solvent B (500 mM KCl in 20% acetonitrile and 0.05% formic acid, pH 3.0) was performed. A total of 28 SCX fractions (1 min each, i.e. 50-l elution volume) were manually collected and dried in a vacuum centrifuge.
Nanoflow HPLC-MS-Dried residues were reconstituted in 50 l of 0.1 M acetic acid and were analyzed by nanoflow liquid chromatography using an Agilent 1100 HPLC system (Agilent Technologies) coupled on line to a 7-tesla LTQ-FT-ICR mass spectrometer (Thermo Electron). The liquid chromatography part of the system was operated in a setup essentially as described previously (19). Aqua C 18 , 5-m (Phenomenex) resin was used for the trap column, and ReproSil-Pur C 18 -AQ 3-m (Dr. Maisch GmbH) resin was used for the analytical column. Peptides were trapped at 5 l/min in 100% solvent A (0.1 M acetic acid in water) on a 2-cm trap column (100-m inner diameter, packed in house) and eluted to a 40-cm analytical column (50-m inner diameter, packed in house) at ϳ100 nl/min in a 150-min gradient from 10 to 40% solvent B (0.1 M acetic acid in 8:2 (v/v) acetonitrile/water). The eluent was sprayed via standard coated emitter tips (New Objective) butt-connected to the analytical column. The mass spectrometer was operated in data-dependent mode, automatically switching between MS and MS/MS. Full scan mass spectra (from m/z 300 to 1,500) were acquired in the FT-ICR cell with a resolution of 100,000 at m/z 400 after accumulation to a target value of 500,000. The three most intense ions at a threshold above 5,000 were selected for collision-induced fragmentation in the linear ion trap at normalized collision energy of 35% after accumulation to a target value of 15,000.
Peptide Identification-All MS 2 spectra were converted to single DTA files using Bioworks 3.1 (Thermo) with default parameters and merged into a Mascot generic format file that was searched twice to identify both unlabeled and 15 N-labeled peptides using an in-house licensed Mascot v2.1.0 search engine (Matrix Science) against a concatenated database containing both forward and reversed entries from an Integr8 D. melanogaster database (version 20060806) consisting of 32,508 sequences. Carbamidomethylcysteine was set as a fixed modification; oxidized methionine, protein N-acetylation, and N-terminal pyroglutamate were set as variable modifications. Trypsin was specified as the proteolytic enzyme, and up to two missed cleavages were allowed. The mass tolerance of the precursor ion was set to 15 ppm, and that of fragment ions was set to 0.8 Da. Both search results ( 14 N and 15 N identifications) were merged into one HTML result page using an in-house developed Perl script for qualitative and quantitative analysis. A false-positive discovery rate of Ͻ1% was estimated (20) and accomplished by using a peptide cutoff score of 20 (horizontal line in supplemental Fig. 2), a minimum of two peptides per protein, and a protein cutoff score of 60.
Protein Quantitation-An in-house developed 15 N version of MS-Quant (24) was used to quantify relative protein levels. MSQuant was modified in such a way that the position of the partner peptide can be detected in the case of metabolic 15 N labeling (i.e. the position of the 15 N-labeled peptide in the case when the unlabeled ( 14 N) peptide was identified and vice versa), and no modifications were made to the quantitation algorithm. Briefly extracted ion chromatograms (XICs) for both unlabeled and labeled peptides are calculated and summed over consecutive MS cycles for the duration of their respective LC-MS peaks using monoisotopic peaks only. All full-scan mass spectra were manually verified for sufficient signal and absence of interference by other signals. A minimum XIC threshold of at least 3,400 was used to exclude low intensity peptides. The total XIC intensity of labeled peptides was corrected for incomplete enriched nitrogen as described before (22). Briefly by using the average 15 N enrichment level of 98.2% and the chemical formula of the peptide the theoretical isotope pattern using natural abundances of elements as well as the pattern based on the 15 N enrichment level was calculated and used to sum the intensities of the "Ϫ1" and "Ϫ2" isotopes. Effectively this adds the intensities of 14 N isotope peaks to the monoisotopic 15 N peak. Relative peptide levels were computed by dividing the labeled (corrected) total XIC by the unlabeled total XIC. Relative protein expression levels (log 2 ) with standard deviation were obtained by averaging individual peptide log 2 ratios that identified the same protein with a minimum of two quantitated peptides per protein. Protein ratios were normalized to the average ratio of all proteins. Finally the protein ratios from the two independent experiments were averaged. The variation between the data sets was calculated by dividing the standard deviation by the average ratio of each protein and taking the average of all 1,737 proteins.
Supporting Information-The peptide identifications have been made publicly available in the proteomics identifications database PRIDE and can be found under experiment accession numbers 8170 and 8171 in the project "In vivo stable isotope labeling of fruit flies reveals post-transcriptional regulation in the maternal-to-zygotic transition." Because PRIDE is not set up to handle quantitative data, these have been parsed into a database along with annotated spectra of all identified peptides that can be viewed upon request. In addition, quantitative and qualitative results of all 2,232 quantitated proteins, including individual peptide ratios, sequence coverage, and protein information, can be found in supplemental Table 9.
RNA Extraction and Affymetrix Microarray Hybridization-RNA was extracted using the SV Total RNA Isolation System (Promega) and tested on an Agilent BioAnalyzer (Agilent). Samples with RNA integrity numbers Ͼ8 were selected. Labeling, hybridization, washes, and staining of microarrays were performed according to Affymetrix specifications.
Microarray Statistical Analysis-Statistical analysis of the microarray data was performed using R and Bioconductor free software as described previously (23). Gene expression indexes were calculated using the robust multichip average algorithm implemented in the Bioconductor affy package. Distribution of the expression indexes is bimodal; therefore we applied the multiple covariance determinant algorithm implemented in the rrcov R package to filter non-expressing genes. 4,657 genes were selected for further analysis. Details of the statistical analysis and R scripts will be provided upon request.
Western Blot Analysis-About 2 mg of early and late embryos and 18 mg of adult flies were lysed in buffer containing 30 mM Tris, 7 M urea, 2 M thiourea, 50 mM DTT, and 54 mM CHAPS and sonicated for 30 s. Lysates were centrifuged at 14,000 ϫ g to pellet cellular debris. Protein concentration was determined by a 2D-Quant kit (Amersham Biosciences/GE Healthcare) using the standard protocol. For each experiment, 15 g of protein was used as starting material, subjected to SDS-PAGE, and blotted. After transfer, the membrane was stained with Ponceau red to verify equal loading of the lysates. Subsequently the nitrocellulose membrane was blocked with 5% Protifar plus (Nutricia) and then incubated with antibodies against PABP, Rack1, and eIF3-S9 followed by a horseradish peroxidase-conjugated secondary antibody. The membranes were subjected to detection by enhanced chemiluminescence. Membranes were subsequently stripped at 50°C in buffer containing 62.5 mM Tris, 2% SDS, and 0.1 M ␤-mercaptoethanol; blocked with 5% Protifar plus; and then incubated with anti-␣-tubulin antibody followed by the same procedure as described above.

Strategy for High Throughput Proteomics-
The strategy used to identify and differentially quantify proteins in embryos in Browne's stages 1-3 and 6 -9 is shown schematically in Fig. 1. We started our approach with ϳ10 ml of 15 N-labeled and unlabeled flies and collected "heavy" and "light" embryos for 90 min. Light embryos were allowed to develop for another 180 min to reach stage 6 -9. A second biological independent experiment with reversed labels ("label swap") was conducted as well. Embryos were visually inspected after harvesting to confirm their intended developmental stage (supplemental Fig. 1, A-C). Incorporation of 15 N was verified by mass spectrometric analysis of labeled embryos (supplemental Fig. 1D). Labeled and unlabeled samples were combined (Fig. 1A), proteins were extracted, digested using trypsin, and subjected to SCX as the first separation step (Fig. 1B). Oneminute SCX fractions were collected with peptides eluting in fractions 5-33. Each of these 28 fractions was subjected to the second dimension nano-LC MS/MS. A typical chromatogram of one fraction is shown in Fig. 1C. Extended column length (40 cm) and gradient times (2.5 h) were used to obtain optimal peptide separation over more than 2 h (Fig. 1C). Analysis of these fractions led to an average of 10,000 fragment spectra per SCX fraction. Altogether more than 250,000 spectra were searched against a Drosophila protein database consisting of "forward" and "reversed" protein sequences for protein identifications and an estimation of the false-positive (FP) discovery rate. To determine an optimal FP rate, peptide scores from both parts of the database were plotted FIG. 1. Experimental strategy used to analyze the MZT. A, labeled and unlabeled yeast was used to grow embryos that were harvested before (right) and after (left) the MZT. Embryos were combined, lysed, digested, and subjected to SCX fractionation (B). Each of the 28 SCX fractions were analyzed by reversed-phase LC-MS (C), and the resulting MS and MS/MS spectra were used to, respectively, quantify and identify peptides and stored in a PostGreSQL database (D).
(supplemental Fig. 2). With a peptide cutoff score of 20 (horizontal line in supplemental Fig. 2), a peptide FP rate of ϳ4% resulted in 79,196 peptide identifications. In addition to the peptide cutoff score, we used a minimum of two peptides per protein and a protein cutoff score of 60. This led to 2,736 protein identifications with a false-positive discovery rate of Ͻ1%. The complete list of all proteins identified in stages 1-3 as well as stages 6 -9 is given in supplemental Table 1. Subsequently proteins were quantified by integrating all MS peak areas of identified peptides using an in-house adapted version of MSQuant (24) (Fig. 1C). Using a minimum of two quantifiable peptides per protein, a total of 2,232 proteins were quantified (Fig. 1D). Peptides that could not be quantified represent mainly low abundance peptides that disappear in the noise or produce an insufficient number of data points for proper integration (each peptide was manually verified). Qualitative as well as quantitative data were parsed into a searchable PostGreSQL database (Fig. 1D) (available upon request).
Protein Identification and Quantification-Protein identifications were obtained by stringent filtering criteria, and therefore the compendium of the proteins found in stages 1-3 and 6 -9 (supplemental Table 1) provides a valuable resource for further exploration. To accurately assess the quantitative difference between early and late embryos, thereby uncovering proteins associated with the maternal-to-zygotic transition, protein ratios were based on the quantitative results of two biological independent experiments where labels were swapped (unlabeled early embryos and 15 N-labeled late embryos). A total of 1,737 proteins were quantified in two experiments. Fig. 2A shows a scatter plot of the protein ratios of both experiments where a correlation coefficient of 0.831 indicates that protein ratios are in good agreement. An overview of the relative expression patterns of these 1,737 proteins is shown in Fig. 2B including error bars that represent the S.D. between the two protein ratios. Protein quantitation was of high quality, reflected by the small average error in each of the two individual data sets (that is, the error between the peptides of the same protein) of 18 and 11%, respectively. Furthermore the average variation between proteins quantitated in both data sets was 11%. Proteins identified in only one replicate most probably suffered from undersampling during mass spectrometric data acquisition, but given the high correlation between the replicates and the small standard deviations, most of them are likely to be plausible. Yet to increase confidence, only the proteins that were quantitated in both data sets were taken for further analysis.
The accuracy of protein quantitation in our approach was evidenced further by the abundance ratios of ribosomal proteins. Among the expression ratios of the 125 subunits identified in this study, a remarkable observation can be made indicating a subtle but clear distinction between cytoplasmic and mitochondrial ribosomal proteins (supplemental Table 2). A total of 92% of the cytoplasmic ribosomal subunits (n ϭ 66) were down-regulated, whereas this was true for only 42% (n ϭ 22) of the mitochondrial subunits. This is a clear distinction, which was caused by remarkably small differences in absolute changes in expression levels: the average log 2 ratio of cytoplasmic ribosomal proteins was Ϫ0.08 compared with 0.01 for mitochondrial ribosomal proteins. This suggests that our method for protein quantitation is highly accurate, permitting the distinction of small differences. Such subtle changes may reflect biologically relevant regulation, but obviously more conservative criteria need to be applied to classify individual proteins in the entire data set as being up-or downregulated. Therefore, to make a selection of proteins from the entire data set that changed in abundance before and after MZT, we applied a cutoff value of at least 1 times the highest average coefficient of variation (18%), a log 2 value of 0.238. In addition, only proteins were selected that had a coefficient of variation of less than 50% between the two experiments. Of the 580 proteins meeting these criteria, 348 went up in abundance, whereas 232 went down (Fig. 2B). Both of these groups were analyzed in more detail looking for evidence that could biologically explain these observations.
Of the 348 proteins that were up-regulated (supplemental Table 3), six were found exclusively in the later time point and not in the "maternal" sample, permitting the conclusion that these are purely zygotic proteins. This is partly in concordance with data from De Renzis et al. (6) who classified transcripts of four of these six proteins (Amalgam, Neurotactin, Ptc-related, and Bangles and Beads) as purely zygotic (i.e. no mRNA expression at 0 -1 h). However, De Renzis et al. (6) found maternal transcripts for the other two proteins (Hiiragi and Frazzled) but annotated them as primary zygotic because these two genes are more than 3-fold down-regulated compared with similarly staged wild-type embryos after ablation of the corresponding chromosome, indicating that they must be transcribed zygotically (6). Taken together, the only interpretation of the observed expression profile is that these are the product of zygotic transcriptional and translational activity that starts after approximately 2 h of development even though Hiiragi and Frazzled seem to have a maternal contribution that is post-transcriptionally silenced during the first few hours of development. The up-regulation of the other 342 proteins indicates that these are present at some level at stages 1-3 and that their expression level is increased because of zygotic translation. Alternatively these expression profiles could be the result of specific timed translation of maternal mRNA. All together, for simplicity we refer to this set of 348 proteins as zygotic proteins.
Using the same criteria as for the up-regulated proteins, the second group includes 232 proteins that were down-regulated (supplemental Table 4). These can be interpreted as maternal proteins that are degraded over time, although it cannot be excluded that the expression ratio is a combined effect of zygotic translation and (stronger) degradation of maternal products. Yet for simplicity we will refer to this set as "maternal proteins."

Comparing Identity of Zygotic Proteins with Previous Data
Sets-Previously genomics techniques were used to infer maternal or zygotic origin by interpreting localization of transcripts (cytoplasmic or nuclear, respectively) (25) and by chromosomal ablation to discriminate between transcriptional and post-transcriptional regulation of gene expression (6). To extend this to translational activity in the zygote, we believe this is proven best by the higher abundance of a given protein in late compared with early embryos. In other words, there is no other explanation for a protein to accumulate during the MZT than that it is synthesized in the zygote. We therefore propose to define the proteins up-regulated in our study as the first proteins to be expressed by the embryo. By comparing these proteins to the transcripts described by Lé cuyer et al. (25) and De Renzis et al. (6) as being zygotic, we found that the agreement between the data sets is rather low. In Fig. 3 a Venn diagram shows the identified zygotic products (transcripts and proteins, respectively) from the three data sets. Only 13 proteins were found to be present in all three experiments, whereas 26 and 16 proteins were found to be overlapping with the De Renzis et al. (6) and the Lé cuyer et al. (25) data sets, respectively. In supplemental Table 5 Fig. 3); however, the products of these transcripts were found to be significantly down-regulated in our data set. It is difficult to explain these seemingly contradicting results not only because highly different techniques were used in each of the studies but also because they bypass any regulation that may occur at the posttranscriptional level, particularly translational activation or inhibition of transcripts.
Functional Classification of Zygotic and Maternal Proteins-Both zygotic and maternal proteins were clustered in functional groups based on gene ontology (GO) classification (26). The list of all enriched GO terms (EASE score (26) below 0.05) can be found in supplemental Table 6, whereas the most prominent GO classes for both zygotic and maternal proteins are shown in Table I. Classification of zygotic proteins by biological process led to a great number of enriched GO terms as might be expected from the dramatic events that occur in the embryo during this period of development. Further interpretation of these results is not straightforward given the large number of proteins in some classes and the widely differing abundance ratios of individual proteins in each class.
Some of the most dramatically down-regulated (maternal) proteins are those involved in translational control of RNAs (Table II). These include Oskar (OSK), a key player in germ line as well as abdominal development, that was found to be more than 9-fold down-regulated. Other examples include Bicaudal C, Vasa, and CUP, which are involved in translational repres-  (25) data, respectively, were claimed to be zygotic, but proteins were found to be down-regulated in our work, suggesting maternal expression. sion of oskar mRNA and which were down-regulated in a fashion similar to that of OSK itself (log 2 ratios of Ϫ4.8, Ϫ1.6, and Ϫ5.4, respectively). In addition, CUP can be recruited by SMG (log 2 ratio of Ϫ2.8) to translationally repress nanos mRNA. These results suggest that release of transcriptional control by a variety of maternal proteins is one of the most prominent events during the MZT.
Post-transcriptional Regulation-From our findings above it is evident that a large proportion of the proteins identified here change in abundance due to both translation and degradation. Some of these proteins are involved in translationally repressing, localizing, and stabilizing mRNAs. Our quantitative proteomics approach provides an opportunity to link expression profiles of proteins to their cognate transcript levels, which could reveal mechanisms of posttranscriptional regulation. To directly correlate protein and transcript expression levels, microarray experiments were conducted on identical samples (including isotope labels) used for proteomics analyses. We could obtain mRNA expression levels for 4,657 transcripts that could be correlated to 1,548 entries in the quantitative proteomics data set. Direct comparison of mRNA and protein expression levels is shown in Fig. 4. Spearman's rank correlation coefficient between these data sets is weak (0.39), indicating that mRNA levels are poor indicators for protein expression levels. Although the majority of the data points are clustered around the center of the plot (i.e. not changing in either direction), a number of more extreme values permit some interesting observations. For instance, a cluster of data  Table 9.

Quantitative Proteomics of Embryonic Fruit Fly Development
points can be found on the y axis (Fig. 4, encircled) in the region where transcript levels show decreased abundance, although protein levels do not change appreciably. Apparently transcripts are degraded, whereas proteins remain stable even though transcription (and translation) ceases. Strikingly 37 of the 41 proteins fall into the GO biological process category "catalytic activity" (supplemental Table 7). A possible explanation could be that these proteins are stabilized by an unknown mechanism to prevent degradation. Alternatively this could be supplemented by a mechanism regulating enzyme activity. It is conceivable that this would provide a mechanism to engage or fine tune enzymatic activity in metabolic processes without the need to newly synthesize proteins. In another segment in Fig. 4 protein levels are down-regulated, although mRNA levels do not change appreciably or are even up-regulated. Strikingly several prominent maternal proteins fall in this group, such as Smaug and ME31B. Moreover a number of proteins known to interact with these proteins are observed here (CUP, Ypsilon Schachtel, poly(A)-binding protein, Bicaudal C, and Trailer Hitch) as well as Tudor and Pathetic. This pattern could be the result of post-transcriptional or transcriptional silencing where translation is blocked and protein levels drop, although transcript levels remain constant or even increase. Because this specific correlation of transcript and protein levels seems to be restricted to a functionally defined class of proteins (i.e. those involved in silencing of maternal transcripts), it is tempting to speculate that a dedicated, functionally relevant but as yet unknown mechanism is involved. Along with the above described proteins, another protein (CG1943) is located in the same region in Fig.  4. Despite its homology to HN1-like proteins, no function is known for this protein, but its expression is likely to be regulated post-transcriptionally.
Post-translational Protein Processing-During the quantitation and validation process some proteins attracted attention because their coefficient of variation (CV) was much higher than the average CV. Of note, this was observed in both biological replicates (i.e. forward and reverse labeling), where average CVs were 18 and 11%, respectively. A high deviation is indicative of large differences in individual peptide ratios within a single protein. For proteins quantified with a CV higher than 40%, we examined individual peptides establishing whether variation originated from errors in the quantitation process itself or possibly from a biological cause like posttranslational processing.
Low signal-to-noise ratios are one explanation for poor consistency in quantitation. This appeared to apply to six proteins that were off-on regulated (BNB, PTR, HRG, AMA, FRA, and NRT). Of these proteins, the intensity of the labeled (down-regulated) peptides was at noise level, resulting in considerable variation of abundance ratios and thus a high S.D. for the protein. Note that this does not affect our conclusion that these proteins are among the most strongly affected proteins (off-on), but the exact ratio by which this occurs cannot be determined with high accuracy.
A more interesting example is CP1, a Cathepsin L-like cysteine proteinase involved in the process of intra-and extracellular protein degradation and turnover. The protein ratio in both experiments was similar (log 2 ratios of Ϫ0.84 and Ϫ0.95) as was the high S.D. (1.28 and 1.58, respectively). It is known that these proteases are activated by removal of their propeptide, comprising 150 amino acids in the N terminus (27). Indeed in both experiments the same two N-terminal peptides exactly fitting the propeptide of CP1 displayed a more dramatic down-regulated pattern (log 2 ratios Ϫ3.01 and Ϫ3.74) compared with the other peptides (Ϫ0.22 and Ϫ0.16) (supplemental Table 8). It can therefore be concluded that CP1 is activated during the MZT.
Another example is the polyadenylate-binding protein PABP (log 2 ratio Ϫ1.93, S.D. 2.05). A closer look at the sequence of this protein and the individual peptides that were quantified (Fig. 5) revealed that specifically C-terminal peptides exhibited a ratio deviating strongly from the other peptides. The average log 2 ratio of the 13 identified peptides (Fig. 5, indicated in red) until amino acid residue 410 is Ϫ0.94, whereas the C-terminal peptides (four in total; Fig. 5, blue) showed a log 2 ratio of Ϫ5.17, i.e. a 32-fold downregulation. This indicates that PABP is C-terminally processed. This is unlikely to be the result of aspecific proteolytic cleavage during sample handling because (i) this was the only protein in the entire data set for which we observed this phenomenon and (ii) ratios were confirmed in the label swap experiment (log 2 ratio Ϫ0.82 for the N-terminal and Ϫ1.94 for the C-terminal peptides). Furthermore this observation was validated by Western blot analysis where the presence of PABP was tested in embryos before and after MZT as well as in adult flies using three different rabbit polyclonal antibodies against PABP (Fig. 6A). Full-length PABP was present in early embryos and adult flies, but no signal was observed for full-length PABP in late embryos. Truncation of PABP indicated by mass spectrometry would result in a product of ϳ46 kDa resulting in a band shift in the blot. Unfortunately none of the three antibodies produced a signal at this position (data not shown). A likely explanation could be that these three antibodies are specific for the (possibly immune-dominant) C-terminal part of the protein. This part is heavily down-regulated (32-fold), which possibly is below the detection level of this Western blot analysis. Other proteins that were semiquantified by Western blot analysis were Rack1 and eIF3-S9 (Fig. 6B). Mass spectrometry indicated that these proteins were not regulated during MZT (log 2 ratios of Ϫ0.06 and 0.02, respectively), which was confirmed by Western blot analysis (Fig. 6B).
Moreover C-terminal truncation of PABP may serve a functional role. PABP contains four RNA recognition motifs located on the N-terminal part of the protein with the last domain ending at amino acid 362 (Fig. 5, highlighted in gray).
In addition, there is a polyadenylate motif at the C-terminal end of this protein (ranging from amino acid 552 to 629; Fig.  5, highlighted in orange). Through these motifs, PABP facilitates the formation of the "closed loop" structure of the mRNA-protein complex that is essential to stimulate translation. More specifically, the RNA recognition motifs of PABP interact with poly(A) tails of transcripts, whereas the C-terminal domain is used to interact with factors regulating polyadenylation, deadenylation, and translational activities (28). It is this latter domain that is more down-regulated than the other part of the protein containing the RNA recognition motifs, indicating that the polyadenylate domain is eliminated by post-translational processing. Potential biological implications of this event are discussed below.

DISCUSSION
Because of the notion that early embryonic development is determined to a large extent by maternal mRNA deposited in the oocyte, there has been great interest to investigate the relationship between gene expression and establishment of embryonic organization. In the first place, this concerns the identity of the gene products involved in governing these processes but extends to the mechanism whereby gene expression is controlled in space and time. It has been known for a long time that embryos contain a large number of maternally derived mRNAs during the initial stages of development (29), whereas early proteomics studies showed that not all transcripts were translated, indicating that they must be controlled at the translational level (30). Since the introduction of genomics and imaging techniques it has become possible to study this in far greater detail, and now for the first time, we complement these studies by a large scale quantitative proteomics approach.
To successfully investigate the MZT in Drosophila embryos at the proteome level, we applied an approach that allows the identification and accurate quantitation of multiple proteins in a single experiment. This proteomics approach includes the stable isotope labeling of all proteins in fruit flies in vivo by the incorporation of heavy nitrogen ( 15 N) (17). Two-dimensional peptide separation using SCX and reversed-phase chromatography combined with high accuracy mass spectrometry resulted in the identification of 2,736 proteins with a falsepositive rate of Ͻ1%. Of these proteins, 1,737 proteins were quantified in two independent experiments where isotope labels were swapped. These data now allow us to distinguish and to relatively quantify proteins that are expressed before and after the MZT. This thereby provides a strong indication of which proteins are of maternal origin and which ones are (also) expressed by the zygote. Especially the latter class of proteins is of primary interest because they define the earliest products of zygotic translation, initiating processes further downstream. In our data set, these proteins are defined by their elevated expression after the MZT.
We classified a total of 342 proteins whose abundance increased during the MZT, indicating that, at least in part, they were translated in the zygote. We cannot determine to what extent this compensates for protein degradation that may occur, yet we can conclude these proteins are the product of zygotic translational activity. Also we cannot discriminate whether this occurs from zygotic transcripts or from maternal transcripts that are silenced and stabilized during the first few hours of development and are expressed after the MZT. Another six proteins were identified exclusively after the MZT, indicating that these are purely zygotic and have no maternal protein contribution. It cannot be excluded, however, that maternal transcripts, such as Hiiragi and Frazzled, are silenced during the first few hours of development. Previous genomics studies revealed a large number of zygotic genes that were classified either based on cellular localization (nuclear or cytoplasmic) (25) or based on the location on chromosome-ablated mutants compared with wild-type embryos (6). Only 13 products were classified as zygotic in all three studies, whereas 26 and 16 overlap between our study and those by De Renzis et al. (6) and Lé cuyer et al. (25), respectively (Fig. 3). Several reasons could account for the small overlap, related primarily to the fact that widely different techniques were used. Proteomics studies, as presented here, tend to be biased toward abundant proteins unlike genomics experiments such as microarrays. Some transcripts classified as zygotic by De Renzis et al. (6) (eight transcripts) and Lé cuyer et al. (25) (four transcripts) were found to be downregulated in our study, suggesting maternal expression. Alter-natively the decrease in protein level could be the combined effect of zygotic translation and (stronger) protein degradation. Finally it is important to note that no direct relation may be expected from the genomics studies (6,25) and ours because transcript and protein levels do not necessarily correlate (see Fig. 4).
Functional classification of both maternal and zygotic proteins covered a wide variety of functional categories (Table I).
A group of proteins enriched in early embryos and found in the category "protein-RNA complex assembly" were RNA-binding proteins. An interesting example is Oskar, which is responsible for setting up body axes and abdominal development together with mediators that are involved in regulating silenced localization and expression of Osk mRNA (31). These factors include CUP, Ypsilon Schachtel (YPS), and ME31B, all of which were identified and strongly down-regulated in this study (Table II). Once localized to and translated at the posterior pole, OSK has the ability to interact with the RNA binding domain of Smaug (32,33) preventing Smaug from binding nanos (nos) mRNA and subsequent deadenylation. This results in nos stability and translation at the posterior pole (34), whereas unlocalized and silenced nos in the bulk cytoplasm of the embryo is subjected to degradation (8,35).
The translational regulators discussed above (YPS, ME31B, TRAL, CUP, EIF4E, PABP, and SMG) are among the most severely down-regulated proteins in our study, suggesting shutdown of translational silencing mechanism. Remarkably their respective transcript levels all remain relatively constant (Fig. 4). This observation may be explained by a post-transcriptional regulatory mechanism that stabilizes transcripts while inhibiting translation. This could be mediated by miRNAs, which bind to the 3Ј untranslated regions of specific mRNAs and inhibit their translation. Gene silencing by miRNAs is active in Drosophila and is indicated in this study by the distinct up-regulation of Argonaute-1, which is critically involved in miRNA-mediated cleavage of target mRNAs. Nakahara et al. (36) searched for miRNA targets in Drosophila oocytes by comparing wild-type oocytes to null dicer-1 oocytes in which the biogenesis of miRNAs is blocked. Fortyone proteins were found (of which 22 were identified), and 18 candidates were presented whose expression was elevated in the dicer mutant, suggesting they are targets of miRNAs. Of these targets, 15 were identified in our study as well of which only ME31B was down-regulated. Although it cannot be excluded that apart from ME31B also other translational regulators are miRNA targets, there is no direct evidence that this is indeed the case.
In this study we identified at least two proteins that were post-translationally processed, CP1 and PABP. We found that CP1, or Cathepsin L, lost its N-terminal propeptide up to residue 153, indicating that this protease was activated during the MZT. Cathepsin L is a lysosomal protease (37) but has also been shown to localize to the nucleus (38). A Cathepsin L isoform that is devoid of a signal peptide localizes to the nucleus in S phase and processes the CDP/Cux transcription factor (38). Interestingly it was shown very recently that activated Cathepsin L proteolytically cleaves the N-terminal tail of Histone 3 during differentiation of mouse embryonic stem cells, indicating that Cathepsin L plays a role in development (39). It is an intriguing possibility that this may be the case in fly development as well.
A second protein that was processed during the MZT was PABP, eliminating the conserved C-terminal protein interaction domain. PABP is critically involved in mRNA translation and stabilization in stress granules. This is achieved through domains in its N-terminal part that bind poly(A) tails of target transcripts and a C-terminal motif that can interact with factors regulating translation initiation and termination, polyadenylation, and deadenylation (28,40). By these specific binding properties, PABP bridges 3Ј poly(A) tails of target mRNAs with components of the translational initiation machinery (e.g. EIF4G) at the 5Ј end and thereby forms so-called "closed loop" structures that stimulate translation (28,41). The striking observation indicating the loss of the C-terminal protein interaction domain in PABP was further tested by Western blotting, which showed the complete disappearance of full-length PABP (Fig. 6). Because mass spectrometry had indicated only partial loss of the protein, we used three different antibodies in an attempt to show the formation of a truncated form. However, we failed to observe a band shift for any of the antibodies. Apparently they all recognize epitopes in the Cterminal part of PABP. Although the Western blots suggest total disappearance of PABP, which would be an exciting observation in itself, we can now draw a more subtle conclusion based on mass spectrometric data indicating that PABP is C-terminally processed, in fact leaving the largest part of the protein unchanged.
Given the variety of proteins recruited by the C-terminal domain, the functional consequences of its deletion is hard to predict, although one could envisage effects on translational activity, heterogeneous nuclear ribonucleoprotein integrity, mRNA stability, or a combination of these. Further studies should reveal whether truncation of PABP would abolish interaction with associated proteins, whether the mRNA loop structure would be maintained, and whether occupation of the poly(A) tails of mRNA would be preserved. This could result in stalled translation but could also affect mRNA degradation by preventing interaction with the CCR4-POP2-NOT deadenylase complex and hence shortening of the poly(A) tail. Clearly it would require detailed mechanistic studies to fully resolve this.
Apart from a functional role for truncated PABP, another question that remains is how this product is formed. Remarkably proteolytic cleavage of PABP is not unprecedented because a number of viral proteases (poliovirus 2A and 3C proteases and coxsackievirus B3 2A protease) have been shown to specifically cleave PABP in the C-terminal domain (21,42). Interestingly proteolytic cleavage was accompanied by loss of translational activity (42) suggesting that translation could be inhibited in Drosophila as well if a similar endogenous mechanism exists indeed. Although we cannot exactly pinpoint the site of cleavage, it is restricted to a well defined region (between residues Arg-378 and Phe-411) based on quantitative data of individual peptides in PABP (Fig. 5). It is striking to note that the viral proteases cleave exactly in this region (21), suggesting a large degree of conservation. We have searched for proteins homologous to the viral proteases in the Drosophila genome, but obvious candidates have not been found. All together, it remains speculative whether cleavage of PABP has a specific role in the expression profiles of SMG, ME31B, CUP, YPS, TUD, and PATH (both at the level of transcript and protein; Fig. 4) or in the context of disintegration of granules or even developmental progression in general.
In conclusion, our quantitative proteomics approach using in vivo labeling of fruit flies with stable isotopes combined with extensive analysis by LC-MS/MS has permitted the relative quantitation of thousands of proteins during early embryonic development. The method proved highly accurate and reproducible and has revealed detailed information giving insight that cannot be obtained from genomics approaches. This clearly demonstrates the contribution of quantitative proteomics for our understanding of early fly development. Although we focused our study on early fly development, the quantitative proteomics technique used is applicable to any other process where changes in protein expression are to be studied (development, environmental condition, or mutation). Furthermore the method for metabolic labeling should be easily adoptable to many fly laboratories because it requires only minor adaptations compared with routine protocols for growing flies. Therefore, we expect this approach to find broad application in fly developmental biology and beyond.
Acknowledgments-We thank Dr. Kent Duncan for providing antibodies against PABP, eIF3-S9, and Rack1 and Prof. Sonenberg for the kind gift of the anti-PABP antibody.