Quantitative Comparison of Caste Differences in Honeybee Hemolymph*S

The honeybee, Apis mellifera, is an invaluable partner in agriculture around the world both for its production of honey and, more importantly, for its role in pollination. Honeybees are largely unexplored at the molecular level despite a long and distinguished career as a model organism for understanding social behavior. Like other eusocial insects, honeybees can be divided into several castes: the queen (fertile female), workers (sterile females), and drones (males). Each caste has different energetic and metabolic requirements, and each differs in its susceptibility to pathogens, many of which have evolved to take advantage of the close social network inside a colony. Hemolymph, arthropods’ equivalent to blood, distributes nutrients throughout the bee, and the immune components contained within it form one of the primary lines of defense against invading microorganisms. In this study we have applied qualitative and quantitative proteomics to gain a better understanding of honeybee hemolymph and how it varies among the castes and during development. We found large differences in hemolymph protein composition, especially between larval and adult stage bees and between male and female castes but even between adult workers and queens. We also provide experimental evidence for the expression of several unannotated honeybee genes and for the detection of biomarkers of a viral infection. Our data provide an initial molecular picture of honeybee hemolymph, to a greater depth than previous studies in other insects, and will pave the way for future biochemical studies of innate immunity in this animal.

vitellogenin, a major nutrient storage protein in females thought to serve as a nutrient reserve and lipid carrier (4,5). The fluid also contains components of the innate immune system, such as macrophage-like cells (hemocytes) (6), antimicrobial peptides (7), and prophenoloxidase for the encapsulation of pathogens (8).
The innate immune system of insects holds many similarities to innate immunity in mammals, yet the study of hemolymph has lagged far behind mammalian serum largely due to the enormous effort and resource investment into the search for disease biomarkers in humans. In the last 4 years only a handful of studies have used mass spectrometry to characterize hemolymph peptides and proteins in insects. Drosophila melanogaster is the best studied among them as researchers have worked to elucidate the hemolymph proteomes of healthy flies (9 -11) as well as postinjury (12) and immunechallenged flies (13)(14)(15)(16). Mass spectrometry-based hemolymph studies have also been conducted for species as diverse as Helicoverpa armigera (moth) (17), Bombyx mori (silkworm) (18), Amblyomma hebraeum (tick) (19), and Anopheles gambiae (mosquito) (20,21).
Through its immense impact on flower pollination Apis mellifera (honeybee) has arguably the biggest impact on humanity of any member of class Insecta, even more than A. gambiae, the malarial vector. A study from Cornell University estimated that the yearly value of pollination by honeybees for the United States alone is 14.6 billion dollars (22). Domestic consumption and export of honey is a profitable industry, worth more than 250 million dollars per year and is on the rise (2003 statistics, United States Department of Agriculture) (23). Although the social behavior of honeybees has been well studied, little is known about the biochemistry of honeybees. On a systems biology level, recent transcriptome analyses (24 -27) are the first inroads to studying the expression patterns of numerous low abundance proteins in A. mellifera but have been applied specifically to brain or whole body and not to hemolymph.
The honeybee genome sequencing and annotation effort is nearing completion 1 making effective proteomic analysis of this animal possible. Partial proteomes of royal jelly and pollen (28) and of bee venom (29) have been analyzed mainly by two-dimensional gel separation followed by MALDI-TOF, resulting in three and nine unique protein identifications, re-spectively. Biochemical analysis of honeybee hemolymph over the past 40 years has focused on amino acid content, sugar metabolism, and the correlation of juvenile hormone to vitellogenin protein content, but beyond these two polypeptides there is very limited knowledge about the protein content of this fluid. The overall aim of this study was to expand our knowledge of honeybee hemolymph proteins beyond the very small number that are already known as a prelude to future biochemical studies of innate immunity in bees. Furthermore we measured caste-and stage-specific protein differences in this critical fluid, providing an initial molecular understanding of the susceptibility of specific stages or castes to major honeybee diseases.
Hemolymph Collection-Drones, hive workers, worker larvae, and three queens were collected from colonies maintained at the University of British Columbia, Canada. An additional queen was acquired from a feral colony in Vancouver, Canada. Samples were taken in early autumn and spring. Additional worker larvae samples were collected at the Agriculture Canada Research Station in Beaverlodge, Alberta, Canada. For collection from all adult castes, the bee was held between soft forceps, trapping its wings against the thorax, while a glass disposable 5-l Microcap pipette was inserted dorsally into the bee between the second and third abdominal terga from the thorax. Clear and slightly yellow hemolymph was drawn out by capillary action. If cloudy yellow intestinal contents were taken, the sample was discarded. Hemolymph was dispensed into a 1.5-ml microcentrifuge tube containing 100 l of PBS with EDTA-free protease inhibitor mixture kept chilled on ice. For collection of larval hemolymph, 5-day-old (estimated) larvae were taken and washed in 10 ml of PBS to reduce contamination from worker jelly. A small incision was then made two-thirds of the way down one side of the larva, and milky white hemolymph was drawn as described above. Inspection of larval hemolymph by light microscopy revealed that the cloudiness of the liquid was due to a high number of hemocytes and prohemocytes. The microcentrifuge tubes were centrifuged at 16,100 ϫ g in a microcentrifuge (Eppendorf) for 10 min at 4°C to pellet these cells and other debris, which were discarded. For protein identification, hemo-lymph from three to six workers or drones was pooled, while for quantitation, three samples of two bees from each caste were analyzed independently. Hemolymph from the queens was not pooled at all. The protein concentration of clarified hemolymph was assayed using Coomassie Plus Protein Assay reagent (Pierce) according to the manufacturer's instructions and stored at Ϫ20°C.
Electrophoresis-Hemolymph proteins from adult queens, drones, workers, and worker larvae were resolved on precast 4 -12% Nu-PAGE gels in reducing conditions with MES buffer according to manufacturer's instructions (30 or 60 g/lane). Blue-silver stain (30) was used to visualize protein bands, and then each lane was cut into between 20 and 30 slices of approximately equal staining.
Proteolytic Digestion-For in-solution digestion of hemolymph samples, proteins were precipitated in ethanol/sodium acetate (31) and resuspended in 6 M urea, 2 M thiourea, 20 mM Tris-HCl, pH 8. 0 (urea/thiourea solution was deionized prior to buffering and then stored in aliquots at Ϫ80°C). In-gel (32) and in-solution (31) digestions were performed essentially as described previously. After acidification of peptides with 1% trifluoroacetic acid, 0.5% acetic acid, and 3% acetonitrile (Sample Buffer), a volume corresponding to 2 g of total protein was purified and concentrated on STop And Go Extraction (STAGE) tips (33); eluted in 80% acetonitrile, 0.5% acetic acid; dried in a vacuum concentrator (Eppendorf); and resuspended in Sample Buffer.
Liquid Chromatography-Tandem Mass Spectrometry-Peptide samples from in-gel digestions were separated by HPLC using a C 18 column (Dionex) coupled to a Q-TOF hybrid instrument (QSTAR, Pulsar i, Applied Biosystems, Foster City, CA). Larval samples collected in Beaverlodge were analyzed on a linear trapping quadrupole-Fourier transform mass spectrometer (LTQ-FT, Thermo Electron, Bremen, Germany), and all quantitative analysis was performed on an LTQ-Orbitrap (Thermo Electron). The QSTAR was on-line coupled to a Dionex UltiMate capillary flow HPLC instrument using a nanospray ionization source (Proxeon Biosystems, Odense, Denmark). Precolumns were only used with the QSTAR, and the analytical column used with the QSTAR was 15 cm-long, 75-m-inner diameter fritted fused silica prepacked with 3 m C 18 beads (Dionex UltiMate). For QSTAR analyses buffer A consisted of 0.1% formic acid, and buffer B consisted of 0.1% formic acid in acetonitrile. Gradients were run from 2% B to 30% B over 45 min and then ramped up steeply to 80% B for 10 min to wash the column before reconditioning with 2% B. Data-dependent acquisition settings were set as described previously (34). The LTQ-FT and LTQ-Orbitrap systems were on-line coupled to Agilent 1100 Series nanoflow HPLC instruments using nanospray ionization sources (Proxeon Biosystems) holding columns packed into 15-cm-long, 75-m-inner diameter fused silica emitters (8-m-diameter opening, pulled on a P-2000 laser puller from Sutter Instruments) using 3-m-diameter ReproSil Pur C 18 beads. For both instruments buffer A consisted of 0.5% acetic acid, and buffer B consisted of 0.5% acetic acid and 80% acetonitrile. Gradients were run from 6% B to 30% B over 60 min, then 30% B to 80% B in the next 10 min, held at 80% B for 5 min, and then dropped to 6% B for another 15 min to recondition the column. The LTQ-FT was set to acquire a full range scan at 25,000 resolution in the FT from which the three most intense multiply charged ions per cycle were isolated for fragmentation in the LTQ. At the same time selected ion monitoring scans in the FT were carried out on each of the same three precursor ions exactly as described previously (35). The LTQ-Orbitrap was set to acquire a full-range scan at 60,000 resolution from 350 to 1500 Th in the Orbitrap and to simultaneously fragment the top five peptide ions in each cycle in the LTQ. Because Orbitrap data were to be used for quantitation, blank gradients where buffer B was injected were interspersed between analytical gradients to eliminate carryover.
Analysis of Mass Spectrometry Data-Centroided fragment peak lists were processed to Mascot generic format using the vendorprovided Extract_MSN.exe (Thermo Electron) for LTQ-FT/Orbitrap data or Mascot.dll (Applied Biosystems) for QSTAR data. Monoisotopic peak and charge states in LTQ-FT/Orbitrap data were corrected using DTA Supercharge, part of the MSQuant suite of software (msquant.sourceforge.net). Using Mascot version 2.1 (Matrix Science), peak lists were searched against a protein database composed of predicted proteins from the Amel_2.0 genome assembly (9759 sequences) compiled by the National Center for Biotechnology Information (ftp.ncbi.nih.gov/genomes/Apis_mellifera/special_requests/ NCBI-prot.fa.gz) as well as sequences for human keratins, porcine trypsin, and lysyl endopeptidase. The following criteria were used in the Mascot search for QSTAR data: trypsin cleavage specificity with up to one missed cleavage (35), cysteine carbamidomethyl fixed modification, no variable modifications, Ϯ0.2-Da peptide tolerance and MS/MS tolerance, and ESI-Q-TOF fragmentation scoring. For LTQ-Orbitrap and LTQ-FT spectra, peptide tolerance and MS/MS tolerance were set at 10 ppm and 0.8 Da, respectively, and the scoring scheme used was ESI-TRAP. MSQuant version 1.4.0a17 (36) was used for parsing Mascot files and iterative mass recalibration. Another Mascot search was conducted against the MSDB database (Viridae taxonomy) for potential honeybee disease biomarkers using the same conditions as above. For data from in-gel digested samples, additional searches were made with X!Tandem (37) using the same criteria as used for Mascot searches with the Q-TOF (100 ppm) as the predefined scoring method. After removing peptides from keratins, trypsin, and Lys-C, peptides from all honeybee samples were compiled, redundancies were removed, and then a BLAST analysis was performed (38) (word size, 2; scoring matrix, PAM30; maximum expectation value, 20,000) against the same honeybee protein database as used above to arrive at a final, non-redundant protein list (39). In-house scripts were used to extract protein matches from the BLAST output file with a minimum identity of 99.9% and to find the percent sequence coverage of each protein. At least two peptides over seven residues in length and with a Mascot score greater than 25 or an X!Tandem expectation value less than 0.1 were required to consider a protein hit significant. To estimate the frequency of false-positive identifications using these criteria, all spectra were searched again by Mascot against a database where all honeybee proteins sequences were reversed.
To identify unannotated peptides in the honeybee genome, all fragment spectra that did not match to a peptide in the initial Mascot searches described above were collected and used again to search against all six translated open reading frames in the Amel_2.0 genome assembly (228,567,597 base pairs, ftp.ensembl.org/pub/curren-t_apis_mellifera/data/fasta/dna/ Apis_mellifera.AMEL2.0.apr.dna.contig.fa.gz). Only those spectral matches with scores greater than 33 (the 99% confidence limit calculated by Mascot) were considered further. To access mapped peptides, go to the Ensembl main page for honeybees (www.ensembl.org/Apis_mellifera/index.html) and in the "Region" box type in a region of choice in the genome, which will be displayed as "Contig-View." Under the "detailed view" window, click on "DAS Sources" in the title bar and select "Manage sources . . . ". On the left side of the page, click on "Add Data Source." In the "DAS Server URL" dropdown menu, select "das.ensembl.org/das/," and then in the box directly below enter "hydraeuf_00001563" for DAS source name and click "Next." Choose "Next" and "Finish" on the following two pages. Upon viewing the genome again in "ContigView," the newly added "hydraeuf_00001563" data can be selected in the "DAS Sources" drop-down menu. Once the page reloads, mapped peptides will be displayed in its own separate track titled "hydraeuf_00001563." MSQuant was used to correlate elution times between parallel analyses of in-solution digested hemolymph samples and then to calculate peptide ion volumes (in Th⅐s) as described previously (40) with manual validation of each value. To ensure the reliability of measured values, only proteins having at least three quantifiable peptides in each of three experiments were considered.
Gene Ontology-Gene ontology (GO) (41) assignments were made using Blast2GO (42). The BLASTp searches were done against the non-redundant database with an expectation value maximum of 1eϪ3 and high scoring segment pair length cutoff of 33. Annotation was made using the following criteria: pre-eValue-Hit-Filter, 1eϪ6; pre-Similarity-Hit-Filter, 15; Annotation Cutoff, 55; GO Weight, 5. Directed acyclic graphs were constructed using a sequence filter of 7, score ␣ of 0.6, and node score filter of 0. From these graphs, proteins were categorized under Molecular Function and Biological Process using terms from level 3 of the graph.

RESULTS
Hemolymph Extraction-The most straightforward and well established procedure for extracting hemolymph from adult bees is to insert a small glass capillary under the second abdominal segment (Fig. 1a) (43). This method appeared to yield hemolymph relatively uncontaminated by environmental proteins (see below). On the other hand, the equivalent procedure on larvae too often resulted in fecal contamination, especially in older larvae, so instead an incision was made on the left-hand side of the larvae, and the resulting hemolymph was collected in a glass capillary (Fig. 1b). Initial hemolymph extracts from worker larvae were heavily contaminated with major royal jelly proteins (MRJPs) to the extent that even vitellogenin, the protein that should be most abundant, was difficult to detect (results not shown). MRJPs were likely present because bee larvae are immersed in jelly from the time the egg hatches until pupation. Several variations of washing conditions were explored, but simply washing the larvae in PBS for 2 min effectively reduced the MRJP contamination to levels seen in adult workers and drones (see "Experimental Procedures"). Adult workers and drones likely retain some MRJPs on their exterior from their larval and pupal stages and may acquire some through their diet (44), but interestingly no MRJPs were detected in queen samples. This may reflect the constant grooming of the queen by attendant workers (45) that would clear away any residual proteins. Alternatively the longer lifespan of the queen may simply allow more time for such proteins to be removed. Finally because these proteins were found on worker larvae it suggests that "major royal jelly proteins" are more generic than their name suggests because "royal jelly" implies it is fed only to queen larvae.
Qualitative Analysis of Caste and Stage Differences in Hemolymph Proteins-Our initial hypothesis when starting these experiments was that we would only find very subtle differences in protein profiles of different castes, the reason being that sera from male and female humans are essentially indistinguishable by normal proteomics methods until one drills far down to the abundance levels of hormones. Straightforward gel electrophoresis of hemolymph proteins from different honeybee castes and worker larvae disproved this hypothesis immediately (Fig. 1c). Intensely stained bands were likely to be the main previously characterized proteins in insect hemolymph, vitellogenin, apolipophorin, and hexamerins, but the entire lane of each sample was excised and divided into slices for analysis by LC-MS/MS in parallel with in-solution digests of each sample. The data were analyzed using two search engines and a cascade of successive searches against various databases as summarized in Fig. 1d. All data obtained from the QSTAR were searched against the latest honeybee protein sequences using both Mascot and X!Tandem to obtain greater coverage. It has been reported by others (46) that Mascot reported more protein hits than X!Tandem for a given set of spectra, but the two methods were largely congruent (Fig. 2) (47). By visual inspection, those spectra assigned by only one of the search engines (see "Experimental Proce-dures") appeared to be good matches, so the reasons they were not picked up by the other engine were not clear. In total we identified 3657 unique peptides (Supplemental Table S1) satisfying the criteria for significance (see "Experimental Procedures"). After excluding peptides for contaminant keratins and digestion enzymes as well as matches to highly identical proteins, 324 unique A. mellifera proteins were considered identified. Using the reversed database method (48), the probability of false-positive protein hits using these criteria was calculated at 0.5%, or 1.7 (ϳ2) proteins. The sequence coverage of each protein hit (Supplemental Table S2) gives a semiquantitative measure of the level of each protein in hemolymph (49,50), and in agreement with band intensities from one-dimensional gel electrophoresis, proteins with the great- est coverage are vitellogenin, apolipophorin, and hexamerins. The overall overlap between proteins expressed in adult queens, drones, and workers (Fig. 3) was only 42%, and fewer proteins were found in hemolymph from female castes (183 for queens and 204 for workers) than from drones (252). All proteomics data described here can be found in on-line supplemental material and on our Website at www.proteomics. ubc.ca/foster/data.php.
Honeybee Biomarkers-Proteomics holds great promise for identifying indicator proteins that would allow the easy diagnosis of a disease state. To date efforts toward identifying such biomarkers in mammalian systems have fallen flat (51), but there is no reason to expect that biomarkers might not be found in other species. Toward this end, fragment spectra from hemolymph samples were also searched against all viral sequences, and one drone sample returned a very strong match for the deformed wing virus (DWV) polyprotein (AAP49283), a virus that can infect the honeybee asymptomatically. Although the protein has a nominal mass of 332 kDa, peptides were detected across nearly the entire length of the gel lane, consistent with reports of autocleavage (52). Seven peptides meeting our acceptance criteria ("Experimental Procedures" and Supplemental Table S3) were fragmented, accounting for 2% sequence coverage, one of which (Fig. 4) was critical to differentiate DWV from a virus with 95% identity and an identical length: the Varroa destructor virus (VDV) (AAP51418 (53)). VDV infects a honeybee mite, Varroa jacobsoni (54), and not honeybees directly so it was unlikely that VDV would be in the hemolymph sample, but this discriminating peptide solidifies this conclusion.
Gene Ontology-The assignment of putative function to proteins by GO helps the interpretation of large proteomics datasets especially for an organism with limited available knowledge about individual proteins such as A. mellifera. GO terms for hemolymph proteins were assigned using Blast2GO, resulting in 73% of the proteins being assigned at least one GO term. This effort helped rationalize the presence of proteins with specialized function with relation to castespecific behaviors, growth and development, and immune functions (Supplemental Table S4).
We used Blast2GO to organize the Molecular Function and Biological Process GO term assignment(s) of each hemolymph protein into a directed acyclic graph, which groups GO terms that have highly specific meanings under terms with broader meanings. We selected terms at the third level of these graphs and plotted the number of proteins under each term (Fig. 5, a and b, and Supplemental Table S5). Because GO assignments to Cellular Component could not be made for the majority of proteins, this ontology was not explored further. Approximately 62% of all proteins could be incorporated into at least one of these graphs. Third level terms that are most represented are cellular physiological process (GO: 0050875) and metabolism (GO:0008152), although more than three-fourths of the proteins in each are found under both terms. Adult drones and worker larvae appear to have an excess of proteins with these biological functions, but this may be due to the overall greater protein variety detected in these bees. Drones also seemed to have many more proteins with hydrolase activity (GO:0016787). The reduced quantity of major hemolymph proteins such as vitellogenin and apolipophorin in drones effectively narrowed the dynamic range of protein concentration, which likely resulted in increased sensitivity to lower abundance proteins.
Quantitative Differences in Hemolymph-Electrophoretic FIG. 2. X!Tandem versus Mascot. Fragment spectra acquired on the QSTAR were searched against the A. mellifera protein database by both programs using a peptide significance cutoff of ion score 25 and "require bold red" for Mascot (right) and log(e) Յ 1 for X!Tandem (left). A minimum of two peptides per protein from either search was required for a significant hit. separation of hemolymph proteins revealed some striking differences between castes and adults versus larvae. For proteins that are too low in abundance to be visualized by gel staining methods, sequence coverage can serve as a useful estimation of their levels. For example, we were able to see interesting differences in the expression of proteins associated with pheromone response between castes. The antennal-specific protein 3c (NP_001011583) is an odorant-binding protein that we identified in all bees but with far greater sequence coverage in drones and workers (48%) compared with queens (8.5%). Another odorant-binding protein (XP_624854) also appeared to be more abundant in drones (22% sequence coverage) and workers (20%) over queens (5%). Also we found that the honeybee antibacterial peptide hymenoptaecin (NP_001011615 (55)) was expressed in lower amounts in the larvae compared with the adults (sequence coverage: larvae, 10%; adult queens and workers, 30%; drones, 16%).
However, to more rigorously quantify differences in protein abundance between castes and stages, in-solution digests of hemolymph were analyzed using the extremely high resolving power and accuracy of an LTQ-Orbitrap system. Relative peptide ion intensities in parallel LC-MS/MS analyses were the basis for quantitation as we have reported previously (39,40). Normalization of results to a housekeeping protein, such as tubulin ␣-1 used here, was critical to control for potentially biased quantitation in a complex protein mixture with one or two dominant proteins. We were able to compare relative expression levels for 63 worker adult and larva proteins (Fig.  6a) and 37 proteins in the adult castes (Fig. 6b) using this method. The data show that vitellogenin is expressed 20 times more highly in queens than workers, 5 times less in drones than workers, and that adult workers have at least 50-fold more vitellogenin than larvae. These values are in general agreement with a Northern blot study on vitellogenin mRNA in honeybee abdomen (56). All hexamerins (NP_001011600, XP_392869, and XP_624041) were more than 50-fold overexpressed in worker larvae than worker adult, which is similar to results from a Coomassie-stained SDS-polyacrylamide gel electrophoresis of hemolymph from workers of different ages (57). The angiotensin-converting enzyme (XP_393561) involved in a highly conserved pathway to control blood pressure (58) was in 36-fold excess in worker adults than larvae. This method of protein quantitation was particularly effective at comparing protein levels in different developmental stages within the worker caste likely because they share a similar complement of proteins. We were also able to quantify a number of immunity-related proteins, such as prophenoloxidase-activating factor (XP_623150) and prophenoloxidase itself (NP_001011627), which are more abundant in adult workers compared with larvae by about 30-and at least 50-fold, respectively. The ␤-1,3-glucan recognition protein (XP_395368) was also found to be at least 50 times greater in adult workers over larvae. Unassigned Spectra-The A. mellifera genome was only sequenced a year before this study was conducted, and there are relatively few groups working on the gene annotation of this organism. The annotation that has been done has largely been based on homology to D. melanogaster combined with modeling against honeybee expressed sequence tags, insect cDNAs, and some ab initio predictions. Despite these efforts, there are likely to be hundreds if not thousands of unannotated genes left to be identified in the honeybee genome. To this end we used all spectra that Mascot was not able to match to any peptides in the existing protein database to search the genomic sequences directly. This resulted in 958 previously unpredicted peptides, which we have now mapped onto the genome and have made available to the community through Ensembl's Distributed Annotation System (DAS, www.ensembl.org/info/data/external_data/das/index.html), accessible from Ensembl's honeybee website (see "Experimental Procedures" for instructions; www.ensembl.org/Apis_ mellifera/index.html). . Bars represent the number of proteins in each term found in each type of bee: all bees (black, with number of proteins for that term), queen (horizontal lines), drone (white), worker adult (dark gray), and larvae (light gray). DISCUSSION Honeybees and ants are the two classic model systems for investigating social behavior, but fruit flies, and to a lesser extent mosquitoes, have been the insects of choice for molecular studies. The vast majority of genes in A. mellifera are only predicted, meaning that most of the gene products have never been directly observed. The data presented here provide solid experimental validation for a large number of these proteins and go further to identify other peptides outside of annotated open reading frames that appear to be expressed.
Currently there are only a handful of honeybee proteins with reports of expression levels during development or in different castes, and the proteomics data presented here support these previous findings. Vitellogenin, a largely female-specific glucolipoprotein (59), was presented as an intensely stained 180-kDa protein band in adult queens and workers and worker larvae (4, 5) but was also detected by mass spectrometry in drones (60,61). Lower levels of vitellogenin in worker larvae relative to the adult female castes agree with previous reports of its expression (61). On the other hand, hexamerins are a family of nutrient storage proteins that make up the bulk of larval hemolymph (62,63), and the queen was the only caste to retain hexamerin 70b expression during adulthood in quantities comparable to larva, also confirming previous studies (63). Apolipophorin was highly expressed to approximately equal levels in all castes.
The extremely high levels of vitellogenin, apolipophorin, and hexamerins in females is a considerable hindrance to effective MS analysis of less abundant proteins, analogous to the problems posed by albumin in human serum proteomics (64). In support of this, the protein variety detected in samples of hemolymph from females was always lowest. From a behavioral standpoint, the number of tasks accomplished by females far exceeds the male honeybees, whose sole purpose is mating, therefore females might express a greater variety of proteins compared with males. Consequently the lower protein diversity detected in females compared with the males is most likely rationalized by the problem of insufficient dynamic range of mass spectrometers and not due to a biological phenomenon. Despite the presence of a few highly abundant proteins that significantly reduced the number of protein hits, 324 proteins in total were identified in this study, which is one of the largest to date for A. mellifera. Clearly, however, more work is needed to bring proteomics research for this species on par with Drosophila where about 150 hemolymph proteins have been identified from healthy Drosophila larvae alone (9 -11).
GO term mappings allowed putative assignment of immu-nity-related functions to A. mellifera hemolymph proteins for which there are only a few studies for the major proteins. For example, drone-specific protein (AmeLGUn_WGA-1241_2.510469.510469.p) with predicted hydrolase activity (GO:0016787) is a homolog of prostasin, a protein found in human seminal fluid (65) and the Tenebrio molitor (mealworm beetle) spermatophore (66), an organ involved in the transfer of sperm to females. Queen hemolymph was expected to include proteins related to reproduction that are not present in other caste members. One such protein (XP_395884) was predicted to be involved in chromatin silencing (GO:0006342) and oocyte maturation (GO:0001556). Both adult queen and worker larva hemolymph contained a protein (XP_392479) with a role in oocyte microtubule cytoskeleton polarization (GO:0048129). As expected, worker larvae hemolymph contained far more enzymes involved in making simple biomolecules than adult workers, including those needed for the synthesis of thymidylate (XP_624530, GO:0006233, GO:0006235), steroids (XP_397214, GO:0006694), fatty acids (XP_396268, GO:0006633), and arginine and glutamine (XP_393888, GO:0006541, GO:0006526). Perhaps a better indication of rapid tissue growth is the presence of an elongation factor (XP_623682, GO:0006414) sequenced only in larva hemolymph. Synthesis of these basic building blocks of life would be required throughout all stages of development so these proteins may exist in adult bees as well but just at lower levels. Sequence coverage is directly correlated with peptide fragmentation efficiency, which in turn depends on the amino acid sequence of the peptides, so this method is, at best, a semiquantitative measure of abundance (50). As long as sampling of different conditions is consistent, however, then sequence coverage is a valid method for comparing the relative abundance of proteins in different samples especially because it is available to all tandem mass spectrometers without any additional preparative work to the samples or complex data analysis software. Relative peptide ion intensities, on the other hand, infer protein quantities by directly comparing identical peptides for a given protein found between samples, relying on high mass accuracy of a parent peptide ion obtained using the LTQ-Orbitrap and highly reproducible LC to find the elution peak for a peptide in a parallel analysis even if it is not sequenced. The precision of this method suffers somewhat because the ionization environment for a given peptide in two different samples will not be identical, and so the MS detector response may vary. Therefore, we only considered proteins for quantitation by this method when at least three peptides were observed in each analysis, ignoring all other proteins that did not meet this criteria.
Proteomics-based biomarker hunts in mammalian systems have yet to produce any major breakthroughs (51) so we were surprised at the relative ease with which we were able to identify a bona fide"biomarker" in one drone hemolymph sample and doubly so because the bees had no outward indications that they carried DWV (i.e. their wings were not deformed). Several highly similar viruses in the DWV family can infect bees, but the seven peptides identified in the DWV polypeptide allowed us to diagnose this virus specifically. This positive result suggests that hemolymph might be a useful fluid for the diagnosis of other honeybee pathogens with much larger impact on apiculture, such as American foulbrood. In reality, mass spectrometers are far too expensive, and their use is too specialized for them to become commonplace tools for even large apiaries, but they could be used in large testing centers such as those run by various governmental departments of agriculture.
An initial goal of this study was to identify the molecular components of the honeybee immune system, which is expected to be highly similar to D. melanogaster because both insects are under the superorder Endopterygota. Indeed honeybee hemolymph was found to contain at least two proteins of the prophenoloxidase pathway (prophenoloxidase-activating factor and prophenoloxidase), which results in melanin synthesis that leads to black pigmentation and is a crucial part of the encapsulation response against microorganisms and parasites (67), a defense mechanism common to insects and arthropods. Lower levels of prophenoloxidase in larva reported here may be rationalized by the incomplete development of its immune system and may be at least partially responsible for the complete lack of black coloration in all larval tissues. A homolog of the Drosophila ␤-1,3-glucan recognition protein, a cell membrane protein anchored by a glycosylphosphatidylinositol group (68, 69) responsible for binding glucan on the surface of yeast, bacteria, and fungi (70), was also found in A. mellifera (XP_395368). Furthermore two proteins (AmeLGUn_WGA635_2.509004.509004.p509004 and XP_395941) were found to be homologous to the Drosophila peptidoglycan recognition protein where its binding to the surface of Gram-positive bacteria is necessary for Toll activation (71). Neither protein could be quantitated, but interestingly the latter protein was not identified at all in worker larvae hemolymph. Hymenoptaecin (NP_001011615), an antibacterial polypeptide (55), is expressed in lower amounts in larvae compared with the adults. One protein in our dataset, ␣ 2 -macroglobulin (XP_392454), showed homology to Drosophila thioester-containing protein-2 (CAB87808) (72) and to A. gambiae ENSANGP00000019522 (AAG00600) (73), which is a known component of the insect complement system (74). The honeybee ␣ 2 -macroglobulin is a large protein (1777 residues) but was only identified by three peptides, hinting that it is probably not very abundant. This suggests that other members of the thioester-containing protein family may exist in the hemolymph but are difficult to detect. In general, it is apparent that adult workers have more immunity-associated proteins compared with larvae; this is unsurprising given the developmental stage differences, and it suggests a likely cause for the vulnerability of larvae to pathogens that do not infect adults, such as Ascophaera apis (chalkbrood) and Paenibacillus larvae (American foulbrood). To confirm such speculations, proteomics could be used to compare hemolymph from healthy versus immune-challenged honeybees. Similar studies have already been done in D. melanogaster using a variety of bacterial, fungal, and other challenges such as lipopolysaccharide (LPS) and peptidoglycan (13)(14)(15)(16); in B. mori (silkworm) after LPS challenge (18); and in A. gambiae in response to LPS, Escherichia coli, and several small proteins (21). Now that the honeybee genome has been sequenced an avalanche of data similar to that seen in other model systems will likely follow for bees. Here our efforts to identify the protein repertoire of hemolymph are one of the first direct applications of these vast and largely unexplored genomics data. Through quantitative analysis of different castes and life stages our data pose several plausible and testable hypotheses to explain behavior at a molecular level. The current study provides an in-depth view of the composition of honeybee hemolymph that will open new avenues for biochemical analysis of this most beneficial insect.