If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
* This work was supported by a University of Queensland Postdoctoral Fellowship (to A. A. W.), Australian Research Council Discovery Early Career Researcher Award DE160101142 (to E. U.), Linkage Grant LP140100832 (to G. F. K.), and Australian National Health and Medical Research Council Principal Research Fellowship (to G. F. K.). This article contains supplemental material.
Assassin bugs (Hemiptera: Heteroptera: Reduviidae) are venomous insects, most of which prey on invertebrates. Assassin bug venom has features in common with venoms from other animals, such as paralyzing and lethal activity when injected, and a molecular composition that includes disulfide-rich peptide neurotoxins. Uniquely, this venom also has strong liquefying activity that has been hypothesized to facilitate feeding through the narrow channel of the proboscis—a structure inherited from sap- and phloem-feeding phytophagous hemipterans and adapted during the evolution of Heteroptera into a fang and feeding structure. However, further understanding of the function of assassin bug venom is impeded by the lack of proteomic studies detailing its molecular composition.
By using a combined transcriptomic/proteomic approach, we show that the venom proteome of the harpactorine assassin bug Pristhesancus plagipennis includes a complex suite of >100 proteins comprising disulfide-rich peptides, CUB domain proteins, cystatins, putative cytolytic toxins, triabin-like protein, odorant-binding protein, S1 proteases, catabolic enzymes, putative nutrient-binding proteins, plus eight families of proteins without homology to characterized proteins. S1 proteases, CUB domain proteins, putative cytolytic toxins, and other novel proteins in the 10–16-kDa mass range, were the most abundant venom components. Thus, in addition to putative neurotoxins, assassin bug venom includes a high proportion of enzymatic and cytolytic venom components likely to be well suited to tissue liquefaction. Our results also provide insight into the trophic switch to blood-feeding by the kissing bugs (Reduviidae: Triatominae). Although some protein families such as triabins occur in the venoms of both predaceous and blood-feeding reduviids, the composition of venoms produced by these two groups is revealed to differ markedly. These results provide insights into the venom evolution in the insect suborder Heteroptera.
Venoms are chemical arsenals injected by one animal into another to disrupt the homeostasis of the injected animal in ways that assist predation, defense, or feeding by the injecting animal (
). Typically, venoms are composed of multiple toxins, including peptides, enzymes, and small molecules, such as polyamines, that bind to and affect the function of multiple molecular targets in the injected animal. Because of their key role governing life-or-death interactions between animals, venom toxins are subject to selection pressures that have resulted in unique evolutionary patterns such as massive duplication and accelerated evolution of toxin-encoding genes (
). In addition, the properties that ensure that toxins confer a fitness advantage to the animals that produce them, including high stability and potency, make them well suited for use as insecticides, therapeutics, and pharmacological tools (
Approaching the golden age of natural product pharmaceuticals from venom libraries: an overview of toxins and toxin-derivatives currently involved in therapeutic or diagnostic applications.
). However, our understanding of the factors shaping venom evolution, and our ability to repurpose venom toxins for biotechnological use, is limited by the current focus of research on a small number of prominent groups of venomous animals: the scorpions, spiders, snakes, and cone snails. Studies on neglected taxa (
Proteomics and deep sequencing comparison of seasonally active venom glands in the platypus reveals novel venom peptides and distinct expression profiles.
) are essential to gain a more general understanding of how venom systems evolve and how venom evolution is influenced by factors such as geographical, trophic, and morphological constraints.
Assassin bugs (family Reduviidae) are a large and diverse group of insects consisting of ∼6800 species in 25 subfamilies distributed over all continents except Antarctica (
). Like other hemipterans such as cicadas and aphids, which feed on plants, reduviids have mouthparts that are extensively elongated and modified to form a proboscis that is specialized for piercing and sucking. Reduviids however (together with the majority of other heteropteran families), use their piercing-sucking mouthparts to inject venom into, and feed from, prey. Exceptions are the blood-feeding kissing bugs (Reduviidae: Triatominae) that use their venom to facilitate acquisition of blood meals from vertebrates, including humans (
). The venom apparatus of reduviids includes morphologically complex paired secretory glands within the thorax/abdomen, a muscle-driven pump within the head, and a devoted venom channel formed by interlocking maxillary stylets through which venom can be injected into the prey (
Insecticidal activity of venomous saliva from Rhynocoris fuscipes (Reduviidae) against Spodoptera lituraHelicoverpa armigera by microinjection and oral administration.
J. Venom. Anim. Toxins Incl. Trop. Dis.2011; 17: 486-490
) determined the primary structure of three disulfide-rich peptides in assassin bug venom and showed that one of these, Ptu1, was neurotoxic by virtue of its ability to inhibit the voltage-gated calcium channel CaV2.2. Ptu1 was shown to have the inhibitor cystine knot (ICK) fold (
Utilizing the assassin bug, Pristhesancus plagipennis (Hemiptera: Reduviidae), as a biological control agent within an integrated pest management programme for Helicoverpa spp. (Lepidoptera: Noctuidae) and Creontiades spp. (Hemiptera: Miridae) in cotton.
), the detailed composition of assassin bug venom remains unknown. Here, we elucidate the protein composition of venom from the harpactorine assassin bug Pristhesancus plagipennis using a combined transcriptomic/proteomic approach. We show that this venom contains more than a hundred individual components, including putative neurotoxins, cytolytic toxins, digestive enzymes, and members of eight novel protein families. Our data reveal convergent evolution between assassin bugs and other venomous animals, as well as unique differences in reduviid venom that may be related to the dual requirement of assassin bug venom to both paralyze and liquefy prey.
EXPERIMENTAL PROCEDURES
Insects and Venom Collection
Assassin bugs (P. plagipennis) were collected in Brisbane, Australia, fed on crickets (Acheta domesticus; Pisces Live Food, Brisbane, Australia), and housed in individual containers to avoid cannibalism. Venom was harvested from adults of both sexes by electrostimulation using a non-lethal protocol. Bugs were restrained on a foam platform with a rubber band over the thorax, and the proboscis was gently inserted into a P200 pipette tip, and electrostimulation (30 V, 5 ms pulses, 5 Hz) was applied to the thorax using an S48 square pulse stimulator (Grass Technologies, Warwick, RI) with electrodes installed on tweezers (supplemental Video 1). Venom was immediately transferred to a tube on dry ice and stored at −80 °C until analysis.
Experimental Design and Statistical Rationale
The purpose of this study was to determine the venom proteome of P. plagipennis. Therefore, we employed methods to maximize the number and diversity of proteins identified, including strategies to identify peptides (short minimum contig length for RNA-Seq assemblies; LC-MS/MS of HPLC fractions) and low abundance proteins (combining multiple RNA-Seq assemblies, and LC-MS/MS of HPLC fractions and 2D gel electrophoresis spots). Overall, our LC-MS/MS analysis included 62 HPLC fractions, 54 1D SDS-PAGE bands, and 156 2D SDS-PAGE spots. The eventual dataset includes 17 technical replicates of selected HPLC fractions, two technical replicates of reduced and alkylated but undigested venom, and two replicates of reduced, alkylated, and digested venom. The remaining samples are neither biological nor technical replicates but are subsets of venom components fractionated to maximize the potential of LC-MS/MS for protein identification. Initially, spectra were compared by Paragon searches (
The paragon algorithm, a next generation search engine that uses sequence temperature values and feature probabilities to identify peptides from tandem mass spectra.
) against a database containing all ORFs with a length of >30 amino acids in our three venom gland transcriptomes. Data from MS samples were pooled for a single search wherever possible to facilitate identification of the optimal data set of proteins with minimum redundancy using the Paragon/ProtGroup algorithms in ProteinPilot. In practice, it was necessary to run a separate Paragon search for 2D gel spots, as they were alkylated with iodoacetamide (all other samples were alkylated with iodoethanol). The proteins and peptides identified by these two searches were reviewed manually, and poor-quality identifications were excluded to yield a draft dataset of 130 proteins. To check this manual process, we re-validated protein identifications using further Paragon searches against a database containing just these 130 sequences, leading us to further discard three sequences with ProteinPilot Unused values below the threshold of 1.3 (corresponding to >95% confidence at protein level). After establishing the proteome, we performed an extra Paragon search of each of our 277 experimental samples individually against the venom proteome, providing insights into assignment of individual gel spots and HPLC peaks.
Transcriptomics
For RNA extraction, venom glands were harvested from two adult female and two final instar bugs after anesthesia with CO2 for ∼5 min. The main gland posterior lobe, main gland anterior lobe, and accessory gland were removed and stored separately in >10× the glandular volume of RNAlater (Ambion, Austin, TX). Total RNA was extracted using a DNeasy kit (Qiagen, Mississauga, Canada), and mRNA was isolated using a Dynabeads mRNA Direct kit (Ambion) according to the manufacturer's instructions. This process yielded 3120, 680, and 119 ng of mRNA from the main gland posterior lobe, main gland anterior lobe, and accessory gland, respectively. RNA sequencing (RNA-Seq) was performed on 340 ng from each lobe of the main gland and 119 ng from the accessory gland on the Illumina, San Diego, CA NextSeq instrument at the IMB Sequencing Facility. After TruSeq library preparation, each sample was run on four lanes of a 150 cycle mid-output run to generate 150-bp paired-end reads (main gland posterior lobe, 95,700,337 reads; main gland anterior lobe, 78,592,255 reads; accessory gland, 54,480,493 reads).
For each gland region, eight assemblies were constructed using CLC Genomics Workbench (CLC Bio, Aarhus, Denmark) and Trinity (
). For CLC assemblies, reads with q-scores below 30 were trimmed and the reads assembled using minimum contig length of 150 bp, minimum similarity to join contig 0.95, and word (k-mer) sizes of 21, 24, 29, 34, 44, 54, and 64. For Trinity, which employs a fixed k-mer method, reads were assembled using the default trimming parameters and minimum contig length of 150 bp. For each gland compartment, contigs from the Trinity and CLC assemblies were pooled and clustered using CD-HIT (
; threshold 95%) and then re-imported into CLC Genomics Workbench where trimmed reads were re-mapped and used to update the final contigs. Contigs from all compartments were then pooled, and a total of 149,776 open reading frames (ORFs) of 90 bp and greater were extracted using GetORF (
), to which 155 common LC-MS/MS contaminant sequences were added to produce the final sequence database for searching.
Mass Spectrometry
For MS analysis, venom was centrifuged (12,000 rcf, 4 °C) to remove particulate matter and prepared for MS from crude venom or fractionated using 1D and 2D gel electrophoresis or HPLC. For 1D electrophoresis, 1, 2, 4, 8, or 16 μg (A280 eq) of venom was run on a 12.5% Tris-glycine polyacrylamide gel after denaturation (5 min 95 °C) in sample buffer with or without the reducing agent dithiothreitol (DTT). Gels were stained with Coomassie Brilliant Blue R-250. For 2D electrophoresis, 400 μg (A280 eq) of venom was lyophilized and re-suspended in 125 μl of resuspension buffer (8 m urea, 4% CHAPS) supplemented with 10 mm DTT and 1% ampholyte solution (Life Sciences catalog no. 17-6000-88; GE Healthcare, Little Chalfont, UK) and then absorbed into Readystrip immobilized pH gradient strip (pH 3–10, nonlinear, Life Sciences catalog no. 163-2005) for 16 h. Isoelectric focusing was then conducted under mineral oil on an Ettan IPGphor3 isoelectric focuser (GE Healthcare) using a program of 5900 total V-h. The strip was then incubated in reducing reagent (50 mm Tris-HCl, pH 8.8, 6 m urea, 2% SDS, 30% glycerol, and 1.5% DTT) for 10 min and then alkylation reagent (50 mm Tris-HCl, pH 8.8, 6 m urea, 2% SDS, 30% glycerol, and 2% iodoacetamide) for 20 min. Second dimension SDS-PAGE was then run using a 15% Tris-glycine polyacrylamide gel alongside a Precision Plus ladder (catalog no. 161037, Bio-Rad, Hercules, CA) and the gel stained with colloidal Coomassie stain (
). Protein bands and spots were then excised with a scalpel, incubated for 2 h at 37 °C in reduction/alkylation buffer (50 mm ammonium carbonate, pH 11.0, 1% iodoethanol, 0.025% triethylphosphine in 48.5% acetonitrile), and then incubated for 4–16 h at 37 °C in 5–20 μl of digestion reagent (20 ng/μl trypsin (catalog no. 7575, Sigma-Aldrich, St. Louis, MO; cleaves on carboxylic side of Arg and Lys residues) in 40 mm ammonium bicarbonate, pH 8.0, 5% acetonitrile). Samples where then incubated for 30 min at room temperature in extraction reagent (50% acetonitrile, 5% formic acid) with occasional vortexing.
For HPLC, ∼1 mg of venom (A280 eq) was diluted >5-fold in loading buffer consisting of 95% solvent A (0.05% TFA) and 5% solvent B (0.043% TFA, 90% acetonitrile). After centrifugation (10 min, 17,000 rcf, 4 °C), the supernatant was fractionated using a Jupiter 250 × 10-mm Jupiter C4 column (10-μm particle size, 300 Å pore size, catalog no. 00G-4168-NO, Phenomenex, Torrance, CA) using a linear gradient from 5 to 75% solvent B in solvent A over 56 min and a flow rate of 3 ml/min, yielding 64 fractions. After lyophilization and resuspension in 30 μl of milliQ water, each fraction was analyzed by MALDI-TOF MS and LC-MS/MS. For MALDI-TOF MS, each fraction was diluted in MALDI solvent (70% acetonitrile, 1% formic acid), spotted together with the same volume of α-cyano-4-hydroxycinnamic acid (5 mg/ml in MALDI solvent), and analyzed on a 4700 MALDI TOF/TOF Proteomics Analyzer (AB SCIEX, Washington, D.C.) operated in reflectron mode with a laser power of 3400–3800 V. For LC-MS/MS preparation, 3 μl of each resuspended HPLC fraction was incubated with 7 μl of reduction/alkylation buffer (2 h, 37 °C), lyophilized, and incubated in 5 μl of digestion reagent (16 h, 37 °C) before the reaction was terminated by addition of 20 μl of 5% formic acid. For crude venom samples, 5 μg of protein (A280 eq) was processed using the same protocol as for HPLC fractions, except terminated at the appropriate stage to produce samples that were either native, reduced, alkylated but undigested or reduced, alkylated, and digested.
Peptide digests from gel spots and bands, HPLC fractions, and crude venom preparations were resuspended in 1% formic acid, 2.5% acetonitrile and analyzed by LC-MS/MS. Liquid chromatography was performed using either a Nexera X2 LC system (Shimadzu, Kyoto, Japan) with a 100 × 2.1-mm Zorbax 300SB-C18 column (1.8 μm particle size, 300-Å pore size, catalog no. 858750–902, Agilent, Santa Clara, CA) or a Shimadzu Nano LC system coupled to a 150 × 0.1-mm Zorbax 300SB-C18 column (3.5 μm particle size, 300 Å pore size, Agilent catalog no. 5065-9910). The LC outflow was coupled to a 5600 Triple TOF mass spectrometer (AB SCIEX) equipped with a Turbo V ion source. Peptides were eluted over 14- or 25-min gradients of 1–40% solvent B (90% acetonitrile, 0.1% formic acid) in solvent A (0.1% formic acid) at a flow rate of 0.2 ml/min. MS1 scans were collected between 350 and 1800 m/z, and precursor ions in the range m/z 350–1500 with charge +2 to +5 and signal >100 counts/s selected for analysis, excluding isotopes within 2 Da. MS/MS scans were acquired at with an accumulation time of 250 ms and a cycle time of 4 s. The “Rolling collision energy” option was selected in Analyst, allowing collision energy to be varied dynamically based on m/z and z of the precursor ion. Up to 20 similar MS/MS spectra were pooled from precursor ions differing by less than 0.1 Da. The resulting mass spectra in WIFF format were then compared with a library of ORFs extracted from transcriptomes generated from RNA-Seq experiments (together with a list of common MS contaminants) using a Paragon 4.0.0.0 algorithm implemented in ProteinPilot 4.0.8085 software (AB SCIEX). A mass tolerance of 50 mDa, which is ProteinPilot default value for data obtained on an AB SCIEX 5600 mass spectrometer, was used for both precursor and MS/MS ions. MS contaminants considered are included as a FASTA file “MS Contaminants.fa” in the PRIDE submission associated with this study (PXD004804). For all searches, the options “Biological modifications,” “Amino acid substitutions,” and “Thorough ID” were selected. The Biological modifications option allows detection of 232 modifications in addition to the 59 default modifications considered by the Paragon algorithm in ProteinPilot, which are described in the file “Modifications Catalogue and Translations.xls” in the same PRIDE submission. The Amino acid substitutions option allows detection of substitutions of all standard amino acids. In addition to searching predicted cleavage products of protein sequences against precursor ions, the Thorough ID option runs an algorithm to match sequence tag information to amino acid sequences independently of expected cleavage sites (
The paragon algorithm, a next generation search engine that uses sequence temperature values and feature probabilities to identify peptides from tandem mass spectra.
). For the 2D gel spot search, the option “Gel based ID” was also selected, which computationally prioritizes oxidative modifications that are common artifacts of electrophoresis. The results of the pooled-sample Paragon searches were reviewed manually. Low quality identifications were excluded by only reporting proteins for which three or more peptides were observed (p > 95%) or one or more peptides with p > 95% plus a secretion signal sequence with a D-score > 0.65 according to SignalP 4.1 (
Proteins identified by MS were compiled in a spreadsheet and annotated using BlastP results with E < 0.05 against GenBankTM nr database, HMMER domain predictions with E < 0.05 against the Pfam database, signal sequence probability, and mass of predicted mature toxin (supplemental Table 1). For Ptu1 and lipocalin/triabin family proteins, further family members not identified by MS were recovered from transcriptomes using MS-identified Ptu1 or lipocalin/triabin family proteins as bait for BLAST searches with E < 0.05 against the whole-gland database of possible protein sequences. For phylogenetic analysis of protein families, the most closely related proteins to those from P. plagipennis transcriptomes were retrieved from the GenBankTM nr database using BlastP with E < 0.05. Further lipocalin/triabin protein family members from blood-sucking reduviids for which functional data has been published were retrieved manually. Proteins were aligned using MAFFT (
) using lset rates = invgamma with prset aamodelpr = mixed.
RESULTS
Determination of Venom Proteome Using RNA Sequencing and Proteomics
Assassin bugs (P. plagipennis) kept in captivity readily envenomated and killed house crickets (A. domesticus; Fig. 1A; supplemental Video 1). We developed a method to harvest venom non-lethally from P. plagipennis by electrostimulation, which typically yielded 2–10 μl of venom of ∼50–250 mg/ml (A280 eq; Fig. 1B; supplemental Video 2).
Fig. 1A, assassin bug P. plagipennis envenomating feeder cricket (A. domestica). B, extraction of venom from P. plagipennis using electrostimulation.
To investigate the protein composition of P. plagipennis venom, we constructed a library of possible protein sequences produced by the venom gland complex. Because the glandular source of assassin bug venom has not been characterized in detail, we sequenced mRNA from each of the three compartments of the labial gland complex, comprising the main gland posterior lobe, main gland anterior lobe, and accessory gland. The resulting reads were assembled and ORF translations extracted to produce a library of possible protein sequences (see “Experimental Procedures”).
Mass spectra for protein identification were obtained by LC-MS/MS analysis of 277 protein samples derived from venom, comprising crude venom (
; Fig. 2B). A full list of samples analyzed (file “Sample numbering.xlsx”), along with raw data files and protein and peptide summaries, is included in the PRIDE submission accompanying this study (PXD004804). To determine the venom proteome of P. plagipennis, spectra from multiple samples were combined into two Paragon searches against our ORF database “Pp123x.fasta”. One search“all_iodoethanol_samples” included venom crudes, HPLC fractions, and 1D gel bands, while the other search “2Dgelspots” included all 2D SDS-PAGE spots. Although it would have been ideal to combine all samples into a single Paragon search, two searches were performed because of the differing alkylating agents used for these two sample sets. The All_iodoethanol_samples and 2Dgelspots searches resulted in 175 and 174 protein identifications, respectively, at a protein confidence level of >95%. Removal of low quality identifications, decoy database hits and contaminants, and manual comparison of the results from these two searches resulted in a draft proteome consisting of 130 protein sequences. In some cases, identified protein sequences that appeared to be incomplete were compared by BLAST search against individual RNA-Seq assemblies to retrieve additional sequences. Final identification statistics were produced by Paragon search of each of the All_iodoethanol_samples and 2Dgelspots against a database containing the draft proteome plus contaminants (“Pp_electrostim.fasta”). Three protein sequences were further discarded at this stage due to low ProteinPilot Unused values (indicating they were superfluous), resulting in a final proteome of 127 sequences (supplemental Table S1). Of these 127, four were identified from a single detected peptide, spectra for which are shown in supplemental Fig. S1. Re-examination of FDR analyses indicated these 127 sequences are ranked within the region of <1% global FDR in searches against the ORF database.
Fig. 2Proteins detected by LC-MS/MS of 2D SDS-PAGE spots and HPLC fractions showing abundant proteases, CUB-domain proteins, and heteropteran venom family 1 proteins.A, 2D SDS-polyacrylamide gel of crude venom, showing protein families identified by LC-MS/MS of gel spots. B, HPLC trace of venom fractionation, showing protein families identified by LC-MS/MS of collected fractions.
)), trypsin-like proteases, various catabolic enzymes, serpins, a triabin-like protein, bacterial permeability increasing-like protein, and novel protein families (Table 1; Figure 3). Putative enzymes detected in the venom were protein kinase, inositol-phosphate phosphatase, M12A-like metalloproteases, cathepsin B, peptidase S10, hexosaminidase, and nuclease. Seventeen proteins without any inferred putative function were present and were classified into eight families (heteropteran venom protein families 1–8). Proteins in families 3 and 5–8 showed homology to uncharacterized and predicted protein sequences from hemipterans and other insects, whereas families 1, 2, and 4 were novel.
To gain additional information about protein abundance, we then re-analyzed mass spectra from individual 2D gel spots (Fig. 2A), HPLC fractions (Fig. 2B), and other samples, comparing them using the Paragon algorithm to our venom proteome database Pp_electrostim. Many of the individual samples, including low molecular weight gel spots and HPLC fractions, contained peptides from multiple larger proteins, suggesting venom underwent partial autoproteolysis before analysis, consistent with the high protease content of the venom. Nevertheless, it was possible in many cases to assign spots and fractions as particular proteins based on the number of peptides confidently detected (>95% confidence), the number of precursor signals counted as a proportion of total non-contaminant precursor counts, and (in the case of gel spots) the observed versus expected molecular weight. S1 proteases alone accounted for 65 of the 127 identified proteins, 62 of which were identified from spots in the 25–40 kDa range of a 2D SDS-polyacrylamide gel. Aside from proteases, the major venom proteins present are in the 10–16 kDa range and of unknown function. Intensely staining gel spot in this range (spot numbers 104, 62, 109, and 103) were attributed to CUB domain proteins 1 and 2 and venom family 1 proteins 1 and 2, respectively, each of which accounted for >90% of confidently identified peptides and >96% of precursor counts in each case (supplemental Table 2). In addition, each of these proteins is associated with a major peak in the HPLC trace (fractions 50, 44, 52, and 56, respectively). In each case, the identified protein accounts for >69% of precursor counts detected (with the exception of CUB domain protein 1, which accounted for 34% of precursor counts from the fraction collected at the highest UV peak). Further details of detected proteins are available as supplemental data (protein and nucleotide sequences, identification statistics, and annotation, see supplemental Table S1; alignments of selected protein families, see supplemental Fig. S2; files included in the PRIDE submission accompanying this study, see supplemental Table S2; biological modifications, amino acid substitutions, and deviation from expected cleavage patterns detected may be most conveniently viewed by opening ProteinPilot Peptide Summary files in the PRIDE submission using Microsoft Excel).
Ptu1 Family Proteins
We detected five Ptu1-like peptides in P. plagipennis venom using LC-MS/MS (Pp1a, Pp1b, Pp2, Pp4, and Pp5). To determine if transcripts encoding further members of the Ptu1 family are produced in the labial gland complex, we performed a BlastP search against our library of possible protein sequences using P. plagipennis Ptu1 venom peptides and previously described assassin bug Ptu1 family venom peptides as queries. This strategy revealed further 10 Ptu1 family peptide sequences (KX752811 to KX752820). One of these, Pp3, was not identified in LC-MS/MS experiments but has a predicted mature mass (3581.4 Da) closely matching an observed venom component (3581.0 Da) that undergoes a mass shift of 270.7 Da upon alkylation with iodoethanol, close to the theoretical value expected for alkylation of the six Cys residues in the Pp3 sequence (270.4 Da). Pp3 was therefore classified as a putative venom peptide in further analyses. All identified Ptu1 family peptides have six conserved Cys residues that are homologous to those responsible for formation of the ICK fold of Ptu1 (Fig. 4A). Based on the A280 of crude venom and the quantity of peptides recovered after HPLC (judged by A280 calibrated with calculated extinction coefficients from identified peptide sequences), Ptu1-like peptides were estimated to account for ∼1–3% of venom by dry weight.
Fig. 4Evolution of Ptu1-like peptides isolated from P. plagipennis venom. Signal sequences predicted by SignalP4.1 are shown in lowercase; lines above the text show the disulfide connectivity for Ptu1. Sequence labels in blue indicate peptides detected in venom using proteomics. POI, phenoloxidase inhibitor; AMP, antimicrobial peptide. Node labels indicate posterior probabilities. Comparison sequences are remipede (S. tulumensis) agatoxin-like peptide (
The first venomous crustacean revealed by transcriptomics and functional morphology: Remipede venom glands express a unique toxin cocktail dominated by enzymes and a neurotoxin.
To investigate the relationship between P. plagipennis Ptu1 family sequences and previously described proteins, we aligned them together with homologous sequences from the nr database retrieved by a BlastP search (E < 0.05; Fig. 4A). We then performed a Bayesian inference of phylogeny using a mixed-evolution model and rooted the tree with a putative ICK-forming venom peptide from Remipedia, the sister group to Hexapoda (
). According to this analysis (Fig. 4B), Pp1a is monophyletic with the two previously described harpactorine assassin bug venom peptides, Ado1 and Iob1 (posterior probability, p = 0.92). Pp6a was more closely related to the bee peptide OCLP1 (p = 0.99), a non-venom peptide expressed throughout the body, and to related sequences from ants (
). Another clade (p = 0.73) is formed by Pp8 and Pp10–Pp13 and includes a putative antimicrobial peptide from a phytophagous hemipteran, the whitefly Bemisia tabaci (GenBankTM accession number O81338.1). The relationships between the remaining peptides, including all peptides that were detected in venom by MS apart from Pp1, were poorly resolved. Overall, these data are consistent with Ptu1/OCLP1 family peptides being widespread among insects where they perform a non-venom role, for example in the immune system or as phenol oxidase inhibitors, with independent recruitment as venom peptides in the orders Hemiptera and Hymenoptera.
Fractions from which Ptu1 family peptides were detected by LC-MS/MS were examined using MALDI-TOF MS to determine the masses of mature Ptu1 family peptides. For Pp1a and Pp3, the major mass detected in the venom fraction matched perfectly with that calculated for the mature toxin sequence predicted from the transcriptomic data, after removal of the 20–23-residue signal peptides predicted by SignalP (supplemental Fig. S3). This suggests that post-translational processing of Ptu1-like venom peptides includes neither removal of a propeptide sequence nor enzymatic modification of mature toxin residues as is often the case for the venom peptides from cone snails and to a lesser extent spiders (
CUB domains occur widely in multidomain proteins where they perform roles in protein recognition and as adaptor domains, and they are especially well represented in extracellular and developmentally regulated proteins (
), and 19 of the 67 S1 proteases detected in P. plagipennis venom. In contrast to this pattern, we observed that two of the most abundant proteins (CUB domain proteins 1 and 2) in P. plagipennis venom consist of a solitary CUB domain; three further CUB domain proteins (3–5) were detected from HPLC fractions and 1D SDS-polyacrylamide gels. Each CUB domain protein contains 129–142 residues and is stabilized by two conserved disulfide bonds (Fig. 5A). To rule out the possibility that detection of CUB domain proteins is an artifact of proteolytic cleavage of CUB domains from S1 proteases and/or incomplete or wrongly assembled contigs, we examined each CUB domain protein coding sequence. Each contig encoded a CUB domain protein that included multiple stop codons in all reading frames downstream of the stop codon that terminates translation, without additional open reading frames homologous to S1 proteases or other proteins. To further rule out assembly errors, we examined >10 assemblies produced using either CLC Genomics Workbench or Trinity (
), and we found that all five contigs encoding CUB-only domains were present and complete in every assembly. In addition, CUB domain proteins 1 and 2 were very confidently identified from 2D gel spots with an apparent molecular mass that closely matched the predicted mature mass of these proteins (12.6–12.7 kDa). Thus, our data strongly suggest that proteins encoding single CUB domains are abundant components of the venom of P. plagipennis. A CUB domain only protein has previously been reported from a transcriptomic study of the venom glands of another cimicomorphan heteropteran, the minute pirate bug Orius laevigatus (Anthocoridae; Ref.
), suggesting CUB domain proteins may be present in the venoms of phylogenetically diverse heteropterans.
Fig. 5Evolution of CUB domain proteins. Sequence names in blue indicate P. plagipennis sequences determined in this study. A, alignment of CUB domain venom proteins from P. plagipennis and the anthocorid O. laevigatus (
According to BlastP searches, CUB domain sequences were most similar to N-terminal CUB domains of S1 proteases from insects and crustaceans. We examined the phylogenetic relationships between reduviid and anthocorid CUB domain proteins and arthropod CUB-S1 proteases using Bayesian inference of phylogeny, rooting the tree a CUB-S1 protease from the olfactory organ of the spiny lobster Panulirus argus (Fig. 5B). According to this analysis, heteropteran CUB domain venom proteins diverged anciently from a heteropteran CUB-S1 protease but are likely to be monophyletic (p = 0.75). This result suggests that CUB domain proteins were recruited into venom prior to the split of Reduviidae and Anthocoridae ∼190 mya (
). Their conservation over this extended period of time and the high abundance in the venom of P. plagipennis suggest that they serve an important but as yet unknown functions.
Redulysins
Eight venom proteins showed homology to trialysin, a previously described Lys-rich cytolytic toxin isolated from venom of the blood-feeding triatomine reduviid Triatoma infestans. Because blood-feeding reduviids do not lyse red blood cells during feeding, it has been suggested that trialysin has an antimicrobial function that protects the venom gland from colonization by parasites (
). Regardless of its biological function, the cytolytic activity of trialysin is well established experimentally. Purified trialysin forms voltage-dependent channels in lipid bilayers (
), whereas synthetic peptides corresponding to an internal helical domain of trialysin are capable of lysing mammalian and bacterial cells. Martins et al. (
) used fluorophore quenching to demonstrate that trialysin is activated by proteolytic removal of a negatively charged N-terminal motif that exposes the positively charged helical domain. The same researchers used NMR to show that cytolytic peptides derived from the helical domain of trialysin form amphipathic α-helices with hydrophobic and positively charged Lys residues on opposite sides of the helix, a pattern thought to underlie the ability of trialysin to bind negatively charged phospholipid headgroups and hydrophobic tails and therefore disrupt biological membranes (
). Because all of the eight trialysin homologs in P. plagipennis venom feature a conserved motif homologous to the cytolytic motif of trialysin (Fig. 6), we have classified them as putative cytolytic or pore-forming toxins and named them redulysins 1–8. The redulysins are 233–458 residue proteins that are rich in Lys (14–17%) and consist of an N-terminal negatively charged motif, cytolytic motif, and a C-terminal domain stabilized by a pattern of eight conserved Cys residues. The putative cytolytic motif consists of 33 residues predicted to form an α-helix by PSIPRED, ∼40% of which are Lys residues arranged with conserved periodicity (Fig. 6). In contrast to the case for triatomine bugs, there exists a clear biological reason for predaceous reduviids to possess cytolytic toxins, as they might contribute to prey capture, pain induction in defensive envenomation, and/or liquefaction. Moreover, cytolytic and liquefying activities have previously been demonstrated experimentally in reduviid venoms (
). Taken together, the combined data suggest that redulysins are a cytolytic toxin family present in reduviid venoms, at least one member of which was retained by some triatomine bugs despite their shift to a blood diet 25–30 mya (
Fig. 6Conserved sequence features in redulysin proteins.A, amino acid sequence alignment of redulysins with the cytolytic domain of trialysin (AAL82381.1). The consensus sequence of the redulysin proteins is shown below. Lys residues are shown in bold black, positively charged residues in red, and hydrophobic residues in blue. Residues predicted by PSIPRED to form α-helices are highlighted in gray. B, helical wheel diagram of the predicted helical portion of the consensus of redulysin sequences showing clustering of Lys residues and hydrophobic residues on opposite sides of the helix.
In blood-feeding reduviids, the lipocalin/triabin family has radiated to produce proteins with a wide variety of functions, including inhibition of coagulation factors, inhibition of platelet aggregation through sequestration of ADP, sequestration of biological amines, and carriage of nitrous oxide (
). In many triatomine bugs, lipocalin/triabin family proteins account for the majority of venom proteins both in number and weight; they also account for the majority of the functionally characterized components of triatomine venom. We identified one lipocalin/triabin family protein in the venom of P. plagipennis by LC-MS/MS, a 178-residue protein with a predicted mature mass of 19.8 kDa. To further recover P. plagipennis lipocalin/triabin family proteins for phylogenetic analysis, we performed a BlastP search of our library for possible protein sequences using a range of lipocalin/triabin family proteins from blood-feeding reduviids as queries. This strategy recovered a further 11 lipocalin/triabin family proteins (KX752800 to KX752810).
The relationships between lipocalin/triabin family proteins of triatomine reduviids have recently been examined in the context of understanding how triabins evolved to occupy their current roles as functionally diverse facilitators of blood-feeding (
). Because blood-feeding triatomines evolved from predaceous assassin bugs, an understanding of how triatomine triabins are related to those of predaceous assassin bugs is desirable but has not been previously possible. To investigate such relationships, we performed Bayesian inference of phylogeny using the 14 P. plagipennis lipocalin/triabins, their closest homologs according to a BlastP search against the GenBankTM nr database and a range of functionally characterized triatomine triabin family proteins (Fig. 7). All triabin family proteins identified from triatomine venom and a subset of P. plagipennis sequences (including triabin-like venom protein 1 identified in venom by LC-MS/MS) clustered together with high probability (Fig. 7, clade labeled “triabin-like proteins”; p = 0.92). Another subset of lipocalin/triabin family proteins from P. plagipennis, sequences from heteropterans such as the pentatomomorphan Halyomorpha halys, the cimicomorphan bedbug Cimex lectularius, and a representative of the “cockroach triabin” Bla g 4 (
) were found not to be members of the triabin-like protein clade. We suggest that membership of this clade, which includes venom proteins from both predaceous and blood-feeding reduviids but not any protein known to have a non-venom function and no non-reduviid proteins, constitutes a natural delineation for the “triabin” family.
Fig. 7Radiation of triabin/lipocalin superfamily in Reduviidae and allied insects calculated by Bayesian inference. Branches of Pristhesancus, Triatoma, and Rhodnius proteins are colored orange, blue, and pink, respectively; node labels indicate posterior probabilities. Accessions of sequences used in this analysis are as follows: putative BABP (Triatoma), D1MX91; putative nitrophorin (Triatoma), A0A023F6B8; Triafestin, AB292809.1; RPAI, JA76747.1; Triplatin, BAE96121.1; Infestilin, AAZ38958.1; Pallidipin, AAA30329.1; Dimiconin, BAI50848.1; Triabin, CAA56540.1; Procalin, AEM97970.1; Triatin, AAZ38956.1; ApoD-like (Halyomorpha), XP_014291570.1; Isoallergen 1 (Blatella), C3RWZ4; ApoD-like (Cimex), XP_014256007.1; Uncharacterized (Cimex), XP_014250080.1; Lazarillo-like (Cimex), XP_014249890; Lipocalin (Homo), CAA47889.1.
According to our analysis, currently known triabins are divided into three clades. The clade highlighted in green (Fig. 7; p = 0.87) contains members of the nitrophorin/BABP family, which are most prominent in venom from triatomine bugs from tribe Rhodniini (
) and the harpactorine P. plagipennis. Surprisingly, the nitrophorin/BABP family proteins from the triatomine tribe Triatomini are more closely related (p = 0.97) to P. plagipennis proteins rather than nitrophorin/BABP family members from the triatomine tribe Rhodniini (p = 0.97), suggesting that at least two gene loci encoding members of this family existed in the common ancestor of Harpactorinae and Triatominae before they diverged ∼75 mya (
). In the blue clade (Fig. 7), P. plagipennis triabin-like venom protein 1 and a related P. plagipennis protein form the sister group to all remaining triabin family proteins (excluding the nitrophorin/BABP family) that have so far been documented in triatomine venom. The final clade highlighted in tan (Fig. 7) contains only P. plagipennis proteins. These results illuminate the evolutionary history of the triabin family, suggesting that at least four genes encoding members of the triabin family existed in the common ancestor of Harpactorinae and Triatominae and that triabins subsequently radiated in both lineages.
DISCUSSION
In this study we have provided the first holistic analysis of a venom proteome of a predaceous assassin bug through the combination of transcriptomic and proteomic approaches. Our analysis revealed that P. plagipennis produces highly complex venom containing at least 127 peptide and protein components. Many of these venom components have structural and inferred functional similarity to peptides and proteins involved in neurotoxicity (Ptu1 family), membrane disruption and cytolysis (redulysins, bacterial-permeability-increasing peptide), enzymatic catabolism (proteases, phosphatases, nucleases), and nutrient dissemination (transferrin). Thus, the suite of proteins present in the venom of P. plagipennis appears well suited to facilitating the dual activities of paralysis and tissue liquefaction previously observed to result from injection of reduviid venom (
The venom used in this study was obtained by electrostimulation to avoid contamination by glandular tissue. Venom obtained from other species of the assassin bug by electrostimulation has been shown to have rapidly paralyzing and lethal effects on both vertebrates and invertebrates (
). Thus, reduviid venom obtained by electrostimulation shows the biological activities expected of prey-capture venom. Although we argue that the venom analyzed in this study is likely to perform dual roles in paralysis and liquefaction, we note that it is possible that the labial gland complex produces more than one kind of secretion specialized for prey capture, defense, or feeding, as suggested previously (
). We are currently conducting studies to establish the glandular origin of venom obtained by electrostimulation and the biological roles of the different parts of the labial gland complex.
The biochemical composition of venom from predaceous reduviids is unique in comparison with venom from blood-feeding reduviids as well as predatory arthropods such as spiders, scorpions, and centipedes. The venoms of most marine and terrestrial invertebrate predators are dominated by highly diverse, often Cys-rich, peptides with a molecular mass less of than 10 kDa (
). In contrast, we found that the putatively neurotoxic components in P. plagipennis venom, the Ptu1 family of small disulfide-rich peptides, make up a small proportion (1–3%) of venom, whereas the majority of venom proteins are proteases. Possibly, this is because some of the abundant proteins with unknown function, such as CUB domain proteins or venom family 1 proteins, confer the main neurotoxicity of the venom. The high proportion of enzymes compared with peptide neurotoxins may be related to the different feeding biology of assassin bugs compared with other venomous predators. Both spiders and scorpions, like assassin bugs, rely primarily on EOD for feeding (
). However, both spiders and scorpions practice “refluxing” EOD in which enzymes from the gut, and not from the stinger or fangs, are regurgitated onto or into prey to facilitate digestion. In contrast, heteropterans are one of the few arthropods that practice non-refluxing EOD (
), the sole source of proteins injected into prey by assassin bugs is the labial glands, not the gut. Snakes and cone snails are similar to assassin bugs in the sense that their organs of envenomation have evolved from oral structures, but both these animals are capable of swallowing prey whole and digesting them internally. Although proteases and other enzymes in the venoms of snakes and spiders have often been proposed to have roles in the digestion of prey, evidence that they are essential for digestion is relatively weak (
). For assassin bugs, which can only ingest liquid food, and which use the same anatomical structure for injecting venom for both prey capture and EOD, the requirement for digestive activity in venom is likely to be much stronger. This fundamental difference in feeding physiology may explain the relative abundance of proteases compared with peptides in assassin bug venom. Supporting this notion, the molecular weight distribution of P. plagipennis venom toxins is similar to that of the remipede Speleonectes tulumensis, another arthropod whose venom is thought to have a role in EOD as well as prey capture (
The first venomous crustacean revealed by transcriptomics and functional morphology: Remipede venom glands express a unique toxin cocktail dominated by enzymes and a neurotoxin.
The function of many P. plagipennis venom proteins is unknown, but our results provide some clues as to the evolutionary history and function of some protein families. For example, according to our phylogenetic analysis, the CUB domain protein family probably evolved from the CUB domains of S1 proteases through loss of the proteolytic domain prior to 190 mya. Thus, CUB domain proteins likely originated shortly after the divergence of Heteroptera and its sister group Coleorryncha (moss bugs) around 240 mya (
). We suggest that the most plausible account of the evolution of venoms in Heteroptera is that in between the divergence of Heteroptera and Coleorryncha and the last common ancestor of Heteroptera, the protease-rich salivary secretions used by both predaceous and phytophagous heteropterans for EOD (
Alomar O. Zoophytophagous Heteroptera: Implication for Life History and Integrated Pest Management. Thomas Say Publications in Entomology,
Lanham, MD1996: 1-7
) evolved additional toxic and paralytic activities in response to selection for efficient prey capture. This might have occurred either through the recruitment of toxins from proteins expressed elsewhere in the body or via the evolution of salivary enzymes into venom toxins. The latter pattern is exemplified in another orally derived venom, the phospholipase A2 (PLA2) family of snake venoms. PLA2 enzymes are themselves powerful toxins that function by cleaving phospholipids into toxic signaling molecules, but one group (group II snake venom PLA2 with 49Lys) subsequently lost crucial active-site residues and acquired diverse non-enzymatic activities, including neurotoxicity, myotoxicity, and modulatory activity on coagulation and platelet aggregation pathways (
). CUB domain proteins apparently result from a similar evolutionary process, i.e. the neofunctionalization of a digestive enzyme present in the saliva of ancestral heteropterans into a non-enzymatic venom component. Although the function of CUB domain proteins is unknown, their conservation over 190 million years and high abundance in P. plagipennis venom suggest that they perform an important, non-enzymatic function in prey capture and/or feeding.
Functionally, the venoms of predaceous reduviids are very different to the comparatively well characterized venoms produced by blood-sucking triatomine reduviids. Instead of paralysis and liquefaction, triatomine venom works to disrupt host hemostatic systems (
). Reflecting these different actions, we found that the venom of P. plagipennis was markedly different in protein composition compared with triatomine bug venom. Insofar as the venom proteome of P. plagipennis can be generalized to other predaceous reduviids, our results indicate that the shift to blood-feeding by triatomines is likely to have been accompanied by a strong decrease in the expression of proteases, cytolytic toxins, and Ptu1 family peptides, with a concomitant increase in the number and expression of triabins accompanied by the recruitment of Kazal domain proteins and 5′-nucleotidase-type apyrases. Regardless of their major differences in composition, there are many protein families with representatives in the venom of both predaceous and blood-sucking reduviids, including proteases, redulysin/trialysins, inositol phosphate phosphatase, cystatin domain proteins, serpins, and triabins.
This study highlights the power of combining transcriptomic and proteomic approaches for providing a holistic overview of venom proteomes (
). Our results provide a solid foundation for understanding the role played by individual venom components in prey capture and liquefaction in predaceous assassin bugs and provide key insights into the different pathways of venom evolution in predaceous and hematophagous heteropterans.
Nucleic acid and amino acid sequences were deposited in GenBankTM with accession numbers KX459564–KX459693 and KX752800–KX752820.
The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository (
We thank Alun Jones at the Queensland Bioscience Precinct Mass Spectrometry Facility for help with proteomics experiments, Greg Baillie and Angelika Christ from the IMB Sequencing Facility, Max Rosenthal and Caleb Stewart for help with experimental procedures, and the many people who kindly donated assassin bugs to this project.
Approaching the golden age of natural product pharmaceuticals from venom libraries: an overview of toxins and toxin-derivatives currently involved in therapeutic or diagnostic applications.
Proteomics and deep sequencing comparison of seasonally active venom glands in the platypus reveals novel venom peptides and distinct expression profiles.
Insecticidal activity of venomous saliva from Rhynocoris fuscipes (Reduviidae) against Spodoptera lituraHelicoverpa armigera by microinjection and oral administration.
J. Venom. Anim. Toxins Incl. Trop. Dis.2011; 17: 486-490
Utilizing the assassin bug, Pristhesancus plagipennis (Hemiptera: Reduviidae), as a biological control agent within an integrated pest management programme for Helicoverpa spp. (Lepidoptera: Noctuidae) and Creontiades spp. (Hemiptera: Miridae) in cotton.
The paragon algorithm, a next generation search engine that uses sequence temperature values and feature probabilities to identify peptides from tandem mass spectra.
The first venomous crustacean revealed by transcriptomics and functional morphology: Remipede venom glands express a unique toxin cocktail dominated by enzymes and a neurotoxin.
Alomar O. Zoophytophagous Heteroptera: Implication for Life History and Integrated Pest Management. Thomas Say Publications in Entomology,
Lanham, MD1996: 1-7
Author contributions: A.A.W., E.A.U., and G.F.K. designed the research; A.A.W., B.M., and J.J. performed the research; B.G.F. contributed new reagents or analytic tools; A.A.W. analyzed the data; and A.A.W., B.M., J.J., E.A.U., B.G.F., and G.F.K. wrote the paper.