Proteomic Studies of the Singapore Grouper Iridovirus*S

The Singapore grouper iridovirus (SGIV) genome consists of a double-stranded circular DNA of 140,131 base pairs with 162 predicted open reading frames. Our earlier study using peptide mass fingerprints generated from MALDI-TOF MS led to the identification of 26 viral proteins. The present investigation aimed to achieve a more comprehensive and precise identification of the SGIV viral proteome by two workflows: one-dimensional gel electrophoresis (1-DE) separation followed by protein identification by MALDI-TOF/TOF MS/MS (1-DE-MALDI workflow) and shotgun proteomics in which the whole virus was digested by trypsin and the resulting peptides were separated by nano-LC and analyzed by MALDI-TOF/TOF MS/MS (LC-MALDI workflow). In total, 44 viral proteins were identified, 25 of which were reported for the first time. Fourteen proteins were uniquely identified by the 1-DE-MALDI workflow, whereas another 10 proteins were only identified by the LC-MALDI workflow with 20 proteins found by both approaches. Moreover 13 proteins were found to have acetylated N termini. Twenty-three proteins identified contain predicted transmembrane domains, accounting for 52.3% of the total proteins identified. RT-PCR confirmed the transcription products of all the identified viral proteins. A large number of proteins identified by both the 1-DE-MALDI and the LC-MALDI workflows from this study have significantly enhanced the coverage of the SGIV proteome. The SGIV proteome is at present the only iridoviral proteome that has been extensively characterized. Our results should provide further insights into the biology of SGIV and other iridoviruses.

Iridoviruses are animal viruses that infect only invertebrates and poikilothermic vertebrates (1). Since the first discovery of iridovirus in 1954 (2), more than 100 iridoviruses have been isolated and classified into four genera within the family Iridoviridae based on their common characteristics such as the sources of host organisms, genetic properties, and morphological evidences (1). The Singapore grouper iridovirus (SGIV) 1 is a species of the genus Ranavirus. Its complete genome sequence was first determined by our laboratory (3) and more recently reported by Tsai et al. (4). The entire viral genome consists of a double-stranded circular DNA of 140,131 base pairs with 162 predicted ORFs. To date, nine iridoviral genomes have been completely sequenced whose sizes vary between 105 and 212 kbp (Supplemental Table I).
Despite the abundance in genomic information, studies on iridoviruses were limited to ORF prediction and cross-species genome comparison. The authenticity of an ORF as a functional entity would need to be verified by experimental approaches. However, very little work has been carried out on the biological functions of these iridoviral genes except for a few highly conserved genes such as the major capsid protein and the ATPase. Several iridoviral genes, i.e. immediate-early, delayed-early, and late genes, were identified in frog virus 3 (FV3), but their biological functions have yet to be characterized (5)(6)(7)(8).
The rapid development of the proteomic technologies including protein separation techniques and MS has greatly facilitated the identification and characterization of proteins, taking advantage of the availability of many genome sequences (9,10). Previously we reported the genome sequence and the proteomic analysis of SGIV (3). To our knowledge, it is the only iridoviral proteome that has been characterized. Twenty-six proteins were identified by peptide mass fingerprints (PMFs) whose gene sequences were further confirmed by RT-PCR and DNA sequencing of their respective RT-PCR products (3).
Although PMF is a fast and simple method for protein identification, there are several inherent limitations in identifying low abundance proteins, proteins of extreme pI values and molecular masses, and proteins from a mixture (11). These limitations can be overcome by MS/MS, which provides highly specific information of peptide fragmentation pattern as well as de novo sequence information. Thus, it has an advantage of correctly identifying proteins from a protein complex or mixture (12). The development of the 4700 Proteomics Analyzer MALDI-TOF/TOF mass spectrometer (Applied Biosystems) has allowed for a fast and accurate acquisition of both PMF and MS/MS data. Therefore, we utilized this instru-ment to re-examine the SGIV proteome, aiming to achieve a more comprehensive identification of the viral proteins. Besides using traditional one-dimensional gel electrophoresis  to separate the viral proteome (1-DE-MALDI workflow),  we also established an off-line LC-MALDI-TOF/TOF MS/MS  workflow (LC-MALDI workflow). Both workflows in combination, as reported in the present study, greatly expanded the number of proteins identified in the SGIV proteome.

1-DE and
In-gel Digestion-Preparation of the SGIV virion was described previously (3). The purified SGIV proteins were separated by 1-DE followed by Coomassie Blue staining. Reduction and alkylation were performed to the whole gel strip using 10 mM DTT and 55 mM iodoacetamide, respectively. After reduction and alkylation, the gel strip was cut into 64 consecutive gel pieces of ϳ2 mm in width. In-gel digestion was performed using a ZipPlate with a vacuum manifold apparatus (Millipore). Briefly the gel pieces were further diced into smaller pieces of about 1 mm 3 in size and placed into the ZipPlate wells followed by washing sequentially with 100 l each of 25 mM NH 4 HCO 3 , 5% ACN and 25 mM NH 4 HCO 3 , 50% ACN for 30 min. The gel pieces were then dehydrated by 200 l of ACN, and 15 l of sequencing grade porcine trypsin (Promega) (12.5 ng/l) was added to each well. The ZipPlate was incubated at 37°C overnight. The reverse-phase resins at the bottom of ZipPlate wells were wetted by adding 8 l of ACN directly onto them and further incubated at 37°C for 15 min. The tryptic peptides were extracted by adding 130 l of 0.2% TFA and incubated for 30 min at room temperature. Vacuum was applied to allow for the extract to pass through and the tryptic peptides to bind to the resins. The resins were washed twice with 100 l each of 0.2% TFA, and the peptides were eluted with 20 l of 0.1% TFA, 50% ACN and collected into a 96-well plate. After vacuum drying, the peptides were redissolved in 1 l of matrix solution containing 5 mg/ml of ␣-cyano-4-hydroxycinnamic acid in 0.1%TFA, 50% ACN and spotted onto a 192-well stainless steel MALDI target plate (Applied Biosystems).
In-solution Digestion and LC Separation-Approximately 5 volumes of 0.1% SDS, 50 mM Tris, pH 8.5 were added to the purified SGIV virion, and the mixture was incubated at 100°C and vortexed occasionally till the viscous lump disappeared. After centrifugation at 13,000 rpm for 10 min, the supernatant was saved, and the protein concentration was determined by the RC DC protein assay kit (Bio-Rad) using bovine ␥-globulin as the standard. The extracted proteins (200 g) were further diluted to 160 l with 0.1% SDS, 50 mM Tris, pH 8.5, and 4 l of 50 mM triscarboxyethylphosphine were added. The mixture was incubated at 100°C for 10 min. After cooling down, 40 l of 50 mM iodoacetamide were added, and the sample was incubated at 37°C for 2 h in the dark. Milli-Q water (200 l) containing 20 g of sequencing grade porcine trypsin was added, and the sample was further incubated at 37°C overnight.
Digested peptide mixture was separated using an Ultimate TM LC system (Dionex-LC Packings) equipped with a Probot TM MALDI plate spotting device. Three runs with different gradients were performed. For each run, ϳ10 g of peptide mixture were injected and captured onto a 0.3 ϫ 1-mm trap column (3-m C 18 PepMap TM , 100 Å) (Dionex-LC Packings) and then eluted onto a 0.075 ϫ 150-mm analytical column (3-m C 18 PepMap TM , 100 Å) (Dionex-LC Packings). The flow rate of the system was 0.4 l/min. The LC gradients used for the runs were: first run, from 100% buffer A (2% ACN, 0.05% TFA) to 60% buffer B (80% ACN, 0.04% TFA) over 3 h and then 60 -90% B in 1 min and kept at 90% B for 5 min; second run, from 100% A to 20% B in 10 min, then 20 -60% B over 3 h, and 60 -100% B in 1 min and kept at 100% B for 5 min; third run, from 100% A to 30% B in 10 min, then 30 -80% B over 3 h, and 80 -100% B in 1 min and kept at 100% B for 5 min. The LC fractions were mixed with MALDI matrix (7 mg/ml ␣-cyano-4-hydroxycinnamic acid and 130 g/ml ammonium citrate in 75% ACN) at a flow rate of 0.8 l/min through a 25-nl mixing tee (Upchurch Scientific) before spotting onto the 192-well stainless steel MALDI target plates (Applied Biosystems) with a speed of one well per 30 s.
MALDI-TOF/TOF MS/MS Analyses-Samples on the MALDI target plates were analyzed using an ABI 4700 Proteomics Analyzer MALDI-TOF/TOF mass spectrometer (Applied Biosystems). For MS analyses, typically 1000 shots were accumulated for each sample. MS/MS analyses were performed using nitrogen at collision energy of 1 kV and a collision gas pressure of ϳ3.0 ϫ 10 Ϫ7 torrs. A stop condition was used so that 2,000 -10,000 shots were combined for each spectrum depending on the quality of the data.
MASCOT search engine (version 2.0; Matrix Science) was used to search all of the tandem mass spectra. GPS Explorer TM software version 3.0 (Applied Biosystems) was used to create and search files   (3). Gene-specific primers used to amplify the target genes are listed in Supplemental Table II. Fig. 1. The MS and MS/MS data from each of the in-gel digested samples were combined for the MAS-COT database search. When the SGIV ORF database (162 ORFs) alone was used for the search, low scoring hits were reported as statistically significant matches even though the MS/MS spectra were poor. This was due to the small size of the database. Therefore, we combined the SGIV ORF database with the International Protein Index human database version 3.07 for the search to avoid random matches. Thirtyfour proteins were identified with high confidence with sequence coverage from 6 to 70%. All the identified proteins contained at least one matched MS/MS spectrum with MAS-COT expect value less than 0.05, and it must be the best match (Table I and Supplemental Document 1).

Identification of SGIV Proteins-The 1-DE profile of the SGIV is shown in
For the LC-MALDI workflow, the tryptic peptide mixture of the SGIV proteins was separated by a reverse-phase nano-LC column. Three separate LC-MALDI runs were performed with various ACN gradients to ensure good separation of peptides with different hydrophobicity. To maximize the sensitivity and reliability of the protein identification, all the MS/MS spectra from the three runs were combined for the database search. As a result, 30 proteins with high confidence were identified. The cut-off threshold for the MASCOT protein identification was the same as mentioned above (Table II and Supplemental  Document 2).
It is interesting to note that both workflows produced complementary results (Fig. 2). There were 14 proteins identified only by 1-DE-MALDI and 10 others identified only by LC-MALDI with 20 proteins identified by both workflows. Of the 26 proteins identified previously by PMF, 19 of them were confirmed by MS/MS. Also we were able to identify 25 additional proteins of SGIV. In total, 44 proteins were identified with high confidence. Fig. 3 shows the molecular mass distributions of the predicted ORFs and the detected proteins. For the proteins identified by PMF, we only included the 19 proteins that were confirmed by MS/MS for comparison. It is shown that the MS/MS approach is not biased against small molecular mass proteins. And there is no significant difference between 1-DE-MALDI and LC-MALDI. Two proteins less than 10 kDa were identified, accounting for 4.5% of the total proteins identified. Another seven proteins identified were between 10 and 20 kDa, accounting for 15.9% of the total proteins identified. These are comparable to 11.1% (18 ORFs of Ͻ10 kDa) and 29.0% (47 ORFs of 10 -20 kDa) of the predicted ORFs. On the other hand, the PMF approach only identified one protein less than 20 kDa and no proteins less than 10 kDa; thus the PMF approach is strongly biased against low molecular mass proteins. We also checked the pI distributions of the proteins identified (data not shown), and none of these methods are biased against proteins with extreme pI values. In total, the  (Table III). ORF069L, ORF082L, and ORF090L were acetylated at the N-terminal methionine residues. The remaining 10 proteins were acetylated at the second amino acids, indicating that the first N-terminal methionines of these ORFs were removed after translation.
RT-PCR Confirmation of the Gene Expression-To gain additional information about the expression of the viral genes, RT-PCR was used to detect the existence of mRNAs of the identified ORFs. Of the 44 identified ORFs, the mRNA expression of 15 of them was confirmed previously (3). The mRNA expression of the remaining ORFs was further examined by RT-PCR after 48 h of infection. All the genes were completely amplified (Fig. 4, A-D). Sequencing of those RT-PCR products confirmed their respective identities.

DISCUSSION
Of the 44 proteins identified from this investigation, 30 of them showed homology to predicted or verified iridoviral pro-  Tables I and II). Some of them are putative functional proteins such as enzymes required for DNA replication and transcription and proteins involved in the regulation of cell apoptosis, indicating the importance of these proteins in viral infection and replication. Transcription of iridoviral DNA is a coordinated sequential process involving the production and regulation of mRNAs that can be classified into immediate-early, delayed-early, and late genes according to their temporal synthesis upon infection (1). ORF006R is homologous to a delayed-early 31-kDa protein in FV3 that is related to DNA replication (7,14). ORF086R is homologous to a putative immediate-early protein in FV3 that is involved in gene transcription (5). It is interesting to note that these two proteins are assembled into the mature virions. Elucidation of the functions of the above two early genes should be useful to uncover the infective mechanism of the SGIV.  Of the 14 proteins not showing any homology to known iridoviral proteins, ORF045L is homologous to an unnamed protein product of a freshwater pufferfish, Tetraodon nigroviridis. ORF112R shows homology to a tail fiber protein of Escherichia coli. It is uncertain whether this gene encodes a fibrillar structure surrounding the outer capsid. Such a structure was reported in the Chilo iridescent virus (15).

teins (as indicated in
Control over the death machinery of the cell is important for virus survival. Many viruses produce antiapoptotic proteins to prevent premature death of the host cells to facilitate virus production or a persistent infection. On the other hand, some viruses promote host cell apoptosis to spread virus progeny to neighboring cells while evading host immune responses (16). The SGIV viral particle contains several apoptosis-related proteins (ORF025L, ORF115R, and ORF146L), suggesting that the manipulation of cellular apoptotic pathways might be an important mechanism for the propagation of this virus.
ORF025L contains a SAP motif (named after SAF-A/B, Ac-inus, and PIAS (protein inhibitor of activated STAT)), which binds specifically to DNA elements called scaffold/matrix attachment regions. SAF-A and Acinus are targets of caspase cleavage during apoptosis followed by chromatin degradation typical of programmed cell death (17). During apoptosis, SAF-A is cleaved in a caspase-dependent way. The cleavage occurs within the bipartite DNA-binding domain, resulting in the loss of the DNA binding activity and the concomitant detachment of SAF-A from nuclear structural sites. It may be inferred that the detachment of SAF-A, caused by the apoptotic proteolysis of its DNA-binding domain, could contribute to the nuclear breakdown during the host cell apoptosis (18). Whether the viral protein ORF025L might inhibit apoptosis by providing extra substrate for caspase deserves further investigation. ORF115R is homologous to the apoptotic Bcl-2 family of proteins. Members of the Bcl-2 family play an important role in tissue homeostasis, embryogenesis, and immune response through their actions as either inhibitors or promoters of apoptosis (19). The Bcl-2 family members can be subdivided into antiapoptotic members (e.g. Bcl-2, Bcl-xL, Bcl-w, and Mcl-1) and proapoptotic members (e.g. Bax and Bak), which are characterized by the presence of three conserved domains designated as BH1, BH2, and BH3. The additional BH4 domain is unique among the antiapoptotic members (20). These domains are necessary for the formation of homodimers and heterodimers to present their biological functions in vivo. The BH3 domains are conserved in human and fish Bcl-2 family members. However, in viruses they are poorly conserved compared with other eukaryotic Bcl-2 homologs (Fig. 5). It has been reported that adenovirus and herpesviruses produce Bcl-2 analogs to inhibit apoptosis and thus prevent premature death of the host cells, which would hamper virus production (16). Elucidation of the biological function of the SGIV Bcl-2 homolog will allow us to specifically target this protein for drug design and antiviral therapy in aquaculture.
ORF146L contains a ubiquitin ligase E3 conserved domain, a C-terminal RING domain (Conserved Domain Database accession number KOG1814). Ubiquitination is a protein modification process involving a group of enzymes to transport the ubiquitin as a tag by which the protein transport machinery delivers a protein to the proteasome for degradation (21). It is known to play important roles in cell apoptosis. A group of proteins known as the inhibitors of apoptosis proteins, some of which contain the C-terminal RING domain, function as ubiquitin ligase E3 to promote ubiquitination and proteasome degradation of key apoptosis initiator and effector caspases as well as Smac (second mitochondria-derived activator of caspases) (22,23). It will be interesting to see whether the viral protein ORF146L is involved in the manipulation of the host cell apoptotic process.
Besides the above mentioned proteins that assembled into the SGIV viral particle, the virus genome also encodes proteins that are homologous to tumor necrosis factor (TNF)-␣ (ORF136R) and TNF receptors (TNFRs) (ORF050L, ORF051L, and ORF096R). Several members of the TNFR superfamily are potent inducers of apoptosis (24). Some viruses, such as the rabbit poxvirus, produce TNFR orthologs as decoy receptors to neutralize the TNF signal and thus suppress host cell apoptosis. It is likely that the TNFR homologs of SGIV might have a similar function. Elucidation of the biological functions of the above apoptosis-related genes should shed light on the infective mechanism of SGIV and virus-host interaction. CONCLUSIONS Applying 1-DE-MALDI and LC-MALDI workflows, we achieved a more comprehensive identification of SGIV viral proteins. Both workflows were equally effective and complementary to each other. Nineteen of 26 proteins previously identified by PMF were confirmed, and an additional 25 viral proteins were identified. More than half of the identified proteins are predicted to be membrane proteins. Therefore, we have established a platform and strategy suitable for the characterization of other viral proteomes as well as membrane proteomes. Confirmation of the viral gene products by MS should provide a better insight into the biology of SGIV and other iridoviruses.