Rice Proteomics

The technique of proteome analysis with two-dimensional PAGE has the power to monitor global changes that occur in the protein expression of tissues and organisms and/or expression that occurs under stresses. In this study, the catalogues of the rice proteome were constructed, and a functional characterization of some of these proteins was examined. Proteins extracted from tissues of rice and proteins extracted from rice under various kinds of stress were separated by two-dimensional PAGE. An image analyzer was used to reveal a total of 10,589 protein spots on 10 kinds of two-dimensional PAGE gels stained by Coomassie Brilliant Blue. The separated proteins were electroblotted onto a polyvinylidene difluoride membrane, and the N-terminal amino acid sequences of 272 of 905 proteins were determined. The internal amino acid sequences of 633 proteins were determined using a protein sequencer or mass spectrometry after enzyme digestion of the proteins. Finally, a data file of rice proteins that included information on amino acid sequences and sequence homologies was constructed. The major proteins involved in the growth and development of rice can be identified using the proteome approach. Some of these proteins, including a calcium-binding protein that turned out to be calreticulin and a gibberellin-binding protein, which is ribulose-1,5-bisphosphate carboxylase/oxygenase activase in rice, have functions in the signal transduction pathway. The information thus obtained from the rice proteome will be helpful in predicting the function of the unknown proteins and will aid in their molecular cloning.

to study alterations in cellular protein expression in response to various stimuli or as a result of differentiation and development (3). The latter approach further allows cDNA cloning from the resultant sequence(s).
The Rice Genome Research Project is a joint project of the National Institute of Agrobiological Sciences and the Institute of the Society for Techno-innovation of Agriculture, Forestry, and Fisheries. In addition, partial support comes from the Genome Research Program of the Japanese Ministry of Agriculture, Forestry, and Fisheries. The program started in October 1991, and the first phase continued through 1997, resulting in the establishment of some of the basic tools of rice genome analysis. The Rice Genome Research Project was reorganized into a national project in 1998; objectives of the organization are to completely sequence the entire rice genome and to pursue integrated goals in functional genomics, genome informatics, and applied genomics. Two important objectives in rice proteome research are 1) to determine whether the cDNA encoding particular proteins from the cDNA library constructed from rice can be identified by a computer search of an amino acid sequence homology and 2) to predict the function of the proteins and study the physiological significance of functional proteins in rice.
Sequencing of a protein separated by two-dimensional PAGE became possible with the introduction of protein electroblotting methods that allow the efficient transfer of a sample from the gel matrix onto a support that is suitable for gas-phase sequencing or related techniques (4). Proteins can also be recognized by their amino acid composition, their exact molecular weight as determined by mass spectrometry (MS), 1 or their partial amino acid sequence. In conjunction with automated gel scanning and computer-assisted analysis, two-dimensional PAGE has contributed greatly to the development of a protein data base (5)(6)(7)(8).
Gel-separated proteins can be identified rapidly by MS, and if genomic information is also available, such analyses permit the systematic identification of the protein complement of a genome (9). In addition, MS is a powerful tool for the analysis of isoforms, secondary modifications of proteins such as glycosylation and phosphorylation, and proteolysis, which only require low amounts (picomoles to attomoles) of proteins (10).
Such systematic analyses of protein populations are summarized by the term proteomics. Thus, proteomics bridges the gap between genomic sequence information and the actual protein population in a specific tissue, cell, or cellular compartment.
Concerning the rice plant, some well known studies have dealt with the construction of proteomes from complex origins, such as the leaf, embryo, endosperm, root, stem, shoot, and callus proteome (11)(12)(13)(14)(15). Proteomic studies to date have mainly focused on those changes in genome expression that are triggered by environmental factors. Examples of descriptive proteomes include the global comparison of green and etiolated rice shoots (13) and an analysis of defense-associated responses in the rice leaf and leaf sheath following a jasmonic acid treatment (16). One major advantage of the rice two-dimensional PAGE data base, in which most known proteins are recorded, is the wealth of new proteins on which experiments can be conducted at the biochemical and molecular levels. In addition to facilitating the identification of known proteins, these sequences can be used to prepare oligodeoxyribonucleotides, which are essential for cloning the corresponding cDNA. The aim of this study was to separate proteins from rice, to determine their relative molecular weights and isoelectric points, and to perform N-terminal and internal amino acid sequence analysis using a protein sequencer and MS. Finally, an attempt was also made to study the physiological significance of some proteins thus identified from rice.
Gel Electrophoresis-A portion of the rice tissues was removed, homogenized with a lysis buffer (2), and centrifuged. The supernatant was subjected to two-dimensional PAGE (2). Isoelectric focusing or immobilized pH gradient was carried out in a glass tube with a length of 13 cm and a diameter of 3 mm. SDS-PAGE in the second dimension was performed with 15% separation gels. The isoelectric point and relative molecular weight of each protein were determined using the two-dimensional PAGE standard (Bio-Rad). The localization sites of individual proteins on the gels were evaluated automatically with Image Master 2D Elite software (Amersham Biosciences).
N-terminal Amino Acid Sequence Analysis-Following separation by two-dimensional PAGE, the proteins were electroblotted onto a polyvinylidene difluoride membrane (ProBlott, Applied Biosystems, Foster City, CA) and detected by Coomassie Brilliant Blue (CBB) (11). The spots were excised from the polyvinylidene difluoride membrane and applied to the upper glass block of the reaction chamber in a gas-phase protein sequencer (Procise, Applied Biosystems, Foster City, CA). Edman degradation was performed according to the standard program supplied by Applied Biosystems (Foster City, CA).
Internal Amino Acid Sequence Analysis-The proteins were separated by two-dimensional PAGE and stained with CBB. Gel pieces containing protein spots were removed, and the protein was electroeluted from the gel pieces using an electrophoretic concentrator (ISCO, Lincoln, CA) at 2 watts of constant power for 2 h. After electroelution, the protein solution was dialyzed against deionized water for 2 days and lyophilized. The protein dissolved in the SDS sample buffer (pH 6.8) was applied to a sample well in SDS-PAGE. The sample solution was overlaid with a solution containing Staphylococcus aureus V8 protease (Pierce, Rockford, IL). Electrophoresis was performed until the sample and protease were stacked in the upper gel and interrupted for 30 min to digest the protein (17). Electrophoresis was then continued, and the separated digests were electroblotted onto the polyvinylidene difluoride membrane and subjected to gas-phase sequencing (18).
Homology Search of Amino Acid Sequences-The amino acid sequences obtained were compared with those of proteins in the amino acid sequence data base (MPSRCH-pp protein-protein data base, University of Edinburgh, Edinburgh, Scotland, UK).
Analysis Using Mass Spectrometry-The CBB-stained protein spots were excised from a gel, washed with 25% (v/v) methanol and 7% (v/v) acetic acid for 12 h at room temperature, and destained with 50 mM NH 4 HCO 3 in 50% (v/v) methanol for 1 h at 40°C. The proteins were reduced with 10 mM dithiothreitol in 100 mM NH 4 HCO 3 for 1 h at 60°C and incubated with 40 mM iodoacetamide in 100 mM NH 4 HCO 3 for 30 min at room temperature. The gel pieces were minced, allowed to dry, and then rehydrated in 100 mM NH 4 HCO 3 with 1 pmol of trypsin at 37°C overnight. The digested peptides were extracted from the gel slices with 0.1% trifluoroacetic acid in 50% (v/v) acetonitrile/ water three times. The peptide solution thus obtained was dried, reconstituted with 30 l of 0.1% trifluoroacetic acid in 5% acetonitrile/water, and then desalted by Zip Tip C18 TM pipette tips (Millipore, Bedford, MA). Matrix-assisted laser desorption ionization (MALDI) MS was performed using a Voyager Elite XL time-of-flight mass spectrometer (Applied Biosystems, Framingham, MA). The above peptide solution was mixed with the matrix solution, the supernatant of a 50% acetonitrile solution saturated with ␣-cyano-4-hydroxycinnamic acid, and then air-dried on the flat surface of a stainless steel plate. Calibrations were carried out using a standard peptide mixture (19). The mass spectra were subjected to a sequence data base search with Mascot software (Matrix Science Ltd., London, UK).
cDNA Microarray Analysis-A cDNA microarray containing independent rice cDNA clones of 9,600 expressed sequence tags was used. One microgram of mRNA sample prepared from lamina treated with 1 g of BL for 48 h or water as a control was reverse-transcribed in a 20-l volume containing 1 mM Cy5 dCTP (Amersham Biosciences), anchored oligo(dT) 25 , random nonamer, dithiothreitol, dNTP, and SuperScript II (Invitrogen). After incubation at 42°C for 2 h, the reaction was stopped, and RNA was degraded by first heating at 94°C for 3 min and then treating with NaOH at 37°C for 15 min. Fluorescently labeled probes were purified using a QIAquick PCR purification kit (Qiagen, Hilden, Germany). Probe hybridization and scanning of the hybridized microarray slide were conducted. Data were analyzed using Array Vision (Imaging Research Inc., Ontario, Canada) (20).
Transformation of Rice Mediated by Agrobacterium-Agrobacterium tumefaciens strain EHA101 (a gift from Dr. E. Hood) has been described previously (21). Plasmids were introduced into this strain by electroporation. A. tumefaciens was grown on AB medium (1 g/liter NH 4 NO 3 , 296 mg/liter MgSO 4 , 10 mg/liter CaCl 2 , 1.3 g/ml NaH 2 PO 4 , 3 g/liter K 2 HPO 4 , 150 mg/liter KCl, 2.5 mg/liter FeSO 4 , 5 g/liter glucose, 15 g/liter bacto agar) at 28°C. The rice gene was connected to 35S promoter in pIG121-Hm (a gift from Dr. K. Nakamura), a binary vector that contains a kanamycin resistance gene (npt) and a hygromycin resistance gene (hpt), and the intron-gus in the T-DNA region. The transformation was done as reported previously (22). The regenerated rice was eventually transferred to soil in pots and grown to maturity in a greenhouse.

Cataloguing of the Rice Proteome Using Two-dimensional Electrophoresis/Edman
Sequencing/MS Analysis Enlargement of the Rice Protein Data File-The technique of proteome analysis with two-dimensional PAGE has the power to monitor global changes that occur in the protein expression of tissues and organisms whether or not they are under stress. In this study, proteins extracted from endosperm, embryo, root, callus, green shoot, etiolated shoot, leaf sheath, leaf blade, lamina joint, and panicle were separated by two-dimensional PAGE (Fig. 1).
Using an image analyzer, it was revealed that a total of 10,589 protein spots could be detected on 10 kinds of twodimensional PAGE gels stained by CBB. The separated proteins were electroblotted onto a polyvinylidene difluoride membrane, and the N-terminal amino acid sequences of 272 of 905 proteins were determined. The N-terminal regions of the remaining proteins could not be sequenced, and it was concluded that they had a blocking group at the N terminus. The internal amino acid sequences of 633 proteins were determined using the gas-phase protein sequencer or MS after an enzyme digestion of proteins. Finally, a data file of rice proteins was constructed that included information on amino acid sequences and sequence homologies ( Table I). The rice cDNA catalogue contains about 39.6% of the genes in the entire genome. The cDNA encoding particular proteins could be screened with a 40% probability using a computer search for sequence homology.
Comparison of Proteome and cDNA Microarray Techniques to Monitor Changes in Protein and Gene Expression-Brassi-nosteroids (BRs) are a group of naturally occurring plant steroids with structural similarities to insect and animal steroid hormones (23). Exogenous application of BRs to plant tissues evokes various growth responses such as cell elongation, proliferation, differentiation, organ bending, and a number of other physiological processes (24). It is believed that BR affects plant growth through the regulation of gene expression. However, only a few BR-regulated genes have been identified. The changes of gene expression caused by BR were systematically analyzed in rice seedlings using a combination of proteome and cDNA microarray approaches. The bending of the second leaf and its leaf sheath in rice seedlings is very sensitive to BRs, and this unique characteristic of rice leaves has been used as a quantitative bioassay for BRs (25). We adopted this model system and found that 1 M BL treatment caused the maximum bending under these experimental conditions (Fig. 2).
First, proteins were extracted from lamina joints treated with 1 M BL for 48 h or water as a control and analyzed by two-dimensional PAGE. A systematic comparison of protein patterns showed that six proteins were increased, when compared with the water control, in the lamina joint treated with BL (Fig. 2). Sequence analysis revealed that two protein spots (LJ258 and LJ262) were homologous to the ribulose-1,5bisphosphate carboxylase/oxygenase large subunit, and one protein (LJ133) showed homology to glyceraldehyde-3-phosphate dehydrogenase, which is a key enzyme in glycolysis. The other three protein spots (LJa, LJb, and LJ195) did not display any significant homology to proteins in the data bases researched.
Second, a cDNA microarray containing 1,265 independent FIG. 1. Strategy to determine the amino acid sequence in rice proteome analysis. The proteins were separated by two-dimensional PAGE with isoelectric focusing in the first dimension and SDS-PAGE in the second dimension according to their isoelectric point (pI) and then according to their molecular mass, resulting in a two-dimensional gel. The spots were visualized by CBB staining and then scanned for image analysis. To obtain sequence tags by Edman sequencing, we stained gels with CBB before blotting to increase the sensitivity and to allow easier matching of the gels. The spots contained 5-20 pmol. For MS, individual protein spots were then selected, excised from the gel, and digested with the site-specific protease trypsin, resulting in a set of tryptic peptides. The peptides were extracted, and their masses were measured by MALDI-TOF MS. The list of measured peptide masses was compared with the masses of the predicted tryptic peptides for each entry in the sequence data base. rice genes randomly selected from 9,600 expressed sequence tags was used to analyze differential gene expression caused by BL. The arrays were hybridized with Cy5 fluorescently labeled probes of lamina joint sample treated with 1 M BL or water as a control (Fig. 2). Fluorescent signal differences greater than 2-fold between the control and BL-treated samples were considered to be significant. Data analysis showed that the expression of 12 genes was enhanced by the BL treatment (Fig. 2). Among them, five genes had homologies based on a search in the GenBank TM data base using the BLAST program. A vacuolar H ϩ -transporting ATPase homologue (Element 0145) showed higher expression in the BL-treated lamina joint. Element 0550 was homologous to Bromheadia finlaysonia mRNA for extension, suggesting a role in BR-mediated cell expansion. Elements 0250 and 1029 showed homologies to the Arabidopsis ubiquitin-conjugating enzyme and Helianthus annuus mRNA for the 1-aminocyclopropane-1-carboxylic acid oxidase-related protein, respectively. Element 0654 was a rice photosystem II oxygen-evolving complex protein I. The other seven had no significant homologies in the data base.
By using proteome analysis of differential protein expression and cDNA microarray analysis of differential gene expression, we identified some changes at the transcription and

FIG. 2. Comparison of proteome (A) and cDNA microarray (B) techniques to monitor changes in protein and gene expression caused by brassinolide in rice lamina joint.
A, proteome analysis of protein expression after brassinolide treatment on lamina joint. B, microarray analysis of gene expression after brassinolide treatment on lamina joint. ACC, 1-aminocyclopropane-1-carboxylic acid; LSU, large subunit; IEF, isoelectric focusing. translation levels caused by the BL in the lamina joint. However, we did not find any overlaps in the results of the two approaches in the present study. This can be explained in the following manner. 1) The amount of some proteins is far beyond the detection sensitivity of CBB staining on twodimensional PAGE. 2) The cDNA microarray that was used in the present study contains 1,265 genes and accounts for about 4% of the total number of genes predicted in rice, and those changed proteins detected in the proteome are not contained in the cDNA microarray used.
Identification of Protein Sequences Using Protein Sequencer and Mass Spectrometry-Fifty-four proteins of leaf sheath from rice seedlings were analyzed by Edman sequencing and MS. For Edman sequencing, most of the proteins were N-terminally blocked. Using MS, all proteins were identified by matching the protein from rice and other species (Table II). The similar proteins are spot LS083, homologous to calreticulin, a calcium-binding protein located in the endoplasmic reticulum (26); spot LS261, matched to Bowman Birk trypsin inhibitor; spot LS317, identified as a Cu,Zn-SOD 1, a cytoplasmic protein that destroys radicals that are normally produced within the cells and are toxic to biological systems (27); spot LS322, found to be a C97454 rice callus cDNA clone; spot LS332, identified as a chloroplast Cu,Zn-SOD (28); spot LS346, matched to RuBisCO small subunit from chloroplast protein, which catalyzes the first reaction in the Calvin cycle (29).
The N-terminal sequences of three proteins, spots LS327, LS328, and LS329, were identical to each other and to the sequence of Cu,Zn-SOD 2 (accession number P28757) from rice (30). Using MALDI-TOF MS, these proteins were identified with SOD proteins; spots LS327 and LS328 were homologous to Cu,Zn-SOD 2; and spot LS329 was homologous to Fe-SOD (accession number JG0179). Spots LS347, LS349, and LS351 had the same N-terminal sequences and were homologous to salt stress protein (31). In MS analysis, the three proteins displayed a different peptide mass fingerprint and were matched to different salt-induced proteins. Two proteins (LS347 and LS351) were matched to the mannose-binding rice lectin (accession number BAA25369), and another (LS349) was matched to a salT gene product protein (accession number AAB53810). Regulation of rice plant height is important for lodging re-sistance and for increasing the grain yield. Since the leaf sheath is the tissue responsible for elongation of rice plants, the plant length could be altered if the tissue growth is regulated. However, the mechanism of leaf sheath elongation remains largely unknown. The profile shows calreticulin (LS083) and calmodulin-related protein (LS344) are expressed in the leaf sheath. Calreticulin also regulates Ca 2ϩ -dependent protein kinase and the regeneration of rice cultured suspension cells (26), suggesting that the Ca 2ϩ signal is important for stem growth and development. The proteome data file includes many hypothetical proteins (32). Determination of the functions of these proteins is important because the proteins might be related to elongation or Ca 2ϩ signaling pathways, and the knowledge of leaf sheath elongation mechanisms could progress.
Twenty-seven proteins were identified by both methods, and 19 of them (19/27 ϭ 70%) had a homology similar to the rice data bases (Table II) whether the protein species analyzed are post-translational modifications or isoforms. Sometimes an MS-identified result needs to be further confirmed by sequence tags. However, with the combination of these two techniques, it is possible to identify the corresponding genes.

Functional Analysis
A Calcium-binding Protein That Turned Out to Be Calreticulin-Plant cultured cells are useful for transformation and recombination in crop improvements. The capacity for regeneration is thus essential in plant cultured cells. Short term cultured cells of rice can regenerate, but they do so to a lesser extent in long term cultures (Fig. 3). A phosphoprotein involved in the regeneration of rice cultured cells was identified using an in vitro phosphorylation assay (Fig. 3). This protein was purified, and the N-terminal and internal amino acid sequences were determined (22).
Using the proteome data file of rice cultured cells, this protein was identified as a calcium-binding protein, which is homologous to calreticulin of maize. Degenerate PCR primers, designed on the basis of the amino acid sequence of the protein, were used for PCR. A cDNA library was screened using the PCR fragment as a probe. Positive clones were isolated and analyzed by Southern blot. A full-length cDNA insert, CRO1, was sequenced (26).
To know precisely the function of calreticulin in rice tissues, the full-length cDNA for calreticulin was introduced into rice cells in the sense and antisense orientation. In our previous research, calreticulin negatively affected the rice callus regeneration and growth (22). In the present study, the overexpression of calreticulin inhibited plant regeneration from callus and seedling growth (Fig. 3). These results suggest that calreticulin is a negative repressor in the differentiation system and development. These stages of rice are affected by plant hormones. Calreticulin might play a significant role in plant hormone signal transduction.
A Gibberellin-binding Protein, Which Is RuBisCO Activase-Gibberellins (GAs) are a class of plant hormones, that regulate growth and development, stem elongation, flowering, and seed germination (Fig. 4). The cellular responses to GAs are thought to be mediated by GA receptors. A few GAbinding proteins have been identified as candidates for GA receptors by using a variety of techniques. A GA-binding protein in rice was identified using a ligand binding assay (33). The GA-binding protein was purified, and N-terminal and internal amino acid sequences were determined.
Using the proteome data file of rice leaf sheath and leaf blade, this protein was identified as RuBisCO activase in barley. Degenerate PCR primers, designed on the basis of the amino acid sequence of RuBisCO activase, were used for PCR. A cDNA library was screened using the PCR fragment as a probe. Positive clones were isolated and analyzed by Southern blot. Two full-length cDNA inserts, OsrcaA1 and OsrcaA2, were sequenced (34).
To know the precise function of the RuBisCO activase in rice tissues, the full-length cDNA for RuBisCO activase was introduced into the rice cells in the sense and antisense orientation. The overexpression of OsrcaA1 promoted seedling growth better than the overexpression of OsrcaA2 and vector control-transformed rice. Plant regeneration from callus and seedling growth was repressed in antisense Osrca transgenic rice. It is known that light activation of RuBisCO is one of the first processes adversely affected by elevated temperature. In spinach, the larger form of RuBisCO activase is more thermostable than the smaller form (35). In Arabidopsis, at physiological ratios of ADP/ATP, the larger isoform has minimal ATP hydrolysis and RuBisCO activation activity in comparison with the smaller isoform (36). Recently the function of the RuBisCO activase larger isoform has been identified. However, the precise function of the smaller isoform of RuBisCO activase is still not clear. The above results suggest that the small isoform might have a unique role in the regulation of leaf sheath elongation (Fig. 4II). The GA 3 ligand binding assay shows that RuBisCO activase is a receptor for the GA signal transduction pathway(s). The antisense transformant repressed stem and leaf growth, suggesting that the GA signal pathway is essential for rice development. CONCLUSION Two-dimensional PAGE separation and analysis provide a convenient way to study the various proteins that are present or induced in rice plants under different growth conditions, i.e. normal and stress. Knowing which proteins are being synthesized in specific tissues and at different developmental stages of rice under a variety of physiological conditions can lead to identifying the roles for these proteins. The study of partial amino acid sequence analysis of plant proteins will contribute greatly to the field of molecular biology for the identification of proteins through homology search. The information thus obtained from the amino acid sequence of these proteins will be helpful in predicting the function of the proteins and will aid in their molecular cloning in future experiments.