MCP Danish Cancer Society
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Originally published In Press as doi:10.1074/mcp.M500312-MCP200 on February 2, 2006.
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Supplemental Data
Right arrow All Versions of this Article:
M500312-MCP200v1
5/5/811    most recent
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow Glossary
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Yamazaki, S.
Right arrow Articles by Isono, K.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Yamazaki, S.
Right arrow Articles by Isono, K.
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Molecular & Cellular Proteomics 5:811-823, 2006.
© 2006 by The American Society for Biochemistry and Molecular Biology, Inc.


Research

Proteome Analysis of an Aerobic Hyperthermophilic Crenarchaeon, Aeropyrum pernix K1*,S

Syuji Yamazaki{ddagger}, Jun Yamazaki, Keiko Nishijima, Rie Otsuka, Miyako Mise, Hanako Ishikawa, Kazumi Sasaki, Shin-ichi Tago and Katsumi Isono

From the Department of Biotechnology, Genome Analysis Division, National Institute of Technology and Evaluation (NITE), 2-49-10 Nishihara, Shibuya, Tokyo 151-0066, Japan


    ABSTRACT
 TOP
 ABSTRACT
 EXPERIMENTAL PROCEDURES
 RESULTS AND DISCUSSION
 REFERENCES
 
We analyzed the proteome of a crenararchaeon, Aeropyrum pernix K1, by using the following four methods: (i) two-dimensional PAGE followed by MALDI-TOF MS, (ii) one-dimensional SDS-PAGE in combination with two-dimensional LC-MS/MS, (iii) multidimensional LC-MS/MS, and (iv) two-dimensional PAGE followed by amino-terminal amino acid sequencing. These methods were found to be complementary to each other, and biases in the data obtained in one method could largely be compensated by the data obtained in the other methods. Consequently a total of 704 proteins were successfully identified, 134 of which were unique to A. pernix K1, and 19 were not described previously in the genomic annotation. We found that the original annotation of the genomic data of this archaeon was not adequate in particular with respect to proteins of 10–20 kDa in size, many of which were described as hypothetical. Furthermore the amino-terminal amino acid sequence analysis indicated that surprisingly the translation of 52% of their genes starts with TTG in contrast to ATG (28%) and GTG (20%). Thus, A. pernix K1 is the first example of an organism in which TTG is the most predominant translational initiation codon.


The rapid accumulation of genomic sequence data has produced a huge set of new protein sequences, most of which have been derived from the translation of predicted ORFs with no accompanying experimental evidence for their expression and function. Functional prediction is often made by comparison with sequences already in the public databases.

Many microorganisms, in particular Archaea, live in extreme habitats such as high temperature, acidic pH, high salt concentration, etc., and many of them possess unusual cellular and molecular properties. They are collectively termed "extremophiles" and could potentially serve as valuable resources for novel biotechnological applications. Nonetheless there are few existing industrial applications in which either archaeal biomass or archaeal enzymes are used. This is partly due to the lack of data for the expression of individual genes predicted from genome analysis. Such will be best achieved by proteome analysis.

Aeropyrum pernix K1 is an aerobic hyperthermophilic crenarchaeon isolated in 1993 from a coastal solfataric thermal vent in Kodakara-jima Island of Kagoshima, Japan. It grows optimally at 90–95 °C (1). Many of the thermostable enzymes of this archaeon are expected to be useful for a variety of industrial applications. The complete genomic sequence of A. pernix K1 was established in 1999, and ~2,700 ORFs were predicted from the sequence of nearly 1.67 Mb in size. The data were made available to the public through DDBJ/EMBL/GenBankTM as well as the "Database of the Genomes Analyzed at NITE" (DOGAN).1 About 1,600 of the predicted 2,700 ORFs were hypothetical (2). Moreover the number of predicted ORFs is much larger than those of other Archaea and bacteria with similar genome sizes, casting doubt over the authenticity of the predicted ORFs. Natale et al. (3) reannotated the A. pernix K1 genome using the Clusters of Orthologous Groups of Proteins database and reported the total number of its protein-coding genes to be 1,871. Similarly the current RefSeq contains an annotation reported by Pruitt et al. (4) in which 1,841 proteins were predicted in the A. pernix genome, and Guo et al. (5) re-evaluated the A. pernix K1 annotation and inferred a total of 1,610 ORFs as potential protein-coding genes. The confusion concerning the annotation of the A. pernix K1 genome is one of the factors that might have hindered wide spread utilization of A. pernix K1 enzymes, many of which are expected to possess excellent thermostability.

There is an additional problem: from the genomic and proteomic analyses performed to date, ATG is the most common initiation codon, and GTG and TTG are used in less than 10% of bacterial genes (6, 7). In contrast, however, of some 2,700 ORFs predicted in the genome of A. pernix K1, 43% were deduced to be initiated with ATG and 57% were deduced to be initiated with GTG, which differs greatly from other species. Furthermore in A. pernix K1, genes initiated with TTG were reported (8) despite that TTG has not been reported as an initiation codon in other organisms.

The problems described above can only be experimentally clarified by performing proteome analysis. For this purpose, we adopted four methods to maximize the number of detected proteins. Consequently we were able to identify 704 proteins, including 19 that were derived from the genomic regions in which no ORFs were predicted previously (2). The results suggest at the same time that the number of predicted ORFs in the current version of DOGAN is largely overestimated due to the inclusion of ORFs for non-conserved hypothetical proteins with molecular mass of 10–20 kDa. Furthermore amino-terminal amino acid sequences of 134 proteins were determined from which we were able to establish that surprisingly TTG is the most predominant initiation codon in A. pernix K1.


    EXPERIMENTAL PROCEDURES
 TOP
 ABSTRACT
 EXPERIMENTAL PROCEDURES
 RESULTS AND DISCUSSION
 REFERENCES
 
Strain and Culture
A. pernix strain K1 deposited at NITE Biological Resource Center (NBRC 100138) was cultured at 90 °C in 400 ml of jamarine-yeast extract-trypticase peptone medium (1) for 20 h, cooled down on ice, and harvested by centrifugation at 5,000 x g at 4 °C for 10 min. Cellular pellets were resuspended in 3.5% NaCl and recentrifuged.

Protein Preparation
Two-dimensional (2D)-PAGE and 1D-SDS-PAGE-LC-MS/MS—
A. pernix K1 cells were suspended in an extraction buffer (67% acetic acid containing 33 mM MgCl2) and disrupted by sonication at 4 °C. Cell debris were removed by centrifugation, and 4 volumes of 20 mM DTT in acetone were added to the supernatant. The mixture was stored at –20 °C, and the protein precipitates were collected by centrifugation and dried.

MD-LC-MS/MS—
A. pernix K1 cells were suspended in distilled water and lysed by homogenization in S-203 (AS ONE, Osaka, Japan) for 30 s on ice.

2D-PAGE
Protein Separation by 2D-PAGE—
IEF was performed on either 180-mm IPG strips with the pH range of 3–10 (Amersham Biosciences) or IPG ReadyStrips with the pH range of 3–6 or 5–8 (Bio-Rad). Protein samples were dissolved in a lysis buffer containing 7 M urea, 2 M thiourea, 4% CHAPS, 50 mM DTT, 40 mM Tris, and 0.2% carrier ampholyte and incubated at room temperature for 1 h. The first dimensional separation was performed on an IPGphor IEF apparatus (Amersham Biosciences). IPG strips loaded with 100 µg of protein were electrofocused first at 200 V for 1 h, then at a linear gradient of 200–4,000 V for 6 h, and finally at 8,000 V to achieve a total of 60 kV-h. After IEF, the strips were equilibrated with an equilibration buffer containing 6 M urea, 30% glycerol, 2% SDS, 50 mM Tris-HCl (pH 6.8), and 1% DTT for 30 min. SDS-PAGE was then carried out on 12 or 16% polyacrylamide gels (20 x 20 x 0.1 cm). Proteins were visualized by staining with Coomassie Brilliant Blue R-250 (CBB) (Nacalai Tesque, Kyoto, Japan).

Radical-free and Highly Reducing (RFHR)-2D-PAGE—
The method of Wada (9) was mainly followed. Protein samples were dissolved in a lysis buffer containing 8 M urea and 0.2 M mercaptoethanol and incubated at 40 °C for 30 min. Sample charging electrophoresis was carried out with 100 µg of protein on an 8% polyacrylamide gel containing 8 M urea, 40 mM KOH, and 0.37% acetic acid at 100 V for 30 min on an NA1450 apparatus (Nihon Eido, Tokyo, Japan). Subsequently the first dimensional separation was performed on an 8% polyacrylamide gel containing 8 M urea, 400 mM Tris, 500 mM boric acid, and 21.5 mM EDTA-2Na at 100 V for 15 h on an NA1460 apparatus (Nihon Eido). The second dimensional separation was then carried out on an 18% polyacrylamide gel containing 8 M urea 50 mM KOH and 5% acetic acid (16 x 16 x 0.2 cm) at 100 V for 30 h. Proteins were visualized with CBB as described above.

Enzymatic Digestion for 2D-PAGE-MALDI-TOF MS—
In-gel digestion with modified trypsin (sequencing grade, Promega, Madison, WI) and sample spotting for MALDI-TOF MS were performed with the Investigator ProPrep automatic digestion and spotting system (Genomic Solutions, Huntingdon, UK) according to the manufacturer’s protocols with some modifications. The CBB-stained protein spots were excised from the gel and washed with 25 mM NH4HCO3 and acetonitrile at room temperature. The proteins were reduced with 10 mM DDT in 25 mM NH4HCO3 at 60 °C for 10 min and alkylated with 40 mM iodoacetamide in 25 mM NH4HCO3 at room temperature for 35 min. The dried gel pieces were rehydrated and incubated in 25 mM NH4HCO3 containing modified trypsin at 37 °C for 4 h. 3% formic acid was added to stop the enzymatic reaction, and the resultant peptides were concentrated, desalted by passing through a µ-C18 ZipTip (Millipore, Billerica, MA), mixed with a matrix solution of 50% acetonitrile saturated with {alpha}-cyano-4-hydroxycinnamic acid (Sigma), and air-dried on the target plate.

Mass Spectrometry of 2D-PAGE-MALDI-TOF MS—
The resulting peptide mixture was subjected to analysis on an Auto-Flex instrument (Bruker Daltonics, Bremen, Germany) with {alpha}-cyano-4-hydroxycinnamic acid as the matrix and operated in the reflector mode. Calibration was performed in the external mode using a peptide calibration standard kit (Bruker Daltonics). For peptide assignment the mass spectrum data were analyzed using the MASCOT database search program (Matrix Science Ltd., London, UK) in the peptide mass fingerprinting mode against the database of putative proteins of A. pernix K1 containing the data for 2,694 ORFs as well as against the translation of the entire genomic sequence in all phases.

1D-SDS-PAGE-LC-MS/MS Analysis
Protein Separation by 1D-SDS-PAGE—
Protein samples were dissolved in a lysis buffer containing 7 M urea, 2 M thiourea, 4% CHAPS, 50 mM DTT, and 40 mM Tris and incubated at room temperature for 1 h. Subsequently a 6x concentrated electrophoresis loading buffer containing 0.35 M Tris-HCl (pH 6.8), 10% SDS, 30% glycerol, and 9.3% DTT was added. 1D-SDS-PAGE was performed on 10% separating polyacrylamide gels with a Tris-Tricine running buffer containing 0.1 M Tris, 0.1 M Tricine, and 0.1% SDS.

Enzymatic Digestion for 1D-SDS-PAGE-LC-MS/MS—
After CBB staining, the gels were sliced into 5-mm-thick pieces from the top band (>116 kDa) to the bottom line (4.4 kDa). In-gel digestion with modified trypsin was performed using Investigator ProPrep for 6 h and stopped with 3% formic acid. The resulting peptide mixtures were eluted from the gel and dried by evaporation. The peptides were diluted with 0.02% formic acid containing 0.005% heptafluorobutyric acid (HFBA) and 2% acetonitrile.

Mass Spectrometry of 1D-SDS-PAGE-LC-MS/MS—
To analyze peptides, 2D-LC was combined with nano-ESI-MS/MS. The analysis was performed with a Finnigan LCQ DECA XP Plus ion trap mass spectrometer (ThermoElectron, San Jose, CA) coupled with an LC MAGIC 2002 system (Michrom Bioresources, Auburn, CA) through a nanoelectrospray ion source (AMR Inc., Tokyo, Japan) (10). The system was fitted with a strong cation exchange peptide trap column of 1.0-mm inner diameter and 8-mm length (Michrom Bioresources) for the first dimensional chromatography, a Peptide CapTrap (Michrom Bioresources) for desalting and concentration, a C18 reverse phase column (50-mm length and 0.2-mm inner diameter; Michrom Bioresources) for the second dimensional chromatography, and a Pico Tip (New Objective, Woburn, MA) as the electrosprayer. The solvents used for strong cation exchange were 0.02% formic acid containing 0.005% HFBA and 2% acetonitrile with either 0, 25, 50, 75, 100, 150, 250, or 500 mM HCOONH4 (used in eight steps). Desalting was performed with a mixture of 0.1% trifluoroacetic acid, 2% acetonitrile, and 98% water. For reverse phase chromatography, Buffer A (0.1% formic acid, 0.005% HFBA, 2% acetonitrile, and 98% water) and Buffer B (0.1% formic acid, 0.005% HFBA, 90% acetonitrile, and 10% water) were used to form a gradient of 5–65% of Buffer B in 20 min at a flow rate of 1 µl/min.

The mass spectrometer was operated in data-dependent MS/MS mode with dynamic exclusion at 450–2000 m/z ranges, and the ions were selected for CID with automatic data-dependent settings. The MS/MS spectra were converted into peak list files with SEQUESTTM Browser (ThermoElectron) that were searched for with the MASCOT database search program in the MS/MS mode against the A. pernix K1 genomic data. The criteria adopted for protein identification were either 1) that at least three peptides with ion score 20 or higher match or 2) that at least one peptide with ion score 40 or higher matches.

MD-LC-MS/MS Analysis
Protein Separation by MD-LC—
Proteins of a whole cell lysate were separated by off-line 2D-LC and 2D-LC-nano-ESI-MS/MS. The first dimensional chromatography was performed with a self-packed strong anion exchange (SAX) column prepared in a glass chromatography tube of 8-mm inner diameter and 100-mm length. Trimethylaminopropyl-bonded silica gel (BONDESIL-SAX, 40 µm, Varian, Palo Alto, CA) was used to fill the column. Buffers used were 20 mM Tris-HCl, pH 7.0 (Buffer C), and 20 mM Tris-HCl, pH 7.0, with 1 M NaCl (Buffer D). Proteins were eluted with Buffer C from 0 to 5 min with a linear gradient of Buffer C to Buffer D from 6 to 25 min and with Buffer D from 26 to 32 min. The flow rate was 2 ml/min, and the eluate was collected into eight 8-ml fractions. The fractions were concentrated to 0.5 ml by partial lyophilization (EYELA FD-81, Tokyo Rikakikai, Tokyo, Japan). The second dimensional chromatography was performed with a gel permeation chromatography (GPC) column (Bioassist G2SWXL, TOSOH, Tokyo, Japan). 200 µl of each of the concentrated SAX fractions were successively injected into a column connected with a guard column (TOSOH) and two GPC. Elution was performed with Buffer E (0.1 M sodium phosphate, pH 7.0) at a flow rate of 0.5 ml/min, and fractions were collected every 3.5 min starting from 18 min until 60 min (12 fractions). In this way a total of 96 fractions (8 x 12) were obtained.

Enzymatic Digestion for MD-LC-MS/MS—
Samples in each fraction were reduced by incubating in 5 mM DTT at 60 °C for 30 min, alkylated with 15 mM iodoacetamide in the dark and at room temperature for 30 min, and digested with modified trypsin at 37 °C for 1 h. The samples were adjusted to pH 4 with trifluoroacetic acid, desalted with a C18 reverse phase column, and evaporated. The digests were dissolved in a mixture of 0.02% formic acid, 0.005% HFBA, 2% acetonitrile, and 98% water prior to MS analysis.

In-column Enzymatic Digestion—
Proteins retained on the SAX column were treated with 0.1% RapiGest (Waters, Milford, MA) in 5 mM DTT and incubated at 60 °C for 30 min. The proteins were alkylated and digested with modified trypsin at 37 °C for 15 h. The digests were eluted with a mixture of 0.1% trifluoroacetic acid, 5% methanol, and 94.9% water; desalted with a C18 reverse phase column; and evaporated. The digests were dissolved in 0.02% formic acid with 0.005% HFBA, 2% acetonitrile, and 98% water prior to MS analysis.

Amino-terminal Amino Acid Sequence Analysis
Protein spots on 2D-PAGE were electroblotted onto a PVDF membrane (Sequi-Blot PVDF membrane, Bio-Rad) with a semidry blotting apparatus (Bio Craft, Tokyo, Japan). The blotted membrane was stained with CBB. Singly stained spots were excised from the PVDF membrane and applied to a protein sequencer (model Procise 491cLC, Applied Biosystems, Foster City, CA) if the staining intensity appeared to be strong enough for sequencing. For weakly stained spots, two to six excised spots were combined by repeating 2D-PAGE and then applied to the protein sequencer. Edman reactions were performed according to the manufacturer’s instructions. To identify each protein, the amino acid sequences obtained were compared with the predicted amino acid sequence data translated from the genomic sequence of A. pernix K1.

Miscellaneous
The genomic sequence data of A. pernix K1 along with the data for 2,694 annotated ORFs were downloaded from DOGAN (www.bio.nite.go.jp/dogan/Top). Additional A. pernix K1 data for 1,610 annotated ORFs were downloaded from the home page of Tianjin University BioInfomatics Centre (TUBIC) (tubic.tju.edu.cn/Aper/).


    RESULTS AND DISCUSSION
 TOP
 ABSTRACT
 EXPERIMENTAL PROCEDURES
 RESULTS AND DISCUSSION
 REFERENCES
 
2D-PAGE Followed by TOF MS and Amino-terminal Analysis—
About 500 protein spots were well separated from others by 2D-PAGE with IEF in the first dimension (Fig. 1). Also about 70 basic protein spots, many of which were ribosomal proteins, were observed on RFHR gels. They were individually cut out and digested with trypsin as described under "Experimental Procedures," and the resultant peptide mixtures were analyzed by MALDI-TOF MS with 400–3000 m/z ranges. The mass spectra obtained were then examined by using the MASCOT database search program. In addition, the protein spots were transferred to a PVDF membrane and subjected to amino-terminal amino acid sequencing.


Figure 1
View larger version (89K):
[in this window]
[in a new window]
 
FIG. 1. A typical 2D-PAGE pattern. 100 µg of proteins were loaded and separated either by IEF or on a 180-mm-wide 3-10 non-linear immobilized pH gradient strip in the first dimension (horizontal) that was subsequently placed onto a polyacrylamide gel and electrophoresed vertically in the second dimension. The example shown indicates that the first dimensional separation was carried out by IEF with the pI range of 3–10, and the second dimensional separation was by 12% SDS-PAGE in the molecular mass range shown at right. The identified protein spots are indicated by red plus (+) signs.

 
Consequently a total of 300 proteins were identified (Supplemental Table 1). Of them, 187 (62.3%) corresponded to those annotated in other organisms, 80 (26.7%) corresponded to conserved hypothetical proteins, and 33 (11.0%) corresponded to non-conserved hypothetical proteins. However, six of the identified proteins matched the genomic regions of A. pernix K1 in which no ORF had been assigned previously. Of these 300 proteins, 134 proteins were found to have unblocked amino termini, and their 6–21 amino-terminal amino acid residues were successfully sequenced enabling determination of the corresponding genomic regions encoding them. These included the six proteins mentioned above that were derived from the regions without assigned ORFs (Table I). Of the 134 proteins, 73 (54.5%) corresponded to annotated, 45 (33.6%) corresponded to conserved hypothetical, and 16 (11.9%) corresponded to non-conserved hypothetical proteins.


View this table:
[in this window]
[in a new window]
 
TABLE I Proteins whose amino-terminal amino acid sequence was determined

Difference column shows the difference between the previously assigned start position (shown in DOGAN) and the newly identified start position. A blank indicates that no difference was detected, and N.D. indicates that the start position could not be assigned. — indicates no corresponding data because it is a newly identified ORF that was not predicted in DOGAN. ORFs newly identified were named by adding "a" to the upstream ORF names of Ref. 2.

 
1D-SDS-PAGE-LC-MS/MS Analysis—
In this method, proteins prepared from A. pernix K1 cells were first resolved by SDS-PAGE, and the gel was sliced and analyzed as described under "Experimental Procedures" (Fig. 2). A total of 630 proteins were identified accordingly (Supplemental Table 1). Of them, 357 (56.7%) corresponded to annotated, 161 (25.6%) corresponded to conserved hypothetical, and 112 (17.8%) corresponded to non-conserved hypothetical proteins. Also 14 proteins matched the genomic regions without previously assigned ORFs.


Figure 2
View larger version (21K):
[in this window]
[in a new window]
 
FIG. 2. A typical 1D-SDS electropherogram. 10 µg of proteins were loaded and separated on a 10% polyacrylamide gel, which was then sliced into 5-mm-thick pieces as schematically shown at right with their median molecular mass values calculated from the molecular markers. Subsequently proteins were extracted and analyzed.

 
MD-LC-MS/MS Analysis—
In this method, the whole cell lysate was applied to a SAX column for the first dimensional separation and then to a GPC column for the second dimensional separation. In the first dimension, proteins were separated according to their pI values into eight fractions, whereas in the second dimension, proteins in each of the SAX fractions were separated by their molecular mass into 12 fractions. As a consequence, a total of 96 fractions were obtained as exemplified in Fig. 3. Proteins in the resultant fractions were then digested with trypsin and analyzed by 2D-LC-MS/MS. In addition, proteins retained on the SAX column were treated similarly. A total of 404 proteins were thus identified (Supplemental Table 1), 235 (58.2%) of which corresponded to annotated, 94 (23.3%) of which corresponded to conserved hypothetical, and 75 (18.6%) of which corresponded to non-conserved hypothetical proteins. 10 proteins corresponded to the genomic regions without previously assigned ORFs.


Figure 3
View larger version (21K):
[in this window]
[in a new window]
 
FIG. 3. Separation of proteins by MD-LC. Proteins were separated into S1–S8 fractions in the first dimension SAX column chromatography (A). Proteins were subsequently separated by gel permeation chromatography in the second dimension. In B, a chromatogram of proteins of fraction S3 obtained in A is shown that were subjected to second dimensional separation into G1 through G12.

 
By combining the results obtained with the four methods mentioned above, a total of 704 proteins were successfully identified in A. pernix K1. The proteins identified by each method are listed in Supplemental Table 1. Of them, 382 (54.3%) were found to correspond to proteins annotated in other organisms, 188 (26.7%) were found to correspond to conserved hypothetical proteins, and 134 (19.0%) were found to correspond to non-conserved hypothetical proteins. Also 19 proteins were found to correspond to the genomic regions in which no ORFs were previously assigned. Of the 300, 630, and 404 proteins identified, respectively, by 2D-PAGE, 1D-SDS-PAGE-LC-MS/MS, and MD-LC-MS/MS, 204 proteins were common to all methods (Fig. 4). On the contrary, proteins uniquely identified in each method were also recognized as shown. Typical images of 2D- and 1D-PAGE as well as multidimensional chromatograms along with a list of all proteins identified in these studies will be made available on the DOGAN web site (www.bio.nite.go.jp/dogan/Top).


Figure 4
View larger version (16K):
[in this window]
[in a new window]
 
FIG. 4. The number of proteins identified by the three methods used. The number of proteins identified by each of the three methods, i.e. 2D-PAGE, 1D-SDS-PAGE-LC-MS/MS, and MD-LC-MS/MS, is schematically illustrated along with the number of proteins redundantly identified by two or by all three methods as indicated in the respective overlapping regions.

 
Several features of the identified proteins such as molecular mass, pI, hydropathy, protein class, and codon usage were then compared with those of the ORFs predicted in the genome of A. pernix K1. If the statistical distribution of these values is not similar between the observed and predicted proteins, then the annotation of the genomic data needs to be appropriately corrected.

Molecular Mass Distribution—
The molecular mass distribution of proteins predicted from the genomic sequence of A. pernix K1 and those experimentally identified by 2D-PAGE, 1D-SDS-PAGE-LC-MS/MS, and MD-LC-MS/MS was compared in groups of 10 kDa up to and higher than 120 kDa (Fig. 5A). Although 49.4% of the proteins were predicted in the molecular mass range of 10–20 kDa in the genome analysis, only 22.9% were actually observed in the same range. Consequently the number of predicted proteins in this molecular mass range in the current version of DOGAN appears to be overrepresented. The average molecular mass of the proteins identified by 2D-PAGE was 32.4 kDa, whereas it was 34.8 kDa by 1D-SDS-PAGE-LC-MS/MS or MD-LC-MS/MS. With the latter two methods, it is possible to identify proteins harboring a larger molecular mass value, whereas such is not the case with 2D-PAGE as the separation of larger proteins becomes poorer. In any event, it is obvious that about half of the proteins predicted from the genomic data in the 10–20-kDa range appear to be incorrectly assigned.


Figure 5
View larger version (32K):
[in this window]
[in a new window]
 
FIG. 5. Distribution of the four parameters of proteins identified by the three methods as well as of those predicted from the genomic sequence. A, distribution of the proteins according to their molecular mass values was calculated and illustrated. DOGAN and TUBIC indicate, respectively, the proteins annotated and released to the public through DOGAN and those taken from the TUBIC home page (tubic.tju.edu.cn/Aper/). B, distribution of the proteins according to their pI values. C, distribution of the hydropathy indices expressed as the GRAVY score of proteins. The GRAVY score was calculated as the arithmetic mean of the sum of the hydropathy indices of individual amino acid residues. Hydrophilic proteins distribute to the left, and hydrophobic proteins distribute to the right. D, distribution of proteins according to their protein class.

 
This could be due in part to incorrect prediction of ORFs in the genome analysis of A. pernix K1 because of the poorer quality computer software used at that time to assign ORFs. The percentages of ORFs identified by our proteome analysis were 35, 47, 51, 57, and 55%, respectively, in the molecular mass ranges of 20–40, 40–60, 60–80, 80–100, and >100 kDa. Therefore, proteins of larger sizes could in general be more frequently identified in our analysis. Of the proteins of 80 kDa or higher, 14% of those identified by our analysis were predicted to possess a transmembrane domain, whereas the value was much higher for those that could not have been identified in our analysis in which case 61% were predicted to possess a transmembrane domain. Likewise of the proteins smaller than 80 kDa, 9% of identified and 32% of unidentified were predicted to possess a transmembrane domain. From these data, it appears that many of the proteins not identified in our proteome analysis, in particular those with higher molecular mass, are likely to be membrane proteins.

The largest five of the protein-coding ORFs of A. pernix K1 are APE0620, APE0609, APE0057, APE1340, and APE1213. Of them, the products of APE1340 and APE0609 were identified in our proteome analysis. The former has homology to the reverse gyrase of Pyrococcus furiosus that was recently experimentally proven to be necessary for the growth of this bacterium at high temperature (11). The APE0609 protein is similar to a surface layer protein of Staphylothermus marinus. The surface layer protein of S. marinus forms a complex with a protease that is likely to play a role in taking up external peptides and proteins. A protein similar to the protease of this complex is encoded by APE0607 that was identified in our analysis. This protein is likely to have a similar function in A. pernix.

The remaining proteins were not identified in our analysis most likely because they are membrane proteins as they appear to possess a transmembrane domain. APE1213 is a paralogue of APE0607, but its expression might be different from the latter. The function of APE0620 and APE0057 remains to be investigated.

Isoelectric Point Distribution—
The pI values of the identified and predicted proteins were compared with each other in the pI range from 3 to 13 (Fig. 5B). The average pI values of identified proteins were 7.25 (2D-PAGE), 7.55 (1D-SDS-PAGE-LC-MS/MS), and 7.70 (MD-LC-MS/MS), whereas the value for the predicted proteins was calculated to be 8.68. In the pI range between 5 and 7, the proteins predicted from the genomic data are much fewer than those identified by proteome analysis, whereas proteins in the high pI range (>10) show an opposite distribution pattern. Also proteins identified by 2D-PAGE were much more likely to be distributed in the pI range of 5–7 than those identified in other methods, although the reason for this is not clear.

Hydropathy Distribution—
The GRAVY score indicates the hydrophilicity or hydrophobicity of a protein (12); it can be calculated as an arithmetic mean of the sum of the hydropathy index of each amino acid of a protein. About 70% of the predicted proteins concentrated in the neutral range (–0.4 to 0.4). On the other hand, 85% of experimentally identified proteins were found to be distributed in the same range regardless of the identification methods used. The results indicated, therefore, that a large portion of proteins of A. pernix are in the neutral GRAVY score range (Fig. 5C). The averages of the GRAVY score for identified proteins were –0.15 (2D-PAGE), –0.13 (1D-SDS-PAGE-LC-MS/MS), and –0.16 (MD-LC-MS/MS), whereas that of the predicted proteins was –0.02.

Protein Class Distribution—
Proteins predicted from the genome of A. pernix K1 as well as those identified by 2D-PAGE, 1D-SDS-PAGE-LC-MS/MS, and MD-LC-MS/MS were grouped into six functional classes (Fig. 5D) and compared. More experimentally identified proteins were found to be categorized in "metabolism" and "genetic information processing" than those predicted from the genome analysis, whereas a distinctly large proportion of predicted proteins were categorized in "non-conserved hypothetical proteins." With respect to the distribution pattern of proteins in the six protein classes, differences among the methods used were marginal.

Codon Usage Pattern—
It is known that a characteristic bias in codon usage exists in each species of organisms (13). To examine whether and to what extent differences in codon usages exist between the experimentally identified ORFs and the ORFs predicted from the A. pernix K1 genomic sequence, codon usages in individual ORFs were plotted against the categories of proteins described above, namely molecular mass, pI, hydropathy, and protein class.

An example is shown in Fig. 6: in A, the codon usage patterns of proteins categorized by their molecular mass are shown, and in B, similar patterns of proteins categorized by their protein class are shown. As described above, a large proportion of predicted proteins were classified in the molecular mass range of 10–20 kDa. Indeed many of them were found to deviate from the average use of TCC, whereas predicted proteins larger than 40 kDa appear to match well with those of experimentally confirmed proteins. Therefore, it seems that the usage patterns of various codons will serve as good tools to evaluate whether a particular ORF predicted from the genomic sequence is likely to be a true gene or not. Indeed this is one of the bases on which algorithms for the prediction of genes/ORFs in the genomic sequence data rely. A similar analysis was performed with respect to protein classes as shown in Fig. 6B. The patterns of experimentally identified versus predicted ORFs were found to be quite different when proteins categorized as "non-conserved hypothetical proteins" were analyzed. Interestingly such a clear difference shown in Fig. 6, A and B, was not observed when a similar analysis was performed with proteins categorized by their pI and hydropathy values.


Figure 6
View larger version (15K):
[in this window]
[in a new window]
 
FIG. 6. Codon usage patterns of the ORFs encoding proteins of classified categories. A, frequency distribution of the codon TCC in individual ORFs encoding proteins of different categories was calculated as shown. The distribution pattern in ORFs predicted from the genomic data (DOGAN) is expressed in red, whereas that in ORFs experimentally identified is expressed in blue. B, frequency distribution of another codon, CCT, was similarly calculated as shown.

 
Complementarity of the Methods Used—
For the proteome analysis of A. pernix K1, we adopted the high resolving power of 2D-PAGE (14) including RFHR-2D-PAGE (9) and combined it with MALDI-TOF MS for "peptide mass fingerprinting." However, 2D-PAGE-MALDI-TOF MS has limitations in the detection of less abundant or hydrophobic proteins as well as proteins of extremely large sizes. Introduction of improved chaotropes and development of novel zwitterionic detergents (15) were found to improve the situation to some extent.

An alternative approach was to omit the first dimensional separation and apply an enriched membrane protein fraction directly to 1D-SDS-PAGE-LC-MS/MS. Most of the bands on the 1D-SDS-PAGE gel consisted of multiple proteins, but the ability of HPLC in connection with ESI tandem mass spectrometry is powerful enough to analyze a mixture of derived peptides so that conventional tryptic digestion of proteins followed by mass spectrometric analysis led to the identification of each protein in the mixture. This method is called the "shotgun method" (16), and it gives a considerable advantage in the characterization of membrane proteins because separation was targeted at the peptides rather than proteins so that solubility problems that are often encountered with hydrophobic proteins could largely be alleviated. A disadvantage of the shotgun method is that intensive computational analysis of the entire data set is always required, and no information regarding the charge of the intact protein could be obtained.

MD-LC-MS/MS is a third alternative method we adopted in which proteins were separated by MD-LC (10, 17, 18), digested with a specific enzyme, and ionized with ESI, and then their mass spectra were measured. In regular MD-LC systems proteins and peptides are separated according to a variety of their properties, such as pI, relative molecular mass, and hydrophobicity (17). However, a disadvantage of the MD-LC-MS/MS method is that not all of the peptide fragments could be detected, and their quantity is low. Also the sensitivity of detection will progressively decrease as the number of fractions increases.

Comparison with the Results Obtained by Other Researchers—
Guo et al. (5) examined the genomic data of A. pernix K1 and reported that 1,610 ORFs can be recognized as such (tubic.tju.edu.cn/Aper/). Therefore, their data were compared with the proteins experimentally identified. Of the 704 identified proteins, 692 were included as ORFs predicted by Guo et al. (5), but the remaining 12 were not. The molecular mass distribution of the proteins derived from TUBIC ORFs is very similar to the identified proteins as shown in Fig. 5A. However, their other characteristics slightly but significantly deviate from those of the proteins we experimentally characterized (Fig. 5, B–D).

Assignment of the Codons for Translation Initiation—
Of the 134 proteins whose amino-terminal sequences were experimentally determined, 50 were found to possess Met at their amino terminus. By comparing the nucleotide sequences corresponding to the amino-terminal Met, seven of them were found to possess ATG, and 14 others contained GTG. In addition, to our surprise, 29 others were found to contain TTG at the position of the amino-terminal Met. Subsequently we looked for candidate initiation codons based on the amino-terminal amino acid sequence data of the remaining 84 proteins that did not possess Met at their amino terminus. With 80 of them, a putative initiation codon was found in their immediate upstream, i.e. 39 of them were with TTG, 29 were with ATG, and 12 were with GTG, respectively. Because A. pernix K1 possesses an ORF encoding a protein homologous to methionine aminopeptidase, we interpreted the results to indicate that the amino-terminal Met of these 80 proteins was removed post-translationally by the putative methionine aminopeptidase.

With the remaining four proteins, however, candidate initiation codons were not found in the immediate upstream. Of these, APE0079 has homology to S-adenosylmethionine decarboxylase proenzyme 2 that is known to be post-translationally processed into an {alpha} and a ß chain. Similarly APE0521 has homology to a protease subunit of the proteasome of Methanococcus jannaschii, the amino-terminal region of which is likely to be processed. APE2072 has homology to the thermosome ß subunit of Thermoplasma acidophilum. By a "shotgun" mass spectrometry analysis, a peptide containing the detected amino terminus of APE2072 as well as 13 others matching the upstream region were detected (data not shown). Therefore, the detected amino terminus of APE2072 is likely to be that of a processed protein, although the nature of the processing remains to be clarified further. The genomic nucleotide sequence present in the immediate upstream of APE2493 is ATA. However, because ATA is not likely to serve as a translational initiator, it may be that the amino-terminal amino acid sequence corresponding to APE2493 was similarly processed, although the processing has not been elucidated yet.

To summarize the data for translational initiation codons corresponding to the 130 sequences other than the four proteins mentioned above, TTG was found to be most frequent (52%), whereas ATG and GTG, respectively, were found in 28 and 20% of the cases. Of the 130 ORFs, six proteins were found to be derived from the region in which no ORFs were previously assigned. Of the remaining 124 ORFs, the initiation codons of 89 (72%) were different from the positions that were assigned previously (2).

Characteristics of the Region Upstream of the Putative Initiation Codons—
The mechanism of transcriptional initiation in Archaea has been speculated to be more closely related to that of eukaryotes (19, 20). However, three groups of transcription-associated proteins have been identified in Archaea: one group more similar to prokaryotes, another group more similar to eukaryotes, and a third group more similar to both prokaryotes and eukaryotes. Several homologues of bacterial transcriptional factors (2124) have been identified in Archaea, and Tolstrup et al. (25) have shown that the translation process of internal genes of operons in Archaea was similar to that in bacteria.

To characterize the genomic regions likely to function in translational initiation in A. pernix K1, the nucleotide frequency of the sequences surrounding the ORFs for 130 proteins mentioned above were analyzed according to the method of Xiu-Feng et al. (26). The ORFs were categorized into two groups: in Group 1 the ORFs in question are not well separated from their immediate upstream neighbor, whereas in Group 2 they are more than 50 bp away from each other. In the region preceding the 130 ORFs, there is a G box at the position –10 upstream of the initiation codon with a typical sequence of GGTG regardless of the ORF category, whereas ORFs of Group 1 harbor in addition an AT box at the –42 position upstream of the initiation codon and a weak C box at the –35 position (Fig. 7, A and B).


Figure 7
View larger version (30K):
[in this window]
[in a new window]
 
FIG. 7. Frequency of occurrence of each nucleotide in the region preceding the initiation codon. A, frequency of each of the four nucleotides occurring in Group 1 ORFs whose protein products were identified by amino-terminal amino acid analysis was calculated as shown. B, the value in Group 2 ORFs identified by amino-terminal amino acid analysis was similarly calculated.

 
Mechanisms of Translational Initiation—
In view of the finding mentioned above, all the identified ORFs of A. pernix were reassigned by taking the presence of three initiation codons and of a G box into consideration. Consequently TTG was found to be the most predominant initiation codon (38% of all ORFs) followed by ATG (33%) and GTG (29%). After the reassignment, the frequency of occurrence of each nucleotide in the upstream region was plotted. There is a distinct G box at the region surrounding the –10 position upstream of the initiation codon harboring GGTG as a typical sequence as in the case of the 130 genes for the experimentally characterized proteins mentioned above. In addition, Group 1 ORFs possess a weak AT box at the –42 position and a weak C box at the –35 position, whereas the weak C boxes were not so clear in Group 2 ORFs (data not shown).

For the experimental identification of translational initiation codons, a large scale amino-terminal sequencing of Synechocystis sp. strain PCC6803 was performed by Sazuka and Ohara (27, 28) in which amino-terminal sequences of 234 protein spots were analyzed. The initiation codons in Synechocystis sp. were thus identified, suggesting that ATG was most predominant (88%) followed by GTG (7%) and TTG (3%). A similar set of data for Aquifex aeolicus, Archaeoglobus fulgidus, Bacillus subtilis, Borrelia burgdorferi, Chlamydia trachomatis, Escherichia coli, Haemophilus influenzae, Helicobacter pylori, M. jannaschii, Methanobacterium thermoautotrophicum, Mycobacterium tuberculosis, Mycoplasma genitalium, Mycoplasma pneumoniae, Rickettsia prowazekii, Synechocystis sp., and Treponema pallidum was summarized by Rocha et al. (29) based on the genomic sequences of individual organisms as shown in Table II along with the data for Aeropyrum pernix K1, Corynebacterium efficiens, Pyrococcus horikoshii OT3, Staphylococcus aureus N315, Streptomyces avermitilis, and Sulfolobus tokodaii that were taken from DOGAN.


View this table:
[in this window]
[in a new window]
 
TABLE II The percentage of start codons and G + C content in various species

 
It has been reported that TTG is the most plausible initiation codon for many mitochondrial protein genes in two nematodes, Ascaris suum and Caenorhabditis elegans (30). In addition, ACG was reported as an initiation codon in two eukaryotic viral genes (31, 32). In E. coli, initiation at ATG was more efficient than at GTG or TTG (33). It has been generally believed that ATG is the most predominant and efficient translational initiation codon in many other organisms as well. The results shown here, however, are different, and TTG is most predominant in A. pernix K1, although it is not clear whether it is the most efficient initiation codon in A. pernix K1 or not. Archaea are known to possess a eukaryote-like positive regulator in the transcription apparatus that consists of a cognate bacterial-type regulator facilitating recruitment of the TATA-binding protein for transcriptional activation (34). Furthermore two types of translational initiation mechanisms have been reported in Archaea, namely leadered and leaderless translation. The former has been shown to occur in internal genes (25), which possess a G-rich region in their 5' flanking region that is likely to play a role in ribosomal binding, whereas the latter involves scanning for the first initiation codon along the transcripts (35). N-Formylmethionyl tRNA, deformylase, and methionyl aminopeptidase have been proven to play the roles in the translational initiation in E. coli (3638) in particular in combination with ATG. A. pernix K1 possesses no homologues for N-formylmethionyl tRNA and deformylase, and the Met-tRNA genes are present in triplicate, although there is no sequence similarity between them, and two of them have an intron (2). These might possibly be related, at least to some extent, to the less frequent translational initiation at ATG in A. pernix K1.


    ACKNOWLEDGMENTS
 
We thank T. Sekigawa of NITE for valuable assistance in the computational analysis of the genomic data of A. pernix K1.


   FOOTNOTES
 
Received, September 22, 2005

Published, January 17, 2006

Published, MCP Papers in Press, February 2, 2006, DOI 10.1074/mcp.M500312-MCP200

1 The abbreviations used are: DOGAN, Database of the Genomes Analyzed at NITE; NITE, National Institute of Technology and Evaluation; 2D, two-dimensional; 1D, one-dimensional; MD, multidimensional; SAX, strong anion exchange; GPC, gel permeation chromatography; HFBA, heptafluorobutyric acid; RFHR, radical-free and highly reducing; CBB, Coomassie Brilliant Blue R-250; Tricine, N-[2-hydroxy-1,1-bis(hydroxymethyl)ethyl]glycine; TUBIC, Tianjin University BioInfomatics Centre. Back

* The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. Back

S The on-line version of this article (available at http://www.mcponline.org) contains supplemental material. Back

{ddagger} To whom correspondence should be addressed. Tel.: 81-3-3481-1936; Fax: 81-3-3481-8951; E-mail: yamazaki-shuji{at}nite.go.jp


    REFERENCES
 TOP
 ABSTRACT
 EXPERIMENTAL PROCEDURES
 RESULTS AND DISCUSSION
 REFERENCES
 

  1. Sako, Y., Nomura, N., Uchida, A., Ishida, Y., Morii, H., Koga, Y., Hoaki, T., and Maruyama, T. (1996) Aeropyrum pernix gen. nov., sp. nov., a novel aerobic hyperthermophilic archaeon growing at temperatures up to 100°C. Int. J. Syst. Bacteriol. 46, 1070 –1077[Abstract/Free Full Text]

  2. Kawarabayasi, Y., Hino, Y., Horikawa, H., Yamazaki, S., Haikawa, Y., Jin-no, K., Takahashi, M., Sekine, M., Baba, S., Ankai, A., Kosugi, H., Hosoyama, A., Fukui, S., Nagai, Y., Nishijima, K., Nakazawa, H., Takamiya, M., Masuda, S., Funahashi, T., Tanaka, T., Kudoh, Y., Yamazaki, J., Kushida, N., Oguchi, A., Aoki, K., Kubota, K., Nakamura, Y., Nomura, N., Sako, Y., and Kikuchi, H. (1999) Complete genome sequence of an aerobic hyper-thermophilic crenarchaeon, Aeropyrum pernix K1. DNA Res. 6, 83 –101, 145–152[Medline]

  3. Natale, D. A., Shankavaram, U. T., Galperin, M. Y., Wolf, Y. I., Aravind, L., and Koonin, E. V. (2000) Towards understanding the first genome sequence of crenarchaeon by genome annotation using clusters of orthologous group of proteins (COGs). Genome Biol. 1, RESEARCH0009

  4. Pruitt, K. D., Tatusova, T., and Maglott, D. R. (2003) NCBI Reference Sequence project: update and current status. Nucleic Acids Res. 31, 34 –37[Abstract/Free Full Text]

  5. Guo, F., Wang, J., and Zhang, C. (2004) Gene recognition based on nucleotide distribution of ORFs in a hyper-thermophilic crenarchaeon, Aeropyrum pernix K1. DNA Res. 11, 361 –370[Medline]

  6. Blattner, F. R., Plunkett, G., III, Bloch, C. A., Perna, N. T., Burland, V., Riley, M., Collado-Vides, J., Glasner, J. D., Rode, C. K., Mayhew, G. F., Gregor, J., Davis, N. W., Kirkpatrick, H. A., Goeden, M. A., Rose, D. J., Mau, B., and Shao, Y. (1997) The complete genome sequence of Escherichia coli K-12. Science 277, 1453 –1462[Abstract/Free Full Text]

  7. Kunst, F., Ogasawara, N., Moszer, I., Albertini, A. M., Alloni, G., Azevedo, V., Bertero, M. G., Bessieres, P., Bolotin, A., Borchert, S., Borriss, R., Boursier, L., Brans, A., Braun, M., Brignell, S. C., Bron, S., Brouillet, S., Bruschi, C. V., Caldwell, B., Capuano, V., Carter, N. M., Choi, S. K., Codani, J. J., Connerton, I. F., et al. (1997) The complete genome sequence of the gram positive bacterium Bacillus subtilis. Nature 390, 249 –256[CrossRef][Medline]

  8. Helianti, I., Morita, Y., Yamamura, A., Murakami, Y., Yokoyama, K., and Tamiya, E. (2001) Characterization of native glutamate dehydrogenase from an aerobic hyperthermophilic archaeon Aeropyrum pernix K1. Appl. Microbiol. Biotechnol. 56, 388 –394[CrossRef][Medline]

  9. Wada, A. (1986) Analysis of Escherichia coli ribosomal proteins by an improved two-dimensional gel electrophoresis. I. Detection of four new proteins. J. Biochem. 100, 1583 –1594[Abstract/Free Full Text]

  10. Fujii, K., Nakano, T., Hike, H., Usui, F., Bando, Y., Tojo, H., and Nishimura, T. (2004) Fully automated online multi-dimensional protein profiling system for complex mixtures. J. Chromatogr. A 1057, 107 –113[Medline]

  11. Atomi, H., Matsumi, R., and Imanaka, T. (2004) Reverse gyrase is not a prerequisite for hyperthermophilic life. J. Bacteriol. 186, 4829 –4833[Abstract/Free Full Text]

  12. Kyte, J., and Doolittle, R. F. (1982) A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157, 105 –132[CrossRef][Medline]

  13. Grocock, R. J., and Sharp, P. M. (2002) Synonymous codon usage in Pseudomonas aeruginosa PA01. Gene (Amst.) 289, 131 –139[CrossRef][Medline]

  14. O’Farrell, P. H. (1975) High resolution two-dimensional electrophoresis of proteins. J. Biol. Chem. 250, 4007 –4021[Abstract/Free Full Text]

  15. Fountoulakis, M., and Takacs, B. (2001) Effect of strong detergents and chaotropes on the detection of proteins in two-dimensional gels. Electrophoresis 22, 1593 –1602[CrossRef][Medline]

  16. Simpson, R. J., Connolly, L. M., Eddes, J. S., Pereira, J. J., Mortitz, R. L., and Reid, G. E. (2000) Proteomic analysis of the human colon carcinoma cell line (LIM 1215): development of a membrane protein database. Electrophoresis 21, 1707 –1732[CrossRef][Medline]

  17. Wang, H., and Hanash, S. (2003) Multi-dimensional liquid phase based separation in proteomics. J. Chromatogr. B 787, 11 –18

  18. Kawakami, T., Anyoji, H., and Nishimura, T. (2002) Development of proteome analysis systems for efficient expression profiling: towards high throughput identification and quantitation of disease target proteins. J. Mass Spectrom. Soc. Jpn. 3, 135 –141

  19. Soppa, J. (1999) Translation initiation in archaea: facts, factors and future aspects. Mol. Microbiol. 31, 1295 –1305[CrossRef][Medline]

  20. Soppa, J. (1999) Normalized nucleotide frequencies allow the definition of archaeal promoter elements for different archaeal groups and reveal base-specific TFB contacts upstream of the TATA box. Mol. Microbiol. 31, 1589 –1592[CrossRef][Medline]

  21. Kyrpides, N. C., and Ouzounis, C. A. (1995) The eubacterial transcriptional activator Lrp is present in the archaeon Pyrococcus furiosus. Trends Biochem. Sci. 20, 140 –141[CrossRef][Medline]

  22. Vierke, G., Engelmann, A., Hebbeln, C., and Thomm, M. (2003) A novel archaeal transcriptional regulator of heat shock response. J. Biol. Chem. 278, 18 –26[Abstract/Free Full Text]

  23. Lee, S. J., Engelmann, A., Horlacher, R., Qu, Q., Vierke, G., Hebbeln, C., Thomm, M., and Boos, W. (2003) TrmB, a sugar-specific transcriptional regulator of the trehalose/maltose ABC transporter from the hyperthermophilic archaeon Thermococcus litoralis. J. Biol. Chem. 278, 983 –990[Abstract/Free Full Text]

  24. Bell, S. D., Cairns, S. S., Robson, R. L., and Jackson, S. P. (1999) Transcriptional regulation of an archaeal operon in vivo and in vitro. Mol. Cell 4, 971 –982[CrossRef][Medline]

  25. Tolstrup, N., Sensen, C. W., Garrett, R. A., and Clausen, I. G. (2000) Two different and highly organized mechanisms of translation initiation in the archaeon Sulfolobus solfataricus. Extremophiles 4, 175 –179[CrossRef][Medline]

  26. Xiu-Feng, W., Susan, M. B., and John, A. B. (2004) Revealing gene transcription and translation initiation patterns in archaea, using interactive clustering model. Extremophiles 8, 291 –299[Medline]

  27. Sazuka, T., and Ohara, O. (1996) Sequence features surrounding the translation initiation sites assigned on the genome sequence of Synechocystis sp. strain PCC6803 by amino-terminal protein sequencing. DNA Res. 3, 225 –232[Abstract]

  28. Sazuka, T., and Ohara, O. (1997) Towards a proteome project of cyanobacterium Synechocystis sp. strain PCC6803: linking 130 protein spots with their respective genes. Electrophoresis 18, 1252 –1258[CrossRef][Medline]

  29. Rocha, E., Danchin, A., and Viari, A. (1999) Translation in Bacillus subtilis: roles and trends of initiation and termination, insights from a genome analysis. Nucleic Acids Res. 27, 3567 –3576[Abstract/Free Full Text]

  30. Okimoto, R., Macfarlane, J. L., and Wolstenholme, D. R. (1990) Evidence for the frequent use of TTG as the translation initiation codon of mitochondrial protein genes in the nematodes, Ascaris suum and Caenorhabditis elegans. Nucleic Acids Res. 18, 6113 –6118[Abstract/Free Full Text]

  31. Becerra, S. P., Koczot, F., Fabisch, P., and Rose, J. A. (1988) Synthesis of adeno-associated virus structural proteins requires both alternative mRNA splicing and alternative initiations from a single transcript. J. Virol. 62, 2745 –2754[Abstract/Free Full Text]

  32. Curran, J., and Kolakofsky, D. (1988) Ribosomal initiation from an ACG codon in the Sendai virus P/C mRNA. EMBO J. 7, 245 –251[Medline]

  33. Ringquist, S., Shinedling, S., Barrick, D., Green, L., Binkley, J., Stormo, G. D., and Gold, L. (1992) Translation initiation in Escherichia coli: sequences within the ribosome-binding site. Mol. Microbiol. 6, 1219 –1229[Medline]

  34. Ouhammouch, M., Dewhurst, R. E., Hausner, W., Thomm, M., and Geiduschek, E. P. (2003) Activation of archaeal transcription by recruitment of the TATA-binding protein. Proc. Natl. Acad. Sci. U. S. A. 100, 5097 –5102[Abstract/Free Full Text]

  35. Kozak, M. (1999) Initiation of translation in prokaryotes and eukaryotes. Gene (Amst.) 234, 187 –208[CrossRef][Medline]

  36. Schulman, L. H., and Her, M. O. (1973) Recognition of altered E. coli formylmethionine transfer RNA by bacterial T factor. Biochem. Biophys. Res. Commun. 51, 275 –282[Medline]

  37. Mazel, D., Pochet, S., and Marliere, P. (1994) Genetic characterization of polypeptide deformylase, a distinctive enzyme of eubacterial translation. EMBO J. 13, 914 –923[Medline]

  38. Ben-Bassat, A., Bauer, K., Chang, S. Y., Myambo, K., Boosman, A., and Chang, S. (1987) Processing of the initiation methionine from proteins: properties of the Escherichia coli methionine aminopeptidase and its gene structure. J. Bacteriol. 169, 751 –757[Abstract/Free Full Text]


Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
Appl. Environ. Microbiol.Home page
T. J. Santangelo, L. Cubonova, and J. N. Reeve
Shuttle Vector Expression in Thermococcus kodakaraensis: Contributions of cis Elements to Protein Synthesis in a Hyperthermophilic Archaeon
Appl. Envir. Microbiol., May 15, 2008; 74(10): 3099 - 3104.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Supplemental Data
Right arrow All Versions of this Article:
M500312-MCP200v1
5/5/811    most recent
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow Glossary
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar