Skip to main content
Molecular & Cellular Proteomics

Main menu

  • Home
  • Articles
    • Current Issue
    • Papers in Press
    • Reviews and Minireviews
    • Special Issues
    • Editorials
    • Archive
    • Letters to the Editor (eLetters)
  • Info for
    • Authors
      • Editorial Policies
      • How to Submit
      • Manuscript Contents & Organization
      • Data Reporting Requirements
      • Publication Charges
    • Reviewers
    • Librarians
    • Advertisers
    • Subscribers
  • Guidelines
    • Proteomic Identification
      • Checklist (PDF)
      • Instructions for Annotated Spectra
      • Tutorial (PDF)
    • Clinical Proteomics
      • Checklist (PDF)
    • Glycomic Identification
      • Checklist (PDF)
    • Targeted Proteomics
      • Checklist (PDF)
    • Data-Independent Acquisition
      • Checklist (PDF)
    • Frequently Asked Questions
  • About
    • Mission Statement and Scope
    • Editorial Policies
    • Editorial Board
    • MCP Lectureships
    • Permissions and Licensing
    • Partners
    • Alerts
    • Contact Us

Submit

  • Submit
  • Publications
    • ASBMB
    • Molecular & Cellular Proteomics
    • Journal of Biological Chemistry
    • Journal of Lipid Research

User menu

  • Register
  • Subscribe
  • My alerts
  • Log in
  • Log out
  • My Cart

Search

  • Advanced search
  • Publications
    • ASBMB
    • Molecular & Cellular Proteomics
    • Journal of Biological Chemistry
    • Journal of Lipid Research
  • Register
  • Subscribe
  • My alerts
  • Log in
  • Log out
  • My Cart
Molecular & Cellular Proteomics

Advanced Search

  • Home
  • Articles
    • Current Issue
    • Papers in Press
    • Reviews and Minireviews
    • Special Issues
    • Editorials
    • Archive
    • Letters to the Editor (eLetters)
  • Info for
    • Authors
      • Editorial Policies
      • How to Submit
      • Manuscript Contents & Organization
      • Data Reporting Requirements
      • Publication Charges
    • Reviewers
    • Librarians
    • Advertisers
    • Subscribers
  • Guidelines
    • Proteomic Identification
      • Checklist (PDF)
      • Instructions for Annotated Spectra
      • Tutorial (PDF)
    • Clinical Proteomics
      • Checklist (PDF)
    • Glycomic Identification
      • Checklist (PDF)
    • Targeted Proteomics
      • Checklist (PDF)
    • Data-Independent Acquisition
      • Checklist (PDF)
    • Frequently Asked Questions
  • About
    • Mission Statement and Scope
    • Editorial Policies
    • Editorial Board
    • MCP Lectureships
    • Permissions and Licensing
    • Partners
    • Alerts
    • Contact Us
  • Submit
Research

Simultaneous Qualitative and Quantitative Analysis of theEscherichia coli Proteome

A Sweet Tale

Jeffrey C. Silva, Richard Denny, Craig Dorschel, Marc V. Gorenstein, Guo-Zhong Li, Keith Richardson, Daniel Wall and Scott J. Geromanos
Molecular & Cellular Proteomics April 1, 2006, First published on January 5, 2006, 5 (4) 589-607; https://doi.org/10.1074/mcp.M500321-MCP200
Jeffrey C. Silva
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Richard Denny
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Craig Dorschel
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Marc V. Gorenstein
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Guo-Zhong Li
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Keith Richardson
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Daniel Wall
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Scott J. Geromanos
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Article
  • Figures & Data
  • Info & Metrics
  • eLetters
  • PDF
Loading

Abstract

We describe a novel LCMS approach to the relative quantitation and simultaneous identification of proteins within the complex milieu of unfractionated Escherichia coli. This label-free, LCMS acquisition method observes all detectable, eluting peptides and their corresponding fragment ions. Postacquisition data analysis methods extract both the chromatographic and the mass spectrometric information on the tryptic peptides to provide time-resolved, accurate mass measurements, which are subsequently used for quantitation and identification of constituent proteins. The response of E. coli to carbon source variation is well understood, and it is thus commonly used as a model biological system when validating an analytical method. Using this LCMS approach, we characterized proteins isolated from E. coli grown in glucose, lactose, and acetate. The change in relative abundance of the corresponding proteins was measured from peptides common to both conditions. Protein identities were also determined for those peptides that were unique to each condition, and these identities were found to be consistent with the underlying biochemical restrictions imposed by the growth conditions. The relative change in abundance of the characterized proteins ranged from 0.1- to 90-fold among the three binary comparisons. The overall coverage of the characterized proteins ranged from 10 to 80%, consisting of one to 34 peptides per protein. The quantitative results obtained from our study were comparable to other existing proteomic and transcriptional profiling approaches. This study illustrates the robustness of this novel LCMS approach for the simultaneous quantitative and comprehensive qualitative analysis of proteins in complex mixtures.

Escherichia coli is a microbial symbiote found in the colon and large intestine of most warm blooded animals that plays a critical role in vertebrate anabolism and catabolism. The environment in which E. coli lives is subject to rapid changes in the availability of the carbon and nitrogen compounds necessary to provide its energy and primary building blocks. E. coli survival hinges on the ability to successfully control the expression of genes coding for enzymes and proteins required for growth in response to environmental changes. Because of its simple cellular structure and its relative ease of maintenance and manipulation in the laboratory, E. coli has become the “workhorse host” for most research in molecular biology and microbiology. As a result, it is regarded as one of the most completely characterized organisms in all biology. The ease with which recombinant proteins can be expressed in E. coli has made this bacterium useful in the study of many basic biological processes as well as in the production of heterologous proteins for research and therapeutic purposes. For these reasons, E. coli has become a model system for testing new analytical technologies. For example, the relatively small genome size and prevalent laboratory use made E. coli genome one of the first to be completely sequenced (1). Likewise E. coli genome microarrays were among the first to be commercially available with sequences for the complete set of open reading frames as well as intergenic regions (2). The origins of proteomics can also be traced back to E. coli when pioneering two-dimensional gel electrophoresis experiments enabled the investigation of proteins on an organism-wide scale (3). Resources such as CyberCell Database (4) and EchoBASE (5) have been designed as central repositories of biochemical and genetic data from E. coli generated by a wide range of sources. These databases are periodically updated and annotated to facilitate a comprehensive understanding of this model organism. The knowledge gained through this organized effort can be applied to the understanding of other organisms for the development of antibiotics and/or antifungal agents.

The availability of fully sequenced genomes has allowed construction of microarrays that are used to detect and quantify all postulated gene products by determining the levels of the corresponding transcribed mRNA. A study by Zimmer et al. (6) demonstrated the use of this method to identify those genes in E. coli whose expression is activated when replacing a preferred nitrogen source with a non-preferred nitrogen source. In a separate study, Oh et al. (7) performed a similar analysis where E. coli were grown on different carbon sources. These studies not only identified those genes known to be associated with the specified metabolic pathway but also revealed many genes that had not been linked previously with the metabolic pathway under study.

Although the measurement of transcribed mRNA by hybridization techniques has led to the discovery of molecular markers and the elucidation of biologic mechanisms, this technique is not sufficient for the complete characterization of biologic systems. The detection of a particular gene product in a microarray experiment does not confirm the presence or absence of the resulting protein product or related post-translationally modified isoforms. It is also understood that quantitative differences in the transcript of a particular gene or set of genes may not necessarily correlate with the corresponding protein abundance. This failure was illustrated by a study involving the effect of carbon source perturbation on steady-state gene expression in Saccharomyces cerevisiae. The authors reported that growing S. cerevisiae on either galactose or ethanol resulted in significant differences between the abundance ratio of the mRNA and the corresponding protein products (8). Several other studies have demonstrated the poor correlation between the relative abundance of a transcript and the corresponding protein (9, 10). To fully understand the cellular physiology of a particular organism or disease state, a comprehensive analytical survey of the cell must be completed. The information gathered by compiling data gained from multiple bioanalytical approaches (i.e. transcript, protein, and metabolite levels to name a few) on an organism in a variety of physiological states is the basis of the discovery science approach referred to as systems biology. Combining data for such a systems analysis leads to a degree of understanding in which “the whole is greater than the sum of the parts.”

Many studies involving analysis of complex protein mixtures have been accomplished by combining the well established separation capabilities of two-dimensional (2D)1 PAGE with mass spectrometry-based sequence identification of selected, semipurified proteins (11). Although this technique is often applicable to comparative proteomics, 2D PAGE is notoriously insensitive to proteins that are not soluble during the isoelectric focusing stage of the separation. Moreover the staining methods required to visualize the proteins impose restraints on dynamic range and detection limits. Despite two-dimensional separation of the intact proteins, individual gel spots often contain many proteins, affecting the resulting quantitative analysis. This problem is exacerbated by the varying degrees of post-translational modifications that a particular protein may undergo, resulting in protein components appearing in multiple locations on the two-dimensional image. The development of automated, data-dependent ESI MS/MS in conjunction with microcapillary LC and database searching has significantly increased the sensitivity and speed of identification of gel-separated proteins. Alternative methods have subsequently been developed to maximize the duty cycle of the mass spectrometer with a concomitant increase in sensitivity that use a parallel (“broad band” acquisition) rather than a serial approach for the collision-induced dissociation of peptides (12–14). This method enhances the run-to-run reproducibility and yields high mass accuracy for both intact peptides and fragments, thereby improving sensitivity.

A traditional approach to determine the relative quantities of peptides (or proteins) in a complex mixture involves using stable isotope-labeled peptides. This technique allows direct correlation of the naturally occurring peptide to its stable isotope-labeled analog (15–17). In these studies, an amino acid labeling strategy is incorporated into the protocol in which one of the biological samples is treated with the light isotope form of the chemical labeling reagent, and the other sample is treated with the heavy isotope form of the labeling reagent. When the samples are mixed and analyzed by LCMS, labeled peptide pairs from the two samples can be differentiated in the mass spectrometer by the virtue of their mass difference. The ratio of the signal intensities of the light to heavy peptide derivatives of the peptide pairs reflects the abundance ratio for the originating protein in the two different biological samples. Although this quantitative strategy is a useful method for determining the relative abundance of proteins between different samples, it can involve complex chemistry and require expensive reagents, and it is not particularly amenable to large scale relative quantitation studies.

In this study, we used a simple, gel-free, label-free LCMS approach for qualitative and quantitative proteomic analysis (13). This investigation involved the study of E. coli grown with single, specific carbohydrates. This approach provides an excellent model system to study subtle differences in the microbial proteome because there is a controlled environment in which only one parameter is varied. Using E. coli to better understand metabolic pathways and characterize previously unknown proteins helps validate this methodology and could lead to the discovery of novel antibiotics when applied to related virulent microbes. The results of this study correlate well with the known carbon source biochemistry and molecular biology of E. coli. The ease of use and efficiency of this new technique is demonstrated by the comparability of the results with those obtained from existing gene profiling and more traditionally obtained proteomic data available in the literature (7, 18).

MATERIALS AND METHODS

Media and Growth Conditions—

Frozen E. coli (ATCC10798, K-12) cell stocks were streaked onto Luria-Bertani (LB) plates and grown at 37 °C. An individual colony was subsequently streaked onto M9 minimal medium plates supplemented with 0.5% sodium acetate and incubated at 37 °C. Seed cultures were generated by transferring single colonies into flasks of M9 minimal medium supplemented with 0.5% sodium acetate. Seed culture flasks were shaken at 250 rpm at 37 °C until midlog phase (A600 = 0.9–1.1). The seed culture was diluted 1 ml to 500 ml into separate M9 minimal media supplemented with one of three carbon sources (0.5% glucose, 0.5% lactose, or 0.5% sodium acetate). Flasks were shaken at 250 rpm at 37 °C until midlog phase (A600 = 0.9–1.1) and then harvested by centrifugation (5,000 × g for 15 min). Culture medium was discarded, and the cells were frozen at −80 °C until needed for protein extract preparation.

Protein Extract Preparation—

Frozen cells were suspended in 5 ml of lysis buffer (Dulbecco’s phosphate-buffered saline + 1/100 protease inhibitor mixture (Sigma catalog number 8340))/1 g of biomass in a 50-ml Falcon tube. The cells were lysed by sonication in a Microson XL ultrasonic cell disrupter (Misonix, Inc.) at 4 °C. The cell debris were removed by centrifugation at 15,000 × g for 30 min at 4 °C. The resulting soluble protein extract was dispensed into 1.0-ml cryotubes and stored at −80 °C for subsequent analysis.

SDS-PAGE Analysis of Protein Extracts—

Each protein sample was denatured and reduced using a standard PAGE loading buffer mixture containing 1.0% SDS and 10 mm DTT. The denatured protein samples were run in a Bio-Rad Criterion gel apparatus into a 12% polyacrylamide gel at 160 V for 1 h. The polyacrylamide gel was stained with Coomassie Blue using standard protocols.

Protein Digest Preparation—

Approximately 250 μg of total E. coli protein was suspended in 100 μl of 50 mm ammonium bicarbonate (pH 8.5) containing 0.05% Rapigest (19). Protein was reduced in the presence of 10 mm dithiothreitol at 60 °C for 30 min. The protein was alkylated in the dark in the presence of 30 mm iodoacetamide at room temperature for 30 min. Proteolytic digestion was initiated by adding modified trypsin (Promega) at a concentration of 50:1 (E. coli protein to trypsin) and incubated at 37 °C overnight. Tryptic digestion was terminated by diluting 1:1 with water and freezing immediately at −80 °C. The tryptic peptide solution (1.25 μg/μl total protein) was centrifuged at 10,000 × g for 10 min, and the supernatant was transferred into an autosampler vial for peptide analysis via LCMS.

HPLC Configuration—

Capillary liquid chromatography of tryptic peptides was performed with a Waters CapLC system equipped with a Waters NanoEase™ Atlantis™ C18, 300-μm × 15-cm reverse phase column. The aqueous mobile phase (mobile phase A) contained 1% acetonitrile in 0.1% formic acid. The organic mobile phase (mobile phase B) contained 80% acetonitrile in 0.1% formic acid. Samples (5-μl injection, digested equivalent to 6.25 μg of total protein) were loaded onto the column with 6% mobile phase B. Peptides were eluted from the column with a gradient of 6–40% mobile phase B over 100 min at 4.4 μl/min followed by a 10-min rinse of 99% of mobile phase B. The column was immediately re-equilibrated at initial conditions (6% mobile phase B) for 20 min. The lock mass, [Glu1]fibrinopeptide at 100 fmol/μl, was delivered from the auxiliary pump of the CapLC system at 1 μl/min to the reference sprayer of the NanoLockSpray™ source. All samples were analyzed in triplicate.

Mass Spectrometer Configuration—

Mass spectrometry analysis of tryptic peptides was performed using a Waters/Micromass Q-TOF Ultima API system. For all measurements, the mass spectrometer was operated in V-mode with typical resolving power of at least 10,000. All analyses were performed using positive mode ESI using a NanoLockSpray source. The lock mass channel was sampled every 30 s. The mass spectrometer was calibrated with a [Glu1]fibrinopeptide solution (100 fmol/μl) delivered through the reference sprayer of the NanoLockSpray source. Accurate mass LCMS data were collected in an alternating, low energy (MS) and elevated energy (MSE) mode of acquisition. The spectral acquisition time in each mode was 1.85 s with a 0.15-s interscan delay. In low energy MS mode, data were collected at a constant collision energy of 10 eV. In MSE mode, collision energy was ramped from 28 to 35 eV during each 1.85-s data collection cycle. One cycle of MS and MSE data was acquired every 4.0 s. The radio frequency applied to the quadrupole mass analyzer was adjusted such that ions from m/z 300 to 2000 were efficiently transmitted, ensuring that any ions observed in the LC/MSE data less than m/z 300 were known to arise from dissociations in the collision cell.

Data Processing and Protein Identification—

The continuum LCMSE data were processed and searched using ProteinLynx Global Server (PLGS) version 2.2. The resulting peptide and protein identifications were evaluated by the software using statistical models similar to those described by Skilling et al. (20). Results from replicate injections were collated for quantitative analysis to determine the relative -fold change using the glucose condition as the control experiment. Protein identifications were assigned by searching an E. coli protein database using the precursor and fragmentation data afforded by the LCMS acquisition method. The search parameter values for each precursor and associated fragment ion were set by the software using the measured mass error and intensity error obtained from processing the raw continuum data. The mass error tolerance values were typically under 5 ppm. Peptide identifications were restricted to tryptic peptides with no more than one missed cleavage and cysteine carbamidomethylation. The ion detection, clustering, and normalization were processed using PLGS as described earlier (13). Additional data analysis was performed with Spotfire Decision Site 7.2 and Microsoft Excel.

Due to the nature of the alternate scanning acquisition method, fragment ions produced from any given precursor will have the same chromatographic profile and apex retention time as the originating precursor ion. The data processing software produces an inventory of the measured monoisotopic mass of each detected precursor and fragment ion. The chromatographic peak area, chromatographic peak shape, combined charge state, and the apex retention time are also provided for each corresponding precursor and fragment ion. The chromatographic peak area is determined from the combined intensity of all the isotopes for all of the charge states associated to each precursor. Fragment ions are assigned to a parent precursor only if their apex retention times are within plus or minus the time associated with one acquisition scan (i.e. alternate scanning cycle). In these experiments, because the alternate scanning cycle time was 2 s, the ions found in the elevated energy channel to within ±0.05 min of a given precursor were assigned as associated fragments.

A qualitative analysis of a protein mixture may produce instances where more than one precursor ion can be found at the same apex retention time. In this instance, the fragmentation data associated with a specific moment in time is shared among more than one co-eluting precursor ion; however, it is important to remember that precursor and fragment ion data are acquired at high mass accuracy (±5 ppm). At 5-ppm mass accuracy, there is enough mass specificity to resolve associated fragment ions with their appropriate precursor ion for a subsequent accurate mass stringent database search. With this level of mass accuracy and the ability to obtain time-resolved mass measurements, confident identifications can be made in the instances of co-eluting peptides.

In instances where data from multiple injections of the same sample have been collected, the methodology utilizes chromatographic and analytical reproducibility to help confidently assign fragments to co-eluting precursors. A more thorough description of how the algorithms are used to “clean” the fragmentation data will be described in future work.2

Peptide Clustering and Data Normalization—

Identical peptides from each of the replicate injections for all conditions were clustered by mass precision (typically <10 ppm) and a retention time tolerance (typically <0.25 min) using the PLGS clustering software. The clustered peptide data set was exported from PLGS and further evaluated with Excel and Spotfire. For each condition, those ion detections that occurred only in one of the three replicate injections were considered as noise and discarded from further analysis. The LCMS data were normalized to peptides originating from TUFA prior to determining the relative quantitation of identified proteins across the various conditions. The details regarding the normalization strategy is described later in greater detail under “Results and Discussion.”

RESULTS AND DISCUSSION

A standard PAGE analysis was performed on the soluble protein extracts from E. coli grown on three different carbon sources. The protein loading was controlled to ensure that an equal amount of total protein from each condition was applied onto the gel. Two aliquots of total protein were loaded to obtain better resolution of the most abundant proteins. The protein profile patterns illustrated in Fig. 1A reveal similar patterns for the glucose and lactose growth conditions but a distinct pattern difference between the acetate growth condition and the other two growth conditions.

Fig. 1.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 1.

An overview of the analysis of E. coli. A, an SDS-PAGE analysis of the soluble protein generated from E. coli grown in minimal media with acetate (ACE), lactose (LAC), and glucose (GLU). Lane 4 contains the following protein molecular mass markers: 250, 150, 75, 50, 40, 25, 20, 10, and 5 kDa. Lanes 1–3 illustrate 7.5 μl of the whole-cell protein extract directly after sonication. Lanes 5–7 and lanes 8–10 illustrate 7.5 and 2.5 μl of soluble protein after removing the cell debris by centrifugation, respectively. B, the BPI of a single alternate scanning LCMS acquisition (LCMSE) of E. coli from each growth condition. Each LCMSE experiment contains a low energy (LE) function for the intact peptides and an elevated energy (EE) function for the associated fragment ions. C, an overlay of the deisotoped and charge state-reduced monoisotopic mass (0–5000 amu) and apex retention time (10–100 min) of the extracted peptides obtained from the acetate (yellow), lactose (red), and glucose (blue) experiments. The average monoisotopic mass and retention time are plotted for the replicate analysis of each condition. D, an overlay plot of the monoisotopic mass (1450–1675 amu) and apex retention time (29.5–32.0 min) of the extracted peptides from each condition represented within the gray box in Fig. 4C.

The soluble protein extracts from each of the three growth conditions were treated with trypsin, and the resulting peptide mixtures were analyzed by LCMSE. Fig. 1B illustrates the BPI and the total ion chromatograms from the low energy and elevated energy MS data acquisition for each condition, respectively. Inspection of the BPI chromatograms indicates that the similarity between the glucose and lactose conditions is also observed at the level of the tryptic peptides. In addition, the distinction between the acetate condition and the other two carbon sources is evident. The LCMSE data from each condition were processed using the Protein Expression software to produce an inventory of peptides that can be used to determine the relative abundance of peptides/proteins across multiple conditions. The complexity of the samples are illustrated in Fig. 1C, which displays ∼8000 observed monoisotopic masses for each of the extracted peptide components (MH+) as a function of the observed retention time for each condition. The alternate scanning mode of the LCMS data acquisition is configured to detect the precursor peptides in the low energy channel while simultaneously obtaining the data from associated fragments for subsequent structural determination of each precursor.

Ion Detection—

Replicate injections of tryptic peptides from soluble E. coli protein preparations were processed with the Protein Expression software, creating an inventory of the peptides obtained from the low energy data acquisition for each growth condition. Table I lists the number of peptide detections and summed ion intensities for each injection of the acetate, lactose, and glucose growth conditions. An average of 8102, 7959, and 8437 peptides were found in the acetate, lactose, and glucose growth conditions, respectively. The relative standard deviations (RSDs) for the number of peptide detections and the intensity sums ranged from 1.3 to 8.5% and from 3.1 to 5.7%, respectively. These RSDs indicate an acceptable degree of reproducibility of the data for the replicate injections of the three different growth conditions.

View this table:
  • View inline
  • View popup
Table I

Summary of extracted peptides from E. coli grown in acetate, lactose, and glucose

As a quality control measure, an external standard of a five-protein mixture (bovine albumin, bovine hemoglobin, yeast enolase, yeast alcohol dehydrogenase, and rabbit phosphorylase B) was analyzed at the beginning, in the middle, and at the end of the E. coli sample analysis. These proteins were injected at ∼750 fmol. The standard protein mixture is incorporated as a performance check during the data acquisition portion of the study to help verify that the LC and MS systems are performing within acceptable specifications. Significant deviations in retention time reproducibility, signal intensity, mass resolution, and mass accuracy indicate that the LCMS system should be inspected for faults. Repeated injections of this standard protein mixture have set expected criteria for the mass spectrometer and the ion detection software. In a single injection of this simple protein mixture, ∼240 accurate mass, retention time detections (AMRTs) are obtained. Approximately 75% of the 240 time-resolved mass measurements can be assigned to tryptic fragments of the five standard proteins, constituting ∼85% of the total detected intensity. These identifications do not take into account any post-translational modifications that may be present other than carbamidomethyl cysteine and may account for a few of the unidentified AMRTs. Upon analysis of a lower level of the protein standard (∼5-fold dilution), 40 AMRTs are observed from the peak detection software, 36 of which are a subset of the 240 AMRTs found in the higher concentration standard protein mixture. Approximately 77% of the 40 time-resolved mass measurements in the lower concentration sample can be assigned to tryptic fragments of the five standard proteins, constituting ∼85% of the total detected intensity. By extension, it may be assumed that a similar plurality of ions detected from any mixture of tryptic peptides represents actual peptide detections. The statistics associated with the accounting of the peptide detections (“ion accounting”) observed from the analysis of the protein standard described above illustrates the robustness of the peak detection software and the lack of carryover from previous injections. The list of precursors from the standard protein mixture has been provided in the supplemental data (Supplemental Tables 1 and 2).

View this table:
  • View inline
  • View popup
Table II

Summary of the replication of detected peptides from E. coli grown in acetate, lactose, and glucose

The peak detection software is designed to interrogate both the low and elevated energy channels and produce an ion list for only those m/z detections that produce a chromatographic apex. Any m/z detections that occur at a constant background level (i.e. solvent-related ions or column matrix-associated ions) do not appear in the resulting ion lists and will therefore not interfere with subsequent protein identification and/or quantitation.

Clustering Peptide Components across Multiple Conditions—

After obtaining the inventory of the detected peptides from the replicates of each growth condition, the individual peptide lists from each of the E. coli samples were organized into a single matrix such that identical peptides were grouped across the entire experiment (replicate injections of multiple conditions) for subsequent quantitative analysis. The clustering algorithm utilizes the mass precision of the mass spectrometer and retention time reproducibility obtained from the chromatography to cluster the identical peptides across the entire experiment. The details of the clustering algorithm can be found in the work of Silva et al. (13).

After clustering the peptides from the entire study, the replication of each peptide within each condition was determined. Those peptides that only occurred once in each replicate analysis set (one of three) were regarded as background ions and were removed from consideration. Only those peptides that were found in at least two of the three injections were used for analysis. For the acetate condition, the discarded components corresponded to approximately 19% of the total ion detections (peptides) but only represented 4% of the total detected intensity from the acetate condition (Table II). These statistics are consistent with the notion that these discarded peptides are among the low intensity detections that occur at nearly the limit of detection of the instrument. Because the peptides observed at this detection range are more likely to cause spurious quantitative results and provide less structural information, they are discarded from further analysis. Keeping in mind that this particular analysis is not dependent upon a peptide enrichment strategy, it is unlikely that we are losing important qualitative information because there are many tryptic peptides available for the subsequent identification of the constituent proteins. Similar results were observed for the other two growth conditions.

Data Normalization—

Normalization of the data is required for meaningful quantitative results. This can be accomplished in a variety of ways. In instances where not many proteins are affected by a given perturbation, an autonormalization routine is an appropriate means of normalizing the data across many different samples. In this type of normalization routine the data are normalized to the intensity of the many qualitatively matched proteins (or peptides) that are found through statistical analysis not to change between the two conditions. However, in instances where there are dramatic qualitative and quantitative changes such as those observed between glucose and acetate or lactose and acetate, it may not be the best normalization strategy. The dramatic changes due to the various conditions are illustrated later in Fig. 3 (D, F, and H). Comparing the glucose and lactose conditions, the histogram of the intensity ratios of the matched peptides indicates that not many peptides, or originating proteins, change between the two conditions. Approximately 4000 AMRTs are found within the center four bins of the histogram (Fig. 3D). However, in the case of either acetate versus glucose or acetate versus lactose, there are far fewer AMRTS (∼800, Fig. 3, F and H) that are found within the center four bins of the histogram, indicating that many proteins are changing between the two conditions. Given the dramatic changes observed among the three different growth conditions, we opted to normalize to a single protein that did not change among the three conditions prior to determining the relative protein changes among the different conditions. Considering the apparent consistency of the peptide levels of protein chain elongation factor Ef-Tu (TUFA) in the three samples and the substantial number of identified peptides to the protein (∼60% sequence coverage), it was selected as the target protein for normalization across the three different conditions. Before normalizing the samples across the entire set of experiments, the intensity measurements from the raw data indicated that the relative intensity ratios of the TUFA peptides varied by less than 30%. After normalization, this variability was reduced to below 20%. All the observed intensity measurements were scaled to the summed intensity of the TUFA peptides found to be common to each condition. Using this normalization strategy, we were able to correct for injection variability within each condition and also for variation in protein load among all conditions. The monoisotopic masses and retention times of the peptides used for normalization were: 1027.5585 (37.21 min), 1171.6598 (41.97 min), 1187.5300 (33.78 min), 1214.6366 (47.77 min), 1303.7826 (70.87 min), 1780.9388 (64.19 min), 1795.9577 (13.71 min), 1803.8827 (32.27 min), 1964.9721 (63.57 min), and 2117.1521 (91.80 min). The validated mass spectrum for three of these peptides is provided in the supplemental data (Supplemental Fig. 1).

Analytical Reproducibility—

Before conducting the relative protein profiling analysis among the three different growth conditions, a variety of quality control measures were performed on the replicates of each condition to determine the analytical reproducibility of the analysis. The final results from the clustering algorithm were exported from the Protein Expression software as a comma-delimited text file containing all of the mass spectrometric and chromatographic attributes for each peptide component along with all the statistical calculations generated after the clustering process. This clustered data file was imported into Microsoft Excel to determine a number of data quality control measures and into Spotfire Decision Site to visualize the reproducibility of the analysis of each sample.

The parameters used for the clustering of identical peptides throughout an experiment rely on the inherent reproducibility of the instruments used to obtain the data. Specifically the clustering algorithm utilized the analytical reproducibility of the mass measurement and the reproducibility associated with the chromatographic retention time measurement of each peptide. The mass precision error obtained from the extracted peptide components was typically within ±5 ppm of the mean mass measurement. This is illustrated in Fig. 2A and demonstrates the robustness of the ion detection software and the stability of the mass measurement instrumentation. The variability of the quantitative intensity measurements among the replicate injections obtained from the Protein Expression software is summarized in Fig. 2B. These results indicate that the average and median RSD among the replicate injections was 15.6 and 20.2%, respectively. Fig. 2C illustrates the reproducibility of retention times during this study where the RSD was typically less than 1.3%. These observations are consistent with Protein Expression results reported previously (13, 21). The low analytical variability associated with the replicate injections demonstrates the robustness of the method and provides the measure of confidence needed to proceed onto the comparison of the paired conditions for quantitative protein profiling analysis.

Fig. 2.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 2.

Analytical statistics from the replicate LCMSE analysis of E. coli grown in acetate.A, a histogram plot of the mass precision of the clustered peptide components from the replicate LCMSE analysis of the tryptic peptides generated from the soluble protein of E. coli grown in minimal medium with acetate as the sole carbon source. B, a histogram plot of the relative standard deviation of the measured signal intensity (coefficient of variation of replicate intensity measurements) for those clustered peptide components that replicated in at least two of the three injections of the tryptic peptides generated from the soluble protein of E. coli grown in minimal medium with acetate as the sole carbon source. C, a histogram plot of the relative standard deviation of the measured retention time obtained from the clustered peptide components that replicated in at least two of the three injections of the tryptic peptides generated from the soluble protein of E. coli grown in minimal medium with acetate as the sole carbon source.

The intensity variation of this analytical method can be assessed by conducting binary comparisons of the intensity measurements of the matched peptide components for each replicate injection. Fig. 3A illustrates the scatter plot of the binary comparison of two replicate injections from the acetate condition. Fig. 3B presents the same data as a histogram plot of the intensity ratios of the matched peptides demonstrating that the majority of the matched peptides have intensity ratios close to unity. Ideally the binary comparisons would lie on a horizontal line (ln(ratio) = 0) with minimal deviation throughout the signal detection range. The data do produce a close fit to this horizontal line with the smallest deviation between matched peptides of higher abundance. These plots are useful because they illustrate what one would expect to see if there were no apparent changes between any two conditions. The standard deviation associated with the intensity ratios of the matched peptides between two replicate injections from the acetate growth condition was determined to be 0.22. Similar standard deviations were observed from replicate injections of the other two conditions and are indicative of the variability associated with the analytical method.

Fig. 3.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 3.

Differential protein expression of E. coli grown on glucose, lactose, and acetate from replicate LCMSE experiments. Scatter plots (A, C, E, and G) and corresponding histogram plots (B, D, F, and H) of the natural logarithm of the average intensity ratios of matching peptides among the three different growth conditions. Multiple peptides corresponding to a subset of identified proteins are highlighted in the scatter plot for each binary comparison. The standard deviation (StDev) for the ln(average intensity ratio) of the matched peptides is indicated in the histogram plot for each corresponding binary comparison. A, the ln(average intensity ratio) of matched peptides between two replicate injections of E. coli grown in acetate. C, lactose versus glucose: red, LACZ; blue, GALT; yellow, GALD; black, GALM; green, DGAL; and pink, TUFA. E, acetate versus glucose: green, ACEA; yellow, ACS; red, SUCC; blue, SUCD; and black, THRC. G, acetate versus lactose: yellow, ACEA; black, MDH; blue, GALM; and red, GALE. Unique peptides to each condition are indicated at the extreme values in both the scatter plot (top = numerator, bottom = denominator) and histogram (left = numerator, right = denominator) plots. Ace, acetate; Lac, lactose; Glu, glucose.

Clustering Peptides by Mass, Retention Time, and -Fold Change—

After grouping identical peptides by their observed mass and retention time within a given condition, the peptide intensity ratios from any two conditions can be displayed to reflect the relative quantitative difference (-fold change) observed between the those two conditions. Fig. 3, C, E, and G, display the relative -fold change observed for the matching peptide components among the three different growth conditions. Given these plots, those peptides whose intensities change significantly between conditions can be quickly identified. Fig. 3, C, E, and G, clearly show the large variation between the acetate and the other two growth conditions as compared with the small variation observed between glucose and lactose. The dramatic effect can be explained by the overall metabolic adjustments that E. coli must implement to utilize the three different carbon sources (Fig. 4). The similarity in peptide expression levels between glucose and lactose can be rationalized by the nature of the two carbon sources. Lactose is a disaccharide of glucose and galactose. Growth on lactose requires that E. coli express a series of proteins to transport and hydrolyze the disaccharide to its corresponding monosaccharides. Further a series of enzyme-catalyzed reactions are required to activate and epimerize galactose to glucose, a preferred carbon source for E. coli. Glucose is catabolized through glycolysis and the citric acid cycle to provide the primary metabolites for essential building blocks and production of energy. Acetate is a simple carbon source that initially bypasses glycolysis and enters into a modified version of the citric acid cycle, the glyoxylate shunt, to provide the necessary primary metabolites and energy to support growth.

Fig. 4.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 4.

Utilization of glucose, lactose, and acetate by E. coli. Illustration of the biochemical pathways and their corresponding enzymes required for utilization of glucose, lactose, and acetate by E. coli. Lactose catabolism includes those enzymes between lactose and β-d-glucose. Glycolysis includes those enzymes between β-d-glucose and acetyl-CoA. The citric acid cycle and glyoxylate shunt include those enzymes after acetyl-CoA. Acetate utilization includes those enzymes between acetate and acetyl-CoA. The gene/protein name and Blattner number is provided throughout the biosynthetic scheme.

The ability to cluster identical peptide components and quantitatively compare them across multiple conditions provides an analytical means to group related peptides within a given protein profiling experiment. A set of peptides within a determined -fold change range should originate from a limited subset of the proteins in the natural proteome. If the relative abundance of a protein changes between two conditions, it then follows that the relative abundance of tryptic peptides originating from the same protein will reflect the same degree of differential expression between two conditions. This methodology does not require isotope-labeled affinity tags or any other enrichment or labeling strategy. In fact, it is desirable not to enrich for specific peptides because one can take advantage of the multiple peptide measurements for any particular protein when determining its relative abundance. This quantitative comparison allows one to execute a peptide mass fingerprint (PMF) search with a ±5-ppm mass tolerance using a subset of the peptides matched between two conditions for subsequent peptide/protein identification. The quantitative information provided by this analysis in conjunction with a PMF search is a powerful approach to help identify proteins from such complicated LCMS data. Because the data have been acquired in alternate scanning mode, each tryptic peptide has associated fragmentation data that can be used for independent identification and validation of this complementary approach. The details regarding how the identification of peptides are generated from the associated fragmentation data are discussed later in the text.

Fig. 3C is a scatter plot of the 5983 matched peptides between the glucose and lactose growth conditions of E. coli. Six sets of tryptic peptides have been highlighted to illustrate a subset of the proteins that have been identified by PLGS using both precursor and fragmentation data afforded by the LCMSE analysis. The standard deviation associated with the intensity ratios, ln(ratio), of the matched peptides between the two conditions was shown to be 0.25, slightly higher than that observed for two replicate injections of the acetate sample (0.22, Fig. 3A). A total of nine peptides lie within a -fold change range of 1.04 ≤ ln(ratio) ≤ 1.73. When these peptides are submitted for an accurate mass PMF search against the entire E. coli protein database, allowing for one missed cleavage, the results return six peptides (highlighted in green) matching to DGAL (d-galactose-binding periplasmic protein) within a 5-ppm mass error tolerance and providing 29% protein sequence coverage. Because DGAL is up-regulated in the lactose growth condition, one would expect to find additional tryptic peptides unique to that condition. An additional 13 peptides are found to DGAL that are unique to the lactose condition, increasing the final protein sequence coverage to 65%. The alternate scanning data acquisition mode (LCMSE) provides supporting sequence information from the associated fragment ions collected in the elevated energy function experiment to provide structural validation of a majority of the 19 peptides. The validated mass spectrum for three of these peptides is provided in the supplemental data (Supplemental Fig. 2).

The data plotted in Fig. 3E indicate a greater degree of dissimilarity between the two conditions (acetate and glucose), reflected in the higher standard deviation value (0.98). Here for example, of the 13 peptides in the range 4.10 ≤ ln(ratio) ≤ 5.03, there are six peptides that are identified to ACEA (isocitrate lyase, 18% protein sequence coverage) within a mass error tolerance of 5 ppm. An additional 15 peptides to ACEA were unique to the acetate condition, increasing the final protein sequence coverage to 61%. Again the sequences of the majority of the peptides were validated by the elevated energy data acquired in the alternate scanning mode (Supplemental Fig. 3).

Fig. 3G illustrates the matched peptides of the acetate and lactose growth conditions and highlights those peptides identified to isocitrate lyase, malate dehydrogenase, UDP-glucose-4-epimerase, and aldolase-1-epimerase (ACEA, MDH, GALE, and GALM, respectively). The standard deviation associated with the intensity ratios of the matched peptides between these two conditions was determined to be 1.01, similar to that observed in the comparison of the peptides common between the acetate and glucose growth conditions. The treatment of this data is similar to the approach of Conrads et al. (22); however, the method in this study uses a single experiment to provide both the qualitative and quantitative information for each of the constituent proteins. Having the ability to conduct these experiments using one instrument simplifies the overall strategy and greatly reduces the time and effort required to collect and manage the digitized sample information. A more thorough discussion of the quantitative and qualitative results and its correlation to the different growth conditions is discussed later.

Simultaneous Peptide Sequence Identification by LCMSE—

To demonstrate the simultaneous qualitative capabilities of the alternate scanning mode of data acquisition, a total of seven selected ion chromatograms (SICs) from the raw, continuum LCMSE data from a single analysis of the acetate condition are illustrated in Fig. 5. The top SIC is of the doubly charged precursor m/z 945.664 of the GYINSLGALTGGQALQQAK peptide (1890.0220 MH+) from ACEA obtained in the low energy channel (function 1) whose apex retention time is 57.14 min. The six remaining SICs correspond to fragments ions of the peptide from ACEA obtained in the elevated energy channel (function 2) during the LCMSE acquisition. Specifically these SICs correspond to the y4, y8, y9, y11, y12, and y13 fragments of the ACEA peptide. The chromatographic profiles of these fragment ions are illustrated in Fig. 5 and are all shown to apex (57.17 min) within one scan (0.03 min) of the originating precursor peptide. This demonstrates the basic premise of alternate scanning LCMSE: the chromatographic profiles of product (fragment) ions must exactly parallel the profile of the precursor with peak apices matching within one MS scan of the originating precursor. The Protein Expression software converts the continuum LCMSE data into an inventory of time-resolved mass measurements of the detected peptides (precursors) from the low energy channel aligned with their corresponding fragment ions in the elevated energy channel. The information provided in the list of peptides includes the deisotoped and charge state-reduced monoisotopic mass measurement, the corresponding deconvolved intensity measurement, the measured apex retention time, and the average charge state. The time-resolved mass measurement data obtained in the elevated energy channel associated with the m/z 1890.0220 precursor at ∼57.14 min can be seen in the lower panel of Fig. 5. This illustration shows how the LCMSE method enables one to simultaneously perform quantitative and qualitative characterization of detected peptides.

Fig. 5.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 5.

Peptide identification from an LCMSE analysis.A, an SIC of the doubly charged m/z 945.664 precursor peptide ion from the low energy channel (function 1) with an apex retention time of 57.14 min and six associated fragment ions (m/z: 1242.928, 1185.882, 1114.855, 900.662, 843.637, and 474.354) from the elevated energy channel (function 2) that all chromatographically apex at 57.17 min. The delta retention time lies within one scan, 0.03 min. B, the time-resolved fragment ions from the doubly charged m/z 945.664 precursor peptide (lock mass-corrected monoisotopic mass = 1890.0220) identified as the GYINSLGALTGGQALQQAK peptide from isocitrate lyase, ACEA.

The identification of peptides and proteins is carried out using a probabilistic peptide fragmentation model in which the framework of the model has been tuned using a range of well characterized samples. The fragmentation data are deisotoped and charge state-reduced using a maximum likelihood algorithm to provide lock mass-corrected, monoisotopic mass measurements for the subsequent database search (20). The calculation of likelihood is based on a probabilistic summation over all of the possible ways a given peptide could fragment and give rise to trial masses. Observed masses are compared with a database containing probabilistic information about peptide fragmentation patterns based on empirical observation. A Markov chain model has been used to describe a number of attributes that influence the probability of peptide identification. These parameters include the expected appearance of a series of b and y ions, an amino acid undergoing a specified neutral loss, and specific cleavages to occur on the C- or N-terminal side as well as the formation of specific immonium ions. A more thorough explanation of this algorithm has been described by Skilling et al. (20).

Protein Profiling of E. coli among the Different Carbon Sources—

In previous work, we have shown how accurate mass and retention time measurements of peptides could be used to identify differentially abundant peptides belonging to a simple set of proteins spiked into a background of human serum (13). The intent of that study was demonstration of the ability to use accurate mass LCMS of intact peptides (precursor information) as a primary tool for quantitative and subsequent qualitative peptide/protein analysis. We now apply this methodology using both precursor and concurrent fragment ion information to monitor the specific metabolic differences between the three E. coli samples comprising our biological model.

In this study, we sought to use a model biological system to demonstrate the full qualitative and quantitative capabilities provided by the LCMSE methodology. Fig. 6A shows the abundance ratios for the characterized peptides to a set of eight proteins found among the three different growth conditions. The relative abundance of the identified peptides is seen to lie within a narrow quantitative range. These independent quantitative measurements of the multiple peptide identifications for a particular protein provide the ability to determine the relative abundance of the protein between two conditions. Using the average -fold change, standard deviation, and number of peptides to a particular protein found in two conditions, the average relative -fold change for a protein is be displayed with the appropriate 95% confidence interval (CI) in Fig. 6B. The relative quantitation of a particular protein across multiple binary comparisons (growth condition profiles) can provide additional information regarding the participation of proteins in a specific metabolic process. When taken together, the correlations among the growth condition profiles can be used to group proteins in response to specific perturbations. For example, ACEA/ALDA and IDH/MDH show very similar expression profiles for each pair of growth conditions and to a lesser degree among the four. These four proteins are in fact all metabolically related because they are involved in either the citric acid cycle or the glyoxylate shunt. Also the growth condition profiles of ribosomal proteins RL1/RS1 are very similar and differ from the previous set of proteins. The growth condition profiles illustrated by RL1 and RS2 are also shared among the other ribosomal proteins identified in this study (Fig. 7). Fig. 7 summarizes the relative quantitation to a number of identified proteins in this study that are critical for protein translation, carbon utilization, and energy metabolism. A more detailed discussion of the quantitative results and their correlation to the understood biochemistry is described in the following sections. An expanded list of the proteins identified in this study is provided in the supplemental data (Supplemental Table 3).

Fig. 6.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 6.

Differential expression of peptides and proteins.A, the results from the clustered peptides illustrates that identified peptides corresponding to differentially expressed proteins have expression ratios within a narrow range. The relative expression of the protein can be determined from these multiple peptide ratios, providing a measure of confidence for each identified protein. B, the multiple peptide measurements to each protein provide a means to obtain a 95% confidence interval for the relative expression measurement for each binary comparison. The pattern obtained from the relative abundance of each protein from each condition provides a mechanism to group related proteins according to their response to the applied perturbation. Ribosomal proteins RL1 and RS1 show a pattern across the various conditions as do a majority of the other identified ribosomal proteins in this study. Other proteins such as ACEA, ALDA, and MDH share a similar pattern that can be explained by their role in carbon utilization. GALE has a unique pattern that can be attributed to its role in lactose catabolism. Ace, acetate; Lac, lactose; Glu, glucose.

  • Download figure
  • Open in new tab
  • Download powerpoint
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 7.

Relative quantitation of proteins among the three growth conditions from unfractionated E. coli. Relative quantitation of proteins associated with translation (A), amino acid metabolism and stress (B), and carbon and energy metabolism (C) among the three growth conditions. The relative quantitation is based on the average -fold change found among the redundant, quantitative peptide measurements from each protein. A 95% confidence interval was determined for those proteins that contained more than one peptide identification. Those proteins that were unique to a particular condition were assigned a maximum or minimum value of 5.5 (unique to numerator condition) or −5.5 (unique to denominator condition) for each binary comparison. The three binary comparisons are color-coded as follows: black, lactose versus glucose; red, acetate versus glucose; and blue, acetate versus lactose.

View this table:
  • View inline
  • View popup
Table III

Subset of quantified proteins involved in carbon source utilization

AG = ln (average intensity ratio) between acetate and glucose; AL = ln (average intensity ratio) between acetate and lactose; LG = ln (average intensity ratio) between lactose and glucose; 95CI, corresponding ± 95% confidence interval; Ace, acetate; Lac, lactose; Glc, glucose; —, not applicable or not determined.

Translation Machinery—

Approximately 8% of the total cellular volume of E. coli is occupied by ribosomal proteins, and these should be among the abundant proteins in unfractionated E. coli. Determination of the relative expression of these major housekeeping proteins was an important step in the validation of this method before characterizing other, perhaps less abundant proteins involved in carbon metabolism. A total of 49 of the 54 ribosomal proteins were identified in the three growth conditions, and their relative expression profiles (as ln(ratio)) are illustrated in Fig. 7A. The validated mass spectrum of a subset of ribosomal proteins is provided in the supplemental data (Supplemental Fig. 4). The average percent sequence coverage of the ribosomal proteins was ∼54% from either the glucose or lactose growth conditions. However, the average percent sequence coverage of the ribosomal proteins decreased to ∼32% during growth on acetate. The concomitant decrease of ribosomal proteins along with the slower growth rate has been demonstrated when E. coli is grown on acetate (23). The results illustrated in Fig. 7 concur with these observations.

The consistency of the expression profiles for the ribosomal proteins in the paired conditions is striking (Figs. 6 and 7). Growth on acetate results in consistent down-regulation of these proteins relative to growth on either glucose or lactose. These results are consistent with the work of Marr (23), who was able to correlate the specific growth rates of E. coli on various carbon sources with the absolute quantity of ribosomal proteins. The growth rate of E. coli in acetate decreases when compared with the growth rate on either glucose or lactose. The lower growth rate correlates to a lower rate of protein synthesis and results in a decrease in the level of ribosomal proteins. With a decrease in protein synthesis and ribosomal proteins, the demand for amino acid biosynthesis is also attenuated. Conversely the level of ribosomal proteins is not affected by substituting lactose for glucose. Because TUFA was used to normalize the data across all experiments, it does not show any change throughout the three different growth conditions. A number of other associated ribosomal proteins and protein translation chaperones were also identified, and their relative quantitation was determined. Among these proteins were EFG, DNAK, GROEL, GROES, CLPA, and CLPB. Although EFG was not affected, the other proteins were all up-regulated in both glucose and lactose. Growth on either glucose or lactose supports higher growth rates, and as a result there is a concomitant increase in protein production, providing the need for these chaperones to facilitate protein folding and post-translational modification.

Lactose Utilization—

As the PAGE data (Fig. 1) and the intensity ratio plots of matched peptides (Fig. 3, C and D) suggest, there are relatively few differentially expressed proteins between the glucose and lactose growth conditions. Those that are differentially expressed in the lactose condition (summarized in Table III) are significant to the metabolism of lactose. β-Galactosidase (LACZ), which is detected only in the lactose condition with 30% sequence coverage, catalyzes that hydrolysis of lactose to β-d-glucose and β-d-galactose. Aldolase-1-epimerase (GALM, 41% sequence coverage), which converts the β epimer of galactose to the α epimer, was up-regulated 2.7-fold (ln(Lactose/Acetate) = 0.99 ± 0.14, 95% CI) in the lactose relative to the acetate growth condition. Similarly galactokinase (GALK, 21% sequence coverage) was also identified and found to be up-regulated by 6.8-fold (ln(Lactose/Acetate) = 1.91 ± 0.35, 95% CI) in the lactose relative to the acetate growth condition. Another essential protein required for lactose/galactose utilization, GALT, was also found to be unique to the lactose growth condition (∼16% sequence coverage). It catalyzes the reaction of UDP-d-glucose and α-d-galactose-1-phosphate to α-d-glucose-1-phosphate and UDP-galactose. This reaction is coupled with another enzyme, GALE, that concomitantly converts α-d-galactose-1-phosphate and UTP to UDP-galactose and pyrophosphate. GALE was identified (∼33% sequence coverage) and found to be up-regulated by 7.4-fold (ln(Lactose/Acetate) = 1.99 ± 0.24, 95% CI) in the lactose relative to the acetate condition. These are among some of the characteristic proteins essential for lactose metabolism. A similar study by Vollmer et al. (24) identified a subset of lactose-specific proteins using a two-dimensional LC strategy by combining strong cation exchange and reverse phase chromatography. From the multidimensional analysis performed in their study, Vollmer et al. (24) identified LACZ and GALM from their analysis with 15 and 3% total sequence coverage, respectively. Although they were able to identify a subset of lactose-metabolizing proteins, the results from their data did not provide any information regarding the relative quantitation of the characterized proteins between the lactose and glucose growth conditions.

Acetate Utilization—

Comparison of acetate condition to either the glucose or lactose conditions reveals more diversity than the comparison of glucose to lactose (Figs. 1A and 3, E–H). These results should not be surprising because growth on acetate, instead of glucose or lactose, requires the cell to redirect the carbon flux through different metabolic pathways to sustain growth. A major adaptation is the induction of enzymes to convert acetate, rather than pyruvate, to acetyl-CoA (Fig. 4). Two such pathways exist in E. coli. The first is an efficient pathway that directly converts acetate to acetyl-CoA in a single step using the acetyl-CoA synthetase (ACS). The other is a more circuitous route that converts acetate to acetyl-CoA in two steps. The first step, which is catalyzed by acetate kinase (ACKA), involves activation of acetate by phosphorylation to form acetylphosphate. The second step is catalyzed by phosphate acetyltransferase (PTA) to transfer CoA to the activated form of acetate and liberate inorganic phosphate. The relative quantitation results from this study indicate that both pathways are up-regulated, although the pathway involving ACS is elevated to a much greater extent (Table III). ACS was identified with 38% sequence coverage and found to be unique to the acetate up-regulated by 16.6-fold (ln(Acetate/Glucose) = 2.81 ± 0.25, 95% CI) in the acetate growth condition. In addition, ACKA was identified with 25% sequence coverage and up-regulated by 1.9-fold (ln(Acetate/Glucose) = 0.63 ± 0.22, 95% CI), whereas PTA was identified with 31% sequence coverage and up-regulated by 1.6-fold (ln(Acetate/Glucose) = 0.47 ± 0.10, 95% CI) in the acetate growth condition. These results are consistent with Oh et al. (7), who reported that ACS was the main path for acetate uptake from microarray analysis of E. coli grown on acetate and glucose. They are also consistent with the results of Kakuda et al. (25), who showed that mutation of both ackA and pta inhibited cell growth in acetate, indicating that this pathway also delivers a significant amount of carbon flux into the cell. In the absence of glucose in the growth medium we also observed that a number of enzymes in the glycolysis pathway, including phosphoglucose isomerase (PGI), glyceraldehyde-3-phosphate dehydrogenase-A complex (GAPA), and enolase (ENO) are significantly down-regulated in the acetate sample. Another hallmark of growth on acetate is the induction of the glyoxylate shunt pathway. Specifically isocitrate lyase (ACEA) and malate synthase A (ACEB) redirect the carbon flux through the citric acid cycle to conserve the use of the acetyl-CoA for production of primary metabolites and energy management without loss of carbon as carbon dioxide. Both ACEA and ACEB were identified in the acetate condition (58 and 32% sequence coverage, respectively) and were found to be highly up-regulated, 88.2-fold (ln(Acetate/Glucose) = 4.48 ± 0.27, 95% CI) in the case of ACEA and 28.5-fold (ln(Acetate/Glucose) = 3.35 ± 0.41, 95% CI) in the case of ACEB.

Glucose Utilization (Glycolysis and Tricarboxylic Acid Cycle)—

Several glycolysis proteins (PGI, GAPA, and ENO) were down-regulated in acetate relative to either glucose or lactose (Fig. 7 and Tables III and IV). Although the media conditions were not identical among the three studies, the directionality of the relative abundance data is consistent with the microarray data reported by Oh et al. (7) and the 2DGE data by Peng and Shimizu (18) and serve to help validate our methodology. Peng and Shimizu (18) point out that these same proteins are up-regulated in E. coli during growth on glucose when acetate is metabolized, and the flux in the gluconeogenic direction is smaller than the glycolytic flux. In addition, a series of tricarboxylic acid cycle proteins (GLTA, ACNB, ICDA, SUCA, SUCB, SUCC, SUCD, and MDH) were found to be up-regulated in acetate relative to either glucose or lactose. However, there are a number of differences among the three studies, most of which lie within the glycolytic pathway.

View this table:
  • View inline
  • View popup
Table IV

Comparison with 2DGE and transcriptional microarrays

AG = ln (average intensity ratio) between acetate and glucose. Column A shows protein levels as determined by the label-free quantitation method described in this study. Column B shows transcript levels as reported by Oh et al. (7). Column C shows proteins levels as determined from 2DGE reported by Peng and Shimizu (18). NR, not determined in the study.

Some of the exceptions in the microarray work of Oh et al. (7) can be explained by the common lack of correlation between mRNA levels and protein abundance that has been reported by others in many unrelated studies (8–10). Additionally there were a variety of differences among growth conditions in the three studies, ranging from the use of rich and minimal media to the amount of carbohydrate in the growth medium. One would expect some difference in glycolysis when comparing minimal to rich media because primary metabolites (amino acids and cofactors) are present in the rich medium; however, energy metabolism should show some similarity because there is a demand to support cell growth rate. This is evidenced by the greater degree of correlation found in the relative abundance among the proteins, and their corresponding transcripts, in the citric acid cycle and glyoxylate bypass reported from the three studies.

Another source of variation in the work of Peng and Shimizu (18) could be attributed to some of the common characteristics related to the 2DGE technique. The quantitative results obtained from 2DGE can be affected by proteins that exist in many spots on the gel or in instances where single spots on the gel contain many proteins. The relative quantity of isoforms of a given protein can change within a given biological sample; thus relative quantitation could be inaccurate if only one gel spot was used for the relative quantitation of proteins known to exist in different isoforms. If the reason for the multiple gel spots is a post-translational modification of a few residues within the larger protein, this would affect the quantitative result of only the modified peptide(s), leaving the remaining tryptic peptides to correlate well with the relative abundance of the full-length protein. If there is post-translation modification by proteolytic cleavage of the protein, then the quantitative results could lead to the identification of two sets of tryptic peptides exhibiting two different relative abundances provided that the inactive, truncated form of the protein was subsequently degraded. We have not explored examples of these specific scenarios, but because the data acquisition provides a digitized inventory of all the precursors and peptides, we intend to search for examples of these modifications in the data and perhaps design other experiments to explore this phenomenon as a topic for future work.

Because both lactose and glucose conditions were analyzed and compared with acetate, they provided complementary data in the same experiment and serve as positive controls, which offer additional confidence in the results of our study. Although there are discrepancies associated with the relative quantitation of a number of glycolytic proteins among the different studies, they serve to define future experiments that will aim to address the causes of these anomalies.

Conclusions—

The method described in this study is a powerful tool to simultaneously gather qualitative and quantitative information for the characterization of components in a complex protein mixture. The data illustrate that the alternate scanning LCMS method (LCMSE) of the Protein Expression system is a label-free method that is ideally suited to these studies. The inherent redundancy of the tryptic peptides generated from the endogenous proteins is utilized both for protein identification and for subsequent relative quantitation. This strategy provides accurate mass measurements, typically below 5 ppm, of tryptic peptides and their corresponding fragment ions throughout the subsequent LCMS analysis. While determining the mass measurements of both tryptic peptides and fragments, the processing software simultaneously preserves the chromatographic integrity of the data to enhance its qualitative and quantitative capabilities. A more thorough discussion of the qualitative capabilities of this method will be described in future work.2 The ability to collect “all the ions all the time” provides structural information for every tryptic peptide that generates fragments above the minimum detection threshold of the mass spectrometer.

A major goal of this study was to demonstrate the utility of the Protein Expression system with a model biological system such as E. coli. The study, in fact, illustrated that the information obtained from the LCMSE data of the E. coli tryptic digests correlated well with the known biology of carbon metabolism. This work demonstrates that the Protein Expression system can rapidly determine proteome differences among varied biological conditions. These protein profiling experiments yield important information about the response of E. coli to environmental perturbations. Similar studies could later be generalized to investigate other biological systems. Such future studies can in turn lead to a more targeted strategy to combat and/or detect virulent microbes, help develop novel antibiotics, and identify important biomarkers for clinical discovery and diagnostics. In fact, this method has already been extended to other biological systems, such as mycobacteria (Mycobacterium bovis), to study proteomic profiles under different drug treatments in an effort to determine the mechanism of action of novel drugs (21). The simplicity of this label-free approach should encourage more experimentation and increase the efficiency of future biologic research.

Acknowledgments

We are thankful for the valuable contributions of Timothy Riley throughout the development of this work. We also acknowledge Jeanne Li and others at Waters Corp. who provided insight throughout the editing of this manuscript. We also thank Blue Sky Biotech (Worcester, MA) for the contribution in the production and preparation of the E. coli protein extracts that were used for the purpose of this work.

Footnotes

  • Published, MCP Papers in Press, January 5, 2006, DOI 10.1074/mcp.M500321-MCP200

  • ↵1 The abbreviations used are: 2D, two-dimensional; MSE, elevated energy MS; PLGS, ProteinLynx Global Server; RSD, relative standard deviation; AMRT, accurate mass, retention time detection; PMF, peptide mass fingerprint; DGAL, D-galactose-binding periplasmic protein; MDH, malate dehydrogenase; SIC, selected ion chromatogram; CI, confidence interval; ACS, acetyl-CoA synthetase; PGI, phosphoglucose isomerase; 2DGE, 2D gel electrophoresis; BPI, base peak intensity.

  • ↵2 J. C. Silva, C. Dorschel, M. V. Gorenstein, G.-Z. Li, and S. J. Geromanos, manuscript in preparation.

  • ↵* The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

  • ↵S The on-line version of this article (available at http://www.mcponline.org) contains supplemental material.

    • Received September 28, 2005.
    • Revision received December 12, 2005.
  • © 2006 The American Society for Biochemistry and Molecular Biology

REFERENCES

  1. ↵
    Blattner, F. R., Plunkett, G., Bloch, C. A., Perna, N. T., Burland, V., Tiley, M., Collado-Vides, J., Glasner, J. D., Rode, C. K., Mayhew, G. F., Gregor, J., Davis, N. W., Kirkpatrick, H. A., Goeden, M. A., Rose, D. J., Mau, B., and Shao, Y. (1997) The complete genome sequence of Escherichia coli K-12. Science 277, 1453– 1474
    OpenUrlAbstract/FREE Full Text
  2. ↵
    Selinger, D. W., Cheung, K. J., Mei, R., Johansson, E. M., Richmond, C. S., Blattner, F. R., Lockhart, D. J., and Church, G. M. (2000) RNA expression analysis using a 30 base pair resolution Escherichia coli genome array. Nat. Biotechnol. 18, 1262– 1268
    OpenUrlCrossRefPubMed
  3. ↵
    O’Farrell, P. H. (1975) High resolution two-dimensional electrophoresis of proteins. J. Biol. Chem. 250, 4007– 4021
    OpenUrlAbstract/FREE Full Text
  4. ↵
    Sundararaj, S., Guo, A., Habibi-Nazhad, B., Rouani, M., Stothard, P., Ellison, M., and Wishart, D. S. (2004) The CyberCell Database (CCDB): a comprehensive, self-updating, relational database to coordinate and facilitate in silico modeling of Escherichia coli. Nucleic Acids Res. 32, D293– D295
    OpenUrlAbstract/FREE Full Text
  5. ↵
    Misra, R. V., Horler, R. S. P., Reindl, W., Goryanin, I. I., and Thomas, G. H. (2005) EchoBASE: an integrated post-genomic database for Escherichia coli. Nucleic Acids Res. 33, D329– D333
    OpenUrlAbstract/FREE Full Text
  6. ↵
    Zimmer, D. P., Soupene, E., Lee, H. L., Wendisch, V. F., Khodursky, A. B., Peter, B. J., Bender, R. A., and Kustu, S. (2000) Nitrogen regulatory protein C-controlled genes of Escherichia coli: scavenging as a defense against nitrogen limitation. Proc. Natl. Acad. Sci. U. S. A. 97, 14674– 17679
    OpenUrlAbstract/FREE Full Text
  7. ↵
    Oh, M. K., Rohlin, L., Kao, K. C., and Liao, J. C. (2002) Global expression profiling of acetate-grown Escherichia coli. J. Biol. Chem. 277, 13175– 13183
    OpenUrlAbstract/FREE Full Text
  8. ↵
    Griffin, T. J., Gygi, S. P., Ideker, T., Rist, B., Eng, J., Hood, L., and Aebersold, R. (2002) Complementary profiling of gene expression at the transcriptome and proteome levels in Saccharomyces cerevisiae. Mol. Cell. Proteomics 1, 323– 333
    OpenUrlAbstract/FREE Full Text
  9. ↵
    Gygi, S. P., Rochon, Y., Franza, B. R., and Aebersold, R. (1999) Correlation between protein and mRNA abundance in yeast. Mol. Cell. Biol. 19, 1720– 1730
    OpenUrlAbstract/FREE Full Text
  10. ↵
    Anderson, L., and Seilhamer, J. (1997) Comparison of selected mRNA and protein abundances in human liver. Electrophoresis 18, 533– 537
    OpenUrlCrossRefPubMed
  11. ↵
    Henzel, W. J., Watanabe, C., and Stults, J. T. (2003) Protein identification: the origins of peptide mass fingerprinting. J. Am. Soc. Mass Spectrom. 14, 931– 942
    OpenUrlCrossRefPubMed
  12. ↵
    Bateman, R. H., and Hoyes, J. B. (January 16,2002) Methods and apparatus for mass spectrometry. UK Patent 2,364,168A
  13. ↵
    Silva, J. C., Denny, R., Dorschel, C. A., Gorenstein, M., Kass, I. J., Li, G.-Z., McKenna, T., Nold, M. J., Richardson, K., Young, P., and Geromanos, S. (2004) Quantitative proteomic analysis by accurate mass retention time pairs. Anal. Chem. 77, 2187– 2200
    OpenUrl
  14. ↵
    Purvine, S., Eppel, J. T., Yi, E. C., and Goodlett, D. R. (2003) Shotgun collision-induced dissociation of peptides using a time of flight mass analyzer. Proteomics 3, 847– 850
    OpenUrlCrossRefPubMed
  15. ↵
    Kuhn, E., Wu, J., Karl, J., Liao, H., Zolg, W., and Guild, B. (2004) Quantification of C-reactive protein in the serum of patients with rheumatoid arthritis using multiple reaction monitoring mass spectrometry and 13C-labeled peptide standards. Proteomics 4, 1175– 1186
    OpenUrlCrossRefPubMed
  16. Ross, P. L., Huang, Y. N., Marchese, J. N., Williamson, B., Parker, K., Hattan, S., Khainovski, N., Pillai, S., Dey, S., Daniels, S., Purkayastha, S., Juhasz, P., Martin, S., Bartlet-Jones, M., He, F., Jacobson, A., and Pappin, D. J. (2004) Multiplexed protein quantitation in Saccharomyces cerevisiae using amine-reactive isobaric tagging reagents. Mol. Cell. Proteomics 3, 1154– 1169
    OpenUrlAbstract/FREE Full Text
  17. ↵
    Gygi, S. P., Rist, B., Gerber, S. A., Turecek, F., Gelb, M. H., and Aebersold, R. (1999) Quantitative analysis of complex protein mixtures using isotope-coded affinity tags. Nat. Biotechnol. 17, 994– 999
    OpenUrlCrossRefPubMed
  18. ↵
    Peng, L., and Shimizu, K. (2003) Global metabolic regulation analysis for Escherichia coli K12 based on protein expression by 2-dimensional electrophoresis and enzyme activity. Appl. Microbiol. Biotechnol. 61, 163– 178
    OpenUrlCrossRefPubMed
  19. ↵
    Yu, Y. Q., Gilar, M., Lee, P. J., Bouvier, E. S. P., and Gebler, J. C. (2003) Enzyme-friendly, mass spectrometry-compatible surfactant for in-solution enzymatic digestion of proteins. Anal. Chem. 75, 6023– 6028
    OpenUrlPubMed
  20. ↵
    Skilling, J., Denny, R., Richardson, K., Young, P., McKenna, T., Campuzano, I., and Ritchie, M. (2004) Probseq—a fragmentation model for interpretation of electrospray tandem mass spectrometry data. Comp. Funct. Genomics 5, 61– 68
    OpenUrlCrossRef
  21. ↵
    Hughes, M. A., Silva, J. C., Geromanos, S. J., and Townsend, C. A. (2006) Quantitative proteomic analysis of drug-induced changes in Mycobacteria. J. Proteome. Res. 5, 54– 63
    OpenUrl
  22. ↵
    Conrads, T. P., Anderson, G. A., Veenstra, T. D., Pasa-Tolick, L., and Smith, R. D. (2000) Utility of accurate mass tags for proteome-wide protein identification. Anal. Chem. 72, 3349– 3354
    OpenUrlPubMed
  23. ↵
    Marr, A. G. (1991) Growth rate of Escherichia coli. Microbiol. Rev. 55, 316– 333
    OpenUrlAbstract/FREE Full Text
  24. ↵
    Vollmer, M., Nagele, E., and Horth, P. (2003) Differential proteome analysis: two-dimensional nano-LC/MS of E. coli proteome grown on different carbon sources. J. Biomol. Tech. 14, 128– 135
    OpenUrlPubMed
  25. ↵
    Kakuda, H., Hosono, K., Shiroishi, K., and Ichihara, S. (1994) Identification and characterization of the ackA (acetate kinase A)-pta (phosphotransacetylase) operon and complementation analysis of acetate utilization by an ackA-pta deletion mutant of Escherichia coli. J. Biochem. 116, 916– 922
    OpenUrlAbstract/FREE Full Text
View Abstract
PreviousNext
Back to top
Print
Download PDF
Article Alerts
Sign In to Email Alerts with your Email Address
Email Article

Thank you for your interest in spreading the word on Molecular & Cellular Proteomics.

NOTE: We only request your email address so that the person you are recommending the page to knows that you wanted them to see it, and that it is not junk mail. We do not capture any email address.

Enter multiple addresses on separate lines or separate them with commas.
Simultaneous Qualitative and Quantitative Analysis of theEscherichia coli Proteome
(Your Name) has sent you a message from Molecular & Cellular Proteomics
(Your Name) thought you would like to see the Molecular & Cellular Proteomics web site.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Citation Tools
Simultaneous Qualitative and Quantitative Analysis of theEscherichia coli Proteome
Jeffrey C. Silva, Richard Denny, Craig Dorschel, Marc V. Gorenstein, Guo-Zhong Li, Keith Richardson, Daniel Wall, Scott J. Geromanos
Molecular & Cellular Proteomics April 1, 2006, First published on January 5, 2006, 5 (4) 589-607; DOI: 10.1074/mcp.M500321-MCP200

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero

Request Permissions

Share
Simultaneous Qualitative and Quantitative Analysis of theEscherichia coli Proteome
Jeffrey C. Silva, Richard Denny, Craig Dorschel, Marc V. Gorenstein, Guo-Zhong Li, Keith Richardson, Daniel Wall, Scott J. Geromanos
Molecular & Cellular Proteomics April 1, 2006, First published on January 5, 2006, 5 (4) 589-607; DOI: 10.1074/mcp.M500321-MCP200
del.icio.us logo Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Google Plus One

In this issue

Molecular & Cellular Proteomics: 5 (4)
Molecular & Cellular Proteomics
Vol. 5, Issue 4
April 2006
  • Table of Contents
  • Table of Contents (PDF)
  • About the Cover
  • Index by author
  • Back Matter (PDF)
  • Ed Board (PDF)
  • Front Matter (PDF)

View this article with LENS

Jump to section

  • Article
    • Abstract
    • MATERIALS AND METHODS
    • RESULTS AND DISCUSSION
    • Acknowledgments
    • Footnotes
    • REFERENCES
  • Figures & Data
  • eLetters
  • Info & Metrics
  • PDF

  • Follow MCP on Twitter
  • RSS feeds
  • Email

Articles

  • Current Issue
  • Papers in Press
  • Archive

For Authors

  • Submit a Manuscript
  • Info for Authors

Guidelines

  • Proteomic Identification
  • Clinical Proteomics
  • Glycomic Identification
  • Targeted Proteomics
  • Frequently Asked Questions

About MCP

  • About the Journal
  • Permissions and Licensing
  • Advertisers
  • Subscribers

ASBMB Publications

  • Molecular & Cellular Proteomics
  • Journal of Biological Chemistry
  • Journal of Lipid Research
  • ASBMB Today

© 2019 American Society for Biochemistry and Molecular Biology | Privacy Policy

MCP Print ISSN 1535-9476 Online ISSN 1535-9484

Powered by HighWire