|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
,



From the
Waters Corporation, Milford, Massachusetts 01757-3696 and ¶ Waters Corporation, Transistorstraat 18, 1322 CE Almere, The Netherlands
| ABSTRACT |
|---|
|
|
|---|
To date a majority of the quantitative proteomic analyses have been performed using stable isotope labeling strategies such as ICAT (3), iTRAQTM (4), SILAC (stable isotope labeling by amino acids in cell culture) (5), and 18O labeling (6, 7). These methodologies require complex, time-consuming sample preparation and can be relatively expensive.
Recently there have been numerous reports applying label-free methods to monitor the relative abundance of protein between different conditions (811). Relative quantification provides information regarding specific protein abundance changes between two conditions caused by an induced perturbation (environment-induced, drug-induced, and disease-induced). These studies require comparison of identical proteolytic peptides in each of the two experiments to accurately determine relative ratios of the particular protein(s) of interest. Relative abundance values for each peptide to a given protein can then be obtained to quantitatively characterize the differential expression of proteins between different sample states. Many of these methods are based on determining the ratios of the peak area of identical peptides between different conditions. One critical factor limiting the quantitative reproducibility of these methods includes the ability to efficiently cluster the detected peptides. This in turn relies on the accuracy of the mass measurement and the chromatographic reproducibility. Although relative quantification monitors changes in protein abundance between two conditions, it does not determine the absolute quantity of these proteins.
The ability to determine the absolute concentration of a protein (or proteins) present within a complex protein mixture is valuable for the understanding of the underlying molecular biology guiding the response to an applied perturbation. Cellular responses are often controlled through direct and indirect interactions of proteins present in the cell. These coordinated interactions allow the cell to communicate a response across many cellular compartments. The cell can thereby execute an efficient and expeditious recruitment and production of critical proteins needed for adaptation. A method for determining the absolute quantity of proteins in a complex sample would enable determination of the stoichiometry of proteins within a sample and would facilitate understanding of the complicated biological network of cooperative protein interactions that guide cellular responses.
To date a technique capable of determining the absolute concentration of proteins in complex mixtures from a simple LCMS analysis without using specific internal standards for each protein has not been described. Recently Ishihama et al. (12) reported an emPAI1 value that the authors suggest is directly proportional to the protein content in a protein mixture. The authors reported a quantitative deviation of 63% from the actual abundance using the described emPAI method. The method describes the correlation between the number of observed peptides to a protein and its absolute amount. As the amount of protein increases, the number of observed peptides to the protein also increases. This method is useful within a narrow protein concentration range whereby the observed peptides continue to increase linearly as a function of the amount of protein. However, once a higher protein concentration is reached and all the observable peptides have been identified, the relationship deteriorates to an asymptotic limit. Additionally this method relies on characterizing peptides using traditional data-directed MS/MS and may therefore be sensitive enough only to quantify the more abundant proteins present in a mixture.
A more traditional approach to determine the absolute concentration of a protein (or proteins) in a complex mixture involves the use of stable isotope-labeled peptides spiked into the mixture. This allows direct correlation between the stable isotope-labeled peptide and its naturally occurring analog (13). Kuhn et al. (13) have carried out this stable isotope dilution strategy by using synthetic 13C-labeled peptides and multiple reaction monitoring as an analytical method for prescreening candidate protein biomarkers in human serum prior to antibody and immunoassay development.
Typically absolute quantification of proteins requires the use of one or more external reference peptides to generate a calibration-response curve for specific polypeptides from that protein (i.e. synthetic tryptic polypeptide product). The absolute quantification of the given protein is determined from the observed signal response for the specific polypeptide in the sample relative to that generated in the calibration curve. If the absolute quantification of a number of different proteins is to be determined, separate calibration curves are necessary for each specific external reference peptide for each protein. Absolute quantification allows one not only to determine changes between two conditions but also to perform quantitative protein comparisons within the same sample.
Gerber et al. (14) describe a conventional technique for absolute quantification of proteins and their corresponding modified states in complex mixtures using a synthesized peptide as a reference standard. The reference peptide is chemically identical to one of the naturally occurring tryptic peptides of a given protein. The reference standard is introduced to a complex mixture. The mixture is analyzed using LCMS to measure the corresponding signal intensity for the derivative peptide along with the endogenous peptide. This intensity signal response is compared with an intensity calibration curve created using the introduced synthetic molecule to determine the amount of the endogenous protein in the mixture. A disadvantage with using synthetic peptides is that extra steps are required to synthesize an authentic sample and to later "spike" the synthetic standard prior to being able to determine the absolute quantity of the protein itself. To perform the absolute quantification for a number of proteins within a mixture would require one to provide a synthetic standard for each protein of interest.
Another technique for absolute quantification of proteins uses radiolabeled amino acids such as [35S]methionine, whose specific activity is known (15, 16). In this type of experiment, an amino acid, such as [35S]methionine, is incorporated in the culture medium of the growing cell(s). As proteins are synthesized, [35S]methionine is incorporated into the cellular proteins. Based on the extent of incorporation of the radiolabel, the absolute amount of the peptide or protein can be determined. These types of experiments are costly, require good standard operating procedures and specialized quantitative techniques, and can also be deleterious to the subject under study. Consequently determining absolute quantification of proteins using radiolabel techniques is limited to expendable biological systems such as microbes, plants, and cell cultures.
In this work we describe a method that provides absolute quantification of proteins from LCMS data of simple or complex mixtures of tryptic peptides without requiring the use of numerous external reference peptide(s) or the implementation of radiolabeling methods. The method describes how to obtain a single point calibration for the mass spectrometer that is applicable to the subsequent absolute quantification of all other characterized proteins within the complex mixture.
| EXPERIMENTAL PROCEDURES |
|---|
|
|
|---|
Preparation of Simple Protein Mixture in Human Serum
An additional protein digest stock solution of the six standard proteins was prepared in human serum (
1.5 µg/µl of total serum protein) containing 0.05% RapiGest such that each protein digest was at the following concentration: glycogen phosphorylase B from rabbit, 2.4 pmol/µl; hemoglobin from a cow, 4.0 pmol/µl; alcohol dehydrogenase from yeast, 4.0 pmol/µl; serum albumin from a cow, 5.0 pmol/µl; and enolase from yeast, 6.0 pmol/µl. Six additional samples were prepared from a dilution series of the following stock solution in human serum with 0.05% RapiGest: 2-, 5-, 10-, 20-, and 50-fold.
Preparation of Human Serum for Biological Replicates
Human serum was prepared from seven individuals using BD Biosciences VacutainersTM with clot activator as suggested by the manufacturer. A total of 300400 µg of total serum protein (5 µl) was digested according to the procedure outlined under "Protein Digest Preparation." After digesting the serum proteins, 1 pmol of purified tryptic enolase digest was added to each sample prior to LCMS analysis.
Media and Escherichia coli Growth Conditions
Frozen E. coli (ATCC10798, K-12) cell stocks were streaked onto Luria-Bertani (LB) plates and grown at 37 °C. An individual colony was subsequently streaked onto a plate of M9 minimal medium supplemented with 0.5% sodium acetate and grown at 37 °C. Seed cultures were generated by transferring single colonies into flasks of M9 minimal medium supplemented with 0.5% sodium acetate. Seed culture flasks were shaken at 250 rpm at 37 °C until midlog phase (A600 = 0.91.1). The seed culture was diluted 1 ml:500 ml into separate M9 minimal medium supplemented with 0.5% glucose. Flasks were shaken at 250 rpm at 37 °C until midlog phase (A600 = 0.91.1). The E. coli cell cultures were harvested by centrifugation, and pellets were frozen at 80 °C. Frozen cells were suspended in 5 ml/1 g of biomass in lysis buffer (Dulbeccos phosphate-buffered saline ± 1:100 protease inhibitor mixture (Sigma catalog number 8340)) in a 50-ml Falcon tube. The cells were lysed by sonication in a Microson XL ultrasonic cell disrupter (Misonix, Inc.) at 4 °C. The cell debris were removed by centrifugation at 15,000 % g for 30 min at 4 °C, and the resulting soluble protein extracts were diluted to 5 mg/ml with lysis buffer, dispensed into 50-µl aliquots (250 µg), and stored at 80 °C for subsequent analysis. The E. coli protein extract was spiked with tryptic peptides from yeast enolase to a final concentration of 400 fmol/µl before storing at 80 °C.
Protein Digest Preparation
A 100-µl aliquot of the human serum samples was reduced in the presence of 10 mM dithiothreitol at 60 °C for 30 min. The protein was alkylated in the dark in the presence of 50 mM iodoacetamide at room temperature for 30 min. Proteolytic digestion was initiated by adding modified trypsin (Promega) at a concentration of 75:1 (total protein to trypsin, by weight) and incubated overnight at 37 °C. Each digestion mixture was diluted to a final volume of 200 µl with 50 mM ammonium bicarbonate (pH 8.5) to reduce the concentration of RapiGest detergent to 0.025%. Each sample was analyzed in triplicate. The tryptic peptide solution was centrifuged at 13,000 rpm for 10 min, and the supernatant was transferred into an autosampler vial for peptide analysis via LCMS. The LCMSE analysis was performed using 5 µl of the final peptide mixture.
Approximately 250 µg (50 µl) of E. coli protein was suspended in a final volume of 100 µl containing 50 mM ammonium bicarbonate (pH 8.5) and 0.05% RapiGest. The protein mixture was reduced and alkylated as described above.
HPLC Configuration
Capillary LC (CapLC) of tryptic peptides was performed with a Waters CapLC system equipped with a Waters NanoEaseTM AtlantisTM C18, 300-µm % 15-cm reverse phase column. The aqueous mobile phase (mobile phase A) contained 1% acetonitrile in water with 0.1% formic acid. The organic mobile phase (mobile phase B) contained 80% acetonitrile in water with 0.1% formic acid. Samples (5-µl injection) were loaded onto the column with 6% mobile phase B. Peptides were eluted from the column with a gradient of 640% mobile phase B over 100 min at 4.4 µl/min followed by a 10-min rinse of 99% mobile phase B. The column was immediately re-equilibrated at initial conditions (6% mobile phase B) for 20 min. The lock mass, [Glu1]fibrinopeptide at 100 fmol/µl, was delivered from the auxiliary pump of the CapLC system at 1 µl/min to the reference sprayer of the NanoLockSprayTM source. All samples were analyzed in triplicate.
Mass Spectrometer Configuration
Mass spectrometry analysis of tryptic peptides was performed using a Waters/Micromass Q-TOF Ultima API. For all measurements, the mass spectrometer was operated in V-mode with a typical resolving power of at least 10,000. All analyses were performed using positive mode ESI using a NanoLockSpray source. The lock mass channel was sampled every 30 s. The mass spectrometer was calibrated with a [Glu1]fibrinopeptide solution (100 fmol/µl) delivered through the reference sprayer of the NanoLockSpray source. Accurate mass LCMS data was collected in an alternating, low energy (MS) and elevated energy (MSE) mode of acquisition. The spectral acquisition time in each mode was 1.8 s with a 0.2-s interscan delay. In low energy MS mode, data was collected at a constant collision energy of 10 eV. In elevated MSE mode, collision energy was ramped from 28 to 35 eV during each 1.8-s data collection cycle. One cycle of MS and MSE data was acquired every 4.0 s. The RF applied to the quadrupole mass analyzer was adjusted such that ions from m/z 300 to 2000 were efficiently transmitted, ensuring that any ions observed in the LCMSE data less than m/z 300 were known to arise from dissociations in the collision cell.
Data Processing and Protein Identification
The continuum LCMSE data were processed and searched using ProteinLynx Global Server (PLGS) version 2.2. Protein identifications were obtained by searching either a human database to which data from the six standard proteins were appended or an E. coli database. The ion detection, clustering, and normalization were processed using PLGS as described earlier (8). Additional data analysis was performed with Spotfire Decision Site version 7.2 and Microsoft Excel.
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
Analysis of Peptides from the Standard Six-protein Mixture
The serial dilutions of the six standard protein digests were analyzed using an alternate scanning mode of data acquisition (LCMSE) described previously by us (8). The processing software is capable of generating properly integrated peptide signal intensity measurements (deisotoped and charge state-reduced) and accurately mass-measured, peptide ion lists that are used for subsequent qualitative identifications and relative quantification across 34 orders of magnitude dynamic range in ion detection. Because this mode of data acquisition does not bias the LCMS analysis by gas phase preselection of candidate peptide precursors, an inventory of all the peptide components (precursor and fragments) above the limit of detection of the instrument is produced. The time-resolved mass measurements provide the ability to properly align detected fragment ions with their respective precursor ions for accurate identification.
The protein sequence coverage obtained from the PLGS-processed LCMSE data of the six protein mixtures is outlined in Table I. The total amount of standard protein loaded onto the 300-µm column ranged from 100 to 15,000 fmol (scaling the analysis to a 75-µm chromatography system would correspond to
6900 fmol of total protein for the analysis of the same sample after diluting 16-fold, 1/R2). These detection limits are within acceptable levels for existing technologies. A minimum limit of 100 fmol of a single protein was determined for the purpose of this analysis using a 300-µm chromatography system and a standard nanoelectrospray source at 5 µl/min. At 100 fmol of a single protein, the top three most intense tryptic peptides to most of the proteins could be identified with a high degree of confidence. The experiment was set up so that each protein spanned a 50-fold dynamic range with an overall dynamic range of
2.2 orders of magnitude for the entire set of protein standards. The protein sequence coverage of each protein is shown to decrease in a concentration-dependent manner, as one would expect, ranging from 84 to 2% throughout the entire data set. Replicate analysis of each sample produced signal intensity measurements with a coefficient of variation (Cv) of less than 30% for each peptide and an average Cv of less than 15% for all characterized peptides. A summary of the replicate analysis of one of the protein standards, ß hemoglobin, can be seen in Table II. The 14 characterized peptides to ß hemoglobin provided
91% protein sequence coverage. The replicate analyses typically produced mass measurements with a precision of less than 5 ppm (RSD). An average mass accuracy error of 4.7 ppm (root mean square) was achieved for all of the peptides to ß hemoglobin. Similar levels of analytical reproducibility (mass, retention time, and signal response) were observed for the tryptic peptides from the other proteins.
|
|
|
4.9%. Because the proteins spanned a wide range of molecular masses (1497 kDa), the relationship appears to be independent of the protein molecular mass. Alcohol dehydrogenase was treated as the internal standard protein for this analysis. The universal signal response factor (counts/mol, SR/pmol) was determined from the three most intense tryptic peptides to alcohol dehydrogenase. The absolute quantity of the remaining proteins was calculated using the normalized signal response obtained from alcohol dehydrogenase. From these results we observed that the average MS signal response for the three most intense tryptic fragments is constant per unit quantity of protein. From these observations, we propose that the average signal response for the three most intense tryptic peptides can be used to estimate the absolute quantity of other well characterized protein within the same mixture.
|
|
|
|
|
Analysis of the Standard Six-protein Mixture in Human Serum
A second series of samples contained identical concentrations of the standard digest with each sample containing equivalent amounts of human serum (
8.75 µg of serum protein in a 5-µl injection). These complex protein mixtures were analyzed by LCMSE and processed with PLGS as before to obtain corresponding protein identifications. The three most intense peptides to each of the six spiked proteins were identified, and the corresponding average signal responses were calculated. These results are outlined in Table IV and were found to be similar to the results obtained from the analysis of the simple protein mixtures. These results again showed that the relative ratio of the average signal response for the three most intense tryptic peptides was consistent with the relative ratio of the absolute quantity of the individual proteins with the complex protein mixture. Although there was approximately a 20% decrease in the resulting signal response factor (counts/pmol), the results are internally consistent within the same dataset. The variability (Cv) of the normalized signal response per picomole increased slightly from 4.9 to 8.4% when obtained from the more complex sample. The error associated with the absolute quantification of the six standard proteins increased slightly as well. These results show that the determination of the absolute concentration of the six proteins was not affected by the additional complexity of the digested serum protein sample matrix and that the quantification method is capable of determining the absolute concentration of all properly characterized proteins present in a complex sample.
|
71%) for LCMS analysis. This value is similar to the mass fraction for these proteins as determined from the concentration values provided by Specialty Laboratories (
67%). The absolute concentrations were found to be close to the expected concentration values for a number of the characterized serum proteins with four proteins outside the typical range as indicated by Specialty Laboratories.3 This variability can be attributed to the lack of proper sampling statistics from a single sample of human serum.
|
The absolute quantification method outlined in this work provides a means to carry out a mass balance analysis as a useful accounting mechanism that can be applied to the inventory of peptides from any given LCMSE analysis. The total amount of protein present in human serum is
6080 mg/ml. Using an estimated concentration of 70 mg/ml of total serum protein, a 5-µl sample of human serum would contain
350 µg of total protein. According to the digestion protocol described under "Experimental Procedures," the 350 µg of total protein was digested with trypsin in a final volume of 200 µl to produce a digested protein solution containing
1.75 µg/µl. A total of 5 µl of digest was loaded onto the chromatography column (
8.75 µg of total digested protein). The results from the absolute quantification accounts for
10.0 µg of protein digest from the 11 identified proteins as indicated in Table V; this is in good agreement with the theoretical value.
Absolute Quantification of E. coli Proteins
Having the ability to determine absolute quantification of a protein allows one to determine the stoichiometric relationship of proteins within the same sample. To explore this possibility a yeast enolase protein digest of known concentration was spiked into the whole cell lysate of E. coli. The sample was analyzed in triplicate using the LCMS method described above. The average intensity value of the top three ionizing peptides to yeast enolase was used to convert the average intensity of the top three ionizing peptides for a number of well characterized E. coli proteins to the corresponding absolute quantity of protein loaded on column. The relative levels of the estimated absolute concentrations of a number of these proteins were found to be consistent with known quaternary structural information of these proteins. The results obtained from this quantitative assessment are outlined in Fig. 5. A number of identified ribosomal proteins were found to exist at the same relative abundance (1:1), consistent with the structure of the ribosomal complex (20). The stoichiometry of GroEL and GroES was also consistent with the known structure of the molecular chaperonin (2:1). GroEL exists as two stacked heptameric rings of 14 identical 57-kDa monomers to form a cylindrical structure. GroES exists as a single heptameric ring of seven identical 14-kDa monomers that reside at one end of the GroEL structure (21). Another example is illustrated by comparing the relative level of the estimated absolute quantity obtained for the
and ß chains of succinyl-CoA synthetase (SucC and SucD). These proteins were identified, and the corresponding stoichiometry was also consistent with its known heterotetrameric A2B2 (1:1) structure (22). These examples provide additional validation to the method described in this study for the determination of the absolute concentration of proteins using the signal response of the highest ionizing tryptic peptide fragments of identified proteins.
Conclusion
The label-free method described in this work is ideally suited for determining the absolute concentration of proteins present in both simple and complex mixtures. The described method takes full advantage of the recently introduced LCMSE mode of data acquisition and its ability to comprehensively reduce tens of thousands of ion detections to a simple inventory list of peptide precursors along with their time-resolved fragment ions. The specificity afforded by the accurate mass measurements of both the precursors and associated fragment ions (typically less than 5 ppm) provides the ability to identify, with high confidence, a large number of proteins with high sequence coverage. The ability to collect the MS data across the entire chromatographic peak width for all peptides above the limit of detection of the instrument allows for accurate quantification of peptides/proteins from the deconvoluted signal intensities (deisotoped and charge state-reduced). These three attributes of LCMSE data acquisition in association with the correlation between the average MS signal response of the three best ionizing peptides to a protein provide a means to determine the absolute concentration of any well characterized protein present in a sample. Future studies will be performed with known complexes to further validate and understand the guiding principles of this general methodology.
| ACKNOWLEDGMENTS |
|---|
| FOOTNOTES |
|---|
Published, MCP Papers in Press, October 11, 2005, DOI 10.1074/mcp.M500230-MCP200
1 The abbreviations used are: emPAI, exponentially modified protein abundance index; CapLC, capillary LC; PLGS, ProteinLynx Global Server; Cv, coefficient of variation; RSD, relative standard deviation; SR, signal response, peak intensity. ![]()
2 M. V. Gorenstein J. C. Silva, G. F. Li, and S. J. Geromanos, manuscript in preparation. ![]()
3 Directory of Services and Use and Interpretation of Tests, Specialty Laboratories, Santa Monica, CA (www.specialtylabs.com/default.htm). ![]()
* The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. ![]()
S The on-line version of this article (available at http://www.mcponline.org) contains supplemental material. ![]()
To whom correspondence should be addressed: Waters Corp., 34 Maple St., Milford, MA 01757-3696. Tel.: 508-482-3005; Fax: 508-482-2055; E-mail: jeff_silva{at}waters.com
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
M. Wang, J. You, K. G. Bemis, T. J. Tegeler, and D. P. G. Brown Label-free mass spectrometry-based protein quantification technologies in proteomic analysis Brief Funct Genomic Proteomic, June 25, 2008; (2008) eln031v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Prakash, B. Piening, J. Whiteaker, H. Zhang, S. A. Shaffer, D. Martin, L. Hohmann, K. Cooke, J. M. Olson, S. Hansen, et al. Assessing Bias in Experiment Design for Large Scale Mass Spectrometry-based Quantitative Proteomics Mol. Cell. Proteomics, October 1, 2007; 6(10): 1741 - 1748. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. P. C. Vissers, J. I. Langridge, and J. M. F. G. Aerts Analysis and Quantification of Diagnostic Serum Markers and Protein Signatures for Gaucher Disease Mol. Cell. Proteomics, May 1, 2007; 6(5): 755 - 766. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. D. Jaffe, D. R. Mani, K. C. Leptos, G. M. Church, M. A. Gillette, and S. A. Carr PEPPeR, a Platform for Experimental Proteomic Pattern Recognition Mol. Cell. Proteomics, October 1, 2006; 5(10): 1927 - 1941. [Abstract] [Full Text] [PDF] |
||||
![]() |
S.-L. Wu, J. Kim, R. W. Bandle, L. Liotta, E. Petricoin, and B. L. Karger Dynamic Profiling of the Post-translational Modifications and Interaction Partners of Epidermal Growth Factor Receptor Signaling after Stimulation by Epidermal Growth Factor Using Extended Range Proteomic Analysis (ERPA) Mol. Cell. Proteomics, September 1, 2006; 5(9): 1610 - 1627. [Abstract] [Full Text] [PDF] |
||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||