|
Advertisement | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Molecular & Cellular Proteomics 6:755-766, 2007.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ABSTRACT |
|---|
|
|
|---|
2D1 gel-based separation methods combined with mass spectrometry have been the standard for the separation, identification, and quantification of proteins. The method has to date the greatest potential to separate complex protein mixtures comprising up to thousands of components. It also has limitations with regard to the separation of certain protein classes and quantification in general. The quantitative limitations have been detailed elsewhere (8, 9), but they primarily arise from ambiguity in the identification of multiple proteins present in a single spot, identification of proteins at both extremes of the pI range, small proteins, variants and modifications, in-gel degradation, and variation in extraction efficiency. As a complementary alternative, LC-MS-based relative quantification methods have emerged to identify and quantify peptides and proteins in mixtures of various complexities. The majority of these relative quantification techniques use the introduction of stable isotopes into the samples including ICAT (10), isobaric tag for relative and absolute quantification (iTRAQ) (11), in vivo stable isotope labeling by amino acids in cell culture (SILAC) (12), and 18O labeling (13, 14). They typically require multiple sample preparation steps that could result in an increase in experiment variability and a decrease in accuracy. Recent articles have reviewed stable isotope labeling approaches and contrasted their advantages and limitations with quantitative differential in-gel electrophoresis methods (1517).
More recently, label-free LC-MS quantification methods have been described to determine relative abundances of proteins between multiple conditions (1824). These methods are typically based on determining peak area ratios of the same peptides between different conditions. The quantitative reproducibility of these methods depends upon the peptide cluster efficiency, which is determined by the mass measurement accuracy and precision and the chromatographic retention time reproducibility obtained during the experiment. A recent independent study from the Association of Biomolecular Resource Facilities evaluated quantitative proteomics approaches, and it was concluded that label-free methods did at least as well as stable isotope labeling methods.2 Moreover Silva et al. (25) discovered that a label-free approach allows for the estimation of absolute protein concentrations, which were subsequently used for stoichiometry studies.
In this study, a gel-free and label-free LC-MS approach is presented to conduct qualitative and quantitative serum analysis. The Gaucher disease protein serum profile was examined as it is biochemically and quantitatively well defined. The identification and enzyme activity determination of a known Gaucher disease biomarker will be demonstrated and cross-validated with its biochemical known activity. Furthermore clustering methods are described to evaluate the data quality of quantitative label-free LC-MS data sets. Clustering was also used for trend identifications based on absolute determined concentrations. Intensity profiling by K-means clustering of identified peptides was used to identify interrelating proteins, for example proteins that are components of the same biochemical pathway.
| EXPERIMENTAL PROCEDURES |
|---|
|
|
|---|
Sample Preparation/Protein Depletion
Serum samples from the control, the patient pretreatment, and the patient post-treatment were either digested as received or passed through a 10-cm x 4.6-mm multiaffinity removal system column (Agilent Technologies, Palo Alto, CA) to deplete the samples. Hence targeted high abundance proteins, including albumin, IgG, antitrypsin, IgA, transferrin, and haptoglobin, were removed.
10 µl of the undepleted serum samples was diluted with 50 mM ammonium bicarbonate (Sigma-Aldrich) prior to enzymatic digestion. A 20-µl aliquot of the serum samples was used for depletion with the multiaffinity removal system according to the manufacturers protocol. The mobile phase buffers were provided with the system and used as received. Briefly 20 µl of serum were diluted 5-fold with 80 µl of buffer A, and particulates were removed by centrifugation through a 0.22-µm spin filter (Millipore, Billerica, MA) at 13,000 rpm for 3 min. The proteins were separated with a step gradient; the first 10 min of the gradient were maintained at 100% mobile phase A at 0.5 ml/min followed by a step to 100% mobile phase B with a flow rate of 1.0 ml/min in 0.1 min where the composition was maintained for 7 min. Reconditioning of the column was conducted with mobile phase A buffer at 1.0 ml/min for 11 min. The depletion efficiency was estimated to be 50% based on UV absorption peak area ratio of the break-through and bound fraction. The flow-through fractions were collected and buffer-exchanged with 50 mM ammonium bicarbonate, and the volume was reduced to 80 µl.
Protein Digestion Protocols
10 µl of undepleted serum was diluted with 65 µl of 50 mM ammonium bicarbonate solution and denatured in the presence of 10 µl of 1% RapiGest detergent solution (Waters Corp., Milford, MA) at 80 °C for 15 min (26). The serum samples were reduced in the presence of 5 µl of 100 mM dithiothreitol (Sigma-Aldrich) at 60 °C for 30 min. The proteins were alkylated in the dark in the presence of 5 µl of 200 mM iodoacetamide (Sigma-Aldrich) at ambient temperature for 30 min. Proteolytic digestion was initiated by adding 15 µl of 0.5 µg/µl sequencing grade, modified trypsin (Promega, Madison WI) and incubated overnight at 37 °C. Breakdown of the acid-labile detergent was achieved in the presence of 4 µl of an aqueous 12 M HCl solution at 37 °C for 15 min. The tryptic peptide solutions were centrifuged at 13,000 rpm for 10 min, and the supernatant was collected. The enzymatic digestion and treatment of the depleted serum solutions was as described above with the exception of the addition of 20 µl of 0.5 µg/µl trypsin solution.
Prior to analyses, the tryptic peptide solutions were 10-fold diluted with an aqueous 0.1% formic acid (Sigma-Aldrich) solution. A protein digest internal standard was added (1:1 dilution with 100 fmol/µl enolase from Saccharomyces cerevisiae) to perform absolute quantification. The LC-MS analyses were performed using 2 µl of the final serum protein digest mixtures.
Recombinant chitotriosidase (Genzyme, Cambridge, MA) was digested as described above with minor modification. 87 µl of a 50 mM ammonium bicarbonate solution was added to 5 µl of 1 mg/ml chitotriosidase stock solution. The recombinant chitotriosidase was reduced in the presence of 1 µl of 100 mM dithiothreitol at 60 °C for 30 min. Alkylation was conducted in the dark for 30 min by adding 2 µl of 100 mM iodoacetamide. Digestion was initiated by adding 5 µl of 0.5 µg/µl modified sequencing grade trypsin and incubated overnight at 37 °C.
LC-MS Configuration
Nanoscale LC separation of tryptic peptides was performed with a NanoAcquity system (Waters Corp., Milford, MA) equipped with a Symmetry C18 5 µm, 5-mm x 300-µm precolumn and an Atlantis C18 3 µm, 15-cm x 75-µm analytical reversed phase column (Waters Corp.). The samples were initially transferred with an aqueous 0.1% formic acid solution to the precolumn with a flow rate of 4 µl/min for 3 min. Mobile phase A was water with 0.1% formic acid, and mobile phase B was 0.1% formic acid in acetonitrile. The peptides were separated with a gradient of 340% mobile phase B over 90 min at a flow rate of 300 nl/min followed by a 10-min rinse with 90% of mobile phase B. The column was re-equilibrated at initial conditions for 20 min. The column temperature was maintained at 35 °C. The lock mass was delivered from the auxiliary pump of the NanoAcquity pump with a constant flow rate of 200 nl/min at a concentration of 100 fmol of [Glu1]fibrinopeptide B/µl to the reference sprayer of the NanoLockSpray source of the mass spectrometer. All samples were analyzed in triplicate.
Analysis of tryptic peptides was performed using a Q-Tof Premier mass spectrometer (Waters Corp., Manchester, UK). For all measurements, the mass spectrometer was operated in the v-mode of analysis with a typical resolving power of at least 10,000 full-width half-maximum. All analyses were performed using positive nanoelectrospray ion mode. The time-of-flight analyzer of the mass spectrometer was externally calibrated with NaI from m/z 50 to 1990 with the data post acquisition lock mass corrected using the monoisotopic mass of the doubly charged precursor of [Glu1]fibrinopeptide B. The reference sprayer was sampled with a frequency of 30 s. Accurate mass LC-MS data were collected in an alternating low energy and elevated energy mode of acquisition (27, 28). The spectral acquisition time in each mode was 1.5 s with a 0.1-s interscan delay. In low energy MS mode, data were collected at a constant collision energy of 4 eV. In elevated energy MS mode, the collision energy was ramped from 15 to 40 eV during each 1.5-s data collection cycle with one complete cycle of low and elevated energy data acquired every 3.2 s. The radio frequency applied to the quadrupole mass analyzer was adjusted such that ions from m/z 300 to 2000 were efficiently transmitted, ensuring that any ions less than m/z 300 observed in the LC-MS data only arose from dissociations in the collision cell.
Data Processing and Protein Identification
Continuum LC-MS data were processed and searched using ProteinLynx GlobalServer version 2.2.5 (Waters Corp.). Protein identifications were obtained with the embedded ion accounting algorithm of the software and searching a human database to which data from S. cerevisiae enolase were appended. The ion detection, clustering, and normalization were performed using ProteinLynx GlobalServer. The principles of the applied data clustering and normalization have been explained in great detail in previous publications (18, 20). Intensity measurements are typically adjusted on those components, i.e. deisotoped and charge state-reduced accurate mass retention time pairs, that replicate throughout the complete experiment for analysis at the accurate mass/retention cluster level. Components are typically clustered together with a <10 ppm mass precision and a <0.25-min time tolerance. Alignment of elevated energy ions with low energy precursor peptide ions is conducted with an approximate precision of ±0.05 min. For analysis on the protein identification and quantification level the observed intensity measurements are normalized on the intensity measurement of the identified peptides of the digested internal standard.
The underlying principles of the ion accounting search algorithm have been recently described by Li et al.3 In brief, all fragment ions within a retention time window associated to
of the chromatographic peak width of a precursor ion are time-aligned or assigned to the precursor. The resulting precursor-product ion list is then queried against a database utilizing an iterative three-step process whereby the culmination of each loop increases the selectivity and sensitivity of the next. In addition, the method utilizes limited database queries whereby each query accesses different sets and subsets of peptides from the proteins present in the database.
During the first step, the data are matched to only correctly cleaved proteolytic peptides whose precursor and product ion mass tolerances are within the specified tolerances, typically 10 ppm for precursor ions and 20 ppm for product ions. As a consequence of these database search tolerances, each submitted precursor provides multiple tentative peptide identifications. However, the overall strategy of the search algorithm requires that only one peptide identification is provided for each detected precursor. As a result, all other low ranking tentative peptide identifications to each securely identified precursor are not considered. In addition, the product ions used for the validation of each high ranking precursor are removed from the precursor-product list of other co-eluting precursors, thereby eliminating them for consideration when identifying coincidentally detected precursors. During the second step, precursor and product ions that have not yet been assigned are queried against a subset database of the identified proteins from the first step. This includes missed cleavages, in-source fragments, neutral losses, and variable modifications. During the last step, the remaining unidentified ions are considered against the complete database for additional protein identifications, including peptide mass fingerprint identifications.
The protein identifications were based on the detection of more than two fragment ions per peptide, more then two peptides measured per protein, and identification of the protein in at least two of three injections. The false positive rate of the ion accounting identification algorithm is typically 34% with a randomized database 5 times the size of the original utilized database. However, by using replication as a filter, the false positive rate is minimized as false positive identifications have a random nature and as such do not tend to replicate across injections. Additional data analysis was performed with Decisionsite (Spotfire, Somerville, MA), Excel (Microsoft Corp., Redmond, WA), and Simca-P+ (Umetrics, Umeå, Sweden).
| RESULTS |
|---|
|
|
|---|
|
Relative Quantification
Prior to conducting quantitative comparisons between conditions, the observed intensity measurements were normalized on the intensity measurement of the internal standard peptides. In contrast to the normalization method mentioned above, this method utilizes the three most abundant peptides identified to a protein for normalization (25). In those instances where the protein identification was based on two peptides, normalization was conducted with the two best ionizing peptides. Details on protein identification are described under "Experimental Procedures."
The relative standard deviation on the summed intensity measurement of the three most abundant peptides identified to a protein for all identified proteins for the six investigated conditions was found to be equal to 13.6% (see Supplemental Table 1; statistical outliners not excluded), which agrees well with earlier reported values using label-free quantification techniques (18, 20, 24, 29, 30). The significance of regulation level was specified at 30%. Hence 1.3-fold (±0.30 natural log scale) was used as a threshold to identify significant up- or down-regulation, which is typically 23 times the estimated error on the intensity measurement. The provision for a precursor ion to be included for a qualitative measurement was identification based on the search criteria described under "Experimental Procedures." Hence an assured precursor intensity threshold, typically >250 counts per acquisition scan, had to be reached to generate fragment ions of sufficient intensity for identification. In total, 108 non-redundant proteins were identified in the complete sample set of which 46 proteins were common to depleted and undepleted serum. 20 proteins were uniquely identified in the undepleted samples. A further 42 unique proteins were identified in the depleted serum samples.
The relative ratios and variation were individually calculated for each protein from the absolute quantification results calculated within the undepleted and depleted data sets (see Supplemental Table 2, a and b). These were calculated using the normalized summed ion intensity as described above and expressed as relative values. Of the 66 proteins identified in the undepleted sera, 35 were found to be common across all conditions, control, pretreatment, and post-treatment. For the 88 proteins identified in the depleted sera the cross-section of the three conditions equaled 56 proteins. Both cross-sections were analyzed independently, and the relative summed intensity ratio of the pre- and post-treatment samples versus the control samples was expressed. The majority (50 of 56) of the commonly identified proteins in the depleted identification cross-section show a clear trend to normalize as a result of treatment (Fig. 2). A few proteins even show some overshoot. Similar results were obtained for the undepleted cross-section data set. In this instance, 29 of 35 proteins exhibited a similar trend upon treatment.
|
2.7 orders lower compared with the most abundant protein present in serum. This again required that the peptide had to be identified as described in the previous paragraphs; this is independently applicable to both depleted and undepleted serum. The estimated limit of detection at the peptide level, assuming that the precursor retention time and accurate mass are known, is
3.5 orders lower compared with the highest abundant identified peptides; this approaches the linear dynamic range of the analytical technique used in this work.
Chitotriosidase Enzyme Activity
Recently published quantification rules (25) were used to calculate the amount and enzyme activity of chitotriosidase, a known biomarker for symptomatic Gaucher disease patients. Monitoring the concentration of chitotriosidase and other regulated proteins during treatment by means of LC-MS could be a measure for treatment efficacy. The absolute quantification method relies on the fact that the average MS signal response for the three most intense tryptic peptides per mole of protein is a constant. Given a reference, a digest of enolase from S. cerevisiae in this case, this relationship is used to calculate an instrument response factor for each analysis.
With this method, the average concentration of the three injections of chitotriosidase in the pretreatment sample was equal to 1.59 ± 0.31 fmol/µl, which can be calculated back to an actual enzyme activity of 39,500 ± 7860 nmol/ml·h. The determined amino acid sequence coverage for chitotriosidase was 29.2%. The enzyme activity for chitotriosidase was also determined with 4-methylumbelliferyl ß-D-N,N',N'''-triacetylchitotriose substrate assay (31) and found to be equal to 31,800 nmol/ml·h ± 5%. The chitotriosidase level measured with both methods is in the same order of magnitude and varies by only 20%. The advantage of the LC-MS approach is the ability to calculate absolute concentrations of multiple proteins simultaneously without the requirement for isotope-labeled internal standards. It was necessary to deplete the patient serum to address the serum sample dynamic range to identify and quantify chitotriosidase in the pretreatment sample. The applied methods described in this and the following sections, however, can be equally successfully applied to undepleted samples.
Three additional samples obtained from other type 1 Gaucher disease patients were analyzed to statistically validate the levels and determined concentration from the LC-MS data discussed in the previous paragraph. The results of these experiments are summarized in Table I and are in agreement with the above mentioned observations that chitotriosidase is significantly elevated in the serum of patients suffering from type I Gaucher disease.
|
The absolute protein concentration values allow for the generation of so-called condition-specific signatures based on label-free quantitative LC-MS data sets, which were recently introduced for a breast cancer study.4 Briefly a reference protein is identified within the condition that is present at a given level. Alternatively the exogenous protein spike can be used. Note that this does not necessarily have to be the same protein in every condition as the protein concentration signature will not be relative to another condition but simply relative to a constant amount. Hence it is not important that the protein identity is identical as long as the amount is. The absolute protein amounts for all other identified proteins in a condition are expressed versus the absolute protein amount of the reference (Fig. 3). A number of proteins, undergoing significant change, are color-annotated to illustrate the signature usefulness. For instance, apolipoprotein A-I and complement C3 are at a relatively high concentration in the control, at a relatively low concentration when treatment is started, and subsequently close to the control concentration level again in the post-treatment sample. By annotating selected proteins, disease types can be characterized in a global manner by looking at a specific panel of proteins within the plasma proteome as a whole. In this example, apolipoprotein A-I, apolipoprotein C-II, complement C3, and chitotriosidase are proteins that have been shown previously to be regulated in Gaucher patients (4, 6, 32). To date, signatures have been determined for the pre- and post-treatment patient samples and a single control. Midtreatment signatures are currently considered as a treatment monitor tool. A more extensive study with a larger patient group is currently ongoing to clinically validate the identified serum signatures.
|
|
50%. However, no generic depletion fractionation factor can be derived as this is dependent on the interaction of these proteins with either the affinity column or their interaction with the targeted proteins. This is despite the fact that the applied depletion technique is reported to be reproducible and robust.5 Extreme instances were observed where >80% of non-targeted protein was removed, e.g. clusterin and complement C1r, or the protein concentration was not affected, e.g.
1B-glycoprotein,
1-microglycoprotein, and kininogen. These findings are intriguing but will vary dependent on the applied depletion technology (affinity efficiency and kinetics) and protein-protein interactions and how well the latter can be minimized. PCA at the protein level using absolute concentrations also illustrates the effect of treatment as the cluster for the post-treatment sample migrates closer to the control, agreeing with the relative quantification experiment results (Fig. 2). Also here it can be observed that triplicate injections cluster together and that PCA is well suited for multivariate analysis of multicondition experiments.
A common method for the multivariate analysis of gene expression or 2D gel sample set visualization is hierarchal clustering. The determined absolute protein concentrations by LC-MS can also be applied to this type of analysis. The 2log values of the absolute quantities were used to cluster injections of the depleted samples. As illustrated by the condition dendrograms displayed on the top of the heat map visualization (Fig. 5), the triplicate injections for each condition are closest in origin similarity with the pre- and post-treatment sample next closest. This is in agreement with the PCA results at the identification/protein concentration level (Fig. 4). Differences in expression and concentration can be easily distinguished in heat map visualizations. For example, in the selection (Fig. 5), three proteins are highly abundant in the pre- and post-treatment sample and not identified in the control, corresponding to fibrinogen
, ß, and
chains.
|
|
, ß, and
chain described in the previous hierarchical clustering section. The second cluster (Fig. 6b) represents proteins that are up-regulated in patient pre- and particularly in post-treatment samples; these are the previously mentioned fibrinolysis proteins. The third cluster (Fig. 6c) represents apolipoprotein A-I, a protein down-regulated in pre-treatment and only marginally normalizing upon therapy. Apolipoprotein A-I is a component of the high density lipoprotein serum content. The high density lipoprotein in Gaucher patients is extremely low and known to be poorly normalized following enzyme replacement therapy (6). The results obtained with the presented label-free quantitative MS methods are, therefore, again consistent with biochemical findings. Further investigation of the cluster profiles is ongoing to identify potentially new markers of interest and relationships between the identified proteins.
| DISCUSSION |
|---|
|
|
|---|
| ACKNOWLEDGMENTS |
|---|
| FOOTNOTES |
|---|
Published, MCP Papers in Press, February 9, 2007, DOI 10.1074/mcp.M600303-MCP200
1 The abbreviations used are: 2D, two-dimensional; PCA, principal component analysis. ![]()
2 A. M. Falick, J. A. Kowalak, W. Lane, K. Lilley, B. Phinney, C. Turck, S. Weintraub, E. Witkowska, and N. Yates, The Proteomics Research Group 2006 Quantitative Proteomics Study, Association of Biomolecular Resource Facilities, unpublished data. ![]()
3 Li, G.-Z., Golick, D., Gorenstein, M. V., Silva, J. C., Vissers, J. P. C., and Geromanos, S. J. (2006) A novel ion accounting algorithm for protein database searches, Poster W079 presented at the Human Proteome Organisation (HUPO) 5th Annual World Congress, Long Beach, CA (October 28November 1, 2006). ![]()
4 Vissers, J. P. C., Kipping, M., Reimer, T., Kasten, A., Koy, C., Langridge, J. I., and Glocker, M. O. (2006) Quantification of diagnostic protein signatures of polygenic diseases characterized by mass spectrometric proteome analysis: a study on mamma carcinoma, Poster 168 presented at the 2006 Meeting of the Association of Biomolecular Resource Facilities, Long Beach, CA (February 1114, 2006). ![]()
5 Chakraborty, A. B., Berger, S. J., Dorschel, C., Geromanos, S. J., Li, G.-Z., and Gebler, J. C. (2006) Is subtractive affinity depletion of abundant serum proteins useful and reproducible?, Poster 547 presented at the 54th ASMS Conference on Mass Spectrometry, Seattle, WA (May 28June 1, 2006). ![]()
* The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. ![]()
S The on-line version of this article (available at http://www.mcponline.org) contains supplemental material. ![]()
To whom correspondence should be addressed: Waters Corp., Market Development Proteomics, Atlas Park, Simonsway, Manchester M22 5PP, UK. Tel.: 44-161-435-4100; Fax: 44-161-435-4444; E-mail: hans_vissers{at}waters.com
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
S. Letuve, A. Kozhich, A. Humbles, Y. Brewah, M.-C. Dombret, M. Grandsaigne, H. Adle, R. Kolbeck, M. Aubier, A. J. Coyle, et al. Lung Chitinolytic Activity and Chitotriosidase Are Elevated in Chronic Obstructive Pulmonary Disease and Contribute to Lung Inflammation Am. J. Pathol., February 1, 2010; 176(2): 638 - 649. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Shen, P. Li, R.-J. Ni, M. Ritchie, C.-P. Yang, G.-F. Liu, W. Ma, G.-J. Liu, L. Ma, S.-J. Li, et al. Label-free Quantitative Proteomics Analysis of Etiolated Maize Seedling Leaves during Greening Mol. Cell. Proteomics, November 1, 2009; 8(11): 2443 - 2460. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Wang, J. You, K. G. Bemis, T. J. Tegeler, and D. P. G. Brown Label-free mass spectrometry-based protein quantification technologies in proteomic analysis Briefings in Functional Genomics, September 1, 2008; 7(5): 329 - 339. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| All ASBMB Journals | Journal of Biological Chemistry |
| Journal of Lipid Research | ASBMB Today |