Research Quantitative Proteomic Analysis Using Isobaric Protein Tags Enables Rapid Comparison of Changes in Transcript and Protein Levels in

Isobaric tags for relative and absolute quantitation, an approach to concurrent, relative quantification of proteins present in four cell preparations, have recently been described. To validate this approach using complex mammalian cell samples that show subtle differences in protein levels, a model stem cell-like cell line (FDCP-mix) in the presence or absence of the leukemogenic oncogene TEL/PDGFRβ has been studied. Cell lysates were proteolytically digested, and peptides within each sample were labeled with one of four isobaric, isotope-coded tags via their N-terminal and/or lysine side chains. The four labeled samples are mixed and peptides separated by two-dimensional liquid chromatography online to a mass spectrometer (LC-MS). Upon peptide fragmentation, each tag releases a distinct mass reporter ion; the ratio of the four reporters therefore gives relative abundances of the given peptide. Relative quantification of proteins is derived using summed data from a number of peptides. TEL/PDGFRβ leukemic oncogene-mediated changes in protein levels were compared with those seen in microarray analysis of control and transfected FDCP-mix cells. Changes at the protein level in most cases reflected those seen at the transcriptome level. Nonetheless, novel differences in protein expression were found that indicate potential mechanisms for effects of this oncogene.

Isobaric tags for relative and absolute quantitation, an approach to concurrent, relative quantification of proteins present in four cell preparations, have recently been described. To validate this approach using complex mammalian cell samples that show subtle differences in protein levels, a model stem cell-like cell line (FDCP-mix) in the presence or absence of the leukemogenic oncogene TEL/PDGFR␤ has been studied. Cell lysates were proteolytically digested, and peptides within each sample were labeled with one of four isobaric, isotope-coded tags via their N-terminal and/or lysine side chains. The four labeled samples are mixed and peptides separated by twodimensional liquid chromatography online to a mass spectrometer (LC-MS). Upon peptide fragmentation, each tag releases a distinct mass reporter ion; the ratio of the four reporters therefore gives relative abundances of the given peptide. Relative quantification of proteins is derived using summed data from a number of peptides. TEL/PDGFR␤ leukemic oncogene-mediated changes in protein levels were compared with those seen in microarray analysis of control and transfected FDCP-mix cells. Changes at the protein level in most cases reflected those seen at the transcriptome level. Nonetheless, novel differences in protein expression were found that indicate potential mechanisms for effects of this oncogene.

Molecular & Cellular Proteomics 4:924 -935, 2005.
The development of approaches for measurement of relative expression of proteins between two (or more) samples is an essential aspect of systems biology. A common technique for this type of proteomic study has been two-dimensional gel electrophoresis, where proteins are separated by isoelectric point and molecular weight, and spot patterns are compared by sophisticated computer algorithms. Proteins of interest are then identified by MS. However, gel-based approaches have drawbacks, including amount of material required, reproducibility, and limiting of sensitivity by protein loss. These are balanced against recent advances in protein staining that allow intragel comparison of protein quantity from two or three samples (1).
An alternative approach to proteomic analysis is LC-MS (2). This provides an increased sensitivity compared with gelbased approaches and can catalogue protein present in a sample. However, relative quantitation of proteins using LC-MS is challenging. Quantification by analyzing two samples in parallel and comparing their mass spectrometric profiles is not feasible. Isotopic labeling of peptides, however, does allow two samples to be analyzed in a single experiment. Isotope-coded affinity tagging using ICAT reagent technology indicates peptide source, with peak height giving relative quantity (3). Comparing the ICAT reagent approach with two-dimensional gel electrophoresis, however, demonstrates that neither offers comprehensive coverage of a proteome (4). This is true in part because many proteins (and therefore peptides) do not contain cysteine, the amino acid used for covalent attachment of the isotopomer in ICAT reagents. Thus, much information can be discarded in the form of non-labeled peptides, whereas two-dimensional gel electrophoresis excludes many large, hydrophobic, and basic proteins.
Stable isotope labeling with amino acids in cell culture uses isotopes of essential amino acids (for example deuterated leucine) to label cells in culture (5). The samples are mixed, proteolytically digested, and run in LC-MS experiments. All leucine-containing peptides appear as "heavy" and "light" peaks, giving relative protein abundance. This elegant method can only be used on cultured cells; it is unsuitable for study of primary material. For ICAT reagent and stable isotope labeling with amino acids in cell culture technologies, the labeled peptides have different masses in an MS scan; this increases the complexity of the MS spectra and necessitates that MS/MS is therefore performed on the same peptide (the heavy and light labeled versions) twice, wasting analysis time.
Novel labeling reagents can overcome some of the limitations described above. Isobaric tags for relative and absolute quantitation (iTRAQ) 1 use reagents that enable up to four samples to be analyzed within the same experiment. The labels consist of a protein-reactive group that labels all free amines (i.e. will label at the N terminus of all peptides and also the side chain of internal lysine residues), a balance group and a reporter group (6). The labels are isobaric, with a different distribution of isotopes between the reporter and balance groups. Hence, each labeled peptide appears at the same mass in an MS scan, but upon fragmentation in the mass spectrometer, the label dissociates and releases the reporter group as a singly charged ion of masses 114.1, 115.1, 116.1, or 117.1, respectively. Relative peak area indicates the contribution of each sample to total peptide present, providing a measure of relative abundance. The balance group is also lost and the remaining peptide fragments, which all have addition of the same mass (i.e. the protein reactive group) provide data from which to infer the peptide sequence.
The t(5,12) translocation found in chronic myelomonocytic leukemia results in the expression of the leukemogenic tyrosine kinase TEL/PDGFR␤ and activation of the PDGFR␤ tyrosine kinase domain (7). This stem cell disease has been modeled by expressing the TEL/PDGFR␤ in a multipotent hematopoietic stem cell line, FDCP-Mix. The effects of oncogenic expression can be subtle yet lead to profound changes in cellular development. Perhaps unlike signaling for proliferation or apoptotic suppression, the appropriate tools for immediate analysis of potential effectors of altered development are not freely available. Herein, we report the validation and use of iTRAQ reagents on the FDCP-Mix TEL/PDGFR␤ system as a paradigm for rapid, systematic definition of oncogenic processes using proteomics and the value of iTRAQ in permitting direct comparison of transcriptome data.

EXPERIMENTAL PROCEDURES
Cell Line Preparation and Culture-FDCP-Mix cells were transduced with TEL/PDGFR␤ using a murine stem cell retroviral vector as described previously (8). Cells were routinely cultured in Fishers medium with 20% (v/v) horse serum supplemented with 10 ng/ml IL-3 (R&D Systems, Minneapolis, MN). Conditions for inducing myeloid differentiation FDCP-Mix cells were induced to differentiate by two methods as described previously (9). Briefly cells were cultured in Iscove's modified Dulbecco's medium supplemented with pre-selected fetal calf serum (20% (v/v)) and a combination of recombinant murine granulocyte macrophage-colony-stimulating factor (50 units/ ml; Biogen IDEC, Zug, Switzerland), recombinant human macrophage-colony-stimulating factor (5 ng/ml; Amgen Biologicals, Thousand Oaks, CA) plus recombinant murine IL-3 (0.1 ng/ml; Calbiochem, Nottingham, UK). Cells were prepared with a Cytospin centrifuge and stained with May Grunwald-Giemsa, and differential morphology was scored for greater than 100 cells per slide.
Western Blotting-Western blotting was carried out with standard protocols using a monoclonal antibody to the kinase domain of PDGFR␤ or anti-phosphotyrosine antibodies (BD PharMingen, Oxford, UK). The phosphoprotein content (serine, threonine, and tyrosine) was measured by separation of total cell lysates on 10%T SDS-polyacrylamide gels and staining with Pro-Q diamond stain (Molecular Probes, Leiden, The Netherlands) per the manufacturer's instructions.
iTRAQ Reagent Labeling-An overview of the workflow is shown in . The strong cation exchange step shown was employed to remove free iTRAQ reagent as well as to fractionate peptides for separate analyses by reversed-phase LC/MS/MS. PROQUANT is a program designed to integrate data from these isobaric tag experiments for relative protein quantification.
200 mM methylmethanethiosulfate in isopropanol and incubation at room temperature for a further 10 min. Protein was then digested by addition of 10 l of trypsin at 0.5 g/l and incubated at 37°C overnight.
To label the peptides with iTRAQ reagent (Applied Biosystems, Warrington, UK), one unit of label (defined as the amount required to label 100 g of protein) was thawed and reconstituted in 70 l of ethanol, with vortexing for 1 min. The reagent solution was added to the digest and incubated at room temperature for 1 h. Labeling reactions were then pooled before analysis.
Peptide Fractionation and Mass Spectrometry-To remove excess, unbound iTRAQ reagent and to simplify the peptide mixture before reversed-phase LC-MS/MS, peptides were washed and fractionated off line using a strong cation exchange column (Applied Biosystems). In brief, the peptide mixture was diluted 10-fold in loading buffer (10 mM potassium phosphate in 25% (v/v) acetonitrile, pH 3.0), and the pH was checked to ensure it remained between 2.5 and 3.3. Sample mixture was slowly injected onto the strong cation exchange cartridge and was washed with a further 1 ml of loading buffer to remove salts, tris-(2-carboxyethyl)phosphine, and unincorporated iTRAQ reagent. Samples were eluted from the column using 500-l volumes of elution buffer (10 mM potassium phosphate in 25% (v/v) acetonitrile) containing increasing concentration of KCl. Salt concentrations used were 50, 100, 150, 200, 250, 300, 350, and 500 mM. Each salt fraction was then concentrated and dried in a SpeedVac (Thermo Electron, Waltham, MA).
Dried peptide fractions were resuspended in 250 l of 2% (v/v) acetonitrile/0.1% (v/v) formic acid. For each analysis, 60 l of the peptide sample was loaded onto a 15-cm reversed phase C18 column (75 m i.d.) using an UltiMate pump (LC Packings, Amsterdam, The Netherlands) and separated over a 120-min solvent gradient from formic acid on-line to a QSTAR XL mass spectrometer (Applied Biosystems). Data was acquired using an independent data acquisition protocol in which, for each cycle, the two most abundant multiply charged peptides (2 ϩ to 4 ϩ ) above a 10 count threshold in the MS scan with m/z between 480 and 2000 were selected for MS/MS. Each peptide was selected twice and then dynamically excluded (Ϯ50 milli-mass units) for 40 s.
Data Analysis-Data were searched against a mouse KBMS3.0 protein database from the Celera Discovery System (Applied Biosystems). The database allowed for iTRAQ reagent labels at N-terminal residues, internal K and Y residues, and the methylmethanethiosulfate-labeled cysteine as fixed modification, plus one missed cleavage. Search parameters within ProQUANT were set with an MS tolerance of 0.15 Da, an MS/MS tolerance of 0.1 Da, and a minimum confidence score of 20. ProQUANT pooled data from all LC-MS runs. Assessment of these parameters for peptide and protein identification is described in Supplemental Table I.
Transcriptome Analysis-RNA from FDCP-Mix and TEL/PDGFR␤ FDCP-Mix samples was prepared using TRIzol (Invitrogen) in triplicate samples and then cleaned using MinElute RNeasy Clean up kit (Qiagen, Valencia, CA) per the manufacturer's instructions. Transcriptome analysis was undertaken using murine MOE430A Affymetrix chips by the CR-UK Affymetrix microarray facility, Paterson Institute (Manchester, UK). -Fold changes were calculated from the scaled data, where appropriate t-test analysis was applied.

Expression of TEL/PDGFR␤ in the Multipotent Hematopoietic Cell Line FDCP-Mix Inhibits Differentiation but Has No
Effect on Growth Factor Dependence-FDCP-Mix cells were transduced to express TEL/PDGFR␤ (Fig. 2a). Previous expe-rience with Ba/F3 cells transfected with BCR/ABL and TEL/ PDGFR␤ have shown that TEL/PDGFR␤ has comparatively small effects on tyrosine phosphorylation (data not shown). However, TEL/PDGFR␤ has small but significant effects on protein tyrosine phosphorylation and also significantly affects total phosphoprotein level ( Fig. 2, b and c).
Differentiation-blocked cell lines can become growth factor-independent when expressing TEL/PDGFR␤ (10). This did not occur in TEL/PDGFR␤-FDCP-Mix cells. FDCP-Mix cells differentiate to form mature cells when cultured in the appropriate cytokines (11). Expression of TEL/PDGFR␤ inhibited this development (Fig. 2d). Culture conditions that induce myeloid development to 100% postmitotic cells (macrophages, neutrophils) in control FDCP-mix cells gave no such effect in TEL/PDGFR␤-transfected cells. After 20 days in culture, colony-forming clonogenic cells were still present that had primitive myeloid cell or blast cell morphology. This effect on differentiation led us to systematically analyze potential differences in transfected and non-transfected cells using transcriptomic and proteomic methods. To do this, the proteomic method required validation.
Analysis of a Standard Mix of Proteins Using iTRAQ Reagent Labels-A defined six-protein mix that had been enzymatically digested with trypsin was used to confirm the accuracy of ratiometric quantitation of the iTRAQ reagents. The tryptic digest was halved, and each half was labeled with either reagent 116 or 117. These differentially labeled digests were mixed at various ratios (1:1, 2:1, and 1:3) and analyzed by LC-MS/MS. A representative spectrum is shown in Fig. 3a. Relative quantitation of proteins by iTRAQ reagent technology was both accurate and reproducible for five proteins (Fig. 3b). The sixth protein in the standard mixture was not detected with sufficient peptides to allow quantitation. Overall 117:116 ratios of 0.9699, 0.5885, and 3.1748 were obtained. We confirmed that no peptides remained "unlabeled" by analysis of parent ion masses derived for the MS analysis and comparison with theoretical tryptic digests of the six proteins. All isobaric forms of the iTRAQ reagent tag labeled equally efficiently.
Identification of TEL/PDGFR␤ Induced Alterations in the Proteome-The potential value of the system was further examined on lysates from mammalian FDCP-Mix cells and FDCP-Mix cells transduced with the TEL/PDGFR␤ leukemic oncogene. In these experiments, microarray data was available for comparison from the FDCP-Mix cells described above. The experimental design is outlined in Fig. 1. All four labels were used, allowing the use of internal controls and replicates within the same LC-MS experiment for this paradigm study. This experimental design allowed the accuracy of quantitation to be verified by including a 1:1 ratio of control sample (114:116) and a 2:1 ratio of the TEL/PDGFR␤ FDCP-Mix cell (117:115), as shown in Fig. 1. Furthermore, an internal replicate for the comparison of TEL/PDGFR␤-expressing and non-expressing cells is thus provided. The experiment was run in triplicate. Labeling efficiency with iTRAQ reagents was high (Ͼ99%), and both intra-and interexperiment quantitation was highly reproducible, with overall ratios of 1.064, 0.967, and 1.093 when samples were mixed 1:1 and 2.127, 2.128, and 2.217 when samples were mixed 2:1. It is noteworthy that ratios detected were normally distributed; therefore, statistical tests such as Student t test can be applied. In addition, replication within the experiment achieved an acceptable standard, with an average 116:114 ratio of 1.072 32 Ϯ 0.07 (S.D.; n ϭ 3) (Fig. 4a), and the replicate FDCP-Mix versus FDCP-Mix-TEL/PDGFR␤ analyses (117:114 versus 117:116) had a correlation coefficient of 0.93, confirming the reproducibility of the technique. Data for individual peptides and proteins was reproducible across replicate experiments (Fig. 4b).
In a typical experiment, 1120 peptides from 347 proteins were found. Analysis of the ratios of identified proteins showed significant and reproducible differences between samples in a total of 13 proteins (Fig. 4c and Table I). It is noteworthy that such a plot shows an extremely tight clustering around the 1:1 ratio value, indicative of few changes and a robust, reproduc-ible technique. Examples of the spectra produced by iTRAQ reagent-labeled peptides are shown in Fig. 5, a-c, where peptides showing no change between samples, down-regulation or up-regulation by TEL/PDGFR␤ expression are illustrated.
Comparison of iTRAQ™ Reagent Data to cDNA Microarray-Relatively few changes are seen in cDNA microarrays from TEL/PDGFR␤-transfected and control FDCP-Mix cells, reflecting the relatively small effect of this oncogene on protein tyrosine phosphorylation. This paucity of change in the transcriptome led us to confirm changes using proteomics. Analysis of microarray data revealed that, with a 1.5-fold change cut-off, a total of 105 transcripts show a significant increase with TEL/PDGFR␤ expression, and 159 transcripts show decrease (see Supplemental Table I).
The data derived permitted comparison of individual transcript/protein levels. All of the differences identified by iTRAQ reagents (Table I) (Table I). The cDNA array data from these 100 proteins con-

TABLE I The relative quantification of protein levels and mRNA levels in FDCP-Mix cells and TEL/PDGFR␤-FDCP-Mix cells
For comparison, 100 proteins (randomly selected proteins plus those in which changes between the two cell lines were seen using the isobaric tag quantification approach) were selected. Random selection was made via search databases (generating the accession codes shown: emb, EMBL; gb, GenBank TM ; pir, Protein Information Resource; rf, RefSeq (NCBI); spt, Swiss-Prot; trm, TrEMBL.). The total peptides columns refers to the number of peptides from that protein fragmented within three replicate experiments. The relative ratios for each protein were calculated as a weighted value combining averages from three replicate experiments. Ratios shown are: average control/control, an internal control, in which the same amount of the same lysate has been compared (ratio of 116:114, see Fig. 2b); average (1) oncogene/control, the ratio of protein in the TEL/PDGFR␤-transfected cells vs. control FDCP-Mix cells containing data from comparison of 117:114 iTRAQ label relative quantities; average (2) oncogene/control, containing data from comparison of 117:116 ratios. Also shown is the fold change in the mRNA level (-fold cDNA) for that transcript/protein (oncogene/control) according to microarray analysis. Sequences whose expression is significantly altered (p Ͻ 0.05) at the mRNA level are shown in italics.

Quantitative Proteomic Analysis Using Isobaric Protein Tags
changes (subtle, albeit statistically significant) in the proteome but not in the transcriptome. These included a 60S ribosomal protein subunit, aldo-keto reductase family member C13, and eukaryotic translation initiation factor 5A. This initial scan of the value of the iTRAQ approach therefore reveals that it will offer advantages over transcriptome analysis. Furthermore, specific proteins identified using iTRAQ LC-MS/MS were not assayed within the microarray. These included Filamins A and B, and bifunctional purine biosynthesis protein PURH. This assessment was confirmed after searching using alternative protein names, gene names, and accession numbers. iTRAQ therefore offers objective, non-selective sample analysis.
In the five changes showing consistency at transcript and protein level (cathepsin G, mast cell protease 8, myeloperoxidase precursor, protein disulfide isomerase, and L-plastin) the -fold changes in this set were remarkably similar to the -fold changes detected in their mRNA levels. The iTRAQ reagent and cDNA microarray data sets also agree in that most of the changes identified are subtle (2-fold or less). DISCUSSION iTRAQ reagent technology is a newly developed method for relative quantification of proteins from up to four samples. It has immense potential to improve the sensitivity and quality of mass spectrometric analysis of the proteome. Validation of the approach is reported here using a defined protein mixture and cell lysates from a disease model. iTRAQ reagent labels all peptides at their N terminus, along with free amines in lysine side chains, hence all of the peptide population is labeled, allowing more peptides to be quantified from each protein and increasing the quality of the data obtained by this approach. We have demonstrated that it is possible, with both standard proteins and whole-cell lysates, to label four peptide mixtures to completion and, using LC-MS/MS, to identify the relative quantities of the peptide emanating from four samples. The labeling protocol for iTRAQ reagents is simple, with few steps, so there are few opportunities for sample loss or contamination. Another advantage of iTRAQ reagent approach is that the peptides from all samples appear as one peak in MS increasing the total ion current for that peptide, simplifying the spectra, and requiring only one MS/MS experiment per peptide.
Up to 347 proteins were identified in a typical first-pass analysis of a whole-cell lysate. 400 g of protein in a DIGE experiment yielded approximately 100 confirmed protein identities using the same mass spectrometer and LC parameters for peptide sequencing. 2 However, the number of identified proteins in our experiments is lower than that reported for other LC-MS/MS approaches. Overcoming this relative lack of sensitivity is straightforward. Standard protein identification protocols are designed to select the precursor ion with the greatest number of counts in each MS scan for MS/MS. A deeper penetration into the proteome, gaining quantification on lower abundance proteins is achievable by a combination of pre-enrichment strategies (for examples of organelles, see Dreger (12)) and/or the generation of "exclude lists," which instruct the mass spectrometer to ignore specific "high abundance" ions (defined by both mass and chromatographic retention time). On the other hand, the MS/MS experiment can be designed such that the ion(s) selected for MS/MS are the lowest two (above a predetermined threshold) in the MS spectra, rather than the highest. Preliminary investigation of this "bottom-up" selection protocol on one of the FDCP-Mix samples increased the number of proteins identified by a further 50% (data not shown). Because only a fraction of the sample is used in a single LC-MS/MS run, using several thresholds/sample loads should maximize the number of peptide/protein identifications from each sample. Therefore, global proteomic analysis or more focused procedures can be achieved using iTRAQ reagents and multiple subcellular fractions, multidimensional chromatography, plus ion exclusion lists in iTRAQ reagent-based experiments.
Several proteins are differentially expressed in the presence of TEL/PDGFR␤. Heterogeneous nuclear ribonucleoprotein D, shown to be increased by TEL/PDGFR␤ expression, is an mRNA binding protein that has been implicated in tumorigenesis (13) and that is a target of the leukemogenic oncogene BCR/ABL (14). In our study TEL/PDGFR␤ expression decreased Hsc70 levels, whereas previous studies have shown that increased expression of Hsc70 inhibits transformation (15). Myeloperoxidase protein, decreased by TEL/PDGFR␤, is expressed in early myeloid progenitors (16); therefore, its decreased levels in the differentiation-blocked TEL/PDGFR␤expressing cells may provide clue as to the mechanism for this block. Myeloperoxidase expression is regulated by transcription factors such as Pu.1 and the C/EBP family (17). These proteins regulate myelopoiesis and loss of either Pu.1 or C/EBP␣ leads to compromised ability to produce mature cells (18,19). C/EBP␣ transcript levels are decreased 1.5-fold (p ϭ 0.017) in TEL/PDGFR␤-transfected cells, although Pu.1 and other C/EBP family members are unchanged at the transcript level. Thus, the data we have derived allow further experiments on the mechanism of differentiation blockade in these cells. These can include pre-enrichment and selective searching for ions from C/EBP␣ in our iTRAQ reagent experiments to allow relative quantitation of transcription factor levels. In addition, Cathepsin G, which is also reduced in expression, is a serine protease highly expressed in promyelocytes (20,21); it has a role in hematopoietic stem cell mobilization and differentiation and so may also play a role in the TEL-PDGFR␤-mediated differentiation block. Mutations in such proteases have been implicated in neutropenia (22,23). Perhaps even more relevant is that cathepsin L has been shown to locate to the nucleus and regulate transcription via a proteolytic mechanism (24). Cathepsin G may have a similar function.
Comparison of this data set with a cDNA microarray data set from the same cell line provides relatively high levels of agreement between transcripts and protein level. All five of the changes from the 100-transcript sample set were also detected by iTRAQ; of these, all showed a similar level of change. However, the iTRAQ reagent approach identified changes in proteins that are caused by post-transcriptional effects, in that no change is seen in the levels of mRNA. A comparison of this type was previously almost impossible, because stable isotope or gel-based approaches tend to focus on identifying proteins whose expression changes, rather than the relative abundances of all proteins in a sample.
In conclusion, we have shown that iTRAQ protein labeling reagents can be employed to successfully identify proteins in which expression is potentially modified. This has the advantage of using multiple samples in a single LC-MS/MS run. The iTRAQ reagent produced high quality, reproducible data regarding relative expression levels in up to four samples. Comparison of the iTRAQ reagent data with cDNA microarray data suggests a high degree of similarity; all changes in a subset of the cDNA microarray are replicated by iTRAQ reagent analysis. However, iTRAQ reagent experiments also defined several other changes not detected by the cDNA array. iTRAQ reagent technology has great value as a new method for relative quantification of proteins in enriched complexes, organelles, and whole cell lysates. * This work was supported by the Leukemia Research Fund and Biotechnology and Biological Sciences Research Council UK. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. □ S The on-line version of this article (available at http://www. jbc.org) contains Supplemental Table I. ‡ ‡ To whom correspondence should be addressed. Tel.: 0161-306-4182; Fax: 0161-236-0409; E-mail: awhetton@picr.man.ac.uk.