If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
To whom correspondence should be addressed: Proteomics and Functional Genomics Group, Faculty of Veterinary Science, University of Liverpool, Crown St., Liverpool L69 7ZJ, UK. Tel.: 44-151-794-4312; Fax: 44-151-794-4243;
* This work was supported by Biotechnology and Biological Sciences Research Council Grant BB/C007433/1 (to R. J. B. and S. J. G.) and by Genus plc. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Stable isotope-labeled proteotypic peptides are used as surrogate standards for absolute quantification of proteins in proteomics. However, a stable isotope-labeled peptide has to be synthesized, at relatively high cost, for each protein to be quantified. To multiplex protein quantification, we developed a method in which gene design de novo is used to create and express artificial proteins (QconCATs) comprising a concatenation of proteotypic peptides. This permits absolute quantification of multiple proteins in a single experiment. This complete study was constructed to define the nature, sources of error, and statistical behavior of a QconCAT analysis. The QconCAT protein was designed to contain one tryptic peptide from 20 proteins present in the soluble fraction of chicken skeletal muscle. Optimized DNA sequences encoding these peptides were concatenated and inserted into a vector for high level expression in Escherichia coli. The protein was expressed in a minimal medium containing amino acids selectively labeled with stable isotopes, creating an equimolar series of uniformly labeled proteotypic peptides. The labeled QconCAT protein, purified by affinity chromatography and quantified, was added to a homogenized muscle preparation in a known amount prior to proteolytic digestion with trypsin. As anticipated, the QconCAT was completely digested at a rate far higher than the analyte proteins, confirming the applicability of such artificial proteins for multiplexed quantification. The nature of the technical variance was assessed and compared with the biological variance in a complete study. Alternative ionization and mass spectrometric approaches were investigated, particularly LC-ESI-TOF MS and MALDI-TOF MS, for analysis of proteins and tryptic peptides. QconCATs offer a new and efficient approach to precise and simultaneous absolute quantification of multiple proteins, subproteomes, or even entire proteomes.
As the field of proteomics matures as a discipline, there is an increasing realization of the importance of absolute as well as relative quantification, and considerable effort is being directed toward experimental strategies to achieve this goal. Most commonly, relative protein quantification by mass spectrometry has been based on differential stable isotope labeling implemented by metabolic incorporation (
) avoids the use of stable isotopes but requires assumptions concerning mass spectrometric response factors. To achieve relative quantification of proteins without isotope labeling or chemical modification steps, quantitative comparisons have been made of equivalent sets of mass spectrometric data by considering peptide detectability in repetitively acquired spectra or by comparing integrated extracted ion chromatograms following liquid chromatography-mass spectrometry analyses (
In principle, any of the approaches adopted for relative quantification may also be used for absolute quantification if reference standards are available for all analytes in known amounts. When unknowns and reference standards are co-analyzed, such approaches exploit the well established precept in analytical chemistry of internal standardization in which a known amount of a stable isotope-labeled (or otherwise differentiated) standard is added to the analyte such that the response ratio between analyte and the heavier internal standard can then be used to quantify the unknown. However, for quantification of individual proteins in a proteomics study, the true internal standard would be the corresponding protein expressed in pure and stable isotope-labeled form and quantified. This would be challenging on many fronts, including the expression of a native protein in a heterologous system to effect labeling, purification of the protein, and subsequent mass spectrometric analysis of the complex isotopic profile of the analyte and standard protein. Rather than adopt a protein-based approach, absolute quantification using proteotypic peptides as surrogates for the protein of interest has emerged using stable isotope-labeled peptide internal standards as “signature” or “proteotypic” peptides that are chemically synthesized and incorporate stable isotopes (
). Each protein to be quantified requires at least one stable isotope-labeled peptide that must be independently synthesized at relatively high cost. Moreover each peptide must be separately purified and quantified (
). In brief, artificial genes are designed de novo to direct the synthesis of novel proteins that are assemblies of signature Qpeptides derived from a number of discrete proteins. Usually these Qpeptides are arginine or lysine terminated at the C terminus as they represent and will be internal standards for tryptic peptides derived from digestion of the analyte proteins. Appropriately flanked with added features including an initiator codon, a purification tag, and protective sacrificial regions, the gene is transformed into and expressed in a heterologous system, usually bacterial. The expression strain is grown in a chemically defined medium, uniformly isotopically labeled (for example, using 15NH4Cl as the sole nitrogen source) or containing specific stable isotope-labeled amino acids at a high isotope enrichment such that the artificial protein becomes fully labeled. The artificial protein (termed a “QconCAT” for “quantification concatamers”) is purified by virtue of the affinity tag and quantified using a suitable procedure (
). The QconCAT is mixed with a complex mixture of analyte proteins, and subsequent proteolysis releases both the stable isotope-labeled standard and the cognate peptide from the analyte. The known quantity of standard added can then be used for absolute quantification of the analyte. Because quantification of the QconCAT protein will define in absolute terms the quantity of each of the surrogate peptides, the QconCAT strategy provides an efficient means to multiplex absolute quantification. Tryptic peptides are typically 10–15 amino acids long; thus proteotypic Qpeptides from 50 proteins could be encoded in a protein comprising 500–750 amino acids. The Qpeptides are present, by design, in stoichiometrically known amounts (usually equimolar) so that each analyte peptide (and therefore protein) is simultaneously quantified.
Qpeptides are concatenated in the QconCAT protein out of their normal primary sequence context, and it is formally correct to point out that this different context could influence the quantification data (
). However, this can only occur if either the QconCAT or the analyte proteins are incompletely digested such that the yield of each peptide is incomplete. It is generally accepted that for most general proteases, such as trypsin, the K for proteins and peptides is relatively high, and proteolytic reactions operating at substrate concentrations below this value exhibit pseudo-first order kinetics (
). Thus, if the rate of digestion of either the QconCAT or analyte was so low that six or seven reaction half-times could not elapse during the proteolytic reaction, there might be discordance between the yield of the standard and analyte peptide. However, the main determinant of the rate of proteolysis of native proteins is higher order structure, not primary sequence context. Tightly folded proteins, particularly those with a high proportion of β sheet, are intrinsically resistant to proteolysis (
). There is no reason, a priori, to expect that QconCATs would adopt such tightly folded structures. Indeed their propensity to form insoluble inclusion bodies and their recovery by dissolution in strong chaotropes both mitigate against structural impediments to proteolysis. By contrast, unless care is taken in the prior denaturation of analyte proteins, their higher order structure would almost certainly influence proteolysis and could impact absolute quantification. We stress, however, that the incomplete analyte digestion is as much an issue for quantification using synthetic peptides as those using QconCATs. We address the issue of QconCAT and analyte proteolysis here and show that it is a factor that is readily controlled.
Deployment of a QconCAT experiment has many aspects that must be optimized. We demonstrate the use of a QconCAT for absolute quantification of a group of proteins that demonstrate dramatic changes in expression during development of skeletal muscle in the chicken posthatching. We assessed the scope of the method and the magnitude and sources of variance that the method contains. We confirmed the value of guanidination (
) as a strategy to enhance peptide ion yields in MALDI-TOF MS and showed that effective quantification is attainable and equivalent in both MALDI-TOF and ESI-TOF analyses.
Materials and Reagents—
Trypsin (sequence grade) was obtained from Roche Diagnostics. All other chemicals and solvents (HPLC grade) were purchased from Sigma-Aldrich and VWR International Laboratory Supplies (Leicestershire, UK).
Proteomics Analysis of Chicken Skeletal Muscle Soluble Fraction—
Chickens (Institut de Sélection Animale (ISA) Brown layer and Ross 308 broiler) were grown to 30 days posthatch, and animals were culled at 1, 3, 5, 10, 20, and 30 days at which time pectoralis muscle was collected (the above procedures were performed at the Roslin Institute, Edinburgh, UK). To isolate the soluble fraction of chicken skeletal muscle, 100 mg of breast tissue was homogenized in 0.9 ml of 20 mm sodium phosphate buffer, pH 7.0, containing protease inhibitors (Complete protease inhibitors, Roche Applied Science). The homogenized sample was centrifuged at 15,000 × g for 45 min at 4 °C. The supernatant fraction, containing soluble protein, was then removed. This was repeated, homogenizing the insoluble fraction in the same volume of sodium phosphate, and the pooled supernatant fractions were used for all analyses. The total protein concentration of each preparation was measured using a Coomassie Plus Protein Assay (Pierce).
The abbreviations used are: 1D, one-dimensional; GAPDH, glyceraldehyde-3-phosphate dehydrogenase; AK, adenylate kinase; CV, coefficient of variance.
SDS-PAGE analysis, 10 μg of soluble protein samples (volume, 5–10 μl) from birds of different strains and ages were each mixed with an equal volume of reducing sample buffer (1 ml of 0.5 m Tris buffer, pH 6.8, 1 ml of glycerol, 0.02 g of SDS, 0.01g of bromphenol blue, 0.154 g of DTT) and resolved by 12.5% (w/v) SDS-PAGE prior to staining with Coomassie Blue (Bio-Safe, Bio-Rad). Gels were destained with 10% (v/v) acetic acid, 10% (v/v) methanol.
) was expressed in Escherichia coli with a full complement of unlabeled amino acids or in the presence of [13C6]lysine (100 mg/liter) and [13C6]arginine (100 mg/liter) as the sole source of these two amino acids. Expression was induced with isopropyl β-d-thiogalactopyranoside, and the cells were harvested by centrifugation at 1400 × g at 4 °C for 15 min. Inclusion bodies containing QconCAT (as proven by digestion with trypsin and MALDI-TOF MS analysis; data not shown) were recovered by breaking cells using BugBuster Protein Extraction Reagent (Novagen, Nottingham, UK). Inclusion bodies were resuspended in 20 mm phosphate buffer, 6 m guanidinium chloride, 0.5 m NaCl, 20 mm imidazole, pH 7.4. From this solution, [13C6]lysine/arginine-labeled and unlabeled QconCAT proteins were purified separately by affinity chromatography using a nickel-based resin (HisTrap HP kit, Amersham Biosciences). Following sample loading, HisTrap columns were washed with 20 mm phosphate buffer, pH 7.4, prior to elution of the sample with the same buffer containing a higher concentration of imidazole (20 mm phosphate, 0.5 m NaCl, 500 mm imidazole, 6 m guanidinium chloride, pH 7.4) during which phase fractions (1 ml) were collected. The purified QconCAT was desalted by three rounds of dialysis against 100 volumes of 10 mm ammonium bicarbonate, pH 8.5, for 2 h using fresh buffer each time.
Proteomics Analysis of QconCAT for Quantification of Chicken Skeletal Muscle Proteins—
The QconCAT protein was diluted to 5 μm in 50 mm ammonium bicarbonate and digested with trypsin (20:1 substrate:protease) at 37 °C for 24 h after which the digest was incubated with additional trypsin (20:1 substrate:protease) to ensure complete digestion. Peptides were analyzed by MALDI-TOF MS ([email protected], Waters, Manchester, UK). For this, 1 μl of digested material was mixed with an equal volume of α-cyanohydroxycinnamic acid in 50% (v/v) acetonitrile, 0.1% (v/v) trifluoroacetic acid. This was allowed to dry, and peptides were acquired over the mass range 900–3000 m/z. For each combined spectrum, 20–30 spectra were acquired (laser energy typically 30%) with 10 shots per spectrum and a laser firing rate of 5 Hz. Data were processed using MassLynx software to subtract background noise using polynomial order 10 with 40% of the data points below this polynomial and a tolerance of 0.01. Spectral data were also smoothed by performing two mean smooth operations with a window of three channels.
Co-digestion of QconCAT and Chicken Skeletal Muscle Soluble Proteins for Quantification—
QconCAT protein was added in a 1:10 (QconCAT:chicken skeletal muscle protein) ratio to chicken skeletal muscle soluble fraction samples taken from both broiler and layer strains at six time points during growth. For each time point, four birds were analyzed. The mixture was diluted 10-fold with 50 mm ammonium bicarbonate, and 10% (v/v) acetonitrile was added prior to addition of trypsin (20:1 substrate:protease). The reaction mixture was incubated at 37 °C for 24 h, after which the digest was incubated with additional trypsin (20:1 substrate:protease) to ensure complete digestion. 1 μl was analyzed by MALDI-TOF MS.
Monitored Proteolysis of QconCAT and Analyte Proteins—
For QconCAT digestion, 150 μg of protein was digested with trypsin at a ratio of trypsin:protein of 1:20 and 1:100 and stopped at selected time points after addition of enzyme by removing 15 μl (containing 3 μg of protein) and adding to an equal volume of 10% (v/v) formic acid. For analyte protein digestion, 50 μg of protein was digested with trypsin at a ratio of trypsin:protein of 1:20 and stopped at 0 min, 30 min, and 24 h after addition of enzyme by removing 25 μl (containing 6 μg of protein) and adding an equal volume of 10% (v/v) formic acid. The fractions were subsequently stored at −20 °C until the end of the time course. For gel electrophoresis, fractions were dried down in a vacuum centrifuge and reconstituted in 10 μl of reducing sample buffer prior to separation by 12.5% (w/v) 1D SDS-PAGE at 200 V for 45 min. Analyte proteins were also digested in a solution containing 10% ACN (v/v) and with addition of enzyme following a 1-h incubation of the protein at 60 °C. To quantify proteolysis of analyte proteins, digestion of chicken skeletal muscle soluble proteins in solution with trypsin (as described above) was stopped at various time points during 24-h incubation at 37 °C by removing 20 μl (containing 5 μg of protein) and adding an equal volume of 10% (v/v) formic acid containing 0.5 μg of predigested QconCAT peptides. Each fraction was analyzed by MALDI-TOF MS. This experiment was repeated using protein denatured by incubation at 60 °C for 1 h prior to trypsin addition for comparison.
To enhance the signal intensity of lysine-terminated peptides in MALDI-TOF MS, lysine residues were converted to the more basic homoarginine by guanidination (20). This reaction was carried out by drying down the peptide mixture and reconstituting it in 10 μl of 7 m ammonia solution to which was added 5 μl of 0.5 mO-methylisourea (in double distilled H2O). This was mixed thoroughly and incubated overnight at room temperature prior to drying down and desalting using C18 ZipTips (Millipore, Watford, UK).
Peptide mixtures were analyzed by LC-ESI-Q-TOF MS using an EASY-nLC (Proxeon, Odense, Denmark) nanoflow system coupled to a Q-Tof micro (Waters). Nanoflow HPLC at 200 nl/min was used to resolve peptides (in 0.1% (v/v) formic acid) over a 50-min acetonitrile gradient (0–100%). Peptides were acquired over the mass range 400–2000 m/z with the capillary voltage set at 1900 V, collision energy set at 10 V, and sample cone set at 55 V for the entire 50-min gradient. The same reversed phase separation method was used to collect fractions (200 nl) directly onto a MALDI-TOF target for analysis by LC-MALDI-TOF MS.
Assessing Analytical and Biological Variance in Quantification—
Ten identical aliquots of a chicken skeletal muscle soluble protein preparation, to each of which was added a known amount of [13C6]arginine/[13C6]lysine-labeled QconCAT, were digested in solution with trypsin and analyzed to investigate analytical variance. This was compared with biological variance (four animals at each time point) achieved through quantification by MALDI-TOF MS (with and without guanidination), LC-ESI-Q-TOF MS, and LC-MALDI-TOF MS.
Comparison of QconCAT Method with Absolute Quantification Using a Stable Isotope-labeled Synthetic Peptide—
Quantification by the QconCAT method was also compared with that achieved using a stable isotope-labeled synthetic peptide to quantify a single analyte protein also represented in the QconCAT. The peptide of sequence LVSWYDNEFGYSNR and mass 1748.77 Da representing the abundant protein GAPDH was synthesized by Sigma-Genosys (Dorset, UK) and was labeled at the arginine residue with both 13C6 and 15N4 giving a 10-Da mass offset from the analyte peptide. For quantification, the synthetic peptide was added to broiler chicken skeletal muscle samples corresponding to six time points during growth with four replicate animals at each time point. Quantification data were obtained from analysis by MALDI-TOF MS using the relative intensities of the analyte and standard peaks as with QconCAT analysis.
Investigation of the Accuracy of Quantification Using QconCAT—
Purified adenylate kinase (AK; Sigma) was added to chicken skeletal muscle soluble fraction from a 30-day broiler. AK was added from 0 to 0.02 nmol resulting in a final protein concentration of 0–300 nmol/g, and the amount of AK in the tissue was quantified by adding 0.015 nmol of QconCAT prior to digestion with trypsin. Proteolysis was allowed to continue for 24 h after which peptides were analyzed by MALDI-TOF MS.
The QconCAT was designed to include surrogate peptides for 20 chicken skeletal muscle proteins. As chicken skeletal muscle matures posthatch, the protein distribution in the tissue changes dramatically from a large number of proteins that are expressed in similar amounts at hatch to a relatively few high abundance proteins after 30 days of growth (Fig. 1). From previous identification studies (
), the most abundant proteins present in the soluble fraction of chicken skeletal muscle at this stage are predominantly the glycolytic enzymes. Other proteins, notably actin, have disappeared from the soluble fraction of muscle by 10 days of growth, presumably reflecting repartitioning and assembly of the myofibrillar apparatus. Finally serum proteins are detectable in muscle preparations at hatch but rapidly disappear during development. We ascribe this change to the increased exclusion of interstitial fluid as the muscle develops (
). To measure the absolute concentrations of specific proteins at various time points, we selected a group of 20 to be quantified using a single QconCAT. For each of the proteins, we chose a proteotypic peptide that gave a strong signal in previous MALDI-TOF MS analyses of tryptic digests. The peptides were used to guide construction of the DNA sequence of the QconCAT, which was synthesized, inserted into a pET21a vector, and expressed in E. coli. Full details of the design and expression are given elsewhere (
For QconCAT expression, a typical bacterial culture of 200 ml was induced at an A600 of 0.6–0.8, which generated 5–10 mg of QconCAT after cell breakage, recovery of inclusion bodies, and affinity chromatography of guanidinium chloride-solubilized protein on 1-ml nickel-nitrilotriacetic acid columns. After induction, the QconCAT protein was visible as a major band in 1D SDS-PAGE of a broken cell preparation (results not shown). After purification, the protein was homogeneous on 1D SDS-PAGE and was used without further purification (results not shown).
QconCAT protein was added in a 1:10 (QconCAT:chicken skeletal muscle protein) ratio to chicken skeletal muscle soluble fraction samples taken from both broiler and layer strains at six time points during growth. For each time point, four birds were analyzed. This ratio was selected pragmatically based on the abundance of the major proteins in chicken skeletal muscle soluble fraction. The influence of dynamic range on absolute quantification of proteins in complex biological systems is discussed below. After co-digestion of chicken skeletal muscle soluble fraction and [13C6]arginine/lysine-labeled QconCAT, MALDI-TOF MS analysis of peptides produced highly complex mass spectra. However, 10 of 20 proteotypic peptides could be identified in the composite spectrum without further sample processing and were therefore used for quantification. For these 10 proteins, for example glyceraldehyde-3-phosphate dehydrogenase (Fig. 2), the change in protein expression can be measured during growth from 1 to 30 days posthatch by converting relative signal intensities of analyte and internal standard peptide ions into absolute quantities of analyte protein expressed as nmol/g of net weight breast muscle tissue.
The QconCAT was completely digested within 2 min such that no intermediate fragments were visible on SDS-PAGE (Fig. 3). When the trypsin was reduced to much lower levels (100:1 substrate:protease) and the digestion reaction was sampled at very short time intervals, there was some evidence for the appearance of partially fragmented intermediates, although MALDI-TOF MS analysis of these bands, once digested with trypsin, demonstrated that each “band” comprised multiple species, consistent with simultaneous tryptic attack on all scissile bonds at very similar rates. MALDI-TOF MS of peptides confirmed rapid digestion with all peptides detected within the first 2 min of digestion (data not shown).
By contrast, if the protein preparation from skeletal muscle was subjected to trypsin digestion at a ratio of 20:1 substrate:protease, many proteins were digested slowly, and even after 24 h, undigested proteins were clearly visible including β-enolase, creatine kinase, and triose-phosphate isomerase. If a low concentration (10%, v/v) of acetonitrile was included in the digestion reaction, proteolysis was faster. If the protein mixture was denatured by heating to 60 °C for 1 h before digestion, the loss of higher order structure of the substrate proteins meant that the digestion reaction was essentially complete within 30 min.
To demonstrate the importance of complete proteolysis for accurate quantification, we conducted extended digestion reactions with chicken skeletal muscle proteins from 1- and 30-day skeletal muscle. As reported previously and quantified here, these two preparations are dramatically different in the protein expression profiles (Fig. 1), providing different environments for proteolysis. The protein preparations were digested without treatment or after denaturation at 60 °C for 1 h, and the appearance of the analyte peptide used for quantification was determined by the QconCAT methodology; we have previously shown (Fig. 3) that the QconCAT was efficiently and completely digested within 2 min. In all instances, the analyte proteins were digested between 1.3 (AK) and 86 (β-enolase) times faster after denaturation, and in some instances (for example, GAPDH from 1-day muscle) the rate of digestion was very similar (Fig. 4). This is consistent with a model for proteolysis of the native protein in which the initial proteolytic attack exerts a destabilizing effect on the remaining structure such that the rate of proteolysis is increased; the initial proteolysis is effectively rate-limiting. However, in the highly specialized 30-day muscle sample, there was virtually no digestion even after 6 h of proteolysis. Indeed for all proteins studied, the rate of proteolysis of native proteins was diminished in the 30-day muscle sample; we suggest that the acute specialization of this tissue, leading to a predominance of relatively few proteins, might introduce other factors that impede digestion, such as aggregation into supramolecular assemblies or partial inhibition of the trypsin. In all instances, extended digestion times (greater than 24 h) resulted in complete digestion and the same quantification value irrespective of the initial state of the analyte protein preparation.
Variation in ion signal response is inherent with MALDI-TOF MS analysis (
). In a complex MALDI-TOF mass spectrum, peptides that are abundant and have a high response factor dominate the spectrum. Theoretically proteolysis of a complex proteome (for illustration, 10,000 proteins) could generate 105–106 peptides (at ∼50 tryptic peptides per protein), the dynamic range of which will be such that only the most abundant peptides and those that ionize particularly well will be identified. To achieve increased signal intensity from lysine-terminated peptides, guanidination has been used to convert lysine into the more basic homoarginine by reaction with O-methylisourea (
). Guanidination of a tryptic digest was effective at increasing the signal intensity of lysine-terminated peptides in the QconCAT and the analyte sample to allow quantification of two more analyte proteins by MALDI-TOF MS. To improve resolution of peptides for quantification, samples were also analyzed by LC-ESI-Q-TOF MS (Fig. 5). The alternative ionization mode coupled with the benefit of separation of peptides by reversed phase chromatography allowed quantification of a further six proteins previously not identified by MALDI-TOF MS and confirmed quantification data for many of those that had previously been analyzed. Extracted ion chromatograms for unlabeled (analyte) and labeled (QconCAT) peptides were used to locate the ions, and the chromatographic boundaries of the coincident pair of peptides were used to delineate the combined mass spectra from which peptides were quantified by mass spectrometric intensities of the doubly charged ions; there was no evidence of multiply charged ions, for example [M + 3H]3+ corresponding to analyte-QconCAT pairs (Fig. 5). Quantification data for four proteins over 30 days of growth obtained by both methods of MALDI-TOF MS and LC-ES-Q-TOF MS showed excellent agreement such that the correlation coefficient was 0.977 (Fig. 6). All proteins that could be quantified by MALDI-TOF MS (with and without guanidination) and LC-ESI-Q-TOF MS were expressed as nmol/g of pectoralis muscle tissue. The data were obtained during growth from 1 to 30 days posthatch for four birds at each time point for chickens of the layer and broiler strains (Fig. 7). Some proteins demonstrated massive pool expansion, whereas others declined to a similar degree, covering a measurable dynamic range across all proteins of 10–550 nmol/g for a single protein (GAPDH) and as low as 2 ± 1 nmol/g (α-enolase; 1-day broiler). Thus, in a single experiment we were able to assess protein concentrations over a 300-fold range.
To assess variance due to the analytical procedure, four identical protein mixtures (100 μl of chicken skeletal muscle (2.6 μg/μl) with 9 μl of QconCAT (2.9 μg/μl)) were digested with trypsin, and the surrogate peptides were used to quantify proteins by MALDI-TOF MS. Quantification data were collected and used to assess analytical variance (Fig. 8a). The reproducibility of the method was high, and the variance was similar whether four or 10 replicates were used. In both instances, the analytical variance was significantly lower than that for quantification data measured for four different birds of each strain (Fig. 8b). For example, the analytical variance (CV of 6.0% for β-enolase, n = 4) compared favorably to biological variance (CV of 24.0% for β-enolase, n = 4). Increasing the number of analytical replicates to 10 had very little effect on analytical variance (CV of 6.0% for β-enolase, n = 10; data not shown).
For some aspects of quantitative proteomics, MALDI-TOF MS has advantages. Data can be accumulated for a variable number of laser shots, ensuring comparable signal intensities between replicates. Virtually all of the signal resides in the singly charged [M + H]+ ion, whereas with electrospray ionization, the signal can be distributed over a number of differently charged species. However, for complex analytical mixtures, the complexity of a MALDI-TOF mass spectrum, coupled with a noisy signal base line, can compromise quantification. One approach to simplification of a MALDI-TOF MS analysis relies on prior fractionation of the peptide mixture before deposition of successive fractions on the MALDI target (
). Chicken skeletal muscle with added QconCAT was digested and separated by reversed phase liquid chromatography, and fractions (200 nl) were collected onto a MALDI target at 1-min intervals for analysis by MALDI-TOF MS (Fig. 9). This provided an efficient detection system with peptides fixed in the solid phase for continued interrogation when acquiring data for quantification. LC-MALDI-TOF MS was used for analysis of a single chicken skeletal muscle sample to highlight the potential benefit of this method. This approach allowed quantification of the majority of proteins selected for incorporation into the QconCAT protein and consequently contributed additional information for quantification. Comparing quantification by LC-MALDI-TOF MS with both MALDI-TOF MS and LC-ESI-Q-TOF MS confirmed that all three methods of analysis give consistent and comparable quantification. This quantification can be subtle, for example in monitoring isoform changes from embryonic to adult myosin as well as a change in state from free, soluble protein to that assembled within the myofibrillar apparatus (actin). It is also possible to monitor expression of isoforms of the same enzyme for which Qpeptides differ only in a single amino acid (lactate dehydrogenases A and B).
Although there is nothing formally different between a chemically synthesized peptide and a peptide excised from a QconCAT by proteolysis, we compared the quantification of a single protein (GAPDH, which exhibits a dramatic change in abundance during posthatching development) using the QconCAT-derived peptide and the identical synthetic peptide. The correlation between data obtained using QconCAT and that obtained using the synthetic peptide was high (correlation coefficient, 0.998) (Fig. 10), and quantification data were consistent using either internal standard. A small consistent discrepancy (less than 10%) between the two methods could be attributable to the method of quantification used for the two standards. The discrepancy between the synthetic peptide and the QconCAT was reduced if we used the latter to quantify the former but was still present. We do not have an explanation for this residual discrepancy at present. We are confident, however, that the discrepancy is not attributable to incomplete digestion of the QconCAT (see Figs. 3 and 4). In the case of the QconCAT, we used a protein assay to determine the amount of protein as this was the same method used to quantify total protein in the analyte. For the synthetic peptide, the quantity supplied by the manufacturer is too small for independent quantification, and it was necessary to assume that the quantity in the vial was indeed that specified by the manufacturer. The difference between the two standards was minor compared with the biological variance within the system, would not contribute significant errors, and would be readily controlled by alternative QconCAT quantification strategies (see “Discussion”).
To assess the accuracy of a QconCAT experiment for quantification, we spiked a known amount of AK into chicken skeletal muscle soluble fraction from a 30-day broiler. The amount of AK added was converted into protein concentration as nmol/g tissue and compared with the total concentration of AK in the tissue (nmol/g) as quantified using QconCAT (Fig. 11). As expected, there was a strong correlation (R2 = 0.9992) with a slope of 1, indicating the lack of any systematic quenching effects over an extended dynamic range. Quantification of selected muscle proteins by the QconCAT strategy was also compared with densitometric quantification from 1D SDS-PAGE; the correlation of these methods was poor (data not shown; R2 = 0.67), although the stain intensity was strongly proportional to the amount of protein loaded on the gel (data not shown; R2 = 0.995). This is most probably due to the different affinity of individual proteins for the stain.
QconCAT methodology has considerable potential to enhance the scope and scale of quantitative proteomics by multiplexing stable isotope dilution assays using proteotypic peptides as surrogates for the proteins of interest. The specific novelty of the QconCAT approach is derived from the efficient means of simultaneous production of multiple internal standards. Unlike chemical synthesis, biological synthesis de novo is not beset by “difficult” peptides (for example those with runs of serine residues or with a large hydrophobic content) that can be problematic to synthesize chemically in high purity. Moreover QconCAT proteins can be labeled using any metabolic precursor from the remarkably inexpensive uniform 15N labeling using 15NH4Cl as the sole nitrogen source in the medium to specific labeling with [13C6]Lys/[13C6]Arg, which ensure that, for tryptic proteotypic peptides, each has a constant mass offset of 6 Da. Incorporation of a second labeled amino acid that is variably represented in the QconCAT can also facilitate mass isolation of the standard. The imaginative application of metabolic labeling without the need for resynthesis of the QconCAT gene is an advantage of the approach that has yet to be fully exploited.
The QconCAT approach is robust to the choice of mass spectrometric method used. Each of the three methods used (MALDI-TOF MS, LC-ESI-Q-TOF MS, and LC-MALDI-TOF MS) allows quantification of individual proteins that are not detected using the alternative techniques, for example adult myosin and pyruvate kinase have only been quantified using MALDI-TOF MS, myosin-binding protein C and phosphoglycerate kinase have only been quantified using LC-MALDI-TOF MS (data not shown), and lactate dehydrogenase A has only been quantified using LC-ESI-Q-TOF MS. The ability to detect and quantify each peptide incorporated in the original QconCAT protein is very dependent on the analytical context. Although the target ionization method can influence the choice of proteotypic peptides, the opportunity remains to switch to other separation and ionization methods to gain quantification data for large numbers of proteins.
Although a chemically synthesized peptide and a QconCAT peptide are formally equivalent at the analytical stage, we compared the two approaches. Interestingly the two methods gave highly precise estimates of protein levels, but the measured values were different such that there was a consistently lower estimate of protein amount using the synthetic peptide when compared with QconCAT. Quantification of the QconCAT protein was achieved by colorimetric assay using the same method as used for the assessment of total protein concentration in the biological samples. Quantification of the synthetic peptide is based on amino acid analysis conducted by the supplier and was completed separately and prior to analysis with the mixture of analyte proteins. Indeed the quantity of the synthetic peptide supplied (five vials of 1 nmol) was sufficiently low that independent quantification by the end user could be problematical. By contrast, we routinely prepare 5–10 mg (approximately 250 nmol) of the QconCAT used here. The errors introduced by the method of standard quantification are small and, relative to the biological changes we measure here, are not significant. However, future iterations of QconCAT proteins will incorporate a common peptide for internal standard quantification by a synthetic peptide that can be labeled or unlabeled, depending on the labeling status of the QconCAT protein. This peptide, chosen because it ionizes well under MALDI or ESI, could then be used to quantify each QconCAT, normalizing all QconCAT data to a common, absolute standard. This common peptide, which is chemically synthesized, would be required in large amounts, and as such, purification and quantification of this peptide could be conducted to a very high level of confidence. By creating such a “gold standard” for quantification, data from all laboratories using QconCATs (the same or different) could be compared directly.
The application to absolute quantification of multiple proteins within complex biological systems and the adaptable nature of the QconCAT to a variety of analytical systems is clear. For development of strategies for absolute quantification, the QconCAT method provides a reproducible and relatively simple system in which multiple proteins can be quantified using alternative methods of mass spectrometry with chromatographic separation and chemical derivatization. We have made approximate estimates of the costs involved, and to quantify 50 proteins at one Qpeptide per protein, a QconCAT strategy is about 15% of the cost of comparable synthetic peptides and would yield about 250 nmol of protein compared with 5 nmol of each synthetic peptide. The error in analytical replicates is small, but there can be no “holy grail” target performance of the analytical analyses. First provided that analytical variance can be demonstrated to be substantially smaller than biological variance (as we have demonstrated here and indeed the normal expectation), it might be argued that there is a much reduced need to perform analytical replicates and that the effort should be directed toward acquisition of greater biological insight by adding new biological replicates.
For identification proteomics, little regard is paid to the completeness of the digestion of the analyte peptide; the goal is to generate sufficient peptides that are readily ionized and/or fragmented for unambiguous identification. Indeed most search engines are tolerant of and include options to match one or more than one “missed cleavage.” However, when the goal shifts to the more demanding task of peptide-based quantification, it is essential that due cognizance is given to the proteolytic reactions that generate the peptides that are to be used for quantification irrespective of the method. The goal has to be complete digestion, and there are several approaches that can be taken to ensure that this has occurred. This should also be checked experimentally. It would also be feasible to embed two Qpeptides for each protein in a single QconCAT or even two different QconCATs to enhance confidence, but we do not subscribe to the view that this is necessary in many instances. Finally our experience with a large number of QconCATs
J. Rivers, D. M. Simpson, D. H. L. Robertson, S. J. Gaskell, and R. J. Beynon, unpublished observations.
is that they are proteolyzed at rates that are far higher than analyte proteins. In all of these QconCAT constructs, we made no attempt to preserve the primary sequence context of the Qpeptides, and it is clear that this is not an important factor in QconCAT design; the selection of suitable proteotypic peptides in the design phase is much more critical. In this regard, the recent work by Aebersold and co-workers (