Advertisement

Protein Identification by Mass Spectrometry

Issues to be Considered*
  • Michael A. Baldwin
    Correspondence
    To whom correspondence should be addressed: Mass Spectrometry Facility, Department of Pharmaceutical Chemistry, University of California, San Francisco, CA 94143-0446
    Affiliations
    Mass Spectrometry Facility, Department of Pharmaceutical Chemistry, University of California, San Francisco, CA 94143-0446
    Search for articles by this author
  • Author Footnotes
    * This work was supported in part by NCRR RR01614. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
    2 S. A. Carr, personal communication.
      During the past two decades, mass spectrometry has become established as the primary method for protein identification from complex mixtures of biological origin. This is largely attributable to the fortunate coincidence of instrumental advances that allow routine analysis of minute amounts (typically femtomoles) of involatile, polar compounds such as peptides in complex mixtures, with the rapid growth in genomic databases that are amenable to searching with mass spectrometry (MS)
      The abbreviations used are: MS, mass spectrometry; MALDI, matrix-assisted laser desorption/ionization; ESI, electropray ionization; TOF, time-of-flight; LC, liquid chromatography; MS/MS, tandem MS; QqTOF, quadrupole mass selector and quadrupole collision cell with orthogonal acceleration TOF; CID, collision-induced dissociation; HPLC, high-pressure LC; ICAT, isotope-coded affinity tag.
      1The abbreviations used are: MS, mass spectrometry; MALDI, matrix-assisted laser desorption/ionization; ESI, electropray ionization; TOF, time-of-flight; LC, liquid chromatography; MS/MS, tandem MS; QqTOF, quadrupole mass selector and quadrupole collision cell with orthogonal acceleration TOF; CID, collision-induced dissociation; HPLC, high-pressure LC; ICAT, isotope-coded affinity tag.
      data. Like many other developing fields in science, the creation of techniques and software tools and the initial generation and interpretation of data have been the domain of experts, people who are cognizant not only of the benefits of the methods but also of their actual and potential weaknesses. Now, as mass spectrometric techniques and proteomic tools become increasingly available and accessible, a much broader range of researchers is applying the same methodology, often with substantially less understanding of the major limitations that critically affect the reliability and significance of the results. Ideally, the MS community should establish criteria for mass spectrometric identification of proteins that should be employed by researchers. As this remains a rapidly developing field with many different experimental approaches and different ways of searching and interpreting the data, it is difficult to promulgate hard and fast rules. Nevertheless, Molecular & Cellular Proteomics is attempting to develop standards of acceptability for proteomics papers, based on emerging knowledge as well as on principles of biological MS established over the last 20 years by the MS community. Authors of proteomics papers employing MS must make themselves fully aware of the key issues that are driving development of these guidelines. Hence, the paper that follows attempts to highlight the strengths and weaknesses of the methods in current use. It is particularly important to realize that for any protein match returned from a database search, there is a non-zero probability that it will be wrong. Many times, the quality of the data is such that the probability of a false positive can be disregarded, but in some cases the identifications returned by the search engines are very likely incorrect. Therefore, it is unacceptable to simply list all the hits that come back from any search engine and then discuss their biological significance as though they were categorically correct.

      PEPTIDE ANALYSIS

      Almost without exception, protein identification is based on the analysis of peptides generated by proteolyic digestion. The most widely used enzyme is trypsin, which hydrolyzes the protein on the C-terminal side of lysine and arginine, unless the subsequent amino acid in the sequence is a proline. This is advantageous as every peptide other than the protein C terminus has at least two sites for efficient protonation, the N-terminal amino group and the C-terminal basic residue, so peptides are readily ionized and detected as positive ions. However, for a variety of reasons, it is normal for only a subset of the potential tryptic peptides from any protein to be detected, particularly when the peptides are ionized directly from unseparated mixtures in which there may be competition for the available protons. There are also experimental limitations for the detection of peptides that are either very small or very large, a factor that is not controllable with a single enzyme as this is dependent on the distribution of lysine and arginine residues within a protein. Protein sequence coverage can be improved by carrying out a parallel digestion with a second protease of different specificity such as chymotrypsin, then combining the two digests for a common analysis.
      S. A. Carr, personal communication.
      In practice, whether a large or small fraction of the peptides generated from any protein is detected depends on many variables: the amount of that protein present in the original sample; the efficiency of any protein extraction, digestion, and peptide extraction; the presence of other proteins and other impurities; and the sensitivity and performance characteristics of the mass spectrometer and its mode of ionization, mass separation, and ion detection.
      Mass spectrometers employed in proteomic analysis use either matrix-assisted laser desorption/ionization (MALDI) or electrospray ionization (ESI) but they vary widely in their operation and performance characteristics. Early database searching was based on low-resolution linear MALDI-time-of-flight (MALDI-TOF) giving a mass accuracy of perhaps ±2 Da (
      • Henzel W.J.
      • Billeci T.M.
      • Stults J.T.
      • Wong S.C.
      • Grimley C.
      • Watanabe C.
      Identifying proteins from two-dimensional gels by molecular mass searching of peptide fragments in protein sequence databases..
      ). However, this is no longer acceptable because good mass accuracy increases the reliability of database searching by limiting the possible compositions of peptides for any given mass (
      • Clauser K.R.
      • Baker P.
      • Burlingame A.L.
      Role of accurate mass measurement (± 10 ppm) in protein identification strategies employing MS or MS/MS and database searching..
      ). A delayed extraction MALDI-TOF instrument with reflectron should give better than 50 ppm mass accuracy, and significantly better (∼10 ppm) may be achieved with careful internal calibration. ESI has been the standard ionization method for liquid chromatography (LC)-MS and LC-tandem MS (MS/MS), although separated fractions can be deposited onto a MALDI target for either on-line or off-line analysis (
      • Preisler J.
      • Hu P.
      • Rejtar T.
      • Karger B.L.
      Capillary electrophoresis-matrix-assisted laser desorption/ionization time-of-flight mass spectrometry using a vacuum deposition interface..
      • Preisler J.
      • Hu P.
      • Rejtar T.
      • Moskovets E.
      • Karger B.L.
      Capillary array electrophoresis-MALDI mass spectrometry using a vacuum deposition interface..
      • Krutchinsky A.N.
      • Kalkum M.
      • Chait. B.T.
      Automatic identification of proteins with a MALDI-quadrupole ion trap mass spectrometer..
      ). ESI is usually employed for single and triple quadrupoles and quadrupole ion traps that typically give modest resolution. The combination of a quadrupole mass selector and quadrupole collision cell with orthogonal acceleration TOF (QqTOF) gives high resolution (∼10,000) and perhaps 5 ppm mass accuracy if well calibrated (
      • Dawson J.H.J.
      • Guilhaus M.
      Orthogonal acceleration time of flight mass spectrometer..
      • Dodonov A.F.
      • Chernushevich I.V.
      • Laiko V.V.
      Atmospheric pressure ionization time of flight mass spectrometer.
      • Verentchikov A.V.
      • Ens W.
      • Standing K.G.
      Reflecting time-of-flight mass spectrometer with an electrospray ion source and orthogonal extraction..
      • Morris H.R.
      • Paxton T.
      • Dell A.
      • Langhorne J.
      • Berg M.
      • Bordoli R.S.
      • Hoyes J.
      • Bateman R.H.
      High sensitivity collisionally-activated decomposition tandem mass spectrometry on a novel quadrupole/orthogonal-acceleration time-of-flight mass spectrometer..
      ). Fourier transform MS gives the ultimate performance with mass accuracy of perhaps 1 ppm, but such instruments are more expensive, more technically demanding, and until now their deployment has been mostly limited to large MS facilities.

      PROTEIN IDENTIFICATION BY DATABASE SEARCHING

      There are at least three approaches to protein identification based on peptide analysis. The first developed is generally referred to as peptide mass mapping or mass fingerprinting (
      • Henzel W.J.
      • Billeci T.M.
      • Stults J.T.
      • Wong S.C.
      • Grimley C.
      • Watanabe C.
      Identifying proteins from two-dimensional gels by molecular mass searching of peptide fragments in protein sequence databases..
      ,
      • Wada Y.
      • Hayashi A.
      • Masanori F.
      • Katakuse I.
      • Ichihara T.
      • Nakabushi H.
      • Matsuo T.
      • Sakurai T.
      • Matsuda H.
      Characterization of a new fetal hemoglobin variant, Hb F Izumi A gamma 6Glu replaced by Gly, by molecular secondary ion mass spectrometry..
      • Morris H.R.
      • Panico M.
      • Taylor G.W.
      FAB-mapping of recombinant-DNA protein products..
      • James P.
      • Quadroni M.
      • Carafoli E.
      • Gonnet G.
      Protein identification by mass profile fingerprinting..
      • Mann M.
      • Hojrup P.
      • Roepstorff P.
      Use of mass spectrometric molecular weight information to identify proteins in sequence databases..
      • Pappin D.J.
      • Hojrup P.
      • Bleasby A.J.
      Rapid identification of proteins by peptide-mass fingerprinting..
      • Yates J.R.
      • Speicher S.
      • Griffin P.R.
      • Hunkapiller T.
      Peptide mass maps: A highly informative approach to protein identification..
      ). This relies upon a comparison of the experimentally determined MS peak mass values with the predicted molecular mass values of the peptides generated by a theoretical digestion of each protein in a database. Early observations with relatively small databases suggested as few as three or four peptide matches could be sufficient to identify the correct hit, even with relatively low-precision mass measurements from linear MALDI-TOF instruments (
      • Henzel W.J.
      • Billeci T.M.
      • Stults J.T.
      • Wong S.C.
      • Grimley C.
      • Watanabe C.
      Identifying proteins from two-dimensional gels by molecular mass searching of peptide fragments in protein sequence databases..
      ). But recently, genomic databases have grown rapidly, e.g. as of June 14, 2003, the NCBInr database contained 1,446,218 entries, which was 50% more than 1 year earlier. Consequently, the criteria for protein identification have become more stringent and more accurate mass measurement is essential. It is also necessary to match a larger number of peptides and to cover a larger percentage of the protein sequence. The second approach uses collision-induced dissociation (CID) spectra of individual peptides from MS/MS. Prior to the development of database searching, peptide MS/MS spectra from high performance multi-sector instruments with fast atom bombardment ionization were used for de novo peptide sequencing (
      • Biemann K.
      • Scoble H.A.
      Characterization by tandem mass spectrometry of structural modifications in proteins..
      ). In the mid 1990s, such MS/MS spectra, generated by MALDI or ESI, were matched against sequence tags predicted for all proteins in a database, i.e. (short) series of fragment ions that could be attributed to coherent sequences of amino acids corresponding to subsets of the predicted peptide (
      • Mann M.
      • Wilm M.
      Error-tolerant identification of peptides in sequence databases by peptide sequence tags..
      • Eng J.K.
      • McCormack A.L.
      • Yates J.R.
      An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database.
      • Wilm M.
      • Mann M.
      Analytical properties of the nanoelectrospray ion source..
      • Clauser K.R.
      • Baker P.R.
      • Burlingame A.L.
      Peptide fragment-ion tags from MALDI/PSD for error-tolerant searching of genomic databases.
      ), as implemented in programs such as Protein Prospector’s MS-Tag. In their latest form, such searches are based on comparisons between the experimentally observed fragment ions and all predicted fragments for all hypothetical peptides of the appropriate molecular mass, based on known fragmentation rules. Each peptide match can be linked to a protein match, and it is possible that even a single peptide will identify a protein correctly, although identical sequences might be duplicated in closely related proteins, therefore matching multiple peptide sequences provides greater statistical confidence. Clearly, the greater the number of peptides being matched to any one protein and the greater the sequence coverage, the greater the probability of a correct identification. Error tolerant searching will allow a peptide to be identified when it differs from a database peptide by perhaps a single amino acid, and techniques have been developed to identify remote sequence homologies, although these are time-consuming and computationally intensive (
      • Taylor J.A.
      • Johnson R.S.
      Sequence database searches via de novo peptide sequencing by tandem mass spectrometry..
      ,
      • Huang L.
      • Jacob R.J.
      • Pegg S.C.-H.
      • Baldwin M.A.
      • Wang C.C.
      • Burlingame A.L.
      • Babbitt P.C.
      Functional assignment of the 20S proteasome from T. brucei using mass spectrometry and new bioinformatics approaches..
      ). Finally, if no matches are found because the protein is not present in the database, de novo peptide sequencing continues to be a valuable method, based on known rules for peptide fragmentation (
      • Biemann K.
      Sequencing of peptides by tandem mass spectrometry and high-energy collision-induced dissociation..
      ). This is straightforward with good quality spectra and is valuable when a significant proportion of peptides diverge from those predicted, due to errors in databases, discrepancies between genomic sequence and processed or posttranslationally modified proteins, species differences, and nonspecific enzyme cleavages. An early example of the benefits of extensive sequencing was the determination of the primary structure of Gal β-1,3 (
      • Preisler J.
      • Hu P.
      • Rejtar T.
      • Moskovets E.
      • Karger B.L.
      Capillary array electrophoresis-MALDI mass spectrometry using a vacuum deposition interface..
      )GlcNAc α-2,3-sialyltransferase (
      • Wen D.X.
      • Livingston B.D.
      • Medzihradszky K.F.
      • Kelm S.
      • Burlingame A.L.
      • Paulson J.C.
      Primary structure of Gal beta 1, 3(4)GlcNAc alpha 2, 3-sialyltransferase determined by mass spectrometry sequence analysis and molecular cloning. Evidence for a protein motif in the sialyltransferase gene family..
      ).
      The use of MS/MS and CID is becoming the accepted standard for protein identification and is steadily replacing peptide mass fingerprinting, although the quality of tandem data varies considerably with instrument type. Sequence ion information from MALDI-TOF instruments using post-source decay (
      • Spengler B.
      • Kirsch D.
      • Kaufmann R.
      • Jaeger E.
      Peptide sequencing by matrix-assisted laser-desorption mass spectrometry..
      ) is painstaking to collect and the accuracy of mass assignments is relatively low, therefore it is little used now, particularly as MALDI has become available on higher performance QqTOF (
      • Shevchenko A.
      • Chernushevich I.
      • Ens W.
      • Standing K.G.
      • Thomson B.
      • Wilm M.
      • Mann M.
      Rapid ’de novo’ peptide sequencing by a combination of nanoelectrospray, isotopic labeling and a quadrupole/time-of-flight mass spectrometer..
      • Krutchinsky A.N.
      • Loboda A.V.
      • Spicer V.L.
      • Dworschak R.
      • Ens W.
      • Standing K.G.
      Orthogonal injection of matrix-assisted laser desorption/ionization ions into a time-of-flight spectrometer through a collisional damping interface..
      • Krutchinsky A.N.
      • Zhang W.
      • Chait B.T.
      Rapidly switchable matrix-assisted laser desorption/ionization and electrospray quadrupole-time-of-flight mass spectrometry for protein identification..
      • Shevchenko A.
      • Loboda A.
      • Shevchenko A.
      • Standing K.G.
      MALDI quadrupole time-of-flight mass spectrometry: a powerful tool for proteomic research..
      • Baldwin M.A.
      • Medzihradszky K.F.
      • Lock C.M.
      • Fisher B.
      • Settineri C.A.
      • Burlingame A.L.
      Matrix assisted laser desorption/ionization coupled with quadrupole/orthogonal acceleration time-of-flight mass spectrometry for protein discovery, identification and structural analysis..
      ) and TOF/TOF (
      • Vestal M.L.
      • Juhasz P.
      • Hines W.
      • Martin S.A.
      A new delayed extraction MALDI-TOF MS-MS for characterization of protein digests.
      • Medzihradszky K.F.
      • Campbell J.M.
      • Baldwin M.A.
      • Falick A.M.
      • Juhasz P.
      • Vestal M.L.
      • Burlingame A.L.
      The characteristics of peptide collision-induced dissociation using a high performance MALDI-TOF/TOF tandem mass spectrometer..
      • Suckau D.
      • Resemann A.
      • Schuerenberg M.
      • Hufnagel P.
      • Franzen J.
      • Holle A.
      A novel MALDI LIFT-TOF/TOF mass spectrometer for proteomics..
      ) instruments. Low-resolution three-dimensional ion traps have been very popular and are well-suited to high-throughput LC-MS/MS, but cannot distinguish different charge states when operating in full scan mode. This is a limitation as ions are separated not by mass but by mass/charge (m/z), and ESI typically gives multiply charged ions. Library searching based on such low-resolution ion trap data typically tests multiple possibilities based on the assumption that the precursor ions can have one, two, or three protons attached. For some search engines, all fragments are assumed to be singly charged, thereby increasing the probability that any peptide match will be incorrect. However, the popularity of ion traps has presented a challenge for software development and several search engines now assume that fragment ions from multiply charged precursors might also be multiply charged. Precursor ion charge states may also be deduced from an analysis of fragment ion spectra (
      • Colinge J.
      • Magnin J.
      • Dessigny T.
      • Giron M.
      • Masselot A.
      Improved peptide charge state assignment..
      ). With higher resolution instruments, charge states are readily determined unambiguously from the peak spacing, e.g. with QqTOF instruments, and Fourier transform MS has identified charge states for protonated molecular ions and fragment ions of proteins of up to 45 kDa (
      • Ge Y.
      • Lawhorn B.G.
      • El Naggar M.
      • Strauss E.
      • Park J.-H.
      • Begley T.P.
      • F. W.
      McLafferty.Top down characterization of larger proteins (45 kDa) by electron capture dissociation mass spectrometry..
      ). It is also important to add that new ion trap designs, including linear traps, provide much higher resolution and, in combination with other mass analyzers, will form the basis of a new generation of powerful tandem instruments.
      A largely uncontrolled variable factor in database searching is the choice of search engine, of which there are several in common use (
      • Handley J.
      Software for MS protein identification..
      ). A number of web sites offer free access to web-based database searching programs for peptide mass fingerprinting and the identification of sequence tags, all of which provide other tools as well, such as programs for calculating isotope patterns, predicting enzyme digestion patterns, and theoretical prediction of CID fragments. Some sites allow free downloading of the programs. An alternative to internet access is to purchase a license to have the programs resident in-house, and some mass spectrometer vendors supply software and software licenses with the purchase of an instrument. An enumeration of all the current search engines is a moving target, but well-known web-based examples include tools on the ExPASy proteomics server provided by the Swiss Institute of Bioinformatics (www.expasy.ch/tools), Mascot from Matrix Science (London, UK; www.matrixscience.com), and Protein Prospector provided by the University of California (San Francisco, CA; prospector.ucsf.edu). Systems supplied by instrument manufacturers include Sequest from Thermo-Finnigan (San Jose, CA) and Spectrum Mill from Agilent Technologies (Palo Alto, CA).

      DIGESTION OF ISOLATED PROTEINS VERSUS DIGESTION OF PROTEIN MIXTURES

      Protein identification by digestion and peptide mass fingerprinting is not effective for complex protein mixtures unless preceded by a separative step, most often a two-dimensional gel separation. Using large format gels, it is possible to separate and analyze hundreds or even thousands of protein spots from a single gel. This is manually repetitive and time-consuming but automated methods have been developed (
      • Binz P.-A.
      • Muller M.
      • Walther D.
      • Bienvenut W.V.
      • Gras R.
      • Hoogland C.
      • Bouchet G.
      • Gasteiger E.
      • Fabbretti R.
      • Gay S.
      • Palagi P.
      • Wilkins M.R.
      • Rouge V.
      • Tonella L.
      • Paesano S.
      • Rossellat G.
      • Karmime A.
      • Bairoch A.
      • Sanchez J.-C.
      • Appel R.D.
      • Hochstrasser D.F.
      A molecular scanner to automate proteomic research and to display proteome images..
      ), and several companies offer robots for spot identification, spot cutting, digestion, and spotting of the peptide mixtures onto MALDI plates. In practice, gel separation is equally applicable to MS/MS, but an alternative approach is to digest a complex mixture and then to separate the peptides chromatographically before introduction to the mass spectrometer. Such mixtures may contain thousands of proteins and multi-dimensional high-pressure LC (HPLC) may be necessary, e.g. an initial separation by strong cation exchange chromatography may be followed by reversed-phase chromatography (
      • Washburn M.P.
      • Wolters D.
      • Yates 3rd., J.R.
      Large-scale analysis of the yeast proteome by multidimensional protein identification technology..
      ). Peptide mass fingerprinting is no longer possible with this approach as the association is lost between particular peptides and the proteins from which they were derived, therefore CID and peptide sequencing is essential. The latter method is more easily automated than two-dimensional gel analysis, consequently in many laboratories it is replacing the use of gels. It can also be combined with one-dimensional gel separation in which individual bands that may contain multiple proteins are cut out, digested, and the peptides are introduced to the mass spectrometer via an HPLC separation (
      • Huang L.
      • Baldwin M.A.
      • Maltby D.
      • Medzihradszky K.F.
      • Baker P.R.
      • Allen N.
      • Rexach M.
      • Edmondson R.
      • Campbell J.
      • Juhasz P.
      • Martin S.A.
      • Vestal M.L.
      • Burlingame A.L.
      The identification of protein-protein interactions of the nuclear pore complex of S. cerevisiae using high throughput MALDI-TOF/TOF tandem mass spectrometry..
      ).

      AUTOMATION OF DATA COLLECTION AND ANALYSIS

      In so-called data-dependent experiments, with HPLC coupled to a tandem mass spectrometer, repetitive recording of MS spectra is interleaved with the selection and analysis of peaks for CID analysis (
      • Tiller P.R.
      • Land A.P.
      • Jardine I.
      • Murphy D.M.
      • Sozio R.
      • Ayrton A.
      • Schaefer W.H.J.
      Application of liquid chromatography-mass spectrometry(n) analyses to the characterization of novel glyburide metabolites formed in vitro..
      ). The amount of data generated by such experiments running on a continuous basis is overwhelming for manual interpretation, consequently automation is becoming essential. Effective automated transfer of mass spectrometric data to the informatics programs is dependent on the reliable performance and accuracy of so called “peak-picking” algorithms. Such programs are often proprietary to the instrument manufacturers, although some mathematical approaches have been described (
      • Breen E.J.
      • Hopwood F.G.
      • Williams K.L.
      • Wilkins M.R.
      Automatic poisson peak harvesting for high throughput protein identification..
      ). Ideally, they convert the instrument-specific and technique-specific raw mass spectra into generic lists of monoisotopic mass and intensity, usually after some spectral processing to enhance the peak envelopes (ion current profiles) while reducing electronic and random noise. The m/z value for each peak is usually defined by its centroid (the center of mass of the peak), determined above a certain fraction of the peak height such as 50%, a variable that is selected to avoid the inclusion of noise at the baseline and to allow some failure to fully resolve adjacent peaks. MALDI-generated ions are singly charged but ESI-generated ions usually carry multiple charges, therefore ESI peaks are separated by fractional m/z values. A challenge is to correctly identify the first peak in each isotopic cluster. For well-resolved, isolated peptide ions of molecular mass less than ∼1500 Da with good signal-to-noise ratios, this is relatively straightforward as the first peak in each cluster, i.e. the monoisotopic peak (
      • Carr S.A.
      • Burlingame A.L.
      • Baldwin M.A.
      The meaning and usage of the terms monisotopic mass, average mass, mass resolution and mass accuracy for measurement of biomolecules.
      ), is the most abundant. Above 2000 Da this is no longer true, and the correct identification of the monoisotopic peak becomes progressively more difficult with increasing mass, particularly if it falls below a threshold selected to discriminate between actual peaks and background noise. This problem is compounded by the overlapping isotopic clusters likely to occur in the spectra of peptide mixtures.

      SETTING VARIABLE PARAMETERS AND RANKING HITS

      Once the experimental data have been recorded, subjected to smoothing, possible baseline subtraction, the peaks extracted, and possibly filtered according to criteria that exclude nonpeptide peaks, the data must then be searched against an appropriate database, as selected by the researcher. All search engines allow for the setting of a number of variables such as protein molecular mass range, pI range, mass tolerance for peaks in a normal spectrum, mass tolerance for CID precursor ions, mass tolerance for fragment ions, number of charges, number of peptides required for a match, and possible modifications to certain residues such as alkylation of cysteine or oxidation of methionine. The program then returns a series of hits, ranked according to one of a number of possible criteria. Unfortunately, there are presently no universal standards for scoring the output from these programs, therefore deciding/determining what is a significant match is not straightforward. The information provided by the search engines is variable, e.g. the scoring within Protein Prospector has been based on MOWSE (Molecular Weight Search), which allows for the nonuniform distribution of peptide sizes that result from proteolysis (
      • Pappin D.J.
      • Hojrup P.
      • Bleasby A.J.
      Rapid identification of proteins by peptide-mass fingerprinting..
      ). This is not particularly informative, but some researchers have specified a MOWSE score of at least 100 for a hit from peptide mass mapping to be significant. Note however that no MOWSE threshold can be defined above which a match is definitely correct and below which it probably is not. At the University of California, San Francisco we are developing a new scoring system for Protein Prospector based on the frequency of observing certain ion types in MS/MS spectra, backbone ions such as y and b scoring higher than side-chain ions such as d, v, and w. Other algorithms that employ probability-based scoring systems including Sequest (
      • Eng J.K.
      • McCormack A.L.
      • Yates J.R.
      An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database.
      ), Mascot (
      • Perkins D.N.
      • Pappin D.J.C.
      • Creasy D.M.
      • Cottrell J.S.
      Probability-based protein identification by searching sequence databases using mass spectrometry data..
      ), and Sonar (
      • Field H.I.
      • Fenyo D.
      • Beavis R.C.
      RADARS, a bioinformatics solution that automates proteome mass spectral analysis, optimises protein identification, and archives data in a relational database..
      ) are being developed to provide more reliable significance thresholds, but again putative matches close to the threshold may well be incorrect.
      A number of groups have addressed these problems by developing statistical tools that improve the scoring for existing search engines, particularly Sequest (
      • MacCoss M.J.
      • Wu C.C.
      • Yates III, J.R.
      Probability-based validation of protein identifications using a modified SEQUEST algorithm..
      • Keller A.
      • Nesvizhskii A.I.
      • Kolker E.
      • Aebersold R.
      Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search..
      • Anderson D.C.
      • Li W.
      • Payan D.G.
      • Noble W.S.
      A new algorithm for the evaluation of shotgun peptide sequencing in proteomics: Support vector machine classification of peptide MS/MS spectra and SEQUEST scores..
      ), while others have developed independent tools (
      • Hansen B.T.
      • Jones J.A.
      • Mason D.E.
      • Liebler D.C.
      SALSA: A pattern recognition algorithm to detect electrophile-adducted peptides by automated evaluation of CID spectra in LC-MS-MS analyses..
      • Colinge J.
      • Masselot A.
      • Giron M.
      • Dessigny T.
      • Magnin J.
      OLAV: towards high throughput tandem mass spectrometry data identification..
      • Havilio M.
      • Haddad Y.
      • Smilansky Z.
      Intensity-based statistical scorer for tandem mass spectrometry..
      • Nesvizhskii A.I.
      • Keller A.
      • Kolker E.
      • Aebersold R.
      A statistical model for identifying proteins by tandem mass spectrometry..
      ). Ideally, the probabilities assigned will be independent of the mass spectrometer used and the search engine used. Some approaches focus on improved peptide identification, and they generally employ a training set to derive a knowledge base of peptide fragmentation rules. An explicit objective of most of these methods is to make high-throughput experiments more reliable and to minimize the need for time-consuming, visual inspection of data and its subjective, manual interpretation. Such an approach should facilitate the analysis of entire datasets from experiments involving multiple LC-MS/MS runs with perhaps tens of thousands of MS/MS spectra. One method described uses an expectation maximization algorithm to assign a true probability value to each peptide identification. An analysis of the whole dataset identifies “sibling” peptides originating from a single protein. A higher number of sibling peptides observed for a given protein increases the probability of identifying that protein correctly. Thus based on statistical principles, protein identifications based on single peptides carry less weight than those based on multiple peptides (
      • Nesvizhskii A.I.
      • Keller A.
      • Kolker E.
      • Aebersold R.
      A statistical model for identifying proteins by tandem mass spectrometry..
      ). Consideration has also been given to the case in which a particular peptide sequence is part of more than one protein, as in many enzyme families.
      Experience shows that for any analysis of a digest from a single protein, there will be mass spectrometric peaks observed that are not matched by the search engine. Some of these may be identified as nonpeptide peaks by virtue of fractional mass values (mass defects) that fall outside an allowable window for the elemental compositions of normal peptide ions. Others may be peptides from the targeted protein but resulting from nonspecific cleavages or carrying posttranslational or chemical modifications. However, in very many cases, a sample cut from a single gel spot or a fraction corresponding to a single peak in an HPLC run will actually contain multiple proteins, and only a subset of peptides will match any given protein hit. If peptide mass fingerprinting is employed, the peptides that match to the top-scoring hit can be subtracted, then the amended peak list can be searched again for second and subsequent proteins. However, MS/MS is better able to differentiate between peptide and nonpeptide signals and can also identify modified peptides. Multiple proteins will be detected more successfully when MS measurements are coupled with HPLC separation, either on-line or off-line, in part because this allows a wider dynamic range and because it reduces suppression effects, i.e. the selective ionization of only a subset of the peptides present in a mixture.

      THE IMPACT OF EXPERIMENTAL DESIGN AND METHODOLOGY ON THE QUALITY OF THE RESULTS

      In considering the reliability of protein identifications by MS, a variety of sources of obtaining erroneous results must be considered. First, there is contamination by other proteins during the initial isolation and digestion of the sample, which is difficult to avoid for low-level samples, e.g. during gel electrophoresis, cutting spots from the gel, from sample tubes, pipettes, buffers, or extraneous matter dropping into the sample (such as hair, skin, material from clothing, laboratory dust). Then there is the efficient extraction of the peptides and preparation of the sample for MS while trying to minimize the presence of salts or detergents that may adversely affect ionization. The analysis technique chosen should be appropriate for the sample, e.g. a highly complex mixture should not be analyzed without a separative step, either prior to analysis or on-line as in LC-MS/MS. If the preceding steps are optimized, the mass spectral method is appropriate, the instrument is well maintained, and suitable calibration spectra are run, the data obtained should be as good as can be.
      In submitting the data to a search engine, the researcher must first select a suitable database and must ensure that all the variables are set appropriately. If possible and particularly if the user is not very familiar with the relevant search engine, multiple searches should be submitted using different values for the same dataset, and ideally using different search engines. At the University of California, San Francisco, we frequently compare hits from both Mascot and Protein Prospector. Strong protein hits usually coincide with both search engines and may become stronger because of additional peptide identifications. Weak hits based on single weak peptides may also improve if the other search engine finds a different peptide, which can happen as we consider some different ion types with Protein Prospector, or even if both searches identify the same peptide. In contrast, we are also less likely to believe weak hits that are found by only one search engine. For high-throughput analysis, it is not realistic to review every spectrum, but a selection of spectra should be scrutinized, particularly any that yield particularly important or unexpected findings. In these, the signal-to-noise ratio should be examined, the presence of appropriate isotope peaks should be confirmed, and the isotope spacing should be checked to establish that charge states and mass assignments are correct. However, this will only be meaningful if the raw data can be examined, rather than smoothed, de-isotoped, or otherwise manipulated versions of the spectrum.
      The information derived from search results will depend on the nature of the sample, the techniques used, and the particular search engine, but there are some general principles to consider. For peptide mass fingerprinting, there should be adequate sequence coverage for the peptides identified. Thus the matching of four peptides representing 10% of the sequence does not constitute a reliable hit and should not be listed as a positive identification. In contrast, six peptides representing 20% of the sequence may be adequate for a tentative identification. At this level, there is no definitive rule so users should err on the side of caution, and bearing in mind the ever-increasing size of databases and the risk of ambiguous or incorrect results, it should be considered essential to obtain corroborating data from at least one CID spectrum. For protein identifications based on CID data, it is important to appreciate that the first step requires correct peptide identifications. Only if this is achieved can protein identifications have any relevance. Ideally, an examination of a peptide match should reveal that the fragment peaks matched tend to be the stronger ones within the CID spectrum, preferably distributed throughout the m/z range rather than all coming from the low-mass ions. Hopefully, there will be some chemical logic to the spectrum and the fragmentation, e.g. histidine represents another site for protonation and gives higher charge states in ESI, C-terminal arginine favors y rather than b ion formation, cleavage N-terminal to a proline is favored, etc. Then it must be appreciated that the protein is a gene product that may represent one of a number of forms. If it is truncated, this will be revealed only if the appropriate terminus is identified in a peptide, and then it is important to consider that this peptide could correspond to a nonspecific cleavage. Alternative splice variants will only be established unambiguously if an amino acid sequence corresponding to the actual splice point is observed in a peptide.
      No structural or sequence inferences can be based on the failure to observe a specific peptide; such a peptide may arise from proteolysis but simply not be detected, it may have an unexpected mass due to one or more modifications, or the residues may indeed be present in the sequence but the proteolysis may not proceed as anticipated. It is equally important to note that the absence of evidence for the presence of a particular protein does not definitively establish the absence of the protein. Even though a protein may be found in one sample but not another, it is not correct to describe this as being present in sample A but not sample B. This is something that simply cannot be proved by MS.
      Then there is the question of supporting information. Until recently, most mass spectrometric experiments were directed toward identifying small numbers of proteins. In such cases, it would be conventional to confirm any identification by obtaining an appropriate antibody and running a Western blot. Finally, the creation of a knockout mouse might be necessary to ascertain whether there was any biological relevance, which for the identification of neural cell adhesion molecule 1 as a binding partner of the prion protein proved not to be the case (
      • Schmitt-Ulms G.
      • Legname G.
      • Baldwin M.A.
      • Ball H.L.
      • Bradon N.
      • Bosque P.J.
      • Crossin K.L.
      • Edelman G.M.
      • DeArmond S.J.
      • Cohen F.E.
      • Prusiner S.B.
      Binding of neural cell adhesion molecules (N-CAMs) to the cellular prion protein..
      ). Such confirmations are impossible in high-throughput proteomic experiments, therefore researchers have a responsibility to either discard marginal results or to present the full details of the search output so that others can make their own evaluations.

      PROTEIN QUANTITATION

      Although many experiments in proteomics only require that proteins be identified, quantitation can also be important, particularly for differential analysis, such as in comparisons of samples from diseased and healthy states. Prior to the widespread adoption of MS, comparison of the intensity of staining of spots on two-dimensional gels by densitometry was a common approach to this problem. The use of gels continued with mass spectrometric analysis and was extensively automated by some workers (
      • Binz P.-A.
      • Muller M.
      • Walther D.
      • Bienvenut W.V.
      • Gras R.
      • Hoogland C.
      • Bouchet G.
      • Gasteiger E.
      • Fabbretti R.
      • Gay S.
      • Palagi P.
      • Wilkins M.R.
      • Rouge V.
      • Tonella L.
      • Paesano S.
      • Rossellat G.
      • Karmime A.
      • Bairoch A.
      • Sanchez J.-C.
      • Appel R.D.
      • Hochstrasser D.F.
      A molecular scanner to automate proteomic research and to display proteome images..
      ). Nevertheless, this approach to quantitation may suffer from limited dynamic range for staining, uncertainties in protein identifications, and the danger that the darkening of a spot could be due to the presence of a new protein rather than up-regulation of the protein believed to be responsible. Although it is now commonplace to use mass spectrometric analysis of spots that show differential staining, problems with the identification of low-level proteins in mixtures may persist. With differential gel electrophoresis (
      • Unlu M.
      • Morgan M.E.
      • Minden J.S.
      Difference gel electrophoresis: A single gel method for detecting changes in protein extracts..
      ), which is commercially available as the DIGE system (Amersham Biosciences, Piscataway, NJ), the use of spectrally resolvable fluorescent dyes may separate and quantitiate up to three samples on a single two-dimensional gel. This employs software that automatically locates and analyzes protein spots, assigning statistical confidence to observed differences.
      A different approach relies on the power of MS to separate and identify differentially isotopically labeled species. Some researchers have compared proteins from cells grown in different isotopic environments, e.g. incorporating 15N or 13C into one population of proteins, the peptides from that could be distinguished from analogs containing the lighter isotope 14N or 12C by their mass differences (
      • Oda Y.
      • Huang K.
      • Cross F.R.
      • Cowburn D.
      • Chait B.T.
      Accurate quantitation of protein expression and site-specific phosphorylation..
      ,
      • Ong S.E.
      • Blagoev B.
      • Kratchmarova I.
      • Kristense D.B.
      • Steen H.
      • Pandey A.
      • Mann M.
      Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics..
      ). Another isotope used in the MS analysis of peptides for at least 20 years is 18O (
      • Desiderio D.M.
      • Kai M.
      Preparation of stable isotope-incorporated peptide internal standards for field desorption mass spectrometry quantification of peptides in biologic tissue..
      ). Digestion of a protein in 18O-enriched water results in the heavy isotope being incorporated at the C terminus of each new peptide formed, with the exception of the pre-existing protein C terminus. The use of a 1:1 mixture of heavy and light water results in mass spectral doublets for all CID fragments containing a peptide C terminus, e.g. y ions, allowing them to be distinguished from b ions and other N-terminal fragments. Comparative proteomics using digestion in 18O-water was given a new twist with a method described as “inverse labeling.” By carrying out two parallel but isotopically reversed experiments then subtracting one spectrum from the other, only peptides derived from proteins differing in abundance remained to be analyzed (
      • Wang Y.K.
      • Ma Z.
      • Quinn D.F.
      • Fu E.W.
      Inverse 18O labeling mass spectrometry for the rapid identification of marker/target proteins..
      ). A complicating factor is further exchange of isotopically labeled oxygen catalyzed by the protease that can lead to the presence of two 18O per carboxyl group, rather than one. However, this can be decoupled from proteolysis (
      • Yao X.
      • Afonso C.
      • Fenselau C.
      Dissection of proteolytic 18O labeling: endoprotease-catalyzed 16O-to-18O exchange of truncated peptide substrates..
      ) and is advantageous as it gives a mass difference of 4 Da rather than 2, thereby ensuring more effective separation of the natural isotopic clusters.
      Another method to introduce stable isotopes is to differentially label a specific amino acid such as cysteine with a chemical reagent containing light or heavy isotopes. Such reagents may incorporate an isotope-coded affinity tag (ICAT) to enable enrichment of the labeled peptides, as well as to permit measurement of relative abundance (
      • Gygi S.P.
      • Rist B.
      • Gerber S.A.
      • Turecek F.
      • Gelb M.H.
      • Aebersold R.
      Quantitative analysis of complex protein mixtures using isotope-coded affinity tags..
      ). The strength of the isotope labeling strategy is that it allows specific markers to be incorporated into two populations of proteins that can then be mixed together before any extraction, digestion, peptide separation, and analysis is undertaken. As long as peptides incorporating the different isotopes have virtually identical physical and chemical properties, the ratios of peak heights or areas from MS experiments will be proportional to the relative amounts of each protein. In practice some isotopes are better than others, e.g. differences in polarity between hydrogen and deuterium affect the retention times in HPLC whereas the incorporation of 12C/13C labels in peptide pairs has no measurable effect (
      • Hansen K.C.
      • Schmitt-Ulms G.
      • Chalkley R.J.
      • Hirsch J.
      • Baldwin M.A.
      • Burlingame A.L.
      Mass spectrometric analysis of protein mixtures at low levels using cleavable 13C-isotope coded affinity tag and multidimensional chromatography..
      ). Because there is a distribution of naturally occurring heavy isotopes in all peptides, it is necessary to add the heavy isotope in sufficient numbers to completely separate the isotopic patterns for the peptides derivatized with light and heavy reagents. The original ICAT reagent used eight deuterium atoms and a more recent variant uses nine 13C atoms. Special software for quantitation is becoming available from instrument manufacturers and on the websites that provide MS proteomics tools. In carefully conducted experiments, the ratios of intensities for peptide pairs can be measured to within about 10%, although weaker peptides or those affected by overlapping peaks may fall well outside this. ICAT methodology is applicable to high-throughput experiments and can be combined with statistical methods that optimize protein identification (
      • von Haller P.D.
      • Yi E.
      • Donohoe S.
      • Vaughn K.
      • Keller A.
      • Nesvizhskii A.I.
      • Eng J.
      • Li X.-J.
      • Goodlett D.R.
      • Aebersold R.
      • Watts J.D.
      The application of new software tools to quantitative protein profiling via isotope-coded affinity tag (ICAT) and tandem mass spectrometry: II. Evaluation of tandem mass spectrometry methodologies for large-scale protein analysis, and the application of statistical tools for data analysis and interpretation..
      ) as has been described for a dataset derived from lipid raft-associated proteins (
      • von Haller P.D.
      • Yi E.
      • Donohoe S.
      • Vaughn K.
      • Keller A.
      • Nesvizhskii A.I.
      • Eng J.
      • Li X.-J.
      • Goodlett D.R.
      • Aebersold R.
      • Watts J.D.
      The application of new software tools to quantitative protein profiling via isotope-coded affinity tag (ICAT) and tandem mass spectrometry: I. statistically annotated datasets for peptide sequences and proteins identified via the application of ICAT and tandem mass spectrometry to proteins copurifying with T cell lipid rafts..
      ).
      Note that the methods described above yield only “relative” quantities, whereas “absolute” protein quantitation requires a technique such as stable isotope dilution in which known amounts of isotopically labeled peptides are added to the digest (
      • Barr J.R.
      • Maggio V.L.
      • Patterson Jr., D.G.
      • Cooper G.R.
      • Henderson L.O.
      • Turner W.E.
      • Smith S.J.
      • Hannon W.H.
      • Needham L.L.
      • Sampson E.J.
      Isotope dilution-mass spectrometric quantification of specific proteins: model application with apolipoprotein A-I..
      ). Such peptides are designed to mimic the peptides formed by proteolysis of the proteins of interest and can carry chemical modifications to simulate modified proteins such as phosphoproteins (
      • Gerber S.A.
      • Rush J.
      • Stemman O.
      • Kirschner M.W.
      • Gygi S.P.
      Absolute quantitation of proteins and phosphoproteins from cell lysates by tandem mass spectrometry..
      ).

      CHEMICAL AND POSTTRANSLATIONAL MODIFICATIONS

      MS is ideal for the study of covalent modifications as all such modifications involve a change in molecular mass, which is reflected in the mass of any peptide carrying the modified amino acid(s). Sometimes this is uniquely attributable to the chemical transition, such as -2 Da for disulfide bond formation or +42 Da for acetylation. Some changes are easily identified, such as +16 Da for the addition of oxygen, but it may be unclear whether this is the common formation of methionine sulfoxide or something more unusual. It is also uncertain whether this is a posttranslational modification or a chemical change occurring during sample processing. Other mass changes may be more ambiguous, e.g. +80 Da could represent phosphorylation or sulfation, although these may be distinguished by the detection of an anion at m/z 79 versus 80, respectively, and a very accurate mass measurement. Specialized MS/MS techniques have been developed for the identification of the precursor ions for peptides that carry specific modifications such as phosphorylation or glycosylation (
      • Huddleston M.J.
      • Annan R.S.
      • Bean M.F.
      • Carr S.A.
      Selective detection of phosphopeptides in complex mixtures by electrospray liquid chromatography/mass spectrometry..
      • Carr S.A.
      • Huddleston M.J.
      • Annan R.S.
      Selective detection and sequencing of phosphopeptides at the femtomole level by mass spectrometry..
      • Huddleston M.J.
      • Bean M.F.
      • Carr S.A.
      Collisional fragmentation of glycopeptides by electrospray ionization LC/MS and LC/MS/MS: Methods for selective detection of glycopeptides in protein digests..
      ). Some modifications may be quite heterogeneous, as with N-glycosylation, although enzymatic removal of the modified group can give a more defined change, e.g. peptide N-glycosidase F removes N-linked sugars and converts asparagine to aspartic acid (+1 Da). The major difficulties arise with the search for transient modifications of low stoichiometry, such as phosphorylation occurring in a regulatory role. Affinity methods have been employed to enrich phosphopeptides such as immobilized metal ion chromatography but with mixed success (
      • Neville D.C.
      • Rozanas C.R.
      • Price E.M.
      • Gruis D.B.
      • Verkman A.S.
      • Townsend R.R.
      Evidence for phosphorylation of serine 753 in CFTR using a novel metal-ion affinity resin and matrix-assisted laser desorption mass spectrometry..
      ,
      • Posewitz M.C.
      • Tempst P.
      Immobilized gallium(III) affinity chromatography of phosphopeptides..
      ). The difficulties are exacerbated when the modification is labile, either during chemical separation or in the mass spectrometer. Thus CID and MS/MS of the modified peptide should potentially reveal the site of modification, but a facile loss of phosphoric acid from the molecular ion may prevent the detection of any sequence-specific ions. In general, peptide mass mapping alone cannot be used for identification of posttranslational modifications. Based on MS data alone, a search can be made for phosphopeptide mass values that are 80 Da higher in mass than calculated for the amino acid sequence (or 160 Da, etc.), but MS/MS to confirm the sequence is essential, even if the precise site of modification cannot be defined. Other more labile modifications requiring special techniques for detection are sulfation (
      • Wolfender J.-L.
      • Chu F.
      • Ball H.
      • Wolfender F.
      • Fainzilber M.
      • Baldwin M.A.
      • Burlingame A.L.
      Identification of tyrosine sulfation in Conus Pennaceus conotoxins α-Pn1A and α-Pn1B: Further investigation of labile sulfo- and phosphopeptides by electrospray, matrix assisted laser desorption/ionization (MALDI) and atmospheric pressure MALDI mass spectrometry..
      ) and O-GlcNAc (
      • Wells L.
      • Hart G.W.
      O-GlcNAc turns twenty: Functional implications for post-translational modification of nuclear and cytosolic proteins with a sugar..
      ,
      • Chalkley R.J.
      • Burlingame A.L.
      Identification of novel sites of O-N-acetylglucosamine modification of serum response factor using quadrupole time-of-flight mass spectrometry..
      ).

      CONCLUSIONS

      It is clear that the field of proteomic analysis by MS continues to be in a very dynamic state, making it presently difficult to specify absolute standards for analytical protocols and subsequent interpretative judgments. Nevertheless, authors submitting MS-based manuscripts to Molecular & Cellular Proteomics must employ methodology that meets currently acceptable standards. Thus, protein identification based solely on peptide mass fingerprinting should no longer be acceptable and must be complemented by CID and MS/MS. Clearly, it is essential that researchers avoid the “black box” mentality and take responsibility for understanding the methods that they are using. Furthermore, it is strongly recommended that authors take the risk of presenting too much information in describing their experiments, their data, and its interpretation. It will then be incumbent on journal editors and reviewers to determine what fraction of this information is relevant and is required for publication. Authors should report the software they use, including the version, the database version, and the probability scores assigned to each protein identification. Molecular & Cellular Proteomics will also encourage the publication of reports that describe enhancements to experimental techniques or novel approaches to database searching and statistical analysis that improve the accuracy and reliability of protein analysis based on mass spectrometric measurements. Although the methods currently in use clearly can work very well, to date their application has been largely empirical. The editors of this journal now consider it essential that researchers in the field develop, adopt, and adhere to a unified, systematic, and rational approach to protein identification.

      Acknowledgments

      I am grateful for constructive input from Molecular & Cellular Proteomics editors Ralph Bradshaw, Al Burlingame, Ruedi Aebersold, and Steve Carr, and from Dennis Hochstrasser and his colleagues at GeneProt.

      REFERENCES

        • Henzel W.J.
        • Billeci T.M.
        • Stults J.T.
        • Wong S.C.
        • Grimley C.
        • Watanabe C.
        Identifying proteins from two-dimensional gels by molecular mass searching of peptide fragments in protein sequence databases..
        Proc. Natl. Acad. Sci. U. S. A. 1993; 90: 5011-5015
        • Clauser K.R.
        • Baker P.
        • Burlingame A.L.
        Role of accurate mass measurement (± 10 ppm) in protein identification strategies employing MS or MS/MS and database searching..
        Anal. Chem. 1999; 71: 2871-2882
        • Preisler J.
        • Hu P.
        • Rejtar T.
        • Karger B.L.
        Capillary electrophoresis-matrix-assisted laser desorption/ionization time-of-flight mass spectrometry using a vacuum deposition interface..
        Anal. Chem. 2000; 72: 4785-4795
        • Preisler J.
        • Hu P.
        • Rejtar T.
        • Moskovets E.
        • Karger B.L.
        Capillary array electrophoresis-MALDI mass spectrometry using a vacuum deposition interface..
        Anal. Chem. 2002; 74: 17-25
        • Krutchinsky A.N.
        • Kalkum M.
        • Chait. B.T.
        Automatic identification of proteins with a MALDI-quadrupole ion trap mass spectrometer..
        Anal. Chem. 2001; 73: 5066-5077
        • Dawson J.H.J.
        • Guilhaus M.
        Orthogonal acceleration time of flight mass spectrometer..
        Rapid Commun. Mass Spectrom. 1989; 3: 155-159
        • Dodonov A.F.
        • Chernushevich I.V.
        • Laiko V.V.
        Atmospheric pressure ionization time of flight mass spectrometer.
        >Proceedings of the 12th International Mass Spectrometry Conference. 1991; : 153
        • Verentchikov A.V.
        • Ens W.
        • Standing K.G.
        Reflecting time-of-flight mass spectrometer with an electrospray ion source and orthogonal extraction..
        Anal. Chem. 1994; 66: 126-133
        • Morris H.R.
        • Paxton T.
        • Dell A.
        • Langhorne J.
        • Berg M.
        • Bordoli R.S.
        • Hoyes J.
        • Bateman R.H.
        High sensitivity collisionally-activated decomposition tandem mass spectrometry on a novel quadrupole/orthogonal-acceleration time-of-flight mass spectrometer..
        Rapid Commun. Mass Spectrom. 1996; 10: 889-896
        • Wada Y.
        • Hayashi A.
        • Masanori F.
        • Katakuse I.
        • Ichihara T.
        • Nakabushi H.
        • Matsuo T.
        • Sakurai T.
        • Matsuda H.
        Characterization of a new fetal hemoglobin variant, Hb F Izumi A gamma 6Glu replaced by Gly, by molecular secondary ion mass spectrometry..
        Biochim. Biophys. Acta. 1983; 749: 244-248
        • Morris H.R.
        • Panico M.
        • Taylor G.W.
        FAB-mapping of recombinant-DNA protein products..
        Biochem. Biophys. Res. Commun. 1983; 117: 299-305
        • James P.
        • Quadroni M.
        • Carafoli E.
        • Gonnet G.
        Protein identification by mass profile fingerprinting..
        Biochem. Biophys. Res. Commun. 1993; 195: 58-64
        • Mann M.
        • Hojrup P.
        • Roepstorff P.
        Use of mass spectrometric molecular weight information to identify proteins in sequence databases..
        Biol. Mass Spectrom. 1993; 22: 338-345
        • Pappin D.J.
        • Hojrup P.
        • Bleasby A.J.
        Rapid identification of proteins by peptide-mass fingerprinting..
        Curr. Biol. 1993; 3: 327-332
        • Yates J.R.
        • Speicher S.
        • Griffin P.R.
        • Hunkapiller T.
        Peptide mass maps: A highly informative approach to protein identification..
        Anal. Biochem. 1993; 214: 397-408
        • Biemann K.
        • Scoble H.A.
        Characterization by tandem mass spectrometry of structural modifications in proteins..
        Science. 1987; 237: 992-998
        • Mann M.
        • Wilm M.
        Error-tolerant identification of peptides in sequence databases by peptide sequence tags..
        Anal. Chem. 1994; 66: 4390-4399
        • Eng J.K.
        • McCormack A.L.
        • Yates J.R.
        An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database.
        J. Amer. Soc. Mass Spectrom. 1994; 5: 976-989
        • Wilm M.
        • Mann M.
        Analytical properties of the nanoelectrospray ion source..
        Anal. Chem. 1996; 68: 1-8
        • Clauser K.R.
        • Baker P.R.
        • Burlingame A.L.
        Peptide fragment-ion tags from MALDI/PSD for error-tolerant searching of genomic databases.
        Proc. 44th ASMS Conf. Mass Spectrom. Allied Topics,. 1996; : 365
        • Taylor J.A.
        • Johnson R.S.
        Sequence database searches via de novo peptide sequencing by tandem mass spectrometry..
        Rapid Commun. Mass Spectrom. 1997; 11: 1067-1075
        • Huang L.
        • Jacob R.J.
        • Pegg S.C.-H.
        • Baldwin M.A.
        • Wang C.C.
        • Burlingame A.L.
        • Babbitt P.C.
        Functional assignment of the 20S proteasome from T. brucei using mass spectrometry and new bioinformatics approaches..
        J. Biol. Chem. 2001; 276: 28327-28339
        • Biemann K.
        Sequencing of peptides by tandem mass spectrometry and high-energy collision-induced dissociation..
        Methods Enzymol. 1990; 193: 886-888
        • Wen D.X.
        • Livingston B.D.
        • Medzihradszky K.F.
        • Kelm S.
        • Burlingame A.L.
        • Paulson J.C.
        Primary structure of Gal beta 1, 3(4)GlcNAc alpha 2, 3-sialyltransferase determined by mass spectrometry sequence analysis and molecular cloning. Evidence for a protein motif in the sialyltransferase gene family..
        J. Biol. Chem. 1992; 267: 21011-21019
        • Spengler B.
        • Kirsch D.
        • Kaufmann R.
        • Jaeger E.
        Peptide sequencing by matrix-assisted laser-desorption mass spectrometry..
        Rapid Commun. Mass Spectrom. 1992; 6: 105-108
        • Shevchenko A.
        • Chernushevich I.
        • Ens W.
        • Standing K.G.
        • Thomson B.
        • Wilm M.
        • Mann M.
        Rapid ’de novo’ peptide sequencing by a combination of nanoelectrospray, isotopic labeling and a quadrupole/time-of-flight mass spectrometer..
        Rapid Commun. Mass Spectrom. 1997; 11: 1015-1024
        • Krutchinsky A.N.
        • Loboda A.V.
        • Spicer V.L.
        • Dworschak R.
        • Ens W.
        • Standing K.G.
        Orthogonal injection of matrix-assisted laser desorption/ionization ions into a time-of-flight spectrometer through a collisional damping interface..
        Rapid Commun. Mass Spectrom. 1998; 12: 508-512
        • Krutchinsky A.N.
        • Zhang W.
        • Chait B.T.
        Rapidly switchable matrix-assisted laser desorption/ionization and electrospray quadrupole-time-of-flight mass spectrometry for protein identification..
        J. Amer. Soc. Mass Spectrom. 2000; 11: 493-504
        • Shevchenko A.
        • Loboda A.
        • Shevchenko A.
        • Standing K.G.
        MALDI quadrupole time-of-flight mass spectrometry: a powerful tool for proteomic research..
        Anal. Chem. 2000; 72: 2132-2141
        • Baldwin M.A.
        • Medzihradszky K.F.
        • Lock C.M.
        • Fisher B.
        • Settineri C.A.
        • Burlingame A.L.
        Matrix assisted laser desorption/ionization coupled with quadrupole/orthogonal acceleration time-of-flight mass spectrometry for protein discovery, identification and structural analysis..
        Anal. Chem. 2001; 73: 1707-1720
        • Vestal M.L.
        • Juhasz P.
        • Hines W.
        • Martin S.A.
        A new delayed extraction MALDI-TOF MS-MS for characterization of protein digests.
        in: Mass Spectrometry in Biology and Medicine. Humana Press, Totowa, NJ2000: 1-16
        • Medzihradszky K.F.
        • Campbell J.M.
        • Baldwin M.A.
        • Falick A.M.
        • Juhasz P.
        • Vestal M.L.
        • Burlingame A.L.
        The characteristics of peptide collision-induced dissociation using a high performance MALDI-TOF/TOF tandem mass spectrometer..
        Anal. Chem. 2000; 72: 552-558
        • Suckau D.
        • Resemann A.
        • Schuerenberg M.
        • Hufnagel P.
        • Franzen J.
        • Holle A.
        A novel MALDI LIFT-TOF/TOF mass spectrometer for proteomics..
        Anal. Bioanal. Chem. 2003; 376: 952-965
        • Colinge J.
        • Magnin J.
        • Dessigny T.
        • Giron M.
        • Masselot A.
        Improved peptide charge state assignment..
        Proteomics. 2003; 3: 1434-1440
        • Ge Y.
        • Lawhorn B.G.
        • El Naggar M.
        • Strauss E.
        • Park J.-H.
        • Begley T.P.
        • F. W.
        McLafferty.Top down characterization of larger proteins (45 kDa) by electron capture dissociation mass spectrometry..
        J. Amer. Chem. Soc. 2002; 124: 672-678
        • Handley J.
        Software for MS protein identification..
        Anal. Chem. 2002; 74: 159A-162A
        • Binz P.-A.
        • Muller M.
        • Walther D.
        • Bienvenut W.V.
        • Gras R.
        • Hoogland C.
        • Bouchet G.
        • Gasteiger E.
        • Fabbretti R.
        • Gay S.
        • Palagi P.
        • Wilkins M.R.
        • Rouge V.
        • Tonella L.
        • Paesano S.
        • Rossellat G.
        • Karmime A.
        • Bairoch A.
        • Sanchez J.-C.
        • Appel R.D.
        • Hochstrasser D.F.
        A molecular scanner to automate proteomic research and to display proteome images..
        Anal. Chem. 1999; 71: 4981-4988
        • Washburn M.P.
        • Wolters D.
        • Yates 3rd., J.R.
        Large-scale analysis of the yeast proteome by multidimensional protein identification technology..
        Nat. Biotechnol. 2001; 19: 242-247
        • Huang L.
        • Baldwin M.A.
        • Maltby D.
        • Medzihradszky K.F.
        • Baker P.R.
        • Allen N.
        • Rexach M.
        • Edmondson R.
        • Campbell J.
        • Juhasz P.
        • Martin S.A.
        • Vestal M.L.
        • Burlingame A.L.
        The identification of protein-protein interactions of the nuclear pore complex of S. cerevisiae using high throughput MALDI-TOF/TOF tandem mass spectrometry..
        Mol. Cell. Proteomics. 2002; 1: 434-450
        • Tiller P.R.
        • Land A.P.
        • Jardine I.
        • Murphy D.M.
        • Sozio R.
        • Ayrton A.
        • Schaefer W.H.J.
        Application of liquid chromatography-mass spectrometry(n) analyses to the characterization of novel glyburide metabolites formed in vitro..
        J. Chromatogr. A. 1998; 794: 15
        • Breen E.J.
        • Hopwood F.G.
        • Williams K.L.
        • Wilkins M.R.
        Automatic poisson peak harvesting for high throughput protein identification..
        Electrophoresis. 2000; 21: 2243-2251
        • Carr S.A.
        • Burlingame A.L.
        • Baldwin M.A.
        The meaning and usage of the terms monisotopic mass, average mass, mass resolution and mass accuracy for measurement of biomolecules.
        in: Mass Spectrometry in Biology and Medicine. Humana, Totowa NJ2000: 553-561
        • Perkins D.N.
        • Pappin D.J.C.
        • Creasy D.M.
        • Cottrell J.S.
        Probability-based protein identification by searching sequence databases using mass spectrometry data..
        Electrophoresis. 1999; 20: 3551-3567
        • Field H.I.
        • Fenyo D.
        • Beavis R.C.
        RADARS, a bioinformatics solution that automates proteome mass spectral analysis, optimises protein identification, and archives data in a relational database..
        Proteomics. 2002; 2: 36-47
        • MacCoss M.J.
        • Wu C.C.
        • Yates III, J.R.
        Probability-based validation of protein identifications using a modified SEQUEST algorithm..
        Anal. Chem. 2002; 74: 5593-5599
        • Keller A.
        • Nesvizhskii A.I.
        • Kolker E.
        • Aebersold R.
        Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search..
        Anal. Chem. 2002; 74: 5389-5392
        • Anderson D.C.
        • Li W.
        • Payan D.G.
        • Noble W.S.
        A new algorithm for the evaluation of shotgun peptide sequencing in proteomics: Support vector machine classification of peptide MS/MS spectra and SEQUEST scores..
        J. Proteome Res. 2003; 2: 137-146
        • Hansen B.T.
        • Jones J.A.
        • Mason D.E.
        • Liebler D.C.
        SALSA: A pattern recognition algorithm to detect electrophile-adducted peptides by automated evaluation of CID spectra in LC-MS-MS analyses..
        Anal. Chem. 2001. 2001; 73: 1676-1683
        • Colinge J.
        • Masselot A.
        • Giron M.
        • Dessigny T.
        • Magnin J.
        OLAV: towards high throughput tandem mass spectrometry data identification..
        Proteomics. 2003; 3: 1454-1463
        • Havilio M.
        • Haddad Y.
        • Smilansky Z.
        Intensity-based statistical scorer for tandem mass spectrometry..
        Anal. Chem. 2003. 2003; 75: 435-444
        • Nesvizhskii A.I.
        • Keller A.
        • Kolker E.
        • Aebersold R.
        A statistical model for identifying proteins by tandem mass spectrometry..
        Anal. Chem. 2003; 75: 4646-4658
        • Schmitt-Ulms G.
        • Legname G.
        • Baldwin M.A.
        • Ball H.L.
        • Bradon N.
        • Bosque P.J.
        • Crossin K.L.
        • Edelman G.M.
        • DeArmond S.J.
        • Cohen F.E.
        • Prusiner S.B.
        Binding of neural cell adhesion molecules (N-CAMs) to the cellular prion protein..
        J. Mol. Biol. 2001; 314: 1209-1225
        • Unlu M.
        • Morgan M.E.
        • Minden J.S.
        Difference gel electrophoresis: A single gel method for detecting changes in protein extracts..
        Electrophoresis. 1997; 18: 2071-2077
        • Oda Y.
        • Huang K.
        • Cross F.R.
        • Cowburn D.
        • Chait B.T.
        Accurate quantitation of protein expression and site-specific phosphorylation..
        Proc. Natl. Acad. Sci. U. S. A. 1999; 96: 6591-6596
        • Ong S.E.
        • Blagoev B.
        • Kratchmarova I.
        • Kristense D.B.
        • Steen H.
        • Pandey A.
        • Mann M.
        Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics..
        Mol. Cell. Proteomics. 2002; 1: 376-386
        • Desiderio D.M.
        • Kai M.
        Preparation of stable isotope-incorporated peptide internal standards for field desorption mass spectrometry quantification of peptides in biologic tissue..
        Biomed. Mass Spectrom. 1983; 10: 471-479
        • Wang Y.K.
        • Ma Z.
        • Quinn D.F.
        • Fu E.W.
        Inverse 18O labeling mass spectrometry for the rapid identification of marker/target proteins..
        Anal. Chem. 2002; 73: 3742-3750
        • Yao X.
        • Afonso C.
        • Fenselau C.
        Dissection of proteolytic 18O labeling: endoprotease-catalyzed 16O-to-18O exchange of truncated peptide substrates..
        J. Proteome Res. 2003; 2: 147-152
        • Gygi S.P.
        • Rist B.
        • Gerber S.A.
        • Turecek F.
        • Gelb M.H.
        • Aebersold R.
        Quantitative analysis of complex protein mixtures using isotope-coded affinity tags..
        Nature Biotechnol. 1999; 17: 994-999
        • Hansen K.C.
        • Schmitt-Ulms G.
        • Chalkley R.J.
        • Hirsch J.
        • Baldwin M.A.
        • Burlingame A.L.
        Mass spectrometric analysis of protein mixtures at low levels using cleavable 13C-isotope coded affinity tag and multidimensional chromatography..
        Mol. Cell. Proteomics. 2003; 2: 299-314
        • von Haller P.D.
        • Yi E.
        • Donohoe S.
        • Vaughn K.
        • Keller A.
        • Nesvizhskii A.I.
        • Eng J.
        • Li X.-J.
        • Goodlett D.R.
        • Aebersold R.
        • Watts J.D.
        The application of new software tools to quantitative protein profiling via isotope-coded affinity tag (ICAT) and tandem mass spectrometry: II. Evaluation of tandem mass spectrometry methodologies for large-scale protein analysis, and the application of statistical tools for data analysis and interpretation..
        Mol. Cell. Proteomics. 2003; 2: 428-442
        • von Haller P.D.
        • Yi E.
        • Donohoe S.
        • Vaughn K.
        • Keller A.
        • Nesvizhskii A.I.
        • Eng J.
        • Li X.-J.
        • Goodlett D.R.
        • Aebersold R.
        • Watts J.D.
        The application of new software tools to quantitative protein profiling via isotope-coded affinity tag (ICAT) and tandem mass spectrometry: I. statistically annotated datasets for peptide sequences and proteins identified via the application of ICAT and tandem mass spectrometry to proteins copurifying with T cell lipid rafts..
        Mol. Cell. Proteomics. 2003; 2: 426-427
        • Barr J.R.
        • Maggio V.L.
        • Patterson Jr., D.G.
        • Cooper G.R.
        • Henderson L.O.
        • Turner W.E.
        • Smith S.J.
        • Hannon W.H.
        • Needham L.L.
        • Sampson E.J.
        Isotope dilution-mass spectrometric quantification of specific proteins: model application with apolipoprotein A-I..
        Clin. Chem. 1996; 42: 1676-1682
        • Gerber S.A.
        • Rush J.
        • Stemman O.
        • Kirschner M.W.
        • Gygi S.P.
        Absolute quantitation of proteins and phosphoproteins from cell lysates by tandem mass spectrometry..
        Proc. Natl. Acad. Sci. U. S. A. 2003; 100: 6940-6950
        • Huddleston M.J.
        • Annan R.S.
        • Bean M.F.
        • Carr S.A.
        Selective detection of phosphopeptides in complex mixtures by electrospray liquid chromatography/mass spectrometry..
        J. Amer. Soc. Mass Spectrom. 1993; 4: 710-717
        • Carr S.A.
        • Huddleston M.J.
        • Annan R.S.
        Selective detection and sequencing of phosphopeptides at the femtomole level by mass spectrometry..
        Anal. Biochem. 1996; 239: 180-192
        • Huddleston M.J.
        • Bean M.F.
        • Carr S.A.
        Collisional fragmentation of glycopeptides by electrospray ionization LC/MS and LC/MS/MS: Methods for selective detection of glycopeptides in protein digests..
        Anal. Chem. 1993; 65: 877-884
        • Neville D.C.
        • Rozanas C.R.
        • Price E.M.
        • Gruis D.B.
        • Verkman A.S.
        • Townsend R.R.
        Evidence for phosphorylation of serine 753 in CFTR using a novel metal-ion affinity resin and matrix-assisted laser desorption mass spectrometry..
        Protein Sci. 1997; 6: 2436-2445
        • Posewitz M.C.
        • Tempst P.
        Immobilized gallium(III) affinity chromatography of phosphopeptides..
        Anal. Chem. 1999; 71: 2883-2892
        • Wolfender J.-L.
        • Chu F.
        • Ball H.
        • Wolfender F.
        • Fainzilber M.
        • Baldwin M.A.
        • Burlingame A.L.
        Identification of tyrosine sulfation in Conus Pennaceus conotoxins α-Pn1A and α-Pn1B: Further investigation of labile sulfo- and phosphopeptides by electrospray, matrix assisted laser desorption/ionization (MALDI) and atmospheric pressure MALDI mass spectrometry..
        J. Mass Spectrom. 1999; 34: 447-454
        • Wells L.
        • Hart G.W.
        O-GlcNAc turns twenty: Functional implications for post-translational modification of nuclear and cytosolic proteins with a sugar..
        FEBS Lett. 2003; 546: 154-158
        • Chalkley R.J.
        • Burlingame A.L.
        Identification of novel sites of O-N-acetylglucosamine modification of serum response factor using quadrupole time-of-flight mass spectrometry..
        Mol. Cell. Proteomics. 2003; 2: 182-190