The Identification of Protein-Protein Interactions of the Nuclear Pore Complex of Saccharomyces cerevisiae Using High Throughput Matrix-assisted Laser Desorption Ionization Time-of-Flight Tandem Mass Spectrometry*

Mass spectrometry has become the technology of choice for detailed identification of proteins in complex mixtures. Although electrophoretic separation, proteolytic digestion, mass spectrometric analysis of unseparated digests, and database searching have become standard methods in widespread use, peptide sequence information obtained by collision-induced dissociation and tandem mass spectrometry is required to establish the most comprehensive and reliable results. Most tandem mass spectrometers in current use employ electrospray ionization. In this work a novel tandem mass spectrometer employing matrix-assisted laser desorption ionization-time-of-flight/time-of-flight operating at 200 Hz has been used to identify proteins interacting with known nucleoporins in the nuclear pore complex of Saccharomyces cerevisiae. Proteins interacting with recombinant proteins as bait were purified from yeast extracts and then separated by one-dimensional SDS-PAGE. Although peptide mass fingerprinting is sometimes sufficient to identify proteins, this study shows the importance of employing tandem mass spectrometry for identifying proteins in mixtures or as covalently modified forms. The rules for incorporating these features into MS-Tag are presented. In addition to providing an evaluation of the sensitivity and overall quality of collision-induced dissociation spectra obtained, standard conditions for ionization and fragmentation have been selected that would allow automatic data collection and analysis, without the need to adjust parameters in a sample-specific fashion. Other considerations essential for successful high throughput protein analysis are discussed.

Over the last five years, rapid advances in commercial instrumentation have brought the power of matrix-assisted laser desorption ionization (MALDI) 1 mass spectrometry (MS) into widespread use for protein identification in biomedical research. This growth is because of the fact that high sensitivity (low femtomole level) may be achieved, together with the ability to measure mass with high accuracy (Ͻ10 ppm) (1,2). Hence, proteins separated by 1-D and two-dimensional SDS-PAGE and visualized by silver and/or fluorescence stains may be identified readily based on proteolytic digest mass maps and the use of database interrogation strategies (2)(3)(4). These features have accelerated the pace of research focused on examining the composition of complex mixtures of proteins including cell lysates (5)(6)(7)(8), as well as protein-protein interactions (9 -14).
There are many cases when results derived solely from database mass matching of digests are inadequate, incomplete, and/or even unreliable. In these cases, it is essential to have the ability to rapidly sequence peptides present in these digests. Although instrumentation has been available to make such measurements using electrospray ionization for some time (15,16), instrumentation that could obtain peptide sequences with comparable sensitivities and spectral quality using MALDI has not. Hence, investigators have either had to switch ionization or have had to rely on the poor quality of spectra obtained from MALDI metastable fragmentation, socalled post-source decay spectra (17,18). Either choice has drawbacks in practice, for example, the need to incorporate a clean-up step prior to being able to carry out a successful electrospray experiment, the problems of poor sensitivity, the variable and unpredictable quality of post-source decay mass spectra, and the long spectra acquisition time.
Based on a decade of experience employing four-sector tandem mass spectrometers for peptide sequence determination with singly protonated pseudomolecular ions (19,20) and the advantage of the tolerance of MALDI of salts and buffers (21), the importance of gaining the ability to measure high quality sequence from MALDI was obvious. Proof of principle was established using the first orthogonal acceleration time-of-flight analyzer as MS-2 with a MALDI ion source on a double focusing magnetic sector instrument (MS-1) (22). Studies of peptides and modified peptides, as well as the co-factor of lysyl oxidase (23) and conotoxins (24), established the advantages of MALDI high energy CID spectra for reliable de novo peptide sequence and structure determination (23)(24)(25)(26).
This work set the stage for the development of a less expensive MALDI tandem time-of-flight instrument with on axis ion optics capable of producing four-sector-like quality CID spectra for protein characterization while simultaneously realizing an increase in several orders of magnitude of absolute sensitivity (27). Our laboratories have reported the performance of an engineering prototype elsewhere (28).
In previous studies of the proteomics of the nuclear pore complex, we have employed manual digestions using MALDI mass mapping and/or liquid chromatography electrospray ionization CID MS as necessary for the identification of 135 proteins interacting with eight FG nucleoporins (11,29). The present work describes the power and advantages of employing robotic sample processing, together with peptide sequence analysis using MALDI-TOF/TOF MS, to achieve very rapid analysis of the protein interactions detected using 22 different GST fusion proteins as "baits" in separate experiments.

EXPERIMENTAL PROCEDURES
Instrument Design-A preliminary research version of the TOF/TOF instrument based on the footprint of the standard Applied Biosystems Voyager-DE STR spectrometer was described previously (28). Subsequent improvements have increased the performance such that the resolving power in MS mode for peptide molecular ions in the typical mass range of 1 to 3 kDa is now 12,000 -20,000 (full width halfmaximum), and fragment ions above m/z 200 display a mass resolu-tion of 4,000 to 8,000. For the spectra presented here, precursor ion mass selection for MS/MS operation without loss of sensitivity was 1%; subsequent changes have improved this to ϳ0.5%. With internal calibration the mass measurement accuracy for fragment ions is routinely in the region of 50 mDa for all masses, and 10 mDa may be achieved using multipoint calibration. External calibration typically gives errors within 100 to 200 mDa. All spectra shown here were obtained with a neodimium yttrium aluminum gamet (YAG) laser operating at a wavelength of 354 nm and at a frequency of 200 Hz. The sensitivity for interpretable CID spectra of peptide standards is in the range of 1 to 10 fmol deposited on the target, although this is sufficient material to provide multiple spectra. The instrument is rapidly switchable between MS and MS/MS modes without recalibration (in Ͻ2 min), which is valuable for recording survey scans to identify potential precursor ions for subsequent recording of CID MS. A schematic diagram that highlights changes in the design of the instrument is shown in Fig. 1.
The first mass analyzer used for precursor ion selection is essentially as described before, being a dual stage Wiley McLaren TOF mass spectrometer. Compared with the earlier version, higher resolution has been achieved by increasing the flight length from 26 to 40 cm. This TOF is combined with an improved timed ion selector consisting of two deflection gates, the opening and closing of which is under software control. Ion selection relies upon deflection off the ion optical axis of all ions other than those of the desired m/z value. The extent of this deflection is determined in part by the distance between the timed ion selector and the collision chamber, which in this case is 7 cm. The collision cell that was previously grounded is now electrically floating, allowing the accelerating voltage in TOF1 to be increased above its previous value of 3 kV without necessarily increasing the collision energy. The higher accelerating voltage improves ion transmission in TOF1. The collision energy is defined by the difference between the TOF1 accelerating potential and the potential of the collision cell, 8 and 7 kV, respectively, in these experiments. Thus the laboratory collision energy for all data presented here was 1 keV. After passing through the timed ion selector, a new decelerating lens ensures that the reduction in ion energy upon entry to the collision cell is completed with minimal defocusing and loss of signal. Collision gas is bled into the collision cell through an adjustable leak valve, and the pressure is maintained via differential pumping through 1 ⁄8-inch apertures at either end of the cell. No direct measure of the pressure in the collision cell is provided, but the amount of gas added is inferred from the increase in pressure measured by the ionization gauge in the high vacuum region close to the cell. For standard CID operation, the indicated pressure increases from 2.0 ϫ 10 Ϫ8 torr (no collision gas) to ϳ2.0 ϫ 10 Ϫ6 torr. The second section is another pulsed TOF mass analyzer, which provides for recording of high resolution spectra of the fragment ions. This spectrometer has a two-stage acceleration region that increases the ion energy to 15 keV via a rapid high voltage power switch. Only ions that are in the volume of the second source at the time the pulse is applied are accelerated toward the ion mirror. Software calculates the flight time of the precursor ions to the second source; fragment ions formed in the collision cell have the same velocities and thus the same flight times and are accelerated similarly toward the ion mirror. The guide wire that was employed in the earlier instrument has been replaced by a second einzel lens, giving improved resolution and transmission. A pulsed metastable suppressor has been added that can be used to reject all ions with velocity equal to that of the precursor ion, i.e. the precursor ion and any fragment ions formed by its decomposition after acceleration in the second source. New X-Y electrostatic deflectors enable the reflector that was offset previously by 0.5°to be aligned with the major axis of the TOF.
Protein Purification, Separation, and Robotic In-gel Digestion-Nucleoporin-interacting proteins were purified by affinity-capture experiments as described previously (11). Twenty-two different recombinant GST-Nups and karyopherins were selected as bait. Briefly, affinity-capture experiments were performed in binding buffer (20 mM Hepes, pH 6.8, 150 mM KOAc, 2 mM Mg(OAc) 2 , 2 mM dithiothreitol, 0.1% Tween 20). For each experiment, glutathione-Sepharose beads (Amersham Biosciences) and an Escherichia coli extract containing the desired GST-Nup were incubated for 20 min at 4°C. Beads were collected by sedimentation at 2000 ϫ g for 30 s, washed seven times by resuspension and sedimentation, and incubated with 1 ml of yeast extract (ϳ10 mg). In some experiments, the yeast extracts were supplemented with 30 g of HIS-Gsp1p-GTP (Q71L) 15 min prior to mixing with the immobilized GST-Nup. After 2 h at 4°C, beads were washed six times with binding buffer, and bound proteins were eluted with 100 l of 1 M NaCl and concentrated by trichloroacetic acid/ sodium deoxycholate precipitation. Proteins that resisted salt extraction were subsequently extracted from the beads with SDS. All proteins were resolved by SDS-PAGE (7.5% gels) and visualized with Standard Coomassie Blue. 288 protein bands were selected from 10 different 1-D gels. Approximately one quarter of each band was cut out as a section of gel 1.2-1.5 mm square. The gel pieces were loaded into three 96-well plates, mounted in a Genomic Solutions ProPrep digestion robot, reduced and alkylated, and digested with trypsin in 25 mm NH 4 HCO 3 for 4 h. The resulting unseparated digests were desalted robotically using micro C18 Ziptips. The peptides were washed from the Ziptips with a 1.8-L elution solution containing diluted ␣-cyano-4-hydroxycinnamic acid matrix (six dilutions of saturated ␣-cyano in 60% acetonitrile/0.2% trifluoroacetic acid). The solutions were loaded directly onto three 96-well Teflon-coated MALDI plates for mass spectrometric analysis.
Mass Spectrometric Analysis-In the first pass of each sample plate, every sample spot was analyzed by MALDI-TOF MS in automated mode using an Applied Biosystems Voyager-DE STR reflectron mass spectrometer, operating under software control (Proteomics System 1). Spectra were accumulated for each sample until a specified signal strength and signal/noise were attained, and all spectra were submitted automatically to ProteinProspector for batch MS-Fit database searching. The acquisition time was ϳ1 h per sample plate. Spectra were also inspected visually, and a number of peptides were chosen for CID analysis by tandem mass spectrometry using the MALDI-TOF/TOF. For all tandem mass spectra reported here the laser frequency was 200 Hz, the laboratory collision energy was 1 keV, and air was used as collision gas. The laser energy employed for effective CID operation was typically ϳ30% higher than that used for normal MALDI operation. Variable numbers of laser shots were used to obtain each spectrum, depending upon the strength of the primary ion beam, but acquiring the output from 1,000 shots required only 5 s, and as many as 10,000 shots could be accumulated within less than 1 min. Post-acquisition baseline correction and smoothing was carried out using software provided with the TOF/TOF instrument. Sequence tags were identified in CID spectra by visual inspection and selection of the major peaks for submission to MS-Tag. Some CID spectra were used to confirm assignments already made by peptide mass fingerprinting, and others were used to establish peptide identity for those not matched in the previous MS experiments. For those peptides that could not be assigned by MS-Tag, the CID spectra were used to carry out de novo peptide sequencing by manual interpretation. De novo assignments were important to monitor for nonspecific protein cleavage sites and the presence of chemical or post-translational modifications.
Software to Identify Sequence Tags from MALDI-TOF/TOF Spectra-Programs in the ProteinProspector software package that predict peptide fragmentation, such as MS-Pattern, or identify proteins based on their fragmentation, such as MS-Tag, were modified to accommodate the side chain fragments d, v, and w and other spectral features from high energy CID. The changes were based on the following rules for high energy CID fragmentation, derived from a compilation of spectra from four-sector tandem mass spectrometry (19,20): • No high energy fragments would be observed unless Arg, Lys, or His was present in the fragment, and that basic residue would not yield side chain fragments. For example, C-terminal Arg would not form v and w ions. However, it could form a d fragment if the peptide contained another basic amino acid. If two basic residues are present, both of them may produce side chain fragmentation. • Ala, Gly, His, Phe, Pro, Trp, and Tyr, residues do not form d ions.
• Ala, Gly, His, Phe, Trp, and Tyr do not form w ions.
• Thr and Ile may each yield two w and d fragments because of the double substitution at the ␤-carbon in these amino acids.
• Gly and Pro do not form v ions. • For amino acids N-terminally adjacent to Pro residues, no c fragments will be formed.
• Fragment z, defined as "y Ϫ 16 Da," cannot be formed by Pro.
• If the fragment contains Arg, Asn, Gln, or Lys residues, a neutral loss of NH 3 (Ϫ17 Da) should be considered.
• If the fragment contains Asp, Glu, Ser, or Thr, neutral loss of • The eighth and ninth rules apply to the internal fragment ions, as well.
• The first members of the N-terminal ion series will not be identified, because b 1 is not stable unless N-terminally modified, a 1 is observed as the immonium ion, and c 1 would be observed only rarely. Typical MS-Tag results are presented and discussed below. The use of MS-Product is not illustrated here, but it should be noted that to generate a list of fragment ions including d, v, and w ions, the appropriate boxes must be checked on the ProteinProspector user interface.

RESULTS AND DISCUSSION
To evaluate the feasibility of carrying out high throughput protein identification using the MALDI-TOF/TOF tandem mass spectrometer, we have taken as an example the analysis of proteins interacting with known nucleoporin components in the nuclear pore complex (10). Target proteins were purified from yeast extracts using recombinant proteins as bait and then separated by 1-D SDS-PAGE. The criteria for selection of particular bands for analysis was the appearance or absence of protein bands based on a comparison of control and normal yeast extracts. On this basis, 288 protein bands were excised, digested with trypsin, and desalted, and the resulting peptide mixtures were analyzed by MALDI-TOF and MALDI-TOF/TOF. The results from selected gel bands are presented and discussed below. The inset displays part of the 1-D gel picture where the gel slice at ϳ120 kDa was sampled. Three gel lanes represent the following: (Ϫ), control without yeast extract; S30, normal yeast extract; Ran, yeast extract spiked with Gsp1p-GTP (Q71L). Detailed experimental procedure was described elsewhere (11). T represents trypsin autolysis products, and the rest of the peaks labeled were matched to karyopherin ␤-4 subunit. The peaks with * were chosen for CID analysis. lane) and the extract spiked with Ran-GTP protein (Gsp1p-GTP) (right lane), as described previously (11). This band, named B5, was cut out, digested, and analyzed by MALDI-TOF MS, giving the spectrum illustrated in Fig. 2. The peptides labeled "T " represent trypsin autolysis products, which were used as internal standards for calibration of the mass scale. This calibration provided mass measurement accuracy within 20 ppm for database searching. This single mass spectrum contained 29 other peptides, all of which were submitted to MS-Fit for database searching. The search results gave one significant protein hit based upon 19 peptides that matched yeast karyopherin ␤-4 subunit labeled in the figure. To further establish the protein identity, MALDI-TOF/TOF CID spectra were recorded for those peptides labeled with asterisks. The results for two of these are illustrated in Fig. 3, showing the CID spectra of the tryptic peptides of MH ϩ 1521.75 and 1900.01, respectively. The low mass region of Fig. 3, top for m/z 1521.75 shows strong immonium ions such as m/z 86 (Ile/Leu) and 136 (Tyr) that are typical of these CID spectra. Most of the members of the y ion series, y 1 to y 7 , y 9 , and y 11 , and a few members of the b ion series are present. In the CID spectrum of the larger peptide of m/z 1900.01 (Fig. 3, bottom), most of the sequence ions (b and y ion series) are evident. Thus both of these CID spectra gave definitive fragment ion information, allowing the MS-Tag search program to identify the peptide sequences as VLNEQVDESYGLR (MH ϩ ϭ 1521.7) and VFDNFRPVVFGLFQSK (MH ϩ ϭ 1900.01) from the yeast karyopherin ␤-4 subunit. The remaining CID spectra obtained also confirmed this assignment, consistent with the known function of this protein as a carrier involved in nucleocytoplasmic transport (11, 29 -32).
Protein Identification in a Complex Mixture by Sequence Tags-Unseparated protein digests may be very complex, particularly from 1-D electrophoresis. A potential problem with protein identification by peptide mass fingerprinting of such mixtures may be a low fraction of the total peptide mass measured that match to any given single protein (less than 20%). In such cases, no clear candidate protein(s) may be selected reliably. The MALDI-TOF MS spectrum of the digest of band B8, plate 2 is shown in Fig. 4 and provides an example illustrating this point. The mass values of 25 peptides observed in this spectrum were submitted to MS-Fit for searching against a yeast database. No proteins were found to be matched with more than four of the total number of these peptides, whereas there were several hundred hits matching to four. Consequently no definitive candidate could be selected based on the MS-Fit search results alone. To obtain sequence information, TOF/TOF CID analysis was performed on the more abundant components labeled in Fig. 4 indicates clearly the presence of certain amino acid residues, Arg/Pro (70), Thr (74), Ile/Leu (86), His (110), and Phe (120), and the y 1 ion of m/z 175 establishes the presence of an arginine residue at the C terminus. After carrying out a twopoint internal calibration based on the y 1 and MH ϩ ions, 14 of the fragment ions recorded were submitted to MS-Tag for database searching with a fragment ion mass measurement tolerance of 50 ppm. Every peak matched to a single protein, yeast GTP-binding nuclear protein GSP1, as shown in Fig. 5B, having an average mass error of 22 ppm. To illustrate the degree of coverage, the sequence ions (a, b, and y) and the immonium ions are labeled in Fig. 5A. The presence of the characteristic high energy CID fragment ion, w a5 , at m/z 628.3 would establish the presence of Ile rather than Leu at position 2 in this peptide sequence, determined as TITFHR.
Some of the CID spectra obtained from band B8 provided sufficient fragment ions for de novo peptide sequencing. An example is given in Fig. 6, showing the TOF/TOF CID spectrum of the tryptic peptide of MH ϩ 1214.60. The immonium ion region indicates the presence of amino acid residues Ile/Leu (86), Gln/Lys (101), and Tyr (136). Sequence ion series, such as b 2 to b 8 and y 1 to y 9 , are observed, allowing the sequence of this peptide to be determined as N(I/L)(Q/ K)YYD(I/L)SAK, and internal ions and satellite ions were consistent with this. The presence of w a4 and w b4 indicates Ile at the 4th position from the C terminus, whereas the presence of w a9 indicates Leu at the 9th position from the C terminus. An MS-Tag search also identified the sequence as NLQYYDISAK matched to yeast GTP-binding protein GSP1. Furthermore, the TOF/TOF CID spectra of two further peptides of MH ϩ at 922.47 and 1784.92 also gave unambiguous matches to this same protein. Thus peptides giving four of the strongest signals from band B8 unambiguously identified yeast GTP binding-protein GSP1 by peptide sequencing.
Another example of the difficulty of protein identification in cases where a mixture of proteins is present concerns a single 40-kDa band (G11) purified from a normal yeast extract using Rna1 as bait. The MALDI-TOF MS spectrum of the band digest was recorded. Sixty-six peaks were observed. After internal calibration, they were submitted to MS-Fit. Yeast elongation factor 1-␣ was the only significant hit, with 20 peaks matching, as labeled with a in Fig. 7. MS-Fit searches on the remaining peaks gave a variety of hits, but none were very significant. Selected ions were chosen for CID analysis, some to confirm the identification made by MS-Fit and others to identify additional protein components. The CID spectra of two peptides of MH ϩ 3319.86 and 1326.75 (shown in Fig. 8,  A and B, respectively) did indeed confirm the identification of yeast elongation factor 1-␣. The peptides were determined by a MS-Tag search as TLLEAIDAEQPSRPTDKPLRLPLQDVYK and EHALLAFTLGVR. Many sequence ions and a few side chain fragment ions were present in these CID spectra, e.g. in  Fig. 8B, the presence of w a4 , w a8 , and w a9 generated by high energy CID distinguished Leu from Ile at the 4th, 5th, and 9th positions. Discussion of this sample continues below in the context of unresolved precursor ions.

Protein-Protein Interactions of the Nuclear Pore Complex
CID of Peptides with Unresolved Precursor Ions-In a complex digest it is likely that some peptides will be sufficiently close in mass values that their isotope clusters will overlap. Unlike four-sector tandem mass spectrometry in which monoisotopic spectra could be obtained (20,33,34) the precursor mass window selected by the TOF/TOF is several mass units wide, and CID analysis may provide fragment ions originating from multiple precursors. Furthermore, the analysis is complicated as the overlapping peptides could originate from the same protein or from different proteins. Fig. 9 from band G11 represents an example of a CID spectrum of two ions differing in mass by only 1 Da, as shown in the inset. Although m/z 1355.70 was selected for CID analysis, a weaker peptide of MH ϩ 1354.68 also contributed to the fragment ion spectrum. The higher mass peptide (peptide 1) was matched to elongation factor 1-␣, which had already been identified (see above), and its sequence was assigned as YQVTVIDAPGHR. Although the lower mass spectrum was weaker and gave relatively few peaks, the sequence GFGREEFSQVAK was matched to serine methylase (mitochondrial). The ions attributed to each peptide are shown in Fig. 9, where the ions labeled with * were matched to both sequences. Although two proteins, elongation factor 1-␣ and serine methylase (mitochondrial), had been identified by peptide mass mapping and peptide sequencing, many peptides in the digest of band G11 were still unassigned. Some proteins were not well separated in 1-D SDS-PAGE, so the presence of overlapping peptides was probed to assist in the assignment of the remaining peaks. Comparing the peptide masses for band G11 with those from adjacent bands revealed many coincident masses, indicating the presence of the same proteins. CID analysis of selected ions in the bands running before and after band G11 helped to confirm the previous two assignments and to identify three additional proteins as Ran GTPase activating protein 1, serine methylase (cytosolic), and MAS5. Ultimately 51 of the 66 peptides from G11 were attributed to the five proteins as summarized in Table I, each protein contributing a unique set of peptides as shown in Fig. 7. During high throughput analysis, a comparison of data from adjacent bands should increase the accuracy of assignments and eliminate unnecessary repetitive data acquisition and processing.
Automated CID analysis of unresolved peptides is potentially difficult as mixed fragmentation information can complicate database searches. However, sequential searches by MS-Tag on the same dataset but with different precursor masses specified may allow protein identification. Fig. 10 illustrates the TOF/TOF CID spectrum of peptides with MH ϩ at 1962.01 and 1964.97 as shown in the inset. These two peptides separated by only 3 Da could not be resolved in MS1 of the MALDI-TOF/TOF. Consequently although several immonium ions were observed in the CID spectrum, they could not be easily assigned to either peptide. Eighteen fragment ions obtained in the spectrum were submitted for database searching with a mass tolerance for the internally calibrated parent ion of 20 and 300 ppm for the externally calibrated fragment ions. The maximum allowable number of unmatched ions was set to 40%, i.e. seven ions. The maximum number of missed cleavage sites was set to 1, and cysteines were alky- lated by iodoacetamide. When m/z 1964.97 was used as the parent ion mass, eight hits were obtained from the SwissProt database. The four of these with the lowest number of unmatched ions (3 of 18) were all Elongation factor Tu from different species, which was the same match that had been obtained by MS-Fit. Most of the matched ions are sequence ions, i.e. b ions and y ions. The other four hits for different proteins with six or seven unmatched ions were almost certainly false positives as no other peptides could be found to match them. When 1962.01 was used as the parent mass for database searching, ten hits were returned. The top hit had six unmatched ions whereas the remainder all had seven. The three peaks previously unmatched by the first peptide were all matched only by two hits, also elongation factor Tu from two different species, suggesting the peptide sequence identity. In Fig. 10 all ions that were uniquely assigned to one or other peptide are marked with symbols (ˆand *) whereas the ions without symbols were assigned to both sequences. In this example, the first sequence (MH ϩ 1964.97) was easier to determine as it gave the top hit with the least unmatched ions. The second sequence was harder to distinguish from false positives, but all ions being assigned to one or other peptide gave increased confidence, because all the ions should be interpretable. Furthermore, in this instance both sequences were derived from a common protein, and an MS-Fit search had previously given a clue to its identity.
Identification of Modified Proteins by Peptide Sequencing-Protein identification by peptide mass mapping is fast and simple. However, the identification process can be complicated because of the presence of protein mixtures (as discussed above), the presence of only a few peptides for database searching, presence of chemical modifications, and nonspecific cleavages. As discussed above, four of 25 peaks from band B8 tryptic digest (Fig. 4) matched to yeast GSP1 protein by peptide sequencing (Fig. 5) by MS-Tag. However, one abundant ion of MH ϩ 1816.88 did not match this protein, and its CID spectrum could not be matched by MS-Tag. However, strong similarities between the CID spectra of MH ϩ 1784.92 (Fig. 11A, SNYNFEKPFLWLAR) and this peptide (Fig.  11B) were noted, suggesting the latter to be a modified peptide with a mass difference of ϩ32. The b ions up to and including b 5 were observed to be the same in both cases, i.e.  Table I. same in both spectra. However, from y 4 onwards, y ions and associated ions were shifted higher by 32 mass units, e.g. y 4 (545) versus y 4 -NH 3 * (560), y 7 (902) versus y 7 * (934), and y 8 (1030) versus y 8 * (1062). Therefore the site of the modification was identified as the tryptophan at the 4th position from the C terminus. It is known that tryptophan can be oxidized through a Cu 2ϩ -catalyzed reaction to form N-formylkynurenine (35), with the addition of two oxygen atoms, increasing the mass by 32 Da. Therefore, the peptide sequence of MH ϩ 1816.88 was proposed as SNYNFEKPFLW(2O)LAR, a modified peptide from GTP-binding protein GSP1. Because this protein modification is not common, the peptide mass would not be predicted by database searching tools. Nevertheless, by implementing this modification in the MS-Tag program, the peptide sequence was identified readily, thus all the major peptides were matched to the same protein.
Modified peptides were also found in band F9. As seen in the gel illustrated in Fig. 12, several unique low molecular weight proteins labeled F9 to F12 were pulled down using PrP20 as bait, contrasting with the larger proteins pulled down by other baits. PrP20, also called RCC1 in vertebrates, is guanine nucleotide exchange factor, responsible for generating a Ran-GTP gradient across the nuclear envelope to facilitate karyopherin-mediated transport. From the MALDI-TOF MS spectrum of the unseparated tryptic digest from band F9 in Fig. 13, 24 peptide masses were submitted to MS-Fit for database searching. Once again, no more than four peptides matched to any one protein. CID analysis of peptides with MH ϩ at 788.47 and 1004.57 (Fig. 14) identified the major protein in this band as yeast histone H3, defining their sequences as KLPFQR and YKPGTVALR, respectively. Based on masses alone, two other peptides at 831.49 and 1355.70 were consistent with this match with mass errors less than 10 ppm, but several other abundant peaks did not match the same protein. The MALDI-TOF MS spectrum of band F9 showed the peak at m/z 1335.70 to be the first in a cluster of four peaks, each separated by 14 Da, i.e. 1335.70, 1349.71, 1363.73, and 1377.73 as illustrated in the expansion in Fig.  13. Although the first of these peptides was relatively weak and was not selected for CID, in histone H3 the mass corre- sponds to a normal tryptic peptide EIAQDFKTDLR. Histone H3 can be methylated or acetylated at lysine residues (36), and the presence of a single lysine in this sequence is consistent with the 14-Da differences being because of single or double methylation or triple methylation/acetylation. The y ion series in the TOF/TOF CID spectra of the modified peptides displayed in Fig. 15 confirm that the modification is indeed localized on the lysine. The masses of the first four members of this series, y 1 (175), y 2 (288), y 3 (403), and y 4 (504), are the same in all three CID spectra, whereas from y 5 up they show systematic differences of 14 or 28 Da. In Fig. 15A the y 5 ion occurs with loss of ammonia (646 -17 ϭ 629), whereas the intact y 5 ions in Fig. 15, B and C occur at 660 and 674, and the y 6 ions at 793, 807 and 821 support the same conclusion. The b ions are less definitive, but b 2 (243) and b 3 (314) are the same in all spectra whereas strong b 10 ions at 1175, 1189, and 1203 show the expected mass shifts. Thus, the three peptides all correspond to lysine-modified forms of the peptide of MH ϩ 1355.70, arising from digestion of modified his-tone H3. The results suggest that the two peptides of MH ϩ 1349.7 and 1363.7 are the modified peptides with mono-and dimethylated lysine, respectively. Based on the peptide mass, two possible modifications, i.e. trimethylation and acetylation, could occur on the lysine residue in the peptide of MH ϩ 1377.73, because the modified mass increments are isobaric, 42 Da. The peptide mass error is Ϫ6.9 ppm if lysine is trimethylated, whereas the mass error is 19.5 ppm if the lysine is acetylated. Even though all the peptides matched to histone H3 gave mass errors less than 10 ppm, suggesting the modification might be trimethylation, the possibility of acetylation could not be ruled out. However, the obvious difference between these two modifications is that trimethylated lysine carries a positive charge, whereas acetylated lysine is neutral. This chemical property would affect the fragmentation of the peptide dramatically in high energy CID. Biemann (37) has shown that a peptide with a positive charge at the N terminus causes the most side chain fragmentation, giving mainly d, a, w, v, and y ions, whereas a peptide with acetylation at the N Protein-Protein Interactions of the Nuclear Pore Complex terminus causes very little side chain fragmentation, giving mainly y and b ions. Therefore, if this peptide of MH ϩ 1377.73 is trimethylated, the fragmentation of this peptide would be different from the other two modified peptides (mono-and dimethylated), because they do not carry any charge. On the contrary, if this peptide is acetylated, the fragmentation pattern would be similar to those of the other two modified peptides. Comparing the three CID spectra in Fig. 15, d ions starting at the 8th position from the N terminus right after the modified lysine (at the 7th position), i.e. d a8 , d b8 , d 9 , and d 10 were only observed in the CID spectrum of 1377.73 (Fig. 15C), whereas w ions up to the 5th position from the C terminus right before the modified lysine (at the 6th position), i.e. w a3 , w a4 , w a5 , and v 4 were only observed in Fig. 15, A and B. Furthermore, two fragment ions with higher intensity than the molecular ion, i.e. 1221.9 and 1318.9, were attributed to b 10 ϩ H 2 O and MH ϩ Ϫ 42 Ϫ 17, which only occurred in Fig. 16C. Thus, these results suggest that the peptide of MH ϩ 1377.73 most likely contains a trimethylated lysine residue. Furthermore, accurate mass measurement was also performed using either a single point (y 1 (175)) or two points (y 1 and MH ϩ ) for internal calibration. The mass errors of fragment ions containing modified lysine were always less (ϳ30 ppm) if lysine was trimethylated rather than acetylated. This further supports the proposal that the lysine in this peptide is trimethylated. It is believed that the formation of methylated lysine as catalyzed by histone methyltransferase in histone H3 at specific sites (36) may be associated with various biological processes ranging from transcriptional regulation to epigenetic silencing via heterochromatin assembly (38,39). Analysis of bands F10, F11, and F12 identified them as histone H2B, H2A, and H4, respectively. Recently it was discovered that PrP20 (RCC1) associates with chromatin assembly through direct binding to mononucleosomes and histone H2A and H2B (40). The binding of PrP20 (RCC1) to nucleosomes or histones stimulates the catalytic activity of PrP20, generating the Ran-GTP gradient across the nuclear pore complex, which is essential for nuclear envelope assembly, nuclear transport, and other nuclear events. Further experiments need to be carried out to elucidate the exact role of the histone family members and their modifications in the nucleocytoplasmic transport mechanism.
Non-tryptic Cleavages-The presence of peptides formed by nonspecific digestion that does not obey the normal enzyme cleavage rules complicates protein identification by database searching using either peptide mass fingerprinting or identification of sequence tags. Although band B7 on plate 2 was found by both peptide mapping and peptide sequencing to contain a mixture of Ran-specific GTPase-activating protein and glyceraldehyde-3-phosphate dehydrogenase 3, a peptide of MH ϩ 1596.72 did not match either of these proteins based on the MS-Fit search results. However, TOF/ TOF CID analysis showed its sequence to be (D)PFIT-NDYAAYmFK, formed by a nonspecific cleavage of glyceraldehyde-3-phosphate dehydrogenase 3, the N terminus being formed by digestion of an Asp-Pro bond as shown in Fig. 16. It is not clear whether this bond cleavage was caused by nonspecific enzymatic cleavage or chemical hydrolysis, because the Asp-Pro bond is known to be labile in acid. To identify such a peptide using the sequence tag approach, it is necessary to relax the normal enzyme cleavage rules. In general this can only be done on a limited basis as it greatly extends the search times. Note also that the presence of a methionine sulfoxide residue in this peptide is apparent from the loss of 64 Da.

CONCLUSIONS
This proteomic study sought to identify protein-protein interactions based on proteins in yeast extracts that associate with GST-fusion protein baits. As the yeast genome has been sequenced completely, all protein sequences should be present in the databases, therefore it might be anticipated that protein identification by peptide mass fingerprinting should have the maximum probability for success, and the requirement for peptide sequence data from CID experiments should be minimal. The fact that this was not the case in this study was because of the complexity of the mixtures, the occurrence of nonspecific cleavages, and the presence of modified proteins. Therefore tandem mass spectrometry proved to be an essential tool to resolve ambiguities reliably and rapidly. Here the combination of robotic sample preparation and analysis using the MALDI-TOF/TOF MS demonstrated their suitability for high throughput proteomic analysis. In this study MALDI-TOF spectra were collected on a separate instrument, and the tandem mass spectrometry was carried out manually over a two-day period. Nevertheless, it demonstrated the potential for realizing rapid MS and MS/MS using a single instrument giving comprehensive sequence information, high sensitivity, and high mass accuracy. Furthermore, the MALDI-TOF/TOF is now equipped with software for automated collection of both MS and MS/MS spectra (41,42). In the example described here, some 200 protein bands from the 288 selected samples were characterized by peptide mass mapping and peptide sequencing using the MALDI-TOF/TOF mass spectrometer. Many fractions contained overlapping proteins, and a total of 91 unique proteins were identified. Even though these proteins were all present in the databases, in many instances peptide sequencing was essential for unambiguous protein identification, especially for protein mixtures, proteins present only in low abundance, and modified proteins. The high quality MS and MS/MS spectra obtained using the MALDI-TOF/TOF MS proved to be suitable not only for database searching but also for de novo peptide sequencing for unknown proteins and the investigation of chemical modifications. Furthermore, high collision energy fragments generated by TOF/TOF tandem mass spectrometry could be used for more detailed structure determination, such as distinguishing between Ile and Leu.
It is anticipated that future automated operation of this instrument system with a multiplate cassette will increase FIG. 14. MALDI TOF/TOF CID spectra of two tryptic peptides from band F9. A, MH ϩ ϭ 788.5; B, MH ϩ ϭ 1004.6. MS-Tag searches of these two peptides determined the protein identity as yeast histone H3.
greatly the capacity to map and identify some thousands of protein digests per day. However, in addition to the tandem mass spectrometer, many other components are critical to the success of an integrated system for high throughput identification of proteins separated by gel electrophoresis. These include enhancements to the sample handling techniques of gel cutting, digestion, sample cleaning and spotting, and to minimize sample loss and contamination, including effecting improvements to the robotic system to handle the very limited amounts of sample available for many critical studies. Software must be evaluated and optimized for MALDI MS and MS/MS spectral acquisition with automatic mass calibration, isotope stripping, and peak picking over the whole mass range. The data must be submitted for automatic database searching using both peptide mapping and sequencing, and feedback control is needed to ensure the efficient use of the samples and the instrument to identify the maximum number of proteins in any sample with the minimum effort. Further enhancements to ProteinProspector are required to optimize automated database searching tools suitable for protein mixtures, peptide mixtures, protein modifications, and peptides with nonspecific cleavages. Finally, it is important to develop a sample and data management system for sample tracking, data archiving, processing, and creation of output reports. Only then will automated high throughput proteomic analysis become a realistic possibility. Achieving this high capacity will permit studies of the temporal modulation of protein expression and modification of protein expression in response to both endogenous cell cycle signals and exogenous events such as heat shock or exposure to drugs. * This work was supported in part by NCRR, National Institutes of Health Grant RR 01614 (to A. L. B.), NCRR, National Institutes of Health Shared Instrumentation Grant S10RR14606 (to A. L. B.), and NCI, National Institutes of Health Grant R33 CA86135 (to M. Pallavicini). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.