Combining Results from Lectin Affinity Chromatography and Glycocapture Approaches Substantially Improves the Coverage of the Glycoproteome*S

Identification of glycosylated proteins, especially those in the plasma membrane, has the potential of defining diagnostic biomarkers and therapeutic targets as well as increasing our understanding of changes occurring in the glycoproteome during normal differentiation and disease processes. Although many cellular proteins are glycosylated they are rarely identified by mass spectrometric analysis (e.g. shotgun proteomics) of total cell lysates. Therefore, methods that specifically target glycoproteins are necessary to facilitate their isolation from total cell lysates prior to their identification by mass spectrometry-based analysis. To enrich for plasma membrane glycoproteins the methods must selectively target characteristics associated with proteins within this compartment. We demonstrate that the application of two methods, one that uses periodate to label glycoproteins of intact cells and a hydrazide resin to capture the labeled glycoproteins and another that targets glycoproteins with sialic acid residues using lectin affinity chromatography, in conjunction with liquid chromatography-tandem mass spectrometry is effective for plasma membrane glycoprotein identification. We demonstrate that this combination of methods dramatically increases coverage of the plasma membrane proteome (more than one-half of the membrane glycoproteins were identified by the two methods uniquely) and also results in the identification of a large number of secreted glycoproteins. Our approach avoids the need for subcellular fractionation and utilizes a simple detergent lysis step that effectively solubilizes membrane glycoproteins. The plasma membrane localization of a subset of proteins identified was validated, and the dynamics of their expression in HeLa cells was evaluated during the cell cycle. Results obtained from the cell cycle studies demonstrate that plasma membrane protein expression can change up to 4-fold as cells transit the cell cycle and demonstrate the need to consider such changes when carrying out quantitative proteomics comparison of cell lines.

A major aim of proteomics is to identify proteins associated with subproteomes and determine how changes in these subproteomes affect cellular function. In addition, proteomics aims to identify biomarkers that can be used for early disease detection, evaluation of therapeutic efficacy, and the identification of cellular targets for therapy (3,14). Proteomics protocols that selectively enrich for glycoproteins and particularly plasma membrane glycoproteins are needed to achieve these basic and therapeutic objectives. In the current study, we used two strategies with the potential to target the N-and O-linked cellular glycoproteome including those integral to the plasma membrane. One approach is based on the periodate oxidation/hydrazide method originally developed and more recently optimized by Aebersold and co-workers (15,16). This methodology has been used primarily for plasma glycoproteomics analysis and has not been applied for the specific analysis of the plasma membrane glycoproteome. We rationalized that periodate oxidation of cells followed by detergent solubilization would selectively result in coupling of integral plasma membrane glycoproteins to the hydrazide-modified resin used in the Aebersold method. We show that periodate oxidation of cells prior to lysis does provide an effective method for plasma membrane protein enrichment. Further and importantly, we comprehensively investigated the specificity of the periodate/hydrazide procedure using total cell lysates of cells following periodate oxidation by comparing the proteins bound to hydrazide with and without prior periodate oxidation. Our results provide the first comprehensive evaluation of the periodate/hydrazide protocol using cell lysates as opposed to plasma.
We compare the results obtained by the periodate/hydrazide protocol to those obtained by affinity chromatography using lectins that bind sialylated glycoproteins. Lectin chromatography has been used extensively in many glycobiological studies, but its application to proteomics, particularly for investigating the plasma membrane glycoproteome, is very limited. Because the plasma membrane is enriched in sialylated glycoproteins in higher eukaryotic cells, this approach should result in the isolation and identification of integral plasma membrane glycoproteins. Therefore, we used the sialic acid-binding lectins from Maackia amurensis (MAA and MHA) as an affinity approach for isolating sialylated glycoproteins (17)(18)(19). M. amurensis isolectins have been shown to bind to a glycans found on both sialylated O-and N-linked glycans. Therefore, we anticipated that it would provide significant coverage of the glycoproteome.
Because our periodate labeling approach should result in the modification of glycans on all of the cell surface glycoproteins, whereas the M. amurensis lectin column would only bind a subset of these glycoproteins we anticipated that there would be significant overlap in the glycoproteins identified by the periodate/hydrazide protocol and those identified that bound to M. amurensis. In contrast to this expectation, we found that the approaches are complementary (more than one-half of the membrane glycoproteins were identified by the two methods uniquely) and thus identify a larger number of membrane glycoproteins than either alone and are thus complementary. Further we demonstrate that the non-ionic detergent octyl-␤-D-1-thioglucopyranoside is effective in releasing a wide range of membrane proteins, including those with up to14 transmembrane segments. Finally our analysis identified 65 new N-linked glycosylation sites.
Having identified a significant number of integral membrane glycoproteins using these protocols, we established that a subset of these proteins are expressed at the cell surface of HeLa cells and subsequently evaluated how their level of expression changed during the cell cycle. Our results demonstrate that each protein has a unique pattern of expression, but that three of the four evaluated have a relatively constant expression level throughout the cell cycle, whereas the level of expression of the fourth, CD98, varied more than 4-fold within a 4-h period of the cell cycle.

EXPERIMENTAL PROCEDURES
Cell Culture-The cervical cancer cell line (HeLa) was grown in Dulbecco's modified Eagle's medium supplemented with 5% fetal bovine serum (Hyclone) and 100 IU of penicillin-streptomycin. Cells were grown at 37°C with 5% CO 2 . For mass spectrometry analyses cells were grown until ϳ90% confluent. For cell cycle experiments HeLa cells (10 6 cells) were plated in 100-mm dishes and incubated in standard medium for 24 h. HeLa cells were arrested in S phase by using a double thymidine block as described previously (20). Briefly cells were blocked with 2 mM thymidine for 18 h, rinsed and released for 9 h, and then blocked with 2 mM thymidine for 17 h. Cells were released and allowed to progress through the cell cycle. Cell lysates were collected immediately after the second release (zero time) and every 2 h thereafter for 16 h.
Periodate Oxidation/Hydrazide Resin Coupling-Cells were washed once with 1ϫ PBS and incubated with 10 mM NaIO 4 at room temperature for 1 h. Excess NaIO 4 was washed from the cells, and the cells were lysed. Lysis was completed with a 1% solution (20 mM Tris-HCl, 150 mM NaCl, 0.002% NaN 3 , pH 7.5, and 1% protease inhibitor mixture (Sigma)) of the non-ionic detergent octyl-␤-D-1-thioglucopyranoside. This class of detergents has been shown to result in complete solubilization of whole membranes, including lipid rafts (21).
Cell solubilization was completed by passing the cell lysate through a 27.5-gauge needle (15 times), and the cell lysate was mixed with hydrazide resin (Affi-Gel Hz Hydrazide Gel, Bio-Rad) overnight at room temperature. Subsequently the bound sample was processed according to methods published previously (15,16) with slight modifications. The resin was washed with 8 M urea in 100 mM NH 4 HCO 3 , pH 8.3 (AmB). After three washes with AmB, immobilized proteins were reduced with 50 mM DTT in 50 mM AmB for 60 min at 37°C. DTT was removed using 50 mM AmB, and proteins were alkylated with 65 mM iodoacetamide in 50 mM AmB in the dark for 30 min at room temperature. Iodoacetamide was removed with a 50 mM AmB wash. Non-glycoproteins were removed from the resin by washing three times with 1.5 M NaCl and three times with 50 mM AmB. The immobilized proteins were digested with 40 ng/l sequence grade, modified porcine trypsin (Promega) in 50 mM AmB overnight in a 37°C water bath. Tryptic peptides were recovered, concentrated by solid phase extraction chromatography (Strata-X reversed phase, 30 mg/1 ml, Phenomenex), and analyzed by LC/MS. The hydrazide resin was washed three times with 1.5 M NaCl, 80% ACN, 100% MeOH, water, and 50 mM AmB before N-linked glycopeptides were released from the resin with the enzyme N-glycosidase F (glyco-N-glycanase, 200 milliunits, Prozyme). Four microliters of PNGase F in 500 l of AmB (50 mM) was added to the resin and incubated overnight at 37°C, and the solution that contains N-linked peptides was recovered, concentrated by solid phase extraction, and analyzed by LC/MS.
M. amurensis Chromatography-A Poly Prep column (Bio-Rad) was packed with 2 ml of M. amurensis lectin (Sigma lot number 036K4075 was used to prepare all of the columns used for the studies reported) immobilized (5 mg/ml) on CNBr-activated Sepharose 6MB (GE Healthcare). According to the vendor this M. amurensis lectin preparation is a mixture of the isolectins MAA and MHA that have been characterized previously in terms of their carbohydrate binding specificity (17)(18)(19). MAA preferentially binds NeuAc-␣2-3-linked Nglycans, whereas MHA binds NeuAc-␣2-3-O-linked glycans. Ten milligrams of M. amurensis lectin was dissolved in 2 ml of coupling buffer (0.1 M sodium bicarbonate buffer, pH 8.3, 0.5 M NaCl) and mixed with 2 ml of the CNBr-activated Sepharose 6MB that had been treated with 1 mM HCl and washed. Based on protein (Bradford) analysis of the supernatant recovered after the coupling reaction, all of the lectin was bound to the resin yielding ϳ5 mg of lectin/ml of resin. The resin was packed by gravity flow using 7-10 ml of the lectin resin. The column was conditioned with a 10ϫ volume of Tris column buffer (20 mM Tris-HCl, 500 mM NaCl, 1 mM MgCl 2 , 1 mM CaCl 2 , 0.02% NaN 3 , pH 7.5). The entire lectin affinity chromatography protocol was performed at 4°C. Cell lysate, prepared as above but without periodate oxidation, was passed over the column four times, and the column was washed with Tris column buffer containing 0.1% Tween 20 and then with Tris column buffer. The proteins were eluted with 20 mM ethylenediamine. For mass spectrometric analysis, the eluted fraction was filtered through a Microcon YM-30 centrifugal filter device, and the proteins were reduced (30 mM DTT), alkylated (55 mM iodoacetamide), denatured (6 M urea in 50 mM AmB), and treated with PNGase F and trypsin. The tryptic peptides were analyzed by LC/MS.
Preliminary results demonstrated that carrying out the lectin column chromatography at 4°C versus room temperature and with divalent cations (Ca 2ϩ and Mg 2ϩ ) present in the column loading buffer improved glycoprotein binding to the lectin column. In fact, few glycoproteins were detected in the lectin-bound fraction without the addition of the divalent cations. The number of bands observed on the M. amurensis lectin Western blots and band intensity were increased when the rate of sample loading was relatively slow (70 l/min) and when the cell lysate was passed over the lectin column three times before column washing was initiated.
A variety of buffers were investigated for efficient elution of the proteins bound to the M. amurensis lectin beads. These included acidic solutions (e.g. 0.1% trifluoroacetic acid in 1ϫ PBS and 0.1 M glycine, pH 3.41 in 1ϫ PBS) and a basic solution (e.g. a 20 mM unbuffered ethylenediamine solution). In addition, buffers containing carbohydrates (e.g. sialyllactose) expected to compete with glycoproteins for lectin binding were tested. The elution efficiency of each buffer was analyzed using Western blots by comparing the number and intensity of protein bands in the eluted fraction compared with those retained on the lectin column. Using these criteria, an unbuffered ethylenediamine solution was found to produce the best elution efficiency. This was demonstrated by the presence of strongly stained bands in the eluted fraction lane and negligible staining in the bead lane (proteins retained on the lectin beads after treatment with elution buffer) on Western blots.
Adding the detergent wash step to the lectin column protocol increased the number of membrane proteins identified in the lectineluted fraction by more than 30%. In addition, the number of peptides detected for each membrane glycoprotein and the peptide ion score for each peptide detected were substantially higher as compared with the analysis of proteins eluted from the M. amurensis lectin column without detergent washes. Similarly the addition of 500 mM NaCl in the cell lysate substantially improved the number of membrane glycoproteins identified and reduced the number of non-glycoproteins identified. For example, when the lysate did not contain 500 mM NaCl, only four of the top 20 proteins detected (based on spectra counts) were glycoproteins. In contrast, when 500 mM NaCl was included in the lysis buffer 12 of the top 20 proteins identified were glycoproteins.
Protein Identification by LC/ESI-MS/MS Analysis-The resulting tryptic and PNGase F-released peptides were analyzed by LC/ESI-MS/MS using an LTQ ion trap mass spectrometer (Thermo Finnigan, San Jose, CA). Nano-LC and MudPIT analyses were conducted using a dual pump Thermo Surveyor HPLC system. Peptide mixtures were chromatographically separated with seven solutions of NH 4 Cl (0, 20, 40, 60, 80, 200, 400 mM) using a Thermo BioBasic strong cation exchange column (320 m ϫ 100 mm) followed by in-line desalting and reversed phase (C 18 , 75 m ϫ 130 mm) nano-LC analysis. The mobile phases A and B (mobile phase A was 0.1% HCOOH in water, and mobile phase B was 0.1% HCOOH in acetonitrile) were used to create a three-step linear gradient of 5-30% B in the first 65 min followed by 30 -80% B in the next 10 min and a hold at 80% B in the last 10 min. The LC/ESI-MS/MS data acquisition was set up to collect ion signals from the eluted peptides using an automatic, data-dependent scan procedure in which a cyclic series of three different scan modes (one full scan, four zoom scans, and four MS/MS scans for the top four abundant ions) were performed. The MS/MS analysis exclusion rule for the same precursor ion was set to a value of 2 during a 75-s period. The full-scan mass range was set from m/z 400 to 1800 and was divided into three segments, m/z 400 -800, 700 -1,100, and 1,100 -1,500, for gas phase fractionation experiments. The resulting MS/MS spectra were searched against the Swiss-Prot human protein database using the Sequest, Mascot, and X!Tandem algorithms to identify sequences of peptides (22)(23)(24). Scaffold (Proteome Software) was used to merge and summarize resulting files from Sequest, Mascot, and X!Tandem.
Western and Dot Blot Analyses-Five or 10 g of the HeLa cell lysate was loaded per lane on a 4 -12% NuPAGE bis-Tris gel (Invitrogen) under reducing (30 mM ␤-mercaptoethanol) or non-reducing conditions. A voltage of 200 V was applied to the gel for 35-50 min. The proteins in the gel were transferred to a nitrocellulose membrane using the iBlot Dry Blotting System (Invitrogen). The membrane was blocked with 1% BSA in PBS, incubated in primary antibody solution (1:5,000 dilution), and incubated in anti-mouse IgG-horseradish peroxidase secondary antibody solution (1:10,000 dilution). The blot was then incubated in a 40:1 ratio of mixed Amersham Biosciences ECL Plus Western Blot Detection Reagents (GE Healthcare). Chemifluorescence detection of protein bands on the nitrocellulose membrane was imaged on a Storm 860 Gel and Blot Imaging System (GE Healthcare). For dot blot experiments, synchronous and asynchronous HeLa cell lysates were spotted on a nitrocellulose membrane using the Bio-Dot Microfiltration Apparatus (Bio-Rad). The membrane was blocked, incubated with antibodies (1:2,500), and detected as in the Western blot protocol. Spot intensities detected by chemifluorescence were analyzed using the array module of ImageQuant TL software (GE Healthcare).

RESULTS AND DISCUSSION
N-and O-linked glycoproteins are synthesized in the rough ER where some of the N-linked glycoproteins are resident.
Other N-and O-linked glycoproteins are targeted to additional locations within the cell including the various compartments of the Golgi, the plasma membrane, endosomes, lysosomes, and secretory vesicles. Enzymes involved in the processing of N-linked glycans and synthesis of O-linked glycans are spa-tially distributed throughout the Golgi. The final processing steps occur in the TGN. Proteins that reach the TGN are often glycosylated with sialic acid residues. Thus, proteins that are found in the TGN or one of the subsequent destinations (endosome, lysosome, plasma membrane, or secreted) are often sialylated. Application of affinity chromatography with lectins that bind sialylated glycans should enrich for proteins that are found in the plasma membrane and other sites downstream of the TGN. Therefore, we used lectin chromatography to enrich for these glycoproteins (17)(18)(19). The proteins eluted from the M. amurensis column were treated with PNGase F to release N-linked glycans and convert the glycosylated Asn residues to Asp.
We also enriched for plasma membrane glycoproteins by treating intact cells with periodate under conditions that are known to oxidize vicinal hydroxyls of monosaccharides to aldehydes and following cell lysis coupled the glycoproteins carrying the oxidized glycans to hydrazide resin. Covalently linked proteins were digested into tryptic peptides for analysis, and N-linked glycopeptides were subsequently released by PNGase F cleavage of N-linked glycans with conversion of the glycosylated Asn residues to Asp. The peptides obtained from the M. amurensis column and hydrazide resin were identified by LC/MS. We anticipated that the two methods would produce significant overlap in protein identifications as well as some unique and thus complementary protein identifications. Our results show that the two methods produce substantially different identifications (more than one-half of the membrane glycoproteins were uniquely identified), demonstrating the usefulness of combining the results from these two approaches for identifying plasma membrane and other glycoproteins. A diagram of the work flow of our approach is shown in Fig. 1. We used the well characterized HeLa cell line (25) derived from human cervical cancer as a biological source because it was possible to verify many of the proteins identified by our approaches via information (e.g. antibody detection) available in the vast literature on HeLa cells.
Tryptic peptides and N-linked tryptic glycopeptides treated with PNGase F were analyzed by nano-LC/ESI-MS/MS using an LTQ ion trap. Gas phase fractionation, using a narrow mass range for the precursor ion (see "Experimental Procedures"), was used to increase the number of protein identifications (26). To overcome the limitation of the loading capacity of the nano-LC column, MudPIT analyses of the complex mixture of tryptic peptides was used (27). Seven sequential fractions were eluted from the strong cation column, trapped, and desalted in line. The peptides in each fraction were separated by reversed phase chromatography using a nano-C 18 column (75 m ϫ 130 mm).
To assure that the mass and the charge state assignments for the peptide ions were correct, a zoom (profile) scan was acquired prior to each MS/MS scan for each targeted precursor ion. This data acquisition arrangement allows manual investigation of the raw data, facilitating the elimination of false positive identifications of the N-linked peptides. The enzymatic deamidation of Asn to Asp by the action of PNGase F results in an increase of 1 Da in the glycopeptide mass that is within the mass tolerance window (2 Da) used in the protein database search algorithms for our analyses. This led to the identification of glycopeptides in which the Asn was not converted to Asp as well as the corresponding glycopeptide in which the Asn was converted to Asp (generally the glycopeptide with the Asn converted to Asp had a higher score than the native glycopeptide). For some less abundant peptides, the MS data acquisition program calculated the peptide mass based on a more intense isotopic peak instead of the monoisotopic peak. Thus, false positive identifications had to be eliminated by manual inspection of every MS/MS assigned as an N-linked peptide by the searching algorithm.
Application of the Periodate Oxidation/Hydrazide Method for the Enrichment of Cell Surface Glycoproteins-To isolate glycoproteins that are found on the extracellular face of cells, we incubated intact HeLa cells with sodium periodate to oxidize vicinal hydroxyl groups on the carbohydrates of glycoconjugates. Excess periodate was washed from the cells, and a detergent lysate of the cells was prepared as described under "Experimental Procedures." Cell surface glycoproteins were covalently linked to hydrazide resin, and the resin was washed as described by Zhou et al. (16). To identify covalently linked proteins, both tryptic peptides and PNGase F-released glycopeptides were analyzed by LC-MS/MS. Analyzing both tryptic peptides and N-linked glycopeptides resulted in a significant increase in the sequence coverage of the proteins, thus increasing the confidence of protein identification. The resulting MS/MS spectra were searched against the well annotated UniProt human protein database for protein identifications using three algorithms, Mascot, Sequest, and X!Tandem (22)(23)(24). Scaffold software was used to merge and summarize results from each algorithm. Protein identifications were based on a minimum of two peptide hits with a probability score of 95% or greater using the Peptide and Protein Prophet algorithms (28,29). Protein identifications for more abundant proteins were similar when protein identifications were made from MS/MS analyses using Mascot, Sequest, or X!Tandem. Less abundant proteins, identified on the basis of two or three unique peptides, were generally identified by one or more of the three (Sequest, Mascot, or X!Tandem) programs.
Membrane Glycoproteins Identified by the Periodate Oxidation/Hydrazide Resin Method-A one-dimensional nano-LC/ ESI-MS/MS analysis of a tryptic digest from 30 g of HeLa cell protein lysate resulted in the identification of 185 proteins, including many plasma membrane proteins (data not shown). A replicate analysis produced an 84% overlap in protein identifications. Gas phase fractionation analysis of the same sample increased protein identifications by 15%. To obtain basic information (subcellular location, the number of the transmembrane domains, and N-linked and O-linked sites) on the identified proteins, we implemented a script program (Proteome Solutions) that utilizes the UniProt primary access number for each protein to extract and summarize the available information from the Web site of UniProtKB into an Excel spreadsheet. A total of 254 proteins were identified from the results obtained by carrying out two nano-LC runs and two gas phase fractionation analyses. Sixty percent of the identified proteins were found to be membrane proteins and/or glycoproteins.
Additional protein identifications were obtained by MudPIT analysis of a larger amount of the HeLa cell tryptic digest released from the hydrazide resin. A total of 589 proteins were identified using a combination of MudPIT, nano-LC, and gas phase fractionation analyses using about 300 g of HeLa lysate protein (supplemental Tables 1, A and B, and 2). Fourteen of the 20 proteins (70%) identified with the largest number of spectra counts were glycoproteins (Table I) based on information in the UniProt database and results presented later (see Tables III, IV, and V and supplemental Table 4) in which N-linked peptides were identified for each. These glycoproteins had molecular masses of 29.8 -274.3 kDa and an average sequence coverage of 56%. Among the glycoproteins that gave the largest number of spectra counts are the glycosylphosphatidylinositol (GPI)-anchored proteins placental alkaline phosphatase (PLAP-1), intestinal alkaline phosphatase (IAP; an isoform of PLAP-1 with 86% sequence identity), and cell surface glycoproteins MUC18 (CD146), basigin (CD147), transferrin receptor (CD71), integrin ␤1 (CD29), Cathepsin D, and cation-independent mannose 6-phosphate receptor. Some of these glycoproteins have been shown to be associated with tumorigenesis. For example, PLAP-1 has been observed frequently in gonadal cancers, and its level is elevated in the serum from ovarian cancer patients (30). An elevated level of CD147, a type I single pass membrane Plasma Membrane Proteome, Characterization, and Dynamics protein, was found to enhance the growth of carcinoma cells in vivo (31).
Among the 589 HeLa cell proteins identified, 191 are glycoproteins. Fig. 2 shows their subcellular locations based on UniProtKB designations. Most of these proteins are classified as either cell membrane (mostly plasma membrane including GPI-anchored), ER, lysosomal, or secreted proteins. Many of these proteins are listed as being found in various subcellular locations. For example, the cation-independent mannose 6-phosphate receptor has been detected in the ER, Golgi, plasma membrane, and lysosomal compartments and can be secreted (32)(33)(34). A total of 110 integral membrane glycoproteins were found among 191 glycoproteins identified, and most of these are single pass (types I, II, III, and IV) membrane proteins. However, 25 multipass membrane proteins were also identified, including the high affinity cationic amino acid transporter 1 with 14 transmembrane domains. Table II provides a summary of information on each type of integral membrane glycoprotein found. A detailed description of each (number of transmembrane domains, glycosylation, and function) is presented in supplemental Table 2. These results demonstrate that the periodate oxidation/hydrazide method effectively enriches for membrane glycoproteins.
Secreted and Lumenal Glycoproteins Identified by the Periodate Oxidation/Hydrazide Resin Method-More than 70 soluble glycoproteins were identified from the HeLa cell lysate prepared following periodate oxidation of intact cells (Fig. 2). Approximately one-third of the proteins identified are known to be secreted from cells. Another third of the proteins identified are soluble lysosomal proteins that can be secreted and then shuttled to lysosomes via the cation-independent mannose 6-phosphate receptor. Surprisingly approximately onethird of the proteins identified are listed as ER lumen proteins in UniProt, whereas no Golgi lumen proteins were found. Two proteins identified as being associated with cytoplasmic vesicles were also identified. This array of soluble proteins was found despite the fact that the HeLa cells were washed to remove growth medium and washed again after periodate treatment. However, some of these proteins (e.g. collagen and vitronectin) become components of the extracellular matrix once secreted and therefore are unlikely to be removed by the washing procedures. Others may have remained associated with the cells or extracellular matrix through proteinprotein interactions.
Nonspecific Binding of Abundant Proteins to the Hydrazide Resin-In addition to the glycoproteins identified, a number of non-glycoproteins were identified indicating that some nonspecific protein binding to the hydrazide resin occurs even though a high salt wash (1.5 M NaCl) of the resin was used. To evaluate the extent of nonspecific interaction between the hydrazide resin and proteins in the HeLa cell lysate, cell lysates prepared from HeLa cells that were not treated with periodate were mixed with the hydrazide resin. The resin was washed in the same manner as for the periodate-oxidized samples. Tryptic digests of nonspecifically bound proteins were collected. Three nano-LC runs of tryptic digests showed that a significant number of proteins (333) could be detected (supplemental Table 3, A and B). The most abundant, nonspecifically bound proteins were derived from the cell cytosol including elongation factor 2, tubulin ␤-2c, HSP90, ␤-actin, L-lactate dehydrogenase B chain, and elongation factor 1-␣1. These proteins were previously reported to be the most abundant proteins detected from HeLa whole cell lysate (35). Only 17 of the proteins (5.1%) from the control experiment (nonperiodate-treated cells) were glycoproteins and were among the most abundant proteins (CD71, CD98, CD146, IAP, etc.) identified in the tryptic digests of the cell surface periodateoxidized sample. However, in the control lysate, these glyco-

TABLE II Transmembrane glycoproteins detected in HeLa cells
The numbers listed in columns 2 and 3 are the total number of membrane glycoproteins identified by the periodate oxidation/hydrazide or M. amurensis chromatography methods. Some proteins were identified in samples from both methods (number in parentheses in column 4). Therefore, the total number of protein identifications (column 4) is less than the sum of the number of proteins identified by each method alone. proteins were found in far lower abundance than the cytosolic proteins based on their MS/MS spectra counts. Comparing the results from the control (17 glycoproteins) versus treated (191 glycoproteins) cell lysates demonstrates that the periodate oxidation/hydrazide method substantially increases the number of glycoproteins identified. N-Linked Glycosylation Sites Identified in the Periodatetreated/Hydrazide-bound Glycoproteins-Following trypsin digestion, glycopeptides that were covalently linked to the hydrazide resin were released from the resin by PNGase F treatment. Because the PNGase F-released sample is far less complex than the fraction generated by trypsin, it was analyzed by nano-LC/MS in combination with gas phase fractionation only. Sites of N-linked glycosylation were identified by locating peptides with an N-linked glycosylation consensus sequence (NX(S/T) where X is not P) in which the Asn residue was converted to Asp. To assure the accuracy of peptide sequence assignment, the MS/MS spectrum for each identified glycopeptide was manually evaluated for (i) the expected mass deviation between the mass of the detected ion and the calculated mass of the deamidated peptide, (ii) the Mascot ion score, (iii) the Sequest Xcorr score, (iv) the X!Tandem Ϫlog(e) score, and (v) fragment ion matches. This extensive validation process eliminated several false positives. As a result, the average Mascot ion score of the N-linked glycopeptides in our list is more than 70, demonstrating the high quality of identification. Only three triple charged peptides were included in the list of glycopeptides; other triple or higher charged peptides were discarded because of the lack of fragments generated within N-linked consensus sequence.
In total, 80 N-linked peptides (76 unique N-linked sites) within 56 glycoproteins were identified (Table III and supple-mental Table 4). Thirty-four of the 76 N-linked sites identified have not been confirmed previously (listed as potential in UniProtKB but not reported in one of the large glycopeptides studies (15, 36 -38)). Table IV provides details on the newly identified N-linked sites and the identity of the parent glycoprotein. Among the N-linked glycopeptides, four were only identified when the data were analyzed using X! Tandem (see  supplemental Table 4). This demonstrates the usefulness of carrying out data analysis with multiple algorithms (i.e. Mascot, Sequest, and X!Tandem). Among the 56 proteins identified from the N-linked glycopeptide analysis, 54 were also identified in the tryptic digests. Two proteins, receptor-type tyrosine-protein phosphatase ␥ and translocon-associated protein ␤ subunit precursor were uniquely identified via Nlinked glycopeptides, each identified by a single N-linked glycopeptide. Fig. 3 shows an example of the MS/MS spectrum of an N-linked glycopeptide (AA 316 -328) derived from the protein CD98. The fragments shown in Fig. 3 demonstrate that Asn-323 was converted to Asp, whereas Asn-327 was not. Thus, the mass difference of 115 Da between fragments y5 and y6 confirms that a conversion of Asn to Asp occurred at Asn-323 but not at Asn-327, showing that CD98 is N-glycosylated on Asn-323. It is possible that the conversion of Asn to Asp in the peptides identified occurred by nonenzymatic deamidation (39), but this seems unlikely because 40 of the N-linked sites identified in our analyses confirm previously identified sites. Furthermore a total of 26 of the peptides identified with an Asn to Asp substitution contained additional Asn residues (one or more within the same peptide) that were not converted to Asp. Nevertheless we did observe nonenzymatic deamidation but only in two sequences, NGK and NGA, from the peptide (AA 181-191, YYNYTLSINGK) of N-acetylglucosamine-6-sulfatase and the peptide (AA 721-732, QNNGAFNETLFR) of CD71, respectively. These results are consistent with the observation reported by Robinson and Robinson (39) that Asn residues followed by a Gly residue undergo nonenzymatic deamidation more frequently than Asn followed by any other amino acid residue.
Lectin Affinity Chromatography Enrichment of Integral Plasma Membrane Glycoproteins-A resin with two sialic acid-binding isolectins from M. amurensis (MAA and MHA) covalently attached was used to isolate glycoproteins from HeLa cell lysates. The effect of various parameters on the binding of non-glycoproteins versus glycoproteins to the lectin affinity column was evaluated to develop an optimized separation. The effect of various parameters on the separation of these proteins with lectin affinity chromatography was evaluated by Western blot and LC/ESI-MS/MS analyses (see "Experiment Procedures"). We found that glycoprotein binding to the M. amurensis lectin column was optimal when the chromatography was carried out at 4°C with divalent cations (Ca 2ϩ and Mg 2ϩ ) present in the column loading buffer. Including a relatively high concentration of salt (0.5 M NaCl) in the loading buffer and a detergent (0.1% Tween 20) wash significantly

TABLE III Glycoproteins and glycopeptides detected in HeLa cells lysates using the hydrazide or M. amurensis chromatography methods
The numbers listed in columns 2 and 3 are the total number of glycoproteins or glycopeptides that contain N-linked sites identified by the periodate oxidation/hydrazide or M. amurensis chromatography methods. Some glycoproteins or glycopeptides were identified in samples from both methods (number in parentheses in column 4). Therefore, the total number of glycoprotein and glycopeptide identifications in column 4 is less than the sum of the number of proteins identified by each method alone.   Table 5, A and B). Many of the non-glycoproteins identified are those that were found to bind nonspecifically to the hydrazide resin and that were found to bind to the agarose resin (results not shown) used as the support for the M. amurensis column.
Based on the subcellular locations obtained from the Uni-ProtKB database, the majority of the glycoproteins identified are listed as cell membrane, lysosomal, or secreted proteins as shown in Fig. 4. Approximately 57% of the proteins identified following M. amurensis chromatography were categorized as membrane glycoproteins. This was similar (57.1 versus 61.8%) to that obtained by the hydrazide method. In contrast, the M. amurensis chromatography method resulted in a slightly lower percentage of secreted and ER/rough ER glycoproteins than the hydrazide method. A significant number of lysosomal proteins (23 proteins) were also identified in the fraction eluted from the M. amurensis column. This should not be surprising because it has been shown previously that in addition to the phosphomannose-capped N-linked glycans these proteins carry sialylated glycans (40).
As found for the hydrazide method, the most abundant glycoproteins identified in the M. amurensis column-bound fraction based on the spectra counts were PLAP-1, IAP, CD147, CD146, CD98, CD71, and CD59. A total of 94 integral membrane glycoproteins were identified by the M. amurensis method as shown in Table II, including two proteins with up to 12 transmembrane domains. Among the integral membrane proteins, single pass membrane glycoproteins were the most abundant (81 proteins). Sixty-six integral membrane glycoproteins identified following enrichment of the HeLa cell lysate by M. amurensis chromatography were also found using the hydrazide method. However, a significant number of integral membrane proteins were uniquely identified by one of the methods (44 and 28 by the hydrazide and the M. amurensis chromatography methods, respectively).
N-Linked Glycosylation Sites Identified in the M. amurensis Lectin-bound Glycoproteins-PNGase F treatment of the glycoproteins that bound to and were eluted from the M. amu-  (15, 36 -38) Swiss-Prot accession numbers are given in parentheses after the protein names.

Plasma Membrane Proteome, Characterization, and Dynamics
rensis column released glycans from N-linked sites and converted the Asn residues to Asp residues. Unlike the periodate/ hydrazide method in which tryptic peptides are first released from the hydrazide resin followed by PNGase F release of the N-linked glycopeptides, the deglycosylated glycopeptides from the M. amurensis column-bound glycoproteins are mixed with other tryptic peptides in the sample. This could lead to a significant number of incorrectly identified (false positives) sites of N-glycosylation. To minimize false positive identifications of N-linked sites, a manual investigation of the zoom scan spectrum for each ion was done to confirm the mass assignment of each N-linked glycopeptide.
After evaluating the spectra of potential N-linked glycopeptides from the sample prepared by the M. amurensis chromatography method, 76 N-linked glycopeptides were found with 75 representing unique N-linked sites (Table III). Among the N-linked sites identified, 40 have not been reported previously. Table V provides details on the newly identified Nlinked sites and the identity of the parent glycoprotein. Complete information on the sequences, ion scores, and detected masses of the glycopeptides identified are provided in supplemental Table 4. Overall the average number of N-linked sites per glycoprotein identified for proteins isolated by the M. amurensis chromatography method (75/45 ϭ 1.67) is higher than that identified for glycoproteins isolated by the hydrazide method (76/56 ϭ 1.36). This difference could be due to a higher efficiency of PNGase F-catalyzed N-linked glycan release from glycoproteins in solution (used in the M. amurensis chromatography method) versus glycopeptides released from the hydrazide resin.
In summary, a total of 240 glycoproteins were identified from HeLa cell lysates fractionated by the periodate/hydrazide and the M. amurensis chromatography methods. Although there is some overlap (101 proteins) among the glycoproteins identified by the hydrazide and the M. amurensis chromatography methods, a significant number of the 240 glycoproteins were uniquely identified by each of the methods. Thus, the information obtained by using the two methods is substantially greater compared with that obtained by either method independently. Similarly combining the results obtained by the two methods substantially increased the number of N-linked glycosylation sites; only 24 of 127 were de- tected by both methods. These results demonstrate that the hydrazide method and the M. amurensis chromatography method are complementary for glycoprotein and N-linked glycosylation site identification. Thus, a more comprehensive catalog of plasma membrane glycoproteins that may be potential biomarkers is achieved when these two methods are used together.
An indicator of the effectiveness of our strategies for isolating surface antigens is the number of cluster of differentiation (CD) antigens identified in our analysis. A total of 56 CD antigens were detected among the HeLa cell proteins, accounting for nearly 25% of the glycoproteins identified (supplemental Table 2). The effectiveness of surface membrane enrichment from our results is comparable with other reported studies (38,(41)(42)(43)(44)(45). We also detected other classes of surface membrane proteins including GPI-anchored proteins. Ten GPI-anchored proteins (supplemental Table 6A) were identified. Eight of the GPI-anchored proteins we identified have been reported to be present in HeLa cells by others (44,46), whereas CD230 and reticulon-4 receptor have not been reported previously in HeLa cells. Many of the CD antigens identified have important biological functions. For example, CD46, CD55, and CD59 are essential molecules for protecting cells from complement-mediated lysis (47). We also detected several CD antigens as well as other membrane glycoproteins that have been proposed to be involved in cancer metastasis (48), including cell adhesion molecules (integrins ␣, ␤ chain family: CD49a, -b, -c, -e, -f, CD29, CD51, and CD104) and proteolytic enzymes (ADAMs: ADAM9, ADAM10 (CD156c), and ADAM17 (CD156b)).
A number of other cell surface molecules with critical biological functions that have been associated with development and cancer were also found in lower abundance (based on spectra counts and the number of peptide identified) in HeLa cells. These included proteins that form molecular complexes such as Nicastrin, a subunit of the ␥-secretase complex, an endoprotease complex that catalyzes the intramembrane cleavage of integral membrane proteins such as Notch (neurogenic locus notch homolog protein) receptors. Notch2 was also identified among the HeLa cell glycoproteins and has been detected previously in HeLa cells by Western blotting (49). Interestingly Notch1 was not found in our analysis of HeLa cells lysates, and it has been shown to be absent in HeLa cells and cervical cancer tissue. Talora et al. (49) found that Notch1 gene expression is markedly reduced in cervical cancer and that it exerts a protective effect against human papillomavirus-driven transcription of viral genes. Talora et al. (49) proposed that down-modulation of Notch1 has an impor-  (50) reported that the leukocyte common antigen-related receptor tyrosine phosphatase, which regulates cell adhesion, is another substrate of the ␥-secretase complex. This protein was also found among the glycoproteins identified in our study.
To further verify that the glycoproteins identified in our study are expressed by HeLa cells, we searched the Human Protein Atlas database that provides information on protein expression in cell lines and tumor tissues evaluated by immunohistochemistry (IHC) (51). Among the 240 glycoproteins identified in our study, 79 proteins have been evaluated by IHC in HeLa cells by the Human Protein Atlas project. Sixtyeight of these 79 glycoproteins (86%) were detected by IHC in the HeLa cells (supplemental Table 6B) used in the Human Protein Atlas project. Many of these proteins, such as CD29, CD71, CD147, endoplasmin, and proactivator polypeptide, are strongly expressed according to the IHC results and were also detected among the more abundant glycoproteins via our MS analyses. Even in four of the cases in which the Human Protein Atlas lists the IHC result as negative, between 4 and 13% of the HeLa cells examined were scored as positive for antibody binding.
The periodate/hydrazide method was used in a unique manner in our study by carrying out the oxidation with periodate on intact cells. Periodate treatment is expected to oxidize the vast majority of monosaccharides on the cell surface glycans. Thus, one would predict that the majority of cell surface glycoproteins as well as glycoproteins making up or associated with the extracellular matrix would be labeled. This is in fact what was observed.
The M. amurensis chromatography method was selected because the isolectins of M. amurensis are expected to bind a wide range of sialylated glycoproteins containing a terminal 2,3-linked sialic acid, both N-linked and O-linked. In addition, these lectins have been shown to bind additional carbohydrates including those containing 3-O-sulfates (52), 9-Oacetylated sialic acids, and under some circumstances unsubstituted Gal␤1,4GlcNAc (53). Various forms of the M. amurensis lectins are commercially available, and their carbohydrate binding properties will likely vary depending on the lectin preparation. We used a preparation from Sigma for the results presented here but also obtained similar results with M. amurensis lectin from E-Y Laboratories.
Given the way in which the two methods were used, one would anticipate that most of the cell surface glycoproteins identified in the glycoprotein sample obtained by M. amurensis chromatography would have been observed among those identified by the periodate/hydrazide method. However, this was not the case. Rather a significant number of cell surface glycoproteins were uniquely identified in the sample from the M. amurensis chromatography. For example, 12 CD antigens were uniquely found among the glycoproteins identified in the M. amurensis sample. This suggests that other factors con-tribute to determining which glycoproteins are detected in each sample. It is interesting to note that most of the CD antigens uniquely identified in the M. amurensis sample were identified by only two or three peptides and had relatively low spectra counts. Thus, it is possible that they were part of the mixture of glycoproteins isolated by the periodate/hydrazide method, but the peptides derived from them were overshadowed by peptides from more abundant proteins. However, some of the plasma membrane proteins uniquely identified in the M. amurensis sample had higher spectra counts. For example, the HLA class I histocompatibility antigen Cw-12 ␣ chain generated 50 spectra counts and was identified by eight unique peptides accounting for more than 35% sequence coverage. Obtaining more information on the types of carbo- hydrate structures expressed on this protein in HeLa cells might provide more insight into why this plasma membrane glycoprotein was not detected by the periodate/hydrazide method.
Glycoproteins Display Distinct Patterns of Expression during the Cell Cycle-Studies of gene expression during the cell cycle have been used to identify genes that have regulated expression at various points during the cell cycle (20,54). Few proteomics analyses (55) have focused on how protein expression changes during the cell cycle, and we are unaware of any proteomics analyses on how the level of cell surface proteins varies during the cell cycle.
To investigate the cell cycle-dependent expression of proteins in the plasma membrane, we carried out a series of experiments to select a set of proteins that are expressed at the cell surface of HeLa cells. To achieve our goal, we required a set of antibodies that would work in flow cytometry and Western and dot blots. Therefore, we selected a suite of antibodies to a set of glycoproteins identified in HeLa cells via the methods described in the preceding sections and evaluated them using flow cytometry and Western and dot blotting. Antibodies to CD9, CD29, CD55, CD59, CD71, CD98, CD146, CD147, and CD171 were used to evaluate whether these glycoproteins are expressed at the HeLa cell surface. These CD antigens were identified by both the periodate/hydrazide and the M. amurensis chromatography methods based on amino acid sequence coverage ranging from 27 to 59% and the number of the unique peptides identified ranging from five to 33 peptides (supplemental Fig. 1).
All antibodies gave positive results in flow cytometry assays of non-permeabilized HeLa cells demonstrating that each glycoprotein is present at the cell surface of HeLa cells (results not shown). Each antibody except anti-CD9 and anti-CD98 produced a positive Western blot. Most of the antibodies gave positive results only under either reducing or non-reducing conditions, whereas anti-CD147 could be used under either condition (Fig. 5). The Western blot results demonstrated that most of the proteins were detected as a single band, whereas the Western blot results for CD147 and CD171 produced multiple bands. In the case of CD147, it has been observed previously that the protein appears as a series of bands resulting from differences in glycosylation (56). CD171 has been reported to produce bands in the 200 -220-kDa molecular mass range depending on its level of glycosylation. Preliminary results (not shown) obtained from binding lysate to various lectins resins (e.g. concanavalin A and MAA) demon-strate that many of the CD147 and CD171 bands observed in Fig. 5 bind to multiple lectins. Further studies will be required to more fully characterize glycosylation differences among the CD147 isoforms. CD171 can be cleaved by proteases, including ADAM 10 (CD156c), a metalloprotease observed among the glycoproteins identified by mass spectrometry, resulting in fragments of 180 and 40 kDa. We observed bands for this protein at ϳ200 and 40 kDa.
All of the antibodies except anti-CD9 were positive in dot blot assays. Dot blot assays were carried out with various amounts of lysates obtained from asynchronous cultures of HeLa cells to establish the linear response regions for each antibody. For example, the binding curves for antibodies to CD29, CD98, CD146, and CD147 using HeLa cell lysate and detected by ECL Plus chemifluorescence are shown in Fig. 6. Control blots without primary antibody did not produce a signal (blot not shown). Sigmoidal plots of signal intensity versus the logarithm of asynchronous HeLa lysate protein spotted were found to be linear in the range of 1.0 -4.0 g. Thus, the levels of CD29, CD98, CD146m and CD147 were examined at 2-h intervals during the cell cycle using HeLa cell lysate (two different amounts between 1 and 4 g). Similar curves were obtained for the antibodies to CD29, CD146, and CD147.
Cell lysates were prepared from cells synchronized by a double thymidine block (see "Experimental Procedures"). Lysates were prepared from cells at zero time (at removal of the second thymidine block) and at 2-h intervals thereafter and probed with the four antibodies. HeLa cell synchronization was evaluated by analyzing the level of cdc2 expression which peaked 8 h after release from the second thymidine block (results not shown) as expected (20). The amount of each of the CD antigens varied during the cell cycle, but each had a distinct pattern. For example, CD98 reached its lowest level at 8 h postrelease, whereas CD146 was near its highest level of expression at that time point. For three of the CD antigens (CD29, CD146, and CD147) expression levels remained relatively constant (ϳ2-fold change) throughout the 16-h study period. In contrast, the expression level of CD98 changed about 4-fold during the study period, dropping to its lowest level 8 h after release from the second thymidine block and rising rapidly to its maximum level within the next 2 h (10 h after release). The cells are entering M phase as CD98 expression begins to decline and reaches a minimum and are entering G 1 phase as the protein rebounds to maximum expression levels.

FIG. 6. Antibody binding curves to HeLa cell lysates and CD antigen expression in HeLa cells during various parts of the cell cycle (A-D).
Left panels, dot blot and corresponding signal intensity plots for the binding of CD antibodies to HeLa lysates (1.7-4,000 ng) prepared from cells grown asynchronously. The lysate was dotted (2-fold dilutions), probed for the corresponding CD antigens, and detected using ECL Plus chemifluorescence (corresponding dot blot shown above each graph). Right panels, dot blots of HeLa lysates prepared from synchronized cells collected every 2 h after release from a double thymidine block. The dot blots (not shown) were probed for CD antigens, and the bound antibody was detected using ECL chemifluorescence. Relative intensities were an average of three sets of normalized data using two amounts of lysate in duplicate. Maximum error in any measurement was 18.5%. CD98 has been shown to be associated with cell proliferation and growth and is the heavy chain of several dimer complexes with transport activity (amino acids, polyamines, and choline) (57,58). CD98 has been shown to be associated with a wide variety of cell functions (e.g. cell fusion, cell survival/death, and cell adhesion) and associates with several proteins including integrins and CD147 (59 -61). Therefore, changes in the level of CD98 may also influence the cell surface expression of other proteins during the cell cycle. Although we did not measure the cell surface expression pattern of CD98 and CD147, it is clear from our analyses that the total level of expression of these two proteins is not coordinated in HeLa cells during the cell cycle. Additional experiments that include simultaneous measurement of the total and cell surface expression are needed to more fully understand the relationship of these two proteins and others that associate with them.
Our results provide a small sampling of the expression profiles for proteins associated with the plasma membrane but demonstrate that there are differences in expression levels for these proteins during the cell cycle. Proteomics comparisons among various cell lines (e.g. normal versus cancer lines) have often reported modest quantitative differences (2-fold) that may be related not to differences between the cell types but rather to protein expression variations within each cell that occur during cell proliferation. More analyses of cell cycle-dependent protein expression are needed to establish better benchmarks for understanding the degree to which protein expression varies. These studies will provide basic information that is needed to allow for better interpretation of results from quantitative proteomics studies. * This work was supported, in whole or in part, by National Institutes of Health Grant P20 MD000262. This work was also supported by National Science Foundation Grant CHEM-0619163. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.