Exploring the Sialiome Using Titanium Dioxide Chromatography and Mass Spectrometry *S

Strategies for biomarker discovery increasingly focus on biofluid protein and peptide expression patterns. Post-translational modifications contribute significantly to the pattern complexity and thereby increase the likelihood of obtaining specific biomarkers for diagnostics and disease monitoring. Glycosylation is a common post-translational modification that plays a role e.g. in cell adhesion and in cell-cell and receptor-ligand interactions. Abnormal protein glycosylation has important disease associations, and the glycoproteome is therefore a target for biomarker discovery. Here we present a simple and highly selective strategy for purification of sialic acid-containing glycopeptides (the sialiome) from complex peptide mixtures. The approach utilizes a high and selective affinity of sialic acids for titanium dioxide under specific buffer conditions. In combination with mass spectrometry we used this strategy to characterize the human plasma and saliva sialiomes where 192 and 97 glycosylation sites, respectively, were identified. Furthermore we illustrate the potential of this method in biomarker discovery.

Protein glycosylation is among the most common posttranslational modifications known in nature. Glycosylation is difficult to analyze by biochemical methods due to chemically very similar monosaccharide building blocks and pronounced heterogeneity and microheterogeneity of the carbohydrate chains with respect to branching patterns and monosaccharide composition. Glycoproteins containing N-and O-linked glycans are in general either secreted in soluble form from cells or tissues into body fluids or localized to the cell surface. O-Linked glycans are also observed intracellularly where they appear to participate in regulation of signaling pathways (1)(2)(3).
The glycan moieties of extracellular glycoproteins may stabilize the conformation of proteins and confer proteolytic resistance and influence protein turnover, receptor-ligand interactions, cell-cell signaling, and adhesion, but functions are in many instances unknown (4). However, it is well known that glycosylation patterns in chronic disease can be highly aber-rant as a consequence of changes in the expression or activity of glycosyltransferases or other factors affecting the glycan biosynthesis (5)(6)(7).
Many extracellular glycoproteins contain sialic acid (SA) 1 as the monosaccharide located on the non-reducing end of the glycans. It has been demonstrated previously that cancers and cancer staging may be associated with a significant overrepresentation of SA on the surface glycoproteins of cancer cells compared with normal cells (e.g. Refs. 8 -18). Also it is well known that the amount of free SA and lipid-and proteinbound SA is elevated in plasma from cancer patients compared with healthy individuals (9, 19 -22). In addition, glycosylation microheterogeneity in the form of different branching patterns (where the number of sialic acid moieties reflects the glycan branching structure) is linked with acute phase condition and chronic disease (e.g. Refs. 23 and 24), possibly indicating that SA or SA-containing glycoproteins/peptides will be good biomarker candidates for cancer.
A generic method for the isolation of SA-containing glycopeptides from relatively complex mixtures is therefore of interest. Previously a few methods for glycopeptide characterization have been published including chemical derivatization using hydrazine chemistry (25,26), which captures both SA and neutral glycosylated peptides/proteins, or the use of glycan-binding proteins, lectins (27)(28)(29). There are examples of lectins that are specific for SA including Sambucus nigra agglutinin, Maackia amurensis leukoagglutinin I/II, and the Siglecs family, but most lectin chromatography suffers from rather broad specificities and co-purification of non-glycosylated proteins/peptides. Hydrophilic interaction chromatography (30 -32) may be used to enrich for hydrophilic peptides, e.g. glycopeptides. However, presently there is no efficient and simple method for the selective purification of SA-containing glycopeptides from complex biofluids such as plasma or serum.
Here we report a simple, robust, and very selective method for the quantitative and qualitative assessment of SA-containing peptides from complex peptide mixtures. The method takes advantage of the extremely high affinity of titanium dioxide toward SA residues positioned in the non-reducing ends of glycans under specific buffer conditions. We characterize the method and show in model experiments with bovine fetuin that the method is specific and complete for SA-con-taining glycopeptides. In addition, we show for the first time a map of the sialiome, defined as all the SA-containing glycosylated proteins, from plasma and saliva. Finally to illustrate the potential of the new approach in biomarker discovery, it is used to compare the differences of the plasma sialiome of a control individual and a patient with advanced bladder cancer.

Test Proteins
A peptide mixture originating from tryptic digestions of 10 standard proteins was generated. The 10 standard proteins were: serum albumin (bovine), ␤-lactoglobulin (bovine), carbonic anhydrase (bovine), ␣-casein (bovine), ribonuclease B (bovine pancreas), alcohol dehydrogenase (bakers' yeast), lysozyme (chicken), ␣-amylase (Bacillus species), and fetuin (fetal calf serum). All were from Sigma. Each protein was dissolved in 50 mM ammonium bicarbonate, pH 7.8, including 10 mM DTT and incubated at 56°C for 30 min. The proteins were subsequently alkylated using 40 mM iodoacetamide for 1 h at room temperature. The reaction was quenched with addition of 10 mM DTT, and the solution was digested using trypsin (2%, w/w) at 37°C for 12 h. The peptide mixture was generated by mixing the peptides originating from the proteins in equal amounts (0.5 pmol/l) and stored at Ϫ20°C until further use.

Alkaline Phosphatase Treatment
To avoid co-purification of phosphorylated peptides, the tryptic peptides originating from fetuin and serum/plasma/saliva proteins were treated with 0.5 and 2 units of alkaline phosphatase, respectively, in 50 mM NH 4 HCO 3 , pH 7.8, at 37°C for 2 h.

Plasma
Plasma samples from control individuals and bladder cancer (stage IV) patients were depleted of the six highest abundance proteins, albumin, IgG, IgA, haptoglobin, transferrin, and ␣ 1 -antitrypsin, using immobilized affinity-purified antibodies according to the manufacturer's protocol (Agilent). The depleted plasma was concentrated using Microcon 10-kDa spin columns to 50 l. The concentrated proteins were diluted with 50 l of 50 mM NH 4 HCO 3 , reduced and alkylated as described above, and digested overnight with 20 g of trypsin/200 g of protein at 37°C. Prior to purification of SA-containing glycopeptides with TiO 2 the samples were treated with alkaline phosphatase as described above.
The serum from the bladder cancer patient was obtained in connection with ongoing microarray studies of gene expression in bladder cancer patients (T. Ørntoft). Patients were staged according to international criteria, and plasma was secured and frozen at Ϫ20°C until used.

Saliva
Saliva was collected more than 2 h after food intake in a test tube containing Complete TM protease inhibitors (Roche Diagnostics GmbH). The proteins were precipitated by adding 5 volumes of acetone followed by incubation for 1 h at Ϫ80°C. The precipitated proteins were redissolved in 8 M urea and diluted to 1 mM urea using 100 mM NH 4 HCO 3 followed by reduction and alkylation as described above. The proteins were digested overnight with 20 g of trypsin/ 200 g of protein at 37°C. Prior to purification of SA-containing glycopeptides with TiO 2 the sample was treated with alkaline phosphatase as described above.

O Labeling of Tryptic Peptides
The tryptic peptides were labeled as described previously (33). The incorporation efficiency and the 1:1 ratio were tested using MALDI MS.

Purification of SA-containing Glycopeptides Using TiO 2 Microcolumns
TiO 2 microcolumns were packed in GELoader tips essentially as described previously (34) or in p10 micropipette tips (depending on the amount of material to be purified). Briefly a small plug of C 8 material was stamped out of a 3M Empore C 8 extraction disk (e.g. using an HPLC syringe needle or similar device) and placed at the end of the tip. TiO 2 beads were suspended in acetonitrile and packed on top of the C 8 disk using a 1-ml disposable syringe. Peptide mixtures were diluted five times in loading buffer (1 M glycolic acid in 80% acetonitrile, 2-5% TFA) and loaded onto the TiO 2 microcolumn (2% TFA was used for the fetuin sample, whereas 5% TFA was applied for the standard peptide mixture and plasma and saliva peptide mixtures). The column was washed with 10 l of loading buffer followed by 40 l of washing buffer (80% acetonitrile, 2% TFA). The SAcontaining glycopeptides were eluted using 20 -40 l of ammoniawater (10 l of 25% ammonia solution in 490 l of water), pH 11. A small aliquot of each of the eluates was acidified with 100% formic acid, purified using a Poros R3 reversed phase microcolumn, and analyzed by MALDI MS. The remaining eluate was lyophilized for N-glycosidase F digestion.

Glycosidase Treatments
Neuraminidase-Dephosphorylated tryptic peptides originating from fetuin were lyophilized and redissolved in 50 mM NH 4 acetate, pH 6. Neuraminidase (Roche Diagnostics GmbH) was added to the peptide mixture (0.005 unit), and the sample was incubated overnight at 37°C.

Reversed Phase Microcolumn Purification
Reversed phase microcolumn purification was performed as described previously (34).

MALDI MS
MALDI MS was performed using a Voyager STR (PerSeptive Biosystems, Framingham, MA) equipped with delayed extraction. Spectra were obtained in positive reflector or linear ion mode using an accelerating voltage of 20 kV. MALDI MS data analysis was performed using the MoverZ software (Genomic Solutions).

Nanoscale LC-MSMS
Nanoscale liquid chromatography-tandem mass spectrometry was performed using a Q-TOF Ultima mass spectrometer (Waters/Micromass UK Ltd., Manchester, UK) or an LTQ-FT instrument (hybrid two-dimensional linear quadrupole ion trap-FTICR mass spectrometer (Thermo Electron)). A nanoflow HPLC system (Ultimate, Switchos2, Famos, Dionex/LC Packings, Amsterdam, The Netherlands) was used for chromatographic separation of the peptide mixture prior to MS detection on the Ultima Q-TOF system, whereas an Agilent 1100 nanoflow LC system was used for peptide separation prior to detection on the LTQ-FT instrument. The peptides were concentrated and desalted on a 1.5-cm precolumn (75-m inner diameter, 360-m outer diameter, ReproSil-Pur C 18 AQ 3 m (Dr. Maisch)) and eluted at 200 nl/min by an increasing concentration of acetonitrile (1%/min gradient) onto a 8-cm analytical column (50-m inner diameter, 360-m outer diameter, ReproSil-Pur C 18 AQ 3 m). The mass spectrometers were operated in the data-dependent mode selecting three parent ions for collision-induced dissociation per full scan.

Database Searching
Raw data files from the Q-TOF Ultima instrument were processed into pkl files using the ProteinLynx program. On each MSMS spectrum the background was subtracted (40%), and smoothing was performed (Savitzky-Golay; iteration, 2; window, 3 channels). In addition deisotoping was performed using the following parameters: minimum peak width, four channels; centroid top, 80%; TOF resolution, 10,000; NP multiplier (number of pushes correction factor), 0.7. Raw data from the LTQ-FT instrument were processed using the DTASuperCharge (version 1.07) program (SourceForge) and converted into Mascot generic format (mgf) files according to the protocol. Deisotoping was performed using the software default settings. Database searching was performed using the pkl and mgf files using an in-house Mascot database search program (version 2.1). The searched databases were National Center for Biotechnology Information non-redundant (NCBInr, March 26, 2007 (4,761,919 sequences; 1,643,098,755 residues), human) and International Protein Index (IPI-_human human_20060517 (64,191 sequences; 27,188,565 residues). The database searches were performed with fixed modification as carbamidomethyl (Cys) and variable modifications as oxidation (Met) and deamidation (Asn) or deamidation in [ 18 O]water. Enzyme specificity was selected to semitrypsin. For the data obtained using the FT-MS instrument a mass accuracy of 10 ppm was used on the parent ion, and 0.6 Da was used on the fragment ions. For data obtained on the Q-TOF Ultima instrument a mass accuracy of 70 ppm was used on the parent ion, and 0.2 Da was used on the fragment ions.
All identified glycosylated peptides were manually evaluated according to the following criteria: peptides had to include peptide sequence tags (4 -5 amino acids) assigned by abundant Y-or B-ions from the higher mass area, and if prolines were present in the sequence an abundant fragment ion had to be assigned to the predominant fragmentation N-terminal to the proline residue. In addition, the charge states were evaluated according to the number of basic charges in the sequence.
Annotated spectra from each identified peptide have been included in the supplemental data. Annotated spectra were generated either in the MSQuant program (SourceForge) or from Mascot data including Y-and B-ion masses that have been assigned to the peptide.

RESULTS AND DISCUSSION
Principle of the Method-Previously we published a highly selective strategy for purification of phosphorylated peptides using TiO 2 microcolumns (34). We found that peptide loading onto a TiO 2 matrix under highly acidic conditions in a solution containing DHB or phthalic acid resulted in a selective purification of phosphorylated peptides with little co-purification of non-phosphorylated peptides. We proposed that the adsorption of phosphate anions to the surface of TiO 2 is mediated by a bridging bidentate binding (Fig. 1A). This type of binding is much stronger than the chelating bidentate binding that DHB can make with TiO 2 (Fig. 1B), which would mimic the binding that acidic amino acids would be able to make to TiO 2 .
The highly negatively charged SA, which contains both carboxylic acid and hydroxyl functionalities (Fig. 1C), presumably interacts with TiO 2 molecules via a multipoint binding, similar to a multidentate binding (35) (Fig. 1D), in which the carboxylic acid and the hydroxyl groups contribute to the binding. Therefore, SA either free or present at the nonreducing end of glycans on glycopeptides should very efficiently be retained on TiO 2 (or ZrO 2 ) under conditions where non-modified peptides or neutral glycopeptides will not bind, similar to the interaction of phosphorylated peptides with TiO 2 (or ZrO 2 ) Therefore, in combination with phosphatase treatment to remove the phosphorylated peptides that otherwise would saturate the TiO 2 column, this should provide a purified fraction of sialylated peptides.
To examine this scheme, we tested the purification of SAcontaining N-linked glycopeptides from bovine fetuin. Fetuin contains 359 amino acid residues and three N-linked and four O-linked glycans. All the N-linked sites contain SA. Here we only show results from the analysis of the N-linked sites where FIG. 1. Binding modes to titanium dioxide. The complex binding of phosphoric acid and 2,5-dihydroxybenzoic acid to TiO 2 is illustrated in A and B, respectively. C, structure of the most common sialic acid, N-acetylneuraminic acid. D illustrates a multidentate binding that can take place between TiO 2 and compounds that contain carboxylic acids and hydroxyl functionalities. the most efficient deglycosylation enzymes are available. Fetuin is also phosphorylated at several serine residues, and these groups were removed prior to the TiO 2 chromatography to ensure optimal binding of the SA-containing glycopeptides. A MALDI MS peptide mass map of the raw mixture of tryptic peptides from fetuin is shown in Fig. 2A. Several peptide signals can be detected including some in the higher mass range (Ͼ3000 Da), which is known to belong to SA-containing glycopeptides. An aliquot (5 pmol) of the dephosphorylated tryptic peptide mixture from fetuin was diluted in 5%TFA, 80% ACN and applied onto a TiO 2 microcolumn. After washing and elution the purified peptides were analyzed by MALDI MS (Fig. 2B). Here an enrichment of SA-containing glycopeptides was observed due to the signals in the higher mass range; however, a significant amount of signals originating from nonspecific binding was observed. An aliquot (10 pmol) of the dephosphorylated tryptic peptide mixture from fetuin was diluted in the loading buffer containing 1 M glycolic acid and applied onto the TiO 2 microcolumn. After washing and elution, half of the eluted peptides were analyzed by MALDI MS in linear positive ion mode (Fig. 2C). This peptide mass map is dominated by peptide signals in the high mass range Ͼ3000 Da indicating efficient purification of the glycosylated peptides. The remaining eluate was lyophilized and subsequently treated with N-glycosidase F for deglycosylation. The deglycosylated peptides were analyzed by MALDI MS in linear positive ion mode (Fig. 2D). In this peptide mass map only four signals were detected, corresponding to peptides expected to carry N-linked glycans (Table I). No non-glycosylated peptides were detected, indicating a highly efficient and selective purification of the glycopeptides. The selectivity can also be achieved using other acids with functionalities similar to glycolic acid (e.g. DHB, phthalic acid, gallic acid, etc.).
Specificity toward Sialic Acid-containing Glycopeptides-To assess the specificity of TiO 2 for SA-containing glycopeptides versus neutral glycopeptides under the specified loading conditions, dephosphorylated tryptic peptides from fetuin were labeled with 18 O by using 18 O-buffer during trypsinization. An aliquot of the normal 16 O-containing peptides was treated with neuraminidase to cleave off SA and mixed with the 18 O-labeled peptide mixture containing intact SA-carrying glycopeptides in a 1:1 ratio. This mixture was subjected to TiO 2 chromatography. The eluted glycopeptides were deglycosylated using N-glycosidase F and subsequently analyzed by MALDI MS in reflector positive ion mode (Fig. 2E). Again signals from all four N-linked glycosylated peptides were detected (Table I). More than 90% of the signals detected in this MALDI mass spectrum originated from the 18 Olabeled peptides (Table I). Because a 50:50% distribution of the heavy and light peptides would be expected with no SA selectivity of TiO 2 , this result indicates a highly selective purification of the SA-containing glycopeptides over neutral glycopeptides. The 16 O-glycopeptides appearing in this experi- Fetuin was treated with trypsin and subsequently with alkaline phosphatase to remove the phosphate groups. A, MALDI MS peptide mass map obtained from a small aliquot of a peptide mixture originating from tryptic digestion of 10 different proteins including fetuin. B, MALDI MS peptide mass map obtained from the eluate from a TiO 2 purification in 5%TFA of SA-containing glycopeptides from 5 pmol of peptide mixture after reversed phase microcolumn desalting and concentration. C, MALDI MS peptide mass map obtained from half of the eluate from a TiO 2 purification in 5%TFA, 1 M glycolic acid of SA-containing glycopeptides from 10 pmol of peptide mixture after reversed phase microcolumn desalting and concentration. D, MALDI MS peptide mass map obtained after N-glycosidase F treatment of the remaining SA-containing glycopeptides from C. Tryptic peptides from fetuin were treated with alkaline phosphatase, and the sample was subsequently split in two. Both aliquots were lyophilized, one aliquot was redigested with trypsin in 16 O-buffer, and the other aliquot was redigested in 18 O-buffer (95% pure). The 16 O-labeled peptide solution was subsequently treated with neuraminidase. The two peptide solutions were mixed 1:1, and SA-containing glycopeptides were purified by TiO 2 chromatography and subsequently deglycosylated by N-glycosidase F. The MALDI MS peptide mass map shows the resulting deglycosylated peptides (E), and the insets show the isotopic distribution of the four abundant signals. Asterisks illustrate peptides derived by alternative digestion (chymotrypsin) of the purified glycosylated peptides. The sequences of the deglycosylated peptides and their 16 O: 18 O ratios are shown in Table I. The spectra shown A, B, C, and D were all obtained in linear positive ion mode, whereas the spectrum in E was obtained in reflector positive mode. ment could be caused by co-purified neutral glycopeptides, incomplete labeling with 18 O (the [ 18 O]water used in this experiment was only 95% pure), or incomplete removal of the SA with neuraminidase. A combination of the latter two reasons is probably the most likely explanation as neuraminidase has been observed to give incomplete cleavage of the SA from glycoproteins/peptides (data not shown). Furthermore tryptic N-linked glycopeptides from RNase B, which only contain high mannose structures that do not carry sialic acid residues, did not bind under the specific loading conditions used in this study (data not shown). This indicates that TiO 2 is specific for SA-containing compounds provided that suitable buffer conditions are applied.
Identification of SA-containing Glycopeptides from Proteins in Biofluids-After establishing the method using a single model glycoprotein, the method was tested on one of the most complex biofluids used in proteomics, i.e. plasma. Our ultimate aim is to completely characterize the sialiome in plasma, i.e. all the SA-containing glycoproteins in plasma.
Plasma was depleted for the six most abundant proteins. An aliquot (1%) of the dephosphorylated tryptic peptides originating from 60 l of depleted plasma was analyzed by MALDI MS (Fig. 3A). The remaining peptides were submitted to TiO 2 chromatography. The SA-containing glycopeptides were profiled by MALDI MS as shown (Fig. 3B) using 5% of the eluate from the TiO 2 chromatography for the analysis. In the higher mass range (Ͼ3000 Da) broad signals are observed indicating the heterogeneous SA-containing glycopeptides, whereas only a very few signals are observed in the lower mass area mainly originating from phosphopeptides that have escaped the phosphatase treatment. The large amount of glycopeptides present together with the often pronounced microheterogeneity observed on each glycosylation site will result in a multitude of signals from SA-containing glycopeptides explaining the badly resolved material in the MALDI mass spectrum of Fig. 3B. After treating the remaining eluate from the TiO 2 chromatography with N-glycosidase F, an aliquot (5%) was analyzed by MALDI MS (Fig. 3C). All the ion signals in the high mass area disappeared after this treatment, and new peaks were confined to the lower mass range (Ͻ4000 Da) indicating an efficient purification of SA-containing glycopeptides from depleted human plasma. The remaining deglycosylated peptide eluate was analyzed by LC-MSMS.
Supplemental Table S1 shows a list of the SA-containing glycopeptides and their corresponding glycosylation sites identified using 60 l of depleted plasma as starting material. The list includes only non-redundant peptides. If peptide sequences are shared by more than one protein entry only one of the protein entries is included in the list. A total of 192 SA-containing glycosylation sites in 100 proteins were identified in this way. Most of the proteins are secreted proteins. However, some are membrane proteins presumably shed from plasma membranes from cells or tissues. A total of 127 FIG. 3. Characterization of the sialiome from depleted human plasma. Human plasma was depleted for the six most abundant proteins (albumin, IgG, IgA, haptoglobin, transferrin, and ␣ 1 -antitrypsin) using an antibody affinity depletion column. An aliquot (1%) of the dephosphorylated tryptic peptides originating from 60 l of depleted plasma was analyzed by MALDI MS (A). B illustrates the MALDI MS peptide mass map of 5% of the eluate obtained from the TiO 2 purification of the remaining peptide mixture. SA-containing glycopeptides are shown in the high mass region of the spectrum. The eluate from the TiO 2 purification was treated with N-glycosidase F (PNGase F), and 5% was analyzed by MALDI MS (C). Signals in the low mass range of the spectrum are shown here. All spectra were obtained in linear positive ion mode. non-glycosylated peptides (that did not have an NX(S/T/C) consensus site) were identified due to co-purification with the SAcontaining peptides, indicating a purification efficiency of 60%. Of the 192 glycosylation sites, 28 were listed in the Swiss-Prot and TrEMBL databases as potential glycosylation sites. These sites are now verified in human plasma. Fig. 4A show a pie chart describing the location of the sialylated proteins identified in plasma according to the Swiss-Prot database. As expected, the majority of the identified proteins are secreted proteins; however, a significant number of the proteins localize to membranes and could originate from tissue leakage. In a recent much more time-consuming and complex approach using immunoaffinity subtraction, hydrazide chemistry, strong cation exchange fractionation, reversed phase chromatography, and mass spectrometry a total of 639 glycosylation sites were identified using 16 times more starting material than used in this study (36). Hydrazine chemistry targets all glycosylated peptides, and we therefore expect more sites using this method than the present SA-specific method.
Identification of SA-containing Glycopeptides in Proteins in Human Saliva-Human saliva is secreted from multiple sali-vary glands including parotid, submandibular, sublingual, and other minor glands lying beneath the oral mucosa. Saliva is a potential rewarding source of disease biomarkers. Whereas many proteins have been identified in saliva, very little is known about the modification status of those proteins. Recently 128 N-glycosylated peptides were identified from 45 unique N-glycoproteins in saliva using hydrazine chemistry in combination with in-solution isoelectric focusing and mass spectrometry (37).
Here we collected saliva from a healthy individual directly into a protease inhibitor solution. The proteins were precipitated using ice-cold acetone, and the redissolved proteins were digested using trypsin and dephosphorylated using alkaline phosphatase. The SA-containing glycopeptides were isolated using the described protocol, and the deglycosylated peptides were identified by LC-MSMS. Supplemental Table  S2 shows the identified SA-containing glycopeptides and their glycosylation sites. A total of 97 glycosylation sites were found in 45 proteins. The list includes only non-redundant peptides. If peptide sequences are shared by more than one protein entry only one of the protein entries is included in the list. In addition to the glycosylated peptides, we identified 17 non-glycosylated peptides that did not contain an N-linked glycosylation consensus sequence, indicating a purification efficiency of 85% for saliva. Of the 97 sites, 29 sites were new sites according to the Swiss-Prot and TrEMBL databases. In addition, a number of hypothetical proteins with no known function were found to be glycosylated. The pie chart showing the location distribution of the sialylated proteins identified in saliva is shown in Fig. 4B. As expected the majority of the proteins are also secreted; however, almost 20% originate from cellular surfaces, i.e. transmembrane proteins and glycosylphosphatidylinositol-anchored proteins.
Application of SA-specific TiO 2 Chromatography to the Discovery of SA-containing Glycopeptide Profiles in Plasma-The sialiome of plasma from a healthy individual was compared with plasma from a patient with advanced stage bladder cancer to validate the potential for applying this method in biomarker discovery. The two samples were immunoaffinity-depleted for the six most abundant plasma proteins and then reduced, S-alkylated, trypsinized, and dephosphorylated exactly as described above for normal plasma. The tryptic peptides were lyophilized and redigested using trypsin in normal 16 O-buffer (control) and 18 O-buffer (cancer) (Fig. 5A). Using this strategy most of the internal tryptic peptides originating from the cancer sample were labeled with two heavy oxygen molecules, resulting in a mass increase of 4 Da for those peptides. Inadequate labeling can be observed if an acidic amino acid is located close to the tryptic cleavage site. The control and cancer peptides were mixed in a 1:1 ratio, and the SA-containing glycopeptides were purified by TiO 2 chromatography. Half of the eluate from the TiO 2 purification was analyzed directly by LC-MSMS to obtain mass and fragmentation information on the intact glycosylated pep- tides. The LC-MSMS ion chromatogram is shown in Fig. 5B. Material eluting at two different times (marked 1 and 2 in the figure) was chosen to illustrate the complexity of the sample and to directly compare some SA-containing glycopeptides from the healthy sample and the cancer sample.
The mass spectrum of the SA-containing glycopeptide eluting after 40 min (time point 1) and 60 min (time point 2), respectively, are shown in Fig. 5, C and E. Here all the masslabeled peaks represent doubly charged species and illustrate a unique glycopeptide with various glycan structures attached at each time point resulting in a significant heterogeneity. An enlargement of the mass area around the signal at 1173 Da at time point 1 (Fig. 5D) shows a doubly charged peak with 2-Da spacing where the light peak originates from the healthy sample and the heavy peak originates from the cancer sample. The intensity of the two peaks is very similar indicating that no quantitative change can be observed between the normal sample and the cancer sample for this SA-containing glycopeptide. In contrast, the SA-containing glycopeptide eluting at time point 2 was found to be more abundant in the glycopeptide fraction from cancer plasma as illustrated in Fig. 5F by the intensity of the triply charged heavy signal compared with the light fraction signal in the enlargement of the material at m/z 1424. Obviously the increase in the signal could be due to more sialylation of the peptides as well as more of a given sialylated glycoprotein in the cancer sample.
To identify the peptide sequence of the more abundant sialylated glycopeptide, the fragment ion spectrum of the triply charged glycopeptide signal at 1425.26 (Fig. 5F, heavy labeled) was interrogated. The fragment ion spectrum is shown in Fig. 6A.
The fragment ions can be assigned to the loss of single monosaccharides from the glycopeptide, indicating that the charge is predominantly retained on the peptide (Y-type fragments (23)). Three oxonium ions are present that are diagnostic for SA-containing glycans (m/z 274.07, 292.09, and 657.23 (38)). The mass of the peptide can be deduced when taking into account the common core structure of N-linked glycans, i.e. (Man) 3 (GlcNAc) 2 (illustrated in the figure as loses of 3 ϫ 162 and 2 ϫ 203 Da, respectively) (for further information see Ref. 39). The mass of the naked peptide was found to be m/z 1412.84.
The sequence of this peptide was determined after deglycosylation of the remaining SA-containing glycopeptides from the TiO 2 purification and subsequent LC-MSMS analysis of the deglycosylated peptides. The fragment ion spectrum of the peptide at m/z 1413.84 is shown in Fig. 6B (note that deglycosylation of N-glycosylation with N-glycosidase F results in conversion of asparagine to aspartic acid and a mass increase of 1 Da). From these data the sequence VPVPITN*ATLDR could be read where N* indicates the glycan attachment site. This sequence belongs to ␣ 1 -acid glycoprotein 2, also known as orosomucoid 2 (Swiss-Prot entry Q5T538), which is a well known acute-phase reactant with increased synthesis in response to inflammation and tissue damage. As many cancers arise from sites of infection, chronic irritation, and inflammation (e.g. Ref. 40) an increase in acute-phase proteins may correlate with the tumor progression. Because ␣ 1 -acid glycoprotein is one of the most prominent acute-phase reactants and is known to exhibit pronounced microheterogeneous glycan structures shifting in the normal and the acute-phase form of the protein, it is very likely that the findings are due to an increased concentration of the protein in this sample compared with a normal control. Also altered glycan antennary structures (leading to changes in SA content) associated with different microheterogeneity profiles might very well contribute to the results even though the largest effect would be due to entirely new sites being glycosylated in diseased orosomucoid. This, however, to our knowledge has not been reported.
Conclusion-We describe a highly selective method that uses TiO 2 chromatography for purification and characterization of SA-containing glycopeptides. The outcome of the present study clearly illustrates the potential of using TiO 2 chromatography in combination with sensitive and high resolution mass spectrometry for detection and quantification of SA-containing glycopeptides from dephosphorylated complex mixtures. We believe that we would achieve a similar enrichment using other oxidized metal surfaces such as ZrO 2 .
The highly acidic buffer conditions in combination with the DHB, phthalic acid, or glycolic acid was efficiently used to selectively purify SA-containing glycopeptides from both simple and highly complex samples such as human plasma depleted for the six most abundant proteins. Only a very little co-purification of non-modified peptides was detected from complex mixtures. We speculate that this selectivity is a result of the capability of SA to form a multipoint binding to the TiO 2 surface, whereas acidic non-glycosylated peptides and neutral glycopeptides are eliminated by the competitive binding of the substituted aromatic acid (DHB, phthalic acid, or glycolic acid) to the TiO 2 surface. Other acids that resemble DHB, phthalic acid, or glycolic with respect to the functional groups could also be included in the loading procedure to exclude non-sialylated peptides. This approach should also be applicable to other SA-containing compounds, e.g. free FIG. 6. Fragmentation of the abundant triply charged SA-containing glycopeptide from Fig. 5F and its associate deglycosylated peptide. The triply charged glycopeptide at m/z 1425.26 that was increased in the cancer sample was selected for collision-induced dissociation using data-dependent analysis. The resulting fragment ion spectrum is shown in A. The fragment ions detected can be assigned as B-type oxonium ions (labeled with the carbohydrate composition illustrating the diagnostic ions for sialylation) or Y-type fragments with charge retention on the peptide shown accordingly (41). From the common core structure (see inset) the peptide mass was found to be 1412.82 Da (see Ref. 39 for detailed analysis). As illustrated in the inset the following glycan composition could be assigned to the glycopeptide: (GlcNAc) 2 Man 3 Hex 3 (HexNAc) 3 (NeuAc) 3 where Hex is hexose and HexNAc is N-acetylhexosamine. The fragment ion spectrum of the deglycosylated peptide with m/z 1413.8 is shown in B. From the Y-ion series the sequence VPVPITN*ATLDR could be read where N* indicates the glycan attachment site. This sequence belongs to ␣ 1 -acid glycoprotein 2 (orosomucoid 2). glycans. Contamination from sulfated glycostructures cannot be ruled out using the current method, and more work is needed to examine the binding of such glycan structures to TiO 2 .
The method described here has been applied to characterization of the sialiome of depleted human plasma and saliva. In total, we have identified 192 and 97 SA-containing glycosylation sites in proteins from depleted human plasma and saliva, respectively. Experiments with differential display of SA-containing glycopeptides from a healthy and a cancer depleted plasma sample have shown that some SA-containing glycoproteins are differentially expressed/sialylated in cancer compared with the healthy individual. Significant changes were observed in some of the acute-phase proteins that are expected to be changed because the plasma originated from a patient with advanced bladder cancer.
Besides being simple, fast, and efficient in enrichment of SA-containing glycopeptides from complex biofluids, the presented method is extremely tolerant toward salts and other low molecular weight contaminants such as detergents. Combining the method with prepurification methods either on the protein or peptide level, such as isoelectric focusing of proteins or peptides, would most likely increase the number of identified SA-containing glycosylation sites. The presented results also clearly prove the concept of using this method for discovery of potential SA-containing disease biomarkers. In further work the choice of appropriate controls, large sample sets, and stringent validation will need to be applied for this aspect to be developed in a useful way, but the simplicity of the procedure ensures that methodological obstacles will not be prohibitive for such development. We therefore believe that the method could find wide application in glycoproteomics research and biomarker discovery. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.