Enrichment and Site Mapping of O-Linked N-Acetylglucosamine by a Combination of Chemical/Enzymatic Tagging, Photochemical Cleavage, and Electron Transfer Dissociation Mass Spectrometry*

Numerous cellular processes are regulated by the reversible addition of either phosphate or O-linked β-N-acetylglucosamine (O-GlcNAc) to nuclear and cytoplasmic proteins. Although sensitive methods exist for the enrichment and identification of protein phosphorylation sites, those for the enrichment of O-GlcNAc-containing peptides are lacking. Reported here is highly efficient methodology for the enrichment and characterization of O-GlcNAc sites from complex samples. In this method, O-GlcNAc-modified peptides are tagged with a novel biotinylation reagent, enriched by affinity chromatography, released from the solid support by photochemical cleavage, and analyzed by electron transfer dissociation mass spectrometry. Using this strategy, eight O-GlcNAc sites were mapped from a tau-enriched sample from rat brain. Sites of GlcNAcylation were characterized on important neuronal proteins such as tau, synucleins, and methyl CpG-binding protein 2.

Numerous cytoplasmic and nuclear proteins are post-translationally modified with O-linked ␤-N-acetylglucosamine (O-GlcNAc). 1 GlcNAcylation is involved in almost all aspects of cellular metabolism (1) and is highly dependent on the nutrient status of the cell (2). The O-GlcNAc modification rivals phosphorylation in both abundance and protein distribution. Re-cent studies indicate that signaling pathways can be regulated by the interplay of these two modifications at the same or proximal sites on numerous protein substrates (3).
Current understanding of the functions of O-GlcNAc and of the function of O-GlcNAcylation and its relationship to phosphorylation is severely hampered by the difficulties in detecting this labile monosaccharide modification. Problems associated with the identification of O-GlcNAc sites include the following. (a) O-GlcNAc is quickly removed by hydrolases during cell lysis. (b) Like phosphorylation, O-GlcNAc is usually present in less than stoichiometric amounts at given sites on protein substrates. (c) O-GlcNAc is readily lost as an oxonium ion during conventional peptide sequence analysis by collision-activated dissociation (CAD) (supplemental Fig. 1). (d) Modified and unmodified forms of the peptide often co-elute during reverse phase HPLC (supplemental Fig. 2a), and the preferential ionization of the unmodified peptide suppresses the signal observed for the corresponding O-GlcNAc-modified peptide (supplemental Fig. 2, b and c).
Several attempts have been made to enrich samples for O-GlcNAc-modified proteins and peptides. Immunoaffinity purification of O-GlcNAc-modified peptides with an antibody (CTD 110.6) has been largely unsuccessful because of low binding avidity (4). Long, wheat germ agglutinin lectin columns (ϳ39 ft) provide some enrichment but also bind strongly to complex glycans (5). A mutant galactosyltransferase (GalT1) has been used to label GlcNAcylated proteins with a ketone-containing galactose analog (6). Following proteolytic digestion, O-GlcNAc-modified peptides were biotinylated with hydrazine chemistry, isolated on a column packed with avidin beads, eluted with free biotin, and sequenced by ETD mass spectrometry. Failure to elute peptides with high efficiency from the avidin column and an inability to direct the fragmentation to the peptide backbone limit the usefulness of this approach. Reported here is an enrichment methodology that (a) is highly specific for O-GlcNAc-modified peptides, (b) provides for efficient release of the captured peptides from an affinity support, and (c) facilitates complete characterization of the released peptides by ETD mass spectrometry.

EXPERIMENTAL PROCEDURES
Reagents and Chemical Synthesis-All chemicals were purchased from Sigma-Aldrich unless otherwise noted. PC-PEG-biotin-alkyne reagent (Fig. 1a, Reagent 1) containing a photocleavable 1,2-(nitrophenyl)ethyl moiety (PC-PEG-biotin; synthesized according to Olejnik et al. (9) or obtained from Ambergen, Watertown, MA) was prepared by treating the corresponding N-hydroxysuccinimidyl ester with a 10-fold excess of propargylamine (Sigma) in dry methanol at room temperature for 4 h in the dark. The total reaction volume was 20 l containing 5 mg of PC-PEG-biotin (6 mol) and 4 l of propargylamine (60 mol). Product (PC-PEG-biotin-alkyne) was purified by thin layer chromatography (250-m silica gel plate, Analtech, Newark, DE) with methanol/chloroform (1:9, v/v) as the mobile phase. The product was located on the thin layer plate by brief exposure (Ͻ1 s) to 254 nm UV light and extracted by scraping the thin layer zone into dry methanol. Silica gel was removed by centrifugation. The identity of the product was confirmed by mass spectrometry (supplemental Fig.  4). The yield was greater than 98%. The purified product was stored in methanol at Ϫ20°C until use.
Enzymatic and Chemical Derivatization of O-GlcNAc-modified Peptides, Enrichment, and Photochemical Release of Tagged Peptides-Proteins were digested with trypsin (50:1 substrate/enzyme) in 100 mM ammonium bicarbonate, pH 8 overnight at 37°C. Trypsin was removed with a 10-kDa molecular mass cutoff membrane (Millipore, Billerica, MA). Solvent was removed under vacuum, and the residue was then resuspended in 120 l of 10 mM HEPES, pH 7.9 containing 5 mM MnCl 2 , UDP-N-azidoacetylgalactosamine (UDP-GalNAz), and 10 units of GalT1. The reaction mixture was incubated overnight at 4°C, treated with 10 units calf intestine phosphatase (New England Biolabs, Ipswich, MA), incubated for an additional 2 h at room temperature, and then passed through a C 18 spin column (Nest Group, Southborough, MA). UDP-GalNAz and derivatized peptides were eluted in 0.1% TFA, 1% acetonitrile and 0.1% TFA, 80% acetonitrile, respectively. Solvent was removed under vacuum, and the sample was reconstituted in 20 l solution containing 0.05 mol of PC-PEG-biotin-alkyne, 10 mM sodium ascorbate, 1 mM tris[(1benzyl-1H-1,2,3-triazol-4-yl) methyl]amine (in 4:1 t-butanol:DMSO), and 2 mM CuSO 4 . The reaction mixture was incubated overnight at room temperature with gentle agitation. To remove excess PC-PEG-biotin-alkyne, the sample was diluted into strong cation exchange (SCX) loading buffer (5 mM KH 2 PO 4 , 25% acetonitrile, pH 3.0) and then passed through an SCX spin column (Nest Group). The column was washed with several column volumes of loading buffer, and the retained peptides were eluted with high salt buffer (5 mM KH 2 PO 4 , 400 mM KCl, 25% acetonitrile, pH 3.0). Eluant was adjusted to pH 7 with ammonium hydroxide and then allowed to interact with high capacity avidin beads (Pierce) for 2 h at room temperature. Avidin beads were washed 10 times with PBS solution and twice with 20% methanol, water and resuspended in 70% methanol, water. The suspension was transferred to a thin walled PCR tube, irradiated with 365 nm UV light (2 milliwatts/cm 2 ) (Spectroline ENF-240C, Westbury, NY) for 25 min at a distance of 10 cm with rotation, and the supernatant was then dried under vacuum and stored at Ϫ20°C. The overall process is also demonstrated in Fig. 1a.
Western Blotting with HRP-conjugated Avidin-␣-Crystallin (Invitrogen) was subjected to chemoenzymatic tagging as described above but without SCX and avidin enrichment. The protein was resolved by SDS-PAGE and transferred to a nitrocellulose membrane. The membrane was blocked with 10% (w/w) bovine serum albumin in Tris-buffered saline (50 mM Tris⅐HCl, 150 mM NaCl, pH 7.4), incubated with HRP-streptavidin (1:20,000 dilution in TBS with 0.1% Tween 20; Pierce), and visualized by chemiluminescence.
Preparation of Tau-containing Protein Fractions from Rat Brain-Rat brain (3.4 g) was homogenized in 12.5 ml of ice-cold 1% perchloric acid (Polytron homogenizer, Glen Mills, Clifton, NJ). The resulting suspension was incubated on ice for 20 min and centrifuged at 20,000 ϫ g for 20 min. The supernatant was passed through a 1-m filter and concentrated with a 10-kDa molecular mass cutoff membrane with simultaneous buffer exchange to 10 mM HEPES, pH 7.5. Proteins (1.9 mg) were fractionated on a Superose 12 PC 3.2/30 gel filtration column (GE Healthcare) by using a buffer containing 20 mM HEPES and 50 mM NaCl, pH 7.5. Tau-containing fractions were combined. The total protein amount was estimated by UV absorbance at 280 nm. An aliquot of this material (about 1.3 g) was used for O-GlcNAc site mapping.

LC-MS Analysis of Tagged and Enriched O-GlcNAc-modified
Peptides-CAD spectra were recorded on an LTQ-Orbitrap XL (Thermo Fisher Scientific, Bremen, Germany) interfaced to an Eksigent nano-LC system (Dublin, CA). The HPLC gradient was 5-40% solvent B (A ϭ 0.1% formic acid; B ϭ 90% acetonitrile, 0.1% formic acid) in 40 min at a flow rate of 300 nl/min. CAD MS/MS spectra were recorded with the LTQ operated in the data-dependent mode on the five most intense ions observed in MS 1 scans recorded with the Fourier transform analyzer set at a resolution of 60,000 at m/z 400. Parameters for acquiring MS/MS spectra in the ion trap were as follows: activation time, 30 ms; activation Q, 0.25; dynamic exclusion, enabled with a repeat count of 2; and exclusion duration, 60 s. For ETD analysis, an aliquot of sample reconstituted in 0.1% acetic acid was pressure-loaded onto a 360-m-outer diameter ϫ 75-m-inner diameter microcapillary precolumn packed with C 18 (5-20-m diameter, 120 Å) and then washed with 0.1% acetic acid as described previously (7). The precolumn was connected to a 360-m-outer diameter ϫ 50-m-inner diameter microcapillary analytical column packed with C 18 (5-m diameter, 120 Å) and equipped with an integrated electrospray emitter tip (7). Peptides were gradient-eluted into the mass spectrometer at a flow rate of 60 nl/min. ETD MS/MS spectra were acquired using the following parameters: reaction time, 100 ms; reagent AGC target, 4e5 ion counts; full AGC target, 2e4 ion counts; MS n AGC target, 2e4 ion counts; isolation window, 3 m/z; repeat count, 2; repeat duration, 20 s; and exclusion duration, 30 s.
Analysis of Mass Spectrometry Data-Peak lists were generated from raw data files using Bioworks software (version 3.3.1 sp1). In-house developed software was used to remove charge reduction species. The Open Mass Spectrometry Search Algorithm (OMSSA) (version 2.1.1) was utilized to search c-and z-type fragment ions present in ETD MS/MS spectra against all rat and mouse proteins (mouse and rat, 193,424 entries) in the NCBI non-redundant database (release date, November 2008). Database searches were performed with the following fixed parameters: precursor mass tolerance of Ϯ1.5 Da, product ion mass tolerance of Ϯ0.35 Da, and three missed cleavages. In addition, searches were performed with the following variable modifications: methionine in the oxidized and non-oxidized forms and serine and threonine residues with and without the tagged O-GlcNAc group. All database assignments of O-GlcNAc-modified peptides were confirmed by manual interpretation of the corresponding ETD MS/MS spectra.

RESULTS AND DISCUSSION
Approach for Highly Specific Enrichment and Site Mapping of GlcNAcylation-For this enrichment approach (Fig. 1a), proteins were first proteolytically digested. Following this, GlcNAcylated sites were labeled with an azido sugar using the substrate UDP-GalNAz and the galactosyltransferase GalT1 (6). Because the UDP product formed in this reaction potently feedback inhibits GalT1, alkaline phosphatase was added to the reaction mixture to rapidly degrade UDP and to ensure complete derivatization (supplemental Fig. 3). Peptide:N-glycosidase F, which removes N-linked glycans, was also added to the reaction mixture to ensure that N-glycans with terminal GlcNAc residues were not labeled with GalNAz. Excess UDP-GalNAz was removed using a C 18 spin column. GalNAz groups were subsequently biotinylated through a coppercatalyzed 1,3-dipolar cycloaddition reaction of the free azide on GalNAz to an alkyne containing a terminal biotin group and a photocleavable linker (8). After removing excess biotinylation reagent by SCX, enrichment of the tagged O-GlcNAcmodified peptides was performed by avidin affinity chromatography. A major issue with using avidin chromatography is the difficulty in elution of biotin-containing peptides because binding of biotin to avidin is essentially irreversible (K a ϭ 10 15 M Ϫ1 ). The elution efficiency is also low when using monomeric avidin. Therefore, harsh conditions are generally used for elution from avidin beads; however, this can easily damage O-GlcNAc. Reduction-cleavable biotin is not suitable for this application because the dipolar addition reaction requires strong reducing conditions. Acid-cleavable biotin typically requires treatment with 95% trifluoroacetic acid, which we also found was strong enough to partially hydrolyze O-GlcNAc.
Olejnik et al. (9) previously reported a photocleavable biotin reagent. Based on their findings, we synthesized a photocleavable biotin-alkyne reagent to tag GlcNAc-modified peptides (Fig. 1a, inset). This photocleavable biotin-linked alkyne contains a terminal biotin group, which facilitates enrichment of GlcNAc-modified peptides. In addition, after enrichment with avidin chromatography, the tagged GlcNAc-modified peptides are efficiently released upon brief exposure to UV light (365 nm) (supplemental Fig. 4), and the released peptides carry a basic aminomethyltriazolyl acetylgalactosamine group. This overall approach is illustrated in Fig. 1a. Shown in Fig. 1b is a CAD MS/MS spectrum recorded on [M ϩ 3H] 3ϩ ions (m/z 457) corresponding to the tagged, GlcNAcylated tryptic peptide having the sequence YSPTgSPSK (gS is O-GlcNAcylated Ser). Note that the dominant fragmentation pathway occurs at the glycosidic linkage to produce oxonium ions at m/z 300.1 and 503.2. These signature ions are diagnostic for tagged GlcNAc residues and can be used to detect GlcNAcylated peptides in complex mixtures. Fig. 1c displays the ETD MS/MS spectrum recorded on the same [M ϩ 3H] 3ϩ ions. Because the tag added to the O-GlcNAc residue contains a basic group, all GlcNAcylated tryptic peptides should exist in charge state of 3ϩ or higher and thus fragment well along the peptide backbone under ETD conditions ref. (10). Cleavage at the glycosidic linkage was not observed.
Detecting GlcNAcylation in ␣-Crystallin-To test our enrichment strategy, we analyzed a mixture containing tryptic peptides from ␣-crystallin (50 fmol), which has two known O-GlcNAc sites, and bovine serum albumin (2.5 pmol). GlcNAc peptides were tagged with GalNAz followed by photocleavable biotin, enriched, subsequently released from an avidin resin, and analyzed by CAD using an LTQ-Orbitrap mass spectrometer and by ETD using an LTQ XL mass spectrometer. Shown in Fig. 2a is the CAD spectrum recorded on [M ϩ 3H] 3ϩ ions at m/z 715 from the tagged carboxyl-terminal tryptic peptide (AIPVgSREEKPSSAPSS) of ␣-crystallin. Note that the signals labeled as Peak 1 and Peak 2 correspond to the diagnostic oxonium ions, which are discussed above. Shown in Fig. 2b is the ETD MS/MS spectrum recorded on [M ϩ 4H] 4ϩ ions for the same peptide. Predicted fragment ions of type c and z are shown above and below the inset in Fig. 2b. Those ions that were observed are underlined. Note that the mass difference between both c 5 and c 4 and z 12 and Brackets enclose ions that correspond to charge-reduced species and fragments derived from them by loss of small, neutral molecules. Product ions that result from loss of an aminomethyltriazole radical are labeled with a circle (E). c, the tagging approach enables detection of O-GlcNAc-modified proteins by avidin-HRP blotting. This figure shows that tagged ␣-crystalline provides a strong signal after blotting with avidin-HRP blotting. Upon UV illumination, no signal is observed indicating photocleavage and loss of the biotin tag from ␣-crystallin. z 11 is 589 Da rather than the 87 Da expected for an unmodified Ser residue. We concluded that the first Ser in the peptide carries the tagged O-GlcNAc moiety. Tagged EEKPAVg-TAAPK (gT is O-GlcNAcylated Thr) from the ␤-chain of ␣-crystallin was also detected (data not shown). Unmodified BSA peptides were not detected in the analysis indicating that the enrichment proceeded with high selectivity. The level of O-GlcNAc on ␣-crystallin is ϳ10% (11). Knowing this, we estimated that recovery of the two O-GlcNAc-modified peptides from ␣-crystallin was over 90%.
Next, we asked whether this tagging strategy could be used to detect O-GlcNAc at the intact protein level without using mass spectrometry. As shown in Fig. 2c, tagged ␣-crystallin was readily detectable by avidin-HRP blotting. In contrast, the signal disappeared when the tagged protein was illuminated by UV light before the sample was resolved by SDS-PAGE. We concluded that tagged proteins and proteins bound nonspecifically to the avidin beads can be readily distinguished by blotting samples before and after the photochemical cleavage.
Detecting and Site Mapping GlcNAcylation in Complex Biological Sample-The described protocol was used to analyze tau-containing protein fractions obtained from rat brain. These fractions were isolated by a combination of perchloric acid extraction and gel filtration chromatography. Eight O-GlcNAc sites on seven different proteins were identified (Table  I and supplemental Fig. 5). Shown in Fig. 3a is the ETD mass spectrum recorded on [M ϩ 3H] 3ϩ ions from residues 709 to 717 of the tau microtubule-associated protein. Note that the mass difference between both c 3 and c 2 and z 7 and z 6 is 589 Da rather than the 87 Da expected for an unmodified Ser residue. We concluded that Thr 711 (more commonly known as Thr 400 according to the most common splice variant) in the carboxyl-terminal region of tau is O-GlcNAc-modified. This same site can also be phosphorylated by glycogen synthase kinase-3␤ (12). These data support our hypothesis that one function of O-GlcNAc on tau is to prevent hyperphosphorylation in normal brain (13,14). Because we observed limited digestion of tau with trypsin, we anticipate that additional O-GlcNAc sites will be detected when multiple proteases are used to extend the sequence coverage.
From the same tau-containing protein fractions, O-GlcNAc sites on ␣-, ␤-, and ␥-synucleins (Table I) were also characterized. ␣-, ␤-, and ␥-synucleins are three small proteins that are expressed primarily in neural tissues (15). Aggregated ␣-synuclein is a component of filamentous inclusions associated with neurodegenerative conditions, such as Alzheimer and Parkinson diseases (16). Here we report that ␣-synuclein is O-GlcNAc-modified on Thr 72 , a residue located within a 35-amino acid stretch that forms the protease-resistant core of ␣-synuclein fibrils (15). This region is prone to self-aggregation and is able to seed the formation of amyloid fibrils (17). It seems likely that GlcNAcylation on this peptide could prevent fibril formation. Interestingly, Shimura et al. (18) reported that the E3 ubiquitin protein ligase parkin binds and ubiquitinates glycosylated ␣-synuclein but not the non-glycosylated form in human brain, although the identity and site of the glycosylation were unknown at that time.
Two O-GlcNAc sites were also identified on methyl-CpGbinding protein 2 (MeCP2). One site was unambiguously identified at Thr 434 (Fig. 3b), whereas the second site was localized to either Thr 443 or Thr 444 . This protein is also of interest because it binds to methylated DNA and mediates transcriptional repression through interactions with a histone deacetylase and corepressor SIN3A (19). Mutations in MeCP2 cause Rett syndrome, a developmental disorder characterized by mental retardation, motor dysfunction, and autistic behavior (20). Brain-specific phosphorylation of Ser 421 in MeCP2 by a Ca 2ϩ /calmodulin-dependent kinase II-dependent mechanism is thought to regulate a program of gene expression that mediates nervous system maturation (21). Whether or not GlcNAcylation on the nearby residues (Thr 434 and either Thr 442 or Thr 443 ) alters this process will require further investigation.
In summary, we describe here an efficient enrichment protocol that is highly specific for O-GlcNAc-modified proteins and peptides and that facilitates site mapping of O-GlcNAcmodified amino acids at the low femtomole level by ETD mass spectrometry. The procedure uses a novel photocleavable biotin tag that allows for efficient release of the enriched O-GlcNAc-modified peptides from the solid affinity support. In addition, the photocleavage reaction leaves a basic aminomethyltriazole tag at the site of the O-GlcNAc modification. As a result, all modified, tryptic peptides exist in a charge state of 3ϩ or higher and thus fragment efficiently along the peptide backbone when subjected to ETD. For site-specific O-GlcNAc quantification, a heavy isotope-labeled photocleavable biotin alkyne is currently being synthesized.
Because the enrichment procedure is highly specific, the flow-through from the avidin chromatography can be further enriched for phosphopeptides or for peptides with other post-translational modifications. Note that in this case the use of a phosphatase during the labeling process must be omitted. A flow chart diagramming the sequential enrichment of GlcNAcylated peptides and phosphorylated peptides is shown in supplemental Fig. 6. In a recent large scale study, we applied this serial enrichment protocol (leaving out the alkaline phosphatase) to investigate the interplay of phosphorylation and GlcNAcylation in the regulation of cytokinesis. By combining the above serial enrichment protocol with stable isotope labeling by amino acids in cell culture, we mapped, quantified, and compared relative site occupancy for over 120 specific O-GlcNAc-modified residues and over 350 phosphorylated residues by tandem mass spectrometry using only 15 g of sample from a spindle/midbody preparation. This confirms the robustness of our protocol for analysis of complex mixtures. 2 Because the sensitivity of the above approach is already on par with that used to study protein phosphorylation, we expect that the method will be a valuable tool for deciphering the "O-GlcNAcome." Together with other methods (6,22), we expect a significant increase of known O-GlcNAc sites in the very near future, which will highly facilitate investigation of the function of GlcNAcylation.