Analysis of Glycosylation Site Occupancy Reveals a Role for Ost3p and Ost6p in Site-specific N-Glycosylation Efficiency*S

Asparagine-linked glycosylation is the most common post-translational modification of proteins catalyzed in eukaryotes by the multiprotein complex oligosaccharyltransferase. Apart from the catalytic Stt3p, the roles of the subunits are ill defined. Here we describe functional investigations of the Ost3/6p components of the yeast enzyme. We developed novel analytical tools to quantify glycosylation site occupancy by enriching glycoproteins bound to the yeast polysaccharide cell wall, tagging glycosylated asparagines using endoglycosidase H glycan release, and detecting peptides and glycopeptides with LC-ESI-MS/MS. We found that the paralogues Ost3p and Ost6p were required for efficient glycosylation of distinct defined glycosylation sites. Our results describe a novel method for relative quantification of glycosylation occupancy in the genetically tractable yeast system and show that eukaryotic oligosaccharyltransferase isoforms have different activities toward protein substrates at the level of individual glycosylation sites.

N-Glycosylation is important in protein folding and in modulating protein interactions, stability, and activity (1)(2)(3)(4). Oligosaccharyltransferase (OTase) 1 transfers glycans from a lipid pyrophosphate donor to selected asparagines in N-glycosylation sequons (NX(S/T); X P) in polypeptides in the lumen of the endoplasmic reticulum (ER) (5) or in the periplasm of some bacteria (6). Most OTases require a highly defined lipid-linked oligosaccharide substrate (7) while accepting a large number of glycosylation sites located on many different proteins. Glycosylation is proposed to be coupled to translocation into the ER in eukaryotes and occurs before protein folding (8,9), whereas bacterial OTase can glycosylate acceptor sites located in flexible domains of folded proteins in vivo and in vitro (10). In higher eukaryotes, OTase is a multiprotein complex consisting of eight different subunits (Fig. 1), but in some protozoan, archaeal, and bacterial species N-glycosylation is catalyzed by a single protein OTase, homologous with Stt3p, the catalytic protein subunit of the eukaryotic OTase (5,11). The roles of the additional subunits in the eukaryotic OTase are largely unknown but may allow regulation of activity toward preferred glycan and protein substrates. Alternatively as N-glycosylation requires flexible segments in the acceptor protein, the additional subunits may couple OTase to the translocation machinery or inhibit substrate protein folding to extend OTase substrate range or increase glycosylation efficiency.
In yeast, the presence of either Ost3p (UniProtKB number P48439) or Ost6p (UniProtKB number Q03723) results in two alternative OTase complexes (12,13), which differ in their protein substrate-specific activities (12,14). Ost3p and Ost6p share the same predicted topology of four transmembrane helices with an N-terminal hydrophilic domain located in the lumen of the ER. This domain is predicted to have a thioredoxin-like fold (15).
As we suspected that Ost3/6p affect glycosylation of a defined but unknown subset of glycosylation sites, we developed an MS-based analytical method that could concurrently measure the glycan occupancy of many sites in different proteins. This method involved enrichment of glycoproteins covalently linked to the yeast polysaccharide cell wall, release of glycans from these glycoproteins with endoglycosidase H leaving previously glycosylated asparagines residues "tagged" with a single N-acetylglucosamine (GlcNAc), and LC-ESI-MS/MS detection of peptides and glycopeptides after protease digestion. This method builds on previous studies that used glycosidase digests to identify previously glycosylated asparagine residues (16,17). We used this method together with yeast genetics approaches to reveal that the Ost3/6p subunits of the yeast OTase influenced the glycosylation efficiency of specific sites. Our results show that in yeast the presence of multiple OTase isoforms increases the range of efficiently glycosylated protein substrates. EXPERIMENTAL PROCEDURES Chemicals were obtained from Sigma-Aldrich unless specified otherwise.
Cell Wall Protein Sample Preparation-Yeast cells were grown to midlog phase in minimal medium at 23°C. Cells were harvested and lysed at 4°C using glass beads in 50 mM Tris-HCl, pH 7.5, 1ϫ Complete protease inhibitor mixture (Roche Diagnostics), and 2 mM PMSF. Based on previously reported methods (19), covalently linked cell wall material was pelleted by centrifugation at 16,000 rcf for 1 min; washed three times with 50 mM Tris-HCl, pH 7.5; and resuspended in 50 mM Tris-HCl, pH 8, 2% SDS, 7 M urea, and 2 M thiourea. Cysteines were reduced/alkylated by addition of dithiothreitol to 10 mM and incubation with agitation at 30°C for 30 min followed by addition of iodoacetamide to 25 mM and additional incubation with agitation at 30°C for 1 h. Non-covalently linked proteins were then removed by washing the 16,000 rcf pellet five times with 50 mM Tris-HCl, pH 8, 2% SDS, 7 M urea, and 2 M thiourea followed by an additional five washes with 2% SDS. The pellet was resuspended in 2% SDS, 1ϫ G5 buffer (New England Biolabs, Ipswich, MA), and 500 units of endoglycosidase H/100 l (New England Biolabs) and incubated at 37°C with agitation for 16 h. Endoglycosidase H and SDS were removed by washing the 16,000 rcf pellet six times with 50 mM NH 4 HCO 3 , the pellet was resuspended in 50 mM NH 4 HCO 3 , and proteins were digested with either trypsin (4 g/ml) or Asp-N (1 g/ml) at 37°C with agitation for 16 h. Insoluble material was pelleted at 16,000 rcf for 1 min, and soluble peptides were desalted using C 18 ZipTips (Millipore) and dried. Desalted peptides were resuspended in 2% acetonitrile and 0.2% formic acid and analyzed by LC-ESI-MS/MS.
Mass Spectrometry-Cell wall peptides and glycopeptides were analyzed by LC-ESI-MS/MS with an LTQ-FT-ICR-MS instrument (Thermo Scientific, Waltham, MA). Samples were injected into an Eksigent nano-HPLC system (Eksigent Technologies, Dublin, CA) with an autosampler and separated on a home-made reverse-phase column (75 m ϫ 80 mm) packed with C 18 material (AQ, 3 m, 200 Å; Bischoff GmbH, Leonberg, Germany). The column was equilibrated with solvent A (A: 3% acetonitrile and 0.2% formic acid in water). Peptides were eluted using the following gradient: 0 -50 min, 0 -60% B; 50 -53 min, 60 -97% B; 53-60 min, 97% B (B: 80% acetonitrile and 0.2% formic acid in water) at a flow rate of 0.2 l/min. High accuracy mass spectra were acquired with LTQ-ICR-FT in the mass range of 300 -2000 m/z and a target value of 5 ϫ 10 5 ions. Up to four datadependent MS/MS spectra were recorded in parallel at the ion trap of the most intense ions with charge state 2ϩ or higher using collisioninduced dissociation. Target ions already selected for MS/MS were dynamically excluded for 60 s. General mass spectrometric condi-tions were as follows: normalized collision energy, 32% for MS/MS; ion selection threshold, 500 counts for MS/MS; activation, q ϭ 0.25; and activation time of 30 ms for MS/MS acquisitions.
Data Analysis-Peak lists were extracted from raw data using Mascot Distiller (Version 2.1, Matrix Science, London, UK) with default parameters. Peptide identities based on MS/MS data were assigned using an in-house installation of Mascot (Version 2.1., Matrix Science) searching the Saccharomyces cerevisiae protein database (downloaded from European Molecular Biology Laboratory-European Bioinformatics Institute: fgcz_4921 4932_yeast_contaminants_20070811; 6068 sequences; 2,968,192 residues) with the following parameters: fixed modification of carbamidomethylated cysteines; variable modification of oxidized methionines, deamidated asparagines, and Hex-NAc-asparagines; no enzyme specified; 0.01-Da peptide tolerance; 0.6-Da fragment ion tolerance; and detected ion-specific charge state. Peptides with scores below the Mascot threshold for reliable identifications were excluded. Peptide abundances were determined manually from LC-ESI-MS data using Xcalibur (Version 2.0, Thermo Scientific) by summing the base peak chromatogram intensity for the entire isotopic distribution of each selected peptide ion over the elution peak. The glycosylation occupancy at a given sequon was determined by the abundance of the GlcNAc-modified peptide as a fraction of the sum of the GlcNAc-modified and unmodified versions of the same peptide. Each cell type was analyzed with five or more independent biological preparations, and each data point was determined as mean Ϯ range. Data were compared using a two-sided Mann-Whitney test, a non-parametric statistical test appropriate to the data characteristics and low sample sizes. Two peptides detected contained two glycosylation sequons, both of which were occupied in wild type cells. In some cells these glycopeptides were also detected in a singly glycosylated form. However, as it was not possible to define the affected sequon, occupancy of these sites was combined in a single measurement (Asn 233 and Asn 237 of Crh2p as CRH2_233 and Asn 268 and Asn 280 of Ecm33p as ECM33_268).
Protein Sequence Analysis-Secondary structural elements and surface exposure of detected glycosylation sequons were predicted with JPred (20). Multiple sequence alignment based on amino acid sequence was performed with MUSCLE (multiple sequence comparison by log-expectation) (21), and a phylogenetic tree was built from this alignment with SCI-PHY (subfamily classification in phylogenomics) (22).

RESULTS
Quantitative Glycomics Method Development-We investigated the roles of the thioredoxin-like components Ost3/6p in OTase function in vivo. As yeast OTase contains either Ost3p or Ost6p (12, 13), we generated strains where both the OST3 and OST6 loci were deleted (⌬ost3/⌬ost6). Expression of plasmid-encoded Ost3p or Ost6p then led to normal levels of uniform OTase (12). This allowed us to compare the phenotype of strains expressing either Ost3p-or Ost6p-containing OTase complexes. We then asked whether glycosylation of specific acceptor sites was affected by Ost3/6p function. To detect such alterations in substrate recognition, we developed a novel analytical method to provide site-specific relative quantification of glycosylation occupancy. We took advantage of yeast cell wall architecture and prepared glycoproteins covalently attached to the polysaccharide cell wall matrix via glycosylphosphatidylinositol anchor remnants or alkali-sensitive linkages (19). After endoglycosidase H digest leaving glycosylated asparagines with a single GlcNAc, protease digestion, and MS analysis, the N-glycan occupancy at many sites could be determined (Figs. 2 and 3).
Glycan release with endoglycosidase H provided a clear distinction between previously glycosylated and unglycosylated versions of the same peptide with a ⌬mass of 203.08 Da and a difference in retention time of 5-90 s with the GlcNAc-modified peptide eluting before the unmodified peptide (Fig.  2, a and b). This allowed unambiguous identification and relative quantification of GlcNAc-peptide/peptide pairs. MS/MS spectra of GlcNAc-modified peptides typically contained a minor fragment ion at an m/z of 204.1, corresponding to a singly charged GlcNAc, as well as a ⌬mass correspond- ing to loss of asparagine-GlcNAc in a b or y fragment ion series ( Fig. 2c and supplemental Figs. 1-31). Additionally deamidation of a non-glycosylated asparagine residue preceding a glycine residue was observed in four peptides containing glycosylation sequons CRH2_210, CWP1_45, ECM33_197, and GAS5_344 (Table I). NG sequences have been reported to be especially prone to deamidation during the course of standard proteomic sample preparation as the reduced steric hindrance of the small glycine residue favors the cyclic intermediate in the asparagine to aspartic acid conversion (23,24).
Although no protease was specified in the search parameters, all identified glycopeptides corresponded to trypsin or Asp-N cleavage events with the exception of the glycopeptides containing the CRH2_28 and GAS5_24 glycosylation sites (Crh2p: Ala 24 -Glu 35 and Gas5p: Ala 20 -Lys 33 , respectively) ( Table I). These peptides are most likely the result of signal peptide cleavage as predicted between Ala 24 -Ala 25 in Crh2p but between Ala 19 -Ala 20 rather than at the predicted Ser 22 -Ser 23 cleavage site in Gas5p (UniProt).
OTase Site-specific N-Glycosylation Activity-Analysis of cell wall glycoproteins in wild type yeast using our method ( Fig. 3 and Tables I and II) robustly detected 26 NX(S/T) sequons, 85% (22 of 26) of which were occupied with more NXT than NXS sequons glycosylated. Although all detected sequons were predicted to be surface-exposed, sequons in predicted secondary structural elements were less likely to be glycosylated (data not shown).
We next used our method to ask whether the two OTase isoforms defined by the presence of either Ost3p or Ost6p showed site-specific glycosylation activity. For this, we analyzed yeast cells with wild type levels of OTase but with exclusively Ost3p or Ost6p (Fig. 3 and Table II) Table II). DISCUSSION N-Glycosylation is a general co-or post-translational modification affecting many glycosylation sites on numerous proteins. However, only ϳ70% of NX(S/T) sequons in proteins translocated into the ER are actually glycosylated, and the fundamental processes controlling site-specific control of Nglycosylation are poorly understood. The Ost3p and Ost6p subunits of yeast OTase are predicted to contain a thioredoxin-like fold, and the two isoforms of OTase containing either one or the other of these two proteins show different protein-specific glycosylation activities. We were unable to further dissect this activity with standard immunoblot-based methods as these can only measure the average glycosylation state of multiple sites in a single protein and only then when appropriate antisera are available. Such specific methods are inherently inadequate for detailed investigation of the general process of N-glycosylation. To increase the resolution of measurement of OTase function, we therefore developed an MS-based method to concurrently measure N-glycan occupancy at many glycosylation sites in different proteins. Efficient MS detection of numerous glycosylation sites required enrichment of glycoproteins. We chose an enrichment based  on the natural covalent linkage of some proteins to the yeast polysaccharide cell wall matrix as this did not rely on Nglycans and therefore allowed detection of both glycosylated and non-glycosylated versions of the same sequon. Glycosylation occupancy measured by this method was relative as GlcNAc-asparagine modification undoubtedly affected MS ionization efficiency. In addition, only mature, successfully secreted proteins were analyzed. This biased the analysis against any sites that must be glycosylated for correct protein folding and ER export or sites where glycosylation would make correct folding impossible. Our analysis therefore allowed concurrent relative quantification of in vivo OTase activity on a wide variety of polypeptide substrates. Our method is a complementary analytical approach to the recently reported method allowing quantification of glycosylation occupancy at a lower number of predefined glycosylation sites in glycoproteins in clinical sam-ples from patients with congenital disorders of glycosylation by multiple reaction monitoring LC-MS/MS (25).
Although designed for use in the yeast model system, it would be possible to modify our approach to allow analysis of the glycosylation occupancy of mammalian proteins. Nascent polypeptides still in the lumen of the ER could be released with endoglycosidase H before N-glycan modification in the Golgi rendered glycans endoglycosidase H-resistant. To allow more general analysis, peptide-N-glycosidase F could be used to release high mannose, complex, and hybrid glycans. However, upon peptide-N-glycosidase F glycan release, previously glycosylated asparagine residues are converted to aspartic acid with a resulting ⌬mass of only 1 Da between previously glycosylated and unglycosylated versions of the same peptide (potentially leading to confusion with the natural peptide isotopic distribution) and differences in retention

Analysis of Site-specific N-Glycosylation Occupancy
times typically much less than peak width (data not shown). The benefits of both a large GlcNAc tag on glycosylated asparagines and general glycan release could be obtained by using a mixture of exoglycosidases and endoglycosidases D and H, an approach compatible with most N-glycans, as previously reported (16). It is also possible that O-GlcNAc modification of peptides could be confused with GlcNAcasparagine tags in cases in which MS/MS is not of sufficient quality to determine the exact modified residue.
In wild type yeast, the glycosylation site usage pattern observed with our novel method was in general agreement with results compiled from database analysis (26,27). 85% of sites were completely occupied, whereas the remainder were not modified. NXT sequons tended to be glycosylated more often than NXS sequons, which is possibly a reflection of the general preference of OTase for NXT over NXS sequons reported in various studies. The observed tendency for unoccupied sequons to be located in regions of predicted secondary structure also suggests that local conformational factors are likely to be important in defining the glycosylation state of a given sequon.
To compare the protein substrate specificities of OTase isoforms containing either Ost3p or Ost6p, we used our analytical method to examine yeast cells with wild type levels of OTase but expressing only one of Ost3p or Ost6p. These analyses showed site-specific partial underglycosylation with different sites affected in an OTase isoform-dependent manner. The observed effect on glycosylation was not caused by a general reduction in OTase activity as different sites were underglycosylated in the absence of Ost3p or Ost6p. ⌬ost3/⌬ost6 double mutant cells showed underglycosylation of sites requiring Ost3p and Ost6p as well as of additional sites, correlating with the lower level of OTase in these cells due to OTase complex instability (12). These results confirmed that Ost3p and Ost6p did impart protein substrate specificity to OTase activity at the level of individual glycosylation sites with 30% of sites requiring a specific OTase isoform for efficient glycosylation. Ost3p improved glycosylation at more sites than Ost6p, correlating with its roughly 4 times greater abundance in wild type yeast and the more severe phenotype in ⌬ost3 mutant cells (28). It has been proposed that the presence of either Ost3p or Ost6p causes OTase to associate with different translocons (13). However, Ost3p and Ost6p had largely complementary functions as few glycosylation sites specifically required the presence of one or the other protein for efficient glycosylation (Fig. 3 and Table II). It is therefore unlikely that OTase substrate proteins are targeted to the required OTase. This suggested that in Ost3/6p-assisted glycosylation substrate proteins interact sequentially with different OTase complexes containing either Ost3p or Ost6p. These interactions may therefore not necessarily be intimately coupled with translocation but could also occur post-translocationally. Efficient glycosylation of a third of N-glycosylation sites required the specific presence of either the Ost3p or Ost6p OTase isoform. Yeast Ost3/6p is therefore involved in controlling OTase activity toward a substantial proportion of protein substrates even though the Ost3/6p subunits are both the evolutionarily most recent additions to the OTase complex (5) and the physically most peripheral (29). Only the genomes of vertebrates and some fungi code for two different OST3/6 orthologues, and phylogenetic analysis indicates that gene duplication in these organisms occurred independently for these orthologues (Fig. 4). Thus, in some organisms an evolutionary advantage may originate from refined control of substrate-specific N-glycosylation. Conceivably fine tuned glycosylation is of special importance not only for fungi, which rely on glycosylated secretory enzymes for nutrition, but also for vertebrates where development depends on complex networks of interaction among secreted glycoproteins. Recently mutations in genes encoding the human homologues of Ost3/6p (termed implantation-associated protein and N33) have been associated with non-syndromic mental retardation, a novel form of congenital disorder of glycosylation (30,31). Together with our results, this suggests that refined activity of the human OTase is required for efficient glycosylation of one or more proteins needed for the development of normal cognitive functions.
Our novel method will be of utility for more detailed analyses of the protein substrate specificity of OTase in the genetically tractable yeast model system. Many previous studies have investigated the characteristics of sequons that contribute to efficient glycosylation using model glycoproteins in ex vivo or in vitro systems or bioinformatics comparisons of glycoproteins from many different systems (27,(32)(33)(34)(35)(36). The systemic analysis of glycosylation site occupancy we present here represents a novel tool for the phenotypic analysis of defined yeast mutant strains altered in N-glycosylation processes in the ER and hence with reduced N-glycan occupancy. Such analysis will provide a data set of natural glycosylation sites to formulate algorithms that predict the use of N-glycosylation sequons based on the primary amino acid sequence of putative glycoproteins.