C-Mannosylation and O-Fucosylation of Thrombospondin Type 1 Repeats*

The final chemical structure of a newly synthesized protein is often only attained after further covalent modification. Ideally, a comprehensive proteome analysis includes this aspect, a task that is complicated by our incomplete knowledge of the range of possible modifications and often by the lack of suitable analysis methods. Here we present two recently discovered, unusual forms of protein glycosylation, i.e. C-mannosylation and O-fucosylation. Their analysis by a combined mass spectrometric approach is illustrated with peptides from the thrombospondin type 1 repeats (TSRs) of the recombinant axonal guidance protein F-spondin. Nano-electrospray ionization tandem-mass spectrometry of isolated peptides showed that eight of ten Trp residues in the TSRs of F-spondin are C-mannosylated. O-Fucosylation sites were determined by a recently established nano-electrospray ionization quadrupole time-of-flight tandem-mass spectrometry approach. Four of five TSRs carry the disaccharide Hex-dHex-O-Ser/Thr in close proximity to the C-mannosylation sites. In analogy to thrombospondin-1, we assume this to be Glc-Fuc-O-Ser/Thr. Our current knowledge of these glycosylations will be discussed.

Ideally, the complete description of a proteome should include the definition of co-and post-translational modifications. The covalent attachment of carbohydrates is a wide spread feature of secreted, cytoplasmic, and nuclear proteins (1). Although N-glycosidic attachment of GlcNAc to Asn residues and GalNAc O-linked to Ser/Thr have been known for a relatively long time (2,3), only more recently the existence of a large variety of protein-carbohydrate linkages has been revealed (1). Here we present two of these, including their analysis by a combined mass spectrometric approach.
The covalent attachment of an ␣-mannopyranosyl residue to the C-2 atom of the side chain of Trp was initially described in human RNase 2. The structure of this glycoconjugate ( Fig.   1), which does not contain a typical glycosidic linkage but rather a C-C bond, has been based on MS 1 and NMR studies (4,5). The reaction is catalyzed by a microsomal transferase, which uses dolichyl-phosphate-mannose as the sugar donor and recognizes, in nearly all cases, the sequence Trp-X-X-Trp (6,7). The transferase activity can be detected in organisms ranging from Caenorhabditis elegans to man (8). Presently, a total of 49 C-mannosylated tryptophan residues have been identified, derived from 11 different proteins, performing a wide variety of functions (for a review see Ref. 9). Evidence for C-mannosylation and for its site of attachment can conveniently be obtained in CID MSMS experiments of isolated peptides. Here we show examples of this kind of analysis and summarize our current knowledge of this modification.
Fucose was long considered to be a terminal sugar of oligosaccharides. This changed, however, when fucose Olinked to Ser or Thr was discovered in the EGF-like modules from proteins that play an important role in fibrinolysis and coagulation (for an overview, see Refs. 10 and 11). There exist at least two pathways for the extension of O-linked fucose (12). The first one leads to the formation of the tetrasaccharide NeuAc-␣2-6-Gal-␤1,4-GlcNac-␤1,3-Fuc-␣1-O-Ser/Thr, as found in factor IX and Notch (13)(14)(15). For Notch, it has been shown that receptor-ligand interactions are modulated by extension of the of O-linked fucosyl residues (16,17). The second pathway yields the disaccharide Glc-␤1,3-Fuc-␣1-O-Ser/Thr, first found as the amino acid glycoside in human urine (18). Recently, we discovered that all three TSRs in TSP-1 carry this disaccharide in close proximity to C-mannosylation sites. Nano-ESI Q-TOF mass spectrometry has proven to be very useful in the determination of O-fucosylation sites (19,20). Here we use the same approach in the analysis of rF-spondin. This protein also contains TSRs and is secreted by the floor plate during development of the nervous system (21). It functions as a guidance protein for commissural neurons by attracting their axons to the midline during rostral growth.

EXPERIMENTAL PROCEDURES
Proteins-Recombinant F-spondin was expressed in COS-7 cells. 2 Plasmid pSecF-spo-His, encoding residues 27-752 of rat F-spondin followed by a Myc tag and a His 6 sequence, was a generous gift of Dr. A. Klar, Hebrew University-Hadassah Medical School, Jerusalem, Israel (21). Peptides for the analysis of the C-mannosylation and O-glycosylation sites were obtained from aminoethylated and carbamidomethylated rF-spondin, respectively. The former was cleaved with endoproteinase Lys-C and trypsin, whereas the latter was digested with trypsin. Digests were fractionated by C 8 reversed phase LC-ESI MS using a Sciex API 300 triple quadrupole mass spectrometer, as described previously (22,23). Fractions were collected and analyzed using the same instrument. Peptides have been numbered according to their occurrence in mature F-spondin (21).
The analysis of C-mannosylated peptides by nano-ESI MSMS using a Sciex API 300 triple quadrupole mass spectrometer and Edman degradation has been described previously (23). Determination of O-fucosylation sites was performed by low energy CID tandem-MS in the nanospray mode, using a Q-TOF instrument (Micromass, Manchester, United Kingdom), equipped with a Z-spray atmospheric pressure ionization source operated at 30°C. The pressure of the collision gas (Ar) in the hexapole collision cell was 2.7 ϫ 10 Ϫ5 millibar, and the collision energy was typically 20 eV. Details of the method have been described (19).

C-Mannosylation-C-Mannosylation
occurs only on secreted proteins, 3 which, in addition to this modification, also often contain N-and/or O-linked oligosaccharides. This, and the fact that only a single mannosyl residue (mass, 162 Da) is added in the process of C-mannosylation, makes it difficult to detect (C 2 -Man-)Trp in the intact protein. Therefore, the isolation and analysis of relevant peptides is required. Toward that aim, a proteolytic digest of the protein is fractionated by reversed phase LC-MS. Subsequently, the MS data are extracted for the theoretical masses of the peptides containing the recognition sequence Trp-X-X-Trp, assuming the presence of one or more C-linked mannosyl residues. As an example, the detection of peptide KT81 (LVTEWGEWDD-C 545 ), one of the C-mannosylated peptides from rF-spondin, is shown in Fig. 2A. The peptide, containing a single Cmannosylated Trp residue (theoretical mass, 1396 Da, average mass), is identified eluting at 38.3 min. Strong indication for the presence of a (C 2 -Man-)Trp residue can already be obtained at this stage. Because of the stability of the C-C bond in aromatic C-glycosides, these compounds display a characteristic loss of 120 Da in low energy CID experiments (4,24,25). Probably as a result from in-source CID, this phenomenon can also be observed in LC-MS experiments, as illustrated for peptide KT81 in Fig. 2B. This behavior clearly distinguishes C-mannosylated peptides from ones that contain O-linked mannose, which show a loss of 162 Da under similar LC-MS conditions (Fig. 3). Information on the position of the modified Trp residue can be obtained form an MSMS experiment. The results (Fig. 4) confirm the identity of the peptide and provide strong evidence for C-hexosylation of Trp 539 . Because all hexoses add a mass of 162 Da to the Trp residue, MS does not reveal the identity of the sugar involved. Initially the proof for a C-␣-mannosyl residue has been obtained by NMR experiments on peptides from human RNase 2 (4), as well as on the intact protein (26). Furthermore, NMR analysis of peptides from interleukin 12␤ and complement factor C9 also yielded mannose as the sugar present (23,27). In addition, solid phase Edman degradation can be used for the analysis of smaller amounts of the peptides. Using an optimized elution protocol, phenylthiohydantoin-(C 2 -Man-)Trp elutes shortly after phenylthiohydantoin-Tyr (28). The elution position of phenylthiohydantoin-(Hex)Trp, with hexoses other than Man, remains to be established.
Following this approach we have detected 48 (C 2 -Man-Trp) residues in 10 different proteins (Table I). In addition, it seems probable also that a neuropeptide from Carausius morosus is C-glycosylated (29).
O-Fucosylation-The recent discovery of the O-linked disaccharide Glc-Fuc-O-Ser/Thr in TSP-1 raises the question whether the same modification also occurs in other proteins that contain TSRs. We examined recombinant rF-spondin and properdin and show results obtained with the former to illustrate the approach. Relevant peptides were produced by tryptic digestion of the carbamidomethylated protein. To simplify the analysis, a secondary digestion with an appropriate protease was performed when needed. Cleavage of peptide Thr 61 with endoproteinase Asp-N yields peptide T61D (Asp-Asp-Cys-Ser-Ala-Thr 548 -Cys-Gly-Met-Gly-Met-Lys) with a monoisotopic mass of 1639.59 Da, which is 308 Da heavier than predicted from the cDNA sequence. Together with the partial loss of the substituent in low energy LC-MS experiments (data not shown), this strongly suggested the presence of O-linked sugars. Because nano-Q-TOF MSMS allows the direct determination of glycosylation sites in O-fucosylated glycopeptides (19), we analyzed all potential O-fucosylation sites in rF-spondin by this method. As an example, the results obtained with the doubly charged molecular ion of peptide T61D are shown in Fig. 5. The observation of B fragment ions (nomenclature of Domon and Costello (30)) of the Hex-dHex disaccharide (m/z ϭ 309.13), the Hex (m/z ϭ 163.07), and dHex (m/z ϭ 147.09), 4 as well as ions corresponding to their neutral loss of water, provides evidence for the presence of a disaccharide, rather than two monosaccharides. Furthermore, the consecutive loss of the HexϪ and dHexϪ residues strongly indicates that the disaccharide is attached through the dHex. In addition to the deglycosylated fragment ions that confirmed the identity of the peptide, ions y7, y9, and b11 were observed with Hex-dHex and dHex still attached (Fig. 5). Despite the sometimes low abundance of these ions, their signals displayed good signal-to-noise ratios and high resolution (Fig. 5, inset). The results strongly indicated that Thr 548 carries the Hex-dHex disaccharide. Following this approach, the same O-glycosylations were also found in TSR-1, -2, and -4. Table II summarizes all O-fucosylation sites that have been found in TSRs to date. It should be noted that whereas for TSP-1 and properdin, the identity of the monosaccharides have been shown to be Glc and Fuc, this has not yet been determined for the peptides from rF-spondin. DISCUSSION C-Mannosylation-The results presented here emphasize the abundance of C-mannosylation in TSRs. In fact, all 24 4 The resolution was sufficient to distinguish the sugar oxonium ion from the y1 ion. TSRs examined to date are modified in this way (Table I). However, other proteins, and possibly other protein families, also contain this glycosylation. For example, interleukin 12␤ is a member of the type 1 cytokine receptor superfamily and C-mannosylated on Trp 319 (27). Interestingly, nearly all members of this superfamily contain the so-called "WSXWS" box (31). The analysis of these proteins will be of interest, because mutations in this box in the erythropoietin receptor cause its transport to the cell surface to stop at the endoplasmic reticulum-Golgi interface (32).
The stoichiometry of C-mannosylation shown in Table I differs for different Trp residues. In 35 cases full modification was observed, whereas in the remainder, it has been estimated to vary from Ͻ10 to 95% (see references in Table I for details). Estimating the degree of modification can be difficult. In case the modified and unmodified peptides are obtained in pure form from the initial LC-MS experiment, their ratio can be obtained by integrating their peaks in the UV trace. If contamination with other peptides occurs, this procedure cannot be used. An approximate ratio may be obtained from integration of the peaks in the total ion current trace. However, this will only give very approximate estimates, because of e.g. differences in ionization efficiency of modified and unmodified peptides (for a discussion see Ref. 23). Clearly, more work is needed on this aspect of the analysis of C-mannosylation and covalent modifications in general.
The results obtained with TSRs raise a question with respect to the specificity of C-mannosylation. Both studies in vivo with RNase 2 and in vitro studies using synthetic peptides have revealed that the recognition motif for the transferase is Trp-X-X-Trp, where only the first Trp becomes modified (6,7,22,27). This specificity does, however, not hold for the TSRs. Of the 24 C-mannosylated TSRs, 22 contain the sequence Trp-X-X-Trp-X-X-X, without a Trp or other aromatic residue at position ϩ3 relative to the second Trp. In 7 of these, only the first Trp is modified, in agreement with the RNase 2 results. However, in the other 15, both Trp residues are modified. Similarly, 2 TSRs contain the Cmannosylated sequence -X-X-Trp ( ϭ Tyr, Phe) (Table I). A possible explanation for these observations is that features outside of the motif determine whether one or both Trps become modified. Alternatively, more than one Cmannosyltransferase might exist. Cloning of the enzyme(s) catalyzing this reaction and the study of their enzymological properties should resolve this issue.
Based on antibody inhibition studies, it has been proposed that the activity of F-spondin in axonal guidance is located in the TSRs (33). Whether the Trp-X-X-Trp motifs are involved is not known. By far the most studies on this motif have been performed with TSP-1. Synthetic peptides containing C-mannosylation motifs were shown to bind to and/or inhibit the interactions of TSP-1 with fibronectin, transforming growth factor-␤, and heparin (34,35). It should be noted, however, that in all cases non-C-mannosylated peptides were used (for a review see Ref. 34).The two Trp residues were shown to be important for the interaction with heparin. Interestingly, the FIG. 4. Low energy CID tandem-ESI MS of the C-mannosylated peptide KT61 from rF-spondin. The experiment was performed on an API 300 triple quadrupole mass spectrometer. The 120-Da loss, typical for aromatic C-glycosides, has been indicated with 120 and 60 for singly and doubly charged ions, respectively. Losses of H 2 O have been indicated with asterisks. W#, C-mannosylated tryptophan. Cysteine residues have been aminoethylated.
neurite-extending activity of F-spondin can be inhibited by this glycosaminoglycan.
O-Fucosylation-The gas phase instability of O-linked fucose poses a challenging problem for the determination of the exact attachment site in peptides and proteins by MS methods. Because of the high sensitivity and excellent signal-tonoise ratio of Q-TOF instruments, it is often possible to successfully characterize this modification in peptides. The sugar-containing b-and y-ions, needed for the identification, are always detected at 10 -100-fold lower levels than the corresponding unglycosylated ones. Using low picomolar amounts of peptide, low collision energy, and long acquisition times (see "Experimental Procedures"), glycosylated ions can often unambiguously be identified with nano-ESI Q-TOF MSMS.
Using this approach, we have identified three O-fucosylation sites in TSP-1, four in properdin, and four in rF-spondin (note that in this case the identity of the sugars remains to be determined). Whereas TSR2-4 in rF-spondin are fully modified at the sites indicated in Table II, TSR1 appears to be modified for less than 10%. In properdin, O-fucosylation was complete in TSR2-4, but it was only ϳ50% in TSR1.
Examination of the TSR superfamily reveals that 63 of 88 TSRs contain a Ser or Thr at the position corresponding to the O-fucosylation sites reported here (20). This suggests that the modification occurs in many more proteins. In all cases the O-fucosylation and C-mannosylation sites are in close proximity, yet it seems unlikely that one requires the other. Protein chemical studies on recombinant TSR3 from TSP-1 revealed the presence of (C 2 -Man-)Trp, Glc-Fuc-O-Thr, and Fuc-O-Thr, which occurred in all six possible combinations, strongly suggesting that the two glycosylations occur independently (20).
Three forms of O-fucosylation are presently known, attachment of (i) a single fucosyl residue to the insect proteinase inhibitor PMP-C and EGF modules (10,36), (ii) NeuAc-Gal-GlcNAc-Fuc-to EGF modules (15), and (iii) Glc-Fuc-to TSR modules (20) (this work). In addition to the functional studies on Notch mentioned in the Introduction, those on EGF-like domains in other proteins suggest that O-fucosylation has a function. Urokinase-type plasminogen activator is mitogenic for a number of cells in culture, and this activity depends on its EGF-like domain that carries a fucosyl residue on Thr 18 . The defucosylated domain binds as tightly to cells as the fucosy- b Gäde et al. (29) hypothesize that the unknown hexose is attached to N1 of the indole. lated one, but it is not able to produce the mitogenic signal (38 -40). The effect of O-linked fucose on protein structure has been examined in PMP-C (41) and the EGF-like domain derived from blood coagulation factor VII (42). In both cases little difference in structure was found between the protein with and without fucose. However, the dynamic fluctuations of PMP-C were decreased, and its stability was increased upon O-fucosylation. The overall structures of PMP-C and the EGF-like module from factor VII are different. However, both are all-␤ proteins with a central ␤-sheet and three disulfide bridges. TSR modules are likely to have a ␤-sheet structure, as well, based on their homology with the all-␤ proteins HB-GAM (heparin-binding growth-associated molecule) and midkine (43). If this proposition is correct, the O-fucosylation site in TSRs is located in a loop between the first two ␤-sheets.
What is the function of O-fucosylation in TSRs is not known yet. However, in a number of TSR proteins the sequence in which O-fucosylation takes place has been implicated in cell binding. For instance, the cell surface receptor CD36 that mediates the inhibition of neovascularization by TSP-1 in vivo has been proposed to bind to sequences CSVTCG (fucosylation site in bold; for a review see Ref. 34). It is important to note that unmodified synthetic peptides were used in these studies. The Golgi is the only secretory compartment containing a transporter for GDP-Fuc (44). It seems likely that O-fucosylation of TSR modules, like that of EGF modules (10), takes place there. Leukocyte adhesion deficiency II has recently been attributed to an aberrant form of the GDP-Fuc trans- porter, causing a general lack of fucosylation of glycoconjugates (45,46). The leukocyte-associated defects in this disease can be understood by the lack of ␣1,3-linked fucose in Lewis X antigen-based selectin ligands (47). However, the non-fucosylated glycoconjugates that cause the defects in mental development are unknown. It is, therefore, interesting to point out that the TSR proteins TSP-1, F-spondin, Unc-5, and the semaphorins 5A and B all play a role in neuronal development (34). Because O-fucosylation has now been found in the first two of these proteins, and several putative sites are also present in the others, analysis of their O-fucosylation status and functional properties in the context of the disease seems warranted.