If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
To whom correspondence should be addressed:Biomolecular Frontiers Research Centre, Department of Chemistry and Biomolecular Sciences, Macquarie University - Sydney, NSW 2109 Australia. Tel.:+61 2 9850 7487; Fax:+61 2 9850 6192;
* M.T.-A. was supported by an Early Career Fellowship from the Cancer Institute, NSW, Australia and by a Macquarie University Research Development Grant (MQRDG). B.L.S. was supported by a National Health and Medical Research Council R.D. Wright Biomedical Career Development Fellowship Level 2 (APP1087975). N.P. acknowledges the funding of the Australian Research Council Centre of Excellence in Nanoscale Biophotonics (CE140100003). 1 The abbreviations used are:OSToligosaccharyltransferaseDDAdata-dependent acquisitionDIAdata-independent acquisitionECDelectron capture dissociationERendoplasmic reticulumETDelectron transfer dissociationEThcDelectron transfer and higher-energy collision dissociationFDRfalse discovery rateFucfucoseGalNAcN-acetylgalactosamineGlcNAcN-acetylglucosamineHCDhigher-energy collision dissociationHILIChydrophilic interaction liquid chromatographyIRinsulin resistanceiTRAQisobaric tag for relative and absolute quantitationManmannoseMRMmultiple reaction monitoringNeuAcN-acetylneuraminic acidNeuGcN-glycolylneuraminic acidPGCporous graphitized carbonRPreversed phaseSILACstable isotope labelling with amino acids in cell cultureSPEsolid phase extractionXICextracted ion chromatogram.
The glycoproteome remains severely understudied because of significant analytical challenges associated with glycoproteomics, the system-wide analysis of intact glycopeptides. This review introduces important structural aspects of protein N-glycosylation and summarizes the latest technological developments and applications in LC-MS/MS-based qualitative and quantitative N-glycoproteomics. These maturing technologies provide unique structural insights into the N-glycoproteome and its synthesis and regulation by complementing existing methods in glycoscience. Modern glycoproteomics is now sufficiently mature to initiate efforts to capture the molecular complexity displayed by the N-glycoproteome, opening exciting opportunities to increase our understanding of the functional roles of protein N-glycosylation in human health and disease.
Protein glycosylation encompasses a broad class of post-translational modifications involving the covalent attachment of complex carbohydrates (glycans) to specific amino acid residues of polypeptide chains. The human biosynthetic machinery catalyzes diverse types of glycosylation, with the best studied being attachment of glycans to asparagine (N-glycosylation) and serine/threonine (O-glycosylation) residues (
in: Varki A. Cummings R.D. Esko J.D. Freeze H.H. Stanley P. Bertozzi C.R. Hart G.W. Etzler M.E. Essentials of Glycobiology. 2nd Ed. Cold Spring Harbor,
(NY)2009
). As with all types of protein glycosylation, N-glycosylation is a template-less modification synthesized by a suite of glycosylation enzymes in the secretory pathway, Fig. 1A (
). Template-less synthesis means that glycosylation is determined by the physiological state of the glycosylation machinery and the nature of the proteins undergoing glycosylation. Jointly, these attributes determine the repertoire of glycans present on synthesized glycoproteins (glycoforms) and create the important features of protein site- and cell-specific glycosylation (
). Protein glycosylation is therefore a spatiotemporal dynamic modification that cells can utilize to respond to the constantly changing milieu.
Fig. 1.Overview of the biosynthesis and structural classes of mammalian protein N-glycosylation.A, Schematic summary of the biosynthetic machinery of N-glycoproteins. The enzymatic processing, which is initiated while the glycoproteins are still being translated, translocated, and folded, may terminate at any point in the enzymatic sequence depending partially on the Asn solvent accessibility of the maturely folded glycoprotein (
). This generates site-, cell-, and even subcellular-specific glycoform heterogeneity forming one of the functionally most important features of the glycoproteome (
), and also creates substantial analytical challenges. TGN: trans-Golgi network. B, Mammalian N-glycoproteins are typically divided into three main N-glycan classes: high mannose, hybrid, and complex type. Unusual paucimannosidic and chitobiose core type N-glycans arising from unconventional truncation pathways (dashed box) have been reported in specific cell types and physiological conditions (
N-linked glycans are typically present on asparagine residues in AsnXxxSer/Thr, Xxx ≠ Pro consensus sequences (sequons) in humans. This preference is caused by specific recognition of the sequon by the peptide-binding site of an oligosaccharyltransferase (OST)
), the enzyme which catalyzes this reaction. However, it is now clear that mammalian cells also have the ability to rarely glycosylate more relaxed sequons (e.g. AsnXxxCys) (
Glutamine-linked and non-consensus asparagine-linked oligosaccharides present in human recombinant antibodies define novel protein glycosylation motifs.
). Low efficiency N-glycosylation in these noncanonical sequons is consistent with a role of the canonical sequon in recognition and high-affinity binding to OST to promote glycan transfer (
). Mammalian N-glycans share a common trimannosylchitobiose core comprised of three mannose (Man) and two N-acetylglucosamine (GlcNAc) residues, extended with a variety of monosaccharides including Man, GlcNAc, galactose (Gal), fucose (Fuc), N-acetylgalactosamine (GalNAc) and sialic acids such as N-acetylneuraminic acid (NeuAc) and N-glycolylneuraminic acid (NeuGc). N-glycans can be further modified by noncarbohydrate moieties including phosphorylation, sulfation and acetylation (
). The conserved trimannosylchitobiose core of N-glycans is a remnant of the N-glycan precursor (Glc3Man9GlcNAc2) initially transferred to proteins in mammalian cells. This oligosaccharide structure is built stepwise on a dolichol pyrophosphate carrier embedded in the endoplasmic reticulum (ER) membrane, and is transferred en bloc to nascent polypeptides by OST. The terminal Glc and Man play critical roles in assisting glycoprotein folding and in ensuring glycoprotein quality control in the ER. After glycoproteins are correctly folded, the terminal Glc and Man are generally removed by α-glucosidases and α-mannosidases in the ER and cis-Golgi. N-glycans can then be extended by glycosyltransferases in the multiple Golgi compartments, potentially resulting in an extreme diversity of structures on mature glycoproteins in an organism-, cell-, or regulation-specific manner. The diverse mammalian N-glycans can be crudely classified into three conventional classes: high mannose, hybrid, and complex type, Fig. 1B. However, it is becoming clear that other unconventional N-glycan classes such as paucimannosidic and chitobiose core types decorate some mammalian glycoproteins (see below) (
Complementary LC-MS/MS-based N-glycan, N-glycopeptide, and intact N-glycoprotein profiling reveals unconventional Asn71-glycosylation of human neutrophil cathepsin G.
). The structures, biosynthetic pathways, and associated disorders of N-glycosylation have been recently reviewed elsewhere and readers are encouraged to use these resources for a deeper introduction (
A substantial proportion of mammalian genomes is dedicated to genes encoding proteins involved in glycosylation pathways, and these are highly conserved. Consistent with this, glycosylation is central to many biological processes. N-glycans are critical for enabling efficient glycoprotein folding and for maintaining the structural and functional integrity of glycoproteins (
). Finally, protein glycosylation is a strictly regulated modification process in healthy cells, and these biochemical processes are dysregulated in various pathologies including, but not restricted to, cancer (
). Disease-associated changes in protein glycosylation may arise from changes in glycoprotein abundance, glycosylation site occupancy (macro-heterogeneity or “glycosylation efficiency”), or glycan micro-heterogeneity at different sites on a glycoprotein (see details below). Changes in the glycoprotein micro-heterogeneity are dictated by the capacity of glycan-processing enzymes (glycosidases and glycosyltransferases) in the glycoprotein biosynthesis pathway, the nature of specific glycoprotein substrates, and other cellular factors (
This review summarizes the present analytical tools and technologies capable of performing large-scale (system-wide) analysis of protein N-glycosylation micro-heterogeneity and the unique structural insights that can be derived from such experiments by covering the very latest literature describing recent technological developments and applications in LC-MS/MS-based qualitative and quantitative N-glycoproteomics.
System-wide Structural Analysis of Protein N-glycosylation
Deciphering the glycosylation ‘code’ has been the ambition of generations of glycobiologists. The ability to accurately characterize the structure of glycoproteins is necessary if we are to succeed in our quest to unravel the diverse functions of glycans and develop the next generations of glycoprotein-based therapeutics (
). However, glycoproteins are challenging to characterize because of the multiple layers of structural diversity that form a spectrum of chemically similar glycofoms. The information needed to unambiguously characterize such heterogeneous glycoproteins is therefore consequently much larger than for unmodified proteins or for proteins with structurally simple modifications such as phosphorylation or methylation. Even the most detailed modern glycoprotein characterization studies usually only capture part of the glycoprotein structure. Some glycoprotein structural features can be inferred or predicted from the biosynthetic constraints of the well-studied glycosylation machinery of mammalian cells (
). Nevertheless, it is important to stress that even incomplete structural information can often be very useful in deciphering the structure/function relationships of glycoproteins, and in identifying alterations in the biosynthetic glycosylation machinery.
Glycoproteomics is the site-specific analysis of the glycoproteome at the systems level, Fig. 2A. Glycoproteomics experimental workflows are typically initiated with protein extraction from biological samples, denaturation, and protease digestion, Fig. 2B. At this step, isotopic labels assisting in glycopeptide quantitation or enhancing their MS features (e.g. ionization and fragmentation) can be introduced. The resulting peptide mixtures are often extremely complex and glycopeptides are consequently typically enriched and/or prefractionated prior to detection, usually by LC-MS/MS. Glycoproteomics experiments are commonly based on the identification, and less frequently, also the quantitation of intact glycopeptides. Glycoproteomics yields system-wide information on the glycoprotein carriers, the glycan attachment sites, the occupancies of glycosylation at each site, and the structure and heterogeneity of the attached glycans. As showcased in this review by recent examples, glycoproteomics is a powerful technology to map disease-associated alterations in the glycoproteome, Fig. 3. Such glycosylation alterations may originate from multiple tissues, which may be differently regulated during pathogenesis. Typically, in glycoproteomics investigations, intact glycopeptides derived from the total complement of glycoproteins extracted from bodily fluids or complex tissues from healthy and diseased individuals (or other biological scenarios) are qualitatively and quantitatively compared. By also measuring any changes in glycoprotein abundance and site occupancy, the exact mechanism(s) contributing to the observed glycoproteome regulation can be interrogated (see example below).
Fig. 2.A, Definitions and explanations of commonly used nomenclature in glycoproteomics. B, Generic workflow illustrating important components of a glycoproteomics experiment, which crudely can be divided into segments related to glycopeptide sample preparation (top box) and detection (bottom box). Typical examples of the individual components are provided. *Additional sample handling including glycoprotein derivatization and glycopeptide labeling for quantitation purposes may be introduced at this step.
Fig. 3.Three fundamental levels of molecular dysregulation of multiple tissues contributing to an altered secreted N-glycoproteome during disease.A, Hypothetical example illustrating three sources of dysregulation: 1) protein level (green), 2) site occupancy (blue), or 3) glycosylation micro-heterogeneity (red) from three separate tissues (Tissue A-C) contributing to a joint secreted N-glycoproteome (Protein A-C) in a body fluid derived from disease (right) and ‘normal’ healthy (left) condition. B, After proteolysis and enrichment, the altered abundance of the resulting glycopeptides can be detected using LC-MS/MS-based label-free quantitative glycoproteomics as shown by color-coded traces representing extracted ion chromatograms (XICs). However, establishing which of the three mechanisms causes the glycopeptide alterations for the detected glycoproteins may be challenging solely with glycopeptide analysis, especially in glycoproteomes arising from multiple tissues. Parallel quantitative proteomics and “deglycoproteomics” (detection of formerly occupied N-sites) of the same samples can assist in this task.
Many other analytical approaches can be used to characterize aspects of glycoprotein structural diversity. These include site-specific analysis of N-glycoproteins isolated to relative purity rather than in complex mixtures (
Complementary LC-MS/MS-based N-glycan, N-glycopeptide, and intact N-glycoprotein profiling reveals unconventional Asn71-glycosylation of human neutrophil cathepsin G.
Purification and characterization of bioactive his6-tagged recombinant human tissue inhibitor of metalloproteinases-1 (TIMP-1) protein expressed at high yields in mammalian cells.
Site-specific protein N- and O-glycosylation analysis by a C18-porous graphitized carbon-liquid chromatography-electrospray ionization mass spectrometry approach using pronase treated glycopeptides.
A microarray-matrix-assisted laser desorption/ionization-mass spectrometry approach for site-specific protein N-glycosylation analysis, as demonstrated for human serum immunoglobulin M (IgM).
In-depth N-glycome profiling of paired colorectal cancer and non-tumorigenic tissues reveals cancer-, stage- and EGFR-specific protein N-glycosylation.
), and identification and quantification of previously glycosylated sites on de-N-glycosylated proteins (“deglycoproteomics”) after removal of the entire glycan or with remnant N-glycan core remaining (
In-depth N-glycosylation reveals species-specific modifications and functions of the royal jelly protein from Western (Apis mellifera) and Eastern Honeybees (Apis cerana).
). Although these studies per se do not qualify under our definition of glycoproteomics, (site-specific analysis of the glycoproteome at the intact glycopeptide level), they still provide useful information in conjunction with glycoproteomics for the glycobiologist, provided correct experimental design is applied (
). Thus, this review will highlight the very latest (∼2014–present) technological advances and applications in glycoproteomics, which have been instrumental for the recent performance improvements in detection limits, accuracy of glycopeptide identification and quantitation, and gains in glycoproteome coverage.
The modern discipline of glycoproteomics has deep roots in the protein and carbohydrate analytical chemistry pioneered in the late 1980s and early 1990s with the advent of biomolecular mass spectrometry (
). Impressive analytical strategies using relatively insensitive MS instrumentation were developed e.g. the selective detection of glycopeptides in mixtures using deglycosylation-based mass shifts (
Collisional fragmentation of glycopeptides by electrospray ionization LC/MS and LC/MS/MS: methods for selective detection of glycopeptides in protein digests.
Effect of the reducing-terminal substituents on the high energy collision-induced dissociation matrix-assisted laser desorption/ionization mass spectra of oligosaccharides.
). These early studies remain a solid foundation on which many modern glycoanalytical strategies are conceived. It is also clear that glycoproteomics more recently has profited handsomely from technology developments arising from the larger and more mature discipline of proteomics including sample handling, LC-MS/MS acquisition strategies, and data handling and processing (
). In parallel, glycoproteomics has been a beneficiary of the continual performance enhancements of modern mass spectrometers including improved speed, sensitivity, resolution, and accuracy, most notably implemented on the latest Q-TOF (Sciex, Waters, Agilent, Bruker) and on multiple Orbitrap (Thermo) instrument platforms (
). Developments and applications of several key glycoproteomics-specific technologies have additionally been critical for the rapid maturation of glycoproteomics workflows over the past two years, Table I. Specifically, key advances have been made in the enrichment of intact glycopeptides from complex peptide mixtures, in LC-MS/MS-based detection of intact glycopeptides through optimized dissociation and acquisition styles of glycopeptides, and in data handling for more automated, yet still confident, glycopeptide identification and quantitation.
Table ILatest developments and applications of the key components in N-glycoproteomics
Evaluation of a combined glycomics and glycoproteomics approach for studying the major glycoproteins present in biofluids: Application to cerebrospinal fluid.
Strategy integrating stepped fragmentation and glycan diagnostic ion-based spectrum refinement for the identification of core fucosylated glycoproteome using mass spectrometry.
Strategy integrating stepped fragmentation and glycan diagnostic ion-based spectrum refinement for the identification of core fucosylated glycoproteome using mass spectrometry.
Hydrazide is usually used to capture N-glycopeptides with a subsequent peptide N-glycosidase F release of formerly N-glycosylated peptides for glycosylation site mapping and is, thus, not a generally used tool in glycoproteomics.
Multilayer hydrophilic poly(phenol-formaldehyde resin)-coated magnetic graphene for boronic acid immobilization as a novel matrix for glycoproteome analysis.
Novel LC-MS(2) product dependent parallel data acquisition function and data analysis workflow for sequencing and identification of intact glycopeptides.
Evaluation of a combined glycomics and glycoproteomics approach for studying the major glycoproteins present in biofluids: Application to cerebrospinal fluid.
Novel LC-MS(2) product dependent parallel data acquisition function and data analysis workflow for sequencing and identification of intact glycopeptides.
Novel LC-MS(2) product dependent parallel data acquisition function and data analysis workflow for sequencing and identification of intact glycopeptides.
Evaluation of a combined glycomics and glycoproteomics approach for studying the major glycoproteins present in biofluids: Application to cerebrospinal fluid.
Increasing the productivity of glycopeptides analysis by using higher-energy collision dissociation-accurate mass-product-dependent electron transfer dissociation.
Novel LC-MS(2) product dependent parallel data acquisition function and data analysis workflow for sequencing and identification of intact glycopeptides.
Evaluation of a combined glycomics and glycoproteomics approach for studying the major glycoproteins present in biofluids: Application to cerebrospinal fluid.
Novel LC-MS(2) product dependent parallel data acquisition function and data analysis workflow for sequencing and identification of intact glycopeptides.
New glycoproteomics software, GlycoPep Evaluator, generates decoy glycopeptides de novo and enables accurate false discovery rate analysis for small data sets.
MassyTools: A high-throughput targeted data processing tool for relative quantitation and quality control developed for glycomic and glycoproteomic MALDI-MS.
Evaluation of a combined glycomics and glycoproteomics approach for studying the major glycoproteins present in biofluids: Application to cerebrospinal fluid.
Evaluation of a combined glycomics and glycoproteomics approach for studying the major glycoproteins present in biofluids: Application to cerebrospinal fluid.
Evaluation of a combined glycomics and glycoproteomics approach for studying the major glycoproteins present in biofluids: Application to cerebrospinal fluid.
Isotope-coded carbamidomethylation for quantification of N-glycoproteins with online microbore hollow fiber enzyme reactor-nanoflow liquid chromatography-tandem mass spectrometry.
* Hydrazide is usually used to capture N-glycopeptides with a subsequent peptide N-glycosidase F release of formerly N-glycosylated peptides for glycosylation site mapping and is, thus, not a generally used tool in glycoproteomics.
Enrichment and Prefractionation Strategies for Glycoproteomics
Because of the substoichiometry of glycopeptides in complex peptide mixtures arising from the extensive glycan micro- and macro-heterogeneity, and inherently poor detectability, glycopeptide enrichment is a critical component of glycoproteomics experiments. Recent advances in glycopeptide enrichment have been fully reviewed elsewhere (
). Exciting initiatives in glycopeptide enrichment strategies include the optimized use of boronic acid as a reversible glycopeptide capture method on magnetic graphene (
Multilayer hydrophilic poly(phenol-formaldehyde resin)-coated magnetic graphene for boronic acid immobilization as a novel matrix for glycoproteome analysis.
Identification of salivary N-glycoproteins and measurement of glycosylation site occupancy by boronate glycoprotein enrichment and liquid chromatography/electrospray ionization tandem mass spectrometry.
) and applications (see Table I), but this approach is most commonly used in conjunction with peptide N-glycosidase F-catalyzed release and analysis of formerly N-glycosylated peptides and does not satisfy our definition of glycoproteomics. Other new enrichment methods of interest include the metabolic incorporation of N-azido sugars into N-glycopeptides to facilitate their specific enrichment and detection (
) solid phase extraction (SPE) for efficient enrichment of glycopeptides and sialoglycopeptides, respectively, are also promising developments. However, common for most of these proof-of-principle methodology studies is the need for further validation to demonstrate their true potential in glycoproteomics. The frequently used zwitterionic-HILIC SPE-based methods for enrichment and analysis of intact N-glycopeptides (
A new strategy for identification of N-glycosylated proteins and unambiguous assignment of their glycosylation sites using HILIC enrichment and partial deglycosylation.
) have been further tested. Usefully, it was found that N-glycopeptides are still efficiently retained when using higher concentrations (0.1%) of surfactants and detergents including SDS and Triton X-100 (
N-Glycosylation site analysis of proteins from Saccharomyces cerevisiae by using hydrophilic interaction liquid chromatography-based enrichment, parallel deglycosylation, and mass spectrometry.
). As all N-glycopeptides harbor a minimum degree of localized hydrophilicity arising from a high density of polar hydroxyl groups, HILIC remains the most used and, in our opinion, the most efficient and least biased enrichment method facilitating large-scale analysis of intact and native (nonderivatized) glycopeptides in N-glycoproteomics.
Only few developments in the off-line separation and fractionation of glycopeptides prior to LC-MS/MS detection have recently been published. Some glycoproteomics approaches even by-pass this step because of the increased capacity of modern LC-MS/MS instrumentation to handle the extreme complexity of biologically-relevant glycopeptide mixtures, and perhaps because the multiple LC fractions resulting from off-line separations dramatically increase the required LC-MS/MS instrument time, lower the overall sensitivity of the analysis and complicate the downstream quantitative data analysis (