If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
* This work was supported in part with federal funds from the National Institutes of Health (K25CA137222, R01CA107209, K07CA116296, R01DK081368) and a grant from Canary Foundation (TAB). RA acknowledges funding from the Swiss National Science Foundation (3100A0-107679). 1 The abbreviations used are:CIDcollision-induced dissociationQquadrupole.
Glycosylation is one of the most important and common forms of protein post-translational modification that is involved in many physiological functions and biological pathways. Altered glycosylation has been associated with a variety of diseases, including cancer, inflammatory and degenerative diseases. Glycoproteins are becoming important targets for the development of biomarkers for disease diagnosis, prognosis, and therapeutic response to drugs. The emerging technology of glycoproteomics, which focuses on glycoproteome analysis, is increasingly becoming an important tool for biomarker discovery. An in-depth, comprehensive identification of aberrant glycoproteins, and further, quantitative detection of specific glycosylation abnormalities in a complex environment require a concerted approach drawing from a variety of techniques. This report provides an overview of the recent advances in mass spectrometry based glycoproteomic methods and technology, in the context of biomarker discovery and clinical application.
With recent advances in proteomics, analytical and computational technologies, glycoproteomics—the global analysis of glycoproteins—is rapidly emerging as a subfield of proteomics with high biological and clinical relevance. Glycoproteomics integrates glycoprotein enrichment and proteomics technologies to support the systematic identification and quantification of glycoproteins in a complex sample. The recent development of these techniques has stimulated great interest in applying the technology in clinical translational studies, in particular, protein biomarker research.
While glycomics is the study of glycome (repertoire of glycans), glycoproteomics focuses on studying the profile of glycosylated proteins, i.e. the glycoproteome, in a biological system. Considerable work has been done to characterize the sequences and primary structure of the glycan moieties attached to proteins (
). In contrast, this review is focused on recent developments in glycoproteomic techniques and their unique application and technical challenge to biomarker discovery.
Glycoproteomics in Biomarker Discovery and Clinical Study
Most secretory and membrane-bound proteins produced by mammalian cells contain covalently linked glycans with diverse structures (
). The glycosylation form of a glycoprotein is highly specific at each glycosylation site and generally stable for a given cell type and physiological state. However, the glycosylation form of a protein can be altered significantly because of changes in cellular pathways and processes resulting from diseases, such as cancer, inflammation, and neurodegeneration. Such disease-associated alterations in glycoproteins can happen in one or both of two ways: 1) protein glycosylation sites are either hypo, hyper, or newly glycosylated and/or; 2) the glycosylation form of the attached carbohydrate moiety is altered. In fact, altered glycosylation patterns have long been recognized as hallmarks in cancer progression, in which tumor-specific glycoproteins are actively involved in neoplastic progression and metastasis (
). Sensitive detection of such disease-associated glycosylation changes and abnormalities can provide a unique avenue to develop glycoprotein biomarkers for diagnosis and prognosis. In addition, intervention in the glycosylation and carbohydrate-dependent cellular pathways represent a potential new modality for cancer therapies (
Protein biomarker development is a complex and challenging task. The criteria and approach applied for developing each individual biomarker can vary, depending on the purpose of the biomarker and the performance requirement for its clinical application (
). In general, it has been suggested that the preclinical exploratory phase of protein biomarker development can be technically defined into four stages (
), including initial discovery of differential proteins; testing and selection of qualified candidates; verification of a subset of candidates; assay development and pre-clinical validation of potential biomarkers. Thanks to recent technological advances, mass spectrometry based glycoproteomics is now playing a major role in the initial phase of discovering aberrant glycoproteins associated with a disease. Glycoprotein enrichment techniques, coupled with multidimensional chromatographic separation and high-resolution mass spectrometry have greatly enhanced the analytical dynamic range and limit of detection for glycoprotein profiling in complex samples such as plasma, serum, other bodily fluids, or tissue. In addition, candidate-based quantitative glycoproteomics platforms have been introduced recently, allowing targeted detection of glycoprotein candidates in complex samples in a multiplexed fashion, providing a complementary tool for glycoprotein biomarker verification in addition to antibody based approaches. It is clear that glycoproteomics is gaining momentum in biomarker research.
Glycoproteomics Approaches
Glycoproteomic analysis is complicated not only by the variety of carbohydrates, but also by the complex linkage of the glycan to the protein. Glycosylation can occur at several different amino acid residues in the protein sequence. The most common and widely studied forms are N-linked and O-linked glycosylation. O-linked glycans are linked to the hydroxyl group on serine or threonine residues. N-linked glycans are attached to the amide group of asparagine residues in a consensus Asn-X-Ser/Thr sequence (X can be any amino acid except proline) (
). Other known, but less well studied forms of glycosylation include glycosylphosphatidylinositol anchors attached to protein carboxyl terminus, C-glycosylation that occurs on tryptophan residues (
). Our following discussion is focused on glycoproteomic analysis of the most common N-linked and O-linked glycoproteins.
A comprehensive analysis of glycoproteins in a complex biological sample requires a concerted approach. Although the specific methods for sample preparation can be different for different types of samples (e.g. plasma, serum, tissue, and cell lysate), a glycoproteomics pipeline typically consists of glycoprotein or glycopeptide enrichment, multidimensional protein or peptide separation, tandem mass spectrometric analysis, and bioinformatic data interpretation. For glycoprotein-based enrichment methods, proteolytic digestion can be performed before or after glycan cleavage, depending on the specific workflow and enrichment methods used. For glycopeptide enrichment, proteolytic digestion is typically performed before the isolation step so that glycopeptides, instead of glycoproteins, can be captured. For quantitative glycoproteomics profiling, additional steps, such as differential stable isotope labeling of the sample and controls, are required. Fig. 1 illustrates the general strategy for an integrated glycoproteomics analysis.
Fig. 1.The strategies of mass spectrometry based glycoproteomic analysis.
Glycoproteins or glycopeptides can be effectively enriched using a variety of techniques (see below). Following the enrichment step, the workflow then splits into two directions: glycan analysis and glycoprotein analysis. The strategies for glycan analysis have been discussed in several reviews and will not be covered in this report. For glycoprotein analysis, bottom-up workflows (“shotgun proteomics”—peptide based proteomics analysis) (
) are still most common, providing not only detailed information of a glycoprotein profile, but also the specific mapping of glycosylation sites. It is notable that the reliable analysis of mass spectrometric data in glycoproteomic studies largely relies on bioinformatic tools and glyco-related databases that are available. An increasing number of algorithms and databases for glycan analysis have been developed and well documented in several recent reviews (
). For glycoprotein and glycopeptide sequence analysis, a large number of well-characterized and annotated glycoproteins can be found in the UniProt Knowledgebase. In addition, many glycopeptide mass spectra are now available in the continually expanding PeptideAtlas library (
), which stores millions of high-resolution peptide fragment ion mass spectra acquired from a variety of biological and clinical samples for peptide and protein identification. Ultimately, all the data obtained from different aspects of the workflow need to be merged and interpreted in an integrated fashion so that the full extent of glycosylation changes associated with a particular biological state can be better revealed. To the best of our knowledge, the complete glycoform analysis of any glycoprotein in a specific cell type under any specific condition has not yet been accomplished for any glycoprotein with multiple glycosylation sites. Current technology can define the glycan compliment and profile the glycoproteins, but is not capable of putting them together to define the molecular species present. To date, such integrated studies still remain highly challenging, even with advanced tandem mass spectrometry technologies and growing bioinformatic resources (
Differential analysis of site-specific glycans on plasma and cellular fibronectins: application of a hydrophilic affinity method for glycopeptide enrichment.
Approaches to the study of N-linked glycoproteins in human plasma using lectin affinity chromatography and nano-HPLC coupled to electrospray linear ion trap–Fourier transform mass spectrometry.
Characterization of the glycoproteome in a complex biological sample such as plasma, serum, or tissue, is analytically challenging because of the enormous complexity of protein and glycan constituents and the vast dynamic range of protein concentration in the sample. The selective enrichment of the glycoproteome is one of the most efficient ways to simplify the enormous complexity of a biological sample to achieve an in-depth glycoprotein analysis. Two approaches for glycoprotein enrichment have been widely applied: lectin affinity based enrichment methods (
Approaches to the study of N-linked glycoproteins in human plasma using lectin affinity chromatography and nano-HPLC coupled to electrospray linear ion trap–Fourier transform mass spectrometry.
). Recent studies have demonstrated that the two methods are complementary and a very effective means for the enrichment of glycoproteins or glycopeptides from human plasma and other bodily fluids (
A new strategy for identification of N-glycosylated proteins and unambiguous assignment of their glycosylation sites using HILIC enrichment and partial deglycosylation.
). There are a variety of lectin species that can selectively bind to different oligosaccharide epitopes. For instance, concanavalin A (ConA) binds to mannosyl and glucosyl residues of glycoproteins (
). Lectin affinity enrichment has been designed to enrich glycoproteins with specific glycan attachment from plasma, serum, tissue, and other biological samples through affinity chromatography and other methods. Multiple lectin species can also be combined to isolate multiple types of glycoproteins in complex biological samples (
Monitoring of glycoprotein products in cell culture lysates using lectin affinity chromatography and capillary HPLC coupled to electrospray linear ion trap-Fourier transform mass spectrometry (LTQ/FTMS).
Approaches to the study of N-linked glycoproteins in human plasma using lectin affinity chromatography and nano-HPLC coupled to electrospray linear ion trap–Fourier transform mass spectrometry.
Monitoring of glycoprotein products in cell culture lysates using lectin affinity chromatography and capillary HPLC coupled to electrospray linear ion trap-Fourier transform mass spectrometry (LTQ/FTMS).
). Several reports have demonstrated a multilectin column approach to achieve a global enrichment of glycoproteins with various glycan attachments from serum and plasma (
Approaches to the study of N-linked glycoproteins in human plasma using lectin affinity chromatography and nano-HPLC coupled to electrospray linear ion trap–Fourier transform mass spectrometry.
). A recent study has developed a “filter aided sample preparation (FASP)” based method, which allows highly efficient enrichment of glycopeptides using multi-lectins (
). To date, most of the work using lectin affinity for targeted glycoprotein enrichment has focused on N-glycosylation because the binding specificity of lectin for O-glycosylation is less satisfactory. To overcome such caveat, efforts have been made using serial lectin columns of concanavalin A and jacelin in tandem to isolate O-glycopeptides from human serum (
A hydrazide chemistry-based method has been applied to isolate glycoproteins and glycopeptides through the formation of covalent bonding between the glycans and the hydrazide groups (
). The carbohydrates on glycoproteins are first oxidized to form aldehyde groups, which sequentially react with hydrazide groups that are immobilized on a solid surface. The chemical reaction conjugates the glycoproteins to the solid phase by forming the covalent hydrazone bond. Although, conceptually, the majority of the glycoproteins in a biological sample can be captured using this method, the further analysis of the captured glycoproteins is practically limited by the method that can cleave glycoproteins or glycopeptides from the solid phase. Because there is a lack of efficient enzymes or chemicals that can specifically deglycosylate and/or release O-linked glycoproteins or glycopeptides from the solid phase, most of the studies have applied this method solely for N-linked glycoprotein analysis. PNGase F is the enzyme that can specifically release an N-glycosylated proteins or peptides (except those carrying α1→3 linked core fucose (
)) from its corresponding oligosaccharide groups. The hydrazide chemistry method is not only highly efficient in enriching N-linked glycoproteins or glycopeptides from a complex environment, but also allows great flexibility in its applications, such as capturing extracellular N-glycoproteins on live cells to monitor their abundant changes because of cell activation, differentiation, or other cellular activities (
). This method can be readily automated for analyzing a large quantity of samples.
Recent studies have compared glycoprotein isolation methods. One study assessed lectin-based protocols and hydrophilic interaction chromatography for their performance in enriching glycoproteins and glycopeptides from serum (
). Other studies compared lectin affinity and hydrazide chemistry methods for their efficiency in isolating glycoproteins and glycopeptides from a complex biological sample (
). The methods are complementary in enriching glycoproteins because of their different mechanisms of glycoprotein capturing. When both methods were applied, it significantly improves the coverage of the glycoproteome, resulting in an increased number of glycoproteins identified. The lectin affinity method can be tailored to target glycoproteins with specific glycan structure(s) for isolation using different lectins, thus, affording flexibility for its application in glycoproteomic studies. The application of hydrazide chemistry method has been widely used for N-linked glycosylation study. The hydrazide chemistry essentially reacts with all the proteins with carbonyl groups, which may include glycoproteins with oxidized glycans (
Use of fluorescein hydrazide and fluorescein thiosemicarbazide reagents for the fluorometric determination of protein carbonyl groups and for the detection of oxidized protein on polyacrylamide gels.
Identification of carbonylated proteins from enriched rat skeletal muscle mitochondria using affinity chromatography-stable isotope labeling and tandem mass spectrometry.
Identification of yeast oxidized proteins: chromatographic top-down approach for identification of carbonylated, fragmented and cross-linked proteins in yeast.
). The high specificity of this method may mainly result from the specificity of PNGase F, the enzyme cleaving N-glycosidic bonds to release N-glycoproteins and peptides from the solid phase. This method affords high efficiency and specificity in enriching N-linked glycoproteins or glycopeptides from a complex sample, and can be easily incorporated into a proteomics workflow for integrated analysis. In addition to the lectin and hydrazide chemistry-based methods, it has been suggested that boronic acid-based solid phase extraction may also be useful for an overall glycoproteome enrichment (
Mass spectrometry, because of its high sensitivity and selectivity, has been one of the most versatile and powerful tools in glycoprotein analysis, to identify the glycoproteins, evaluate glycosylation sites, and elucidate the oligosaccharide structures (
) for glycoprotein characterization in a complex sample is still technically challenging with the current technology. The most versatile and widely used current glycoproteomics methods are based on characterizing glycopeptides generated by the digestion of glycoproteins, analyzing either deglycosylated glycopeptides or intact glycopeptides with glycan attachment, as illustrated in Fig. 1.
The direct analysis of intact glycopeptides with carbohydrate attachments is complicated by the mixed information obtained from the fragment ion spectra, which may include fragment ions from the peptide backbone, the carbohydrate group and the combinations of both. Although it is technically challenging to comprehensively analyze intact glycopeptides in a global scale for a complex biological sample, complementary information regarding peptide backbone and glycan structure can likely be obtained in a single measurement. Early work using collision-induced dissociation (CID)
has identified a few key features that are characteristics of the fragmentation of glycopeptides, providing the basis for intact glycopeptide identification (
Collisional fragmentation of glycopeptides by electrospray ionization LC/MS and LC/MS/MS: methods for selective detection of glycopeptides in protein digests.
Tandem mass spectrometry and structural elucidation of glycopeptides from a hydroxyproline-rich plant cell wall glycoprotein indicate that contiguous hydroxyproline residues are the major sites of hydroxyproline O-arabinosylation.
Characterization of O-glycosylation sites in recombinant B-chain of platelet-derived growth factor expressed in yeast using liquid secondary ion mass spectrometry, tandem mass spectrometry and Edman sequence analysis.
). The analysis of intact glycopeptides has been carried out using a variety of different instruments, including electrospray ionization (EST)-based ion trap (IT) (
Determination of glycopeptide structures by multistage mass spectrometry with low-energy collision-induced dissociation: comparison of electrospray ionization quadrupole ion trap and matrix-assisted laser desorption/ionization quadrupole ion trap reflectron time-of-flight approaches.
Approaches to the study of N-linked glycoproteins in human plasma using lectin affinity chromatography and nano-HPLC coupled to electrospray linear ion trap–Fourier transform mass spectrometry.
Monitoring of glycoprotein products in cell culture lysates using lectin affinity chromatography and capillary HPLC coupled to electrospray linear ion trap-Fourier transform mass spectrometry (LTQ/FTMS).
Identification of glycoproteins carrying a target glycan-motif by liquid chromatography/multiple-stage mass spectrometry: identification of Lewis x-conjugated glycoproteins in mouse kidney.
Structural analysis of O-glycopeptides employing negative- and positive-ion multi-stage mass spectra obtained by collision-induced and electron-capture dissociations in linear ion trap time-of-flight mass spectrometry.
Site-specific N-glycosylation analysis of human plasma ceruloplasmin using liquid chromatography with electrospray ionization tandem mass spectrometry.
Site-specific N-glycosylation analysis: matrix-assisted laser desorption/ionization quadrupole-quadrupole time-of-flight tandem mass spectral signatures for recognition and identification of glycopeptides.
Determination of glycopeptide structures by multistage mass spectrometry with low-energy collision-induced dissociation: comparison of electrospray ionization quadrupole ion trap and matrix-assisted laser desorption/ionization quadrupole ion trap reflectron time-of-flight approaches.
Glycopeptide analysis by matrix-assisted laser desorption/ionization tandem time-of-flight mass spectrometry reveals novel features of horseradish peroxidase glycosylation.
Analysis of glycoproteins in human serum by means of glycospecific magnetic bead separation and LC-MALDI-TOF/TOF analysis with automated glycopeptide detection.
) mass spectrometers. In general, the CID generated MS/MS spectrum of a glycopeptide is dominated by B- and Y-type glycosidic cleavage ions (carbohydrate fragments) (
), and b- and y-type peptide fragments from the peptide backbone. However, the MS/MS fragmentation data obtained from different instruments can have pronounced difference in providing structure information on glycan and peptide backbone, depending on the experimental setting and instrumentation used for mass analysis, including ionization methods, collision techniques and mass analyzers. Low energy CID with electrospray ionization-based ion trap, Fourier transform-ion cyclotron resonance, and Q/TOF instrument predominantly generates fragments of glycosidic bonds. The increase of collision energy using Fourier transform-ion cyclotron resonance, and Q/TOF instruments result in the more efficient fragmentation of b- and y- ions from the peptide backbone. MALDI ionization generates predominantly singly charged precursor ions, which are more stable and usually fragmented using higher energies via CID or post-source decay (PSD), generating fragments from both the peptide backbone and the glycan (
Site-specific N-glycosylation analysis: matrix-assisted laser desorption/ionization quadrupole-quadrupole time-of-flight tandem mass spectral signatures for recognition and identification of glycopeptides.
Glycopeptide analysis by matrix-assisted laser desorption/ionization tandem time-of-flight mass spectrometry reveals novel features of horseradish peroxidase glycosylation.
Post-translational modifications on proteins: facile and efficient procedure for the identification of O-glycosylation sites by MALDI-LIFT-TOF/TOF mass spectrometry.
). Although Q/TOF instruments have been widely used for intact glycopeptide characterization, one unique feature of the ion trap instrument is that it allows repeated ion isolation/CID fragmentation cycles, which can provide a wealth of complementary information to interpret the structure of a glycan moiety and peptide backbone (
Determination of glycopeptide structures by multistage mass spectrometry with low-energy collision-induced dissociation: comparison of electrospray ionization quadrupole ion trap and matrix-assisted laser desorption/ionization quadrupole ion trap reflectron time-of-flight approaches.
). Recently, fragmentation techniques using different mechanisms from CID have been introduced and applied for glycopeptide analysis, including infrared multiphoton dissociation (IRMPD) (
Electron capture dissociation and infrared multiphoton dissociation MS/MS of an N-glycosylated tryptic peptic to yield complementary sequence information.
Electron capture dissociation and infrared multiphoton dissociation MS/MS of an N-glycosylated tryptic peptic to yield complementary sequence information.
Complete characterization of posttranslational modification sites in the bovine milk protein PP3 by tandem mass spectrometry with electron capture dissociation as the last stage.
Determination of aberrant O-glycosylation in the IgA1 hinge region by electron capture dissociation fourier transform-ion cyclotron resonance mass spectrometry.
A simple cellulose column procedure for selective enrichment of glycopeptides and characterization by nano LC coupled with electron-transfer and high-energy collisional-dissociation tandem mass spectrometry.
). The application of infrared multiphoton dissociation and electon-capture dissociation is largely performed with Fourier transform-ion cyclotron resonance instruments. Complementary to CID fragmentation, electon-capture dissociation and electron-transfer disassociation tend to cleave the peptide backbone with no loss of the glycan moiety, providing specific information on localizing the glycosidic modification. More details regarding mass spectrometric analysis of intact glycopeptides can be found in recent reviews (
Determination of glycopeptide structures by multistage mass spectrometry with low-energy collision-induced dissociation: comparison of electrospray ionization quadrupole ion trap and matrix-assisted laser desorption/ionization quadrupole ion trap reflectron time-of-flight approaches.
Electron capture dissociation and infrared multiphoton dissociation MS/MS of an N-glycosylated tryptic peptic to yield complementary sequence information.
Capillary electrophoresis-electrospray mass spectrometry for the characterization of high-mannose-type N-glycosylation and differential oxidation in glycoproteins by charge reversal and protease/glycosidase digestion.
Site-specific characterization of the N-linked oligosaccharides of a murine immunoglobulin M by high-performance liquid chromatography/electrospray mass spectrometry.
Complete characterization of posttranslational modification sites in the bovine milk protein PP3 by tandem mass spectrometry with electron capture dissociation as the last stage.
Structural analysis of O-glycopeptides employing negative- and positive-ion multi-stage mass spectra obtained by collision-induced and electron-capture dissociations in linear ion trap time-of-flight mass spectrometry.
Determination of aberrant O-glycosylation in the IgA1 hinge region by electron capture dissociation fourier transform-ion cyclotron resonance mass spectrometry.
Complete characterization of posttranslational modification sites in the bovine milk protein PP3 by tandem mass spectrometry with electron capture dissociation as the last stage.
Localization of O-glycosylation sites on glycopeptide fragments from lactation-associated MUC1. All putative sites within the tandem repeat are glycosylation targets in vivo.
Site determination of protein glycosylation based on digestion with immobilized nonspecific proteases and Fourier transform ion cyclotron resonance mass spectrometry.
Analysis of O-glycan heterogeneity in IgA1 myeloma proteins by Fourier transform ion cyclotron resonance mass spectrometry: implications for IgA nephropathy.
) glycopeptides, the interpretation of the fragment spectrum of an intact glycopeptide still requires intensive manual assignment and evaluation. A recent study has demonstrated the feasibility to develop an automated workflow for analyzing intact glycopeptides in mixtures (
). In general, however, a high throughput, large scale profiling of intact glycopeptides in a complex sample still remains a challenge with current technology.
The analysis of deglycosylated peptides requires the removal of glycan attachments from glycopeptides. Fortunately, for N-linked glycopeptides, the N-glycosidic bond can be specifically cleaved using the enzyme PNGase F, providing deglycosylated peptides, which can then be analyzed directly using shotgun proteomics. The PNGase F-catalyzed deglycosylation results in the conversion of asparagine to aspartic acid in the glycopeptide sequence, which introduces a mass difference of 0.9840 Da. Such distinct mass differences can be used to precisely map the N-linked glycosylation sites using high resolution mass spectrometers. Stable isotope labeling introduced by enzymatic cleavage of glycans in H218O has also been used to enhance the precise identification of N-glycosylation sites (
). The removal of O-linked glycans is less straightforward, most assays rely on chemical deglycosylation methods, such as trifluoromethansulfonic acid (
Alkali-catalyzed beta-elimination of periodate-oxidized glycans: a novel method of chemical deglycosylation of mucin gene products in paraffin embedded sections.
). The application of these methods suffers from a variety of limitations, such as low specificity for O-linked glycosylation, degradation of the peptide backbone, and modifications of the amino acid residues—all of which can complicate or compromise O-linked glycoproteomics analysis in a complex sample. Most of the large scale glycoproteomics studies using the deglycosylation approach have been focused on N-glycoproteins, which are prevalent in blood and a rich source for biomarker discovery. O-glycosylation lacks a common core, consensus sequence, and universal enzyme that can specifically remove the glycans from the peptide backbone, thus, is more challenging to analyze for large scale profiling.
Following deglycosylation, the glycopeptides can be treated and analyzed as stripped peptides using a shotgun proteomics pipeline. MS/MS fragment spectra with b-ions and y-ions generated from CID are searched against protein databases using search algorithms, such as SEQUEST (
), to provide peptide and protein identifications with known false discovery rate. The N-glycosylation sites can be precisely mapped using the consensus sequence of Asn-X-Ser/Thr, in which asparagine is converted to aspartic acid following enzyme cleavage introducing a mass difference of 0.9840 Dalton. A variety of mass spectrometers have been used to analyze glycoproteins, in particular N-linked glycoproteins, in complex biological and clinical samples using the deglycosylation approach. These studies include electrospray ionization-based ion trap (
Comparative serum glycoproteomics using lectin selected sialic acid glycoproteins with mass spectrometric analysis: application to pancreatic cancer serum.
). Recently, an attempt was made to apply ion mobility-mass spectrometry (IM-MS) to characterize deglycosylated glycopeptides and the corresponding carbohydrates simultaneously (
) in a single measurement. The approach of analyzing deglycosylated glycopeptides makes it possible to utilize available proteomics technology for large-scale glycoproteome profiling, especially N-linked glycoproteins, in a high-throughput fashion.
Glycoproteomics Analysis in Blood and Other Bodily Fluids
An important target for blood-based diagnostic assays involves the detection and quantification of glycosylated proteins. Glycosylated proteins, especially N-linked glycoproteins, are ubiquitous among the proteins destined for extracellular environments (
Protein N-glycosylation along the secretory pathway: relationship to organelle topography and function, protein quality control, and cell interactions.
), such as plasma or serum. A systematic and in-depth global profiling of the blood glycoproteome can provide fundamental knowledge for blood biomarker development, and is now possible with the development of glycoproteomics technologies. In the past few years, several large scale proteomics studies on profiling the glycoproteome of human plasma and serum have been reported (
Semiautomated high-sensitivity profiling of human blood serum glycoproteins through lectin preconcentration and multidimensional chromatography/tandem mass spectrometry.
), immunoaffinity subtraction and hydrazide chemistry were applied to enrich N-glycoproteins from human plasma. The captured plasma glycoproteins were subjected to two-dimensional liquid chromatography separation followed by tandem mass spectrometric analysis. A total of 2053 different N-glycopeptides were identified, covering 303 nonredundant glycoproteins, including many glycoproteins with low abundance in blood (
). In a different study, hydrazide chemistry-based solid phase extraction method was applied to enhance the detection of tissue-derived proteins in human plasma (
Semiautomated high-sensitivity profiling of human blood serum glycoproteins through lectin preconcentration and multidimensional chromatography/tandem mass spectrometry.
). These studies provide detailed identification regarding the individual N-glycosylation sites using high-resolution mass spectrometry. The efforts made in global profiling of glycoproteins in plasma and serum have not only greatly enhanced our understanding of the blood glycoproteome, but also have facilitated the development of new technologies that can be used for glycoprotein biomarker discovery. A variety of experimental designs and strategies for blood glycoprotein profiling have been applied for clinical disease studies, including prostate cancer (
Identification of candidate biomarkers with cancer-specific glycosylation in the tissue and serum of endometrioid ovarian cancer patients by glycoproteomic analysis.
Comparative serum glycoproteomics using lectin selected sialic acid glycoproteins with mass spectrometric analysis: application to pancreatic cancer serum.
). Most of these studies focused on the early stages of glycoprotein biomarker discovery and many of them exploited multilectin affinity techniques to isolate glycoproteins from serum or plasma.
Glycoproteomics techniques have also been applied to study the glycoproteome of other bodily fluids. The complementary application of hydrazide chemistry-based solid phase extraction and lectin affinity method have led to the identification of 216 glycoproteins in human cerebrospinal fluid (CSF), including many low abundant ones (
). In our own study, 48 glycoproteins have so far been identified in pancreatic juice (unpublished data), adding complementary information to the pancreatic juice protein database (
Glycoproteomics Analysis of Tissue and Cell Lysates
Protein glycosylation has been increasingly recognized as one of the prominent alterations involved in tumorigenesis, inflammation, and other disease states. The study of glycoproteins in cell and tissue carries great promise for defining biomarkers for diagnotic and therapeutic targets. The glycoproteomics studies in liver tissue (
) have provided a fundamental understanding of the liver glycoproteome and identified protein candidates that are associated with highly metastatic liver cancer cells. In one of the studies, hydrazide chemistry and multiple enzyme digestion provided a complementary identification of 939 N-glycosylation sites covering 523 nonredundant glycoproteins in human liver tissue (
Identification of candidate biomarkers with cancer-specific glycosylation in the tissue and serum of endometrioid ovarian cancer patients by glycoproteomic analysis.
). Glycoproteomics studies have also been carried out to study hepatocelluar carcinoma. Magnetic nanoparticle immobilized Concanavalin A was used to selectively enrich N-glycoproteins in a hepatocelluar carcinoma cell line leading to the identification of 184 glycosylation sites corresponding to 101 glycoproteins (
Concanavalin A-immobilized magnetic nanoparticles for selective enrichment of glycoproteins and application to glycoproteomics in hepatocelluar carcinoma cell line.
). In a different study, complementary methods of hydrophilic affinity and hydrazide chemistry were applied to investigate the secreted glycoproteins from a hepatocelluar carcinoma cell line, in which 300 different glycosylation sites within 194 glycoproteins were identified (
). While many of these studies focused on N-glycoproteins, mucin-type O-linked glycoproteins are the predominant forms of O-linked glycosylation and are difficult to analyze. A metabolic labeling method was developed to facilitate their identification in complex cell lysates using proteomic strategies (
Cell surface and membrane proteins are particularly appealing for biomarker discovery, and many of them are glycosylated proteins. Both hydrazide chemistry- and lectin affinity-based approaches have been applied to specifically study cell surface and membrane N-glycoproteins that are associated with diseases, including colon carcinoma (
). One study applied hydrazide chemistry to covalently label extracellular glycan moieties on live cells, providing highly specific and selective identification of cell surface N-glycoproteins (
). A complementary application of hydrazide chemistry and lectin affinity methods was demonstrated to profile cell membrane glycoproteins, significantly enhancing the glycoprotein identification (
One of the major goals of clinical proteomics is to effectively identify dysregulated proteins that are specifically associated with a biological state, such as a disease. In the past decade, different quantitative proteomics techniques have been introduced and applied to study a wide variety of disease settings. These techniques are based on different mechanisms to facilitate mass spectrometric-based quantitative analysis, including stable isotopic or isobaric labeling using chemical reactions (e.g. ICAT and iTRAQ) (
Identification and relative quantitation of protein mixtures by enzymatic digestion followed by capillary reversed-phase liquid chromatography-tandem mass spectrometry.
). The overview and comparison of these quantitative techniques can be found in several reports in the literature and are not discussed in this review. Most of these isotopic labeling techniques can be adapted and utilized for glycoproteomics analysis to quantitatively compare the glycoproteome of a diseased sample to a control, thus revealing the glycosylation occupancy of individual glycosylation sites that may be involved in a disease. In addition to the well-established labeling methods cited above, several more experimental labeling strategies have been described in the field of glycoproteomics. One study demonstrated the feasibility of using stable isotope labeled succinic anhydride for quantitative analysis of glycoproteins isolated from serum via hydrazide chemistry (
). In a different report, the heavy and light version of N-acetoxy-succinimide combining with lectin affinity selection was used to quantitatively profile serum glycopeptides in canine lymphoma and transitional cell carcinoma (
). Stable isotope labeled 2-nitrobenzenesulfenyl was also used for chemical labeling in a quantitative glycoprotein profiling study on the sera from patients with lung adenocarcinoma (
Comparative profiling of serum glycoproteome by sequential purification of glycoproteins and 2-nitrobenzensulfenyl (NBS) stable isotope labeling: a new approach for the novel biomarker discovery for cancer.
O-linked beta-N-acetylglucosamine (O-GlcNAc): Extensive crosstalk with phosphorylation to regulate signaling and transcription in response to nutrients and stress.
). A quantitative study on O-GlcNAc glycosylation has been reported, in which a method termed quantitative isotopic and chemoenzymatic tagging (QUIC-Tag) was described using a biotin-avidin affinity strategy for O-GlcNAc glycopeptide enrichment and stable isotope-labeled formaldehyde for mass spectrometric quantification (
). Recently, the isobaric tag for relative and absolute quantitation (iTRAQ) technique, combined with different glycoprotein enrichment approaches, has been utilized in several quantitative glycoproteomics studies. In the study of hepatocellular carcinoma, N-linked glycoproteins were enriched from hepatocellular carcinoma patients and controls using multilectin column and then quantitatively compared using iTRAQ to reveal the differential proteins associated with hepatocellular carcinoma (
). In a different study, the approach of using narrow selectivity lectin affinity chromatography followed by iTRAQ labeling was demonstrated to selectively identify differential glycoproteins in plasma samples from breast cancer patients (
). Another study utilized hydrazide chemistry-based solid phase extraction and iTRAQ to investigate the tear fluid of patients with climatic droplet keratopathy in comparison of normal controls, identifying multiple N-glycosylation sites with differential occupancy associated with climatic droplet keratopathy (
In addition to using chemical reactions to incorporate stable isotope tag for quantitative mass spectrometric analysis, 18O can be introduced into N-glycopeptides during enzymatic reactions, such as tryptic digestion (incorporation of two 18O into the peptide carboxyl-terminal) and PNGase F mediated hydrolysis (incorporation of one 18O into the asparagine of N-glycosylation sites (
). In a different approach, the SILAC technique allows incorporation of stable isotope-labeled amino acids into proteins during cell culturing process (
). A label-free approach has also been used for glycoproteomics profiling, including a method developed to profile intact glycopeptides in a complex sample (
Mass spectrometry based targeted proteomics has recently emerged as a multiplexed quantitative technique that affords highly specific and candidate-based detection of targeted peptides and proteins in a complex biological sample (
). The technique is based on the concept of stable isotope dilution utilizing stable isotope-labeled synthetic reference peptides, which precisely mimic their endogenous counterparts, to achieve targeted quantification (
). Such techniques can be applied to target specific glycoproteins or glycopeptides, to precisely quantify the status of candidate glycosylation sites and assess the glycosylation occupancy at the molecular level. However, it is technically impractical to use synthetic peptides to precisely mimic a large number of natural glycopeptides with intact a glycan moiety as internal standards because of the structure complexity and variation of the sugar chain. To overcome these technical obstacles, an alternative approach was proposed for targeted analysis of N-glycosylation occupancy, in which stable isotope-labeled peptides were synthesized to mimic the deglycosylated form of candidate glycopeptides as internal references (
). It is known that the deglycosylation step using PNGase F results in a conversion of asparagine to aspartic acid in the peptide sequence, introducing a mass difference of 0.9840 Da. This phenomenon was utilized to design a synthetic peptide to mimic the endogenous N-linked glycopeptide in its deglycosylation form with exact amino acid sequence of its endogenous counterpart and with 13C and 15N labeling on one of its amino acids (
). Therefore, each matched pair of reference and endogenous candidate glycopeptides should share the same chromatographic and mass spectrometric characteristics, and can only be distinguished by their mass difference and isotopic pattern because of isotopic labeling. This design conceptually ensures that the synthetic internal standard of a candidate glycopeptide will be detected simultaneously with its endogenous form under the same analytical conditions, thus, minimizing the systematic variation and providing reliable quantification (
The targeted glycoproteomics technique was first demonstrated to analyze N-glycopeptides that were extracted from human serum using an integrated pipeline combining a hydrazide chemistry-based solid phase extraction method and a data-driven liquid chromatography MALDI TOF/TOF mass spectrometric analysis to quantify 21 N-glycopeptides in human serum (
). A similar mass spectrometric platform was then applied in a different study to assess a subset of glycoprotein biomarker candidates in the sera from prostate cancer patients (
). The targeted glycoproteomics analysis has also been demonstrated using a triple Q/linear ion trap instrument with the selected reaction monitoring (also referred to as multiple reaction monitoring) technique for highly sensitive targeted detection of N-glycoproteins in plasma (
). The technique was applied to detect tissue inhibitor of metalloproteinase 1 (TIMP1), an aberrant glycoprotein associated with colorectal cancer, in the sera of colorectal cancer patients (
) using a tandem enrichment strategy, combing lectin glycoprotein enrichment followed by the method of stable isotope standards and capture by antipeptide antibodies (SISCAPA), to enhance the detection of tissue inhibitor of metalloproteinase 1 (
). These studies demonstrate an integrated pipeline for candidate-based glycoproteomics analysis with precise mapping of targeted N-linked motifs and absolute quantification of the glycoprotein targets in a complex biological sample. Such targeted glycoproteomics can reach a detection sensitivity at the nanogram per milliliter level for serum and plasma detection (