If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Molecular and Translational Medicine Program, Boston University, Boston, MA 02218Department of Biochemistry, Boston University, Boston, MA 02218Bioinformatics Program, Boston University, Boston, MA 02218
The most straightforward applications of proteomics database searching involve intracellular proteins. Although intracellular gene products number in the thousands, their well-defined post-translational modifications (PTMs) makes database searching practical. By contrast, cell surface and extracellular matrisome proteins pass through the secretory pathway where many become glycosylated, modulating their physicochemical properties, adhesive interactions, and diversifying their functions. Although matrisome proteins number only a few hundred, their high degree of complex glycosylation multiplies the number of theoretical proteoforms by orders of magnitude. Given that extracellular networks that mediate cell-cell and cell-pathogen interactions in physiology depend on glycosylation, it is important to characterize the proteomes, glycomes, and glycoproteomes of matrisome molecules that exist in a given biological context. In this review, we summarize proteomics approaches for characterizing matrisome molecules, with an emphasis on applications to brain diseases. We demonstrate the availability of methods that should greatly increase the availability of information on matrisome molecular structure associated with health and disease.
). Evolution of intracellular biology focuses on the regulation of signaling events, transcription and translation, through phosphorylation and other post-translational modifications (PTMs)
that influence allosteric enzyme regulation and signaling cascades through activation/deactivation of recognition domains, for example, SH2, SH3, bromo-, chromo- and tudor domains (
). Significantly, many of the PTMs that occur inside the cell produce well-defined molecular additions (phosphorylation, acetylation, acylation and methylation) that are compatible with the established database searching workflows (
). Evolution of such PTMs arose through the need for complex control of gene expression through regulated signaling networks. By contrast, complex glycosylation reflects the evolutionary response to pathogen pressure and the evolving need for multicellular complexity (
). Organisms need a multicellular organization with the ability to distinguish self from non-self and orchestrated responses to infection and regulated tissue plasticity. Therefore, much of cellular biology responds to signals received from the extracellular environment. Complex glycosylation is heterogeneous as a rule at each protein site, multiplying the number of molecular forms and requiring specialized proteomics methods.
), the mechanisms of tissue homeostasis and most diseases include interactions with the extracellular microenvironment. The matrisome constitutes the non-cellular components that control biochemical and biomechanical cues, growth factor and morphogen gradients, and physical scaffolds that define tissue phenotypes including morphogenesis, differentiation, and homeostasis (
). Although each tissue has a unique extracellular environment, the number of gene products that code matrisome proteins in the entire body are limited (
). Cell surface receptors regulate adhesion and cytoskeletal connections to the matrisome. The structure and organization of the matrisome require maintenance as it adapts to tissue growth needs. Matrisome proteins become glycosylated in the secretory pathway, may be proteolytically processed and cross-linked. The resulting physical and biochemical characteristics reflect the organized networks that depend on numerous molecular interactions that arise from PTMs.
To date, the ability to apply established proteomics methods that depend on database searching to such highly modified and heterogeneous proteins remains far from adequate (
). Although we can detect many matrisome proteins using proteomics, the low sequence coverage leaves many structural elements of functional interest unidentified. In addition, the heterogeneity of glycosylation at each of the many glycosylation sites of matrisome proteins results in astronomically large numbers of possible proteoforms, if taken as multiples of the variants at each site. Although the existence of such large numbers of proteoforms seems unlikely, the number of functional proteoforms that exist in a given biological context remains largely undefined.
In this review, we summarize methods for proteomics, glycomics and glyoproteomics of matrisome molecules. The goal is to characterize matrisome molecular structure in the greatest detail possible using wide-angle omics experiments. The high extent of matrisome protein glycosylation and other post-translational modifications requires special consideration of sample workup and proteomics database searching. We summarize matrisome physiology with emphasis on brain diseases. We summarize experimental approaches for matrisome workup and mass spectrometric analysis.
Extracellular Matrix Physiology and Pathophysiology
Dysregulation of the cellular microenvironment occurs in cancers (
). Known as the matrisome, networks of extracellular matrix and cell surface molecules control the availability of growth factors to cellular receptors and the mechanical-physical properties of the cell microenvironment. Currently, the limited understanding of the regulation of matrisome glycosylation hinders understanding of the roles of glycosylation-dependent matrisome networks in the basic mechanisms necessary for the targeted intervention of many diseases.
) consists of glycoproteins, proteoglycans, collagens, and their interacting partners. Matrisome protein functions are elaborated by biosynthetic enzymes of the secretory pathway that generate mature molecules with spatially and temporally regulated glycosylation. Thus, glycoproteins have context-specific structures and biological functions that remain largely undefined because of the lack of effective methods for quantifying changes to site-specific protein glycosylation. This means that it is necessary to achieve complete matrisome protein coverage in order to determine the changes to these molecules that occur during disease mechanisms.
Progress in developing treatments that target interactions among cells and their extracellular microenvironments, including dysregulated cell growth, morphogenesis, and host-pathogen interactions, is limited by the ability to quantify the extent to which changes in matrisome networks resulting from altered glycoprotein glycosylation determine disease mechanisms. Many matrisome proteins contain lectin domains that recognize glycan epitopes. Thus, the glycosylation of matrisome molecules determines their binding interactions with other matrisome molecules and with soluble growth factors. The result is organized assemblies of matrisome molecules that compose the spatially and temporally regulated microenvironments through which cells receive signals in physiology and pathophysiology.
The core matrisome consists of 195 glycoproteins, 44 collagens and 35 proteoglycans (
); however, such studies have not defined the glycosylation states of matrisome molecules necessary to define networks of interactions with lectin-containing binding partners. Many matrisome molecules have several points of glycosylation, each with microheterogeneity, each representing a functional epitope that needs to be defined in order to characterize biological functions (
). Interactions among cell surface and extracellular glycoproteins with lectins including galectins, C-type lectins, and siglecs, drive the clustering of cell surface molecules into networks that define tissue microenvironments. These organized assemblies of extracellular molecules (the matrisome) adapt cellular microenvironments to phenotypic needs. Although glycosylated proteins represent potential therapeutic targets, their macro- and micro-heterogeneities pose a significant challenge to exploitation. Because their functions depend on glycosylation and other PTMs, it is necessary to produce detailed proteolytic maps of matrisome proteins and follow the changes that occur during aging and disease development.
The Dynamic Brain Matrisome
The brain extracellular space has been referred to as the final frontier in neuroscience (
). Matrisome functions depends on networks of interaction among glycosylated proteins and glycan-binding lectins. As illustrated in Fig. 1 for the brain, networks of cell surface and extracellular glycoproteins and proteoglycans bind many families of growth factors and growth factor receptors (
). They modulate receptor tyrosine kinase signaling pathways at the heart of mechanisms including tissue stiffness and growth factor transport.
Fig. 1Brain matrisome types include blood-brain barrier, interstitial matrix, and perineuronal nets. These structures are composed of matrisome molecules including hyaluronan, collagens, glycoproteins, and proteoglycans. Matrisome structure is spatially and temporally regulated, dynamic, and becomes altered during the pathogenesis of neuropsychiatric and neurodegenerative diseases.
). As shown in Fig. 1, matrisome in the central nervous system includes the interstitial matrix, basement membranes, and perineuronal nets (PNNs), the fine structures of which vary spatially and temporally (
). The matrisome provides the environment necessary for cell homeostasis, repair, regeneration, and neural plasticity in a brain region-specific manner (
The basement membranes that line cerebral blood vessels consist of collagen IV, laminins and heparan sulfate proteoglycans (HSPGs) perlecan and agrin (
). In the brain, neural interstitial matrix separates cells and consists of networks of chondroitin sulfate proteoglycans (CSPGs), tenascins, hyaluronan, and link proteins. The PNN consist of many of the same CSPGs, tenascins, hyaluronan, and link proteins condensed that surround some neuronal cell bodies and dendrites.
Dysregulation of matrisome molecular networks characterize pathophysiologies of cancer (
A novel protein glycan-derived inflammation biomarker independently predicts cardiovascular disease and modifies the association of HDL subclasses with mortality.
Glycosylation profile of immunoglobulin G is cross-sectionally associated with cardiovascular disease risk score and subclinical atherosclerosis in two independent cohorts.
). In the brain, region-specific regulation of matrisome molecule glycosylation controls the neuronal microenvironment and becomes dysregulated in neuropsychiatric diseases (
). For example, in neurodegeneration, proteoglycans bind to and play roles in the aggregation of proteins including Aβ, tau, prion protein, and α-synuclein (
). In Alzheimer's disease, proteoglycans participate in amyloid plaque formation, leading to disease pathology from altered proteolytic processing and accumulation of toxic aggregates (
Perineuronal nets are lattices of matrisome molecules that surround the cell body and dendrites of neurons. They are thought to serve as a reservoir for cations and provide the connectional architecture that controls synaptic plasticity. Deficits in PNN structure appear to contribute to dysfunction in cortical circuitry in schizophrenia (
). The number of PNN in visual cortex increases during postnatal development, paralleling the critical period for synaptic plasticity and playing an important role in critical period closure. Significantly, the adult inability to repair spinal cord injury can be restored by the treatment of the injury site with chondroitinase enzymes (
). PNN structure differs spatially and temporally in the brain in association with injury, repair, development, aging, learning, memory, neuropsychiatric diseases, neurodegeneration, and in response to drug abuse (
Variation in Matrisome Molecular Structure Among Brain Regions, With Development, Aging, and Pathologies
Traditional antibody-based techniques including immunohistochemistry show spatial and temporal regulation of matrisome molecule expression in the brain (
Generation and application of type-specific anti-heparan sulfate antibodies using phage display technology. Further evidence for heparan sulfate heterogeneity in the kidney.
). Although antibody binding indicates the levels of individual epitopes, the antibody specificity and underlying structure are assumed. Matrisome molecule glycosylation can also be stained using lectins, leaving the underlying matrisome site-specific glycosylation structure undefined (
Because antibodies bind to discrete structural epitopes on highly complex matrisome molecules, the changes in structure unrelated to such epitopes are not defined by antibody-based techniques. This is illustrated for the aggrecan proteoglycan in Fig. 2. Aggrecan contains three globular domains and an extended region modified with more than 100 CS chains. The C-terminal G3 domain has EGF-like repeats, a complement regulatory module, and a C-type lectin module (
) and interacts with tenascins. The globular domains are N-glycosylated, and the extended domains also carry keratan sulfate, mucin-type O-glycans, and O-mannose glycans (
). Although this provides a parts list for aggrecan, we have little information on how the variation of glycosylation modulates the functions of aggrecan in a weight-bearing tissue such as cartilage versus brain. Further, although it is clear that CS structure varies among brain regions, there is little information on the fine structures of the resulting matrisome molecular networks. The same reasoning applies to the other brain matrisome network glycoproteins, proteoglycans, and collagens.
Fig. 2Model for the structure of aggrecan, showing glycosylation with CS, HS, N-glycans, mucin-type O-GalNAc glycans, and O-Man glycans. The G1 and G2 domains contain link modules with homology to HAPLN proteins. The G3 domain has two epidermal growth factor (EGF)-like repeats; a C-type lectin domain, and a complement regulatory protein domain (CRP).
It is clear, however, that matrisome glycosylation changes during development and with disease states. This can be seen from the alteration of CS sulfation during development in many tissues, including cartilage (
Detection of age-related changes in the distributions of keratan sulfates and chondroitin sulfates in developing chick limbs: an immunocytochemical study.
). Further, staining of brain tissue with Wisteria floribunda agglutinin (WFA), a lectin that binds GalNAc residues, has been used to identify brain region pathologies associated with schizophrenia (
). The extreme complexities of matrisome proteins concerning glycosylation, cross-linking and other PTMs, drives the need for many more validated antibody reagents than are available to date (
). Such lectin and/or antibody staining studies do not define the glycosite changes that underlie dysregulated interactions with lectin-containing binding partners that give rise to pathologies.
), methods for enrichment of matrisome proteins for proteomics studies include tissue decellularization and extraction of matrisome components from tissue homogenates. Proteomics researchers often use decellularization to remove cellular components prior to solubilization of matrisome molecules (
Preserved proteins from extinct bison latifrons identified by tandem mass spectrometry; hydroxylysine glycosides are a common feature of ancient collagen.
). Alternatively, hydroxylamine cleavage at Asn-Gly sites has also been used to solubilize matrisome from insoluble pellets prior to tryptic digestion and proteomics (
Preserved proteins from extinct bison latifrons identified by tandem mass spectrometry; hydroxylysine glycosides are a common feature of ancient collagen.
Proteomics Data Acquisition Methods for the Analysis of Matrisome Proteins
Present discovery proteomics methods suffice to identify matrisome proteins based on the presence of minimally modified peptides using database searching. Such peptides have been used in targeted proteomics assays for quantification of matrisome molecules based on inferred core protein abundances (
). Data-independent analysis (DIA) has the advantage that all precursor ions are subjected to collisional dissociation. Using the sequential window acquisition of all theoretical fragment ion spectra (SWATH)-MS DIA method (
), fragment ion spectra for all precursors are acquired within the specified m/z range and retention time window. Interpretation of such datasets in which tandem mass spectra show product ions from co-eluting peptides requires the use of spectral libraries (
). Huang et al. developed a spectral library of 201 matrisome proteins and compared the performance of SWATH versus data-dependent acquisition (DDA) for analysis of unfractionated tissue extracts (
). They reported a 15–20% improvement in peptide reproducibility and a 54% increase in several matrisome proteins identified relative to DDA. Önnerfjord et al. used high pH reversed-phase fractionation of tryptic digests as a workflow for cartilage proteomics (
). Because of the additional fractionation step, they reported 653 proteins identified. They used DDA data to build spectral libraries for interpretation of a subset of identified proteins using DIA. They showed that DIA produced a more precise measurement of peptide abundances than DDA.
Naba et al. used a commercial Cytosol/Nucleus/Membrane/Cytoskeleton compartmental protein extraction kit to enrich intracellular and matrisome proteins in separate fractions from tissue (
Quantitative proteomics identify an association between extracellular matrix degradation and immunopathology of genotype VII Newcastle disease virus in the spleen in chickens.
Mayr et al. extracted matrisome proteins from vascular tissue using decellularization, solubilization using a three-stage extraction (salt, detergent, guanidine HCl) and MS-based quantification (
). In studies of human venous tissue, they reported the identification of ∼150 matrisome proteins. They identified a proteomics 4-biomarker signature for atherosclerotic plaques from a comparison of vascular matrisome in human carotid artery specimens (
). In this work, they report 110 matrisome -associated proteins from guanidine HCl extraction and 87 from the salt fractions with an overlap of 51. They also performed matrisome proteomics studies of restenosis and thrombosis following coronary stent implantation in pigs, for which they report the identification of 151 matrisome proteins (
Berretta et al. demonstrated, using a combination of immunohistochemistry and proteomics, that matrisome molecule expression is brain region-dependent (
). For this work, fresh rat brains were dissected, and regions were snap frozen. Tryptic peptides were fractionated using ERLIC, and the resulting fractions analyzed using LC-MS. The fold change abundances of a set of 17 matrisome molecules, including tenascins, hyalectans, link proteins, and others were reported for a set of five rat brain regions.
), MALDI imaging mass spectrometry (IMS) produces 2-dimensional maps of the distributions of ions desorbed from the surfaces of tissue slides. The advantage is that the maps can be produced at ∼25 μm or better resolution and with impressive ion-specific spatial resolution patterns. The disadvantages are that in the absence of a separation step, the dynamic range of protein/peptide detection is limited, and identification of observed proteins or peptides can be cumbersome. Drake et al. have demonstrated the use of MALDI-imaging to visualize proteins and peptides from matrisome-rich tissues, including heart (
). They demonstrated the use of matrix metalloproteinase enzymes to localize collagen and elastin peptides on the surfaces of the tumor and cardiac tissue slides (
Nematodes join the family of chondroitin sulfate-synthesizing organisms: Identification of an active chondroitin sulfotransferase in Caenorhabditis elegans.
Effects of restoring normoglycemia in type 1 diabetes on inflammatory profile and renal extracellular matrix structure after simultaneous pancreas and kidney transplantation.
). Our method provides a readout of GAG quantities, domain structures, and non-reducing end structures using simple enzyme digestions with minimal need for workup. The final proteomics of tryptic peptides identifies ∼1200 proteins from the 10 nL tissue volume, providing deeper coverage than can be obtained from an MS imaging approach.
This approach requires small tissue volumes, minimal sample workup, and reduces the effort required per biospecimen for glycomics and proteomics studies. Fresh frozen slides are washed with a series of solvents, thereby denaturing tissue proteins. Formalin-fixed, paraffin embedded tissue slides require dewaxing, re-hydration, and high pH antigen retrieval prior to enzymatic digestion. Although proteins are denatured in both cases, the observed glycomics and proteomics profiles reflect tissue processing biases that remain to be studied in detail. Nonetheless, the analysis of matrisome molecules from tissue slides offers an attractive option to extraction from wet tissue in terms of lower sample quantities and effort required. For example, in a study of aging rat brain from tissue slides with no enrichment, we observed 9–11% of total proteins of extracellular origin, corresponding to 15 matrisome molecules (
Application of a conventional discovery proteomics workflow with database searching identified Matrisome molecules based on the presence of unmodified peptides. Although homogeneous PTMs including phosphorylation, acetylation, methylation, and ubiquitination are amenable to proteomics database searching (
), glycosylation is heterogeneous as a rule. This multiplies the number of PTM forms of a given matrisome molecule glycopeptide, thereby dividing the precursor ion signals, and multiplying the size of the proteomics search space and the difficulty of assigning the glycopeptide with confidence (
). As shown in Fig. 3, the presence of complex glycosylation alters the collisional dissociation pattern of peptides significantly. Glycopeptide collisional dissociation tandem mass spectra show low m/z oxonium signature ions that indicate the presence of glycosylation. The spectra also show peptide+saccharide ions, the abundances of which depend on the extent of vibrational excitation of the precursor ions. If relatively low collision energies are used, then product ions resulting from losses of saccharide units are abundant. At higher collision energies, peptide plus from one to a few monosaccharide units are observed. Under such conditions, dissociation of the peptide backbone is often observed albeit at relatively low abundances. Thus, the most confident collisional tandem mass spectra for glycopeptide precursor ions contain all three ion types as shown for example for an aggrecan glycopeptide in Fig. 3 (
Fig. 3Higher energy collisional dissociation tandem mass spectrum of an aggrecan glycopeptide showing the presence of oxonium ions (green), peptide+saccharide ions (golden), and peptide backbone dissociation ions (red or blue) (
Investigators have used WFA and concanavalin A (ConA) lectin enrichment of guanidine HCl extracts to enrich glycoproteins from human cardiac tissue from which they reported identification of 65 glycosylation sites from 35 extracellular proteins (
). O-Mannosylated peptides have been enriched from tissue extracts digested using trypsin and peptide-N-glycosidase F using ConA lectin chromatography (
). A set of 16 O-mannosylated glycoproteins were identified, several belonging to the cadherin superfamily, using this approach.
Proteoglycan Glycoproteomics
Enzymatic digestion of GAG chains leaves a glycopeptide with linker saccharide attached to the core protein. Such linker glycopeptides can be identified by the presence of a diagnostic oxonium ion for CS and HS proteoglycans (
Positive mode LC-MS/MS analysis of chondroitin sulfate modified glycopeptides derived from light and heavy chains of the human inter-alpha-trypsin inhibitor complex.
We have observed glycopeptides abundances too low for confident identification when analyzing proteolytic digests from tissue slides. It, therefore, appears that enrichment steps will be necessary to allow glycoproteomics from the tissue. Although such enrichment remains a challenge from small tissue volumes such as obtained from tissue slides, it seems feasible from wet tissue extracts.
CONCLUSIONS
For researchers interested in profiling abundances of matrisome core proteins, the use of decellularization or enrichment methods combined with targeted MS or DIA MS seems appropriate. As in other areas of proteomics, the use of multidimensional separations increases the number of proteins identified at the expense of analysis time and cost. One of these separation dimensions can be designed to enrich glycopeptides, thus increasing the ability to detect matrisome determinants of molecular networks. Such enrichment steps are most readily applied to tissue extracts. The analysis of tissue slides has potential benefits in terms of throughput, cost, and applicability to pathological workflows. The tissue volume, however, is rather low, making use of enrichment steps challenging. On the other hand, tissue slides can be microdissected (
), increasing the ability to select cell populations of interest for subsequent proteomics. Robotic approaches for manipulation of microdissected tissue have been described (
). It may, therefore, be feasible to use glycopeptide enrichment in such robotic workflows to enable the application of glycoproteomics LC-MS methods to microdissected tissue. This will enable profiling of designated matrisome glycosites as a means for assessing changes to extracellular networks during disease mechanisms.
REFERENCES
Demetriou M.
Nabi I.R.
Dennis J.W.
Galectins as adaptors: linking glycosylation and metabolism with extracellular cues.
Trends in Glycosci. Glycotechnol.2018; 30: SE167-SE177
A novel protein glycan-derived inflammation biomarker independently predicts cardiovascular disease and modifies the association of HDL subclasses with mortality.
Glycosylation profile of immunoglobulin G is cross-sectionally associated with cardiovascular disease risk score and subclinical atherosclerosis in two independent cohorts.
Generation and application of type-specific anti-heparan sulfate antibodies using phage display technology. Further evidence for heparan sulfate heterogeneity in the kidney.
Detection of age-related changes in the distributions of keratan sulfates and chondroitin sulfates in developing chick limbs: an immunocytochemical study.
Preserved proteins from extinct bison latifrons identified by tandem mass spectrometry; hydroxylysine glycosides are a common feature of ancient collagen.
Quantitative proteomics identify an association between extracellular matrix degradation and immunopathology of genotype VII Newcastle disease virus in the spleen in chickens.
Nematodes join the family of chondroitin sulfate-synthesizing organisms: Identification of an active chondroitin sulfotransferase in Caenorhabditis elegans.
Effects of restoring normoglycemia in type 1 diabetes on inflammatory profile and renal extracellular matrix structure after simultaneous pancreas and kidney transplantation.
Positive mode LC-MS/MS analysis of chondroitin sulfate modified glycopeptides derived from light and heavy chains of the human inter-alpha-trypsin inhibitor complex.