Proteomics Identification of Proteins in Human Cortex Using Multidimensional Separations and MALDI Tandem Mass Spectrometer*S

It is essential to characterize the proteome of various regions of human brain because most, if not all, neurodegenerative diseases are region-specific. Here we report an in-depth proteomics identification of proteins extracted from the frontal cortex, a region playing a critical role in cognitive function. The integrated proteomics analytical flow consisted of biochemical fractionation, strong cation exchange chromatography, reverse phase liquid chromatography, and MALDI-TOF/TOF mass spectrometric analysis. In total, 812 proteins were confidently identified with two or more peptides. These proteins demonstrated diverse isoelectric points and molecular weights and are involved in several molecular functions, including protein binding, catalytic activity, transport, structure, and signal transduction. A number of proteins known to be associated with neurodegenerative diseases were also identified. Detailed characterization of these proteins will supply the necessary information to appropriately interpret proteins associated with aging and/or age-related neurodegenerative diseases. Finally 140 proteins found in the cortical proteome were present in the proteome of cerebrospinal fluid, providing tissue-specific candidates for biomarker discovery in body fluid.

Most neurodegenerative disorders are region-specific. For instance, in the most common geriatric dementia, Alzheimer disease (AD), 1 neuronal loss is predominantly found in the cerebral cortex and hippocampus (1), whereas in the most common disabling movement disorder, Parkinson disease (PD), neurodegeneration largely centers on the brainstem at least during the early phase of the disease (2). Additionally it is expected that the profile of the proteins within brain regions is likely to change qualitatively and/or quantitatively during aging and/or in different disease states. Thus, identification of proteins unique to each brain region, those associated with neurodegenerative mechanisms in particular, could yield opportunities to overcome major obstacles in the development of new protective and restorative therapies for prominent neurodegenerative diseases. As an emerging technology, MS-based proteomics is increasingly used for this purpose (3)(4)(5)(6)(7)(8)(9)(10).
Global scale proteomics analysis of human temporal lobe, obtained surgically from epileptic patients, has been reported recently. One study applied multidimensional fractionation and separation methods, including biochemical fractionations, one-dimensional electrophoresis, and reverse phase (RP) LC, and a linear ion trap mass spectrometer (LTQ) to profile proteins extracted from temporal lobe tissues (11). In total, 1533 proteins were identified in this study. Other parallel proteomics studies on human temporal lobe tissue include two-dimensional electrophoresis separation coupled with analysis by a MALDI-TOF/TOF mass spectrometer, which identified 375 proteins (12), and a shotgun proteomics approach, consisting of strong cation exchange chromatography (SCX) and RP LC for peptide separation, and a threedimensional ion trap mass spectrometer (LCQ), which identified 209 proteins (13).
In this study, we investigated the proteome of human frontal cortex, a region involved in many neurodegenerative diseases with cognitive dysfunction. For instance, neurodegeneration with the presence of neuritic plaques and neurofibrillary tangles in the frontal cortex is one of the hallmarks of AD (1). Similarly neurodegeneration associated with development of Lewy bodies in the frontal cortex is one of the characteristics of PD with dementia or dementia with Lewy body disease (1,14). Also the frontal cortex is characteristically involved in a group of diseases called frontotemporal dementia, which is currently classified based on immunohistochemical staining (15).
The present study is a part of our ongoing effort to detect protein expression alterations in human frontal cortex as Lewy body deposition spreads from the brainstem to the limbic system and eventually to the isocortex with progression of PD (2,16,17). Although proteins signifying disease progression are being actively investigated, it is important to make proteome data about human cortex, which is largely unknown, available to the investigators in the field. Here we report a large scale proteomics study of the frontal cortex of normal human subjects using a platform consisting of biochemical fractionation, multidimensional LC separation, and MALDI-TOF/TOF mass spectrometric analysis.

EXPERIMENTAL PROCEDURES
Sample Preparation-The human cortex (middle frontal gyrus) samples were obtained following the protocols approved by the University of Washington, University of Michigan, and Emory University. The average age of the subjects was 67 years, and postmortem interval was less than 12 h. All normal subjects (three males and two females; average age, 65 years) were individuals who did not have a diagnosis of neurological disease, were not taking medications prescribed for neurologic disease, and whose neuropathologic examination revealed age-related changes only. All patients had been diagnosed during life with an extrapyramidal movement disorder or dementia, and final diagnoses were established by neuropathological examination according to established criteria (18,19). The data on patients are not reported in this study.
Cytosolic, mitochondrial, and nuclear fractions of human cortex were prepared. Frozen samples were rapidly thawed in ice-cold homogenization buffer (0.32 M sucrose, 20 mM HEPES, pH 7.5, 1 mM PMSF, phosphatase inhibitors (0.2 mM Na 3 VO 3 and 1 mM NaF), and protease inhibitor mixture from Sigma) and then disrupted with a glass-Teflon homogenizer (Wheaton) by 10 gentle up and down strokes. The protein concentrations of the homogenates were determined using a Pierce BCA Protein Assay kit. The homogenate was then centrifuged at 100 ϫ g for 30 s to remove large cell debris. The supernatant was collected and centrifuged for 10 min at 800 ϫ g to separate the crude nuclear pellet and the cytosolic supernatant. The cytosolic supernatant was further centrifuged at 10,000 ϫ g for 15 min to obtain the mitochondria-enriched pellet and the cytosol-enriched supernatant. The mitochondria-enriched pellet was resuspended in the sample buffer consisting of 6 M urea, 0.05% SDS, 5 mM EDTA, and 50 mM Tris-HCl (pH 8.5). The crude nuclear pellet from the second centrifugation was resuspended in a nuclear extraction buffer (20 mM HEPES, pH 7.9, 0.3 M KCl, 1.5 mM MgCl 2 , 20% glycerol, and 0.1% Triton X-100). After 30 -40 min of gentle shaking, the suspension was centrifuged at 14,000 ϫ g for 10 min. The resulting supernatant, representing the nuclei-enriched fraction, was subsequently desalted using a PD-10 column, dried in a centrifugal vacuum concentrator, and redissolved in the sample buffer (6 M urea, 0.05% SDS, 5 mM EDTA, and 50 mM Tris-HCl, pH 8.5).
In addition to conventional cellular fractionation, synaptosomes were prepared as described previously (20) with minor modifications. The rapidly thawed cortical tissue samples were dissected in ice-cold homogenization buffer (0.32 M sucrose, 5 mM HEPES, pH 7.4, 1 mM EDTA, 1 mM PMSF, and protease inhibitor mixture from Sigma) and disrupted with a glass-Teflon homogenizer (Wheaton) by eight gentle up and down strokes. The protein concentrations of the homogenates were determined using a Pierce BCA Protein Assay kit. The homogenate was centrifuged at 2000 ϫ g for 2 min to remove nuclei and cell debris. The supernatant was collected and centrifuged for 10 min at 14,000 ϫ g, and the pellet was resuspended in 3 ml of homogenization buffer and centrifuged at 14,000 ϫ g for 10 min. The resulting pellet, representing a crude synaptosomal/mitochondrial fraction, was subsequently resuspended and layered over a 6-9-13% discontinuous Ficoll gradient that had been equilibrated at 4°C for at least 1 h. The gradient was centrifuged at 86,800 ϫ g for 35 min on a Beckman Optima XL-100K ultracentrifuge using an SW 41 Ti swing bucket rotor. Synaptosomes were collected from the 6-9% and 9-13% gradient interfaces and washed with diluted (5-fold) homogenization buffer and centrifuging at 20,800 ϫ g for 20 min. The synaptosomal pellet was then resuspended in homogenization buffer.
Multidimensional Separation-The tryptic digest of the proteins was fractionated into nine fractions using an SCX PolySulfoethyl A TM column (200 ϫ 2.1-mm inner diameter, 5 m, 300 Å; Poly LC, Columbia, MD). The peptides were separated by applying a stepped linear gradient from 0 to 100% buffer B (5 mM KH 2 PO 4 , 600 mM KCl, and 25% ACN, pH 3.0) versus buffer A (5 mM KH 2 PO 4 and 25% ACN, pH 3.0) at a flow rate of 200 l/min. Each SCX fraction was collected and further separated by RP nanocapillary LC system (LC Packings/ Dionex, Sunnyvale, CA) using a 15-cm ϫ 100-m-inner diameter Magic C 18 capillary column (3 m, 100-Å packing; Michrome BioResources Inc., Auburn, CA) with a gradient running from 5 to 50% mobile phase B (80% ACN and 0.1% TFA in HPLC water) versus mobile phase A (2% ACN and 0.1% TFA in HPLC water) over 75 min at a flow rate of 0.4 l/min. The eluted gradient was mixed with matrix solution (7 mg/ml recrystallized ␣-cyano-4-hydroxycinnamic acid in 60% ACN and 2.6% ammonium citrate) and spotted onto a stainless steel MALDI plate to form a predefined 24 ϫ 24 array (576 spots) using a Probot TM system (LC Packings/Dionex).
MALDI Tandem Mass Spectrometric Analysis-The mass spectrometric analysis was carried out using a MALDI-TOF/TOF instrument (4700 Proteomics Analyzer, Applied Biosystems) with reflector positive ion mode. For MS analysis, 800 -4000 m/z mass range was used with 1000 shots per spectrum. A maximum of 10 precursors per spot with minimum signal/noise ratio of 50 were selected for data-dependent MS/MS analysis. A 1-kV collision energy was used for CID, and 1500 acquisitions were accumulated for each MS/MS spectrum. All the analysis was performed using default calibration, and the mass accuracy was calibrated to within 50 ppm using calibration standards (Applied Biosystems) before each run. For the cytosolic, mitochondrial, and nuclear fractions, triplicates were analyzed. For the synaptosome fraction, duplicate samples were analyzed.
Data Analysis-The MS/MS spectra were searched against the International Protein Index (IPI) human protein database (version 3.18) from the European Bioinformatics Institute using GPS software (version 3.6, Applied Biosystems) running Mascot search algorithm (version 2.0, Matrix Science, Boston, MA) for peptide and protein identification. A mass tolerance of 200 ppm and 0.25 Da was used for precursors and fragment ions, respectively. The data searches were performed with the following criteria: fixed modifications of iTRAQ (isobaric tags for relative and absolute quantification) (21) labeling on the N terminus and lysine (Note that the quantitative data related to disease are not included in this study.) and methyl methanethiosulfonate on cysteine, differential modification of methionine oxidation, and one missed cleavage allowed. In total, 142,146 MS/MS spectra were searched with GPS software using a 95% confidence interval threshold (p Ͻ 0.05) with which a minimum Mascot score of 31 was used for peptide identification. The decoy database search suggested a 0.4% false positive rate for protein identification using a 95% confidence interval. The gene ontology (GO) analysis was performed using GoMiner software (discover.nci.nih.gov/gominer) (22).

RESULTS
Protein Identification-Four biochemical fractions, cytosolic, mitochondrial, nuclear, and synaptosomal, were extracted from the frontal cortical tissue obtained at autopsy. Fraction-ation of tissue was not designed to achieve pure subcellular organelles as this is not practically possible for frozen human brain. Rather this procedure was instituted for deeper probing of the human proteome. The tryptic digests of the proteins in each fraction were further separated with SCX and RP LC before analysis with MALDI-TOF/TOF. All together, 1783 nonredundant proteins were assigned with 95% confidence, including 812 proteins identified with two or more peptides. The proteins identified with more than two peptides in each fraction are categorized in Fig. 1 based on the number of peptides identified for protein assignment. The numbers of proteins assigned by single peptide identification were consistently similar in each experiment, averaging 49% of proteins identified in each fraction. The numbers of proteins identified with two or more peptides were 418, 346, 475, and 508 for cytosolic, mitochondrial, nuclear, and synaptosomal fraction, respectively.
As proteins identified by a single peptide should be considered provisional, we only report the 812 non-redundant proteins identified with two or more peptides in cortical tissue (Supplemental Table 1). These 812 proteins demonstrated diverse characteristics, including varying pI and molecular weight. As compared with a recent proteomics study on human temporal cortex using SCX and an LC-ESI tandem mass spectrometer (13) in which the proteins identified were observably biased toward lower molecular weight (Ͻ30,000) proteins, the proteins identified in this study demonstrated a more evenly distributed range of molecular weight ( Fig. 2A) with 48% of the proteins identified having a molecular weight Ͼ50,000. Furthermore in the previous study mentioned above, a large portion of proteins identified were situated in a pI range of 4 -6. In contrast, 55% of the proteins identified in this study have a pI greater than 6, and a more diverse distribution of pI was observed (Fig. 2B).
Gene Ontology-The proteins identified in this study were classified in accordance with their distribution in molecular functions and cellular processes based on GO analysis and annotations. The main functions of the proteins identified include protein-protein or protein-small molecule binding, catalytic activity, transport, cytoskeletal, and signal transduction (Fig. 3A). The majority of the proteins identified are involved in cellular processes, physiological processes, regulation of biological processes, and development (Fig. 3B).
Proteins Presented in CSF Proteome-Among the proteins identified in our human CSF studies (23-25) 1555 proteins were assigned with two or more peptides. In comparison with the proteins that were identified in the cortex with the 1555 CSF proteins, 140 proteins were found to exist in both the cortical and CSF proteomes (Fig. 4). Many of these proteins are known to be involved in a variety of central nervous system functions (Supplemental Table 2). DISCUSSION Here we report on the first large scale proteomics analysis of frontal cortex, an area critical to human cognitive and executive functions. Four biochemical fractions, i.e. cytosolic, mitochondrial, nuclear, and synaptosomal, were extracted from tissue samples and analyzed separately using "shotgun" proteomics (26). The tryptic digest of each sample was first separated by SCX and then RP LC followed by MALDI-TOF/ TOF analysis. A data-dependent mass spectrometric strategy was used to increase the number of MS/MS analyses. More specifically, the sample was first analyzed using a global approach in which a threshold of signal/noise ratio was used as the criterion to select precursor ions for MS/MS analysis. In the second analysis, an exclusion list was generated based on the first analysis to exclude the precursors that have already been analyzed so that the analysis could target the precursor ions that were not selected for MS/MS in the first run. This combination of two-dimensional LC separation and data-dependent mass spectrometric analysis significantly enhanced the identification efficiency of low abundance peptides. All together, 1783 non-redundant proteins were assigned with 95% confidence, including 812 proteins identified with two or more peptides. In addition, the high mass accuracy of the TOF analyzer provided additional confidence in peptide identification. Among the peptides assigned for protein identification with 95% confidence, more than 50% of the peptides have a mass error less than 0.05 Da.
The proteins identified in this study demonstrated a wide spectrum of molecular weight and pI, including a significant portion of proteins with high molecular weight and pI. The observation suggested that the proteomics analytical flow used in this study, from sample preparation to mass spectrometric analysis, provided a reasonable and relatively nonbiased analysis of proteins with different physicochemical properties. In addition, a very similar percentage of single peptide-identified proteins, ranging from 44 to 52%, were observed while analyzing each biochemical fraction, reflecting the consistency and stability of the analytical flow for protein identification.
The GO functional analysis of proteins identified by more than two peptides revealed that many of the cortical proteins are involved in catalytic activities and protein binding. In addition, a significant number of proteins have transport, signal transduction, and cytoskeletal functions. The finding is strikingly similar to the results from a recent proteomics study on the proteins from human temporal lobe (11). In comparison with the proteins identified with more than two peptides, 463 (57%) proteins identified in the frontal cortex in this study were also reported in human temporal lobe (11)(12)(13). The difference can be attributed to several possibilities, including the following. 1) It may reflect a difference in the proteome of different brain regions, 2) neither analysis should be considered a complete characterization of human proteome, 3) there are differences in the populations and ages of the subjects as well as platforms and databases used between studies, and 4) some of the proteins identified in the temporal lobe could be related to epilepsy.
By examining proteins identified in human CSF (23)(24)(25) in which only proteins identified with two or more peptides were extracted for comparison, 140 cortical proteins were also found to be present in CSF, accounting for 17% of 812 cortical proteins. From the point of view for biomarker discovery, it is important and significant to identify the presence of brain tissue proteins in CSF, a readily accessible resource for biomarker development. Our further analysis revealed that a great portion of these proteins are known to be specifically involved in or associated with specific central nervous system functions. Some of these proteins are implicated in the mechanisms of a few neurodegenerative diseases. Selected examples of these proteins are discussed in more detail below.
Neurofilament Triplet H Protein-Neurofilaments (NFs) are the major structural proteins of the neuronal cytoskeleton and are involved in cell architecture and differentiation as well as neurite determination and maintenance of fiber caliber. Neu- rofilaments are composed of three different subunits: NF-L, NF-M, and NF-H. Studies have shown that defects in NF-H might be a possible cause of amyotrophic lateral sclerosis, a degenerative disorder of motor neurons in the cortex, brainstem, and spinal cord characterized by abnormal accumulation of NFs in perikarya and proximal axons (27,28). Two peptides were identified for this protein, one covers coil 1B (peptide 205-213), and the other covers coil 2B (peptide 353-372).
Cystatin C Precursor-Cystatin C (CST) is an inhibitor of cysteine proteases with amyloidogenic properties. Structural studies have explored the tendency of this protein to dimerize and suggested a mechanism of aggregation in the cerebral arteries of elderly people with amyloid angiopathy (29). Defects in the CST3 gene are a possible cause of hereditary cerebral hemorrhage with amyloidosis, also known as cerebral amyloid angiopathy or cerebroarterial amyloidosis Icelandic type, which is characterized by a thickening of the cerebral arterial walls with deposition of CSTs with the characteristics of amyloid (30 -32). Two peptides from the chain of cystatin C were identified.
Protein NDRG2-NDRG2 is a cytoplasmic protein involved in several cellular physiological processes and cell growth/ differentiation and found in pathological brain lesions of AD (33). Isoforms 1 and 2 are present in brain neurons and up-regulated in AD (at the protein level). All three peptides identified are from the chain of protein NDRG2, and one of the peptides (peptide 274 -288) is missing in isoform 5.
Apolipoprotein E Precursor-APOE is a secreted protein produced in many organs, including brain. The APOE*4 2 allele is genetically associated with the common late onset familial and sporadic forms of AD (34). The mechanism by which APOE*4 participates in AD pathogenesis is not known. However, APOE*4 gene dose is known to be a major risk factor for late onset AD (35). Six peptides were identified for this protein.
Two peptides (peptides 114 -121 and 210 -224) originate from the repeat sections of the protein chain.
Cathepsin D Precursor-Cathepsin D is a ubiquitously expressed lysosomal acid protease possibly involved in ADrelated diseases through cleavage of amyloid precursor protein into amyloidogenic components (36,37). Deficiency in cathepsin D resulting from mutations in the corresponding gene is the possible cause of neuronal ceroid lipofuscinosis (38). Among the peptides identified, one is from the light chain of the protein, and the rest of the eight peptides cover the regions of the heavy chain of the protein.
Proteolipid Protein 1-Defects in the PLP1 protein gene are a cause of Pelizaeus-Merzbacher disease (39). Pelizaeus-Merzbacher disease is an X-linked neurologic disorder of myelin metabolism and characterized by early impairment of motor development (during the first 3 months of life) and later by the development of abnormal movements and progressive spastic paraplegia (39). Several regions of the protein were identified, including the extracellular domain (peptides 46 -53 and 193-229) and the cytoplasmic domain (peptides 99 -105, 106 -122, and 128 -137).
In summary, 1783 non-redundant proteins were identified in the human frontal cortex with 95% confidence, including 812 proteins identified with multiple peptides. An analytical flow consisting of two-dimensional LC separation and MALDI-TOF/TOF mass spectrometric analysis was applied for an in-depth proteomics identification in four biochemical fractions. The proteins identified by more than two peptides cover a wide range of molecular weight and pI. The main molecular functions of these proteins include protein binding, catalytic activity, transport, structure, and signal transduction. 140 cortical proteins were also found in the CSF proteome with many of them related to disease mechanisms. Taken together, the proteins identified in the present study provide important fundamental information for the ultimate characterization of cortical function of human brain. Proteins that are functionally important and also present in CSF could be considered for potential biomarker candidates. * This work was supported by National Institutes of Health Grants R01AG025327, R01ES012703, AG08671, and AG025688 (to several investigators in multiple medical centers) as well as by a Shaw Endowment (to J. Z.). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.