A Proteomic Analysis of Human Hemodialysis Fluid*S

The vascular compartment is an easily accessible compartment that provides an opportunity to measure analytes for diagnostic, prognostic, or therapeutic indications. Both serum and plasma have been analyzed extensively by proteomic approaches in an effort to catalog all proteins and polypeptides. Limitations of such approaches in obtaining a comprehensive catalog of proteins include the fact that a handful of proteins constitute over 90% of plasma protein content and that the renal glomeruli filter out proteins and polypeptides that are smaller than 66 kDa from blood. We chose to study hemodialysis fluid because it contains a higher concentration of small proteins and polypeptides and is also simultaneously depleted of the most abundant proteins present in blood. Using gel electrophoresis in combination with LC-MS/MS, we identified 292 proteins of which greater than 70% had not been previously identified from serum or plasma. More than half of the proteins identified from the hemodialysis fluid were smaller than 40 kDa. We also found 50 N-terminally acetylated peptides that allowed us to unambiguously map the N termini of mature forms of the corresponding proteins. Several identified proteins, including cytokines, were only present as predicted transcripts in data bases and thus represent novel proteins. The proteins identified in this study could serve as biomarkers in serum using more sensitive methods such as ELISA-specific antibodies.

The vascular compartment is an easily accessible compartment that provides an opportunity to measure analytes for diagnostic, prognostic, or therapeutic indications. Both serum and plasma have been analyzed extensively by proteomic approaches in an effort to catalog all proteins and polypeptides. Limitations of such approaches in obtaining a comprehensive catalog of proteins include the fact that a handful of proteins constitute over 90% of plasma protein content and that the renal glomeruli filter out proteins and polypeptides that are smaller than 66 kDa from blood. We chose to study hemodialysis fluid because it contains a higher concentration of small proteins and polypeptides and is also simultaneously depleted of the most abundant proteins present in blood. Using gel electrophoresis in combination with LC-MS/MS, we identified 292 proteins of which greater than 70% had not been previously identified from serum or plasma. More than half of the proteins identified from the hemodialysis fluid were smaller than 40 kDa. We also found 50 N-terminally acetylated peptides that allowed us to unambiguously map the N termini of mature forms of the corresponding proteins. Several identified proteins, including cytokines, were only present as predicted transcripts in data bases and thus represent novel proteins. The proteins identified in this study could serve as biomarkers in serum using more sensitive methods such as ELISA-specific antibodies.

Molecular & Cellular Proteomics 4:637-650, 2005.
A comprehensive analysis of human serum and plasma has proven to be difficult, especially for low molecular weight and low abundance proteins, because of the wide range of concentrations with the 10 most abundant proteins constituting almost 90% of the serum proteome by mass (1). The dynamic range issue is exacerbated for low molecular weight species because the kidneys filter away molecules with molecular mass of less than 66 kDa (2). The abundant proteins such as albumin, immunoglobulins, and transferrin hamper identification of less abundant proteins. Although several methods including ultracentrifugation (3), immunodepletion, solvent extraction/precipitation (4), and size exclusion chromatography (5) have been tried for removal of abundant components, it is still difficult to completely eliminate the interference from residual amounts of these abundant proteins.
Hemodialysis fluid has previously been used as a source of polypeptides (6) due to its higher concentration of low molecular weight components, but no comprehensive list of constituents present in the hemofiltrate has yet been published. We chose to analyze hemodialysis fluid because it is greatly reduced in the protein content, from 70 g/liter in the plasma to ϳ70 mg/liter. This is because the filtration cutoff used during hemodialysis results in selective depletion of proteins greater than ϳ60 kDa. It has been shown that the concentration of albumin in the hemodialysis fluid is 5,000-fold lower compared with its normal concentration in serum (7).
Our strategy involved one-dimensional gel electrophoresis separation and in-gel digestion of proteins followed by LC-MS/MS for identifying proteins in the hemodialysis fluid. Using this approach, we identified 292 proteins of which 205 had never been previously reported in the serum or plasma. Western blot analysis of a subset of these proteins revealed that they were also present in normal serum indicating that the sensitivity of detection might be the major reason why the majority of these proteins have never been identified previously in serum or plasma. We were also able to identify the N termini of a number of proteins based on peptides sequences that were acetylated at their N termini. A number of semitryptic peptides that were identified in this study were most likely derived from in vivo proteolysis. We were also able to identify a number of novel proteins including several cytokines in this analysis. The lack of a major overlap between the list of proteins identified in this study and previously reported proteins in serum or plasma reflects the difficulty of identifying these components using current proteomic methods. As demonstrated by our study, it is likely that more sensitive methods such as Western blotting or ELISA will be able to detect these proteins in serum or plasma.

Sample Collection and Preparation and Reagents-
The hemodialysis fluid was obtained from a 68-year-old white male with acute renal failure following a coronary artery bypass surgery. The patient was not known to have any other diseases. Vascular access was obtained utilizing a dual lumen catheter in the femoral position. Continuous venovenous hemodialysis was performed with a Prisma® dialysis machine (Cobe Renal Intensive Care, Lakewood, CO). The dialysis membrane was an acrylonitrile and sodium methallyl sulfonate copolymer, AN69® (Cobe Renal Intensive Care). Sterile dialysate was prepared in 5.0-liter aliquots (140 meq/liter sodium, 111 meq/liter chloride, 3.5meq/liter calcium, 3 meq/liter lactate, 1 meq/liter magnesium, 32 meq/liter bicarbonate, 2.0 meq/liter potassium), and the dialysis was carried out with a blood flow of 180 ml/min and a dialysate flow of 1.0 liter/hr. The spent dialysate was collected in 5.0-liter aliquots and analyzed after concentrating using a 3-kDa-cutoff filter (Centricon YM3, Millipore). The protein content in the hemodialysis fluid was measured using a modified Lowry protein assay kit (Bio-Rad). The concentrated hemodialysis fluid was run on precast NuPage 4 -12% bis-tris 1 and 10 -20% Tricine minigels (Invitrogen). The gels were silver-stained (8), and between 20 and 25 visible protein bands were excised. Protein gel bands were reduced with dithiothreitol (Fluka, Buch, Switzerland) and alkylated with iodoacetamide (Sigma) before digestion with trypsin (Promega, Madison, WI) in 100 mM NH 4 HCO 3 (Fluka). Solvents for liquid chromatography included heptafluorobutyric acid (Sigma), glacial acetic acid (Fisher Scientific), and HPLCgrade acetonitrile (J. T. Baker Inc.).
Trypsin Digestion and LC-MS/MS Analysis-The excised gel slices were digested with trypsin as follows. The gel bands were washed twice in 0.1 M NH 4 HCO 3 , washed twice in 50% acetonitrile, and subsequently cut into 2 ϫ 2-mm pieces. The gel pieces were shrunk using 100% acetonitrile, and proteins were reduced by addition of 0.1 M dithiothreitol followed by an incubation step at 56°C for 45 min. The washing procedure described above was repeated, and proteins were alkylated by adding 55 mM iodoacetamide and incubating for 30 min at room temperature in the dark. After an additional wash and shrinkage, 10 ng/l trypsin in 0.1 M NH 4 HCO 3 sufficient to cover the gel pieces was added followed by an incubation on ice for 20 min. When the gel pieces were completely rehydrated, any excess trypsin solution was removed and replaced by 0.1 M NH 4 HCO 3 , and samples were incubated overnight at 37°C. The digestion was stopped by adding 10 l of glacial acetic acid, and the supernatant containing the tryptic peptides was harvested. An extraction step was carried out to recover the peptides from the gel slices by adding 50% acetonitrile and incubating at room temperature for 30 min. The supernatant was harvested again and pooled. The pooled peptide extracts were dried down to ϳ10 l and subjected to LC-MS/MS analysis as follows. Samples were injected onto a 5-cm C 18 trap column (inner diameter, 75 m) packed with YMC ODS-A 5-15-m beads (Kanematsu USA Inc., New York, NY) using an autosampler (1100 microwell plate autosampler, Agilent Technologies, Palo Alto, CA). The peptides were eluted from the trap column onto an analytical 10-cm C 18 column (inner diameter, 75 m) packed with Vydac MS218 5-m beads (Vydac, Columbia, MD) with a gradient increasing from 10% solvent B, 90% solvent A (solvent A: 0.4% acetic acid, 0.005% heptafluorobutyric acid; solvent B: 90% acetonitrile, 0.4% acetic acid, 0.005% heptafluorobutyric acid) to 45% solvent B, 55% solvent A in 30 min. A flow of 4 l/min during loading and 300 nl/min during elution was delivered by a nanoflow pump (Agilent Technologies 1100 nanopump). The LC setup was connected to either a quadrupole-time-offlight mass spectrometer (QTOF API-US, Micromass, Manchester, UK) or an ion trap mass spectrometer (LC/MSD Trap XCT, Agilent Technologies) using nanoelectrospray sources from Proxeon (Odense, Denmark).
Western Blotting-For Western blotting experiments, serum from a healthy person was first depleted of albumin, IgG, IgA, transferrin, haptoglobin, and antitrypsin using an Agilent Technologies multiple affinity removal kit. 20 l of serum were diluted to 100 l in Buffer A (Buffer A and Buffer B as supplied in the multiple affinity removal kit by Agilent Technologies) prior to loading onto the column at 250 l/min in Buffer A. The elution from the column was monitored at 280 nm, and the depleted serum was collected in the interval from 2 to 4 min. Cleaning of the column was carried out by eluting the bound proteins as follows. 10 min after the injection, Buffer A was exchanged with Buffer B, and the flow rate was simultaneously increased to 1,000 l/min. Following this, Buffer B at 1,000 l/min was continued for 18 min after which the conditions were returned to the initial loading conditions. After conditioning the column under loading conditions for 5 min, the system was ready for a new depletion process.
Depleted serum was resolved by SDS-PAGE, and the proteins were transferred onto a nitrocellulose membrane. The membrane was blocked at 4°C overnight with 5% BSA in phosphate-buffered saline containing 0.1% Tween 20. The membrane was incubated with the relevant antibodies for 2 h, washed, and incubated with horseradish peroxidase-conjugated secondary antibody for 1 h. The proteins were visualized using enhanced chemiluminescence detection according to the manufacturer's instructions (Amersham Biosciences). The sources of primary antibodies were as follows: Cathepsin D, connective tissue growth factor, Galectin 3, and Lipocalin 2 (R&D Systems, Inc. Minneapolis, MN); Nucleophosmin 1 (Zymed Laboratories Inc.); ␣-defensins 1-3 (BD Biosciences); Cathepsin H (Serotec, Oxford, UK); and Cofilin 2 (Upstate Biotechnology, Lake Placid, NY). Conjugated secondary anti-mouse and anti-rabbit were from Amersham Biosciences, and anti-goat antibody was from Santa Cruz Biotechnology (Santa Cruz, CA).
Data Base Search and Analysis-The mass spectrometry data files from individual LC-MS/MS experiments were merged and then searched against the RefSeq data base (human: 27,975 entries; July 28, 2004) using Mascot (Matrix Sciences Ltd., London, UK). The search parameters were as follows: mass accuracy of the monoisotopic precursor and peptide fragments was set to 1.5 and 0.5 Da, respectively, for the data acquired on the ion trap mass spectrometer and 0.2 and 0.2 Da, respectively, for the data acquired on the quadrupole-time-of-flight mass spectrometer. The following variable modifications were permitted: oxidation of methionine, histidine, and tryptophan residues; N-terminal acetylation of proteins; and cyclization of N-terminal glutamine. Two missed tryptic cleavages were allowed. Additional searches using data acquired on the quadrupole time-of-flight mass spectrometer were performed with the following criteria: 1) semitryptic constraints, 2) N-acetylhexosamine modification of asparagine residues, and 3) hydroxylation of proline residues. For validation purposes, the retrieved peptide sequences were divided into three groups as follows: 1) peptides with a low score (quadrupole-time-of-flight data, Ͻ25; ion trap data, Ͻ30), 2) peptides with intermediate to high scores (quadrupole-time-of-flight data, Ͼ25; ion trap data, Ͼ30), and 3) proteins identified with only one intermediate or high scoring peptide. All of the low scoring peptides were discarded. All intermediate and high scoring peptides as well as single peptide hits used to identify proteins were manually validated if the following criteria were met: 1) several consecutive y-ions although absence of y-ions after proline and glycine, 2) the existence of lower a-and b-ions, 3) none or few unassigned fragments ions, and 4) a charge state of the precursor ion and fragment ions that are in accordance with basic amino acids in the assigned peptide sequence. Supplemental Fig. 6 pro-vides the MS/MS spectra of all proteins that were identified on the basis of a single peptide.

RESULTS AND DISCUSSION
Our strategy to characterize the hemodialysis fluid proteome by mass spectrometry using a gel LC approach was as follows. The hemodialysis fluid was first desalted and concentrated using a 3-kDa-cutoff filter. For fractionating the proteins and polypeptides in the hemodialysis fluid, the desalted sample was resolved by SDS-PAGE and silver-stained, and the bands were excised, digested with trypsin, and analyzed by LC-MS/MS. Supplemental Table 1 lists all the proteins that were identified in this study along with the number of peptides that matched each protein, molecular weight, and assignments of the approved Human Genome Organization (HUGO) gene symbols, wherever available. Supplemental Table 1 provides a count of how many gene products were actually identified from this sample as the count on the basis of proteins can be misleading because of the presence of a large number of protein isoforms.

Hemodialysis Fluid Proteome Versus the Plasma Proteome
Of the proteins identified in this study, 85% were smaller than 60 kDa, and more than half were Ͻ30 kDa. This is expected because of the hemodialysis process in which blood is filtered through a membrane that has a molecular mass cutoff of ϳ60 kDa, which should only allow proteins less than 60 kDa into the hemodialysis fluid. Fig. 1 shows a histogram of the number of proteins identified plotted against the corresponding molecular masses. Most of the proteins are found between 15 and 30 kDa. Superimposed on the histogram is a similar plot from the combined studies by Anderson et al. (9) and Chan et al. (10). Those two studies resulted in more than 2,100 proteins of which ϳ50% were below 65 kDa, and only 16% were below 30 kDa. Another distinguishing feature of the hemodialysis fluid data set is the absence of the 40 -80-kDa hump that is conspicuous in the serum/plasma proteome data set. A subset of the serum/plasma proteins were identified by Tirumalai et al. (3), who used serum that had been filtered through a 30-kDa filter prior to analysis. In terms of the number of proteins identified, this data set (340 proteins) is comparable to our hemodialysis data set (292 proteins), although significant differences are observed. Comparing the molecular weight distribution of the proteins identified in these two studies, it is clear that a considerably greater fraction of the proteins found in the hemodialysis fluid (56% of proteins with molecular mass Ͻ 30 kDa) had a lower molecular weight than in the analysis by Tirumalai et al. (18% of proteins with molecular mass Ͻ 30 kDa).
We compared the known serum/plasma proteins with those identified in this study to determine the degree of overlap based on the genes that encoded the expressed proteins and their isoforms. We first mapped all the protein entries to HUGO-approved gene symbols wherever possible (www.gene.ucl.ac.uk/nomenclature/), which allowed us to eliminate redundancy and facilitated this comparison. Although the serum/plasma catalog contained more than 7 times as many protein entries, it was remarkable that 205 of 292 proteins found in the hemodialysis fluid were not reported in the serum plasma proteome. Supplemental Fig. 1 shows a distribution of proteins identified in this study according to their molecular function. Table I lists the proteins that were identified from hemodialysis fluid but not previously reported in serum/ plasma. Not surprisingly, several of these proteins are of low molecular mass, validating our use of hemodialysis fluid to identify smaller proteins and polypeptides. Fig. 2 shows a Venn diagram indicating the relative numbers of overlapping and non-overlapping gene products.
It can be argued that this difference observed between serum and hemodialysis fluid is attributable to the underlying condition in the patient undergoing dialysis. In that case, the proteins specific to our list of hemodialysis fluid would not be observed in serum from normal individuals. It should be noted that the components that are normally filtered away by the kidneys are excreted in the urine. Therefore, we also compared our results with two proteomic analyses of the human urinary proteome (11,12) with a total of 358 proteins. Of the proteins specifically found in hemodialysis fluid, 29 of those have been identified from urine of normal individuals. This indicates that a significant fraction of the proteins are indeed found in normal serum/plasma but have not been identified earlier because of the technical limitations mentioned earlier.

Western Blot Analysis to Confirm Expression in Normal Serum
To test whether the expression of the proteins found in hemodialysis fluid was also detectable in normal serum, we    obtained antibodies that work in Western blotting experiments against a subset of proteins. As shown in Fig. 3, we were able to observe the expression of Galectin 3, Cofilin 2, Cathepsin H, ␣-defensins 1 and 3, Cathepsin D, Nucleophosmin 1, connective tissue growth factor, and Lipocalin 2 in normal serum. Thus, although we cannot formally rule out the possibility that some of the proteins are found in the serum because of the underlying disease of the patient undergoing hemodialysis, we think that the majority of the proteins identified in this study are normal components of serum/plasma and have not been previously identified due to their lower concentrations in the blood because of clearance by the kidneys.

Secreted Proteins Identified from the Hemodialysis Fluid
We identified several proteins that were either known or predicted to be secreted proteins. Below we will discuss a subset of these proteins that are likely to be especially interesting from a biological perspective. Many of them have not been previously identified in serum/plasma. All accession numbers are from the RefSeq data base.
A Novel Predicted Osteoblast Protein (FAM3C; NP_055703)-This protein was initially identified from genomic data bases using structure-based methods in search of novel "four-helix bundle" cytokines. Four genes were iden- tified in this family, although functional prediction has only been done for a related protein encoded by a related gene, FAM3B, which inhibits basal insulin secretion. This novel protein encoded by the FAM3C gene is expressed in all the tissues examined but was named as a predicted osteoblast protein as it was initially observed in osteoblasts. It contains an N-terminal signal peptide and no transmembrane domain, which suggests that it is a secreted protein (13).

Fatty Acid-binding Protein 3 (FABP3) (NP_004093)-This
was originally identified by screening a human adult muscle gt11 expression library with an antibody to muscle FABP (14). The same molecule was also identified as mammary-derived growth inhibitor based on its activity as a growth inhibitor in lactating bovine mammary gland. This protein plays a role in the intracellular transport of long-chain fatty acids and their acyl-CoA esters. FABP3 is a candidate tumor suppressor gene involved in breast cancer (15). Antrum Mucosa Protein 18 kDa (NP_062563)-This molecule, also designated as CA11, was initially isolated using differential display in human gastric cancer tissue. The expression of CA11 was observed to be down-regulated in gastric cancer tissue as compared with the normal gastric mucosa. Northern blot and RACE indicated that it is predominantly expressed in stomach and at low levels in the uterus and placenta (16). It has been suggested that the loss of its expression in gastric tissues may play an important role in gastric carcinogenesis (17). This protein is localized to the secretory granules of mucosal epithelium lining the stomach lumen (18).
Resistin (NP_065148)-Resistin (resistance to insulin) belongs to a family of proteins that is involved in inflammatory processes and in regulating metabolism. Human resistin (also referred to as FIZZ3) was identified by searching sequence data bases with a related mouse protein called FIZZ1 (19). The expression of resistin is induced during adipogenesis, and it is normally secreted by adipocytes. Elevated resistin levels are seen in serum in both genetic and diet-induced obesity suggesting that it could potentially link obesity to diabetes (20).
Dermcidin (NP_444513)-Dermcidin is a novel human antimicrobial peptide secreted by the sweat glands. Screening of a subtracted cDNA library of primary melanoma and benign melanocytic nevus tissues with cDNA arrays led to isolation of dermcidin. It has been shown to possess antimicrobial activity against Escherichia coli, Enterococcus faecalis, Staphylococcus aureus, and Candida albicans (21). It may also play a role  in tumorigenesis by enhancing cell growth and survival in a subset of breast carcinomas (22). ␣-Defensin 1 (NP_004075)-Defensins are a family of microbicidal and cytotoxic polypeptides involved in host defense. ␣-Defensin 1 was identified by screening a cDNA library constructed from HL-60, a human promyelocytic leukemia cell line, with an oligonucleotide probe based on the C-terminal sequence of human neutrophil peptides (23). It was found in different tissues including bone marrow, blood, neutrophils, and plasma (24). ␣-Defensins have been shown to inhibit the replication of HIV-1 (25).
␣-Defensin 3 (NP_005208)-Several proteins secreted by activated CD8 ϩ T cells from long term non-progressors with HIV-1 were identified. One of them was ␣-defensin 3 encoded by the DEFA3 gene (25). ␣-Defensins 1, 2, and 3 collectively account for much of the anti-HIV-1 activity of CD8 antiviral factor that is not attributable to ␤-chemokines (24). It is known to be expressed in bone marrow, leukocytes, and neutrophils (24,26). All the family members are known to be present in plasma (27).
Thymosin ␤ 10 (NP_066926)-Thymosin ␤ 10 was isolated from a kidney cDNA library and is an actin-sequestering protein (28). Thymosin ␤ 10 has been shown to be a putative progression marker for human cutaneous melanoma (28). It is likely that this protein is released into the plasma after lysis of cells.
Chromosome 19 Open Reading Frame 10 (NP_061980)-A novel secreted protein was identified in a murine system using an expression cloning strategy (29). We have identified the human ortholog of this murine secreted protein. The function of this protein is not known, although its sequence suggests that it is likely to be a cytokine.

Semitryptic Versus Full Tryptic Data Base Searching
All of the 292 proteins in this study were identified by searching data bases using tryptic constraints for peptides. However, one would also expect to observe fragments derived after proteolysis in vivo. Thus, we tested this by searching a subset of our data having the highest mass accuracy (data from quadrupole-time-of-flight mass spectrometer) with semitryptic constraints. Searching our data with semitryptic instead of fully tryptic constraints resulted in a higher total score for nearly all of the proteins identified. However, for most of the entries, the higher score was a result of the contribution of low scoring peptides that did not pass our validation criteria. Semitryptic high scoring peptides that passed our manual validation are listed in Table II. Although some of the peptides presented in Table II could results from in-source fragmentation of labile proline-containing peptides (30), we expect that the majority arise from proteolytic cleavage events that occurred in vivo. Fig. 4A shows an example of an MS/MS spectrum of a semitryptic peptide derived from Gelsolin.

Proteolytic Activity in Hemodialysis Fluid
Theoretically the hemodialysis fluid should not contain proteins greater in size than 60 kDa. Nevertheless it is possible for larger proteins to be detected in the hemodialysis fluid if they undergo proteolytic cleavage in vivo. Thus, we decided to examine the distribution of peptides in greater detail. Of the proteins greater than 90 kDa, more than half exhibited a grouping of several peptides in either the N or C terminus of the protein, suggesting the presence of a fragment or isoform. Smaller fragments of proteins can be the result of proteolytic activity in plasma. Fig. 5 shows five examples of such groupings within a larger protein. For most of the proteins shown in Fig. 5, all peptides are clustered in regions that constitute only 4 -8% of the whole protein sequence. For the 260-kDa protein fibronectin 1 and the 189-kDa protein complement component 3, a clustering of peptides was found in the middle of the proteins. This observation could be explained if these proteins were cleaved twice resulting in the observed fragment. For plasminogen (93 kDa) and desmocollin 1 (100 kDa), a clustering of peptides was observed in the N-terminal part of the protein, whereas a C-terminal clustering was observed in the case of Perlecan (480 kDa). Furthermore all of the above mentioned proteins were identified from bands that were at least a third of their expected molecular masses. For instance, fibronectin 1 was identified from the 15-19-kDa gel band, and desmocollin 1 was identified from the 6 -15-kDa gel band. It is also worth noting that although several proteolytic cleavage sites are located in the C-terminal region of plasminogen (31), our data suggest that a cleavage occurred in the N terminus of the protein to generate Angiostatin (32), an important player in the regulation of angiogenesis (33). The probability of clustering of five peptides within a region that is 5% of the total protein length is 10 Ϫ6 strongly suggesting that we are observing true in vivo proteolytic events. Of the above described proteins, all were retrieved from protein bands migrating in the ϳ6 -15-kDa region, except for complement component 3 that was retrieved from the ϳ60 -80-kDa region. For the latter this observation is in support of a single cleavage rather than a multicleavage event and is in agreement with literature where C3 convertase has been reported to cleave complement component 3 at residue 748, generating C3a anaphylatoxin (34).

Post-translational Modifications
Many proteins and peptides found in plasma and hemodialysis fluid have undergone a number of changes since the formation of nascent polypeptide chains. These changes have to be taken into account when interpreting the mass spectrometry data partly to get a better description of the biological sample and partly to avoid false-positive identifications. Basically the modifications can be grouped into enzymecatalyzed changes and spontaneously occurring modifications. A large number of peptides were oxidized at methionine and tryptophan residues or exhibited pyroglutamine formation by cyclization by an N-terminal glutamine residue. It is difficult to infer any biological significance from these spontaneously occurring modifications, and they will not be discussed in detail. We searched the data from the quadrupole time-offlight mass spectrometer to identify some of the commonly occurring post-translational modifications.   Table III. We were able to find two different N termini for peptidylprolyl isomerase A; in one instance, the initiator methionine was acetylated, and in another, the initiator methionine was removed and the next amino acid, valine, was acetylated. Fig. 4B shows an MS/MS spectrum of an acetylated peptide derived from thymosin ␤ 10 protein. Our data on N-terminal acetylation is in good agreement with the preference of methionine aminopeptidases that the ultimate amino acid is small or unmodified. Of the 43 acetylated proteins, one N-acetylated peptide was found in the middle of a predicted protein (XP_371848), suggesting that this could be a wrongly predicted protein. Indeed a Blast analysis of orthologous proteins confirms our new assignment of the translational initiation site (see Supplemental Fig. 2), which is in agreement with sequence conservation in five different species beginning at this methionine. Supplemental Fig. 3 shows the MS/MS spectra of all peptides containing an acetylated residue.

N-Glycosylation of Peptides
Three peptides were found to be covalently modified with an N-acetylhexosamine moiety, which corresponds to a mass increase of 203 Da (Table III and Supplemental Fig. 4). The CID-induced fragmentation spectrum of the triply charged ion with m/z 711.396 is shown in Fig. 4C. The N-acetylhexosamine gives rise to an intense oxonium ion at m/z 204, and further fragmentation of the oxonium ion gives rise to ions with m/z 186, 168, 138, and 126 (see Fig. 4C, inset). Interpretation of the mass spectrum revealed that the second asparagine in the peptide was covalently linked to an N-acetylhexosamine moiety. This modified asparagine occurs in the glycosylation consensus sequence NX(S/T) where X can be any amino acid. This is an unusual modification because the common Nglycan is cotranslationally transferred to the asparagine residue en bloc as the oligosaccharide GlcNAc 2 Man 9 Glc 3 and seldom trimmed beyond the trimannosyl-chitobiose core. It is interesting that only three peptides are found with this modification and come from a protein called prosaposin. Because N-glycosylation takes place only on asparagines in the glycosylation sequon NX(S/T), there will always be an adjacent serine of threonine. It is difficult to find peptides where O-GlcNAc is at least not a theoretically possible explanation (except in the rare instances where X is arginine or lysine, and trypsin cleavage is not hindered by the glycan moiety). However, there is a doubly charged ion in the spectrum that shows, even though it is small, that asparagine is the site of modification.
Prosaposin is a 524-amino acid glycoprotein that gives rise to four saposins that are predominantly localized in late endosomal/lysosomal compartment. A large number of endoand exoglycosidases are also present in the lysosome, and it is possible that the extensive trimming of the glycan down to a single N-acetylglucosamine takes place there. Whether this trimming of the glycan structure has any biological significance is not known, but it should be noted that two cases of metachromatic leukodystrophy have been reported where the mutations in the glycosylation sequon, either N215H (36) or T217I (37) corresponding to glycosylation sequon in the identified peptide TNSTFVQALVEHVKEECDR, led to a dysfunctional saposin B protein emphasizing the importance of glycosylation.

Proline Hydroxylation
A number of structural molecules in the extracellular matrix are known to undergo extensive post-translational modifications. Hydroxylation of prolines by prolyl hydroxylase is a common modification of collagen that confers structural stability to the collagen triple helix (38). We identified one proline hydroxylation site in fibrinogen, two sites in collagen ␣ 2 type I, and six sites in collagen ␣ 1 type I (see Fig. 4D for the MS/MS spectrum of a peptide containing two hydroxylated proline residues). Table IV lists all the hydroxylation sites identified in this study, and the MS/MS spectra are shown in Supplemental Fig. 5.

Conclusions
Using one-dimensional gel electrophoresis and LC-MS/MS, we have identified 292 proteins from hemodialysis fluid of which more than half were proteins smaller than 30 kDa. Analysis of the modified peptides led to identification of 43 N-terminally acetylated proteins and three proteins hydroxylated on prolines. We also found three peptides from prosaposin to be modified with a single HexNAc at two different glycosylation sequons. We were able to map the identified peptides onto larger proteins, which showed groupings of peptides within limited regions. A comparison of our results with previously published studies that examined serum and plasma proteomes showed that two-thirds of the proteins identified in this study had not been identified previously as components of serum or plasma. We feel that this is mainly due to two major contributing factors: the first is the greater dynamic range of protein concentrations in serum/plasma samples, and the second is enrichment of the lower molecular weight proteins in the hemodialysis fluid. The proteins identified in this study will allow further investigations into their detection in serum/plasma and possible use as biomarkers of disease states. This study presents the first comprehensive list of hemofiltrate proteins and in-depth analysis of the post-translational modifications. This proteomic survey is by no means exhaustive, and there are probably many proteins in the low molecular weight region we have not identified. This is not an uncommon phenomenon. In the Anderson et al. (9) list where literature and three proteomic studies have been compared, 196 of 1,275 proteins were reported in more than one study, and only 46 were reported in all four studies. It is currently not possible to extrapolate from the hemofiltrate to plasma or urine constituents, but it represents a little step toward mapping the enormous unknown of the human body fluid proteomes in health and disease.
Acknowledgments-We thank John Kloss for help with data base searching and Raghunath Reddy for creating helpful scripts. We thank Sun Microsystems for providing us a computer cluster under the Academic Equipment Grant Program. * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.