A Serum Glycomics Approach to Breast Cancer Biomarkers*S

Because the glycosylation of proteins is known to change in tumor cells during the development of breast cancer, a glycomics approach is used here to find relevant biomarkers of breast cancer. These glycosylation changes are known to correlate with increasing tumor burden and poor prognosis. Current antibody-based immunochemical tests for cancer biomarkers of ovarian (CA125), breast (CA27.29 or CA15-3), pancreatic, gastric, colonic, and carcinoma (CA19-9) target highly glycosylated mucin proteins. However, these tests lack the specificity and sensitivity for use in early detection. This glycomics approach to find glycan biomarkers of breast cancer involves chemically cleaving oligosaccharides (glycans) from glycosylated proteins that are shed or secreted by breast cancer tumor cell lines. The resulting free glycan species are analyzed by MALDI-FT-ICR MS. Further structural analysis of the glycans can be performed in FTMS through the use of tandem mass spectrometry with infrared multiphoton dissociation. Glycan profiles were generated for each cell line and compared. These methods were then used to analyze sera obtained from a mouse model of breast cancer and a small number of serum samples obtained from human patients diagnosed with breast cancer or patients with no known history of breast cancer. In addition to the glycosylation changes detected in mice as mouse mammary tumors developed, glycosylation profiles were found to be sufficiently different to distinguish patients with cancer from those without. Although the small number of patient samples analyzed so far is inadequate to make any legitimate claims at this time, these promising but very preliminary results suggest that glycan profiles may contain distinct glycan biomarkers that may correspond to glycan “signatures of cancer.”

deaths from the disease in 2005. Breast cancer detected in its early stages is treatable, but there are currently no United States Food and Drug Administration-approved serum tests for early detection of the disease. Because early symptoms of breast cancer are sometimes absent or not recognized, by the time the cancer is finally diagnosed it is often in an advanced stage of progression and untreatable. Therefore, a method that is capable of finding reliable biomarkers of breast cancer present in patient serum is needed.
Glycosylation of proteins is known to change in breast and other types of cancer (1)(2)(3). Alterations in glycosylation influence growth, differentiation, transformation, adhesion, metastasis, and immune surveillance of the tumor (4). In breast cancer, the presence of increasing concentrations of highly glycosylated proteins (mucins) and other changes in glycosylation correlate with increasing tumor burden and poor prognosis (3). O-Linked glycosylation of the mammary gland is also known to become altered during malignancy in large part due to the changes in mucin glycosylation (5).
Serum biomarkers that have been implicated in breast cancer are mucin 1 (MUC1), 1 carcinoembryonic antigen, and mammaglobin (6,7). The only serum tests approved for use in breast cancer are immunoassay tests for MUC1 (CA27.29 or CA15-3) and carcinoembryonic antigen. Unfortunately these tests lack the sensitivity and specificity for use as screening tests for the early detection of breast cancer and are not recommended by the Association of Clinical Oncologists (6) and are only approved for use to monitor treatment of patients with breast cancer. MUC1 is known as the polymorphic epithelial mucin and is also the target of a test for pancreatic, hepatic, and colon cancers (CA19-9). The polymorphic nature or "heterogeneity" of polymorphic epithelial mucin is largely due to the high amounts of O-linked glycosylation of the tandem repeat elements present in the extracellular carboxyl end of the molecule (8). Using antibodies, the CA27.29 and CA19-9 tests detect different MUC1 antigenic epitopes that correspond to the different types of cancer. Also present on many of these proteins and detectable by antibodies are N-linked oligosaccharides (glycans); this is a different form of glycosylation that is also implicated in cancer (9).
As an alternative to current immunochemical or proteomics methods for finding biomarkers of breast cancer in patient serum, global profiling methods for glycans cleaved from their protein core are being developed (10). The resulting free glycan species can be directly analyzed by mass spectrometry thereby creating a profile of glycans, some of which are biomarkers for breast cancer. This approach seeks to directly identify glycan groups that are linked to any glycosylated protein secreted or shed from the tumor and/or circulating tumor cell without knowledge of the protein core to which it is bound. Because this approach does not focus on the analysis of proteins, the abundant serum proteins such as albumin and immunoglobulins will not interfere with the glycan analysis. Attention is directed solely toward identifying the aberrant cancer glycans associated with tumor cells using high resolution mass spectrometry.
The instrument used for this study was a FT-ICR mass spectrometer with a MALDI source. This instrument is well known for high mass accuracy (Ͻ5 ppm with external calibration) and high resolution (Ͼ100,000 full width at half-height). Glycans are readily identified solely based on their mass and confirmed by their fragmentation pattern. This research aims to apply state-of-the-art mass spectrometry to direct clinical diagnosis. Another advantage of this approach is that glycans can be directly examined in samples without purification of proteins. It is shown here that it is possible to identify glycan profiles in the conditioned media of breast cancer tumor cell lines, in sera of a mouse model of breast cancer, and in a small number of patient samples.

EXPERIMENTAL PROCEDURES
Materials-Cell culture media were purchased from Invitrogen. Sodium borohydride and dihydroxybenzoic acid came from Sigma, and other reagents came from Fisher Scientific.
Growth of Breast Cancer Tumor Cell Lines-Breast cancer tumor cell lines were grown in Dulbecco's modified Eagle's medium supplemented with 10% fetal bovine serum. The MCF-10A cell line was grown in mammary epithelial growth medium (serum-free) obtained from Clonetics and supplemented with 100 ng/ml cholera toxin (Calbiochem). Conditioned media (CM) were harvested from each cell line, concentrated (Vivascience concentrator, 7000 or 10,000 molecular weight cutoff), dialyzed overnight at 4°C (Pierce Slide-A-Lyzer, 7000 or 10,000 molecular weight cutoff), frozen, and lyophilized.
Preparation of Mouse Serum Samples-The mice used for this study were inbred FVB strain mice. They were all transplanted at 5 weeks of age with tiny pieces of mammary tumors that originated from the polyoma middle T antigen (PyMT) mouse model of breast cancer (11). These mice were housed together in the Center for Lab Animal Science animal facility. They were all fed standard diets.
Blood samples were taken by postorbital eye bleeding from mice after having Met-1 cells transplanted into the mammary fat pads of syngeneic FVB/N mice (12). Over the course of these mice developing mammary tumors, additional blood samples were drawn. All blood samples were immediately processed into serum (Serum Separator Tubes, SSTா 365967, BD Biosciences) and frozen at Ϫ20°C. Each sample was thawed (20 -30 l of serum) and subjected directly to chemical cleavage (see below).
Preparation of Patient Serum Samples-Serum samples from individuals without a known history of cancer (n ϭ 4) and pre-existing patient serum samples already tested for CA27.29 (n ϭ 4) were acquired from the Specialty Chemistry unit, UC Davis Medical Center Clinical Laboratories . Serum samples were tested for CA27.29 using the chemiluminescence microparticle immunoassay (Bayer ADVIA Centaur) and then frozen at Ϫ20 or Ϫ70°C until use. Two hundred microliters of serum were used in the chemical cleavage reaction (see below). Table I shows the medical information obtained for the patients used in this study.
Chemical Cleavage of Oligosaccharides from Glycosylated Proteins-Approximately 2 mg of dialyzed and lyophilized cell line supernatant were weighed or a volume of 200 l of human serum was transferred to a 15-ml conical polypropylene tube. Glycans were chemically cleaved by ␤-elimination using treatment with sodium borohydride in sodium hydroxide (10). Sodium borohydride under optimized conditions primarily releases O-linked glycans but may also release portions of N-linked glycans. Alkaline borohydride solution (200 l, mixture of 1.0 M sodium borohydride and 0.1 M sodium hydroxide) was added, and the mixture was incubated at 42°C for 18 h in a water bath. After reaction, 1.0 M cold hydrochloric acid solution was slowly added to the samples in an ice bath to neutralize the excess sodium borohydride until hydrogen gas evolution ceased (pH ϭ 3-5).
Oligosaccharide Purification by Solid Phase Extraction-Oligosaccharides released by reductive elimination were purified by solid phase extraction (SPE) using a graphitized carbon cartridge (GCC) (Alltech, Deerfield, IL). The cartridge was washed with nanopure water followed by 0.05% (v/v) TFA in 80% ACN in H 2 O (v/v) and once again with nanopure water. The solution of released oligosaccharides was loaded to the cartridge. Subsequently the cartridge was washed with nanopure water at a flow rate of about 1 ml/min to remove salts. Glycans in the cell lines were eluted stepwise with 10% ACN in H 2 O (v/v) and 40% ACN in 0.05% TFA in H 2 O (v/v). Glycans in the serum samples were eluted stepwise with 10% ACN in H 2 O (v/v), 20% ACN in H 2 O (v/v), and 40% ACN in 0.05% TFA in H 2 O (v/v). Each fraction was collected and concentrated in vacuo prior to MS analysis. The serum samples from the mouse and humans were subjected to digestion using proteases to remove any residual peptides not removed by SPE.
Mass Spectrometric Analysis-Mass spectra were recorded on a FT-ICR mass spectrometer with an external source (HiResMALDI, IonSpec Corp., Irvine, CA) equipped with a 7.0-tesla magnet. The source of the HiResMALDI utilized a pulsed Nd:YAG (neodymiumdoped yttrium aluminium garnet) laser (266 nm) for ionization. 2,5-Dihydroxybenzoic acid was used as a matrix (5 mg/100 l in 50% ACN in H 2 O (v/v)). A saturated solution of NaCl in 50% ACN in H 2 O (v/v) was used as a cation dopant. The oligosaccharide solution (1 l) was applied to the MALDI probe followed by matrix solution (1 l). The sample was dried under a stream of air prior to mass spectrometric analysis.
Structural Determination Using Infrared Multiphoton Dissociation-Tandem mass spectrometry through infrared multiphoton dissociation (IRMPD) was used to determine the general structures of several oligosaccharides (13). This allowed for comprehensive fragmentation of specific ion species. The ion of interest was readily selected and isolated from the other ions in the ICR cell using an arbitrary waveform generator. A continuous wave Parallax CO 2 laser (Waltham, MA) with 20-watt maximum power and 10.6-m wavelength was installed at the rear of the magnet and provided the infrared photons for IRMPD. The laser beam diameter was 6 mm as specified by the manufacturer and was expanded to ϳ12 mm by means of a 2ϫ beam expander (Synrad, Mukilteo, WA) to ensure complete irradiation of the ion cloud through the course of the experiment. The laser was aligned and directed to the center of the ICR cell through a BaF 2 window (Bicron Corp., Newbury, OH). Photon irradiation time was optimized to produce the greatest number and abundance of fragment ions. The laser was operated at an output of ϳ13 watts.
Analysis of the Spectra-Mass spectra were analyzed without first correcting the relatively small differences of intensity due to varying sample loadings. The spectra were deconvolved using the ReSpect TM probabilistic data reconstruction method (Positive Probability Limited, Isleham, UK) to give a set of peak tables summarizing the features present. The peak tables were then combined using a purposewritten computer program that identifies features common to two or more of the tables, taking account of any small calibration differences between the spectra. Principal component analysis (PCA) of the resulting table was performed after centering.
Principal component regression (PCR) was performed to predict the cancer status of the patients using an indicator variable (putting 1 for a cancer patient and Ϫ1 for no cancer). As an attempt to test the validity of this prediction, the regression was repeated a number of times after inappropriately reassigning the indicator variable.

Analysis of Breast Cancer Cell Line Supernatant Glycans-
Glycans were harvested from cell line supernatants to determine whether sufficient amounts of material could be obtained for MS analysis. Glycans were profiled using methods developed previously (10, 13) on CM obtained from the growth of breast cancer tumor cell lines in culture. Breast cancer tumor cell lines (acquired from the ATCC) were analyzed in this manner, including the breast cancer tumor cell line BT 474, originally isolated from a solid invasive ductal carcinoma of the breast; MDA-MB-453, derived from a metastatic site (pericardial effusion) from a patient with metastatic carcinoma; MDA-MB-361, isolated from the brain metastasis of a breast cancer patient with adenocarcinoma; and MDA-MB-468, derived from the pleural effusion of a breast cancer patient with adenocarcinoma (14).
CM were obtained from each cell line, and glycans were chemically released by ␤-elimination using treatment with sodium borohydride in sodium hydroxide (see "Experimental Procedures" for details). The released oligosaccharides were then subjected to solid phase extraction with GCCs and eluted in three separated fractions by 10, 20, and 40% acetonitrile, which allowed the separation of neutral from anionic oligosaccharides and removal of most peptides. Each fraction was separately analyzed by MALDI-FT-ICR MS in the positive mode (15). Fig. 1 shows a representative spectrum of the GCC 10% (Fig. 1, top) and GCC 40% fractions (Fig. 1, bottom) for cell line MDA-MB-468. The GCC 10% fraction shows mainly matrix and peptides, whereas the GCC 40% fraction shows the presence of glycans (filled circles). These glycans have masses identical to those also found in the cell lines of ovarian cancer (16). Tentative confirmation of the peaks as glycans was obtained primarily by the spacing between the peaks, which corresponded precisely to differences of a hexose (Hex) or N-acetylhexosamine (HexNAc). The masses, however, do not correspond to oligosaccharide alditols as expected from glycans released by alkaline sodium borohydride. The masses also do not correspond to those of aldehydes. Instead they appear to be oligosaccharides with an unknown head group. We are further pursuing the identity of these oligosaccharides; however, they are identical to those observed in the ovarian cancer cell lines (17). Several of these species were further characterized using IRMPD. An example of a species with m/z 712 obtained from an ovarian cancer cell line, which was also detected in the breast cancer cell line MDA-MB-468( Fig. 1, 40%) is shown in Fig. 2a. Based on this spectrum a proposed structure is shown in Fig. 2b.
The unique glycan masses present in all cell lines, with the exception of the MCF-10A and BT 474, were a series of ϳ14 species that appear to be related. sibility is that these ions are fragments resulting from the MALDI process, which is known to produce energetic ions and may cause metastable dissociation particularly in mass analyzers with long detection times (10,18). The structures when determined by IRMPD appear to be fragments of Nlinked complex-type oligosaccharides with the trimannose core intact. The 60 mass units most likely corresponds to fragmentation from a cross-ring cleavage from the MALDI laser ionization step (10,19,20) that results in an oligosaccharide group with a head group of CH 2 (OH)CH 2 (OH) (Fig.  2b). This type of cleavage could also be the result of a peeling reaction (10,21) during the alkaline release procedure. It therefore appears that the cleavage reaction is also producing N-linked oligosaccharides and that ionization by MALDI is producing cross-ring cleavage fragmentations. Cross-ring fragments can produce relevant structural information about these glycans, which might be specific for N-linked oligosaccharides produced from cancer cells.
The procedure used to enrich glycans was relatively fast and requires no HPLC separation. Release and processing of 12 samples typically requires 48 h with additional time required for mass spectral analysis. Presently only 1.0 l of the GCC fraction, equivalent to 50 nl of serum, is needed to obtain an oligosaccharide profile. The glycans from four breast cancer cell lines and one mammary epithelial cell line were isolated and purified from the CM prior to analysis using the procedure outlined above. ). The differences in glycans between the cell lines might be characteristic and could be used to distinguish between cell lines and possibly different forms of breast cancer. We observed that the MCF-10A non-tumorigenic epithelial cell line did not show the glycans we detected in the tumor cell lines. The MCF-10A epithelial cell line was grown in defined media, whereas the other breast cancer cell lines were grown in media containing 10% bovine serum albumin, so the different media could be responsible for the lack of these glycans. However, we have analyzed fetal bovine serum (FBS) and FBS-containing media and did not detect the tumorassociated glycans (data not shown), so we know that the glycans in the tumor cell lines are not present in the FBScontaining media. Further demonstration that the MCF-10A cell line does not produce these glycans using the same growth conditions for all cell lines is in progress. Previous analysis of glycans obtained from glycosylated proteins present in the conditioned media of ovarian cancer tumor cell lines ES-2, SKOV3, OVCAR, and CaOV showed the presence of some of these same glycans (17). During these studies, the variability in glycan analysis by MALDI-FTICR MS was examined by analyzing triplicate samples of a 40% GCC fraction from ES-2 tumor cells, and these data are included in the Supplemental Fig. 2. The reproducibility was within 10%, and the ions appeared to have the same masses and relative intensities for each sample.
Analysis of Glycans in Mouse Serum-In addition to measuring glycans in the conditioned media of breast cancer tumor cells, we wanted to determine whether these glycomics methods could be used to measure early glycosylation changes in a mouse model of metastatic breast cancer. For this purpose, a transplantable mammary tumor mouse model of breast cancer that was originally derived from the PyMT mouse model of metastatic breast cancer was used (11). The PyMT mouse was originally created by inserting the oncogene PyMT under the regulation of the mouse mammary tumor virus into mice (11). These mice developed widespread transformation of the mammary epithelium and rapid production of multifocal mammary adenocarcinomas with high rates of metastasis to the lung. A transplantable mammary tumor mouse model was subsequently produced via primary culture of a mammary tumor from the PyMT mouse (Met-1 cells) that may be grown in standard cell culture or serially transplanted into mammary fat pads of FVB female mice (12). These mice develop mammary tumors, which metastasize to the lungs (22). To establish whether we can use our method to detect glycosylation changes in the serum of mice during formation of tumors, serum samples were obtained from four mice during the growth of their tumors, glycans were cleaved and subjected to solid phase extraction with GCC, and the fractions were analyzed by MALDI-FTICR MS (see "Experimental Procedures"). Spectra from three separate time points were obtained from each mouse during the development of its mammary tumors. An example of spectra from one mouse (mouse 1128) is shown in Fig. 4 with serum samples taken during weeks 0, 2, and 6. The spectra were normalized in terms of ion abundances to the most intense signal among the three spectrums. A number of peaks, m/z ions 509, 548, and 617, were present in all three spectra from mouse 1128 and in almost every serum sample obtained from the four mice (Fig. 5). A number of m/z ions corresponding to oligosaccharide (glycan) masses were absent in week 0 but appeared in week 2 and increased further in week 6. These corresponded to the peaks m/z 772, 915, 1078, and 1443. The m/z peaks 772 and 1078 ions were observed to change in the other three mice, and m/z peaks 916 and 1443 were observed to change in two of the four mice (Fig. 5). Other m/z peaks observed to change in two of the four mice at week 4 were 874 and 1138. Most of these m/z ions correspond to glycan m/z ions present in the conditioned media of the breast cancer cell lines and appear to correspond to the most abundant glycan species. Although similar m/z ions appear to be present in the mouse serum samples, it is still too early to reach any conclusions about these glycan changes we observed. More rigorous mouse studies are needed before we can determine whether the glycan changes we detected in mouse serum during the development of mammary tumors result from the tumors and are indicative of the presence of breast cancer.
Analysis of Glycans in Human Serum-A small sample set of human serum samples, four patient serum samples, previously tested for CA27.29 and four serum samples from persons with no prior history of cancer were analyzed using this method. Limited medical information is provided for these samples (Table I). These samples were already tested for CA27.29, which is an immunoassay test sometimes used to monitor treatment of breast cancer patients, and the results from this test are included in Table I for comparison. The same method previously described to prepare the breast cancer tumor cell lines and mouse serum samples was used for the patient serum samples. Fig. 6 shows a comparison for all three fractions between a cancer patient with metastatic breast cancer and a patient with no cancer. The 10, 20, and 40% GCC fractions are all represented, and variation between the no-cancer and breast cancer patients can be seen. The glycan masses found in the breast cancer cell lines and the mouse models also appear to be present in the serum. In addition, several new series that are present in the 10% fraction are present in both cancer and no-cancer patients. These oligomers have abundant peaks at m/z 617, 815, 1013, etc. and are precisely 198.015 mass units apart. This mass corresponds to the residue mass of hexuronic acid where the acidic proton has been replaced by a sodium ion, a common occurrence when the MALDI sample is doped with sodium chloride. Hexuronic oligomers have not been reported; however, the monomer has been reported to increase in cancer patients and those with diabetes (23).
In the 20% GCC fraction, the cancer patient's spectrum has peaks that correspond to the masses observed in both the cancer cell line and the mouse with implanted cancer tissues at later times (m/z 712, 772, 915, 1077, etc.). Table II lists the masses that are found in serum that correspond to the cell lines. These masses appear as a series in the 20% GCC fraction of the cancer patient and also appeared in the no-  cancer patient but in the later 40% GCC fraction. One explanation for this phenomenon is that molecules that produced these masses in the MALDI-FTICR MS are slightly different in the cancer patient and elute at a lower solvent concentration (20% ACN) than similar molecules from the no-cancer patient. This is why the glycan separation step (GCC SPE fractionation) may be crucial to our method. This separation step can help distinguish very small differences between molecules. It is important to note that each glycan mass can correspond to a number of isomers, sometimes over a dozen for a single mass, with each isomer having its own physical property. These physical differences potentially translate into different binding affinities and interactions with the graphite column. Our experiences with HPLC separation of glycan mixtures by GCC show us that isomers tend to have widely varying retention times. Structural homologs, such as those that differ by only a single residue but are identical structures otherwise, tend to be closer together in retention times, whereas stereoisomers even when only one linkage is different can have significantly different retention times. In fact, these small differences in glycan structure may actually have profound functional consequences in the tumor cell. Recently we reported that many of the same masses listed in Table II were markers for ovarian cancer in serum (17). In that study, it was found that these masses were more prevalent in the ovarian cancer patients and generally absent in the no-cancer patients for all three GCC fractions examined (10, 20, and 40% ACN). Slight structural differences, possibly between isomers with different structural properties, cause these same masses to be detected in the 20% GCC fraction of the cancer patient samples and in the 40% GCC fraction of the no-cancer patient. Further experimentation indicated that these same masses could be detected in the no-cancer patient serum samples if the ionization conditions were varied, for example the laser power was increased (data not shown). The results presented in the current report were obtained at ionization conditions that produce these peaks (masses) in both cancer and no-cancer patient serum samples. However, these results are still consistent with the possibility that specific glycan isomers are markers for breast or ovarian cancer, but it is clear that more extensive structural and analytical experiments are necessary to resolve some of these issues. Also much larger sets of serum samples from breast and ovarian cancer patients will need to be analyzed before it can be determined whether this method is capable of distinguishing between these two types of cancer. Statistical Analysis of Human Serum Glycans-For each of the extracted fractions, peak tables derived from the patient sample spectra were combined to give a table showing the intensity of common features in each of the spectra. PCA (24) of this table then conveniently summarized the most significant differences between the spectra. The greatest difference between cancer and no-cancer patients was found in the GCC 10% fraction. As shown in the PCA score plot (Fig. 7), principal component 3 appears to distinguish the four cancer patients (who all had positive scores) from the no-cancer patients (negative scores). The further finding that the correct assignment of the patients' cancer status gave the best correlation coefficient in PCR analysis (see "Experimental Procedures") to predict the cancer status also suggests that the spectra of the cancer and no-cancer patients are distinguishable (Fig. 8). Fig. 8a shows the result of the PCR analysis if the samples are correctly assigned as cancer and no-cancer, whereas Fig. 8b shows the result if the assignments are incorrect. The features used for the statistical analysis are the peaks in the mass spectra not just binned parts of m/z. The data were deisotoped to 12 C masses after deconvolution, and the results were ions and not zero charge mass. Although in our sample set there are many more features in the spectra than samples in the study, there is the danger of overfitting the data. We use existing methods (e.g. cross-validation) to try and ensure that any predicted biomarkers are real. In addition, in the future, we hope to develop probabilistic classification methods that do not overfit the data. The principal criterion that we use to identify a feature common to two or more peak tables is that within the estimated mass errors the feature must have the same mass in each peak table. However, any small calibration differences between the spectra could produce systematic errors greater than our estimated mass errors, potentially resulting in some features being misidentified. Any calibration error in the peak tables thus needs to be corrected before the tables are combined. This is done by using the measured masses of the strongest signals common to all of the tables as internal standards. We find a linear correction for each table that removes any systematic deviation of its standard masses from the average of the standards in all the tables. This finding is highly tentative because of the small number of patients in the study. However, if the results were confirmed in a larger study, finding discrimination in such an early principal component would suggest that the differences between cancer and no-cancer patients account for a substantial part of the variation in the spectra and can therefore be used to distinguish between patients with breast cancer from those without.

DISCUSSION
Our quest to find relevant biomarkers that correlate with the presence of breast cancer in humans involves the use of new methods that are capable of detecting glycosylation changes in breast cancer patients. Three separate approaches have been utilized to evaluate the use of this technology: 1) using breast cancer tumor cell lines to establish proof of concept, 2) using mouse models of breast cancer, and 3) testing serum samples from patients with metastatic breast cancer and comparing these with serum from patients with no cancer.
The first "proof of concept" was accomplished by analyzing CM obtained from breast cancer tumor cell lines grown in culture. Breast cancer cell lines MDA-MB-468 and MDA-MB-453 are from pleural or pericardial effusions, whereas MDA-MB-361 was established from a brain metastasis. The glycosylation profiles from these three cell lines were very similar, whereas the fourth cancer cell line (BT 474) had a different profile possibly due to the fact that it was isolated from a solid tumor of invasive ductal carcinoma of the breast. Although data for the MCF-10A epithelial non-tumorigenic cell line established from fibrocystic disease are shown, a direct comparison of its results with those from the breast cancer tumor cell lines therefore could not be made because it was grown in defined media with no FBS, whereas the other cell lines were grown in complete media that contained 10% FBS. The MCF-10A cells could not be grown in 10% FBS, and the other cell lines could not be grown in defined media, preventing a direct comparison.
The use of MALDI in the analysis is advantageous for several reasons. MALDI is more tolerant of salts and less susceptible to suppression effects compared with ESI. However, MALDI does have characteristics that make it less than ideal. For example, variations between different positions in the sample spots are often observed. For this reason, a spot is sampled in several locations until the best signal to noise ratio is obtained. These spectra were used for the analysis. When measured in this manner, the variability between samples was negligible, and the overall features of the spectra made them nearly indistinguishable. In the same way, the SPE also yielded the same group of glycans in the same fractions when precautions are made to make the experimental conditions similar. This involved using the same flow rates, applying the same amount of material, passing the same volume of wash, etc. We have examined the reproducibility of this method thoroughly with other cell lines. The analysis of three separate cultures of the cancer cell line ES-2 (ovarian cancer), for example, shows highly reproducible MALDI spectra for each fraction (analysis of the 40% fraction is included in the supplemental material). The abundances of all the peaks in the MALDI were highly similar from one spectra to the next with less than 10% variability. Similar levels of reproducibility were obtained in the analysis of prostate cancer cell lines LNCaP, PC3, and SAOS2 (data not shown) grown in triplicates and analyzed individually by nanospray FT-ICR MS.
The methods outlined in this work were used to examine whether glycosylation changes in serum from a mouse model of breast cancer as it developed mammary tumors could be detected. Although these data are too preliminary to draw conclusions about these results and their relationship to mammary tumors, they do suggest that it may be possible to monitor changes in glycosylation of proteins present in mouse serum. Additionally it remains unclear whether these glycans were due to the growth of mammary tumors or the response to the presence of these tumors; however, they warrant further investigation.
The use of a mouse model of breast cancer, in which blood can be sampled during the progression of the disease, may help correlate glycan changes with the growth and potential metastasis of the tumors. The advantages to using such a mouse model of breast cancer to test this method are 1) mice are genetically identical; 2) sex, diet, and age can be controlled; and 3) large numbers of mice can be tested over the course of disease so reproducibility can be tested.
Still one major disadvantage to the use of mouse models of cancer is the fact that they are not human and may not adequately represent human breast cancer disease. However, the mouse model may still be used to test the reproducibility, sensitivity, and reliability of this method to detect the presence of mammary tumors, ductal carcinoma in situ, or metastatic breast cancer.
The very small number of patient samples is inadequate to make any legitimate claims about the performance of this method for testing serum for the presence of breast cancer in individuals. Despite these limitations, this approach shows promise, and carefully designed prospective studies are needed before a reliable, accurate method for detection of breast cancer can be determined. These samples were acquired before very limited information about the patients was obtained. The breast cancer patient serum samples were pre-existing samples, already tested for CA27.29. The "nocancer" serum samples were also pre-existing samples ob-tained from the UC Davis Medical Center Clinical Laboratories. The finding was that these groups of patients segregated to different quadrants after using PCA.
The Use of Glycosylation Analysis to Find Biomarkers of Cancer in Serum-Glycosylation of proteins changes in a large and dramatic fashion in cancer cells. Glycans become shorter and more negatively charged, and core structures change (1,3). Our methods are designed to target analysis of the shorter O-linked and more sialated glycans. These types of glycans are not normally produced in healthy individuals. The challenges to find relevant biomarkers of cancer in serum that can be used for early detection of disease are considerable. Because of the difficulties and issues that result from some existing proteomics methods used to find biomarkers of cancer, especially those that are highly dependent on data analysis and the use of a more "bioinformatic" data mining approach, it is necessary to first demonstrate that any method to find cancer biomarkers is reliable, reproducible, and accurate before the important clinical samples are analyzed. This is why our group chose to analyze tumor cell-conditioned media first before attempting to analyze patient serum samples.
We also reasoned that because it is already possible to measure the presence of the tumor biomarker CA125 in serum of women with ovarian cancer and MUC1 in women with breast cancer (these are both highly glycosylated mucins produced by tumor cells), with more sensitive methods to measure glycosylation, it may be possible to directly measure the glycosylation of these proteins and other proteins shed or secreted from these cells without the need to purify the proteins. Mucins come from epithelial cells and are not normally present in high quantities in the circulation. MUC1 and CA125 (MUC16) produced in normal women do not contain the aberrant glycosylation that breast and ovarian cancer tumor cells produce.
Glycosylation of proteins, especially mucins, accounts for their extremely large molecular weight. Some mucins have 50 tandem repeats, each of which might have five potential sites for O-glycosylation that could contain 250 oligosaccharide side chains. So it is entirely possible that a 1 nM concentration of this protein could have a 250 nM effective concentration of a specific oligosaccharide structure (3). Because some mucin tandem repeats may contain 5-100 potential glycosylation sites per repeat and mucin core proteins contain 5-500 repeats, a mucin protein becomes several million daltons in size largely due to the presence of glycosylation. This amount of glycosylation might correspond to a potential stoichiometric amplification of greater than 7500-fold for the associated oligosaccharide side chains for a single protein molecule.
Another issue with proteomics analysis of plasma and serum is the very large dynamic range (10 orders of magnitude) of proteins with the additional problem of protein degradation in these samples. Although glycosidases may exist in serum, an interesting finding is that the levels of some glycosidases are actually reduced, and many do not appear to change in cancer patients (25). We plan to conduct more in depth analysis of the degradation of glycans in serum during our continuing testing and validation phases of this method. Also considerable effort is being expended toward identification and structural analysis of the oligosaccharides.
Finally detecting glycan biomarkers of cancer should be as feasible as finding relevant protein biomarkers of cancer. Glycosylation might be considered to be "fine tuning" of the protein and essential for its function. We feel our methods will complement proteomics methods and provide different and important information about the presence of cancer in a patient. An advantage of our method is our use of high resolution and sensitive FT instruments that make it possible to accurately detect very low (femtomole or lower) amounts of glycans in serum samples. With tandem MS/MS using IRMPD it is possible to obtain additional structural information about the glycans. It is also possible that unique structures and composition of cancer glycan biomarkers will be detected in the serum of cancer patients.
Conclusions-This proof-of-concept study illustrates several important points. 1) Glycans released from serum can be collected and profiled by mass spectrometry. 2) Changes in glycosylation can be observed be in the progression of the disease states in mouse serum. 3) Glycans may be useful markers for the onset of breast cancer.
It is too early to tell whether glycans can be used as biomarkers for the early diagnosis of breast cancer. There needs to be more information regarding the nature of the glycans. Although the method used in this study typically releases O-linked oligosaccharides, there is always the possibility for the release of N-linked glycans. The composition can be determined by the mass and tandem MS experiments; however, there is too little material for all but the most rudimentary type of structural analysis. Nonetheless we are convinced that we are observing glycans. Changes in glycosylation may therefore be a viable method for disease diagnosis. Glycomics may be useful in combination with other methods because it may prove to be complimentary to other equally promising technologies (proteomics and microarray) to provide a better way of detecting breast cancer early and saving lives.