Metastasis-related Plasma Membrane Proteins of Human Breast Cancer Cells Identified by Comparative Quantitative Mass Spectrometry*

The spread of cancer cells from a primary tumor to form metastasis at distant sites is a complex multistep process. The cancer cell proteins and plasma membrane proteins in particular involved in this process are poorly defined, and a study of the very early events of the metastatic process using clinical samples or in vitro assays is not feasible. We have used a unique model system consisting of two isogenic human breast cancer cell lines that are equally tumorigenic in mice; but although one gives rise to metastasis, the other disseminates single cells that remain dormant at distant organs. Membrane purification and comparative quantitative LC-MS/MS proteomics identified 13 membrane proteins that were expressed at higher levels and three that were underexpressed in the metastatic compared with the non-metastatic cell line from a total of 1919 identified protein entries. Among the proteins were ecto-5′-nucleotidase (CD73), NDRG1, integrin β1, CD44, CD74, and major histocompatibility complex class II proteins. The altered expression levels of proteins identified by LC-MS/MS were validated using flow cytometry, Western blotting, and immunocyto- and immunohistochemistry. Analysis of clinical breast cancer biopsies demonstrated a significant correlation between high ecto-5′-nucleotidase and integrin β1 expression and poor outcome, measured as tumor spread or distant recurrence within a 10-year follow-up. Further the tissue analysis suggested that NDRG1, HLA-DRα, HLA-DRβ, and CD74 were associated with the ER−/PR− phenotype represented by the two cell lines. The study demonstrates a quantitative and comparative proteomics strategy to identify clinically relevant key molecules in the early events of metastasis, some of which may prove to be potential targets for cancer therapy.

Breast cancer is the most common malignant disease among women in Western countries, occurring in approximately one in 11 women (1). In this disease, malignant cells often disseminate to regional lymph nodes and establish distant metastases, preferentially in the bone, lung, and liver, resulting in poor outcome and high mortality (2,3).
Metastases are established through a complex set of events that is yet not fully understood but requires detachment of single cells from the primary tumor, penetration of the tissue matrix, and migration of these cells to distant locations where they induce angiogenesis and undergo expansive growth (4). Some disseminated cancer cells seem to be capable of maintaining dormancy in distant organs without establishing metastases but may suddenly become activated many years after resection of the primary tumor (5). The dormancy may be caused by environmental signals, either lack of those inducing differentiation or the presence of signals stimulating growth arrest. Cellular factors and changes in the microenvironment, such as inflammation or a change in hormonal status, might eventually induce proliferation, differentiation, and subsequent metastatic growth, whereas other disseminated cancer cells remain dormant for a lifetime (6).
Traditional models of metastasis suggest that a subpopulation of cells in the primary tumor acquire metastatic capacity late in tumorigenesis, but gene expression profiles and cellular studies have recently provided evidence for a possible alternative model that suggests the metastatic capacity is acquired early in tumorigenesis (7). Stem cell populations have been identified in a range of hematopoietic and solid tumors and might represent the cells of origin for these tumors but might also be responsible for metastasis (8). Although a preserved genetic signature between the primary tumor and the metastasis has been found, other studies provide evidence of a gradual acquisition of genomic changes because distant metastases may not uniformly share mutations and often differ extensively from the primary tumor, reflecting the extent of genetic instability of breast cancer (9,10). Only few studies provide proteomic characteristics of metastatic versus primary tumor of breast cancer because of the difficulties of obtaining high quality human tumor samples with full clinical histories and the absence of directly relevant in vitro assays (11,12).
The two isogenic cell lines M-4A4 and NM-2C5, which were derived from the MDA-MB-435 cell line and originated from a highly aggressive human invasive ductal carcinoma, provide an interesting model of the metastatic process (13). M-4A4 and NM-2C5, when inoculated into the mammary fat pad of nude mice, showed equal tumorigeneity, but although M-4A4 established easily detectable metastases restricted to lymph nodes and lungs, NM-2C5 cells disseminated to distal organs, but the cells remained dormant and did not establish metastasis (14). There is an ongoing debate on whether the parent cell line MDA-MB-435 can be defined as a breast cancer cell line because it, along with breast-and epithelia-specific markers, also expresses melanoma-specific genes (15). However, MDA-MB-435 can be induced to express breast differentiation-specific proteins and secrete milk lipids as observed in other well established breast cancer cell lines and has therefore been considered as an excellent model of a highly malignant and dedifferentiated breast cancer (16). Regardless of this debate, our model system remains valuable in the context of cancer metastasis, but the results should, as always when using cell line models, be supported by studies of clinically relevant human tissue specimens.
M-4A4 and NM-2C5 have been extensively compared using gene expression analysis identifying a panel of differentially expressed genes (13,(17)(18)(19)(20). However, because the proteome is so much more complex than the genome, similar studies at the protein level with special focus on plasma membrane proteins may add valuable biological insight and identify cell surface molecules that might be targeted with drugs or antibodies to inhibit the metastatic process.
Comparative quantitative proteomics using stable isotope labeling with amino acids in cell culture (SILAC) 1 and LC-MS/MS allows a study of proteins with quantitatively different expression levels on metastasizing versus non-metastasizing cells. We used this technique to identify a panel of plasma membrane proteins showing altered expression in cells capable of forming metastasis. Validation studies at the protein and RNA expression level of the cell lines indicate that several of the identified proteins may be important for establishing metastasis in distant organs and thus have potential in targetspecific therapy. Therefore, to further evaluate the clinical relevance of a selected number of the candidates identified by our analysis, their expression levels were evaluated in a panel of primary breast cancer biopsies and corresponding axillary lymph node metastasis from patients with known clinical outcomes. The results demonstrated the power of this systematic stepwise strategy for identifying targets of potential clinical value.

Purification of Plasma Membrane Proteins and Analysis by LC-MS/MS
SILAC, a quantitative proteomics strategy that metabolically labels the entire proteome allowing quantitative comparison of proteins/peptides by mass spectrometry analysis, was used on M-4A4 and NM-2C5 (21,22). The cell lines were propagated as described above but with dialyzed FCS (HyClone) and in custom-made DMEM lacking L-arginine, L-lysine, and L-glutamine (JRH Bioscience). M-4A4 was supplemented with glutamine and isotopic forms of "heavy" arginine and lysine ( 13 C 6 ), and NM-2C5 was supplemented with glutamine and "light" arginine and lysine ( 12 C 6 ), allowing complete incorporation in the cells. One-third of the amount of arginine present in normal DMEM was used to avoid arginine to proline conversion. The isotopic labeling did not affect cell growth rates, morphology, or biological activity. Isotope-labeled M-4A4 and unlabeled NM-2C5 cells were combined 1:1 followed by homogenization, fractionation by centrifugation, and Percoll-sucrose density separation of cell contents. The fractions enriched for cell surface proteins and with minimal mitochondrial contamination, as determined by the activities of ␥-glutamyl transpeptidase and succinate dehydrogenase, were either pooled or processed individually for further analysis. The purified plasma membrane fractions were enzymatically digested with lysyl endoproteinase and trypsin and analyzed by LC-MS/MS (Q-TOF Micro tandem mass spectrometer, Waters/Micromass, Manchester, UK) using a 2-h stepped linear gradient. Peptides with incorporated heavy arginine and/or heavy lysine had a higher mass of 6 Da per amino acid relative to peptides with normal isotopic arginine and/or lysine. The intensity ratios of the heavy and "normal" peptide peaks, respectively, in a given mass spectrum indicated the relative abundance of proteins in M-4A4 and NM-2C5, respectively. With variations of one parameter at a time, the purification of plasma membrane proteins was repeated 16 times, and each sample was analyzed by LC-MS/MS. The variations included single versus dual isotopic labeling, Percoll-sucrose density separation of crude membranes versus no density separation, and single versus repeated sample analysis to increase the number of proteins identified by LC-MS/MS. Dual isotopic labeling and repeated analysis of a sample with LC-MS/MS (three to four times) increased the percentage of proteins that could be quantified, whereas the use of Percoll-sucrose density separation resulted in enrichment of membrane proteins. 1 The abbreviations used are: SILAC, stable isotope labeling with amino acids in cell culture; APC, antigen-presenting cells; ecto-5Ј-NT, ecto-5Ј-nucleotidase; ER, estrogen receptor; HRP, horseradish peroxidase; ICC, immunocytochemistry; IHC, immunohistochemistry; MIF, migration inhibitor factor; PR, progesterone receptor; QV, quantification value; WB, Western blotting; MHC, major histocompatibility complex; DMEM, Dulbecco's modified Eagle's medium; VEMS, Virtual Expert Mass Spectrometrist; IPI, International Protein Index; FACS, fluorescence-activated cell sorting; ICAM-1, intercellular adhesion molecule 1; NDRG1, N-myc downstream regulated gene 1; ICAM, intercellular adhesion molecule; APMAP, adipocyte plasma membrane associated protein.
Cell Surface Membrane Purification-The cells were lysed by incubation in a hypotonic buffer (10 mM Tris base, 1.5 mM MgCl 2 , 10 mM NaCl, pH 6.8) followed by sedimentation (300 ϫ g for 5 min), resuspension in gradient buffer (0.25 M sucrose, 10 mM HEPES, 100 mM succinic acid, 1 mM EDTA, 2 mM CaCl 2 , 2 mM MgCl 2 , pH 7.4), and homogenization using a motor-driven Potter homogenizer (B. Braun Biotech). The homogenate was centrifuged at 1,000 ϫ g for 10 min, and the supernatant, free of cell debris and nuclei, was further centrifuged at 100,000 ϫ g for 30 min in an M150GX ultracentrifuge (Sorvall). The pellet, containing crude membranes, was resuspended in 2 ml of gradient buffer and manually homogenized and mixed with 1.9 ml of Percoll (Amersham Biosciences) containing 10% PBS and 0.19 ml of 2.5 M sucrose in an Easy-Seal TM tube (5 ml; Sorvall). Gradient buffer was used to fill up the excess volume of the tube before centrifugation at 120,000 ϫ g for 15 min. The gradient was fractionated from the top into 10 fractions by displacing it from the bottom with 2 M sucrose. Percoll was removed by centrifugation at 900,000 ϫ g for 15 min.
␥-Glutamyl Transpeptidase Activity-Aliquots of each fraction (50 l) were transferred to a microtiter plate and mixed with 150 l of substrate solution (1 mM L-␥-glutamyl-p-nitroanilide and 20 mM glycylglycine in 0.1 M Tris-HCl, pH 7.6), and the relative concentration of 4-nitroaniline was measured 10 times at 405 nm in 15 min at 37°C using a Victor 3 Multilabel Plate Reader (PerkinElmer Life Sciences). The relative ␥-glutamyl transpeptidase activity was calculated on a linear segment of the measurements.
Succinate Dehydrogenase Activity-Aliquots of each fraction (20 l) were transferred to a microtiter plate and mixed with 80 l of H 2 O and 100 l of substrate solution (0.1 M phosphate buffer, 0.1 M sodium succinate, 0.05 M sucrose, pH 7.4) containing 2 mg/ml p-iodonitrotetrazolium (Sigma-Aldrich). The relative concentration of iodonitrotetrazolium formazan was measured 10 times at 490 nm over 5 min at 37°C (Victor 3 Multilabel Plate Reader), and the relative succinate dehydrogenase activity was calculated on a linear segment of the measurements.
Protein Concentration Measurement-The protein concentration was determined by a colorimetric, detergent-compatible, Lowrybased assay (DC protein assay, Bio-Rad) in accordance with the manufacturer's protocol using a BSA preparation as standard (Pierce).

Data Analysis, Quantification, and Database Searching
The raw LC-MS/MS data obtained were processed by the Mass-Lynx 4.0 software (Waters, Milford, MA) and analyzed by the freeware VEMS V3.207 (24,25). The VEMS program ExRaw.exe was used to extract the retention times from the raw LC-MS/MS data files. Data from sequential LC-MS/MS analyses of the technical replicates were analyzed as one batch. The data were compared with all human proteins (n ϭ 66,620) in the International Protein Index (IPI) version 3.23 (26).
Default VEMS settings were used for identification. Computational methods, scoring function, and statistical evaluation are described in detail in Matthiesen et al. (24). In brief, the cutoff score value for accepting individual MS/MS spectra was set to 10 to initially include as many protein identifications as possible. The MS/MS spectra of all single peptide-based protein identifications were manually inspected, and protein identifications based on poor MS/MS spectra (e.g. spec-tra with few matching b-and y-ions or assignment of both heavy and normal arginine and/or lysine in the same peptide) were excluded. The false discovery rate was calculated as the number of proteins identified using a reversed database (false positives) divided by the sum of proteins identified (true positives identified using the IPI and the false positives identified using IPI reversed) (27). The false discovery rate was estimated to be 3.9% using IPI version 3.23 reversed, made using the Decoy Database Builder software (27).
Mass accuracy of precursor and fragment ions were set to Ϯ0.5 Da, and the mass accuracy of the peptides identified were generally below 50 ppm with an average of ϳ30 ppm. Additional lowering of the mass accuracy did not influence the false discovery rate significantly (data not shown). The enzyme specificity setting was cleavage Cterminal to arginine and lysine, and one missed cleavage was allowed. Method-specific settings were as follows: 1) cysteine modification: carbamidomethylation; and 2) variable modifications, whole database: Met oxidation, [ 13 C 6 ]Arg, and [ 13 C 6 ]Lys.
Default VEMS settings were used for quantification. The cutoff score of the peptides used was set to 20, mass accuracy was set to Ϯ0.3 Da, and method-specific settings were "Labeled amino acids": [ 13 C 6 ]Arg and [ 13 C 6 ]Lys. For a peptide pair to be quantified, at least one peptide had to meet the criteria of the cutoff score, whereas the other peptide should have a count above 5 (or meet the criteria of the cutoff score). Data points with fewer counts than 5 were removed, whereas no outlier data points were removed.
The VEMS peptide quantification values (QVs) were based on calculation of QV for each set of the first three peaks in the light and heavy isotopic clusters. QV peptide ϭ QV 1st set of peaks ϩ QV 2nd set of peaks ϩ QV 3rd set of peaks 3 (Eq. 1) The maximum standard deviation of the peptide QV accepted for quantification was 5. The VEMS QV for any given set of peaks was calculated as the peak intensity (I) of the heavy isotope divided by the total peak intensity of the heavy and the light isotope from multiple scans (the number of scans depending on the peak intensity).
QV ϭ I Heavy I Heavy ϩ I Light (Eq. 2) This means that a protein expressed in equal amount by both cell lines has a QV of 0.5 or 50%. For each sample, the QVs of ␤-actin (UniProt accession number P60709) and Na ϩ /K ϩ -ATPase (UniProt accession number P05023) were examined, and the average QV was calculated. When average QV was more than Ϯ6 from 50%, the quantifications were normalized to assure comparable results between experiments. A QV-dependent normalization function, which is deduced in supplemental Data 1, was calculated for each sample to be normalized. A protein expressed in only NM-2C5 has a theoretical QV of 0%, and a protein expressed in only M-4A4 has a theoretical QV of 100%. A threshold of 2-fold change in expression levels of proteins in M-4A4 relative to NM-2C5 was chosen to ensure the biological relevance of the markers identified corresponding to QV of 33 and 67%. For marker identification, cutoff QVs of 40 and 60% were used to allow for a general margin of error as the true signal is often underestimated due to signals from background "noise" or saturation of the detector.
For a protein to be considered differentially expressed it should meet the following criteria. 1) The proteins should be identified in two or more samples and be identified by two or more peptides in at least one of these.
2) The protein quantifications should be based on two or more different peptides, and at least one protein QV should be above 60% or below 40%. Further proteins that did not consistently exhibit differential expression despite fulfilling the above criteria were excluded.
The protein list generated by VEMS was transferred to Protein-Center (Professional Edition, version 1.1.2-1.3.4, Proxeon, Odense, Denmark). To reduce redundancy the identified proteins were clustered into groups according to similarity levels 95, 80, and 65% (SL95, SL80, and SL65, respectively), but all proteins matching the data equally well are reported in supplemental Data 2 and 3. For marker identification, the proteins were sorted according to the number of samples they were identified in and according to the quantification.

Western Blotting
Based on protein concentration measurements, equal amounts of plasma membrane preparations generated from M-4A4 and NM-2C5 cells were resolved by 4 -20% SDS-PAGE (Pierce) and electroblotted onto a PVDF membrane. The membranes were blocked in PBS, 0.1% Tween 20, 5% nonfat dry milk powder for 1 h at room temperature and incubated with primary antibody for 16 h at 4°C followed by washing and incubation with HRP-conjugated secondary antibody for 1 h at room temperature. All antibody incubations and washing steps were carried out in PBS, 0.1% Tween 20. The immunoreactive bands were visualized using an ECL Western blot kit (Amersham Biosciences). Anti-Na ϩ /K ϩ -ATPase was always included in separate lanes with M-4A4 and NM-2C5 membrane lysates to ensure equal loading of the two cell lines.

Flow Cytometry Analysis
Subconfluent cells were washed with PBS and harvested by ESGRO Complete Accutase (Millipore, Danvers, MA) and suspended in complete medium containing enzyme inhibitors (Complete protease inhibitor mixture tablets, Roche Diagnostics GmbH) to a final concentration of 10 6 cells/ml of medium. The cells were incubated with primary antibody for 60 min, washed twice with complete medium, and pelleted at 300 ϫ g for 5 min at 4°C. Cell suspensions were then incubated with a FITC-labeled secondary antibody for 60 min, washed, resuspended in PBS, and analyzed on a FACScan (BD Biosciences). Buffer, medium, and cells were kept at 4°C until analyzed. As a negative control, cells were incubated without primary antibody. The results were interpreted using the ratio of geometric mean values within each cell line, with and without a primary antibody, and between the cell lines as the autofluorescence of the cells without primary antibody or in the presence of an isotype control differed in the two cell lines.

Quantitative Real Time PCR
Relative quantification of gene expression was performed in triplicates using SYBR Green PCR Master Mix (Applied Biosystems, Foster City, CA) in accordance with the recommendation of the supplier. The median relative expression levels were normalized using the reference genes TATA box-binding protein, ␤-actin, and Na ϩ /K ϩ -ATPase. The primers for specific amplification were all obtained from QuantiTect Primer Assay collection (Qiagen): TBP/TATA box-binding protein, QT00000721; ACTB/␤-actin, QT00095431; ATP1A1/Na ϩ -

Tissue Specimens
Selection of breast cancer biopsies was based on the medical history of the patient related to whether (n ϭ 15) or not (n ϭ 15) the patient experienced distant metastases within a 10-year follow-up period and on the subtype: estrogen receptor-positive (ER ϩ ), progesterone receptor-positive (PR ϩ ), and HER2/neu-negative (HER2 Ϫ ) (n ϭ 10), ER ϩ /PR ϩ /HER2 ϩ (n ϭ 2), ER Ϫ /PR Ϫ /HER2 Ϫ (n ϭ 12), or HER2overexpressing subtype ER Ϫ /PR Ϫ /HER2 ϩ (n ϭ 6). When axillary lymph node metastasis was present at primary surgery (n ϭ 15), this tissue was included in the tissue array. ER, PR, and HER2 status were re-evaluated according to current technical standards. Normal breast tissues (from breast reductive surgery) were also included in the tissue array (n ϭ 2). Selection and evaluation of cancer tissue were performed by a highly experienced pathologist.
Formalin-fixed paraffin-embedded primary breast tumor biopsies and corresponding axillary lymph node metastasis retrieved at the primary diagnostic surgery were obtained from the tissue bank of Department of Pathology, Odense University Hospital. A representative area of each tumor was selected from a hematoxylin-and eosinstained tumor slide and punched out from the corresponding paraffinembedded tumor tissue to form tissue arrays, each containing 15-20 tissue sections with a diameter of 0.4 mm. Based on standard histopathological criteria the immunohistochemical staining results were rated according to percentage of positive tumor cells (0 -100%) and staining intensity from ϩ1 to ϩ3 with ϩ3 being the most intense. When a heterogeneous staining pattern was present, the tumor was rated according to the most intense stained area when this was representative of at least 10% of the positive tumor cells.
All samples were coded, and patient confidentiality was maintained. The study was approved by the ethics committee of Funen and Vejle County.

Immunohistochemistry and Immunocytochemistry
Plasma/thrombin cell blocks were generated from the cell lines by adding 50 l of plasma to 5 ϫ 10 5 cells followed by 35 l of bovine thrombin (Biofac A/S, Ejby, Denmark), which leads to the formation of a clot surrounding the cells. The plasma/thrombin cell clots and patient tissue biopsies were fixed in 4% neutral buffered formalde-hyde, pH 7.4, for 24 h. Sections (4 m) were cut from paraffinembedded tissue blocks, mounted on ChemMate TM Capillary Gap Slides (Dako), dried at 60°C, deparaffinized, and hydrated. Prior to antigen retrieval, blocking of endogenous peroxidase was performed in 1.5% hydrogen peroxide in TBS buffer, pH 7.4, for 10 min. A panel of antigen retrieval protocols was initially evaluated, including heatinduced epitope retrieval by microwave boiling for 15 min in (i) T-EG buffer (10 mM Tris, 0.5 mM EGTA, pH 9.0), (ii) 10 mM citrate buffer, pH 6.0, or (iii) Dako Target Retrieval Solution (Dako). Slides were heated for 11 min at full power (900 watts) in the microwave oven and then for 15 min at 400 watts. After heating, slides remained in buffer for 15 min. Heat-induced epitope retrieval by microwave boiling in T-EG or Target Retrieval Solution for 15 min proved to be the optimal antigen retrieval methods for a panel of antibodies tested. Sections were subsequently incubated with antibody diluted in antibody diluent (S2022, Dako) for 1 h at room temperature, washed with TNT buffer (0.1 M Tris, 0.15 M NaCl, 0.05% Tween 20, pH 7.5), and immunostained using the EnVisionϩ TM HRP detection system (K4001, Dako) or PowerVisionϩ TM HRP detection system (ImmunoVision Technologies) on the TechMate TM 500 instrument (Dako). 3,3Ј-Diaminobenzidine was used as the substratechromogen system (K3468, Dako). Immunostaining was followed by nuclear counterstaining in Mayer's hematoxylin. Finally coverslips were mounted with AquaTex (Merck). For each experiment, samples with either an isotype-matched antibody or no primary antibody were included as controls. Evaluation of the IHC staining was performed by a pathologist in a blinded setup. To identify proteins that were differentially expressed between any two groups of tissue specimens, a twotailed Fisher's exact test was applied to identify p values smaller than 0.05. The significance level in the clinical study was calculated based on individual testing of each of the six marker antibodies against the panel of clinical samples. Due to the limited strength of the study (the low sample size in each group) significance was not maintained when using Bonferroni correction for multiple comparisons.

RESULTS
Proteomics-To identify proteins, particularly plasma membrane proteins that may influence the ability of breast cancer cells to expand and establish metastasis at distant sites, the isogenic cell line pair M-4A4 and NM-2C5 was used as a model system. Comparison of protein expression levels in membrane-enriched preparations of the two cell lines using quantitative LC-MS/MS proteomics identified a total of 1919 protein entries. LC-MS/MS data were processed in VEMS. All data, including protein identifications, quantified peptides, sequence coverage, accuracy, precursor m/z observed, precursor charge observed, scores, and E-values, are supplied in supplemental Data 2 and 3. Assigned and manually verified spectra for all single peptide-based protein identifications are provided in supplemental Data 4. Refining the search to proteins identified by at least two peptides reduced the identified protein entries to 1145 of which 622 were membrane proteins. Thirteen proteins were found to be quantitatively expressed at higher amounts in the metastatic M-4A4 cells compared with the non-metastatic NM-2C5 cells, whereas three proteins were expressed in lower amounts in M-4A4 compared with NM-2C5 cells, and thus all of these proteins may be potential markers of cancer cells with the ability to establish metastasis (Table I) (2), 2-5-fold change (Ͼ2), or more than 5-fold (Ͼ Ͼ2). All listed proteins were identified in at least two samples, each run twice or in triplicates.
were expressed 2-to 5-fold higher or lower, respectively, in M-4A4 cells compared with NM-2C5. An arbitrary minimum threshold of a 2-fold change was used and deemed to be biologically relevant.
Biochemical Validation of Altered Protein Expression-To validate the quantitative proteomic differences observed by mass spectrometry, the expression levels of eight proteins were examined in the two cell lines using immunocytochemistry, flow cytometry, and/or Western blotting.
Immunocytochemical analysis of the NM-2C5 and M-4A4 cell lines as well as the "parental" cell line MDA-MB-435 was performed when commercial antibodies were available (Fig.  1). Interestingly NDRG1 showed a distinct translocation from the cytoplasm in NM-2C5 to the plasma membrane in M-4A4 cells (Fig. 1a). In MDA-MB-435 cells, NDRG1 was localized in the cytoplasm in some cells and the plasma membrane in others, corresponding to a mix of non-metastatic and metastatic cells. Analysis of integrin ␤1 showed more intense and distinct plasma membrane staining in M-4A4 compared with NM-2C5 (Fig. 1b). MDA-MB-435 showed an intermediate of the two expression patterns. Intense cell surface expression of the MHC class II proteins HLA-DR␣ and HLA-DR␤ was observed in M-4A4 cells, whereas no staining of NM-2C5 cells was observed (Fig. 1, c and d). CD74 expression was also restricted to M-4A4 cells, and interestingly, in addition to the traditional cytoplasmic staining pattern of CD74/invariant chain, some staining of the cell surface, where CD74 acts as a migration inhibitor factor (MIF) receptor, was also observed (Fig. 1e) (29). MDA-MB-435 showed a staining pattern similar to that of M-4A4 for HLA-DR␣, HLA-DR␤, and CD74 proteins.
The expression level of CD44 on M-4A4 compared with NM-2C5 was analyzed using a panel of anti-CD44 antibodies. A CD44v6 antibody, recognizing an epitope encoded by exon v6 on the variant portion of human CD44, exhibited cell surface staining of NM-2C5 and MDA-MB-435, whereas M-4A4 was not stained (Fig. 1f). In contrast, staining using an anti-CD44 antibody recognizing an epitope conserved on all CD44 isoforms showed equal cell surface staining of the three cell lines (data not shown). Further a CD44v5-specific antibody did not stain any of the three cell lines (data not shown).
Expression levels of the different proteins were next examined by flow cytometry (Fig. 2). HLA-DR␤ exhibited the most distinct difference in expression levels of all proteins examined by flow cytometry with a 65-fold increase in M-4A4 versus NM-2C5. HLA-DR␣ exhibited a 2-fold higher expression level in M-4A4 versus NM-2C5 by flow cytometry (data not shown), a somewhat lower expression difference than that observed by proteomics analysis (2-to 26-fold) and immunocytochemistry, suggesting that HLA-DR␣ and -DR␤ molecules were presented on the plasma membrane as separate proteins rather than as an active MHC class II complex. CD74 and ecto-5Ј-NT also showed twice the expression level on M-4A4 compared with NM-2C5, whereas less difference in expression levels was observed for integrin ␤1, NDRG1, and annexin A2 (data not shown). CD44 showed 2-fold higher expression on NM-2C5 versus M-4A4 in agreement with the proteomics study.
Semiquantitative measurement of the denatured proteins by Western blotting in reduced and non-reduced states was performed using Na ϩ /K ϩ -ATPase, a plasma membrane protein present in quantitatively equal amounts in both cell lines, for normalization (Fig. 3). HLA-DR␣ (35 kDa) and -DR␤ (29 kDa) were detected in M-4A4 and lymphocytes (positive control) but not in NM-2C5. A much more intense band for ecto-5Ј-NT (58 kDa) was observed with M-4A4 compared with NM-2C5 under non-reducing conditions. Protein bands corresponding to a dimer and a trimer of ecto-5Ј-NT were also detected. The band for integrin ␤1 (110 kDa) was also at least twice the intensity in M-4A4 compared with NM-2C5 confirming the proteomics data.
Using the anti-CD44 antibody, which recognized all CD44 isoforms, protein bands with molecular masses of 95 and 170 kDa were stained in both cell lines. Comparison of the staining intensity of the bands showed that the amount of the higher molecular mass isoform of CD44 was greater in NM-2C5 than in M-4A4. To further characterize the isoforms recognized in M-4A4 and NM-2C5, cell lysates were treated with the deglycosylation reagent peptide-N-glycosidase F. A shift from 95 to 65 kDa was observed for the lower CD44 band and from 170 to 130 kDa for the higher CD44 band upon deglycosylation. Under these deglycosylation conditions, the intensity of the 130-kDa band was significantly stronger in NM-2C5 than in M-4A4, whereas the intensity of the 65-kDa band was only slightly stronger in NM-2C5 than in M-4A4. Under reducing conditions, CD44, integrin ␤1, or ecto-5Ј-NT could not be detected in Western blotting of M-4A4 and NM-2C5 probably due to epitopes including disulfide bonds. Western blotting of NDRG1 and annexin A2 resulted in equally intense bands in M-4A4 and NM-2C5 (data not shown). The discrepancy between the proteomics and Western blot results for NDRG1 and annexin A2 was likely due to the semiquantitative nature of the latter method. Further analysis of the remaining candidate cell surface markers identified by proteomics was not possible because of the lack of suitable reagents for immunohisto-, immunocytochemistry, and/or flow cytometry analysis.
Evaluation of Expression at the Transcriptional Level-The observed altered protein expression levels corresponded with alterations in mRNA expression levels as measured by quantitative real time PCR, although the magnitude of changes in transcription did not exactly coincide (Fig. 4). The mRNA expression levels of HLA-DR␣, -DR␤, and CD74 were approximately 100-fold higher in M-4A4 than in NM-2C5, whereas mRNA expression levels of ecto-5Ј-NT, NDRG1, integrin ␤1, integrin ␣V, sulfate transporter, and ICAM-1 were 5-fold higher or less in M-4A4 compared with NM-2C5. Finally mRNA expression levels of annexin A2, integrin ␣6, ADP-ribosylation factor 4, and BSCv protein were less than 2-fold higher in M-4A4 compared with NM-2C5. The mRNA expression levels of CD98, CD44, and MUC18 were 2-fold lower or less in M-4A4 compared with NM-2C5.

Evaluation of Protein Expression in Clinical Cancer
Biopsies-When grouping the patients based on whether or not the primary tumors had metastasized to axillary lymph nodes or distant organs, a significant correlation between outcome and integrin ␤1 (p Ͻ 0.05) or ecto-5Ј-NT (p Ͻ 0.05) expression was observed (Table II). Similarly a significant correlation between outcome and ecto-5Ј-NT (p Ͻ 0.05) expression was observed when the patients were grouped based on whether or not they developed distant recurrence within a 10-year follow-up period (Table III).
IHC analysis revealed that myoepithelial, but not luminal, cells of normal resting breast epithelial cells showed high expression of integrin ␤1 (Fig. 5, g-i). High integrin ␤1 expression was observed in the malignant epithelial cells of biopsies of 67% (n ϭ 14 of 21) of the primary tumors that had developed distant and/or lymph node metastases, whereas only 22% (n ϭ 2 of 9; p Ͻ 0.05) of primary tumors with no disseminated disease exhibited integrin ␤1 expression. Two patients with integrin ␤1-negative primary tumor exhibited integrin ␤1 expression in the axillary lymph node metastasis, suggesting that an integrin ␤1-expressing subclone of the primary tumor was responsible for the axillary lymph node metastasis. Integrin ␤1 expression seemed to be independent of hormonal receptor and HER2 status (Table IV).
IHC analysis showed that normal breast epithelia expressed no or only low amounts of ecto-5Ј-NT (Fig. 5j). Intense ecto-5Ј-NT expression was observed in 67% (n ϭ 14 of 21) of the primary tumors that developed distant and/or lymph node metastases but in only 22% (n ϭ 2 of 9; p Ͻ 0.05) of primary tumors with no disseminated disease. Similarly high ecto-5Ј-NT expression was observed in 67% (n ϭ 10 of 15) of primary tumors from patients with relapse within a 10-year period but only in 20% (n ϭ 3 of 15; p Ͻ 0.05) of patients without relapse. Further two patients with ecto-5Ј-NT-negative primary tumors exhibited ecto-5Ј-NT expression in the  axillary lymph node metastasis, suggesting that an ecto-5Ј-NT-expressing subclone of the primary tumor was responsible for the axillary lymph node metastasis. The ecto-5Ј-NT expression seemed to be independent of hormonal receptor and HER2 status (Table IV). Complete overlap in the tumors that expressed ecto-5Ј-NT and integrin ␤1 expression was not observed, suggesting that they were independent, but potentially additive, factors for development of metastasis.
In contrast to ecto-5Ј-NT and integrin ␤1, HLA-DR␣, -DR␤, CD74, and NDRG1 expression did not seem to correlate with the presence of metastasis (axillary lymph node or distant) or recurrence but rather to the ER Ϫ /PR Ϫ phenotype of the M-4A4 and NM-2C5 cell lines (Tables II and III). IHC staining for NDRG1 (Fig. 5, d-f) showed low, predominantly cytoplasmic, expression in normal breast epithelial cells. The tumor cells of ER ϩ /PR ϩ biopsies were mainly negative (33%; n ϭ 4 of 12) or exhibited only weak staining similar to normal breast epithelial cells regardless of their 10-year metastatic status. In contrast, a significant correlation of NDRG1-positive tumor cells and ER Ϫ /PR Ϫ biopsies were found (83%; n ϭ 15 of 18; mal breast (a, d, g, and j) tissue biopsies. Hormonal and recurrence status within a 10-year period is as follows for breast cancer tissue: b, ER ϩ /PR ϩ /ϩ relapse; c, ER Ϫ /PR Ϫ /Ϫ relapse; e, ER Ϫ / PR Ϫ /ϩrelapse (cytoplasm staining); f, ER Ϫ /PR Ϫ /ϩrelapse (membrane staining); h, ER Ϫ /PR Ϫ /ϩrelapse; i, ER Ϫ / PR Ϫ /Ϫ relapse; k, ER Ϫ /PR Ϫ /ϩrelapse; l, ER ϩ /PR ϩ /Ϫrelapse. p Ͻ 0.05), again regardless of recurrence status. In biopsies from the two patients with relapses of ER ϩ /PR ϩ /HER2 ϩ tumors, NDRG1 was highly expressed in the tumor cells (Table  IV). As observed by ICC staining of the cancer cell lines, the subcellular location of NDRG1 varied from tumor to tumor and within tumors and in some cases were mostly cytoplasmic, whereas in others NDRG1 was predominantly present on the plasma membrane or both. No clear correlation between subcellular expression patterns and hormonal or HER2 status or other clinical parameters could be made. IHC analysis showed that HLA-DR␣ and -DR␤ were expressed in normal resting breast epithelial cells (n ϭ 2) and constitutively expressed by professional antigen-presenting cells (APC), such as dendritic cells, B cells, and macrophages, in cancer and normal tissue (Fig. 5, a-d). The malignant cells in primary breast cancer samples of those that had metastasized within a period of 10 years did not show clear staining differences from those that did not metastasize. However, significant correlations between cancer cell expression of the two proteins and the ER Ϫ /PR Ϫ phenotype of the biopsies were found regardless of their clinical outcome (p Ͻ 0.05). Positive staining of the two proteins were observed in cancer cells of 50% (n ϭ 9 of 18) of ER Ϫ /PR Ϫ tumors, whereas only 8% (n ϭ 1 of 12) of the ER ϩ /PR ϩ tumors were stained. The expression of HER2 did not influence the frequency of HLA-DR␣ and HLA-DR␤ expression (Table IV). Staining for CD74 generally followed that of HLA-DR␣ and HLA-DR␤ and was mostly cytoplasmic (Fig. 6, a-c). Interestingly in two cases of triple negative biopsies from patients with node-positive and distant metastasis, CD74 was present in the membrane, whereas HLA-DR␣ and HLA-DR␤ expression was absent (Fig. 6, d-f).

FIG. 5. Immunohistochemical analysis of staining patterns of anti-HLA-DR␤ (a-c), -NDRG1 (d-f), -integrin ␤1 (g-i), and -ecto-5-nucleotidase (j-l) antibodies in breast cancer and nor-
Analysis of tissues for CD44 identified expression on the surface of all normal breast epithelial cells. In contrast, only a few breast cancers expressed CD44 on the surface (n ϭ 4), whereas the remaining were CD44-negative with the excep-  tion of subpopulations of CD44-positive cells that were clustered together. Although an overall lower expression of CD44 was observed in tumors compared with normal tissue, no correlation to relapse or breast cancer subtype could be identified (data not shown).

DISCUSSION
Our quantitative comparative proteomic study revealed 16 membrane proteins expressed at higher or lower levels (2-fold or more) in the metastatic compared with the non-metastatic cell line and our analysis by other biochemical techniques such as flow cytometry, Western blotting and ICC unequivocally confirmed these findings validating the strategy used. These two cell lines are equally tumorigenic in nude mice, but only M-4A4 cells have the capability to establish metastasis, whereas NM-2C5 spread, but do not expand at the distant site. Therefore, the findings were worthy of deeper analysis in human tumor samples, which led to the confirmation that integrin ␤1 and ecto-5Ј-NT expression correlate significantly with metastatic performance in human malignant disease.
Ecto-5Ј-NT, which is a glycosylphosphatidylinositol-anchored component of lipid rafts and an enzyme converting 5Ј-AMP to extracellular adenosine (30), was one of the proteins identified in our proteomics study to be expressed at a 2-to 5-fold higher level in M-4A4 than in NM-2C5. The role of ecto-5Ј-NT in cancer has been shown to relate to adenosine generation; adenosine, acting through specific receptors, is an important metabolite released by cancer cells that establishes physiological conditions conducive to tumor progression (31). The relationship of ecto-5Ј-NT expression, its activity, and adenosine generation to tumor cell biology has not been fully elucidated. Further ecto-5Ј-NT has been identified as an adhesion molecule that binds to other cells and extracellular matrix proteins such as laminin and fibronectin (32). Our IHC analysis of clinical breast cancer specimens showed significant correlation between ecto-5Ј-NT expression and tumor spread and recurrence; e.g. high ecto-5Ј-NT expression was observed in 67% of the primary breast cancer biopsies from patients with recurrence but in only 20% of the biopsies from patients with no recurrence. Recently a study of clinical breast cancer specimens suggested an inverse correlation between ER expression and ecto-5Ј-NT expression (33); however, our study shows that ecto-5Ј-NT expression is independent of ER, PR, and HER2 status. Similar results were also reported from an earlier study (34), and careful analysis of the study by Spychala et al. (33) showed that the inverse correlation between ER expression and ecto-5Ј-NT expression was only found when an arbitrary cutoff at 20% ER ϩ cancer cells was set. We believe that the observation of ecto-5Ј-NT in ER Ϫ breast cancer cells relates to the aggressiveness of this breast cancer subtype resulting in earlier development of metastasis (33,35). We propose that increased expression of ecto-5Ј-NT may facilitate establishment of micrometastases at distant sites either indirectly through adenosine promotion of tumor cell growth and invasiveness or directly by the action of ecto-5Ј-NT as an adhesion molecule. This theory is supported by studies showing that small interfering RNA against ecto-5Ј-NT effectively inhibits invasion and migration of the highly aggressive MDA-MB-231 cells and prevents adhesion to the extracellular matrix (36).
Integrins, ICAM-1 (CD54), CD44, and MUC18 (CD146) are all adhesion receptors that link the extracellular matrix to the cytoskeleton and play critical roles in the metastatic progression of breast cancer (37). Tumor cells might use integrins for migration and anchorage at sites of metastasis (38 -40). In our proteomics study, we found 2-to 3-fold higher expression levels of integrins ␤1, ␣6, and ␣V and ICAM-1, whereas CD44 and MUC18 exhibited 2-to 3-fold lower expression in M-4A4 versus NM-2C5. Quantitative real time PCR confirmed this at the mRNA level. The higher level of integrin ␤1 and lower level of CD44 in M-4A4 versus NM-2C5 were also confirmed by Western blotting, and ICC showed a more distinct plasma membrane expression of integrin ␤1 in M-4A4 versus NM-2C5. Our IHC data showed that 67% of the node-positive patients and/or patients developing recurrent disease within a 10-year period had higher luminal integrin ␤1 expression compared with 22% in the group of metastasis-free patients, significantly correlating activation of integrins to tumor progression. ER, PR, and HER2 status had no significant influence on the expression of integrin ␤1. Our data are supported by other studies showing that integrin ␣6␤1, a receptor for laminin, facilitates tumorigenesis and promotes tumor cell survival in distant organs (41). Targeted elimination of integrin ␣6␤1 in a breast carcinoma cell line reduced the size of the primary tumor and the number of metastatic foci in the lungs of experimental mice (42). A high expression level of integrin ␣6 has also been correlated with reduced survival of women with breast cancer (43), whereas integrin ␣V␤3 seems to promote spontaneous metastasis of breast cancer to bone (44). It was previously suggested that a network of proteins, including HLA-DR molecules and integrin ␤1, plays a role in signal transduction, cell adhesion, and motility in melanoma and lymphoid cells (45). Analysis of cell membrane proteins in cell lines derived from metastasis and malignant melanoma has also demonstrated that integrin ␤1 and integrin ␣V are coexpressed with ecto-5Ј-NT (46); this is in agreement with our observation of a significant, but not complete, overlap between integrin ␤1 and ecto-5Ј-NT expression in the clinical breast cancer specimens.
CD44 proteins are a family of ubiquitously expressed transmembrane glycoproteins and a major receptor for hyaluronan. All isoforms of the protein are encoded by a single gene but are derived by alternative splicing (v1-v10) and variation in Oand N-glycosylations (47,48). CD44s, the smallest and most widely expressed isoform, is expressed in ductal epithelial cells and myoepithelial cells in all cases of normal and benign breast tissue; however, the expression of CD44s progres-sively decreases with increasing deviation from normal histology corresponding to our data (49). The peptides identified by our study were identical to protein sequences in both the N-terminal part (exon 2) and the C-terminal part of CD44, sequences present in all CD44 isoforms. However, our biochemical studies suggest that only some CD44 isoforms exhibit different expression levels in the two cell lines as ICC, using an anti-CD44 antibody recognizing all CD44 isoforms, showed no significant difference in staining patterns or intensity of the two cell lines as also shown by Urquidi et al. (13). In contrast, an antibody against CD44v6 stained NM-2C5 but not M-4A4, whereas none of the cell lines were stained using an antibody against CD44v5. Western blotting of NM-2C5 and M-4A4 cells identified two CD44 protein bands at 95 and 170 kDa that were both glycosylated. The higher molecular mass form seemed to be more highly expressed in NM-2C5 than in M-4A4 cells. IHC staining of CD44 showed a strongly heterogeneous expression pattern (data not shown). This suggests that further studies might identify different isoforms that are associated with metastatic behavior but were not elucidated with the antibodies that we used.
We found a 2-to 3-fold higher expression level of NDRG1 in the plasma membrane fraction of M-4A4 than of NM-2C5. ICC confirmed this difference but revealed an additional difference in the subcellular localization of NDRG1 in the two cell lines: NDRG1 was predominantly cytoplasmic in NM-2C5, whereas the plasma membrane was predominantly stained in M-4A4. Due to the multiple localizations of NDRG1 and the absence of a transmembrane domain in the protein, we suggest that NDRG1 is translocated to the plasma membrane in M-4A4 in the process of metastasis, likely enabling the cells to adhere and proliferate in distant organs. However, this protein, also referred to as CAP43, DRG1, RTP, and RIT42, is expressed in normal tissue, and a strong association of NDRG1 with the plasma membrane in lactating breast epithelial cells has been observed by others, whereas in other cell types the NDRG1 staining was localized in the cytoplasm or nucleus (50). The function of this protein remains unclear, and some controversy regarding its role in cancer exists. Some studies suggest NDRG1 involvement in metastatic suppression, whereas others show higher expression of this protein, likely related to metastasis (51)(52)(53)(54)(55). The difference in subcellular localization of NDRG1 may be important for its function, and we therefore remain open to the concept that it plays some role in the development of metastases. Our IHC study on clinical samples showed that NDRG1 expression was significantly correlated with the ER Ϫ /PR Ϫ status of breast cancer cells because 83% of ER Ϫ /PR Ϫ biopsies independent of metastatic status showed increased NDRG1 expression compared with only 33% in ER ϩ /PR ϩ biopsies. In another study, estradiol was shown to induce down-regulation of NDRG1 in ER ϩ breast cancer cells but not in ER Ϫ cells, indicating a hormonal influence on NDRG1 only when the hormone receptors are present, which also could explain some of the discrepancies among the NDRG1 studies (55). NDRG1 expression in ER Ϫ / PR Ϫ /HER2 Ϫ but not ER ϩ /PR ϩ /HER2 Ϫ breast cancer cells may correlate with the ability of the primary tumor cells to metastasize.
MHC class II HLA-DR␣␤ heterodimers are constitutively expressed by professional APC but, unlike MHC class I antigens, are not present on resting breast epithelial cells. However, MHC class II antigens have been detected in the normal lactating breast as well as in several other carcinomas (56 -58). In professional APC, CD74/invariant chain is present in intracellular association with the immature MHC class II molecule blocking the peptide-binding groove; however, a pool of CD74 resides transiently on the cell surface in diverse cells (59). The exact function of cell surface CD74 has not been determined, but its recent identification as a MIF receptor opens a new and likely fruitful area of research. A study by Starlets et al. (29) suggests that CD74 acts as a survival receptor, whereas other studies have identified CD44 as a potential accessory protein required for MIF-CD74 signal transduction (60). MIF, a proinflammatory cytokine, is overexpressed in prostate cancer, but the mechanism by which MIF exerts effects on cancer cells remains undetermined (61). We found that the expression level of CD74 increased more than 100-fold, HLA-DR␣ increased up to 26-fold, and HLA-DR␤ increased only 5-fold in M-4A4 compared with NM-2C5 in the proteomics study. In flow cytometry, a 65-fold difference in expression of HLA-DR␤ was observed between the two cell lines, whereas the differences in HLA-DR␣ and CD74 expression were 2-fold, suggesting that the proteins are regulated and exist individually rather than in a complex. ICC analysis of HLA-DR␣ and -DR␤ showed distinct cell surface staining of M-4A4 cells, whereas CD74 was present in both the cytoplasm and at the cell surface of M-4A4. In contrast, no staining for the three proteins was observed in NM-2C5. The differences in protein expression level between assays may relate to the fact that some intracellular protein in addition to membrane protein was measured by the proteomics analysis and to the quality of the antibodies used for flow cytometry and ICC. Our IHC data showed that the tumor cells of ER Ϫ / PR Ϫ breast cancer biopsies, regardless of metastatic status, expressed significantly higher levels of HLA-DR␣ and -DR␤ compared with the ER ϩ /PR ϩ biopsies. CD74 was present in tumor cells of 61% of ER Ϫ /PR Ϫ breast cancer biopsies but in only 33% of the ER ϩ /PR ϩ biopsies. The tumor cells staining for CD74 were mainly cytoplasmic, but importantly, in a few cases CD74 membrane staining was observed. Interestingly CD74 has been suggested as a promising target of cancer therapy in multiple myeloma and exhibits restricted expression in normal tissue (62).
This study indicates the feasibility of using mass spectrometry and other proteomics techniques to find new connections between the ability of cancer cells to establish metastasis in distant organs and altered expression levels of specific proteins. The study was designed to focus on plasma membrane proteins relative to the entire proteome, thereby increasing the likelihood of identifying proteins that could be targeted by antibodies and small molecule drugs and thus inhibit the metastatic process. Several of the plasma membrane proteins, which we identified and characterized, seem to be involved in the metastatic behavior of human breast cancer. This indicates that further investigations using this strategy hold promise for identifying novel mechanistically important targets for preventing metastasis and significantly advancing treatment of breast cancer. Because of novel technical improvements, future studies will also include identification and characterization of proteins with altered expression in other subcellular compartments, thereby bringing further insight to the biology of metastasis.