|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Molecular & Cellular Proteomics 4:523-533, 2005.
© 2005 by The American Society for Biochemistry and Molecular Biology, Inc.


,¶
,||
From the
GI Division/Department of Medicine, University of Washington, Seattle, WA 98195;
Institute for Systems Biology, Seattle, WA 98103; and || Institute for Molecular Systems Biology, ETH Zurich and Faculty of Nature, Science Federal Institute, University of Zurich, CH-8093 Zurich, Switzerland
| ABSTRACT |
|---|
Pancreatic ductal adenocarcinomas, the majority of the exocrine pancreatic tumors, are thought to develop in a multi-step process, involving a series of specific genetic mutations in each step (3). In the past 10 years, significant progress in the identification and characterization of cancer-related gene abnormalities has been made; unfortunately, this progress has not yet effectively translated into substantial clinical improvement in the diagnosis or treatment of the disease. Current clinical pancreatic tumor markers lack the sensitivity and specificity required for screening an asymptomatic population for the purpose of early detection; an ideal biomarker would require a specificity of greater than 99% to avoid the consequences of a high rate of false-positive results. The only widely used marker for pancreatic cancer, CA19-9, is frequently elevated in pancreatic cancer but can also be expressed in other malignancies. Moreover, CA19-9 levels can be elevated in such benign conditions as acute and chronic pancreatitis, hepatitis, and biliary obstruction. The sensitivity and specificity of CA19-9 are
8090%, limiting its value as a screening marker for the general populace (4). Efforts to use various genetic mutations as surrogate markers of disease have also been unsuccessful due to the lack of sensitivity or specificity (5, 6). Therefore, there is an urgent need for new and better biomarkers for pancreatic cancer.
In the search for effective biomarkers, some fundamental and important aspects are worth careful consideration. First, what would be the ideal target for a biomarker? Theoretically, DNA, RNA, or protein can all be used as biomarkers. The choice of the type of target largely determines the technique applied and ultimately influences the outcome of biomarker development. Second, what biological specimens should be used in the search for the biomarkers? Several types of specimens are available for pancreatic cancer research, including pancreatic tissue, proximal body fluid such as pancreatic juice, and easily accessible body fluids such as serum or plasma. Third, should one use whole cancer tissue or isolated tumor cells for biomarker discovery? This is a concern common to all cancer researchers, but it is most manifest in pancreatic cancer where extensive fibroblastic stroma surrounds the cancer cells.
In recent years, the development of quantitative proteomics technologies has stimulated considerable interest in applying the technology for clinical applications. The capability to identify sensitive and effective biomarker proteins is critical in the battle against cancer, because the ability to treat and cure cancer, particularly pancreatic cancer, directly depends on the ability to detect it at its earliest stage (79). Recent proteomics studies in pancreatic cancer have identified proteins differentially regulated in cancer samples and have led to the discovery of several candidate biomarkers (12).1 The use of proteomic profiling for pancreatic cancer biomarker discovery is still at its early stage; however, the efforts so far have been productive and the results are encouraging.
| STRATEGIES FOR PROTEIN BIOMARKER DISCOVERY IN PANCREATIC CANCER |
|---|
As it is evident that RNA levels do not necessarily correlate with protein levels (27, 28), a direct search for protein biomarkers may also be successful. Currently, protein-based biomarkers represent the majority of cancer biomarkers (6). The current approaches to search for pancreatic cancer protein biomarker using proteomic profiling are illustrated in Fig. 1. They include the investigation of differentially expressed proteins in pancreatic juice, pancreatic cancer tissue, and serum. Serum is an ideal diagnostic specimen in general, due to its easy and inexpensive accessibility. Pathological and cancerous changes in the body can be reflected as protein changes in the serum.
|
The most direct approach is to identify protein biomarkers in pancreatic cancer tissue. Biomarkers are presumably present in cancer tissue at higher concentration, and therefore have higher chance to be detected. However, tissue is not as clinically useful as blood or juice, because it is difficult to obtain tissue samples from the pancreas, which is in a remote location in the body. However, candidate biomarkers identified through proteomic profiling of tissue can provide a great basis for biomarker identification in serum for screening general population, for which a more sensitive and specific assay will be required. While different profiling techniques are warranted for each of these approaches to achieve an optimal outcome, a complementary strategy using different approaches and sample sources will be beneficial to facilitate the discovery of candidate biomarkers for pancreatic cancer.
| CURRENT MS-BASED METHODS FOR PROTEOMIC PROFILING |
|---|
The recent development of MS-based quantitative proteomics utilizes multi-dimensional chromatography for extensive separation of peptides and the incorporation of signature tags into proteins or peptides for quantitative analysis (32, 35, 43). In general, these quantitative approaches for protein profiling consist of the following major steps: 1) differentially labeling proteins/peptides from comparative samples with stable isotopes or chemical tags followed by proteolysis, 2) multi-dimensional chromatography separation, 3) ESI or MALDI MS/MS, and 4) computational analysis of the obtained spectra for the identification of peptide sequences and quantitative analysis. Relative quantification of each identified protein of the compared samples is accomplished by determining the abundance ratio from the signal intensities of the differentially labeled peptides with identical sequences. These newer approaches to quantitative proteomics significantly increase the analytical dynamic range and enhance the capability of detecting low-abundant proteins in a complex system. Quantitative proteomics has been applied to study cellular functions and pathways affected by perturbations and disease (4449), to identify new components and changes in the composition of protein complexes and organelles (5055), and has led to the detection of putative disease biomarkers (56).
While they differ in sample preparation and other aspects, most of the MS-based quantitative proteomic methods share the use of stable isotope labeling or mass tagging to generate the mass signatures that identify the target sample and serve as the basis for accurate quantification. These methods include the use of chemical reactions to introduce an isotopic or chemical tag at specific functional groups on polypeptides (5760), metabolic isotope labeling using heavy amino acids in cell culture (61, 62), and methods that introduce stable isotope tags via enzymatic reactions (63, 64). Among those approaches, the most commonly used and versatile method has been the ICAT technology (59, 65), in which the proteins in two samples (i.e. normal and disease) representing different proteomes are labeled separately using one of the two chemically identical, but isotopically different, ICAT reagents. The labeled proteins from different proteomes are then combined, digested, purified, and separated by multi-dimensional chromatography, and analyzed by MS/MS. The incorporation of a biotin affinity tag into ICAT reagents enables the selective isolation and purification of cysteine-containing peptides, thus affording a substantial reduction in sample complexity. The presence of tagged cysteinyl residue(s) in the peptide adds an additional powerful constraint for database searching. Because over 90% of mammalian species contain proteins with cysteinyl residue, the development of the ICAT strategy provides a widely applicable tool for comparative proteomics studies and significantly advances the technology for quantitative protein profilingespecially for human tissue samples that cannot be subjected to metabolic labeling.
Another approach for disease biomarker development is using proteomic pattern analysis. This approach primarily uses the pattern of signals observed within a mass spectrum to identify differentially abundant peaks within normal and disease samples for distinguishing the two groups (target and control). SELDI-TOF MS has been proposed in recent years for biomarker study, in combination with artificial intelligence algorithm, for the detection of discriminating signals (66). The technique uses MS to generate proteomic patterns in biological fluids, especially serum, and then applies pattern recognition algorithms to distinguish cancer patients from normal controls. In the first SELDI study, an artificial intelligence algorithm was used to identify a mass spectrometric pattern and to distinguish ovarian cancer in individuals from a second independent noncancer group with 95% specificity and 100% sensitivity (66). The study demonstrated that the technique was rapid and required minimal sample preparation and was suitable for analyzing serum and other body fluids. Because the technique did not provide quantitative information and protein identification, which can be at times critical for biomarker identification, the approach of using patterns of mass peaks to diagnose disease without knowing the identities of proteins has drawn some concern (6769).
A comprehensive, quantitative proteomics analysis for biomarker discovery is challenging and involves considerations of different aspects. It deals with an enormously complex background that may fluctuate with the population tested and is limited by the technology available in dealing with sensitivity, dynamic range and throughput, and is further limited by the specificity and availability of different sample types. In the study of pancreatic cancer, 2-DE-based and ICAT-based profiling technologies have so far been successfully applied to pancreatic tissue and body fluid such as pancreatic juice. These studies have led to the discovery of some valuable or promising biomarker candidates. SELDI-TOF MS has mainly been applied to the analysis of body fluid, such as serum and pancreatic juice, and has lead to discovery of candidate biomarkers (70). To effectively handle complex biological and clinical samples, integration of multiple platform technologies will be necessary. A complementary strategy involving multiple technologies and different sample types will be beneficial for the discovery of clinically useful pancreatic cancer biomarkers.
| PROTEIN PROFILING OF PANCREATIC TISSUE |
|---|
,
-enolase, peroxiredoxin I, TM2, and S100A8 were specifically overexpressed in tumors compared with normal and pancreatitis tissues (13). In an effort to systematically study protein profiles in pancreatic cancer with the aim of identifying potential biomarkers, we used the ICAT technology to perform quantitative protein profiling of pancreatic cancer tissues and normal pancreas. Proteins extracted from pancreatic cancer tissue and matching normal pancreas were labeled with isotopically heavy and light ICAT reagents, respectively. The samples were then combined, tryptic digested, purified, and analyzed with LC ESI MS/MS. The obtained CID spectra of peptides were searched against the National Cancer Institute human sequence database using SEQUEST for peptide/protein sequence identification and statistically modeled for subsequent statistical validation (38, 71, 72). Relative quantification was accomplished by comparing the signal intensity of the peptides with identical sequence, but different stable isotope signatures (73).
Using this ICAT strategy, 656 unique proteins were identified (protein probability scores
0.9) and quantified.1 These proteins were classified into a variety of categories in molecular function (Fig. 2). Close to one-third of the proteins had catalytic activity or were regulated enzymes, which is not surprising due to the major digestion function of the pancreas. Some other significant molecular function groups included: binding proteins (19%), structural molecules (9%), transporter activity (5%), and obsolete molecular function (4%). The distribution of protein abundance ratios in pancreatic cancer for all these 656 proteins was centered around the ratio of 0.81.0 (Fig. 3), with 77% of the protein ratios being between 0.5 and 2.
|
|
There is currently a vast amount of information on the differential gene expression of pancreatic cancer at the mRNA level, including RNA expression arrays, DNA microarrays, differential display experiments, and SAGE analysis of pancreatic cancer (14, 25, 7483). Comparing our ICAT proteomic data to the published gene expression studies, there were over 30 proteins/genes that have been detected as differentially regulated in pancreatic cancer by mRNA studies and the ICAT proteomic study. Significantly, 121 (80%) of the differentially regulated proteins identified by ICAT strategy in our study have not been detected by previous mRNA studies. The significant improvement in methods to identify differentially regulated proteins in cancer samples provides new targets and a greater pool for development of biomarker for early diagnosis and therapy.
Twenty-three out of the 40 differentially expressed proteins identified by 2-DE (13) were also identified in ICAT analysis; however, only 13 of them showed differential expression by at least 2-fold in cancer samples in our study. Some proteins (or genes), for example, annexin A1, glyceraldehyde 3-phosphate dehydrogenase, cofilin, galectin1, S100A protein family, and 14-3-3 protein
have been repeatedly detected by various studies, including ours, to be differentially expressed in pancreatic cancer (10, 12, 15, 16, 18, 76, 82). Those proteins could be particularly important targets for future biomarker development.
| CANCER SAMPLE PREPARATION |
|---|
On the other hand, it may still be of considerable interest to analyze pure tumor cells, provided these above factors were considered. The use of enriched cancer cells may facilitate the discovery of very-low-abundant biomarkers. Several approaches have been applied to isolate pure cancer cells from pancreatic cancer tissue. The use of fixed tissue, such as that might be seen with microdissection of paraffin blocks, might not be suitable for proteomics; the proteins become cross-linked when fixed. Our strategy to isolate pure cancer cells used CELLection Epithelial Enrichment (Dynal Biotech, Oslo, Norway) to isolate epithelial cells. CELLection Epithelial Enrichment uses uniform, mono-dispersed, super-paramagnetic polymer spheres (Dynabeads) coated with monoclonal antibody BerEP4 (EpCAM Ab) against human epithelial cells. The beads bind to the target cells in a cellular slurry of whole pancreatic cancer. The bead:cell complexes are then separated from the sample with a magnetic particle concentrator. A DNA linker between the antibody and bead surface provides a cleavable site for cell detachment. Using a similar approach, another study has isolated pure epithelial cell from pancreatic tissue and demonstrated to be suitable for protein and mRNA studies (87).
A second method for isolating cells uses primary pancreatic ductal culture. This methodology is particularly useful for samples that are smaller in size and not amenable to the heavy cell losses associated with making the pancreatic cell slurry used for Dynabead separation. In our experience, the primary cultures can be passaged up to five times before senescing. Rigorous analysis of the cell cultures should be undertaken to ensure that the cells are epithelial in origin: cells are examined by hematoxylin and eosin, electron microscopy, organotypic culture with terminal differentiation, and tested for epithelial markers. This approach of short-term culturing cells, in effect, increases the quantity of small material. However, even short-term culture over three passages may result in unavoidable changes in protein expression (unpublished data).
A third method for isolating pure cancer cells is laser capture microdissection (LCM). LCM has been used to microdissect epithelial cells from cancers, effectively providing enriched populations of target cells. In a study using LCM to enrich for both normal and malignant pancreatic ductal epithelial cells, investigators have managed to obtain enough material for 2-DE analysis; however, silver staining was required for detection and a number of technical challenges were encountered due to the limited number of cells captured from LCM (12). While it is effective in separating epithelial cells from other cell types, the LCM procedure requires a considerable commitment of time and is labor intensive.
Finally, an additional challenge in studying proteins from pancreas is the rich proteases produced in this organ. To prevent protein degradations, protease inhibitors are essential in sample preparation.
| PROTEIN PROFILING OF PANCREATIC JUICE |
|---|
The SELDI-TOF approach has been applied to identify novel protein peaks in pancreatic juice to distinguish samples from cancer patients from normal controls (94). Using this methodology, Goggins and colleagues identified one peak present in the pancreatic juice from 10 of 15 of the patients with pancreatic cancer. This peak was further identified as hepatocarcinoma-intestine-pancreas/pancreatitis-associated protein (HIP/PAP) by a specialized protein chip and immunoassay. Because HIP/PAP is also present in pancreatitis and hepatoma cell lines, the limitations on sensitivity and specificity would preclude its use as a biomarker in the general population.
We have used the ICAT method to comprehensively study the proteome of pancreatic juice and quantitatively identify proteins that are differentially expressed in pancreatic cancer.1 Three sets of comparison studies were performed: cancer to normal, pancreatitis to normal, and normal to normal. Because pancreatitis patients have the greatest risk for false-positive biomarker findings, we included juice from pancreatitis patients in our assessment to exclude any nonspecific crossover proteins. In the comparison of the cancer to normal pair, we identified and quantified a total of 78 proteins in pancreatic juice of which 30 proteins displayed greater than 2-fold expression difference in cancer. These included 24 proteins that were up-regulated and six that were down-regulated in cancer sample. The pancreatitis and pooled normal pair analysis identified 73 proteins, of which 27 proteins showed greater than 2-fold expression difference in pancreatitis patients. Finally, the pair of normal to normal analysis identified 61 proteins. All together, in three studies we identified a total of 136 unique proteins from pancreatic juice with false-positive rate
0.9% (11), including the samples from normal, pancreatitis, and cancer. Of the 136 proteins identified in pancreatic juice from normal, pancreatitis, and cancer, there were 20 proteins that were found in all three samples; 40 proteins identified in the pancreatitis samples were also identified in cancer samples; and 34 proteins identified in normal were also shared in cancer sample (Fig. 4). From the 30 proteins displaying greater than 2-fold expression difference in cancer, we subtracted the proteins that were also showing differential expression in pancreatitis and normal juice. Ultimately, we identified 21 proteins that were differentially expressed only in cancer juice.
|
| PROTEIN PROFILING OF SERUM |
|---|
2-macroglobulin, transferrin, and immunoglobulins, may represent as much as 80% of the total serum protein, while many other proteins are present in trace amounts. For example, important protein classes such as cytokines are at pg/ml level while serum albumin is at 3555 mg/ml (95). The large quantity of these abundant proteins makes it difficult to identify low-abundance proteins in serum using traditional 2-DE. Multi-dimensional chromatography and sample prefractionation prior to LC MS/MS have been used to reduce complexity and increase capacity for protein identification. However, the techniques remain problematic, mainly due to the great dynamic range in protein concentration. Lately, one study used ultra-high-efficiency strong cation exchange LC/reversed-phase LC/MS/MS separations to extend the dynamic range and coverage of human plasma proteins (96). The authors demonstrated that a protein identification dynamic range of >108 can be achieved using conventional ion trap MS/MS instrumentation. This approach resulted in the identification of >800 human plasma proteins from
365 µg of plasma without the need for depletion of high-abundant serum albumin or immunoglobulins. Even with this extensive effort, only a fraction of the total serum proteins have been identified. Given the challenges in the direct proteomic analysis of serum with higher throughput, several alternative approaches have been developed to reduce the abundant proteins or the complexity of the sample. Different methods are now available to deplete albumin and other high-abundance proteins from serum, before 2-DE analysis, to allow the visualization of low-abundant proteins (97101). One study used Affi-Gel Blue to treat serum samples before 2-DE analysis, and it resulted in the detection of 28 new spots, as well as enhanced staining intensity of different spots by several folds (97). While depletion of high-abundant proteins can increase the detection of low-abundant proteins, this approach is potentially problematic for quantitative measurements due to the variable and selective losses of other proteins along with the immunoglobulins and albumin. Careful experimental design and controls are needed to address these problems.
A third approach to reduce the complexity of serum proteins focuses on the in-depth analysis of sub-proteomes of rich biological context, thus minimizing the repeated analyses of abundantly expressed proteins. Recently, two new methods for the identification of N-linked glycopeptides in complex biological samples have been reported (102, 103). In general, these methods immobilize glycopeptides on a solid support, then the N-linked glycopeptides are released and collected for mass spectrometric analysis. By combining glycopeptide enrichment and stable isotope labeling, the method has been successfully applied to profiling glycoproteins in serum (103, 104).
Finally, the SELDI approach has been applied to profile serum in an attempt to identify protein peaks that can distinguish health from disease sample. A recent study from Goggins group (70) analyzed serum samples from patients with and without pancreatic cancer using SELDI and identified two most discriminating protein peaks that could differentiate patients with pancreatic cancer from healthy controls with a sensitivity of 78% and specificity of 97%.
| VALIDATION AND DEVELOPMENT OF BIOMARKER |
|---|
Immunohistochemical analysis using tissue arrays is a widely accepted validation method for candidate biomarkers identified through tissue proteomic profiling. Tissue array technology allows the acceleration of studies by correlating protein expression with clinicopathologic information in large numers of samples (105). Up to several hundred tissue sections can be processed in one slide for subsequent use. In our study of quantitative protein profiling of pancreatic cancer tissue, overexpression of annexin A2 was detected in both cancer samples by ICAT analysis. In the immunohistochemical analysis of a pancreas tissue array, annexin A2 was expressed negatively or mildly in all acinar cells, ductal cells, and islet cells of normal pancreas (12/12). However, 93% of the cancer samples (118/127) showed strong expression of annexin A2 in the cancer cells, while the majority of adjacent acinar cells, islet cells normal, and ductal cells from the same specimens showed mild or negative expression. Overexpression of annexin A2 has previously been observed in human pancreatic cancer and a pancreatic cancer-derived cell lines by mRNA measurement and immunohistochemical analysis (106, 107). However, previous efforts were limited to less than 10 samples, and no follow-up study has been performed. Interestingly, overexpression of annexin II (gene) has not been detected by recent mRNA profiling studies. We used tissue array analysis of over 100 pancreatic cancer samples to validate the overexpression of annexin A2 discovered by proteomic profiling. Our data detects overexpression in 93% of 127 pancreatic ductal adenocarcinoma samples, suggesting that annexin A2 is a good candidate for future biomarker development.
ELISA methodology is a tried and true method for detection of serum proteins and is useful for moderate numbers of sera samples. The ELISA test requires optimization prior to use and requires two antibodies for each protein that is being assayed. The two antibodies must be directed toward separate epitopes of the protein. This, in turn, makes the ELISA more difficult to construct for biomarker quantification. Alternatively, measurement of proteins in the body fluid can occur through a protein array, an emerging methodology. These arrays are constructed using antibodies that specifically recognize and capture proteins of interest. The antibody capture arrays can quantify sub-picomolar amounts protein abundance within body fluids or tissues of interest and can assay multiple targets simultaneously. However, this method, again, relies heavily on the availability of suitable antibodies.
Recently, a high-throughput proteome-screening technology (108) has been developed that provides an alternative way for quantitative proteomics, specifically targeting certain proteins with biological significance such as biomarkers and prognostic proteins for identification and quantification. The highly selective platform consists of a LC/spotting system for peptide array preparation, a MALDI TOF/TOF tandem mass spectrometer, and uses synthesized stable isotope-labeled peptides as representative signatures for specific protein identification and quantification. The approach minimizes the interference of unwanted background proteins in a complex sample, therefore significantly improving the confidence in protein throughput and identification. Because the method is MS-based and directly focuses on selected peptides/proteins for identification and absolute quantification, it is applicable for profiling multiple proteins and well suited as a complementary tool for biomarker detection. This technology will be especially effective for detecting a panel of candidate biomarkers in large number of clinical samples.
The approaches for biomarker discovery in pancreatic cancer are summarized in Fig. 1. For discovery phase, protein profiling can be used to analyze pancreatic tissue, pancreatic juice, and serum to identify candidate biomarkers. Candidate biomarkers can be next validated by immunohistochemistry in large cohort of patient samples, such as in tissue arrays, and subsequently developed into diagnostic biomarkers in serum or pancreatic juice, using ELISA, protein arrays, and/or high-throughput proteome-screening technology.
| CONCLUSION |
|---|
| FOOTNOTES |
|---|
Published, MCP Papers in Press, January 31, 2005, DOI 10.1074/mcp.R500004-MCP200
1 R. Chen, E. C. Yi, S. Donohoe, S. Pan, D. A. Crispin, Z. Lane, D. R. Goodlett, M. P. Bronner, R. Aebersold, and T. A. Brentnall, manuscript in preparation. ![]()
2 The abbreviations used are: 2-DE, 2-dimensional electrophoresis; LCM, laser capture microdissection; HIP/PAP, hepatocarcinoma-intestine-pancreas/pancreatitis-associated protein. ![]()
* This work was funded in part by the Gene and Mary Ann Walters Pancreatic Cancer Foundation and with Federal funds from the National Heart, Lung and Blood Institute, National Institutes of Health, under contract no. NOI-HV-28179. R. C. has been supported by GI training grant (NIH 5 T32 DK07742 08). ![]()
¶ To whom correspondence should be addressed: Department of Medicine, Division of Gastroenterology, University of Washington, Seattle, WA 98195. Tel.: 206-543-3280; Fax: 206-685-9478; E-mail: teribr{at}u.washington.edu
| REFERENCES |
|---|
/NF-
B signal transduction pathway.
Nat. Cell Biol.
6, 97
105[CrossRef][Medline]