Survival Prediction for Pancreatic Cancer Patients Receiving Gemcitabine Treatment*

Although gemcitabine monotherapy is the standard treatment for advanced pancreatic cancer, patient outcome varies significantly, and a considerable number do not benefit adequately. We therefore searched for new biomarkers predictive of overall patient survival. Using LC-MS, we compared the base-line plasma proteome between 29 representative patients with advanced pancreatic cancer who died within 100 days and 31 patients who survived for more than 400 days after receiving at least two cycles of the same gemcitabine monotherapy. Identified biomarker candidates were then challenged in a larger cohort of 304 patients treated with the same protocol using reverse-phase protein microarray. Among a total of 45,277 peptide peaks, we identified 637 peaks whose intensities differed significantly between the two groups (p < 0.001, Welch's t test). Two MS peaks with the highest statistical significance (p = 2.6 × 10−4 and p = 5.0 × 10−4) were revealed to be derived from α1-antitrypsin and α1-antichymotrypsin, respectively. The levels of α1-antitrypsin (p = 8.9 × 10−8) and α1-antichymotrypsin (p = 0.001) were significantly correlated with the overall survival of the 304 patients. We selected α1-antitrypsin (p = 0.0001), leukocyte count (p = 0.066), alkaline phosphatase (p = 8.3 × 10−8), and performance status (p = 0.003) using multivariate Cox regression analysis and constructed a scoring system (nomogram) that was able to identify a group of high risk patients having a short median survival time of 150 days (95% confidence interval, 123–187 days; p = 2.0 × 10−15, log rank test). The accuracy of this model for prognostication was internally validated and showed good calibration and discrimination with a bootstrap-corrected concordance index of 0.672. In conclusion, an increased level of α1-antitrypsin is a biomarker that predicts short overall survival of patients with advanced pancreatic cancer receiving gemcitabine monotherapy. Although an external validation study will be necessary, the current model may be useful for identifying patients unsuitable for the standardized therapy.

Invasive ductal adenocarcinoma of the pancreas is one of the most aggressive and lethal malignancies (1). It is the fifth leading cause of cancer-related death in Japan and the fourth in the United States, accounting for an estimated Ͼ23,000 deaths per year in Japan and Ͼ33,000 in the United States (2,3). Because the majority of patients have distant metastases even at their first presentation (4,5), the main therapeutic modality for pancreatic cancer is systemic chemotherapy, and gemcitabine monotherapy is the current standard (6). Gemcitabine treatment has significantly improved the median survival time of patients with advanced pancreatic cancer (7). However, the outcome of the treatment varies significantly among individuals, and a considerable portion of patients do not appear to benefit significantly from it. It therefore seems necessary to assess the efficacy and adverse effects of the drug before administration and tailor the treatment accordingly for each person.
We previously identified a predictive biomarker for hematologic toxicity, which is one of the most frequent and potentially life-threatening adverse effects associated with gemcitabine monotherapy (8). As a next step, we performed a large scale proteome analysis in this study to identify biomarkers predictive of patient survival after gemcitabine monotherapy. Several factors and their combinations have been reported to correlate significantly with outcome in patients with advanced pancreatic cancer receiving gemcitabine, such as performance status, metastases, serum albumin, alkaline phosphatase, and peripheral leukocyte count (9 -11). Unfortunately, however, the accuracy of survival prediction based on these conventional prognostic factors seems unsatisfactory (9).
In recent years, there has been considerable interest in applying advanced proteomics technologies to the discovery of predictive biomarkers (12,13). We and others have successfully applied MALDI MS-based protein profiling techniques for predicting the efficacy of chemoradiotherapy and molecular targeting therapy (14,15). Two-dimensional image converted analysis of liquid chromatography and mass spectrometry (2DICAL) 1 is a new LC-MS-based pro-teomics platform that was developed in our laboratory (16). 2DICAL can quantify protein content accurately across a theoretically unlimited number of samples without isotope labeling and thus has considerable advantages over conventional LC-MS-based methods for clinical studies (17). The predictive biomarker protein for hematologic toxicity described above was identified using 2DICAL (8).
It has been generally accepted that tumor responses do not always correlate with the outcome of patients (10,18,19). The rates of complete and partial responses (Response Evaluation Criteria in Solid Tumors guideline) to gemcitabine mono- 1 The abbreviations used are: 2DICAL, two-dimensional image converted analysis of liquid chromatography and mass spectrometry; AIC, Akaike's information criterion; CC, correlation coefficient; CI, confidence interval; CV, coefficient of variance; ECOG, Eastern Cooperative Oncology Group; NCC, National Cancer Center; ID, identification; FDR, false discovery rate. therapy are limited to ϳ10% (20 -22), and the majority of pancreatic cancers do not show significant tumor regression. Given that the ultimate goal of gemcitabine therapy for pancreatic cancer is to achieve prolonged survival, it would be desirable to stratify patients according to survival rather than tumor response (9). In the present study, using 2DICAL, we compared the base-line plasma proteome of two extreme populations of patients who had shown distinct clinical courses after identical gemcitabine treatment.

EXPERIMENTAL PROCEDURES
Patients-Samples were collected from a total of 304 patients who had all been included in our previous study (8). All patients had metastatic (stage IVb; n ϭ 285) or locally advanced (stage IVa; n ϭ 19) (23) histologically or cytologically proven pancreatic ductal adenocarcinoma and had received at least two cycles of gemcitabine monotherapy (1,000 mg/m 2 intravenously over 30 min on days 1, 8, and 15 of a 28-day cycle). Two hundred eighty-one patients (92%) received gemcitabine as a first line therapy (supplemental Table S1).
Two hundred sixty-two patients (86%) were treated consecutively at the National Cancer Center (NCC) Hospital (Tokyo, Japan) between September 2002 and June 2007, and 42 patients (14%) were treated consecutively at the NCC Hospital East (Kashiwa, Japan) between September 2002 and July 2004. Survival times were determined as of May 2008. During this period, 248 patients (82%) died, and 56 patients (18%) were censored. Tumor response was evaluated after the first two cycles of gemcitabine using the Response Evaluation Criteria in Solid Tumors guideline.
Sample Preparation-Blood was collected before the first administration of gemcitabine. Plasma or serum was separated by centrifugation at 1,050 ϫ g for 10 min at 4°C and frozen until analysis as reported previously (8,24). Macroscopically hemolyzed samples were excluded from the current analysis. Two hundred fifty-two plasma samples (83%) were collected from the NCC Hospital and Hospital East, and 52 serum samples (17%) were collected from the NCC Hospital. Written informed consent was obtained from all patients before blood sampling. The protocol of this retrospective study was reviewed and approved by the institutional ethics committee boards of the NCC (Tokyo, Japan) and the National Institute of Health Sciences (Tokyo, Japan).
LC-MS-Samples were blinded, randomized, and passed through an IgY-12 High Capacity Spin Column (Beckman Coulter, Fullerton, CA) in accordance with the manufacturer's instructions. The flow-through portion was digested with sequencing grade modified trypsin (Promega, Madison, WI) and analyzed in triplicate using a nanoflow high performance LC system (NanoFrontier nLC, Hitachi High Technologies, Tokyo, Japan) connected to an electrospray ionization quadrupole time-offlight mass spectrometer (Q-Tof Ultima, Waters, Milford, MA). LC-MS run order was also randomized to eliminate any potential bias.
MS peaks were detected, normalized, and quantified using the in-house 2DICAL software package as described previously (16). A serial identification (ID) number was applied to each of the MS peaks detected 277). The stability of LC-MS was monitored by calculating the correlation coefficient (CC) and coefficient of variance (CV) of every triplicate measurement. The mean CC and CV Ϯ S.D. for all 45,277 peaks observed in the 60 triplicate runs were as high as 0.970 Ϯ 0.022 and as low as 0.056 Ϯ 0.017, respectively.
Protein Identification by MS/MS-Peak lists were generated using the Mass Navigator software package (version 1.2) (Mitsui Knowledge Industry, Tokyo, Japan) and searched against the NCBInr database (downloaded on May 20, 2008) using the Mascot software package (version 2.2.1) (Matrix Science, London, UK). The search parameters used were as follows. A database of human proteins was selected.
Trypsin was designated as the enzyme, and up to one missed cleavage was allowed. Mass tolerances for precursor and fragment ions were Ϯ2.0 and Ϯ0.8 Da, respectively. The score threshold was set to p Ͻ 0.05 based on the size of the database used in the search. If a peptide matched to multiple proteins, the protein name with the highest Mascot score was selected.
Western Blot Analysis-Primary antibodies used were rabbit polyclonal antibody against human ␣ 1 -antitrypsin (Dako, Glostrup, Denmark), rabbit polyclonal antibody against human ␣ 1 -antichymotrypsin (Dako), and mouse monoclonal antibody against human complement C3b-␣ (Progen, Heidelberg, Germany). Ten microliters of partitioned sample was separated by SDS-PAGE and electroblotted onto a polyvinylidene difluoride membrane. The membrane was then incubated with the primary antibody and subsequently with the relevant horseradish peroxidase-conjugated anti-rabbit or anti-mouse IgG as described previously (25,26). Blots were developed using an ECL detection system (GE Healthcare). The 637 MS peaks whose mean intensity differed significantly between patients with short term and long term survival (p Ͻ 0.001, Welch's t test) are highlighted in red. B, two MS peaks with the smallest p values (upper, p ϭ 2.57 ϫ 10 Ϫ4 ; bottom, p ϭ 5.03 ϫ 10 Ϫ4 ) in representative patients with short term (left) and long term (right) survival. RT, retention time.
The stained slides were scanned on a microarray scanner (InnoScanா 700AL, Innopsys, Carbonne, France). Fluorescence intensity, determined as the mean net value of quadruplicate samples, was determined using the Mapixா software package (Innopsys). All determined intensity values were transformed into logarithmic variables.
The reproducibility of reverse-phase protein microarray assay was revealed by repeating the same experiment. A plasma sample was serially diluted within a range of 1,024 -16,384-fold. Each diluted sample was spotted in quadruplicate onto glass slides and blotted with anti-␣ 1 -antitrypsin antibody. In a representative quality control experiment, the CC value was 0.977 between days, and the median CV was 0.026 among quadruplicate samples.
Statistical Analysis-Overall survival time was defined as the period from the date of starting gemcitabine monotherapy until the date of death from any cause or until the date of the last follow-up at which point the data were censored. We used the Kaplan-Meier method to plot overall survival curves. Statistical significance of intergroup differences was assessed with Welch's t test, Wilcoxon test, 2 test, or log rank test as appropriate. The maximally selected statistics (27) using the fitness of univariate Cox model (log likelihood) was used to determine which level (optimal cutoff point) of each factor best segregated patients in terms of survival.
Multivariate regression analysis was performed using ordinal Cox regression modeling. Factors included in the prediction model were selected with a forward stepwise selection procedure using Akaike's information criterion (AIC), and the result was confirmed using a backward stepwise procedure. The significance of differences between models with and without ␣ 1 -antitrypsin was assessed with the likelihood ratio test. The survival prediction model was internally validated by measuring both discrimination and calibration (28). Discrimination was evaluated using the concordance index, which is similar in concept to the area under the receiver operating characteristic curve. Calibration was evaluated with a calibration curve whereby patients are categorized by predicted survival and then

RESULTS
The median survival estimate for the present study was 236 days (95% CI, 216 -254 days), which is comparable to those of previous large scale studies (10,22). To identify a prognostic factor in patients with advanced pancreatic cancer, we compared the base-line plasma proteome between 29 patients showing short term survival (Ͻ100 days) and 31 patients showing long term survival (Ͼ400 days) using 2DICAL. There was no significant difference in age, sex, body surface area, prior therapy, clinical stage, or gemcitabine pharmacokinetics (24) (Table I) between the two groups, but the patients with short term survival had significantly poorer base-line conditions such as liver function and Eastern Cooperative Oncology Group (ECOG) performance status than those with long term survival (Table I).
Among a total of 45,277 independent MS peaks detected within the range 250 -1,600 m/z and within the time range of 20 -70 min, we found that the mean intensity of triplicates differed significantly for 637 peaks (p Ͻ 0.001, Welch's t test). Fig. 1A is a representative two-dimensional view of all the MS peaks displayed with m/z along the x axis and the retention time of LC along the y axis. The 637 MS peaks whose expression differed significantly between patients with short term and long term survival are highlighted in red.
MS peaks that were increased in patients with short term survival with the highest statistical significance (p ϭ 2.57 ϫ 10 Ϫ4 ) (Fig. 1B) matched the amino acid sequences of the ␣ 1 -antitrypsin (AAT) gene product (supplemental Fig. S1A). The MS peak with the second highest statistical significance (p ϭ 5.03 ϫ 10 Ϫ4 ) was revealed to be derived from the ␣ 1 -antichymotrypsin (AACT) gene product (supplemental Fig.  S1B). We calculated the false discovery rate (FDR) (29) and confirmed the significance of these MS peaks (FDR ϭ 0.0327 for ␣ 1 -antitrypsin and FDR ϭ 0.0428 for ␣ 1 -antichymotrypsin). Fig. 2A shows the distribution of the two peaks (ID 1740 (at 508 m/z and 48.9 min; ␣ 1 -antitrypsin) and ID 11165 (at 713 m/z and 41.5 min; ␣ 1 -antichymotrypsin)) in patients with short term (red) and long term survival (blue). The differential expression and identification of ␣ 1 -antitrypsin and ␣ 1 -antichymotrypsin were confirmed by denaturing SDS-PAGE and immunoblotting (Fig. 2B).
Correlation of ␣ 1 -Antitrypsin and ␣ 1 -Antichymotrypsin with Overall Survival-The relative levels of ␣ 1 -antitrypsin and ␣ 1antichymotrypsin in plasma or serum samples obtained from 304 patients with advanced pancreatic cancer prior to gemcitabine treatment (including 60 patients used in 2DICAL) FIG. 3. Left, representative reversephase protein microarray slide stained with anti-␣ 1 -antitrypsin antibody. Right, samples were randomly assigned, and quadruplicate spots of representative patients with high and low levels of ␣ 1antitrypsin were extracted.
were measured using reverse-phase protein microarrays (Fig.  3). Quadruplicate spots for representative patients with high and low levels of ␣ 1 -antitrypsin are shown in Fig. 3. There were no differences between plasma (n ϭ 252) and serum (n ϭ 52) with regard to the levels of ␣ 1 -antitrypsin and Although the levels of ␣ 1 -antitrypsin and ␣ 1 -antichymotrypsin were not mutually correlated (Pearson's r ϭ 0.274), either level showed a significant correlation with overall survival (Table II). When the most optimal cutoff value was determined by maximally selected analysis, the median survival time of patients with high levels of ␣ 1 -antitrypsin (Ͼ2.09 arbitrary units) was significantly shorter than that of patients with low levels (Յ2.09) (201 days (95% CI, 176 -219 days) versus 327 days (95% CI, 271-439 days), log rank p ϭ 2.26 ϫ 10 Ϫ9 ; Fig.  4A). Similarly, the median survival time was significantly shorter in patients with ␣ 1 -antichymotrypsin levels of Ͼ4.41 (211 days (95% CI, 193 to 235 days)) than in those with levels of Յ4.41 (327 days (95% CI, 255-416 days)) (p ϭ 2.02 ϫ 10 Ϫ4 ; Fig. 4B). Even when the 60 patients used for 2DICAL were excluded, the differences in survival separated by ␣ 1antitrypsin and ␣ 1 -antichymotrypsin levels were still significant (supplemental Fig. S2, A and B). However, the level of either ␣ 1 -antitrypsin or ␣ 1 -antichymotrypsin was not associated with tumor response (Spearman's ϭ 0.090 and ϭ 0.017, respectively). The increased level of ␣ 1 -antitrypsin in 58 patients who subsequently developed progressive diseases was statistically significant (p ϭ 0.020; supplemental Fig. S3) but quite modest, confirming that it is not a predictive biomarker of tumor response.

Construction and Validation of Model Predicting Overall Survival Time-Univariate
Cox regression analysis revealed that ECOG performance status and laboratory values including leukocyte count, albumin, aspartate aminotransferase, alanine aminotransferase, alkaline phosphatase, ␣ 1 -antitrypsin, and ␣ 1 -antichymotrypsin were correlated with overall survival of the 304 patients (p Ͻ 0.05; Table II). Because none of the parameters were able to predict survival outcome satisfactorily when used individually (data not shown), we attempted to construct a multivariate predictive model for estimation of overall survival. We searched for parameters using a forward stepwise selection procedure by AIC from all the clinical and laboratory data listed in Table II (available for all 304 cases) and found that a combination of ␣ 1 -antitrypsin, alkaline phosphatase, leukocyte count, and ECOG performance status provided the lowest AIC value. We also searched for parameters using a backward elimination algorithm and found that this identified the same combination of factors as that selected by a forward stepwise procedure. The base-line ␣ 1 -antitrypsin level was the second most significant contributor to the model (Table II). The prediction model using this combination of parameters was significantly compromised when the level of ␣ 1 -antitrypsin was excluded (⌬ 2 ϭ 14.12, df ϭ 1, p ϭ 0.0002, likelihood ratio test).
Based on the results of multivariate Cox regression analysis, we constructed a scoring system (nomogram) in which the values of the four parameters (␣ 1 -antitrypsin, alkaline phosphatase, leukocyte count, and ECOG performance status) were integrated into a single score (total point) to estimate the survival outcome (Fig. 5A). The accuracy of the nomogram for prognostication was internally validated. The bootstrapcorrected concordance index was 0.672, and the calibration curve demonstrated good agreement between the predicted and observed outcomes (Fig. 5B). It was possible to estimate high risk patients by calculating the total points using the nomogram. The median survival time was 150 days (95% CI, 123-187 days) for patients with a total point score of Ͼ94 (n ϭ 98) and 282 days (95% CI, 255-328 days) for patients with a score of Յ94 (n ϭ 206), and the difference was significant (p ϭ 2.00 ϫ 10 Ϫ15 , log rank test; Fig. 5C). Even when the 60 patients used for 2DICAL analyses were excluded from the total points calculation, the difference was still significant (p ϭ 5.23 ϫ 10 Ϫ10 ; supplemental Fig. S2C). The median survival time was 171 days (95% CI, 147-205 days) for patients with a score of Ͼ92 (n ϭ 83) and 270 days (95% CI, 243-299 days) for patients with a score of Յ92 (n ϭ 161). The cutoff value that optimally segregated patients into subgroups with a poor and good prognosis was determined by using the maximally selected statistics. DISCUSSION Currently, no diagnostic tool has been established for stratifying patients with advanced pancreatic cancer according to their likelihood of obtaining a survival benefit from gemcitabine treatment. Because some high risk patients may achieve prolonged survival through modification (or even withdrawal) of therapeutic protocols, a diagnostic method that can accurately identify such patients is necessary. We first compared the plasma proteome of two groups of patients who showed distinct clinical courses after receiving the same gemcitabine protocol (Fig. 1) and found that individuals who showed poor clinical courses had shown high base-line levels of plasma ␣ 1 -antitrypsin and ␣ 1 -antichymotrypsin (Figs. 1B and 2A). ␣ 1 -Antitrypsin is an abundant plasma protein that usually cannot be measured by MS. However, antibody-based protein depletion (30) allowed us to accentuate the differences in ␣ 1 -antitrypsin levels.
The results obtained by 2DICAL were then validated in a 5-fold larger cohort using a different methodology: high density reverse-phase protein microarray (Figs. 3 and 4 and Table  II). Reverse-phase protein microarray is an emerging proteomics technology capable of validating new biomarkers because of its overwhelmingly high throughput (31,32). Furthermore, reverse-phase protein microarrays require significantly smaller amounts of clinical samples for quantification than established clinical tests, such as ELISA. The prognostic significance of ␣ 1 -antitrypsin was further supported by multivariate survival analysis with stepwise covariate selection. The level of ␣ 1 -antitrypsin was selected as the second most significant factor following alkaline phosphatase (Table II), but ␣ 1 -antichymotrypsin was not selected. To derive clinical applicability from the above findings, we constructed a model (nomogram) including ␣ 1 -antitrypsin to estimate the survival period of pancreatic cancer patients (Fig. 5A), and its significance was internally validated (Fig. 5B). One previous study has demonstrated a correlation between an increased serum level of ␣ 1 -antitrypsin and short survival in patients treated surgically for pancreatic cancer (33). Although the number of cases examined was small (n ϭ 44), the results support our present findings. ␣ 1 -Antitrypsin and ␣ 1 -antichymotrypsin are members of the serine protease inhibitor (serpin) superfamily that plays key roles in the regulation of inflammatory cascades (34,35). ␣ 1 -Antitrypsin and ␣ 1 -antichymotrypsin interact mainly with neutrophil elastase and neutrophil cathepsin G, respectively, and inhibit their protease activities (36). A protease-to-protease inhibitor imbalance in patients with genetic ␣ 1 -antitrypsin deficiency is reported to confer a higher risk of chronic pancreatitis (37). However, the serum level of ␣ 1 -antitrypsin in patients with pancreatic cancer varied significantly from case to case, and its clinical significance has remained unclear. We showed that increased concentrations of ␣ 1 -antitrypsin and ␣ 1 -antichymotrypsin in plasma/serum correlated with poor survival, indicating that patients with poor outcomes have lower base-line protease activities than those with favorable outcomes. How such a protease imbalance affects the progression of pancreatic cancer awaits further clarification in future studies.
In conclusion, we identified a prognostic biomarker potentially useful for selecting high risk patients with advanced pancreatic cancer who are unlikely to gain adequate survival benefit from the standard treatment. This may be of great clinical importance, especially when an alternative therapeutic option becomes available for patients with advanced pancreatic cancer in the future. However, the level of ␣ 1antitrypsin was not significantly correlated with the efficacy of gemcitabine, indicating that it may reflect the natural course of pancreatic cancer irrespective of treatment.
Therefore, an independent prospective validation study will be definitely necessary to confirm the universality of the present findings. The absolute concentration of ␣ 1 -antitrypsin can be measured by nephelometry, but this measurement requires a larger sample volume than reverse-phase microarrays and for this reason could not be performed in this study. While bearing all these limitations in mind, the present findings may not only help to stratify patients with pancreatic cancer but also provide novel insights into the molecular mechanisms behind the malignant progression of this neoplasm, possibly leading to the development of novel therapeutic strategies.