|
Advertisement | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Molecular & Cellular Proteomics 4:773-784, 2005.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ABSTRACT |
|---|
|
|
|---|
The routine application of antibody microarrays to biological and marker-based research requires establishing optimized experimental and analysis methods. Experimental optimization can help to improve the accuracy and reproducibility of measurements, but the analysis methods must be properly developed and applied to ensure the proper interpretation of the data. A common data processing procedure applied to microarray data is normalization, which adjusts the data from each microarray to account for possible systematic experimental variation in factors such as sample labeling efficiency, scanner readout efficiency, and microarray quality (13, 14). Several normalization procedures have been developed for DNA microarrays. A method developed early on for DNA microarrays is global normalization, which normalizes each array by the median or mean of the intensity log ratios on the array. Other normalization methods purport to correct for systematic errors that may affect arrays non-globally when not all of the spots on an array have the same bias. Intensity-dependent normalization adjusts two-color microarray ratios to account for intensity-based bias in the ratios (14, 15) either linearly or non-linearly. Print tip normalization has been used to account for bias caused by variation associated with differences in the tips used for printing (14). Scale normalization makes the assumption that the spread of the distribution of log ratios should be the same for all print tip groups (14). Statistical regression models of microarray data also have been developed for normalization (16).
The two-color comparative fluorescence detection method that we and others have used for antibody microarray experiments is similar to the two-color labeling strategies used for cDNA microarray experiments, so the normalization methods for cDNA arrays may be useful for two-color antibody microarray experiments. However, the differences in antibody microarray experiments, such as a smaller and more selected set of targets and a different labeling method, may mean that the optimal normalization methods may be different. A systematic, detailed comparison and evaluation of the various normalization options for antibody microarrays have not yet been performed. Given the importance of this procedure for subsequent data analysis and interpretation, an in-depth analysis of normalization methods for antibody microarray data is necessary before performing large scale biomarker studies. Therefore, we conducted studies to evaluate various normalization methods for antibody microarray data. Three replicate sets of antibody microarray measurements from serum samples of patients with pancreatic cancer and of control subjects were acquired, and we evaluated seven different normalization methods. The methods represented a variety of major classes of normalization types. Modifications of these types exist, but by evaluating representative methods from a range of classes we could broadly survey the effects of normalization on the data. Newly developed methods with computations requiring special software were not tested.
Each normalization method makes use of assumptions of how "correct" data should behave, and the comparison and evaluation of normalization methods must be independent of those assumptions. Previous comparisons of DNA normalization methods have used the criteria of reproducibility between replicate data sets (13, 17, 18), the linearity of signals from spiked-in standards (17), and the levels of biases in simulated data (13). In this study, we examined several different parameters to get a broad picture of the affects of normalization. The criteria for evaluating and comparing the methods were reproducibility between replicate data sets, accuracy in comparison with known values, and the integrity of overall trends in the data sets. In addition, using optimally normalized data, we investigated the potential benefit of using combinations of measurements for the classification of the samples.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Antibodies, ELISA, and Protein Concentration Measurements
Antibodies were purchased from various sources (see Supplemental Table I for the complete list of antibodies and sources). Antibodies that were supplied in ascites fluid or antisera were purified using protein A beads (Affi-gel Protein A MAPS kit, Bio-Rad) according to the manufacturers protocol. The antibodies were prepared at concentrations of 1001000 µg/ml in 1x phosphate-buffered saline. Two antibodies targeting HGF and one antibody targeting MUC-1 were kindly contributed to the project (Drs. Brian Cao and Ilan Tsarfaty, respectively). ELISAs were performed using commercially available kits from Bethyl Corporation (Montgomery, TX) for the detection of hemoglobin, IgM, IgG, IgA, transferrin, and albumin, and from Cedarlane Laboratories (Hornby, ON, Canada) for the detection of von Willebrand factor. Total serum protein concentrations were measured in duplicate using the BCA assay kit (Pierce).
Fabrication of Antibody Microarrays
Antibodies were deposited eight times each onto slides coated with a polyacrylamide hydrogel (HydroGel, PerkinElmer Life Sciences) using a high-throughput, custom built contact arrayer. Before printing, the hydrogel-coated slides were hydrated for 10 min each in three changes of purified water, dried by centrifugation, and incubated at 40 °C for 20 min. Each printed microarray was circumscribed using a hydrophobic marker. The slides were incubated overnight at room temperature in a humidified chamber to induce binding of the antibodies to the hydrogel matrix. They were washed for 30 s, 3 min, and 30 min in 1x phosphate-buffered saline/0.5% Tween 20, blocked for 1 h at room temperature in 1% BSA/phosphate-buffered saline/0.5% Tween 20, and washed briefly two times in phosphate-buffered saline/0.5% Tween 20 before use.
Sample Labeling
An aliquot from each of the 43 serum samples was labeled with N-hydroxysuccinimide-Cy3 (Amersham Biosciences), and another aliquot was labeled with N-hydroxysuccinimide-Cy5 (Amersham Biosciences). Each serum aliquot was diluted 1:20 into a 200 mM carbonate buffer at pH 8.3, and a twentieth volume of 6.7 mM N-hydroxysuccinimide-Cy3 or -Cy5 in Me2SO was added. This labeling mix gave approximately a 510-fold molar excess of dye relative to the serum proteins (assuming an average serum protein molecular mass of 70 kDa). The concentrations, time, and pH of the labeling reaction were designed to label each serum protein thoroughly but not to completion in case overlabeling of certain proteins might interfere with antibody-antigen interactions. The carbonate buffer contained 1.5 µg/ml BSA labeled with 2,4-dinitrophenol (DNP),1 as a normalization spike-in. After the reactions proceeded for 2 h on ice, a twentieth volume of 1 M Tris-HCl, pH 8.0, was added to each tube to quench the reactions, and the solutions were allowed to sit for another 20 min. The unreacted dye was removed by passing each solution through a size exclusion chromatography spin column (Bio-Spin P6, Bio-Rad) under centrifugation at 1000 x g for 2 min. The Cy5-labeled samples were pooled, and equal amounts of the pool were transferred to each of the Cy3-labeled samples. Each dye-labeled protein solution was supplemented with nonfat milk to a final concentration of 3%, Tween 20 to a final concentration of 0.1%, and 1x phosphate-buffered saline to yield a final serum dilution of 1:100.
Processing of Antibody Microarrays
100 µl of each labeled serum sample mix was incubated on a microarray with gentle rocking at room temperature for 2 h. After incubation, the slides were rinsed briefly in 1x phosphate-buffered saline with 0.1% Tween 20 to remove the unbound sample and then subsequently washed three times for 10 min each in 1x phosphate-buffered saline with 0.1% Tween 20. The slides were spun dry before scanning for fluorescence at 543 and 633 nm using a microarray scanner (ScanArray Express HT, PerkinElmer Life Sciences).
Data Analysis
The software program GenePix Pro 5.0 (Axon Instruments, Foster City, CA) was used to quantify the image data. An intensity threshold for each antibody spot was calculated by the formula 3 x B x CVb, where B is the median local background of each spot, and CVb is the average coefficient of variation (S.D. divided by the average) of all the local backgrounds on the array. Spots that either did not surpass the intensity threshold in both color channels, had a regression coefficient (calculated between the pixels of the two color channels) of less than 0.3, or had more than 50% of the pixels saturated in either color channel were excluded from analysis. The ratio of background-subtracted, median sample-specific fluorescence to background-subtracted, median reference-specific fluorescence was calculated, and the ratios from replicate antibody measurements within the same array were averaged using the geometric mean.
Hierarchical clustering and visualization were performed using the programs Cluster and Treeview (see rana.lbl.gov). Ratios were log transformed (base 2) and median centered by genes. Antibodies that did not have measurements in at least 80% of the samples were removed from the clusters.
Normalization Methods
Multiple normalization methods were applied to the microarray data. The details of each are given below.
DNP
The averaged ratios were multiplied by a normalization factor N for each array that was calculated by N = 1/RDNP, where RDNP is the average ratio of the replicate anti-DNP antibody spots on the array.
IgM-ELISA
The averaged ratios were multiplied by a normalization factor N for each array that was calculated by N = (SIgM/µIgM)/RIgM, where RIgM is the average ratio of the replicate anti-IgM antibody spots on the array, SIgM is the ELISA-measured IgM concentration of the serum sample on that array, and µIgM is the mean ELISA-measured IgM concentration of all of the samples.
IgM Set to 1
The averaged ratios were multiplied by a normalization factor N for each array that was calculated by N = 1/RIgM, where RIgM is the average ratio of the replicate anti-IgM antibody spots on the array.
Mean Centering
The averaged ratios were multiplied by a normalization factor N for each array that was calculated by N = 1/µ, where µ is the mean ratio of all of the antibody spots on the array.
Loess
For each array, the log-transformed ratios of the antibody measurements were plotted with respect to the average intensities of the spots (averaged over both the 543 and 633 channels), and a regression line was fit as implemented by the marray package for the R environment (19). The ratios of the individual spots were adjusted so that the regression line centered around zero (14).
Loess/IgM-ELISA
The averaged ratios were first processed using the Loess method described above. The resulting array values were normalized by the IgM-ELISA method as described above.
| RESULTS |
|---|
|
|
|---|
|
1-antitrypsin and anti-vascular endothelial growth factor, are more scattered, showing lower reproducibility or less distinction from the other profiles.
|
Evaluation of Normalization Methods
We evaluated the effects of seven different normalization procedures on measurement reproducibility, measurement accuracy, and trends in the data sets. "DNP" normalizes each array by setting values from a spiked-in standard (DNP-labeled BSA in this case) to a fixed value; "IgM-ELISA" normalizes each array by setting an internal standard (IgM in this case) to the standards known values (from ELISA); "IgM set to 1" normalizes each array by setting an internal standard (IgM in this case) to a fixed value; "Mean centering" sets the mean of the ratios in each array to a fixed value; "Loess" uses intensity-based correction to account for biases in the data that may arise from non-linearity in the ratios at certain intensities (14); and "Loess/IgM-ELISA" uses intensity-based correction followed by normalization to the known values of an internal standard (using the IgM-ELISA method in this case). Each of the methods except for the Loess methods corrects for factors that affect the arrays globally, such as labeling or scanner effects. Print tip-based methods were not tested because each antibody was printed by all the tips, and the replicate spots were averaged.
Properly normalized data should reduce variability caused by systematic noise between experiments. The effect of normalization on the reproducibility between replicate data sets was evaluated by examining both the coefficients of variation (CV) and the correlations between the replicate experiments. The CV of each antibody (S.D. divided by average) between the triplicate measurements from each serum sample was calculated for each normalization method. The average CVs for each antibody were compared between the non-normalized data and each set of normalized data using a two-tailed, paired t test (Table I). The average CVs ranged from 0.16 (mean centered) to 0.22 (Loess/IgM-ELISA). Normalization by mean centering was the only method that had a significantly lower (p < 0.05) average CV in comparison with the non-normalized data (Table I, column 3). Normalization by Loess/IgM-ELISA and DNP resulted in an average CV that was significantly higher than non-normalized. We also counted the number of individual antibodies that had significantly higher or lower CVs in the normalized data compared with the non-normalized data. A two-tailed, paired t test was used to compare the CVs between the non-normalized data and each set of normalized data for each antibody. Normalizing by mean centering or by IgM set to 1 produced an abundance of antibodies with a lower CV than in the non-normalized data (Table I, column 4). IgM-ELISA and Loess normalizing had similar numbers of antibodies with higher and lower CVs relative to the non-normalized data. Normalizing by DNP or by Loess/IgM-ELISA yielded a high number of antibodies with a higher CV and a low number with a lower CV than the non-normalized data.
A complementary approach for evaluating reproducibility is to calculate a correlation between duplicate experiment sets. Pearson correlations were calculated between the replicate sets of 43 arrays after normalization by each method for each antibody. The correlations from each of the normalized data sets were compared with the correlations from the non-normalized data (Table II). In the comparison of experiment set 1 with set 2, normalization by DNP, IgM-ELISA, and IgM set to 1 produced many antibodies (19, 14, and 18, respectively) that had higher inter-set correlations and few antibodies (1, 3, and 3, respectively) that had lower correlations than the non-normalized data. Many of the antibodies (11, 7, and 11, respectively) had correlation coefficients that increased by 0.1 or more over the non-normalized data. In contrast, normalization by Loess or Loess/IgM-ELISA resulted in only a few antibodies with higher correlations and many with lower correlations than the non-normalized data. Normalization by mean centering did not seem to significantly increase the antibody correlations in comparison with the non-normalized correlations. The pairwise comparisons between all three of the experiment sets (1 versus 2, 1 versus 3, 2 versus 3) produced similar results. Taking the two analyses together, the reproducibility of the replicate data seems slightly improved compared with the non-normalized data after normalization by mean centering, IgM set to 1, or IgM-ELISA.
|
|
1-antichymotrypsin (e.g. samples 86, 91, and 82). The above patterns are somewhat altered in the mean centered data because the groups of patients that are high in most proteins or low in most proteins are not seen as they are in the other clusters.
|
|
1-antitrypsin, and anti-C-reactive protein, showed higher binding in the pancreatic cancer samples, and the rest showed lower binding in the pancreatic cancer samples; anti-albumin, anti-transferrin, and anti-complement C3 showed the greatest significance. Fig. 4 shows that all of the distributions overlapped significantly between the cancer and control groups.
|
Real boosting was used to classify each of the cancer and control samples as belonging to one of the two classes. The antibodies used in the multiparametric classification and the performance of the classifier is presented in Table IV. The errors, sensitivities, and specificities (columns 24) were averaged over the 10 cross-validation iterations. Each row gives the cumulative result after using the antibodies at and above that row. When using all seven antibody measurements (one antibody was used twice), the cumulative results are 0.00 error, 1.00 sensitivity, and 1.00 specificity, or perfect classification, which was not possible using any single antibody.
|
| DISCUSSION |
|---|
|
|
|---|
Some of the normalization methods performed well in one category but not very well in another, showing the value of using multiple criteria for the evaluation. For example, normalizing by IgM set to 1 yielded more reproducible data than the non-normalized data. However, the accuracy of the measurements was greatly compromised as determined by comparison with the ELISA values. Because IgM concentrations vary from sample to sample, setting this value to a constant is not valid and reduces the accuracy of the measurements. It will probably be impossible to find a "housekeeping" protein in the serum that could be used as a constant reference because it seems that all serum proteins are subject to significant change between individuals as supported by our current analyses. Normalizing by IgM set to 1 also greatly altered the trends in the data sets in comparison with the other normalization methods (Supplemental Fig. 1).
Other methods performed well in some categories but not in others. Loess normalization produced reasonably accurate data, but the reproducibility was lower than the non-normalized data as assessed by the correlation analysis. Applying IgM-ELISA normalization to the Loess-adjusted data did not improve reproducibility. The list of proteins that discriminated cancer from healthy and the general structure of the cluster (Supplemental Fig. 2) also was altered after Loess normalization. The Loess method was developed for DNA microarray data and relies on having a large number of data points to produce an accurate picture of intensity-based biases in the ratios. With fewer data points, as with our data, such adjustments may be erroneous and may actually add noise to the data. Other normalization methods that use regression calculations of trends in data therefore also may not perform well on smaller, more selected arrays. Likewise, scaling methods, which adjust the variances (the spread) in groups of ratios, also might not perform well on these arrays because the variances in small numbers of proteins could legitimately change.
Both DNP and IgM-ELISA normalization had good accuracy, and neither altered the trends in the data sets relative to the non-normalized data, but the reproducibility after IgM-ELISA normalization was slightly higher. Normalization by mean centering also performed well in reproducibility and accuracy, although the normalization seemed to alter trends in the data more than normalization by DNP or IgM-ELISA. Normalization by mean centering is accurate if the average concentration of the measured proteins is constant between samples. Because the average concentration may change, especially if measuring a small number of proteins, normalization by mean centering may occasionally produce results that inaccurately reflect the trends in the data sets.
Therefore, taking all of the information together, the IgM-ELISA normalization method, of the methods evaluated, seems to have performed the best. Normalizing by the known values of an internal standard such as IgM is attractive because these values are inherent to the sample. A spiked-in standard like DNP-labeled BSA is not inherent to the sample, so the standard would not correct for sources of bias that occurred before the standard was introduced. The accuracy of a known standard is independent of the size or selection of the rest of the array, and it makes no assumptions about the behaviors of particular housekeeping proteins. Drawbacks of normalization by an internal standard are that highly accurate ELISA values for that protein must be obtained for every sample and that one relies on the quality of the microarray measurements for that protein.
Further improvements in the normalization method are still possible and necessary for antibody microarrays. The high reproducibility of the mean centered data showed that normalizing by many proteins may be valuable. We are currently investigating variants on the mean centering method. In addition, different spike-in proteins, such as plant or peanut proteins that have no homology to human proteins, may perform better than DNP-labeled BSA. A panel of three or four highly specific spike-in proteins may produce less variable results than the use of a single protein. Other antibody or protein array techniques may have other optimal normalization methods; the methods presented here provide a strategy for determining which is optimal. In addition, other sample types, such as those from tissue or cell culture sources, may behave differently than serum, and the normalization would need to be independently optimized.
In addition to being useful for evaluating normalization methods, these data served the additional purpose of exploring the value of combined measurements for sample classification. Multiple markers may be grouped together to improve diagnostic performance if the markers contribute complementary, non-overlapping discriminatory information. The improvement of the sample classification when using the multiparametric method, compared with the use of single proteins, showed the potential value of antibody microarray data for more accurate diagnostics. This particular classifier is not likely to be specific for pancreatic cancer because most of the proteins used in the classifier had similar distributions between the cancer and other gastrointestinal disease samples. The development of a specific classifier for pancreatic cancer will require measurements from additional proteins that are more specifically associated with pancreatic cancer. A more sensitive detection method, such as the two-color rolling circle amplification method demonstrated previously (22), would allow the measurement of lower abundance proteins that may contribute to a specific signature for pancreatic cancer. Studies using that approach are ongoing.
No firm conclusions on the nature of specific serum protein alterations in pancreatic cancer can be made from these data because of the small sample size and potential bias between the case and control samples, but the observed differences between the cancer and control samples were consistent with the high levels of inflammation usually associated with pancreatic cancer. The higher levels of C-reactive protein and von Willebrand factor in the disease samples probably reflect a positive acute phase response (25, 26), and a reduction in the levels of albumin and transferrin as observed here is also commonly observed in an acute phase response (27). Decreased levels of serum IgG and IgM have been observed in cancer (28, 29), and higher
1-antitrypsin has also been associated with pancreatic cancer (30).
In summary, this work established reliable methods for normalizing antibody microarray data and established objective criteria for assessing normalization methods. Furthermore, we showed that many different proteins in serum samples can be reliably measured using antibody microarrays and that this capability is useful for multiparametric sample classification. These developments lay the foundation for larger-scale studies that could lead to improved diagnostics for pancreatic cancer and other cancers.
| ACKNOWLEDGMENTS |
|---|
| FOOTNOTES |
|---|
Published, MCP Papers in Press, March 25, 2005, DOI 10.1074/mcp.M400180-MCP200
1 The abbreviations used are: DNP, 2,4-dinitrophenol; CV, coefficients of variation. ![]()
* This research was funded in part by the Early Detection Research Network of the National Cancer Institute and the Michigan Proteome Consortium of the Michigan Life Sciences Corridor, and by the Van Andel Research Institute. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. ![]()
S The on-line version of this article (available at http://www.jbc.org) contains Supplemental Tables I and II and Supplemental Figures 1 and 2. ![]()
Both authors contributed equally to this work. ![]()
¶ Current address: Seton Hall University, Dept. of Biology, South Orange, NJ 07079. ![]()
¶¶ To whom correspondence should be addressed: 333 Bostwick, NE, Grand Rapids, MI 49503. Tel.: 616-234-5268; Fax: 616-234-5269; E-mail: brian.haab{at}vai.org
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
C. Lausted, Z. Hu, and L. Hood Quantitative Serum Proteomics from Surface Plasmon Resonance Imaging Mol. Cell. Proteomics, December 1, 2008; 7(12): 2464 - 2474. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Ellmark, J. Ingvarsson, A. Carlsson, B. S. Lundin, C. Wingren, and C. A. K. Borrebaeck Identification of Protein Expression Signatures Associated with Helicobacter pylori Infection and Gastric Adenocarcinoma Using Recombinant Antibody Microarrays Mol. Cell. Proteomics, September 1, 2006; 5(9): 1638 - 1646. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Sanchez-Carbayo Antibody Arrays: Technical Considerations and Clinical Applications in Cancer Clin. Chem., September 1, 2006; 52(9): 1651 - 1659. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Orchekowski, D. Hamelinck, L. Li, E. Gliwa, M. VanBrocklin, J. A. Marrero, G. F. Vande Woude, Z. Feng, R. Brand, and B. B. Haab Antibody Microarray Profiling Reveals Individual and Combined Serum Proteins Associated with Pancreatic Cancer Cancer Res., December 1, 2005; 65(23): 11193 - 11202. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| All ASBMB Journals | Journal of Biological Chemistry |
| Journal of Lipid Research | ASBMB Today |