Originally published In Press as doi:10.1074/mcp.M800462-MCP200 on July 14, 2009.
Molecular & Cellular Proteomics 8:2227-2242, 2009.
© 2009 by The American Society for Biochemistry and Molecular Biology, Inc.
Research
Normalization and Statistical Analysis of Quantitative Proteomics Data Generated by Metabolic Labeling*,
Lily Ting , ,¶,||,
Mark J. Cowley , ,¶,**,
Seah Lay Hoon , ,
Michael Guilhaus ,
Mark J. Raftery and
Ricardo Cavicchioli ,¶¶
From the School of Biotechnology and Biomolecular Sciences and
 Bioanalytical Mass Spectrometry Facility, The University of New South Wales, Sydney, New South Wales 2052, Australia
Comparative proteomics is a powerful analytical method for learning about the responses of biological systems to changes in growth parameters. To make confident inferences about biological responses, proteomics approaches must incorporate appropriate statistical measures of quantitative data. In the present work we applied microarray-based normalization and statistical analysis (significance testing) methods to analyze quantitative proteomics data generated from the metabolic labeling of a marine bacterium (Sphingopyxis alaskensis). Quantitative data were generated for 1,172 proteins, representing 1,736 high confidence protein identifications (54% genome coverage). To test approaches for normalization, cells were grown at a single temperature, metabolically labeled with 14N or 15N, and combined in different ratios to give an artificially skewed data set. Inspection of ratio versus average (MA) plots determined that a fixed value median normalization was most suitable for the data. To determine an appropriate statistical method for assessing differential abundance, a -fold change approach, Student's t test, unmoderated t test, and empirical Bayes moderated t test were applied to proteomics data from cells grown at two temperatures. Inverse metabolic labeling was used with multiple technical and biological replicates, and proteomics was performed on cells that were combined based on equal optical density of cultures (providing skewed data) or on cell extracts that were combined to give equal amounts of protein (no skew). To account for arbitrarily complex experiment-specific parameters, a linear modeling approach was used to analyze the data using the limma package in R/Bioconductor. A high quality list of statistically significant differentially abundant proteins was obtained by using lowess normalization (after inspection of MA plots) and applying the empirical Bayes moderated t test. The approach also effectively controlled for the number of false discoveries and corrected for the multiple testing problem using the Storey-Tibshirani false discovery rate (Storey, J. D., and Tibshirani, R. (2003) Statistical significance for genomewide studies. Proc. Natl. Acad. Sci. U.S.A. 100, 9440–9445). The approach we have developed is generally applicable to quantitative proteomics analyses of diverse biological systems.
¶¶ To whom correspondence should be addressed. Tel.:61-2-9385-3516; Fax:61-2-9385-2742; E-mail:r.cavicchioli{at}unsw.edu.au.

CiteULike Complore Connotea Del.icio.us Digg Reddit Technorati What's this?
This article has been cited by other articles:

|
 |

|
 |
 
J. W. Gouw, J. Krijgsveld, and A. J. R. Heck
Quantitative Proteomics by Metabolic Labeling of Model Organisms
Mol. Cell. Proteomics,
January 1, 2010;
9(1):
11 - 24.
[Abstract]
[Full Text]
[PDF]
|
 |
|
Copyright © 2009 by the American Society for Biochemistry and Molecular Biology.
|
Advertisement
Advertisement
|