Originally published In Press as doi:10.1074/mcp.M700240-MCP200 on November 19, 2007.
Molecular & Cellular Proteomics 7:631-644, 2008.
© 2008 by The American Society for Biochemistry and Molecular Biology, Inc.
Special Issue: 8th International Symposium On Mass Spectrometry In The Life Sciences
Statistical Similarities between Transcriptomics and Quantitative Shotgun Proteomics Data *,S
Norman Pavelka ,
Marjorie L. Fournier ,
Selene K. Swanson ,
Mattia Pelizzola ,¶,
Paola Ricciardi-Castagnoli||,
Laurence Florens and
Michael P. Washburn ,**
From the Stowers Institute for Medical Research, Kansas City, Missouri 64110, Department of Biotechnology and Bioscience, University of Milano-Bicocca, Piazza della Scienza 2, 20126 Milano, Italy, and || Singapore Immunology Network, 8A Biomedical Grove, Immunos Building, Singapore 138648, Singapore
If the large collection of microarray-specific statistical tools was applicable to the analysis of quantitative shotgun proteomics datasets, it would certainly foster an important advancement of proteomics research. Here we analyze two large multidimensional protein identification technology datasets, one containing eight replicates of the soluble fraction of a yeast whole-cell lysate and one containing nine replicates of a human immunoprecipitate, to test whether normalized spectral abundance factor (NSAF) values share substantially similar statistical properties with transcript abundance values from Affymetrix GeneChip data. First we show similar dynamic range and distribution properties of these two types of numeric values. Next we show that the standard deviation (S.D.) of a protein's NSAF values was dependent on the average NSAF value of the protein itself, following a power law. This relationship can be modeled by a power law global error model (PLGEM), initially developed to describe the variance-versus-mean dependence that exists in GeneChip data. PLGEM parameters obtained from NSAF datasets proved to be surprisingly similar to the typical parameters observed in GeneChip datasets. The most important common feature identified by this approach was that, although in absolute terms the S.D. of replicated abundance values increases as a function of increasing average abundance, the coefficient of variation, a relative measure of variability, becomes progressively smaller under the same conditions. We next show that PLGEM parameters were reasonably stable to decreasing numbers of replicates. We finally illustrate one possible application of PLGEM in the identification of differentially abundant proteins that might potentially outperform standard statistical tests. In summary, we believe that this body of work lays the foundation for the application of microarray-specific tools in the analysis of NSAF datasets.
** To whom correspondence should be addressed: Stowers Inst. for Medical Research, 1000 E. 50th St., Kansas City, MO 64110. Tel.: 816-926-4457; Fax: 816-926-4694; E-mail: mpw{at}stowers-institute.org

CiteULike Complore Connotea Del.icio.us Digg Reddit Technorati What's this?
This article has been cited by other articles:

|
 |

|
 |
 
H. Choi, D. Fermin, and A. I. Nesvizhskii
Significance Analysis of Spectral Count Data in Label-free Shotgun Proteomics
Mol. Cell. Proteomics,
December 1, 2008;
7(12):
2373 - 2385.
[Abstract]
[Full Text]
[PDF]
|
 |
|
Copyright © 2008 by the American Society for Biochemistry and Molecular Biology.
|
Advertisement
Advertisement
|