MCP Thermo Scientific TMT Isobaric Mass Tagging Kits
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Full Text (PDF)
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow Glossary
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Burlingame, A. L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Burlingame, A. L.
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Molecular & Cellular Proteomics 2:425, 2003.
© 2003 by The American Society for Biochemistry and Molecular Biology, Inc.


EDITORIAL

Toward Deciphering the Knowledge Encrypted in Large Datasets

Alma L. Burlingame

In the discussions that took place during formulation of the charter of this journal, it was clear that the research community had embarked on a challenging quest to identify and describe the totality of cellular protein expression in unprecedented detail. In addition, interest in exploring global studies of quantitative changes in levels of expression or posttranslational status would require development of new methodologies, but such a focus would provide even deeper insight into factors that affect physiology and phenotype and normal or aberrant function. It was presumed that particular aspects of such knowledge would be revealed quickly driven by powerful mass spectrometric technologies that exploit the growing scale of information provided by the encyclopedia of genomic information.

Many recent forays are being guided by strategies based on so-called "discovery science" gathered by advanced technologies that possess unprecedented power to scrutinize complex systems in precise molecular detail. Use of such revolutionary tools are producing global measurements of message levels and ensembles of protein translation products. Thus, very large amounts of rather precise new information on gene expression and its modulation are accumulating in a growing number of laboratories. The challenges focus on deciphering the multi-functional, network, and even phenotypic contexts embedded in these data.

A sobering characteristic of large-scale measurements of gene expression is the present inability of any single laboratory to understand or fully comprehend information gathered on such comprehensive scale. Hence, much of the information content implicit in such large datasets remains cryptic and must await future mining endeavors to "reveal" the full biological meaning. One strategy that might accelerate the process of understanding would be to stimulate the participation of previously unidentified interested parties. This could be accomplished by making these datasets accessible per se together with the results deduced from the group that carried out the original experiments. In so doing, one should bear in mind that such data sets will become rapidly "out-dated" by the ongoing expansion and improvement of sequence databases and the analytical tools available for their interpretation. Thus, in order to fully realize the potential of publicly sharing a growing base of large scale datasets, such datasets must not only be amenable to inquiry with newer analytical tools as they become available, but that some form of statistical, probability-based, or otherwise objective and scientifically validated determination of "confidence" levels be applied to the results. This is vital to allow for the drawing of meaningful biological information from the data. Perhaps more importantly, this will also allow for the comparison of related datasets produced by other research groups as they become available. This will facilitate deriving yet more meaningful biological information not apparent from the individual datasets. Thus, by provision of accessibility to such suitably annotated data, the discovery of additional meaning and insight not recognized by the original scientific work may proceed.

It is our intention: to articulate the nature of problems associated with the quality of such datasets and suitability of current tools to deal with them; to facilitate the further mining of such datasets; and to foster critical discussion of these topics and issues within the community. Molecular & Cellular Proteomics is committed to providing: 1) community access to current research examples of these complex research problems, 2) a forum for articulation and adjudication of technical and scientific issues surrounding large-scale biological research, and 3) a vehicle for formulation of standards through community participation, critique, and consensus. To this end, the publication in this issue of one such dataset, along with the original raw data used to generate it (1, 2), represents a first step toward realizing our intention.

REFERENCES

  1. von Haller, P. D., Yi, E., Donohoe, S., Vaughn, K., Keller, A., Nesvizhskii, A. I., Eng, J., Li, X., Goodlett, D. R., Aebersold, R., and Watts, J. D. (2003) The application of new software tools to quantitative protein profiling via ICAT and tandem mass spectrometry: I. Statistically annotated datasets for peptide sequences and proteins identified via the application of ICAT and tandem mass spectrometry to proteins co-purifying with T cell lipid rafts. Mol. Cell. Proteomics 2, 426 –427[Abstract/Free Full Text]

  2. von Haller, P. D., Yi, E., Donohoe, S., Vaughn, K., Keller, A., Nesvizhskii, A. I., Eng, J., Li, X., Goodlett, D. R., Aebersold, R., and Watts, J. D. (2003) The application of new software tools to quantitative protein profiling via ICAT and tandem mass spectrometry: II. Evaluation of tandem mass spectrometry methodologies for large-scale protein analysis, and the application of statistical tools for data analysis and interpretation. Mol. Cell. Proteomics 2, 428 –442[Abstract/Free Full Text]


Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
Mol. Cell. ProteomicsHome page
R. J. Chalkley, P. R. Baker, K. C. Hansen, K. F. Medzihradszky, N. P. Allen, M. Rexach, and A. L. Burlingame
Comprehensive Analysis of a Multidimensional Liquid Chromatography Mass Spectrometry Dataset Acquired on a Quadrupole Selecting, Quadrupole Collision Cell, Time-of-flight Mass Spectrometer: I. How Much of the Data is Theoretically Interpretable by Search Engines?
Mol. Cell. Proteomics, August 1, 2005; 4(8): 1189 - 1193.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Full Text (PDF)
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow Glossary
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Burlingame, A. L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Burlingame, A. L.
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 All ASBMB Journals   Journal of Biological Chemistry 
 Journal of Lipid Research   ASBMB Today