|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Molecular & Cellular Proteomics 1:983-995, 2002.
© 2002 by The American Society for Biochemistry and Molecular Biology, Inc.


From the Center for Structural Biology, Department of Molecular and Cellular Biochemistry, University of Kentucky, Lexington, Kentucky 40536-0298
Protein simple sequences are a subclass of low complexity regions of sequence that are highly enriched in one or a few residue types. Such sequences are common in transcription regulatory proteins, in structural proteins, in proteins involved in nucleic acid interactions, and in mediating protein-protein interactions. Simple sequences of 10 or more residues, containing
50% of a single residue type are surveyed in this work. Both eukaryote and prokaryote proteomes are investigated with emphasis on the eukaryotes. Very large numbers of such sequences are found in all organisms surveyed. It is found that eukaryotes possess far more simple sequences per protein than do the prokaryotes. Prokaryotes display a linear relationship between number of proteins containing simple sequences and proteome size, whereas it is not clear that such a relationship holds for eukaryotes. Strikingly, it is found that each eukaryote possesses its own unique distribution of simple sequences. Within those distributions it is found that simple sequences enriched in certain residue types are clearly favored, whereas others are just as clearly discriminated against. The preferences observed are not correlated with residue occurrence. An analysis of classes of proteins of known function suggests that simple sequence occurrence and distribution may be related to protein function. Based upon this analysis, the large number of simple sequences found above that would be expected from a simple statistical model, plus the known functional importance of numerous such sequences, it is postulated that eukaryotes have evolved to not only tolerate large numbers of simple sequences but also to require them.
To whom correspondence should be addressed: Center for Structural Biology, Dept. of Molecular and Cellular Biochemistry, University of Kentucky, 800 Rose St., Lexington, KY 40536-0298. Tel.: 859-323-6037; Fax: 859-323-1037; Email: tpcrea0{at}uky.edu
![]()
CiteULike
Complore
Connotea
Del.icio.us
Digg
Reddit
Technorati What's this?
This article has been cited by other articles:
![]() |
M. A. Huntley and G. B. Golding Selection and Slippage Creating Serine Homopolymers Mol. Biol. Evol., November 1, 2006; 23(11): 2017 - 2025. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. E. Coronado, O. Attie, S. L. Epstein, W.-G. Qiu, and P. N. Lipke Composition-Modified Matrices Improve Identification of Homologs of Saccharomyces cerevisiae Low-Complexity Glycoproteins. Eukaryot. Cell, April 1, 2006; 5(4): 628 - 637. [Abstract] [Full Text] [PDF] |
||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| All ASBMB Journals | Journal of Biological Chemistry |
| Journal of Lipid Research | ASBMB Today |