Originally published In Press as doi:10.1074/mcp.M200032-MCP200 on December 9, 2002.
Molecular & Cellular Proteomics 1:983-995, 2002.
© 2002 by The American Society for Biochemistry and Molecular Biology, Inc.
Research
Abundance and Distributions of Eukaryote Protein Simple Sequences*
Kim Lan Sim and
Trevor P. Creamer
From the Center for Structural Biology, Department of Molecular and Cellular Biochemistry, University of Kentucky, Lexington, Kentucky 40536-0298
Protein simple sequences are a subclass of low complexity regions of sequence that are highly enriched in one or a few residue types. Such sequences are common in transcription regulatory proteins, in structural proteins, in proteins involved in nucleic acid interactions, and in mediating protein-protein interactions. Simple sequences of 10 or more residues, containing 50% of a single residue type are surveyed in this work. Both eukaryote and prokaryote proteomes are investigated with emphasis on the eukaryotes. Very large numbers of such sequences are found in all organisms surveyed. It is found that eukaryotes possess far more simple sequences per protein than do the prokaryotes. Prokaryotes display a linear relationship between number of proteins containing simple sequences and proteome size, whereas it is not clear that such a relationship holds for eukaryotes. Strikingly, it is found that each eukaryote possesses its own unique distribution of simple sequences. Within those distributions it is found that simple sequences enriched in certain residue types are clearly favored, whereas others are just as clearly discriminated against. The preferences observed are not correlated with residue occurrence. An analysis of classes of proteins of known function suggests that simple sequence occurrence and distribution may be related to protein function. Based upon this analysis, the large number of simple sequences found above that would be expected from a simple statistical model, plus the known functional importance of numerous such sequences, it is postulated that eukaryotes have evolved to not only tolerate large numbers of simple sequences but also to require them.
To whom correspondence should be addressed: Center for Structural Biology, Dept. of Molecular and Cellular Biochemistry, University of Kentucky, 800 Rose St., Lexington, KY 40536-0298. Tel.: 859-323-6037; Fax: 859-323-1037; Email: tpcrea0{at}uky.edu

CiteULike Complore Connotea Del.icio.us Digg Reddit Technorati What's this?
This article has been cited by other articles:

|
 |

|
 |
 
P. Koodathingal, N. E. Jaffe, D. A. Kraut, S. Prakash, S. Fishbain, C. Herman, and A. Matouschek
ATP-dependent Proteases Differ Substantially in Their Ability to Unfold Globular Proteins
J. Biol. Chem.,
July 10, 2009;
284(28):
18674 - 18684.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
K. K. Gupta, B. A. Paulson, E. S. Folker, B. Charlebois, A. J. Hunt, and H. V. Goodson
Minimal Plus-end Tracking Unit of the Cytoplasmic Linker Protein CLIP-170
J. Biol. Chem.,
March 13, 2009;
284(11):
6735 - 6742.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. A. Huntley and G. B. Golding
Selection and Slippage Creating Serine Homopolymers
Mol. Biol. Evol.,
November 1, 2006;
23(11):
2017 - 2025.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. E. Coronado, O. Attie, S. L. Epstein, W.-G. Qiu, and P. N. Lipke
Composition-Modified Matrices Improve Identification of Homologs of Saccharomyces cerevisiae Low-Complexity Glycoproteins.
Eukaryot. Cell,
April 1, 2006;
5(4):
628 - 637.
[Abstract]
[Full Text]
[PDF]
|
 |
|
Copyright © 2002 by the American Society for Biochemistry and Molecular Biology.
|
Advertisement
Advertisement
|