Submitted on January 22, 2003
Revised on April 5, 2003
Accepted on April 7, 2003
Genome-wide analyses of carboxyl terminal sequences
Jean-Ju L. Chung, Hongmei Yang, and Min Li
Neuroscience, Johns Hopkins University, Baltimore, MD 21205
Corresponding Author: minli{at}jhmi.edu
Sequence motifs at the protein C-termini in linear polypeptides are uniquely positioned and functionally capable of serving as recognition signatures for a variety of cellular and biochemical processes. At the proteome level, it is unknown whether and what carboxyl terminal sequences might be particularly conserved, which may be directly related to specific biological functions shared among certain groups of proteins. To investigate this question, we analyzed the terminal sequences of reported yeast open reading frames (ORFs), which presumably constitute the predicted, entire proteome of Saccharomyces cerevisiae. The results show that there are both known and novel terminal sequences. They are conserved at a frequency similar to that of functionally important, experimentally confirmed signals such as the HDEL sequence which mediates the ER retention and/or retrieval. The findings support the notion that there may be additional carboxyl terminal signals and the conserved motifs may be experimentally tested for currently unknown biological functions. Similar analyses were also applied to the limited proteome databases of other organisms with overall consistent findings. Therefore, indexing proteome according to its C-terminal sequences may provide a means for functional classification and determination of proteins.