|
Advertisement | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Molecular & Cellular Proteomics 3:704-714, 2004.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ABSTRACT |
|---|
|
|
|---|
ßßß
" structure (ßB), the Src-type SH2 domain contains an extra ß-strand (ßE or ßE-ßF motif). Alternatively, the linker domain-conjugated SH2 domain in STAT contains the
B` motif. Combining BLAST data from ßB core motif sequences with predicted secondary structural alignment, we have screened for SH2 domains in various eukaryotic model systems including Arabidopsis, Dictyostelium, and Saccharomyces. Two novel genes carrying the linker-SH2 domain of STAT were discovered and subsequently cloned from Arabidopsis. These genes, designated as STAT-type linker-SH2 domain factors (STATL), are found in a wide array of vascular and nonvascular plants, suggesting that the linker-SH2 domain evolved prior to the divergence of plants and animals. Using this approach, we expanded the number of putative SH2 domain-bearing genes in Dictyostelium and comparatively studied the secondary structural profiles of both typical and atypical SH2 domains. Our results indicate that the linker-SH2 domain of the transcription factor STAT is one of the most ancient and fully developed functional domains, serving as a template for the continuing evolution of the SH2 domain essential for phosphotyrosine signal transduction.
100-aa-long motif that recognizes and interacts with phosphotyrosine-containing motifs on the same or different protein molecules during signal transduction in animal cells. About 200 SH2 domain-containing genes have been identified in human cells, suggesting that this domain is one of the most rapidly expanded protein modules (1). In animal cells, SH2 domains are predominately present in signaling molecules, i.e. signaling-related enzymes including protein tyrosine kinases, protein tyrosine phosphatases, inositol phosphatase, and phospholipase and signaling adapters. However, the SH2 domain has also been found in transcription factor STAT family members (2, 3). In a signaling molecule with catalytic activity, the SH2 domain is often conjugated immediately upstream with another functional motif such as the SH3 domain, whereas in STAT the linker domain is immediate upstream of the SH2 domain. Recently, two STAT proteins have been discovered in Dictyostelium, a facultative slime mold capable of both growing as a single cell and differentiating into multicellular structures (4, 5). More recently, SHK, the SH2 domain-bearing protein kinase, has been identified in the same species (6). In both cases, the SH2 domains are the linker domain-conjugated. Because a typical SH2 domain has not been found in single-cell eukaryote yeast or microorganisms, the SH2 domain formation and its phospho-signaling were proposed to be coincident with animal evolution, perhaps critical during the transition from single-cell to multicellular animals or metazoan (7, 8).
The characteristic structure of the SH2 domain is three ß-strands flanked by two
-helices (
ßßß
). The first ß-strand (ßB), conserved in its sequence GXF/YBBR (9), is the core motif critical for binding phosphotyrosine (pTyr) (10). This sequence is required for the normal function of the SH2 domain and conveniently serves as the fingerprint structure in SH2 domain recognition. While ßB core motif and motif-like sequences exist widely in different genes, some perfect ßB sequences are not necessarily indicative of the SH2 domain (11). In these cases, secondary structural alignment clarifies the confusion caused by sequence alignment alone. Additionally, secondary structural analysis can provide reliable structural evidence for SH2 domains with ambiguous ßB and ßB-flanking sequences. A careful analysis of the amino acid sequence, motif orientation, and secondary structural features indicate that STAT SH2 domains differ from those involved in signal transduction. In a STAT protein, the SH2 domain is the immediate extension from five continuing
-helices representing the linker domain. Moreover, all STAT-like SH2 domains carry a
B` motif between ßD and
B sequences. We combined ßB core motif sequence BLAST with secondary structural screening to identify SH2 domains in genome databases of various eukaryotic model systems. We identified and analyzed a novel gene family, which carries the linker-SH2 domain of STAT from the genome database of Arabidopsis as well as other plants. Two of these linker-SH2 domain-carrying genes were cloned from the cDNA library of Arabidopsis and sequenced. Using secondary structural alignment, we comprehensively analyzed the typical and atypical SH2 domains found in plants, yeast, and Dictyostelium. According to our secondary structural analysis, the linker-SH2 domain of SHK in Dictyostelium, representing the modern SH2 domain in signal transduction, may share the common ancestor with or even directly evolved from the linker-SH2 domain of STAT. The discovery of the linker-SH2 domain of STAT in plants supports the notion that this domain had been developed prior to the divergence of plant and animal kingdoms.
| MATERIALS AND METHODS |
|---|
|
|
|---|
3dpssm/), which predicts
-helices and ß-strands. When overall
-helix/ß-strand arrangement, but not individual amino acid residue around the motif sequences was evaluated, all predicted
-helix and ß-sheet were correct comparing with the structural information obtained from crystallization analysis.
Cloning and Northern blot of at-STATLa and at-STATLb
Both at-statla and at-statlb genes were cloned from the Arabidopsis cDNA library. Primers from known, flanking regions were used in order to clone the full-length sequences of at-statla and at-statlb. Various primer combinations were designed in order to cover with overlapping the full-length sequences of at-statla and at-statlb. PCR products were subsequently sequenced and the gene sequences of both at-statla and at-statlb have been deposited in GenBank. Total RNAs prepared from different parts of Arabidopsis were used for Northern blot analysis.
Western Blotting and Immuno-precipitation
Arabidopsis thaliana, ecotype Columbia, was grown on Murashige and Skoog agar medium at 22 °C under constant light for 3 weeks. Sodium orthovanadate (100 µM) and hydrogen peroxide (1 mM) were added for the indicated times. The whole plant of Arabidopsis was homogenized in ice-cold extraction buffer (30 mM Tris-HCl, pH 8.5, 150 mM NaCl, 1 mM EDTA, 20% glycerol, 1 mM dithiothreitol, and proteinase inhibitors). Cell debris was separated from soluble material by centrifugation at 18,000 x g for 5 min. GST-at-STATLa in full length and GST-at-STATLb-SH2 domain (596692) were constructed, expressed, and purified as described previously (12). Purified glutathione S-transferase (GST) recombinant proteins were incubated with the above-prepared extracts. Extensively washed GST protein precipitates were then subjected to standard Western blotting procedures using horseradish peroxidase-conjugated anti-pTyr (pY20) and enhanced chemiluminescence (Amersham Biosciences, Piscartaway, NJ) for detection.
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
ßßß
" structure. The SH2 domains of signaling factors including enzymes and adapters exclusively contain the predicted ßE or ßE-ßF motif, consistent with crystallographic findings (Fig. 1A) (1316). The small ß-strand, ßE or ßE-ßF motif, has proven to be critical for protein/ligand recognition in pTyr signaling factors (10). However, the ßF fragment is not always detectable or predictable, presumably due to the instability of this sequence in ß-strand formation (15, 17). According to the distance between ßC and ßD motifs, the SH2 domain-carrying enzymes can be further divided into the long ßC-ßD loop and the short ßC-ßD loop groups (Fig. 1A). Src family members all belong to the long ßC-ßD group. In most of the adapters, the SH2 domains carry a putative short ßC-ßD loop. For transcription factor STAT, the length of the predicted ßC-
B sequence varies with various family members and are overall shorter than those of signaling factors (Fig. 1A). But the most striking difference between Src-type and STAT-type SH2 domains is that STAT or STAT-like SH2 domains do not contain the ßE or ßE-ßF motif. Instead, they all contain the
B` or a nonsplit
B motif (Fig. 1A), which is considered as the critical region for STAT dimerization (16, 18). All these secondary structural features of STAT-type SH2 domains obtained from the secondary structural prediction are consistent with the findings obtained from the crystallization studies (16, 18). In Drosophila STAT, the SH2 domain carries the putative
B` motif despite lacking the ßD motif (Fig. 1A). Two putative STAT-like sequences (ce-STATa and ce-STATb) were previously found by homologous alignment in Caenorhabditis elegans genome data (19). Our secondary structural analysis predicts that these SH2 domains also contain
B` motifs (Fig. 1A). Therefore, the level of detail obtained here can easily distinguish subtle domain differences between STAT SH2 and signaling factors.
|
ßßß
" topology obtained from secondary structural prediction (Fig. 1B) agreed with the conclusion drawn in a previous crystallography study (20). However, unlike Src-type or STAT-type SH2 domains, the CBL SH2 domain lacks the small ßE or
B` motif. The immediate upstream sequence of the kinase-like domain in JAK has long been suspected as an SH2-like domain (21). Two-dimensional alignment clearly reveals that this sequence, though ambiguous, represents a typical Src-type SH2 domain (Fig. 1C). Among all the JAK members analyzed, Drosophila JAK contains the longest loop between ßB and ßC motifs. Utilizing other programs such as PHD (maple.bioc. columbia.edu/predictprotein) and JPRED (www.compbio.dundee.ac.uk) for prediction, similar results were obtained (data not shown). Thus, secondary structural prediction is particularly suited for, and reliable in, the detection of such fine structural difference of protein motifs with divergent sequences.
STATLs Are Novel Genes Carrying the STAT-like Linker-SH2 Domain in Plants
To locate putative SH2 domains in plants, we used Pat-match to screen the genomes of various eukaryotes for the ßB core motif (see "Materials and Methods"). In the Arabidopsis Information Resource (TAIR; www.arabidopsis.org), we found 604 ßB sequences in 583 putative genes, of which secondary structural analysis confirmed two putative genes containing the typical "
ßßß
" structure of an SH2 domain [AC007651 (protein locus: AAD50031); AC007260 (protein locus: AAD30582)]. Subsequent cDNA cloning and DNA sequencing analysis indicate that these two genes are closely related to each other (65% identical at the amino acid level) (Fig. 2A). In both genes, the C termini are longer and divergent than those predicated by those published genomic sequences. The SH2 domains reside at the C-terminal regions and the predicted secondary structure match those found in STAT (Fig. 2B). Moreover, a sequence of 90 amino acids immediately N-terminal of the SH2 domain is also well conserved and resembles STATs linker domain (Fig. 2B) (16, 18, 22). We therefore named the two novel genes STAT-type linker-SH2 domain factor a and b (STATLa and STATLb). The linker-SH2 domain is well conserved in putative STATL genes identified in both monocot and dicot plants including soybean, sorghum, potato, tomato, medicago, wheat, and rice (Fig. 2C). Surprisingly, the STATL sequences were also retrieved in lower plants like the moss, Physcomitrella patens (pp-STATL) from NCBI translated BLAST searches, and the green algae, Chlamydomonas (cr-STATL) from the Chlamydomonas Resource Center (www.biology.duke.edu/chlamy_genome/crc.html), (Fig. 2C). The ubiquitous presence of STATL in plants suggests that the SH2 domain plays a fundamental role in plants and animals and rejects the possibility that this domain originated in plants through some accidental means (i.e. reverse horizontal gene transfer) or coevolved with tyrosine kinases or the SH3 domain (23).
|
A motif in STATL is also a lysine rather than an arginine, which coordinates with a phosphate group in metazoan STAT (16).
The secondary structure of STATs linker-SH2 domain (the five
-helices and the "
ßßß
" sandwich) (16, 18) is well conserved in at-STATL proteins according to our prediction (Fig. 2B). When aligned with STAT, sequence gaps in at-STATLa and at-STATLb occur at neutral positions and do not interrupt the arrangement of
-helices and ß-strands. For instance, a 9-aa stretch (ENMAGKGFS), absent in the linker domain of at-STATLa or at-STATLb, forms an out loop between
9 and
10 in the linker domain, which does not interrupt the helicity between
9 and
10 (Fig. 2B) (16). For all STATL sequences obtained from different plants, the SH2 domains do not carry the ßE motif. Instead, they exclusively carry the
B` motif or a nonsplit
B`
B (Fig. 2C), which matches the secondary structural characteristics of the STAT SH2 domain. The lack of ßD motif coupled with the presence of a large nonsplit
B in the SH2 domain of cr-STATL in Chlamydomonas may indicate a premature form of SH2 domain that arose during its development. The upstream of the linker-SH2 domain (Ser139-Pro232 in both at-STATLa and at-STATLb) is predicted to form a continuous ß-sheet (Fig. 2D), which has some similarity to the DNA-binding domain of dd-STATa or dd-STATc but not that of human STAT (16, 18, 24). Therefore, the strong similarity between STAT and STATL within their linker-SH2 domains at both the amino acid sequence and secondary structure levels strongly indicate that these two domains share a common ancestor that evolved prior to the divergence of plants and animals.
Messenger RNAs of both genes were detected ubiquitously in different parts of Arabidopsis (Fig. 2E). Plants are not known to contain JAK-like nonreceptor tyrosine kinases (25). However, receptor protein kinases, serine/threonine plus tyrosine dual-function protein kinases, as well as protein phosphatases exist in plants (13, 26, 27). In Fig. 2F, tyrosine-phosphorylated proteins ranging in size from 60 to 120 kDa were detected in Arabidopsis treated with vanadate, a naturally occurring transition metal that can function as a nonspecific protein phosphatase inhibitor and trigger protein tyrosine phosphorylation in cells (28). Purified GST-STATLa-full length and GST-STATLb-SH2 domain proteins but not the GST control (Fig. 2G, right panel) were able to pull down the 120-kDa tyrosine phosphorylated protein in vanadate-treated samples but not in the samples without vanadate treatment (Fig. 2G, left panel). Therefore, the SH2 domain of STATL proteins might be involved in pTyr-dependent protein-protein interaction in plants.
SHKs Linker-SH2 Domain Is Homologous to That of STAT in Dictyostelium
How does the STAT-type linker-SH2 domain relate phylogenetically to the tyrosine signaling- or Src-type SH2 domain, which quickly expanded in number in animal cells? To answer this question, we studied the database of Dictyostelium, a slime mold considered more closely related to fungi and animals than to plants (29). SH2-bearing genes cloned from Dictyostelium include two transcription factors (i.e. STATa and STATc) and one signaling factor (i.e. SHK1) (46). From the Dictyostelium discoideum Genome Project (www.sanger.ac.uk/Projects/D_discoideum), two additional putative STAT sequences, designated as dd-STATb and dd-STATd, and four additional putative SHK genes, designated as dd-SHK2, dd-SHK3, dd-SHK4, and dd-SHK5, were identified using ßB core motif sequence as well as the whole SH2 domain for BLAST searches. Like SHK1, the protein kinases of these novel putative SHK members are most closely related to the protein kinases found in plants (6). However, these same kinases in plants are not conjugated to any SH2 or SH2-like sequences. Using the kinase domain (C-region) of the putative SHK2 for a BLAST search, a large number of homologous expressed sequence tags (ESTs) were identified in the genomes of both Arabidopsis and Dictyostelium, but not in the databases of other eukaryotes (not shown). This suggests a close evolutionary relationship between plants and Dictyostelium.
Primary and predicted secondary structure alignment indicates that the SHK SH2 domains carry some features of the STAT SH2 domains in Dictyostelium. Phenylalanine (F) has been found to follow the ßB motif in the SH2 domains of SHK1, SHK2, and SHK3 (Fig. 3A). However, secondary structural modifications have been noted in SHKs SH2 domain that contrast STAT. In the region between the predicted ßD and
B motifs, the
B` motif has been replaced by the ßF or ßE-ßF motif. The introduction of ßE in SHK5 lengthened the distance between
A and
B motifs (Fig. 3A). This seems to reflect the trend of SH2 maturation because this distance between
A and
B motifs have become even longer in the Src SH2 domain in metazoans (Fig. 1A). In this region, both sequence similarities and gaps were observed, suggesting that the accumulation of favorable mutations, insertions, and deletions might all contribute to the
-helix/ß-sheet switch (Fig. 3A). Thus, this
B`/ßE-ßF-containing region in the SH2 domain serves as an evolutionarily active region (EAR) within an otherwise conserved domain essential for its function. When STATcs linker domain was used for a BLAST search, the sequence between the protein kinase domain and the SH2 domain (the linker) of SHK was recovered, suggesting a close relationship among these molecules within this region. SHKs linker domain is predicted to contain a
-helix repeat composed of
7 to
10 motifs, which is indeed homologous to that of STAT (Fig. 3B). The C-terminal three
-helices (
9,
10, and
11) formed a large
-helix immediately upstream of the SH2 domain. While the linker domains of most SHK members are relatively similar in size, SHK5s linker domain is apparently much longer due to homopolymerism, a characteristic of many genes found in Dictyostelium (4, 7). Comparing with STAT or STATL, the predicated helical characteristic of the linker domain is degenerating in SHK (Fig. 3B), perhaps due to a functional regression of this domain in tyrosine signaling.
|
-helix or
/ß-hybridized motif in the region of the ße motif (Fig. 3C), suggesting that this region is a putative EAR. Although a weak sequence homology exists between the STAT linker domain and the SH3 domain (2), the presence of well-developed SH3 domains as well as the fully developed linker-SH2 domain in independent genes in plants indicates the SH3 domain is unlikely evolved from the linker domain. While SH3 domain carrying genes exist in plants and quickly multiplied in its number in lower eukaryotes such as Dictyostelium and yeast, SH3-SH2 domain conjugation has not been discovered in these organisms.
SPT6 Gene Carries a Putative Immature Linker-SH2-like Domain
We next studied the yeast genome in which no typical SH2 domains were identified (30). Using the same approach, we identified 89 ßB sequences from 86 putative genes in the Saccaromyces genome database (genome-www.stanford.edu/Saccharomyces). The suppressor of Ty 6 gene (SPT6) attracted our attention after secondary structural analysis of all these sequences. SPT6 was previously reported in yeast and animal cells and is involved in transcriptional initiation and DNA/RNA binding (31, 32). Using the yeast SPT6 protein sequence for a BLAST search, putative SPT6 genes were identified in plants, Dictyostelium, and other eukaryotes (NCBI BLAST searches), suggesting that SPT6 is also an ancient gene that existed prior to the divergence of plants and animals. The conserved third residue should be either Phe or Tyr in the standard ßB sequence GXF/YBBR (Fig. 4); however, sc-SPT6 of yeast is the only one to follow this rule (Fig. 4). Nevertheless, the putative
ßßß
-like structure, albeit less typical, is well maintained in all SPT6 proteins according to our secondary structural prediction, supporting the suspicion of a degenerate SH2 domain in SPT6 genes (33, 34). The
A, ßB, and ßC motifs that compose the evolutionary inactive region in the SH2 domains are conserved in SPT6s regardless of the origin. The putative
B`
B helical structure in SPT6 does not split as STATs SH2 domain does. In the suspected EARs of most SPT6s, a short ß-strand and a
-helix are predicted, and the ßD motif was not fully extended according to the prediction analysis. Such a poorly developed putative ßD motif may not efficiently form an anti-parallel ß-sheet with ßB and ßC, and may eventually hamper its function in protein-protein interactions. The large Lys/Glu-rich helical structure in the linker domain of SPT6 resembles the ancient STATL-like
9-
10-
11 continuing
-helices that became three discrete helices in the linker domain of human STAT or was discontinued in SHK (Figs. 2A and 3B). Moreover, in the region upstream of the linker domain, SPT6 contains a putative 6-ß-strand repeat resembling the DNA-binding domain in both STAT and at-STATL (not shown). Interestingly, our secondary structure prediction analysis indicates that SPT6 and cr-STATL of the unicellular plant Chlamydomonas share some critical features at secondary structural level. Both are predicted to bear a large nonsplit
B`
B motif and a poorly developed ßD motif (Figs. 2B and 4). Therefore, it is possible that SPT6 and STAT are evolutionarily related.
|
|

" rather than "
ßßß
" topology is present in all those SH2-like sequences. Therefore, a perfect ßB sequence does not necessarily reveal an SH2 domain structure. In contrast, as long as the "
ßßß
" secondary structure maintains, the amino acid sequence can be very divergent. The balance between evolution and conservation in the SH2 domain development reflects evolution at the primary structural level but conservation at secondary structural level. The discovery of STAT linker-SH2 domain-bearing genes in plants underscores the proposal that SH2 domain development was an early step in the evolution of multicellularity. The SH2 domain formed most likely in a common eukaryote ancestor prior to divergence of any of the major eukaryote taxa. Hence, the linker-SH2 domain of the transcription factor STAT has been placed on center stage of transcriptional activation prior to the development of the SH2 domain into pTyr signal transduction (36, 37).
| ACKNOWLEDGMENTS |
|---|
| FOOTNOTES |
|---|
Published, MCP Papers in Press, April 7, 2004, DOI 10.1074/mcp.M300131-MCP200
1 The abbreviations used are: SH2, Src homologous 2; SH3, Src homologous 3; SHK, SH2 domain-bearing protein kinase; SPT6, suppressor of Ty 6; STAT, signal transducer and activator of transcription; STATL, STAT-like; pTyr, phosphotyrosine; GST, glutathione S-transferase; EAR, evolutionarily active region. ![]()
* This work was supported in part by National Institutes of Health Grant RO1 CA82549 (to E. Y. C.) and by Grant RR-15578 from the Center of Biomedical Research Excellence (COBRE) Program of the National Center for Research Resources to Brown University. The costs of publication of this article were defrayed in part by the pay-ment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. ![]()
Current address: Department of Gynecology, Yale University School of Medicine, New Haven, CT 06510. ![]()
¶ Current address: Department of Plant Biology, Cornell University, Ithaca, NY 14850. ![]()
¶¶ To whom correspondence should be addressed: Departments of Surgery Science and Pathology & Laboratory Medicine, Brown University School of Medicine, Providence, RI 02903. Tel.: 401-444-0172; Fax: 401-444-3278; E-mail: y_eugene_chin{at}brown.edu
| REFERENCES |
|---|
|
|
|---|
-induced cytoplasmic protein tyrosine kinase(s).
Cell
70, 323
335[CrossRef][Medline]
receptor 1-TRADD signaling complex to inhibit NF-
B activation.
Mol. Cell Biol.
20, 4505
4512
interferon-driven transcription.
Mol. Cell Biol.
19, 5106
5112
-mediated transcription in Saccharomyces cerevisiae.
Mol. Cell Biol.
7, 679
686This article has been cited by other articles:
![]() |
Y. Yamada, H. Y. Wang, M. Fukuzawa, G. J. Barton, and J. G. Williams A new family of transcription factors Development, September 15, 2008; 135(18): 3093 - 3101. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z.-l. Yuan, Y.-j. Guan, D. Chatterjee, and Y. E. Chin Stat3 Dimerization Regulated by Reversible Acetylation of a Single Lysine Residue Science, January 14, 2005; 307(5707): 269 - 273. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| All ASBMB Journals | Journal of Biological Chemistry |
| Journal of Lipid Research | ASBMB Today |