|
|
||||||||



,**
From the
Department of Pharmaceutical Chemistry, University of Kansas, Lawrence, Kansas 66047,
Bioinformatics Core Facility, University of Kansas, Lawrence, Kansas 66045, ¶ Stowers Institute for Medical Research, Kansas City, Missouri 64110, and || Max Planck Institute for Informatics, Stuhlsatzenhausweg 85, 66123 Saarbrücken, Germany
| ABSTRACT |
|---|
|
|
|---|
The potentially wide ranging functions that polyanionic molecules could perform, their nonspecific versus specific binding to cellular proteins, and the highly crowded environment of cells suggest a number of fundamental questions. It seems quite possible that the crowded polyanionic environment of cells serves as some kind of functional surface/network where many (but by no means all) cellular events such as protein folding, protein-protein interactions, regulation, metabolism, and trafficking occur. Thus, we ask here whether it is plausible that cellular polyanionic surfaces might provide a matrix to direct the organization and function of a network of PABPs. We hypothesize that the cell can be viewed as a dense, functional network of polyanionic surfaces that facilitate, direct, and/or regulate certain cellular events through a gradient of relatively nonspecific interactions with PABPs. To investigate this hypothesis, we chose five model cellular polyanions to probe protein arrays containing 4,087 unique yeast proteins (23) and subsequently analyzed their potential interactions and functional significance.
| EXPERIMENTAL PROCEDURES |
|---|
|
|
|---|
Biotinylation of Polyanions
Actin and Tubulin
Actin was dialyzed against 10 mM PBS overnight to avoid a cross-reaction between the Tris salt contained in its lyophilized form and the biotinylation reagent. Tubulin was dialyzed against two exchanges of 10 mM PBS for 3 h. Actin and tubulin (
2.0 mg/ml) were biotinylated according to the manufacturers instructions using a biotin-XX-sulfosuccinimidyl ester (5 nmol/µl) included in the Invitrogen protein array kit. Excess biotin was removed using a gel filtration resin provided in the kit. The biotinylation efficiency was assessed by performing Tris-glycine SDS-PAGE and a Western blot of the biotinylated polyanions and the provided reference proteins according to standard protocols. Detection and visualization of samples were performed using a streptavidin-alkaline phosphatase conjugate and a chemiluminescent substrate, respectively.
Heparin and HS
Compounds were dissolved in 0.1 M MES buffer to a final concentration of
4.0 mg/ml. The biotinylation reaction was conducted by addition of 25 µl of biotin hydrazide in dry DMSO and 12.5 µl of freshly prepared EDC with final concentrations of 50 and 5.0 mM, respectively, to the MES/heparin or MES/HS solution. Solutions were stirred overnight at room temperature. The carbodiimide chemistry coupled biotin-LC-hydrazide to heparin and HS through uronic acids (carboxylate groups) (24). Excess biotin was removed using either dialysis cassettes (molecular weight cutoff 3,500) or polyacrylamide desalting columns (Pierce). Biotinylation efficiency was assessed by performing a dot blot on nitrocellulose membranes (Schleicher & Schuell). Approximately 2 µl of each biotinylated sample was placed on the membrane, and the blot was air-dried. Biotinylated tubulin or BSA (both at a molar ratio of 9:1) and non-biotinylated polysaccharide or buffer were used as positive and negative controls, respectively. This procedure was repeated three times followed by blocking and washing of the membrane according to the Western blot protocols. Detection and visualization of samples were performed as above.
Calf Thymus DNA
A final concentration of 200 µM EZ-link psoralen-PEO3-biotin in doubly deionized H2O was added to a 1 mg/ml solution of DNA in TE buffer (10 mM Tris, 1 mM EDTA, pH 7.4) after which DNA was boiled and immediately cooled in dry ice/ethanol bath. The reaction centrifuge tube was placed on ice and irradiated with a long wavelength UV source (Mini Ralight lamp, UVP Inc., San Gabriel, CA) for 40 min. The biotinylated solutions of DNA were stored at 20 °C. Detection and visualization of biotinylated DNA was performed through a dot blot assay as described above.
Quantification of Polyanions
Actin and Tubulin
Final concentrations were determined by UV spectroscopy at 290 nm for actin and 280 nm for tubulin using extinction coefficients of 2.66 x 104 (25) and 1.15 x 105 M1 cm1,2 respectively, with a cell pathlength of 1.0 cm.
Heparin and HS
Polysaccharide concentrations were determined using the metachromatic dye azure A (26). A stock solution of azure A was prepared in 0.2% (v/v) aqueous formic acid, pH 3.5, at a final concentration of 5.04 x 105 M. One milliliter of the dye stock solution was titrated with 110 µl of 1.0 mg/ml polysaccharide solution prepared in 10 mM PBS and vortexed (27). The absorbance of each sample was measured at 620 nm where the extent of disappearance of azure A was followed as a function of polysaccharide concentration. In each case, the non-biotinylated heparin and HS were used as standards for the biotinylated samples. UV absorbance confirmed that azure A was bound to both biotinylated and non-biotinylated forms of heparin and HS. These experiments were conducted after observing that biotinylated HS failed to bind to another metachromatic dye, 1,9-dimethylmethylene blue. This is in contrast to the non-biotinylated form, which shows binding to this dye.
Calf Thymus DNA
DNA was quantified using an optical density of 1.0 for a 50 µg/ml solution at 260 nm.
Protein Array Experiments
The yeast protein microarrays contain 4,087 purified proteins from Saccharomyces cerevisiae ORFs that are expressed as N-terminal GST-His6 fusion proteins, purified, and spotted in duplicate on a nitrocellulose-coated glass slide (23). These arrays were probed with the biotinylated polyanions in the current study. Due to the high background associated with biotinylated heparin and HS with this type of coating, modified surface chemistry arrays from Invitrogen were utilized. According to the manufacturers protocol, the arrays were probed with 120 µl of a 2550 µg/ml concentration of each biotinylated polyanion and exposed to streptavidin-Alexa Fluor® 647 followed by subsequent washings, drying, and scanning at 635 nm to acquire the fluorescence image. Protein arrays were scanned using a GenePix 4000B microarray scanner (Molecular Devices Co., Sunnyvale, CA) at 635 nm with 10-µm pixel size. The results were analyzed using GenePix® Pro 5.0 software. Each image was analyzed using the appropriate grid available from Invitrogen. To investigate the nature of any polyanion-protein interactions, each array, already probed with the biotinylated probes, was incubated with three different concentrations of NaCl solution (0.05, 0.15, and 0.75 M) in the probing buffer for 30 min, and the number of positive protein signals were reanalyzed after each experiment using the appropriate grid. For all scanned images at 00.75 M NaCl, the program "Prospector" (Invitrogen) was utilized to identify significant polyanion-protein interactions. Interactions that were considered statistically significant demonstrated a mean signal greater than median signal of all protein spots on the array plus one times the standard deviation (+1S.D.). This +1S.D. threshold has a Z-score that is greater than 1. For such threshold values, interacting proteins with low, medium, and high confidence levels are included in the dataset. Upon increase in the confidence level of interactions, +2S.D. and +3S.D. are introduced. We decided to include the greatest number of proteins involved in any type of binding to polyanions (specific or nonspecific) and, therefore, chose to study PABPs at the level of +1S.D. in most cases unless indicated otherwise. All results and data analyses were obtained from protein array experiments that included 0.15 M NaCl (corresponding to physiological ionic strength) unless indicated otherwise.
Competition Experiments with Dextran Sulfate (DS)
An experiment with a high concentration of DS (6 mM) was conducted that differed from the NaCl experiments in two ways. 1) Each array was exposed to DS for 24 h in contrast to the 30-min exposure time for the NaCl experiments, and 2) each experiment was conducted at room temperature as opposed to 4 °C for the NaCl experiments. Following the incubation of protein arrays with DS, they were washed, dried, and scanned, and the resulting images were analyzed using GenePix 5.0.
Searching for Positively Charged Amino Acids in the Sequence of PABPs Using MATLAB
An in-house MATLAB program (version 7.0.4; The MathWorks, Inc., Natick, MA) was used to search for the presence of the basic amino acids Lys, Arg, and His in PABPs (test set) and to calculate their amount (percentage) in each protein. Additionally this program was used to analyze percentage of basic amino acids in the population of all yeast proteins that were spotted on the array (population set). The translated protein sequence for each ORF was obtained from the S. cerevisiae Genome Database (SGD) batch download tool (db.yeastgenome.org/cgi-bin/batchDownload). The results from the test and population sets were compared utilizing a one-sample t test. This test calculates a two-tailed p value between the means of one particular basic amino acid percentage for proteins that interacted with each of the five different probes used and the same amino acids in the population set (QuickCalcs, GraphPad Software, Inc.). Furthermore in an effort to determine whether the enrichment of positive residues was more pronounced locally, we used Perl script to calculate the number of positive pitches in PABPs and other yeast proteins. The positive pitches are defined as a sequence run containing three or more positive residues of any five continuous residues.
pI Calculations
The code for the program for theoretical calculations of pI values for yeast PABPs was obtained from the code author.3 This program was used to generate and compare the theoretical pI values for the test set as well as the yeast protein population on the array. The results of the test and population sets were compared utilizing a one-sample t test.
Searching for a Sequence Signature in PABPs Using MEME and InterPro
Sequence Signature Analysis
A sequence signature is defined as a highly conserved region, a sequence pattern that is found repeatedly in a group of related sequences. By this definition, a sequence signature could be a protein family, functional domain, functional site, or any conserved region of unknown function. Thus, the actual physical manifestation of a signature can vary greatly in size (28). In this study, sequence signatures were derived from MEME motifs. Signature analysis was performed for the sequences of the yeast PABPs binding to one or more polyanion probes using MEME (version 3.0.10) (29). MEME implements an unsupervised learning algorithm for discovering gapless signatures in a group of related protein sequences. In this study, we chose the maximum number of signatures to be 50 and minimum and maximum width of the signatures to range from 5 to 300 amino acids, respectively. The E-value cutoff was set to 0.1 to report only statistically significant results. In addition, the type of distribution was set to zero or one occurrence per sequence, and the minimum number of sites was set to three. Therefore, only signatures that were shared by three or more PABPs were discovered by the MEME search.
Sequence Signature Characterization
The consensus sequence of each signature model was searched for in all InterPro member databases using an on-line version of InterProScan (www.ebi.ac.uk/InterProScan/). InterPro is an integrated collection of the most commonly used databases of protein families, domains, and functional sites (30). The program InterProScan allows a user to search for sequence signatures in any number of these databases simultaneously.
Additionally in an effort to determine the distribution of known polyanion binding domains (protein domains that bind actin, ATP, DNA, fibronectin, and RNA) in PABPs, we used domain information for S. cerevisiae in the Integr8 web portal (www.ebi.ac.uk/integr8/FtpSearch.do?orgProteomeId=40). A list of yeast domains that bind polyanions such as ATP (GO: 0005524), actin (GO: 0003779), DNA (GO: 0003677), and RNA (GO: 0003723) was obtained from the InterPro database (www.ebi.ac.uk/interpro/) (fibronectin did not result in any hits). The rationale for this investigation was to relate any identified sequence signature in PABPs to known domain data.
Investigation of the Three-dimensional Structures of PABPs Using Macroscopic Electrostatic Models and Protein Continuum Electrostatics
Our list of 529 unique PABPs was compared against a list of 5,140 yeast proteins in www.expasy.org/cgi-bin/lists?yeast.txt. We utilized protein continuum electrostatics (PCE), developed by Miteva et al. (31), which is a web tool based on macroscopic electrostatics with atomic details to calculate the electrostatic properties of PABPs. Submitting a protein crystal structure to the PCE server enables calculation of surface potentials and generation of electrostatic energies via infinite difference solutions to the Poisson-Boltzmann equation (31). The crystal structures of the matched proteins were submitted to the PCE server (bioserv.rpbs.jussieu.fr/PCE) with the default values of 4.0 and 80.0 for protein internal and solvent dielectric constant, respectively. A value of 0.15 was used for the ionic strength, and a range of 3.0 (red) to +3.0 kcal/mol (blue) was chosen for the electrostatic potential distribution surfaces of the matched proteins. Prior to the submission of structures to the PCE server, the crystal structures were examined to ensure that they contained only the most native-like conformation with the longest amino acid chains and no ligands and/or mutations. Additionally a choice of the highest possible resolution and R-factors were preferred when selecting the solved PABPs structures. A set of random non-PABPs was also selected and subjected to the same treatment to compare their electrostatic properties with those of PABPs.
Relationship between Phosphorylation and PABPs
The ORF list of kinase substrates in yeast was obtained from the supplemental information published by Ptacek et al. (32) and was compared with the lists of PABPs as well as the proteins that did not interact with our probes (non-PABPs) using MATLAB (version 7.0.4). Statistical analysis of the results was conducted using Fishers exact test.4
PABPs and Intrinsic Protein Disorder
A sequence analysis of PABPs for the extent of their disorderedness was conducted utilizing the pipeline interface of DisEMBL (33). Based on artificial neural networks with training on three different datasets, this program predicts disorder based on various definitions. In this study, MATLAB was used to calculate the percentage of "hot loops" for PABPs (for three statistical levels), hubs (we have defined hubs as proteins with more than 15 interacting partners), and non-PABPs from DisEMBL output files, respectively. An unpaired, two-sampled unequal variance t test was used to investigate the statistical difference between the means of percentages of hot loops for PABPs and non-PABPs.
Essential Genes, Protein Homologues, and PABPs
A database was created to include the name of all essential genes in yeast (34). These data have demonstrated that
18% of the genes in yeast are essential. We created a query between the ORFs of essential genes and PABPs to investigate how many of the PABPs were expressed by essential genes. Statistical analysis of the results was conducted using Fishers exact test.4
For a protein homologue study, BLAST 2.2.13 was downloaded from the National Center for Biotechnology Information (NCBI) server and installed in a PC work station. All-against-all BLAST searches were conducted on the yeast proteome to determine the number of homologues of yeast PABPs versus the whole yeast proteome. The default Blossom 62 scoring matrix was used in the BLAST searches, and the E-value cutoff was set to 0.01 to ensure only statistically significant homologues were identified.
Visualization of Potential PABP Networks
Cytoscape (version 2.2), an open-source program, provided a useful tool with which to visualize PABP interaction networks (35). In an attempt to build a cellular network between our PABPs and other proteins from different databases, we used Saccharomyces high quality data (YeastHighQuality.sif) as the sample data in Cytoscape, which is based on the known interactions in the Biomolecular Interaction Network Database (36). This dataset contained 3,025 proteins (nodes) and 6,886 protein-protein interactions (edges). We also generated another database from the published interactions of von Mering et al. (37) and considered only the high and medium confidence protein-protein interactions (2,617 proteins and 11,855 interactions). We specified two aims in this section: 1) to observe the extent of interaction between PABPs serving as hubs and other yeast proteins and 2) to investigate the presence of any PABP network(s) that formed a coherent, functionally rich network (see "Searching for Enriched Functional Proteins in PABPs Utilizing Biological Network Gene Ontology tool (BiNGO)").
Prior to accomplishing the first goal, a query was created to obtain the number of interacting partners between each ORF of PABPs in the von Mering et al. (37) dataset. Due to a lack of hub definition, we defined high and low connection hubs as proteins with more than 40 and 15 interacting partners, respectively, in this portion of our study.
Searching for Enriched Functional Proteins in PABPs Utilizing Biological Network Gene Ontology Tool (BiNGO)
BiNGO, an implemented plug-in for Cytoscape, was used to determine the statistical overrepresentation of gene ontology (GO) categories in the yeast PABP network (38). This tool permits mapping of the statistically predominant functional categories of genes or proteins based on the GO hierarchy (38). BiNGO uses a hypergeometric test (as opposed to a binomial one) in which sampling occurs without replacement to search for the statistically predominant functional categories in terms of p values and GO diagrams. Such diagrams contain color-coded nodes of different sizes based on p values that create an easy-to-follow visualization. This program was used to identify a significant functional category for the yeast PABPs as well as their first neighbors. A similar sized set of random proteins was also generated and tested using BiNGO to ensure the significance of the results generated from the experimental set.
Investigating PABP Networks
The yeast interactome was compiled by combining the yeast protein-protein interaction networks from the databases BioGrid (www.thebiogrid.org), Comprehensive Yeast Genome Database (mips.gsf.de/proj/yeast/CYGD/interaction), and IntAct (www.ebi.ac.uk/intact/index.jsp). Only the experimentally detected binary physical interactions were selected from each database. Regarding a specific topological property of subnetworks formed by PABP datasets, we investigated the number of protein-protein interactions involving at least one PABP (39).
We constructed 50,000 random subnetworks by arbitrarily picking the same number of proteins as in the respective PABP dataset and including their interaction partners. We used the resulting histogram of the number of interactions contained in each random subnetwork to estimate a p value quantifying the significance of the subnetwork for the respective PABP dataset. In addition, a 95% binomial confidence interval for the p value was calculated. To give a concrete example of our evaluation procedure, there were 108 PABPs in the dataset detected at the statistical level of +3S.D., and 103 of them were present in the compiled yeast interactome. Thus, we generated 50,000 different random subnetworks of 103 proteins and their direct interaction partners. We then counted the protein-protein interactions in each subnetwork to drive a histogram.
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
|
630 nm (SF 2). Detailed results of quantification of heparin and HS are reported in the supplemental material.
Protein Array Experiments
To identify PABPs in yeast, arrays containing 4,087 yeast proteins were used. Examples of images created from different protein arrays are shown in the supplemental material (SF 3). The 16 yeast proteins that interacted with all five polyanion probes at the physiological NaCl concentration of 0.15 M and +1S.D. are shown in Table I (more information is in Supplemental Table 1 (ST 1)). Table II includes the total number of PABPs that were bound to polyanions at the statistical level of +1S.D. (see "Protein Array Experiments" under "Experimental Procedures"). Using this approach, we identified 135, 60, 264, 348, and 86 yeast proteins interacting with actin, tubulin, heparin, HS, and DNA, respectively. Furthermore significant overlaps were observed in which some PABPs interacted with more than one polyanion. For example, 216, 88, 46, and 16 proteins interacted with two, three, or four or more probes, respectively. A comprehensive list of the standard ORFs of proteins that interacted with the individual polyanion probes is available in the supplemental tables (ST 2).
|
|
|
50% decrease in the number of yeast proteins that were bound to actin, tubulin, heparin, HS, and DNA (Fig. 1). It was unexpected, however, that the number of DNA-binding proteins increased after incubation with DS (Fig. 1A). This apparent contradiction is probably due to the large standard error of this particular experiment because in the subsequent panels (B and C) this result was not observed. Although the nature of polyanion-protein binding is generally considered to be primarily electrostatic (40, 41), these experiments support the idea that polyanion-protein interactions also involve other non-coulombic types of intermolecular (e.g. hydrophobic and hydrogen binding) interactions (40, 41).
Searching for Positively Charged Amino Acids in the Sequence of PABPs
It is already well established that proteins that interact with polyanions contain localized regions with high positive charge density (12, 4042). To confirm that PABPs are enriched for positively charged amino acids relative to other proteins, the percentages of Lys, Arg, and His for the test and population sets were calculated. The results are shown in Table IV. These values were generated by calculating the percentage of the basic amino acids for each ORF of all yeast proteins that interacted with polyanions (PABPs) and the population of proteins present on the arrays, respectively, and subsequently averaging these values over the entire range of test and total proteins. The results from a one-sample t test demonstrated a statistically higher percentage of Lys, a marginal increase for His, and a lower amount of Arg in the PABPs (Table IV). Overall there is an 11.9% increase of positively charged residues compared with the general population set (this percent difference was generated using a similarly sized 100 random sets, and the average result was 0.034%). The enrichment of positive residues seems to be more pronounced locally (i.e. within small sequence runs) because there is an
51% increase in the number of positive pitches in PABPs (Table IV). These positively charged regions are probably on the surfaces of proteins because it is well established that charged residues are surface-localized (see below). Our data in Table IV are consistent with the presence of localized positive patches on PABPs. It is quite likely that other factors such as the specific three-dimensional geometry of a positively charged site strongly influence the charge-charge interactions between PABPs and polyanions. For example, it has been argued that Arg is more effective in binding to polyanions than Lys (40) due to an optimal geometry. In our analysis, however, PABPs generally contained more Lys and His than Arg.
|
Searching for a Sequence Signature in PABPs Using MEME and InterPro
Because PABPs were enriched in basic amino acids, a sequence signature study was performed to determine whether they shared any common motifs. A MEME analysis (29) found 23 signatures within the population of PABPs that matched the existing entries in the InterPro database. Most of these are domains present in RNA/DNA-binding proteins, kinases, and chaperone domains. These categories of proteins were also observed in Cytoscape and BiNGO analysis of the PABP clusters along with other functional categories (see below). Although the existence of possible sequence signatures for heparin-binding proteins has been hypothesized previously (43), MEME did not identify any common unique signatures for the more general class of polyanion-binding proteins. Cardin and Weintraub (43) proposed sequences of XBBBXXBX and XBBXBX (with B and X as basic and hydropathic residues, respectively) as consensus sequences from studying 12 known heparin binding regions in four proteins. Hileman et al. (40) also suggested a TXXBXXTBXXXTBB motif (where T defines a turn and B and X have the previous definitions) using x-ray and NMR structural data for heparin-binding growth factors.
As stated above, although MEME did not identify any novel sequence signatures, it identified 23 signatures that have been associated with binding sites for polyanions such as ATP, DNA, and RNA. Table V reports the fraction of PABPs that share known polyanion binding domains. It can be seen that 3.5, 1.7, and 8.5% of PABPs share a common DNA, RNA, or ATP binding domain, respectively. It was expected that we might also detect a common actin binding domain. We suspect that the lack of such a finding is due to the fact that the presence of known domains is often a function of particular polyanionic sequences as well as the experimental conditions used in those investigations. In the current study, we focused on the presence of more nonspecific interactions, which may explain some of the discrepancy. The fact that MEME analysis found no novel polyanion binding sequence signatures enforces the idea of nonspecific interactions between positive patches on PABPs surfaces and polyanions rather than highly stereoscopically specific binding sites. Another example of such nonspecific interactions was evident when we compared the list of our DNA-binding proteins to that generated by Hall et al. (44). We found that most of the DNA-binding proteins that were identified by Hall et al. (44) bound to a least one of our other probes, a clear indication of the nonspecific nature of such interactions.
|
|
It was found that of the 3,536 unique yeast proteins that did not interact with polyanions, 1,134 were kinase substrates (32%). In contrast, of 529 proteins that did interact with polyanionic probes, 311 were kinase substrates (59%). This demonstrates a statistically significant (Fishers exact test, p < 0.0001) difference between the number of kinase substrates that possess polyanion binding ability and those that do not, suggesting that PABPs are stronger candidates for phosphorylation than non-PABPs. Because protein phosphorylation is estimated to affect 30% of the proteome as a major regulatory mechanism (32), this finding may be of functional significance. This also suggests the possibility that the regulation of PABPs by phosphorylation may involve their nonspecific interaction with polyanions. For example, phosphorylation, by decreasing the positive charge in a specific region of a protein, could decrease the interactions of the protein with polyanions. An example of such an event is seen in DNA non-homologous end joining. In this case, the phosphorylation of linker histones (a PABP) by DNA-dependent protein kinase in the vicinity of a DNA break reduces the histone affinity for DNA ends and facilitates DNA ligation to perform end joining (46).
PABPs and Intrinsic Protein Disorder
The view that proteins require a well defined conformation to function has increasingly been challenged (47, 48). It is now often argued that many proteins contain a disordered region that permits increased structural flexibility and, therefore, the option of favorable interactions with a variety of ligands or protein partners (48). Such proteins have also been proposed to be associated with "hub" activity in which they possess a central interaction role in protein-protein networks and presumably require structural flexibility for interaction with multiple partners (49). Interestingly the disordered regions of proteins appear to also be subject to increased phosphorylation (50). Therefore, we hypothesized in this work that polyanions may serve as potential interaction partners for "disordered" PABPs. We focused on the hot loop type of disorder (defined as coils/loops with a high degree of mobility as determined from crystallographic B-factors) because this identifier has demonstrated a more accurate performance evaluation (33). We also conducted a similar analysis for PABPs that interacted with more than 15 partners (hubs) to investigate whether hub proteins were disordered in nature (49) because it is known that a node with a large number of connections should serve as a hub in a protein network (51). As mentioned previously, there is no formal definition of how many protein neighbors are required for a protein to be qualified as a hub. Thus, PABPs with more than 15 interaction partners were arbitrarily chosen.
The calculations for hot loops found that 519 of 529 PABPs displayed an average value of 32% disorderedness (at +1S.D. level). A value of 29% was found for non-PABPs. Although the difference does not appear large, it was significant with a p value of 6 x 105. These values at +2S.D. and 3S.D. levels were also very significant (p < 0.001). This percentage also was close to statistical significance in the case of PABPs with 15 or more partners (p = 0.067).
Considering the increase in the disordered nature of PABPs, we suggest that it is possible that polyanions may serve as interaction partners for these proteins to structure their more flexible, disordered regions in vivo. In addition, this suggests the phrase "natively disordered" may be inappropriate at least in vivo. Given the ubiquity of intracellular polyanions, it may well be that proteins that are PABPs fold immediately upon synthesis due to the local presence of polyanions (such as ribosomes) (52). This is consistent with recent NMR studies that have shown little evidence for unfolded proteins in cells (53). Interestingly, it has also been observed that the negatively charged cavity environment of GroEL is essential for rapid folding of some proteins (54).
Essential Genes, Protein Homologues, and PABPs
We also investigated whether PABPs were expressed by essential genes in yeast cells because some proteins expressed by these genes serve as hubs (51). Additionally we investigated the presence of protein homologues for PABPs because these proteins often possess functional importance. Such functional features are involved in the robustness of the system for tolerating errors (55). The result of a query between the yeast ORFs of essential gene products and PABPs demonstrated that 26 PABPs of a total 135 proteins that interact with actin were classified as essential. In the case of tubulin, heparin, HS, and DNA, 11, 48, 78, and 16 PABPs of 60, 264, 348, and 86 total PABPs are classified as essential, respectively. Although Fishers exact test found no enrichment in the number of essential genes in PABPs (p > 0.05), the two-tailed p value (p = 0.0426) for protein homologues demonstrated that the number of homologues of PABPs is greater than the number of homologues found in the general population of yeast proteins. The average number of homologues for the population set was 4.36, whereas this number was 4.31, 5.53, 5.61, 5.35, and 6.79 for PABPs that bind actin, tubulin, heparin, HS, and DNA, respectively. This indicates that in the case of DNA-binding proteins, on average, there are almost seven proteins that have significant sequence similarity with each DNA-binding protein. Because the sequence determines the function of proteins, it may be that such homologues perform a similar function in yeast cells.
Visualization of PABP Networks
One of the more potentially intriguing aspects of this study involves a description of the yeast proteome in terms of protein-polyanion interactions. We thus hypothesized that a network of PABPs may be present as a direct consequence of PABP-polyanion interactions. To this end, we were able to find 260 of 529 unique PABPs in the YeastHighQuality.sif network file of Cytoscape. These 260 PABPs interacted with each other through 82 protein-protein interactions. A map of these selected PABPs displayed eight distinct clusters with 83 nodes and 82 edges (not shown). The rest of the PABPs (177) demonstrated a non-interacting cluster. We also identified 266 nodes with 168 interactions for PABPs in the dataset generated by von Mering et al. (37). It appears that this dataset represents a more complete protein-protein interaction map because we detected only two major clusters consisting of 90 proteins with a more complete 145 interactions identified (compared with 82 edges, Fig. 3). The remaining 176 proteins did not participate in the clusters and manifested only 23 interactions. The lack of even more interactions between these proteins probably reflects the incompleteness of the interaction databases that are currently available. As previously discussed (see "Experimental Procedures"), we propose two aims in analyzing PABPs networks. The first aim was the detection of hub PABPs and their extent of interaction with other yeast proteins. For this purpose, we identified PABPs that appear to act as hubs and investigated the number of interacting partners as well as their function. The second aim was to investigate whether polyanionic surfaces induce PABPs to form functional clusters in the form of networks. Thus, we used BiNGO analysis to investigate the functionality of PABPs (see below).
|
To take this analysis one step further, we used Cytoscape to identify the first neighbors of the 11 hub PABPs. This included some PABPs and also many non-PABPs. Using the YeastHighQuality.sif and von Mering et al. (37) datasets, 124 proteins with 811 interactions and 287 proteins with 4,204 interactions were identified, respectively. This high degree of interaction between PABPs and the yeast proteome demonstrates the extent to which PABPs may be involved in a yeast cell. The BiNGO analysis of many of these proteins from both datasets showed that they not only perform the expected molecular functions such as RNA binding but also possess helicase activity and are structural constituents of ribosomes (see below). They are involved in biological processes as diverse as general metabolism, rRNA metabolism, and ribosome biogenesis/assembly. They appear to reside at several different cellular locations such as membrane-bound and unbound organelles, the nucleus, ribosomal subunits, and various protein complexes. These results are consistent with the results of the MEME analysis. Similar analyses can be found for hub PABPs with more than 15 interaction partners in the supplemental material (SF 4B.a and 4B.b). These exhibit additional functional roles such as transferase and polymerase activity.
Fig. 3 shows a PABP network (266 proteins with 168 interactions) generated from the von Mering et al. (37) dataset using Cytoscape. This consists of two distinct clusters. Each contains a specific number of nodes and edges. For example, cluster 1, which was the larger of the two networks, contains 73 proteins with 107 interactions. Both clusters were analyzed using BiNGO (see below) to search for any functional overrepresentation present in either one.
Searching for an Enriched Functional PABPs Using BiNGO
Overrepresentation of the GO categories of molecular function, biological processes, and cellular compartments of cluster 1 in yeast PABP networks is shown in Fig. 4, A, B, and C. This diagram demonstrates the statistical significance of any overrepresentation through a color scale ranging from yellow (p value = 0.01) to dark orange (p value = 107). Because the color saturates at dark orange for p values that are more than 5 orders of magnitude smaller than the chosen significance level (0.01), the most intensely colored nodes that are farthest down the hierarchy should be considered the more significant in terms of molecular function, biological process, or the cellular location of the subject proteins (38).
|
Investigating PABP Networks
To investigate whether PABPs form a specific interaction network, we compiled a yeast interactome consisting of 36,135 physical interactions between 5,574 proteins. The rationale behind this study was that if polyanions and PABPs have an important functional role in many cellular processes, the PABPs should be involved in significantly more protein-protein interactions than a random set of proteins. We found out that the number of interactions involving PABPs was significantly higher than in random subnetworks as judged by the estimated p values (average of 0.04, see ST 8). The resulting histograms from counting the protein-protein interactions in each subnetwork is shown in SF 6. Interestingly the p values for the PABP dataset at the +3S.D. statistical level are not always higher than the ones for the corresponding +1S.D. and/or 2S.D. datasets (see SF 6).
Summary
This work suggests the existence of a network of polyanions and/or PABPs that may play a role in the structure and function of yeast cells. Thus, polyanionic surfaces could potentially align/regulate PABPs for interaction with other proteins through electrostatic and non-coulombic molecular interactions. This may be thought of as a "beads on string" type of organization, a structure previously observed for protein-polyanion complexes (56). This might also be thought of in terms of the following admittedly crude analogy. If a cell is considered a city, then polyanions behave like roads, buildings, elevators, and other routes of transportation and sites of habitation, whereas the PABPs correspond to the individuals who interact and perform the various activities of the city. Although simplistic, this analogy captures the possible role of the proposed polyanion network.
From the analyses performed here as well as observations by others (57, 58), it appears that there is a limited stringency in the criteria for polyanion-protein interactions. A minimal requirement of basic amino acids that form localized patches of positive charge on the surface of proteins with no necessity for unique binding motifs appears to be sufficient for such interactions with little discrimination toward the source of negative charge. Several previous studies have been conducted to better define protein and glycosaminoglycan interactions based on molecular modeling and amino acids sequences (40, 43). Most of the latter studies were conducted with either a limited number of proteins or small synthetic peptides. As we gain further knowledge of the structural basis of polyanion-protein interactions, a lack of specificity for this type of binding (despite its relatively high affinity) is becoming increasingly apparent. The question we must address here is whether such nonspecific interactions play a functional role within cells. This idea is somewhat in contrast to the usual assumption that interactions between macromolecules are directed only by stereoscopically precise binding.
Chu et al. (22) have defined "unique" and "common" binding sites for fibroblast growth factor-2 (FGF-2) and heparin-binding epidermal growth factor-like growth factor (HB-EGF) through competition studies. Their analysis found that the number of common binding sites for FGF-2 and HB-EGF were 13- and 30-fold greater than the unique ones, respectively. Although as suggested by Chu et al. (22) it is plausible that the common sites facilitate the access of HS to the unique sites, we propose that such common sites may also represent a functional surface for nonspecific binding to polyanionic surfaces in a cell that can regulate and/or stabilize FGF-2 or HB-EGF (and similarly other PABPs) and/or align them for further interactions with other proteins. Such less specific sites on PABPs should be more available for interaction with polyanions from simply a statistical point of view.
We showed that PABPs that are functionally important (through gene duplication and BiNGO analysis) exhibit high affinity toward polyanions not because of novel sequence signatures but rather, at least partially, through nonspecific electrostatic interactions. We also demonstrated a higher probability of regulation of PABPs by phosphorylation, suggesting further functional modulation by nonspecific interactions between PABPs and polyanions. These findings as well as those of Chu et al. (22) that PABPs have common sites for nonspecific interactions suggest that such interactions are of functional importance in a cell. A similar view has been proposed recently in which electrolyte pathways and pools "wire" the cytoplasm and organize the crowded environment of the cell (59). In this work, however, we suggest a different "wiring" system in which the cellular polyanions provide a matrix within which some of the functional activity of the cell occurs. The possibility of the simultaneous presence of multiple systems of such organization seems quite plausible.
Although it is apparent that protein arrays are useful tools for high throughput proteomics studies, several limitations of this approach should be noted. Binding to a static or "just-in-time-assembly" complex of proteins (60) and transient protein-protein interactions cannot be easily detected by this stationary approach. Thus, this approach would be expected to significantly underestimate the extent of any interactions probed. Protein arrays, however, are perhaps the most effective tool for rapid proteomics studies where post-translational modifications of proteins are present.
In summary, we identified 529 PABPs in the yeast proteome. It was confirmed that charge-charge interactions as well as other types of non-covalent binding are responsible for these polyanion-protein interactions. Highly localized positively charged patches of basic amino acids result on average for a slightly basic character for the 529 yeast PABPs. No unique sequence signature, however, was identified within and among PABPs, but some confirmation of previously known binding motifs was produced. The most interesting outcomes of this study were the observations that (a) many PABPs were subjected to phosphorylation, implying that regulation of such proteins through their charge-charge interactions with polyanions may be possible; (b) a high content of disorderedness was present in many PABPs that may be a target for stabilization and structure induction (e.g. folding) by polyanions; (c) there was an extensive interaction between PABPs and other yeast proteins in the form of a network; and (d) there is a potential role of some PABPs as protein hubs. The functional, biological, and locational overrepresentation of PABPs networks in yeast, which is clearly not the result of chance, suggests a degree of critical interaction between polyanions and proteins of widespread functional significance.
| ACKNOWLEDGMENTS |
|---|
| FOOTNOTES |
|---|
Published, MCP Papers in Press, September 18, 2006, DOI 10.1074/mcp.M600240-MCP200
1 The abbreviations used are: PABP, polyanion-binding protein; HS, heparan sulfate; DS, dextran sulfate; SF, Supplemental Fig(s).; ST, Supplemental Table(s); SGD, S. cerevisiae Genome Database; MEME, Multiple Em for Motif Elicitation; GO, gene ontology; PCE, protein continuum electrostatics; BiNGO, Biological Network Gene Ontology tool; HB-EGF, heparin-binding epidermal growth factor-like growth factor; FGF, fibroblast growth factor; PDB, Protein Data Bank; PEO, polyethylene oxide; EDC, 1-ethyl-3-(3-dimethylaminopropyl)carbodiimide biotin-LC-hydrazide. ![]()
2 T. Mitchison, personal website, protocols. ![]()
3 C. Putnam, personal communication. ![]()
4 Ø. Langsrud, personal website (Fishers exact test). ![]()
5 N. Salamat-Miller, J. Fang, C. W. Seidel, Y. Assenov, M. Albrecht, and C. R. Middaugh, unpublished results. ![]()
* This work was supported in part by Kansas IDeA Network of Biomedical Research Excellence Bioinformatics Core, National Institutes of Health Grant P20 RR016475. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. ![]()
S The on-line version of this article (available at http://www.mcponline.org) contains supplemental material. ![]()
** To whom correspondence should be addressed: Dept. of Pharmaceutical Chemistry, University of Kansas, 2030 Becker Dr., Lawrence, KS 66047. Tel.: 875-864-5813; Fax: 875-864-5814; E-mail: middaugh{at}ku.edu
| REFERENCES |
|---|
|
|
|---|