Novel Proteomics Strategy Brings Insight into the Prevalence of SUMO-2 Target Sites*

Small ubiquitin-like modifier (SUMO) is covalently conjugated to its target proteins thereby altering their activity. The mammalian SUMO protein family includes four members (SUMO-1–4) of which SUMO-2 and SUMO-3 are conjugated in a stress-inducible manner. The vast majority of known SUMO substrates are recognized by the single SUMO E2-conjugating enzyme Ubc9 binding to a consensus tetrapeptide (ΨKXE where Ψ stands for a large hydrophobic amino acid) or extended motifs that contain phosphorylated or negatively charged amino acids called PDSM (phosphorylation-dependent sumoylation motif) and NDSM (negatively charged amino acid-dependent sumoylation motif), respectively. We identified 382 SUMO-2 targets using a novel method based on SUMO protease treatment that improves separation of SUMO substrates on SDS-PAGE before LC-ESI-MS/MS. We also implemented a software SUMOFI (SUMO motif finder) to facilitate identification of motifs for SUMO substrates from a user-provided set of proteins and to classify the substrates according to the type of SUMO-targeting consensus site. Surprisingly more than half of the substrates lacked any known consensus site, suggesting that numerous SUMO substrates are recognized by a yet unknown consensus site-independent mechanism. Gene ontology analysis revealed that substrates in distinct functional categories display strikingly different prevalences of NDSM sites. Given that different types of motifs are bound by Ubc9 using alternative mechanisms, our data suggest that the preference of SUMO-2 targeting mechanism depends on the biological function of the substrate.

ids downstream of the consensus tetrapeptide are essential for SUMO conjugation and bind to a positively charged patch on Ubc9 (15). Another extended consensus motif, the phosphorylation-dependent sumoylation motif (PDSM) consists of the ⌿KXE tetrapeptide followed by a proline-directed phosphorylation site and cannot be sumoylated unless phosphorylated (16). The PDSM and NDSM may represent variations on the same theme; the phosphorylated serine can be substituted for by acidic residues (14). Notably sumoylation has also been reported to occur on non-consensus lysines indicating that the consensus tetrapeptide is not an absolute requirement for sumoylation (17,18). However, the proteins where SUMO is conjugated on non-consensus sites are not readily found because the presence of either the consensus tetrapeptide or an extended motif is often used as a prediction tool for SUMO substrate identification.
To date, a few proteomics studies of SUMO-2 or SUMO-3 substrates have been reported. Vertegaal et al. (19) used a nuclear body enrichment procedure followed by His purification, in-solution digestion, separation of peptides by HPLC, and MS/MS. Rosas-Acosta et al. (20) used a combination of proteasome inhibition and heat shock treatment followed by tandem affinity purification, enterokinase elution, in-solution digestion, HPLC, and MALDI MS/MS to identify SUMO-1 and SUMO-3 targets. Two studies based on stable isotope labeling by amino acids in cell culture (SILAC) have been published: Vertegaal et al. (7) generated His-tagged cell lines and nuclear lysates followed by SDS-PAGE and LC-MS/MS, whereas Schimmel et al. (21) used HeLa cells overexpressing His-SUMO-2, proteasome inhibition, whole cell lysates, and in-solution digestion followed by LC-MS/MS.
To improve the sensitivity of SUMO substrate identification, we developed a novel strategy for SUMO-2 substrate purification and identification. The novelty lies within the desumoylation of substrates prior to SDS-PAGE using a recombinant SUMO protease, which considerably reduces the complexity of the SDS-PAGE-separated sample and increases the number of identified substrates. Bioinformatics analysis of the 382 identified SUMO-2 substrates revealed that the SUMO consensus site is significantly enriched within the SUMO-2 proteome. However, 52% of the identified proteins lacked any known consensus sequence, suggesting that SUMO conjugation to non-consensus sites is more common than previously anticipated. In addition, many substrates containing NDSM, but not PDSM, were present in our data set, indicating that these extended consensus motifs are differentially targeted by stress-induced SUMO-2 conjugation. To further characterize biological functions of the identified proteins, we did Gene Ontology (GO) (22) term enrichment analysis. Interestingly GO categories of DNA-and nucleotide-related functions were selectively enriched in NDSM-containing SUMO-2 substrates. Given that different types of consensus sites are bound by Ubc9 using alternative mechanisms (13,14,16), our results reveal an unexpected connection between the SUMO-2 targeting mechanism and the biological function of the substrate.
Cell Culture and Transfection-Human K562 erythroleukemia cells were maintained and transfected as described earlier (24). To generate a stable cell line overexpressing His-hemagglutinin (HA)-tagged SUMO-2 (K562 HA-SUMO-2 ), K562 cells were cotransfected with 30 g of pSG5-His-HA-SUMO-2 and 3 g of empty pcDNA3.1 by electroporation (975 microfarads, 220 V) using a Bio-Rad Gene Pulser electroporator. Transfected cells were allowed to recover for 2 days, and neomycin-resistant cells were selected in medium containing G418 (500 g/ml; Invitrogen) for 2 weeks. Drug-resistant cells were diluted and selected for single cell clones. The K562 HA-SUMO-2 cells were routinely maintained in medium containing G418 (500 g/ml). Human HeLa cervical cancer cells were grown in Dulbecco's modified Eagle's medium supplemented with 10% fetal calf serum and antibiotics (penicillin and streptomycin) in a humidified 5% CO 2 atmosphere at 37°C. HeLa cells were transfected by electroporation as described earlier (24).
Heat Shock Treatments-Cells were heat-shocked for the indicated times in water baths so that the temperature of the medium was 43°C. HeLa cells were heat-shocked in plastic bags sealed with Heat Sealer (Wallac). K562 cells were heat-shocked in cell culture bottles (Falcon).
Western Blotting-For Western blotting, cell pellets were first suspended in 1 volume of PBS. The suspension of cells was lysed by boiling in 2 volumes of Laemmli sample buffer. The lysates were separated by 8% SDS-PAGE, transferred to a nitrocellulose membrane, and blotted with ␣-HA antibody (Covance).
Immunoprecipitation-For large scale immunoprecipitation, cells were suspended in PBS and lysed by boiling in 1 volume of 1% SDS in PBS. Lysates were mixed with 10 volumes of 1% Triton X-100 in PBS, sonicated, and cleared by centrifugation at 18,000 rpm for 10 min. Cleared lysates were mixed with 9 volumes of 1% BSA and 1% Triton X-100 in PBS. N-Ethylmaleimide was added to a final concentration of 20 mM. Lysates were precleared with IgG-Sepharose (Amersham Biosciences) and incubated with ␣-HA-agarose (Sigma) for 2 h at room temperature. Beads were washed intensively, suspended in high salt protease buffer (Invitrogen), and resuspended in low salt protease buffer (Invitrogen) before treating with recombinant Ulp-1 SUMO protease (Invitrogen) at 37°C for 1.5 h and boiling in Laemmli sample buffer. A similar protocol was used for small scale immunoprecipitations but without SUMO protease treatment unless otherwise indicated. ␣-Myc antibodies (Sigma) were used together with protein G beads (GE Healthcare) to immunoprecipitate Myc-tagged poly(ADP-ribose) polymerase 1 (PARP-1). Immunoprecipitated proteins were separated by SDS-PAGE and either silver-stained or transferred to a nitrocellulose membrane and blotted with ␣-topoisomerase I (Santa Cruz Biotechnology), ␣-DDX21 (25), ␣-PARP-1 (Sigma), ␣-Myc (Sigma), or ␣-SUMO-2/3 (Zymed Laboratories Inc.) antibodies.
Analysis of SUMO Protease Activity-Lysates from heat-shocked K562 HA-SUMO-2 cells were prepared and immunoprecipitated with ␣-HA-agarose (Sigma) as described above. Equal amounts of immunoprecipitate-containing beads were treated with or without recombinant Ulp-1 SUMO protease (Invitrogen) in low salt SUMO protease buffer (Invitrogen) at 37°C for 1.5 h. The beads were repeatedly washed with PBS and boiled in Laemmli sample buffer. Non-deconjugated SUMO substrates were analyzed by Western blotting with ␣-HA antibodies (Covance).
Nano-LC/ESI-MS/MS Analysis-Immunoprecipitated and desumoylated samples from 1 ϫ 10 8 K562 HA-SUMO-2 or parental K562 cells were separated by SDS-PAGE, and the gel was silver-stained. The lane from heat-treated K562 HA-SUMO-2 cells was cut in 1-mm slices, and each slice was washed twice with 0.2 M NH 4 HCO 3 , 10% ACN and dehydrated with ACN. Proteins were reduced and alkylated by treating them with 20 mM DTT and 55 mM iodoacetamide, respectively. After washing with 50 mM NH 4 HCO 3 and ACN, proteins were digested in gel with trypsin (Promega, Madison, WI) and incubated overnight at 37°C. Tryptic peptides were extracted from the gel pieces with 50% ACN, 5% HCOOH. The peptide extracts were vacuum centrifuged to dryness and stored at Ϫ20°C until analyzed by mass spectrometry.
Dried peptides were dissolved to 10 l of 0.1% formic acid (FA) and analyzed by automated nanoscale capillary LC-MS/MS using an Ultimate capillary LC system, Famos autosampler, and Switchos II column switching unit (LC Packings). The LC system was coupled to a quadrupole TOF mass spectrometer (Q-Star Pulsar, Applied Biosystems/MDS Sciex, Toronto, Canada). Samples were desalted and preconcentrated on line with a 0.3 ϫ 5-mm PepMap C 18 -precolumn (LC Packings). Peptide separation was achieved by using a 75-m ϫ 150-mm PepMap C 18 (3 m, 100 Å) analytical column (LC Packings) and a two-stage gradient. First mobile phase B concentration was raised from 5 to 20% in 5 min and then from 20 to 60% in 20 min using a flow rate of 200 nl/min. Mobile phase A was a mixture of 0.1% FA and 5% ACN. Mobile phase B consisted of 0.1% FA and 95% ACN. The mass spectrometer was programmed to accumulate signal for 1 s for the TOF-MS scan after which the two most intense peaks were selected for 3-s product ion scans. Analyst QS (Applied Biosystems) was used for instrument control.
The data from different gel slices were combined and analyzed using Analyst QS 1.1 (Applied Biosystems) and MASCOT in-house server and Daemon (2.2.0; Matrix Science). Raw data files from the Q-Star Pulsar were converted to peak lists for MASCOT searches using the following data import filter options in Daemon: (a) default precursor charge stages 2ϩ and 3ϩ; (b) MS-MS averaging: reject spectra if less than five peaks or precursor Ͻ10 or precursor Ͼ100,000; precursor mass tolerance for grouping, 0.2; maximum number of cycles between groups; 10; and minimum number of cycles per group, 1; (c) information-dependent acquisition survey scan centroid parameters: redetermine precursor charge; percent height, 50%; and merge distance, 0.01 amu; and (d) MS-MS data centroid/threshold: remove peaks within intensity less than 0.01% of highest; centroid all MS/MS data; percent height, 50%; and merge distance, 0.1 amu.
The data were searched against the Swiss-Prot database release 54.0, which has 276,256 sequences in total and 16,891 sequences after taxonomy filter. The following search criteria were used: taxonomy, Homo sapiens; mass error tolerance for parent ion and fragment ions, 0.3 and 0.3 Da, respectively; fixed modification, cysteine carbamidomethylation; variable modification, methionine oxidation; enzyme, trypsin; and number of missed cleavages, one. The same settings were used when the MASCOT 2.2.0 in-build feature was used to search the data set against the decoy data base (26). This search gave a false discovery rate of 2.6%. In the MASCOT peptide summary report the settings were as follows: significance threshold, p Ͻ 0.05; require bold red; and expect cutoff, 0.05. All the proteins that were identified only with one peptide were manually removed from the data set. Keratins were removed as common contaminants.
GO Enrichment Analysis-GO is a widely used controlled vocabulary to describe gene and gene product attributes that comprises over 26,000 terms in three major categories: biological process, molecular function, and cellular component (22,27). Identification of statistically enriched GO terms for a given group of SUMO substrates was done by comparing GO term frequencies in the substrate group against the background. GO annotations for each protein were fetched from Ensembl through the BioMart system (28). GO terms are structured as a directed acyclic graph so that the nodes deeper in the tree are more specific than the nodes close to the root node. This hierarchy was taken into account when counting the number of annotated proteins as follows. A protein was associated with a certain GO term if it was annotated either with a term itself or a child of the term. Fisher's exact test was performed to derive a p value for each GO term in a given group of substrates (e.g. NDSM) using the other group (e.g. nonconsensus) or the complete proteome as background. Fisher's exact test uses a 2 ϫ 2 contingency table to evaluate the significance of the association between two variables (29). To correct for multiple hypothesis comparisons we used Holm's method (30) to adjust p values when using the complete proteome as the background distribution. To show only the most relevant GO terms, parent terms having a child term with a lower p value were excluded from the final result list.

RESULTS
A Novel Proteomics Strategy to Identify SUMO-2 Substrates-We designed a proteomics-based approach for identification of SUMO-2 targets (Fig. 1A). Our purification strategy was based on denaturing cell lysis and immunoaffinity purification followed by SUMO-2 deconjugation using a SUMO protease prior to separation by SDS-PAGE and mass spectrometric identification. Sumoylation was induced by heat shock to maximize the number of identified SUMO-2 substrates (9). Sufficient quantities of cell lysate for substrate identification were produced by human K562 erythroleukemia cells stably expressing His-HA-tagged SUMO-2 (K562 HA-SUMO-2 ). The ectopically expressed SUMO-2 behaved similarly to the endogenous SUMO-2/3 proteins because it was largely unconjugated in untreated cells and efficiently conjugated to substrates upon heat shock (Fig. 1B). Denaturing cell lysis is an important step for restoring the SUMO-2 conjugates because it results in rapid inactivation of endogenous SUMO proteases. Lysates need to be renatured before immunoprecipitation of SUMO-2 substrates.
The novelty of our method lies within the in vitro SUMO deconjugation prior to SDS-PAGE, which allows concentration of substrates containing several sumoylation sites or SUMO chains to single bands on the SDS-PAGE gel (Fig. 1C). The SUMO deconjugation step reduced complexity of the sample and increased the number of identified proteins (data not shown). We chose a gel-based procedure because it allowed us to efficiently separate large quantities of purified proteins, which could subsequently be easily visualized and compared with mock and control samples. For the desumoylation reaction, we used recombinant yeast Ulp-1, which can efficiently deconjugate SUMO-2 from HA beads as demonstrated by disappearance of the high molecular weight SUMO-2 conjugates (Fig. 1D).
Identification of 382 SUMO-2 Substrates-Immunoprecipitated proteins from the K562 HA-SUMO-2 cells and parental K562 cells were prepared for mass spectrometric analysis. After desumoylation, the proteins were separated by SDS-PAGE and visualized by silver staining. The heat-treated K562 HA-SUMO-2 sample showed strong staining throughout the molecular weight range. In contrast, samples from parental K562 cells and untreated K562 HA-SUMO-2 cells showed, except for the strong IgG band, only a weak staining indicative of specific immunoprecipitation ( Fig. 2A). The heat-shocked K562 HA-SUMO-2 lane was cut in slices that were treated with trypsin. After digestion tryptic peptides were extracted from the gel pieces and analyzed by LC-MS/MS. Data collected from individual gel bands were merged and searched against a database using stringent criteria; i.e. at least two identified peptides with a significance threshold of p Ͻ 0.05 were required for a positive protein hit. We used a decoy database method that gave a false discovery rate of 2.6%. The mass spectrometric analysis of the heat-treated K562 HA-SUMO-2 lane led to the iden-tification of 382 SUMO-2 substrates (supplemental Table 1). Based on a literature survey in PubMed, 115 of the identified substrates were novel as they were not previously reported to be conjugated by any of the four mammalian SUMO paralogues (7, 19 -21, 31-45). We also analyzed several gel slices from heat-shocked parental K562 and untreated K562 HA-SUMO-2 cells. These samples contained no hits exceeding our threshold, likely reflecting the low stoichiometry of SUMO-2 conjugation in unstressed cells (Fig. 1B).
The purification was validated by Western blotting of the immunoprecipitated and desumoylated samples. DNA topoisomerase I, nucleolar RNA helicase 2 (DDX21), and PARP-1 were identified as positive hits by mass spectrometry, and they were readily detected in the heat-shocked K562 HA-SUMO-2 sample (Fig. 2B). In contrast, the control K562 HA-SUMO-2 showed only a weak signal in long exposures, and no signal could be detected in the parental K562 cells. Next we analyzed the sumoylation of PARP-1. HeLa cells were transfected with Myc-PARP-1 and SUMO-2 and exposed to heat shock. PARP-1 was immunoprecipitated using antibodies against the Myc epitope. The immunoprecipitates were blotted against SUMO-2/3 and the Myc tag to detect PARP-1. A ladder of sumoylated forms after SDS-PAGE indicated that PARP-1 is multisumoylated in a heat shock-inducible manner (Fig. 2C).  Substrates-As sumoylation is often coupled to transcriptional repression (1), we were surprised not to find significant enrichment of transcriptionally related GO categories ( Fig. 3A and Table I). Instead strongly enriched GO categories belonging to the molecular function ontology included "RNA processing" (8.2-fold) and "RNA binding" (7.3-fold). The most frequent GO term "heterogeneous nuclear ribonucleoprotein complex" belonged to the cellular component ontology and was 64 times more abundant in our data set than in the genome (supplemental Table 2).

Differential Prevalence of NDSM and PDSM Sites in
To evaluate the use of various consensus sites in SUMO-2 modification, we classified the substrates according to the type of consensus sites present. To achieve this we developed novel software called SUMOFI, which searches for motifs that are used for targeting of SUMO substrates from a selected set of proteins using ScanProsite (46). SUMOFI defines ⌿ as Met, Leu, Val, Ile, or Phe and NDSM according to Yang et al. (15). With the help of SUMOFI we divided our substrates into the following groups. Classical refers to SUMO-2 substrates containing the consensus tetrapeptide alone, NDSM or PDSM refers to SUMO-2 substrates containing NDSM or PDSM, non-NDSM refers to SUMO-2 substrates lacking NDSM, and non-consensus refers to SUMO-2 substrates lacking the consensus tetrapeptide. Despite the significant enrichment of classical and NDSM, a majority of SUMO-2 substrates (52%) were non-consensus ( Fig. 3B and Table II). To evaluate whether this was due to a heat shock-specific effect, we analyzed the proportion of non-consensus from a recent SUMO-2 proteomics study where no heat shock was used to induce SUMO-2 conjugation (21). The results were strikingly similar, i.e. 51% non-consensus, indicating that a large proportion of the SUMO-2 proteome is indeed modified on sites lacking any known SUMO consensus sequence. The proportion of NDSM and PDSM was remarkably different as NDSM were significantly enriched compared with the genome, but only one PDSM was included in our data set ( Fig.  3B and Table II). This implies that, unlike PDSM sites, NDSM sites specifically are targeted by heat shock-induced SUMO-2 modification, although Ubc9 likely recognizes both extended motifs in a similar manner (15).
The Occurrence of NDSM Sites Depends on the Biological Function of the Substrate-To understand how the different motifs are utilized in SUMO-2 modification, we performed GO classification for substrate groups according to the type of consensus sites. Certain GO terms, but not all SUMO-2-enriched GO terms, showed prominent differences between NDSM and non-NDSM ( Fig. 3C and Table III). GO categories associated with DNA-and nucleotide-related functions such as "DNA metabolic process" and "helicase activity" were more abundant in NDSM than non-NDSM. The same GO terms were enriched in NDSM also when compared with classical. Importantly when examining all NDSM site-containing proteins in the FIG. 2. Identification of heat shock-inducible SUMO-2 substrates. A, SUMO-2 substrates were immunoprecipitated from untreated and heat-shocked K562 HA-SUMO-2 and heat-shocked parental K562 cells. The immunoprecipitated proteins were desumoylated on beads with recombinant Ulp-1, separated by SDS-PAGE, silver-stained, and identified by LC-MS/MS. B, an aliquot from the immunoprecipitated and desumoylated samples was analyzed by Western blotting with antibodies for topoisomerase I (TopoI), DDX21, and PARP-1. C, PARP-1 undergoes SUMO-2 modification in a heat shock-inducible manner. HeLa cells transfected with Myc-tagged PARP-1 and SUMO-2 were exposed to heat shock at 43°C for 2 h. Cells were lysed under denaturing conditions, renatured, sonicated, and centrifuged. ␣-Myc immunoprecipitates (IP) were separated by SDS-PAGE and blotted against SUMO-2/3. genome there is no significant enrichment of either DNA metabolic process or helicase activity (15). In contrast, some SUMO-2-enriched terms, such as RNA binding and RNA processing, were equally abundant in NDSM, classical, and non-NDSM ( Fig.  3C and Table III). DISCUSSION We present a novel strategy for SUMO-2 substrate purification and identification. In technical terms the innovation lies within the desumoylation of substrates prior to SDS-PAGE using a recombinant SUMO protease, which considerably reduces the complexity of the SDS-PAGE-separated sample and increases the number of identified substrates. This approach of protease treatments should be widely applicable for example when identifying substrates of ubiquitin and other ubiquitin-like modifiers, which are conjugated as chains of variable length. There are several commercially available re- Heat shock-induced SUMO-2 substrates were compared with genome background for identification of positively and negatively enriched GO terms belonging to the molecular function ontology. B, many SUMO-2 substrates lack the consensus tetrapeptide. Prevalences of classical, NDSM, PDSM, and non-consensus were compared between our list of heat shock-inducible SUMO-2 substrates and genome. C, several GO categories are selectively associated with NDSM. Proportions of GO categories associated with genome, non-NDSM, classical, and NDSM were analyzed. Classical, SUMO-2 substrates containing the consensus tetrapeptide alone; NDSM, SUMO-2 substrates containing NDSM; PDSM, SUMO-2 substrates containing PDSM; non-consensus, SUMO-2 substrates lacking consensus tetrapeptide; non-NDSM, SUMO-2 substrates lacking NDSM.  combinant enzymes with deubiquitinating activity, such as His 6 -USP7CD (Boston Biochemicals), that could be used to facilitate identification of ubiquitin substrates. Ubc9 uses alternative mechanisms to bind dissimilar SUMO motifs (13)(14)(15)(16). Different SUMO motifs are thus targeted by alternative mechanisms. To understand how these different motifs are utilized in SUMO-2 modification, we made GO classifications of substrate groups according to the type of consensus sites present. Similar analyses can now be easily performed by the SUMO investigators using our SUMOFI Web server. The striking finding that GO categories associated with DNA-and nucleotide-related functions were enriched in NDSM suggests that the preference of SUMO-2 targeting mechanism depends on the biological function of the substrate. Our data cannot be explained by a generally higher abundance of NDSM sites in proteins belonging to these categories (15). For example, the GO category helicase activity is strongly enriched in the stress-induced SUMO-2 pool of NDSM-containing proteins but not in all NDSM-containing proteins. Because NDSM sites are also abundant in proteins belonging to other functional groups (15), it is possible that these groups are preferentially modified in other biological contexts. Therefore, additional specificity determinants within or adjacent to NDSM may allow specific context-dependent modification of a subset of NDSM substrates. It is also possible that NDSM has some novel functions, which are mirrored in our data. For example, sumoylated NDSM could recruit a specific set of effector proteins or proteins influencing kinetics of SUMO modification, such as desumoylating enzymes.
To date, SUMO substrate recognition has largely relied on the assumption that most sumoylation is initiated by Ubc9 binding to the SUMO consensus tetrapeptide ⌿KXE (47). Surprisingly 52% of the SUMO-2 substrates identified in this study lacked any known SUMO consensus site. Nonconsensus substrates are excluded by the sequence-based substrate prediction strategies currently available, and our data suggest that proteins containing the consensus tetrapeptide are overrepresented in the literature. We propose that SUMO substrate recognition mechanisms are more versatile than earlier anticipated and that numerous SUMO substrates are targeted by a yet unknown mechanism independ-ent of the consensus tetrapeptide. To find such a mechanism we looked for enriched short peptide motifs with lysine residues from our non-consensus substrates but were not able to find such motifs. It is possible that new sumoylation motifs are difficult to find because of complex site specificity involving more determinants than just a simple linear peptide motif. For example, the ubiquitin-conjugating enzyme E2-25K is sumoylated on a non-consensus lysine within an ␣ helix, but if the secondary structure is disrupted, sumoylation is directed to a neighboring consensus site (18). Moreover the specificity determinants could function as modules that are localized apart from the actual sumoylation site. SUMO-interacting motifs binding non-covalently to activated SUMO-Ubc9 complexes might provide such a module (48). Another possibility is that non-consensus sumoylation utilizes a principle similar to that known for ubiquitination where the E3 ligase binds to a specific sequence element, providing specificity (49). The target lysine, in turn, has low selectivity, and conjugation may occur on a variety of lysines with low selectivity or even on lysines introduced to non-native positions (50). It is obvious that novel mass spectrometric techniques allowing unbiased identification of sumoylation sites are needed to fully understand the mechanisms underlying non-consensus sumoylation. Technical advances might also reveal novel types of SUMO consensus sequences as in the field of phosphoproteomics where phosphopeptide enrichment techniques have increased the amount of identified phosphorylation sites exponentially, allowing identification of a multitude of novel phosphorylation motifs (51).
Our Web server SUMOFI is now freely available at http:// csbi.ltdk.helsinki.fi/sumofi/. It can be used to look for SUMOtargeting motifs from a user-defined list of proteins or from genomes of common model organisms.