Topographic Studies of the GroEL-GroES Chaperonin Complex by Chemical Cross-linking Using Diformyl Ethynylbenzene

Many essential cellular processes depend upon the self-assembly of stable multiprotein entities. The architectures of the vast majority of these protein machines remain unknown because these structures are difficult to obtain by biophysical techniques alone. However, recent progress in defining the architecture of protein complexes has resulted from integrating information from all available biochemical and biophysical sources to generate computational models. Chemical cross-linking is a technique that holds exceptional promise toward achieving this goal by providing distance constraints that reflect the topography of protein complexes. Combined with the available structural data, these constraints can yield three-dimensional models of higher order molecular machines. However, thus far the utility of cross-linking has been thwarted by insufficient yields of cross-linked products and tandem mass spectrometry methods that are unable to unambiguously establish the identity of the covalently labeled peptides and their sites of modification. We report the cross-linking of amino moieties by 1,3-diformyl-5-ethynylbenzene (DEB) with analysis by high resolution electron transfer dissociation. This new reagent coupled with this new energy deposition technique addresses these obstacles by generating cross-linked peptides containing two additional sites of protonation relative to conventional cross-linking reagents. In addition to excellent coverage of sequence ions by electron transfer dissociation, DEB cross-linking produces gas-phase precursor ions in the 4+, 5+, or 6+ charge states that are readily segregated from unmodified and dead-end modified peptides using charge-dependent precursor selection of only quadruply and higher charge state ions. Furthermore, electron transfer induces dissociation of the DEB-peptide bonds to yield diagnostic ion signals that reveal the “molecular ions” of the unmodified peptides. We demonstrate the power of this strategy by cross-linking analysis of the 21-protein, ADP-bound GroEL-GroES chaperonin complex. Twenty-five unique sites of cross-linking were determined.

A wide range of cellular processes are mediated by stable protein complexes that range in subunit size from a few proteins, e.g. the signal recognition particle, to over a hundred, e.g. the spliceosome. Indeed, protein interactions underlie an enormous scope of physiological and pathophysiological processes, encompassing everything from cell cycle regulation to initiation of apoptosis, angiogenesis, and aberrant interactions in cancer.
Presently, the subunit compositions, dynamics, and topographies as well as the overall architectures of most multimeric complexes remain unknown. With a handful of exceptions, most notably the crystal structures of the large and small ribosomal subunits, which were early synchrotron successes determined some 10 years ago (1)(2)(3), large molecular machines have proven recalcitrant to high resolution structural analysis. Conversely, a large number of individual cytosolic proteins have been studied at atomic resolution as have a few membrane proteins. However, despite this individual level of detail, our knowledge of structural information on most complexes is inadequate, and new approaches aimed toward unraveling these structures are necessary.
One approach that has proven successful is the generation of computational models that integrate biophysical and biochemical information. For instance, structural data from cryoelectron microscopy of intact complexes and crystallography of individual subunits have been combined with proteomicsbased experiments that provide composition and neighbor relationships to model a variety of protein complexes (4).
A promising strategy to further refine the modeling process involves chemical cross-linking in conjunction with modern tandem mass spectrometry. Analysis of covalently crosslinked proteins provides information on subunit interfaces and generates distance constraints that reflect the topography of the complex (5)(6)(7). For example, we have previously used such an approach to generate a structure of the bacterial signal recognition particle in complex with its receptor (8).
Using cross-link constraints, the modeled structure was in agreement with the x-ray structure deduced later and additionally revealed information on the location of the M-domain that failed to diffract. Taking advantage of recent advances in high resolution mass spectrometry, chemical cross-linking of the 15-protein RNA polymerase II-transcription factor IIF complex has been carried out using the reagent BS3. 1 The spatial proximities obtained revealed the locations of yeast transcription factor IIF on the model of the polymerase II surface (9). However, chemical cross-linking is still not viewed as a robust technique for generating structural information because of the complexities involved in achieving the desired results. These include the low yield of cross-linked products and the complexity of the digested reaction mixture. Moreover, the lack of widespread progress or adoption of crosslinking analysis by the protein biology community rests on the over-reliance by research groups on commercially available chemical reagents that have not been constructed with the ease of successful analysis by mass spectrometry in mind.
Any reagent designed to solve this challenge must take advantage of the power and sensitivity of modern tandem mass spectrometry to provide the sequence and sites of modification for the peptides that constitute the cross-link. The most basic requirement is to experimentally optimize the formation of sequence ion series that will define both peptides and their sites of attachment in a cross-linked species. Both electron transfer dissociation (ETD) (10) and electron capture dissociation (11) energy deposition processes would appear to have a major advantage over the use of collision-induced dissociation (CID) with respect to the analysis of large, highly charged species such as cross-linked peptides. These processes produce extensive, evenly-distributed dissociation along peptide backbones that show less influence to the sequence of the analyte peptide than CID.
To fully exploit this advantage, it is desirable to generate precursor ions with maximal charge state as ETD is known to function optimally with low m/z precursors (12). Furthermore, highly charged analytes can be specifically analyzed in the presence of complex mixtures on the basis of charge statedependent precursor ion selection in the mass spectrometer (9,13). From this perspective, commercial cross-linking reagents, which react by acylation of lysine residues, are suboptimal as they remove two potential sites of protonation from any cross-linked species formed.
A number of strategies use labeling with stable isotopes to facilitate identification of cross-linked peptides within complex reaction mixtures on the basis of the unique isotopic signatures bestowed (14 -21). However, precursor ion selection based on an isotope pattern is difficult to implement, whereas selection based on charge state is an option routinely available on all tandem mass spectrometers. Hence, isotopic labeling schemes only serve as an aid to correct identification rather than a means to increase the yield of cross-linked peptides that may be selected for MS/MS. Other strategies incorporate a chemical tag into the cross-linking reagent that allows enrichment of labeled peptides (19,(22)(23)(24)(25).
However, of the number of chemical cross-linking reagents reported in recent years, nearly all rely on activated esters, such as N-hydroxysuccinimide esters, to acylate the amino moieties of proteins. In addition to decreasing the number of basic sites on the cross-linked peptides, our laboratory has noted that peptides cross-linked in this manner often provide poor sequence information in both CID and ETD MS/MS. Often the product ion spectra favor fragment ions from only one of the two peptides, and sequence ions that contain the site of cross-linking are less frequently observed than unmodified fragments. 2 This difficulty has prompted the development of gas-phase cleavable cross-linking reagents (26 -32). These reagents dissociate in the mass spectrometer to unmask the now linear component peptides but generally require an additional stage of activation to obtain sequence ions.
Here we describe cross-linking by a novel chemical reagent, 1,3-diformyl-5-ethynylbenzene (DEB), that forms Schiff bases between lysyl -amino functions at protein-protein interfaces. These are readily reduced by cyanoborohydride to join proteins through secondary amino linkages. Thus, DEB inserts a rigid backbone spacer between lysyl -amino functions while preserving them as sites of protonation that, as noted above, are advantageous for ETD sequence determination. We show the utility of this strategy in studies of the 21-subunit GroEL-GroES chaperonin complex. This complex facilitates protein folding by sequestering polypeptide chains in a compartment formed by a nucleotide-bound, sevenmembered homooligomeric GroEL ring and a seven-membered homooligomeric GroES ring. An additional GroEL ring, the trans ring, sits adjacent to and facing the cis (GroESbound) ring but bears a collapsed conformation, which is unable to bind GroES until ATP hydrolysis and the completion of the catalytic cycle (33)(34)(35).

EXPERIMENTAL PROCEDURES
Materials-GroEL and GroES, prepared from overexpression in Escherichia coli, were a generous gift from the Frydman laboratory at Stanford University. Protein concentrations were determined by Bradford assay. Sequencing grade methylated trypsin was ordered from Promega. Synthetic reagents and other chemicals were purchased from Sigma-Aldrich at the highest grade available except for sodium cyanoborohydride (95%), which was ordered from Acros Organics, and formic acid (98%), which was from Fluka. HPLC grade solvents were purchased from Fisher Scientific.
Synthesis of DEB-Generally the procedure of Bhagwat et al. (36) was followed for the preparation of 1,3-dihydroxymethyl-5-ethynylbenzene from diethyl-5-hydroxyisophthalate with the exception that the trimethylsilyl (TMS) group was not removed. Oxidation to the dialdehyde was effected by the Dess-Martin procedure. Briefly, 0.7 mmol of 1,3-dihydroxymethyl-5-trimethylsilyl-ethynylbenzene was 1 The abbreviations used are: BS3, bis(sulfosuccinimidyl) suberate; DEB, 1,3-diformyl-5-ethynylbenzene; ETD, electron transfer dissociation. 2 M. Trnka, unpublished results. stirred with 1.8 mmol of Dess-Martin periodinane in CH 2 Cl 2 at 0°C. After 1 h, the ice bath was removed, and the mixture was stirred for an additional 2 h. The mixture was washed with aqueous sodium thiosulfate and then with aqueous sodium bicarbonate and dried, and the residue was purified by flash chromatography on silica gel using a solvent system consisting of 15% ethyl acetate, 84.9% hexane, 0.1% formic acid to yield the TMS-protected diformyl compound. The TMS group was then removed by dissolving the 1,3-diformyl-5-trimethylsilylethynylbenzene in 5:1 MeOH:THF and adding a 3 molar excess of 2 N K 2 CO 3 solution dropwise. Saturated ammonium chloride solution was added, and the product was extracted into ethyl acetate, which was dried and removed under vacuum. The crude product was purified by flash chromatography on silica gel using a mobile phase of 17% ethyl acetate, 82.9% hexane, 0.1% formic acid. The final product was characterized by 1  Cross-linking reactions also contained 1 mM adenosine diphosphate (ADP) (from 10ϫ stock in buffer A) and 2.5 mM DEB (from 40ϫ stock in DMSO). Reactions were equilibrated to 37°C, and reduction of Schiff base adducts (cross-linking) was initiated by the addition of NaCNBH 3 to 20 mM final concentration (from 50ϫ stock in 0.01 N NaOH). Reactions were incubated at 37°C for 1 h and terminated by acetone precipitation. 200 l of ice cold acetone was added, and the samples were placed on dry ice for 2 h. The protein was recovered by centrifugation for 20 min at 15,000 ϫ g and then washed twice with ice-cold acetone. Residual acetone was removed on a SpeedVac.
The acetone-precipitated pellets were then dissolved in Laemmli sample buffer, separated by SDS-PAGE, and visualized with Coomassie Blue staining. The region of the gel corresponding to molecular mass greater than 54 kDa was divided into four gel bands, which were reduced with dithiothreitol, alkylated with iodoacetamide, and digested with trypsin overnight at 37°C. Peptides were extracted from the gel slices using two aliquots of 5% formic acid, 50% acetonitrile solution with 30 min of vortexing and sonication. Extracts were evaporated to dryness, desalted on C 18 OMIX solid phase extraction tips (Varian). Approximately 0.5% of each sample was injected for LC-MS/MS analysis.
The DEB molecule was designed with an aryl alkyne moiety to allow click chemistry-based enrichment. However, for simple systems, enrichment of cross-linked peptides by SDS-PAGE isolation of high molecular weight protein species followed by in-gel tryptic digestion and charge state-dependent selection of ETD precursor ions was found to be an excellent method of isolating cross-linked species. We will present the results from the bioconjugate enrichments and depletions of cross-linked reaction mixtures elsewhere. 3 Mass Spectrometry-Tryptic digests of cross-linked proteins were separated by reverse phase chromatography using a Waters Nanoacquity ultraperformance LC system equipped with a 100-m-inner diameter ϫ 100-mm column packed with 1.7-m-diameter, 300-Åpore size C 18 particles (Waters). Peptide mixtures were separated using 90-min gradients of 2-30% solvent B (solvent A, 0.1% formic acid in H 2 O; solvent B: 0.1% formic acid in acetonitrile) at a flow rate of 400 nl/min. Eluting peptides were analyzed on an ESI LTQ-Orbitrap XL with an ETD module installed (Thermo Scientific). Precursor spectra were measured in the Orbitrap analyzer by averaging two microscans of 100-ms maximal duration with an automatic gain control setting of 2 ϫ 10 6 . Components observed in quadruply charged or higher charge states were selected for ETD analysis. High resolution ETD product ion spectra were measured in the Orbitrap averaging two microscans of 500-ms maximal duration and an automatic gain control setting of 2 ϫ 10 5 . The minimal signal required for product ion selection was 10,000. ETD activation times were varied between 50 and 200 ms, and supplemental activation was turned on. The precursor isolation window was set to 5 m/z units. In some experiments, ETD or CID of the same precursors was simultaneously measured in the linear ion trap.
Data Analysis-Separate ETD and CID peak lists were generated using the in-house package PAVA (version, July 28, 2009) (37) such that optimal interrogation of ETD-specific fragmentation and sequence correlation was possible (38). Putative cross-linked peptides were identified by searching for arbitrary mass modifications greater than 400 Da on Lys residues or at protein N termini using Protein Prospector version 5.6. These searches were performed with parent mass tolerance of 100 ppm and a fragment tolerance of 20 ppm. Carbamidomethylcysteine was searched as a constant modification. Variable modifications included methionine oxidation, loss of initiator methionine, protein N-terminal acetylation, and peptide N-terminal glutamine cyclization to pyroglutamate. Additionally, type 0 ("deadend") DEB modifications (C 10 H 6 O 1 ) and type 0 DEB modifications in which the free aldehyde is reduced to an alcohol (C 10 H 8 O 1 ) were searched as variable modifications on lysine residues and protein N termini. We use the nomenclature: type 0, 1, and 2 to refer to deadend, intrapeptide, and interpeptide modifications, respectively (39). Mass modifications of any integer value between 400 and 5000 Da on lysine residues or protein N termini were searched as variable modifications. These searches look for mass modifications over a range of integers plus a mass defect based on the averagine mass (40). Thus, the elemental formula of the hypothetical species in the mass modification searches is unknown, necessitating a precursor mass tolerance larger than the instrumental limitation. No more than two variable modifications were allowed on any given peptide. Up to three missed tryptic cleavages were allowed, and variable modifications that modified lysine residues were not counted as one of these three missed cleavages.
Mass modification searches were performed against a restricted database consisting of only GroEL and GroES. An earlier search of the data against the Swiss-Prot (version, December 15, 2009) database (513,877 entries) had established these as the only major components in the sample (the other results were contaminating cytosolic E. coli proteins identified by four or fewer peptides). This search used the same parameters as above except that the precursor mass tolerance was set to 20 ppm, and mass modifications were turned off.
As a default setting, Batch-Tag in Protein Prospector only considers the 20 most intense peaks in each half of the mass range of any given MS/MS spectrum to search a total of 40 peaks. However, putative cross-linked spectra, e.g. those that were identified as bearing an arbitrary modification greater than 400 Da, were re-searched using the 100 most intense product ion signals in the peak list. The top 100 scoring peptides in this search were examined for complementarity. That is, this list was examined for pairs of peptides whose mass values combined with the mass of the DEB cross-linker bridge (C 10 H 6 ) equal the mass of the selected precursor M ϩ H (with a tolerance of 15 ppm). This search was performed using a developmental version of Protein Prospector, version 5.3.1xl. The developmental cross-linking version of Protein Prospector assigns a peptide score to each individual peptide hit as well as an overall score to the 3 M. J. Trnka and A. L. Burlingame, manuscript in preparation.
cross-linked peptide. The complementarity search features will be added to the public version of Protein Prospector (http://prospector. ucsf.edu) with the next major release.
All of the hits from this complementarity search were then validated by manual annotation of the product ion spectra. Furthermore, the charge state and monoisotopic mass determination of the precursors were validated manually.

RESULTS
Complementary Mass Modification Searching-As described above, the ability of Protein Prospector to search for arbitrarily sized mass modifications on any lysine residue (40) was used to generate a list of putative cross-linked peptide species. From 8263 product ion spectra, 843 were found to match a tryptic peptide bearing a mass modification greater than 400 Da on an internal lysine or at the protein N terminus. A novel search strategy was then used to identify bona fide cross-linked peptides. This process involved re-searching a putatively cross-linked spectrum against all possible tryptic peptides that could be formed from the GroES-GroEL proteins, again allowing for arbitrary value mass modifications between 400 and 5000 Da, and examining the list of the top 100 resulting hits for complementarity. As noted already above, complementarity is defined as a peptide, P, whose mass modification value matches the mass of a second peptide, Q, plus the mass of the DEB bridge (C 10 H 6 ) and vice versa. Although the mass modification search to the individual peptide requires sacrificing accurate mass on account of not knowing the elemental formula of the modification (typically, we use 100-ppm MS tolerance for Orbitrap data), the second search, which looks for complementarity, reintroduces mass tolerance that matches the limits of the instrumentation (15 ppm for Orbitrap data).
High Resolution ETD Analysis of Cross-linked Peptides-Based on the complementarity criterion, of the 843 putative spectra of cross-linked peptides, it was possible to assign 388 (46%) bona fide cross-linked peptides. However, this list was highly redundant as the same precursor was generally selected for dissociation multiple times and because the same cross-linked residues were observed in slightly different analogs with respect to methionine oxidations or missed cleavages. Finally the number of cross-linked peptides was reduced to 25 non-redundant lysine-lysine or lysine-protein N-terminal cross-links as shown in Table I. Table I reports the individual peptide sequences, P and Q with the arbitrary mass modifications matched by Protein Prospector along with the overall XL score for the combination of peptides, and the fragmentation percentage observed for each match. Fragmentation percentage is defined as the number of sequence unique c-and z ⅐ -ions that were observed for a peptide sequence divided by the number of possible unique fragment ions (12). Most of the reported cross-linked peptides in this study produced excellent ETD sequence fragment ion coverage of both halves of the cross-linked peptides (see Figs. 1 and 2 and supplemental figures). We observed 18 of 25 spectra that were matched by over 80% of the possible sequence ions, whereas 23 of 25 were matched by over 50%.
As would be expected from peptide fragmentation induced by electron transfer, all of the fragment ions observed result from dissociation of a single bond. Thus, c-and z ⅐ -ions, which contained the modified lysine residues, were observed with the expected mass shift intact. That is, these c-and z ⅐ -ions from peptide P were shifted such that they accounted for the intact mass of peptide Q plus the cross-linked bridge (C 10 H 6 ). Often, these fragment ions were observed as doubly or triply charged signals, which were readily distinguished and assigned unambiguously as the ETD fragments were measured at high mass resolution and high mass measurement accuracy in the Orbitrap analyzer (see Figs. 1 and 2). In contrast, it should be noted that the same product ion spectra measured in the linear ion trap lacked sufficient mass resolution to precisely determine the charge state of multiply charged fragment ion signals.
In addition to the generally high coverage of sequence ions, another considerable advantage of DEB cross-links is that under ETD cross-linked peptides dissociate the carbon-nitrogen bond between the benzylic position of DEB and the reductively alkylated amine. This process produces fragment ions that correspond to each of the peptides that make up the cross-link as if they were unmodified (P ϩ H and Q ϩ H) as well as the same peptides bearing a tag from the DEB molecule (P ϩ H ϩ XL and Q ϩ H ϩ XL). Fig. 3 illustrates these diagnostic fragment ions and suggests a likely mechanism of their formation. This fragmentation pathway was found to be present among all lysine-lysine-cross-linked spectra, and these dissociations were found to be extremely useful in confirming the identity of putative cross-links. However, this dissociation pathway was generally less prevalent from modifications located at the N terminus of the protein. The presence of these diagnostic fragments was particularly helpful in assignments in which one of the peptides was short and identified by only a handful of c-and z ⅐ -ions as shown in Fig.  2. Despite the low score of this spectrum matched to the peptide EK(2524.332)LQER, the identity of this species is confirmed by the Q ϩ H and Q ϩ H ϩ XL ion signals at 802.440 and 929.491 m/z, respectively.
In contrast, P ϩ H and Q ϩ H ϩ XL ions were never observed in CID product ion spectra. Although CID in the linear ion trap did generate b-and y-ion fragment series, these series were less extensive than the corresponding c-and z ⅐ -ions from ETD and were often dominated by a few intense fragment ion signals. Because charge reduction takes place during ETD but not CID, b-and y-ions had higher charge states than corresponding c-and z ⅐ -ions. Because DEB cross-linking generates highly charged precursor ions, a sizeable fraction of the b-and y-ions were triply charged or higher. Most search algorithms do not consider fragment ions to be greater than doubly charged. Thus, product ion spectra should be measured at high resolution and searched with an

Cross-linking by DEB
algorithm that considers fragment ions in higher charge states to analyze DEB-cross-linked samples by CID.
GroEL-GroES Cross-linking-Because of the homooligomeric nature of the GroEL complex, a lysine-lysine-crosslinked peptide pair identified by mass spectrometry can logically originate from at least 14 different topologically distinct subunit pairings. Each of the 25 identified cross-linked peptide pairs was examined with reference to all possible pairings on the crystal structure of the GroEL-GroES-ADP complex (Protein Data Bank code 1PF9 (33)). Distances were measured between -amines as well as ␤-carbons of the crosslinked lysine pairs. The rigid DEB molecule spans an interamine distance of 7.3 Å (Fig. 3). However, to account for rotation of the lysine side chains relative to their location in the crystal structure, the interlysine ␤-carbon distances were also measured. Because the distance of the lysine side chain from ␤-carbon to -amine is 5 Å, a distance constraint of 17.3 Å can be placed on this measurement.
Of the 25 cross-links, 21 were between two lysine residues, whereas the other four were between the ␣-amino group of GroES and a lysine. Of the 21 lysine-lysine cross-links obtained from these experiments, 20 fit within the 17.3-Å constraint for at least one possible subunit pairing of matched residues. The twenty-first, Lys 13 -Lys 20 on GroES, exceeds this constraint by 3.0 Å. Table II reports interlysine ␤-carboncross-linked distances that are less than 22 Å from both the capped cis and lidless trans conformations of GroEL and GroES (Fig. 4). At most, one configuration in the cis and one pairing in the trans conformation were consistent with the 17.3-Å distance constraint. In a number of cases, the crosslink could arise exclusively from the cis or the trans conformation. Each of the links detected in this study was visually inspected for feasibility to ensure that there were no obvious clashes with the known atomic resolution structure of the protein (see Fig. 5).
For cross-links between the ␣-amine of the N terminus and a lysine side chain, the ␣-amine-␤-carbon distance is expected to fit a constraint of 12.3 Å. However, none of the four cross-links to the GroES N-terminal methionine fit this constraint despite the high confidence in these spectral assignments. Instead the ␣-amine-␤-carbon distances of these linkages spanned 15.0 -21.5 Å. As already noted, the one lysine-lysine linkage that exceeded the 17.3-Å constraint was also on the N terminus of GroES (Lys 13 -Lys 20 ). Hence, our results provide evidence that the N-terminal domain of GroES is somewhat flexible in solution and thus able to form the cross-linked peptides observed. DISCUSSION In this work, we report the development of a powerful electron transfer-based mass spectrometric strategy for the elucidation of the precise topography of large protein complexes using a newly designed bifunctional chemical crosslinking reagent, DEB. We show that this reagent samples the presence of free ␣and -amino functions under physiological conditions within a proximity of ϳ7 Å by formation of Schiff bases. Subsequent reduction forms secondary amine linkages that are stable during proteolytic liberation of the crosslinked peptides formed from the protein complex. In addition, this reduction introduces two additional sites of protonation that are advantageous for both (a) increasing the overall charge density of the species during electrospray ionization and (b) favoring the formation of peptide sequence ion series under electron transfer energy deposition conditions. Hence, these spectra contain the sequences and sites of attachment of both peptides participating in the cross-link as well as signals that represent molecular ion mass values corresponding to the individual molecular weights of both peptides. As will be discussed below, these increased charge states are of further analytical advantage in that they facilitate gas-phase "isolation" of cross-linked peptides in complicated reaction mixtures by enabling charge-dependent selection of quadruply charged and higher precursor ion signals during ETD analysis.
The E. coli GroEL-GroES complex was chosen as a multiprotein complex on which to evaluate the effectiveness of our cross-linking methodology. The homooligomeric GroEL complex forms a structure consisting of two stacked heptameric rings. ATP binds to a nucleotide binding site at the equatorial domain of GroEL that catalyzes a conformational extension and counterclockwise twist of the GroEL apical domain. This facilitates binding of the GroES lid, another heptameric homooligomeric ring. The GroES-bound GroEL cis complex encompasses a hydrophobic cavity that sequesters substrate proteins, allowing them to fold properly. Physiologically, hydrolysis of the ␥-phosphate of ATP at the GroEL nucleotide binding site drives the catalytic cycle, which consists of coordinated cycling of the cis and trans GroEL rings between extended GroES-bound and collapsed conformers. ADP binding, on the other hand, creates a stable, asymmetric complex in which the cis GroEL ring is extended and capped by the GroES ring, whereas the trans ring is collapsed and open (see Fig. 4) (33)(34)(35)41).

FIG. 2. High resolution ETD product ion spectrum of m/z 666.161 5؉ corresponding to intersubunit DEB cross-link between Lys 327 and
Lys 364 of GroEL. The Q ϩ H and Q ϩ H ϩ XL signals confirm the identity of the peptide EKLQER. The asterisk (*) denotes contaminating signals that are within the precursor selection window. The dagger symbol marks cϩ1 and zϩ1 ions (c † and z † ). High resolution product ion spectra enable confirmation of fragment ion charge state. Fig. 4 shows the x-ray structure of the GroEL-GroES-ADP structure with residues established as part of a cross-linked peptide pair highlighted in red. All of the intersubunit crosslinks were determined to come from adjacent subunits in one of the rings. This result is consistent with the fact that the vast majority of the contacting protein-protein surface area is present in these regions and that there are few lysine residues in the inter-ring-contacting regions. This is also consistent with results reported from an earlier cross-linking study of GroEL-GroES that found that inter-ring cross-links were formed at a much lower rate (42).
The homooligomeric nature of the GroEL-GroES has complicated our analysis somewhat because the cross-linked species identified by mass spectrometry can originate from a number of different conformational pairings. However, all the cross-links we observed fit the distance imposed by the DEB geometry to at most one pair of residues in the cis and one pair in the trans complex, and many fit exclusively to only one conformation. As a measure of reliability, we determined the lysine-lysine ␤-carbon distances between every possible pairing of lysine residues on the complex. For each lysine pair, it was determined whether at least one possible subunit pairing matched the distance constraint of 17.3 Å. Of 1130 possible pairings between all possible lysine side chains, 155 fit the distance constraint for at least one possible subunit configuration. Thus, the likelihood of a single cross-link matching randomly is 13.7%. Therefore, it is highly unlikely that 20 of 21 of our reported lysine-lysine cross-links meet this requirement by chance alone.
As mentioned, the four links to the N terminus of GroES as well as the Lys 13 -Lys 20 GroES cross-link require exceeding the expected geometric constraints somewhat. Amino acid residues 13-32 are known to constitute the GroES "mobile loop," which sits at the contact region between GroEL and GroES. Thus, the cross-linking results are consistent with the findings of NMR spectroscopy and crystallography with respect to the conformational flexibility of this region (41,43,44). Several of the cross-links identified could only have originated from sampling exclusively the capped cis GroEL conformation or the lidless trans conformation. For instance, the cross-link between Lys 277 and Lys 390 of GroEL fits well between Lys 277 on subunit H and Lys 390 on subunit N of the trans complex with a C␤-C␤ distance of 13.6 Å as illustrated in Fig. 5. However, the best fit on the cis complex gives a C␤-C␤ distance between Lys 277 (chain A) and Lys 390 (chain A) of 33.3 Å. Furthermore, the intrasubunit links on both the trans and cis conformations are sterically bad fits, which further validates Lys 277 (H)-Lys 390 (N) or an equivalent configuration on the trans ring as the cross-linked subunit match. initiates dissociation of this bond to generate diagnostic P ϩ H and Q ϩ H ϩ XL fragment ions, which match the molecular ion of the individual peptides and were observed in all cross-linked spectra.

Cross-linking sites and measured distances of GroEL-GroES cross-links
No.

Cross-linking by DEB
Several research groups have introduced cross-linking reagents that insert a low energy gas-phase, infrared multiphoton dissociation, or UV light-cleavable bond (26 -29, 45). This simplifies cross-linking analysis by producing fragment ions that provide the molecular ion of the individual peptides. However, these strategies tend not to simultaneously produce sequence ions that would identify the peptide. This leaves the identity of the peptides to be inferred by precursor mass alone (46,47) or requires non-standard instrumentation that permits multiple stages of dissociation (29). With electron transfer dissociation, DEB cross-linking not only produces sequence ions efficiently but also releases the individual molecular ions at equivalent intensity. These species are invaluable in confirming the correct peptide assignment.
Discrimination of cross-linked species by charge state-dependent precursor selection is now an established technique (9,13). Unmodified and type 0 modified peptides are typically doubly or triply charged, whereas peptides cross-linked by acylating agents (typically N-hydroxysuccinimide esters) are mostly triply or quadruply charged. The charge state shift is due to the additional sites of protonation introduced by the ␣-amine and C-terminal tryptic residue of the extra peptide. Cross-linking with DEB introduces a further two sites of protonation and results in cross-linked peptides that have on average five charges. We found no cross-linked peptides in our initial experiments that were less than quadruply charged and thus implemented this as a requirement for precursor selection. In comparison, cross-linking with BS3 produces a significant percentage of triply charged cross-linked peptides (9) (see the supplemental chart). Hence, charge state discrimination is more effective with DEB as it is possible to reject a much greater portion of the highly abundant unmodified and dead-end modified species. We expect the efficacy of strong cation exchange to isolate cross-linked species to also be enhanced.
The relatively small size and rigid structure of the DEB molecule leads to increased resolution in the structural inferences made possible via this cross-linking analysis. A recent in silico analysis of 54 protein complexes with solved crystal structures demonstrated that, for the purpose of macromolecular structure modeling based on cross-link-derived distance constraints, the quality of the model depends on the number of cross-links as well as the maximal distance defined by those cross-links. It is desirable to have large numbers of shorter distance restraints (7). However, these factors are in conflict with each other as there will be fewer possible crosslinks derived from shorter length cross-linkers.
DEB spans a lysine -amino to lysine -amino function distance of 7.3 Å. Measured from the ␤-carbons of the modified lysine residues to account for flexibility in the side chain orientations the distance is 17.3 Å. Other recent studies use the popular, commercially available cross-linkers disuccinimidyl suberate and BS3, which produce identical cross-linked bridges with inter -amine distances of 11.4 Å (9, 13). Therefore, DEB-derived distance constraints provide greater structural resolution for modeling purposes than disuccinimidyl suberate-or BS3-derived constraints. Our study found 20 lysine-lysine-cross-linked peptides within the strict restraints of the DEB geometry with one additional cross-link that is likely correct if we allow 3 Å for loop flexibility. Additionally, four cross-links were identified between lysine residues on the GroES mobile loop and the ␣-amine of GroES.   I) (B). Amino acid residues that constitute acceptable cross-linking configurations are highlighted in red. Orange amino acid residues illustrate alternate, intrasubunit cross-linking configurations that were ruled out because they violated distance constraints and steric requirements.

CONCLUSIONS
It is without doubt that more effective cross-linking methods coupled with optimized electron transfer or electron capture ion optical systems will play a critical role in providing distance constraints that, together with cryoelectron microscopy of a protein complex and x-ray structures of individual subunits, will help solve the structure of macromolecular complexes. Many protein complexes are recalcitrant to crystallography, whereas cryoelectron microscopy often does not provide sufficient resolution (ϳ4 Å) to determine the orientation and arrangement of the individual subunits. Furthermore, there is growing interest in mapping protein interaction networks as it is now understood that almost all physiological phenomena involve protein-protein interactions (48 -51). Understanding not only the identity of interacting proteins but also mapping the structural domains involved in interaction is required to understand how proteins cooperate to carry out cellular processes and will aid in designing new therapeutics, both antibody-and small molecule-based, that target this vast and emerging class of therapeutic targets (52).
By producing cross-linked peptide pairs that are bound through positively charged secondary amines, DEB in conjunction with high resolution ETD mass spectrometry will accelerate cross-link analysis of protein machines by producing robust fragmentation, diagnostic fragments, and increased precursor charge state that benefit charge-based selection schemes. Thus, easily obtained, high quality structural constraints will provide the missing link in our ability to connect individual proteins with their functional partners through exploitation of the tools of computational biology.