Biochemical Characterization of Protein Complexes from the Helicobacter pylori Protein Interaction Map

We have investigated a large set of interactions from the Helicobacter pylori protein interaction map previously identified by high-throughput yeast two-hybrid (htY2H)-based methods. This study had two aims: i) to validate htY2H as a source of protein-protein interaction complexes for high-throughput biochemical and structural studies of the H. pylori interactome; and ii) to validate biochemically interactions shown by htY2H to involve components of the H. pylori type IV secretion systems. Thus, 17 interactions involving 31 proteins and protein fragments were studied, and a general strategy was designed to produce protein-interacting partners for biochemical and structural characterization. We show that htY2H is a valid source of protein-protein complexes for high-throughput proteome-scale characterization of the H. pylori interactome, because 76% of the interactions tested were confirmed biochemically. Of the interactions involving type IV secretion proteins, three could be confirmed. One interaction is between two components of the type IV secretion apparatus, ComB10 and ComB4, which are VirB10 and VirB4 homologs, respectively. Another interaction is between a type IV component (HP0525, a VirB11 homolog) and a non-type IV secretion protein (HP01451), indicating that proteins other than the core VirB (1-11)-VirD4 proteins may play a role in type IV secretion. Finally, a third interaction was biochemically confirmed between CagA, a virulence factor secreted by the type IV secretion system encoded by the Cag pathogenicity island, and a non-type IV secretion protein, HP0496.

Proteomics aims at studying proteins on the scale of an entire pathway or a whole cell. Proteomic analyses encompass the characterization of protein expression profile, the identification of post-translational modification, and the detection of protein-protein interactions (see for review Ref. 1). The study of protein-protein interactions has benefited from the development of large-scale high-throughput (ht) 1 methods (2) based on in vitro and in vivo systems, such as yeast two-hybrid systems (Y2H) (3)(4)(5)(6), protein chips (7), or systematic analysis of protein complexes by tandem affinity purification and mass spectrometry (8,9). Information about interactions, molecular complexes, and pathways can now be classified and examined via databases on the World Wide Web (10). Interpretation of these databases provides a framework on which a variety of target discovery strategies can be implemented (11).
One fundamental caveat of ht detection of protein-proteininteracting partners is that both Y2H and tandem affinity purification tagging/MS methods generate large numbers of false-positives or -negatives. Thus, methods need to be developed to assess the biochemical and/or biological significance of the interactions (12,13). Depending on the methodological approach, some false-negatives might arise because of incorrect folding or inadequate subcellular localization of the proteins under investigation (14). This is noteworthy as some genome-wide Y2H projects missed most (as much as 90%) of already characterized interactions (2). False-positives can be generated because searching for many potential interactions increases the probability of selecting interacting polypeptides of no biological relevance (2). As a result, htY2H does not guarantee that the inferred interactions are of physiological relevance and users of interaction network databases must further evaluate each interaction of interest. One means of alleviating at least partially this problem is to increase the sampling of the genome by using overlapping DNA fragments as preys. Such strategy leads to added redundancy of Y2H data, increasing data accuracy, and decreasing false-positives or -negatives significantly. It has the additional advantage of defining more finely the regions of the proteins involved in protein-protein interaction (selected interacting domains or SIDs) (3,4,11).
The first protein interaction map reported for a human pathogen was that of Helicobacter pylori (15). H. pylori is probably the most common chronic bacterial infection in humans, present in almost half of the population (16). It is a Gram-negative, spiral-shaped bacterium that colonizes the gastric mucosa of primates and is known to be responsible for several major diseases such as type B gastritis and peptic ulcer disease (17) and is a risk factor for gastric mucosa (18,19) and mucosa-associated lymphoid tissue lymphoma (20). H. pylori pathogenicity is at least in part associated with the secretion of the cytotoxin CagA, a protein which is thought to interfere with normal signaling once injected into gastric or duodenal epithelial cells. CagA secretion requires the assembly of a type IV secretion system encoded by genes clustered in the Cag pathogenicity island (21).
Typically, a type IV secretion system is composed of at least 12 proteins termed VirB1-11 and VirD4. The core of the secretion machinery is formed by the VirB8, 9, and 10 proteins (22). Two proteins are extracellular and associated with or are part of a pilus system: VirB2 is the pilus subunit itself, while VirB5 is a minor component of the pilus (23,24). Finally, one ATPase and two ATP-binding proteins appear to power the entire machinery: VirB11, VirB4, and VirD4, respectively (25)(26)(27)(28). In H. pylori, homologs for most of these proteins are encoded by the Cag pathogenicity island and are termed HPXXXX (where XXXX denotes the ORF number as defined in Ref. 29). With the exception of proteins possessing significant homology to members of other type IV secretion systems, little is known about the molecular assembly process and the role of each of the Cag proteins in its assembly. Note that the Cag type IV secretion system is not the only type IV secretion system encoded by H. pylori. Two other gene clusters encoding such systems have been recently characterized: the ComB system, which serves as a DNA uptake machinery, and the Tfs3 system, which serves as conjugation system (30,31).
The protein interaction map obtained for H. pylori displays a set of 1,524 interactions between 285 baits and an initial prey library of 2 ϫ 10 6 independent fragments (15). Surprisingly, no interactions between Cag pathogenicity island-encoded proteins were detected, while several were found to connect them to members of other cell pathways or proteins with unknown functions. Such examples are the interaction between HP0525, a VirB11 component of the type IV secretion system, and HP1451, between HP0527, a VirB10 homolog, with HP0149, and CagA (HP0547) with HP0496. HP1451, HP0149, and HP0496 functions are unknown and are not known components of any type IV secretion systems in H. pylori. Such result suggests that the assembly and function of the Cag-encoded type IV system in H. pylori might depend on previously unidentified proteins. Interactions between pro-teins of the ComB system were also discovered, notably one that had not been described previously between ComB10 (or HP0042), a VirB10 homolog, and ComB4 (HP0017), a VirB4 homolog (30). None of these interactions were confirmed by any other means. Given the abysmal record of htY2H methods in some of the published studies, the interactions recorded by Rain et al. (15) can only be trusted if other means of investigation are deployed to confirm them.
In this report, we describe a protein complex study involving 31 full-length proteins or SIDs identified in H. pylori by htY2H to form interacting binary complexes. One goal of this research was to assess whether the htY2H-derived proteinprotein interaction map of H. pylori is reliable enough to make the initiation of a proteome-wide biochemical and structural investigation of the interactome of this important human pathogen worthwhile. A second goal was to characterize biochemically complex formation involving type IV secretion proteins. Here we show that the H. pylori protein-protein interaction map is a valid starting point for the biochemical and structural characterization of protein-protein complexes. We also characterize for the first time interactions between HP0525 and HP1451, between HP547 (CagA) and HP0496, and between ComB10 and ComB4.

Target Selection and Amplification
A panel of 31 targets, i.e. 19 full-length proteins and 12 SIDs, were selected within the protein interaction map (Table I). Note that one protein (HP0522) is involved in interactions with two different proteins (HP1414 and HP0819) and that one interaction (between HP1231 and HP1247) was studied twice using either the full-length HP1231 or its corresponding SID. Also, one protein, HP1338, is self-interacting. Thus, a total of 31 targets engaged in 17 interactions were studied. These targets belong to different functional categories (FC) as defined in the pyloriGen database (genolist.pasteur.fr/pyloriGene/genome. cgi), including FC2 (purines, pyrimidines, nucleosides, and nucleotides), FC7 (transport and binding proteins), FC8 (DNA metabolism), FC9 (transcription), FC11 (protein fate), FC14 (cellular processes), and FC17 (hypothetical, conserved with no known function). Five of the targets under investigation are known to be involved directly in type IV secretion system biology (HP0525, HP0547, HP527, HP0017, HP0042). The size of the targeted proteins ranged from 3 to 69 kDa, and five were predicted to have at least one transmembrane domain using the HMMTOP server (32). Proteins are named according to the genome annotation (29). The predicted biological score (PBS) defined as described in the pyloriGen database (see above) is reported in Table I for each putative interaction (15). In brief, the PBS reflects the reliability of the interaction and is defined uniquely on the basis of the htY2H results.
Unless otherwise stated, the SIDs were obtained by PCR amplification using specific primers that inserted the restriction sites BamHI (5Ј end) and EcoRI (3Ј end). For the SIDs of HP0547 and HP1451, the restriction sites BamHI-PstI and EcoRI-HindIII were used, respectively. Most coding sequences for the full-length proteins were obtained directly by digesting the Y2H bait clones using the BamHI and PstI restriction enzymes (15). Two full-length proteins, HP0522 and HP0862, were obtained by PCR amplification of genomic DNA using primers that added the restriction sites BamH1 and EcoRI. The digested DNA fragments were visualized on an agarose gel, and cor-

Protein targets
The protein targets selected for this study are named according to the ORFs described by Tomb et al. (29). Interacting partners are presented on the same row (• X -) where X (A to C) is the original PBS for the interaction (see definition of PBS in text). Full-length proteins and SIDs are indicated in bold and plain script, respectively. The location of the SIDs in the sequence is indicated in parentheses. The lengths (in amino acid), predicted molecular mass in kDa (M  rectly sized bands were excised and purified with a gel extraction kit (Concert Life Technologies, Inc., Grand Island, NY).

Destination Vectors and Expression Clones
The 31 DNA fragments were each cloned in pPROEXHT (-b or -c) and pMAL (-BE or -G) vectors (see definition below), generating for each a His 6 -and maltose-binding protein (MBP)-tagged version of the proteins, respectively. The His 6 and MBP tags add 6 and 44 kDa to the molecular mass of the proteins, respectively.
To generate MBP fusions, two vectors were designed. The pMAL-c2x vector (New England Biolabs, Beverly, MA) that expresses the MBP as a N-terminal fusion domain was modified by insertion of a PreScission protease cleavage site between the XmnI and EcoRI sites (gift of Joe St. Gemes, Washington University, St Louis, MO). The EcoRI and BamHI sites were then inverted to create pMAL-BE. From pMAL-BE, one G was added in front of the BamHI site to create pMAL-G. All SID sequences were cloned between the appropriate restriction sites of pMAL-BE, while all full-length sequences were cloned in pMAL-G. To generate His 6 fusions, pPROEXHTb (for the SIDs) and pPROEXHTc (for full-length proteins) were used. One exception to this is HP1451 (a SID), which was cloned in pPROEXHTa. All expression vectors were finally transformed into BL21(DE3)pLysS competent cells (Invitrogen, San Diego, CA).
For co-expression studies, HP1451, HP0017, and HP0496 were subcloned into pAlter-Ex2 vector (Promega, Madison, WI) by using NcoI-PstI restriction sites. This vector contains a tetracycline-resistant gene as well as a different origin of replication (p15a), which makes it suitable for co-expressing proteins with pPROEXHT-cloned partners.

Protein Expression
Freshly transformed recombinants were inoculated into 5 ml of Luria-Bertani broth containing 100 g/ml ampicillin and 40 g/ml chloramphenicol, and grown overnight at 37°C. This preculture was then used to inoculate 500 ml of fresh Luria-Bertani broth containing the same antibiotics. The cells were grown at 37°C to an OD 600 of 0.6 and induced with 1 mM isopropyl ␤-D-thiogalactoside for 3 h. Solubility of each protein or protein fragment was assessed by harvesting the cells, resuspending them in a lysis buffer (20 mM Tris-HCl, pH 7.6, 150 mM NaCl, 0.2% (v/v) Triton X-100, and 5% glycerol), lyzing the cells by sonication and pelleting cell debris by centrifugation, collecting the supernatant, and analyzing the supernatant using SDS-PAGE for the presence of the induced band visually after Coomassie blue staining for MBP fusions (all MBP fusions were found to be strongly induced and soluble, see "Results") and using an anti-His 6 antibody (see below) to detect His 6 -tagged fusions. Note that, in our hands, mixing cells of putative interactive partners prior to sonication did not result in detectable changes in solubility of individual proteins. Nor did overnight induction at 20°C.

Binary Complex Formation
Several protocols were then devised to form the complexes, depending on the solubility of the His 6 -tagged interacting partners (all MBP fusions were found to be soluble).
Complex Formation Between Two Soluble Proteins or Protein Fragments-Cells corresponding to two soluble interacting partners were mixed together and harvested. The pellet was resuspended into 40 ml of lysis buffer (20 mM Tris-HCl, pH 7.6, 150 mM NaCl, 0.2% (v/v) Triton X-100, and 5% glycerol) and frozen (Ϫ80°C). A protease inhibitor mixture (1 mM PMSF, 1% aprotinin, and 1 mM ␤-mercaptoethanol) was added to the unfrozen cells prior to sonication. The cells were then disrupted by sonication, and cell debris were separated by centrifugation at 45,000 ϫ g for 20 min. The resulting supernatant was loaded onto an amylose column (fast flow amylose resin, New England Biolabs, 10-ml column volume) equilibrated with buffer A (20 mM Tris-HCl, pH 7.6, 150 mM NaCl). MBP fusion proteins were eluted by applying a short linear gradient (3 column volumes) of buffer B (buffer A containing 10 mM of maltose). The MBP fusion proteins usually eluted around 10 -15% of buffer B. The presence of the His 6 -tagged partner was then detected using SDS-PAGE followed by either Coomassie blue staining or using an anti-His 6 antibody (see below).
Binary Complex Formation When One of the Proteins Is Denatured and Refolded in the Absence of Its Putative Binding Partner-When His 6 -tagged proteins were found to be insoluble (i.e. in inclusion bodies after sonication and centrifugation), the pellet containing the inclusion bodies was washed with 10 ml of buffer R (8 M urea, 20 mM Tris, pH 7.6, 150 mM NaCl), and then centrifuged. The resulting pellet was then dissolved in 10 ml of buffer R for 3 h at 4°C under constant stirring. The cell debris were discarded by centrifugation (40,000 ϫ g, 15 min). The resulting supernatant was then diluted three times in buffer U (3 M guanidine, 20 mM Tris, pH 7.6, 150 mM NaCl). The resulting solution was then gently added (0.5 ml/min) to a solution of 300 ml of buffer F (0.5 M L-arginine, 20 mM Tris, pH 7.6, 150 mM NaCl) under stirring at 4°C. The solution was then centrifuged (40,000 ϫ g, 20 min) to remove any precipitated protein. The solution was finally loaded onto a TALON resin column (10-ml volume, Clontech, Palo Alto, CA) equilibrated with 150 mM NaCl and 20 mM Tris, pH 8. The protein was eluted using a short linear gradient (3 column volumes) of the same buffer containing 1 M imidazole. To assess complex formation, the eluted protein was added to the supernatant resulting from sonication and centrifugation of cells having expressed its binding partner as an MBP fusion. Complex formation was then monitored by elution from an amylose column.
Binary Complex Formation When One of the Proteins Is Denatured and Refolded in the Presence of Its Binding Partner-The His 6 -tagged insoluble proteins were solubilized using buffer R as previously described. The unfolded protein was then loaded onto a TALON column equilibrated with 20 mM Tris-HCl, pH 7.6, 200 mM NaCl, and 8 M urea. The protein was eluted with a short linear gradient (3 column volumes) of 1 M imidazole in the same buffer. The fractions were pooled (6 -8 ml total) and added to the supernatant containing the soluble MBP fusion partner (final volume of 30 ml). The solution was dialysed overnight against 1 liter of buffer F. The solution was filtered, and complex formation was monitored by elution from the amylose column (as described previously).
Co-expression of Binary Complexes-The same procedure described in the section "Protein Expression" was used for co-expression experiments except that tetracycline was added to the media at a concentration of 12 g/ml. Complex formation was monitored by elution from an amylose column (as described previously).

SDS-PAGE and Western Blot
SDS-PAGE experiments were carried out in 12% polyacrylamide gels using Mini Protean III following the manufacturer's protocol. Two gels were made per complex, one was stained with Simple Blue (Invitrogen) and the other was used for Western blot. Separated proteins were blotted onto PVDF membranes using liquid transfer for 1 h at 100 V (Mini ProteanIII system; Bio-Rad, Hercules, CA). Membranes containing the transferred proteins were initially incubated in the blocking solution (7.4% PBS, 5% non-fat dried milk). The detection of the His 6 tag fusion protein was performed by incubating the membrane with the primary monoclonal antibody "anti-polyhistidine" from Sigma (St. Louis, MO; 1:3,000 in 7.4% PBS, 1% non-fat dried milk) during 14 -16 h at 4°C. The membrane was then incubated with the secondary anti-mouse immunoglobulin G goat antibody coupled to alkaline phosphatase from Sigma (1:40000) during two hours at room temperature. The signal was visualized by using NBT-BCIP tablets (SIGMA) as a substrate for alkaline phosphatase.

Cloning, Expression, and Solubility of the Proteins
Protein affinity purification tags can profoundly influence stability, solubility, and yields of proteins expressed in bacteria as well as the putative interactions we intended to study. Therefore, all proteins were cloned into both expression vectors (His 6 and MBP tags) leading to a set of two expression clones for each protein. After sonication, the cell debris were harvested, and when the protein was absent from the supernatant, it was considered as insoluble. We found that 11 of 31 targets expressed with a His 6 tag were soluble, 17 were insoluble, two did not express, and one was toxic (Table II).
Thirty of 31 targets expressed with a MBP tag were soluble. HP0149 (fused either to His 6 or MBP) was toxic; therefore, the interaction that HP0149 is engaged in could not be studied (Table II). These results are in agreement with several studies describing MBP tags as an efficient tag to enhance solubility of fusion proteins (33)(34)(35). Expression and solubility results of the His 6 tag proteins are overall comparable to those reported in other expression studies (35).

Complex Purification by Single-step Affinity Chromatography
We then investigated which protein partners were able to interact. The cells expressing interacting partners were mixed, frozen, and sonicated. The soluble fraction of the lysate was loaded onto an amylose column to retain the MBP fusion protein. Co-elution of the His 6 -tagged binding partner was monitored by SDS-PAGE followed by Western blot using anti-His 6 antibody. For each interaction, formation of both the MBP-X/His 6 -Y and MBP-Y/His 6 -X putative binary complexes were studied. The interactions analysis by SDS-PAGE coupled to Western blotting identified four categories: i) strongly detected, ii) poorly detected, iii) not detected because both His-tagged proteins (His 6 -X and His 6 -Y) were insoluble, and iv) not detected despite at least one His-tagged protein being soluble (Table III and Fig. 1).
Four interactions belonged to the first group, where four complexes could be purified (HP0650/HP1245, HP1230/ HP1529, HP1338/HP1338, and HP0862/HP1474; Table III and Fig. 1A). Interestingly, only one of these interactions (HP0650/HP1245) could be detected in both the MBP-X/ His 6 -Y and MBP-Y/His 6 -X combinations where X and Y indicates the two binding partners. Therefore, the availability of both combinations is particularly important to ensure that the presence of the tag fusion domain does not prevent the interaction or that at least one binding partner is soluble as a His 6 tag fusion. The complexes belonging to this group could be easily purified in milligram amounts. In the case of HP1338 (Fig. 1E), before and after purification on a nickel-nitrilotriacetic acid column, multimeric forms of the proteins were detected, demonstrating that HP1338 was self-interacting (Fig. 1E).
Three interactions were poorly detected by both SDS-PAGE and Western blot: HP0525/HP1451, HP0496/HP547, and HP0042/HP0017 (as illustrated in Fig. 1B; see also Table  III). One interaction, between HP0149 and HP0527, could not be studied as HP0149 was toxic (Table II). The remaining nine interactions were not detected by visual inspection of Coomassie-stained gels or by Western blotting, suggesting either that the amount of binary complexes obtained were not detectable or that the interactions were not occurring under these conditions (Fig. 1, C and D). For seven of these nine undetected interactions, we hypothesized that the interactions were not occurring because in both combinations the MBP-and His 6 -tagged versions of each protein and protein fragment were assessed for expression and solubility. Color-coding for the cells is as follows: gray for soluble, black for insoluble, and white when no expression was observed. T indicates toxic proteins. Interacting partners are presented on the same row. SIDs are indicated in normal text, whereas full-length proteins are indicated in bold. His 6 -tagged proteins were insoluble, and mixing cell lysates were not sufficient to drive solubilization of these proteins. However, for two complexes, HP0608/HP0175 and HP1293/ HP1198, all proteins were produced and retrieved in the soluble fraction, and yet complex formation was not observed. Thus, these two interactions are either htY2H false-positives or the methods used to detect interactions are not adequate. Notably, in the case of HP1293(RpoA)/HP1198(RpoB), the interaction is known to occur (36). It is possible that the solution conditions used here did not allow the formation of the HP1293/HP1198 complex. Another possible explanation is that the N-terminal tags used in this study prevented interaction. The fact that only one of the 17 interactions occurred in the two MBP-X/His 6 -Y combinations (see above) indicates that tags may sometimes interfere.

Co-expression of Type IV Interactions
All three weak interactions, which we observed between HP0525 and HP1451, HP0042 and HP0017, and HP0547 and HP0496, involve type IV secretion proteins. As type IV secre-tion proteins are of special interest to us, we further investigated these three interactions by co-expressing the two proteins in the same cells. Untagged HP0525, HP0017, and HP0496 were subcloned into the pAlter-Ex2 vector and coexpressed together with, respectively, His 6 -HP1451, His 6 -HP0042, and His 6 -HP0547.
A stable HP0525/HP1451 complex could be readily purified in a three-step purification protocol consisting of affinity chromatography (TALON), tag removal (tobacco etch virus protease followed by TALON column), and gel filtration (Fig. 2). It thus appears that co-expression in this case results in a tighter complex than the one formed by simply mixing cells expressing the two proteins separately. The HP0525/HP1451 complex yielded crystals diffracting to 3.5 Å (details to be published elsewhere). The same procedure was then applied to His 6 -HP0042/HP0017 and His 6 -HP0547/HP0496, but with negative results. Co-expressing His 6 -HP0042/HP0017 resulted in both proteins being insoluble when expressed at 37 or 22°C. This is intriguing as His 6 -HP0042 expressed alone is soluble. In the case of His 6 -HP0547/HP0496, the expression For each protein interaction, complex formation was assessed using three different methods in consecutive order. In a first place (indicated as "soluble/soluble"), interaction was tested directly by mixing cell lysates. Both the MPB-Y/His 6 -X and the MBP-X/His 6 -Y combinations were used. HPXXXX in parentheses indicates the protein that was used as a His 6 -tagged protein to yield complexes. When no interaction was detected using this first method and when the His 6 -tagged binding partner was insoluble (Table II), it was hypothesized that lack of interaction was due to the insolubility of the His 6 -tagged binding partner. Thus, two methods were developed whereby the His 6 -tagged binding partner was either urea denatured and rapidly refolded on its own and the interaction assessed (indicated as "Refolded alone") or urea denatured and refolded in the presence of the MBP version of the binding partner (indicated as "Refolded with ligand"). (Ϫ), (ϩ), and (ϩϩϩ) indicate interactions not detected, detected weakly, or detected strongly, respectively. Cells in gray indicate interactions that did not need to be evaluated by the method under consideration because it had been already assessed by a previous one. For the two methods involving refolding, the His 6 tag fusion proteins that were used for complex formation are indicated in parentheses. The PBS of each interaction are indicated.

FIG. 1. Detection and purification of protein complexes using affinity chromatography. From A to E, SDS-PAGE (left panel) and
corresponding Western blot (right panel) analyses of samples. In the right panel, His 6 tag fusion proteins were detected with monoclonal anti-poly-histidine antibody (Sigma) as primary antibody and alkaline phosphatase-conjugated antimouse IgG from goat (Sigma) as secondary antibody, followed by visualization with nitroblue tetrazolium (NBT)-5-bromo-4-chloro-3-indolyl phosphate (BCIP) reaction. A, Example of a strong interaction where both interacting partners are soluble. The example is that of MBP-HP0650 (lanes 1 and 2, whole-cell extract before and after induction, respectively) interacting with His 6 -HP1245 (lanes 3 and 4, whole-cell extract before and after induction, respectively). After purification using amylose affinity chromatography, the resulting fraction contains MBP-HP0650 and also a clearly visible His 6 -HP1245 band in lane 5, left panel. The detection of His 6 -HP1245 is confirmed by Western blot analysis ( lane 5, right panel). B, Example of a weak interaction where both interacting partners are soluble. The example is that of His 6 -HP0547 (lanes 1 and 2, whole-cell extracts before and after induction, respectively) interacting with MBP-HP0496 (lanes 3 and 4, whole-cell extract before and after induction, respectively). Both proteins are observed in the combined supernatant after sonication (lane 5). After purification using amylose affinity chromatography, the interaction is poorly detected after Coomassie blue staining (lane 6, left panel) but can be detected by Western blot analysis (lane 6, right panel). C, Example of an interaction that is not detected because one of the binding partners is not soluble. The example is that of His 6 -HP1231 (lanes 1 and 2, whole-cell extract before and after induction, respectively) interacting with MBP-HP1247 (lanes 3 and 4, whole-cell extract before and after induction, respectively). Expression profiles are good for both constructs, but His 6 -HP1231 is hardly detected in the soluble fraction of the lysate (lane 5, left and right panels), and therefore the interaction is not detected (lane 6, left and right panels). D, Example of an interaction that is not detected despite both binding partners being soluble. The example is that of the putative His 6 -HP0175/MBP-HP0608 interaction. of HP0496 was undetectable, although higher yield of soluble His 6 -HP0547 were obtained. Thus co-expression for complex formation is not necessarily the best method to form complexes.

Protein Unfolding and Refolding of the His 6 -tagged
Proteins to Detect the Interactions It was hypothesized that, in the seven cases where no interaction was observed, this was due to the His 6 -tagged binding partner being insoluble. To further investigate this hypothesis, two unfolding/refolding protocols were used to generate soluble forms of the His 6 -tagged proteins. The first uses a rapid refolding of the individual protein in an "arginine" buffer (500 mM) after urea treatment. It was found that only one protein, His 6 -HP1032, responded to this treatment and could be obtained in a soluble form amenable to complex formation. After the refolding step, His 6 -HP1032 was efficiently purified by His tag affinity chromatography in nondenaturing conditions. The purified protein was mixed with the supernatant obtained after sonication of the MBP-HP1122expressing cells. MBP tag affinity chromatography resulted in the purification of the MBP-HP1122/His 6 -HP1032 complex as detected by SDS-PAGE and confirmed by Western blotting (Fig. 3). This finding is in agreement with previous data showing that HP1032 encodes the sigma factor and that HP1122 is a corresponding anti-HP1032 sigma factor (37).
However, six of the seven interactions could not be characterized by this method, either because the proteins precipitated after the refolding step or the refolded proteins could not be purified by His tag affinity chromatography (data not shown). These interactions were investigated by refolding the urea-denatured His 6 -tagged protein in the presence of its putative MBP fusion binding partner (see "Experimental Procedures"). His 6 -HP0537 and MBP-HP0261 illustrate a successful purification of the complex (Fig. 4). After purification on a TALON column, the purified unfolded His 6 -HP0537 was mixed with the supernatant of the MBP-HP0261-containing cells and dialysed overnight against the refolding buffer. After MBP affinity chromatography, His 6 -HP0537 co-eluted with the MBP-tagged HP0261, showing that a strong interaction occurs between these two binding partners (Fig. 4). This method was successful with five of the six interactions tested (Table III), showing that the interactions that were not detected with previous methods (soluble-soluble or refolding method 1) did indeed occur.
This illustrates the importance of folding proteins in the presence of their interacting partners. It is likely that the binding partner of the protein helps in the folding process. As underlined in Ref. 38, there have recently been an increased number of studies reporting that protein domains that are wholly or partly unstructured in solution can become structured upon binding to their target. It is also possible that the MBP part of the MBP-tagged protein plays a role in solubiliz- ing its binding His 6 -tagged partner. The chaperone-like activity of MBP has been suggested to occur only when fused to a protein (33). However, MBP may also be active in trans.

DISCUSSION
In this study, we first describe a method to evaluate biochemically the H. pylori protein-protein interaction map generated by htY2H. We also show for the first time that type IV secretion proteins can engage in interaction with non-type IV secretion proteins.
A panel of proteins belonging to several functional categories and putative subcellular localizations were selected. We developed a strategy that proved successful in identifying 76% of interactions previously identified by htY2H (15). Such a strategy is based on i) seeking maximum yield in solubility by tagging at least one of the binding partners to MBP, ii) using a double tag system that allows rapid purification of the binary complexes, and iii) adopting an iterative approach that assesses complex formation using increasing levels of complexity in complex formation protocols. The latter point appeared to be particularly important, as a number of protein complexes were shown to form only after denaturation of the His 6 -tagged protein followed by its refolding in the presence of its binding partner.
Expression of each binding partner in separate cells allows the option to study separately the proteins of interest. This could be useful in more detailed biochemical studies aiming at determining binding constants or thermodynamic parameters for the interaction under investigation, or to establish biochemical screening of compound libraries to discover new lead compounds able to disrupt complex formation. We successfully determined the binding parameters of only one interaction using isothermal titration calorimetric, that between HP1032 and HP1122. All other interactions studied by isothermal titration calorimetric, where both binding partners were soluble either directly or after refolding, produced either uninterpretable or undetectable heat signal, or severe baseline drift (results not shown).
The present study suggests an original alternative or complementary approach to structural genomics efforts. One of the reasons structural genomics efforts have demonstrated very little productivity (only a few new folds have been discovered since the inception and funding of the Protein Structure Initiative) and have had little scientific impact is because it has focused its considerable means and efforts on single protein structures. This has led to the successful handling of a relatively small number of soluble proteins (the so-called "low-hanging fruits"). Protein complexes are much more informative and, as has been shown repeatedly, complex formation often leads to higher solubility and stability (39). Thus, by forming complexes, one gains access to proteins that would be otherwise insoluble. The only major hurdle to the "structural complexomics" approach that we advocate here has been to obtain a reliable source of high-throughput complex discovery and characterization. Here we show that htY2H is such a source.
Recent studies have highlighted that htY2H may lead to serious misinterpretation (2). False-positives can be separated into two groups, namely methodological and biochemical. The first group includes apparent htY2H-characterized interactions that are not based on the assembly of the twohybrid proteins, and the second one encompasses interactions that are not physiologically relevant (2). Although, our study does not provide information about the latter category, it shows that a large majority of the interactions from the H. pylori protein interaction map are genuine and therefore are not methodological false-positives.
We characterized here two interactions that have not been studied before: i) between HP0525, a VirB11 homolog, and HP1451, a non-type IV secretion protein, and ii) between HP0547 (CagA) and HP0496, also a non-type IV secretion protein. HP1451 SID (residues 92-264) shows homology 2 to the SpoIIIJ-associated protein (JAG; COG1847) that contains a domain known to bind single-stranded nucleic acids. Although the exact function of this protein is unknown, it is likely involved in regulation of the SpoIIIJ protein, which plays a crucial role during sporulation (40,41). This is therefore a possible second homologue of the sporulation pathway associated to type IV secretion system as VirD4/TrwB shows homology to SpoIIIE, the ATPase and DNA motor of the sporulation machinery (42,43).  1 and 2) and MBP-HP0261 (lanes 3 and 4). Overexpressed protein bands are observed after isopropyl ␤-D-thiogalactoside (IPTG) induction of the cells (lanes 2 and 4) compared with noninduced (lanes 1 and 3) for His 6 -HP0537 and MBP-HP0261, respectively, and are in agreement with their respective predicted molecular masses of 44 and 59 kDa. After denaturation in urea (8 M), His 6 -HP0537 was purified under denaturing condition by affinity chromatography using a TALON column (lane 5). The purified His 6 -HP0537 was added to the soluble fraction of the cells expressing MBP-HP0261 (lane 6) and refolded overnight in dialysis against a renaturation solution. The soluble fraction after dialysis contained both MBP-HP0261and His 6 -HP0537 (lane 7) and the complex formed was purified by amylose affinity chromatography (lane 8). Right panel, Immunodetection after Western blotting of the same samples using anti-His 6 antibodies, confirming that HP0537 interacts with HP0261. HP0547 (cagA) SID (392-733) was shown to interact with HP0496 full-length protein. HP0496 shows homology (30% identity) to the thioreductase cluster of genes (COG0824), especially to YbgC protein-family from Escherichia coli, the cytoplasmic protein of the TOL-PAL system. This system is important for cell envelope integrity and may function in the transport of material through the periplams (44). The biological significance of those new interactions is at the moment unclear, but the discovery of such interaction paves the way for the development of new avenues of research in type IV secretion.