Skip to main content
Molecular & Cellular Proteomics

Main menu

  • Home
  • Articles
    • Current Issue
    • Papers in Press
    • Reviews and Minireviews
    • Special Issues
    • Editorials
    • Archive
    • Letters to the Editor (eLetters)
  • Info for
    • Authors
      • Editorial Policies
      • How to Submit
      • Manuscript Contents & Organization
      • Data Reporting Requirements
      • Publication Charges
    • Reviewers
    • Librarians
    • Advertisers
    • Subscribers
  • Guidelines
    • Proteomic Identification
      • Checklist (PDF)
      • Instructions for Annotated Spectra
      • Tutorial (PDF)
    • Clinical Proteomics
      • Checklist (PDF)
    • Glycomic Identification
      • Checklist (PDF)
    • Targeted Proteomics
      • Checklist (PDF)
    • Data-Independent Acquisition
      • Checklist (PDF)
    • Frequently Asked Questions
  • About
    • Mission Statement and Scope
    • Editorial Policies
    • Editorial Board
    • MCP Lectureships
    • Permissions and Licensing
    • Partners
    • Alerts
    • Contact Us

Submit

  • Submit
  • Publications
    • ASBMB
    • Molecular & Cellular Proteomics
    • Journal of Biological Chemistry
    • Journal of Lipid Research

User menu

  • Register
  • Subscribe
  • My alerts
  • Log in
  • My Cart

Search

  • Advanced search
  • Publications
    • ASBMB
    • Molecular & Cellular Proteomics
    • Journal of Biological Chemistry
    • Journal of Lipid Research
  • Register
  • Subscribe
  • My alerts
  • Log in
  • My Cart
Molecular & Cellular Proteomics

Advanced Search

  • Home
  • Articles
    • Current Issue
    • Papers in Press
    • Reviews and Minireviews
    • Special Issues
    • Editorials
    • Archive
    • Letters to the Editor (eLetters)
  • Info for
    • Authors
      • Editorial Policies
      • How to Submit
      • Manuscript Contents & Organization
      • Data Reporting Requirements
      • Publication Charges
    • Reviewers
    • Librarians
    • Advertisers
    • Subscribers
  • Guidelines
    • Proteomic Identification
      • Checklist (PDF)
      • Instructions for Annotated Spectra
      • Tutorial (PDF)
    • Clinical Proteomics
      • Checklist (PDF)
    • Glycomic Identification
      • Checklist (PDF)
    • Targeted Proteomics
      • Checklist (PDF)
    • Data-Independent Acquisition
      • Checklist (PDF)
    • Frequently Asked Questions
  • About
    • Mission Statement and Scope
    • Editorial Policies
    • Editorial Board
    • MCP Lectureships
    • Permissions and Licensing
    • Partners
    • Alerts
    • Contact Us
  • Submit
Research

Proteome Data Improves Protein Function Prediction in the Interactome of Helicobacter pylori

Stefan Wuchty, Stefan A. Müller, J. Harry Caufield, Roman Häuser, Patrick Aloy, Stefan Kalkhof and Peter Uetz
Molecular & Cellular Proteomics May 1, 2018, First published on February 1, 2018, 17 (5) 961-973; https://doi.org/10.1074/mcp.RA117.000474
Stefan Wuchty
From the ‡Dept. of Computer Science, §Center for Computational Science, ¶Dept. of Biology, ‖Sylvester Comprehensive Cancer Center, Univ. of Miami, Miami, FL 33156;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Stefan A. Müller
**German Center for Neurodegenerative Diseases (DZNE), 81377 Munich, Germany;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
J. Harry Caufield
‡‡Center for the Study of Biological Complexity, Virginia Commonwealth University, Richmond, VI 23284;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Roman Häuser
§§German Cancer Research Center, 69120 Heidelberg, Germany;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Patrick Aloy
¶¶Joint IRB-BSC-CRG Program in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona) and the Barcelona Institute of Science and Technology. Barcelona, Catalonia, Spain; ‖‖Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Catalonia, Spain;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Stefan Kalkhof
aDepartment of Molecular Systems Biology, UFZ, Helmholtz-Centre for Environmental Research Leipzig, 04318 Leipzig, Germany; bInstitute of Bioanalysis, University of Applied Sciences and Arts of Coburg, Friedrich-Streib-Str. 2, 96450 Coburg, Germany; cFraunhofer Institute for Cell Therapy and Immunology, Department of Therapy Validation, 04103 Leipzig, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Peter Uetz
‡‡Center for the Study of Biological Complexity, Virginia Commonwealth University, Richmond, VI 23284;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Article
  • Figures & Data
  • Info & Metrics
  • eLetters
  • PDF
Loading

Abstract

Helicobacter pylori is a common pathogen that is estimated to infect half of the human population, causing several diseases such as duodenal ulcer. Despite one of the first pathogens to be sequenced, its proteome remains poorly characterized as about one-third of its proteins have no functional annotation. Here, we integrate and analyze known protein interactions with proteomic and genomic data from different sources. We find that proteins with similar abundances tend to interact. Such an observation is accompanied by a trend of interactions to appear between proteins of similar functions, although some show marked cross-talk to others. Protein function prediction with protein interactions is significantly improved when interactions from other bacteria are included in our network, allowing us to obtain putative functions of more than 300 poorly or previously uncharacterized proteins. Proteins that are critical for the topological controllability of the underlying network are significantly enriched with genes that are up-regulated in the spiral compared with the coccoid form of H. pylori. Determining their evolutionary conservation, we present evidence that 80 protein complexes are identical in composition with their counterparts in Escherichia coli, while 85 are partially conserved and 120 complexes are completely absent. Furthermore, we determine network clusters that coincide with related functions, gene essentiality, genetic context, cellular localization, and gene expression in different cellular states.

Helicobacter pylori (H. pylori) 1 is a pervasive pathogen that is uniquely adapted to life in the acidic environment of the human stomach and associated with gastric inflammation and duodenal ulcer (1, 2). Persisting in such an environment by tightly associating with epithelial cells, H. pylori affects an estimated half of the human population. As a consequence, H. pylori is notorious for causing low-level inflammation and duodenal ulcer as well as stomach carcinoma and MALT (mucosa-associated lymphoid tissue) lymphoma (1⇓–3), causing 700,000 deaths annually worldwide (4).

The genome of H. pylori reference strain 26695 was completely sequenced in 1997 (5) and encodes ∼1,587 proteins with about 34% remaining uncharacterized (6). Given its impact on world health, a concerted effort is required to understand this significant number of proteins and their role in infection and disease.

Interactions between proteins are needed for almost all biological processes, helping to understand pathways as well as linking poorly or uncharacterized proteins. Only a few comprehensive bacterial interactome studies have been published to date, such as Escherichia coli (7), Campylobacter jejuni (8) and Mycobacterium tuberculosis (9). In particular, protein interactions of H. pylori were among the first to be determined in bacteria (10), an interactome that has been recently expanded (11), capturing roughly 70% of the proteome. While such interactomes have been detected using yeast two-hybrid methods, a few studies also identified bacterial protein complexes (12⇓⇓–15).

Several studies have attempted to characterize the proteome of H. pylori. Bumann et al. (16) found more than 1,800 protein spots on 2-dimensional gels, of which 200 were identified. Similarly, Jungblut et al. (17) found up to 1,800 protein spots on 2-dimensional gels. 152 were identified, including 27 proteins that corresponded to hitherto hypothetical proteins (17). Govorun et al. (18) analyzed the proteomes of four H. pylori clinical isolates and identified 126 proteins. More recently, Jungblut et al. (19) used intensive prefractionation to identify a total of 567 proteins (36.6% of the proteome). Recently, we have identified 1,190 and 1,143 proteins by 2D-LC-MS and GeLC-MS, respectively (20, 21), representing roughly 72% of the H. pylori proteome.

As proteomes and interactomes have been determined independently, their relationship remains unclear. Here, we integrate proteomic quantitative measurements in a network of roughly 3,000 protein–protein interactions (10, 11). Our analyses of diverse datasets allow us to explore the role of abundance in both the proteome and interactome as well as the structure and functionality of networked patterns. Investigating the proteomes of spiral and coccoid forms of H. pylori, we find that proteins that are critical for the control of the underlying interactome are significantly enriched with genes that are differentially expressed in the spiral form. Such observations potentially point to single proteins that play a role in the adaptability of the pathogen to different physiological conditions. Furthermore, we predict the function of more than 300 previously poorly annotated genes as well as protein complexes and functional network clusters in H. pylori. As a consequence, our integration and analysis of various large-scale datasets provide new insights into the proteome, interactome, and physiology to significantly improve our knowledge of this important pathogen.

EXPERIMENTAL PROCEDURES

Essential Genes

We collected essentiality data from several comprehensive genetic studies in H. pylori (22, 23). Furthermore, we added genes that were essential for H. pylori colonization (24, 25).

Relative and Absolute Protein Quantification

Data on relative changes in protein abundance between coccoid and spiral cells were extracted from our previous study (21). Briefly, four biological replicates of coccoid and spiral H. pylori cells were measured by LC-MS using a stable isotope labeling in cell culture based approach, capturing 1,143 proteins.

As for proteomic abundances, we collected data from a previous study as well (20). Nine replicates were measured by LC-MS/MS analysis without sample fractionation, capturing the abundance values of 1,190 proteins in H. pylori.

Protein–protein Interactions in H. pylori

We collected a total of 3,002 protein interactions from two high-throughput studies. In particular, we considered a set called PIM1 from (10) and PIM2 from (11) that both were determined by yeast two-hybrid approaches. We also identified 1,466 interactions that were classified as core data (i.e. high confidence) as they represent the overlap of PIM1 and PIM2 (10, 11).

Clustering Analysis

We used the Markov Clustering (26) algorithm (MCL) to identify clusters of interactions in the combined core H. pylori network. Applying different combinations of parameters, we automatically assessed each cluster's ability to significantly enrich coherent proteins. In particular, we utilized functional annotations from the Comprehensive Microbial Resource (27), gene ontology (GO) (28) and Kyoto Encyclopedia of Genes and Genomes (KEGG) database (29). Furthermore, we utilized gene essentiality from (22⇓⇓–25). For microarray analyses, we utilized 16 sets of gene expression analyses of H. pylori (27) from the Comprehensive Microbial Resource and considered three cases: genes up-regulated (+), genes down-regulated (−), and genes differently regulated (+ or −). Each experiment is identified by a number, a title, and the author (Table S4). For GO term enrichment analysis, we used the TopGo python library (30). For other annotations, we used Fisher's exact test and considered clusters if they enriched genes with p < 0.05.

Functional Classes of Proteins

H. pylori proteins were grouped according to broad functional classes that were defined by clusters of orthologous groups (COGs) (31, 32) since COGs provide a consistent classification of bacterial genes based on orthologous groups.

Enrichment Analysis

Binning proteins with a certain characteristic d (e.g. with a given number of interactions), we calculated the fraction of proteins that had a feature i in each group d, fi(d). As a null model, we randomly sampled protein sets with feature i of the same size 10,000 times and calculated the corresponding random fraction, fi,r(d). The enrichment/depletion of proteins with feature i in a group d is then defined as Embedded Image

Interactions Between Functional Classes

Proteins of H. pylori were grouped according to their protein abundance. Focusing on a set of protein interactions, we counted the occurrence of different abundance group combinations (33). For each combination of abundance groups i, j, we determined its probability po(i, j) = Embedded Image, where N is the total number of interactions between the underlying abundance groups. As a null model, we determined an expected probability of interactions between classes i, j: pe(i, j) = Embedded Image. Specifically, vi is the number of viable proteins in group i (i.e. proteins of group i that are involved in at least one interaction in the underlying set), and Ji,j is the number of genes that are involved in both groups. Combining these probabilities, we determined a log-odds ratio: r = Embedded Image. For large samples, we estimated the variance of the odds distribution σ2 = nij−1 + (N − nij)−1 + a−1 + (b − a)−1 where a = (vivj) − Embedded Image and b = Embedded Image. In particular, we calculated a p value for the significance of a link between two groups by a Z-test, where z = Embedded Image and considered each link with p < 0.05 (33).

Bacterial Meta-protein Interaction Data

We used 2,231 binary interactions between E. coli proteins that we have previously determined through yeast two-hybrid screens (7). As for other yeast two-hybrid screen sets, we utilized 12,012 interactions in Campylobacter jejuni (8), 3,121 interactions in Mesorhizobium loti (34), 3,236 interactions in Synechocystis sp. PCC6803 (35), 2,519 interactions in Streptococcus pneumoniae (36), 3,684 interactions in Treponema pallidum (33), 783 interactions in Bacillus subtilis (37), and 8,042 interactions in M. tuberculosis (9).

Utilizing all-versus-all BLASTP searches with the InParanoid script (38) in protein sets of two species, sequence pairs with mutually best scores were selected as central orthologous pairs. Proteins of both species that showed such an elevated degree of homology were clustered around these central pairs, forming orthologous groups. The quality of the clustering was further assessed by a standard bootstrap procedure. We only considered the central orthologous sequence pair with a confidence level of 100% as the real orthologous relationship. Protein sequence information of bacterial organisms was retrieved from Uniprot (39).

Functional Prediction of Unknown Proteins in H. pylori

We modeled the prediction of a functional class σ of a protein i as a Potts model (40). In particular, we considered functional annotation of proteins in H. pylori using COG classes as of the EggNOG database (24). All proteins without a functional annotation as well as proteins that were either classified as unknown or had a general function (such as membrane protein or ABC transporter) were randomly assigned a function out of the remaining 23 classes. In particular, we minimized the following global function E = −Σi,jJijδ(σi,σj) − Σihi(σi), where Jij is the adjacency matrix of the interaction network that accounts for unclassified proteins. In particular, Jij = 1 if unclassified proteins i and j interact and vice versa. δ(i,j) is the discrete δ function, where δ = 1 if unclassified proteins i and j have the same function (i.e. σi = σj) and vice versa. As a consequence, the first term allows us to optimize the number of interactions between unclassified proteins if they are predicted to have the same function. Depending on the assigned function to an otherwise unclassified protein, the second term aims to optimize support for the assigned function of protein i. In particular, we determine the number of classified proteins hi(σi) that interact with unclassified protein i with the same function σ that was assigned to unclassified protein i. To minimize E, we applied a simulated annealing approach that features an effective temperature T. After initially assigning random functions to all unclassified proteins, we randomly selected a protein, changed its function to a different class, and determined the energy of the new configuration. If the difference of energies ΔE ≤ 0, the new configuration was accepted. If ΔE > 0, the new configuration was accepted with probability p = e−ΔE/T. To obtain stabilized functional configurations, we repeated such a Monte Carlo step 10,000 times (40). Subsequently, we increased the inverse of T by 0.01 in each step and repeated such Monte Carlo steps. Since minimum energy solutions are not unique, we repeated such runs of simulated annealing 100 times and considered the fraction of times an unclassified protein i was observed in a certain functional state σ as an estimate of the probability that protein i belongs to class σ.

Heterogeneity of Functional Prediction

The Simpson s-index considers the fractions with which a given protein was assigned to a functional class. In particular, we calculated its heterogeneity of functional fractions as a Simpson diversity (41) index defined as s = Σi=1Npi2, where pi is the fraction with which a given protein was assigned to functional class i. Such a measure tends to 1 if one function dominates the distribution of fractions and vice versa.

Three-dimensional Modeling of Protein Structures

To model the structures of proteins (Fig. 6), we used Protein Data Bank (PDB) (42) structures 1A50 (TrpAB), 1PII (TrpCF), 1KGZ, (TrpD), 1I1Q (TrpFE), 2EEY (MoaC), 2FUW (Mog), 3RPF (MoaDE), and 2BZ0 (RibA). We created images with PyMOL v1.5.0.1.

Determination of Critical, Intermediate, and Redundant Proteins

We defined a set S ⊆ V of nodes in a network G = (V, E) as a minimum dominating set (MDSet) if every node v ∈ V is either an element of S or adjacent to an element of S. In a binary integer linear programming problem (ILP) we assigned a binary variable xv = 1 when a protein v ∈ V that participates in interactions E in a protein interaction network G is an element of the MDSet, and xv = 0 otherwise. The smallest set of MDSet nodes is obtained by minΣv ∈ Vxv, subject to the constraint xv + Σw ∈ Γ(v)xw ≥ 1 where Γ(v) was the set of interaction partners of protein v. However, many optimal solutions exist that provide MDSets of the same size. Such characteristics suggest the existence of subset of nodes that always (critical nodes), never (redundant nodes), and sporadically appear in MDSets (intermediate nodes). To find such subsets, our objective is to determine if v ∈ MDSet always appear in the MDSet of any solution. For each v ∈ MDSet, we create an ILP as before and assume that xv = 0 (i.e. not participating in the MDSet). After solving the ILP, we determine the size of the corresponding MDSet Nv that we obtained with xv = 0. If Nv > N, v is a critical node and intermediate otherwise. For all nodes that did not participate in the original MDSet, v ∉ MDSet, we need to check if they always appear outside MDSets. For v ∉ MDSet, we create an ILP as before and assume that xv = 1 (i.e. participating in the MDSet). After solving the ILP, we determine the size of the corresponding MDSet Nv that we obtained with xv = 1. If Nv > N, v is a redundant node and intermediate otherwise (43, 44). To solve these ILP problems, we utilized a branch-and-bound algorithm (45) as implemented by the lpSolve library.

Betweeness Centrality

As a global measure of its centrality, we calculated a node's betweenness, indicating a node's appearance in shortest paths through the whole network. In particular, we defined betweenness centrality cB of a node v as cB(v) = Embedded Image, where sst was the number of shortest paths between proteins s and t, while sst (v) was the number of shortest paths running through v.

RESULTS

Proteome Versus Interactome

We combine interaction datasets that have been determined by yeast two-hybrid approaches and obtain an interactome of H. pylori that connects 1,060 proteins (∼70% of the proteome) through roughly 3,000 interactions (10, 11). Furthermore, a “core” interactome, capturing high-confidence interactions, connects 759 proteins (49% of the proteome) through 1,466 interactions. In Fig. 1A, we label each protein with its functional class, essentiality, and abundance in this high-quality core network of protein interactions. As for estimating absolute quantities, we utilize data from our previous study (20) where we measured the abundance of proteins in H. pylori without sample fractionation with a LC-MS approach. Only accounting for proteins with at least three unique peptides, we obtain abundance values of 1,130 proteins that correspond to 831 interacting proteins in the combined interactions network of H. pylori. In Fig. 1B, we determine the functions of the most abundant proteins by binning proteins according to their abundance in three groups. Utilizing functional annotations from the EggNOG database (46), we find that most abundant proteins are involved in metabolite biosynthesis, transport and catabolism, protein turnover, translation and energy production. Fig. 2A indicates that essential proteins are significantly overrepresented among highly abundant proteins (Student's t test, p < 10−2). To account for interactions between proteins, we determine the proteins' propensity to interact with proteins of certain abundance levels. In particular, we calculate each protein's abundance specific Z-score and group proteins in bins of certain Z-scores. In Fig. 2B, we determine the enrichment of interactions between proteins that appear in a given Z-score bin, utilizing the combined network of all protein interactions. Generally, we observe that proteins predominately interact with proteins of similar abundance. Still, our results further indicate that proteins in low-abundance bins appear to interact with proteins in intermediate-abundance bins.

Fig. 1.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 1.

The interactome and proteome of H. pylori. (A) We map all high-quality interactions between proteins in the core protein interaction network and their corresponding abundances in H. pylori. Furthermore, we label all proteins with their protein functions and essentiality. (B) We bin protein abundances in three groups (high: top 20%, low: bottom 20%, intermediate: remainder) and determine the enrichments of functions in each bin. We find that highly abundant proteins preferably are enriched with metabolic functions.

Fig. 2.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 2.

Essentiality and abundances. (A) Grouping proteins into bins of abundances, we observe that essential proteins are more abundant than their nonessential counterparts (p < 10−2, Student's t test). (B) Using the combined interaction network in H. pylori, proteins appear to predominately interact with proteins of similar abundance. In turn, interactions that involve proteins in low-abundance bins tend to interact with proteins in intermediate-abundance bins.

The Functional Cross-talk of the Interactome

Resembling other interactome hairballs (Fig. 1A) the H. pylori interactome does not show clear functional clustering. However, the interaction data are well supported by interactions between proteins of the same functional group (Fig. 3). This observation can be used to validate the reliability (or at least plausibility) of an interaction dataset since random interactions would provide no significant enrichment. Interestingly, we additionally observe some unexpected functional cross-talk. For example, ribosomal proteins and proteins involved in translation interact with proteins involved in motility more often than expected by chance (groups I and K in Fig. 3). Similarly, motility proteins also interact with proteins involved in amino acid metabolism.

Fig. 3.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 3.

Functional crosstalk in the interactome of H. pylori. Determining the prevalence of interactions between functional groups, we observe that the majority of interactions appear between proteins of the same functional class.

Predicting Functions of Proteins Using a Bacterial Meta-interactome

To investigate the functional predictive power of our initial network of experimentally determined interactions in H. pylori, we randomly pick 80% of all functionally annotated proteins 1,000 times to predict the functions of the remaining 20% in each random run. Using a stochastic model (40), we represent every protein by a profile that reflects the probability of having a certain function. Applying different probability thresholds for the presence of a functional annotation, we determine receiver operating characteristic curves and consider the corresponding area under the curve as a measure of prediction quality (47) (Fig. 4A). To increase the predictive power of the underlying protein interaction network, we augment our network in H. pylori with protein interactions from other bacteria (36, 48). Specifically, we consider interactions that have at least one interacting protein with a functionally annotated ortholog in H. pylori, while its interacting counterpart is at least functionally annotated in the corresponding organism. Focusing on the same previously sampled sets of proteins, we predict the functions of the corresponding 20% by utilizing the augmented network. Notably, we observe a significant shift toward increased values of the area under the receiver operating characteristic curve (p < 10−50, Student's t test), suggesting that the augmentation of the original network with interactions from other bacteria significantly improves the quality of functional predictions (Fig. 4A). Since each protein is represented by a profile of function-specific probabilities, we calculate the Simpson s-index (41) as a measure of heterogeneity of predicted functions. Such a measure tends to be 1 if one function dominates the distribution of fractions (i.e. has a high probability). In turn, the s-index approaches 0 if probabilities are equally distributed. Since our sampling approach randomly picks a subset of proteins and predicts functions based on the remaining proteins in both the original interaction network of H. pylori and the augmented network, we directly compare the impact of the augmented network on the homogeneity of functional prediction. In Fig. 4B, we calculate the mean s-indices of each protein, suggesting that functional predictions of the majority of proteins benefit from the addition of the bacterial meta-interactome. Based on our observations that interactions from other bacteria have a considerable benefit on our ability to predict functions, we apply our approach to the functional prediction of 337 poorly characterized or previously unknown H. pylori proteins. While we determine the probability that a given protein has a particular function, we assess the significance of our predictions by randomly sampling known functions 100 times. Applying a Z-test, we determine a corrected p value for each score (49) that we consider significant if FDR<0.05. The heatmap in Fig. 4C shows the range of functions predicted for these proteins, including a sizeable fraction to be involved in transcriptional and translational activities. In Supplemental Table S1, we present the functional profiles of all proteins in the order in which they appear in Fig. 4C.

Fig. 4.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 4.

Functional prediction of unknown proteins in H. pylori using a bacterial meta-interactome. (A) To assess the quality of our classification procedure, we randomly sample 20% of all functionally annotated proteins in H. pylori and utilize the remainder to predict their functions. To measure prediction quality we calculate the area under the receiver operating characteristic curve, suggesting that the addition of the bacterial meta-interactome allows for better functional prediction (p < 10−50, Student's t test). (B) We consider all randomized samples and calculate the mean s-indices of each gene of unknown function (circles) in both the original network of H. pylori and the augmented network. In the scatter-plot the homogeneity of the functional prediction of the majority of genes (78.6%) benefit from including the bacterial meta-interactome. (C) Combining the network of protein interactions of H. pylori and the bacterial meta-interactome, we predict the functions of 337 proteins with unknown or poorly characterized functions (FDR < 0.05).

Control of the H. pylori Protein Interaction Network

Considering the network of protein–protein interactions in H. pylori, we aim at the elucidation of proteins that are important for the topological controllability of the underlying network (43, 44, 50). In particular, networks are dominated by minimum dominating sets (MDSet) that can be determined by an ILP. Such a method allows us to find the smallest set of nodes where each non-MDSet node is adjacent to a node in the MDSet. However, many different configurations of MDSets exist that have the same number of critical proteins. As such, an assumption implies sets of nodes that always, partially, or never participate in MDSets. Therefore, we define proteins as critical if they always participated in the MDSet of a given configuration (Fig. 5A). Furthermore, we consider redundant nodes that never appeared in MDSets while intermediate nodes sporadically occur in MDSets. Applying an algorithm that allows us to determine such sets of nodes (43, 44), we observe that the percentage of critical nodes is roughly <10%, while intermediate nodes constitute <30% of all proteins (Fig. 5B). The mean degree of critical proteins far exceeds the corresponding values of intermediate and redundant proteins that are close to the mean degree of all proteins in the underlying interaction networks (Fig. 5B).

Fig. 5.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 5.

Controlling the H. pylori protein interaction network. (A) In a toy network we illustrate the concept of critical, intermediate, and redundant nodes. (B) In the table, we present statistics of the protein interaction network of H. pylori and of its corresponding critical, intermediate, and redundant proteins. Notably, critical proteins are highly connected, while degrees of intermediate and redundant nodes revolve around the mean degree of all proteins (dashed line). (C) We define the top 20% of proteins with the highest node betweeness as a set of bottleneck proteins. Randomly sampling sets of critical, intermediate, and redundant proteins 10,000 times, we find that critical nodes are strongly enriched with bottlenecks. While intermediate nodes are moderately enriched, we also find a significant depletion of redundant nodes in the underlying set of bottleneck proteins. (D) In the Volcano plot of the fold change of proteins that compares their abundance levels in the coccoid and spiral form, we label all proteins with their critical, intermediate, and redundant role in the underlying network of protein interactions of H. pylori. We define proteins with a fold change of >0.5 and <−0.5 (p < 0.05) as regulated proteins (shaded areas), suggesting that critical, regulated proteins predominantly appear as being present in the spiral form. (E) As a corollary, we randomly sample sets of regulated proteins 10,000 times. We observe that critical proteins are significantly enriched with regulated genes. (F) We determine the enrichment of functions in the set of critical proteins by randomly sampling their functions. We observe that critical proteins predominately appear in transcriptional and posttranslational modification functions.

As for other topological characteristics, we calculate the betweeness centrality of all nodes in the underlying network. Defining the top 20% of proteins with highest betweeness centrality as a set of bottleneck nodes, we calculate the enrichment of such proteins in sets of critical, intermediate, and redundant proteins. Given all proteins in the underlying interaction network, we sample sets of proteins by randomly shuffling their labels, generating nonoverlapping, random sets of critical, intermediate, and redundant proteins. We observe that critical proteins in all organisms are strongly enriched with bottlenecks (p < 10−4). Albeit insignificantly, intermediate proteins are enriched with bottleneck nodes as well, while critical proteins hardly are bottlenecks (p < 10−4, Fig. 5C).

Additionally, we compare protein levels in the spiral and coccoid cells of H. pylori based on previously published proteomic data (21) to link the generated interaction network with cell physiology. These cellular forms were analyzed by LC-MS, allowing the comparison of relative changes between the two states with high accuracy. Determining the fold change and the corresponding p value using a Student's t test of proteins comparing the spiral and coccoid expression levels, we generate a Volcano plot where we label each protein as critical, intermediate or redundant (Fig. 5D). Qualitatively, we observe that critical proteins seem to have a higher abundance in the spiral form of H. pylori. As a corollary, we consider a set of regulated genes defined as proteins with −0.5 ≤ fold change ≥ 0.5 and p < 0.05 (21). Randomly sampling sets of regulated genes 10,000 times (Fig. 5E), we observe that critical proteins are significantly enriched with regulated genes (p < 0.05) while redundant proteins are found diluted (p < 0.05). Assuming that critical proteins play a role in the transition between the spiral and coccoid forms, we perform an analysis of their functions. Fig. 5F indicates that functions of critical proteins mostly revolve around transcriptional and posttranslational modification functions. In Supplemental Table S2, we annotate each protein with its role, fold change comparing coccoid to spiral form, and functional annotation.

Genomic Organization and the Interactome

Bacterial genomes are typically organized through functional gene clusters such as operons that encode functional units such as protein complexes. Interdependence of genomic loci and protein interaction maps has been demonstrated for the T. pallidum interactome (33) as well as for phage interactomes (51, 52). To reveal such genomic links we map protein interactions onto operons in the H. pylori genome (53) (Fig. 6). For instance, the well characterized urease gene cluster (operons 4721–4722) reveals interactions among the enzyme's core components (UreA-UreB), between the urease accessory factors UreE-UreG and UreF-UreH, and finally between UreA and UreH (see Fig. 6 in (11)). Another example is observed in operon 4,987, a gene cluster that encodes enzymes involved in tryptophan biosynthesis. A comparison with experimental protein structures from orthologs shows that our protein-protein interaction (PPI) studies (11) capture most PPI interactions that are important to assemble the enzyme complexes (Fig. 6, right panel): TrpA-TrpB and TrpE-TrpG are organized as heterotetramers with two subunits of each protein. TrpD is a homodimer, and TrpCF is a single protein that functions as a monomer. Likewise, operon 4888 encodes for enzymes involved in molybdopterin biosynthesis while the protein interaction map accurately reflects the organization of the enzyme complex.

Fig. 6.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 6.

Examples of PPI enrichments in and between genomic operons. Interacting proteins are symbolized by connected gene symbols (heteromers) and colored gene symbols (homomers). In the right panel, protein structures of Trp and Moa orthologs highlight detected (solid arrows) and undetected (dashed arrows) intermolecular interactions that are known from the enzymes′ tertiary structures.

An example for enriched crosstalk between different gene clusters is found between two operons (4960 and 5013), encoding ribosomal proteins or products that are related to protein translation and tRNA modification, respectively. The hypothetical protein HP1414, the first gene in the miaA operon, binds the ribosomal proteins L19 and S16 as well as the hypothetical protein HP1150 (which belongs to COG1837, a family of putative RNA-binding proteins). In fact, HP1414 is the H. pylori homologue of the ribosomal silencing factor RsfS (=RsfA) that we previously showed to bind to ribosomal protein L14, preventing association of the small and large ribosomal subunit (54). L19 is located in the direct neighborhood of L14 in the ribosome-forming bridges (B8 and B6) to the small ribosomal subunit (55), potentially representing a novel or additional hotspot for RsfS action. Both operons are functionally associated since both encode for products that are involved in protein translation.

One more example of interconnected operons is found between operon 4815 and 5035 that encode several uncharacterized (HP0469-HP0465) and flagellar rod proteins (FliE, FlgC, FlgB), respectively, suggesting that the 4815 operon may be involved in motility. Involvement of HP0466 in flagellar biosynthesis has already been suggested by others based on its interaction with FlgB and homology comparisons of the operon member HP0465 with motility accessory factors of C. jejuni (56). Moreover, transposon insertion into the HP0466 locus causes a colonization defect (24) whereas for HP0468 no functional information is available.

Protein Complexes in H. pylori

Protein functions are often mediated by protein complexes that are defined as stable assemblies of multiple proteins. Since complexes have not been studied systematically in H. pylori, we utilize extensive experimental data on protein complexes from E. coli (57) and M. pneumoniae (14) to predict homologous complexes in H. pylori. Utilizing orthologous protein information from the COG database (31, 32), we find that H. pylori shares 786 orthologous proteins with E. coli but only 260 with M. pneumoniae (Fig. 7A). As an example for different levels of complex conservation, we observe that the murein tripeptide transporter, a hetero-pentamer, is well conserved in H. pylori, while only three out of five subunits in the periplasmic nitrate reductase are present. In contrast to E. coli, the cascade complex is completely missing from H. pylori (Fig. 7B). In Fig. 7C, we count the number of complexes with different degrees of conservation in H. pylori using reference sets of E. coli complexes from the EcoCyc database (58) and the dataset of (57). Furthermore, we use protein complex information from M. pneumoniae (14). The degree of conservation (Fig. 7A) prompts us to focus on E. coli, indicating that E. coli may be a good model for some processes in epsilon-proteobacteria but not for others. Using a reference set of 285 well-studied E. coli complexes from EcoCyc (58), we predict 80 H. pylori complexes to be identical (in composition) with their counterparts in E. coli. Another 85 complexes are partially conserved while 120 are completely absent. All predicted complexes in H. pylori are available in Supplemental Table S3.

Fig. 7.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 7.

Predicted protein complexes in H. pylori. (A) Proteomes of E. coli, M. pneumoniae, and H. pylori overlap substantially using orthologous proteins. Proteins not belonging to COGs were excluded. (B) We show selected protein complexes indicating different degrees of complex conservation. Dashed circles indicate proteins in E. coli complexes that are absent in H. pylori. Stoichiometry of protein complexes is indicated if they diverge from one subunit. (C) We count the number of complexes with different degrees of conservation in H. pylori.

Functional Integration of Gene Expression

To identify functionally relevant network clusters, we systematically analyze the H. pylori high-quality core protein interaction network to identify subnetworks that overrepresent certain functional groups, using functional terms from the Comprehensive Microbial Resource (27), GO (28), KEGG (29), gene essentiality, genetic context, cellular localization, and gene expression data. Some of these clusters are illustrated in Fig. 8 while detailed results can be found in Supplemental Table S4. For instance, cluster 16 consists of nine proteins that are highly interconnected by interactions. Involved genes are co-expressed under different conditions: Six genes of the cluster are up-regulated when the growth conditions are shifted to low pH values while three members are up-regulated under limited iron accessibility in the stationary phase. Finally, three proteins have an increased expression level when H. pylori is grown in contact with liver cells versus medium alone. While cluster members belong to very different pathways (e.g. Cag17 and Cag20 belong to the type IV secretion system, IspA is a geranyltranstransferase, CeuE a periplasmic iron-binding protein, and Ggt is a gamma-glutamyltranspeptidase), they are connected by interactions and gene expression. While such discordant expression patterns are found in other clusters as well (e.g. clusters 17, 26, and 36), our results suggest the presence of conditions under which these genes are co-expressed, allowing proteins to interact. Clusters 26 and 36 are enriched for proteins related to chemotaxis and motility. Notably, our screens detect all three CheV paralogs in the H. pylori genome (HP0019, HP0393, and HP0616) to bind to the hemolysin secretion protein precursor HylB (cluster 36). Moreover, we find that HP0019 and HP0393 interact with the methyl-accepting chemotaxis transducer (TlpC) but not HP0616. Cluster 6 shows that this combination strategy unearthed additional interesting aspects that cannot be detected when one parameter is analyzed in isolation. While seven members that belong to three different operons are connected, GO assignments of HP0164 (signal-transducing protein, histidine kinase) and OmpR (response regulator HP0166) suggest an involvement of the cluster members in two-component signaling.

Fig. 8.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 8.

Selected interaction clusters with various enriched functional terms, genomic context, co-expression, co-localization, and phenotypes. We depict interaction clusters that we derive from the combined H. pylori core protein interaction map with significant enrichments (p < 0.05). All results can be found in Supplemental Table S5. Node labels represent protein names (if available) or the locus number of the corresponding protein. Co-expressed genes based on operons are given in the legends by the transcription unit number as used in (53) and gene expression data (Supplemental Table S5). For each cluster, we provide a separate legend highlighting the enriched properties. In particular, clusters 16 and 17 are enriched for differentially, co-expressed proteins when growth is shifted to low pH values conditions. Clusters 13 and 65 are enriched for essential genes, while cluster 23 is enriched with ribosomal proteins. Clusters 26 and 36 include flagellar/chemotaxis proteins, and cluster 6 is enriched for intra- and interconnected operons.

DISCUSSION

Given that H. pylori is a major human pathogen causing millions of ulcers and other health problems each year surprisingly little is known about its molecular biology. To fill this gap, we investigate the proteome and interactome in a more systematic way.

Investigating the abundance of proteins in H. pylori, we find that highly abundant proteins revolve around translational, posttranslational modification, protein turnover, metabolite biosynthesis, transport, catabolism, and energy production functions. Furthermore, we find that abundant proteins are typically encoded by essential genes.

Combining protein interactions with protein abundances, we observe that proteins of similar abundance preferably interact with each other. While we find some interactions between proteins of different abundances, our results clearly confirm assumptions that interacting pairs of proteins are usually present in roughly stoichiometric ratios.

Notably, H. pylori still encodes a large number (∼500) of uncharacterized proteins. Among proteins of known function, we find that interactions usually connect proteins of similar activity. Based on such characteristics, we utilize a bacterial meta-interactome of closely related bacteria to predict the functions of unknown proteins. In particular, we account for interactions that are conserved in other closely related bacteria. Such an augmentation of our initial network of protein interactions allows us to increase the accuracy of our classification method significantly and to predict the function of more than 300 proteins with previously poorly annotated or unknown function. Resembling the spectrum of functions of abundant proteins, we find that the majority of proteins thus obtained mostly revolve around translational and posttranslational modification functions.

Utilizing our network of protein interactions in H. pylori, we determine sets of proteins that topologically control the underlying network. In particular, we find sets of critical, intermediate, and redundant proteins that always, partially, or never appear in different control configurations of the underlying network. In particular, each control configuration features a minimum dominating set (MDSet) so that every node is either an element of the MDSet or adjacent to a protein of the MDSet. Notably, critical proteins appear to be enriched with regulated genes that are significantly present in the spiral form of H. pylori. The spiral form of H. pylori is mostly dividing while the coccoid is a nonculturable but viable form. The observation that genes that are overexpressed in the spiral compared with the coccoid form are enriched with critical proteins suggests that the underlying topology network plays a role in the switch of the two bacterial forms. Notably, critical proteins are also enriched with proteins of high betweeness, representing central topological proteins with a propensity to connect different, disparate parts of the network. Therefore, we surmise that critical proteins may assume the role of levers that allow the bacteria to activate certain functions to change between forms as well as integrate different parts of the network to carry out the transformation from coccoid to spiral form. As a corollary, we hypothesize that such proteins carry functions that contribute to the spiral form. Indeed, we find that critical proteins are mostly enriched with transcriptional and posttranslational modification functions.

As for protein complexes, we integrate protein complex information of E. coli and M. pneumoniae and infer potential complexes in H. pylori by determining evolutionarily conserved complex components. As expected, we find a higher rate of conserved complexes when we consider E. coli protein complexes. Such a result may be rooted in the fact that E. coli has almost six times as many proteins than M. pneumoniae. Furthermore, its protein complexes are better investigated than their counterparts in Mycoplasma, suggesting that E. coli is a better model for protein complexes as H. pylori shares significantly more orthologs with E. coli than M. pneumoniae. Moreover, we integrate the interactome with functional and expression profiles of genes in H. pylori, allowing us to find significant protein clusters. Our analysis reveals an abundance of different network clusters that combine certain functions that integrate the placement of operons of cluster members as well. Such an observation clearly suggests that expression, function, and operon regulation are driving forces of the observed network clusters.

National Institutes of Health https://dx.doi.org/10.13039/100000002NIH R01GM109895

Footnotes

  • Author contributions: S.W., S.M., H.C., R.H., P.A., S.K., and P.U. performed the research; S.W. analyzed the data; S.W. and P.U. wrote the paper; and S.K. and P.U. designed research.

  • ↵* This work was supported by National Institutes of Health grant NIH R01GM109895.

  • ↵Embedded Image This article contains supplemental material.

  • ↵1 The abbreviations used are:

    H. pylori
    Helicobacter pylori
    LC-MS/MS
    Liquid chromatography-mass spectrometry
    PPI
    protein-protein interactions
    MDSet
    minimum dominating set
    PIM
    protein interaction map
    GO
    gene ontology
    KEGG
    Kyoto Encyclopedia of genes and genomes
    COG
    Clusters of orthologous genes
    MCL
    Markov clustering
    PDB
    Protein Data Bank
    MALT
    mucosa-associated lymphoid tissue.

  • Received November 21, 2017.
  • Revision received January 25, 2018.
  • © 2018 by The American Society for Biochemistry and Molecular Biology, Inc.

REFERENCES

  1. 1.↵
    1. Warren, J. R., and
    2. Marshall, B.
    (1983) Unidentified curved bacilli on gastric epithelium in active chronic gastritis. Lancet 1, 1273–1275
    OpenUrlCrossRefPubMed
  2. 2.↵
    1. Marshall, B. J., and
    2. Warren, J. R.
    (1984) Unidentified curved bacilli in the stomach of patients with gastritis and peptic ulceration. Lancet 1, 1311–1315
    OpenUrlCrossRefPubMed
  3. 3.↵
    1. Kusters, J. G.,
    2. van Vliet, A. H., and
    3. Kuipers, E. J.
    (2006) Pathogenesis of Helicobacter pylori infection. Clin. Microbiol. Rev. 19, 449–490
    OpenUrlAbstract/FREE Full Text
  4. 4.↵
    1. Bauer, B., and
    2. Meyer, T. F.
    (2011) The Human gastric pathogen Helicobacter pylori and its association with gastric cancer and ulcer disease. Ulcers 2011, 340157
    OpenUrlCrossRef
  5. 5.↵
    1. Tomb, J. F.,
    2. White, O.,
    3. Kerlavage, A. R.,
    4. Clayton, R. A.,
    5. Sutton, G. G.,
    6. Fleischmann, R. D.,
    7. Ketchum, K. A.,
    8. Klenk, H. P.,
    9. Gill, S.,
    10. Dougherty, B. A.,
    11. Nelson, K.,
    12. Quackenbush, J.,
    13. Zhou, L.,
    14. Kirkness, E. F.,
    15. Peterson, S.,
    16. Loftus, B.,
    17. Richardson, D.,
    18. Dodson, R.,
    19. Khalak, H. G.,
    20. Glodek, A.,
    21. McKenney, K.,
    22. Fitzegerald, L. M.,
    23. Lee, N.,
    24. Adams, M. D.,
    25. Hickey, E. K.,
    26. Berg, D. E.,
    27. Gocayne, J. D.,
    28. Utterback, T. R.,
    29. Peterson, J. D.,
    30. Kelley, J. M.,
    31. Cotton, M. D.,
    32. Weidman, J. M.,
    33. Fujii, C.,
    34. Bowman, C.,
    35. Watthey, L.,
    36. Wallin, E.,
    37. Hayes, W. S.,
    38. Borodovsky, M.,
    39. Karp, P. D.,
    40. Smith, H. O.,
    41. Fraser, C. M., and
    42. Venter, J. C.
    (1997) The complete genome sequence of the gastric pathogen Helicobacter pylori. Nature 388, 539–547
    OpenUrlCrossRefPubMed
  6. 6.↵
    The UniProt, C. (2017) UniProt: The universal protein knowledgebase. Nucleic Acids Res. 45, D158–D169
    OpenUrlCrossRefPubMed
  7. 7.↵
    1. Rajagopala, S. V.,
    2. Sikorski, P.,
    3. Kumar, A.,
    4. Mosca, R.,
    5. Vlasblom, J.,
    6. Arnold, R.,
    7. Franca-Koh, J.,
    8. Pakala, S. B.,
    9. Phanse, S.,
    10. Ceol, A.,
    11. Häuser, R.,
    12. Siszler, G.,
    13. Wuchty, S.,
    14. Emili, A.,
    15. Babu, M.,
    16. Aloy, P.,
    17. Pieper, R., and
    18. Uetz, P.
    (2014) The binary protein–protein interaction landscape of Escherichia coli. Nat. Biotechnol. 32, 285–290
    OpenUrlCrossRefPubMed
  8. 8.↵
    1. Parrish, J. R.,
    2. Yu, J.,
    3. Liu, G.,
    4. Hines, J. A.,
    5. Chan, J. E.,
    6. Mangiola, B. A.,
    7. Zhang, H.,
    8. Pacifico, S.,
    9. Fotouhi, F.,
    10. DiRita, V. J.,
    11. Ideker, T.,
    12. Andrews, P., and
    13. Finley, R. L., Jr.
    (2007) A proteome-wide protein interaction map for Campylobacter jejuni. Genome Biol. 8, R130
    OpenUrlCrossRefPubMed
  9. 9.↵
    1. Wang, Y.,
    2. Cui, T.,
    3. Zhang, C.,
    4. Yang, M.,
    5. Huang, Y.,
    6. Li, W.,
    7. Zhang, L.,
    8. Gao, C.,
    9. He, Y.,
    10. Li, Y.,
    11. Huang, F.,
    12. Zeng, J.,
    13. Huang, C.,
    14. Yang, Q.,
    15. Tian, Y.,
    16. Zhao, C.,
    17. Chen, H.,
    18. Zhang, H., and
    19. He, Z. G.
    (2010) Global protein–protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv. J. Proteome Res. 9, 6665–6677
    OpenUrlCrossRefPubMed
  10. 10.↵
    1. Rain, J. C.,
    2. Selig, L.,
    3. De Reuse, H.,
    4. Battaglia, V.,
    5. Reverdy, C.,
    6. Simon, S.,
    7. Lenzen, G.,
    8. Petel, F.,
    9. Wojcik, J.,
    10. Schächter, V.,
    11. Chemama, Y.,
    12. Labigne, A., and
    13. Legrain, P.
    (2001) The protein–protein interaction map of Helicobacter pylori. Nature 409, 211–215
    OpenUrlCrossRefPubMed
  11. 11.↵
    1. Häuser, R.,
    2. Ceol, A.,
    3. Rajagopala, S. V.,
    4. Mosca, R.,
    5. Siszler, G.,
    6. Wermke, N.,
    7. Sikorski, P.,
    8. Schwarz, F.,
    9. Schick, M.,
    10. Wuchty, S.,
    11. Aloy, P., and
    12. Uetz, P.
    (2014) A second-generation protein–protein interaction network of Helicobacter pylori. Mol. Cell. Proteomics 13, 1318–1329
    OpenUrlAbstract/FREE Full Text
  12. 12.↵
    1. Butland, G.,
    2. Peregrín-Alvarez, J. M.,
    3. Li, J.,
    4. Yang, W.,
    5. Yang, X.,
    6. Canadien, V.,
    7. Starostine, A.,
    8. Richards, D.,
    9. Beattie, B.,
    10. Krogan, N.,
    11. Davey, M.,
    12. Parkinson, J.,
    13. Greenblatt, J., and
    14. Emili, A.
    (2005) Interaction network containing conserved and essential protein complexes in Escherichia coli. Nature 433, 531–537
    OpenUrlCrossRefPubMed
  13. 13.↵
    1. Arifuzzaman, M.,
    2. Maeda, M.,
    3. Itoh, A.,
    4. Nishikata, K.,
    5. Takita, C.,
    6. Saito, R.,
    7. Ara, T.,
    8. Nakahigashi, K.,
    9. Huang, H. C.,
    10. Hirai, A.,
    11. Tsuzuki, K.,
    12. Nakamura, S.,
    13. Altaf-Ul-Amin, M.,
    14. Oshima, T.,
    15. Baba, T.,
    16. Yamamoto, N.,
    17. Kawamura, T.,
    18. Ioka-Nakamichi, T.,
    19. Kitagawa, M.,
    20. Tomita, M.,
    21. Kanaya, S.,
    22. Wada, C., and
    23. Mori, H.
    (2006) Large-scale identification of protein–protein interaction of Escherichia coli K-12. Genome Res. 16, 686–691
    OpenUrlAbstract/FREE Full Text
  14. 14.↵
    1. Kühner, S.,
    2. van Noort, V.,
    3. Betts, M. J.,
    4. Leo-Macias, A.,
    5. Batisse, C.,
    6. Rode, M.,
    7. Yamada, T.,
    8. Maier, T.,
    9. Bader, S.,
    10. Beltran-Alvarez, P.,
    11. Castaño-Diez, D.,
    12. Chen, W. H.,
    13. Devos, D.,
    14. Güell, M.,
    15. Norambuena, T.,
    16. Racke, I.,
    17. Rybin, V.,
    18. Schmidt, A.,
    19. Yus, E.,
    20. Aebersold, R.,
    21. Herrmann, R.,
    22. Bottcher, B.,
    23. Frangakis, A. S.,
    24. Russell, R. B.,
    25. Serrano, L.,
    26. Bork, P., and
    27. Gavin, A. C.
    (2009) Proteome organization in a genome-reduced bacterium. Science 326, 1235–1240
    OpenUrlAbstract/FREE Full Text
  15. 15.↵
    1. Hu, P.,
    2. Janga, S. C.,
    3. Babu, M.,
    4. Diaz-Mejía, J. J.,
    5. Butland, G.,
    6. Yang, W.,
    7. Pogoutse, O.,
    8. Guo, X.,
    9. Phanse, S.,
    10. Wong, P.,
    11. Chandran, S.,
    12. Christopoulos, C.,
    13. Nazarians-Armavil, A.,
    14. Nasseri, N. K.,
    15. Musso, G.,
    16. Ali, M.,
    17. Nazemof, N.,
    18. Eroukova, V.,
    19. Golshani, A.,
    20. Paccanaro, A.,
    21. Greenblatt, J. F.,
    22. Moreno-Hagelsieb, G., and
    23. Emili, A.
    (2009) Global functional atlas of Escherichia coli encompassing previously uncharacterized proteins. PLoS Biol. 7, e96
    OpenUrlCrossRefPubMed
  16. 16.↵
    1. Bumann, D.,
    2. Meyer, T. F., and
    3. Jungblut, P. R.
    (2001) Proteome analysis of the common human pathogen Helicobacter pylori. Proteomics 1, 473–479
    OpenUrlCrossRefPubMed
  17. 17.↵
    1. Jungblut, P. R.,
    2. Bumann, D.,
    3. Haas, G.,
    4. Zimny-Arndt, U.,
    5. Holland, P.,
    6. Lamer, S.,
    7. Siejak, F.,
    8. Aebischer, A., and
    9. Meyer, T. F.
    (2000) Comparative proteome analysis of Helicobacter pylori. Mol. Microbiol. 36, 710–725
    OpenUrlCrossRefPubMed
  18. 18.↵
    1. Govorun, V. M.,
    2. Moshkovskii, S. A.,
    3. Tikhonova, O. V.,
    4. Goufman, E. I.,
    5. Serebryakova, M. V.,
    6. Momynaliev, K. T.,
    7. Lokhov, P. G.,
    8. Khryapova, E. V.,
    9. Kudryavtseva, L. V.,
    10. Smirnova, O. V.,
    11. Toropyguine, I. Y.,
    12. Maksimov, B. I., and
    13. Archakov, A. I.
    (2003) Comparative analysis of proteome maps of Helicobacter pylori clinical isolates. Biochemistry 68, 42–49
    OpenUrlCrossRefPubMed
  19. 19.↵
    1. Jungblut, P. R.,
    2. Schiele, F.,
    3. Zimny-Arndt, U.,
    4. Ackermann, R.,
    5. Schmid, M.,
    6. Lange, S.,
    7. Stein, R., and
    8. Pleissner, K. P.
    (2010) Helicobacter pylori proteomics by 2-DE/MS, 1-DE-LC/MS and functional data mining. Proteomics 10, 182–193
    OpenUrlCrossRefPubMed
  20. 20.↵
    1. Müller, S. A.,
    2. Findeiβ, S.,
    3. Pernitzsch, S. R.,
    4. Wissenbach, D. K.,
    5. Stadler, P. F.,
    6. Hofacker, I. L.,
    7. von Bergen, M., and
    8. Kalkhof, S.
    (2013) Identification of new protein coding sequences and signal peptidase cleavage sites of Helicobacter pylori strain 26695 by proteogenomics. J. Proteomics 86, 27–42
    OpenUrlCrossRefPubMed
  21. 21.↵
    1. Müller, S. A.,
    2. Pernitzsch, S. R.,
    3. Haange, S. B.,
    4. Uetz, P.,
    5. von Bergen, M.,
    6. Sharma, C. M., and
    7. Kalkhof, S.
    (2015) Stable isotope labeling by amino acids in cell culture based proteomics reveals differences in protein abundances between spiral and coccoid forms of the gastric pathogen Helicobacter pylori. J. Proteomics 126, 34–45
    OpenUrlCrossRefPubMed
  22. 22.↵
    1. Salama, N. R.,
    2. Shepherd, B., and
    3. Falkow, S.
    (2004) Global transposon mutagenesis and essential gene analysis of Helicobacter pylori. J. Bacteriol. 186, 7926–7935
    OpenUrlAbstract/FREE Full Text
  23. 23.↵
    1. Chalker, A. F.,
    2. Minehart, H. W.,
    3. Hughes, N. J.,
    4. Koretke, K. K.,
    5. Lonetto, M. A.,
    6. Brinkman, K. K.,
    7. Warren, P. V.,
    8. Lupas, A.,
    9. Stanhope, M. J.,
    10. Brown, J. R., and
    11. Hoffman, P. S.
    (2001) Systematic identification of selective essential genes in Helicobacter pylori by genome prioritization and allelic replacement mutagenesis. J. Bacteriol. 183, 1259–1268
    OpenUrlAbstract/FREE Full Text
  24. 24.↵
    1. Baldwin, D. N.,
    2. Shepherd, B.,
    3. Kraemer, P.,
    4. Hall, M. K.,
    5. Sycuro, L. K.,
    6. Pinto-Santini, D. M., and
    7. Salama, N. R.
    (2007) Identification of Helicobacter pylori genes that contribute to stomach colonization. Infect. Immun. 75, 1005–1016
    OpenUrlAbstract/FREE Full Text
  25. 25.↵
    1. Kavermann, H.,
    2. Burns, B. P.,
    3. Angermuller, K.,
    4. Odenbreit, S.,
    5. Fischer, W.,
    6. Melchers, K., and
    7. Haas, R.
    (2003) Identification and characterization of Helicobacter pylori genes essential for gastric colonization. J. Exper. Med. 197, 813–822
    OpenUrlAbstract/FREE Full Text
  26. 26.↵
    1. Enright, A. J.,
    2. Van Dongen, S., and
    3. Ouzounis, C. A.
    (2002) An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 30, 1575–1584
    OpenUrlCrossRefPubMed
  27. 27.↵
    1. Peterson, J. D.,
    2. Umayam, L. A.,
    3. Dickinson, T.,
    4. Hickey, E. K., and
    5. White, O.
    (2001) The Comprehensive Microbial Resource. Nucleic Acids Res. 29, 123–125
    OpenUrlCrossRefPubMed
  28. 28.↵
    1. Ashburner, M.,
    2. Ball, C. A.,
    3. Blake, J. A.,
    4. Botstein, D.,
    5. Butler, H.,
    6. Cherry, J. M.,
    7. Davis, A. P.,
    8. Dolinski, K.,
    9. Dwight, S. S.,
    10. Eppig, J. T.,
    11. Harris, M. A.,
    12. Hill, D. P.,
    13. Issel-Tarver, L.,
    14. Kasarskis, A.,
    15. Lewis, S.,
    16. Matese, J. C.,
    17. Richardson, J. E.,
    18. Ringwald, M.,
    19. Rubin, G. M., and
    20. Sherlock, G.
    (2000) Gene ontology: Tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29
    OpenUrlCrossRefPubMed
  29. 29.↵
    1. Kanehisa, M., and
    2. Goto, S.
    (2000) KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 28, 27–30
    OpenUrlCrossRefPubMed
  30. 30.↵
    1. Alexa, A.,
    2. Rahnenführer, J., and
    3. Lengauer, T.
    (2006) Improved scoring of functional groups from gene expression data by decorrelating GO graph structure. Bioinformatics 22, 1600–1607
    OpenUrlCrossRefPubMed
  31. 31.↵
    1. Tatusov, R. L.,
    2. Fedorova, N. D.,
    3. Jackson, J. D.,
    4. Jacobs, A. R.,
    5. Kiryutin, B.,
    6. Koonin, E. V.,
    7. Krylov, D. M.,
    8. Mazumder, R.,
    9. Mekhedov, S. L.,
    10. Nikolskaya, A. N.,
    11. Rao, B. S.,
    12. Smirnov, S.,
    13. Sverdlov, A. V.,
    14. Vasudevan, S.,
    15. Wolf, Y. I.,
    16. Yin, J. J., and
    17. Natale, D. A.
    (2003) The COG database: An updated version includes eukaryotes. BMC Bioinformatics 4, 41
    OpenUrlCrossRefPubMed
  32. 32.↵
    1. Franceschini, A.,
    2. Szklarczyk, D.,
    3. Frankild, S.,
    4. Kuhn, M.,
    5. Simonovic, M.,
    6. Roth, A.,
    7. Lin, J.,
    8. Minguez, P.,
    9. Bork, P.,
    10. von Mering, C., and
    11. Jensen, L. J.
    (2013) STRING v9.1: Protein–protein interaction networks, with increased coverage and integration. Nucleic Acids Res. 41, D808–D815
    OpenUrlCrossRefPubMed
  33. 33.↵
    1. Titz, B.,
    2. Rajagopala, S. V.,
    3. Goll, J.,
    4. Häuser, R.,
    5. McKevitt, M. T.,
    6. Palzkill, T., and
    7. Uetz, P.
    (2008) The binary protein interactome of Treponema pallidum—The syphilis spirochete. PloS One 3, e2292
    OpenUrlCrossRefPubMed
  34. 34.↵
    1. Shimoda, Y.,
    2. Shinpo, S.,
    3. Kohara, M.,
    4. Nakamura, Y.,
    5. Tabata, S., and
    6. Sato, S.
    (2008) A large scale analysis of protein–protein interactions in the nitrogen-fixing bacterium Mesorhizobium loti. DNA Res. 15, 13–23
    OpenUrlCrossRefPubMed
  35. 35.↵
    1. Sato, S.,
    2. Shimoda, Y.,
    3. Muraki, A.,
    4. Kohara, M.,
    5. Nakamura, Y., and
    6. Tabata, S.
    (2007) A large-scale protein–protein interaction analysis in Synechocystis sp. PCC6803. DNA Res. 14, 207–216
    OpenUrlCrossRefPubMed
  36. 36.↵
    1. Wuchty, S.,
    2. Rajagopala, S. V.,
    3. Blazie, S. M.,
    4. Parrish, J. R.,
    5. Khuri, S.,
    6. Finley, R. L., Jr., and
    7. Uetz, P.
    (2017) The protein interactome of Streptococcus pneumoniae and bacterial meta-interactomes improve function predictions. mSystems 2, e00019–e00017
    OpenUrl
  37. 37.↵
    1. Marchadier, E.,
    2. Carballido-López, R.,
    3. Brinster, S.,
    4. Fabret, C.,
    5. Mervelet, P.,
    6. Bessieres, P.,
    7. Noirot-Gros, M. F.,
    8. Fromion, V., and
    9. Noirot, P.
    (2011) An expanded protein–protein interaction network in Bacillus subtilis reveals a group of hubs: Exploration by an integrative approach. Proteomics 11, 2981–2991
    OpenUrlCrossRefPubMed
  38. 38.↵
    1. Remm, M.,
    2. Storm, C. E., and
    3. Sonnhammer, E. L.
    (2001) Automatic clustering of orthologs and in-paralogs from pairwise species comparisons. J. Mol. Biol. 314, 1041–1052
    OpenUrlCrossRefPubMed
  39. 39.↵
    UniProt. (2015) UniProt: A hub for protein information. Nucleic Acids Res. 43, D204–D212
    OpenUrlCrossRefPubMed
  40. 40.↵
    1. Vazquez, A.,
    2. Flammini, A.,
    3. Maritan, A., and
    4. Vespignani, A.
    (2003) Global protein function prediction from protein–protein interaction networks. Nat. Biotechnol. 21, 697–700
    OpenUrlCrossRefPubMed
  41. 41.↵
    1. Simpson, E. H.
    (1949) Measurement of diversity. Nature 163, 688
    OpenUrlCrossRef
  42. 42.↵
    1. Rose, P. W.,
    2. Prlić, A.,
    3. Altunkaya, A.,
    4. Bi, C.,
    5. Bradley, A. R.,
    6. Christie, C. H.,
    7. Costanzo, L. D.,
    8. Duarte, J. M.,
    9. Dutta, S.,
    10. Feng, Z.,
    11. Green, R. K.,
    12. Goodsell, D. S.,
    13. Hudson, B.,
    14. Kalro, T.,
    15. Lowe, R.,
    16. Peisach, E.,
    17. Randle, C.,
    18. Rose, A. S.,
    19. Shao, C.,
    20. Tao, Y. P.,
    21. Valasatava, Y.,
    22. Voigt, M.,
    23. Westbrook, J. D.,
    24. Woo, J.,
    25. Yang, H.,
    26. Young, J. Y.,
    27. Zardecki, C.,
    28. Berman, H. M., and
    29. Burley, S. K.
    (2017) The RCSB protein data bank: Integrative view of protein, gene and 3D structural information. Nucleic Acids Res. 45, D271–D281
    OpenUrlCrossRefPubMed
  43. 43.↵
    1. Ishitsuka, M.,
    2. Akutsu, T., and
    3. Nacher, J. C.
    (2016) Critical controllability in proteome-wide protein interaction network integrating transcriptome. Sci. Rep. 6, 23541
    OpenUrl
  44. 44.↵
    1. Nacher, J. C., and
    2. Akutsu, T.
    (2014) Analysis of critical and redundant nodes in controlling directed and undirected complex networks using dominating sets. J. Compl. Networks 2, 394–412
    OpenUrl
  45. 45.↵
    1. Land, A. H., and
    2. Doig, A. G.
    (1960) An automatic method of solving discrete programming-problems. Econometrica 28, 497–520
    OpenUrlCrossRef
  46. 46.↵
    1. Huerta-Cepas, J.,
    2. Szklarczyk, D.,
    3. Forslund, K.,
    4. Cook, H.,
    5. Heller, D.,
    6. Walter, M. C.,
    7. Rattei, T.,
    8. Mende, D. R.,
    9. Sunagawa, S.,
    10. Kuhn, M.,
    11. Jensen, L. J.,
    12. von Mering, C., and
    13. Bork, P.
    (2016) EggNOG 4.5: A hierarchical orthology framework with improved functional annotations for eukaryotic, prokaryotic and viral sequences. Nucleic Acids Res. 44, D286–D293
    OpenUrlCrossRefPubMed
  47. 47.↵
    1. Fawcett, T.
    (2006) An introduction to ROC analysis. Pattern Recogn. Lett. 27, 861–874
    OpenUrlCrossRef
  48. 48.↵
    1. Caufield, J. H.,
    2. Wimble, C.,
    3. Shary, S.,
    4. Wuchty, S., and
    5. Uetz, P.
    (2017) Bacterial protein meta-interactomes predict cross-species interactions and protein function. BMC Bioinformatics 18, 171
    OpenUrl
  49. 49.↵
    1. Benjamini, Y., and
    2. Hochberg, Y.
    (1995) Controlling the false discovery rate—A practical and powerful approach to multiple testing. J. Roy Stat. Soc. B Met. 57, 289–300
    OpenUrl
  50. 50.↵
    1. Wuchty, S.
    (2014) Controllability in protein interaction networks. Proc. Natl. Acad. Sci. U.S.A. 111, 7156–7160
    OpenUrlAbstract/FREE Full Text
  51. 51.↵
    1. Sabri, M.,
    2. Häuser, R.,
    3. Ouellette, M.,
    4. Liu, J.,
    5. Dehbi, M.,
    6. Moeck, G.,
    7. García, E.,
    8. Titz, B.,
    9. Uetz, P., and
    10. Moineau, S.
    (2011) Genome annotation and intraviral interactome for the Streptococcus pneumoniae virulent phage Dp-1. J. Bacteriol. 193, 551–562
    OpenUrlAbstract/FREE Full Text
  52. 52.↵
    1. Häuser, R.,
    2. Sabri, M.,
    3. Moineau, S., and
    4. Uetz, P.
    (2011) The proteome and interactome of Streptococcus pneumoniae phage Cp-1. J. Bacteriol. 193, 3135–3138
    OpenUrlAbstract/FREE Full Text
  53. 53.↵
    1. Sharma, C. M.,
    2. Hoffmann, S.,
    3. Darfeuille, F.,
    4. Reignier, J.,
    5. Findeiss, S.,
    6. Sittka, A.,
    7. Chabas, S.,
    8. Reiche, K.,
    9. Hackermüller, J.,
    10. Reinhardt, R.,
    11. Stadler, P. F., and
    12. Vogel, J.
    (2010) The primary transcriptome of the major human pathogen Helicobacter pylori. Nature 464, 250–255
    OpenUrlCrossRefPubMed
  54. 54.↵
    1. Häuser, R.,
    2. Pech, M.,
    3. Kijek, J.,
    4. Yamamoto, H.,
    5. Titz, B.,
    6. Naeve, F.,
    7. Tovchigrechko, A.,
    8. Yamamoto, K.,
    9. Szaflarski, W.,
    10. Takeuchi, N.,
    11. Stellberger, T.,
    12. Diefenbacher, M. E.,
    13. Nierhaus, K. H., and
    14. Uetz, P.
    (2012) RsfA (YbeB) proteins are conserved ribosomal silencing factors. PLoS Genet. 8, e1002815
    OpenUrlCrossRefPubMed
  55. 55.↵
    1. Gao, H.,
    2. Sengupta, J.,
    3. Valle, M.,
    4. Korostelev, A.,
    5. Eswar, N.,
    6. Stagg, S. M.,
    7. Van Roey, P.,
    8. Agrawal, R. K.,
    9. Harvey, S. C.,
    10. Sali, A.,
    11. Chapman, M. S., and
    12. Frank, J.
    (2003) Study of the structural dynamics of the E. coli 70S ribosome using real-space refinement. Cell 113, 789–801
    OpenUrlCrossRefPubMed
  56. 56.↵
    1. Karlyshev, A. V.,
    2. Linton, D.,
    3. Gregson, N. A., and
    4. Wren, B. W.
    (2002) A novel paralogous gene family involved in phase-variable flagella-mediated motility in Campylobacter jejuni. Microbiology 148, 473–480
    OpenUrlCrossRefPubMed
  57. 57.↵
    1. Caufield, J. H.,
    2. Abreu, M.,
    3. Wimble, C., and
    4. Uetz, P.
    (2015) Protein complexes in bacteria. PLoS Comput. Biol. 11, e1004107
    OpenUrl
  58. 58.↵
    1. Keseler, I. M.,
    2. Mackie, A.,
    3. Santos-Zavaleta, A.,
    4. Billington, R.,
    5. Bonavides-Martinez, C.,
    6. Caspi, R.,
    7. Fulcher, C.,
    8. Gama-Castro, S.,
    9. Kothari, A.,
    10. Krummenacker, M.,
    11. Latendresse, M.,
    12. Muñiz-Rascado, L.,
    13. Ong, Q.,
    14. Paley, S.,
    15. Peralta-Gil, M.,
    16. Subhraveti, P.,
    17. Velázquez-Ramírez, D. A.,
    18. Weaver, D.,
    19. Collado-Vides, J.,
    20. Paulsen, I., and
    21. Karp, P. D.
    (2017) The EcoCyc database: Reflecting new knowledge about Escherichia coli K-12. Nucleic Acids Res. 45, D543–D550
    OpenUrlCrossRefPubMed
View Abstract
PreviousNext
Back to top
Print
Download PDF
Article Alerts
Sign In to Email Alerts with your Email Address
Email Article

Thank you for your interest in spreading the word on Molecular & Cellular Proteomics.

NOTE: We only request your email address so that the person you are recommending the page to knows that you wanted them to see it, and that it is not junk mail. We do not capture any email address.

Enter multiple addresses on separate lines or separate them with commas.
Proteome Data Improves Protein Function Prediction in the Interactome of Helicobacter pylori
(Your Name) has sent you a message from Molecular & Cellular Proteomics
(Your Name) thought you would like to see the Molecular & Cellular Proteomics web site.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Citation Tools
Proteome Data Improves Protein Function Prediction in the Interactome of Helicobacter pylori
Stefan Wuchty, Stefan A. Müller, J. Harry Caufield, Roman Häuser, Patrick Aloy, Stefan Kalkhof, Peter Uetz
Molecular & Cellular Proteomics May 1, 2018, First published on February 1, 2018, 17 (5) 961-973; DOI: 10.1074/mcp.RA117.000474

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero

Request Permissions

Share
Proteome Data Improves Protein Function Prediction in the Interactome of Helicobacter pylori
Stefan Wuchty, Stefan A. Müller, J. Harry Caufield, Roman Häuser, Patrick Aloy, Stefan Kalkhof, Peter Uetz
Molecular & Cellular Proteomics May 1, 2018, First published on February 1, 2018, 17 (5) 961-973; DOI: 10.1074/mcp.RA117.000474
del.icio.us logo Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Google Plus One

In this issue

Molecular & Cellular Proteomics: 17 (5)
Molecular & Cellular Proteomics
Vol. 17, Issue 5
1 May 2018
  • Table of Contents
  • Table of Contents (PDF)
  • Cover (PDF)
  • About the Cover
  • Index by author
  • Ed Board (PDF)

View this article with LENS

Jump to section

  • Article
    • Abstract
    • EXPERIMENTAL PROCEDURES
    • RESULTS
    • DISCUSSION
    • Footnotes
    • REFERENCES
  • Figures & Data
  • eLetters
  • Info & Metrics
  • PDF

  • Follow MCP on Twitter
  • RSS feeds
  • Email

Articles

  • Current Issue
  • Papers in Press
  • Archive

For Authors

  • Submit a Manuscript
  • Info for Authors

Guidelines

  • Proteomic Identification
  • Clinical Proteomics
  • Glycomic Identification
  • Targeted Proteomics
  • Frequently Asked Questions

About MCP

  • About the Journal
  • Permissions and Licensing
  • Advertisers
  • Subscribers

ASBMB Publications

  • Molecular & Cellular Proteomics
  • Journal of Biological Chemistry
  • Journal of Lipid Research
  • ASBMB Today

© 2019 American Society for Biochemistry and Molecular Biology | Privacy Policy

MCP Print ISSN 1535-9476 Online ISSN 1535-9484

Powered by HighWire