Advertisement

Proteome Profiling Outperforms Transcriptome Profiling for Coexpression Based Gene Function Prediction*

Open AccessPublished:November 11, 2016DOI:https://doi.org/10.1074/mcp.M116.060301

      Abstract

      Coexpression of mRNAs under multiple conditions is commonly used to infer cofunctionality of their gene products despite well-known limitations of this “guilt-by-association” (GBA) approach. Recent advancements in mass spectrometry-based proteomic technologies have enabled global expression profiling at the protein level; however, whether proteome profiling data can outperform transcriptome profiling data for coexpression based gene function prediction has not been systematically investigated. Here, we address this question by constructing and analyzing mRNA and protein coexpression networks for three cancer types with matched mRNA and protein profiling data from The Cancer Genome Atlas (TCGA) and the Clinical Proteomic Tumor Analysis Consortium (CPTAC). Our analyses revealed a marked difference in wiring between the mRNA and protein coexpression networks. Whereas protein coexpression was driven primarily by functional similarity between coexpressed genes, mRNA coexpression was driven by both cofunction and chromosomal colocalization of the genes. Functionally coherent mRNA modules were more likely to have their edges preserved in corresponding protein networks than functionally incoherent mRNA modules. Proteomic data strengthened the link between gene expression and function for at least 75% of Gene Ontology (GO) biological processes and 90% of KEGG pathways. A web application Gene2Net (http://cptac.gene2net.org) developed based on the three protein coexpression networks revealed novel gene-function relationships, such as linking ERBB2 (HER2) to lipid biosynthetic process in breast cancer, identifying PLG as a new gene involved in complement activation, and identifying AEBP1 as a new epithelial-mesenchymal transition (EMT) marker. Our results demonstrate that proteome profiling outperforms transcriptome profiling for coexpression based gene function prediction. Proteomics should be integrated if not preferred in gene function and human disease studies.
      Cellular functions require coordinated expression of genes involved in the same biological pathways or protein complexes. High-throughput mRNA profiling has been the dominant approach to studying gene expression and its relationship to cellular functions. Coexpression of mRNAs under multiple conditions is commonly used to infer cofunctionality of their gene products (
      • Quackenbush J.
      Genomics. Microarrays–guilt by association.
      ), and this “guilt-by-association” (GBA)
      The abbreviations used are: GBA, guilt-by-association; TCGA, The Cancer Genome Atlas; CPTAC, the Clinical Proteomic Tumor Analysis Consortium; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; iTRAQ, isobaric peptide labeling approach; LR, likelihood ratio; ARACNE, Algorithm for the Reconstruction of Accurate Cellular Networks.
      1The abbreviations used are: GBA, guilt-by-association; TCGA, The Cancer Genome Atlas; CPTAC, the Clinical Proteomic Tumor Analysis Consortium; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; iTRAQ, isobaric peptide labeling approach; LR, likelihood ratio; ARACNE, Algorithm for the Reconstruction of Accurate Cellular Networks.
      heuristic is the basis for analyzing mRNA profiling data using gene clustering (
      • Eisen M.B.
      • Spellman P.T.
      • Brown P.O.
      • Botstein D.
      Cluster analysis and display of genome-wide expression patterns.
      ), coexpression network analysis (
      • Butte A.J.
      • Tamayo P.
      • Slonim D.
      • Golub T.R.
      • Kohane I.S.
      Discovering functional relationships between RNA expression and chemotherapeutic susceptibility using relevance networks.
      ,
      • Voineagu I.
      • Wang X.
      • Johnston P.
      • Lowe J.K.
      • Tian Y.
      • Horvath S.
      • Mill J.
      • Cantor R.M.
      • Blencowe B.J.
      • Geschwind D.H.
      Transcriptomic analysis of autistic brain reveals convergent molecular pathology.
      ,
      • Margolin A.A.
      • Wang K.
      • Lim W.K.
      • Kustagi M.
      • Nemenman I.
      • Califano A.
      Reverse engineering cellular networks.
      ), and pathway and gene set enrichment analysis (
      • Subramanian A.
      • Tamayo P.
      • Mootha V.K.
      • Mukherjee S.
      • Ebert B.L.
      • Gillette M.A.
      • Paulovich A.
      • Pomeroy S.L.
      • Golub T.R.
      • Lander E.S.
      • Mesirov J.P.
      Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.
      ,
      • Zhang B.
      • Kirov S.
      • Snoddy J.
      WebGestalt: an integrated system for exploring gene sets in various biological contexts.
      ,
      • Wang J.
      • Duncan D.
      • Shi Z.
      • Zhang B.
      WEB-based GEne SeT AnaLysis Toolkit (WebGestalt): update 2013.
      ). However, genes with similar mRNA expression profiles are not necessarily functionally coupled due to reasons such as transcriptional leakage and nonspecific occurrence of cis-regulatory elements in the genome (
      • Rodriguez-Trelles F.
      • Tarrio R.
      • Ayala F.J.
      Is ectopic expression caused by deregulatory mutations or due to gene-regulation leaks with evolutionary potential?.
      ,
      • Stuart J.M.
      • Segal E.
      • Koller D.
      • Kim S.K.
      A gene-coexpression network for global discovery of conserved genetic modules.
      ,
      • Yanai I.
      • Korbel J.O.
      • Boue S.
      • McWeeney S.K.
      • Bork P.
      • Lercher M.J.
      Similar gene expression profiles do not imply similar tissue functions.
      ). Distinguishing accidental transcriptional covariation from those that are functionally important is a well-known challenge, and strategies such as meta-analysis (
      • Lee H.K.
      • Hsu A.K.
      • Sajdak J.
      • Qin J.
      • Pavlidis P.
      Coexpression analysis of human genes across many microarray data sets.
      ) and evolutionary constraint (
      • Stuart J.M.
      • Segal E.
      • Koller D.
      • Kim S.K.
      A gene-coexpression network for global discovery of conserved genetic modules.
      ,
      • Ramani A.K.
      • Li Z.
      • Hart G.T.
      • Carlson M.W.
      • Boutz D.R.
      • Marcotte E.M.
      A map of human protein interactions derived from co-expression of human mRNAs and their orthologs.
      ) have been developed to address this challenge.
      Recent advancements in mass spectrometry-based proteomic technologies have enabled global expression profiling at the protein level, and the concordance between mRNA and protein profiling data has been extensively studied during the past decade (
      • Liu Y.
      • Beyer A.
      • Aebersold R.
      On the Dependency of Cellular Protein Levels on mRNA Abundance.
      ,
      • Vogel C.
      • Marcotte E.M.
      Insights into the regulation of protein abundance from proteomic and transcriptomic analyses.
      ). Although a few publications suggest that gene expression is mostly controlled at the mRNA level (
      • Jovanovic M.
      • Rooney M.S.
      • Mertins P.
      • Przybylski D.
      • Chevrier N.
      • Satija R.
      • Rodriguez E.H.
      • Fields A.P.
      • Schwartz S.
      • Raychowdhury R.
      • Mumbach M.R.
      • Eisenhaure T.
      • Rabani M.
      • Gennert D.
      • Lu D.
      • Delorey T.
      • Weissman J.S.
      • Carr S.A.
      • Hacohen N.
      • Regev A.
      Immunogenetics. Dynamic profiling of the protein life cycle in response to pathogens.
      ,
      • Li J.J.
      • Bickel P.J.
      • Biggin M.D.
      System wide analyses have underestimated protein abundances and the importance of transcription in mammals.
      ,
      • Li J.J.
      • Biggin M.D.
      Gene expression. Statistics requantitates the central dogma.
      ), many studies have reported a considerable discrepancy between mRNA and protein profiles in human and other model organisms (
      • Vogel C.
      • Marcotte E.M.
      Insights into the regulation of protein abundance from proteomic and transcriptomic analyses.
      ,
      • Foss E.J.
      • Radulovic D.
      • Shaffer S.A.
      • Goodlett D.R.
      • Kruglyak L.
      • Bedalov A.
      Genetic variation shapes protein networks mainly through non-transcriptional mechanisms.
      ,
      • Ghazalpour A.
      • Bennett B.
      • Petyuk V.A.
      • Orozco L.
      • Hagopian R.
      • Mungrue I.N.
      • Farber C.R.
      • Sinsheimer J.
      • Kang H.M.
      • Furlotte N.
      • Park C.C.
      • Wen P.Z.
      • Brewer H.
      • Weitz K.
      • Camp 2nd, D.G.
      • Pan C.
      • Yordanova R.
      • Neuhaus I.
      • Tilford C.
      • Siemers N.
      • Gargalovic P.
      • Eskin E.
      • Kirchgessner T.
      • Smith D.J.
      • Smith R.D.
      • Lusis A.J.
      Comparative analysis of proteome and transcriptome variation in mouse.
      ,
      • Schwanhausser B.
      • Busse D.
      • Li N.
      • Dittmar G.
      • Schuchhardt J.
      • Wolf J.
      • Chen W.
      • Selbach M.
      Global quantification of mammalian gene expression control.
      ,
      • Zhang B.
      • Wang J.
      • Wang X.
      • Zhu J.
      • Liu Q.
      • Shi Z.
      • Chambers M.C.
      • Zimmerman L.J.
      • Shaddox K.F.
      • Kim S.
      • Davies S.R.
      • Wang S.
      • Wang P.
      • Kinsinger C.R.
      • Rivers R.C.
      • Rodriguez H.
      • Townsend R.R.
      • Ellis M.J.
      • Carr S.A.
      • Tabb D.L.
      • Coffey R.J.
      • Slebos R.J.
      • Liebler D.C.
      • NCI, CPTAC
      Proteogenomic characterization of human colon and rectal cancer.
      ). It is not completely clear how much of the reported mRNA-protein discrepancy is due to technological issues and how much is due to underlying biology. Importantly, whether proteome profiling data can outperform transcriptome profiling data for coexpression based gene function prediction is largely unknown.
      The deep proteome profiling data sets recently generated by the Clinical Proteomic Tumor Analysis Consortium (CPTAC) on the breast (
      • Mertins P.
      • Mani D.R.
      • Ruggles K.V.
      • Gillette M.A.
      • Clauser K.R.
      • Wang P.
      • Wang X.
      • Qiao J.W.
      • Cao S.
      • Petralia F.
      • Kawaler E.
      • Mundt F.
      • Krug K.
      • Tu Z.
      • Lei J.T.
      • Gatza M.L.
      • Wilkerson M.
      • Perou C.M.
      • Yellapantula V.
      • Huang K.L.
      • Lin C.
      • McLellan M.D.
      • Yan P.
      • Davies S.R.
      • Townsend R.R.
      • Skates S.J.
      • Wang J.
      • Zhang B.
      • Kinsinger C.R.
      • Mesri M.
      • Rodriguez H.
      • Ding L.
      • Paulovich A.G.
      • Fenyo D.
      • Ellis M.J.
      • Carr S.A.
      • NCI, CPTAC
      Proteogenomics connects somatic mutations to signalling in breast cancer.
      ), colorectal (
      • Zhang B.
      • Wang J.
      • Wang X.
      • Zhu J.
      • Liu Q.
      • Shi Z.
      • Chambers M.C.
      • Zimmerman L.J.
      • Shaddox K.F.
      • Kim S.
      • Davies S.R.
      • Wang S.
      • Wang P.
      • Kinsinger C.R.
      • Rivers R.C.
      • Rodriguez H.
      • Townsend R.R.
      • Ellis M.J.
      • Carr S.A.
      • Tabb D.L.
      • Coffey R.J.
      • Slebos R.J.
      • Liebler D.C.
      • NCI, CPTAC
      Proteogenomic characterization of human colon and rectal cancer.
      ), and ovarian (
      • Zhang H.
      • Liu T.
      • Zhang Z.
      • Payne S.H.
      • Zhang B.
      • McDermott J.E.
      • Zhou J.Y.
      • Petyuk V.A.
      • Chen L.
      • Ray D.
      • Sun S.
      • Yang F.
      • Chen L.
      • Wang J.
      • Shah P.
      • Cha S.W.
      • Aiyetan P.
      • Woo S.
      • Tian Y.
      • Gritsenko M.A.
      • Clauss T.R.
      • Choi C.
      • Monroe M.E.
      • Thomas S.
      • Nie S.
      • Wu C.
      • Moore R.J.
      • Yu K.H.
      • Tabb D.L.
      • Fenyo D.
      • Bafna V.
      • Wang Y.
      • Rodriguez H.
      • Boja E.S.
      • Hiltke T.
      • Rivers R.C.
      • Sokoll L.
      • Zhu H.
      • Shih Ie M.
      • Cope L.
      • Pandey A.
      • Zhang B.
      • Snyder M.P.
      • Levine D.A.
      • Smith R.D.
      • Chan D.W.
      • Rodland K.D.
      • CPTAC Investigators
      Integrated proteogenomic characterization of human high-grade serous ovarian cancer.
      ) tumors that had been transcriptomically profiled by The Cancer Genome Atlas (TCGA) (
      • Cancer Genome Atlas Network
      Comprehensive molecular characterization of human colon and rectal cancer.
      ,
      • Cancer Genome Atlas, Network
      Comprehensive molecular portraits of human breast tumours.
      ,
      • Cancer Genome Atlas Research Network
      Integrated genomic analyses of ovarian carcinoma.
      ) provided a new opportunity to address this question. We constructed gene coexpression networks based on mRNA and protein profiling data sets, respectively, for each of the three cancer types. Comprehensive comparisons between the mRNA and protein coexpression networks constructed for the same cancer type allowed us to systematically investigate the relative utility of mRNA and protein profiling data in predicting gene cofunctionality.

      DISCUSSION

      With matched mRNA and protein profiling data from three cancer types, we have performed the first systematic study to investigate the relative utility of mRNA and protein profiling data in predicting gene cofunctionality. Although many studies have reported only a moderate correlation between mRNA and protein profiles (
      • Vogel C.
      • Marcotte E.M.
      Insights into the regulation of protein abundance from proteomic and transcriptomic analyses.
      ,
      • Foss E.J.
      • Radulovic D.
      • Shaffer S.A.
      • Goodlett D.R.
      • Kruglyak L.
      • Bedalov A.
      Genetic variation shapes protein networks mainly through non-transcriptional mechanisms.
      ,
      • Ghazalpour A.
      • Bennett B.
      • Petyuk V.A.
      • Orozco L.
      • Hagopian R.
      • Mungrue I.N.
      • Farber C.R.
      • Sinsheimer J.
      • Kang H.M.
      • Furlotte N.
      • Park C.C.
      • Wen P.Z.
      • Brewer H.
      • Weitz K.
      • Camp 2nd, D.G.
      • Pan C.
      • Yordanova R.
      • Neuhaus I.
      • Tilford C.
      • Siemers N.
      • Gargalovic P.
      • Eskin E.
      • Kirchgessner T.
      • Smith D.J.
      • Smith R.D.
      • Lusis A.J.
      Comparative analysis of proteome and transcriptome variation in mouse.
      ,
      • Schwanhausser B.
      • Busse D.
      • Li N.
      • Dittmar G.
      • Schuchhardt J.
      • Wolf J.
      • Chen W.
      • Selbach M.
      Global quantification of mammalian gene expression control.
      ,
      • Zhang B.
      • Wang J.
      • Wang X.
      • Zhu J.
      • Liu Q.
      • Shi Z.
      • Chambers M.C.
      • Zimmerman L.J.
      • Shaddox K.F.
      • Kim S.
      • Davies S.R.
      • Wang S.
      • Wang P.
      • Kinsinger C.R.
      • Rivers R.C.
      • Rodriguez H.
      • Townsend R.R.
      • Ellis M.J.
      • Carr S.A.
      • Tabb D.L.
      • Coffey R.J.
      • Slebos R.J.
      • Liebler D.C.
      • NCI, CPTAC
      Proteogenomic characterization of human colon and rectal cancer.
      ), whether protein profiling data better reflects cellular functions has remained unanswered, because the reported mRNA-protein discrepancy may have both biological and technical explanations (
      • Jovanovic M.
      • Rooney M.S.
      • Mertins P.
      • Przybylski D.
      • Chevrier N.
      • Satija R.
      • Rodriguez E.H.
      • Fields A.P.
      • Schwartz S.
      • Raychowdhury R.
      • Mumbach M.R.
      • Eisenhaure T.
      • Rabani M.
      • Gennert D.
      • Lu D.
      • Delorey T.
      • Weissman J.S.
      • Carr S.A.
      • Hacohen N.
      • Regev A.
      Immunogenetics. Dynamic profiling of the protein life cycle in response to pathogens.
      ,
      • Li J.J.
      • Bickel P.J.
      • Biggin M.D.
      System wide analyses have underestimated protein abundances and the importance of transcription in mammals.
      ,
      • Schwanhausser B.
      • Busse D.
      • Li N.
      • Dittmar G.
      • Schuchhardt J.
      • Wolf J.
      • Chen W.
      • Selbach M.
      Global quantification of mammalian gene expression control.
      ). Our study provided quantitative evidence to demonstrate that protein profiling data is more closely aligned with function than mRNA profiling data. Proteomic data strengthened the link between gene expression and function for the vast majority of biological processes and pathways. We identified a subset of biological processes and pathways for which protein measurements would be most critical. We also developed Gene2Net, which will allow biologists to generate hypotheses on new gene-function relationships based on the protein coexpression networks.
      Although mRNA profiling has been the dominant approach to studying gene expression and its relationship to cellular functions, it has been suggested that genes with similar mRNA expression profiles are not necessarily functionally coupled (
      • Yanai I.
      • Korbel J.O.
      • Boue S.
      • McWeeney S.K.
      • Bork P.
      • Lercher M.J.
      Similar gene expression profiles do not imply similar tissue functions.
      ). Our results showed that chromosomal colocalization plays a significant role in determining mRNA coexpression. Somatic copy number alteration may be an important driver of this phenomenon (
      • Zhang B.
      • Wang J.
      • Wang X.
      • Zhu J.
      • Liu Q.
      • Shi Z.
      • Chambers M.C.
      • Zimmerman L.J.
      • Shaddox K.F.
      • Kim S.
      • Davies S.R.
      • Wang S.
      • Wang P.
      • Kinsinger C.R.
      • Rivers R.C.
      • Rodriguez H.
      • Townsend R.R.
      • Ellis M.J.
      • Carr S.A.
      • Tabb D.L.
      • Coffey R.J.
      • Slebos R.J.
      • Liebler D.C.
      • NCI, CPTAC
      Proteogenomic characterization of human colon and rectal cancer.
      ,
      • Mertins P.
      • Mani D.R.
      • Ruggles K.V.
      • Gillette M.A.
      • Clauser K.R.
      • Wang P.
      • Wang X.
      • Qiao J.W.
      • Cao S.
      • Petralia F.
      • Kawaler E.
      • Mundt F.
      • Krug K.
      • Tu Z.
      • Lei J.T.
      • Gatza M.L.
      • Wilkerson M.
      • Perou C.M.
      • Yellapantula V.
      • Huang K.L.
      • Lin C.
      • McLellan M.D.
      • Yan P.
      • Davies S.R.
      • Townsend R.R.
      • Skates S.J.
      • Wang J.
      • Zhang B.
      • Kinsinger C.R.
      • Mesri M.
      • Rodriguez H.
      • Ding L.
      • Paulovich A.G.
      • Fenyo D.
      • Ellis M.J.
      • Carr S.A.
      • NCI, CPTAC
      Proteogenomics connects somatic mutations to signalling in breast cancer.
      ,
      • Zhang H.
      • Liu T.
      • Zhang Z.
      • Payne S.H.
      • Zhang B.
      • McDermott J.E.
      • Zhou J.Y.
      • Petyuk V.A.
      • Chen L.
      • Ray D.
      • Sun S.
      • Yang F.
      • Chen L.
      • Wang J.
      • Shah P.
      • Cha S.W.
      • Aiyetan P.
      • Woo S.
      • Tian Y.
      • Gritsenko M.A.
      • Clauss T.R.
      • Choi C.
      • Monroe M.E.
      • Thomas S.
      • Nie S.
      • Wu C.
      • Moore R.J.
      • Yu K.H.
      • Tabb D.L.
      • Fenyo D.
      • Bafna V.
      • Wang Y.
      • Rodriguez H.
      • Boja E.S.
      • Hiltke T.
      • Rivers R.C.
      • Sokoll L.
      • Zhu H.
      • Shih Ie M.
      • Cope L.
      • Pandey A.
      • Zhang B.
      • Snyder M.P.
      • Levine D.A.
      • Smith R.D.
      • Chan D.W.
      • Rodland K.D.
      • CPTAC Investigators
      Integrated proteogenomic characterization of human high-grade serous ovarian cancer.
      ). In addition, genomic colocalization-driven coexpression has been previously reported in Caenorhabditis elegans (
      • Roy P.J.
      • Stuart J.M.
      • Lund J.
      • Kim S.K.
      Chromosomal clustering of muscle-expressed genes in Caenorhabditis elegans.
      ) and Saccharomyces cerevisiae (
      • Cohen B.A.
      • Mitra R.D.
      • Hughes J.D.
      • Church G.M.
      A computational analysis of whole-genome expression data reveals chromosomal domains of gene expression.
      ). Thus, this observation may also be explained by other mechanisms such as colocalization of coexpressed genes in regions of active chromatin or enhancers shared by neighboring genes on chromosomes. The impact of genomic colocalization on gene coexpression is significantly reduced at the protein level than mRNA level. (supplemental Fig. S7–S9). Although mRNA coexpression was driven by both cofunction and chromosomal colocalization of the genes, protein coexpression was driven primarily by functional similarity between coexpressed genes. Importantly, functionally coherent mRNA modules are preferably preserved in protein networks (Fig. 3B), suggesting a role of protein level regulation in coordinating gene functions.
      Among the three cancer types, proteomic data provided the largest added value for ovarian cancer (Fig. 1B, Fig. 3). We note that mRNA profiling data for ovarian cancer were generated by microarray, whereas mRNA data for breast and colorectal cancers were generated by RNA-Seq. The observed differences may be partially attributable to the different platforms. However, this also may reflect the unique biology of ovarian cancer. Prevalent copy number alterations in ovarian cancer (
      • Cancer Genome Atlas Research Network
      Integrated genomic analyses of ovarian carcinoma.
      ) may create widespread gene expression alterations at the transcriptome level, thereby requiring more extensive post-transcriptional regulation to buffer against non-functional alterations (
      • Yanai I.
      • Korbel J.O.
      • Boue S.
      • McWeeney S.K.
      • Bork P.
      • Lercher M.J.
      Similar gene expression profiles do not imply similar tissue functions.
      ,
      • Battle A.
      • Khan Z.
      • Wang S.H.
      • Mitrano A.
      • Ford M.J.
      • Pritchard J.K.
      • Gilad Y.
      Genomic variation. Impact of regulatory variation from RNA to protein.
      ). Indeed, chromosomal colocalization had a much stronger impact on mRNA coexpression in ovarian cancer compared with the other two cancer types, and such impact was reduced, but still visible at the protein level in ovarian cancer (supplemental Fig. S9).
      Although current proteomic platforms can identify more than ten thousand proteins, the number of quantifiable proteins remain much smaller than those can be quantified by mRNA profiling. In this study, the quantifiable proteins in the breast, colorectal, and ovarian data sets were 6281, 3899, and 3327, respectively, whereas the number of quantified genes in corresponding mRNA profiling data sets were 20501, 20501, and 17814. Our study was limited to genes with both mRNA and protein abundance measurements. but we believe our conclusion is not biased, because the same trend was observed with the number of studied genes increasing from 2988 in ovarian cancer to 3764 in colorectal cancer and 5988 in breast cancer. Moreover, the robustness of our conclusion was also confirmed by down-sampling experiments using the breast cancer data sets (see supplemental Text S2 and supplemental Fig. S10).
      The network topology may affect the priority scores of genes in the network. Zhang et al. (
      • Zhang B.
      • Shi Z.
      • Duncan D.T.
      • Prodduturi N.
      • Marnett L.J.
      • Liebler D.C.
      Relating protein adduction to gene expression changes: a systems approach.
      ) tried to remove this effect by assessing the statistical significance of the scores. To evaluate whether considering network topology could improve network-based function prediction, we combined two statistic metrics, localP and edgeP, with the rank ratio metric for assessing the significance of the priority scores (see supplemental Text S2). On average, considering network topology only increased the AUROCs less than 2% (supplemental Fig. S11). Furthermore, results based on all three types of AUROCs consistently suggest that protein networks significantly outperformed mRNA networks in gene function prediction.
      In conclusion, our results demonstrate that proteome profiling outperforms transcriptome profiling for coexpression based gene function prediction. The GBA strategies developed in transcriptomic studies would be more effective when applied to proteomic data. Gene function and disease studies would benefit immensely from broad adoption of global proteome profiling technologies.

      Acknowledgments

      We thank Marko Jovanovic, Nikolai Slavov, and Li Ding for critical reading of the manuscript and helpful suggestions. Transcriptomics data for this study were generated by The Cancer Genome Atlas pilot project established by the NCI and the National Human Genome Research Institute. Information about TCGA and the investigators and institutions who constitute the TCGA research network can be found at http://cancergenome.nih.gov/. The study was conducted in part using the resources of the Advanced Computing Center for Research and Education at Vanderbilt University, Nashville, TN.

      REFERENCES

        • Quackenbush J.
        Genomics. Microarrays–guilt by association.
        Science. 2003; 302: 240-241
        • Eisen M.B.
        • Spellman P.T.
        • Brown P.O.
        • Botstein D.
        Cluster analysis and display of genome-wide expression patterns.
        Proc. Natl. Acad. Sci. U.S.A. 1998; 95: 14863-14868
        • Butte A.J.
        • Tamayo P.
        • Slonim D.
        • Golub T.R.
        • Kohane I.S.
        Discovering functional relationships between RNA expression and chemotherapeutic susceptibility using relevance networks.
        Proc. Natl. Acad. Sci. U.S.A. 2000; 97: 12182-12186
        • Voineagu I.
        • Wang X.
        • Johnston P.
        • Lowe J.K.
        • Tian Y.
        • Horvath S.
        • Mill J.
        • Cantor R.M.
        • Blencowe B.J.
        • Geschwind D.H.
        Transcriptomic analysis of autistic brain reveals convergent molecular pathology.
        Nature. 2011; 474: 380-384
        • Margolin A.A.
        • Wang K.
        • Lim W.K.
        • Kustagi M.
        • Nemenman I.
        • Califano A.
        Reverse engineering cellular networks.
        Nat. Protoc. 2006; 1: 662-671
        • Subramanian A.
        • Tamayo P.
        • Mootha V.K.
        • Mukherjee S.
        • Ebert B.L.
        • Gillette M.A.
        • Paulovich A.
        • Pomeroy S.L.
        • Golub T.R.
        • Lander E.S.
        • Mesirov J.P.
        Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.
        Proc. Natl. Acad. Sci. U.S.A. 2005; 102: 15545-15550
        • Zhang B.
        • Kirov S.
        • Snoddy J.
        WebGestalt: an integrated system for exploring gene sets in various biological contexts.
        Nucleic Acids Res. 2005; 33: W741-W748
        • Wang J.
        • Duncan D.
        • Shi Z.
        • Zhang B.
        WEB-based GEne SeT AnaLysis Toolkit (WebGestalt): update 2013.
        Nucleic Acids Res. 2013; 41: W77-W83
        • Rodriguez-Trelles F.
        • Tarrio R.
        • Ayala F.J.
        Is ectopic expression caused by deregulatory mutations or due to gene-regulation leaks with evolutionary potential?.
        Bioessays. 2005; 27: 592-601
        • Stuart J.M.
        • Segal E.
        • Koller D.
        • Kim S.K.
        A gene-coexpression network for global discovery of conserved genetic modules.
        Science. 2003; 302: 249-255
        • Yanai I.
        • Korbel J.O.
        • Boue S.
        • McWeeney S.K.
        • Bork P.
        • Lercher M.J.
        Similar gene expression profiles do not imply similar tissue functions.
        Trends Genet. 2006; 22: 132-138
        • Lee H.K.
        • Hsu A.K.
        • Sajdak J.
        • Qin J.
        • Pavlidis P.
        Coexpression analysis of human genes across many microarray data sets.
        Genome Res. 2004; 14: 1085-1094
        • Ramani A.K.
        • Li Z.
        • Hart G.T.
        • Carlson M.W.
        • Boutz D.R.
        • Marcotte E.M.
        A map of human protein interactions derived from co-expression of human mRNAs and their orthologs.
        Mol. Syst. Biol. 2008; 4: 180
        • Liu Y.
        • Beyer A.
        • Aebersold R.
        On the Dependency of Cellular Protein Levels on mRNA Abundance.
        Cell. 2016; 165: 535-550
        • Vogel C.
        • Marcotte E.M.
        Insights into the regulation of protein abundance from proteomic and transcriptomic analyses.
        Nat. Rev. Genet. 2012; 13: 227-232
        • Jovanovic M.
        • Rooney M.S.
        • Mertins P.
        • Przybylski D.
        • Chevrier N.
        • Satija R.
        • Rodriguez E.H.
        • Fields A.P.
        • Schwartz S.
        • Raychowdhury R.
        • Mumbach M.R.
        • Eisenhaure T.
        • Rabani M.
        • Gennert D.
        • Lu D.
        • Delorey T.
        • Weissman J.S.
        • Carr S.A.
        • Hacohen N.
        • Regev A.
        Immunogenetics. Dynamic profiling of the protein life cycle in response to pathogens.
        Science. 2015; 347: 1259038
        • Li J.J.
        • Bickel P.J.
        • Biggin M.D.
        System wide analyses have underestimated protein abundances and the importance of transcription in mammals.
        PeerJ. 2014; 2: e270
        • Li J.J.
        • Biggin M.D.
        Gene expression. Statistics requantitates the central dogma.
        Science. 2015; 347: 1066-1067
        • Foss E.J.
        • Radulovic D.
        • Shaffer S.A.
        • Goodlett D.R.
        • Kruglyak L.
        • Bedalov A.
        Genetic variation shapes protein networks mainly through non-transcriptional mechanisms.
        PLos Biol. 2011; 9: e1001144
        • Ghazalpour A.
        • Bennett B.
        • Petyuk V.A.
        • Orozco L.
        • Hagopian R.
        • Mungrue I.N.
        • Farber C.R.
        • Sinsheimer J.
        • Kang H.M.
        • Furlotte N.
        • Park C.C.
        • Wen P.Z.
        • Brewer H.
        • Weitz K.
        • Camp 2nd, D.G.
        • Pan C.
        • Yordanova R.
        • Neuhaus I.
        • Tilford C.
        • Siemers N.
        • Gargalovic P.
        • Eskin E.
        • Kirchgessner T.
        • Smith D.J.
        • Smith R.D.
        • Lusis A.J.
        Comparative analysis of proteome and transcriptome variation in mouse.
        PLoS Genetics. 2011; 7: e1001393
        • Schwanhausser B.
        • Busse D.
        • Li N.
        • Dittmar G.
        • Schuchhardt J.
        • Wolf J.
        • Chen W.
        • Selbach M.
        Global quantification of mammalian gene expression control.
        Nature. 2011; 473: 337-342
        • Zhang B.
        • Wang J.
        • Wang X.
        • Zhu J.
        • Liu Q.
        • Shi Z.
        • Chambers M.C.
        • Zimmerman L.J.
        • Shaddox K.F.
        • Kim S.
        • Davies S.R.
        • Wang S.
        • Wang P.
        • Kinsinger C.R.
        • Rivers R.C.
        • Rodriguez H.
        • Townsend R.R.
        • Ellis M.J.
        • Carr S.A.
        • Tabb D.L.
        • Coffey R.J.
        • Slebos R.J.
        • Liebler D.C.
        • NCI, CPTAC
        Proteogenomic characterization of human colon and rectal cancer.
        Nature. 2014; 513: 382-387
        • Mertins P.
        • Mani D.R.
        • Ruggles K.V.
        • Gillette M.A.
        • Clauser K.R.
        • Wang P.
        • Wang X.
        • Qiao J.W.
        • Cao S.
        • Petralia F.
        • Kawaler E.
        • Mundt F.
        • Krug K.
        • Tu Z.
        • Lei J.T.
        • Gatza M.L.
        • Wilkerson M.
        • Perou C.M.
        • Yellapantula V.
        • Huang K.L.
        • Lin C.
        • McLellan M.D.
        • Yan P.
        • Davies S.R.
        • Townsend R.R.
        • Skates S.J.
        • Wang J.
        • Zhang B.
        • Kinsinger C.R.
        • Mesri M.
        • Rodriguez H.
        • Ding L.
        • Paulovich A.G.
        • Fenyo D.
        • Ellis M.J.
        • Carr S.A.
        • NCI, CPTAC
        Proteogenomics connects somatic mutations to signalling in breast cancer.
        Nature. 2016; 534: 55-62
        • Zhang H.
        • Liu T.
        • Zhang Z.
        • Payne S.H.
        • Zhang B.
        • McDermott J.E.
        • Zhou J.Y.
        • Petyuk V.A.
        • Chen L.
        • Ray D.
        • Sun S.
        • Yang F.
        • Chen L.
        • Wang J.
        • Shah P.
        • Cha S.W.
        • Aiyetan P.
        • Woo S.
        • Tian Y.
        • Gritsenko M.A.
        • Clauss T.R.
        • Choi C.
        • Monroe M.E.
        • Thomas S.
        • Nie S.
        • Wu C.
        • Moore R.J.
        • Yu K.H.
        • Tabb D.L.
        • Fenyo D.
        • Bafna V.
        • Wang Y.
        • Rodriguez H.
        • Boja E.S.
        • Hiltke T.
        • Rivers R.C.
        • Sokoll L.
        • Zhu H.
        • Shih Ie M.
        • Cope L.
        • Pandey A.
        • Zhang B.
        • Snyder M.P.
        • Levine D.A.
        • Smith R.D.
        • Chan D.W.
        • Rodland K.D.
        • CPTAC Investigators
        Integrated proteogenomic characterization of human high-grade serous ovarian cancer.
        Cell. 2016; 166: 755-765
        • Cancer Genome Atlas Network
        Comprehensive molecular characterization of human colon and rectal cancer.
        Nature. 2012; 487: 330-337
        • Cancer Genome Atlas, Network
        Comprehensive molecular portraits of human breast tumours.
        Nature. 2012; 490: 61-70
        • Cancer Genome Atlas Research Network
        Integrated genomic analyses of ovarian carcinoma.
        Nature. 2011; 474: 609-615
        • Li B.
        • Dewey C.N.
        RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome.
        BMC Bioinformatics. 2011; 12: 323
        • Smyth G.K.
        • Speed T.
        Normalization of cDNA microarray data.
        Methods. 2003; 31: 265-273
        • Rhee S.Y.
        • Wood V.
        • Dolinski K.
        • Draghici S.
        Use and misuse of the gene ontology annotations.
        Nat. Rev. Genet. 2008; 9: 509-515
        • Resnik P.
        Semantic similarity in a taxonomy: An Information-Based measure and its application to problems of ambiguity in natural language.
        J. Artif. Intell. Res. 1999; 11: 130
        • Marino-Ramirez L.
        • Bodenreider O.
        • Kantz N.
        • Jordan I.K.
        Co-evolutionary rates of functionally related yeast genes.
        Evol. Bioinform. Online. 2006; 2: 271-276
        • Ruan J.
        • Dean A.K.
        • Zhang W.
        A general co-expression network-based approach to gene expression analysis: comparison and applications.
        BMC Syst. Biol. 2010; 4: 8
        • Tornow S.
        • Mewes H.W.
        Functional modules by relating protein interaction networks and gene expression.
        Nucleic Acids Res. 2003; 31: 6283-6289
        • Margolin A.A.
        • Nemenman I.
        • Basso K.
        • Wiggins C.
        • Stolovitzky G.
        • Dalla Favera R.
        • Califano A.
        ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context.
        BMC Bioinformatics. 2006; 7: S7
        • Shi Z.
        • Derow C.K.
        • Zhang B.
        Co-expression module analysis reveals biological processes, genomic gain, and regulatory mechanisms associated with breast cancer progression.
        BMC Syst. Biol. 2010; 4: 74
        • Tripathi M.K.
        • Deane N.G.
        • Zhu J.
        • An H.
        • Mima S.
        • Wang X.
        • Padmanabhan S.
        • Shi Z.
        • Prodduturi N.
        • Ciombor K.K.
        • Chen X.
        • Washington M.K.
        • Zhang B.
        • Beauchamp R.D.
        Nuclear factor of activated T-cell activity is associated with metastatic capacity in colon cancer.
        Cancer Res. 2014; 74: 6947-6957
        • Turinsky A.L.
        • Razick S.
        • Turner B.
        • Donaldson I.M.
        • Wodak S.J.
        Interaction databases on the same page.
        Nat. Biotechnol. 2011; 29: 391-393
        • Dice L.R.
        Measures of the amount of ecologic association between species.
        Ecology. 1945; 26: 297-302
        • Shi Z.
        • Wang J.
        • Zhang B.
        NetGestalt: integrating multidimensional omics data over biological networks.
        Nat. Methods. 2013; 10: 597-598
        • Fury W.
        • Batliwalla F.
        • Gregersen P.K.
        • Li W.
        Overlapping probabilities of top ranking gene lists, hypergeometric distribution, and stringency of gene selection criterion.
        Conf Proc IEEE Eng. Med. Biol. Soc. 2006; 1: 5531-5534
        • Benjamini Y.
        • Hochberg Y.
        Controlling the false discovery rate: a practical and powerful approach to multiple testing.
        J. R. Stat. Soc. Series B. 1995; 57: 289-300
        • Kohler S.
        • Bauer S.
        • Horn D.
        • Robinson P.N.
        Walking the interactome for prioritization of candidate disease genes.
        Am. J. Hum. Genet. 2008; 82: 949-958
        • Zhang B.
        • Shi Z.
        • Duncan D.T.
        • Prodduturi N.
        • Marnett L.J.
        • Liebler D.C.
        Relating protein adduction to gene expression changes: a systems approach.
        Mol. Biosyst. 2011; 7: 2118-2127
        • Barabasi A.L.
        • Oltvai Z.N.
        Network biology: understanding the cell's functional organization.
        Nat. Rev. Genet. 2004; 5: 101-113
        • Hartwell L.H.
        • Hopfield J.J.
        • Leibler S.
        • Murray A.W.
        From molecular to modular cell biology.
        Nature. 1999; 402: C47-C52
        • Rubio-Perez C.
        • Tamborero D.
        • Schroeder M.P.
        • Antolin A.A.
        • Deu-Pons J.
        • Perez-Llamas C.
        • Mestres J.
        • Gonzalez-Perez A.
        • Lopez-Bigas N.
        In silico prescription of anticancer drugs to cohorts of 28 tumor types reveals targeting opportunities.
        Cancer Cell. 2015; 27: 382-396
        • Kourtidis A.
        • Jain R.
        • Carkner R.D.
        • Eifert C.
        • Brosnan M.J.
        • Conklin D.S.
        An RNA interference screen identifies metabolic regulators NR1D1 and PBP as novel survival factors for breast cancer cells with the ERBB2 signature.
        Cancer Res. 2010; 70: 1783-1792
        • Pio R.
        • Corrales L.
        • Lambris J.D.
        The role of complement in tumor growth.
        Adv. Exp. Med. Biol. 2014; 772: 229-262
        • Barthel D.
        • Schindler S.
        • Zipfel P.F.
        Plasminogen is a complement inhibitor.
        J. Biol. Chem. 2012; 287: 18831-18842
        • Tsai J.H.
        • Yang J.
        Epithelial-mesenchymal plasticity in carcinoma metastasis.
        Genes Dev. 2013; 27: 2192-2206
        • Daemen A.
        • Peterson D.
        • Sahu N.
        • McCord R.
        • Du X.
        • Liu B.
        • Kowanetz K.
        • Hong R.
        • Moffat J.
        • Gao M.
        • Boudreau A.
        • Mroue R.
        • Corson L.
        • O'Brien T.
        • Qing J.
        • Sampath D.
        • Merchant M.
        • Yauch R.
        • Manning G.
        • Settleman J.
        • Hatzivassiliou G.
        • Evangelista M.
        Metabolite profiling stratifies pancreatic ductal adenocarcinomas into subtypes with distinct sensitivities to metabolic inhibitors.
        Proc. Natl. Acad. Sci. U.S.A. 2015; 112: E4410-E4417
        • Roy P.J.
        • Stuart J.M.
        • Lund J.
        • Kim S.K.
        Chromosomal clustering of muscle-expressed genes in Caenorhabditis elegans.
        Nature. 2002; 418: 975-979
        • Cohen B.A.
        • Mitra R.D.
        • Hughes J.D.
        • Church G.M.
        A computational analysis of whole-genome expression data reveals chromosomal domains of gene expression.
        Nat. Genet. 2000; 26: 183-186
        • Battle A.
        • Khan Z.
        • Wang S.H.
        • Mitrano A.
        • Ford M.J.
        • Pritchard J.K.
        • Gilad Y.
        Genomic variation. Impact of regulatory variation from RNA to protein.
        Science. 2015; 347: 664-667