Advertisement

Reproducibility and Transparency by Design*

  • Vladislav A. Petyuk
    Affiliations
    Pacific Northwest National Laboratory, Richland, WA
    Search for articles by this author
  • Laurent Gatto
    Affiliations
    de Duve Institute, Université Catholique de Louvain, Brussels, Belgium
    Search for articles by this author
  • Samuel H. Payne
    Correspondence
    To whom correspondence should be addressed.
    Affiliations
    Brigham Young University, Provo UT
    Search for articles by this author
  • Author Footnotes
    * V.A.P. was supported by NIH/NINDS U18NS082140. L.G. was supported by BBSRC Strategic Longer and Larger grant (Award BB/L002817/1). S.H.P. was supported by NIH/NCI U24 CA210972 and the U.S. Department of Energy, Office of Science, Office of Biological and Environmental Research, Early Career Research Program. Battelle operates PNNL for the DOE under contract DE-AC05-76RLO01830. The authors declare no competing financial interest.
Open AccessPublished:July 04, 2019DOI:https://doi.org/10.1074/mcp.IP119.001567

      Graphical Abstract

      To truly achieve reproducible research, having reproducible analytics must be a principal research goal. Biological discovery is not the only deliverable; reproducibility is an essential part of our research.
      Public trust of scientific research is affected by the clarity of published conclusions and also the perceived transparency of the method. Although irreproducibility is not exclusive to biology, strong public interest in environmental and biomedical discoveries seems to have focused the spotlight here following a number of high-profile studies that failed to be reproduced (
      • Reaves M.L.
      • Sinha S.
      • Rabinowitz J.D.
      • Kruglyak L.
      • Redfield R.J.
      Absence of detectable arsenate in DNA from arsenate-grown GFAJ-1 cells.
      ,
      • Petricoin E.F.
      • Ardekani A.M.
      • Hitt B.A.
      • Levine P.J.
      • Fusaro V.A.
      • Steinberg S.M.
      • Mills G.B.
      • Simone C.
      • Fishman D.A.
      • Kohn E.C.
      • Liotta L.A.
      Use of proteomic patterns in serum to identify ovarian cancer.
      ,
      • Kern S.E.
      Why your new cancer biomarker may never work: Recurrent patterns and remarkable diversity in biomarker failures.
      ,
      • Diamandis E.P.
      Cancer biomarkers: Can we turn recent failures into success?.
      ,
      • Li M.
      • Wang I.X.
      • Li Y.
      • Bruzel A.
      • Richards A.L.
      • Toung J.M.
      • Cheung V.G.
      Widespread RNA and DNA sequence differences in the human transcriptome.
      ,
      • Asara J.M.
      • Schweitzer M.H.
      • Freimark L.M.
      • Phillips M.
      • Cantley L.C.
      Protein sequences from mastodon and Tyrannosaurus rex revealed by mass spectrometry.
      ). In this report, we specifically focus on the linked issues of reproducibility and transparency of integration and analyses for multi-omics data. Unlike data generation where biological variability is expected to be manifest, computational analyses should be completely and exactly reproducible. Unfortunately, the documentation of data processing, analysis, and statistical algorithms in publications is usually not sufficiently detailed. This lack of detail is especially problematic for multi-omics characterizations where the complex statistical integration is essential to merging disparate data types (e.g. clinical, proteomics, genomics, etc.).

      REFERENCES

        • Reaves M.L.
        • Sinha S.
        • Rabinowitz J.D.
        • Kruglyak L.
        • Redfield R.J.
        Absence of detectable arsenate in DNA from arsenate-grown GFAJ-1 cells.
        Science. 2012; 337: 470-473
        • Petricoin E.F.
        • Ardekani A.M.
        • Hitt B.A.
        • Levine P.J.
        • Fusaro V.A.
        • Steinberg S.M.
        • Mills G.B.
        • Simone C.
        • Fishman D.A.
        • Kohn E.C.
        • Liotta L.A.
        Use of proteomic patterns in serum to identify ovarian cancer.
        Lancet. 2002; 359: 572-577
        • Kern S.E.
        Why your new cancer biomarker may never work: Recurrent patterns and remarkable diversity in biomarker failures.
        Cancer Res. 2012; 72: 6097-6101
        • Diamandis E.P.
        Cancer biomarkers: Can we turn recent failures into success?.
        J. Natl. Cancer Inst. 2010; 102: 1462-1467
        • Li M.
        • Wang I.X.
        • Li Y.
        • Bruzel A.
        • Richards A.L.
        • Toung J.M.
        • Cheung V.G.
        Widespread RNA and DNA sequence differences in the human transcriptome.
        Science. 2011; 333: 53-58
        • Asara J.M.
        • Schweitzer M.H.
        • Freimark L.M.
        • Phillips M.
        • Cantley L.C.
        Protein sequences from mastodon and Tyrannosaurus rex revealed by mass spectrometry.
        Science. 2007; 316: 280-285
        • Kinsinger C.R.
        • Apffel J.
        • Baker M.
        • Bian X.
        • Borchers C.H.
        • Bradshaw R.
        • Brusniak M.Y.
        • Chan D.W.
        • Deutsch E.W.
        • Domon B.
        • Gorman J.
        • Grimm R.
        • Hancock W.
        • Hermjakob H.
        • Horn D.
        • Hunter C.
        • Kolar P.
        • Kraus H.J.
        • Langen H.
        • Linding R.
        • Moritz R.L.
        • Omenn G.S.
        • Orlando R.
        • Pandey A.
        • Ping P.
        • Rahbar A.
        • Rivers R.
        • Seymour S.L.
        • Simpson R.J.
        • Slotta D.
        • Smith R.D.
        • Stein S.E.
        • Tabb D.L.
        • Tagle D.
        • Yates 3rd, J.R.
        • Rodriguez H.
        Recommendations for mass spectrometry data quality metrics for open access data (corollary to the Amsterdam Principles).
        Mol. Cell. Proteomics. 2011; 10 (O111.015446)
        • Chalkley R.J.
        • MacCoss M.J.
        • Jaffe J.D.
        • Röst H.L.
        Initial Guidelines for Manuscripts employing data-independent acquisition mass spectrometry for proteomic analysis.
        Mol. Cell. Proteomics. 2019; 18: 1-2
        • Abbatiello S.
        • Ackermann B.L.
        • Borchers C.
        • Bradshaw R.A.
        • Carr S.A.
        • Chalkley R.
        • Choi M.
        • Deutsch E.
        • Domon B.
        • Hoofnagle A.N.
        • Keshishian H.
        • Kuhn E.
        • Liebler D.C.
        • MacCoss M.
        • MacLean B.
        • Mani D.R.
        • Neubert H.
        • Smith D.
        • Vitek O.
        • Zimmerman L.
        New guidelines for publication of manuscripts describing development and application of targeted mass spectrometry measurements of peptides and proteins.
        Mol. Cell Proteomics. 2017; 16: 327-328
        • Grossman R.L.
        • Heath A.P.
        • Ferretti V.
        • Varmus H.E.
        • Lowy D.R.
        • Kibbe W.A.
        • Staudt L.M.
        Toward a shared vision for cancer genomic data.
        New Eng. J. Med. 2016; 375: 1109-1112
        • Deutsch E.W.
        • Csordas A.
        • Sun Z.
        • Jarnuczak A.
        • Perez-Riverol Y.
        • Ternent T.
        • Campbell D.S.
        • Bernal-Llinares M.
        • Okuda S.
        • Kawano S.
        • Moritz R.L.
        • Carver J.J.
        • Wang M.
        • Ishihama Y.
        • Bandeira N.
        • Hermjakob H.
        • Vizcaíno J.A.
        The ProteomeXchange consortium in 2017: Supporting the cultural change in proteomics public data deposition.
        Nucleic Acids Res. 2017; 45: D1100-D1106
        • Wang M.
        • Carver J.J.
        • Phelan V.V.
        • Sanchez L.M.
        • Garg N.
        • Peng Y.
        • Nguyen D.D.
        • Watrous J.
        • Kapono C.A.
        • Luzzatto-Knaan T.
        • Porto C.
        • Bouslimani A.
        • Melnik A.V.
        • Meehan M.J.
        • Liu W.T.
        • Crusemann M.
        • Boudreau P.D.
        • Esquenazi E.
        • Sandoval-Calderon M.
        • Kersten R.D.
        • Pace L.A.
        • Quinn R.A.
        • Duncan K.R.
        • Hsu C.C.
        • Floros D.J.
        • Gavilan R.G.
        • Kleigrewe K.
        • Northen T.
        • Dutton R.J.
        • Parrot D.
        • Carlson E.E.
        • Aigle B.
        • Michelsen C.F.
        • Jelsbak L.
        • Sohlenkamp C.
        • Pevzner P.
        • Edlund A.
        • McLean J.
        • Piel J.
        • Murphy B.T.
        • Gerwick L.
        • Liaw C.C.
        • Yang Y.L.
        • Humpf H.U.
        • Maansson M.
        • Keyzers R.A.
        • Sims A.C.
        • Johnson A.R.
        • Sidebottom A.M.
        • Sedio B.E.
        • Klitgaard A.
        • Larson C.B.
        • Torres-Mendoza PCABD
        • Gonzalez D.J.
        • Silva D.B.
        • Marques L.M.
        • Demarque D.P.
        • Pociute E.
        • O'Neill E.C.
        • Briand E.
        • Helfrich E.J.N.
        • Granatosky E.A.
        • Glukhov E.
        • Ryffel F.
        • Houson H.
        • Mohimani H.
        • Kharbush J.J.
        • Zeng Y.
        • Vorholt J.A.
        • Kurita K.L.
        • Charusanti P.
        • McPhail K.L.
        • Nielsen K.F.
        • Vuong L.
        • Elfeki M.
        • Traxler M.F.
        • Engene N.
        • Koyama N.
        • Vining O.B.
        • Baric R.
        • Silva R.R.
        • Mascuch S.J.
        • Tomasi S.
        • Jenkins S.
        • Macherla V.
        • Hoffman T.
        • Agarwal V.
        • Williams P.G.
        • Dai J.
        • Neupane R.
        • Gurr J.
        • Rodriguez A.M.C.
        • Lamsa A.
        • Zhang C.
        • Dorrestein K.
        • Duggan B.M.
        • Almaliti J.
        • Allard P.M.
        • Phapale P.
        • Nothias L.F.
        • Alexandrov T.
        • Litaudon M.
        • Wolfender J.L.
        • Kyle J.E.
        • Metz T.O.
        • Peryea T.
        • Nguyen D.T.
        • VanLeer D.
        • Shinn P.
        • Jadhav A.
        • Muller R.
        • Waters K.M.
        • Shi W.
        • Liu X.
        • Zhang L.
        • Knight R.
        • Jensen P.R.
        • Palsson B.O.
        • Pogliano K.
        • Linington R.G.
        • Gutierrez M.
        • Lopes N.P.
        • Gerwick W.H.
        • Moore B.S.
        • Dorrestein P.C.
        • Bandeira N.
        Sharing and community curation of mass spectrometry data with global natural products social molecular networking.
        Nature Biotechnol. 2016; 34: 828-837
        • da Veiga Leprevost F.
        • Gruning B.A.
        • Alves Aflitos S.
        • Röst H.L.
        • Uszkoreit J.
        • Barsnes H.
        • Vaudel M.
        • Moreno P.
        • Gatto L.
        • Weber J.
        • Bai M.
        • Jimenez R.C.
        • Sachsenberg T.
        • Pfeuffer J.
        • Vera Alvarez R.
        • Griss J.
        • Nesvizhskii A.I.
        • Perez-Riverol Y.
        BioContainers: An open-source and community-driven framework for software standardization.
        Bioinformatics. 2017; 33: 2580-2582
        • Lamport L.
        Latex. Addison–Wesley, 1994
      1. Baumer, B., Cetinkaya-Rundel, M., Bray, A., Loi, L., and Horton, N. J., (2014) R Markdown: Integrating a reproducible analysis tool into introductory statistics. arXiv preprint arXiv:1402.1894,

        • Huber W.
        • Carey V.J.
        • Gentleman R.
        • Anders S.
        • Carlson M.
        • Carvalho B.S.
        • Bravo H.C.
        • Davis S.
        • Gatto L.
        • Girke T.
        • Gottardo R.
        • Hahne F.
        • Hansen K.D.
        • Irizarry R.A.
        • Lawrence M.
        • Love M.I.
        • MacDonald J.
        • Obenchain V.
        • Oleś A.K.
        • Pagès H.
        • Reyes A.
        • Shannon P.
        • Smyth G.K.
        • Tenenbaum D.
        • Waldron L.
        • Morgan M.
        Orchestrating high-throughput genomic analysis with Bioconductor.
        Nat. Methods. 2015; 12: 115-121
        • Perez-Riverol Y.
        • Gatto L.
        • Wang R.
        • Sachsenberg T.
        • Uszkoreit J.
        • Leprevost Fda V.
        • Fufezan C.
        • Ternent T.
        • Eglen S.J.
        • Katz D.S.
        • Pollard T.J.
        • Konovalov A.
        • Flight R.M.
        • Blin K.
        • Vizcaíno J.A.
        Ten simple rules for taking advantage of Git and GitHub.
        PLoS Comput. Biol. 2016; 12: e1004947
        • Ohnishi Y.
        • Huber W.
        • Tsumura A.
        • Kang M.
        • Xenopoulos P.
        • Kurimoto K.
        • Oleś A.K.
        • Arauzo-Bravo M.J.
        • Saitou M.
        • Hadjantonakis A.K.
        • Hiiragi T.
        Cell-to-cell expression variability followed by signal reinforcement progressively segregates early mouse lineages.
        Nat. Cell Biol. 2014; 16: 27-37
        • Laufer C.
        • Fischer B.
        • Billmann M.
        • Huber W.
        • Boutros M.
        Mapping genetic interactions in human cancer cells with RNAi and multiparametric phenotyping.
        Nat Methods. 2013; 10: 427-431
        • Eglen S.J.
        • Weeks M.
        • Jessop M.
        • Simonotto J.
        • Jackson T.
        • Sernagor E.
        A data repository and analysis framework for spontaneous neural activity recordings in developing retina.
        GigaScience. 2014; 3: 3
        • Lee J.Y.
        • Fujimoto G.M.
        • Wilson R.
        • Wiley H.S.
        • Payne S.H.
        Blazing Signature Filter: S library for fast pairwise similarity comparisons.
        BMC Bioinform. 2018; 19: 221
        • Markowetz F.
        Five selfish reasons to work reproducibly.
        Genome Biol. 2015; 16: 274
        • McKiernan E.C.
        • Bourne P.E.
        • Brown C.T.
        • Buck S.
        • Kenall A.
        • Lin J.
        • McDougall D.
        • Nosek B.A.
        • Ram K.
        • Soderberg C.K.
        • Spies J.R.
        • Thaney K.
        • Updegrove A.
        • Woo K.H.
        • Yarkoni T.
        How open science helps researchers succeed.
        eLife. 2016; 5: e16800