Advertisement
Research Article Collection: Immunopeptidomics|Articles in Press, 100515

Benchmarking bioinformatics pipelines in data-independent acquisition mass spectrometry for immunopeptidomics

  • Mohammad Shahbazy
    Affiliations
    Infection and Immunity Program, Biomedicine Discovery Institute, Monash University, Clayton, VIC 3800, Australia

    Department of Biochemistry and Molecular Biology, Monash University, Clayton, VIC 3800, Australia
    Search for articles by this author
  • Sri H. Ramarathinam
    Affiliations
    Infection and Immunity Program, Biomedicine Discovery Institute, Monash University, Clayton, VIC 3800, Australia

    Department of Biochemistry and Molecular Biology, Monash University, Clayton, VIC 3800, Australia
    Search for articles by this author
  • Patricia T. Illing
    Affiliations
    Infection and Immunity Program, Biomedicine Discovery Institute, Monash University, Clayton, VIC 3800, Australia

    Department of Biochemistry and Molecular Biology, Monash University, Clayton, VIC 3800, Australia
    Search for articles by this author
  • Emma C. Jappe
    Affiliations
    Evaxion Biotech, Bredgade 34E, DK-1260 Copenhagen, Denmark
    Search for articles by this author
  • Pouya Faridi
    Correspondence
    Corresponding authors. P.F..
    Affiliations
    Department of Medicine, School of Clinical Sciences, Monash University, Clayton, VIC 3800, Australia
    Search for articles by this author
  • Nathan P. Croft
    Correspondence
    Corresponding authors. N.P.C..
    Affiliations
    Infection and Immunity Program, Biomedicine Discovery Institute, Monash University, Clayton, VIC 3800, Australia

    Department of Biochemistry and Molecular Biology, Monash University, Clayton, VIC 3800, Australia
    Search for articles by this author
  • Anthony W. Purcell
    Correspondence
    Corresponding authors. A.W.P..
    Affiliations
    Infection and Immunity Program, Biomedicine Discovery Institute, Monash University, Clayton, VIC 3800, Australia

    Department of Biochemistry and Molecular Biology, Monash University, Clayton, VIC 3800, Australia
    Search for articles by this author
Open AccessPublished:February 14, 2023DOI:https://doi.org/10.1016/j.mcpro.2023.100515

      Highlights

      • Four spectral library-based data-independent acquisition (DIA) pipelines were benchmarked for accurate identification and quantification of peptide antigens in complex immunopeptidomic datasets.
      • PEAKS and DIA-NN provided higher immunopeptidome coverage and reproducibility between replicates.
      • Skyline and Spectronaut achieved lower false-positive identifications.
      • All tools showed high correlation in the quantification of HLA-bound peptides.
      • A consensus approach provides the highest confidence in peptide identification.

      ABSTRACT

      Immunopeptidomes are the peptide repertoires bound by the molecules encoded by the major histocompatibility complex (MHC) (human leukocyte antigen (HLA) in humans). These HLA-peptide complexes are presented on the cell surface for immune T-cell recognition. Immunopeptidomics denotes the utilization of tandem mass spectrometry (MS/MS) to identify and quantify peptides bound to HLA molecules. Data-independent acquisition (DIA) has emerged as a powerful strategy for quantitative proteomics and deep proteome-wide identification; however, DIA application to immunopeptidomics analyses has so far seen limited use. Further, of the many DIA data processing tools currently available, there is no consensus in the immunopeptidomics community on the most appropriate pipeline(s) for in-depth and accurate HLA peptide identification. Herein, we benchmarked four commonly used spectral library-based DIA pipelines developed for proteomics applications (Skyline, Spectronaut, DIA-NN, and PEAKS) for their ability to perform immunopeptidome quantification. We validated and assessed the capability of each tool to identify and quantify HLA-bound peptides. Generally, DIA-NN and PEAKS provided higher immunopeptidome coverage with more reproducible results. Skyline and Spectronaut conferred more accurate peptide identification with lower experimental false-positive rates. All tools demonstrated reasonable correlations in quantifying precursors of HLA-bound peptides. Our benchmarking study suggests a combined strategy of applying at least two complementary DIA software tools to achieve the greatest degree of confidence and in-depth coverage of immunopeptidome data.

      Graphical abstract

      Keywords

      Abbreviations:

      DDA (Data-dependent acquisition), DIA (Data-independent acquisition), HLA (Human leukocyte antigen), FDR (False discovery rate), LFQ (Label-free quantification), LC-MS/MS (Liquid chromatography coupled with tandem mass spectrometry), MS (Mass spectrometry), MHC (Major histocompatibility complex), pHLA (HLA-bound peptide complex), PSM (Peptide-spectrum match), SWATH-MS (Sequential windowed acquisition of all theoretical fragment ion mass spectra)

      Introduction

      The human major histocompatibility complex (MHC) is a highly polymorphic region of chromosome 6, and includes loci that encode for a family of glycoproteins termed human leukocyte antigens (HLAs). These HLA molecules play a primary role in presenting peptide antigens at the cell surface derived from intra- and extracellular proteins for immunosurveillance by T cells (

      Dudek, N. L., and Purcell, A. W. (2016) Repertoire of Nonclassical MHC I (HLA-E, HLA-F, HLA-G, and Orthologues). In: Ratcliffe, M. J. H., ed. Encyclopedia of Immunobiology, pp. 215-219, Academic Press, Oxford

      ,

      Purcell, A. W., and Dudek, N. L. (2016) Repertoire of Classical MHC Class I and Class II Molecules. In: Ratcliffe, M. J. H., ed. Encyclopedia of Immunobiology, pp. 200-208, Academic Press, Oxford

      ,
      • Neefjes J.
      • Jongsma M.L.M.
      • Paul P.
      • Bakke O.
      Towards a systems understanding of MHC class I and MHC class II antigen presentation.
      ,
      • Rock K.L.
      • Reits E.
      • Neefjes J.
      Present yourself! By MHC class I and MHC class II molecules.
      ). The presentation of these complexes of peptide–HLA (pHLA) allows the adaptive immune system to scrutinize protein expression within the cell and recognize foreign pathogenic agents associated with infected cells as well as aberrant antigen expression associated with cancerous cells (
      • Neefjes J.
      • Jongsma M.L.M.
      • Paul P.
      • Bakke O.
      Towards a systems understanding of MHC class I and MHC class II antigen presentation.
      ,
      • Rock K.L.
      • Reits E.
      • Neefjes J.
      Present yourself! By MHC class I and MHC class II molecules.
      ,
      • Guermonprez P.
      • Valladeau J.
      • Zitvogel L.
      • Théry C.
      • Amigorena S.
      Antigen Presentation and T Cell Stimulation by Dendritic Cells.
      ,
      • Blum J.S.
      • Wearsch P.A.
      • Cresswell P.
      Pathways of Antigen Processing.
      ). HLA class I (HLA-I) molecules generally bind and present short peptides (usually 8- to 12 amino acids in length) derived from the degradation of intracellular proteins by the proteasome. In contrast, HLA class II (HLA-II) molecules present longer peptides (13- to 25-mer) derived from the degradation of extracellular proteins in the endo-lysosomal compartments of the cell (
      • Neefjes J.
      • Jongsma M.L.M.
      • Paul P.
      • Bakke O.
      Towards a systems understanding of MHC class I and MHC class II antigen presentation.
      ,
      • Rock K.L.
      • Reits E.
      • Neefjes J.
      Present yourself! By MHC class I and MHC class II molecules.
      ). The immunopeptidome is the term used to describe the peptide ligands bound to HLA molecules on the surface of cells (

      Purcell, A. W., and Dudek, N. L. (2016) Repertoire of Classical MHC Class I and Class II Molecules. In: Ratcliffe, M. J. H., ed. Encyclopedia of Immunobiology, pp. 200-208, Academic Press, Oxford

      ,
      • Neefjes J.
      • Jongsma M.L.M.
      • Paul P.
      • Bakke O.
      Towards a systems understanding of MHC class I and MHC class II antigen presentation.
      ,
      • Rock K.L.
      • Reits E.
      • Neefjes J.
      Present yourself! By MHC class I and MHC class II molecules.
      ,
      • Caron E.
      • Kowalewski D.J.
      • Chiek Koh C.
      • Sturm T.
      • Schuster H.
      • Aebersold R.
      Analysis of Major Histocompatibility Complex (MHC) Immunopeptidomes Using Mass Spectrometry.
      ,

      Dudek, N. L., Croft, N. P., Schittenhelm, R. B., Ramarathinam, S. H., and Purcell, A. W. (2016) A Systems Approach to Understand Antigen Presentation and the Immune Response. In: Reinders, J., ed. Proteomics in Systems Biology: Methods and Protocols, pp. 189-209, Springer New York, New York, NY

      ), and immunopeptidomics refers to the use of mass spectrometry (MS) to identify and quantify immunopeptides (
      • Caron E.
      • Kowalewski D.J.
      • Chiek Koh C.
      • Sturm T.
      • Schuster H.
      • Aebersold R.
      Analysis of Major Histocompatibility Complex (MHC) Immunopeptidomes Using Mass Spectrometry.
      ,
      • Faridi P.
      • Purcell A.W.
      • Croft N.P.
      Immunopeptidomics We Need a Sniper Instead of a Shotgun.
      ,
      • Freudenmann L.K.
      • Marcu A.
      • Stevanović S.
      Mapping the tumour human leukocyte antigen (HLA) ligandome by mass spectrometry.
      ,
      • Ritz D.
      • Kinzi J.
      • Neri D.
      • Fugmann T.
      Data-Independent Acquisition of HLA Class I Peptidomes on the Q Exactive Mass Spectrometer Platform.
      ). In-depth knowledge of immunopeptidomes can assist in developing novel immunotherapies and vaccines (

      Dudek, N. L., Croft, N. P., Schittenhelm, R. B., Ramarathinam, S. H., and Purcell, A. W. (2016) A Systems Approach to Understand Antigen Presentation and the Immune Response. In: Reinders, J., ed. Proteomics in Systems Biology: Methods and Protocols, pp. 189-209, Springer New York, New York, NY

      ,
      • Schumacher F.-R.
      • Delamarre L.
      • Jhunjhunwala S.
      • Modrusan Z.
      • Phung Q.T.
      • Elias J.E.
      • Lill J.R.
      Building proteomic tool boxes to monitor MHC class I and class II peptides.
      ,
      • Purcell A.W.
      • Ramarathinam S.H.
      • Ternette N.
      Mass spectrometry–based identification of MHC-bound peptides for immunopeptidomics.
      ,
      • Croft N.P.
      Peptide Presentation to T Cells: Solving the Immunogenic Puzzle.
      ).
      Modern MS-based immunopeptidomics approaches can yield thousands of HLA-bound peptide identifications per sample (
      • Caron E.
      • Kowalewski D.J.
      • Chiek Koh C.
      • Sturm T.
      • Schuster H.
      • Aebersold R.
      Analysis of Major Histocompatibility Complex (MHC) Immunopeptidomes Using Mass Spectrometry.
      ,

      Dudek, N. L., Croft, N. P., Schittenhelm, R. B., Ramarathinam, S. H., and Purcell, A. W. (2016) A Systems Approach to Understand Antigen Presentation and the Immune Response. In: Reinders, J., ed. Proteomics in Systems Biology: Methods and Protocols, pp. 189-209, Springer New York, New York, NY

      ,
      • Purcell A.W.
      • Ramarathinam S.H.
      • Ternette N.
      Mass spectrometry–based identification of MHC-bound peptides for immunopeptidomics.
      ,
      • Bassani-Sternberg M.
      • Bräunlein E.
      • Klar R.
      • Engleitner T.
      • Sinitcyn P.
      • Audehm S.
      • Straub M.
      • Weber J.
      • Slotta-Huspenina J.
      • Specht K.
      • Martignoni M.E.
      • Werner A.
      • Hein R.
      • H. Busch D.
      • Peschel C.
      • Rad R.
      • Cox J.
      • Mann M.
      • Krackhardt A.M.
      Direct identification of clinically relevant neoepitopes presented on native human melanoma tissue by mass spectrometry.
      ,
      • Hunt D.F.
      • Henderson R.A.
      • Shabanowitz J.
      • Sakaguchi K.
      • Michel H.
      • Sevilir N.
      • Cox A.L.
      • Appella E.
      • Engelhard V.H.
      Characterization of peptides bound to the class I MHC molecule HLA-A2.1 by mass spectrometry.
      ). A hallmark of these studies is the isolation of pHLA complexes following mild lysis of cells by immunoprecipitation with immobilized pHLA-specific antibodies. This immunoprecipitation step is followed by peptide elution, fractionation, and sequencing through liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS) (
      • Purcell A.W.
      • Ramarathinam S.H.
      • Ternette N.
      Mass spectrometry–based identification of MHC-bound peptides for immunopeptidomics.
      ).
      Data-independent acquisition (DIA) (
      • Gillet L.C.
      • Navarro P.
      • Tate S.
      • Röst H.
      • Selevsek N.
      • Reiter L.
      • Bonner R.
      • Aebersold R.
      Targeted Data Extraction of the MS/MS Spectra Generated by Data-independent Acquisition: A New Concept for Consistent and Accurate Proteome Analysis.
      ) is an MS/MS sequential data acquisition strategy providing deep proteome-wide profiling (
      • Gillet L.C.
      • Navarro P.
      • Tate S.
      • Röst H.
      • Selevsek N.
      • Reiter L.
      • Bonner R.
      • Aebersold R.
      Targeted Data Extraction of the MS/MS Spectra Generated by Data-independent Acquisition: A New Concept for Consistent and Accurate Proteome Analysis.
      ,
      • Doerr A.
      DIA mass spectrometry.
      ,
      • Navarro P.
      • Kuharev J.
      • Gillet L.C.
      • Bernhardt O.M.
      • MacLean B.
      • Röst H.L.
      • Tate S.A.
      • Tsou C.-C.
      • Reiter L.
      • Distler U.
      • Rosenberger G.
      • Perez-Riverol Y.
      • Nesvizhskii A.I.
      • Aebersold R.
      • Tenzer S.
      A multicenter study benchmarks software tools for label-free proteome quantification.
      ,
      • Venable J.D.
      • Dong M.-Q.
      • Wohlschlegel J.
      • Dillin A.
      • Yates J.R.
      Automated approach for quantitative analysis of complex peptide mixtures from tandem mass spectra.
      ) that was first applied to bottom-up proteomics (
      • Venable J.D.
      • Dong M.-Q.
      • Wohlschlegel J.
      • Dillin A.
      • Yates J.R.
      Automated approach for quantitative analysis of complex peptide mixtures from tandem mass spectra.
      ,
      • Aebersold R.
      • Mann M.
      Mass spectrometry-based proteomics.
      ,
      • Aebersold R.
      • Mann M.
      Mass-spectrometric exploration of proteome structure and function.
      ). DIA avoids data completeness issues associated with sampling in LC-MS/MS by utilizing a parallel fragmentation on all precursor ions within a predefined mass to charge (m/z) precursor isolation window. Indeed, DIA produces MS2 spectra by systematic fragmentation to identify precursors independently from their compositional characteristics (
      • Doerr A.
      DIA mass spectrometry.
      ) rather than the fragmentation of the most abundant precursors as typically performed in data-dependent acquisition (DDA). Thus, due to stochastic sampling and duty cycle restrictions in DDA techniques, DIA more thoroughly samples areas of the m/z scan space that may have been ignored in a DDA run (
      • Gillet L.C.
      • Navarro P.
      • Tate S.
      • Röst H.
      • Selevsek N.
      • Reiter L.
      • Bonner R.
      • Aebersold R.
      Targeted Data Extraction of the MS/MS Spectra Generated by Data-independent Acquisition: A New Concept for Consistent and Accurate Proteome Analysis.
      ,
      • Ludwig C.
      • Gillet L.
      • Rosenberger G.
      • Amon S.
      • Collins B.C.
      • Aebersold R.
      Data-independent acquisition-based SWATH-MS for quantitative proteomics: a tutorial.
      ).
      DIA is increasingly being utilized for proteomics applications using customized or pan spectral libraries to deconvolve the data and generate peptide identifications. The application of DIA to immunopeptidomics has received considerably less attention (
      • Ritz D.
      • Kinzi J.
      • Neri D.
      • Fugmann T.
      Data-Independent Acquisition of HLA Class I Peptidomes on the Q Exactive Mass Spectrometer Platform.
      ,
      • Caron E.
      • Espona L.
      • Kowalewski D.J.
      • Schuster H.
      • Ternette N.
      • Alpízar A.
      • Schittenhelm R.B.
      • Ramarathinam S.H.
      • Lindestam Arlehamn C.S.
      • Chiek Koh C.
      • Gillet L.C.
      • Rabsteyn A.
      • Navarro P.
      • Kim S.
      • Lam H.
      • Sturm T.
      • Marcilla M.
      • Sette A.
      • Campbell D.S.
      • Deutsch E.W.
      • Moritz R.L.
      • Purcell A.W.
      • Rammensee H.-G.
      • Stevanovic S.
      • Aebersold R.
      An open-source computational and data resource to analyze digital maps of immunopeptidomes.
      ,
      • Schittenhelm R.B.
      • Sivaneswaran S.
      • Lim Kam Sian T.C.C.
      • Croft N.P.
      • Purcell A.W.
      Human Leukocyte Antigen (HLA) B27 Allotype-Specific Binding and Candidate Arthritogenic Peptides Revealed through Heuristic Clustering of Data-independent Acquisition Mass Spectrometry (DIA-MS) Data.
      ,
      • Shan P.
      • Tran H.
      Integrating Database Search and De Novo Sequencing for Immunopeptidomics with DIA Approach.
      ,
      • Pak H.
      • Michaux J.
      • Huber F.
      • Chong C.
      • Stevenson B.J.
      • Müller M.
      • Coukos G.
      • Bassani-Sternberg M.
      Sensitive Immunopeptidomics by Leveraging Available Large-Scale Multi-HLA Spectral Libraries, Data-Independent Acquisition, and MS/MS Prediction.
      ,

      Kovalchik, K., Hamelin, D., and Caron, E. (2022) Generation of HLA Allele-Specific Spectral Libraries to Identify and Quantify Immunopeptidomes by SWATH/DIA-MS. In: Corrales, F. J., Paradela, A., and Marcilla, M., eds. Clinical Proteomics: Methods and Protocols, pp. 137-147, Springer US, New York, NY

      ) due to the difficulties associated with generating HLA allele-specific spectral libraries and inherent challenges associated with the nature of the peptide ligands being analyzed in such studies (
      • Tsiatsiani L.
      • Heck A.J.R.
      Proteomics beyond trypsin.
      ,
      • Tiwary S.
      • Levy R.
      • Gutenbrunner P.
      • Salinas Soto F.
      • Palaniappan K.K.
      • Deming L.
      • Berndl M.
      • Brant A.
      • Cimermancic P.
      • Cox J.
      High-quality MS/MS spectrum prediction for data-dependent and data-independent acquisition data analysis.
      ,
      • Schuster H.
      • Shao W.
      • Weiss T.
      • Pedrioli P.G.A.
      • Roth P.
      • Weller M.
      • Campbell D.S.
      • Deutsch E.W.
      • Moritz R.L.
      • Planz O.
      • Rammensee H.-G.
      • Aebersold R.
      • Caron E.
      A tissue-based draft map of the murine MHC class I immunopeptidome.
      ). A further consideration of DIA in this context is that with any wide isolation window there is a greater degree of mixed spectra, which can be especially challenging to resolve when data must be searched with a “no enzyme” digestion specificity and corresponding increases in database search space.
      DIA software tools are categorized functionally as library-based (targeted proteome analysis), and library-free (untargeted proteome analysis) approaches (
      • Faridi P.
      • Purcell A.W.
      • Croft N.P.
      Immunopeptidomics We Need a Sniper Instead of a Shotgun.
      ,
      • Ludwig C.
      • Gillet L.
      • Rosenberger G.
      • Amon S.
      • Collins B.C.
      • Aebersold R.
      Data-independent acquisition-based SWATH-MS for quantitative proteomics: a tutorial.
      ,
      • Caron E.
      • Espona L.
      • Kowalewski D.J.
      • Schuster H.
      • Ternette N.
      • Alpízar A.
      • Schittenhelm R.B.
      • Ramarathinam S.H.
      • Lindestam Arlehamn C.S.
      • Chiek Koh C.
      • Gillet L.C.
      • Rabsteyn A.
      • Navarro P.
      • Kim S.
      • Lam H.
      • Sturm T.
      • Marcilla M.
      • Sette A.
      • Campbell D.S.
      • Deutsch E.W.
      • Moritz R.L.
      • Purcell A.W.
      • Rammensee H.-G.
      • Stevanovic S.
      • Aebersold R.
      An open-source computational and data resource to analyze digital maps of immunopeptidomes.
      ,
      • Tiwary S.
      • Levy R.
      • Gutenbrunner P.
      • Salinas Soto F.
      • Palaniappan K.K.
      • Deming L.
      • Berndl M.
      • Brant A.
      • Cimermancic P.
      • Cox J.
      High-quality MS/MS spectrum prediction for data-dependent and data-independent acquisition data analysis.
      ). The library-based method (“peptide-centric”) queries MS spectra to be verified and matched with ion fragmentation patterns in a previously acquired spectral library. The library-based approach has been supported by a target-decoy search strategy to recognize true and false matches against the spectral library (
      • Schuster H.
      • Shao W.
      • Weiss T.
      • Pedrioli P.G.A.
      • Roth P.
      • Weller M.
      • Campbell D.S.
      • Deutsch E.W.
      • Moritz R.L.
      • Planz O.
      • Rammensee H.-G.
      • Aebersold R.
      • Caron E.
      A tissue-based draft map of the murine MHC class I immunopeptidome.
      ). Although there are several DIA data processing tools currently available, scientists in the immunopeptidomics community have no consensus on the most well-suited pipeline(s) for in-depth and accurate HLA-bound peptides identification. Herein, we performed a benchmarking study on four of the most widely used library-based DIA data processing pipelines, namely Skyline (
      • MacLean B.
      • Tomazela D.M.
      • Shulman N.
      • Chambers M.
      • Finney G.L.
      • Frewen B.
      • Kern R.
      • Tabb D.L.
      • Liebler D.C.
      • MacCoss M.J.
      Skyline: an open source document editor for creating and analyzing targeted proteomics experiments.
      ,
      • Pino L.K.
      • Searle B.C.
      • Bollinger J.G.
      • Nunn B.
      • MacLean B.
      • MacCoss M.J.
      The Skyline ecosystem: Informatics for quantitative mass spectrometry proteomics.
      ), Spectronaut (
      • Bruderer R.
      • Bernhardt O.M.
      • Gandhi T.
      • Miladinović S.M.
      • Cheng L.-Y.
      • Messner S.
      • Ehrenberger T.
      • Zanotelli V.
      • Butscheid Y.
      • Escher C.
      • Vitek O.
      • Rinner O.
      • Reiter L.
      Extending the Limits of Quantitative Proteome Profiling with Data-Independent Acquisition and Application to Acetaminophen-Treated Three-Dimensional Liver Microtissues.
      ), DIA-NN (
      • Demichev V.
      • Messner C.B.
      • Vernardis S.I.
      • Lilley K.S.
      • Ralser M.
      DIA-NN: neural networks and interference correction enable deep proteome coverage in high throughput.
      ), and PEAKS (
      • Tran N.H.
      • Qiao R.
      • Xin L.
      • Chen X.
      • Liu C.
      • Zhang X.
      • Shan B.
      • Ghodsi A.
      • Li M.
      Deep learning enables de novo peptide sequencing from data-independent-acquisition mass spectrometry.
      ).

      Experimental Procedures

      Experimental Design and Statistical Rationale

      We implemented standard immunoprecipitation (IP) protocols to prepare biological replicates from lysed cells to acquire DDA and DIA data (
      • Purcell A.W.
      • Ramarathinam S.H.
      • Ternette N.
      Mass spectrometry–based identification of MHC-bound peptides for immunopeptidomics.
      ,
      • Pandey K.
      • Ramarathinam S.H.
      • Purcell A.W.
      Isolation of HLA Bound Peptides by Immunoaffinity Capture and Identification by Mass Spectrometry.
      ). Firstly, we utilized a large-scale IP protocol to acquire DDA data to generate spectral libraries of HLA-bound peptides per HLA-I allotype (HLA-A*02:01 and -B*57:01). DIA datasets were acquired on HLA-bound peptides isolated and purified from smaller samples using an optimized small-scale IP and peptide elution protocol. The resulting DIA datasets were used to analyze and benchmark four software tools in different metrics. The DIA datasets contain HLA-I peptides purified and isolated from (

      Dudek, N. L., and Purcell, A. W. (2016) Repertoire of Nonclassical MHC I (HLA-E, HLA-F, HLA-G, and Orthologues). In: Ratcliffe, M. J. H., ed. Encyclopedia of Immunobiology, pp. 215-219, Academic Press, Oxford

      ) C1R-B*57:01 (three biological replicates) and (

      Purcell, A. W., and Dudek, N. L. (2016) Repertoire of Classical MHC Class I and Class II Molecules. In: Ratcliffe, M. J. H., ed. Encyclopedia of Immunobiology, pp. 200-208, Academic Press, Oxford

      ) C1R-A*02:01 cell lines (four biological replicates). (
      • Neefjes J.
      • Jongsma M.L.M.
      • Paul P.
      • Bakke O.
      Towards a systems understanding of MHC class I and MHC class II antigen presentation.
      ) The third DIA dataset was acquired from a titration experiment with samples containing different concentrations of the pHLA-I isolated from C1R-B*57:01 cells. Several sequence analysis approaches were used to analyze the performance of each pipeline, including Venn overlap graphs, Upset plots, and sequence motif analysis alongside regular statistical tests. We implemented the multiple pairwise comparisons derived from two-way ANOVA analysis and standard t-test analyses to compare the performance of different software tools in immunopeptidome coverage and empirical false positive rates in identifying accurate HLA binders.

      Cell culture

      B-lymphoblastoid C1R cell lines express very low levels of endogenous HLA-A and HLA-B once transfected with an HLA allele of interest that are well-established mono-allelic cell lines (
      • Jappe E.C.
      • Garde C.
      • Ramarathinam S.H.
      • Passantino E.
      • Illing P.T.
      • Mifsud N.A.
      • Trolle T.
      • Kringelum J.V.
      • Croft N.P.
      • Purcell A.W.
      Thermostability profiling of MHC-bound peptides: a new dimension in immunopeptidomics and aid for immunotherapy design.
      ). The C1R cells were transfected by either HLA-B*57:01 or HLA-A*02:01 and maintained in RF-10 media of RPMI-1640 medium (Gibco™, Thermo Fisher Scientific, Waltham, MA). This media was supplemented with 10% fetal calf serum (FCS; from Sigma-Aldrich), 50 μM of β-mercaptoethanol (Sigma-Aldrich), 1% (v/v) mM non-essential amino acids (Gibco™), 5 mM HEPES (Sigma-Aldrich, St Louis, MO), 2 mM L-glutamine (MP Biomedicals), and 50 μg mL-1 Penicillin-Streptomycin (Pen-Strep, to prevent bacterial contaminations in cell culture). 0.3 mg mL-1 hygromycin (Invitrogen) was added to support a stable expression of the transfected HLA-I allotypes. The expression of the HLA-I allotypes was verified by flow cytometry. Furthermore, only in C1R-B*57:01 cell culturing 0.5 mg mL-1 of Geneticin™ (G418 Sulfate from Gibco™) reagent was added to select and maintain HLA-B*57:01 expression. The cells were expanded and grown in flasks and roller bottles incubated at 37°C and a 5% CO2 atmosphere. The cells were counted, harvested by centrifugation (3724 × g at 4°C for 15 min), washed with phosphate-buffered saline (PBS), pelleted, snap-frozen in liquid nitrogen, and then stored at -80°C.

      Large-scale isolation and elution of HLA peptides

      We utilized a well-established IP protocol to isolate and purify peptides bound to either HLA-B*57:01 or HLA-A*02:01 expressed individually on the surface of the C1R transfectants (
      • Purcell A.W.
      • Ramarathinam S.H.
      • Ternette N.
      Mass spectrometry–based identification of MHC-bound peptides for immunopeptidomics.
      ). Briefly, frozen pellets of >5×108 cells were firstly pulverized by liquid nitrogen-cooled cryomilling (Retsch® MM 400 mixer mill) and then resuspended in lysis buffer (50 mM Tris pH 8 (Sigma-Aldrich), 150 mM NaCl (Merck-Millipore, Darmstadt, Germany), 0.5% IGEPAL 630 (Sigma-Aldrich), and one tablet of cOmplete™ Protease Inhibitor Cocktail (Roche Applied Science®)) and incubated on a roller for 45-60 min at 4°C. The lysate was pre-cleared by centrifugation at 2,000 × g and 4°C for 10 min, and ultracentrifugation of the supernatant at 100,000 × g and 4°C for 45 min. HLA-I-bound peptides were then purified from lysates by immunoaffinity chromatography. This consisted of antibodies (i.e., anti-HLA-A2 antibody BB7.2 [ATCC HB-82] or anti-pan-HLA class I monoclonal antibody W6/32 [ATCC HB-95]) that had been cross-linked to protein A sepharose (antibody/protein A; 10 mg mL−1) as described (
      • Purcell A.W.
      • Ramarathinam S.H.
      • Ternette N.
      Mass spectrometry–based identification of MHC-bound peptides for immunopeptidomics.
      ).
      The pHLA complexes, captured through the immunoaffinity column, were eluted and dissociated in 10% acetic acid. This resultant eluate (containing peptides, HLA class I heavy chains, and β2-microglobulin (β2m)) was fractionated using C18 reversed-phase (RP) end-capped high-performance liquid chromatography (HPLC) column (4.6 mm internal diameter × 100 mm long; Chromolith® SpeedROD; Merck-Millipore) on either Ettan™ or ÄKTAmicro™ HPLC System (GE Healthcare, UK controlled by the UNICORN™ ver. 5.11 software). The mobile phases contained buffer A (0.1% trifluoroacetic acid (TFA) [Thermo Fisher Scientific, San Jose, CA]), buffer B (80% acetonitrile (ACN) [Thermo Fisher Scientific, Waltham, MA], and 0.1% TFA). The collected fractions were pooled into nine pools of peptides to be concentrated by vacuum centrifugation (CentriVap Benchtop Vacuum Concentrator [LABCONCO®]). The eluted peptides were resuspended in 15 μL of 0.1% v/v of formic acid (FA) in water (Optima™ LC/MS Grade). We spiked 200 fmoles of iRT peptides as an internal retention time (RT) standard in each sample for the RT normalization (
      • Escher C.
      • Reiter L.
      • MacLean B.
      • Ossola R.
      • Herzog F.
      • Chilton J.
      • MacCoss M.J.
      • Rinner O.
      Using iRT, a normalized retention time for more targeted measurement of peptides.
      ). The samples were stored at -80°C until LC-MS/MS analysis.

      LC-MS/MS analysis to identify HLA-bound peptides for spectral library generation

      C1R-B*57:01 DDA data – An Orbitrap Fusion™ Tribrid™ Mass Spectrometer (Thermo Scientific™) coupled to UltiMate™ 3000 RSLCnano UHPLC system (Thermo Scientific™) was used to acquire MS/MS spectra. The UHPLC system is equipped with a Thermo Scientific™ Dionex™ UltiMate™ 3000 RS Autosampler. 6 μL of sample fractions was loaded, with a flow rate of 15 μL min-1, onto an Acclaim™ PepMap™ RSLC C18 analytical column (75 μm internal diameter × 50 cm length, nanoViper, Particle Size 2μm, Pore Size 100 Å [Thermo Scientific™, Waltham, MA]) through a Thermo Scientific™ Acclaim™ PepMap™ 100 C18 LC Nano-Trap column (100 μm internal diameter × 200 mm length, Particle Size 5 μm, Pore Size 100 Å) with a flow rate of 250 nL min-1. The mass spectrometer was operated in DDA mode with higher-energy collisional dissociation (HCD) fragmentation. The other parameters were set to the following settings: precursor charge state of 2+ to 6+, full-scan MS1 range 375–1800 m/z at 120,000 resolution with automatic gain control (AGC) target of 2×105 ions, dynamic exclusion of 15 s, precursors selected for MS/MS of 20 per cycle time, an isolation width of 1.0 Da, c-trap loaded with a target of 200,000 ions, the Orbitrap fragment analysis resolution of 30,000, HCD normalized collision energy of 32, and accumulation times of 200 ms and 120 ms for MS1 and MS2, respectively.

      DDA database search to generate spectral library

      MS/MS spectra from DDA data matched with HLA-bound peptide sequences (peptide-spectrum matches (PSMs)) were used to generate spectral libraries. This approach provides a sizeable peptide repertoire containing pHLA targets for spectral library-based DIA data searches. PEAKS® Studio Xpro ver 10.6 software (Bioinformatics Solutions Inc, Waterloo, ON, Canada) was utilized to process and search LC−MS/MS DDA data against the human proteome database (UniProtKB/SwissProt v.26072021 UP000005640; 20,375 entries) with a contaminant database of iRT peptide sequences. The search parameters were set to instrument Orbitrap (Orbi-Orbi), HCD fragmentation, no digestion-unspecific enzyme, precursor, and fragment mass error tolerances of 10 ppm and 0.02 Da. Variable post-translational modifications (PTMs) were set to oxidation (M [+15.99]) and deamidation (NQ [+0.98]) with a maximum of three PTMs for each peptide. The false discovery rate (FDR) was adjusted by 1% to identify peptides confidently by deploying a target-decoy algorithm. The decoy sequences (the same size as the protein database) were generated using the PEAKS decoy-fusion method (
      • Tran N.H.
      • Qiao R.
      • Xin L.
      • Chen X.
      • Liu C.
      • Zhang X.
      • Shan B.
      • Ghodsi A.
      • Li M.
      Deep learning enables de novo peptide sequencing from data-independent-acquisition mass spectrometry.
      ).

      Generation of a spectral library containing HLA-A*02:01 and HLA-B*57:01 bound peptides

      We generated a hybrid spectral library by searching the combined DDA MS data of C1R-B*57:01 and C1R-A*02:01 immunopeptidomes using PEAKS Xpro software. First, we used this hybrid spectral library to search the biological replicates of C1R-B*57:01 and C1R-A*02:01 DIA data. Afterward, we purged the results of the known C1R background, mainly containing HLA-C*04:01 binders (i.e., those peptides identified from untransfected C1R cells), using the experimental peptide data from C1R parental cells and the joint sequence IDs between two individual library assays (C1R-B*57:01 and C1R-A*02:01) (
      • Pavlos R.
      • McKinnon E.J.
      • Ostrov D.A.
      • Peters B.
      • Buus S.
      • Koelle D.
      • Chopra A.
      • Schutte R.
      • Rive C.
      • Redwood A.
      • Restrepo S.
      • Bracey A.
      • Kaever T.
      • Myers P.
      • Speers E.
      • Malaker S.A.
      • Shabanowitz J.
      • Jing Y.
      • Gaudieri S.
      • Hunt D.F.
      • Carrington M.
      • Haas D.W.
      • Mallal S.
      • Phillips E.J.
      Shared peptide binding of HLA Class I and II alleles associate with cutaneous nevirapine hypersensitivity and identify novel risk alleles.
      ,
      • Gfeller D.
      • Bassani-Sternberg M.
      Predicting Antigen Presentation—What Could We Learn From a Million Peptides?.
      ,
      • Nielsen M.
      • Lund O.
      • Buus S.
      • Lundegaard C.
      MHC class II epitope predictive algorithms.
      ). The refinement of the joint IDs was implemented because the common peptides were considered the background of HLA-C*04:01-bound peptides. This filtered hybrid spectral library contained 30,839 unique peptides (51.3% HLA-B*57:01 binders and 48.7% HLA-A*02:01 binders).

      Small-scale immunoprecipitation for pHLA quantification and DIA algorithm benchmarking

      A small-scale immunoprecipitation protocol (
      • Pandey K.
      • Ramarathinam S.H.
      • Purcell A.W.
      Isolation of HLA Bound Peptides by Immunoaffinity Capture and Identification by Mass Spectrometry.
      ), modified from the formerly developed protocol (
      • Purcell A.W.
      • Ramarathinam S.H.
      • Ternette N.
      Mass spectrometry–based identification of MHC-bound peptides for immunopeptidomics.
      ), was utilized to purify and isolate HLA-bound peptides. The pellets of C1R-B*57:01 and C1R-A*02:01 cells were lysed and resuspended in the lysis buffer. The resultant lysates were incubated on a roller at 4 °C for 45 mins and cleared by centrifuge at 3700 × g for 10 mins at 4 °C. Cleared lysates were split into aliquots of 5×107 cell equivalents and incubated with 250 μg of W6/32 antibody bound to protein A agarose per replicate. The resultant samples were incubated overnight at 4 °C for immunoprecipitation (n = 3 and n = 4 for C1R-B*57:01 and C1R-A*02:01 cells, respectively; where n denotes the number of biological replicates). Then, immunoprecipitated samples were centrifuged and filtered through MobiSpin columns (with 10 μm pore size filters [MoBiTec GmbH, Germany]) with the following washing by PBS. Afterward, 400 μL of 10% acetic acid was used to elute bound pHLA complexes. Eluates comprising class-I heavy chains, β2m, and W6/32 antibody were filtered using pre-washed 5 kDa centrifugal filter units (Ultrafree®-MC-PLHCC, Merck Millipore, Germany) assisted by centrifugation for 60 min at 16,000 × g to collect filtered samples in new Eppendorf tubes. The filters were washed with an additional 250 μL of 10% acetic acid to collect residual peptides. Eluted peptides were desalted using C18-Omix 100 μL Pipette tips (Agilent©, OMIX A57003100) and eluted in a buffer of 30% ACN and 0.1% FA in water (LC-MS grade). Cleaned-up samples were concentrated by vacuum centrifugation. 200 fmoles of the standard iRT peptides were spiked into the samples for retention time prediction and peak normalization for LC-MS/MS analysis. Subsequently, HLA-bound peptides were reconstituted in a 0.1% v/v FA buffer in water, sonicated for 10 min, centrifuged at 21,000 × g for 10 min, and stored at -80 °C before DIA-MS analysis.

      Titration of HLA-B*57:01 bound peptidome within a HeLa tryptic digest

      C1R-B*57:01 cells were lysed and aliquoted as triplicates of 1×108 cell equivalents. The lysates were immunoprecipitated according to the small-scale protocol, as explained above. The isolated HLA peptides in three replicates were aliquoted to generate proportional dilution factors (DFs) of 0 (no HLA peptides), 0.2, 0.4, 0.6, 0.8, and 1.0 (maximum) in six 1.5mL Eppendorf tubes top up to 21 μL by a buffer of 0.1% FA in LC-MS grade water. A constant concentration of HeLa Protein digest (i.e., 32.4 ng) was spiked into all tubes as a fixed background of tryptic peptides. The immunopeptidomes were diluted so that samples contained cell equivalents of 0 (DF0), 1.08×107 (DF0.2), 2.16×107 (DF0.4), 3.24×107 (DF0.6), 4.32×107 (DF0.8), or 5.4×107 cells (DF1.0). These six samples (with six DFs) were injected three times as technical replicates to acquire 18 DIA-MS data files.

      DIA-MS acquisition

      An Orbitrap Fusion™ Tribrid™ Mass Spectrometer (Thermo Scientific™) coupled to UltiMate™ 3000 RSLCnano UHPLC system (Thermo Scientific™) was operated for LC-MS/MS DIA analysis of sample biological replicates prepared by small-scale immunoprecipitation as described above. The injection volume was 6 μL to load the sample onto the trap column and then eluted through the C18 column. DIA data acquisition parameters were as follows: MS1 resolution 120,000, 250 ms MS1 scans, MS1 scans across 375-1,575 m/z, 25 DIA scans with the fixed isolation precursor windows of 24 Da (m/z) ranging from 375.4260 to 975.6879 m/z, MS/MS resolving power of 17,500, and 100 ms scans to acquire MS2 spectra.

      Library search-based DIA data analysis

      We employed four “peptide-centric” DIA software tools (i.e., Skyline ver. 21.1 (
      • MacLean B.
      • Tomazela D.M.
      • Shulman N.
      • Chambers M.
      • Finney G.L.
      • Frewen B.
      • Kern R.
      • Tabb D.L.
      • Liebler D.C.
      • MacCoss M.J.
      Skyline: an open source document editor for creating and analyzing targeted proteomics experiments.
      ,
      • Pino L.K.
      • Searle B.C.
      • Bollinger J.G.
      • Nunn B.
      • MacLean B.
      • MacCoss M.J.
      The Skyline ecosystem: Informatics for quantitative mass spectrometry proteomics.
      ), Spectronaut ver. 16 (
      • Bruderer R.
      • Bernhardt O.M.
      • Gandhi T.
      • Miladinović S.M.
      • Cheng L.-Y.
      • Messner S.
      • Ehrenberger T.
      • Zanotelli V.
      • Butscheid Y.
      • Escher C.
      • Vitek O.
      • Rinner O.
      • Reiter L.
      Extending the Limits of Quantitative Proteome Profiling with Data-Independent Acquisition and Application to Acetaminophen-Treated Three-Dimensional Liver Microtissues.
      ), DIA-NN ver. 1.8.0 (
      • Demichev V.
      • Messner C.B.
      • Vernardis S.I.
      • Lilley K.S.
      • Ralser M.
      DIA-NN: neural networks and interference correction enable deep proteome coverage in high throughput.
      ), and PEAKS Xpro ver. 10.6 (
      • Tran N.H.
      • Qiao R.
      • Xin L.
      • Chen X.
      • Liu C.
      • Zhang X.
      • Shan B.
      • Ghodsi A.
      • Li M.
      Deep learning enables de novo peptide sequencing from data-independent-acquisition mass spectrometry.
      )) to analyze DIA datasets at 1% FDR (at peptide level) using an extensive DDA spectral library previously generated from both cell lines and HeLa protein digest. The settings applied for each tool (e.g., mass and retention time tolerances) are detailed in the Supplemental Data.

      Statistics

      We utilized two-way ANOVA and Tukey’s multiple pairwise comparisons to assess the differences between immunopeptidome coverages provided by each tool (Fig. 1B/C) and comparisons of the number of HLA peptides identified by different pipelines for C1R-B*57:01 and C1R-A*02:01 DIA datasets against the hybrid spectral library assay containing HLA-B*57:01 and HLA-A*02:01 peptide binders (Fig. 4B/C). Moreover, the differences between the experimental FDR% errors for both DIA datasets corresponding to false-positive rates were assessed by two-way ANOVA and Tukey’s multiple pairwise comparisons (Fig. 4D/E). A p-value of ≤ 0.05 was considered the statistically significant cut-off.

      Software tools for peptide sequence and statistical data analysis

      All statistical analyses were executed by GraphPad Prism v. 9.0.0 and MATLAB programming code routines (The MathWorks Inc., Natick, MA) v. R2021a. NetMHCpan (v. 4.1) (
      • Reynisson B.
      • Alvarez B.
      • Paul S.
      • Peters B.
      • Nielsen M.
      NetMHCpan-4.1 and NetMHCIIpan-4.0: improved predictions of MHC antigen presentation by concurrent motif deconvolution and integration of MS MHC eluted ligand data.
      ) was used to determine allelic specificity according to peptide binding rank. The peptides were also segregated based on sequence features using GibbsCluster (v. 2.0) (
      • Andreatta M.
      • Alvarez B.
      • Nielsen M.
      GibbsCluster: unsupervised clustering and alignment of peptide sequences.
      ). Seq2Logo (
      • Thomsen M.C.F.
      • Nielsen M.
      Seq2Logo: a method for construction and visualization of amino acid binding motifs and sequence profiles including sequence weighting, pseudo counts and two-sided representation of amino acid enrichment and depletion.
      ) was used to produce sequence motif analysis of the identified HLA-bound peptides. We used BioVenn (
      • Hulsen T.
      • de Vlieg J.
      • Alkema W.
      BioVenn – a web application for the comparison and visualization of biological lists using area-proportional Venn diagrams.
      ) and InteractiVenn (
      • Heberle H.
      • Meirelles G.V.
      • da Silva F.R.
      • Telles G.P.
      • Minghim R.
      InteractiVenn: a web-based tool for the analysis of sets through Venn diagrams.
      ) to generate Venn overlap graphs. The UpSetR Shiny App was utilized to draw upset plots (
      • Lex A.
      • Gehlenborg N.
      • Strobelt H.
      • Vuillemot R.
      • Pfister H.
      UpSet: Visualization of Intersecting Sets.
      ). BioRender (BioRender.com) and Microsoft PowerPoint were used to design and create schematic figures and experimental workflows.

      Results

      We compared four commonly used spectral library-based DIA software tools for their ability to interrogate immunopeptidomics data using two exemplar mono-allelic datasets containing peptides isolated from HLA-A*02:01 and HLA-B*57:01 expressing C1R cells (Figure 1A). We utilized these mono-allelic datasets as they each provide a distinct diversity of peptide sequences, which aids in the DIA data processing for benchmarking tool performance compared to the greater complexity that is inherent to multi-allelic datasets. We implemented iRT peptides as the retention time standards for calibration by all DIA tools as a key parameter in spectral library-based DIA analysis. DIA-MS data were searched against a relevant spectral library containing highly curated HLA-A*02:01 and -B*57:01 peptides using Skyline, Spectronaut, DIA-NN, and PEAKS XPro.
      Figure thumbnail gr1
      Figure 1Overview of benchmarking study workflow for spectral library-based DIA-MS immunopeptidomics analysis. (A) HLA-bound peptides were eluted and immunoprecipitated from cells and subjected to DIA-MS, with the resulting data analyzed by Skyline, Spectronaut, DIA-NN, and PEAKS. Benchmark analyses were performed to assess these tools using statistical and validation procedures. The number of HLA-bound peptides identified by DIA pipelines: (B) C1R-B*57:01 data with three biological replicates and (C) C1R-A*02:01 data with four biological replicates. (D)-(E) The Upset diagrams show the overlap of the immunopeptidome coverage achieved by each DIA software tool; (D) C1R-B*57:01 data, and (E) C1R-A*02:01 data. (F) The percentage of peptides (qualified precursors with high fragmentation score of ≥0.5) identified by the tools in C1R-B*57:01 and C1R-A*02:01 DIA datasets. (G) The PSM scores (-logP) of the identified peptides by the tools derived from individual spectral libraries for both datasets. ns (p-value > 0.05); * (p-value ≤ 0.05); ** (p-value ≤ 0.01); *** (p-value ≤ 0.001); **** (p-value ≤ 0.0001). The error bars show mean ± CI95% for three (or four reps for C1R- A*02:01 data) biological replicates.

      Generation and verification of HLA-specific spectral libraries

      We generated large DDA datasets from C1R-B*57:01 and C1R-A*02:01 cell lines, from which allotype-specific spectral libraries were exported, encompassing 18,285 and 19,235 HLA peptides, respectively (Fig. S1(a)-(b) - Supplemental Data). The allelic specificity of the spectral library was verified by analyzing the sequence of HLA peptides using HLA binding prediction (NetMHCpan 4.1) and sequence clustering (GibbsCluster) to generate HLA-allotype-specific consensus binding motifs for the HLA-B*57:01 and HLA-A*02:01 peptide ligands. These motifs matched those reported in the literature and NetMHC Motif Viewer for these two HLA variants (Fig. S1(b) - Supplemental Data). The use of BB7.2 antibody in immunoprecipitation was another specificity filtration to isolate only HLA-A*02:01 ligandomes. Specifically, the HLA-B*57:01 binders show a predominance of S/T/A at position 2 (P2) and W/F/Y at the C-terminus. Conversely, the HLA-A*02:01 peptides contained a prevalence of L/M/I/V at P2 and V/L/I/A at the C-terminus. A subset of common peptides was identified as either HLA-C*04:01 or HLA-B*35:03 binders, as these allotypes are expressed at relatively low levels by the C1R parental cell line.

      Assessment of immunopeptidome coverage by four DIA tools

      We examined immunopeptidome coverage achieved by the four DIA data processing tools. Although all comparable, Spectronaut and DIA-NN demonstrated slightly higher identifications (3515 ± 439 and 3421 ± 368 HLA-bound peptides, respectively) of HLA-B*57:01 ligands in the C1R-B*57:01 dataset (n=3) compared to PEAKS and Skyline (3167 ± 294 and 3061 ± 172 HLA-bound peptides, respectively). Figure 1B summarises the number of HLA-bound peptides identified by each software for the C1R-B*57:01 data replicates, and detailed replicate-level information is available in Table S1 (Supplemental Data). A slightly different trend was observed from four replicates of DIA-MS data for peptides isolated from C1R-A*02:01, with PEAKS (2521 ± 250) and DIA-NN (2456 ± 236) identifying the highest number of peptides followed by Skyline (2204 ± 271) and Spectronaut (2055 ± 354) (Figure 1C; Table S2 (Supplemental Data)). Thus, PEAKS and DIA-NN led marginally compared to the other pipelines in the number of identified HLA peptides in both datasets based on the pairwise comparisons derived from two-way ANOVA analysis (Table S3 - Supplemental Data).
      All HLA-bound peptides identified in the biological replicates were merged into a single list to allow the comparison of identification yield. Figure 1D shows the overlap in peptide identifications across all software tools for the C1R-B*57:01 data. Herein, Skyline, Spectronaut, DIA-NN, and PEAKS uniquely identified 1038, 439, 285, and 310 unique HLA-I peptides, respectively (unique IDs based on sequence). A similar trend was observed in HLA-A*02:01-bound peptides (Figure 1E). Skyline, Spectronaut, DIA-NN, and PEAKS identified 673, 109, 275, and 864 unique peptides, respectively. Of note, only 33.37% (2117 IDs) for B*57:01 and 34.39% (1908 IDs) for A*02:01 of the total peptides identified across all replicates were identified by all four tools. These results are also provided as Venn diagrams with more details in Figure S2 (Supplemental Data).
      To independently assess the library-matching quality of identified HLA-bound peptides from DIA data by each tool, we calculated the spectral reliability of precursors through fragmentation scores for each matched precursor in the corresponding spectral libraries. If all possible fragmentation ions were detected, we assigned a value of 1 to that peptide identification. Afterward, we only considered the DIA identification results with precursor fragmentation scores of ≥0.5 (calculated from the spectral library). Figure S1C (Supplemental Data) showed the fragmentation score distribution calculated for each precursor. We found a higher percentage of peptides identified by Spectronaut (88.41 ± 0.65 %) among qualified precursors (with a fragmentation score of ≥ 0.5) in the C1R-B*57:01 DIA dataset. Moreover, >80% of peptides profiled by Skyline (85.19 ± 0.76 %), DIA-NN (84.65 ± 0.63 %), and PEAKS (82.74 ± 0.88 %) passed this examination. In the C1R-A*02:01 data, a higher proportion of peptides sequenced by DIA-NN (92.61 ± 0.65 %) and Spectronaut (92.39 ± 0.83 %) were matched with qualified spectra compared to PEAKS (89.31 ± 0.45 %) and Skyline (87.91 ± 0.44 %) (Figure 1F). Figure S3 (Supplemental Data) shows Venn diagram overlaps for sequence IDs from all biological replicates of both C1R-B*57:01 and C1R-A*02:01 datasets. Generally, the overlaps in both datasets demonstrate that DIA-NN and PEAKS are comparable performing tools in the number of identified HLA-bound peptides. We compared PSM scores (-logP) of the identified peptides by the tools derived from individual spectral libraries for both datasets. The PSM scores reveal peptide-spectrum match quality used to validate the identified peptides and cut off the results by the target-decoy algorithm at FDR of interest. Figure 1G shows that Skyline could identify the peptides with different PSM scores compared to other tools.

      Evaluation of the reliability of identified HLA peptides using precursor characteristics

      The library matching in DIA significantly depends on the ion matching with library spectra and retention time. Thus, we next examined retention time and m/z values (as the main characteristics of precursor ions) to check whether the same precursors were assigned identical sequences by each tool. For this aim, we considered only peptides commonly identified by all tools derived from the C1R-B*57:01 DIA data. We tested the correlations between the precursor RT and m/z values through pairwise comparisons (Figure 2A-B). The Pearson correlation coefficients (PCCs) demonstrate the correlation between each pair of tools (r ≥ 0.99), suggesting that all pipelines could isolate and assign identical sequences to commonly identified precursors.
      Figure thumbnail gr2
      Figure 2Pearson correlations plots between (A) RT and (B) m/z values for commonly identified precursors by different software tools. The diagonal bar plots show RT and m/z values distribution per tool.
      As expected, the peptide sequence motifs, peptide length distribution, precursor m/z distribution, and precursor charge distributions were consistent with the HLA-B*57:01 ligands when identified from merged replicates of the C1R-B*57:01 data (Supplemental Data – Figures S4 and S5). To determine whether the software tools assign different peptide sequences to the same precursors, we analyzed the peptides exclusively identified by each tool from the C1R-B*57:01 DIA data to assess their corresponding RT and m/z values as described for the commonly identified species. Only 4.39% of precursors were assigned to two or more unique peptides sequences using the different software algorithms (Figure S6 – Supplemental Data).

      Assessment of reproducibility in identifying immunopeptidomes across biological replicates

      Next, we analyzed the reproducibility of immunopeptidome identification across the DIA replicates. We acquired DDA data from the same samples (three biological replicates of HLA peptides purified from C1R-B*57:01 cells) to compare the reproducibility achieved from DDA and DIA (Figure 3A). Figure 3B shows the reproducibility of 47.90% in identifying immunopeptidomes peptide identification overlap from DDA data by database search using PEAKS. As anticipated, reproducibility was universally higher achieved via DIA, regardless of the software tool (Figure 3C-F). DIA-NN provided marginally the highest overlap (66.72%), followed by PEAKS (64.44%), Spectronaut (63.80%), and then Skyline (62.03%), demonstrating comparable reproducibility for all tools.
      Figure thumbnail gr3
      Figure 3Examination of robustness in HLA-bound peptide identification across three biological replicates of C1R-B*57:01 data. (A) A brief workflow on MS data acquisition of the same sample replicates with DDA and DIA. (B) DDA data were analyzed by PEAKS, and then DIA data were analyzed by (C) Skyline, (D) Spectronaut, (E) DIA-NN, and (F) PEAKS.

      Evaluation of false positive and false negative discovery rates using a hybrid spectral library

      We assessed experimental false-positive (FP) and false-discovery rates (FDR) using a hybrid spectral library consisting of HLA-B*57:01 and HLA-A*02:01 peptides to serve as targets or decoys for each respective data set (Figure 4A). This strategy enables a straightforward way to calculate external FP and FDR. In the C1R-B*57:01 dataset, PEAKS identified 2586 ± 327, while 1895 ± 301, 2195 ± 286, and 2287 ± 252 peptides were identified by Skyline, Spectronaut, and DIA-NN, respectively. In the C1R-A*02:01 dataset, PEAKS identified 2473 ± 227 compared to 1374 ± 198, 1857 ± 260, and 1942 ± 269 detected by Skyline, Spectronaut, and DIA-NN, respectively (Table S4 – Supplemental Data). Subsequently, for the C1R.B*57:01 dataset, we counted HLA-B*57:01 peptides as true positive (TP) and -A*02:01 binders as false positive (FP) to calculate an FDR (and the reverse for C1R.A*02:01 data sets). We calculated the external FDR values by the TP and FP rates across the biological replicates for each pipeline (Figures 4B-E). FDR values were determined for Skyline (1.25 ± 0.23%), DIA-NN (1.45 ± 0.15%), PEAKS (1.75 ± 0.15%), and Spectronaut (2.03 ± 0.03%) for the C1R-B*57:01 dataset. For the C1R-A*02:01 data, the calculated FDR was somewhat higher with values for Spectronaut (2.69 ± 0.57%), DIA-NN (3.60 ± 0.29%), Skyline (4.44 ± 0.95%) and PEAKS (4.25 ± 0.89%) (Table S5 – Supplemental Data).
      Figure thumbnail gr4
      Figure 4Empirical assessments of the accuracy in immunopeptide identification using a hybrid spectral library. (A) A workflow describes generating a hybrid spectral library and its use to estimate experimental FP rates in immunopeptide identification. (B)-(C) The number of HLA peptides identified by different pipelines over (B) three replicates of C1R-B*57:01 DIA data; and (C) four replicates of C1R-A*02:01 DIA data against the hybrid spectral library containing HLA-B*57:01 and HLA-A*02:01 peptide binders. We used corresponding individual spectral libraries as the references to recognize peptides bound to HLA-B*57:01 and HLA-A*02:01 as true-and false-positive rates. (D)-(E) The experimental FDR% errors were calculated corresponding with false-positive rates for; (D) C1R-B*57:01 DIA dataset; and (E) C1R-A*02:01 DIA dataset supported by the statistical two-way ANOVA and multiple pairwise comparisons. ns (p-value > 0.05); * (p-value ≤ 0.05); ** (p-value ≤ 0.01); *** (p-value ≤ 0.001); **** (p-value ≤ 0.0001). The error bar graphs show mean ± CI95% for three (or four reps for C1R- A*02:01 data) biological replicates.

      Examination of sensitivity, specificity, and correlations in quantifying HLA-bound peptides through titration data

      To further examine the sensitivity and specificity of peptide identification and quantitation, we generated a DIA dataset whereby HLA-B*57:01 ligands were titrated by spiking into a consistent quantity of HeLa Protein tryptic digest as a complex background. Figure 5A summarizes the experimental design. We analyzed the DIA data by searching against the C1R-B*57:01 spectral library. Figure 5B shows the number of HLA-bound peptides identified across the titration data points, demonstrating the anticipated increase in peptide IDs with the increasing amounts of spiked immunopeptides. Since there were no immunopeptides in the first data point (i.e., DF of 0), these samples were not expected to have any identified HLA-B*57:01 peptides. DIA-NN did not erroneously detect HLA-B*57:01 peptides in the HeLa digest alone (DF0) sample, whereas the other tools detected false positives at 1-5% of total identifications.
      Figure thumbnail gr5
      Figure 5Assessment of sensitivity, specificity, and quantitation of HLA-bound peptides. (A) A brief experimental design scheme to acquire DIA titration data with a fixed background of HeLa Protein Digest and varied quantities of spiked immunopeptides isolated and purified from C1R-B*57:01 cells. (B) The number of HLA peptides identified across the titration data points and their technical replicates when searching against the C1R-B*57:01 spectral library. (C) Upset diagrams show overlap between all merged HLA-B*57:01 peptides identified over the technical replicates of the DF0.2 and DF1.0 samples. (D) A stacked bar plot reveals the HLA-bound peptides proportionally, identified exclusively and commonly by the pipelines, across different dilutions. (E) The violin plots show the quantity of precursors for the joint set of peptides to compare the trend of the normalized peak area from DF0.2 to DF1.0 as calculated by the different tools. (F) The linear regression plots of the normalized peak area for the same set of the quantified precursors in D0.2 show correlation and PCC values for each pair of tools.
      Figure 5B provides the identification-based sensitivity as determined by the curve slopes of precursor identification in the titration data points. DIA-NN (2956.0 ± 161.0), Spectronaut (2870.0 ± 143.0), and PEAKS (2405.0 ± 155.7) provided higher identification sensitivity over Skyline (1761.0 ± 153.8) with higher regression slopes (Supplemental Data – Figure S7A). Figure 5C shows an Upset diagram of the overlap between all merged peptides over three technical replicates of the DF0.2 and DF1.0 samples. At the DF0.2 titration level, PEAKS identified 558 unique HLA-I peptides, followed by Skyline (432 IDs), DIA-NN (271 IDs), and Spectronaut (138 IDs). At the DF1.0 titration level, Skyline identified exclusively 723 (12.53%) unique HLA-I peptides compared to DIA-NN (589 IDs - 10.21%), PEAKS (520 IDs – 9.01%), and Spectronaut (339 IDs – 5.88%). Skyline, PEAKS, and DIA-NN are among the top-two tools in identifying higher number of unique HLA-bound peptides, at all titration levels.
      A stacked bar plot shows the exclusively identified HLA-B*57:01 peptides proportionally, highlighting identification specificity, and the commonly sequenced immunopeptides by all tools across different data points (Figure 5D). As expected, with increasing immunopeptidome content input through the titration levels, more consensus percentage is clear between the tools in identifying common peptides (31.3% at DF0.2 to 38.9% at DF1.0).
      To assess the quantification ability of the software tools, we analyzed relative precursor quantity by normalized peak area to evaluate intensities for 323 HLA-B*57:01 peptides identified and quantified commonly in the titration data (DF0.2 to DF1.0) by all pipelines (Figure 5E). As expected, all the tools showed similar increasing normalized peak areas across the dilution series. Figure 5F shows a high level of pairwise correlations (PCC > 0.8) between the normalized peak area (precursor at MS1 level normalized to the corresponding DF1.0 quantities) for Skyline, Spectronaut, DIA-NN, and PEAKS, at DF0.2 - the lowest concentration of the spiked immunopeptides.
      We examined the linearity of peptide identification across the titration data points. Skyline, Spectronaut, DIA-NN, and PEAKS achieved a high Pearson correlation coefficient of 0.944, 0.981, 0.977, and 0.968, respectively (Figure S7A – Supplemental Data). Furthermore, the DIA data were searched against the HeLa Protein Digest spectral library to identify background tryptic peptides (Table S6 – Supplemental Data). We expected consistency in peptide IDs due to the spiked HeLa Protein digest’s fixed concentration in every sample. Therefore, we calculated the CV values for each technical triplicate with each tool to examine robustness. DIA-NN consistently (lower CV%) identified the highest number of tryptic peptides over all data points (Figure S7B – Supplemental Data).

      Discussion

      The present study demonstrates that when an extensive spectral library is already available, DIA can outperform the DDA-MS analysis with, on average, ∼ 37% increase in immunopeptidome coverage (DIA; 3167 ± 294 vs. DDA; 2307 ± 357) and 16% greater reproducibility. When comparing four commonly used DIA-analytical workflows, Spectronaut, DIA-NN, and PEAKS provided slightly better immunopeptidome coverage. All tools showed comparable identification reproducibility over the biological replicates. Furthermore, strong correlations between precursor RT for the commonly identified HLA peptides were observed, demonstrating that a consensus approach provides the highest confidence for immunopeptide identification. After an in-depth analysis of precursors (DIA spectra) identified from C1R-B*57:01 data, only 4.39% of precursors were assigned to two or more different peptide IDs using different algorithms. Skyline and DIA-NN provided much lower external FDR rates during external validation in the C1R-B*57:01 DIA data analyses against the hybrid spectral library containing HLA- B*57:01 and HLA-A*02:01 binders. In the C1R-A*02:01 data analysis, Spectronaut and DIA-NN showed lower empirical FDR errors. All software tools revealed excellent linearity over the titration data points with varying concentrations of immunopeptidomes. PEAKS and DIA-NN also demonstrated higher sensitivity for peptide identification during the titration experiments, suggesting they could better discriminate peptides with lower signal intensities than the other tools. We enlisted several critical analytical criteria to compare the software tools (Table 1). Overall, the current data suggest a strategy of applying at least two complementary DIA software tools to achieve confident and in-depth coverage of immunopeptidomes, better DIA-MS data visualization, and reliable peptide quantification. A combination of DIA-NN and PEAKS tools identified 74.9% and 85.1% of all peptides in C1R-B*57:01 and C1R-A*02:01 data, respectively. Besides, DIA-NN and Skyline could identify 85.9% and 81.1% of all peptides from C1R-B*57:01 and C1R-A*02:01 data, respectively, with an excellent possibility for data visualization by Skyline.
      Table 1A summary of the performance of the compared DIA analysis tools against key analytical criteria
      Analytical criteriaSkyline ver. 21.1Spectronaut ver. 16DIA-NN ver. 1.8PEAKS ver. Xpro (v. 10.6)
      Immunopeptidomes coverage*******************
      Coverage specificity (exclusively identified peptides per tool)12.14-16.33%1.95-6.90%4.49-4.96%4.89-15.58%
      Identification-based sensitivity with increasing pHLA concentration
      The results are shown as mean ± CI95% for three technical replicates.
      1761.0 ± 153.82870.0 ± 143.02956.0 ± 161.02405.0 ± 155.7
      Identification reproducibility across replicates62.03 %63.80 %66.72 %64.44 %
      Actual FDR% calculated from search against hybrid spectral library1.25-4.44 %2.03-2.69 %1.45-3.60 %1.75-4.25 %
      Speed per triplicate
      For the data analyzed in this report.
      15-20 mins17 mins5 mins12 mins
      Data visualization
      The visualization was assessed for the ease of carrying out overall run-to-run comparisons as well as specific immunopeptidomics needs at the peptide level, namely precursor and product ion spectral and chromatographic inspection.
      ***************
      Graphical user interfaceYesYesYesYes
      Ease to configure and use*****************
      AvailabilityFree (open access)License requiredFree (open access)License required
      *weak; **basic; ***standard; ****strong; *****very strong.
      The results are shown as mean ± CI95% for three technical replicates.
      § For the data analyzed in this report.
      The visualization was assessed for the ease of carrying out overall run-to-run comparisons as well as specific immunopeptidomics needs at the peptide level, namely precursor and product ion spectral and chromatographic inspection.
      In this study we concentrated on library-based DIA analyses, but several library-free modalities are now available (
      • Tiwary S.
      • Levy R.
      • Gutenbrunner P.
      • Salinas Soto F.
      • Palaniappan K.K.
      • Deming L.
      • Berndl M.
      • Brant A.
      • Cimermancic P.
      • Cox J.
      High-quality MS/MS spectrum prediction for data-dependent and data-independent acquisition data analysis.
      ,
      • Cox J.
      Prediction of peptide mass spectral libraries with machine learning.
      ,
      • Wilhelm M.
      • Zolg D.P.
      • Graber M.
      • Gessulat S.
      • Schmidt T.
      • Schnatbaum K.
      • Schwencke-Westphal C.
      • Seifert P.
      • de Andrade Krätzig N.
      • Zerweck J.
      • Knaute T.
      • Bräunlein E.
      • Samaras P.
      • Lautenbacher L.
      • Klaeger S.
      • Wenschuh H.
      • Rad R.
      • Delanghe B.
      • Huhmer A.
      • Carr S.A.
      • Clauser K.R.
      • Krackhardt A.M.
      • Reimer U.
      • Kuster B.
      Deep learning boosts sensitivity of mass spectrometry-based immunopeptidomics.
      ) and could be useful for future immunopeptidomics studies where either no existing DDA library is available or when sample is too limiting. For such library-free approaches, extensive deep learning training of non-tryptic peptides is required for accurate predictive models of HLA peptide fragmentation (
      • Cox J.
      Prediction of peptide mass spectral libraries with machine learning.
      ,
      • Wilhelm M.
      • Zolg D.P.
      • Graber M.
      • Gessulat S.
      • Schmidt T.
      • Schnatbaum K.
      • Schwencke-Westphal C.
      • Seifert P.
      • de Andrade Krätzig N.
      • Zerweck J.
      • Knaute T.
      • Bräunlein E.
      • Samaras P.
      • Lautenbacher L.
      • Klaeger S.
      • Wenschuh H.
      • Rad R.
      • Delanghe B.
      • Huhmer A.
      • Carr S.A.
      • Clauser K.R.
      • Krackhardt A.M.
      • Reimer U.
      • Kuster B.
      Deep learning boosts sensitivity of mass spectrometry-based immunopeptidomics.
      ,
      • Li K.
      • Jain A.
      • Malovannaya A.
      • Wen B.
      • Zhang B.
      DeepRescore: Leveraging Deep Learning to Improve Peptide Identification in Immunopeptidomics.
      ). Notably, Prosit is a useful tool for generating in-silico spectral libraries if the MS2 spectra for HLA peptides are unavailable (
      • Gessulat S.
      • Schmidt T.
      • Zolg D.P.
      • Samaras P.
      • Schnatbaum K.
      • Zerweck J.
      • Knaute T.
      • Rechenberger J.
      • Delanghe B.
      • Huhmer A.
      • Reimer U.
      • Ehrlich H.-C.
      • Aiche S.
      • Kuster B.
      • Wilhelm M.
      Prosit: proteome-wide prediction of peptide tandem mass spectra by deep learning.
      ). However, the library-free search of any decent-sized FASTA databases will result in an extremely large search space given no enzyme specificity in HLA peptidome data processing (
      • Cox J.
      Prediction of peptide mass spectral libraries with machine learning.
      ,
      • Searle B.C.
      • Swearingen K.E.
      • Barnes C.A.
      • Schmidt T.
      • Gessulat S.
      • Küster B.
      • Wilhelm M.
      Generating high quality libraries for DIA MS with empirically corrected peptide predictions.
      ). a further reason for the current analysis of only library-based searching is that library-free DIA searches are mostly utilized in cases of very narrow isolation windows in DIA proteomics analysis with the possibility of several injections of samples to enhance coverage. The limitations in starting material for immunopeptidomics compared to proteomics makes this a less feasible approach, but one that nevertheless will warrant its own future investigation. Likewise, collision energy (CE) affects fragment ion intensities (
      • Neta P.
      • Simon-Manso Y.
      • Yang X.
      • Stein S.E.
      Collisional Energy Dependence of Peptide Ion Fragmentation.
      ) and subsequent generation of spectral libraries. Therefore, library matching for DIA data processing can be impacted by CE used to create spectra in the library. Ideally, the in-house generation of a spectral library for DIA analysis, preferably on the same instrument, can improve the quality of results. Otherwise, the library should pass acquisition setting tests to determine whether a library can be validated (

      Kovalchik, K., Hamelin, D., and Caron, E. (2022) Generation of HLA Allele-Specific Spectral Libraries to Identify and Quantify Immunopeptidomes by SWATH/DIA-MS. In: Corrales, F. J., Paradela, A., and Marcilla, M., eds. Clinical Proteomics: Methods and Protocols, pp. 137-147, Springer US, New York, NY

      ,
      • Gessulat S.
      • Schmidt T.
      • Zolg D.P.
      • Samaras P.
      • Schnatbaum K.
      • Zerweck J.
      • Knaute T.
      • Rechenberger J.
      • Delanghe B.
      • Huhmer A.
      • Reimer U.
      • Ehrlich H.-C.
      • Aiche S.
      • Kuster B.
      • Wilhelm M.
      Prosit: proteome-wide prediction of peptide tandem mass spectra by deep learning.
      ).
      Overall this study highlights the potential of DIA-MS for immunopeptidomics studies, with gains in immunopeptidome coverage compared to DDA-MS approaches when a comprehensive spectral library is available. The combination of search tools yields larger immunopeptidomes and provides the greatest coverage, whilst a consensus approach provides the greatest accuracy and confidence in the data set. Given that immunopeptidomics is often the first step in peptide antigen discovery the reduced accuracy may be acceptable with subsequent MS-level and functional validation steps necessary to identify clinically actionable peptide T cell epitopes.

      Data availability

      All LC-MS/MS immunopeptidomics data, PEAKS® Studio Xpro DDA database search, and DIA identification search results have been uploaded to the ProteomeXchange Consortium via the PRIDE partner repository (
      • Deutsch E.W.
      • Bandeira N.
      • Sharma V.
      • Perez-Riverol Y.
      • Carver J.J.
      • Kundu D.J.
      • García-Seisdedos D.
      • Jarnuczak A.F.
      • Hewapathirana S.
      • Pullman B.S.
      • Wertz J.
      • Sun Z.
      • Kawano S.
      • Okuda S.
      • Watanabe Y.
      • Hermjakob H.
      • MacLean B.
      • MacCoss M.J.
      • Zhu Y.
      • Ishihama Y.
      • Vizcaíno J.A.
      The ProteomeXchange consortium in 2020: enabling ‘big data’ approaches in proteomics.
      ,
      • Perez-Riverol Y.
      • Bai J.
      • Bandla C.
      • García-Seisdedos D.
      • Hewapathirana S.
      • Kamatchinathan S.
      • Kundu Deepti J.
      • Prakash A.
      • Frericks-Zipper A.
      • Eisenacher M.
      • Walzer M.
      • Wang S.
      • Brazma A.
      • Vizcaíno Juan A.
      The PRIDE database resources in 2022: a hub for mass spectrometry-based proteomics evidences.
      ). For DDA data, the accession codes are PXD017824 (published data) and PXD034429 (C1R-A*02:01 and C1R-B*57:01 DDA LC-MS/MS, accessed via username “[email protected]” and password “bDTLiFey”). C1R-A*02:01 (n=4) and C1R-B*57:01 (n=3) DIA LC-MS/MS data alongside the C1R-B*57:01 DIA titration data (18 runs) were deposited under the dataset identifier PXD034539 (, accessed via username “[email protected]” and password “h7CKkSc2”). All search and results summaries were deposited with the corresponding datasets.

      Supplemental data

      This article contains supplemental data.

      Conflict of interest

      AWP is a scientific advisor for Bioinformatics Solutions Inc (the provider of PEAKS software).
      There are no other conflicts of interest declared by the authors.

      Acknowledgments

      The authors acknowledge the provision of instrumentation, training, and technical support by the Monash Biomedical Proteomics Facility. Computational resources were supported by the [email protected]/Monash Node of the NeCTAR Research Cloud, an initiative of the Australian Government’s Super Science Scheme and the Education Investment Fund. We appreciate the technical help and lab resources provided by Rochelle Ayala.

      Supplementary Data

      References

      1. Dudek, N. L., and Purcell, A. W. (2016) Repertoire of Nonclassical MHC I (HLA-E, HLA-F, HLA-G, and Orthologues). In: Ratcliffe, M. J. H., ed. Encyclopedia of Immunobiology, pp. 215-219, Academic Press, Oxford

      2. Purcell, A. W., and Dudek, N. L. (2016) Repertoire of Classical MHC Class I and Class II Molecules. In: Ratcliffe, M. J. H., ed. Encyclopedia of Immunobiology, pp. 200-208, Academic Press, Oxford

        • Neefjes J.
        • Jongsma M.L.M.
        • Paul P.
        • Bakke O.
        Towards a systems understanding of MHC class I and MHC class II antigen presentation.
        Nature Reviews Immunology. 2011; 11: 823
        • Rock K.L.
        • Reits E.
        • Neefjes J.
        Present yourself! By MHC class I and MHC class II molecules.
        Trends in immunology. 2016; 37: 724-737
        • Guermonprez P.
        • Valladeau J.
        • Zitvogel L.
        • Théry C.
        • Amigorena S.
        Antigen Presentation and T Cell Stimulation by Dendritic Cells.
        Annual Review of Immunology. 2002; 20: 621-667
        • Blum J.S.
        • Wearsch P.A.
        • Cresswell P.
        Pathways of Antigen Processing.
        Annual Review of Immunology. 2013; 31: 443-473
        • Caron E.
        • Kowalewski D.J.
        • Chiek Koh C.
        • Sturm T.
        • Schuster H.
        • Aebersold R.
        Analysis of Major Histocompatibility Complex (MHC) Immunopeptidomes Using Mass Spectrometry.
        Molecular & Cellular Proteomics. 2015; 14: 3105
      3. Dudek, N. L., Croft, N. P., Schittenhelm, R. B., Ramarathinam, S. H., and Purcell, A. W. (2016) A Systems Approach to Understand Antigen Presentation and the Immune Response. In: Reinders, J., ed. Proteomics in Systems Biology: Methods and Protocols, pp. 189-209, Springer New York, New York, NY

        • Faridi P.
        • Purcell A.W.
        • Croft N.P.
        Immunopeptidomics We Need a Sniper Instead of a Shotgun.
        PROTEOMICS. 2018; 181700464
        • Freudenmann L.K.
        • Marcu A.
        • Stevanović S.
        Mapping the tumour human leukocyte antigen (HLA) ligandome by mass spectrometry.
        Immunology. 2018; 154: 331-345
        • Ritz D.
        • Kinzi J.
        • Neri D.
        • Fugmann T.
        Data-Independent Acquisition of HLA Class I Peptidomes on the Q Exactive Mass Spectrometer Platform.
        PROTEOMICS. 2017; 171700177
        • Schumacher F.-R.
        • Delamarre L.
        • Jhunjhunwala S.
        • Modrusan Z.
        • Phung Q.T.
        • Elias J.E.
        • Lill J.R.
        Building proteomic tool boxes to monitor MHC class I and class II peptides.
        PROTEOMICS. 2017; 171600061
        • Purcell A.W.
        • Ramarathinam S.H.
        • Ternette N.
        Mass spectrometry–based identification of MHC-bound peptides for immunopeptidomics.
        Nature Protocols. 2019; 14: 1687-1707
        • Croft N.P.
        Peptide Presentation to T Cells: Solving the Immunogenic Puzzle.
        BioEssays. 2020; 421900200
        • Bassani-Sternberg M.
        • Bräunlein E.
        • Klar R.
        • Engleitner T.
        • Sinitcyn P.
        • Audehm S.
        • Straub M.
        • Weber J.
        • Slotta-Huspenina J.
        • Specht K.
        • Martignoni M.E.
        • Werner A.
        • Hein R.
        • H. Busch D.
        • Peschel C.
        • Rad R.
        • Cox J.
        • Mann M.
        • Krackhardt A.M.
        Direct identification of clinically relevant neoepitopes presented on native human melanoma tissue by mass spectrometry.
        Nature Communications. 2016; 713404
        • Hunt D.F.
        • Henderson R.A.
        • Shabanowitz J.
        • Sakaguchi K.
        • Michel H.
        • Sevilir N.
        • Cox A.L.
        • Appella E.
        • Engelhard V.H.
        Characterization of peptides bound to the class I MHC molecule HLA-A2.1 by mass spectrometry.
        Science. 1992; 255: 1261
        • Gillet L.C.
        • Navarro P.
        • Tate S.
        • Röst H.
        • Selevsek N.
        • Reiter L.
        • Bonner R.
        • Aebersold R.
        Targeted Data Extraction of the MS/MS Spectra Generated by Data-independent Acquisition: A New Concept for Consistent and Accurate Proteome Analysis.
        Molecular & Cellular Proteomics. 2012; 11 (O111.016717)
        • Doerr A.
        DIA mass spectrometry.
        Nature Methods. 2015; 12 (35): 35
        • Navarro P.
        • Kuharev J.
        • Gillet L.C.
        • Bernhardt O.M.
        • MacLean B.
        • Röst H.L.
        • Tate S.A.
        • Tsou C.-C.
        • Reiter L.
        • Distler U.
        • Rosenberger G.
        • Perez-Riverol Y.
        • Nesvizhskii A.I.
        • Aebersold R.
        • Tenzer S.
        A multicenter study benchmarks software tools for label-free proteome quantification.
        Nature Biotechnology. 2016; 34: 1130-1136
        • Venable J.D.
        • Dong M.-Q.
        • Wohlschlegel J.
        • Dillin A.
        • Yates J.R.
        Automated approach for quantitative analysis of complex peptide mixtures from tandem mass spectra.
        Nature Methods. 2004; 1: 39-45
        • Aebersold R.
        • Mann M.
        Mass spectrometry-based proteomics.
        Nature. 2003; 422: 198-207
        • Aebersold R.
        • Mann M.
        Mass-spectrometric exploration of proteome structure and function.
        Nature. 2016; 537: 347
        • Ludwig C.
        • Gillet L.
        • Rosenberger G.
        • Amon S.
        • Collins B.C.
        • Aebersold R.
        Data-independent acquisition-based SWATH-MS for quantitative proteomics: a tutorial.
        Molecular Systems Biology. 2018; 14e8126
        • Caron E.
        • Espona L.
        • Kowalewski D.J.
        • Schuster H.
        • Ternette N.
        • Alpízar A.
        • Schittenhelm R.B.
        • Ramarathinam S.H.
        • Lindestam Arlehamn C.S.
        • Chiek Koh C.
        • Gillet L.C.
        • Rabsteyn A.
        • Navarro P.
        • Kim S.
        • Lam H.
        • Sturm T.
        • Marcilla M.
        • Sette A.
        • Campbell D.S.
        • Deutsch E.W.
        • Moritz R.L.
        • Purcell A.W.
        • Rammensee H.-G.
        • Stevanovic S.
        • Aebersold R.
        An open-source computational and data resource to analyze digital maps of immunopeptidomes.
        eLife. 2015; 4e07661
        • Schittenhelm R.B.
        • Sivaneswaran S.
        • Lim Kam Sian T.C.C.
        • Croft N.P.
        • Purcell A.W.
        Human Leukocyte Antigen (HLA) B27 Allotype-Specific Binding and Candidate Arthritogenic Peptides Revealed through Heuristic Clustering of Data-independent Acquisition Mass Spectrometry (DIA-MS) Data.
        Molecular & Cellular Proteomics. 2016; 15: 1867
        • Shan P.
        • Tran H.
        Integrating Database Search and De Novo Sequencing for Immunopeptidomics with DIA Approach.
        J Biomol Tech. 2019; 30 (S23): S23
        • Pak H.
        • Michaux J.
        • Huber F.
        • Chong C.
        • Stevenson B.J.
        • Müller M.
        • Coukos G.
        • Bassani-Sternberg M.
        Sensitive Immunopeptidomics by Leveraging Available Large-Scale Multi-HLA Spectral Libraries, Data-Independent Acquisition, and MS/MS Prediction.
        Molecular & Cellular Proteomics. 2021; 20
      4. Kovalchik, K., Hamelin, D., and Caron, E. (2022) Generation of HLA Allele-Specific Spectral Libraries to Identify and Quantify Immunopeptidomes by SWATH/DIA-MS. In: Corrales, F. J., Paradela, A., and Marcilla, M., eds. Clinical Proteomics: Methods and Protocols, pp. 137-147, Springer US, New York, NY

        • Tsiatsiani L.
        • Heck A.J.R.
        Proteomics beyond trypsin.
        The FEBS Journal. 2015; 282: 2612-2626
        • Tiwary S.
        • Levy R.
        • Gutenbrunner P.
        • Salinas Soto F.
        • Palaniappan K.K.
        • Deming L.
        • Berndl M.
        • Brant A.
        • Cimermancic P.
        • Cox J.
        High-quality MS/MS spectrum prediction for data-dependent and data-independent acquisition data analysis.
        Nature Methods. 2019; 16: 519-525
        • Schuster H.
        • Shao W.
        • Weiss T.
        • Pedrioli P.G.A.
        • Roth P.
        • Weller M.
        • Campbell D.S.
        • Deutsch E.W.
        • Moritz R.L.
        • Planz O.
        • Rammensee H.-G.
        • Aebersold R.
        • Caron E.
        A tissue-based draft map of the murine MHC class I immunopeptidome.
        Scientific Data. 2018; 5180157
        • MacLean B.
        • Tomazela D.M.
        • Shulman N.
        • Chambers M.
        • Finney G.L.
        • Frewen B.
        • Kern R.
        • Tabb D.L.
        • Liebler D.C.
        • MacCoss M.J.
        Skyline: an open source document editor for creating and analyzing targeted proteomics experiments.
        Bioinformatics. 2010; 26: 966-968
        • Pino L.K.
        • Searle B.C.
        • Bollinger J.G.
        • Nunn B.
        • MacLean B.
        • MacCoss M.J.
        The Skyline ecosystem: Informatics for quantitative mass spectrometry proteomics.
        Mass Spectrometry Reviews. 2020; 39: 229-244
        • Bruderer R.
        • Bernhardt O.M.
        • Gandhi T.
        • Miladinović S.M.
        • Cheng L.-Y.
        • Messner S.
        • Ehrenberger T.
        • Zanotelli V.
        • Butscheid Y.
        • Escher C.
        • Vitek O.
        • Rinner O.
        • Reiter L.
        Extending the Limits of Quantitative Proteome Profiling with Data-Independent Acquisition and Application to Acetaminophen-Treated Three-Dimensional Liver Microtissues.
        Molecular & Cellular Proteomics. 2015; 14 (1400)
        • Demichev V.
        • Messner C.B.
        • Vernardis S.I.
        • Lilley K.S.
        • Ralser M.
        DIA-NN: neural networks and interference correction enable deep proteome coverage in high throughput.
        Nature Methods. 2020; 17: 41-44
        • Tran N.H.
        • Qiao R.
        • Xin L.
        • Chen X.
        • Liu C.
        • Zhang X.
        • Shan B.
        • Ghodsi A.
        • Li M.
        Deep learning enables de novo peptide sequencing from data-independent-acquisition mass spectrometry.
        Nature Methods. 2019; 16: 63-66
        • Pandey K.
        • Ramarathinam S.H.
        • Purcell A.W.
        Isolation of HLA Bound Peptides by Immunoaffinity Capture and Identification by Mass Spectrometry.
        Current Protocols. 2021; 1: e92
        • Jappe E.C.
        • Garde C.
        • Ramarathinam S.H.
        • Passantino E.
        • Illing P.T.
        • Mifsud N.A.
        • Trolle T.
        • Kringelum J.V.
        • Croft N.P.
        • Purcell A.W.
        Thermostability profiling of MHC-bound peptides: a new dimension in immunopeptidomics and aid for immunotherapy design.
        Nature Communications. 2020; 11: 6305
        • Escher C.
        • Reiter L.
        • MacLean B.
        • Ossola R.
        • Herzog F.
        • Chilton J.
        • MacCoss M.J.
        • Rinner O.
        Using iRT, a normalized retention time for more targeted measurement of peptides.
        PROTEOMICS. 2012; 12: 1111-1121
        • Pavlos R.
        • McKinnon E.J.
        • Ostrov D.A.
        • Peters B.
        • Buus S.
        • Koelle D.
        • Chopra A.
        • Schutte R.
        • Rive C.
        • Redwood A.
        • Restrepo S.
        • Bracey A.
        • Kaever T.
        • Myers P.
        • Speers E.
        • Malaker S.A.
        • Shabanowitz J.
        • Jing Y.
        • Gaudieri S.
        • Hunt D.F.
        • Carrington M.
        • Haas D.W.
        • Mallal S.
        • Phillips E.J.
        Shared peptide binding of HLA Class I and II alleles associate with cutaneous nevirapine hypersensitivity and identify novel risk alleles.
        Scientific Reports. 2017; 7: 8653
        • Gfeller D.
        • Bassani-Sternberg M.
        Predicting Antigen Presentation—What Could We Learn From a Million Peptides?.
        Frontiers in Immunology. 2018; 9
        • Nielsen M.
        • Lund O.
        • Buus S.
        • Lundegaard C.
        MHC class II epitope predictive algorithms.
        Immunology. 2010; 130: 319-328
        • Reynisson B.
        • Alvarez B.
        • Paul S.
        • Peters B.
        • Nielsen M.
        NetMHCpan-4.1 and NetMHCIIpan-4.0: improved predictions of MHC antigen presentation by concurrent motif deconvolution and integration of MS MHC eluted ligand data.
        Nucleic Acids Research. 2020; 48: W449-W454
        • Andreatta M.
        • Alvarez B.
        • Nielsen M.
        GibbsCluster: unsupervised clustering and alignment of peptide sequences.
        Nucleic Acids Research. 2017; 45: W458-W463
        • Thomsen M.C.F.
        • Nielsen M.
        Seq2Logo: a method for construction and visualization of amino acid binding motifs and sequence profiles including sequence weighting, pseudo counts and two-sided representation of amino acid enrichment and depletion.
        Nucleic Acids Research. 2012; 40: W281-W287
        • Hulsen T.
        • de Vlieg J.
        • Alkema W.
        BioVenn – a web application for the comparison and visualization of biological lists using area-proportional Venn diagrams.
        BMC Genomics. 2008; 9: 488
        • Heberle H.
        • Meirelles G.V.
        • da Silva F.R.
        • Telles G.P.
        • Minghim R.
        InteractiVenn: a web-based tool for the analysis of sets through Venn diagrams.
        BMC Bioinformatics. 2015; 16: 169
        • Lex A.
        • Gehlenborg N.
        • Strobelt H.
        • Vuillemot R.
        • Pfister H.
        UpSet: Visualization of Intersecting Sets.
        IEEE transactions on visualization and computer graphics. 2014; 20: 1983-1992
        • Cox J.
        Prediction of peptide mass spectral libraries with machine learning.
        Nature Biotechnology. 2022;
        • Wilhelm M.
        • Zolg D.P.
        • Graber M.
        • Gessulat S.
        • Schmidt T.
        • Schnatbaum K.
        • Schwencke-Westphal C.
        • Seifert P.
        • de Andrade Krätzig N.
        • Zerweck J.
        • Knaute T.
        • Bräunlein E.
        • Samaras P.
        • Lautenbacher L.
        • Klaeger S.
        • Wenschuh H.
        • Rad R.
        • Delanghe B.
        • Huhmer A.
        • Carr S.A.
        • Clauser K.R.
        • Krackhardt A.M.
        • Reimer U.
        • Kuster B.
        Deep learning boosts sensitivity of mass spectrometry-based immunopeptidomics.
        Nature Communications. 2021; 12: 3346
        • Li K.
        • Jain A.
        • Malovannaya A.
        • Wen B.
        • Zhang B.
        DeepRescore: Leveraging Deep Learning to Improve Peptide Identification in Immunopeptidomics.
        PROTEOMICS. 2020; 20 (1900334)
        • Gessulat S.
        • Schmidt T.
        • Zolg D.P.
        • Samaras P.
        • Schnatbaum K.
        • Zerweck J.
        • Knaute T.
        • Rechenberger J.
        • Delanghe B.
        • Huhmer A.
        • Reimer U.
        • Ehrlich H.-C.
        • Aiche S.
        • Kuster B.
        • Wilhelm M.
        Prosit: proteome-wide prediction of peptide tandem mass spectra by deep learning.
        Nature Methods. 2019; 16: 509-518
        • Searle B.C.
        • Swearingen K.E.
        • Barnes C.A.
        • Schmidt T.
        • Gessulat S.
        • Küster B.
        • Wilhelm M.
        Generating high quality libraries for DIA MS with empirically corrected peptide predictions.
        Nature Communications. 2020; 11: 1548
        • Neta P.
        • Simon-Manso Y.
        • Yang X.
        • Stein S.E.
        Collisional Energy Dependence of Peptide Ion Fragmentation.
        Journal of the American Society for Mass Spectrometry. 2009; 20: 469-476
        • Deutsch E.W.
        • Bandeira N.
        • Sharma V.
        • Perez-Riverol Y.
        • Carver J.J.
        • Kundu D.J.
        • García-Seisdedos D.
        • Jarnuczak A.F.
        • Hewapathirana S.
        • Pullman B.S.
        • Wertz J.
        • Sun Z.
        • Kawano S.
        • Okuda S.
        • Watanabe Y.
        • Hermjakob H.
        • MacLean B.
        • MacCoss M.J.
        • Zhu Y.
        • Ishihama Y.
        • Vizcaíno J.A.
        The ProteomeXchange consortium in 2020: enabling ‘big data’ approaches in proteomics.
        Nucleic Acids Research. 2020; 48: D1145-D1152
        • Perez-Riverol Y.
        • Bai J.
        • Bandla C.
        • García-Seisdedos D.
        • Hewapathirana S.
        • Kamatchinathan S.
        • Kundu Deepti J.
        • Prakash A.
        • Frericks-Zipper A.
        • Eisenacher M.
        • Walzer M.
        • Wang S.
        • Brazma A.
        • Vizcaíno Juan A.
        The PRIDE database resources in 2022: a hub for mass spectrometry-based proteomics evidences.
        Nucleic Acids Research. 2022; 50: D543-D552