Open Modification Searching of SARS-CoV-2–Human Protein Interaction Data Reveals Novel Viral Modification Sites

The outbreak of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the causative agent of the coronavirus 2019 disease, has led to an ongoing global pandemic since 2019. Mass spectrometry can be used to understand the molecular mechanisms of viral infection by SARS-CoV-2, for example, by determining virus–host protein–protein interactions through which SARS-CoV-2 hijacks its human hosts during infection, and to study the role of post-translational modifications. We have reanalyzed public affinity purification–mass spectrometry data using open modification searching to investigate the presence of post-translational modifications in the context of the SARS-CoV-2 virus–host protein–protein interaction network. Based on an over twofold increase in identified spectra, our detected protein interactions show a high overlap with independent mass spectrometry-based SARS-CoV-2 studies and virus–host interactions for alternative viruses, as well as previously unknown protein interactions. In addition, we identified several novel modification sites on SARS-CoV-2 proteins that we investigated in relation to their interactions with host proteins. A detailed analysis of relevant modifications, including phosphorylation, ubiquitination, and S-nitrosylation, provides important hypotheses about the functional role of these modifications during viral infection by SARS-CoV-2.

The outbreak of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the causative agent of the coronavirus 2019 disease, has led to an ongoing global pandemic since 2019. Mass spectrometry can be used to understand the molecular mechanisms of viral infection by SARS-CoV-2, for example, by determining virus-host protein-protein interactions through which SARS-CoV-2 hijacks its human hosts during infection, and to study the role of post-translational modifications. We have reanalyzed public affinity purification-mass spectrometry data using open modification searching to investigate the presence of post-translational modifications in the context of the SARS-CoV-2 virus-host protein-protein interaction network. Based on an over twofold increase in identified spectra, our detected protein interactions show a high overlap with independent mass spectrometry-based SARS-CoV-2 studies and virus-host interactions for alternative viruses, as well as previously unknown protein interactions. In addition, we identified several novel modification sites on SARS-CoV-2 proteins that we investigated in relation to their interactions with host proteins. A detailed analysis of relevant modifications, including phosphorylation, ubiquitination, and S-nitrosylation, provides important hypotheses about the functional role of these modifications during viral infection by SARS-CoV-2.
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a highly transmissible and pathogenic coronavirus that causes coronavirus disease 2019 . Since late 2019, it has caused an ongoing global pandemic and socioeconomic crisis, with currently over 610 million people infected and over 6.5 million deaths globally (World Health Organization, September 2022) (1). Despite intensive research by the scientific community, many important questions regarding its molecular mechanisms remain unanswered.
Viruses are opportunistic intracellular pathogens that depend on their interactions with host proteins to ensure their survival and propagation. The study of protein-protein interactions (PPIs) between SARS-CoV-2 and human proteins is important to understand the mechanisms of viral infection by SARS-CoV-2 (2) and develop therapeutic treatments (3). Proteins and their function can be altered by post-translational modifications (PTMs) to increase the functional diversity of the limited number of proteins that are encoded. Interestingly, PTMs have been characterized on several coronavirus proteins, in spite of the fact that coronaviruses lack enzymes capable of introducing such modifications (4). Elucidating the roles of PTMs in a mechanistic context is important to understand viral infection, as they are crucial for viral protein function and promote viral replication, assembly, and release. For example, phosphorylation of the SARS-CoV-2 nucleocapsid protein allows its shuttling between cellular compartments (5). In addition, insights in viral and host PTM dynamics offer a potential avenue toward the development of antiviral therapies. Removing PTMs that play a role in the enzymatic activity of viral proteins can aid the host in overcoming viral infection. Alternatively, viral proteins can be modified leading to their inactivation and/or proteasomal degradation, for example, by attaching ubiquitin (6).
Affinity purification-mass spectrometry (AP-MS)-based proteomics was used to compile the first SARS-CoV-2-human PPI map, revealing interactions with proteins involved in major cellular processes, including DNA replication, RNA processing, and vesicle trafficking to give insights into SARS-CoV-2 infection (2). In addition, the interaction map revealed several human proteins that are targeted by existing drugs approved by the US Food and Drug Administration, which may be potential targets for drug repurposing.
The majority of previously conducted studies use standard sequence database searching to process AP-MS data (2,(7)(8)(9)(10). Most search engines require the user to explicitly specify potential protein modifications to be considered during searching. Unfortunately, taking into account multiple modifications simultaneously during spectrum identification is problematic. First, it leads to excessive search times because of the exponential increase in the number of candidate peptides that have to be considered. Second, it produces more random high-scoring matches because of the increase in candidates, leading to fewer identifications at a given false discovery rate (FDR) (11). Therefore, a standard analysis typically only considers a handful of variable modifications. In the original analysis of the SARS-CoV-2 AP-MS data, only two modifications (N-terminal acetylation and methionine oxidation) were considered, leaving a substantial part of the data unexplored.
We have recently performed a computational reanalysis of the SARS-CoV-2 AP-MS data by Gordon et al. (2) to uncover additional virus-host interactions not reported in the original study, which highlighted further opportunities for drug repurposing (12). However, PTMs were still not considered. Therefore, in this study, we used an increasingly popular approach to overcome these limitations called "open modification searching" (OMS) to gain new insights into virus-host interactions. This innovative approach allows a modified spectrum to match against its unmodified variant by using a very wide precursor mass window. It is thus able to identify peptides carrying any type of modification. Algorithmic advances of the last few years now allow for the fast and accurate use of OMS, enabling an unbiased detection of modifications at an unparalleled scale (13)(14)(15). The OMS solution ANN-SoLo (13,14) was used to reprocess the SARS-CoV-2 AP-MS data by Gordon et al. (2). The reanalysis resulted in a more than twofold increase in identified spectra compared with the originally reported results, allowing more accurate PPI filtering. In addition, several modified viral peptides were identified. Phosphorylation, ubiquitination, and S-nitrosylation were selected to be investigated in more detail, in view of their putative importance during a viral infection. For each of these PTMs, we detected novel PTM sites on SARS-CoV-2 proteins, revealing potential functional insights.

Experimental Design and Statistical Rationale
A previously generated AP-MS dataset to study the SARS-CoV-2 virus-host interactome (2) was retrieved from the PRIDE repository (PXD018117) (16). For the full experimental details, see the original study by Gordon et al. (2). In brief, AP was performed using 27 SARS-CoV-2 proteins that were individually tagged and expressed in triplicate (biological replicates) in HEK-293T cells. Bead-bound proteins were denatured, reduced, carbamidomethylated, and enzymatically digested using trypsin, and each sample was injected via an Easy-nLC 1200 (Thermo Fisher Scientific) into a Q-Exactive Plus mass spectrometer (Thermo Fisher Scientific). The SARS-CoV-2 proteins that were included are all mature nonstructural proteins (Nsps), except for Nsp3 and Nsp16; a mutated version of Nsp5 to disable its proteolytic activity (Nsp5_C145A); and all predicted SARS-CoV-2 Orfs, including the spike (S), membrane (M), nucleocapsid (N), and envelope (E) protein. Spectrum identifications were filtered at 1% FDR, and PPIs were filtered using a SAINTexpress Bayesian false discovery rate (BFDR) ≤ 0.05, an average spectral count ≥ 2, and a MiST score ≥ 0.7 (see further).

Spectrum Identification Using Open Modification Searching
First, the downloaded raw files were converted to MGF files using ThermoRawFileParser (version 1.2.3) (17). Next, OMS was performed using the ANN-SoLo spectral library search engine (version 0.2.4) (13,14). A combined human-SARS-CoV-2 spectral library was used for searching. The MassIVE-KB library (version June 15, 2018) was used as human spectral library. This is a comprehensive human higher energy collisional dissociation spectral library containing 2,154,269 unique precursors corresponding to 1,114,503 unique peptides, derived from publicly available MS data in the MassIVE repository (18). SARS-CoV-2 spectra were simulated by generating all possible tryptic peptide sequences from the SARS-CoV-2 protein sequences downloaded from UniProt (version March 05, 2020) using Pyteomics (version 4.3.2) (19) and predicting the corresponding spectra using Prosit (version prosi-t_intensity_2020_hcd; collision energy 33 as determined by Prosit collision energy calibration) (20). A simulated spectral library for the green fluorescent protein was generated in a similar fashion. A final spectral library was compiled by merging all spectra using SpectraST (version 5.0) (21) and adding decoy spectra in a 1:1 ratio using the shuffle-and-reposition method (22). ANN-SoLo was configured to use a 20 ppm precursor mass tolerance during the first step of its cascade search and a 500 Da precursor mass tolerance during its open search. Other search settings were to filter peaks below 101 m/z, above 1500 m/ z, and in a 0.5 m/z window around the precursor mass; a 0.02 m/z fragment mass tolerance; and a bin size of 0.05 m/z. The remaining settings were kept at their default values. Peptide-spectrum matches (PSMs) were filtered at 1% FDR using ANN-SoLo's built-in subgroup FDR procedure (supplemental Table S1).
In addition, OMS was performed using MSFragger (version 3.5) (15) and FragPipe (version 18.0) against a concatenated FASTA file containing human protein sequences (UniProt reviewed sequences downloaded on February 28, 2020) (23), the SARS-CoV-2 protein sequences (version March 05, 2020), and the green fluorescent protein sequence. An equal number of decoy protein sequences was generated using FragPipe. The MSFragger search settings included a precursor mass tolerance between −150 Da and 500 Da, a fragment mass tolerance of 0.02 Da, and trypsin cleavage with up to two missed cleavages. Cysteine carbamidomethylation was used as a fixed modification, and oxidation of methionine and N-terminal acetylation were used as variable modifications. Other search settings were kept at their default values. PSMs were processed using PeptideProphet (version 4.4.0) (24) with the FragPipe default settings for open searches and filtered at 1% FDR.
To validate the filtered PPIs, a list of SARS-CoV-2-human interactions reported in seven alternative MS-based SARS-CoV-2 virus-host interactome studies (7-10, 28-30) was obtained from the BioGRID repository (31). In addition, PPIs from a previous reanalysis of the original AP-MS study were included (12). Human interaction partners were queried in the VirHostNet database (version 3.0) (32) to investigate whether these proteins are also targeted by other viruses.

Gene Ontology Enrichment Analysis
Gene Ontology (GO) enrichment analysis was performed for the human proteins that interact with each viral protein, using the enrichGO function of the clusterProfiler package (version 4.0.5) in R. Significant GO terms corresponding to the biological process category (1% Benjamini-Hochberg FDR) were extracted and further refined to select nonredundant terms using the rrvgo package (version 1.4.4) with default parameters (https://ssayols.github.io/rrvgo).

Investigation of PTMs
The precursor mass differences observed in the ANN-SoLo results were referenced against the Unimod database (33) to determine the modifications that were present. PSMs that included specific PTMs (phosphorylation, ubiquitination, and S-nitrosylation) were manually investigated in more detail to disambiguate between alternative PTM assignments with near-identical mass and determine the modification site by visual inspection using the spectrum_utils Python package (version 0.3.3) (34).

Increased Spectrum Identification Rate Using Open Modification Searching Boosts PPI Confidence
Open modification searching using ANN-SoLo succeeded in identifying 830,743 of 2,503,010 total MS/MS spectra at 1% FDR (33% identification rate). This represents a 214% increase in identified spectra compared with the originally reported results (2) obtained by standard searching using MaxQuant (44) (Fig. 1A). Notably, 402,586 PSMs correspond to modified peptides with nonzero precursor mass differences from open modification searching ( Fig. 1C and supplemental Table S10). Besides the increase in identified spectra, an important advantage of open modification searching is its ability to identify any type of PTM in an unbiased fashion, without the need to explicitly specify a limited number of variable modifications. This makes it possible to explore the general presence of PTMs in the context of the SARS-CoV-2 virus-host interactome. Frequently observed PTMs include modifications that were likely artificially introduced during sample processing (45), such as oxidation, dioxidation, and acetylation. Such ubiquitous modifications are typically included as variable modifications during standard searching. The OMS results also include unique observations of biologically relevant modifications at lower abundances, such as phosphorylation, ubiquitination, and S-nitrosylation.
Similar beneficial results can also be achieved with alternative OMS tools, such as MSFragger (15). Performing an open search using MSFragger instead of ANN-SoLo succeeded in identifying 574,394 of 2,503,010 total MS/MS spectra at 1% FDR (23% identification rate). This represents a 148% increase in identified spectra compared with the originally reported results (2) (supplemental Fig. S1A). Furthermore, there is a strong correspondence in identification results between ANN-SoLo and MSFragger (supplemental Fig. S1, B and C), which indicates the robustness of open modification searching, irrespective of the search engine that is employed. For simplicity of the downstream analyses and because ANN-SoLo was more sensitive than MSFragger, the ANN-SoLo results were used to investigate protein interactions during viral infection by SARS-CoV-2.
PPI filtering is crucial to separate true interactors from nonspecific binders and contaminants. A combination of SAINTexpress and MiST filtering of the open modification searching results produced 375 high-confidence PPIs (supplemental Table S4). These results contain 164 PPIs that overlap with the previously reported results (2) and 211 novel PPIs (supplemental Fig. S2). Notably, a previous reanalysis of these AP-MS data using alternative bioinformatics tools reported a similar overlap with the original PPI results (12). The difference in detected PPIs is partly because of the difference in spectrum identification and PPI filtering strategies. In the original analysis, a two-step filtering strategy was used. In the first step, the PPIs were filtered by a SAINTexpress BFDR ≤ 0.05, an average spectral count ≥ 2, and a MiST score ≥ 0.7. All proteins that fulfilled the first filtering step were searched in the CORUM (46) database of known protein complexes, and information was extracted about the stable protein complexes that they participated in. In the second step, all the proteins that formed known complexes with interactors identified in the first step were subjected to a filtering step with a lower stringency (MiST score ≥ 0.6) (2). In contrast, in the reanalysis, only a single filtering step was performed using a SAINTexpress BFDR ≤ 0.05, an average spectral count ≥ 2, and a MiST score ≥ 0.7.
When compared with the unfiltered PPI data of the original analysis (2), 199 of the 211 novel PPIs were previously detected as well but failed the original PPI filtering thresholds (supplemental Fig. S3). Most discarded PPIs (~88%) did not pass the minimum threshold for the MiST score, which is a linear combination of the prey abundance, the prey reproducibility across repeated runs, and the specificity of the prey relative to other baits (47). About 28% of the discarded PPIs did not pass the SAINTexpress BFDR filter, which is based on the abundance of the preys and control proteins (26). The difference in filtered PPIs in the reanalysis is driven by the increased number of PSMs per identified protein from the open modification results, which resulted in a more reliable identification of both background proteins and potential interaction partners. This influences the PPI filtering and results in a more robust identification of true interactors.
The PPIs were validated against seven alternative MS-based SARS-CoV-2 virus-host interactome studies (7-10, 28-30) and the results from a previous reanalysis (12). The largest overlap was found with the previous reanalysis and the original analysis, as these studies make use of the same experimental data and only differ in the computational tools that were employed ( Fig. 2A). Although most independent studies contributed unique PPIs, there is a strong correspondence in the superset of detected PPIs across all independent studies, with 317 of the 375 PPIs that were detected in this study previously reported in one or more of the other studies ( Figs. 2A and S2). In addition, investigating the overlap with the VirHostNet database (32) shows that 328 of the human interaction partners are also targeted by other viruses (Fig. 2B). These results are significantly enriched in the Vir-HostNet database (Fisher's exact test, p ≪ 0.001), which is expected as different viruses have similar strategies to hijack the host, for example, through functional conservation or viral recombination (32).
GO enrichment analysis based on the human interacting proteins for each SARS-CoV-2 protein indicates how viral infection might hijack major cellular processes, including metabolic processes involving noncoding RNA (ncRNA; Nsp8) and glycoproteins (Orf8), and RNA export from the nucleus (Orf6) (Fig. 3). Both microRNA and long ncRNA are ncRNA that can regulate gene expression and exhibit different expression profiles in COVID-19 patients compared with healthy people (48,49). In addition, they have recently been found to act as viral modulators, regulating viral infection and host defense (50,51). Glycoprotein metabolic processes are expected to be relevant to Orf8, as induction of proinflammatory cytokine production by secreted Orf8 is glycosylation dependent (52).  Table S10. PPI, protein-protein interaction; PSM, peptide-spectrum match.
RNA export from the nucleus is also relevant, as SARS-CoV-2-infected cells have an increased level of nuclear mRNA accumulation. Confirming our detected interactions between Orf6 and the mRNA export factors Rae1 and Nup98, a recent study found that Orf6 uses this process to trap host mRNA (53). In addition to these major cellular processes, other biological processes relevant during viral infection were found, including viral transcription and viral gene expression (supplemental Fig. S4).

Interaction Between Human Kinases And Phosphorylated Viral Proteins
Phosphorylation is a common PTM that impacts many basic cellular processes (54). It usually results in a functional change of the target protein, interfering with its enzymatic activity, cellular location, and/or association with other proteins. Several studies have indicated that also the function of viral proteins can be affected by their phosphorylation status. For example, association of the SARS-CoV-2 nucleocapsid protein with the 14-3-3 host proteins depends on its phosphorylation status (55), and phosphorylation of the SARS-CoV-1 nucleocapsid protein has been proven to be important for the regulation of the viral life cycle (56).
Twenty phosphorylated viral peptides were identified using open modification searching, and phosphosites were localized to specific residues using visual inspection where possible (supplemental Table S5 and supplemental Figs. S5-S24). Notably, despite the fact that no phosphopeptide enrichment protocol was used during the original AP-MS study, we identified several phosphorylation sites that match results from specialized phosphoproteomics studies performed on SARS-CoV-2 infected cells (35)(36)(37) (Fig. 4A).
Phosphorylation status is determined by the interplay between kinases and phosphatases. Kinases catalyze the attachment of phosphate groups to target proteins, and phosphatases remove phosphate groups from target proteins (57). SARS-CoV-2 does not encode any kinases or phosphatases and thus relies on host proteins for its phosphorylation status. Several kinases and phosphatases have been identified as potential drug targets, and a few kinase inhibitors are currently being used to treat COVID-19. For example, baricitinib is recommended by the World Health Organization for patients with severe or critical COVID-19. It is a Janus kinase inhibitor that suppresses overstimulation of the immune system by preventing phosphorylation of key proteins involved in the signal transduction that leads to immune activation and inflammation (58), and it is able to prevent SARS-CoV-2 from entering the cell by inhibiting clathrinmediated endocytosis (59).
We detected multiple interactions between SARS-CoV-2 proteins and host kinases and phosphatases (supplemental Tables S11 and S12), including receptor interaction protein kinase 1 (RIPK1; Nsp12) and Tank-binding kinase 1 (TBK1; Nsp13), both of which have previously been linked to SARS-CoV-2. Active phosphorylated RIPK1 was found in epithelial cell samples from COVID-19 patients (60), and a recent study discovered that Nsp12 promotes activation of RIPK1, which in turn enhances viral replication by stimulating the expression of viral receptors, such as angiotensin-converting enzyme 2 and epidermal growth factor receptor (61). Furthermore, Nsp13 is found to limit the activation of TBK1 by directly binding to TBK1 (62). In addition, we uncovered previously unknown interactions with host kinases, such as the interaction between Nsp12 and TLK1 (tousled-like kinase 1; Fig. 4B). This interaction is particularly interesting as kinase-specific phosphorylation site prediction revealed that an observed phosphorylation site on Nsp12 (T21) is most likely caused by TLK1.

Ubiquitination of SARS-CoV-2 Proteins
Ubiquitination is a common PTM that can affect the localization, stability, and function of proteins (63). It regulates a variety of cellular processes, including protein degradation, protein trafficking, transcription, cell-cycle control, and cell signaling (64). Depending on the cellular context, ubiquitin attachment may either promote or inhibit the viral life cycle. Viruses have developed means to exploit protein ubiquitination by enhancing or inhibiting ubiquitination of specific substrates depending on their needs. They can redirect protein degradation toward proteins with antiviral activity (65) or use ubiquitination to regulate viral proteins. For example, transcriptional function of the HIV type-1 Tat protein is increased by the addition of a single ubiquitin molecule (66). More recently, it has been discovered that the inhibition of interferon α signaling by SARS-CoV-2 Orf7a depends on the ubiquitination of K119 (67).
Ten ubiquitinated viral peptides were identified, and, where possible, ubiquitination sites were localized to specific residues using visual inspection (supplemental Table S7 and supplemental Figs. S25-S34). Notably, all ubiquitination sites that could be confidently localized have been previously reported in independent studies (7, 42) (Fig. 5A) or could be predicted. Interestingly, although ubiquitination of K16 on Orf9c has not been reported in SARS-CoV-2 yet, this site has been observed to be ubiquitinated in SARS-CoV-1 (7).
The final step of ubiquitination is performed by a ubiquitin (E3) ligase. This enzyme is particularly important as it determines to which substrate protein the ubiquitin is attached. A ubiquitin can be removed by deubiquitinating enzymes (DUBs). Both host E3 ligases and DUBs were found to interact with SARS-CoV-2 proteins (supplemental Tables S13 and S14), suggesting that SARS-CoV-2 might hijack the host ubiquitination system. For example, Nsp13 was found to interact with the DUB USP13 (Fig. 5B). According to a recent study, Nsp13 likely hijacks USP13 to prevent itself from degradation. Both knockdown of USP13 and its inhibition through spautin-1 resulted in decreased levels of Nsp13, suggesting that USP13 deubiquitinates and consequently stabilizes Nsp13 (68).

S-nitrosylation of SARS-CoV-2 Proteins
A promising compound currently undergoing clinical trials for COVID-19 is nitric oxide (NO) (69). Besides its role as an important vasodilator, to prevent blood clot formation, it functions as a vital immune mediator, exerting broadspectrum antiviral effects (70). NO potentially prevents infection by SARS-CoV-2, as it was suggested for SARS-CoV (71). The surprisingly low prevalence of smokers among hospitalized COVID-19 patients (72) could be attributed to the intermittent burst of high NO concentration in cigarette smoke (73). Also nitrate-rich nutrition, exercise, FIG. 4. Phosphorylation of SARS-CoV-2 proteins. A, UpSet plot showing the overlap in phosphorylation sites between independent phosphoproteomic studies and the sites found in the current AP-MS reanalysis. B, The PPI network of Nsp12 with highlighted kinases. Although RIPK1 and BCKDK were found in the original analysis, to our knowledge, TLK1 has not been identified as an interaction partner of Nsp12 yet. After kinase-specific phosphorylation site prediction, we found that one of the observed phosphorylation sites (T21) is predicted to be phosphorylated by TLK1. AP-MS, affinity purification-mass spectrometry; BCKDK, branched chain ketoacid dehydrogenase kinase; Nsp12, nonstructural protein 12; PPI, protein-protein interaction; RIPK1, receptor interaction protein kinase 1; TLK1, tousled-like kinase 1.

A B
FIG. 5. Ubiquitination of SARS-CoV-2 proteins. A, UpSet plot showing the overlap in ubiquitination sites between independent ubiquitination studies and the sites found in the current AP-MS reanalysis. The two novel ubiquitination sites were verified through in silico prediction. K16 on Orf9c has not been reported for SARS-CoV-2 yet but was previously observed in SARS-CoV-1. B, The PPI network of Nsp13 with the deubiquitinating enzyme USP13 highlighted. AP-MS, affinity purification-mass spectrometry; Nsp13, nonstructural protein 13; PPI, proteinprotein interaction; SARS-CoV, severe acute respiratory syndrome coronavirus.

Open Searching of SARS-CoV-2-Human Protein Interaction Data
Mol Cell Proteomics (2022) 21 (12) 100425 7 and breathing through your nose is hypothesized to prevent a SARS-CoV-2 infection, as they all increase the NO concentration (71,74).
The general antiviral mechanism appears to be the NOmediated S-nitrosylation of viral and host proteins (75). S-nitrosylation is the reversible covalent attachment of NO to the thiol side chain of cysteine. It is one of the most important and universal PTMs, and it can act as a global regulator of protein function akin to phosphorylation and ubiquitination (76). Interestingly, reactive cysteine residues are present in many viral and host proteins, representing possible targets for NO (75). Results of an in vitro study suggest that S-nitrosylation of the SARS-CoV-2 3CL protease (Nsp5) can directly inhibit its protease activity and reduce viral replication (77).
To the best of our knowledge, there are currently no reported S-nitrosylation sites on SARS-CoV-2 proteins. Because cysteine residues are typically considered with a fixed carbamidomethylation modification (+57.021464 Da) introduced during sample processing by alkylation with iodoacetamide (45), Prosit (20)-which was used to simulate SARS-CoV-2 spectra-always considers cysteine residues to be carbamidomethylated. However, the presence of a prior modification can block reduction and alkylation of cysteine residues (78). Therefore, rather than identifying S-nitrosylation directly based on its mass difference of +28.990164 Da, this modification is represented by a mass difference of −28.031300 Da (28.990164 Da − 57.021464) in the open modification searching results. Because the corresponding S-nitrosylation mass difference is identical to the mass difference observed from the valine to alanine and methionine to cysteine amino acid substitutions, careful visual inspection to localize the modification to specific amino acid residues was performed to confirm three S-nitrosylation sites (supplemental Table S8 and supplemental Figs. S35-S52).
Interestingly, one of these S-nitrosylation sites (C45) is located on the main protease Nsp5, specifically, near the catalytic site. Nsp5 is one of two cysteine proteases necessary for viral replication and assembly, and its protease activity is suggested to be directly inhibited by S-nitrosylation (77). A recent study found that C45 is a hyper-reactive cysteine with a higher nucleophilicity than C145-the catalytic cysteine of Nsp5-and identified C45 as an attractive binding site for the development of a covalent inhibitor (79). DISCUSSION Mass spectrometry research can help to provide insights into the etiology of SARS-CoV-2 infection and identify potential therapeutic targets by investigating the roles of viral and host proteins during infection, their protein interactions, and PTMs (12). In addition, it can be used to develop diagnostic strategies, such as through efforts of the CoV-MS consortium (80)-a partnership by multiple academic and industrial groups to increase applicability, accessibility, sensitivity, and robustness of an MS-based diagnostic test that detects proteolytically digested SARS-CoV-2 proteins.
Here, we have used open modification searching to reprocess SARS-CoV-2-human AP-MS data to investigate PTMs in the context of PPIs between SARS-CoV-2 and its human host. Studying these two aspects in tandem is especially relevant to understand the mechanisms of viral infection, because although PTMs are essential for viral replication, coronaviruses lack the enzymes to introduce these themselves. Whereas most SARS-CoV-2 studies so far have only considered a limited number of modifications commonly introduced during sample processing, our open modification searching strategy enabled the unbiased investigation of any PTM. This increased the number of identified spectra by more than twofold, corresponding to newly identified modified peptides, and enabled us to put these PTMs in the context of the virushost PPI network.
We discovered several combinations of relevant protein interactions and PTMs that hint at functional roles during viral infection. Specifically, we investigated phosphorylation, ubiquitination, and S-nitrosylation in more detail. These are reversible PTMs that are essential for a variety of cellular processes and could be of importance during viral infection. We found interactions between phosphorylated SARS-CoV-2 proteins and host kinases, between ubiquitinated SARS-CoV-2 proteins and host E3 ligases, and novel S-nitrosylation sites on SARS-CoV-2 proteins. Notably, even though no specialized modification enrichment was performed, we were able to confidently detect and localize multiple PTM sites that have been previously reported in independent enrichment studies, as well as novel PTM sites on SARS-CoV-2 proteins.
An important consideration in this study is the overlap in PPI results with the original study (2), which used standard searching instead of open modification searching. Although the vast majority of original spectrum identifications (79%) could be replicated using open modification searching, the overlap in PPIs was smaller (49%). Notably, another recent reanalysis of these data using alternative bioinformatics software tools showed a similarly limited overlap with the original PPI results (12). Besides using stringent FDR control and other PPI filtering settings during all data processing steps, we validated the detected protein interactions through comparison with alternative MS-based SARS-CoV-2 studies and general virus-host interactions in the VirHostNet database. This showed a highly significant overlap between our PPI results and these independent sources, reinforcing the validity of the newly detected interactions.
The partial mismatch in PPI results can be largely explained by more complete identification of both background proteins and potential interaction partners using OMS. As such, the current study highlights the dependence of PPI filtering on the preceding protein identification results. First, open modification searching can be used to obtain high-quality identification data from which comprehensive PPI results can be obtained.
Second, PPI filtering algorithms have to be able to robustly deal with uncertainty and missingness in the identification results when determining true protein interactions. Because OMS increased both the robustness of the PPI filtering results and enabled us to put PTM information in context of the protein interaction network, we suggest that this strategy is ideally suited for the analysis of MS-based PPI data.
In Supplemental data -This article contains supplemental data.