Functional Insights Into Protein Acetylation in the Hyperthermophilic Archaeon Sulfolobus islandicus*

About 26% of the total proteins and 44% of the identified proteins were acetylated at lysine residues and the N terminus, respectively, in the hyperthermophilic archaeon Sulfolobus islandicus. A Pat homolog preferentially acetylated a group of acyl-CoA synthetases among the acetylated proteins, whereas an Ard1 homolog exhibited broad substrate specificity. A S. islandicus mutant strain lacking the Pat homolog showed no significant growth defects and that lacking the Ard1 homolog grew more slowly than the parent strain. Graphical Abstract Highlights Protein acetylation at Lys residues and the N terminus occurs widely in Sulfolobus. SisPat preferentially acetylates a group of acyl-CoA synthetases. SisArd1 acetylates the majority of the acetylated N termini identified in the cell. SisArd1, but not SisPat, is required for the optimal growth of the organism. Proteins undergo acetylation at the Nε-amino group of lysine residues and the Nα-amino group of the N terminus in Archaea as in Bacteria and Eukarya. However, the extent, pattern and roles of the modifications in Archaea remain poorly understood. Here we report the proteomic analyses of a wild-type Sulfolobus islandicus strain and its mutant derivative strains lacking either a homolog of the protein acetyltransferase Pat (ΔSisPat) or a homolog of the Nt-acetyltransferase Ard1 (ΔSisArd1). A total of 1708 Nε-acetylated lysine residues in 684 proteins (26% of the total proteins), and 158 Nt-acetylated proteins (44% of the identified proteins) were found in S. islandicus. ΔSisArd1 grew more slowly than the parental strain, whereas ΔSisPat showed no significant growth defects. Only 24 out of the 1503 quantifiable Nε-acetylated lysine residues were differentially acetylated, and all but one of the 24 residues were less acetylated by >1.3 fold in ΔSisPat than in the parental strain, indicating the narrow substrate specificity of the enzyme. Six acyl-CoA synthetases were the preferred substrates of SisPat in vivo, suggesting that Nε-acetylation by the acetyltransferase is involved in maintaining metabolic balance in the cell. Acetylation of acyl-CoA synthetases by SisPat occurred at a sequence motif conserved among all three domains of life. On the other hand, 92% of the acetylated N termini identified were acetylated by SisArd1 in the cell. The enzyme exhibited broad substrate specificity and could modify nearly all types of the target N termini of human NatA-NatF. The deletion of the SisArd1 gene altered the cellular levels of 18% of the quantifiable proteins (1518) by >1.5 fold. Consistent with the growth phenotype of ΔSisArd1, the cellular levels of proteins involved in cell division and cell cycle control, DNA replication, and purine synthesis were significantly lowered in the mutant than those in the parental strain.

Protein acetylation is one of the major post-translational modifications (PTMs) 1 , in which the acetyl group from acetyl coenzyme A (Ac-CoA) is transferred to the N terminus on the ␣-amino group (N␣-acetylation or Nt-acetylation) or theamino moiety of lysyl side chains (N-acetylation or lysine acetylation) (1). Nt-acetylation and enzymatic lysine acetylation were catalyzed by N-terminal acetyltransferase (NAT) and lysine (Lys [K]) acetyltransferase (KAT), respectively (1,2). In addition, global lysine acetylation in bacteria (e.g. Escherichia coli and Bacillus subtilis) and eukaryotic mitochondria also occurs extensively via a nonenzymatic mechanism (3). Although Nt-acetylation is irreversible, lysine acetylation is reversible through the action of deacetylase. Hence, the latter is also referred to as reversible lysine acetylation (RLA) (1,4).
Since its initial discovery in eukaryotic histones in 1963 (5), lysine acetylation has been found in a number of eukaryotic and bacterial proteins (4). Proteomic studies of lysine acetylation demonstrate that protein lysine acetylation occurs as widely in bacteria (e.g. E. coli, Salmonella enterica, B. subtilis, Rhodopseudomonas palustris, etc.) (6 -11) as in eukaryotes (e.g. human, drosophila, yeast, etc.) (12)(13)(14). RLA is known to affect a wide range of cellular processes, including chromatin maintenance (15), regulation of gene expression (16), metabolism (17)(18)(19)(20), and cell structure (21). S. enterica acetyl-CoA synthetase (ACS) was the first enzyme shown to be regulated by RLA (17). This enzyme and other acyl-CoA synthetases, which form the AMP-forming acyl-CoA synthetase family, activate organic acids by generating corresponding acyl-CoA thioesters through a two-step process involving the formation of an enzyme-bound acyl-AMP intermediate (22). Acyl-CoA synthetases play a critical role in the anabolic and catabolic metabolism of fatty acids (17,18,23). Acetyl-CoA synthetase is also found to affect histone acetylation and hence gene expression by changing pools of the acetyl group donor acetyl-CoA (24,25). Acetylation of a conserved lysine residue in acetyl-CoA synthetase by the protein lysine acetyltransferase Pat in S. enterica abolished the activity of the acetyl-CoA synthetase by preventing the formation of the acyl-AMP intermediate (17). This regulatory mechanism appears to be evolutionarily conserved not only in bacteria (e.g. S. enterica, B. subtilis, R. palustris) (26) but also in eukaryotes (e.g. mouse and human) (27,28).
Nt-acetylation is a co-or post-translational modification frequently found on proteins in eukaryotes (e.g. ϳ60% of the proteins in yeast and ϳ80% of the proteins in humans) (4,29). The modification depends primarily on the nature of the first two amino acid residues at the N terminus. Eukaryotes encode multiple N-terminal acetyltransferases. For example, six human NATs have been identified and designated NatA-NatF (2,30). Each of these NATs contains a specific catalytic subunit (i.e. Naa10, Naa20, Naa30, Naa40, Naa50 and Naa60). Some of the NATs (i.e. NatA, NatB, NatC and NatE) are known to possess one or two auxiliary subunits (30). These various NATs differ in substrate specificity (2). Nt-acetylation modulates protein interactions, targeting, folding and degradation (2,31). Mutations in NATs were found to result in slow growth, abnormal cell cycle control, defects in chromosomal segregation or cell division, etc., in eukaryotes (2,31,32). Recently, extensive Nt-acetylation was also shown to occur in Bacteria (e.g. ϳ18% of the proteins in Pseudomonas aeruginosa) (33,34). E. coli RimL, RimJ and RimI are the well-known examples of bacterial NATs, and they target ribosomal proteins L7, S5 and S18, respectively (35). However, the role of these bacterial NATs is unclear because mutants defective in the three proteins exhibited no detectable phenotypic distinctions from the wild-type strain (36 -38).
As compared with that in Eukarya and Bacteria, protein acetylation in Archaea is less well understood. It is shown in a few proteomic studies that proteins are widely acetylated both at the N terminus and at internal lysine residues in archaea as in eukaryotes (39 -43). Much of the current knowledge about the mechanistic and functional aspects of archaeal protein acetylation is derived from studies on Sso10b, a member of the Sac10b family from Sulfolobus solfataricus. The Sac10b family includes small and sequence-nonspecific nucleic acid binding proteins highly conserved in Archaea (44 -47). It was reported that Sso10b was acetylated at the -amino group of K16, a moderately conserved residue in the Sac10b family, and the ␣-amino group of the N-terminal residue (46). Acetylation of K16 was reported to reduce the affinity of Sso10b for DNA and de-repress transcription in vitro (46). Further, in vitro assays showed that the acetylation and deacetylation of K16 were catalyzed by an archaeal Pat homolog, the sole lysine acetyltransferase that has been biochemically characterized in Archaea, and the lysine deacetylase Sir2, respectively (46,48). Therefore, it is suggested that K16 in Sso10b is an RLA site, and acetylation and deacetylation of this site represents the only RLA-mediated regulatory mechanism so far identified in Archaea (1,4,49). However, these observations do not agree with the finding that the binding affinity of the Sulfolobus Pat homolog for Sso10b was low as compared with that of the eukaryotic counterpart for the cognate substrate proteins (50). Moreover, it was recently reported that Sis10b, member of the Sac10b family from Sulfolobus islandicus, as well as Sso10b were not acetylated but methylated at K16 in vivo (51).
On the other hand, a S. solfataricus Ard1 homolog, which shares 37% sequence similarity with human Ard1/Naa10, the catalytic subunit of NatA (52), was found to catalyze Ntacetylation of the N-terminal Ser residue of Sso10b in vitro. The Sulfolobus enzyme, the only NAT enzyme that has been identified in Archaea, showed a relaxed target sequence requirement, as compared with eukaryotic NatA because it was able to modify Sso10b variants with A, G, T, V, ML and ME at the N terminus, albeit with different efficiencies, in vitro (52). Structural analysis revealed that the active site of S. solfataricus Ard1 was a hybrid of the active sites of eukaryotic NatA and NatE (53). Therefore, it was proposed that the archaeal Ard1 represents an ancient and simple form of the eukaryotic NAT machinery (53). Very little is known about the functions of the Nt-acetylation of proteins in Archaea. Nt-acetylation was reported to reduce the cellular levels of 20S proteasome ␣1 protein in Haloferax volcanii cells, and of Sis10b and Cren7 in S. islandicus cells (51,54). Therefore, the bona fide substrates and the roles of archaeal Pat and Ard1 homologs remain to be elucidated.
In the present study, we performed comparative proteomic analyses on the extents and spectra of N-and Nt-acetylation on proteins from wild-type S. islandicus and its derivative mutant strains defective in either Pat or Ard1. We showed that proteins were extensively acetylated at the N terminus (ϳ44% of the identified proteins) and internal lysine residues (ϳ26% of the theoretical total of the proteins encoded by the genome) in S. islandicus. Neither SisPat nor SisArd1 was essential for the survival of S. islandicus, but the lack of the latter had far greater an impact on the growth of the organism than the lack of the former. SisPat acetylated only a small subset of proteins, with a group of acyl-CoA synthetases as preferred substrates, in vivo and was likely involved in metabolic regulation, whereas SisArd1 showed broad substrate specificity, being able to modify nearly all types of the target N termini for human NatA-NatF, and probably played a role in a range of cellular processes including cell cycle control, DNA replication and CRISPR-based immunity.

EXPERIMENTAL PROCEDURES
Growth of Organisms-All Sulfolobus strains were grown in Zillig's, SCVy or TYS medium (55) at 75°C with shaking at 150 rpm. S. islandicus E233S (⌬pyrEF, ⌬lacS), a generous gift from Professor Qunxin She at Copenhagen University, was cultured in Zillig's medium supplemented with uracil (20 g/ml). SCVy medium was used in transformation experiments. Solid plates for Sulfolobus strains were prepared by adding 0.8% (w/v) gelrite (Carl Roth, Karlsruhe, Germany) to the liquid medium.
Experimental Design and Statistical Rationale-Both the ⌬SisPat mutant and S. islandicus E233S were grown to the exponential phase (OD 600 ϳ0.6) and harvested. Each strain was prepared in three biological replicates, and each biological replicate was analyzed in three technical replicates for iTRAQ-based quantitative proteomic analysis. Each strain was prepared in two biological replicates for the TMTbased quantitative analysis of N-acetylation. Similarly, both the ⌬SisArd1 mutant and E233S were grown to the exponential phase (OD 600 ϳ0.6) and harvested. Each strain was prepared in two biological replicates for TMT-based quantitative proteomic analysis and for the qualitative analysis of N-terminal peptides by terminal amine isotopic labeling of substrates (TAILS) technique. The data from the replicates were combined and analyzed with MaxQuant (v.1.5.2.8) or Scaffold Qϩ (v. 4.5.2). For TMT-based quantitative lysine acetylome analysis, the acetylated peptides were detected in two replicates, and differentially regulated peptides were identified when the ratios of acetylated peptides between ⌬SisPat and the parent from both replicates were at least 1.3 or 2. For TMT-/iTRAQ-based quantitative proteomic analyses, the unpaired Student's t test or the Mann-Whitney test was applied with the criterion of significance set at p value Ͻ 0.05.
Construction of a Pat Deletion Mutant (⌬SisPat) and a Complementary Strain-Genome-editing plasmid for the deletion of SisPat (SiRe2454) was constructed by cloning a spacer derived from the target site in the sispat gene and donor DNA sequences flanking the sispat gene into pSe-Rp, a Sulfolobus CRISPR-cloning vector (56). The target site started with a protospacer adjacent motif (CCN) positioned 171 bp downstream of the start codon of the sispat gene, and the immediately adjacent 40-nt sequence was used as the spacer (supplemental Table S1). The spacer fragment was inserted into pSe-Rp at the BspMI site, forming pAC-sispat. The sequences upstream (L-arm) and downstream (R-arm) of sispat were amplified by PCR from the genomic DNA of S. islandicus REY15A (supplemental Table S1). The L-arm and R-arm fragments were double digested with SalI/MluI and MluI/NotI, respectively, and inserted into plasmid pACsispat at the SalI and NotI sites, respectively, giving rise to the plasmid pGE-sispat. Plasmid pGE-sispat was introduced into S. islandicus E233S by electroporation (57). Transformed cells grown on SCVy plates were screened by PCR amplification using the Flanking and Internal primers (supplemental Table S1). The resulting PCR products were analyzed by agarose gel electrophoresis and by DNA sequencing. Colony-picking was repeated to obtain a pure mutant strain. Plasmids were cured from the deletion mutant by pyrEF counter selection with uracil and 5-FOA, yielding ⌬SisPat.
A strain that complemented the deletion of the genomic copy of the sispat gene was constructed as described previously (51). In brief, the sispat gene was obtained by PCR from the S. islandicus DNA (supplemental Table S1). The PCR product was cleaved with NdeI/SalI and inserted into plasmid pSeSD (58), producing the SisPat overexpression plasmid pSeSD-SisPat. After propagation in E. coli DH5␣, pSeSD-SisPat was transformed into ⌬SisPat.
Recombinant Protein Preparation-Genes encoding SisPat (SiRe2454) and a selected group of acetylated proteins, including SiRe0305, SiRe0317, SiRe2355, SiRe0033, SiRe0711, SiRe2388 and SiRe1630, were amplified from the genomic DNA of S. islandicus REY15A using primer pairs listed in supplemental Tables S1. All PCR products were cloned, using the ClonExpress Ultra One Step Cloning Kit (Vazyme, Nanjing, China), into an expression vector according to the manufacturer's instruction. The fragments of the SisPat and SiRe0305 genes were cloned into expression plasmid pET30a, those of the SiRe0317 and SiRe2355 genes were cloned into expression vector Champion™ pET SUMO (Invitrogen, Waltham, MA), and those of the SiRe0033, SiRe0711, SiRe2388 and SiRe1630 genes were cloned into expression vector pET28a. Recombinant proteins SiRe0317 and SiRe2355 were fused to an N-terminal His 6 -SUMO tag, and the rest of the proteins were fused to a His 6 tag at either the N terminus or the C terminus. Site-directed mutagenesis of SiRe0305, SiRe0317 and SiRe2355 was performed using the Mut Express II Fast Mutagenesis Kit V2 (Vazyme).
The resultant expression vectors were transformed into E. coli Rosetta (DE3). The overproducers were grown to an OD 600 of ϳ0.6 at 37°C in LB medium containing 50 mg/ml kanamycin and 34 mg/ml chloramphenicol. Overproduction of the proteins, except for SiRe2355, was induced with the addition of 0.8 mM isopropyl-1-thio-␤-D-galactopyranoside (IPTG) and subsequent incubation at 37°C for 4 h. Overproduction of SiRe2355 was induced with 0.1 mM IPTG, and the further incubation was at 12°C for 16 h. Following induction, cells were harvested by centrifugation, resuspended in buffer A [20 mM Tris-HCl, pH 8.8, 1 mM DTT, 0.1 mM EDTA, 500 mM NaCl, 10% (v/v) glycerol] and sonicated on ice. After centrifugation, the supernatants from the SiRe0305, SiRe0317 and SiRe2355 overproducers were heat treated at 75°C for 20 min, and those from the SiRe0033, SiRe2338, SiRe0711 and SiRe1630 overproducers were heat treated at 65°C for 20 min. Samples were clarified by centrifugation at 30,000 ϫ g for 30 min at 4°C. Each protein sample was loaded onto a 1 ml HisTrap HP column in an Ä KTA FPLC Purifier system (GE Healthcare, Pittsburgh, PA), and the column was washed with buffer B [20 mM Tris-HCl, pH 8.8, 1 mM DTT, 0.1 mM EDTA, 500 mM NaCl, 500 mM imidazole and 10% (v/v) glycerol]. Peak fractions containing the target proteins were pooled and dialyzed against 10 mM Tris-HCl, pH 8.0, and 150 mM NaCl. Protein concentrations were determined by the Lowry method with bovine serum albumin (BSA) as the standard (59).
Protein Acetyltransferase Assays-SisPat activity was assayed as described previously (48). Briefly, a standard reaction (25 l) contained 10 mM Tris-HCl, pH 8.0, 150 mM NaCl, 10 M protein substrate, 10 M SisPat and 3 l [acetyl-3 H] acetyl-CoA (2.35 Ci/mmol; Perkin-Elmer, Waltham, MA). Reactions were incubated at 65°C for 3 h and quenched by boiling for 5 min in SDS-PAGE loading buffer. The samples were subjected to electrophoresis in an SDS-PAGE gel. The gel was immersed for 1 h in EN 3 HANCE (Perkin-Elmer), washed for 25 min in deionized water, and soaked for 10 min in 10% acetic acid and 2% glycerol. The gel was then exposed to X-ray film.
Preparation of Samples for the Quantitative Determination of the Lysine Acetylome-TMT-based quantitative acetylomic analysis was performed as described previously with modifications (60). ⌬SisPat and E233S were grown in Zillig's medium containing uracil (20 g/ml) at 75°C with shaking at 150 rpm. Cells were harvested at an OD 600 of ϳ0.6, resuspended in lysis buffer [8 M urea, 1% protease inhibitor mixture (Roche, Basel, Switzerland), 3 M trichostatin A, 50 mM nicotinamide and 2 mM EDTA], and disrupted on ice by sonication. After centrifugation at 20,000 ϫ g for 30 min at 4°C, proteins in the supernatant were precipitated with ice-cold 20% TCA for 2 h at Ϫ20°C. Following centrifugation at 15,000 ϫ g for 10 min at 4°C, the pellet was washed three times with cold acetone. The proteins were dissolved in 8 M urea and 50 mM NH 4 HCO 3 , and the protein concentration was determined by using the BCA kit (Beyotime, Shanghai, China).
The protein solution was reduced with 5 mM DTT for 30 min at 56°C, and subsequently alkylated with 11 mM iodoacetamide (IAA) for 45 min at room temperature in dark. After adding 50 mM NH 4 HCO 3 to lower the urea concentration to Ͻ 2 M, trypsin was added at a trypsin-to-protein mass ratio of 1:50. Following digestion for over-night, trypsin was added at a trypsin-to-protein mass ratio of 1:100. The second trypsin digestion was for 4 h.
A sample of the resulting peptides (from 4 mg of proteins) was desalted using the Strata X C18 SPE column (Phenomenex, Torrance, CA), vacuum-dried and reconstituted in 0.5 M triethylammonium bicarbonate (TEAB). The peptides were labeled with the tandem mass tag (TMT) (Thermo Fisher, Waltham, MA) according to the manufacturer's protocol. Two replicates of the ⌬SisPat sample were labeled with 126-tag and 130-tag, respectively, and two replicates of the E233S sample were labeled with 127-tag and 131-tag, respectively. All peptides were mixed, desalted and vacuum dried. To enrich lysine-acetylated peptides, the peptides were resuspended in NETN buffer (100 mM NaCl, 1 mM EDTA, 50 mM Tris-HCl, pH 8.0, and 0.5% Nonidet P-40), and incubated with anti-acetyllysine agarose beads (PTM Biolabs, Hangzhou, China) at 4°C for overnight with gentle shaking. The beads were washed four times with 1 ml of NETN buffer and twice with deionized H 2 O. The bound peptides were eluted from the beads with 1% trifluoroacetic acid (TFA). The eluted fractions were combined, vacuum-dried, and subjected to LC-MS/MS analysis.

Preparation of Samples for the Determination of the N-terminal Acetylome by Using the Terminal Amine Isotopic Labeling of Substrates (TAILS) Technique-
The TAILS analysis of proteins from ⌬SisArd1 and E233S was carried out as described (61). ⌬SisArd1 and E233S were grown in Zillig's medium containing uracil (20 g/ml) at 75°C with shaking at 150 rpm and harvested at an OD 600 of ϳ0.6. Proteins from the two strains were reduced and alkylated as described above for the preparation of the protein samples for the quantitative determination of the lysine acetylome. A sample (4 mg) of the proteins was precipitated by the addition of eight sample volumes of cold acetone and subsequent incubation for 2 h at Ϫ20°C. After centrifugation at 12,000 ϫ g for 10 min at 4°C, the precipitate was dissolved in 6 M GuHCl, 20 mM HEPES-KOH, pH 8.0, 40 mM deuterated formaldehyde ( 12 CD 2 O) and 20 mM NaBH 3 CN and incubated at 37°C for overnight. The reaction was quenched by the addition of 1 M NH 4 HCO 3 to 100 mM. After 4 h at 37°C, proteins were precipitated by the addition of eight sample volumes of cold acetone. After 2 h at Ϫ20°C, the sample was centrifuged, and the precipitate was dissolved in 8 M urea and 50 mM NH 4 HCO 3 . The proteins were digested with trypsin as described above. The sample was fractionated into 60 fractions by high-pH reverse-phase HPLC on a C18 column (Betasil C18, 5 m particles, 10 mm ID ϫ 250 mm, Thermo Fisher). These fractions were combined into 6 pools and dried in a SpeedVac (Thermo Fisher). The peptides were dissolved in PBS buffer (0.1 M Na 2 HPO 4 , 0.15 mM NaCl, pH7.2), and incubated with pre-washed NHS-activated agarose beads (Lot number 26196, Thermo) at 4°C for overnight with gentle shaking. Unbound peptides were collected after removing the beads by centrifugation at 12,000 ϫ g for 10 min at 4°C. The peptides were vacuum-dried and subjected to LC-MS/MS.
TMT-and iTRAQ-based Quantitative Proteomics-Quantitative proteomic analyses of ⌬SisArd1 and E233S by using a TMT-based approach and of ⌬SisPat and E233S by using an iTRAQ-based approach were performed as described (62,63). Briefly, proteins from the three strains were prepared, reduced, alkylated and digested as described above for the preparation of the protein samples for the quantitative determination of the lysine acetylome. The resulting peptides (from 200 g of total proteins for each sample) were labeled with the TMT or iTRAQ regents by following the manufacturers' instructions. For the TMT-based quantitative proteomic analysis of ⌬SisArd1 and E233S, two duplicate ⌬SisArd1 samples were labeled with TMT regent 129-tag and 130-tag, respectively, and two duplicate E233S samples were labeled with 127-tag and 128-tag, respectively. For the iTRAQ-based quantitative proteomic analysis of ⌬SisPat and E233S, three triplicate ⌬SisPat samples were labeled with iTRAQ regent 113-tag, 115-tag, and 117-tag, respectively, and three triplicate E233S samples were labeled with 114-tag, 116-tag, and 118-tag, respectively. The labeled samples were mixed at equal amounts and fractionated by high-pH reverse-phase HPLC on a C18 column (Zorbax 300 Extend-C18, 5 m particles, 4.6 mm ID ϫ 250 mm, Agilent, Santa Clara, CA). The fractions were combined into pools, dried in a SpeedVac and analyzed by LC-MS/MS.
LC-MS/MS Data Acquisition-LC-MS/MS analysis was performed on an EASY-nLC 1000 UPLC system (Thermo Fisher) coupled online to an Orbitrap Fusion Tribrid mass spectrometer (Thermo Fisher). Peptides were desalted online by an in-house packed trap column (C18, 5 m particles, 100 m ID, 3 mm length, Dr. Maisch GmbH, Ammerbuch-Entringen, Germany). The trapped peptides were loaded onto an in-house packed reversed-phase C18 column (3 m particles, 75 m ID, 150 mm length, Dr. Maisch GmbH) and eluted at a flow rate of 300 nl/min with a gradient from solvent A (0.1% formic acid in water) to solvent B (0.1% formic acid in acetonitrile): 4% to 22% B over 65 min, 22% to 35% B over 15 min, 35% to 80% B over 5 min and 80% B over 5min. The eluted peptides were subjected to analysis on an Orbitrap Fusion Tribrid mass spectrometer with an MS survey scan (m/z range 350 -1800, 70,000 resolution, 3 ϫ Data Analysis-The RAW mass spectrometry files were processed using MaxQuant (v.1.5.2.8) with an integrated Andromeda search engine. Tandem mass spectra were searched against the Uniprot S. islandicus REY15A database (2018/12/26, taxonomy ID: 930945, 2,631 sequences) concatenated with a reverse decoy database. The mass tolerance for precursor ions was set at 20 ppm in First search and 5 ppm in Main search, and the mass tolerance for fragment ions was set at 0.02 Da. Trypsin/P was specified as the cleavage enzyme, allowing up to 4 missing cleavages. The minimum peptide length was set at 7. Interference filters were employed by setting PIF at Ͼ0.75 to minimize co-isolation interference.
For RAW mass spectrometry files for the TMT-based quantitative acetylomic analysis, carbamidomethyl (C) and TMT-6plex (N terminus, K) were specified as fixed modifications, whereas oxidation (M) and acetylation (K) as variable modifications. False discovery rate (FDR) was adjusted to Ͻ1% and the minimum score for modified peptides was set at Ͼ40. Localization probability was set at Ͼ0.75. Quantitative information for each peptide or lysine site was calculated according to TMT ratios. A differentially acetylated lysine peptide was identified using a 1.3-or 2-fold cutoff.
For RAW mass spectrometry files for the TAILS-based N-terminal acetylomic analysis, Carbamidomethyl (C) was set as a fixed modification, whereas acetylation (N terminus), tetradeutero-dimethyl (C 2 H 2 D 4 , 32.0564 Da) (K, N terminus) and oxidation (M) were set as variable modifications. FDR was adjusted to Ͻ1% and the minimum score for modified peptides was set at Ͼ40. Representative MS 2 spectra of modified peptides (i.e. ones with the highest MaxQuant score for a given modified peptide) were manually inspected. MS 2 spectra, in which fewer than four sequential matched fragment ions were present or most of the matched ions were at the level of background noise, were deleted. The extent of Nt-acetylation for each protein was calculated as the ratio of the intensity of Nt-acetylated peptides to that of the identified N-terminal peptides from that protein.
For RAW mass spectrometry files for the TMT-based quantitative proteomic analyses, carbamidomethyl (C), TMT-6plex (N terminus, K) were specified as fixed modifications, whereas oxidation (M) as a variable modification. At least one unique peptide was required for the identification and quantification of a protein. A maximum FDR of 1% was employed for the identification of a protein. A differentially regulated protein was identified using a 1.5-fold cutoff and p value Ͻ0.05.
For RAW mass spectrometry files for the iTRAQ-based quantitative proteomic analyses, Mascot (v. 2.5.1) and Scaffold Qϩ (v. 4.5.2) with the integrated Andromeda search engine were used. The parameters were set as that for the TMT-based quantitative proteomic analyses except that Methylthio (C) and iTRAQ-8plex (N terminus, K) were set as fixed modification.

S. islandicus Defective in SisPat is Viable and Similar to the Parental Strain in Growth-
SisPat is the only protein lysine acetyltransferase from Archaea that has been shown to be capable of protein acetylation in vitro (48). Sequence alignment reveals that SisPat shares significant similarity (up to ϳ27%) at the amino acid sequence level to the GNAT (GCN5related N-acetyltransferase) domain of known protein lysine acetyltransferases from Bacteria (e.g. S. enterica Pat, R. palustris Pat, R. palustris KatA, E. coli Pka, M. tuberculosis PatA, and M. smegmatis PatA) (supplemental Fig. S1) (1). It was previously reported that Sso10b, a member of the Sac10b protein family, was acetylated in vitro at K16 by SsoPat, a SisPat homolog from S. solfataricus (48). In fact, Sso10b was the only proposed substrate for Sulfolobus Pat for a long time. In a recent study, however, K16 from neither Sso10b nor Sis10b was found to be acetylated in vivo (51). To determine the bona fide substrates of SisPat, we constructed a Pat deletion mutant in S. islandicus (⌬SisPat) using a CRISPR-based gene deletion approach (56). The successful deletion of the SisPat gene in the mutant strain, as verified by PCR and immunoblotting (supplemental Fig. S2), indicates that SisPat was dispensable for the growth of the cell.
To determine if the lack of SisPat would affect the growth of S. islandicus, we grew ⌬SisPat, the parental strain S. islandicus E233S and a complementary strain, which was constructed by introducing a plasmid overproducing SisPat (pSeSD-SisPat) into ⌬SisPat, in either the nutrient-poor SCVy medium or the nutrient-rich TYS medium at 75°C (55). All three strains grew more slowly and reached lower maximum cell densities in SCVy medium than in TYS medium ( Fig. 1A and 1B). The growth rates of ⌬SisPat and the parental strain were similar in both media. These results indicate that the absence of SisPat did not impact significantly on the growth of S. islandicus under growth conditions used in the present study.
We also conducted iTRAQ-based quantitative proteomic analysis of ⌬SisPat and the parental strain. A total of 1586 proteins, accounting for 60% of the proteins encoded by the genome (2631), were identified (supplemental Table S2). Among 1543 quantifiable proteins, only four proteins were differentially regulated by Ͼ1.5 fold in ⌬SisPat, in agreement with the growth phenotype of the mutant (Fig. 1C).
Sites of Lysine N-acetylation Are Widespread in S. islandicus-To investigate the effect of the deletion of SisPat on the pattern and level of global lysine N-acetylation, the exponentially-grown cells of the parental strain and ⌬SisPat were harvested and lysed. Total cellular proteins were digested with trypsin. Acetyllysine-containing peptides were enriched by using the anti-acetyllysine antibody beads and subjected to analysis by LC-MS/MS. A total of 1,708 nonredundant lysine N-acetylation sites from 684 proteins, accounting for 26% of the theoretical total of the proteins encoded by the genome (2631), were identified (supplemental Table S3). Interestingly, a total of 93 propionylated peptides were found in 81 proteins (supplemental Table S4), providing the first evidence that protein lysine propionylation also occurs in Archaea, in addition to Eukarya and Bacteria (64,65). The acetylated proteins contained between 1 and 22 acetylated lysine residues (supplemental Fig. S3). Approximately 96% of the acetylated proteins were acetylated at no more than five lysine sites, and acetylated proteins with a single site of acetylation are the most abundant among the acetylated proteins (43%). To determine the sequence preference of lysine acetylation, we examined the sequences upstream and downstream of each acetylated residue. Eleven conserved amino acid sequence motifs spanning from position Ϫ10 to ϩ10 with respect to an acetylated lysine were identified from 1174 peptides (supplemental Fig. S4A). Among these motifs, K, R, N, and S were most frequently found next to the acetylated lysine residue, with frequencies of about 17, 13, 12, and 10%, respectively, of all identified acetylated peptides (supplemental Fig. S4B). However, the protein secondary structure affiliation of acetylated lysine residues was like that of unacetylated lysine residues (supplemental Fig. S5A). Moreover, the probability of acetylation of a lysine residue was not significantly affected by its surface accessibility because 39% of the acetylated lysine sites, as compared with 41% of the unacetylated lysine residues, were surface exposed (supplemental Fig. S5B). These observations presumably suggest that protein acetylation occurred cotranslationally.
An arCOG category analysis suggested that the N-acetylated proteins were significantly enriched in metabolism (supplemental Fig. S6). Besides, the modified proteins were also enriched in ribosomal structure and biogenesis (arCOG J), transcription (arCOG K), replication, recombination and repair (arCOG L) and cell wall/membrane/envelope biogenesis (ar-COG M). It was shown that most of the enzymes in the tricarboxylic acid (TCA) cycle were acetylated at conserved lysine residues involved in substrate or cofactor binding in Bacteria (66). We found that isocitrate dehydrogenase (SiRe2695) and succinate-CoA ligase (SiRe0277) were acetylated at their respective critical lysine residues (K227 and K38) in S. islandicus, raising the possibility that the functional role of lysine acetylation in the TCA cycle is conserved between Bacteria and Archaea.
SisPat Specifically Acetylates a Very Small Subset of Proteins In Vivo-Of the 1708 lysine acetylation sites, 1503 sites from 644 proteins were quantifiable. Strikingly, among these acetylated sites, only 24 sites from 23 proteins were differentially modified by Ͼ1.3 fold (supplemental Table S3). All but one of these sites (i. e. a lysine residue in an YHS domain protein) were modified to lower extents in the mutant than in the parental strain (supplemental Table S3). Among the 24 sites, ten from ten proteins exhibited a Ͼ2 fold difference in acetylation between the two strains, and they were all less acetylated in ⌬SisPat than in the parent (Table I, supplemental  Table S3). Quantitative proteomics based on iTRAQ labeling revealed that each of these differentially acetylated proteins was present at similar levels in the mutant and parental strains, excluding the possibility that the observed differential acetylation of these proteins was because of their different intracellular concentrations in the two strains (supplemental Table S2). Moreover, whereas several lysine acetylation sites were identified in these proteins (e.g. SiRe2451, SiRe0686, SiRe2327, SiRe0317, SiRe1580, and SiRe0435), only a single site in each protein was differentially acetylated (Table I), further indicating that these proteins were differentially acetylated in the two strains and the differential acetylation occurred at specific lysine residues.
Top on the list of the differentially acetylated proteins are a group of six acyl-CoA synthetases of presumably different substrate specificity (4 ϳ 14 fold). Acyl-CoA synthetases are known to function in both anabolic and catabolic pathways by converting fatty acids to acyl-CoA derivatives (17). As shown by sequence alignment, a conserved motif (PX 4 GK) at the site of lysine acetylation exists in the six acyl-CoA synthetases ( Fig. 2A and 2B). This motif resembles those found in bacterial acyl-CoA synthetases acetylated by Pat (Fig. 2C). Acetylation of agmatinase (SiRe2355, K158) was also significantly reduced (ϳ7 fold) in the mutant strain. Agmatinase is one of the vital enzymes in the biosynthesis of polyamines (e.g. putrescine, spermidine, and spermine) from arginine (67)(68)(69). Polyamines, which exist in abundance in Sulfolobus species, are known to serve a role in the thermal protection of nucleic acids and to facilitate protein synthesis (70). Other enzymes that were significantly less acetylated in ⌬SisPat than in the parental strain included ornithine cyclodeaminase (SiRe1622, a Lysine residues of differential acetylation in the parental and mutant strains are indicated by a star (*). b Locations of lysine residues which were acetylated to similar extents in both ⌬SisPat and the parental strain.
K158, ϳ3 fold), citryl-CoA lyase (SiRe0435, K59, ϳ2 fold) and nitrilase (SiRe1580, K187, ϳ2 fold). Ornithine cyclodeaminase catalyzes hydrolysis of L-ornithine to L-proline, participating in arginine and proline biosynthesis (71). Citryl-CoA lyase cleaves citryl-CoA into acetyl-CoA and oxaloacetate, functioning in the regulation of the cellular concentration of acetyl-CoA (72,73). Nitrilase converts nitriles into the corresponding carboxylic acid and ammonia, serving roles in nutrient assimilation and detoxification of nitriles (74). The sole protein that was acetylated to a slightly higher level in ⌬SisPat than in the parental strain contains the YHS-domain (SiRe2555), which is found in copper transporting ATPases, phenol hydroxylases and some membrane proteins. It is worth noting that the conserved motif shared by the six acyl-CoA synthetases is not found at the site of acetylation in the other differentially acetylated proteins. As revealed by our sequence search, S. islandicus REY15A encodes a total of 15 putative acyl-CoA synthetases. Nine of them were not identified in our quantitative acetylomic analysis (Fig. 3B). All the 15 enzymes except for SiRe1630 contain the conserved motif PX 4 GK for N-acetylation by SisPat. To test if SisPat could acetylate these acyl-CoA synthetases, especially the nine enzymes unidentified in the acetylomic analysis, we overproduced and purified recombinant SiRe0305, SiRe0317, SiRe2355, SiRe0033, SiRe0711, SiRe2338, and SiRe1630, and performed an in vitro protein acetylation assay on them. As shown in Fig. 3A, a protein was significantly acetylated in the assay only when it contained the conserved acetylation site because the two acyl-CoA synthetases SiRe0317 and SiRe0305, both of which were the preferred substrates for SisPat in vivo, were not detectably acetylated by the acetyltransferase when a point mutation was introduced into the proteins at the conserved site of acetylation (i.e. K544A in SiRe0317 and K534A in SiRe0305). In a separate experiment, we found that recombinant SiRe0305 and SiRe0317 incubated with SisPat in the in vitro acetylation assay were exclusively acetylated at the conserved lysine site by MS (supplemental Fig. S7). Therefore, the conserved lysine residue in acyl-CoA synthetases was the specific site of acetylation by SisPat. As expected, all the five putative acyl-CoA synthetases except for SiRe1630 were acetylated by SisPat (Fig. 3C). In comparison, agmatinase was not as a good substrate as the acyl-CoA synthetases because it was less efficiently acetylated than the latter in the assay. Taken together, our data indicate that SisPat specifically acetylates a small subset of proteins, including acyl-CoA synthetases in particular, in the S. isladicus cells.
Ard1 Is Required for the Optimal Growth of S. islandicus-Sulfolobus proteins were also likely acetylated at the N terminus by the N-terminal acetyltransferase Ard1 (52). Recently, we successfully constructed an Ard1 deletion mutant in S. islandicus (⌬SisArd1) by using a homologous recombinationbased strategy (75), suggesting that the protein was not essential for the survival of the cell (51). We noticed that, during the mutant construction, counter-selection for the deletion of the pyrEF marker in the presence of 5-FOA and uracil resulted in the formation of two types of colonies of drastically different sizes on the plate (Fig. 4A). As revealed by PCR, the larger colonies contained the sisard1 gene as well as a spontaneous mutation in pyrEF, whereas the smaller ones lacked the sisard1 gene (data not shown). It appears that ⌬SisArd1 was not able to grow as well as the parental strain. To further determine the effect of the deletion of sisard1 on the growth of the cell, we grew ⌬SisArd1, the parental strain and a complementary strain, which was prepared by introducing a plasmid encoding SisArd1 into ⌬SisArd1 (51), in Zillig's liquid medium. ⌬SisArd1 grew significantly more slowly, but reached a slightly higher maximum cell density, than the parental strain (Fig. 4B). On the other hand, the complementary strain grew more rapidly and attained a higher maximum cell density than the parent, presumably because the cellular content of SisArd1 in the complementary strain was about 10-fold higher FIG. 2. Motif analysis of lysine acetylation sites differentially acetylated in the parental strain and ⌬SisPat. A, Ten sites that were most drastically less acetylated in ⌬SisPat than in the parental strain. Lysine residues of acetylation and their flanking sequences are aligned. Six acyl-CoA synthetases are framed by a red square. The acetylated lysine residues were marked by an arrow. B, A MEME motif generated from the peptides shown in Fig. 2A. The height of each letter corresponds to the frequency of the amino acid residue that the letter represents at that position. The lysine residue of acetylation is indicated by an arrow. C, The conserved motif of acyl-CoA synthetases acetylated by Pat from R. palustris. The lysine residue of acetylation is indicated by an arrow. than that in the parental strain (51). Therefore, we conclude that SisArd1 is required for the optimal growth of S. islandicus.
Deletion of the SisArd1 Gene Affects the Cellular Levels of a Number of Proteins-To gain mechanistic clues into the growth phonotype of ⌬SisArd1, we then conducted a TMTbased quantitative proteomic analysis on both the mutant and parental strains. A total of 1618 proteins, accounting for ϳ61% of the proteins encoded by the genome (2631), were confidently identified (supplemental Table S5). Among them, 1518 proteins were quantifiable. When a cutoff was set at a fold change of Ն 1.50, 130, and 143 proteins were found to be up-and downregulated, respectively, accounting in combination for ϳ10% of the total proteins (Fig. 5, supplemental Table S5).
As suggested by an arCOG analysis (76,77), the differentially regulated proteins were distributed in nearly all arCOGs categories (supplemental Fig. S8, supplemental Table S5). Intriguingly, five downregulated proteins, including cell division protein CdvB (SiRe1174), CdvC (SiRe1175), CdvB1 (SiRe1550) and two chromosome partitioning-like proteins (SiRe0265 and SiRe0132), are involved in cell cycle control (Table II). CdvB, CdvC, and CdvB1 comprise the cell division machinery CdvABC, which forms a composite band contracting concomitantly with the septum formation in Sulfolobus (78 -80). Notably, it was reported that a reduction in the level of CdvB1 caused growth retardation in S. islandicus (81), and the deletion of the CdvB gene was lethal in S. acidocaldarius (82). Therefore, it is tempting to suggest that downregulation of the Cdv proteins contributed to the slow growth of ⌬SisArd1. Among the three replication initiator proteins (i.e. Orc1-1, Orc1-3 and Whip), only Orc1-3 (SiRe0002) was identified in our analysis. The level of Orc1-3 was downregulated by ϳ7 fold in ⌬SisArd1. The replicative DNA polymerase DNA polymerase B1 (SiRe1451) was also downregulated. Downregulated proteins also included DNA repair proteins NurA (SiRe0061), Mre11 (SiRe0063) and HerA (SiRe0064), which are encoded by a single operon, and nucleotide excision repair proteins Xpb1 (SiRe1129) and Xpf (SiRe1280). Purine nucleotide synthesis was probably reduced because five proteins (i.e. SiRe1376, SiRe1377, SiRe1378, SiRe1379, and SiRe1381) in a single operon were all downregulated in the mutant. On the other hand, all the CRISPR-associated proteins identified in the proteome were upregulated (Table II,  supplemental Table S5). These represent most of the proteins in the three CRISPR systems (i.e. CRISPR IA, IIIB Cmr-␣ and IIIB Cmr-␤) of this organism (83). CRISPR IIIB Cmr-␤ proteins SiRe0598, SiRe0600, SiRe0601, and SiRe0603, and CRISPR IIIB Cmr-␣ proteins SiRe0890, SiRe0891, SiRe0892, SiRe0893, SiRe0894, and SiRe0895 were upregulated by 1.5 ϳ 2 fold, whereas CRISPR IA proteins SiRe0765, SiRe0766, SiRe0767, SiRe0768, and SiRe0771 were slightly upregulated (Ͻ1.5 fold) (supplemental Table S5). The consequence of the upregulation of the CRISPR systems remains to be elucidated, but it may be related to the nonimmunity functions (e.g. regulation of gene expression, genome remodeling, DNA repair and cell dormancy) of these systems (84). Most of the proteins involved in carbohydrate transport and metabolism (34 out of 47 proteins, arCOG G) and energy production and conversion (eight out of nine proteins, arCOG R) were also upregulated (supplemental Fig. S8), in agreement with the observation that the mutant was able to grow to higher cell density than the parental strain. It is worth noting that eight transcriptional regulators were differentially regulated, of which seven were downregulated and the PadR family protein was upregulated. It is possible that differential regulation of some of the proteins resulted from the change in the cellular levels of these transcriptional regulators.
About Half of the Identified Proteins Are Acetylated at the N-termunus in S. islandicus-To investigate the effect of SisArd1 on protein Nt-acetylation in vivo, we determined the Nt-acetylated proteins from both ⌬SisArd1 and the parent strain by using the TAILS technique (61). Each strain was grown in replicate to the exponential phase, and the proteins were isolated. Primary free amines were tetradeutero-dimethylated. Following trypsin digestion, the N-terminal peptides were enriched and subjected to LC-MS/MS. We were able to obtain 359 unique N-terminal peptides from the parental sample (supplemental Table S6). About 49% (177/359) of the identified proteins lacked the initiator methionine residue (iMet). The proportion of the proteins undergoing N-terminal methionine cleavage in S. islandicus appears to be comparable to those reported previously for other organisms (55 ϳ 70%) (33, 39 -41, 85). As observed in bacteria, eukaryotes and haloarchaea, initiator methionine cleavage occurred only in proteins containing an amino acid residue with a small side chain (i.e. G, A, V, S, T) at the second position in S. islandicus (Fig. 6A) (39,85). In proteins, whose second residue contained a large side chain (i.e. E, Q, D, N, I, L, Y, M, K, F), iMet was generally retained (Fig. 6A). About 21% of the identified N-terminal peptides were fully acetylated, 23% were partially acetylated, and the remaining 56% were unacetylated at the N terminus (Fig.  6B, supplemental Table S6). As shown in Fig. 6B, all MMstarting proteins were Nt-acetylated. Nearly all (71/74) of the proteins containing Ser at the second position were Ntacetylated, accounting for 45% of the Nt-acetylome in S. islandicus. Nt-acetylation also frequently occurred on proteins with an A, G, ME or MQ at the N terminus. Proteins with a T, MI, MY or ML at the N terminus were occasionally Nt-acetylated. Nt-acetylation was hardly detected on proteins with other N termini.
SisArd1 Is a Major Nt-acetyltransferase With Broad Substrate Specificity in S. islandicus-As revealed by comparing the Nt-acetylation data for ⌬SisArd1 with those for the parental strain, the number of Nt-acetylated proteins in the mutant strain was substantially lower than that in the parental strain. Only 11% (36/339) of the identified proteins were Nt-acety- lated in ⌬SisArd1, whereas 44% (158/359) of the proteins were Nt-acetylated in the parental strain (Fig. 6B, 6C and supplemental Table S6). We subsequently compared the proteins identified in both the parental and the mutant samples (supplemental Table S7). A total of 286 proteins were identified in both samples, and 129 of them were Nt-acetylated in the parental strain. Of these Nt-acetylated proteins, 119 were unacetylated (96) or significantly less frequently acetylated (23) at the N terminus in the mutant strain. Therefore, SisArd1 was responsible for the modification of the vast majority (92%) of the acetylated N termini identified in S. islandicus (supplemental Table S7). The 119 target proteins of SisArd1 contained 16 unique N termini, which fall into two groups based on the presence of iMet. One group included S, A, G, T and, infrequently, V and E, and the other contained MD, ME, MN, MQ, MI, MF, ML, MY, MA and MM at the N terminus ( Fig.  6D and supplemental Table S7). The spectrum of N termini that SisArd1 acetylated covers nearly all specific target sites for each of the six human N-terminal acetyltransferases (NatA ϳ NatF), supporting the notion that archaeal Ard1 is an evolutionary precursor of existent eukaryotic NATs. It is worth noting that there was significant residual protein acetylation at the N terminus (i.e. 36 Nt-acetylated proteins) in ⌬SisArd1, indicating strongly the presence of additional N-terminal acetyltransferases in S. islandicus (supplemental Table S7). Ten of the identified proteins, including three MM-starting proteins and one MK-starting protein, were found to be as well Nt-acetylated in ⌬SisArd1 as in the parental strain. Most of the acetylated N termini in the mutant strain possessed the target N-terminal amino acid residues for SisArd1, suggesting that there exist unknown N-terminal acetyltransferase(s) with substrate specificity overlapping with that of ⌬SisArd1. Our observation is consistent with the identification of several putative acetyltransferases in the organism by sequence searches. DISCUSSION In the present study, we have determined the extent and pattern of protein acetylation in S. islandicus and provided the first report on the substrate specificity and potential roles of the two highly conserved archaeal protein acetyltransferases SisPat and SisArd1 in vivo. S. islandicus proteins were acetylated at numerous internal lysine residues and at the N terminus. Proteins containing N-acetylated lysine residues accounted for as many as ϳ26% of the total proteins in the FIG. 6. N-terminal peptides identified in the S. islandicus parental strain and ⌬SisArd1. A, Distribution of the second amino acid residues in the identified N-terminal peptides in the parental strain. Peptides with iMet (blue), without iMet (red) and with both types of the N terminus (green) are shown. B, Frequencies of Nt-acetylation on various N termini in the parental strain. C, Frequencies of Nt-acetylation on various N termini in ⌬SisArd1. Nt-acetylated peptides (red), partially Nt-acetylated peptides (yellow) and Nt-unacetylated peptides (blue) are indicated. An N terminus, for which Ն 3 proteins were identified in the parental or mutant strain, is shown. D, Proteins differentially Nt-acetylated in ⌬SisArd1 and the parental strain. Peptides Nt-acetylated in the parental strain but unacetylated or less acetylated in the ⌬SisArd1 are shown in green and cyan, respectively. All termini identified in this study are listed in supplemental Tables S6 and S7. organism. However, it is worth noting that these residues were probably acetylated at a very low level because only ϳ0.15% of the peptides identified were acetylated in the quantitative proteomes of S. islandicus (data not shown). Interestingly, SisPat was responsible for the acetylation of only a very small subset (24 out of 1503 quantifiable lysine sites, as judged by using a cutoff of Ͼ1.3 fold difference in modification between the parental strain and ⌬SisPat) of the acetylated proteins. The vast majority of acetyllysines presumably resulted from the action of other protein acetyltransferases or nonenzymatic processes. Sequence analysis suggests that S. islandicus encodes six putative acetyltransferases in addition to SisPat and SisArd1 (data not shown). The possibility exists that some of these enzymes are involved in the acetylation of the acetylated proteins. Equally or more likely, many of the identified sites of acetylation were acetylated nonenzymatically. It was reported recently that acetyl phosphate (acP), a high-energy intermediate of the phosphotransacetylase/acetate kinase (Pta/AckA) pathway, directly donated its acetyl group to the -amino group of a deprotonated lysine in bacteria (10,86). However, because of the absence of the Pta/ AckA pathway, nonenzymatic protein acetylation via the acP reaction may not occur in Sulfolobus. Therefore, it is more likely that acetyl-CoA, another highly reactive reaction intermediate shown to be potentially capable of mediating nonenzymatic protein lysine acetylation (87)(88)(89)(90)(91), plays a role in nonenzymatic acetylation of the proteins in S. islandicus.
It was believed for a long time that Sulfolobus Pat acetylated Sso10b at K16, as implicated by the reports that the lysine residue existed in an acetylated form in vivo and the protein was acetylated by Sulfolobus Pat in vitro (46,48). However, both Sso10b and Sis10b were found to be methylated, and not acetylated, at K16 in the cell (51). On the other hand, K64 and K68 of Sis10b were occasionally acetylated in vivo. In agreement with the previous finding, only K64 and K68, but not K16, were acetylated, and, further, acetylation of the two residues was independent of the presence of SisPat in the present lysine acetylomic analysis. Apparently, K64 and K68 were not acetylated by SisPat, and Sis10b was not the substrate of the acetyltransferase in vivo. Notable among the bona fide substrates of SisPat were a group of acyl-CoA synthetases containing a conserved site of acetylation by SisPat. S. islandicus encodes a total of 15 putative acyl-CoA synthetases with different substrate specificity. All but one carries the conserved site of acetylation. Six of the enzymes were identified as among the most preferred substrates of SisPat in the acetylome, and the remaining enzymes except for the one lacking the conserved acetylation site were also likely the substrates of SisPat as demonstrated by the in vitro acetylation assays. Therefore, our data provide conclusive experimental evidence that the regulatory mechanism involving the acetylation of acyl-CoA synthetases is conserved among all three domains of life.
Acyl-CoA synthetases serve a central role in controlling CoA homeostasis and thus maintaining metabolic balance in the cell by activating acetate and other weak organic acids to acetyl-CoA and other corresponding CoA thioesters, which participate in diverse anabolic and catabolic processes (17,18,23). In enteric bacteria, such as E. coli and S. enterica, acetate is activated into acetyl-CoA via either the low-affinity Pta/AckA pathway or the high-affinity ACS pathway in response to variation in the level of acetate in the habitats (18). However, Sulfolobus species encode neither Pta nor AckA, presumably activating acetate only through the ACS pathway.
A global analysis of protein lysine acetylation has recently been reported in Haloferax mediterranei, a halophilic archaeon (43). As observed in S. islandicus, many proteins (i.e. 17% of the total proteins) are potentially acetylated at the internal lysine residues in H. mediterranei. Metabolic proteins represent the largest functional group of proteins acetylated, accounting for 47 and 41% in the lysine acetylomes of S. islandicus and H. mediterranei, respectively. In addition, ribosomal proteins and aminoacyl-tRNA synthetases were frequently modified in the two species. Analysis of the sequence context at the site of acetylation reveals that positively charged amino acid residues (i.e. Lys and Arg) are enriched at the ϩ1 and ϩ2 positions, in relation to the residue of acetylation, in the acetylomes of the two species. In comparison, Tyr is significantly enriched at the ϩ1 position whereas Lys or Arg is always located downstream of the ϩ2 position in the mycobacterium, drosophila and human acetylomes (12,13,92,93). H. mediterranei encodes twelve acyl-CoA synthetases. Our analysis of the H. mediterranei acetylome has identified acetylated peptides from six of them. Notably, the lysine residue of the conserved motif PX 4 GK in these proteins, which was identical to that in S. islandicus, was acetylated. Sequence searches show that H. mediterranei encodes eleven GCN5 family acetyltransferases, and two of them (i.e. HFX_1840 and HFX_1886) share significant sequence homology with SisPat (27% for both). Therefore, we suggest that a SisPat homolog catalyzes the acetylation of acyl-CoA synthetases in H. mediterranei. We infer further that the function of SisPat homologs is conserved among archaea because acyl-CoA synthetases with the conserved lysine residue of acetylation are widespread in nearly all archaeal phyla.
Nt-acetylation was once considered a rare event in Bacteria and Archaea (4). Nonetheless, an increasing number of bacteria and archaea have recently been found to undergo Nt-acetylation (33, 34, 39 -41). In archaea, the extent of Ntacetylation varies among species, e.g. 14% to 19% in Halobacterium salinarum and Natronomonas pharaonis (39), 29% in H. volcanii (41), and rare in methanogens (94,95). In a recent proteomic study, 127 proteins were found to be acetylated at the N terminus in S. islandicus (42). In this study, we identified the N-terminal peptides from 359 unique proteins in S. islandicus by using a terminal amine isotopic labeling of substrates (TAILS) technique. Notably, a significant fraction of the proteins had undergone N-terminal maturation with the initiator methionine removed (ϳ49% of the identified proteins) and the N terminus acetylated (ϳ44% of the identified proteins). Although the fraction of Nt-acetylated proteins in S. islandicus is substantially higher than those in other archaeal species that have been studied, it is lower than those in human (ϳ80%) and yeast (ϳ60%) (29).
A comparison of Nt-acetylated and Nt-unacetylated proteins from ⌬SisArd1 and the parental strain reveals that SisArd1 is the major Nt-acetyltransferase in S. islandicus because it was responsible for the Nt-acetylation of the majority (92%) of the proteins identified and the pattern of N termini acetylated by the enzyme resembles that observed in the Nt-acetylome of the organism. Interestingly, SisArd1 could Nt-acetylate 16 different N termini, with the exception of MKstarting termini, spanning the target N termini of all human Nats (NatA ϳ NatF) in vivo. Therefore, at least one N-terminal acetyltransferase remains to be identified, which should be able to catalyze efficiently Nt-acetylation of the MM-and MK-starting termini. iMet-cleaved proteins with N-terminal S, A, G or T, the targets of human NatA, and iMet-retaining proteins with N-terminal ME or MQ, the targets of human NatB, were preferentially Nt-acetylated by SisArd1. However, the pattern of Nt-acetyation frequencies for various N termini in S. islandicus differed from that in human. Over 95% of MN, MD, ME, or MQ, and Ͼ54% of ML, MI, MF, MY or MK, the target termini of NatB and NatC/E/F, respectively, were Ntacetylated in human (2). But these termini were rarely acetylated in S. islandicus. This difference may contribute to the lower overall N-terminal acetylation in S. islandicus (ϳ44%) than in human (ϳ80%) (29). SisArd1 is most closely related to the catalytic subunit Ard1/Naa10 of NatA at the amino acid sequence level. However, structural studies reveal that the active site of Ard1 from S. solfataricus represents a hybrid of the active sites of NatA and NatE (53). Therefore, the ability of SisArd1 to modify both iMet-lacking and iMet-retaining N termini results from the use by the enzyme of a hybridized strategy, which permits the enzyme to facilitate the acetylation of distinct substrates through different catalytic mechanisms (53).
The availability of the mutant strains ⌬SisPat and ⌬SisArd1 permits a look into the functions of SisPat and SisArd1, the only two biochemically characterized archaeal protein acetyltransferases in Archaea. Neither enzyme was essential for the growth of the organism. The lack of SisPat did not hinder the growth of the mutant strain, whereas SisArd1 was required for the optimal growth of the organism. The slower growth of ⌬SisArd1 than that of the parental strain is consistent with the reduced expression of genes encoding proteins functioning in cell division, cell cycle control, purine synthesis and DNA replication in the mutant strain, as revealed by quantitative proteome analysis. Nt-acetylation also affected a number of other cellular processes, such as DNA repair and CRISPRrelated activities. It has been reported that Nt-acetylation is involved in the control of the cellular level of proteins such as 20S proteasome ␣1 protein from H. volcanii and small nucleic acid-binding proteins from Sulfolobus because the lack of Nt-acetylation resulted in an increase in the intracellular concentration of these proteins (51,54). We show in this study that 273 proteins were differentially regulated (130 upregulated and 143 downregulated) by Ͼ1.5 fold in ⌬SisArd1, as compared with those in the parental strain. Only 20 of these proteins, including twelve upregulated and eight downregulated ones, were Nt-acetylated in the parental strain but not or less Nt-acetylated in ⌬SisArd1. In addition, 18 quantifiable proteins were Nt-acetylated only in the parental strain and were not differentially regulated by Ͼ1.5 fold in ⌬SisArd1. It appears that, although Nt-acetylation affected the cellular levels of a significant fraction (ϳ10%) of proteins in S. islandicus, the Ac/N-end rule is not generally applicable, as expected from the fact that fewer than ten proteins have so far been shown to obey the Ac/N-end rule (2,31). Of possible relevance to the control of cellular level of the proteins is the finding that eight transcriptional factors were differentially regulated in ⌬SisArd1. Although none of these transcriptional factors were identified in our Nt-acetylomic analysis, the possibility exists that the synthesis of these factors was affected directly or indirectly by Nt-acetylation. Changes in gene expression of the transcriptional factors might be responsible for the changes in cellular level of the proteins in ⌬SisArd1. Obviously, a better understanding of the role of Nt-acetylation in the control of cellular protein concentrations awaits further investigation.