If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
‡State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, Beijing, P. R. China§College of Life Science, University of Chinese Academy of Sciences, Beijing, P. R. China
‡State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, Beijing, P. R. China§College of Life Science, University of Chinese Academy of Sciences, Beijing, P. R. China
‡State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, Beijing, P. R. China§College of Life Science, University of Chinese Academy of Sciences, Beijing, P. R. China
‡State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, Beijing, P. R. China§College of Life Science, University of Chinese Academy of Sciences, Beijing, P. R. China
* This work was supported by National Natural Science Foundation of China grants 31730001, 31470175, and 31130003 to L. H. [S] This article contains supplemental Figures and Tables. ** These authors contributed equally to this work.
Proteins undergo acetylation at the Nε-amino group of lysine residues and the Nα-amino group of the N terminus in Archaea as in Bacteria and Eukarya. However, the extent, pattern and roles of the modifications in Archaea remain poorly understood. Here we report the proteomic analyses of a wild-type Sulfolobus islandicus strain and its mutant derivative strains lacking either a homolog of the protein acetyltransferase Pat (ΔSisPat) or a homolog of the Nt-acetyltransferase Ard1 (ΔSisArd1). A total of 1708 Nε-acetylated lysine residues in 684 proteins (26% of the total proteins), and 158 Nt-acetylated proteins (44% of the identified proteins) were found in S. islandicus. ΔSisArd1 grew more slowly than the parental strain, whereas ΔSisPat showed no significant growth defects. Only 24 out of the 1503 quantifiable Nε-acetylated lysine residues were differentially acetylated, and all but one of the 24 residues were less acetylated by >1.3 fold in ΔSisPat than in the parental strain, indicating the narrow substrate specificity of the enzyme. Six acyl-CoA synthetases were the preferred substrates of SisPat in vivo, suggesting that Nε-acetylation by the acetyltransferase is involved in maintaining metabolic balance in the cell. Acetylation of acyl-CoA synthetases by SisPat occurred at a sequence motif conserved among all three domains of life. On the other hand, 92% of the acetylated N termini identified were acetylated by SisArd1 in the cell. The enzyme exhibited broad substrate specificity and could modify nearly all types of the target N termini of human NatA-NatF. The deletion of the SisArd1 gene altered the cellular levels of 18% of the quantifiable proteins (1518) by >1.5 fold. Consistent with the growth phenotype of ΔSisArd1, the cellular levels of proteins involved in cell division and cell cycle control, DNA replication, and purine synthesis were significantly lowered in the mutant than those in the parental strain.
, in which the acetyl group from acetyl coenzyme A (Ac-CoA) is transferred to the N terminus on the α-amino group (Nα-acetylation or Nt-acetylation) or the ε-amino moiety of lysyl side chains (Nε-acetylation or lysine acetylation) (
). Nt-acetylation and enzymatic lysine acetylation were catalyzed by N-terminal acetyltransferase (NAT) and lysine (Lys [K]) acetyltransferase (KAT), respectively (
). In addition, global lysine acetylation in bacteria (e.g. Escherichia coli and Bacillus subtilis) and eukaryotic mitochondria also occurs extensively via a nonenzymatic mechanism (
). Although Nt-acetylation is irreversible, lysine acetylation is reversible through the action of deacetylase. Hence, the latter is also referred to as reversible lysine acetylation (RLA) (
). Proteomic studies of lysine acetylation demonstrate that protein lysine acetylation occurs as widely in bacteria (e.g. E. coli, Salmonella enterica, B. subtilis, Rhodopseudomonas palustris, etc.) (
). This enzyme and other acyl-CoA synthetases, which form the AMP-forming acyl-CoA synthetase family, activate organic acids by generating corresponding acyl-CoA thioesters through a two-step process involving the formation of an enzyme-bound acyl-AMP intermediate (
The acetyl-CoA synthetase gene ACS2 of the yeast Saccharomyces cerevisiae is coregulated with structural genes of fatty acid biosynthesis by the transcriptional activators Ino2p and Ino4p.
). Acetyl-CoA synthetase is also found to affect histone acetylation and hence gene expression by changing pools of the acetyl group donor acetyl-CoA (
). Acetylation of a conserved lysine residue in acetyl-CoA synthetase by the protein lysine acetyltransferase Pat in S. enterica abolished the activity of the acetyl-CoA synthetase by preventing the formation of the acyl-AMP intermediate (
Reversible N epsilon-lysine acetylation regulates the activity of acyl-CoA synthetases involved in anaerobic benzoate catabolism in Rhodopseudomonas palustris.
Nt-acetylation is a co- or post-translational modification frequently found on proteins in eukaryotes (e.g. ∼60% of the proteins in yeast and ∼80% of the proteins in humans) (
). The modification depends primarily on the nature of the first two amino acid residues at the N terminus. Eukaryotes encode multiple N-terminal acetyltransferases. For example, six human NATs have been identified and designated NatA-NatF (
). Each of these NATs contains a specific catalytic subunit (i.e. Naa10, Naa20, Naa30, Naa40, Naa50 and Naa60). Some of the NATs (i.e. NatA, NatB, NatC and NatE) are known to possess one or two auxiliary subunits (
). Mutations in NATs were found to result in slow growth, abnormal cell cycle control, defects in chromosomal segregation or cell division, etc., in eukaryotes (
). However, the role of these bacterial NATs is unclear because mutants defective in the three proteins exhibited no detectable phenotypic distinctions from the wild-type strain (
As compared with that in Eukarya and Bacteria, protein acetylation in Archaea is less well understood. It is shown in a few proteomic studies that proteins are widely acetylated both at the N terminus and at internal lysine residues in archaea as in eukaryotes (
). Much of the current knowledge about the mechanistic and functional aspects of archaeal protein acetylation is derived from studies on Sso10b, a member of the Sac10b family from Sulfolobus solfataricus. The Sac10b family includes small and sequence-nonspecific nucleic acid binding proteins highly conserved in Archaea (
). It was reported that Sso10b was acetylated at the ε-amino group of K16, a moderately conserved residue in the Sac10b family, and the α-amino group of the N-terminal residue (
). Further, in vitro assays showed that the acetylation and deacetylation of K16 were catalyzed by an archaeal Pat homolog, the sole lysine acetyltransferase that has been biochemically characterized in Archaea, and the lysine deacetylase Sir2, respectively (
). Therefore, it is suggested that K16 in Sso10b is an RLA site, and acetylation and deacetylation of this site represents the only RLA-mediated regulatory mechanism so far identified in Archaea (
). However, these observations do not agree with the finding that the binding affinity of the Sulfolobus Pat homolog for Sso10b was low as compared with that of the eukaryotic counterpart for the cognate substrate proteins (
). Moreover, it was recently reported that Sis10b, member of the Sac10b family from Sulfolobus islandicus, as well as Sso10b were not acetylated but methylated at K16 in vivo (
Insights into the post-translational modifications of archaeal Sis10b (Alba): lysine-16 is methylated, not acetylated, and this does not regulate transcription or growth.
), was found to catalyze Nt-acetylation of the N-terminal Ser residue of Sso10b in vitro. The Sulfolobus enzyme, the only NAT enzyme that has been identified in Archaea, showed a relaxed target sequence requirement, as compared with eukaryotic NatA because it was able to modify Sso10b variants with A, G, T, V, ML and ME at the N terminus, albeit with different efficiencies, in vitro (
). Very little is known about the functions of the Nt-acetylation of proteins in Archaea. Nt-acetylation was reported to reduce the cellular levels of 20S proteasome α1 protein in Haloferax volcanii cells, and of Sis10b and Cren7 in S. islandicus cells (
Insights into the post-translational modifications of archaeal Sis10b (Alba): lysine-16 is methylated, not acetylated, and this does not regulate transcription or growth.
The N-terminal penultimate residue of 20S proteasome alpha1 influences its N(alpha) acetylation and protein levels as well as growth rate and stress responses of Haloferax volcanii.
). Therefore, the bona fide substrates and the roles of archaeal Pat and Ard1 homologs remain to be elucidated.
In the present study, we performed comparative proteomic analyses on the extents and spectra of Nε- and Nt-acetylation on proteins from wild-type S. islandicus and its derivative mutant strains defective in either Pat or Ard1. We showed that proteins were extensively acetylated at the N terminus (∼44% of the identified proteins) and internal lysine residues (∼26% of the theoretical total of the proteins encoded by the genome) in S. islandicus. Neither SisPat nor SisArd1 was essential for the survival of S. islandicus, but the lack of the latter had far greater an impact on the growth of the organism than the lack of the former. SisPat acetylated only a small subset of proteins, with a group of acyl-CoA synthetases as preferred substrates, in vivo and was likely involved in metabolic regulation, whereas SisArd1 showed broad substrate specificity, being able to modify nearly all types of the target N termini for human NatA-NatF, and probably played a role in a range of cellular processes including cell cycle control, DNA replication and CRISPR-based immunity.
EXPERIMENTAL PROCEDURES
Growth of Organisms
All Sulfolobus strains were grown in Zillig's, SCVy or TYS medium (
) at 75 °C with shaking at 150 rpm. S. islandicus E233S (ΔpyrEF, ΔlacS), a generous gift from Professor Qunxin She at Copenhagen University, was cultured in Zillig's medium supplemented with uracil (20 μg/ml). SCVy medium was used in transformation experiments. Solid plates for Sulfolobus strains were prepared by adding 0.8% (w/v) gelrite (Carl Roth, Karlsruhe, Germany) to the liquid medium.
Experimental Design and Statistical Rationale
Both the ΔSisPat mutant and S. islandicus E233S were grown to the exponential phase (OD600 ∼0.6) and harvested. Each strain was prepared in three biological replicates, and each biological replicate was analyzed in three technical replicates for iTRAQ-based quantitative proteomic analysis. Each strain was prepared in two biological replicates for the TMT-based quantitative analysis of Nε-acetylation. Similarly, both the ΔSisArd1 mutant and E233S were grown to the exponential phase (OD600 ∼0.6) and harvested. Each strain was prepared in two biological replicates for TMT-based quantitative proteomic analysis and for the qualitative analysis of N-terminal peptides by terminal amine isotopic labeling of substrates (TAILS) technique. The data from the replicates were combined and analyzed with MaxQuant (v.1.5.2.8) or Scaffold Q+ (v. 4.5.2). For TMT-based quantitative lysine acetylome analysis, the acetylated peptides were detected in two replicates, and differentially regulated peptides were identified when the ratios of acetylated peptides between ΔSisPat and the parent from both replicates were at least 1.3 or 2. For TMT-/iTRAQ-based quantitative proteomic analyses, the unpaired Student's t test or the Mann-Whitney test was applied with the criterion of significance set at p value < 0.05.
Construction of a Pat Deletion Mutant (ΔSisPat) and a Complementary Strain
Genome-editing plasmid for the deletion of SisPat (SiRe2454) was constructed by cloning a spacer derived from the target site in the sispat gene and donor DNA sequences flanking the sispat gene into pSe-Rp, a Sulfolobus CRISPR-cloning vector (
). The target site started with a protospacer adjacent motif (CCN) positioned 171 bp downstream of the start codon of the sispat gene, and the immediately adjacent 40-nt sequence was used as the spacer (supplemental Table S1). The spacer fragment was inserted into pSe-Rp at the BspMI site, forming pAC-sispat. The sequences upstream (l-arm) and downstream (R-arm) of sispat were amplified by PCR from the genomic DNA of S. islandicus REY15A (supplemental Table S1). The l-arm and R-arm fragments were double digested with SalI/MluI and MluI/NotI, respectively, and inserted into plasmid pAC-sispat at the SalI and NotI sites, respectively, giving rise to the plasmid pGE-sispat. Plasmid pGE-sispat was introduced into S. islandicus E233S by electroporation (
). Transformed cells grown on SCVy plates were screened by PCR amplification using the Flanking and Internal primers (supplemental Table S1). The resulting PCR products were analyzed by agarose gel electrophoresis and by DNA sequencing. Colony-picking was repeated to obtain a pure mutant strain. Plasmids were cured from the deletion mutant by pyrEF counter selection with uracil and 5-FOA, yielding ΔSisPat.
A strain that complemented the deletion of the genomic copy of the sispat gene was constructed as described previously (
Insights into the post-translational modifications of archaeal Sis10b (Alba): lysine-16 is methylated, not acetylated, and this does not regulate transcription or growth.
). In brief, the sispat gene was obtained by PCR from the S. islandicus DNA (supplemental Table S1). The PCR product was cleaved with NdeI/SalI and inserted into plasmid pSeSD (
), producing the SisPat overexpression plasmid pSeSD-SisPat. After propagation in E. coli DH5α, pSeSD-SisPat was transformed into ΔSisPat.
Recombinant Protein Preparation
Genes encoding SisPat (SiRe2454) and a selected group of acetylated proteins, including SiRe0305, SiRe0317, SiRe2355, SiRe0033, SiRe0711, SiRe2388 and SiRe1630, were amplified from the genomic DNA of S. islandicus REY15A using primer pairs listed in supplemental Tables S1. All PCR products were cloned, using the ClonExpress Ultra One Step Cloning Kit (Vazyme, Nanjing, China), into an expression vector according to the manufacturer's instruction. The fragments of the SisPat and SiRe0305 genes were cloned into expression plasmid pET30a, those of the SiRe0317 and SiRe2355 genes were cloned into expression vector Champion™ pET SUMO (Invitrogen, Waltham, MA), and those of the SiRe0033, SiRe0711, SiRe2388 and SiRe1630 genes were cloned into expression vector pET28a. Recombinant proteins SiRe0317 and SiRe2355 were fused to an N-terminal His6-SUMO tag, and the rest of the proteins were fused to a His6 tag at either the N terminus or the C terminus. Site-directed mutagenesis of SiRe0305, SiRe0317 and SiRe2355 was performed using the Mut Express II Fast Mutagenesis Kit V2 (Vazyme).
The resultant expression vectors were transformed into E. coli Rosetta (DE3). The overproducers were grown to an OD600 of ∼0.6 at 37 °C in LB medium containing 50 mg/ml kanamycin and 34 mg/ml chloramphenicol. Overproduction of the proteins, except for SiRe2355, was induced with the addition of 0.8 mm isopropyl-1-thio-β-d-galactopyranoside (IPTG) and subsequent incubation at 37 °C for 4 h. Overproduction of SiRe2355 was induced with 0.1 mm IPTG, and the further incubation was at 12 °C for 16 h. Following induction, cells were harvested by centrifugation, resuspended in buffer A [20 mm Tris-HCl, pH 8.8, 1 mm DTT, 0.1 mm EDTA, 500 mm NaCl, 10% (v/v) glycerol] and sonicated on ice. After centrifugation, the supernatants from the SiRe0305, SiRe0317 and SiRe2355 overproducers were heat treated at 75 °C for 20 min, and those from the SiRe0033, SiRe2338, SiRe0711 and SiRe1630 overproducers were heat treated at 65 °C for 20 min. Samples were clarified by centrifugation at 30,000 × g for 30 min at 4 °C. Each protein sample was loaded onto a 1 ml HisTrap HP column in an ÄKTA FPLC Purifier system (GE Healthcare, Pittsburgh, PA), and the column was washed with buffer B [20 mm Tris-HCl, pH 8.8, 1 mm DTT, 0.1 mm EDTA, 500 mm NaCl, 500 mm imidazole and 10% (v/v) glycerol]. Peak fractions containing the target proteins were pooled and dialyzed against 10 mm Tris-HCl, pH 8.0, and 150 mm NaCl. Protein concentrations were determined by the Lowry method with bovine serum albumin (BSA) as the standard (
). Briefly, a standard reaction (25 μl) contained 10 mm Tris-HCl, pH 8.0, 150 mm NaCl, 10 μm protein substrate, 10 μmSisPat and 3 μl [acetyl-3H] acetyl-CoA (2.35 Ci/mmol; Perkin-Elmer, Waltham, MA). Reactions were incubated at 65 °C for 3 h and quenched by boiling for 5 min in SDS-PAGE loading buffer. The samples were subjected to electrophoresis in an SDS-PAGE gel. The gel was immersed for 1 h in EN3HANCE (Perkin-Elmer), washed for 25 min in deionized water, and soaked for 10 min in 10% acetic acid and 2% glycerol. The gel was then exposed to X-ray film.
Preparation of Samples for the Quantitative Determination of the Lysine Acetylome
TMT-based quantitative acetylomic analysis was performed as described previously with modifications (
). ΔSisPat and E233S were grown in Zillig's medium containing uracil (20 μg/ml) at 75 °C with shaking at 150 rpm. Cells were harvested at an OD600 of ∼0.6, resuspended in lysis buffer [8 M urea, 1% protease inhibitor mixture (Roche, Basel, Switzerland), 3 μm trichostatin A, 50 mm nicotinamide and 2 mm EDTA], and disrupted on ice by sonication. After centrifugation at 20,000 × g for 30 min at 4 °C, proteins in the supernatant were precipitated with ice-cold 20% TCA for 2 h at −20 °C. Following centrifugation at 15,000 × g for 10 min at 4 °C, the pellet was washed three times with cold acetone. The proteins were dissolved in 8 m urea and 50 mm NH4HCO3, and the protein concentration was determined by using the BCA kit (Beyotime, Shanghai, China).
The protein solution was reduced with 5 mm DTT for 30 min at 56 °C, and subsequently alkylated with 11 mm iodoacetamide (IAA) for 45 min at room temperature in dark. After adding 50 mm NH4HCO3 to lower the urea concentration to < 2 m, trypsin was added at a trypsin-to-protein mass ratio of 1:50. Following digestion for overnight, trypsin was added at a trypsin-to-protein mass ratio of 1:100. The second trypsin digestion was for 4 h.
A sample of the resulting peptides (from 4 mg of proteins) was desalted using the Strata X C18 SPE column (Phenomenex, Torrance, CA), vacuum-dried and reconstituted in 0.5 m triethylammonium bicarbonate (TEAB). The peptides were labeled with the tandem mass tag (TMT) (Thermo Fisher, Waltham, MA) according to the manufacturer's protocol. Two replicates of the ΔSisPat sample were labeled with 126-tag and 130-tag, respectively, and two replicates of the E233S sample were labeled with 127-tag and 131-tag, respectively. All peptides were mixed, desalted and vacuum dried. To enrich lysine-acetylated peptides, the peptides were resuspended in NETN buffer (100 mm NaCl, 1 mm EDTA, 50 mm Tris-HCl, pH 8.0, and 0.5% Nonidet P-40), and incubated with anti-acetyllysine agarose beads (PTM Biolabs, Hangzhou, China) at 4 °C for overnight with gentle shaking. The beads were washed four times with 1 ml of NETN buffer and twice with deionized H2O. The bound peptides were eluted from the beads with 1% trifluoroacetic acid (TFA). The eluted fractions were combined, vacuum-dried, and subjected to LC-MS/MS analysis.
Preparation of Samples for the Determination of the N-terminal Acetylome by Using the Terminal Amine Isotopic Labeling of Substrates (TAILS) Technique
The TAILS analysis of proteins from ΔSisArd1 and E233S was carried out as described (
). ΔSisArd1 and E233S were grown in Zillig's medium containing uracil (20 μg/ml) at 75 °C with shaking at 150 rpm and harvested at an OD600 of ∼0.6. Proteins from the two strains were reduced and alkylated as described above for the preparation of the protein samples for the quantitative determination of the lysine acetylome. A sample (4 mg) of the proteins was precipitated by the addition of eight sample volumes of cold acetone and subsequent incubation for 2 h at −20 °C. After centrifugation at 12,000 × g for 10 min at 4 °C, the precipitate was dissolved in 6 m GuHCl, 20 mm HEPES-KOH, pH 8.0, 40 mm deuterated formaldehyde (12CD2O) and 20 mm NaBH3CN and incubated at 37 °C for overnight. The reaction was quenched by the addition of 1 m NH4HCO3 to 100 mm. After 4 h at 37 °C, proteins were precipitated by the addition of eight sample volumes of cold acetone. After 2 h at −20 °C, the sample was centrifuged, and the precipitate was dissolved in 8 m urea and 50 mm NH4HCO3. The proteins were digested with trypsin as described above. The sample was fractionated into 60 fractions by high-pH reverse-phase HPLC on a C18 column (Betasil C18, 5 μm particles, 10 mm ID × 250 mm, Thermo Fisher). These fractions were combined into 6 pools and dried in a SpeedVac (Thermo Fisher). The peptides were dissolved in PBS buffer (0.1 m Na2HPO4, 0.15 mm NaCl, pH7.2), and incubated with pre-washed NHS-activated agarose beads (Lot number 26196, Thermo) at 4 °C for overnight with gentle shaking. Unbound peptides were collected after removing the beads by centrifugation at 12,000 × g for 10 min at 4 °C. The peptides were vacuum-dried and subjected to LC-MS/MS.
TMT- and iTRAQ-based Quantitative Proteomics
Quantitative proteomic analyses of ΔSisArd1 and E233S by using a TMT-based approach and of ΔSisPat and E233S by using an iTRAQ-based approach were performed as described (
). Briefly, proteins from the three strains were prepared, reduced, alkylated and digested as described above for the preparation of the protein samples for the quantitative determination of the lysine acetylome. The resulting peptides (from 200 μg of total proteins for each sample) were labeled with the TMT or iTRAQ regents by following the manufacturers' instructions. For the TMT-based quantitative proteomic analysis of ΔSisArd1 and E233S, two duplicate ΔSisArd1 samples were labeled with TMT regent 129-tag and 130-tag, respectively, and two duplicate E233S samples were labeled with 127-tag and 128-tag, respectively. For the iTRAQ-based quantitative proteomic analysis of ΔSisPat and E233S, three triplicate ΔSisPat samples were labeled with iTRAQ regent 113-tag, 115-tag, and 117-tag, respectively, and three triplicate E233S samples were labeled with 114-tag, 116-tag, and 118-tag, respectively. The labeled samples were mixed at equal amounts and fractionated by high-pH reverse-phase HPLC on a C18 column (Zorbax 300 Extend-C18, 5 μm particles, 4.6 mm ID × 250 mm, Agilent, Santa Clara, CA). The fractions were combined into pools, dried in a SpeedVac and analyzed by LC-MS/MS.
LC-MS/MS Data Acquisition
LC-MS/MS analysis was performed on an EASY-nLC 1000 UPLC system (Thermo Fisher) coupled online to an Orbitrap Fusion Tribrid mass spectrometer (Thermo Fisher). Peptides were desalted online by an in-house packed trap column (C18, 5 μm particles, 100 μm ID, 3 mm length, Dr. Maisch GmbH, Ammerbuch-Entringen, Germany). The trapped peptides were loaded onto an in-house packed reversed-phase C18 column (3 μm particles, 75 μm ID, 150 mm length, Dr. Maisch GmbH) and eluted at a flow rate of 300 nl/min with a gradient from solvent A (0.1% formic acid in water) to solvent B (0.1% formic acid in acetonitrile): 4% to 22% B over 65 min, 22% to 35% B over 15 min, 35% to 80% B over 5 min and 80% B over 5min. The eluted peptides were subjected to analysis on an Orbitrap Fusion Tribrid mass spectrometer with an MS survey scan (m/z range 350–1800, 70,000 resolution, 3 × 106 AGC target, and 50 ms maximal ion time) and an MS/MS scan (m/z range 100–2000, 17,500 resolution, high-energy collision dissociation, 2 m/z isolation window, 30 normalized collision energy, 5 × 104 AGC, 200 ms maximal ion time, 30 s dynamic exclusion, and top number 20), respectively.
Data Analysis
The RAW mass spectrometry files were processed using MaxQuant (v.1.5.2.8) with an integrated Andromeda search engine. Tandem mass spectra were searched against the Uniprot S. islandicus REY15A database (2018/12/26, taxonomy ID: 930945, 2,631 sequences) concatenated with a reverse decoy database. The mass tolerance for precursor ions was set at 20 ppm in First search and 5 ppm in Main search, and the mass tolerance for fragment ions was set at 0.02 Da. Trypsin/P was specified as the cleavage enzyme, allowing up to 4 missing cleavages. The minimum peptide length was set at 7. Interference filters were employed by setting PIF at >0.75 to minimize co-isolation interference.
For RAW mass spectrometry files for the TMT-based quantitative acetylomic analysis, carbamidomethyl (C) and TMT-6plex (N terminus, K) were specified as fixed modifications, whereas oxidation (M) and acetylation (K) as variable modifications. False discovery rate (FDR) was adjusted to <1% and the minimum score for modified peptides was set at >40. Localization probability was set at >0.75. Quantitative information for each peptide or lysine site was calculated according to TMT ratios. A differentially acetylated lysine peptide was identified using a 1.3- or 2-fold cutoff.
For RAW mass spectrometry files for the TAILS-based N-terminal acetylomic analysis, Carbamidomethyl (C) was set as a fixed modification, whereas acetylation (N terminus), tetradeutero-dimethyl (C2H2D4, 32.0564 Da) (K, N terminus) and oxidation (M) were set as variable modifications. FDR was adjusted to <1% and the minimum score for modified peptides was set at >40. Representative MS2 spectra of modified peptides (i.e. ones with the highest MaxQuant score for a given modified peptide) were manually inspected. MS2 spectra, in which fewer than four sequential matched fragment ions were present or most of the matched ions were at the level of background noise, were deleted. The extent of Nt-acetylation for each protein was calculated as the ratio of the intensity of Nt-acetylated peptides to that of the identified N-terminal peptides from that protein.
For RAW mass spectrometry files for the TMT-based quantitative proteomic analyses, carbamidomethyl (C), TMT-6plex (N terminus, K) were specified as fixed modifications, whereas oxidation (M) as a variable modification. At least one unique peptide was required for the identification and quantification of a protein. A maximum FDR of 1% was employed for the identification of a protein. A differentially regulated protein was identified using a 1.5-fold cutoff and p value <0.05.
For RAW mass spectrometry files for the iTRAQ-based quantitative proteomic analyses, Mascot (v. 2.5.1) and Scaffold Q+ (v. 4.5.2) with the integrated Andromeda search engine were used. The parameters were set as that for the TMT-based quantitative proteomic analyses except that Methylthio (C) and iTRAQ-8plex (N terminus, K) were set as fixed modification.
RESULTS
S. islandicus Defective in SisPat is Viable and Similar to the Parental Strain in Growth
SisPat is the only protein lysine acetyltransferase from Archaea that has been shown to be capable of protein acetylation in vitro (
). Sequence alignment reveals that SisPat shares significant similarity (up to ∼27%) at the amino acid sequence level to the GNAT (GCN5-related N-acetyltransferase) domain of known protein lysine acetyltransferases from Bacteria (e.g. S. enterica Pat, R. palustris Pat, R. palustris KatA, E. coli Pka, M. tuberculosis PatA, and M. smegmatis PatA) (supplemental Fig. S1) (
). It was previously reported that Sso10b, a member of the Sac10b protein family, was acetylated in vitro at K16 by SsoPat, a SisPat homolog from S. solfataricus (
). In fact, Sso10b was the only proposed substrate for Sulfolobus Pat for a long time. In a recent study, however, K16 from neither Sso10b nor Sis10b was found to be acetylated in vivo (
Insights into the post-translational modifications of archaeal Sis10b (Alba): lysine-16 is methylated, not acetylated, and this does not regulate transcription or growth.
). To determine the bona fide substrates of SisPat, we constructed a Pat deletion mutant in S. islandicus (ΔSisPat) using a CRISPR-based gene deletion approach (
). The successful deletion of the SisPat gene in the mutant strain, as verified by PCR and immunoblotting (supplemental Fig. S2), indicates that SisPat was dispensable for the growth of the cell.
To determine if the lack of SisPat would affect the growth of S. islandicus, we grew ΔSisPat, the parental strain S. islandicus E233S and a complementary strain, which was constructed by introducing a plasmid overproducing SisPat (pSeSD-SisPat) into ΔSisPat, in either the nutrient-poor SCVy medium or the nutrient-rich TYS medium at 75 °C (
). All three strains grew more slowly and reached lower maximum cell densities in SCVy medium than in TYS medium (Fig. 1A and 1B). The growth rates of ΔSisPat and the parental strain were similar in both media. These results indicate that the absence of SisPat did not impact significantly on the growth of S. islandicus under growth conditions used in the present study.
Fig. 1Effect of the deletion of the sispat gene on the growth and the cellular protein levels of the cell. ΔSisPat, the parental strain and a complementary strain were grown in TYS medium (A) or SCVy medium (B) at 75 °C with shaking at 150 rpm. The OD600 values of the cells were measured. An average of three independent measurements were carried out for each strain. C, A volcano plot showing proteins quantified in the proteomic analysis of ΔSisPat and the parental strain. Differentially regulated proteins were identified with cutoffs set at p value <0.05 and fold-change ratio >1.5 fold. Significantly down- and upregulated proteins are shown in green and red, respectively.
We also conducted iTRAQ-based quantitative proteomic analysis of ΔSisPat and the parental strain. A total of 1586 proteins, accounting for 60% of the proteins encoded by the genome (2631), were identified (supplemental Table S2). Among 1543 quantifiable proteins, only four proteins were differentially regulated by >1.5 fold in ΔSisPat, in agreement with the growth phenotype of the mutant (Fig. 1C).
Sites of Lysine Nε-acetylation Are Widespread in S. islandicus
To investigate the effect of the deletion of SisPat on the pattern and level of global lysine Nε-acetylation, the exponentially-grown cells of the parental strain and ΔSisPat were harvested and lysed. Total cellular proteins were digested with trypsin. Acetyllysine-containing peptides were enriched by using the anti-acetyllysine antibody beads and subjected to analysis by LC-MS/MS. A total of 1,708 nonredundant lysine Nε-acetylation sites from 684 proteins, accounting for 26% of the theoretical total of the proteins encoded by the genome (2631), were identified (supplemental Table S3). Interestingly, a total of 93 propionylated peptides were found in 81 proteins (supplemental Table S4), providing the first evidence that protein lysine propionylation also occurs in Archaea, in addition to Eukarya and Bacteria (
). The acetylated proteins contained between 1 and 22 acetylated lysine residues (supplemental Fig. S3). Approximately 96% of the acetylated proteins were acetylated at no more than five lysine sites, and acetylated proteins with a single site of acetylation are the most abundant among the acetylated proteins (43%). To determine the sequence preference of lysine acetylation, we examined the sequences upstream and downstream of each acetylated residue. Eleven conserved amino acid sequence motifs spanning from position −10 to +10 with respect to an acetylated lysine were identified from 1174 peptides (supplemental Fig. S4A). Among these motifs, K, R, N, and S were most frequently found next to the acetylated lysine residue, with frequencies of about 17, 13, 12, and 10%, respectively, of all identified acetylated peptides (supplemental Fig. S4B). However, the protein secondary structure affiliation of acetylated lysine residues was like that of unacetylated lysine residues (supplemental Fig. S5A). Moreover, the probability of acetylation of a lysine residue was not significantly affected by its surface accessibility because 39% of the acetylated lysine sites, as compared with 41% of the unacetylated lysine residues, were surface exposed (supplemental Fig. S5B). These observations presumably suggest that protein acetylation occurred cotranslationally.
An arCOG category analysis suggested that the Nε-acetylated proteins were significantly enriched in metabolism (supplemental Fig. S6). Besides, the modified proteins were also enriched in ribosomal structure and biogenesis (arCOG J), transcription (arCOG K), replication, recombination and repair (arCOG L) and cell wall/membrane/envelope biogenesis (arCOG M). It was shown that most of the enzymes in the tricarboxylic acid (TCA) cycle were acetylated at conserved lysine residues involved in substrate or cofactor binding in Bacteria (
). We found that isocitrate dehydrogenase (SiRe2695) and succinate-CoA ligase (SiRe0277) were acetylated at their respective critical lysine residues (K227 and K38) in S. islandicus, raising the possibility that the functional role of lysine acetylation in the TCA cycle is conserved between Bacteria and Archaea.
SisPat Specifically Acetylates a Very Small Subset of Proteins In Vivo
Of the 1708 lysine acetylation sites, 1503 sites from 644 proteins were quantifiable. Strikingly, among these acetylated sites, only 24 sites from 23 proteins were differentially modified by >1.3 fold (supplemental Table S3). All but one of these sites (i. e. a lysine residue in an YHS domain protein) were modified to lower extents in the mutant than in the parental strain (supplemental Table S3). Among the 24 sites, ten from ten proteins exhibited a >2 fold difference in acetylation between the two strains, and they were all less acetylated in ΔSisPat than in the parent (Table I, supplemental Table S3). Quantitative proteomics based on iTRAQ labeling revealed that each of these differentially acetylated proteins was present at similar levels in the mutant and parental strains, excluding the possibility that the observed differential acetylation of these proteins was because of their different intracellular concentrations in the two strains (supplemental Table S2). Moreover, whereas several lysine acetylation sites were identified in these proteins (e.g. SiRe2451, SiRe0686, SiRe2327, SiRe0317, SiRe1580, and SiRe0435), only a single site in each protein was differentially acetylated (Table I), further indicating that these proteins were differentially acetylated in the two strains and the differential acetylation occurred at specific lysine residues.
Table IProteins containing a lysine acetylation site that was less acetylated by >2 fold in ΔSisPat than that in the parental strain
Top on the list of the differentially acetylated proteins are a group of six acyl-CoA synthetases of presumably different substrate specificity (4 ∼ 14 fold). Acyl-CoA synthetases are known to function in both anabolic and catabolic pathways by converting fatty acids to acyl-CoA derivatives (
). As shown by sequence alignment, a conserved motif (PX4GK) at the site of lysine acetylation exists in the six acyl-CoA synthetases (Fig. 2A and 2B). This motif resembles those found in bacterial acyl-CoA synthetases acetylated by Pat (Fig. 2C). Acetylation of agmatinase (SiRe2355, K158) was also significantly reduced (∼7 fold) in the mutant strain. Agmatinase is one of the vital enzymes in the biosynthesis of polyamines (e.g. putrescine, spermidine, and spermine) from arginine (
). Polyamines, which exist in abundance in Sulfolobus species, are known to serve a role in the thermal protection of nucleic acids and to facilitate protein synthesis (
). Other enzymes that were significantly less acetylated in ΔSisPat than in the parental strain included ornithine cyclodeaminase (SiRe1622, K158, ∼3 fold), citryl-CoA lyase (SiRe0435, K59, ∼2 fold) and nitrilase (SiRe1580, K187, ∼2 fold). Ornithine cyclodeaminase catalyzes hydrolysis of l-ornithine to l-proline, participating in arginine and proline biosynthesis (
). Nitrilase converts nitriles into the corresponding carboxylic acid and ammonia, serving roles in nutrient assimilation and detoxification of nitriles (
). The sole protein that was acetylated to a slightly higher level in ΔSisPat than in the parental strain contains the YHS-domain (SiRe2555), which is found in copper transporting ATPases, phenol hydroxylases and some membrane proteins. It is worth noting that the conserved motif shared by the six acyl-CoA synthetases is not found at the site of acetylation in the other differentially acetylated proteins.
Fig. 2Motif analysis of lysine acetylation sites differentially acetylated in the parental strain and ΔSisPat.A, Ten sites that were most drastically less acetylated in ΔSisPat than in the parental strain. Lysine residues of acetylation and their flanking sequences are aligned. Six acyl-CoA synthetases are framed by a red square. The acetylated lysine residues were marked by an arrow. B, A MEME motif generated from the peptides shown in Fig. 2A. The height of each letter corresponds to the frequency of the amino acid residue that the letter represents at that position. The lysine residue of acetylation is indicated by an arrow. C, The conserved motif of acyl-CoA synthetases acetylated by Pat from R. palustris. The lysine residue of acetylation is indicated by an arrow.
As revealed by our sequence search, S. islandicus REY15A encodes a total of 15 putative acyl-CoA synthetases. Nine of them were not identified in our quantitative acetylomic analysis (Fig. 3B). All the 15 enzymes except for SiRe1630 contain the conserved motif PX4GK for Nε-acetylation by SisPat. To test if SisPat could acetylate these acyl-CoA synthetases, especially the nine enzymes unidentified in the acetylomic analysis, we overproduced and purified recombinant SiRe0305, SiRe0317, SiRe2355, SiRe0033, SiRe0711, SiRe2338, and SiRe1630, and performed an in vitro protein acetylation assay on them. As shown in Fig. 3A, a protein was significantly acetylated in the assay only when it contained the conserved acetylation site because the two acyl-CoA synthetases SiRe0317 and SiRe0305, both of which were the preferred substrates for SisPat in vivo, were not detectably acetylated by the acetyltransferase when a point mutation was introduced into the proteins at the conserved site of acetylation (i.e. K544A in SiRe0317 and K534A in SiRe0305). In a separate experiment, we found that recombinant SiRe0305 and SiRe0317 incubated with SisPat in the in vitro acetylation assay were exclusively acetylated at the conserved lysine site by MS (supplemental Fig. S7). Therefore, the conserved lysine residue in acyl-CoA synthetases was the specific site of acetylation by SisPat. As expected, all the five putative acyl-CoA synthetases except for SiRe1630 were acetylated by SisPat (Fig. 3C). In comparison, agmatinase was not as a good substrate as the acyl-CoA synthetases because it was less efficiently acetylated than the latter in the assay. Taken together, our data indicate that SisPat specifically acetylates a small subset of proteins, including acyl-CoA synthetases in particular, in the S. isladicus cells.
Fig. 3Acetylation of putative acyl-CoA synthetases from S. islandicus by SisPat in vitro.A, Site-specific acetylation of acyl-CoA synthetases (SiRe0305 and SiRe0317) and agmatinase (SiRe2355) detected in the acetylome by SisPat in vitro. The lysine residue in each of the three proteins, which was differentially acetylated in the parental strain and ΔSisPat, was mutated into an alanine residue (i.e. SiRe0305K534A, SiRe0317K544A and SiRe2355K158A). Genes encoding the wild-type and mutant proteins were overexpressed in E. coli. Recombinant wild-type and mutant proteins were incubated with [3H]acetyl-CoA in the presence or absence of SisPat. B, Sequence alignment of nine acyl-CoA synthetases which were not identified in the acetylomic analysis. Lysine residues of acetylation and their flanking sequences are aligned. The conserved lysine residue is marked by an arrow. C, Acetylation of acyl-CoA synthetases undetected in the acetylome by SisPat in vitro. Recombinant acyl-CoA synthetases SiRe0033, SiRe0711, SiRe2338, SiRe1630 and SiRe0305 were incubated with SisPat in the standard in vitro protein acetylation assay mixture. The samples were subjected to SDS-PAGE. The gel was immersed in EN3HANCE, and processed for exposure to X-ray film.
Insights into the post-translational modifications of archaeal Sis10b (Alba): lysine-16 is methylated, not acetylated, and this does not regulate transcription or growth.
). We noticed that, during the mutant construction, counter-selection for the deletion of the pyrEF marker in the presence of 5-FOA and uracil resulted in the formation of two types of colonies of drastically different sizes on the plate (Fig. 4A). As revealed by PCR, the larger colonies contained the sisard1 gene as well as a spontaneous mutation in pyrEF, whereas the smaller ones lacked the sisard1 gene (data not shown). It appears that ΔSisArd1 was not able to grow as well as the parental strain. To further determine the effect of the deletion of sisard1 on the growth of the cell, we grew ΔSisArd1, the parental strain and a complementary strain, which was prepared by introducing a plasmid encoding SisArd1 into ΔSisArd1 (
Insights into the post-translational modifications of archaeal Sis10b (Alba): lysine-16 is methylated, not acetylated, and this does not regulate transcription or growth.
), in Zillig's liquid medium. ΔSisArd1 grew significantly more slowly, but reached a slightly higher maximum cell density, than the parental strain (Fig. 4B). On the other hand, the complementary strain grew more rapidly and attained a higher maximum cell density than the parent, presumably because the cellular content of SisArd1 in the complementary strain was about 10-fold higher than that in the parental strain (
Insights into the post-translational modifications of archaeal Sis10b (Alba): lysine-16 is methylated, not acetylated, and this does not regulate transcription or growth.
). Therefore, we conclude that SisArd1 is required for the optimal growth of S. islandicus.
Fig. 4Effect of the deletion and overexpression of the sisard1 gene on the growth of the cell.A, Growth of ΔSisArd1 and the parental strain on a solid plate. The cells were grown on an SCVy medium plate, and the colonies formed were photographed under an anatomic microscope. B, Growth curves of ΔSisArd1, the parental strain and two complementary strains. All strains were grown at 75 °C in Zillig's medium. The OD600 values of the cultures were measured. All numbers are an average of three independent measurements. Com 1, complementary strain 1 [ΔSisArd1/pSeSD-SisArd1(araS promoter)], Com 2, complementary strain 2 [ΔSisArd1/pSeSD-SisArd1(native promoter)].
Deletion of the SisArd1 Gene Affects the Cellular Levels of a Number of Proteins
To gain mechanistic clues into the growth phonotype of ΔSisArd1, we then conducted a TMT-based quantitative proteomic analysis on both the mutant and parental strains. A total of 1618 proteins, accounting for ∼61% of the proteins encoded by the genome (2631), were confidently identified (supplemental Table S5). Among them, 1518 proteins were quantifiable. When a cutoff was set at a fold change of ≥ 1.50, 130, and 143 proteins were found to be up- and downregulated, respectively, accounting in combination for ∼10% of the total proteins (Fig. 5, supplemental Table S5).
Fig. 5A volcano plot showing proteins quantified in the proteomic analysis of ΔSisArd1 and the parental strain. Differentially regulated proteins were identified with cutoffs set at p value < 0.05 and fold-change ratio >1.5 fold. Significantly down- and upregulated proteins are shown in green and red, respectively.
), the differentially regulated proteins were distributed in nearly all arCOGs categories (supplemental Fig. S8, supplemental Table S5). Intriguingly, five downregulated proteins, including cell division protein CdvB (SiRe1174), CdvC (SiRe1175), CdvB1 (SiRe1550) and two chromosome partitioning-like proteins (SiRe0265 and SiRe0132), are involved in cell cycle control (Table II). CdvB, CdvC, and CdvB1 comprise the cell division machinery CdvABC, which forms a composite band contracting concomitantly with the septum formation in Sulfolobus (
). Therefore, it is tempting to suggest that downregulation of the Cdv proteins contributed to the slow growth of ΔSisArd1. Among the three replication initiator proteins (i.e. Orc1–1, Orc1–3 and Whip), only Orc1–3 (SiRe0002) was identified in our analysis. The level of Orc1–3 was downregulated by ∼7 fold in ΔSisArd1. The replicative DNA polymerase DNA polymerase B1 (SiRe1451) was also downregulated. Downregulated proteins also included DNA repair proteins NurA (SiRe0061), Mre11 (SiRe0063) and HerA (SiRe0064), which are encoded by a single operon, and nucleotide excision repair proteins Xpb1 (SiRe1129) and Xpf (SiRe1280). Purine nucleotide synthesis was probably reduced because five proteins (i.e. SiRe1376, SiRe1377, SiRe1378, SiRe1379, and SiRe1381) in a single operon were all downregulated in the mutant. On the other hand, all the CRISPR-associated proteins identified in the proteome were upregulated (Table II, supplemental Table S5). These represent most of the proteins in the three CRISPR systems (i.e. CRISPR IA, IIIB Cmr-α and IIIB Cmr-β) of this organism (
). CRISPR IIIB Cmr-β proteins SiRe0598, SiRe0600, SiRe0601, and SiRe0603, and CRISPR IIIB Cmr-α proteins SiRe0890, SiRe0891, SiRe0892, SiRe0893, SiRe0894, and SiRe0895 were upregulated by 1.5 ∼ 2 fold, whereas CRISPR IA proteins SiRe0765, SiRe0766, SiRe0767, SiRe0768, and SiRe0771 were slightly upregulated (<1.5 fold) (supplemental Table S5). The consequence of the upregulation of the CRISPR systems remains to be elucidated, but it may be related to the nonimmunity functions (e.g. regulation of gene expression, genome remodeling, DNA repair and cell dormancy) of these systems (
). Most of the proteins involved in carbohydrate transport and metabolism (34 out of 47 proteins, arCOG G) and energy production and conversion (eight out of nine proteins, arCOG R) were also upregulated (supplemental Fig. S8), in agreement with the observation that the mutant was able to grow to higher cell density than the parental strain. It is worth noting that eight transcriptional regulators were differentially regulated, of which seven were downregulated and the PadR family protein was upregulated. It is possible that differential regulation of some of the proteins resulted from the change in the cellular levels of these transcriptional regulators.
Table IISelected genes differentially expressed in the parental strains and ΔSisArd1
Genes which encode putative functions in cell cycle control, nucleotide metabolism, replication, translation, repair and defense and were differentially expressed by >1.5 fold in the parental strain and ΔSisArd1 are included in the table.
a Genes which encode putative functions in cell cycle control, nucleotide metabolism, replication, translation, repair and defense and were differentially expressed by >1.5 fold in the parental strain and ΔSisArd1 are included in the table.
b D, cell cycle control, cell division, chromosome partitioning; F, nucleotide transport and metabolism; L, replication, recombination and repair; K, transcription; V, defense mechanisms.
About Half of the Identified Proteins Are Acetylated at the N-termunus in S. islandicus
To investigate the effect of SisArd1 on protein Nt-acetylation in vivo, we determined the Nt-acetylated proteins from both ΔSisArd1 and the parent strain by using the TAILS technique (
). Each strain was grown in replicate to the exponential phase, and the proteins were isolated. Primary free amines were tetradeutero-dimethylated. Following trypsin digestion, the N-terminal peptides were enriched and subjected to LC-MS/MS. We were able to obtain 359 unique N-terminal peptides from the parental sample (supplemental Table S6). About 49% (177/359) of the identified proteins lacked the initiator methionine residue (iMet). The proportion of the proteins undergoing N-terminal methionine cleavage in S. islandicus appears to be comparable to those reported previously for other organisms (55 ∼ 70%) (
). As observed in bacteria, eukaryotes and haloarchaea, initiator methionine cleavage occurred only in proteins containing an amino acid residue with a small side chain (i.e. G, A, V, S, T) at the second position in S. islandicus (Fig. 6A) (
). In proteins, whose second residue contained a large side chain (i.e. E, Q, D, N, I, L, Y, M, K, F), iMet was generally retained (Fig. 6A). About 21% of the identified N-terminal peptides were fully acetylated, 23% were partially acetylated, and the remaining 56% were unacetylated at the N terminus (Fig. 6B, supplemental Table S6). As shown in Fig. 6B, all MM-starting proteins were Nt-acetylated. Nearly all (71/74) of the proteins containing Ser at the second position were Nt-acetylated, accounting for 45% of the Nt-acetylome in S. islandicus. Nt-acetylation also frequently occurred on proteins with an A, G, ME or MQ at the N terminus. Proteins with a T, MI, MY or ML at the N terminus were occasionally Nt-acetylated. Nt-acetylation was hardly detected on proteins with other N termini.
Fig. 6N-terminal peptides identified in the S. islandicus parental strain and ΔSisArd1.A, Distribution of the second amino acid residues in the identified N-terminal peptides in the parental strain. Peptides with iMet (blue), without iMet (red) and with both types of the N terminus (green) are shown. B, Frequencies of Nt-acetylation on various N termini in the parental strain. C, Frequencies of Nt-acetylation on various N termini in ΔSisArd1. Nt-acetylated peptides (red), partially Nt-acetylated peptides (yellow) and Nt-unacetylated peptides (blue) are indicated. An N terminus, for which ≥ 3 proteins were identified in the parental or mutant strain, is shown. D, Proteins differentially Nt-acetylated in ΔSisArd1 and the parental strain. Peptides Nt-acetylated in the parental strain but unacetylated or less acetylated in the ΔSisArd1 are shown in green and cyan, respectively. All termini identified in this study are listed in supplemental Tables S6 and S7.
SisArd1 Is a Major Nt-acetyltransferase With Broad Substrate Specificity in S. islandicus
As revealed by comparing the Nt-acetylation data for ΔSisArd1 with those for the parental strain, the number of Nt-acetylated proteins in the mutant strain was substantially lower than that in the parental strain. Only 11% (36/339) of the identified proteins were Nt-acetylated in ΔSisArd1, whereas 44% (158/359) of the proteins were Nt-acetylated in the parental strain (Fig. 6B, 6C and supplemental Table S6). We subsequently compared the proteins identified in both the parental and the mutant samples (supplemental Table S7). A total of 286 proteins were identified in both samples, and 129 of them were Nt-acetylated in the parental strain. Of these Nt-acetylated proteins, 119 were unacetylated (96) or significantly less frequently acetylated (
The acetyl-CoA synthetase gene ACS2 of the yeast Saccharomyces cerevisiae is coregulated with structural genes of fatty acid biosynthesis by the transcriptional activators Ino2p and Ino4p.
) at the N terminus in the mutant strain. Therefore, SisArd1 was responsible for the modification of the vast majority (92%) of the acetylated N termini identified in S. islandicus (supplemental Table S7). The 119 target proteins of SisArd1 contained 16 unique N termini, which fall into two groups based on the presence of iMet. One group included S, A, G, T and, infrequently, V and E, and the other contained MD, ME, MN, MQ, MI, MF, ML, MY, MA and MM at the N terminus (Fig. 6D and supplemental Table S7). The spectrum of N termini that SisArd1 acetylated covers nearly all specific target sites for each of the six human N-terminal acetyltransferases (NatA ∼ NatF), supporting the notion that archaeal Ard1 is an evolutionary precursor of existent eukaryotic NATs. It is worth noting that there was significant residual protein acetylation at the N terminus (i.e. 36 Nt-acetylated proteins) in ΔSisArd1, indicating strongly the presence of additional N-terminal acetyltransferases in S. islandicus (supplemental Table S7). Ten of the identified proteins, including three MM-starting proteins and one MK-starting protein, were found to be as well Nt-acetylated in ΔSisArd1 as in the parental strain. Most of the acetylated N termini in the mutant strain possessed the target N-terminal amino acid residues for SisArd1, suggesting that there exist unknown N-terminal acetyltransferase(s) with substrate specificity overlapping with that of ΔSisArd1. Our observation is consistent with the identification of several putative acetyltransferases in the organism by sequence searches.
DISCUSSION
In the present study, we have determined the extent and pattern of protein acetylation in S. islandicus and provided the first report on the substrate specificity and potential roles of the two highly conserved archaeal protein acetyltransferases SisPat and SisArd1 in vivo. S. islandicus proteins were acetylated at numerous internal lysine residues and at the N terminus. Proteins containing Nε-acetylated lysine residues accounted for as many as ∼26% of the total proteins in the organism. However, it is worth noting that these residues were probably acetylated at a very low level because only ∼0.15% of the peptides identified were acetylated in the quantitative proteomes of S. islandicus (data not shown). Interestingly, SisPat was responsible for the acetylation of only a very small subset (24 out of 1503 quantifiable lysine sites, as judged by using a cutoff of >1.3 fold difference in modification between the parental strain and ΔSisPat) of the acetylated proteins. The vast majority of acetyllysines presumably resulted from the action of other protein acetyltransferases or nonenzymatic processes. Sequence analysis suggests that S. islandicus encodes six putative acetyltransferases in addition to SisPat and SisArd1 (data not shown). The possibility exists that some of these enzymes are involved in the acetylation of the acetylated proteins. Equally or more likely, many of the identified sites of acetylation were acetylated nonenzymatically. It was reported recently that acetyl phosphate (acP), a high-energy intermediate of the phosphotransacetylase/acetate kinase (Pta/AckA) pathway, directly donated its acetyl group to the ε-amino group of a deprotonated lysine in bacteria (
). However, because of the absence of the Pta/AckA pathway, nonenzymatic protein acetylation via the acP reaction may not occur in Sulfolobus. Therefore, it is more likely that acetyl-CoA, another highly reactive reaction intermediate shown to be potentially capable of mediating nonenzymatic protein lysine acetylation (
), plays a role in nonenzymatic acetylation of the proteins in S. islandicus.
It was believed for a long time that Sulfolobus Pat acetylated Sso10b at K16, as implicated by the reports that the lysine residue existed in an acetylated form in vivo and the protein was acetylated by Sulfolobus Pat in vitro (
Insights into the post-translational modifications of archaeal Sis10b (Alba): lysine-16 is methylated, not acetylated, and this does not regulate transcription or growth.
). On the other hand, K64 and K68 of Sis10b were occasionally acetylated in vivo. In agreement with the previous finding, only K64 and K68, but not K16, were acetylated, and, further, acetylation of the two residues was independent of the presence of SisPat in the present lysine acetylomic analysis. Apparently, K64 and K68 were not acetylated by SisPat, and Sis10b was not the substrate of the acetyltransferase in vivo. Notable among the bona fide substrates of SisPat were a group of acyl-CoA synthetases containing a conserved site of acetylation by SisPat. S. islandicus encodes a total of 15 putative acyl-CoA synthetases with different substrate specificity. All but one carries the conserved site of acetylation. Six of the enzymes were identified as among the most preferred substrates of SisPat in the acetylome, and the remaining enzymes except for the one lacking the conserved acetylation site were also likely the substrates of SisPat as demonstrated by the in vitro acetylation assays. Therefore, our data provide conclusive experimental evidence that the regulatory mechanism involving the acetylation of acyl-CoA synthetases is conserved among all three domains of life.
Acyl-CoA synthetases serve a central role in controlling CoA homeostasis and thus maintaining metabolic balance in the cell by activating acetate and other weak organic acids to acetyl-CoA and other corresponding CoA thioesters, which participate in diverse anabolic and catabolic processes (
The acetyl-CoA synthetase gene ACS2 of the yeast Saccharomyces cerevisiae is coregulated with structural genes of fatty acid biosynthesis by the transcriptional activators Ino2p and Ino4p.
). In enteric bacteria, such as E. coli and S. enterica, acetate is activated into acetyl-CoA via either the low-affinity Pta/AckA pathway or the high-affinity ACS pathway in response to variation in the level of acetate in the habitats (
). As observed in S. islandicus, many proteins (i.e. 17% of the total proteins) are potentially acetylated at the internal lysine residues in H. mediterranei. Metabolic proteins represent the largest functional group of proteins acetylated, accounting for 47 and 41% in the lysine acetylomes of S. islandicus and H. mediterranei, respectively. In addition, ribosomal proteins and aminoacyl-tRNA synthetases were frequently modified in the two species. Analysis of the sequence context at the site of acetylation reveals that positively charged amino acid residues (i.e. Lys and Arg) are enriched at the +1 and +2 positions, in relation to the residue of acetylation, in the acetylomes of the two species. In comparison, Tyr is significantly enriched at the +1 position whereas Lys or Arg is always located downstream of the +2 position in the mycobacterium, drosophila and human acetylomes (
). H. mediterranei encodes twelve acyl-CoA synthetases. Our analysis of the H. mediterranei acetylome has identified acetylated peptides from six of them. Notably, the lysine residue of the conserved motif PX4GK in these proteins, which was identical to that in S. islandicus, was acetylated. Sequence searches show that H. mediterranei encodes eleven GCN5 family acetyltransferases, and two of them (i.e. HFX_1840 and HFX_1886) share significant sequence homology with SisPat (27% for both). Therefore, we suggest that a SisPat homolog catalyzes the acetylation of acyl-CoA synthetases in H. mediterranei. We infer further that the function of SisPat homologs is conserved among archaea because acyl-CoA synthetases with the conserved lysine residue of acetylation are widespread in nearly all archaeal phyla.
Nt-acetylation was once considered a rare event in Bacteria and Archaea (
). In this study, we identified the N-terminal peptides from 359 unique proteins in S. islandicus by using a terminal amine isotopic labeling of substrates (TAILS) technique. Notably, a significant fraction of the proteins had undergone N-terminal maturation with the initiator methionine removed (∼49% of the identified proteins) and the N terminus acetylated (∼44% of the identified proteins). Although the fraction of Nt-acetylated proteins in S. islandicus is substantially higher than those in other archaeal species that have been studied, it is lower than those in human (∼80%) and yeast (∼60%) (
A comparison of Nt-acetylated and Nt-unacetylated proteins from ΔSisArd1 and the parental strain reveals that SisArd1 is the major Nt-acetyltransferase in S. islandicus because it was responsible for the Nt-acetylation of the majority (92%) of the proteins identified and the pattern of N termini acetylated by the enzyme resembles that observed in the Nt-acetylome of the organism. Interestingly, SisArd1 could Nt-acetylate 16 different N termini, with the exception of MK-starting termini, spanning the target N termini of all human Nats (NatA ∼ NatF) in vivo. Therefore, at least one N-terminal acetyltransferase remains to be identified, which should be able to catalyze efficiently Nt-acetylation of the MM- and MK-starting termini. iMet-cleaved proteins with N-terminal S, A, G or T, the targets of human NatA, and iMet-retaining proteins with N-terminal ME or MQ, the targets of human NatB, were preferentially Nt-acetylated by SisArd1. However, the pattern of Nt-acetyation frequencies for various N termini in S. islandicus differed from that in human. Over 95% of MN, MD, ME, or MQ, and >54% of ML, MI, MF, MY or MK, the target termini of NatB and NatC/E/F, respectively, were Nt-acetylated in human (
). But these termini were rarely acetylated in S. islandicus. This difference may contribute to the lower overall N-terminal acetylation in S. islandicus (∼44%) than in human (∼80%) (
). SisArd1 is most closely related to the catalytic subunit Ard1/Naa10 of NatA at the amino acid sequence level. However, structural studies reveal that the active site of Ard1 from S. solfataricus represents a hybrid of the active sites of NatA and NatE (
). Therefore, the ability of SisArd1 to modify both iMet-lacking and iMet-retaining N termini results from the use by the enzyme of a hybridized strategy, which permits the enzyme to facilitate the acetylation of distinct substrates through different catalytic mechanisms (
The availability of the mutant strains ΔSisPat and ΔSisArd1 permits a look into the functions of SisPat and SisArd1, the only two biochemically characterized archaeal protein acetyltransferases in Archaea. Neither enzyme was essential for the growth of the organism. The lack of SisPat did not hinder the growth of the mutant strain, whereas SisArd1 was required for the optimal growth of the organism. The slower growth of ΔSisArd1 than that of the parental strain is consistent with the reduced expression of genes encoding proteins functioning in cell division, cell cycle control, purine synthesis and DNA replication in the mutant strain, as revealed by quantitative proteome analysis. Nt-acetylation also affected a number of other cellular processes, such as DNA repair and CRISPR-related activities. It has been reported that Nt-acetylation is involved in the control of the cellular level of proteins such as 20S proteasome α1 protein from H. volcanii and small nucleic acid-binding proteins from Sulfolobus because the lack of Nt-acetylation resulted in an increase in the intracellular concentration of these proteins (
Insights into the post-translational modifications of archaeal Sis10b (Alba): lysine-16 is methylated, not acetylated, and this does not regulate transcription or growth.
The N-terminal penultimate residue of 20S proteasome alpha1 influences its N(alpha) acetylation and protein levels as well as growth rate and stress responses of Haloferax volcanii.
). We show in this study that 273 proteins were differentially regulated (130 upregulated and 143 downregulated) by >1.5 fold in ΔSisArd1, as compared with those in the parental strain. Only 20 of these proteins, including twelve upregulated and eight downregulated ones, were Nt-acetylated in the parental strain but not or less Nt-acetylated in ΔSisArd1. In addition, 18 quantifiable proteins were Nt-acetylated only in the parental strain and were not differentially regulated by >1.5 fold in ΔSisArd1. It appears that, although Nt-acetylation affected the cellular levels of a significant fraction (∼10%) of proteins in S. islandicus, the Ac/N-end rule is not generally applicable, as expected from the fact that fewer than ten proteins have so far been shown to obey the Ac/N-end rule (
). Of possible relevance to the control of cellular level of the proteins is the finding that eight transcriptional factors were differentially regulated in ΔSisArd1. Although none of these transcriptional factors were identified in our Nt-acetylomic analysis, the possibility exists that the synthesis of these factors was affected directly or indirectly by Nt-acetylation. Changes in gene expression of the transcriptional factors might be responsible for the changes in cellular level of the proteins in ΔSisArd1. Obviously, a better understanding of the role of Nt-acetylation in the control of cellular protein concentrations awaits further investigation.
DATA AVAILABILITY
The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium (http://proteomecentral.proteomexchange.org) via the PRIDE partner repository with the dataset identifier PXD012246.
Acknowledgments
We thank Professor Haiteng Deng for valuable suggestions and Dr. Yuanming Luo for technical assistance.
The acetyl-CoA synthetase gene ACS2 of the yeast Saccharomyces cerevisiae is coregulated with structural genes of fatty acid biosynthesis by the transcriptional activators Ino2p and Ino4p.
Reversible N epsilon-lysine acetylation regulates the activity of acyl-CoA synthetases involved in anaerobic benzoate catabolism in Rhodopseudomonas palustris.
Insights into the post-translational modifications of archaeal Sis10b (Alba): lysine-16 is methylated, not acetylated, and this does not regulate transcription or growth.
The N-terminal penultimate residue of 20S proteasome alpha1 influences its N(alpha) acetylation and protein levels as well as growth rate and stress responses of Haloferax volcanii.
Author contributions: J.C. and L.H. designed research; J.C. and T.W. performed research; J.C., Q.W., X.Z., and L.H. analyzed data; J.C. and L.H. wrote the paper.