If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
‖,* This work was supported by a grant from the National Institute of Health (NIH) to J.D.J. at the Broad Institute of MIT & Harvard (U54-HG008097). [S] This article contains supplemental material. The authors do not declare any conflict of interest.
The N-terminal regions (tails) of histone proteins are dynamic elements that protrude from the nucleosome and are involved in many aspects of chromatin organization. Their epigenetic role is well-established, and post-translational modifications present on these regions contribute to transcriptional regulation. Considering their biological significance, relatively few structural details have been established for histone tails, mainly because of their inherently disordered nature. Although hydrogen/deuterium exchange mass spectrometry (HX-MS) is well-suited for the analysis of dynamic structures, it has seldom been employed in this context, presumably because of the poor N-terminal coverage provided by pepsin. Inspired from histone-clipping events, we profiled the activity of cathepsin-L under HX-MS quench conditions and characterized its specificity employing the four core histones (H2A, H2B, H3 and H4). Cathepsin-L demonstrated cleavage patterns that were substrate- and pH-dependent. Cathepsin-L generated overlapping N-terminal peptides about 20 amino acids long for H2A, H3, and H4 proving its suitability for the analysis of histone tails dynamics. We developed a comprehensive HX-MS method in combination with pepsin and obtained full sequence coverage for all histones. We employed our method to analyze histones H3 and H4. We observe rapid deuterium exchange of the N-terminal tails and cooperative unfolding (EX1 kinetics) in the histone-fold domains of histone monomers in-solution. Overall, this novel strategy opens new avenues for investigating the dynamic properties of histones that are not apparent from the crystal structures, providing insights into the structural basis of the histone code.
Histones are highly conserved proteins that are integral components of the nucleosome, the repeating unit of chromatin. In the “canonical nucleosome structure,” 147 DNA base pairs wrap around a histone octamer that consists of a core H3-H4 tetramer flanked by two H2A-H2B heterodimers (
). Interactions within the octamer core and with the surrounding DNA promote a compact and rigid nucleosome structure bordered by flexible N- and C-terminal tails. An extensive array of covalent post-translational modifications (PTMs)
) complemented by highly basic, dynamic, flanking termini that extend outside the nucleosome boundaries. Given their disordered nature, parts of the histone tails are absent in high-resolution X-ray nucleosome structures reported to date (
) was employed and for most histones provided poor coverage of the tails. Neprosin was introduced recently as an alternative enzyme for histone studies producing N-terminal tail peptides 1–38 for H3 and 1–32 for H4 in acidic conditions, (
) have yet to be tested to this class of proteins; their reported cleavage preferences however do not seem to render them appropriate for generating measurable size peptides (7–25 amino acids) given the sequence of the tails. Recently, an alternative approach employing electron transfer dissociation (ETD) in combination with HX-MS top-down and middle-down approaches for this class of proteins was demonstrated (
). Despite the high-resolution histone maps obtained however, the widespread implementation of ETD in HX-MS studies lags typical bottom-up approaches, as the analysis, although seemingly straightforward, presents technical challenges (
), and to enhance coverage of histone tails in bottom-up HX-MS studies. Cathepsin-L, is one of the 11 cysteine cathepsin proteases that share a common catalytic mechanism and strong sequence similarity with the papain super-family of proteases. Cathepsins are synthesized as propeptides and are processed in the mature form either autocatalytically or by other proteases (
). Taken together, these data suggest that cathepsin-L may be applicable to the generation of histone N-terminal peptides of appropriate size for HX-MS studies. Inspired by these reports, we investigated the activity of cathepsin-L at HX-MS quench conditions for all core histones. We show enhanced N-terminal coverage for most of the proteins, and developed a method to investigate in-solution dynamics of monomeric H3 and H4 by HX-MS.
Human recombinant histones (H2A3; Q7L7L0 (M2502S), H2B2E; Q16778 (M2505S), H3.1; P68431 (M2503S) and H4; P62805 (M2504S)) were purchased from New England BioLabs Inc (Ipswich, MA). Mononucleosomes, recombinant human (16–0009) containing H2A1B (P04908), H2B1K (O60814), H3.1 (P68431), and H4 (P62805) were purchased from EpiCypher (Durham, NC). Human recombinant cathepsin-L was purchased either from PromoKine (purity >90%; PK-RP577–1135-5, PromoCell GmbH) or from BioVision (purity >90%; 1135–5, Milpitas, CA). Dulbecco's phosphate buffered saline (PBS; D8537), cathepsin-L inhibitor (RKLLW-NH2, ≥97%; SCP0110), deuterium oxide (99.9 atom % D; 151882) and guanidine hydrochloride (≥98%; 50950) were purchased from Sigma-Aldrich (Saint Louis, MO). Tris(2-carboxyethyl)-phosphine hydrochloride (TCEP·HCl; 20491) and immobilized pepsin on cross-linked agarose beads (6%; 20343) were purchased from Thermo Scientific (Rockford, IL). Acetonitrile (ACN, 99.9%; A998) and formic acid (FA, >99.5% purity; A117) were from Fisher (Fair Lawn, NJ) and water from J.T.Baker (JT4218–3; Center Valley, PA). Leucine enkephalin (MS Leucine Enkephalin Kit; 700002456) used for lock mass in the MS instrument was from Waters (Milford, MA). SDS-PAGE gels (4–15%, Mini-PROTEAN TGX Precast Protein Gels) were from Bio-Rad (4561086; Hercules, CA) and staining reagent (EZBlueTM Coomassie Brilliant Blue G-250) was from Sigma-Aldrich (G1041).
In Vitro Enzyme Activity Analysis
Individual histones (1 μl of 1 μg/μl, PBS pH 7.5) were diluted with PBS (9 μl, pH ranging from 7.5 to 2.5, TCEP 5 mm), and cathepsin-L (1μl of 0.5 μg/μl, 25 mm sodium acetate, pH 5.5, TCEP 5 mm) was immediately added to the samples. The enzymatic digestions were allowed to proceed for 2.5 min either on ice (H3) or at room temperature (H4, H2A3 and H2B.1) and were quenched with SCP0110 (1 μl of 240 μm in PBS, pH 2.5). Samples were subsequently subjected to electrophoresis under denaturing conditions and were stained with EZBlueTM following standard protocol.
Intact Protein Analysis and Deconvolution
Intact histones (2 μl, 50 ng/μl in 5% FA) were flow injected and analyzed on a Q Exactive Plus Orbitrap MS (Thermo Fisher Scientific, Bremen, Germany) using a standard m/z range of 400–2000. MS parameters were: spray voltage 2 kV, source temperature 250 °C, AGC target 3e6, maximum injection time (IT) 200 ms and resolution (at m/z 200) 140,000. Data were acquired in profile mode and ESI mass spectra were deconvoluted using Intact software (Protein Metrics Inc., San Carlos CA).
In-solution Digestions and Analysis Using a Q Exactive Plus MS
Individual histones (1.5 μg, PBS pH 7.5) were diluted with PBS (pH 2.5) and cathepsin-L (0.15 μg, 25 mm sodium acetate, pH 2.5) was added to the samples. For cathepsin-L activity experiments, digestions were performed in the presence of PBS containing TCEP, guanidine HCl, or both (at pH 2.5) at various concentrations (Results and Discussion). The enzymatic digestions were allowed to proceed for 10 min at either 37 °C or 0 °C and were quenched by boiling samples at 95 °C for 5 min. Samples were desalted on Empore C18 silica bead Stagetips (
). Elution mixtures were lyophilized and reconstituted in 5% formic acid befure nLC-MS/MS analysis. Peptides were injected using a Proxeon Easy nLC 1000 and analyzed on a Q Exactive Plus Orbitrap MS (Thermo Fisher Scientific). Briefly, 2 μl of samples (1 μg) were loaded onto a microcapillary column (360 μm OD × 75 μm ID) with an integrated electrospray emitter tip (10 μm, New Objective) packed with 20 cm of ReproSil-Pur C18-AQ 1.9 μm beads (Dr. Maisch, GmbH). The column was heated to 50 °C using a column heater (Phoenix S&T). Peptides were separated using a flow rate of 200 nl/min and the following gradient: 3% B to 40% B in 30 min, to 90% B in 4 min, keep at 90% B for 6 min and return to initial conditions in 1 min. Samples were analyzed in data-dependent analysis (DDA) mode using a Top-12 method. Ion source parameters were: spray voltage 2 kV, source temperature 250 °C. Full MS scans were acquired in the m/z range 300–2000, with an AGC target 3e6, maximum IT 20 ms and resolution (at m/z 200) 70,000. MS/MS parameters were as follows: AGC target 1e5, maximum IT 50 ms, loop count 10, isolation window 1.6 m/z, isolation offset 0.3 m/z, NCE 27, resolution (at m/z 200) 17,500 and fixed first mass 100 m/z; unassigned and singly charged ions were excluded from MS/MS. Data were acquired in centroid mode in both MS and MS/MS scans. Peptides were identified using SpectrumMill Proteomics Workbench (prerelease version B.06.01.202, Agilent Technologies). A nonspecific enzyme search was performed against a fasta file containing sequences of human histones, pepsin, cathepsin and contaminant proteins identified upon tryptic digestion of the samples (52 entries). Peptide and fragment tolerances were at ±20 ppm, minimum matched peak intensity 40% and peptide false discovery rates were calculated to be less than 1% using the target-decoy approach (
). No fixed or variable modifications were included in the search. Spectra with a score higher than 4 were accepted.
HX Sample Preparation and Analysis Using a QToF-MS
For hydrogen exchange experiments, individual histones (3 μl, 0.4 μg/μl in PBS; 10 mm Na2HPO4, 1.8 mm KH2PO4, 2.7 mm KCl and 147 mm NaCl, pH 7.35) were equilibrated on ice and were mixed with ice-chilled deuterated PBS (47 μl, final D2O content during reaction 94% v/v), prepared by two cycles of lyophilization and reconstitution in D2O. Samples were quenched at 10, 100, 1000, and 10,000 s with TCEP (5 μl of 55 mm containing 13.8% FA) to a final pH 2.4 and were snap-frozen in liquid nitrogen. Non-deuterated samples (0 s) were prepared in PBS and full deuteration (FD) controls were prepared by incubating proteins as described above for 10 min at 95 °C. Samples were prepared and analyzed in duplicates. Prior to LC-MS analysis, samples were thawed using a mini centrifuge and were mixed with ice-chilled cathepsin-L (5 μl, 100 ng/μl in 5 mm TCEP, pH 2.5). Samples were incubated for 5 min at 3 °C using an EppendorfTM ThermomixerTM R (Thermo Scientific) and were immediately injected into a nanoAcquity UPLC with HX technology (Waters). Samples were digested online for 1 min at 300 μl/min using the Enzymate™ BEH Pepsin Column (130 Å, 2.1 × 30 mm, 5 μm, 186007233; Waters) at 0 °C and were subsequently loaded on an Acquity UPLC R BEH C18 VanGuard pre-column (130 Å, 1.7 μm, 2.1 × 5 mm, 186003975; Waters) using FA (0.23% (v/v)). Peptides were separated on an Acquity UPLC R BEH C18 analytical column (130 Å, 1.7 μm, 1 × 100 mm, 186002346; Waters) at 40 μl/min using solvents A (0.23% v/v FA) and B (0.23% v/v FA in ACN). The following gradient was applied: 3% B to 10% B in 0.5 min, to 40% B in 7 min, to 60% B in 1.2 min, to 97%B in 0.3 min; kept at 97% B for 0.6 min and returned to initial conditions in 0.4 min. Samples were analyzed on a Synapt XEVO G2-XS QToF MS (MassLynxTM 4.1, SCN 916; Waters). Spectra were collected in Resolution mode (m/z 350–1500) using the following parameters: capillary voltage 3 kV, cone voltage 40 V, ion source block temperature 80 °C, cone gas flow 100 L/h, desolvation gas flow 800 L/h at 150 °C, and nebulizer gas flow 6 bar. Step Wave values were: DC Offset, 15V; Wave Height 1, 5 V; Wave Height 2, 30 V; Wave Velocity 1, 150 m/s and Wave Velocity 2, 150 m/s. Leucine enkephalin (2 ng/μl in 50% [v/v] ACN and 0.23% [v/v] FA) was infused for lock mass spray at 4 μl/min; three spectra of 0.5 s were acquired every 20 s. For peptide identification, individual histones (1 μg each) were run in data-dependent acquisition (DDA) mode upon digestion with cathepsin-L, pepsin or a serial combination of both (Results & Discussion). MS survey scan time was 0.1 s and the 5 most intense multiply-charged ions with >10000 intensity/s were chosen for MS/MS. Fragmentation scans were acquired for m/z 100–1600 with a total ion current target of 15000 intensity/s for up to 0.8 s. A ramped collision energy profile of 18–40 eV was applied and fragmented precursor ions were excluded from further MS/MS for 8 s. The lock mass spray and UPLC parameters were as described above. Peptides were identified using PLGS (Waters). For all histones, a nonspecific enzyme search was performed with parameters as described above for SpectrumMill Proteomics Workbench identification. Spectra were recalibrated using Leucine-Enkephalin (556.2771 Da) for lock mass.
Analysis of Deuterium Uptake
Peptide lists were subsequently loaded to DynamX 3.0 (Waters) and deuterium (D) uptake curves of individual peptides were calculated based on the assigned centroid m/z value of the isotopic cluster envelopes. d-uptake values were normalized using experimental values obtained from the analysis of FD controls divided by the deuterium fraction (0.94) in the solvent during the exchange and the theoretical maximum number of exchangeable amides (
)), we digested the four core histones individually over a pH gradient ranging from basic to acidic to test the enzyme's activity (Fig. 1A). Cathepsin-L demonstrated cleavage patterns that were substrate and pH-dependent, and interestingly, for all substrates exhibited optimum activity in the lower acidic range (∼3.5–2.5). For H3.1, H2A, and H2B, several sub-bands were detected across the entire pH range tested, although activity against H3.1 was reduced at pH 2.5. Digestions at higher pH values yielded fewer bands indicating lower enzyme activity, in accordance with similar in vitro experiments of cathepsin-L (73.3% sequence similarity) for H3.1 (
). In contrast, cathepsin-L cleaved H4 exclusively at pH values <5.5 with highest intensity fragments detected at pH 2.5. These results confirm previous reports showing that cathepsin-L has maximal activity at acidic pH and that cathepsin-L affects substrate kinetics in a pH-dependent manner (
We next employed nanoLC-MS/MS to identify peptides generated by cathepsin-L for the four histones (supplemental Fig. S1) and investigated cleavage preferences. Prior to digestion, intact histones were analyzed individually by ESI-MS, confirming that the major proteins species present had complete and unmodified sequences (∼11–15 kDa, supplemental Fig. S2). Histones were digested in triplicate in a slightly reducing environment that enhances the enzyme's activity (Fig. 1B). The enzyme's reproducibility was assessed by the number of common peptides identified within individual histones across replicates. Therefore, for H2A 57% of peptides was identified in three replicates, for H2B 53%, for H3.1 55% and for H4 65% (supplemental Table S1). Of note, another ∼20% of peptides were identified in two replicates for individual histones. The moderate reproducibility observed is attributed to notable secondary exopeptidase activity detected (supplemental Fig. S3A), concordant with cysteine cathepsins that are known to contain critical structural elements for both endopeptidase and exopeptidase activities (
). The median pairwise Pearson's correlation of peptide intensities detected in at least two replicates was 0.93 (supplemental Fig. S3B). The “primary” cleavage sites (derived from peptides accounting for ∼80% of total ion current) are depicted in supplemental Fig. S3C. For H2A3, cleavage at G44 is in line with cathepsin-L in vitro proteolysis experiments of H2A2A (98% homologous to H2A3) at pH 6 reported previously (
). For H4, proteolysis using cathepsin-L has not yet been reported in the literature. Our studies indicate that R23 is the most abundant among “primary” cleavage sites, with a few more detected across the protein's length.
We then scanned for optimal concentrations of reducing agent (TCEP) and denaturant (guanidine hydrochloride), typically added to the samples during the quenching step (0 °C, pH 2.5) and prior to proteolysis to improve digestion efficiency (
). Our results indicate that a slightly reducing environment (5 mm) and no denaturant result in optimal Cathepsin-L activity (Fig. 1B–1C). This is consistent with maintenance of a reduced active site Cys while leaving structural disulfides intact.
To next determine amino acid cleavage preferences of cathepsin-L at pH 2.5, we combined all histone peptides identified above and discarded redundant cleavage sites (194 peptides remaining). We aligned peptides (P5 to P4′ (
). To account for the high fraction of positively charged amino acids, as a reference set, we employed the four histones and cleaved consecutive amino acids across their entire sequences to generate 9 amino acid peptide sequences (455 peptides). Comparison of the frequency percentage of an amino acid at locations P5 to P4′ between the experimental and reference set revealed a preference for valine, leucine and isoleucine in P2 (positive values in Fig. 2), in agreement with previous studies on cathepsin-L cleavage of hemoglobin (
). We compared peptide maps generated using cathepsin-L proteolysis, online pepsin digestion and in-tandem cathepsin-L/online pepsin digestion under HX-MS conditions (cathepsin-L:histone ratio 1:2, 3 °C) (supplemental Table S2). For H3, pepsin produced long peptides (∼50 amino acids), whereas cathepsin-L generated similar size peptides but also additional overlapping ones (1–21 and 1–22), improving resolution in this region (Fig. 3, Table I, supplemental Table 2). Peptide 1–21 is the shortest generated H3 peptide (at pH 2.5) reported in the literature to date (neprosin produced peptide 1–32 at similar pH (
)) and contains five amino acids that can be post-translationally modified (K4, K9, S10, K14 and K18), emphasizing the challenging nature of the H3 tail. For H4, cathepsin-L generated peptides ∼23 amino acids long and several complementary ones starting at D24, corroborating R23 as a “primary” cleavage site. Pepsin cleaved H4 at D24 generating peptides 1–24 and 25–49, partially demonstrated in a previous HX study of the H3-H4 dimer (
). Combined use of both cathepsin-L and pepsin almost doubled the number of H4 peptides and overlapping peptides and was therefore preferred despite a slight increase observed in the average length that was the result of the larger peptides generated by pepsin. For H2A each protease generated several peptides with few of them being common among cathepsin-L and pepsin; five overlapping peptides with overhanging C-terminal residues were detected up to residue G28, providing the highest resolution among histones. Finally, pepsin proved better for H2B giving a shorter peptide (
), whereas cathepsin-L produced peptides that were all >37 amino acids in length. The use of both proteases in this case however resulted in shorter average peptide lengths and greater sequence overlap. Given the highly basic content of the N termini, detected peptides were highly charged and therefore excellent candidates for HX studies employing ETD for single-residue resolution (
). In contrast to the tails, both proteases generated many overlapping peptides for the remaining protein sequences (beyond amino acid 50), emphasizing the contrast in the amino acid composition between the tails and the core histone-fold regions. Using cathepsin-L, we were able to obtain full sequence coverage for H2A and H4, and ∼85% coverage for H2B and H3. Overall, the use of cathepsin-L/online pepsin digestion in-tandem proved more efficient compared with each protease alone, evident by the high number of peptides identified and improved resolution. Of note, the average peptide length was not impacted using single or dual protease digestion and was in the range of 18–25 amino acids. We further tested the developed digestion scheme in intact mononucleosomes containing the histone octamer (H2A1B, H2B1K, H3 and H4). Overlapping N-terminal peptides for H2A (
) were detected and full coverage was obtained for all histones (with the exception of H3, 93%), indicating that our method may be deployed successfully in a mononucleosomal context (supplemental Table S2).
Table ISummary results of peptide identification under HX-MS conditions of histones following digestion with Cathepsin-L, online pepsin and a combination of both
During DNA replication, newly synthesized histones are recognized by chaperones that coordinate their nuclear import and deposit them onto DNA to form nucleosomes. Most histone chaperones have a preference for binding and importing histone heterodimers (
), disregarding monomeric states. Recent evidence, however, has shown that H3 and H4 are predominantly monomeric in the cytosol and can be rapidly imported into the nucleus bound tightly to importin-β proteins (
). Obtaining structural information at the monomer level, therefore, is crucial for delineating structural elements governing histone-chaperone interactions. We employed our method using dual cathepsin-L/pepsin digestion to probe conformational dynamics of the H3 and H4 monomers. The monomeric states of the proteins were confirmed using size-exclusion chromatography (supplemental Fig. S4). Proteins were incubated in D2O for different periods of time (10 s, 100 s, 1000 s and 10000 s) and exchange was performed on ice to slow HX rates (
). On quenching, proteins were digested for a total of 5 min using cathepsin-L followed by online pepsin digestion and analyzed by LC-MS (Fig. 4A). The d-uptake of individual profiles was subsequently calculated; for peptides with bimodal isotope distributions, the fraction of each population was modeled further (Fig. 4B). In total, 36 peptides were generated for H3 (87% coverage) and 56 for H4 (100% coverage) monomers and a subset of those was analyzed for their d-content (23 for H3 and 30 for H4), depicted in the form of a heat map for individual time points (Fig. 5A and 5B) and onto the histone structures (Fig. 5C and 5D). d-recoveries were 83% for H3 and 80% for H4 (average values, calculated as described in (
Both monomers demonstrated highly flexible N termini, evident by extensive deuteration at the earliest time point. For H3, all amide protons exchanged with deuterons (peptides 1–21 and 1–22) indicating that this region is unstructured. To our knowledge this is the first time that dynamics of the H3 N terminus has been captured at this peptide resolution by bottom-up approaches. In previous HX-MS studies of H3 and variants, poor coverage of the N terminus was obtained using pepsin (