Analyses of Histone Proteoforms Using Front-end Electron Transfer Dissociation-enabled Orbitrap Instruments*

Histones represent a class of proteins ideally suited to analyses by top-down mass spectrometry due to their relatively small size, the high electron transfer dissociation-compatible charge states they exhibit, and the potential to gain valuable information concerning combinatorial post-translational modifications and variants. We recently described new methods in mass spectrometry for the acquisition of high-quality MS/MS spectra of intact proteins (Anderson, L. C., English, A. M., Wang, W., Bai, D. L., Shabanowitz, J., and Hunt, D. F. (2015) Int. J. Mass Spectrom. 377, 617–624). Here, we report an extension of these techniques. Sequential ion/ion reactions carried out in a modified Orbitrap Velos Pro/EliteTM capable of multiple fragment ion fills of the C-trap, in combination with data-dependent and targeted HPLC-MS experiments, were used to obtain high resolution MS/MS spectra of histones from butyrate-treated HeLa cells. These spectra were used to identify several unique intact histone proteoforms with up to 81% sequence coverage. We also demonstrate that parallel ion parking during ion/ion proton transfer reactions can be used to separate species of overlapping m/z that are not separated chromatographically, revealing previously indiscernible signals. Finally, we characterized several truncated forms of H2A and H2B found within the histone fractions analyzed, achieving up to 93% sequence coverage by electron transfer dissociation MS/MS. Results of follow-up in vitro experiments suggest that some of the truncated histone H2A proteoforms we observed can be generated by cathepsin L, an enzyme known to also catalyze clipping of histone H3.

Chromatin is the structural framework that packages DNA into chromosomes within the nucleus of a cell (2). Histones comprise the principal protein component of chromatin and are involved in the regulation of gene expression (3,4). This epigenetic regulation is achieved through complex patterns of post-translational modifications (PTMs), 1 the incorporation of histone variants, and through controlled histone proteolysis (5)(6)(7)(8)(9)(10). Comprehensive characterization of histones by mass spectrometry (MS) has proven technically difficult for a number of reasons. Traditional methods (bottom-up MS) of sequence determination and PTM site localization are not practical. Histone N-terminal regions are rich in lysine and arginine residues, and thus proteolysis using trypsin generates peptides that are too small or that are poorly retained on reversephase HPLC C18 resins for subsequent MS detection (11). With the advent of electron transfer dissociation (ETD) and more efficient electron capture dissociation fragmentation methods, which are better suited for larger, more highly charged peptides (12,13), several studies utilizing other endoproteases to generate longer peptides have emerged (14 -16). Although these methodologies do well to preserve the combinatorial PTM profiles of histone tails, in some cases it is still impossible to identify the proteoforms from which these peptides originate. This is why analyzing histones intact, as they exist in the cells from which they are derived, is the best method for identifying unique histone proteoforms.
The results of several recent studies involving top-down analyses of histones highlight the complexity of the histone proteome as well as important biological implications that are not easily captured unless the proteoforms are analyzed intact (17)(18)(19)(20). Although these results are very promising, this approach is not routinely utilized because of the many challenges with which it is associated. In a 2014 Proteomics article authored by members of the Consortium for Top-Down Proteomics, Dang et al. concluded that analyses of intact proteins could be greatly improved by the development of higher resolution isolation capabilities and by the ability to acquire rich sequence informative MS/MS spectra on a time scale more compatible with chromatography (18).
We have previously described modifications to a Thermo Fisher Scientific Orbitrap Velos Pro TM mass spectrometer that make it possible to perform multiple fills of the C-trap with product ions from sequential ion/ion reactions of intact proteins prior to high resolution mass analysis (1,21). The result is a dramatic enhancement of the observed fragment ion current and significant improvement of signal-to-noise ratios (SNRs) in MS/MS spectra. This eliminates the need for the acquisition of several time-consuming transients and allows for the acquisition of high quality spectra on a time scale compatible with chromatography. The front-end ETD (FETD) source can also be used to ionize ion/ion proton transfer (IIPT) reagents. IIPT reactions are used to disperse fragment ions over the available m/z range in a controlled manner and to simplify ETD fragment ion spectra (1,22,23). Additionally, we have enabled parallel ion parking, first reported by McLuckey and co-workers (24,25), which involves harmonic excitation of selected ions within the ion trap to reduce their reactivity in gas-phase ion/ion reactions. This allows us to perform IIPT without reacting the precursor outside of the targeted product m/z range. Here, we demonstrate the power of these new techniques through their coupling with on-line, nano-LC chromatography and application to the analysis of intact histones derived from butyrate-treated HeLa cells.

EXPERIMENTAL PROCEDURES
Unless otherwise stated, all MS analyses were carried out on an in-house modified (21) hybrid Orbitrap Velos mass spectrometer (Thermo Fisher Scientific, San Jose, CA). In the time that the experiments described below were performed, the instrument was upgraded to include a high-field Orbitrap mass analyzer (26). Experiments involving recombinant H4 and parallel ion parking were performed following the upgrade and so they make use of enhanced FT (27). Otherwise, spectra are acquired in magnitude mode. All spectra were acquired in reduced profile mode, using instrument default noise reduction.
Direct Infusion of Recombinant H4 -Recombinant H4 was purchased from New England Biolabs (Ipswich, MA), desalted using an Amiconா Ultra-0.5 3K centrifugal filter (Millipore, Billerica, MA), and reconstituted in a solution containing 40% acetonitrile and 60% of 0.1% acetic acid in water (all percentages expressed as v/v). The sample was directly infused at a concentration of 5 pmol/l and ionized using microelectrospray ionization as described previously (28) into the front end of an in-house modified ETD/IIPT-enabled Orbitrap Elite TM . MS/MS spectra were acquired using the following instrument parameters: resolution (r) ϭ 60,000 at 400 m/z; 190 -2000 m/z scan range; 1E5 ion counts precursor automatic gain control (AGC) target; 3E5 reagent AGC target; 1-20 fragment ion fills of the C-trap; 1 scan/spectrum; 2 m/z precursor isolation window centered at 708.4 m/z ([M ϩ 16H] ϩ16 charge state of H4); 6 -8-ms ETD; 0 -20-ms IIPT.
Histone Extraction from HeLa S3 Cells-Cultured HeLa S3 cells were treated with 10 mM sodium butyrate overnight and harvested, and the nuclei were isolated as described previously (29). The nuclei pellet was resuspended in 0.4 M NaCl containing 1 mM dithiothreitol (DTT), 0.3 mM 4-(2-aminoethyl) benzenesulfonyl fluoride hydrochloride, 10 mM sodium butyrate, and 5 M Halt protease inhibitor mixture at a ratio of 10:1 (v/v), salt buffer/pellet. This was incubated with constant rotation for 2 h at 4°C and then spun at 3000 ϫ g for 5 min. Supernatant was discarded, and the pellet was resuspended in a 2.5 M NaCl solution containing DTT, 4-(2-aminoethyl) benzenesulfonyl fluoride hydrochloride, sodium butyrate, and protease inhibitors as above in a ratio of 5:1 (v/v), salt buffer/pellet. The histones were then acid-extracted and TCA-precipitated as described previously (29).
Off-line Histone Fractionation-Histones purified from HeLa S3 cells (ϳ100 g) were dissolved in water and chromatographically separated on a Vydac C18 column (4.6-mm inner diameter, 250-mm length, 5-m particle size) at a flow rate of 0.8 ml/min using a gradient of 30 -60% solvent B in 100 min, 60 -100% B in 20 min, and 100 -30% B in 10 min (solvent A ϭ 5% (v/v) acetonitrile, 0.2% (v/v) trifluoroacetic acid in HPLC-grade water; solvent B ϭ 95% (v/v) acetonitrile, 0.19% (v/v) trifluoroacetic acid in HPLC-grade water). Fractions were collected using an automatic sample collector in 1-min intervals between 15 and 80 min from the start of the gradient. The UV detector was set to monitor absorbance at 214 nm, and fractions containing histones H1, H2A, H2B, and H4 were identified and appropriately combined based on their known chromatographic retention times (29). These were transferred to microcentrifuge tubes, taken to dryness using a SpeedVac concentrator, and stored at Ϫ35°C.
On-line RP-HPLC for Mass Spectrometry-Dried samples were reconstituted in water at concentrations of 5-10 pmol/l. Aliquots of 1-2.5 pmol of protein along with 100 fmol of peptide standards were pressure-loaded onto a 360-m outer diameter ϫ 75-m inner diameter fused-silica micro-capillary analytical column (packed 11-cm of POROSHELL 300SB-C18 resin (Agilent Technologies, Palo Alto, CA) (30) equipped with a laser-pulled electrospray emitter tip (31). The column was washed with ϳ20 column volumes of 0.3% (v/v) formic acid in water. Samples were eluted at 60 nl/min using a gradient of 0 -33% B in 10 min, 33-48% B in 60 min, 48 -100% B in 10 min and 100 -0% B in 10 min. The elution gradient utilized solvent A, 0.3% formic acid in water, and solvent B, 0.3% formic acid, 72% acetonitrile, 18% isopropyl alcohol, and 9.7% water (all v/v). Following reverse-phase HPLC separation, samples were directly ionized using microelectrospray ionization as described (28). For some samples, initial screens revealed that better chromatography might be achieved by enacting simple changes in the elution gradient. These changes included doubling the 0 -33% B portion of the gradient to a time of 20 min for samples containing truncated histones and altering the gradient used for H1 to 0 -20% B in 10 min, 20 -40% B in 60 min, and then 40 -100% B in 10 min. All other aspects of the RP-HPLC separation and ionization for mass spectrometry remained the same across all analyses.
On-line MS Analyses-Samples were ionized by microelectrospray into the front-end of an in-house modified ETD/IIPT-enabled Orbitrap Velos Pro TM . Except for the occasional targeted analyses, the instrument was operated in data-dependent mode in which the MS1 scan was taken from m/z 300 -2000 in the Orbitrap mass analyzer (r ϭ 60,000 at m/z 400) followed by a second MS1 scan acquired in the linear ion trap over a limited m/z range, upon which data-dependent selection was based. In this way, same species/different charge state reselection was reduced. A single ETD/IIPT MS/MS scan was also taken in the Orbitrap. MS/MS scan parameters were set as follows: r ϭ 30k, 60k or 100k at m/z 400; 200 -2000 m/z scan range; 1E5 ion counts AGC target; 3E5 ion counts reagent AGC target; 5-30 fragment ion fills of the C-trap; 1-3 scans; 2-3 m/z precursor isolation window; 4 -25-ms ETD (fluoranthene reagent); and 20 -40-ms IIPT (sulfur hexafluoride reagent). The instrument was operated using charge state exclusion (ϩ1, ϩ2, and ϩ3) and dynamic exclusion with a repeat count of 1-2, repeat duration of 60 s, and an exclusion duration of 90 s. Analytes of different sizes and charge densities require different ETD/IIPT reaction times, C-trap fill counts, and resolving power; these were varied depending on the fraction being analyzed. For example, when analyzing H1, a 21.9-kDa protein, shorter ETD and IIPT reaction times, more multiple fills, and higher resolution settings were used than those used for the analysis of H4, an 11.4-kDa protein. Duty cycle times also varied accordingly, but in general they were within 6 -15 s/cycle. Some fractions contained larger proteins that required MS/MS scan ranges over 2000 m/z for sequence determination and PTM site mapping. However, the instrument control software did not allow the acquisition of spectra with scan ranges exceeding 2000 m/z for ETD/IIPT MS/MS scan types in data-dependent mode. This prompted us to perform "on-the-fly" targeted analyses based upon full MS data acquired in the initial sample screenings. Instrument method parameters were varied at our discretion based on our evaluation of the MS/MS spectra generated as data were acquired.
High resolution full MS and MS/MS data were manually inspected using Qual Browser software (Thermo Fisher Scientific). Interpretation of all ETD/IIPT MS/MS data was performed manually on the unprocessed raw spectra. Theoretical fragment ion masses were calculated using an in-house-developed fragment mass calculator. Theoretical isotopic distributions were modeled using Isotope Pattern Calculator software (Pacific Northwest National Laboratories, omics.pnl.gov/). Percent sequence coverage was calculated by dividing the number of observed N-C␣ bond cleavages by the total number of predicted N-C␣ bond cleavages (note that cleavage of the N-C␣ bond that is N-terminal to proline does not produce an observable fragment). Masses for all intact proteoforms identified agreed to within 5 ppm and were confirmed by MS/MS. Relative abundance information for each truncated histone species was determined by taking the area under the curve of the extracted ion chromatogram for the most abundant isotope for every charge state present utilizing a 10-ppm mass window (32).
Parallel Ion Parking during IIPT-For parallel ion parking experiments, the MS1 scan was taken from m/z 300 to 2000 in the high-field Orbitrap mass analyzer (1E6 ion counts AGC target, r ϭ 120,000 at 400 m/z). A 40 m/z window centered at m/z 810 (790 -830 m/z) was selected and subject to 140 ms of IIPT. The instrument control software was modified to apply a custom RF waveform to the x-rods of the linear ion trap during IIPT. The waveform included frequencies mapping to m/z 2300 -3900, corresponding to the desired final charge state of the precursor. For the sample analyzed (fraction H2B), the [M ϩ 5H] ϩ5 charge state was the desired product (ϳm/z 2770). The amplitude of the waveform was adjusted such that maximum activation of the target product ions was achieved without dissociation or ejection. Product ions were analyzed in the Orbitrap (1E5 ion counts AGC target, r ϭ 120,000 at 400 m/z) after five fills of the C-trap.
In Vitro H2A Proteolysis and nano-HPLC-MS/MS Analysis-The in vitro proteolysis of recombinant histone H2A type 2 (Uniprot ID, Q6FI13) was carried out in 50 mM MES, pH 6.0, and 5 mM DTT. Recombinant cathepsin L (R&D Systems 1515-CY-010) was used in a 5000:1 (histone/enzyme) ratio during a 1-h time course reaction. 20 g of the sample reaction was quenched with 10% acetic acid after 15, 30, and 60 min. Samples were desalted using in-house packed Stage tips as described previously (29). After Stage tip desalting, samples were resuspended in 0.1% formic acid prior to nano-LC-MS/MS analysis. 1 g of sample was loaded onto a 75-m inner diameter ϫ 17-cm Reprosil-Pur C18-AQ (3 m; Dr. Maisch GmbH, Germany) column using an EASY-nLC nano-HPLC (Thermo Fisher Scientific). The HPLC gradient was 0 -100% solvent B (A ϭ 0.1% formic acid; B ϭ 95% acetonitrile, 0.1% formic acid) over 60 min at a flow rate of 250 nl/min. nano-LC was coupled to an Orbitrap Fusion mass spectrometer (Thermo Fisher Scientific). MS1 spectra (m/z 300 -1200) were acquired in the Orbitrap with a resolution of 120,000 (at 200 m/z) and an AGC target of 5E5. Data-dependent selection was performed using TopSpeed with a cycle time of 3 seconds. MS/MS for charge states ϩ2-5 was performed using higher-energy collisional dissociation (HCD), with a normalized collision energy of 27%, a maximum injection time of 150 ms, and an AGC target of 5E4. Higher charge states were fragmented using ETD, using a reaction time of 20 ms, a maximum injection time of 400 ms, 3 scans, and an AGC target of 5E5. Both types of MS/MS fragmentation were acquired in the Orbitrap with a resolution of 30,000 (at 200 m/z). Data were analyzed using Proteome Discoverer (version 1.4, Thermo Scientific, Bremen, Germany) with Mascot (Matrix Science, London, UK) as the database search engine (database: human histones, Uniprot, March, 2014, 58 entries). Search parameters were as follows: HCD spectra were searched using a precursor ion mass tolerance of 10 ppm and a fragment ion mass tolerance of 0.05 Da; ETD spectra were deconvoluted using Xtract (Thermo Fisher Scientific) and searched using a precursor ion mass tolerance of 10 ppm and a fragment ion mass tolerance of 0.02 Da. Unspecific cleavage (no enzyme) and no modifications were specified. Only identifications with a false discovery rate of Ͻ1% (calculated using fixed value PSM validator) were accepted. Chromatograms of the relevant peptides were manually extracted using Xcalibur (Thermo Fisher Scientific) after deconvolution for peptide quantification.

Multiple Fills and IIPT-A Thermo Fisher Scientific Orbitrap
Velos Pro TM was modified to introduce reagent anions from the front by adding a glow discharge ionization source between the stacked ring ion guide and multipole 00, as described previously (21). The discharge is used to ionize ETD and IIPT reagents (fluoranthene and sulfur hexafluoride, respectively) for ion/ion reactions. By placing the reagent ion source in the front of the instrument, the C-trap is made available to store fragment ions from multiple iterations of ion/ion reactions. Once sufficient amounts of fragment ions are collected, they are sent to the Orbitrap for high resolution mass analysis. Fig. 1, A and B, shows 8-ms ETD/20-ms IIPT MS/MS spectra of intact recombinant H4 using 1 and 20 fragment ion fills of the C-trap, respectively. SNRs for fragments shared between the two spectra are indicated in parentheses. By increasing the number of fragment ion volumes analyzed from 1 to 20, total fragment ion current is increased by an order of magnitude. This is also demonstrated in supplemental Fig. 1, which shows a trace of total ion current as the number of multiple fills is varied. The spectrum shown in Fig. 1C is an average of 20 spectra like that shown in Fig 1A and    The use and benefits of IIPT to charge reduce ETD-produced fragment ions have been described previously (1,23). Briefly, following fragmentation by ETD, fragment ions are allowed to react with a proton transfer reagent. These reagents act as a base and abstract protons from highly charged fragments. The amount of charge reduction is tuned by varying the amount of time the reaction is allowed to take place. By extending IIPT reaction time such that fragment ions are significantly charge-reduced, the fragment ion signals are increasingly distributed over the full scan range (example shown in supplemental Fig. 3).
Characterization of Intact Histones Derived from Butyratetreated HeLa Cells-HeLa cells were treated with sodium butyrate prior to histone isolation and separation. Butyrate treatment of cells results in histone hyperacetylation; butanoate acts as an inhibitor of histone deacetylase activity (33). The histones were acid-extracted from the nuclei and purified by off-line reverse-phase HPLC fractionation. The histones were identified based on their well known chromatographic retention times (34). Intact protein was directly loaded onto a reverse-phase C18 Poros-shell column and gradient-eluted into the mass spectrometer. Fig. 2 summarizes the data that can be obtained from a single LC-MS experiment. Fig. 2A shows the base-peak chromatogram of the H2A2 fraction revealing three distinct regions of intact protein elution. Peaks 1 and 2 correspond to H4 in various modified forms containing oxidized and unoxidized methionines, respectively, whereas peak 3 corresponds to various proteoforms of intact H2A. Displayed in Fig. 2, B and C, are the averaged high resolution ESI spectra recorded on peaks 2 and 3 in the base peak chromatogram, respectively. Signals in the observed charge state distribution carry 7-18 and 8 -23 positive charges for H4 and H2A, respectively. An ETD/IIPT MS/MS spectra recorded on an [M ϩ 19H] 19ϩ ion of an intact form of H2A is shown in Fig. 2D. This spectrum was obtained using a 4-ms ETD and a 25-ms IIPT and is an average of six spectra, each taking 11.7 s to acquire utilizing 3 scans and 15 multiple fills with r ϭ 30,000 at 400 m/z. Note that equivalent spectra taken as the sum of 45 scans (transients) of single fragment ion fills would take ϳ19.8 s/scan. Residues identified by ions of types c and z observed in Fig.  2D are indicated in Fig. 2E (hand annotated version shown in supplemental Fig. 4). These define 80% sequence coverage and identify the intact H2A as H2A.1 (Q93077).
Depicted in supplemental Fig. 5 is a full MS spectrum indicating signals for intact H2A proteoforms that were identified and confirmed by ETD/IIPT MS/MS. In addition to H2A.1, also detected were H2A.1-C and H2A.1-H (Uni Parc Q93077 and Q96KK5, respectively). H2A.1-C differs from H2A.1 by the substitution of Thr-16 with a serine residue and by substitution of Lys-99 with an arginine residue. These substitutions give rise to a ␦ mass of ϩ13.9904 Da from H2A.1, which is very close to the ␦ mass that would be exhibited in the event of a monomethylation (ϩ14.0157 Da). The calculated and observed mass differences for intact H2A.1-C and the monomethylated form of H2A.1 are 3.5 and 5.3 ppm, respectively, making H2A.1-C the better fit by mass accuracy. However, based on this information alone, neither could be conclusively excluded. The MS/MS spectra contained c ions that exhibited a ␦ mass of Ϫ14 Da following Thr-16 and z ions that exhibited a mass shift of ϩ28 Da following Lys-99. The ϩ28-Da mass shift is also indicative of a dimethylated lysine, but this possibility was ruled out based on the mass accuracy of the fragment ions observed (z 31 3 . ϩ , 13 C m/z 1101.9823: 0.2 ppm for Lys 3 Arg, 7.8 ppm for Lys 3 di-methylation Lys). Thus, we were able to conclude that this is H2A.1-C. This demonstrates the importance of obtaining high quality MS/MS spectra for the confirmation of proteoform identity. H2A.1-H differs from H2A.1 by the deletion of the last two C-terminal residues, emphasizing the importance of analyzing the intact forms of these proteins, as digestions would produce identical peptides with the exception of those containing the C termini. This difference was easily discerned upon examination of the z fragment ion series in the MS/MS spectra. H2A.1-C and H2A.1-H are present at levels roughly 20 -25 and 15-20%, respectively, relative to H2A.1. Eluting later in the gradient and also confirmed by MS/MS are the respective mono-Lys-acetylated forms of these proteins. The acetylation is localized to Lys-5 in all H2A types. Fig. 3 displays a spectrum recorded on [M ϩ 16H] 16ϩ (m/z 718.0) from intact H4 using a 4-ms ETD reaction time and 40-ms IIPT. This is an average of eight spectra, each taking ϳ8.75 s to acquire utilizing 3 scans at r ϭ 30,000 (at 400 m/z) with 10 multiple fills of the C-trap per scan (ϳ13.9-s acquisition for single fill/transient equivalent spectrum). Fig. 3B offers an expanded view of the 1250 -1500 m/z region of the spectrum. With the exception of the extreme C terminus, which is devoid of charge, a near contiguous series of both cЈ and z ⅐ Ј can be observed for ϳ60 residues from either end of the protein, yielding 81% sequence coverage and site localization of five acetylations as well as Arg-20 dimethylation (fully annotated spectrum shown in supplemental Fig. 6).
Compared with other histones, H1 presents a more difficult situation for on-line LC-MS analysis of the intact form. The most common human isoforms of H1 range from 194 to 226 amino acids in length and, due to their larger size, require higher resolution, shorter gas-phase ion/ion reaction times, more multiple fills of the C-trap, and an extended mass range. Initial sample runs revealed the presence of at least two forms of H1, H1.4 and H1.2. MS/MS spectra acquired in data-dependent mode using 2-ms ETD/25-ms IIPT are shown in supplemental Fig. 7

S G R G K G G K G L G K G G A K R H R K V L R D N I Q G I T K P A I R R L A R R G G V K R I S G L I Y E E T R G V L K V F L E N V I R D A V T Y T E H A K R K T V T A M D V V Y A L K R Q G R T L Y G F G G z•-ions c-ions 30
Ac-αN-  erage (H1.4, 72%, and H1.2, 67%, P16403 and P10412, respectively). Additional targeted analyses elucidated several additional unique H1 proteoforms. One of these was a form of H1.2 containing a previously uncharacterized single amino acid substitution, A141T. The spectrum and coverage are shown in Fig. 4 (hand-annotated version shown in supplemental Fig. 8).
Parallel Ion Parking during IIPT-Histone fraction H2B presented the most difficult challenge due to the number of unique proteoforms it contained. Fig. 5, A and B, displays the averaged (five spectra, r ϭ 120,000 at 400 m/z) high resolution ESI spectra recorded on intact H2B proteoforms eluting over ϳ45 s. Isolation of the region containing the [M ϩ 17H] ϩ17 (Fig. 5C, m/z 790 -830) reveals several distinct isotopic clus-ters, each spaced less than 1 m/z apart. Using a 2 m/z isolation window, all MS/MS spectra acquired on precursors possessing ETD-compatible charge states contained fragment ions unique to at least two H2B proteoforms. Upon close inspection of Fig. 5C, two ϩ8 precursors can be seen between m/z 825-830 (labeled D and E, also see supplemental Fig. 9, D and E). This same 40 m/z window was isolated and subjected to 140-ms IIPT. During the IIPT reaction, a waveform, including frequencies mapping to m/z 2300 -3900, was applied. As the precursor ions are charge-reduced, their m/z values increase, and their frequencies of oscillation come into resonance with the applied waveform. This occurs when an ion's m/z reaches the 2300 -3900 m/z range. At this point, these ions are activated, increasing their velocity relative to  the reagent ions and slowing the rate of further proton transfer. Fig. 5D displays the product ion spectrum following 140-ms IIPT/parking. The vast majority of the ion current lies within 2730 -2800 m/z range where the "parked" intact H2B [M ϩ 5H] ϩ5 ions lie. An enlarged inset of this region is shown in supplemental Fig. 10. Signals corresponding to precursor ions that were not fully reacted to the ϩ5 charge state or that were not fully parked and underwent further proton transfer to the ϩ4 are indicated. Close inspection of the spectrum shown in Fig. 5D reveals several signals of lesser charged species (labeled A-E). Their observed charge state and m/z as well as their calculated m/z and charge state prior to IIPT are given in Table I. Comparisons of the observed masses to calculated theoretical values suggest that these signals are derived from truncated forms of H2B that co-eluted with intact H2B during the time these spectra were acquired (identity and mass accuracy of the most abundant isotopologue in each envelope are shown in Table I). Peptides containing residues 66 -125, 67-125, and 68 -125 derived from two isoforms of H2B match within 5 ppm mass error. Expected signals for these species A-C are not observed in Fig. 5C (also see supplemental Fig.  9), presumably due to the density of unique species within the window. Note that the assigned identities for these truncated

TABLE I Truncated H2B proteoforms observed in IIPT/parallel ion parking MS/MS spectrum
Observed m/z and theoretical precursor m/z values are shown. The masses reported are those of the most abundant isotopologue in the charge state envelope. UniProt sequences from isoforms of H2B with residues matching those contained in the truncated H2B are given. Note these are not confirmed by MS/MS, and identification is based on accurate mass and MS/MS spectra of other truncated H2Bs seen in LC-MS analyses performed without parallel ion parking. Observed mass accuracies are reported in ppm and were calculated from the most abundant isotopologue. H2Bs observed in this fraction. These truncated H2Bs could be derived from several other proteoforms of intact H2B due to their sequence similarity. Characterization of Truncated H2B and H2A Proteoforms Derived from Butyrate-treated HeLa Cells-Eluting after the intact H2B proteoforms were several fairly abundant forms of truncated H2B. The base peak chromatogram and the sequence of the most abundant intact H2B (H2B-1C, P62807) observed in the fraction are shown in Fig. 6. The solid black lines in Fig. 6 indicate "primary" cleavage sites mapped by MS. Less abundant cleavage sites in Fig. 6 are marked with gray lines. This is not a comprehensive account of all the cleavages that are observed, but rather only those that are most abundant. Complementary N-terminal peptides, 1-25 and 1-51, are present and elute near the front of the gradient. They are less abundant and are more difficult to characterize because their signals are split among multiple acetylated forms (ϩ1-3 lysine-acetyl confirmed by MS/MS). The acety-lations are shared among multiple sites, including Lys-11, Lys-12, Lys-15, Lys-16, and Lys-20, which are all sites that were confirmed by MS/MS in the intact form and that have previously been identified by MS in an earlier study (35). Truncations derived from two additional H2B isoforms (possibly H2B.2-E, Q16778, and H2B.1-K, O60814) were also present. Note that the sequences of the intact forms found in the sample differ by only one residue (V39I for H2B.2-E or S124A for H2B.1-K), and so distinction between truncated peptides can only be made if both unique residues are contained within the peptide. If only one of these residues is present, then only one of the two additional isoforms can be excluded. There is also no way to distinguish the modified (acetylated) form of the intact protein from which the C-terminal peptides are derived. Up to 93% sequence coverage is obtained by ETD/IIPT MS/MS of the three most abundant C-terminal containing peptides (90%, 26 -125; 93%, 52-125; and 84%, 50 -125). The spectrum and sequence cov-  erage of the 26 -125 peptide are shown in supplemental Fig. 11.
Additionally, dozens of truncated H2As were observed in the H2A1 fraction. These ranged in size from 10 to 75 residues in length, and each account for less than 1% of the total ion current relative to intact H2A from the same fraction. The base peak chromatogram is shown in Fig. 7, A-C, and indicates where truncated species elute relative to intact H2A. Fig. 7D shows the sequence of the most abundant intact H2A observed in the fraction (H2A2A, Q6FI13). Again, the solid black lines in Fig. 7 indicate "primary" cleavage sites, and the less abundant cleavage sites are indicated with gray lines. This is not a comprehensive account of all the truncated histone forms observed, only the most abundant. Supplemental Table  1 lists these truncated proteoforms along with their respective abundances. The most abundant of these was the N termi- nus-containing proteoform cleaved C-terminal to Gly-44. Intact forms of H2A identified on the fraction were H2A2A and H2A2C (Q16777), which differ by their last 5-6 C-terminal residues. Because of this, there is no way to determine whether the N terminus-containing truncations were derived from H2A2A or H2A2C. MS/MS spectra of truncated species are shown in supplemental Fig. 12.
Identification of Histone H2A Clipping Enzyme by in Vitro proteolytic Assay-Given the large variety of truncated histone proteoforms identified, we further investigated whether cathepsin L, a protease responsible for histone H3 and H3.3 clipping (9,37,38), is capable of cleaving histone H2A. To do so, we performed an in vitro assay incubating recombinant histone H2A with cathepsin L and characterized the cleavage specificity using nano-LC coupled to an Orbitrap Fusion with FETD. Supplemental Fig. 13 and supplemental Table 2 show that the signal of intact histone H2A decreased by ϳ3 orders of magnitude after 1 h of incubation, while the clipped forms increased in abundance over time. Specifically, we determined the clipped proteoform Gly-44 (cleaved after glycine 44) as one of the most abundant cleaved proteoforms and confirmed its sequence by ETD MS/MS (supplemental Fig.  13C). We also observed several other C-terminal peptides (supplemental Table 2). Taken together, our data suggest that cathepsin L might have a larger variety of targets than currently described. DISCUSSION In the commercial configuration of the Orbitrap Velos Pro/ Elite TM , the reagent ion source is located at the rear of the instrument, requiring that reagent anions traverse the C-trap to reach the ion trap, where ion/ion reactions are performed. By putting the reagent ion source in the front of the instrument, the C-trap is made available for the storage of fragments from multiple iterations of ion/ion reactions prior to transient acquisition in the Orbitrap. In this way, we are able to dramatically increase fragment ion current. The multiple fill strategy for improving SNRs has a number of advantages over transient summing or spectral averaging. Chief among these is that the time that it takes to acquire spectra of multiple C-trap fills is less than the time it takes to acquire transients from the same number of single C-trap fills to be averaged or summed. Specifically, if a transient takes 380 ms to acquire, an additional fragment ion fill can be accomplished in less than 1 ⁄2 that time, i.e. ϳ130 ms for overhead plus time for ETD/IIPT activation. Additionally, SNRs improve faster using multiple fills compared with summing transients or averaging spectra and so fewer transients are needed to produce a useful spectrum for analysis. This enables the acquisition of high quality spectra on a time scale more compatible with chromatography.
The power of top-down mass spectrometry lies in its ability to provide an unambiguous view of protein expression dynamics by distinguishing closely related proteoforms from one another while also observing any combinations of PTMs. Bottom-up MS strategies are not capable of this, which is why there has been a significant push in the field of proteomics to develop instruments and methodologies that enable high throughput, top-down mass spectrometry of intact proteins. Here, we demonstrated the power of FETD-enabled sequential ion/ion reactions along with multiple fragment ion fills through their coupling with on-line HPLC and application to the analysis of intact histones. Data-dependent and targeted LC-MS experiments were employed to record high resolution MS/MS spectra of histones from butyrate-treated HeLa cells. Using these spectra, several unique intact histone proteoforms were identified with up to 81% sequence coverage. Examples shown here contain multiple modifications, including single amino acid substitutions, methylations, and acetylations. Additionally, many truncated forms of histones H2A and H2B were characterized with up to 93% sequence coverage.
Histone proteolysis has been implicated in several important regulatory processes, yet it remains poorly understood. Only eight histone-specific proteases are known, and their biological significance has not been extensively defined (39). This area of study is clearly poised for significant expansion, as evidenced by the identification of two H2A-specific proteases and several novel histone clipping events in just the past few years (39 -42). Discerning whether any histonecleaving event is the result of continuous histone turnover or sample degradation versus an alternative biological process with epigenetic implications is challenging (36). Our in vitro data suggest that cathepsin L might be a potential candidate to explain some of the truncated histone H2A proteoforms we observed in the butyrate-treated HeLa cell histone samples. We found that cleavage C-terminal to Gly-44 produced the most abundant truncated H2A proteoforms observed in both the in vitro experiments and those derived in vivo from HeLa cells. Differences between the in vitro and the HeLa cell data in the abundance of other truncated proteoforms might be explained by improper folding of the recombinant histone or by the absence of a proper nucleosome structure in the in vitro experiments. In the future, we hope that the tools we describe here will aid in deciphering whether or not these truncated histone proteoforms are biologically regulated and contribute to epigenetic mechanisms.
That we are able to observe these low level truncated histone proteoforms in simple top-down LC-MS experiments further demonstrates the utility of our methodologies as it is very likely that bottom-up-or middle-down-based proteomics approaches would have missed cleavage events like those we describe in our analyses. This could happen for a number of reasons. First, given the low levels at which these truncations were observed (which, for any single truncated form, is less than 1% relative to the intact histone), along with the added complexity that is generated when samples are digested for MS proteomic studies, peptides of this type may have not been detected or may have simply gone unnoticed by investigators. Several of the generated peptides derived from originally truncated species might have been too small to be retained by the HPLC column. Quantitative analysis of the relative abundance of unique truncated forms would be difficult to perform as evidence for some unique truncated species could be lost or could be below the limit of detection. Also, the ability to correlate the endogenous histone proteolysis with any specific histone variant or PTM state would not be possible unless those PTMs or sequence variations and the cleavage sites were contained within the same peptide produced in the sample preparative digestion. Only by determining the primary structures of these species, as they existed in the cell, can one draw these potentially important biological parallels.
Furthermore, the use of parallel ion parking during IIPT is demonstrated in an analysis of H2B proteoforms. Chromatographic separation is made difficult by the fact that several H2B proteoforms differ from one another by just a single amino acid residue. At high ETD-compatible charge states, multiple proteoforms are present within a relatively small m/z range. Upon data-dependent selection, often more than one proteoform is isolated and fragmented, resulting in mixed MS/MS spectra, making identification difficult. Additionally, as demonstrated here, overlapping signals make it difficult to tease out the number of unique species present. By using IIPT to charge-reduce the precursors contained within a wide isolation window and parking the products in a desired m/z range, we are able separate species that, prior to IIPT, possessed roughly the same m/z but a significantly different mass. This reveals signals that can be attributed to less abundant H2B proteoforms that were not observed in the MS1 spectrum nor the isolation-only spectrum due to their proximity to more abundant signals derived from other species. This is the first time ion parking has been demonstrated on a commercially available linear ion trap using ion cloud densities that lend themselves to high resolution MS/MS analyses. We believe that parallel ion parking will become a powerful tool in identifying proteins in a mixture when used in conjunction with MS/MS fragmentation methods and multiple fill technology. Most exciting is the potential for this technology to allow differentiation and identification of closely related proteoforms that are difficult to separate chromatographically, like H2B. As demonstrated in supplemental Fig. 10, following IIPT/parking, we are able to spread apart isotopic clusters such that their centers are ϳ3 m/z apart (compared with Ͻ1 m/z prior to IIPT) without losing signal due to further proton transfer. This is a stepping stone to enabling higher resolution isolation on the MS3 level. However, doing so will require further instrument modification to increase the precursor ion capacity of the linear ion trap to be able to observe fragments in MS3 spectra following IIPT.