A Method for Automatically Interpreting Mass Spectra of 18O-Labeled Isotopic Clusters*S

16O/18O labeling is one differential proteomics technology among many that promises diagnostic and prognostic biomarkers of disease. Although the incorporation of 18O in the C-terminal carboxyl group during endoproteinase digestion in the presence of H218O makes the process of labeling facile, the ease and effectiveness of label incorporation have in some regards been outweighed by the difficulties in interpreting the resulting spectra. Complex isotope patterns result from the composition of unlabeled (18O0), singly labeled (18O1), and doubly labeled species (18O2) as well as contributions from the naturally occurring isotopes (e.g.13C and 15N). Moreover because labeling is enzymatic, the number of 18O atoms incorporated can vary from peptide to peptide. Finally it is difficult to distinguish highly up-regulated from highly down-regulated or C-terminal peptides. We have developed an algorithm entitled regression analysis applied to mass spectrometry (RAAMS) that automatically, rapidly, and confidently interprets spectra of 18O-labeled peptides without requiring chemical composition information derived from product ion spectra. The algorithm is able to measure the effective 18O incorporation rate due to variable enzyme substrate specificity of the pseudosubstrate during the isotope exchange reaction and corrects for the 18O0 abundance that remains in the labeled sample when using a two-step digestion/labeling procedure. We have also incorporated a method for distinguishing pure 18O0 from pure 18O2 peptides utilizing impure H218O. The algorithm operates on centroided peak lists and is therefore very fast: nine chromatograms of, on average, 1,168 spectra and containing, on average, 6,761 isotopic clusters were interpreted in, on average, 45 s per chromatogram. RAAMS is fast enough (average, 38 ms/spectrum) to allow the possibility of performing information-dependent MS/MS on a chromatographic time scale on species exceeding predetermined ratio thresholds. We describe in detail the operation of the algorithm and demonstrate its use on datasets with known and unknown ratios.

Differential proteomics technologies promise diagnostic and prognostic biomarkers to reduce the burden of disease (1)(2)(3)(4). 16 O/ 18 O stable isotope labeling is one tool among many with which to quantify the relative expression differences of proteins between two biological samples (5). 16 O/ 18 O labeling, in contrast to other popular differential labeling methodologies such as CAT, isobaric tag for relative and absolute quantitation (iTRAQ TM ) 1 , and stable isotope labeling of amino acids in cell culture (SILAC), (a) does not select for peptides containing certain amino acids (a limitation of ICAT), (b) does not require fragmentation spectra (tandem MS) to record an abundance ratio (a limitation of iTRAQ), and (c) is applicable to biological specimens such as serum, plasma, and cerebrospinal fluid (a limitation of SILAC) (6 -8). In a typical 18 O experiment, two samples (or two pools of samples) of a biological matrix such as serum are collected with one typically representing a normal or control population and the other representing a disease state, although other arrangements are possible. One sample is digested with an endoproteinase, typically trypsin (9), in the presence of H 2 16 O, whereas the other sample is digested with the same enzyme in H 2 18 O. The trypsin-catalyzed exchange of up to two 16 O atoms for two 18 O atoms at the C-terminal carboxyl group of the peptide provides a mass shift of 2 ( 18 O 1 ) or 4 ( 18 O 2 ) daltons. The two digests are then mixed and subjected to mass spectrometry where the isotopic clusters for each peptide appear as a set of peaks corresponding to the unlabeled ( 18 O 0 ) and labeled ( 18 O 1 and 18 O 2 ) species. Measuring the heights of these peaks gives information about the relative abundances of the various peptides in the two samples.
Interpreting the resulting spectra, however, is not completely straightforward. Contributions from the artificially enriched 18 O isotopes and naturally occurring isotopes (particularly 13 C and 34 S) combine to form complex, overlapping isotopic distributions especially for high molecular weight peptides where the monoisotopic peak is no longer the most abundant isotope. Schnolzer et al. (10) demonstrated that once the initial cleavage and exchange has occurred the proteolytic peptide forms a pseudosubstrate for the enzyme, causing the incorporation of a second 18 O atom. However, different peptide pseudosubstrates have different reaction rates, K m , with trypsin, which causes peptide-to-peptide variability in the incorporation of 18 O. The reaction rate is determined by such factors as peptide length (10) and sequence composition (11).
Recent digestion protocols (11,12) introduce a further complication because two digestion steps are utilized: a first digestion using solution-based trypsin in H 2 16 O and a second digestion using immobilized trypsin in H 2 18 O. For peptides with high incorporation rates, the two-step procedure is effectively the same as the single step digestion; however, with low incorporation rate peptides, the possibility arises that a peptide molecule cleaved in H 2 16 O may never reassociate with trypsin in H 2 18 O. Bantscheff et al. (13) observed that, when using the two-step digestion procedure, it is important to account for the peptide molecules with no 18 O atoms at their C termini that remain in the labeled sample. In addition to these complications, other factors that can affect the labeling process include back exchange and pH (14).
Several different methods are described in the literature for calculating 16 O/ 18 O ratios while accounting for variable incorporation. Yao et al. (5) used a ratio combining the experimental abundances of the 18 O 0 , 18 O 1 , and 18 O 2 peaks along with the theoretical isotopic abundances of those peaks; chemical composition information derived from known peptide sequence or from product ion spectra was used to obtain the theoretical isotopic abundances. Johnson and Muddiman (15) removed the requirement for sequence identification/product ion spectra by utilizing the average amino acid averagine (16) to calculate approximate chemical compositions, by modeling the contributions of 13 C and 34 S with a power function, and by including this in the ratio calculation. Both of these calculations assume a single step digestion and do not account for 18 O 0 abundance in the labeled sample.
At least two software implementations exist that perform ratio calculations; both have limitations. Halligan et al. (17) automated the process of combining the peptide sequence information from Sequest (18) with either the Yao et al. (5) or Johnson and Muddiman (15) ratio calculations. They required zoom scans, utilized the protein identification data to determine the monoisotopic mass, and assumed that this mass appeared experimentally. Qian et al. (19) utilized THRASH (20) to detect peptide pairs and then applied the ratio equations of Johnson and Muddiman (15). THRASH, however, uses a single theoretical isotopic distribution and must separately detect the 18 O 0 and 18 O 2 species; therefore it will have difficulty with more massive peptides or those with incomplete incorporation.
Our current work describes a completely automated method for locating and interpreting 18 O-labeled isotopic clusters in parent ion chromatograms without requiring product ion or selected ion monitoring/zoom spectra. Central to our algorithm is the use of linear regression to simultaneously fit all peaks in the isotope cluster rather than just the peaks representing the 18 O 0 , 18 O 1 , and 18 O 2 species. We use the residuals from the regression to compute a fitting score between the theoretical and experimental isotopic distributions that can be used to rank potential candidate biomarkers. Mirgorodskaya et al. (21) also used linear regression with 18 O-labeled internal standards but measured the relative abundances of the 18 O 1 and 18 O 2 species for a given peptide in a separate experiment instead of allowing these abundances to vary in the regression.
We also describe a method of distinguishing highly upregulated peptides from highly down-regulated or C-terminal peptides. When using highly enriched H 2 18 O, peptides that have fully incorporated two 18 O atoms and that appear only in the labeled sample appear identical to peptides that have incorporated no 18 O atoms either because they were C-terminal (and thus not a substrate for trypsin) or because they were significantly down-regulated. By lowering the enrichment level of the H 2 18 O to, for example, 90%, a unique peak "signature" is generated that the algorithm can recognize to distinguish these two cases. Furthermore our algorithm accounts for the residual 18 O 0 abundance that remains in the labeled sample from peptides with low incorporation rates; this is particularly relevant when utilizing the two-step digestion procedure.
In a manner similar to THRASH, the algorithm performs automated "reduction" of spectra. No assumption is made regarding monoisotopic mass, which is determined using an alignment procedure similar to that described by Senko et al. (16) and utilized by THRASH. However, THRASH operates directly on the raw spectral data; this allows it to detect species at very low signal to noise ratios but also makes it very slow. Instead we have focused on centroided peak data supplied by the vendor software from parent ion mass spectra for reasons of speed and because such data were readily available. The algorithm is indeed quite fast, interpreting a chromatogram of ϳ1,300 spectra in less than 1 min; THRASH might take several hours to process such data. The algorithm was developed using data from FT-ICR instrumentation but should be generally applicable to any instrument with sufficient resolving power.
We will describe in detail the operation of our algorithm, entitled regression analysis applied to mass spectrometry (RAAMS), and evaluate its performance using datasets with both known and unknown ratios.

MATERIALS AND METHODS
Datasets-Three datasets were used to demonstrate the effectiveness of the algorithm: two with known 18 O/ 16 O ratios and one with unknown ratios; these datasets are summarized in Fig. 1. The first two datasets consisted of purified proteins at known 18 16 O ratios. Note that the two pure proteins were present at approximately equal amounts in both the labeled and unlabeled samples; only the ratios of labeled to unlabeled digests of these proteins were varied. It is important to understand that these known ratios were generated for verification purposes only; a typical 18 O experiment would consist of a single mixed sample with a different (and unknown) 18 O/ 16 O ratio for each protein. These two datasets are available upon request (see below) in mzXML format.
To verify that the algorithm was able to detect differences in complex samples a third dataset was prepared. This set consisted of two pools of human plasma each composed of five disease or five control patients. The disease pool was trypsin-digested in 90% H 2 18 O, whereas the control pool was digested in H 2 16 O. These pools were then combined, and strong cation exchange (SCX) chromatography was performed, collecting 40 1-min fractions. We examined by nano-LC-FT-ICR that single fraction with the highest UV absorbance response because it was more likely to provide a complex sample.
Sample Preparation-Human albumin, human apotransferrin, DTT, and iodoacetamide were purchased from Sigma. Sequencing grade trypsin for solution-based trypsin digestion was purchased from Promega (Madison, WI). Trypsin immobilized on agarose beads was purchased from Pierce. Disposable size exclusion columns (Micro Bio-Spin, P6 in Tris) were purchased from Bio-Rad. 18 O-Enriched water (99 atom % enrichment) was purchased from Isotec (Miamisburg, OH). Solvents for LC-MS were purchased from Burdick and Jackson. Formic acid was from Fluka.
Two sample tubes, each containing 400 g each of albumin and transferrin, were initially reduced with DTT and alkylated with iodoacetamide. Samples were digested with solution-based trypsin in H 2 16 O, 10% acetonitrile overnight at 37°C. After this initial digestion, a second digestion using immobilized trypsin was used to exchange respective isotope labels into each sample. A 45-l volume of immobilized trypsin beads, previously washed with 50 mM Tris, was added to each sample, and samples were concentrated to dryness on a Savant vacuum centrifuge. 50 l of either H 2 16 O or the 18 O-enriched water were added to their respective samples and were again concentrated to dryness by vacuum centrifuge. For the isotope exchange digestions, 100 l of either H 2 16 O, 99% 18 O-enriched water, or 90% 18 O-enriched water were added to samples followed by 20 l of acetonitrile that had been dried by storing it over a bed of sodium sulfate. 90% 18 O-enriched water was prepared by adding a 10% volume of HPLC grade H 2 16 O to 99% 18 O-enriched water. Samples were then vortexed for 7 h at room temperature on a Genie Vortex 2. After the exchange digestion, 2 l of neat formic acid and 2 l of 5% heptafluorobutyric acid were added to each sample, and samples were vortexed briefly and centrifuged for 5 min at 6,000 ϫ g. 95 l of supernatant from each sample were transferred to new microcentri-fuge tubes for use in preparing prescribed ratios between the two samples. Two steps of 10-fold dilution (50 l diluted to 500 l) were performed for each albumin/transferrin sample using HPLC grade water containing 2% acetonitrile, 0.1% heptafluorobutyric acid, and 0.0005% Zwittergent 3-16. 18  Plasma samples were first depleted of the six most abundant proteins using an Agilent multiaffinity column. The non-retained fraction from each sample was assayed for total protein content using a Bradford assay and was denatured with urea, reduced, alkylated, digested, and labeled using a protocol from Prolytica TM (Stratagene, La Jolla, CA). The labeled digests were then mixed based upon their total protein content, desalted on a reversed phase cartridge, and fractionated on a 4.6-mm-inner diameter PolySULFOETHYL aspartamide SCX column using a potassium phosphate/potassium chloride buffer system at pH 3.0 with 20% acetonitrile. SCX aliquots were vacuum concentrated to dryness and reconstituted in water containing 2% acetonitrile, 0.1% heptafluorobutyric acid, and 0.0005% Zwittergent 3-16.
FT-ICR-MS data and data-dependent linear ion trap MS/MS data were collected in parallel on an LTQ-FT mass spectrometer (Thermo Electron Corp., Bremen, Germany). FT-ICR-MS survey data were Three datasets were prepared: the first two datasets consisted of purified proteins (human transferrin and serum albumin) mixed at known 18 O/ 16 O ratios. The third dataset consisted of human plasma from disease and control pools. collected in full profile mode from m/z 350 to 1,900 at 100,000 resolving power (full width, half-maximum at m/z 400) using two microscans and a target population of 1 ϫ 10 6 charges for automatic gain control (AGC) in the ICR cell. The linear ion trap performed MS/MS experiments on the three most abundant ions from the FT-ICR survey scans using 2 microscans, a precursor isolation width of 2 m/z units, 35% normalized collision energy, and 30 ms activation time. An automatic gain control target value of 1 ϫ 10 4 was used for MS/MS experiments in the LTQ mass analyzer.
Software Implementation-Software was implemented in Cϩϩ on Linux and MacOS X. Performance measurements were made on a computer with a 3-GHz Xeon processor running Linux 2.6. Software and datasets are available as open source upon request.

COMPUTATIONAL DATA INTERPRETATION
The algorithm takes as input the centroided peaks from a single spectrum as provided by the instrument vendor software and produces as output a list of isotope clusters detected therein along with their monoisotopic, neutral masses, their centroid retention times, and the ion abundances of both the unlabeled ( 16 O, sample A) and labeled ( 18 O, sample B) species. These two abundances, denoted A and B , 2 respectively, can be used to find differentially expressed peptides in either an absolute ( A Ϫ B ) or relative ( B / A ) scale. The estimated variance-covariance matrix of is also returned, allowing confidence intervals to be formed.
The algorithm operates in three steps. First, a list of all potential isotope clusters is generated based on peak spacing consistent with any allowed charge state. Some isotope peaks will belong to multiple potential isotope clusters. Second, the ion abundances of each of the potential clusters are fit to a theoretical isotopic model using linear regression, giving a best fit to each in terms of the relative amounts of 18 O 0 , 18 O 1 , and 18 O 2 for that peptide. Finally the algorithm decides between potential isotope clusters that share peaks based on their residuals from the regression fits. Each step is described in further detail below and summarized in Table I.
Step One: Examine Peak Spacing-In the first step of the algorithm, potential isotope clusters are found with spacing approximately equal to 1/z. To accomplish this, the data are divided into disjoint groups of peaks where a "group" is defined by a leading and trailing peak-free region of at least 3 m/z units. A distance matrix is computed between all pairs of peaks in this group. Charge states up to 6 are considered; elements in the distance matrix with values of 1.00235 Ϯ error, 1.00235/2 Ϯ error, . . . , 1.00235/6 Ϯ error are of interest. The average charge state spacing of 1.00235 is from Horn et al. (20). Error in charge state spacing is allowed to be as high as 40 ppm to account for partially or unresolved peaks. For each of the possible charge states from highest to lowest, the algorithm identifies adjacent pairs of peaks that have the requisite spacing. For each such pair, in order of abundance, the cluster is extended by adding those peaks that best fit the spacing of the original pair. The algorithm stops walking outward when one of two conditions is met: (a) the next peak with appropriate spacing is greater than 3 m/z units away or (b) the average error in peak spacing for the whole run exceeds 40 ppm. In this way, a potential isotope cluster is found that has m/z spacings of approximately 1/z for a given charge state z. At least three peaks must be found for a potential isotope cluster to be passed to later stages of the algorithm. Isotope clusters resulting from low abundance species with mass less than ϳ500 Da may legitimately have only two peaks (the A ( 18 O 0 ) and A ϩ 1 ( 13 C 1 18 O 1 ) peaks with negligible 13 C contribution) above the limit of detection, but these are very difficult to distinguish from noise spikes.
Three special cases converge to make it difficult to determine, based only on peak spacing, the correct charge state for a given run of peaks. (a) Noise peaks can interdigitate between real peaks. (b) Isotope clusters can legitimately overlap and share one or more peaks. (c) Low abundance peaks can fall below the signal to noise threshold of the peak detection algorithm. Examples of these are shown in Supplemental Fig. S1. To address the issue of noise peaks, legitimate peak sharing, and missing peaks, the algorithm attempts to fit all charge states that are consistent with the peak spacing data. The algorithm also tolerates up to two missing peaks internal to a candidate isotope cluster. In each of the three cases, the result is one or more isotope clusters with correct charge state assignments and one or more in- 2 The variables used are: A , predicted amount of material in sample A ( 16 O, unlabeled sample); B , predicted amount of material in sample B ( 18 O, labeled sample); X, matrix of theoretical isotopic abundances, see Fig. 2; y, column vector of experimentally observed peak heights, see Fig. 2; ŷ , column vector of predicted peak heights; ␤, column vector of parameter estimates with ␤ 1 , ␤ 2 , and ␤ 3 corresponding to the amounts of 18   Step 1: Find sets of peaks that could be potential isotope clusters based on their spacing. Find all charge states (1 Յ z Յ 6) consistent with isotope spacing of 1/z and having at least 3 peaks.
Step 2: For each potential isotope cluster: a) Form a matrix X with theoretical isotopic abundances, generated using averagine (16) and Mercury (22), for incorporation of 0, 1, or 2 18 O atoms. b) Form vectors y with experimental peak abundances, each y vector assuming a different peak is the monoisotopic mass. c) Solve matrix equation using non-negative least squares. d) Decide between possible y vectors (monoisotopic masses) using residuals. e) Compute and correct for incorporation rate using 18 O 1 / 18 O 2 abundances. Step 3: Decide between possible charge states by computing a charge assignment score that combines goodness of fit for all potential isotope clusters that share common peaks.
correct assignments that share peaks with the correct assignments. We resolve between these correct and incorrect isotope clusters in step three of the algorithm (described below).
Step Two: Examine Peak Abundances-In step two of the algorithm, the candidate isotope clusters from step one are evaluated based on their abundances. The purpose of step two is to determine the contributions of the three label states (exchange of zero, one, or two oxygen atoms at the C terminus of the peptide) to the experimentally observed abundances of the isotope cluster by least squares fitting of a linear combination of expected isotopic abundances for each of these states. For each candidate isotope cluster, its charge state (determined in step one) and the mass of its first peak are used to compute an approximate neutral mass, and then an approximate chemical composition is determined using the average amino acid averagine, which has the chemical formula C 4.9384 H 7.7583 N 1.3577 O 1.4773 S 0.0417 (16). Optionally the algorithm can adjust up or down the number of sulfurs predicted by averagine and perform the steps below for these additional expected chemical compositions.
As shown in Fig. 2a, for each chemical composition, three theoretical isotopic distributions are generated using Mercury (22) that correspond to the three label states. The first theoretical isotopic distribution is that which would result if all the molecules of peptide had not exchanged either of their Cterminal oxygen atoms (the " 18 O 0 label state") and is shown as black bars in Fig. 2a 2 18 O. The three theoretical isotopic distributions are arranged in the columns of a matrix, X, as shown in Fig. 2b (note that in this figure, the matrices are shown transposed). The X matrix is scaled so that each column sums to 1. A column of ones is also added to represent the intercept. For a further discussion of the regression model, see Eckel-Passow et al. (23).
The experimental peak abundances are arranged in a column vector, y, also shown in Fig. 2b. Because it is not known beforehand which experimental peak is the monoisotopic mass (all 12 C, all 14 N, all 16 O, etc.), a number of different y vectors, corresponding to different alignments of theoretical isotopes to experimental peaks, are attempted. Indeed the monoisotopic mass may not even be detected experimentally as in the case of species highly up-regulated in the labeled sample. Fig. 2b shows only the y vector of the correct align-FIG. 2. Theoretical isotopic distributions and resulting matrices. a, for each potential isotope cluster, the algorithm generates three theoretical isotopic distributions corresponding to the complete exchange of zero (black bars), one (light gray bars), or two (dark gray bars) of the C-terminal oxygen atoms with atoms drawn from the 90% H 2 18 O. b, these three distributions are arranged in the columns of a matrix X, shown here transposed. A column of ones is added for the intercept. Four rows of zeros are added before the monoisotopic mass and after the relative isotopic abundances are all less than 1% to prevent problematic alignments at the tails of the distribution. The experimental peak heights are arranged in a vector y, also shown transposed. c, an example isotope cluster with an expected 18 O/ 16 O ratio of 5. The algorithm utilizes the peak data (indicated by gray bars), not the spectral data (black line). The predicted isotope abundances (ŷ ) are indicated by circles. ment with the monoisotopic mass marked A; different alignments can be imagined by shifting this vector to the left or right. Where an experimental peak is not detected for a given theoretical isotope, 0 is inserted for the corresponding element of y.
The goal then is to determine the linear combination of the columns of X that most closely resembles all the experimental peaks. This is done using non-negative least squares regression (24,25), which defines the best fit as that with the lowest sum-of-squares error. The result of this regression is a column vector ␤ with elements representing the amount of experimental abundance that can be accounted for by exchange of zero (␤ 1 ), one (␤ 2 ), or two (␤ 3 ) of the C-terminal oxygen atoms or by an abundance offset (␤ 0 ). The abundance parameters ␤ 1-3 are constrained to be non-negative because they represent the amount of abundance in each of the three label states. By using a least squares fit, the abundances of all peaks in the isotope cluster are used in determining the abundances from sample A and sample B ( A and B ; described below) instead of just the abundances of A, A ϩ 2, and A ϩ 4 peaks as in previous work (5,15). Mirgorodskaya et al. (21) also used regression analysis for quantification with 18 Olabeled internal standards but measured the relative abundances of the 18 O 1 and 18 O 2 species in a separate experiment rather than allowing them to vary in the regression.
The regression is performed for each possible alignment of theoretical distribution to experimental peaks. The alignment with the lowest residual sum-of-squares error in the fit is chosen as correct; this is similar in concept to the alignment procedure described by Senko et al. (16). In choosing the correct alignment, it is important to account for the fact that the isotopic distribution of complete incorporation of two 18 O atoms closely resembles that of the natural abundance distribution and that the two distributions are thus indistinguishable. This means that very low residuals can be obtained by a partial match to only the rightmost peaks of a cluster (incorrectly assuming, for instance, that the experimental 18 O 2 is the theoretical 16 O 2 ) because this fit will have fewer peaks and therefore less error. To prevent this, four zeros are added to each end of the X matrix when determining the correct alignment; these zeros (shown in gray in Fig.  2b) introduce high residuals for those problematic alignments at the tails of the distribution that would otherwise effectively ignore a large amount of experimental abundance. However, they also bias the parameter estimates. Therefore, once the correct alignment is found, a version of X lacking these extra "isotopes" is fit to the y vector for the correct alignment, and the resulting ␤ is used for all further calculations.
The result of this final non-negative least squares fitting is an estimate of the vector ␤ where elements 1-3 represent the abundances contributed by the exchange of zero, one, or two of the C-terminal oxygen atoms, respectively, and element 0 represents an intercept term. These values take into account and correct for the presence of both the naturally occurring isotopes ( 13 C, 34 S, etc.) and also for the e.g. 90% impurity of the H 2 18 O. However, to convert these values into the original sample abundances A (unlabeled; 16 O) and B (labeled; 18 O), we must take into account the fact that trypsin has different affinities for different peptide pseudosubstrates. Two events must occur for a peptide molecule to incorporate an 18 O atom. 1) That peptide must react (exchange) with trypsin, and 2) an 18 O atom must be chosen from the 90% H 2 18 O. The probability of a given peptide molecule having at least one interaction with trypsin is equal to (1 Ϫ e ϪKt ) where K is a reaction constant that varies from peptide to peptide and t is reaction time; we denote this probability p c . If a given peptide molecule reacts at least once with trypsin, the probability of obtaining an 18  Conversely the probability of not incorporating an 18 O atom is (1 Ϫ s c ). The probability s c can be thought of as an incorporation rate or "effective purity." There are four sets of molecules we must take into consideration: 1) molecules from sample A, which never contain 18 O, 2) molecules with no exchange events in sample B, 3) molecules where exactly one of the C-terminal oxygen atoms has had at least one exchange event in sample B, and 4) molecules where both C-terminal oxygen atoms have had at least one exchange in sample B. In the single step digestion procedure, the probability of a molecule having no exchange events in sample B is zero. This is because, in the single step procedure, peptide cleavage and labeling both occur simultaneously in H 2 18 O. However, in the two-step digestion, there is the possibility that a peptide may be cleaved in H 2 16 O but then never (re)react with trypsin in H 2 18 O. In this procedure, the observed abundance in the A peak will be a combination of abundances from unlabeled molecules in both sample A and sample B. Thus, ␤ 1 is equal to the abundance of molecules from sample A plus those molecules from sample B where neither of the C-terminal oxygen atoms has exchanged. ␤ 2 is equal to the abundance of molecules from sample B where only one of the C-terminal oxygen atoms has exchanged. The probability of this occurring is equal to the probability of having one 18 O atom exchange and the other not exchange; there are two ways this could be done, and thus the final probability is 2p c (1 Ϫ p c ). ␤ 3 is equal to the abundance of molecules from sample B where both C-terminal oxygen atoms have exchanged as follows.
Solving sequentially for p c , B , and A yields the following.
Thus, by assuming that the rate of incorporation of an 18 O atom at the C terminus of a peptide is not influenced by the presence there of another 18 O atom, we can use a function of the ratio of one to two incorporations of 18 O (visible experimentally as the ratio ␤ 2 /␤ 3 ) to estimate the incorporation rate s c . Then the amounts of material in control ( A ) and disease ( B ) samples can be corrected for the incorporation rate using our estimate of s c . When the three parameters ␤ 1-3 are nonzero, we calculate the ratio of material in sample B (labeled) to that in sample A (unlabeled) as B / A , making use of the correction for s c as shown above. This requires that we detect at least the 18 O 0 , 18 O 1 , and 18 O 2 peaks to obtain estimates for ␤ 1 , ␤ 2 , and ␤ 3 and that these estimates are not constrained to zero by the non-negative regression. In cases where at least one of the parameter estimates is constrained to zero, we directly use the ratio (␤ 2 ϩ ␤ 3 )/␤ 1 . In both cases we constrain to a minimum or maximum ratio to avoid dividing by zero. We could also use s c to filter out those peptides that, due to a low incorporation rate, do not accurately report the true ratio of amounts. Fig. 2c demonstrates that the regression also provides predicted isotopic abundances, denoted ŷ and shown as circles.
To evaluate the goodness of fit between the experimental and predicted isotopic abundances, we compute a fitting "score" as in Equation 8, where the sum of the squared residuals from the fit is divided by the sum of experimental peak abundances. The denominator serves to normalize the error from the regression for comparison across isotope clusters with varying abundances and numbers of peaks. Lower scores are better in that clusters that more closely agree with their predicted abundances will have lower scores. Optimal scoring is an area of active investigation for our group.
Step Three: Decide between Charge Assignments-The third and final step of the algorithm further examines the candidates from step one that share common peaks. Recall that missing peaks, noise peaks, and overlapping isotope clusters introduce complications into the charge assignment process. Additional fits are attempted to decide which of these assignments are correct, and these additional fits result in isotope clusters that share peaks. Furthermore this peak sharing can be either legitimate (when two real isotope clus-ters overlap) or artifactual as in the case of noise peaks or missing peaks (see examples in Supplemental Fig. S1). A strict rule might be to never allow a peak to participate in more than one isotope cluster, but this would incorrectly eliminate some clusters that legitimately shared peaks.
Two competing factors must be accounted for in deciding between isotope clusters that share peaks. Isotope clusters with better (lower) fitting scores should be preferred over those with worse fitting scores. However, fitting score is influenced by the number of peaks in the isotope cluster: clusters with fewer peaks will have less error. This means that, for some isotope clusters, an incorrect assignment with lower charge state (and therefore fewer peaks) will have a better fitting score than a correct assignment with higher charge state. Therefore, we must also evaluate the amount of abundance that the isotope cluster accounts for out of the surrounding peaks. Obviously isotope clusters that use more of the surrounding peaks (i.e. a higher charge state) should be preferred over those that use fewer of them (i.e. a lower charge state).
We combine these two competing factors by computing a charge assignment score for each isotope cluster in the transitive closure of isotope clusters that share peaks (if an isotope cluster A shares peaks with B and B shares peaks with C, then ABC constitutes the transitive closure of isotope clusters that share peaks). We compute a charge assignment score for each isotope cluster in the transitive closure as the squared error from the fitting for this isotope cluster plus the squared abundances for all the other peaks in the transitive closure that were not used by that cluster (denoted yЈ) divided by the sum of abundances of all peaks in the transitive closure as follows.
Charge assignment score ϭ ͑ y Ϫ ŷ ͒ 2 ϩ ͑ yЈ͒ 2 y ϩ yЈ (Eq. 9) We then rank each isotope cluster by charge assignment score from lowest to highest and examine each in turn in this order. Isotope clusters that share peaks and have charge states that are even multiples of each other are not allowed to co-exist. The algorithm decides among clusters that violate this sharing rule by picking the one with the best (i.e. lowest) charge assignment score. In this way, legitimate peak sharing is allowed (Supplemental Fig. S1b has overlapping clusters with z ϭ 2 and z ϭ 3, which are allowed), but the incorrect, duplicate clusters introduced by noise or missing peaks are removed.
Evaluation of Alignment Correctness-A key factor influencing the correctness of the ratios produced by the algorithm is whether it determines the correct monoisotopic mass, that is to say whether it determines the correct alignment between the theoretical and experimental isotopic distributions. When the algorithm incorrectly assigns the monoisotopic mass, it generally assigns either the 18 (A ϩ 2 or A ϩ 4) isotope as the monoisotopic mass, introducing a shift in mass of 2 or 4 Da. Although there are indeed some species that differ by exactly a small, integer number of daltons, the most likely explanation for such a mass error is that the algorithm has incorrectly assigned the monoisotopic mass. Because peptides with masses less than ϳ4,000 daltons have readily detectable monoisotopic masses, determining the correct alignment is trivial using the chromatogram from the sample digested in H 2 16 O. We used these two pieces of knowledge (Ϯ4-Da shift and correct monoisotopic mass from 16 O sample) to verify the algorithm when fitting variable ratios.
We ran the algorithm, using a single theoretical isotopic distribution corresponding to the natural isotopic abundances for the chemical composition predicted by averagine, against the 16 O chromatogram and then combined the appearance of species at multiple charge states, in multiple spectra chromatographically, and across each of the samples by using mass and retention time thresholds of 20 ppm and 120 s, respectively, as described previously for our label-free approach (26). (We also used thresholds in retention time to remove artifact species, such as polymer introduced by the chromatography, as described by Mason et al. (27).) For each species we determined a consensus monoisotopic mass across all isotopic clusters for that species as their abundance weighted average of monoisotopic mass.
Next we applied the algorithm to all nine chromatograms each of the 90 and 99% enrichment datasets but allowed the algorithm to fit unknown 18 O/ 16 O ratios. We then used the same grouping procedure on this "unknown" data with one exception: we allowed two isotope clusters A and B to be grouped across multiple spectra and multiple charge states if mass(A) ϭ mass(B) ϩ n ϫ 1.00235 Da Ϯ 20 ppm where n varies from 0 to 4. We then rated the correctness of the algorithm in determining the monoisotopic mass for each cluster in the unknown data as whether its monoisotopic mass is the same (within 20 ppm) as the consensus monoisotopic mass from the 16  Manual inspection of 50 isotope clusters with a range of abundances and with scores between 0.1 and 0.105 revealed that 42 of 50 were correctly assigned. However, at scores between 0.2 and 0.205, only 5 of 50 were correct. Based on this we have used a threshold value of 0.1, shown by a solid vertical line. Note that at very high score (ϳ0.5) more abundant peaks are again observed; manual inspection revealed that many of these occur when noise peaks are separated (by random chance) from a highly abundant peak by exactly 1/z. This results in very high residuals for these incorrect fits.
One factor in the ability of the algorithm to determine the correct ratio is whether it can determine which of the peaks of an isotope cluster is the monoisotopic mass, particularly for peptides highly up-regulated in the labeled sample; put another way, it is important for the algorithm to determine the correct "alignment" between the theoretical and experimental isotopic distributions. The isotope clusters in Fig. 3a are colored green if the alignment was determined to be correct using the procedure described above and red if it was incorrect; isotope clusters whose correct alignment could not be determined because their mass and retention time could not be matched to the 16 O sample are colored black. Fig. 3b shows a histogram, for each of the nine expected ratios, of the number of isotope clusters (horizontal axis) that have a given ratio (vertical axis, which has the same scale as in a). Only those clusters with scores less than the threshold of 0.1 are shown. Stacked bars are colored red, green, or black to indicate alignment correctness (note: truncated bars are one-third the correct length). In general, the experimental ratios correspond well with the expected ones out to expected ratios of about 20 and 0.05; the distributions do broaden, however, at the more extreme ratios. Some bias toward 16 O is observed in the ratios; reverse labeling experiments and/or normalization may be useful in correcting this. Fig. 4 shows FT-ICR mass spectra of a single tryptic peptide of BSA, having been digested in (a) normal H 2 16 O, (b) H 2 18 O reported by the vendor as Ͼ99% pure, and (c) 90% pure H 2 18 O. In Fig. 4b the majority of the peptide molecules have incorporated two 18 O atoms, producing the expected 4-Da mass shift. As can be seen from comparing Fig. 4, a and b, the isotopic abundance distributions of 16 O-and that of 18 O-labeled peptides have a very similar shape (imagine shifting the isotopic distribution in Fig. 4a 4 daltons higher). Thus, when a peptide exists in high abundance in one sample but is at a level below the detection limit in the other, it is not possible to tell from which sample it originated. This determination depends in large part on the alignment step in the algorithm that attempts to determine which peak is the monoisotopic peak; if the wrong peak is decided upon, the algorithm will compute incorrect ratios. Moreover the C-terminal peptide of each protein in the sample will not incorporate 18 O label because it is (typically) not a substrate for trypsin. A spectrum such as in Fig. 4, a or b, could then be the result of a peptide that is highly up-regulated, is highly downregulated, or exists only in one of the samples; it could also result from a C-terminal peptide.
Comparison of Fig. 4, a and c, shows that this ambiguity can be partially resolved by using H 2 18 O of less than 100% purity, resulting in a distinct pattern of peaks. Fig. 4c shows the same peptide digested in 90% H 2 18 O. In this example, ϳ 1 ⁄3 of the peptide molecules have incorporated only a single 18 O atom, and a small fraction, ϳ4%, have not incorporated any. Thus, utilization of 90% H 2 18 O incorporates a signature to discriminate, for example, a C-terminal peptide that is unlabeled from a highly up-regulated, labeled ( 18 O 2 ) peptide. Fig. 5 shows the advantage of using impure H 2 18 O in distinguishing extreme ratios by depicting histograms of the number of correct versus incorrect alignments for each of the nine expected 18 O/ 16 O ratios in both 99 and 90% H 2 18 O. Across a range of abundance bins (reported as the base 10 logarithm of the abundance of the most abundant peak in each isotope cluster), the number of isotope clusters where the alignment was determined to be correct is shown in green, whereas the number with incorrect alignments are shown in red. The number of isotope clusters where the correct alignment could not be determined (see above for procedure) is shown in black. Only isotope clusters with scores less than the threshold of 0.1 are included. In Fig. 5a the digest was performed in 99% pure H 2 18 O. At lower ratios (more 16 O), the algorithm has difficulty determining the correct alignment, as shown by the large number of isotope clusters with incorrect alignments, because the expected distributions are indistinguishable to the algorithm. However, when the purity of the H 2 18 O is dropped to 90%, as shown in Fig. 5b, the algorithm does better as indicated by the lower number of incorrect alignments at low 18 O/ 16 O ratios. This is because the additional impurity causes a distinct pattern of peaks in the extreme 18 O case where most of the abundance in the combined sample comes from the labeled sample (where the 18 O/ 16 O ratio is high). The algorithm uses this fact to distinguish the two extremes of labeling during the alignment proc-  16 O at the bottom). Stacked bars are colored by the number of isotope clusters with correct (green), incorrect (red), and unknown (black) alignment between theoretical and experimental isotopic distributions. b, as in a but with 90% H 2 18 O. By decreasing the purity of the water the performance of the algorithm in finding the correct alignment is improved at low 18 O/ 16 O ratios. Note that due to differences in chromatography, the total number of clusters is not comparable between a and b. Inf, infinity. a and b at all ratios. Ratios closer to 1 are easier to detect at lower abundance because they are more likely to have all three of the A, A ϩ 2, and A ϩ 4 peaks. Work to determine the optimum level of impurity is ongoing. Note that the total number of isotope clusters is not equal between Fig. 5, a and b, due to differences in chromatography.
The variable incorporation rate of 18 O into individual peptides is another factor that can influence the correctness of the ratios computed by the algorithm. Different peptides can have different incorporation rates because trypsin has different affinities for them. A peptide with a low incorporation rate will show an artificially low 18 O/ 16 O ratio because the labeled sample will contain an artificially high amount of 18 (11,28,29). Increasing the concentration of trypsin in the labeling reaction will also increase the reaction rate. One commercially available kit (12) (Prolytica, Stratagene) achieves this by utilizing two digestion steps: a longer, high volume digestion in H 2 16 O with solution-based trypsin followed by a second, shorter digestion in H 2 18 O with immobilized trypsin. The use of immobilized trypsin allows for higher trypsin concentrations without increasing the number of autolytic fragments resulting from simply increasing the trypsin concentration. However, in the single step digestion procedure, if a peptide is seen, it must have had at least one opportunity for exchange; this is not true in the two-step digestion procedure. The algorithm attempts to correct for low incorporation rates by computing an effective purity, denoted s, using a function of the 18  This approach is demonstrated in Fig. 6, which shows a histogram of the distribution of the rate of incorporation of 18 O, indicated by the effective purity s c , across isotope clusters found in the nine chromatograms with known ratios. Higher s c values indicate more efficient labeling, which would suggest higher affinity of trypsin for a given peptide. Three representative peptides are shown in a, b, and c in both the 16 O sample (top) used to determine the correct monoisotopic mass and the all 18 O sample (bottom), which should show complete incorporation; the predicted abundance distribution (ŷ ) is indicated by circles. Moving from right to left, Fig. 6c shows a peptide with a high s c value, indicating substantial incorporation of 18 O. Fig. 6b shows a peptide at intermediate s c where less than half the molecules have incorporated one 18 O atom and almost none have incorporated two. Finally in Fig. 6a a peptide with a low s c value is shown, indicating poor incorporation; indeed very few molecules have incorporated one 18 O atom, and essentially none have incorporated two 18 O atoms. Peptides with such a low incorporation rate are problematic because it is difficult to correctly estimate the incorporation rate when so little 18 O 2 is seen. In general, these peptides are simply not amenable to quantification. Table II gives, for each of the nine chromatograms at different expected ratios, information about the number of clusters found and the number of species (peptides) identified by grouping those clusters using tolerances in mass and retention time. Only those isotope clusters with scores less than 0.2 are considered. Over 500 species were found even in a digest containing only two proteins; we believe that large FIG. 6. Histogram of the number of isotope clusters with a given incorporation rate, indicated by the effective purity (s c ). Data from all nine chromatograms of purified proteins digested in H 2 16 O and 90% H 2 18 O and mixed in varying ratios are shown. Higher incorporation rates indicate more of the peptide molecules have had both C-terminal 16  amounts of chymotryptic cleavage, nonspecific cleavage, post-translational modification, and salt adduction account for this high number of species (data not shown). We report the percent of total ion current (TIC) that could be determined to belong to peaks involved in isotope clusters detected by the algorithm both before and after grouping. On average, 61% of the ion current could be accounted for by the algorithm. The percentage after grouping reflects the fact that we require at least three isotope clusters to find a species; on average about 53% of TIC could be accounted for by the algorithm after grouping. Finally the algorithm took on average 45 s to process an 80-min chromatogram (of about 1,150 spectra) or about 26 spectra/s.
To give some sense of how the algorithm performs on more complex mixtures, Fig. 7 shows peptides found in a single SCX fraction of pooled human plasma that had been depleted of the six most abundant proteins. Isotope clusters detected by the algorithm that represented occurrences of peptides at multiple charge states and in multiple spectra chromatographically were combined using tolerances of 20 ppm and 120 s as described by Mason et al. (27). Fig. 7a plots the 18 O/ 16 O ratio of the most abundant ion for each peptide versus the fitting score of that ion. Lower scores indicate better fit, and larger ratios indicate species up-regulated in the disease sample. An example of a highly up-regulated peptide with an 18 O/ 16 O ratio of ϳ50 is shown in Fig. 7b where circles indicate the predicted isotopic abundances (ŷ ). Fig. 7c shows a slightly down-regulated peptide ( 18 O/ 16 O Ϸ 0.4) with a very good fit (score, 0.007). Fig. 7d shows a peptide with a neutral mass of 4,224 Da; the algorithm calculates a 18 O/ 16 O ratio of 2.8. This isotopic distribution would be challenging to interpret manually or with previous ratio calculations. It should be noted that, as the mass of peptides increase, their isotopic distributions become dominated by 13 C isotopes, and they broaden considerably. Above ϳ2,500 Da, the naturally occurring 13 C makes it more difficult to detect the presence of the 18 O introduced by the labeling.
Parenthetically we note that averagine yields only an approximate chemical composition for a given neutral mass; the predicted chemical composition can be wrong, particularly in the number of sulfurs. We examined the effect of allowing the algorithm to vary the number of sulfurs up or down from that predicted by averagine and determined that the effect of this, although improving scores for some peptides, was minimal (data not shown). DISCUSSION We borrow many concepts from the THRASH algorithm (20), which we have used previously as part of a label-free biomarker discovery strategy (15). The chief distinction be- FIG. 7. Ratios observed in a complex sample. a, scatter plot of peptides detected in a single SCX fraction from two pools of depleted human plasma, one trypsin-digested in H 2 16 O and the other in 90% H 2 18 O (dataset 3, Fig. 1). Isotope clusters from the algorithm were grouped into species using mass (20 ppm) and retention time tolerances (120 s) as described in the text; for each peptide, the fitting score (x axis) and 18 Ϸ 50; b) and a slightly down-regulated peptide (c) both have extremely good fitting scores. d, a peptide with neutral mass 4,233.9 would be difficult to correctly interpret either manually or using previous ratio metrics. As the mass of the peptide increases, it becomes increasingly difficult for the algorithm to both find the correct alignment and accurately calculate the ratio. tween this work and THRASH is that this work is amenable to 18 O-labeled data where complex overlapping isotopic distributions preclude the single theoretical isotopic distribution used by THRASH. However there are other, possibly more subtle, differences we wish to point out.
Our algorithm makes use of centroided peak lists. This is in sharp contrast to THRASH, which does not perform a priori peak detection and instead directly examines the full-profile spectral data. Whereas we only attempt alignments that correspond to the apexes of peaks (typically 10 -12 alignments per cluster), THRASH, in contrast, tries a series of several thousand alignments in very small m/z increments assuming each of the several thousand samples in the spectral region under examination may contain the monoisotopic mass. This makes THRASH much more robust in its ability to detect low abundance clusters with peaks below the signal to noise threshold of a peak picking algorithm but also makes it orders of magnitude slower. Also although the present work uses linear regression to find the best fit in a least squares manner in both m/z and abundance, THRASH iterates an overall abundance parameter to scale the theoretical isotopic distribution being aligned.
Our algorithm attempts all allowed charge states that are consistent with the peak data; THRASH, in contrast, uses a Fourier-Patterson charge state determination algorithm (30) to determine the prevailing charge state in a spectral region. THRASH only falls back to fitting all allowed charge states if the fit resulting from the charge state determined by the Fourier-Patterson is poor. Our method relies on the fact that performing the regression is very fast and that far fewer possible alignments are attempted. THRASH also depends on a subtraction step in which the predicted isotopic abundances are subtracted from the original spectrum. Experience has shown this subtraction to be problematic in that it often leaves residuals behind that may have spectral characteristics, resulting in the detection of artifactual isotope clusters with acceptable fitting scores. We avoid the subtraction by simply fitting the two clusters separately.

CONCLUSIONS
In conclusion, we have described and demonstrated the effectiveness of an algorithm that automatically interprets spectra of 18 O-labeled isotope clusters. It utilizes linear regression to model the complex, overlapping isotopic distributions that result in these spectra. The algorithm allows for correction of peptide to peptide variability of incorporation rate, particularly when using a common two-step digestion/ labeling procedure. It also provides for the use of impure H 2 18 O to distinguish strongly up-regulated peptides from strongly down-regulated and C-terminal peptides. The algorithm operates on centroided peak data and does not require fragmentation spectra; it is fast, interpreting an entire 1,500spectrum chromatogram in less than a minute. This provides for the possibility that the algorithm could be used on line during acquisition to direct the instrument to perform MS/MS on peptides with interesting ratios. This could help ameliorate the fact that in a typical MS/MS experiment one must extensively separate complex samples or risk missing low abundance peptides due to the limited capacity of the instrument to perform fragmentation.
The algorithm was applied here to FT-ICR data but should be generally applicable to any instrument with sufficient resolving power. Also although it was designed for 18 O data, it handles unlabeled isotopic distributions as a special case and could be used to interpret label-free experiments as well. It could also be easily adapted for other labeling moieties. In the future we hope to improve our scoring scheme, to incorporate knowledge of the expected distribution of the incorporation rate (s c ), and to incorporate protein identification data when available.