|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Molecular & Cellular Proteomics 5:423-432, 2006.
© 2006 by The American Society for Biochemistry and Molecular Biology, Inc.
,
,¶





,
,**,¶¶
From the
Department of Computer Science, University of Washington, Seattle, Washington 98195 ** Institute for Systems Biology, Seattle, Washington 98103 ¶ Fred Hutchinson Cancer Research Center, Seattle, Washington 98109 
Institute for Molecular Systems Biology, Eidgenössische Technische Hochschule and Faculty of Natural Sciences, University of Zurich, Zurich, Switzerland 
Department of Molecular Biology and Biochemistry, Wesleyan University, Middletown, Connecticut 06459
Systems Biology Group, Institut Pasteur, 25-28 Rue du Dr. Roux, 75015 Paris, France || Cedars-Sinai Medical Center, Los Angeles, California 90048
| ABSTRACT |
|---|
|
|
|---|
Although mass spectrometers are exquisitely sophisticated instruments, there is evidence that they suffer from problems of sensitivity, reproducibility, and undersampling. The number of human genes, for example, is estimated to be well above 20,000, but protein products for only a fraction of those have been detected so far (6). The main reason is the limited capability to detect very low abundance analytes, thus highlighting the low sensitivity of these instruments, thus creating problems for the fingerprinting-based approaches. Furthermore the sample throughput of tandem MS-based methods presents another limitation of this technology. Peptides are typically selected for CID based on the intensity of the MS signal they generate. Because only some of the larger peaks are typically subjected to CID, this undersampling issue worsens the ability to profile a sample completely and confidently. This problem is further augmented due to the low sensitivity of the search algorithms used to sequence MS/MS spectra. Both of the above problems are approached (with limited success) by way of repeated experiments where the net throughput is better than that of the individual experiments. This further demonstrates the problem of reproducibility, i.e. the intensity of peptide peaks varies across experiments, and thus widely varying subsets of peptides get identified and quantified in each. All these issues create a fundamental problem for large scale MS-based proteomic studies, particularly those that are comparative in nature (for example, identifying biomarkers in human serum).
In the last few years, some computational approaches have been suggested to tackle these issues, for example accurate mass tags (7) and PepMiner (8). These approaches are based on building MS fingerprints of the peptide signals. Commonly if multiple experiments on related samples are to be compared, they are first interpreted individually, resulting in lists of identified peptides or proteins, one per experiment, that are then compared. The success of these approaches is limited due to the low throughput of the tandem MS experiments and the semistochastic subset of proteins identified in each experiment. Recently several research groups (911) have built tools to computationally identify and list the features (putative peptides) in a single MS experiment. These lists are then compared across multiple experiments by matching the m/z value and the retention time of the individual features. Peptide elution is not a well understood process, and a lot of unexplained behavior is observed even in the elution profiles of the sequenced peptides, making these feature lists ambiguous (evident by different feature lists generated by different tools on the same sample). Also the retention time for a peptide is notoriously hard to reproduce (12), a notion that was confirmed by the experimental data we examined here. Reasons for this are variations in the speed with which the peptides travel through the LC column, variations in time elapsing between successive fractions, inconsistencies in the solvent gradient generation, etc. Thus computational alignment of the LC column across multiple experiments is a hard problem, but the above methods do not give evidence for the correctness of the alignments that they construct. Furthermore the above methods make additional assumptions, e.g. that retention time can be interpolated linearly between blocks in different experiments (10, 13), that features can be reliably recognized from single experiments (10), that the total ion count profile suffices for the characterization of the alignment in time (9, 14), and that retention time varies by about 5 min or less between experiments (11).
In this study, we suggest a novel approach that is significantly different from all the above. First a signal map that maps the signals from any given peptide in one experiment to the signals acquired from the same peptide in the other experiment was created. Once constructed, signal maps can then be exploited for a variety of purposes, for example, for detecting features common across large numbers of experiments. The intuition behind this approach is that feature extraction on the basis of multiple experiments will be more sensitive and specific than first identifying features and then comparing across multiple experiments. Our approach first constructs a signal map on the raw MS data and performs all other data processing (e.g. feature detection) later. Similar ideas are successfully used in comparative genomics where genome maps are first created using whole-genome alignments and then used for better understanding of the content and function of various genomic regions (15).
Table I presents an overview of the formal notions we used. Let M and N be two MS runs where M1, M2, ... and N1, N2, ... are the corresponding mass spectra. Our technical approach (described in detail under "Materials and Methods") constructs the signal map by constructing an alignment that consists of a set of pairs (i,j) such that the occurrence of shared peaks between the pairs Mi and Nj is globally maximized. It is a stronger version of dynamic time warping, a highly successful approach in natural speech processing and other fields (16). To account for varying column speeds, we allow for multiple mass spectra in one run to correspond to a single mass spectrum in the other run. Our approach assumes that elution order (the relative order of peptides) is conserved. This assumption is likely to hold because peptides are separated by the same physical property in both runs. All methods described earlier (10, 11, 13, 14) make this assumption without presenting direct experimental evidence for it. In this study, other than describing the usefulness of signal maps, we also present (a) a method that does not make any additional assumptions beyond elution order conservation, (b) direct experimental evidence that this assumption holds, and (c) direct experimental evidence for the correctness of the alignments resulting from our method even on low-to-moderate accuracy instruments.
|
| MATERIALS AND METHODS |
|---|
|
|
|---|
Score Function
Here we consider two runs R = (M1, M2, ... ) and S = (N1, N2, ... ). Generally speaking, our aim was to construct an alignment that places similar spectra Mi and Nj close to each other. Our procedure for finding such an alignment is based on a pairwise score function. Note that a spectrum is a list of peaks, so the spectrum Mi can be written as {p1, p2, ... } where mz(pi) and intensity(pi) are the m/z ratio and intensity value for the peak pi, respectively (see Table I).
This score function rewards corresponding peaks within a window of 2
where
is the accuracy of the mass spectrometer used. Our initial measure of agreement between the two spectra Mi = {p1, p2, ...} and Nj = {q1, q2, ...} is: Mix Nj =
(p,q) intensity(p) x intensity(q) where |mz(p) mz(q)
2
|. Normalizing this expression to make it robust against global linear fluctuations in peak intensity, we arrive at the following preliminary score function for two mass spectra Mi and Nj: (Mi x Nj)/
((Mi x Mi)(Nj x Nj)), which is the same measure as the one used by Stein and Scott (17).
However, we found that the above score function does not appear to very well distinguish "close" from "distant" spectra. To address this, we computed an additional term E[Mi x Nj], which denotes the expected value of Mi x Nj under the random placement of all peaks within the given mass-to-charge range.
![]() |
If we consider each spectrum as being generated from two components: a noise distribution and a signal distribution, E[Mi x Nj] approximates s(i,j) for Mi and Nj generated only from the noise distribution. Thus subtracting this term makes only the signal component contribute to the score. The expected value of s(i,j) is thus zero for completely uncorrelated mass spectra Mi and Nj. Fortunately E[Mi x Nj] is straight forward to compute. Note that the probability that two peaks p and q with randomized intensity values are within two
of each other is constant. Therefore, a constant c proportional to
exists, such that
![]() |
Thus, after precomputing (
p intensity(p)) for Mi and (
q intensity(q)) for Nj, E[Mi x Nj] can be computed for each given pair (Mi,Nj) with only three multiplications. To further reduce the effect of unrelated spectra, any score below 0.2 was reduced to zero.
Alignment Algorithm
Based on the above score function s(i,j), we aimed to relate peaks of a run R = (M1, M2, ... ) with peaks of another run S = (N1, N2, ... ) through a signal map f by choosing the alignment
such that similar spectra appear close to each other in the sequence (i,j) in
. In empirical tests, we determined that it is beneficial to introduce some robustness against bad spectra by scoring not only pairs of spectra immediately adjacent in
but also close ones as follows.
For each spectrum Mi, we also scored its similarity with those two spectra Nj 1 and Nj 2 that appear immediately before Nj in
and those two spectra Nj + 1 and Nj + 2 that appear immediately after Nj in
(omitting the border cases). Nj can also be scored against Mi in a similar way. Adding all these similarity scores gives us a much stronger estimate for the local similarity between Mi and Nj. A best alignment according to this score function can be computed using a straightforward modification of the Needleman-Wunsch global alignment algorithm (18) as described below.
Let sc
(k,l) be the score of the best alignment of runs (M1, M2, ... Mk) and (N1, N2, ... Nl). Then using dynamic programming we can compute sc
(i,j) by the following recursion.
![]() |
Formulating the problem in the above manner, the dynamic programming computes the score sc
(k, l) of the global alignment with the maximum spectra similarity, giving us a global measure of similarity between the two runs. As in sequence alignment, an optimal alignment can be extracted by backtracking the optimal values in the abovere cursion.
We validated the above approach by examining a few pair wise alignments. The data set we used represents five duplicate two-dimensional LC-MS/MS measurements of tryptic digests of a whole-cell lysate of the yeast Saccharomyces cerevisiae. The five samples were collected from cells synchronized in the G1 phase of the cell cycle and at four time points following release (30, 60, 90, and 120 min). The protein sample was digested into peptides using the enzyme trypsin and then separated by charge into 35 fractions using SCX chromatography. For each time point, each fraction was split in half, resulting in a "Series A" and a "Series B," which together have a total of 68 fractions (at time point 0, data for two Series B fractions were not acquired). Each sample, in turn, was separated according to hydrophobicity by RPLC and later analyzed by ESI-MS (ThermoFinnigan LCQ Deca XP mass spectrometer). Complete detail about this data set is presented in Flory et al.1 We use the notation T0SCX23A to refer to the RPLC experiment done on Series A of SCX fraction 23 of sample acquired at time point 0 min. Fig. 1 shows three alignments for this purpose. The alignment of run T0SCX23A with itself is a near perfect diagonal alignment as one would expect. The other two plots show the alignment of run T0SCX23A with runs T0SCX23B (repeat experiment) and T0SCX24A (the strong cation exchange (SCX)2 fraction following T0SCX23A). None of these alignments is a perfect diagonal, but the middle region of the run where (from our more detailed examination) almost all of the signals lie is quite close to diagonal (similar evidence in Figs. 5 and 8). As expected, the technical replicate experiment T0SCX23B had an alignment closer to diagonal than the adjacent SCX fraction T0SCX24A, which could be considered as an approximate "biological replicate." It is important to note that the similar, but over a large interval, constant and distinct, slopes of the alignments are in no way favored by the algorithm itself. This observation suggests that the alignment reveals a true difference in the relative speed with which the peptides elute in the seruns. This phenomenon is known among mass spectrometrists as a difference in flow rates of the two reverse phase (RP) LC columns, and flow rate is one parameter that is known to be hard to keep constant between runs. This empirical observation confirms the ability of our optimal alignment to reconstruct the correspondence between two runs of mass spectra.
|
|
|
Analysis of Similarities and Analysis of Differences
Both these analyses required us to analyze multiple (more than two) runs together. Based on our method for constructing pairwise signal maps by alignment, we explored two strategies.
Global Alignment Using a Set of Globally Best Pairwise Alignments
The pairwise alignment procedure developed above allowed us to map the signals between any pair of 68 runs (of time point 0) in our input data (yeast experiment) and thus match signals between successive runs. To extend the pairwise map to a global map, we needed to thread together more than two runs. Although this might be done in the linear order in which the runs were acquired, we tried the following approach to guard against potentially bad runs. We decided to maximize the quality of the set of the pairwise alignments that we can choose to thread together to form a global alignment.
Given n = 68 runs, we would always need n 1 = 67 pairwise alignments to connect all runs into a single spanning graph. For this, we decided to compute a minimum spanning tree (19) of the complete graph that contains all pairwise alignments. Fig. 2 shows the minimum spanning tree.
|
Progressive Multiple Alignment
The other approach we proposed is based on a progressive alignment strategy. For this, we computed all pairwise alignments. Then the strongest pair (having the highest alignment score) was merged to form a consensus run (using the analysis of similarities (AnSi) approach described below). This procedure was then repeated to identify the next pair of runs to merge. The process was repeated until we were left with a single consensus run.
AnSi
To analyze the similarities between two runs, we merged the two into a single run with the idea that the similarities would be strengthened. The merge operation follows the signal map between the two runs. For each pair (Mi,Nj) of MS spectra in the alignment
, we generated a new spectrum by overlaying the two spectra. Peaks close in m/z value were merged into one by adding the intensity and averaging the m/z value (weighted by their intensity). In this way we could create a run of merged spectra, which could then be treated as a consensus run.
Analysis of Differences (AnDi)
To identify differences between two runs, the merging happens in a slightly different way. Instead of merging the two spectra (Mi,Nj) together, for each peak of Mi we searched for the highest corresponding peak in a small window (±10) of spectra around Nj. The window allows for small errors in the alignment and a not well understood elution profile. This largest peak was then used to normalize the corresponding peak in Mi. This way we got the difference spectrum Mi. A run of such spectra (M1, M2, ...) gives the consensus defference run.
Feature Recognition Method
To detect features in real and virtual runs, we used the following simple feature recognition algorithm. We called a peak a feature if the following three conditions are satisfied.
Peptide ID Transfer
Two CID spectra were compared using the following three factors, and if they were significantly similar, the identification was transferred from one to the other.
If the two CID spectra showed high similarity on the above three factors, the identification was transferred from the source to the target. To do an interexperiment peptide transfer, all CID spectra in one experiment were compared with the ones from the second experiment using the above procedure.
In addition to the above criteria, we could also add proximity in signal map, i.e. the precursor peaks p and q in the two runs (which correspond to the same peptide) are such that f(p) and q are close to each other in spectra indices. As described earlier, we first computed all pairwise alignments. Then using this signal map consisting of all strong pairwise alignments (alignments with average spectra similarity greater than 0.2), we could decide whether the two CID spectra are close. As capturing peptides for CID happens on an irregular basis, we used a relaxed threshold (±100 spectra) to decide whether the two CIDs refer to the same peptide.
| RESULTS |
|---|
|
|
|---|
To evaluate how well score functions work in this context, we considered pairs of successive SCX fractions of time point 0 from the yeast cell cycle data. These can be expected to have some overlap in peptide content, and high scores should thus be generated for those spectrum pairs (Mi,Nj) with similar i and j. When scores s(i,j) are represented in rectangular array with coordinates (i,j), high scores are thus expected around a path from low (i,j) pairs to high (i,j) pairs (in the case of a perfectly linear relationship between elution times, around a diagonal).
Fig. 3 represents such arrays for the runs T0SCX23A and T0SCX24A on the left-hand side for the Stein and Scott (17) score function that performed best in a previous comparison of score functions and on the right-hand side for our score function. In both cases, high scores did occur around portions of the diagonal, but the Stein and Scott (17) score function yielded many more high scores for two more classes of unrelated spectra for which scores should be low: (a) those that are off-diagonal and not generated from similar spectra and (b) those pairs generated from the beginning and end of the two experiments. Upon closer inspection, we established that the beginning and the end of both runs contain long subseries of "empty" spectra that contain no significant signal. Similar results were obtained when other pairs of successive SCX fractions were analyzed.
|
|
To evaluate this aspect, we compared two RPLC runs used before: T0SCX23A and T0SCX24A, representing successive SCX fractions of the yeast whole-cell lysate experiment, which can thus be expected to have some overlap in terms of their peptide content. Fig. 5 represents, for all spectra pairs (Mi,Nj) where Mi is the spectrum from T0SCX23A and Nj is the spectrum from T0SCX24A (Mi,j), the score of the best alignment containing (Mi,Nj) with bright red corresponds to the highest score. The bright red color around the main diagonal in the highlighted area represents the score of an optimal alignment, and the sharp change of colors around it indicates its robustness.
Note that, in the areas in which one of Mi or Nj contains no significant signal (here marked "ND"), an optimal alignment is not well defined as one would expect. The local "warps" are a well known phenomenon that can be exposed very clearly using this methodology, which renders signal maps as a useful tool in the refinement and validation of separation technology.
Optimal Signal Maps Are Correct
In the yeast data, after acquiring a mass spectrum up to four precursor masses were selected from that spectrum for CID. The selection was based on raw intensity levels, i.e. the higher peaks were preferentially selected for MS/MS. Thus, an MS spectrum was followed by zero to four MS/MS spectra. We used these alternately acquired CID spectra to evaluate the quality of the signal map.
As described earlier, experiments were performed at five time points in the yeast cell cycle with a significant degree of overlap expected between successive time points. For each time point a large number of CID spectra were conclusively assigned to a peptide ("identified") using Sequest (20) and PeptideProphet (21). As there is no experimental deterministic control for the elution phase in which any given peptide is sampled by CID, the correct signal map usually does not contain the corresponding MS spectra (k,l) but a pair (i,j) such that k is close to i and i is close to j.
We used the identified CID spectra in successive time points to assess the correctness of our signal map. Generally we found the successive time points to contain 10,00020,000 identified CID spectra from shared peptides. The first step in our evaluation was to remove all the peptide identification information from time point 0. Then we transferred the peptide identifications from time point 30 to time point 0 using two algorithms. As described earlier in "Materials and Methods," the first approach is based entirely on spectra similarity, and the other approach puts additional constraints from the alignment (the two MS spectra from which the precursor masses were selected for CID are close according to the signal map). These transferred identifications can be compared with the original identifications of the various CID spectra of time point 0, thus computing the numbers of correct and incorrect transfers. Fig. 6 plots these numbers as we vary the threshold for the peptide transfer algorithm.
|
Also using a very stringent threshold for the peptide transfer algorithm, we were able to transfer identifications to more than a thousand spectra of time point 0 that are not identified by Sequest. Thus this process of transferring identifications increases throughput and also represents a data set that can be used to understand the shortcomings of current approaches to identify peptides from mass spectra using database searches.
Signal Maps Can Identify Biomarkers
Signal maps can be used in various ways, but in this study we focused on one of their applications, biomarker discovery. Good biomarkers are signals whose presence (or absence) reliably indicates a biological state such as an early form of a health condition, two different growth conditions in cell cultures, etc. that would otherwise be hard to detect (22). Typically two collections of samples are compared in an attempt to identify the differences. The first collection of case or disease samples is from subjects who have a disease condition. The second collection of control samples is from subjects who do not have the condition. Biomarker discovery is the process of identifying signals that distinguish two such collections.
We applied signal maps to a synthetic biomarker discovery scenario in which the composition of all samples was known a priori. Our aim was to create a data set on which we would be able to analyze the performance of an experimental-computational approach to identify biomarker signals. Two protein samples were prepared, one "control sample" consisting of a four-protein digest and a second "disease" sample in which in addition to the four digested proteins ß-lactoglobulin (as a simulated "biomarker") was spiked in. Six LC-MS experiments of the same four-protein control sample and seven experiments of the same five-protein disease sample formed the basis of our analysis. Despite having only four/five proteins, these mixture are surprisingly very complex as the peptides exhibit multiple charge and isotopic states, the proteins are not 100% pure, the tryptic digestion is imperfect leading to missed cleavages and miscleavages, and the gradient is short. All these issues arise in any proteomic setup increasing the complexity of the sample manyfold. This was observed for this mixture too as thousands of peptide-like features were observed in the experiments (without deisotoping), whereas the theoretical tryptic digestion yields only a hundred or so peptides.
The diagram in Fig. 7 summarizes our two-stage biomarker discovery strategy on the basis of signal maps. In the first stage, we created signal maps between the six runs (repeats) on the four-protein control sample, allowing us to analyze signals that occur consistently across theses runs and summarize them in a new virtual control run we called C (AnSi; described earlier in "Materials and Methods"). This was done using pairwise alignments between the various repeat pairs. One such alignment is shown in Fig. 8. Comparing it with Fig. 5, we can see that the repeats were more reproducible in the four-protein mixture than in the yeast experiment.
|
In the second stage, we constructed a signal map between D and C that allowed us to integrate these two data sets into a virtual difference run (using AnDi; described earlier in "Materials and Methods"). Briefly AnDi aims to identify signals present in the virtual five-protein run D but not in the virtual four-protein run C. Conveniently we call the resulting virtual run DC. Before computing DC, however, we performed a first validation of this approach, by computing CD, a virtual run that should, under ideal experimental conditions and perfect signal maps, contain no signal. Fig. 9, panel 3, shows that indeed very few signals are present in CD. In fact, there are hardly any signals present if we ignore the start and the end of the run CD. This result confirms our assumption about peptide elution order: if the order in which any pair (p,q) of peptides elute had changed between the runs C and D, our alignment would have failed to capture that, and thus most likely signals from p or q (or both) would be present in CD.
Fig. 9, panel 4, shows DC, which contains many more signals, as expected under the correct signal maps. Colors in the above figures were used to highlight those signals that an ad hoc feature detection method identified as features (described earlier in "Materials and Methods"). Green color was further used to highlight those signals that corresponded with a list of expected signals from ß-lactoglobulin according to the theoretical tryptic digestion performed. The many features colored in green provide evidence that signal maps and feature recognition work.
|
In an attempt to settle this issue, we performed a LC-MS/MS experiment with a sample that contained only a tryptic digest of ß-lactoglobulin, the fifth protein, shown in Fig. 9, panel 5. On visual inspection, the figure displays a significant overlap with DC, suggesting that the unexplained signals in DC arise indeed from non-tryptic cleavages of the fifth protein or other signals arising only from the fifth protein. Conversely this means that AnSi-AnDi analysis on the basis of signal maps indeed reflects consistent differences between sets of experiments. Follow-up analysis revealed 14 non-tryptic peptides in MS/MS experiment of ß-lactoglobulin, many of which overlap with the red signals shown in Fig. 9, panel 4. To quantify the overlap between panels 4 and 5 of Fig. 9, we computed the virtual run ((DC)- (ß-lactoglobulin)) using the AnDi approach. The total ion current in this run was 15% of the total ion current in (DC). This shows that panels 4 and 5 of Fig. 9 overlap significantly in terms of the peptide content.
| DISCUSSION |
|---|
|
|
|---|
Other potential uses of signal maps include the following.
All computational analyses were performed on Linux personal computers. Computing the optimal signal map between two runs (1.5-gigabyte mzXML (23) files each) takes around 1020 min on a normal work station. We are currently working to develop a better understanding of elution curves to develop statistical approaches for the feature detection method that follows the AnDi analysis.
Although we have presented evidence that alignment can lead to the cited benefits on low-to-moderate resolution mass spectrometers, other platforms may require different parameter settings. We believe that, despite these foreseeable adaptations, we have shown that the approach is quite powerful. In particular, most of the data we used here were acquired on instruments with low-to-moderate resolution and accuracy, such as LCQ, Q-TOF, and LTQ. It will be very interesting to run our tools on data sets acquired on high resolution and more accurate instruments, such as FTICR.
.
| ACKNOWLEDGMENTS |
|---|
| FOOTNOTES |
|---|
Published, MCP Papers in Press, November 3, 2005, DOI 10.1074/mcp.M500133-MCP200
1 M. Flory, H. Lee, R. Bonneau, P. Mallick, K. Serikawa, D. R. Morris, and R. Aebersold, submitted for publication. ![]()
2 The abbreviations used are: SCX, strong cation exchange; RP, reverse phase; ID, identification; AnSi, analysis of similarities; AnDi, analysis of differences. ![]()
* This work was supported in part by federal funds from the NHLBI, National Institutes of Health under Contract Number N01-HV-28179. ![]()
¶¶ To whom correspondence should be addressed. Tel.: 33-1-4568-8620; Fax: 33-1-4061-3704; E-mail: benno{at}pasteur.fr
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
A. Prakash, B. Piening, J. Whiteaker, H. Zhang, S. A. Shaffer, D. Martin, L. Hohmann, K. Cooke, J. M. Olson, S. Hansen, et al. Assessing Bias in Experiment Design for Large Scale Mass Spectrometry-based Quantitative Proteomics Mol. Cell. Proteomics, October 1, 2007; 6(10): 1741 - 1748. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Bouyssie, A. G. de Peredo, E. Mouton, R. Albigot, L. Roussel, N. Ortega, C. Cayrol, O. Burlet-Schiltz, J.-P. Girard, and B. Monsarrat Mascot File Parsing and Quantification (MFPaQ), a New Software to Parse, Validate, and Quantify Proteomics Data Generated by ICAT and SILAC Mass Spectrometric Analyses: Application To the Proteomics Study of Membrane Proteins from Primary Human Endothelial Cells Mol. Cell. Proteomics, September 1, 2007; 6(9): 1621 - 1637. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Lange, C. Gropl, O. Schulz-Trieglaff, A. Leinenbach, C. Huber, and K. Reinert A geometric approach for the alignment of liquid chromatography mass spectrometry data Bioinformatics, July 1, 2007; 23(13): i273 - i281. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Listgarten, R. M. Neal, S. T. Roweis, P. Wong, and A. Emili Difference detection in LC-MS data for protein biomarker discovery Bioinformatics, January 15, 2007; 23(2): e198 - e204. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. J. Skaggs, M. E. Gorre, A. Ryvkin, M. R. Burgess, Y. Xie, Y. Han, E. Komisopoulou, L. M. Brown, J. A. Loo, E. M. Landaw, et al. Phosphorylation of the ATP-binding loop directs oncogenicity of drug-resistant BCR-ABL mutants PNAS, December 19, 2006; 103(51): 19466 - 19471. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. D. Jaffe, D. R. Mani, K. C. Leptos, G. M. Church, M. A. Gillette, and S. A. Carr PEPPeR, a Platform for Experimental Proteomic Pattern Recognition Mol. Cell. Proteomics, October 1, 2006; 5(10): 1927 - 1941. [Abstract] [Full Text] [PDF] |
||||