Addressing accuracy and precision issues in iTRAQ quantitation

iTRAQ (isobaric tags for relative or absolute quantitation) is a mass spectrometry technology that allows quantitative comparison of protein abundance by measuring peak intensities of reporter ions released from iTRAQ-tagged peptides by fragmentation during MS/MS. However, current data analysis techniques for iTRAQ struggle to report reliable relative protein abundance estimates and suffer with problems of precision and accuracy. The precision of the data is affected by variance heterogeneity: low signal data have higher relative variability; however, low abundance peptides dominate data sets. Accuracy is compromised as ratios are compressed toward 1, leading to underestimation of the ratio. This study investigated both issues and proposed a methodology that combines the peptide measurements to give a robust protein estimate even when the data for the protein are sparse or at low intensity. Our data indicated that ratio compression arises from contamination during precursor ion selection, which occurs at a consistent proportion within an experiment and thus results in a linear relationship between expected and observed ratios. We proposed that a correction factor can be calculated from spiked proteins at known ratios. Then we demonstrated that variance heterogeneity is present in iTRAQ data sets irrespective of the analytical packages, LC-MS/MS instrumentation, and iTRAQ labeling kit (4-plex or 8-plex) used. We proposed using an additive-multiplicative error model for peak intensities in MS/MS quantitation and demonstrated that a variance-stabilizing normalization is able to address the error structure and stabilize the variance across the entire intensity range. The resulting uniform variance structure simplifies the downstream analysis. Heterogeneity of variance consistent with an additive-multiplicative model has been reported in other MS-based quantitation including fields outside of proteomics; consequently the variance-stabilizing normalization methodology has the potential to increase the capabilities of MS in quantitation across diverse areas of biology and chemistry.


Introduction
Different techniques are being used and developed in the field of proteomics to allow quantitative comparison of samples between one state and another. These can be divided into gel [1][2][3][4] or mass spectrometry based [5][6][7][8] techniques. Comparative studies have found that each technique has strengths and weaknesses and plays a complementary role in proteomics [9,10].
There is significant interest in stable isotope labelling strategies of proteins or peptides, as with every measurement there is the potential to employ an internal reference allowing relative quantitation comparison, which significantly increases sensitivity of detection of change in abundance. Isobaric labelling techniques such as tandem mass tags [11,12] or isobaric tags for relative or absolute quantitation (iTRAQ) [13,14] allow multiplexing of six and eight separately labelled samples within one experiment respectively. In contrast to most other quantitative 9 generalised logarithm transformation, a transformation that addresses heterogeneity of variance by approximately stabilizing the variance of the transformed signal across its whole dynamic range [34]. Huber et al. provided an open source software package, VSN, which determines the data-dependent transformation parameters. Here we report that the application of this transformation is beneficial for the analysis of iTRAQ data. We investigate the error structure of iTRAQ quantitation data, using different peak identification and quantitation packages, LC-MS/MS data collection systems, and both the 4-plex and 8-plex iTRAQ systems. The usefulness of the VSN transformation to address heterogeneity of variance is demonstrated. Furthermore, we consider the correlations between multiple, peptide-level readings for the same protein and propose a method to summarise them to a protein abundance estimate. We consider same-same comparisons to assess the magnitude of experimental variability, and then use a set of complex biological samples whose biology has been well characterised to assess the power of the method to detect true differential abundance. We assess the accuracy of the system with a four protein mix at known ratios spanning a fold change expression range of 1 to 4. From this we propose a methodology to address iTRAQ"s accuracy issues. Table 1 summarises the datasets used in this analysis. Detailed experimental procedural information is available in supplementary information. To evaluate experimental variability in the iTRAQ system we prepared same-same datasets, for which an aliquot of the same sample was labelled by each of the available isobaric tags and then combined prior to peptide separation and quantitation. Same-same datasets were collected for different sample types, quantitation systems, MS/MS systems and for both the 4-and 8-plex labelling system. To investigate iTRAQ accuracy, an experiment was prepared with a background of proteins at unchanging level but with the addition of four spiked proteins of known ratios ( Table 2). Two of the proteins were present at one to one to allow data normalisation to adjust for tag differences. To examine the approach on a complex biological system with biological differences, iTRAQ data were collected from yeast grown under various nutritional limiting conditions. Table 1 Figure 1). At the protein level, this leads to some proteins having only one reading whilst others have hundreds. The majority of these peptide readings are low volume and hence to maximize the sensitivity of the study it is desirable to keep these peptides for the data analysis (Supplementary Figure 2). The volume distribution arises as sampling of peptides is not random, but rather occurs as a result of a data dependent selection process in the MS for the high intensity peaks beyond any exclusion list/dynamic exclusion process applied. This limits iTRAQ to relative level comparison only (i.e. comparing ratios).

Fragmentation behaviour
by guest on May 6, 2020 To assess biases and variability in fragmentation, we examined the ratio between reporter ion maximum intensity and the 145 Da peak maximum intensity in the phosphorylase B dataset, whose high sampling depth allowed analysis at the peptide level. The 145 Da peak arises from incomplete fragmentation and is composed of the balance group attached to the 114-117 Da reporter group. The mass of 145 Da is common to all of the four 4-plex tags. We considered the top 31 sampled peptides (which comprise 50% of the dataset). The data were filtered, by removing peptide readings if they contained missing reporter ion values occurred, or if two or more of the reporter ion peak maximum intensities were below 15 counts. First, we found fragmentation efficiency to be peptide dependent; this is shown by the different ratios for different peptides between the reporter ion and the 145 Da peak intensities (Supplementary Data). Second, fragmentation efficiency was consistent across tags within an experiment run (data not shown); this is unsurprising, since with the iTRAQ system, fragmentation of the four reporter ion occurs simultaneously. These results provide further support for preferring relative level comparisons over raw measurements.

Heterogeneity of variance: the variance-mean dependence
The previous sections used the standard i-Tracker filtering method. To understand the variance behaviour fully, all quantified peptides were included in the analyses from this point on. Ratiointensity plots (RI plot) were used to assess the distribution of ratios as a function of average signal strength (Figure 1).While panels A and B show that the centre of the distribution of logratios has no significant intensity dependent systematic bias, in agreement with the findings of Hu et al. [16], the width of the distribution is significantly larger at low intensities than at high intensities. This heterogeneity of variance has previously been seen in iTRAQ data collected by guest on May 6, 2020 with a 4700 Proteomics Analyzer (Applied Biosystems, USA) and analysed with GPS explorer (Applied Biosystems, USA) (ABI) [16], and independently with data collected with a QSTAR when analysed with ProQuant v1.1 [21].
Suggest location for figure 1 The logarithm transformation, has previously been suggested for iTRAQ data with the objective of addressing the heterogeneity of variance [19,21]. However, Figure 1A and B show that the logarithm transformation does not sufficiently stabilize the variance. To further investigate the error structure, the relationship between the mean and the variance on the log scale for the four tags for each peptide reading after normalisation was assessed for each dataset; a representative case is shown in Figure 2. The plot is consistent with the additive-multiplicative (twocomponent) error model: a multiplicative component, with a leading exponent of 1 on the log-log plot, dominates at high intensities. At low intensities, the variance tends to a constant, signalindependent value due to an additive component.  Figure 3). Furthermore, if we define the usual log-ratio between two peak intensities I 1 and I 2 as then a generalised log-ratio can be defined as Here, c is a data-dependent constant; more specifically, it depends on the mean and standard deviation of the additive error component. For values of I 1 and I 2 both much larger than c, the generalised log-ratio simplifies to the usual log-ratio. is compressed towards 0 compared to q, i.e., its absolute value | is always smaller than |q|. The size of this shrinkage depends on the size of I 1 and I 2 , becoming more pronounced as I 1 or I 2 get smaller. the assumption that the majority (50% or more) of intensity values are truly not changing in expression.
The result of the generalised log (variance stabilisation normalisation, VSN) transformation on the Erwinia same-same dataset is shown in Figure 1 C and D: the variation of the generalised log-ratio is independent of the signal strength. The coefficient of variance (CV) is frequently used as a measure of variability. Figure 3 shows, with the blue dots, the heterogeneity of the CV as a function of average signal seen with log2 transformed data. In the proteomics community, techniques are frequently compared via single CV summary values [21]. However, as Figure 3 shows, due to the heterogeneity and intensity-dependence of the CV, summaries such as "median CV" are generally too simplistic, and may be misleading. Figure 3 also contrasts this behavior with that of the data transformed by the VSN transformation: their CV is approximately constant, and it coincides with the CV of the logarithm transformed data at high intensity levels. For medium or low intensities, the CV of the VSN transformed data is reduced compared to the logarithm transformed data.
In the Erwinia dataset of Figure 3, the high intensity convergence is not reached, which suggests that the MS data were not collected over the full dynamic range. However, it was seen with the phosphorylase B dataset (Supplementary Figure 4). The phosphorylase B sample is a simpler sample with individual peptides at much higher signal strengths. The CV following VSN transformation was smaller with the phosphorylase B dataset compared to the Erwinia datasets, but was similar between the Erwinia experiments. This is unsurprising as the Erwinia datasets had an additional separation stage (SCX) and were derived from a more complex sample.

From peptides to protein: A complex structure
While the measurements are made at the peptide level, interest often lies at the protein level, and a method is needed to summarize the peptide-level readings into a single, robust relative abundance estimate for each protein. A variety of approaches have been suggested for this task, which differ in how they address the different potential biases and the potentially different amount of confidence (precision) in each peptide-level reading. Here, we first discuss these issues, then present our approach.

Fraction effect
We define a "fraction effect" within a peptide as a significant dependence between the measured ratio and the fraction in which the reading was taken. The top ten sampled peptides from Erwinia dataset B and C were examined for a fraction effect by grouping the VSN transformed data by fraction and using a one-way ANOVA to assess for a significant difference in the mean between groups. Only fractions which were sampled more than three times were included in the analysis. 45% of the peptides had a statistically significant difference between fraction groups and the percentage of the variance explained by the fraction effect varied between 37.0 and 86.5% (average 57.1%). From this analysis we conclude for these peptides that the fraction effect was significant. With the phosphorylase B dataset, no statistically significant differences were seen between repeat injections, indicating that the fraction effect arose not from the repeat injections but rather from the separation (SCX) stage. Note that a fraction effect was also seen with log transformed or raw data when a Kruskal-Wallis test was used (data not shown). These results indicate that the error within fraction group for a peptide is smaller than the error between by guest on May 6, 2020 fraction groups and is arising not from the repeat injections but from additional variance from the repeated SCX separation.

Peptide effect
We define a "peptide effect" within a protein as a significant dependency between the measured ratio and the precursor ion (i.e. peptide). Due to the fraction effect, insufficient numbers of readings were obtained per peptide to consider a peptide effect for the Erwinia datasets. The phosphorylase B datasets, however, were designed to get multiple readings for each peptide, and an ANOVA was used to test for a significant difference in the mean ratio between peptides. Only peptides which were sampled more than three times were included in the analysis. The percentage of the variance explained by the peptide effect varied between 13.1 and 78.5 % (average 54.0%) depending on the tag combination examined. This peptide effect was observed for both MS instrumentations and for all software packages analysed. These results indicate that the measurement error within a peptide group for a protein is smaller than the error between peptide groups.

Intensity effect
As described in the previous section, the variance, and hence the readings" confidence intervals, are different in different parts of the dynamic range. It is uncommon to have a large number of replicate readings for each peptide, hence estimating that variance directly is impractical. We proposed applying the VSN variance stabilising transformation, which puts the data on a scale on which intensity effects on the variance are removed, but are traded for a intensity-dependent conservative bias, that is, shrinkage towards ratios of one when the intensities are small.
by guest on May 6, 2020

Data distribution
To investigate the data distribution and ensure the appropriate application of statistical tools, we plotted frequency histograms and normal quantile-quantile (Q-Q) plots for the readings of the top ten sampled peptides from the phosphorylase B dataset after VSN normalisation ( Figure 4). The data distributions were localised and uni-modal, resembling a combination of a normal distribution with outliers. This was found for data obtained with or without the i-Tracker standard low volume filter (discard if less than 3 of the reporter ions above a threshold of 15 counts) [39].
Suggested location for figure 4

Estimating the protein ratio
First, we compute a robust central tendency measure for each protein, such as the trimmed average of the VSN-transformed peak intensities of all the peptides belonging to the protein.
Differences between these quantities for different conditions then measure the differential abundance of the protein between the conditions. In doing so, we ignore the fraction and peptide effects described in Sections 3.3.1.1-2, and accept the conservative variance-bias trade-off of the generalised log-ratio described in Section 3.3.1.3. While it is conceivable that a mathematical model could be constructed that explicitly models and adjusts for these effects, such an approach would likely be complicated by unbalanced data structure (Section 3.1.1), often with few readings at each level, and by fragility to outliers and model misspecification. Here, we argue that while ignoring these effects might potentially incur suboptimal estimates, the disadvantage by guest on May 6, 2020 is by far offset in practice, at least with data from current experiments, by the simplicity and robustness of the above approach. Figure 5 shows the CV, at the protein level, of protein abundance estimates, where peptide data were combined with a 20% trimmed mean. The CV was calculated at the protein level using the ratio obtained from the six different possible tag combinations. For comparison, the CVs of protein abundance estimates are also shown when the ordinary logarithm transformation was used instead of VSN"s generalised logarithm. With VSN, the CV showed no signal strength dependence and was generally lower than with the logarithm.
Suggested location for figure 5

Selecting a significance threshold
In the simplest situation, iTRAQ is used in a pair-wise comparison [10,18,21,40,41]. A protein is deemed to be differentially abundant if measured ratios exceed a certain threshold. The threshold is chosen such that it encompasses the majority of technical variation in a same-same comparison. This analysis approach assumes that the samples being compared are representative of the population and takes no account of biological variation. The thresholds that encompass 90 and 95% of the experimental variation were found to be reproducible across different tag combinations ( Figure 6, Table 3). For the Erwinia datasets, a ±1.1 fold-change threshold encompassed 95% of the experimental variation after a trimmed mean estimation of protein ratio using VSN transformed data was employed. Thus, the experimental variation is so low that by guest on May 6, 2020 proteins with low changes in expression will be detectable in a pair-wise comparison, although the researcher will need to assess such a change is biologically significant.
Suggested location for figure 6 and table 3 3.4 Validation: application to real data Both a log transformation with ratiometric normalisation and the VSN transformation were applied to data from a biological study comparing yeast grown under various nutritional limiting conditions [42]. The variability of the data from yeast samples was found to be higher than in the Erwinia study (Table 4). For both the VSN and log transformed data, when biological differences were present, they were reflected by the protein ratios (Figure 7).

Suggested location for figure 7
The VSN transformed data identified considerably more proteins as having significant change in expression compared to the log transformed data ( Table 5). The greater sensitivity with the VSN method arose from the reduced variability of the peptide readings used to estimate the protein by guest on May 6, 2020 ratio ( Supplementary Figure 9), and was also reflected by the lower sensitivity threshold with VSN (Table 4).

Accuracy
The above analysis has focused on the precision of the iTRAQ technology. Concerns have been raised that iTRAQ might have problems with accuracy, by systematically underestimating ratios [23][24][25][26][27]. To assess this question, we prepared a sample with proteins at known ratios. Our findings confirmed that there is systematic ratio under-estimation in iTRAQ quantitation.
However, we observed a linear relationship between the observed and the expected ratio at the protein level over the four-fold range difference examined (Figure 8). Consistent with underestimation, the gradient of the linear relationship was less than one and the under-estimation became more obvious for larger ratio changes. This effect was seen both on data collected with a Q-STAR and a QTof Premier, suggesting that the effect is ubiquitous and not dependant on the MS technology used. It has been suggested that this under-estimation arises from contaminating peptides with similar m/z ratios during ion selection prior to collision induced dissociation [23,27]. A quantitative model reveals that when the relative amount of contamination is the same within an experiment, a linear relationship between observed and true ratios is expected, and would be independent of signal strength. We observe this in our data: RI plots at the peptide and protein level show no systematic deviations in the ratio observed with signal strength (data not shown). If the relative amount of contamination increases, the under-estimation becomes more pronounced. In fact, this was seen when the isolation width was increased in a study of iTRAQ by guest on May 6, 2020 labelled BSA digest [23]. To investigate the effect of contamination within the ion-selection process, the selection window settings used with the QTof Premier were changed as described in the Methods section. No statistically significant difference was seen between the three settings (data not shown), in agreement with that, the quantitative model predicted that even a two-fold increase in the contamination (10 to 20%) would only result in a minor impact on the linear relationship seen (Supplementary Figure 10). We conclude that we were not able achieve such strong changes in contamination levels using the ranges of ion selection parameters we employed, suggesting that factors other than the ion selection window give rise to this effect.
The under-estimation could arise from the MS, or the protein or the sample complexity or a mixture of all three. Our results, whilst limited to two proteins changing in ratio, indicate that the under-estimation is independent of the protein. The peptides for a protein were found to be scattered randomly around the estimated ratio suggesting that a peptide specific component is not significant in the degree of under-estimation. No difference was seen between the ratios when three different amounts of sample were injected, which suggests that peptide ion abundance is not a crucial component to the degree of under-estimation. While it is conceivable that larger changes in sample complexity might trigger differences, in the system used in this study, the sample complexity was reasonably high in all cases, with utilisation of minimal prefractionation of the peptides by a single short chromatography run prior to MS analysis. Further studies to pinpoint the true source of under estimation are beyond the remit of this work. Kuzyk et al. (2009) reported that an intensity dependent bias was seen at high ratio changes (≥5:1) with a Q-STAR and was possible with a LC-MALDI TOF/TOF at 10:1 ratio [26]. This by guest on May 6, 2020 bias led to greater under-estimation. For the QToF Premier known ratio mix data, no significant intensity dependent bias was seen (Supplementary Figure 11); however an intensity dependent deviation in the ratio reported was observed in the Q-STAR data, including peptides at a ratio of 1:1 (Supplementary Figure 12). The bias was not seen in the 1:1 ratio with the Erwinia sample which has a more typical sample complexity and dynamic range (section 3.1.3). This issue with the Q-STAR needs further study that is beyond the scope of this manuscript, but highlights a need to be cautious with high signal intensity data which arise when relatively large amounts of a few proteins are labelled using standard protocols.

Discussion
Both accuracy and precision of measurements in quantitative analyses rely on reproducible and exact values being returned from the experiment. The iTRAQ ratio data exhibit heterogeneity of variance, where the variance is higher for low intensity signals. This is a significant problem, as low signals dominate the datasets, and in a typical iTRAQ experiment many proteins have only a few peptide readings. Furthermore, the commonly used requirement of a minimum of two peptides for confident identification of a protein results in the desire to keep as many readings as possible in an analysis. Consequently, methods which discard peptide readings below a threshold significantly limit the depth of proteins sampled in a study. Other methods, such as weighted mean or weighted regression, also aim to address the issue of heterogeneity of variance; however these methods do not work well for proteins with few peptide readings, which dominate iTRAQ studies by guest on May 6, 2020 A two component error model consisting of an additive and multiplicative component is proposed to account for the variance structure. The presence of both components was verified with both the 4-plex and 8-plex iTRAQ tag systems, independent of analytical software and LC-MS/MS instrumentation used. The additive-multiplicative error model suggests that an appropriate data transformation will be useful, the so-called generalised logarithm (or: glog) transformation, which stabilises the variance across the entire intensity range. After such transformation, the decoupling of the variance from the signal significantly simplifies the downstream analysis as each peptide reading for a protein can be treated equally. Furthermore, it allows using low-intensity readings (rather than, say, discarding them). In data from a biological system, low intensity readings may be among the most interesting ones, when a peptide is seen at low abundance in some of the biological samples and higher in others. The price that we pay for using variance stabilisation is that ratios of small peak areas are compressed towards 1 (or, glogratios towards 0). This is a conservative effect and is called the "variance-bias" trade-off where a (hopefully large) improvement in precision is traded for the (hopefully small) cost of a bias. For the datasets of interest, we feel that this trade-off is justified, giving the benefit of being able to include all peptides and having robust estimates for all proteins even if few peptides are present.
The additive-multiplicative error structure has also been reported with quantitation by other MS based methodologies, and the additive component may arise from the integration of count based signal inherent with the majority of MS instrumentation [29,31,32] and/or the presence of a small basal unspecific background signal. As a consequence, heterogeneity of variance is, to varying degree, likely to be an inherent feature of all peptide quantitation methodologies and estimation that uses the glog transformation may play a useful role for these techniques. worth considering the distribution of readings for a protein at a peptide level. This has been incorporated into a freely available visualisation package for the R environment that compares expression changes for the peptides from the same protein [13]. In our same-same data, all differences arose purely from technical effects. Some sub-structure was identified in peptide readings where readings from a specific peptide or fraction clustered. Ideally a hierarchical process which takes a central tendency measure at each level would be used to estimate the overall protein ratio. In our opinion there are too few readings in a typical study at each level for this approach to be robust -outliers would be too influential on the result. We therefore propose the use of a trimmed mean as a robust measure of central tendency for the VSN-transformed peptide readings for a protein as these readings were found to be uni-modal in distribution with some outliers. In the case of proteins with only a few peptides a standard mean would be by guest on May 6, 2020 calculated as there is no alternative in this situation. This can be combined with visual inspection to assess whether the assignment of peptides to a parent protein is appropriate.
The simplest iTRAQ experiment is a pair-wise comparison between sample types looking for changes above a threshold determined from experimental variation assessed by looking at samesame comparison. For both raw and log transformed data, the threshold is difficult to determine as it should have an intensity dependent element. This is complicated even more by the fact that on the protein level, the estimated protein ratios are obtained from peptides at various intensities; consequently the majority of current methodologies fail to consider this problem. With the VSN transformation this intensity dependence is removed and 90% and 95% thresholds were found to lead to reproducible results across tag combinations. The thresholds varied with sample type but were low and indicated the sensitivity of the technology to expression changes. In practice, of course, the experimenter will use larger thresholds that also take into account biological variation. The thresholds reported here is not intended as a universal benchmark, and the reality is that for each new system (be it MS, chromatography or sample) a new same-same sample study should be run. If the compared samples are such that *most* protein abundances are the same across samples, then the distribution of observed glog-ratios can also be used to set the significance threshold. A threshold methodology was applied to a biological study and the iTRAQ findings were in keeping with those published for this system.
Compared to the previous iTRAQ data processing methodologies, we showed that the VSN processed data are more precise and sensitive to detecting changes. The advantages of the VSN methodology will be greatest in situations where hypothesis tests are used to detect changes in by guest on May 6, 2020 expression. Such tests are particularly useful in studies which include biological replicates to ensure the differences highlighted arise from a treatment difference rather than from a sampling effect. Underlying the more powerful hypothesis tests are assumptions such as normality and homogeneity of variance, which tend to be more appropriate with the VSN transformed data.
The study on a sample with known ratios in two independent MS systems confirmed that the iTRAQ technology does have an accuracy problem: ratios tend to be under-estimated. The experiments here, spanning a one to four fold ratio, suggest that this effect is independent of signal strength and leads to a linear relationship between the observed and the expected ratio which goes through the origin. Data modelling supports the suggestion of Bantscheff et al [23] that this under-estimation arises from contamination in the precursor ion selection process and indicates that a linear relationship would be obtained when the proportion of contamination is consistent within an experiment. With this linear relationship a single correction factor can be calculated to adjust for this under-estimation from readings of known proteins which span a range of expected ratios. We, therefore, recommend that for a typical sample, a similar experiment to that described here is carried out and a gradient value estimated from the linear relationship is employed as a correction factor for their system. Alternatively, if sample complexity is thought to influence this relationship, we envisage a kit could be developed which consists of a mixture of proteins at known ratios which are added to samples prior to iTRAQ labelling, and which would allow the calculation of the correction factor.
To support further development in data analysis, raw data for an example same-same study (ErwiniaC), the yeast study, and the spiked study are downloadable from the PRIDE database by guest on May 6, 2020 [45], and the PRIDE identification numbers are provided in supplementary information. Excel spread sheets including the quantitation data for both raw and normalized are also available in supplementary information.
In summary, this manuscript proposes methodologies to address the precision and accuracy limitations of iTRAQ. The accuracy issue, arising from contamination during precursor ion selection specific to MS/MS quantitation, can be addressed by calculation of a correction factor from spiked samples, whilst the precision issue can be addressed by the VSN transformation.
This then allows a robust estimation of the ratio at the protein level as all peptides have near equivalent precision. Together these methodologies will allow iTRAQ to provide robust quantitative data even when a protein is quantified from only 2 peptides. The potential application of the VSN method in MS studies is not restricted to iTRAQ quantitation or even to proteomics, as many MS based applications have reported precision problems related to heterogeneity of variance.     dataset has been removed for visual clarity as it had a high CV of 1.5. This analysis was completed on the Erwinia dataset B where no intensity-based data filtering had been performed beyond the peptides being unique for a protein and the peptide being confident in its assignment to a protein.    Tables   Study Type Sample Type   (dataset name) iTRAQ system LC-MS/MS