## Abstract

The need for assay characterization is ubiquitous in quantitative mass spectrometry-based proteomics. Among many assay characteristics, the limit of blank (LOB) and limit of detection (LOD) are two particularly useful figures of merit. LOB and LOD are determined by repeatedly quantifying the observed intensities of peptides in samples with known peptide concentrations and deriving an intensity *versus* concentration response curve. Most commonly, a weighted linear or logistic curve is fit to the intensity-concentration response, and LOB and LOD are estimated from the fit. Here we argue that these methods inaccurately characterize assays where observed intensities level off at low concentrations, which is a common situation in multiplexed systems. This manuscript illustrates the deficiencies of these methods, and proposes an alternative approach based on nonlinear regression that overcomes these inaccuracies. We evaluated the performance of the proposed method using computer simulations and using eleven experimental data sets acquired in Data-Independent Acquisition (DIA), Parallel Reaction Monitoring (PRM), and Selected Reaction Monitoring (SRM) mode. When the intensity levels off at low concentrations, the nonlinear model changes the estimates of LOB/LOD upwards, in some data sets by 20–40%. In absence of a low concentration intensity leveling off, the estimates of LOB/LOD obtained with nonlinear statistical modeling were identical to those of weighted linear regression. We implemented the nonlinear regression approach in the open-source R-based software MSstats, and advocate its general use for characterization of mass spectrometry-based assays.

Assay characterization is important in quantitative bottom-up mass spectrometry-based proteomics. By relating the observed intensities to the concentration of the protein or peptide, it enables quantification of peptides or proteins in clinical research and diagnostics (1, 2), or in systems biology research (3). An important aspect of assay characterization is the estimation of its limit of blank (LOB)^{1}, limit of detection (LOD) and lower limit of quantification (LLOQ), which quantify the sensitivity and the quantitative reproducibility of the assays. The estimation of these figures of merit in mass spectrometry has been necessary, *e.g.* for drug of abuse testing (4) or for clinical diagnostics, *e.g.* with PSA (5). They are particularly important when the detection limits are near the clinical decision limit, as in the case of troponin (6). They are also important for evaluating various mass spectrometric workflows (7). Much technological research aims at estimating and minimizing these values for various experiment types. As of early 2017, over 700 publications indexed on PubMed reported LOB and LOD for mass-spectrometry based experiments. However, there is currently no commonly accepted experimental design or statistical guidelines for deriving these figures of merit, in multiplexed mass spectrometric investigations.

Assay characterization involves spiking peptides or proteins in known concentrations in a simple or complex background or titrating a peptide or protein mixture of known composition. After digestion, separation and spectral acquisition, the observed peak intensities of a peptide are summarized, and related to the true concentrations using statistical models. The statistical models used for this task is the focus of this manuscript.

Most statistical models rely explicitly or implicitly on the *canonical calibration curve* (8), which contains three distinct regimes (Fig. 1*A*). The noise regime, *i.e.* the concentrations where peptide intensities are indistinguishable from the chemical and electronic noise. These intensities are referred to as *background noise*. The linear regime, *i.e.* the concentrations where the canonical calibration curve is an increasing straight line (referred to as *response curve*). The slope of the line is the efficiency of the assay. Finally, the saturation regime, *i.e.* the concentrations where an increase in concentration does not translate into an increase in the intensity of the analyte. The three regimes are defined by the parameters Change and Change_{Sat} in Fig. 1*A*. Because most experiments do not include concentrations in the saturation regime, this manuscript focuses on the noise and on the linear regime. However, the approach proposed in the manuscript can be easily extended to the full canonical calibration curve.

The concepts of LOB and LOD of a peptide are best defined with respect to the canonical calibration curve, as illustrated in Fig. 1*B* (9). If the peptide intensities fall in the linear regime, the LOB is defined as the concentration at which the upper bound of the prediction interval of the background noise intersects the mean intensity of samples with the peptide. The LOD is defined as the concentration at which the upper bound of the prediction interval of the background noise intersects the lower bound of the prediction interval of samples with the peptide. The prediction interval, *i.e.* the confidence interval of a new intensity (as opposed to the confidence interval of the mean response curve) must be used to accurately reflect the performance of a future assay. The lower limit of quantification or quantitation (LLOQ) is loosely defined as the lowest concentration at which the measurements are acceptable for the intended use (10). Because multiple definitions of LLOQ exist (11), this manuscript only presents results for the LOB and LOD. However, the discussion can readily be extended to LLOQ.

When mass spectrometric experiments characterize a limited number of peptides, the concentrations are fine-tuned, so that the intensities fall well inside the linear regime, and the canonical calibration curve is observed. Following the canonical calibration curve, statistical analyses fit a linear regression that relates the concentrations and the observed intensities on the original (*i.e.* not log-transformed) scale. Many methods for fitting linear regression and estimating the LOB, LOD have been proposed (12, 13, 14, 15), compared (9, 16, 17, 18), and implemented in open-source software such as Quasar (19) and Skyline (20). Most methods acknowledge that the variance of the observed intensities increases with the strength of the intensity, and specify regression with unequal variance, or use a robust estimation procedure.

Increasingly however, mass spectrometric assay characterization is multiplexed, and simultaneously quantifies hundreds or even thousands of peptides using, *e.g.* Selected Reaction Monitoring (SRM), or more recently Parallel Reaction Monitoring (PRM) or Data-Independent Acquisition (DIA). In these experiments, it is often impossible to fine-tune the concentrations of each peptide, and some may fall outside of the linear regime. When the number of concentrations or replicates is relatively small, the intensities from the lower portion of the linear regime are like the intensities of the background noise. This creates uncertainty in the shape of the response, and the intensities of the peptide flatten out and produce the *apparent calibration curve*. The apparent calibration curve is illustrated with the dashed line in Fig. 1*A* and using an experimental data set in Fig. 2. Statistical analysis of multiplexed experiments must appropriately deal with these situations, and a linear model is not appropriate.

To account for the nonlinearity of the apparent calibration curve, alternative approaches fitting a logistic response curve (21, 22, 23) have been proposed. However, as illustrated in Fig. 2, they do not fully capture the intensities at low concentrations. Moreover, their estimation fails for peptides that are in linear regime, and for which the intensities do not contain both change points. As a result, the general strategy for the statistical analysis of the entire multiplexed assay characterization experiment is often unclear.

This manuscript proposes to remedy these challenges in highly multiplexed assay characterization experiments. It generalizes the approaches based on linear and logistic curve fits, and enables a more general but intuitive assay calibration. When the concentrations fall entirely in the *linear portion of the dynamic range* (*i.e.* in range of concentrations where the canonical and the apparent calibration curves agree), the output of the proposed approach is identical to that of weighted linear regression. When the concentrations deviate from this range and the observed intensities level off, the proposed approach appropriately represents the canonical calibration curve, while accounting for the uncertainty in the estimation. We evaluated the proposed approach using computer simulations and compared the resulting figures of merit on 11 experimental data sets acquired in SRM, PRM and DIA mode. The evaluations demonstrated that the proposed approach results in a more accurate, and often more conservative, estimation of the figures of merit.

## EXPERIMENTAL PROCEDURES

We evaluated the performance of the proposed approach using eleven experimental data sets, acquired in Selected Reaction Monitoring (SRM), Parallel Reaction Monitoring (PRM), or Data- Independent Acquisition (DIA), as well as using computer simulation. The data sets are summarized in Table I. All the experimental data sets were processed with Skyline (20).

##### Assay Development and Evaluation Experiments

Experiments SRM1-SRM8 consisted of heavy isotope-labeled peptides spiked into a complex background in known and varying concentrations, and systematically quantified together with the endogenous peptides. Experiment SRM1 was a multi-laboratory study of SRM performance for human proteins (CPTAC study 7) (24). Targeted SRM experiments SRM2-SRM7 were all the assays available on the CPTAC assay portal (25) at the time of writing. The DIA experiment is described in detail in Supplementary Section 1.2.

In each data set, areas under the chromatographic curves corresponding to fragments or transitions of a peptide in a run were summed and subjected to natural logarithm transformation. In the case of the DIA data set, only the most intense fragment ions are considered, as described in supplementary Section 1.2. The summed and log-transformed areas were then normalized across runs, by first equalizing the medians of these values among all the endogenous peptides, and then applying the same shifts to the values of all the heavy peptides. Finally, the normalized values were converted back to the original scale. In the following, we refer to these summed and normalized peak areas as the *intensity* of the peptide in the run.

##### Assay Used in Large Clinical Studies

Experiment CL (2) was a mass spectrometric assay of vitamin D binding globulin (VDBG) in human serum. A human serum sample containing VDBG was diluted into a background of chicken serum lacking VDBG. Heavy-labeled standard peptides were spiked in at a fixed concentration in each run. The assay quantified two peptides of the protein, VLEPTLK and ELPEHTVK. The peak area responses and concentration of these two peptides was calculated using a five-point calibration curve. The concentration values of the peptides were averaged to determine the concentration of VDBG. The intensities were normalized to the heavy- labeled standard according to the same procedure as above.

##### Simulated Data Sets

In the experimental data sets, the true LOB/LOD are unknown, and therefore we can only evaluate the impact of different statistical approaches on their estimation. In contrast, simulated data sets can help us evaluate the accuracy of the results. We simulated two data sets, SIM1 and SIM2 with known figures of merit, as described in supplementary Section 1.4.1.

##### Background

To place the proposed approach in the context of existing work, in this section we describe the definitions, notation, and current approaches for assay characterization based on linear (9) and logistic (21) curve fit with unequal variance. We assume that the characterization is performed at the peptide level (*i.e.* after summarizing all the transitions or fragments of the peptide) and describe the model for a single representative peptide.

##### Input Data and Notation

In mass spectrometry-based assay characterization experiments, each peptide is spiked or titrated in *L* samples in known and distinct concentrations *C*_{1} < …< *C*_{L}. The experiments also typically include blank samples used to quantify the background noise. We denote the blank samples by the true concentration *C*_{0} = 0. The *L* + 1 samples are analyzed with a mass spectrometer in technical replicates (*i.e.* runs). The spectral peaks corresponding to the peptide in a run are detected and quantified (*e.g.* by integrating the area under their chromatographic curve, or by any other method of choice) using a data processing tool such as Skyline (20). The peak intensities are then summarized (*e.g.* with the sum of the chromatographic peak areas of all the peptide transitions or fragments) and normalized, to produce a single *intensity* value per replicate run.

The output of data processing and summarization is the input to peptide assay characterization. It consists of (*C _{i}*,

*Y*) pairs of true peptide

_{ij}*concentrations C*,

_{i}*i*= 0, …,

*L*and their summary

*intensities Y*in the replicate run

_{ij}*j*,

*j*= 1, …,

*N*on the original (

_{i}*i.e.*not logarithm transformed) scale. The total number of intensities for a peptide is

*N*= Σ

_{i=0}

^{L}

*N*.

_{i}##### Existing Approaches for Multiplexed Assays

##### Linear Curve Fit

When the concentrations are in the linear portion of the dynamic range, many assay characterizations rely on a linear curve fit on the original (*i.e.* not log-transformed) scale. Most models explicitly or implicitly specify a two-part statistical model, where one part is for the blank samples, and the other for the samples containing the peptide. For the blank samples, the approaches assume that the background noise randomly fluctuates around a constant. For the samples with the peptide, the approaches assume a linear response curve, that relates the observed intensities to the spiked or titrated concentrations. The fluctuations around this expected relationship are assumed to be Normally distributed, with variance depending on the concentration. In the notation of this manuscript, a typical model is written as:
(1)

The existing regression-based approaches differ in their assumptions regarding the variance. Some specify a distinct variance parameter σ_{i}^{2} for each concentration, as in Equation (1) (9). A common alternative (12, 26) assumes that the variance is proportional to *C*_{i}^{2}, i.e.
(2)

The first term of the sum models the variance of the chemical and electronic noise at low concentrations, whereas the second models its increase with concentration. The *C*_{i}^{2} scaling implies the constant signal to noise ratio in the linear regime. Because the pattern in Equation (2) was not observed for the data sets in this manuscript, the results below follow the approach in Equation (1) and in (9).

##### Logistic Curve Fit

Approaches based on logistic curve fit are similar in spirit to the approaches based on a linear curve fit above. They model the mean observed intensities μ_{i} in Equation (1) with a logistic curve. In our notation, a five-parameter logistic curve (27, 28) is expressed as:
(3)

It contains five parameters (29): *a* is the intensity value at zero concentration and in the noise regime, *b* is the intensity value of the saturation regime, and *c* is the midrange concentration between *a* and *b*. Coefficients *d* and *g* control the extent to which the response curve moves from the noise regime to the saturation regime. A four-parameter logistic curve sets *g* = 1 in Equation (3) and restricts the curve to be symmetric around the midpoint . In the following, we will compare the proposed approach to the approach based on the five-parameter logistic curve, as it is most widely used and often generates better quality fit.

##### Figures of Merit

The *limit of blank* (*LOB*) of a peptide is defined as the highest concentration, at which the *expected* mass spectrometric intensity in presence of the peptide is indistinguishable from the *possible* back- ground noise (10, 12). The possible background noise is defined as the 100 × (1 − α)% one sided *prediction interval* of the intensity from the blank sample, which we denote as Noise_{α}. Therefore, in the notation of Equation (1), the LOB is the concentration *C* at which Noise_{α} intersects the response curve Intercept + Slope × *C*. This definition formalizes the fact that at the LOB, the probability of false positive peptide quantification is α, and the probability of false negative peptide quantification is 0.5. The LOB is illustrated with a dot in Fig. 1*B*.

The *limit of detection* (*LOD*) of a peptide is defined as the highest concentration, at which the possible mass spectrometric intensity in presence of the peptide is indistinguishable from the possible background noise (10, 12). The possible mass spectrometric intensity in presence of the peptide is defined as the lower bound of the 100 × (1 − β)% one-sided prediction interval at concentration *C _{i}*. At the LOD, the lower bound intersects the possible background noise Noise

_{α}. This definition formalizes the fact that intensities from any true concentration above the LOD must not be mistaken for background noise. At the LOD, the probability of false positive peptide quantification is α. The probability of false negative peptide quantification is β. The LOD is illustrated with a triangle in Fig. 1

*B*.

The definitions are easily extended to a logistic curve. In this case, the LOB uses the logistic response curve fit, and the LOD uses the lower bound of the prediction interval of the logistic response curve. In all the cases, the parameters α and β are set in advance. Smaller values of α and β lead to wider prediction intervals (30), and to more conservative figures of merit. In the following we systematically set α = β = 0.1 as is commonly done (10).

An alternative definition of the LOD (14) (31) is the concentration *C* at which the response curve has intensity Noise_{α} + where *z*_{1−β} is the (1 − β) × 100 quantile of the standard Normal distribution, σ_{1} is the standard deviation of the sample with the peptide at the lowest concentration. Because this definition does not have a straightforward expression of prediction intervals, we do not use it in what follows.

##### Parameter Estimation and Prediction Intervals

The figures of merit rely on unknown quantities that must be estimated from the data. For the blank samples, an estimate of the expected background noise μ_{0} is obtained as the average intensity Y̅_{0} = . The upper bound of the (1 − α) × 100 prediction interval of the background noise is determined as (32):
(4) where *t*_{1−α,N0−1} is the (1 − α) × 100% percentile of the Student's t distribution with *N*_{0} − 1 degrees of freedom.

For the samples with the peptide, parameters Slope and Intercept in Equation (1) are estimated via weighted least squares (30), as follows. First, the variance parameters σ_{i}^{2} are estimated as the sample variance at each spiked or titrated concentration (9). To increase the resolution beyond the spiked/titrated concentrations, the concentrations are discretized to a finer grid *C*_{1} < *C _{i′}* <

*C*

_{L+L′}. This finer grid contains the original concentrations, augmented with

*L*′ = 100 concentrations equally distributed in log space between the minimum and the maximum of the spiked/titrated concentration values.

The gridded variance parameters σ_{i′}^{2} are estimated by linear interpolation between estimates from the closest observed concentrations. Next, the concentration-specific weights are defined as *w _{i′}* = . Finally, the parameters in Equation (1) are estimated by plugging in the weights. The lower bound of the one-sided (1 − β) × 100% prediction interval is obtained with the closed-form expression (13, 9):
(5) where
Here

*t*

_{1−β,}

_{N}_{−2}is the (1 − β) percentile of the Student's t distribution with

*N*

_{0}− 2 degrees of freedom.

Unfortunately, no closed-form analogy to Equation (5) exists for the logistic curve fit. Therefore, the lower bound of the prediction interval is obtained via bootstrap (33). The bootstrap is an iterative procedure with a large number *B* of iterations, say *B* = 2000. At each iteration, the pairs (*C _{i}**,

*Y**) are obtained by sampling with replacement from the observed pairs

_{ij}*C*). The estimated

_{i}, Y_{ij}*â**,

*b̂**,

*ĉ**,

*d̂** and μ̂

** in Equation (3) are obtained via weighted least squares, while plugging in the full-sample weights*

_{i′}*w*= .

_{i}With the logistic curve fit, the lower bound of the one-sided (1 − β) × 100% prediction interval for a concentration *C _{i′}* is defined as the β × 100% quantile of the mixture of

*B*Normal distributions 𝒩(μ̂

**, σ̂*

_{i′}_{i′}

^{2}) with equal weights. The quantile can be approximated by simulation (34), or evaluated analytically on a grid of augmented concentrations above. Here we chose the grid-based evaluation as it is more computationally efficient, as detailed in Algorithm 1 in supplementary Section 2.1.

Finally, the estimate of LOB is the concentration *C _{i′}* at which the estimated upper bound of the background noise intersects the fitted curve μ̂

*(obtained with either logistic or logistic curve fit). The estimate of LOD is the concentration*

_{i′}*C*at which the estimated upper bound of the background noise intersects the estimated lower bound of the prediction interval , as shown in Fig. 1

_{i′}*B*.

##### Proposed Statistical Approach

##### Modeling the Canonical Calibration Curve

The proposed approach assumes that the systematic relationship between the true concentrations and the observed intensities follows the canonical calibration curve in the noise and linear regimes, as shown in Fig. 1*A*. In other words, it generalizes the linear response μ* _{i}* in Equation (1) by introducing the parameter Change ≥ 0. The intensities from all the concentrations below Change (including the blank samples) are viewed as the background noise, and the intensities of peptides with concentrations above Change fluctuate around the linear response curve. This bilinear model has a “hockey stick” appearance, and is formally written as:
(6)

The parameter Change defines the functional form of the response curve. When Change ≤ *C*_{1}, all the spiked or titrated concentrations of a peptide are within the linear regime, and Equation (6) is simplified to the linear model in Equation (1). When Change > *C*_{1} some of the concentrations are below the linear regime, the observed intensities from these concentrations are indistinguishable from the background noise, and the three-parameter “hockey stick” pattern is appropriate for the peptide. Therefore, the model can accurately represent all the peptides in a multiplexed experiment.

The model in Equation (6) is easily extended to situations where the spiked or titrated concentrations span noise, linear and saturation regimes in Fig. 1*A*. Such extension is detailed in supplementary Section 2.2 and results are shown in supplemental Fig. S4.

##### Parameter Estimation and Model Selection

The parameters μ_{0}, Slope and Change in Equation (6) are unknown and must be estimated from the data. The model is first fit to all the intensities of the peptide using the same weighted least squares estimation procedure as in Parameter Estimation and Prediction Intervals. When the linear model is appropriate, the nonlinear expression in Equation (6) is too complex (*i.e.* is over-parametrized), and results in a substantial uncertainty in the estimate of Change. We quantify this uncertainty with the (1 − γ) × 100% confidence interval. We set γ = 0.2 in what follows.

The confidence interval can be derived approximately using large-sample approximation (35), or by a bootstrap procedure. Because the number of replicate runs in assay characterization experiments is relatively small, here we use bootstrap as described in supplementary Section 2.1. If the confidence interval is located above *C*1, we conclude that the change point is present. Otherwise, we conclude that the observed intensities are consistent with the linear response, and that the simpler linear model in Equation (1) is appropriate. Finally, the selected (linear or “hockey stick”) model is refit to the data using weighted least squares.

##### Prediction Intervals and Figures of Merit

The prediction intervals from this model must reflect the uncertainty in (1) selecting the appropriate model, (2) estimating the unknown parameters of the selected response curve, and (3) random fluctuations of the individual intensities around the response. The combined effect of these three sources cannot be expressed analytically. However, the intervals can be obtained using the bootstrap procedure, like that in Parameter Estimation and Prediction Intervals, but with modifications.

At each bootstrap iteration, the pairs (*C _{i}**,

*Y**) are obtained by sampling with replacement from the observed pairs

_{ij}*C*). Parameters in Equation (6) are estimated via weighted least squares, while plugging in the full-sample weights

_{i}, Y_{ij}*w*= . To reflect the uncertainty because of model selection, the model selection is performed as described in Parameter Estimation and Model Selection, and either a linear or a “hockey stick” model is fit to the resampled data set.

_{i}Fig. 3 shows that the bootstrap iterations can vary substantially in both the selected model and its parameter estimates. Therefore, the response curve is obtained by averaging the estimated response curves across the bootstrap iterations. As can be seen from the figure, the resulting response curve may be highly nonlinear, and may lack the characteristic “hockey stick” pattern. This nonlinear fit reflects the uncertainty in the selection and in the estimation of the parameters of the canonical calibration curve.

Finally, the bounds of the prediction interval are obtained as the quantiles of the mixture of the predictive distributions across the iteration, where the numeric values for each concentration are evaluated on a grid as in Section 3.2.4. The figures of merit are then obtained as in Section 3.2.4. The estimate of LOB is the concentration *C _{i′}* at which the estimated upper bound of the background noise intersects the fitted curve μ̂

*. The estimate of LOD is the concentration*

_{i′}*C*at which the estimated upper bound of the background noise (defined in Equation (4)) intersects the estimated lower bound of the prediction interval.

_{i′}##### Implementation

The proposed approach is implemented as part of the free and open-source R-based package MSstats v3.6 and subsequent versions (http://msstats.org/assay-characterization) (36). The pseudocode for the algorithm is available in supplementary Section 2.1. The computational cost of the estimation procedure is limited, with the most expensive step being the calculation of bootstrap samples. For instance, Fig. 5*B* was obtained on a standard work station with 3.5 GHz CPU in about 1 min. The largest data set in this manuscript, SRM2, was analyzed in ∼6 h. The algorithm can be easily parallelized, and the runtime can be shortened by deploying it on a multi-core computer or on a cluster. The runtime can also be shortened by replacing the bootstrap-based confidence interval of the parameter Change in Equation (6) with an asymptotic confidence interval, but at the expense of an accuracy loss.

##### Evaluation

##### The Proposed Approach Accurately Fit the Data

As illustrated in Fig. 2, both linear and logistic curve fits can fail to simultaneously capture the flattening of peptide intensities at low concentrations, and the linearity of the response curve. In contrast, the proposed approach relies on the canonical calibration curve, and on bootstrap-based estimation, to capture both aspects. Fig. 3 shows the proposed curve fit for the same peptide as in Fig. 2. The curvature of the fit explicitly reflects the uncertainty in the concentration Change, at which the noise regime transitions into the linear response.

Additional examples of a better fit and better interpretation of the proposed approach are shown in Fig. 4 for the SRM2 experiment, and in Fig. 5 for the DIA experiment. As can be seen, the proposed approach led to higher figures of merit as compared with linear or logistic curve fit. This reduced optimism in the performance of the assays may have important implications for their downstream use.

The proposed approach did not always alter the figures of merit. Supplemental Fig. S1 illustrates the fact that, when the concentrations fall entirely within the linear portion of the dynamic range, the proposed approach was identical to that of weighted linear regression. The figure also indicates that the proposed approach was relatively robust to deviations from the linear regime at larger concentrations. The larger concentrations were associated with higher variance and were therefore given a smaller weight during the estimation procedure. Supplemental Fig. S2 presents a case where the proposed approach lowered the estimated figures of merit as compared with an existing approach.

##### The Proposed Approach Improved the Accuracy of LOB/LOD Estimates

As we have seen in Fig. 4 and in Fig. 5, the differences in the model fit between the existing and the proposed approaches impacted the estimates of the figures of merit. Because the true figures of merit in these experiments are unknown, we evaluated the accuracy of the estimates using the simulated data set SIM1.

Fig. 6*A* illustrates the accuracy of fit in one simulation instance. It shows that deviations from linear patterns increased the intercept and decreased the slope of the linear fit, as compared with the exact response curve. Although the logistic fit captured the nonlinearity of the intensities at low concentrations, it deviated from the linear pattern in the linear regime. These situations led to smaller (*i.e.* less accurate, too optimistic) figures of merit.

The accuracy of the proposed approach over all the simulation instances is reported in Table II and illustrated in Fig. 6*B*. Both linear and logistic curves dramatically underestimated the true figures of merit. On average, for the simulated data set, the linear curve underestimated the LOB by 63% and the LOD by 25%. The 5-parameter logistic curve fit underestimated the LOB by 53% and the LOD by 6%. In contrast, the proposed approach was slightly conservative, on average it overestimated the LOB by 1% and the LOD by 4%. However, it substantially reduced both the magnitude and the variability of the estimation error. This indicates that the estimates by the proposed approach for the experimental data sets are likely closer to the true values than those by linear and logistic fits.

Supplemental Fig. S3 shows that when the number of grid points used to discretize the spiked or titrated concentrations was sufficiently large, it had a relatively small impact on the accuracy of the results. The insufficient number of discretized concentrations can be diagnosed using plots such as supplemental Fig. S3*A*, which display overly jagged curves. As shown in supplemental Figs. 3*B* and S3*C*, this issue can be corrected by using a finer grid.

Finally, supplemental Fig. S4 uses simulated data sets SIM2 and SIM3 to illustrate that the proposed approach accurately estimates the figures of merit for assays with various combinations of noise, linear and saturation regimes.

##### The Proposed Approach Produced More Conservative Figures of Merit in the Assay Development and Evaluation Experiments

Fig. 7 compares the figures of merit obtained with linear, logistic and proposed approaches for all the assay development and evaluation experiments. Because the true values for these data sets are unknown, the figure plots ratios of the estimates by the logistic and the proposed approaches to the estimates of the linear fit.

As compared with the linear fit, the proposed approach substantially reduced the optimism regarding the figures of merit for all the data sets. As compared with the logistic curve fit, it reduced the optimism for many data sets. The extent of the changes depended on the number of peptides for which intensities deviate from the linear pattern, and on the type of these deviations. A larger increase in the LOB and LOD was associated with a larger proportion of peptides with nonlinear patterns. For example, in the PRM data set 84% of peptides had nonlinear patterns (*i.e.* more than 50% of the bootstrap iterations selected the “hockey stick” model with the proposed approach). The proposed approach increased the LOB and LOD by on average 40% as compared with the linear fit. In contrast, data set SRM1 had only 14% of peptides with nonlinear patterns, and the proposed approach did not produce substantial changes in the figures of merit.

At the same time, the proposed approach did not always produce more conservative figures of merit. supplemental Fig. S2 illustrates one such a case for a peptide from data set SRM7. The proposed approach produced smaller (*i.e.* more optimistic) figures of merit in this case. This behavior may be associated with intensities that deviate substantially from the canonical calibration curve, *e.g.* in presence of interferences or other data quality issues. In the assay development and evaluation experiments, such situation was observed on average in 10% of the peptides over all the data sets.

Supplemental Fig. S5 illustrates these patterns for the DIA experiment. It shows that the nonlinearity of a response curve of a peptide is strongly associated with the difference between the proposed and linear model-based estimates of LOB and LOD.

##### The Proposed Approach Produced Conservative Figures of Merit in the Clinical Research Assay

We applied the proposed approach to the clinical research assay experiment. This data set is representative of clinical research assays in that it is not multiplexed, contains many different concentrations, and many replicates as compared with the assay development and evaluation experiments described previously in this manuscript.

The linear fit for this data set is illustrated in Fig. 8*A*. The linear curve appears to fit well overall. However, a zoom into the lowest concentrations shows that it failed to accurately represent this region, which is of our main interest. The logistic curve fit and the fit by the proposed approach are illustrated in Fig. 8*B*. Because of the lack of intensities in the saturation range, the logistic curve fit was unstable. It failed to adequately capture the leveling off at low concentrations and resulted in an overly wide prediction interval. In contrast, the proposed approach accurately represented the data, facilitated the interpretation in the context of the canonical calibration curve, and provided reasonable estimates of the figures of merit.

Other quantities, in addition to LOB and LOD, are often of interest in clinical research assays. These include lower limit of quantification (LLOQ). In this manuscript, we used a definition of LLOQ, which is based on a linear curve fit, and is limited to the concentrations in the experiment. The definition and the details of the calculation for this data set are described in supplementary Section 3.2. The procedure resulted in LLOQ = 1 μg/ml. Although this quantity is not necessarily related to the LOD, we note that it was below the value obtained by the proposed approach (LOD = 1.03) and may therefore be potentially be too optimistic.

## DISCUSSION

We introduced a novel statistical approach, which is specifically designed for characterization of multiplexed mass spectrometry-based assays. It expands upon the linear curve fit between the observed peptide intensities and known concentrations by directly incorporating the concept of the canonical calibration curve.

When peptide concentrations fall entirely within the linear portion of the dynamic range, the fit by the proposed approach is identical to that of weighted linear regression. However, when peptide concentrations are small, and their intensities flatten out, the proposed approach results in a more accurate and a more interpretable fit. This is particularly important in multiplexed experiments, where it is impossible to ensure that the concentrations of all the assayed peptides are in the linear portion of the dynamic range.

The proposed approach improves the accuracy of the figures of merit as compared with the linear and to the logistic curve fits. When peptide concentrations are below the linear portion of the dynamic range, the proposed approach typically results in more conservative figures of merit. This may have important implications regarding the usability of the assay. At the same time, the concentrations above the linear portion of the dynamic range do not have a strong impact on the quality of the fit. This is because intensities from high concentrations have higher variance and have a smaller weight during the estimation. If needed, extensions of the proposed approach to tri-linear fits can be used to capture concentrations exceeding the linear regime.

The proposed approach is compatible with any strategy for quantifying a peptide in a run. In this manuscript, we chose to sum the intensities of all the transitions or fragments of a peptide that are reported by data processing, because this approach is simple, frequently used, and does not require additional software or manual intervention. However, other strategies are also acceptable. For example, quantification can be performed using top transitions or fragments. The transitions or fragments can be manually or automatically curated to remove interferences. If a curation procedure is used, it must be incorporated into every subsequent use of the assay to avoid overly optimistic characterization of its performance.

The proposed approach can be used for absolute quantification. A byproduct of the estimation procedure is a confidence interval for the parameter Change, which determines the lower bound of the linear portion of the dynamic range. The fit obtained by the proposed approach for concentrations above the confidence interval, or a separate linear fit limited to the concentrations above the confidence interval, can be used to relate the observed intensities of the peptide to the true concentrations.

Because the proposed approach discretizes the true concentrations, its resolution is not limited to the concentrations used in the experimental design. Therefore, it can help compare the performance of alternative assays or workflows for a same peptide. Moreover, discrepancies between the figures of merit by the proposed approach and by the linear fit, in cases where the proposed approach produces lower estimated values, can be used to automatically screen the assays for interferences, or for other data quality issues, which can be subsequently manually inspected.

Overall, we believe that the proposed approach offers multiple benefits, and advocate its general use for characterization of multiplexed mass spectrometry-based assays.

## DATA AVAILABILITY

The raw files for dataset DIA can be downloaded from Chorus (project 1379) at the following url: https://chorusproject.org/pages/index.html. The Skyline file of experiment CL can be downloaded from the following Panorama repository at https://panoramaweb.org/labkey/vUSnYr.url All the other datasets examined in this article are publicly available on the CPTAC assay portal: https://assays.cancer.gov.

## Acknowledgments

We thank Ching-Yun Chang for valuable discussions and feedback and Meena Choi for help with the implementation.

## Footnotes

Author contributions: C.G. and O.V. designed research; C.G. performed research; C.G., J.E., and L.P. contributed new reagents/analytic tools; C.G., J.E., S.E.A., C.M.H., and L.P. analyzed data; C.G. and O.V. wrote the paper; J.E., S.E.A., and C.M.H. conducted experiments; M.J.M. and A.N.H. supervised experiments.

↵* The work was supported in part by the US National Science Foundation CAREER award DBI-1054826 to O.V.

↵

^{}This article contains supplemental material.↵

^{1}The abbreviations used are:- LOB
- limit of blank
- DIA
- data-independent acquisition
- LOD
- limit of detection
- LLOQ
- lower limit of quantitation/quantification
- PRM
- parallel reaction monitoring
- SRM
- selected reaction monitoring.

- Received September 13, 2017.
- Revision received January 16, 2018.

- © 2018 by The American Society for Biochemistry and Molecular Biology, Inc.