Submitted on September 5, 2002
Revised on November 6, 2002
Accepted on November 6, 2002
A new method to estimate ligand-receptor energetics
Joel R. Bock and David A. Gough
Bioengineering, University of California, San Diego, La Jolla, CA 92093-0412
Corresponding Author: dgough{at}bioeng.ucsd.edu
In the discovery of new drugs, lead identification and optimization have assumed critical importance, given the number of drug targets generated from genetics, genomics and proteomics technologies. High-throughput experimental screening assays have been complemented recently by virtual screening approaches to identify and filter potential ligands, given characteristics of a target receptor structure of interest. Virtual screening mandates a reliable procedure for automatic ranking of structurally distinct ligands in compound library databases. Computing a rank score requires the accurate prediction of binding affinities between these ligands and the target. Many current scoring strategies require information about the target three-dimensional structure. In this research, a new method to estimate the free binding energy between a ligand and receptor is proposed. We extend a central idea previously reported [4, 6, 5] that uses simple descriptors to represent biomolecules as input examples to train a support vector machine (SVM) [29], and the application of the trained system to previously unseen pairs, estimating their propensity for interaction. Here, we seek to learn the function that maps features of a receptorligand pair onto their equilibrium free binding energy. These features do not comprise any direct information about the three-dimensional structures of ligand or target. In cross validation experiments, it is demonstrated that objective measurements of prediction error rate and rank-ordering statistics are competitive with several other investigations, most of which depend on three-dimensional structural data. The size of the sample (n=2,671) indicates that this approach is robust and may have widespread applicability beyond restricted families of receptor types. It is concluded that newly-sequenced proteins, or those for which three-dimensional crystal structures are not easily obtained, can be rapidly analyzed for their binding potential against a library of ligands using this methodology.