Abstract
Virtual screening by molecular docking has become established as a method for drug lead discovery and optimization. All docking algorithms make use of a scoring function in combination with a method of search. Two theoretical aspects of scoring function performance dominate operational performance. The first is the degree to which a scoring function has a global extremum within the ligand pose landscape at the proper location. The second is the degree to which the magnitude of the function at the extremum is accurate. Presuming adequate search strategies, a scoring functions location performance will dominate behavior with respect to docking accuracy: the degree to which a predicted pose of a ligand matches experimental observation. A scoring function s magnitude performance will dominate behavior with respect to screening utility: enrichment of true ligands over non-ligands. Magnitude estimation also controls pure scoring accuracy: the degree to which bona fide ligands of a particular protein may be correctly ranked. Approaches to the development of scoring functions have varied widely, with a number of functions yielding similarly high levels of performance relating to the location issue. However, even among functions performing equally well on location, widely varying performance is observed on the question of magnitude. In many cases, performance is good enough to yield high enrichments of true ligands versus non-ligands in screening across a wide variety of protein types. Generally, performance is not good enough to correctly rank among true ligands. Strategies for improvement are discussed.
Keywords: Docking, scoring, free energy, PMF, machine learning, Surflex, Glide, GOLD