Abstract
Molecular docking may be reasonably successful at reproducing X-ray poses of a ligand in the binding site of a protein, but scoring functions are typically unsuccessful at correctly ranking ligands according to their binding affinity. Using a set of challenging target enzymes, we show how the use of support vector machines (SVMs), trained with the individual energy terms retrieved from docking-based virtual screening (VS) experiments, can improve the discrimination between active and decoy compounds. Actives and decoys were obtained from the Directory of Useful Decoys (DUD) and docked into target binding sites with AutoDock Vina. The energy parameters of Vina's scoring function were used to train classification models with SVM-light. The results show that although Vina offers acceptable pose prediction accuracy for most targets, its scoring function performs poorly compared to our SVM classification models. The superior overall VS performance of the trained classification models confirms the potential of the use of machine learning methods to eschew the limitations of scoring functions at capturing the non-additive relationship between individual energy terms involved in ligand binding. Altogether, the results illustrate the potential of SVM-based protocols at enabling efficient, fast and economic virtual high-throughput screening campaigns with a freely-available docking software.
Keywords: AutoDock Vina, docking, machine learning, virtual screening, scoring function, SVM.
Graphical Abstract