Abstract
The pharmacophore concept is central to the rational drug design and discovery process. Traditionally, a pharmacophore is defined as a specific three-dimensional (3D) arrangement of chemical functional groups found in active molecules, which are characteristic of a certain pharmacological class of compounds. Herein, by analogy with 3D pharmacophores, a more general concept of descriptor pharmacophore is introduced. The descriptor pharmacophores are defined by the means of variable selection QSAR as a subset of molecular descriptors that afford the most statistically significant structure-activity correlation. The two variable selection QSAR methods developed in this laboratory are discussed these include Genetic Algorithms - Partial Least Squares (GA-PLS) and K-Nearest Neighbors (KNN). Both methods employ multiple topological descriptors of chemical structures such as molecular connectivity indices or atom pairs (AP), and stochastic optimization algorithms to achieve a robust QSAR model, which is characterized by the highest value of cross-validated R2 (q2 ). By default, the descriptor pharmacophore represents an invariant selection of descriptor types however, descriptor values are generally different for different molecules. We demonstrate that chemical similarity searches using descriptor pharmacophores as opposed to using all descriptors afford more efficient mining of chemical databases or virtual libraries to discover compounds with a desired biological activity.
Keywords: Descriptor Pharmacophores, K-Nearest Neighbors, QSAR model, Quantitative Structure-Activity, toxicity analysis, chemometric methods, LUMO energies, molecular connectivity indices, KNN-QSAR model, Molconn-X program