Abstract
Acetylcholinesterase has long been considered as a target for Alzheimer disease therapy. In this work, several classification models were built for the purpose of distinguishing acetylcholinesterase inhibitors (AChEIs) and decoys. Each molecule was initially represented by 211 ADRIANA.Code and 334 MOE descriptors. Correlation analysis, F-score and attribute selection methods in Weka were used to find the best reduced set of descriptors, respectively. Additionally, models were built using a Support Vector Machine and evaluated by 5-, 10-fold and leave-one-out cross-validation. The best model gave a Matthews Correlation Coefficient (MCC) of 0.99 and a prediction accuracy (Q) of 99.66% for the test set. The best model also gave good result on an external test set of 86 compounds (Q=96.51%, MCC=0.93). The descriptors selected by our models suggest that H-bond and hydrophobicity interactions are important for the classification of AChEIs and decoys.
Keywords: Acetylcholinesterase inhibitor (AChEI), correlation analysis, F-score, support vector machines (SVM), weka, Alzhemier disease, H-bond, hydrophobiciyty, drug discovery process