Abstract
Using a support vector machine (SVM), two computational models were built to predict whether a compound is an active or weakly active strand transfer (ST) inhibitor based on a dataset of 1257 ST inhibitors of HIV-1 integrase. The model built with MACCS fingerprints gave a prediction accuracy of 91.82% and a Matthews Correlation Coeffiient (MCC) of 0.73 on test set, and the model built with 40 MOE descriptors gave a prediction accuracy of 93.64% and an MCC of 0.79 on test set. Some molecular properties such as electrostatic properties, van der Waals surface area, hydrogen bond properties and the number of fluorine atoms are important factors influencing the interactions between the inhibitor and the integrase. Some scaffolds like β-diketo acid and its derivatives, naphthyridine carboxamide or the isosteric of it and pyrimidionones may play crucial rule to the activity of the HIV-1 integrase inhibitors.
Keywords: Classification model, HIV-1 integrase ST inhibitors (HIV INSTI), Kohonen’s self-organizing map (SOM), MACCS fingerprints, MOE descriptors, support vector machine (SVM).