Abstract
Background: Quantitative Structure Activity Relationship (QSAR) methods based on machine learning play a vital role in predicting biological effect.
Objective: Considering the characteristics of the binding interface between ligands and the inhibitory neurotransmitter Gamma-Aminobutyric Acid A(GABAA) receptor, we built a QSAR model of ligands that bind to the human GABAA receptor.
Methods: After feature selection with Mean Decrease Impurity, we selected 53 from 1,286 docked ligand molecular descriptors. Three QSAR models are built using a gradient boosting regression tree algorithm based on the different combinations of docked ligand molecular descriptors and ligand receptor interaction characteristics.
Results: The features of the optimal QSAR model contain both the docked ligand molecular descriptors and ligand-receptor interaction characteristics. The Leave-One-Out-Cross-Validation (Q2 LOO) of the optimal QSAR model is 0.8974, the Coefficient of Determination (R2) for the testing set is 0.9261, the Mean Square Error (MSE) is 0.1862. We also used this model to predict the pIC50 of two new ligands, the differences between the predicted and experimental pIC50 are -0.02 and 0.03, respectively.
Conclusion: We found the BELm2, BELe2, MATS1m, X5v, Mor08v, and Mor29m are crucial features, which can help to build the QSAR model more accurately.
Keywords: QSAR, GABAA, GBRT, mean decrease impurity, random forests, ligand-receptor interaction characteristics, pIC50.
Graphical Abstract