Abstract
Inhibition of non-structural protein 5B (NS5B) represents an attractive strategy for the therapeutic treatment of hepatitis C virus (HCV). In this study, machine learning classifiers such as artificial neural network (ANN), support vector machine (SVM), random forest (RF) and decision tree (DT) analyses were used to classify 970 compounds based on their physicochemical properties, including quantum chemical descriptors, constitutional descriptors, functional groups and molecular properties. Good predictive performance was obtained from all classifiers, providing accuracies ranging from 82.47–89.61% for external validation set. SVM was noted as the best classifier, indicated by its highest accuracy of 89.61%. The analyses were performed on data sets stratified by structural scaffolds (nucleoside and non-nucleoside) and bioactivities (active and inactive properties). In addition, a molecular fragment analysis was performed to investigate molecular substructures corresponding to biological activities. Furthermore, common substructures and potential functional groups governing the activities of active and inactive inhibitors were noted for the benefit of rational design and high-throughput screening towards potential HCV NS5B inhibitors.
Keywords: Descriptors, HCV NS5B polymerase inhibitors, Hepatitis C virus, Machine learning method, Molecular fragment analysis.
Graphical Abstract