Abstract
Information about interactions between enzymes and small molecules is important for understanding various metabolic bioprocesses. In this article we applied a majority voting system to predict the interactions between enzymes and small molecules in the metabolic pathways, by combining several classifiers including AdaBoost, Bagging and KNN together. The advantage of such a strategy is based on the principle that a predictor based majority voting systems usually provide more reliable results than any single classifier. The prediction accuracies thus obtained on a training dataset and an independent testing dataset were 82.8% and 84.8%, respectively. The prediction accuracy for the networking couples in the independent testing dataset was 75.5%, which is about 4% higher than that reported in a previous study [1]. The webserver for the prediction method presented in this paper is available at http://chemdata.shu.edu.cn/small-enz.
Keywords: Enzyme, small molecule, majority voting, interaction, metabolic pathways, Bagging, KNN, Metabolism, glycolysis, oxidative phosphorylation,, gluconeogenesis, K-nearest neighbor algorithm, Matthew's correlation coefficient, pseudo amino acid composition, Amino acid, A-B-K voting system, jackknife test, benchmark dataset, SVM algorithm, 10-Fold Cross-Validation