Abstract
Feature selection has become increasingly important for quantitative structure-activity relationship (QSAR) studies. In the present article, we evaluate three state-of-the-art feature selection algorithms, namely mutual information (MI), genetic algorithm (GA), and support vector machine regression (SVR)-based recursive feature elimination (SVRRFE), in the reduction of high dimensional feature space for QSAR regression. We used SVR to evaluate the performance of these feature selection algorithms. In addition, we present a simple but very efficient iterative strategy for optimizing parameters for SVM-RFE algorithm. All three algorithms can effectively reduce the number of features and often achieve improved performance.
Keywords: Feature selection, quantitative structure-activity relationship, regression, support vector machine