Abstract
Aims and Objective: Redundant information of microarray gene expression data makes it difficult for cancer classification. Hence, it is very important for researchers to find appropriate ways to select informative genes for better identification of cancer. This study was undertaken to present a hybrid feature selection method mRMR-ICA which combines minimum redundancy maximum relevance (mRMR) with imperialist competition algorithm (ICA) for cancer classification in this paper.
Materials and Methods: The presented algorithm mRMR-ICA utilizes mRMR to delete redundant genes as preprocessing and provide the small datasets for ICA for feature selection. It will use support vector machine (SVM) to evaluate the classification accuracy for feature genes. The fitness function includes classification accuracy and the number of selected genes.
Results: Ten benchmark microarray gene expression datasets are used to test the performance of mRMR-ICA. Experimental results including the accuracy of cancer classification and the number of informative genes are improved for mRMR-ICA compared with the original ICA and other evolutionary algorithms.
Conclusion: The comparison results demonstrate that mRMR-ICA can effectively delete redundant genes to ensure that the algorithm selects fewer informative genes to get better classification results. It also can shorten calculation time and improve efficiency.
Keywords: Imperialist competition algorithm, minimum redundancy maximum relevance, support vector machine, microarray gene expression data, feature selection, cancer classification.