Abstract
Background: Analysis on classification of microarray gene expression data has been an important research topic in bioinformatics.
Objective: For the unsatisfied performance of basic classification methods, researches on ensemble classifiers prove ensembling classifiers to be an efficient way to increase classification accuracy.
Method: In this paper, we propose a new diversity-based classification method, which combines a feature selection method based on clustering and an ensemble classifier D3C to improve the classification accuracy. D3C is a novel ensemble method which utilizes ensemble pruning based on k-means clustering and dynamic selection and circulating combination aiming at obtaining diversity among classifiers.
Results & Conclusion: We apply our proposed method on seven gene data sets. Compared to prior research, experimental results reveal that our method outperforms other ensemble classifiers in accuracy for gene classification.
Keywords: Gene expression data, feature selection, selective ensemble learning, clustering, diversity.
Graphical Abstract