Abstract
Introduction: One of the most challenging tasks in computer-aided diagnosis (CAD) is mapping of radiological features to image descriptors. If the descriptors chosen are not appropriate to discriminate a disorder from another, the role of CAD system may not be appreciable.
Methods: A better idea could be to extract all the features that are felt to have significance in discrimination and then apply a feature selection algorithm to select only the actually significant features thereby overcoming the problems of high dimensional data. In this paper a distance based genetic algorithm (DGA) has been developed for feature selection. DGA uses fifty percent of the dataset used for training as training set and the remaining fifty percent as validation set in each of the classes. It then applies genetic algorithm (GA) to minimize the objective function, defined by the sum of the squared deviation of each data in the training set of each class from each data in the validation set of the corresponding class. The proposed algorithm has been tested in a CAD system. Result: The performance of the CAD system that uses the proposed DGA for feature selection has been compared with the system that uses features selected by differential evolution and a statistical repair mechanism (DEFS) and the system that does not use feature selection. Conclusion: It is found that the accuracy of the system that uses DGA is 88.16% against 83.47% for the system that uses DEFS algorithm and 86.46% for the system that does not use feature selection.Keywords: Computer aided diagnosis, feature selection, genetic algorithm, k-nearest neighbor classifier, lung disorder, objective function.
Graphical Abstract