Abstract
Background: Most pattern recognition and data mining problems such as classification, clustering and retrieval rely on distance metric or similarity measure. The metrics used in such problems should be capable of reflecting the relationship among the data efficiently. Different features contribute differently to the discrimination of different classes of images. This is especially true in medical image classification where it is difficult to relate the radiological signs to image features and hence to determine the relative importance of the different image features.
Methods: In this paper, we have proposed novel metrics that apply different weights on the deviations of the features. The weights have been used together with Euclidean distance and city block distance to derive weighted Euclidean distance (WED) and weighted city block distance (WCBD) respectively.
Result: The weights are obtained using genetic algorithm for finding the optimal features for classification. The weights are representative of the relative significance of the features in the classification of diseases. The proposed adaptive weighted deviation based metrics (AWDMs) have been tested by using it in a computer aided diagnosis (CAD) system for diagnosis of lung disorders.
Discussion: The AWDMs proposed in this work are capable of being tuned in accordance with the dataset used for training. The accuracy of the CAD system has been found to increase from 80.73% to 84.31% with the use of WED against Euclidean Distance and from 81.43% to 81.98% with the use of WCBD against city block distance.
Conclusion: This establishes the improvement in performance with the usage of AWDMs.
Keywords: Distance metric, genetic algorithm, weighted distance measure, classification, computer aided diagnosis, crossvalidation.
Graphical Abstract