Abstract
Background: A typical content-based image retrieval system deals with the query image and images in the dataset as a collection of low-level features and retrieves a ranked list of images based on the similarities between features of the query image and features of images in the image dataset. However, top ranked images in the retrieved list, which have high similarities to the query image, may be different from the query image in terms of the semantic interpretation of the user which is known as the semantic gap. In order to reduce the semantic gap, this paper investigates how natural scene retrieval can be performed using the bag of visual word model and the distribution of local semantic concepts.
Methods: We study the efficiency of using different approaches for representing the semantic information, depicted in natural scene images, for image retrieval.
Results: The semantic representation of the natural scene images has been implemented using the annotated and un-annotated images. Firstly, the retrieval performance when employing the COV to summarize the amount of local semantic concepts depicted in an image have reported an encouraging results. The COV constructed from the labels of image regions represented by the BOW model have shown better performance compared with the baseline methods, such as color histogram, and also comparable with the COV benchmark. Secondly, the retrieval performance of using different configuration of the bag of visual word model have been studied and evaluated experimentally using three natural scene datasets.
Conclusion: The experimental results obtained using the different image datasets have shown that the concept occurrence vector approaches achieved better retrieval accuracy compared to the BOW-based approaches and baseline.
Keywords: Image retrieval, natural scenes, bag of visual words, visual vocabulary, low-level features, local semantic concepts.
Graphical Abstract