Chapter - Bag of Visual Words Model - A Mathematical Approach

Abstract

Information extraction from images is now incredibly valuable for many new inventions. Even though there are several simple methods for extracting information from the images, feasibility and accuracy are critical. One of the simplest and most significant processes is feature extraction from the images. Many scientific approaches are derived by the experts based on the extracted features for a better conclusion of their work. Mathematical procedures, like Scientific methods, play an important role in image analysis. The Bag of Visual Words (BoVW) [1, 2, 3] is one of them, and it is helpful to figure out how similar a group of images is. A set of visual words characterises the images in the Bag of Visual Words model, which are subsequently aggregated in a histogram per image [4]. The histogram difference depicts the similarities among the images. The reweighting methodology known as Term Frequency – Inverse Document Frequency (TF-IDF) [5] refines this procedure. The overall weighting [6] for all words in each histogram is calculated before reweighting. As per the traditional way, the images are transformed into the matrix called as Cost matrix. It is constructed through two mathematical: Euclidean distances and Cosine distances. The main purpose of finding these distances is to detect similarity between the histograms. Further the histograms are normalized and both distances are calculated. The visual representation is also generated. The two mathematical methods are compared to see which one is appropriate for checking resemblance. The strategy identified as the optimum solution based on the findings aids in fraud detection in digital signature, Image Processing, and classification of images.

Cite as