Have a personal or library account? Click to login
Open Access
|Dec 2014

Abstract

In Bag of Words image presentation model, visual words are generated by unsupervised clustering, which leaves out the spatial relations between words and results in such shorting comings as limited semantic description and weak discrimination. To solve this problem, we propose to substitute visual words by visual phrases in this article. Visual phrases built according to spatial relations between words are semantic distrainable, and they can improve the accuracy of Bag of Words model. Considering the traditional classification method based on Bag of Words model is vulnerable to the background, block and scalar variance of an image, we propose in this article a multiple visual words learning method for image classification, which is based on the concept of visual phrases combined with Multiple Instance Learning. The final classification model is able to show the spatial features of image classes. Experiments performed on standard image testing sets, Caltech 101 and Scene 15, show the satisfying performance of this algorithm.

Language: English
Page range: 1470 - 1492
Submitted on: May 17, 2014
Accepted on: Oct 12, 2014
Published on: Dec 1, 2014
Published by: Professor Subhas Chandra Mukhopadhyay
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2014 Tao Wang, Wenqing Chen, Bailing Wang, published by Professor Subhas Chandra Mukhopadhyay
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.