Have a personal or library account? Click to login
Using Text and Visual Cues for Fine-Grained Classification Cover
Open Access
|Feb 2021

Abstract

Text is an important invention of humanity, which plays a key role in human life, so far from dark ages. Text in image is closely related to the scene or a product and is widely used in vision based application. In this paper we are addressing the problem of visual understanding with text. The main focus is combining textual cues and visual cues in deep neural network. First the text is recognized and classified from the image. Then we combine the attended word embedding and visual feature vector which are then optimized by CNN for Fine-grained image classification. We carried out the experiments on soft drink dataset in Pakistan. The results shows that the system achieves significant performance which can be potentially beneficial for real world application e.g. product search.

Language: English
Page range: 42 - 49
Published on: Feb 22, 2021
Published by: Xi’an Technological University
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year

© 2021 Zaryab Shaker, Xiao Feng, Muhammad Adeel Ahmed Tahir, published by Xi’an Technological University
This work is licensed under the Creative Commons Attribution 4.0 License.