A Comprehensive Video Dataset for Multi-Modal Recognition Systems

Anand Handa; Rashi Agarwal; Narendra Kohli

doi:10.5334/dsj-2019-055

A Comprehensive Video Dataset for Multi-Modal Recognition Systems

Data Science Journal

Volume 18 (2019): Issue 1

By: Anand Handa , Rashi Agarwal and Narendra Kohli

Open Access

|Nov 2019

Abstract

This paper presents a comprehensive, highly defined and fully labelled video dataset. This dataset consists of videos related to 67 different subjects. The videos contain similar text and the text contains digits from 1 to 20 recited by 67 different subjects using the same experimental setup. This dataset can be used as a unique resource for researchers and analysts for training deep neural networks to build highly efficient and accurate recognition models in various domains of computer vision such as face recognition model, expression recognition model, speech recognition model, text recognition, etc. In this paper, we also train models related to face recognition and speech recognition on our dataset and also compare the results with the publically available datasets to show the effectiveness of our dataset. The experimental results show that our comprehensive dataset is more accurate than other dataset on which the models are tested.

References

Acharya, D, et al. 2018. “Covariance pooling for facial expression recognition”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 367–374. DOI: 10.1109/CVPRW.2018.00077
Open DOI Search in Google Scholar Back to article
Frizzell, K, et al. 2018. “Modifiable Intuitive Robot Controller: Computer Vision-Based Controller for Various Robotic Designs”. In: SoutheastCon2018. IEEE, 1–7. DOI: 10.1109/SECON.2018.8479064
Open DOI Search in Google Scholar Back to article
Goodfellow, I, et al. 2016. Deep learning. vol. 1. Cambridge: MIT press.
Search in Google Scholar Back to article
Handa, A, Agarwal, R, Dr, and Kohli, N, Prof. Dec. 2018. “A comprehensive video dataset for Multi-Modal Recognition Systems”. DOI: 10.5281/zenodo.1492227
Open DOI Search in Google Scholar Back to article
Jackson, Z, et al. 2018. Jakobovski/free-spoken-digit-dataset v1. 0.7.
Search in Google Scholar Back to article
Krizhevsky, A, Sutskever, I and Hinton, GE. 2012. “Imagenet classification with deep convolutional neural networks”. In: Advances in neural information processing systems, 1097–1105.
Search in Google Scholar Back to article
Lai, Y. 2012. Human-machine interaction system. US Patent App. 13/086,394.
Search in Google Scholar Back to article
Lyons, M, et al. 1998. “Coding facial expressions with gabor wavelets”. In: Proceedings Third IEEE international conference on automatic face and gesture recognition. IEEE, 200–205. DOI: 10.1109/AFGR.1998.670949
Open DOI Search in Google Scholar Back to article
Memos, VA. et al. 2018. “An efficient algorithm for media-based surveil-lance system (EAMSuS) in IoT smart city framework”. In: Future Generation Computer Systems, 83: 619–628. DOI: 10.1016/j.future.2017.04.039
Open DOI Search in Google Scholar Back to article
Other. 2019. NeatVideo. https://www.neatvideo.com/.
Search in Google Scholar Back to article
Shekar Naganna, S, Seth, A, Tomar, V and Yellareddy, SR. 2018. Face recognition in big data ecosystem using multiple recognition models. U.S. Patent Application 15/957,884.
Search in Google Scholar Back to article
Simonyan, K and Zisserman, A. 2014. “Very deep convolutional net-works for large-scale image recognition”. In: arXiv preprint arXiv, 1409.1556
Search in Google Scholar Back to article
Sun, X, Wu, P and Hoi, SCH. 2018. “Face detection using deep learning: An improved faster RCNN approach”. In: Neurocomputing, 299: 42–50. DOI: 10.1016/j.neucom.2018.03.030
Open DOI Search in Google Scholar Back to article
Zhao, B, et al. 2017. “Waveforms classification based on convolutional neural networks”. In: 2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC). IEEE, 162–165. DOI: 10.1109/IAEAC.2017.8053998
Open DOI Search in Google Scholar Back to article

Articles in this issue

DOI: https://doi.org/10.5334/dsj-2019-055 | Journal eISSN: 1683-1470

Journal RSS Feed

Language: English

Page range: 55 - 55

Submitted on: Nov 20, 2018

Accepted on: Oct 21, 2019

Published on: Nov 8, 2019

Published by: Ubiquity Press

In partnership with: Paradigm Publishing Services

Publication frequency: 1 issue per year

Keywords:

Machine leaning,

Deep learning,

video datasets,

Convolutional Neural Network

© 2019 Anand Handa, Rashi Agarwal, Narendra Kohli, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.

Volume 18 (2019): Issue 1