Bridging the gap between AI and human emotion: a multimodal recognition system

Ganta Neeraja; Jakkula Sai Surya Teja; M. Ravi Kumar; J. Lakshmi Prasanna; Parvez M. Muzammil; Chella Santhosh

doi:10.2478/candc-2024-0024

References

Alnuaim, A. A., Zakariah, M., Alhadlaq, A., Shashidhar, C., Hatamleh, W. A., Tarazi, H., Shukla, P. K. and Ratna, R. (2022) Human-Computer Interaction with Detection of Speaker Emotions Using Convolution Neural Networks. Computational Intelligence and Neuroscience, 2022 (1): 7463091.
Search in Google Scholar Back to article
Baevski, A., Zhou, Y., Mohamed, A. and Auli, M. (2020) wav2vec 2.0: A framework for self-supervised learning of speech representations. Advances in Neural Information Processing Systems, 33, 12449-12460.
Search in Google Scholar Back to article
Chandolikar, N., Joshi, C., Roy, P., Gawas, A. and Vishwakarma, M. (2022, March) Voice recognition: A comprehensive survey. In: 2022 International Mobile and Embedded Technology Conference (MECON). IEEE, 45–51.
Search in Google Scholar Back to article
Chowdary, M. K., Nguyen, T.N. and Hemanth, D. J. (2023) Deep learning-based facial emotion recognition for human–computer interaction applications. Neural Computing and Applications, 35 (32): 23311–28.
Search in Google Scholar Back to article
Ekman, P. and Friesen, W. V. (1971) Constants across cultures in the face and emotion. Journal of Personality and Social Psychology, 17 (2), 124.
Search in Google Scholar Back to article
Eyben, F., Scherer, K. R., Schuller, B. W., Sundberg, J., André, E., Busso, C., Devillers, L. Y., Epps, J., Laukka, P., Narayanan, S. S. and Truong, K. P. (2015) The Geneva minimalistic acoustic parameter set (GeMAPS) for voice research and affective computing. IEEE Transactions on Affective Computing, 7(2), 190–202.
Search in Google Scholar Back to article
Goodfellow, I. J., Erhan, D., Carrier, P. L., Courville, A., Mirza, M., Hamner, B., Cukierski, W., Tang, Y., Thaler, D., Lee, D. H. and Zhou, Y. (2013) Challenges in representation learning: A report on three machine learning contests. In: Neural Information Processing: 20th International Conference, ICONIP 2013, Daegu, Korea, November 3-7, 2013. Proceedings, Part III 20. Springer Berlin Heidelberg, 117–124.
Search in Google Scholar Back to article
Kanna, R. K., Surendhar, P. A., Rubi, J., Jyothi, G., Ambikapathy, A. and Vasuki, R. (2022) Human Computer Interface Application for Emotion Detection Using Facial Recognition. In: 2022 IEEE International Conference on Current Development in Engineering and Technology (CCET). IEEE, 1–7.
Search in Google Scholar Back to article
Khare, S. K., Blanes-Vidal, V., Nadimi, E. S. and Acharya, U. R. (2024) Emotion recognition and artificial intelligence: A systematic review (2014–2023) and research recommendations. Information Fusion, 102, 102019.
Search in Google Scholar Back to article
Khattak, A., Asghar, M. Z., Ali, M. and Batool, U. (2022) An efficient deep learning technique for facial emotion recognition. Multimedia Tools and Applications, January, 81 (2): 1649–1683.
Search in Google Scholar Back to article
Kumar, H. and Martin, A. (2023) Artificial Emotional Intelligence: Conventional and deep learning approach. Expert Systems with Applications, February, 1, 212: 118651.
Search in Google Scholar Back to article
Lim, Y., Ng, K. W., Naveen, P. and Haw, S. C. (2022) Emotion recognition by facial expression and voice: review and analysis. Journal of Informatics and Web Engineering, 1(2), 45–54.
Search in Google Scholar Back to article
Majumder, N., Poria, S., Hazarika, D., Mihalcea, R., Gelbukh, A. and Cambria, E. (2019) Dialoguernn: An attentive rnn for emotion detection in conversations. In: Proceedings of the AAAI conference on artificial intelligence, 33, 01. AAAI, 6818–6825.
Search in Google Scholar Back to article
Mannar Mannan, J., Srinivasan, L., Maithili, K. and Ramya, C. (2023) Human emotion recognize using convolutional neural network (CNN) and Mel frequency cepstral coefficient (MFCC). Seybold Report Journal, 18 (4): 49–61.
Search in Google Scholar Back to article
Mansouri, A., Affendey, L. S. and Mamat, A. (2008) Named entity recognition approaches. International Journal of Computer Science and Network Security, 8(2), 339–344.
Search in Google Scholar Back to article
Poria, S., Cambria, E., Bajpai, R. and Hussain, A. (2017) A review of affective computing: From unimodal analysis to multimodal fusion. Information Fusion, 37, 98–125.
Search in Google Scholar Back to article
Raji, I. D. and Buolamwini, J. (2019) Actionable auditing: Investigating the impact of publicly naming biased performance results of commercial ai products. In: Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society. ACM, 429–435.
Search in Google Scholar Back to article
Samek, W., Montavon, G., Vedaldi, A., Hansen, L. K. and Müller, K. R., eds. (2019) Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. LNCS 11700 Springer Nature.
Search in Google Scholar Back to article
Sarvakar, K., Senkamalavalli, R., Raghavendra, S., Kumar, J. S., Manjunath, R. and Jaiswal, S. (2023) Facial emotion recognition using convolutional neural networks. Materials Today: Proceedings, January, 1, 80: 3560-3564.
Search in Google Scholar Back to article
Scherer, K. R. (2003) Vocal communication of emotion: A review of research paradigms. Speech Communication, 40 (1-2), 227–256.
Search in Google Scholar Back to article
Shahzad, H. M., Bhatti, S. M., Jaffar, A., Akram, S., Alhajlah, M. and Mahmood, A. (2023) Hybrid facial emotion recognition using CNN-based features. Applied Sciences, 13 (9), 5572.
Search in Google Scholar Back to article
Trinh, V. L, Dao, Th. L. T., Le Xuan, T. and Castelli, E. (2022) Emotional speech recognition using deep neural networks. Sensors, February, 12, 22 (4), 1414.
Search in Google Scholar Back to article
Venkatesan, R., Shirly, S., Selvarathi, M. and Jebaseeli, T. J. (2023) Human Emotion Detection Using DeepFace and Artificial Intelligence. Engineering Proceedings, 59(1), 37.
Search in Google Scholar Back to article
Zafar, A., Aamir, M., Mohd Nawi, N., Arshad, A., Riaz, S., Alruban, A., Almotairi, S. and Dutta, A. K. (2022) A comparison of pooling methods for convolutional neural networks. Applied Sciences, 12 (17), 8643.
Search in Google Scholar Back to article
Zeng, Z., Pantic, M., Roisman, G. I. and Huang, T. S. (2007) A survey of a ect recognition methods: audio, visual and spontaneous expressions. In: Proceedings of the 9th International Conference on Multimodal Interfaces. ACM, 126–133.
Search in Google Scholar Back to article

Bridging the gap between AI and human emotion: a multimodal recognition system

References

Paradigm

My account