Have a personal or library account? Click to login
Bridging the gap between AI and human emotion: a multimodal recognition system Cover

Bridging the gap between AI and human emotion: a multimodal recognition system

Open Access
|Aug 2025

References

  1. Alnuaim, A. A., Zakariah, M., Alhadlaq, A., Shashidhar, C., Hatamleh, W. A., Tarazi, H., Shukla, P. K. and Ratna, R. (2022) Human-Computer Interaction with Detection of Speaker Emotions Using Convolution Neural Networks. Computational Intelligence and Neuroscience, 2022 (1): 7463091.
  2. Baevski, A., Zhou, Y., Mohamed, A. and Auli, M. (2020) wav2vec 2.0: A framework for self-supervised learning of speech representations. Advances in Neural Information Processing Systems, 33, 12449-12460.
  3. Chandolikar, N., Joshi, C., Roy, P., Gawas, A. and Vishwakarma, M. (2022, March) Voice recognition: A comprehensive survey. In: 2022 International Mobile and Embedded Technology Conference (MECON). IEEE, 45–51.
  4. Chowdary, M. K., Nguyen, T.N. and Hemanth, D. J. (2023) Deep learning-based facial emotion recognition for human–computer interaction applications. Neural Computing and Applications, 35 (32): 23311–28.
  5. Ekman, P. and Friesen, W. V. (1971) Constants across cultures in the face and emotion. Journal of Personality and Social Psychology, 17 (2), 124.
  6. Eyben, F., Scherer, K. R., Schuller, B. W., Sundberg, J., André, E., Busso, C., Devillers, L. Y., Epps, J., Laukka, P., Narayanan, S. S. and Truong, K. P. (2015) The Geneva minimalistic acoustic parameter set (GeMAPS) for voice research and affective computing. IEEE Transactions on Affective Computing, 7(2), 190–202.
  7. Goodfellow, I. J., Erhan, D., Carrier, P. L., Courville, A., Mirza, M., Hamner, B., Cukierski, W., Tang, Y., Thaler, D., Lee, D. H. and Zhou, Y. (2013) Challenges in representation learning: A report on three machine learning contests. In: Neural Information Processing: 20th International Conference, ICONIP 2013, Daegu, Korea, November 3-7, 2013. Proceedings, Part III 20. Springer Berlin Heidelberg, 117–124.
  8. Kanna, R. K., Surendhar, P. A., Rubi, J., Jyothi, G., Ambikapathy, A. and Vasuki, R. (2022) Human Computer Interface Application for Emotion Detection Using Facial Recognition. In: 2022 IEEE International Conference on Current Development in Engineering and Technology (CCET). IEEE, 1–7.
  9. Khare, S. K., Blanes-Vidal, V., Nadimi, E. S. and Acharya, U. R. (2024) Emotion recognition and artificial intelligence: A systematic review (2014–2023) and research recommendations. Information Fusion, 102, 102019.
  10. Khattak, A., Asghar, M. Z., Ali, M. and Batool, U. (2022) An efficient deep learning technique for facial emotion recognition. Multimedia Tools and Applications, January, 81 (2): 1649–1683.
  11. Kumar, H. and Martin, A. (2023) Artificial Emotional Intelligence: Conventional and deep learning approach. Expert Systems with Applications, February, 1, 212: 118651.
  12. Lim, Y., Ng, K. W., Naveen, P. and Haw, S. C. (2022) Emotion recognition by facial expression and voice: review and analysis. Journal of Informatics and Web Engineering, 1(2), 45–54.
  13. Majumder, N., Poria, S., Hazarika, D., Mihalcea, R., Gelbukh, A. and Cambria, E. (2019) Dialoguernn: An attentive rnn for emotion detection in conversations. In: Proceedings of the AAAI conference on artificial intelligence, 33, 01. AAAI, 6818–6825.
  14. Mannar Mannan, J., Srinivasan, L., Maithili, K. and Ramya, C. (2023) Human emotion recognize using convolutional neural network (CNN) and Mel frequency cepstral coefficient (MFCC). Seybold Report Journal, 18 (4): 49–61.
  15. Mansouri, A., Affendey, L. S. and Mamat, A. (2008) Named entity recognition approaches. International Journal of Computer Science and Network Security, 8(2), 339–344.
  16. Poria, S., Cambria, E., Bajpai, R. and Hussain, A. (2017) A review of affective computing: From unimodal analysis to multimodal fusion. Information Fusion, 37, 98–125.
  17. Raji, I. D. and Buolamwini, J. (2019) Actionable auditing: Investigating the impact of publicly naming biased performance results of commercial ai products. In: Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society. ACM, 429–435.
  18. Samek, W., Montavon, G., Vedaldi, A., Hansen, L. K. and Müller, K. R., eds. (2019) Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. LNCS 11700 Springer Nature.
  19. Sarvakar, K., Senkamalavalli, R., Raghavendra, S., Kumar, J. S., Manjunath, R. and Jaiswal, S. (2023) Facial emotion recognition using convolutional neural networks. Materials Today: Proceedings, January, 1, 80: 3560-3564.
  20. Scherer, K. R. (2003) Vocal communication of emotion: A review of research paradigms. Speech Communication, 40 (1-2), 227–256.
  21. Shahzad, H. M., Bhatti, S. M., Jaffar, A., Akram, S., Alhajlah, M. and Mahmood, A. (2023) Hybrid facial emotion recognition using CNN-based features. Applied Sciences, 13 (9), 5572.
  22. Trinh, V. L, Dao, Th. L. T., Le Xuan, T. and Castelli, E. (2022) Emotional speech recognition using deep neural networks. Sensors, February, 12, 22 (4), 1414.
  23. Venkatesan, R., Shirly, S., Selvarathi, M. and Jebaseeli, T. J. (2023) Human Emotion Detection Using DeepFace and Artificial Intelligence. Engineering Proceedings, 59(1), 37.
  24. Zafar, A., Aamir, M., Mohd Nawi, N., Arshad, A., Riaz, S., Alruban, A., Almotairi, S. and Dutta, A. K. (2022) A comparison of pooling methods for convolutional neural networks. Applied Sciences, 12 (17), 8643.
  25. Zeng, Z., Pantic, M., Roisman, G. I. and Huang, T. S. (2007) A survey of a ect recognition methods: audio, visual and spontaneous expressions. In: Proceedings of the 9th International Conference on Multimodal Interfaces. ACM, 126–133.
DOI: https://doi.org/10.2478/candc-2024-0024 | Journal eISSN: 2720-4278 | Journal ISSN: 0324-8569
Language: English
Page range: 619 - 638
Submitted on: Sep 1, 2024
|
Accepted on: Mar 1, 2025
|
Published on: Aug 26, 2025
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year

© 2025 Ganta Neeraja, Jakkula Sai Surya Teja, M. Ravi Kumar, J. Lakshmi Prasanna, Parvez M. Muzammil, Chella Santhosh, published by Systems Research Institute Polish Academy of Sciences
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.