Have a personal or library account? Click to login
Nuanced Music Emotion Recognition via a Semi‑Supervised Multi‑Relational Graph Neural Network Cover

Nuanced Music Emotion Recognition via a Semi‑Supervised Multi‑Relational Graph Neural Network

Open Access
|Jun 2025

References

  1. Akiki, C., & Burghardt, M. (2021). Muse: The musical sentiment dataset. Journal of Open Humanities Data, 7, 10.
  2. Aljanaki, A., Wiering, F., and Veltkamp, R. C. (2014, October 27–31). Computational modeling of induced emotion using GEMS. In Proceedings of the 15th International Society for Music Information Retrieval Conference, ISMIR 2014, Taipei, Taiwan, (pp. 373378).
  3. Aljanaki, A., Yang, Y.‑H., and Soleymani, M. (2017). Developing a benchmark for emotional analysis of music. PloS One, 12(3), e0173392. 10.1371/journal.pone.0173392
  4. Alonso‑Jiménez, P., Serra, X., and Bogdanov, D. (2023, November 5–9). Efficient supervised training of audio transformers for music representation learning. In Proceedings of the 24th International Society for Music Information Retrieval Conference, ISMIR 2023, Milan, Italy, (pp. 824831).
  5. Arazo, E., Ortego, D., Albert, P., O’Connor, N. E., and McGuinness, K. (2020). Pseudo‑labeling and confirmation bias in deep semi‑supervised learning. In 2020 International joint conference on neural networks (IJCNN), (pp. 18). IEEE.
  6. Bhatti, A. M., Majid, M., Anwar, S. M., and Khan, B. (2016). Human emotion recognition and analysis in response to audio music using brain signals. Computers in Human Behavior, 65, 267275. 10.1016/j.chb.2016.08.029
  7. Bogdanov, D., Lizarraga‑Seijas, X., Alonso‑Jiménez, P., and Serra, X. (2022). MusAV: A dataset of relative arousal‑valence annotations for validation of audio models. In International Society for Music Information Retrieval Conference (ISMIR 2022), Bengaluru, India.
  8. Bogdanov, D., Won, M., Tovstogan, P., Porter, A., and Serra, X. (2019). The MTG‑Jamendo dataset for automatic music tagging. In Machine Learning for Music Discovery Workshop, International Conference on Machine Learning (ICML 2019). Long Beach, CA, United States.
  9. Castellon, R., Donahue, C., and Liang, P. (2021, November 7‑12). Codified audio language modeling learns useful representations for music information retrieval. In Proceedings of the 22nd International Society for Music Information Retrieval Conference, ISMIR 2021, Online (pp. 8896).
  10. Chełkowska‑Zacharewicz, M., and Janowski, M. (2021). Polish adaptation of the Geneva emotional music scale: Factor structure and reliability. Psychology of Music, 49(5), 11171131. 10.1177/0305735620934624
  11. Chen, T., and Wong, R. C. (2020, August 23–27). Handling information loss of graph neural networks for session‑based recommendation. In KDD ’20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Virtual Event, CA, USA, (pp. 11721180). ACM.
  12. Choi, J., Song, J.‑H., and Kim, Y. (2018). An analysis of music lyrics by measuring the distance of emotion and sentiment. In 2018 19th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD) (pp. 176181). IEEE.
  13. da Silva, A. C. M., Silva, D. F., and Marcacini, R. M. (2022, December 4–8). Heterogeneous graph neural network for music emotion recognition. In Proceedings of the 23rd International Society for Music Information Retrieval Conference, ISMIR 2022, Bengaluru, India, (pp. 667674).
  14. Dhariwal, P., Jun, H., Payne, C., Kim, J. W., Radford, A., and Sutskever, I. (2020). Jukebox: A generative model for music.. CoRR, abs/2005.00341.
  15. Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O., and Dahl, G. E. (2017). Neural message passing for quantum chemistry. In International Conference on Machine Learning (pp. 12631272). PMLR.
  16. Gómez‑Cañón, J. S., Cano, E., Eerola, T., Herrera, P., Hu, X., Yang, Y.‑H., and Gómez, E. (2021). Music emotion recognition: Toward new, robust standards in personalized and context‑sensitive applications. IEEE Signal Processing Magazine, 38(6), 106114. 10.1109/MSP.2021.3106232
  17. Grover, A., and Leskovec, J. (2016). node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2016 (pp. 855864). ACM.
  18. Hamilton, W. L., Ying, Z., and Leskovec, J. (2017, December 4‑9). Inductive representation learning on large graphs. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, Long Beach, CA, USA, (pp. 10241034).
  19. Hassani, K., and Ahmadi, A. H. K. (2020). Contrastive multi‑view representation learning on graphs. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020 (Vol. 11, pp. 41164126). PMLR.
  20. Horner, A., Hu, D. H., Wu, B., Yang, Q., and Zhong, E. (2013, May 2–4). SMART: Semi‑supervised music emotion recognition with social tagging. In Proceedings of the 13th SIAM International Conference on Data Mining, Austin, Texas, USA, (pp. 279287). SIAM.
  21. Hu, X., and Downie, J. S. (2010, August 9–13). When lyrics outperform audio for music mood classification: A feature analysis. In Proceedings of the 11th International Society for Music Information Retrieval Conference, ISMIR 2010, Utrecht, Netherlands, (pp. 619624). International Society for Music Information Retrieval.
  22. Hu, X., Downie, J. S., and Ehmann, A. F. (2009). Lyric text mining in music mood classification. American Music, 183(5), 49209.
  23. Jacobsen, P.‑O., Strauss, H., Vigl, J., Zangerle, E., and Zentner, M. (2024). Assessing aesthetic music‑evoked emotions in a minute or less: A comparison of the gems‑45 and the gems‑9. Musicae Scientiae, 0(0), 10298649241256252. 10.1177/10298649241256252
  24. Jia, Z., Lin, Y., Wang, J., Feng, Z., Xie, X., and Chen, C. (2021). HetEmotionNet: Two‑stream heterogeneous graph recurrent neural network for multi‑modal emotion recognition. In Proceedings of the 29th ACM International Conference on Multimedia (pp. 10471056).
  25. Kim, Y. E., Schmidt, E. M., Migneco, R., Morton, B. G., Richardson, P., Scott, J., Speck, J. A., and Turnbull, D. (2010). Music emotion recognition: A state of the art review. In Proc. ISMIR, 86, 937952.
  26. Kingma, D. P., and Ba, J.Y. (2015, May 7–9). Adam: A method for stochastic optimization. In Y. Bengio and Y. LeCun (Eds.), 3rd International Conference on Learning representations, ICLR 2015, San Diego, CA, USA, Conference Track Proceedings.
  27. Kipf, T. N., and Welling, M. (2017, April 24–26). Semi‑supervised classification with graph convolutional networks. In 5th International Conference on Learning Representations, ICLR 2017, Conference Track proceedings. OpenReview.net.
  28. Laurier, C., Sordo, M., Serra, J., and Herrera, P. (2009, October 26–30). Music mood representations from social tags. In Proceedings of the 10th International Society for Music Information Retrieval Conference, ISMIR 2009, (pp. 381386). Kobe International Conference Center, Kobe, Japan. International Society for Music Information Retrieval.
  29. Lee, J., Oh, Y., In, Y., Lee, N., Hyun, D., and Park, C. (2022, July 11–15). GraFN: Semi‑supervised node classification on graph with few labels via non‑parametric distribution assignment. In SIGIR ’22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, (pp. 22432248). Madrid, Spain. ACM.
  30. Mignot, R., and Peeters, G. (2019). An analysis of the effect of data augmentation methods: Experiments for a musical genre classification task. Transactions of the International Society for Music Information Retrieval, 2(1), 97110.
  31. Moscati, M., Parada‑Cabaleiro, E., Deldjoo, Y., Zangerle, E., and Schedl, M. (2022, October 17–21). Music4All‑onion ‑ A large‑scale multi‑faceted content‑centric music recommendation dataset. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, Atlanta, GA, USA, (pp. 43394343). ACM.
  32. Moscati, M., Parada‑Cabaleiro, E., Deldjoo, Y., Zangerle, E., and Schedl, M. (2025). Music4All‑onion. Zenodo.
  33. Moscati, M., Strauß, H., Jacobsen, P., Peintner, A., Zangerle, E., Zentner, M., and Schedl, M. (2024, July 1–4). Emotion‑based music recommendation from quality annotations and large‑scale user‑generated tags. In Proceedings of the 32nd ACM Conference on User Modeling, Adaptation and Personalization,UMAP 2024, Cagliari, Italy, (pp. 159164). ACM.
  34. Panda, R., Malheiro, R., and Paiva, R. P. (2018). Novel audio features for music emotion recognition. IEEE Transactions on Affective Computing, 11(4), 614626.
  35. Panda, R., Malheiro, R., and Paiva, R. P. (2020). Audio features for music emotion recognition: A survey. IEEE Transactions on Affective Computing, 14(1), 6888.
  36. Perozzi, B., Al‑Rfou, R., and Skiena, S. (2014). DeepWalk: Online learning of social representations. In The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14 (pp. 701710). ACM.
  37. Pons, J., and Serra, X. (2019). Musicnn: Pre‑trained convolutional neural networks for music audio tagging. CoRR. 10.48550/arXiv.1909.06654">http://dx.doi.org/10.48550/arXiv.1909.06654
  38. Rajan, R., Antony, J., Joseph, R. A., and Thomas, J. M. (2021). Audio‑mood classification using acoustic‑textual feature fusion. In 2021 Fourth International Conference on Microelectronics, Signals and Systems (ICMSS) (pp. 16). IEEE.
  39. Santana, I. A. P., Pinhelli, F., Donini, J., Catharin, L., Mangolin, R. B., Feltrim, V. D., and Domingues, M. A. (2020). Music4All: A new music database and its applications. In 2020 International Conference on Systems, Signals and Image Processing (IWSSIP) (pp. 399404). IEEE.
  40. Schlichtkrull, M., Kipf, T. N., Bloem, P., Van Den, Berg, R., Titov, I., and Welling, M. (2018, June 3–7). Modeling relational data with graph convolutional networks. In The semantic web: 15th international conference, ESWC 2018, Heraklion, Crete, Greece, proceedings 15, (pp. 593607). Springer.
  41. Strauss, H., Vigl, J., Jacobsen, P.‑O., Bayer, M., Talamini, F., Vigl, W., Zangerle, E., and Zentner, M. (2024). The emotion‑to‑music mapping atlas (EMMA): A systematically organized online database of emotionally evocative music excerpts. Behavior Research Methods, 118. 10.3758/s13428-023-02178-2
  42. Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., and Mei, Q. (2015). LINE: Large‑scale information network embedding. In Proceedings of the 24th International Conference on World Wide Web, WWW 2015 (pp. 10671077). ACM.
  43. Thakoor, S., Tallec, C., Azar, M. G., Munos, R., Veličković, P., and Valko, M. (2021). Bootstrapped representation learning on graphs. In ICLR 2021 Workshop on Geometrical and Topological Representation Learning.
  44. Tovstogan, P., Bogdanov, D., and Porter, A. (2021, December). MediaEval 2021: Emotion and theme recognition in music using jamendo. In Working Notes Proceedings of the MediaEval 2021 Workshop, Online, CEUR Workshop Proceedings, 318(1), 1315. CEUR-WS.org
  45. Trost, W., Ethofer, T., Zentner, M., and Vuilleumier, P. (2012). Mapping aesthetic musical emotions in the brain. Cerebral Cortex, 22(12), 27692783. 10.1093/cercor/bhr353
  46. Vashishth, S., Sanyal, S., Nitin, V., and Talukdar, P. P. (2020, April 26–30). Composition‑based multi‑relational graph convolutional networks. In 8th International Conference on Learning Representations, ICLR 2020. Addis Ababa, Ethiopia, OpenReview.net.
  47. Velickovic, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., and Bengio, Y. (2018). Graph attention networks. In 6th International Conference on Learning Representations, ICLR 2018, Conference Track Proceedings. OpenReview.net.
  48. Velickovic, P., Fedus, W., Hamilton, W. L., Liò, P., Bengio, Y., and Hjelm, R. D. (2019). Deep graph infomax. In 7th International Conference on Learning Representations, ICLR 2019. OpenReview.net.
  49. Xue, H., Xue, L., and Su, F. (2015). Multimodal music mood classification by fusion of audio and lyrics. In MultiMedia Modeling: 21st International Conference, MMM 2015 Proceedings, Part II 21 (pp. 2637). Springer.
  50. Yang, J. (2021). A novel music emotion recognition model using neural network technology. Frontiers in Psychology, 12, 760060. 10.3389/fpsyg.2021.760060
  51. Yang, Y.‑H., and Chen, H. H. (2011). Music emotion recognition. CRC Press.
  52. Yang, Y.‑H., Lin, Y.‑C., Su, Y.‑F., and Chen, H. H. (2008). A regression approach to music emotion recognition. IEEE Transactions on Audio, Speech, and Language Processing, 16(2), 448457. 10.1109/TASL.2007.911513
  53. Zad, S., Heidari, M., James Jr, H., and Uzuner, O. (2021). Emotion detection of textual data: An interdisciplinary survey. In 2021 IEEE World AI IoT Congress (AIIoT) (pp. 02550261). IEEE.
  54. Zentner, M., Grandjean, D., and Scherer, K. R. (2008). Emotions evoked by the sound of music: Characterization, classification, and measurement. Emotion, 8(4), 494521. 10.1037/1528-3542.8.4.494
  55. Zhang, K., Zhang, H., Li, S., Yang, C., and Sun, L. (2018). The PMEmo dataset for music emotion recognition. In Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, (pp. 135142).
  56. Zhang, L., Yang, X., Zhang, Y., and Luo, J. (2023, November 5–9). Dual attention‑based multi‑scale feature fusion approach for dynamic music emotion recognition. In Proceedings of the 24th International Society for Music Information Retrieval Conference, ISMIR 2023, Milan, Italy, (pp. 207214).
  57. Zhou, Z., and Li, M. (2005, July 30August 5). Semi‑supervised regression with co‑training. In IJCAI‑05, Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, (pp. 908916). Edinburgh, Scotland, UK, Professional Book Center.
  58. Zhu, X., and Ghahramani, Z. (2002). Learning from labeled and unlabeled data with label propagation. Technical report, Carnegie Mellon University.
DOI: https://doi.org/10.5334/tismir.235 | Journal eISSN: 2514-3298
Language: English
Submitted on: Oct 29, 2024
Accepted on: Apr 30, 2025
Published on: Jun 11, 2025
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2025 Andreas Peintner, Marta Moscati, Yu Kinoshita, Richard Vogl, Peter Knees, Markus Schedl, Hannah Strauss, Marcel Zentner, Eva Zangerle, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.