Have a personal or library account? Click to login
A Bimodal Deep Model to Capture Emotions from Music Tracks Cover
Open Access
|Mar 2025

References

  1. L. Smietanka and T. Maka, “Interpreting convolutional layers in DNN model based on time–frequency representation of emotional speech,” Journal of Artificial Intelligence and Soft Computing Research, vol. 14, no. 1, pp. 5–23, Jan. 2024, doi: 10.2478/jaiscr-2024-0001.
  2. S. Sheykhivand, Z. Mousavi, T. Y. Rezaii, and A. Farzamnia, “Recognizing Emotions Evoked by Music Using CNN-LSTM Networks on EEG Signals,” IEEE Access, vol. 8, pp. 139332-139345, 2020, doi: 10.1109/ACCESS.2020.3011882.
  3. Y. Takahashi, T. Hochin, and H. Nomiya, “Relationship between Mental States with Strong Emotion Aroused by Music Pieces and Their Feature Values,” in Proc. 2014 IIAI 3rd International Conference on Advanced Applied Informatics, 2014, pp. 718-725, doi: 10.1109/IIAIAAI.2014.147.
  4. P. A. Wood and S. K. Semwal, “On exploring the connection between music classification and evoking emotion,” in Proc. 2015 International Conference on Collaboration Technologies and Systems(CTS), 2015, pp. 474-476, doi: 10.1109/CTS.2015.7210471.
  5. M. Agapaki, E. A. Pinkerton, and E. Papatzikis, “Music and neuroscience research for mental health, cognition, and development: Ways forward,” Frontiers in Psychology, vol. 13, 2022, doi: https://doi.org/10.3389/fpsyg.2022.976883">https://doi.org/10.3389/fpsyg.2022.976883.
  6. Y. Song, S. Dixon, M. Pearce, and A. Halpern, “Perceived and Induced Emotion Responses to Popular Music: Categorical and Dimensional Models,” Music Perception: An Interdisciplinary Journal, vol. 33, pp. 472-492, Apr. 2016, doi: 10.1525/mp.2016.33.4.472.
  7. Y. Yuan, “Emotion of Music: Extraction and Composing,” Journal of Education, Humanities and Social Sciences, vol. 13, pp. 422-428, May 2023, doi: 10.54097/ehss.v13i.8207.
  8. S. A. Sujeesha, J. B. Mala, and R. Rajeev, “Automatic music mood classification using multi-modal attention framework,” *Engineering Applications of Artificial Intelligence*, vol. 128, p. 107355, 2024, doi: 10.1016/j.engappai.2023.107355.
  9. M. Schedl, P. Knees, B. McFee, D. Bogdanov, and M. Kaminskas, “Music recommender systems,” in Recommender systems handbook, Springer, 2015, pp. 453-492.
  10. MorphCast Technology. Available: https://www.morphcast.com. Accessed: November 2024.
  11. S. Zhao, G. Jia, J. Yang, G. Ding, and K. Keutzer, “Emotion Recognition From Multiple Modalities: Fundamentals and methodologies,” IEEE Signal Processing Magazine, vol. 38, no. 6, pp. 59-73, Nov. 2021, doi: 10.1109/msp.2021.3106895.
  12. T. Li, “Music emotion recognition using deep convolutional neural networks,” Journal of Computational Methods in Science and Engineering, vol. 24, no. 4-5, pp. 3063-3078, 2024, doi: 10.3233/JCM-247551.
  13. P. L. Louro, H. Redinho, R. Malheiro, R. P. Paiva, and R. Panda, “A comparison study of deep learning methodologies for music emotion recognition,” Sensors, vol. 24, no. 7, p. 2201, 2024, doi: 10.3390/s24072201.
  14. M. Blaszke, G. Korvel, and B. Kostek, “Exploring neural networks for musical instrument identification in polyphonic audio,” IEEE Intelligent Systems, pp. 1-11, 2024, doi: 10.1109/mis.2024.3392586.
  15. M. Barata and P. Coelho, “Music Streaming Services: Understanding the drivers of customer purchase and intention to recommend,” Heliyon, vol. 7, p. e07783, Aug. 2021, doi: 10.1016/j.heliyon.2021.e07783.
  16. J. Webster, “The promise of personalization: Exploring how music streaming platforms are shaping the performance of class identities and distinction,” New Media & Society, p. 146144482110278, Jul. 2021, doi: 10.1177/14614448211027863.
  17. E. Schmidt, D. Turnbull, and Y. Kim, “Feature selection for content-based, time-varying musical emotion regression,” in Proc ACM SIGMM Int Conf Multimedia Info Retrieval, Mar. 2010, pp. 267-274, doi: 10.1145/1743384.1743431.
  18. Y.-H. Yang, Y.-C. Lin, H.-T. Cheng, I.-B. Liao, Y.-C. Ho, and H. H. Chen, “Toward Multimodal Music Emotion Classification,” in Advances in Multimedia Information Processing - PCM 2008, 2008, pp. 70-79.
  19. T. Ciborowski, S. Reginis, D. Weber, A. Kurowski, and B. Kostek, “Classifying Emotions in Film Music—A Deep Learning Approach,” Electronics, vol. 10, no. 23, p. 2955, Nov. 2021, doi: 10.3390/electronics10232955.
  20. X. Han, F. Chen, and J. Ban, “Music Emotion Recognition Based on a Neural Network with an Inception-GRU Residual Structure,” Electronics, vol. 12, no. 4, p. 978, Feb. 2023, doi: 10.3390/electronics12040978.
  21. Y. J. Liao, W. C. Wang, S.-J. Ruan, Y. H. Lee, and S. C. Chen, “A Music Playback Algorithm Based on Residual-Inception Blocks for Music Emotion Classification and Physiological Information,” Sensors, vol. 22, no. 3, p. 777, Jan. 2022, doi: 10.3390/s22030777.
  22. R. Sarkar, S. Choudhury, S. Dutta, A. Roy, and S. K. Saha, “Recognition of emotion in music based on deep convolutional neural network,” Multimedia Tools and Applications, vol. 79, pp. 765-783, 2019, [Online]. Available: https://api.semanticscholar.org/CorpusID:254866914.
  23. S. Giammusso, M. Guerriero, P. Lisena, E. Palumbo, and R. Troncy, “Predicting the emotion of playlists using track lyrics,” International Society for Music Information Retrieval ISMIR, Late Breaking Session, 2017.
  24. Y. Agrawal, R. Shanker, and V. Alluri, “Transformer-based approach towards music emotion recognition from lyrics,” Advances in Information Retrieval. ECIR 2021. Lecture Notes in Computer Science, vol 12657. Springer, 2021, doi: 10.1007/978-3-030-72240-1 12.
  25. D. Han, Y. Kong, H. Jiayi, and G. Wang, “A survey of music emotion recognition,” Frontiers of Computer Science, vol. 16, Dec. 2022, doi: 10.1007/s11704-021-0569-4.
  26. T. Baltrušaitis, C. Ahuja, and L. -P. Morency, “Multimodal Machine Learning: A Survey and Taxonomy,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 2, pp. 423-443, 1 Feb. 2019, doi: 10.1109/TPAMI.2018.2798607.
  27. R. Delbouys, R. Hennequin, F. Piccoli, J. Royo-Letelier, and M. Moussallam, “Music Mood Detection Based On Audio And Lyrics With Deep Neural Net,” ISMIR 2018 https://doi.org/10.48550/arXiv.1809.07276">https://doi.org/10.48550/arXiv.1809.07276
  28. I. A. P. Santana et al., “Music4all: A new music database and its applications,” in Proc. 2020 International Conference on Systems, Signals and Image Processing (IWSSIP), 2020, pp. 399-404, doi: 10.1109/IWSSIP48289.2020.9145170.
  29. E. Çano and M. Morisio, “Moodylyrics: A sentiment annotated lyrics dataset,” in Proc. 2017 International conference on intelligent systems, meta-heuristics & swarm intelligence, 2017, pp. 118-124, doi: 10.1145/3059336.3059340.
  30. E. Çano and M. Morisio, “Music mood dataset creation based on last.Fm tags,” in Proc. 2017 International Conference on Artificial Intelligence and Applications, Vienna, Austria, 2017, pp. 15-26, DOI:10.5121/csit.2017.70603.
  31. R.E. Thayer: The Biopsychology of Mood and Arousal, Oxford University Press, 1989.
  32. J. Russell, “A Circumplex Model of Affect,” Journal of Personality and Social Psychology, vol. 39, pp. 1161-1178, Dec. 1980, doi: 10.1037/h0077714.
  33. Social music service - Last.fm. Available: https://www.last.fm/. Accessed: November 2024.
  34. Genius - Song Lyrics & Knowledge. Available: https://genius.com/. Accessed: November 2024.
  35. YouTube. Available: https://www.youtube.com. Accessed: November 2024.
  36. M. Sakowicz and J. Tobolewski, “Development and study of an algorithm for the automatic labeling of musical pieces in the context of emotion evoked,” M.Sc. thesis, Gdansk University of Technology and Universitat Politècnica de Catalunya (co-supervised by B. Kostek and J. Turmo), 2023.
  37. Genius and Spotify partnering. Available: https://genius.com/a/genius-and-spotify-together. Accessed: November 2024.
  38. Pafy library. Available: https://pypi.org/project/pafy/. Accessed: November 2024.
  39. Moviepy library. Available: https://pypi.org/project/moviepy/. Accessed: November 2024.
  40. M. Honnibal and I. Montani, “spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing,” 2017. Available: https://github.com/explosion/spaCy. Accessed: November 2024.
  41. P. N. Johnson-Laird and K. Oatley, “Emotions, Simulation, and Abstract Art,” Art & Perception, vol. 9, no. 3, pp. 260-292, 2021, DOI: https://doi.org/10.1163/22134913-bja10029">https://doi.org/10.1163/22134913-bja10029.
  42. P. N. Johnson-Laird and K. Oatley, “How poetry evokes emotions,” Acta Psycho-logica, vol. 224, p. 103506, 2022, doi: https://doi.org/10.1016/j.actpsy.2022.103506">https://doi.org/10.1016/j.actpsy.2022.103506.
  43. J. Pennington, R. Socher, and C. Manning, “GloVe: Global Vectors for Word Representation,” in Proc. 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Oct. 2014, pp. 1532-1543, doi: 10.3115/v1/D14-1162.
  44. SpaCy - pre-trained pipeline for English. Available: https://spacy.io/models/en\#en_core_web_lg. Accessed: November 2024.
  45. S. Loria, “Textblob Documentation,” Release 0.15, vol. 2, 2018. Available: https://textblob.readthedocs.io/en/dev/. Accessed: November 2024.
  46. F. Pedregosa et al., “Scikit-learn: Machine Learning in Python,” Journal of Machine Learning Research, vol. 12, no. 85, pp. 2825-2830, 2011. Available: http://jmlr.org/papers/v12/pedregosa11a.html. Accessed: November 2024.
  47. ”Paradise City” Guns N’ Roses https://genius.com/Guns-n-roses-paradise-citylyrics
  48. FastText - text classification tutorial. Available: https://fasttext.cc/docs/en/supervisedtutorial.html. Accessed: November 2024.
  49. T. Wolf et al., “Transformers: State-of-the-Art Natural Language Processing,” Jan. 2020, pp. 38-45, doi: 10.18653/v1/2020.emnlp-demos.6.
  50. XLNet (base-sized model). Available: https://huggingface.co/xlnet-base-cased. Accessed: November 2024.
  51. Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. R. Salakhutdinov, and Q. V. Le, “Xlnet: Generalized autoregressive pretraining for language understanding,” Advances in neural information processing systems, vol. 32, 2019. https://doi.org/10.48550/arXiv.1906.08237">https://doi.org/10.48550/arXiv.1906.08237
  52. C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the Inception Architecture for Computer Vision,” in Proc. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2016, pp. 2818-2826 doi: 10.1109/CVPR.2016.308.
  53. K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” in Proc. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 770-778, doi: 10.1109/CVPR.2016.90.
  54. K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in Proc. 3rd International Conference on Learning Representations(ICLR 2015), 2015, pp. 1-14. https://doi.org/10.48550/arXiv.1409.1556">https://doi.org/10.48550/arXiv.1409.1556
  55. Librosa library. Available: https://librosa.org/. Accessed: November 2024.
  56. Chollet, F. et al., 2015. Keras. Available: https://github.com/fchollet/keras. Accessed: November 2024.
  57. TensorFlow library. Available: https://www.tensorflow.org/?hl=pl. Accessed: November 2024.
  58. S. C. Huang, A. Pareek, S. Seyyedi, I. Banerjee, and M. Lungren, “Fusion of medical imaging and electronic health records using deep learning: a systematic review and implementation guidelines,” npj Digital Medicine, vol. 3, 12, 2020. https://doi.org/10.1038/s41746-020-00341-z">https://doi.org/10.1038/s41746-020-00341-z
  59. A. Paszke et al., 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems, 32. Curran Associates, Inc., pp. 8024-8035.
  60. Combining two deep learning models. Available: https://control.com/technical-articles/combining-two-deep-learning-models/. Accessed: November 2024.
  61. Y. Shi, A. Karatzoglou, L. Baltrunas, M. Larson, A.Hanjalic, and N. Oliver, “TFMAP: Optimizing MAP for top-n context-aware recommendation,” in Proc. 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 155-164, Portland Oregon USA, August 2012, doi: 10.1145/2348283.2348308.
  62. K. Pyrovolakis, P.K. Tzouveli, and G. Stamou, Multi-Modal Song Mood Detection with Deep Learning. Sensors (Basel, Switzerland), 22, 2022, doi:10.3390/s22031065
  63. E. N. Shaday, V. J. L. Engel, and H. Heryanto, “Application of the Bidirectional Long Short-Term Memory Method with Comparison of Word2Vec, GloVe, and FastText for Emotion Classification in Song Lyrics”, Procedia Computer Science, vol. 245, pp. 137-146, 2024, https://doi.org/10.1016/j.procs.2024.10.237">https://doi.org/10.1016/j.procs.2024.10.237
Language: English
Page range: 215 - 235
Submitted on: Nov 28, 2024
Accepted on: Feb 21, 2025
Published on: Mar 18, 2025
Published by: SAN University
In partnership with: Paradigm Publishing Services
Publication frequency: 4 times per year

© 2025 Jan Tobolewski, Michał Sakowicz, Jordi Turmo, Bożena Kostek, published by SAN University
This work is licensed under the Creative Commons Attribution 4.0 License.