Have a personal or library account? Click to login
Multi-Objective Investigation of Six Feature Source Types for Multi-Modal Music Classification Cover

Multi-Objective Investigation of Six Feature Source Types for Multi-Modal Music Classification

By: Igor Vatolkin and  Cory McKay  
Open Access
|Jan 2022

Abstract

Every type of musical data (audio, symbolic, lyrics, etc.) has its limitations, and cannot always capture all relevant properties of a particular musical category. In contrast to more typical MIR setups where supervised classification models are trained on only one or two types of data, we propose a more diversified approach to music classification and analysis based on six modalities: audio signals, semantic tags inferred from the audio, symbolic MIDI representations, album cover images, playlist co-occurrences, and lyric texts. Some of the descriptors we extract from these data are low-level, while others encapsulate interpretable semantic knowledge that describes melodic, rhythmic, instrumental, and other properties of music. With the intent of measuring the individual impact of different feature groups on different categories, we propose two evaluation criteria based on “non-dominated hypervolumes”: multi-group feature “importance” and “redundancy”. Both of these are calculated after the application of a multi-objective feature selection strategy using evolutionary algorithms, with a novel approach to optimizing trade-offs between both “pure” and “mixed” feature subsets. These techniques permit an exploration of how different modalities and feature types contribute to class discrimination. We use genre classification as a sample research domain to which these techniques can be applied, and present exploratory experiments on two disjoint datasets of different sizes, involving three genre ontologies of varied class similarity. Our results highlight the potential of combining features extracted from different modalities, and can provide insight on the relative significance of different modalities and features in different contexts.

DOI: https://doi.org/10.5334/tismir.67 | Journal eISSN: 2514-3298
Language: English
Submitted on: Jun 17, 2020
Accepted on: Dec 2, 2021
Published on: Jan 24, 2022
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2022 Igor Vatolkin, Cory McKay, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.