Abstract
Motif discovery in polyphonic symbolic music data is an important yet challenging task in music processing. In this paper, we propose a novel motif-discovery method created by combining the traditional rule-based repeated pattern discovery algorithms with a machine learning–based model that performs the task of motif note identification, i.e., identifying whether or not a note belongs to a motif. More specifically, the motif note identification model extracts motif notes for subsequent repeated pattern discovery. Removing non-motif notes can reduce the unwanted outputs in repeated pattern discovery and thereby improve performance. With a limited amount of training data, motif note identification can be implemented by fine-tuning a pre-trained model for symbolic music using pseudo-labels. The results demonstrate the feasibility of applying data-driven methods to assist the motif-discovery task, specifically on the occurrence and three-layer metrics, under the situation that labeled training data of the motif and repeated pattern are scarce.
