Have a personal or library account? Click to login
Differentiable Short-Term Models for Efficient Online Learning and Prediction in Monophonic Music Cover

Differentiable Short-Term Models for Efficient Online Learning and Prediction in Monophonic Music

Open Access
|Nov 2022

Figures & Tables

tismir-5-1-123-g1.png
Figure 1

First 8 bars of tune “250 to Vigo (sessiontune9)” from “The Session” dataset. The figure shows a key similar to the query, both of which are followed by the same pitch E. PPM would fail to match the key and query since the key and query do not match at any order n.

Table 1

NLL and precision of the DCSTM, the CCSTM (our models), and the baselines, categorized in short-term (STM) and long-term (LTM) models. The length of the respective temporal context is denoted by n. * denotes that the maximal context is used. EVT means that the performance was measured only at time steps where the pitch changes.

TYPENAMENNLLPRECISION
STMCCSTM-5125120.5740.848
CCSTM-32320.7330.783
DCSTM-5125120.7920.781
MC-331.9220.606
PPM*1.3870.798
Repetition12.7240.606
LTMWaveNet-5125120.5020.849
Transformer-5125120.3700.887
Transformer-32320.8520.718
EVTCCSTM-5125121.2370.682
IDyOM*1.8700.426
tismir-5-1-123-g2.png
Figure 2

Negative log-likelihood (NLL) and precision as functions of time-steps in intra-opus prediction, averaged over pieces of the test set.

tismir-5-1-123-g3.png
Figure 3

Prediction of DSTMs on A Scone For Breakfast (sessiontune157) from the test dataset. Green indicates the actual pitch, and red indicates a prediction error.

tismir-5-1-123-g4.png
Figure 4

Confusion matrices for DSTMs.

tismir-5-1-123-g5.png
Figure 5

Aggregate saliency maps for DSTMs. Pixel intensities indicate how important variables are for prediction on average.

tismir-5-1-123-g6.png
Figure 6

Similarity of codes grouped by their pitch value.

tismir-5-1-123-g7.png
Figure 7

Precision computed for each pitch in the test set and the histogram of pitches in the training data set.

tismir-5-1-123-g8.png
Figure 8

Precision computed for each time signature in the test set and the histogram of time signatures in the training data set.

DOI: https://doi.org/10.5334/tismir.123 | Journal eISSN: 2514-3298
Language: English
Submitted on: Nov 5, 2021
Accepted on: Sep 12, 2022
Published on: Nov 29, 2022
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2022 Mathias Rose Bjare, Stefan Lattner, Gerhard Widmer, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.