Have a personal or library account? Click to login

Figures & Tables

tismir-7-1-157-g1.png
Figure 1

Graphical overview of the methods we contribute. ScoreAug (Section 5.1) in the top row, unsupervised domain adaptation (Section 5.2) in the center and snapshot-ensemble-based confidence ratings (Section 5.3) at the bottom.

Table 1

The AP at 0.5 overlap for our baseline model and two state-of-the-art models (DWD, Faster R-CNN (Tuggener et al., 2021)) on DeepscoresV2.

DeepscoresV2 dataset
ModelAP (overlap = 0.50)
Baseline model89.3%
DWD50.3%
Faster R-CNN79.9%
tismir-7-1-157-g2.png
Figure 2

Example snippets from two RealScores pages with ground truth annotations overlayed.

tismir-7-1-157-g3.png
Figure 3

Example blank pages.

Table 2

Probabilities of augmentations as part of ScoreAug that can be applied to either the blanks, synthetic scores, or both at the same time. Note that Paug decides how likely any other augmentations (after the salt and pepper noise) will be applied, in order to not only feed ScoreAugmented samples to the model. Our final model uses Psnp = 0%, Paug = 30%, Pblur = 10%.

BlanksScores
Salt and Pepper NoisePsnp
No Additional AugmentationsPaug
Horizontal Flip50%
Vertical Flip50%
Crop and Resize20%
Randomise Brightness50%
Higher Contrast20%
Small Angle Rotation60%60%
Additional Brightness40%
Gaussian BlurPblur
tismir-7-1-157-g4.png
Figure 4

ScoreAug examples (top right, bottom row) derived from the same synthetic sample (top left).

tismir-7-1-157-g5.png
Figure 5

Overview of our UDA system, with data, gradient, and label flow of step (I) shown in orange, of step (II) in green and of step (III) in blue.

Table 3

The AP for the baseline model and models with ScoreAug and Finalise data augmentation on the DeepScoresV2 and the RealScores datasets.

DeepScoresV2 dataset
ModelAP (overlap = 0.25)
Baseline87.6%
ScoreAug86.0%
ScoreAug + Finalise83.3%
RealScores dataset
ModelAP (overlap = 0.25)
Baseline36.0%
ScoreAug56.5%
ScoreAug + Finalise73.7%
Table 4

The AP for the baseline model and a model with uda on DeepScoresV2 and the RealScores dataset.

DeepScoresV2 dataset
ModelAP (overlap = 0.25)
Baseline87.6%
uda72.4%
RealScores dataset
ModelAP (overlap = 0.25)
Baseline36.0%
uda48.9%
Table 5

The AP for the model not utilizing ensembles and ensemble models with different cosine annealing cycle lengths on the DeepScoresV2 and the RealScores dataset.

DeepScoresV2 dataset
ModelAP (overlap = 0.25)
ScoreAug82.1%
ScoreAug ensemble (10 cycles)85.6%
ScoreAug ensemble (20 cycles)87.3%
ScoreAug ensemble (30 cycles)83.4%
RealScores dataset
ModelAP (overlap = 0.25)
ScoreAug37.9%
ScoreAug ensemble (10 cycles)44.6%
ScoreAug ensemble (20 cycles)46.7%
ScoreAug ensemble (30 cycles)47.0%
tismir-7-1-157-g6.png
Figure 6

Four cropped visualisation samples of predictions made by an ensemble. The colour of the bounding box indicates the model’s confidence (green means high confidence, and red means low confidence). For symbols with a confidence score below 30%, we plot not only the coloured bounding box but also the assigned label as well as the confidence score.

Table 6

The AP for the ensemble trained with a cosine annealing cycle length of 20. The model is trained once with ScoreAug only and once with ScoreAug in combination with 50 subsequent Finalise cycles.

DeepScoresV2 dataset
Ensemble (cycle length = 20)AP (overlap = 0.25)
ScoreAug87.3%
ScoreAug & Finalise81.5%
RealScores dataset
Ensemble (cycle length = 20)AP (overlap = 0.25)
ScoreAug46.7%
ScoreAug & Finalise63.6%
DOI: https://doi.org/10.5334/tismir.157 | Journal eISSN: 2514-3298
Language: English
Submitted on: Dec 6, 2022
Accepted on: Jul 31, 2023
Published on: Jan 11, 2024
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2024 Lukas Tuggener, Raphael Emberger, Adhiraj Ghosh, Pascal Sager, Yvan Putra Satyawan, Javier Montoya, Simon Goldschagg, Florian Seibold, Urs Gut, Philipp Ackermann, Jürgen Schmidhuber, Thilo Stadelmann, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.