Have a personal or library account? Click to login
Open Broadcast Media Audio from TV: A Dataset of TV Broadcast Audio with Relative Music Loudness Annotations Cover

Open Broadcast Media Audio from TV: A Dataset of TV Broadcast Audio with Relative Music Loudness Annotations

Open Access
|Aug 2019

Figures & Tables

Table 1

Comparison between publicly available datasets.

Name/AuthorMixed musicClasses per inst.Loudness# instancesDuration (h)
ScheirerYes*, annotatedSingle-classNo2451
SeyerlehnerYes, not annotatedMulti-classNo139
GTZANNoSingle-classNo1281.1
MUSANNoSingle-classNo2016108.9
OpenBMATYes, annotatedMulti-classYes164727.4

[i] * Only with speech.

tismir-2-1-29-g1.png
Figure 1

Distribution of audio files by program type and country. The program types are: children (C), documentary (D), entertainment (E), music (M), news (N), series & films (S&F), sports (S) and talk (T).

tismir-2-1-29-g2.png
Figure 2

Screenshot of BAT, the annotation tool used for the annotation of OpenBMAT.

tismir-2-1-29-g3.png
Figure 3

(Left) MD mapping: mapping to compute the agreement for the music detection task. (Right) RMLE mapping: mapping that includes information about the relative loudness of music.

Table 2

Percentages of full, partial and pair-wise (PW) agreement (Agr) for the whole dataset. These values have been computed for the complete taxonomy and both mappings.

Agreement levelNo mapping Agr (%)MD mapping Agr (%)RMLE mapping Agr (%)
%FA68.1894.7889.1
%PA96.7510099.79
%PW (annotators 1 & 2)77.4696.2291.7
%PW (annotators 2 & 3)76.9796.7892.78
%PW (annotators 1 & 3)78.6696.5593.52
tismir-2-1-29-g4.png
Figure 4

Percentage of the content of OpenBMAT by class and agreement level.

tismir-2-1-29-g5.png
Figure 5

Percentage of audio files accumulated over a certain %FAaf value using the RMLE mapping.

tismir-2-1-29-g6.png
Figure 6

(Rows) Class annotated by 2 annotators. (Columns) Class annotated by the third annotator. (Values) Percentage of the content with full or partial agreement for each class divided by the classification of the third annotator.

Table 3

Columns 2 to 4: percentage of all the audio annotated by each annotator as each of the classes of the RMLE mapping. Columns 5 and 6: percentage of all the audio annotated by each annotator as Music or No Music (isolated) or as any of the other 4 classes (mixed).

AnnotatorFg. Music (%)Bg. Music (%)No Music (%)Isolated (%)Mixed (%)
Annotator 116.634.4548.9460.0939.91
Annotator 212.737.2850.0257.8442.16
Annotator 31534.6650.3459.2840.72
Table 4

Performance of MMG on the OpenBMAT dataset using the MD and RMLE mappings. We report overall accuracy (Acc), and Precision (P) and Recall (R) for each mapped class. In this table, Music stands both for Music, in the case of MD mapping, and Foreground Music, in the case of RMLE mapping.

MappingAcc.Music PMusic RBg. Music PBg. Music RNo Music PNo Music R
MD88.9591.9985.4586.2992.48
RMLE82.7177.6469.9678.5176.0986.891.33
tismir-2-1-29-g7.png
Figure 7

Audio file distribution by full agreement using the RMLE mapping and the accuracy achieved by MMG when evaluated against the annotations of one of the annotators.

DOI: https://doi.org/10.5334/tismir.29 | Journal eISSN: 2514-3298
Language: English
Submitted on: Jan 15, 2019
Accepted on: Jun 24, 2019
Published on: Aug 12, 2019
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2019 Blai Meléndez-Catalán, Emilio Molina, Emilia Gómez, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.