
Figure 1
The BM book and various derived data modalities from MTD. (a) Original book. (b) Clean sheet music engraving of a musical theme. (c) Piano roll representation. (d) Alignment data. (e) Waveform of audio snippet.

Figure 2
Various themes from the MTD. (a) Beethoven: Symphony No. 5 in C minor, Op. 67, first movement (piano transcription), first theme. (b) Beethoven: Piano Sonata No. 2 in A major, Op. 2, No. 2, first movement, second theme. (c) Schubert: Piano Sonata in B♭ major, D 960, first movement, second theme. (d) Debussy: Reflets dans l’eau (Images, Book 1, L 110, No. 1), two themes.

Figure 3
Various bar graphs for metadata of the themes. Numbers of themes on horizontal axes (logarithmic) are shown per (a) composer (non-bold numbers after the slash indicate total number of themes in BM book), (b) theme instrumentation, (c) work instrumentation, and (d) ensemble type.
Table 2
Overview of all metadata contained in the MTD.
| Field | Description |
|---|---|
| MTDID | Identifier, used in the MTD |
| BMID | Identifier, from original BM book |
| EDMID | Identifier, used in the EDM |
| ComposerID | Identifier, based on composer’s name |
| WorkID | Identifier, usually based on catalog number |
| PerformanceID | Identifier, based on main performer of recording |
| CollectionID | Identifier, based on album collection |
| LabelID | Identifier, based on recording label |
| WCMID | Internal ID for audio recording |
| MusicBrainzID | MusicBrainz release ID for album collection |
| ComposerBirth | Composer’s year of birth |
| ComposerDeath | Composer’s year of death |
| WorkTitle | Sub-title, nickname, or non-numeric title for musical work |
| ThemeLabelBM | Label for theme, from original BM book |
| ThemeInstruments | Instrument (s) playing the theme |
| WorkInstruments | Instrument (s) of the musical work |
| Ensemble | Ensemble type |
| Polyphony | Indication of musical texture |
| NameCD | CD name in album collection |
| NameTrack | Track name in the CD of the album collection |
| StartTime | Start time of theme occurence in audio recording |
| EndTime | End time of theme occurence in audio recording |
| MidiTransposition | Pitch transposition difference between recording and symbolic encoding |
| Comment | Textual comment |

Figure 4
Dataset overview. (a) Main statistics of dataset. Average information (Ø) given as mean ± standard deviation. (b) Histogram of theme durations in quarter notes and (c) in seconds.

Figure 5
Time signatures in the MTD.
Table 3
Overview of the MTD directory structure.
| Directory | Description | Format |
|---|---|---|
| data_EDM-orig_CSV | Original EDM files | CSV |
| data_EDM-orig_IMG | ||
| data_EDM-orig_MID | MIDI | |
| data_EDM-corr_CSV | Corrected EDM files | CSV |
| data_EDM-corr_IMG | ||
| data_EDM-corr_MID | MIDI | |
| data_EDM-alig_CSV | Aligned EDM files | CSV |
| data_EDM-alig_MID | MIDI | |
| data_SCORE_CSV | SCORE files | CSV |
| data_SCORE_IMG | ||
| data_SCORE_MID | MIDI | |
| data_SCORE_SIB | Sibelius | |
| data_SCORE_XML | MusicXML | |
| data_ALIGNMENT | Alignment data | CSV |
| data_AUDIO | Audio snippets | WAV |
| data_META | Metadata | JSON |

Figure 6
Screenshots of our web-based interfaces. (a) Overview table of web page. (b) Subpage for the theme with MTD ID 1066. (c) Jupyter notebook. (d) Alignment tool.
