
Figure 1
Extended representation of an illustration provided in Müller (2021, p. 30) showing different original sources (circles) and media (boxes).
Table 1
Proposed conceptual framework of 12 themes of music data, broken down into three phases—phase −1, ‘before’ the music (leading to); phase 0, the ‘actual’ music (itself and around it); and phase +1, ‘after’ the music (uses of and responses to)—and five focus areas. Example sources and media/representations are provided for each.
| Theme | Source examples | Media/representation examples | |
|---|---|---|---|
| Phase −1: ‘Before’ the music | Focus area: ‘Leading to’ the music | ||
Context (§4.1) | Personal biography | Symbolic (text as linked open data) | |
| Socio‑cultural history, network | Symbolic (text as linked open data) | ||
| Economic: e.g., commission | Symbolic (text as linked open data) | ||
| Technical: e.g., textbooks | Symbolic (text) | ||
Preparation (§4.2) | Composition sketches | Image, symbolic (text as version control) | |
| Performer practice | Symbolic (text), video, other signal (motion capture) | ||
| Phase 0: The ‘actual’ music | Focus area: The music ‘itself’ | ||
Composition (§4.3) | Edition | Image | |
| Score | Symbolic (score) | ||
| Live coding | Symbolic (text as code) | ||
Performance (§4.4) | Live performance | Audio, image, video, other signal (thermal, seismic time series) | |
| Studio recording | Audio, image, video | ||
| Instruments | Symbolic (text as taxonomy) | ||
| Playing action: explicit | Symbolic (text) | ||
| Playing action: observed | Symbolic (text), video, other signal (motion capture) | ||
| Focus area: ‘Around’ the music | |||
Associated media (§4.5) | Album artwork | Image | |
| Lyrics | Symbolic (text) | ||
| Promotional photo shoot | Image | ||
| ‘Music video’ | Video | ||
| Phase +1: ‘After’ the music | Focus area: ‘Uses’ of music | ||
Other media (§4.6) | In video games | Symbolic (text as mapping of music cues to game event triggers) | |
| In film, television, advertisements | Video | ||
Meta‑composition (§4.7) | In album | Symbolic (text as ordered list) | |
| In playlist | Symbolic (text as ordered list) | ||
| In recommender sequence | Symbolic (text as code) | ||
| In setlist | Symbolic (text as ordered list) | ||
| As sample | Symbolic (text as list of time events) | ||
Popularity (§4.8) | Charts | Other signal (time series of rank, dates, sales) | |
| Streams, likes, shares, skips | Other signal | ||
Culture/occasion (§4.9) | Specific: e.g., a coronation | Symbolic (text, also as date or GPS location) | |
| Generic: e.g., weddings | Symbolic (text) | ||
| Focus area: ‘Responses’ to music | |||
Physiology (§4.10) | Heart rate | Other signal (time series) | |
| Skin conductivity | Other signal (time series) | ||
| Brain signals | Other signal (EEG) | ||
| Seismic | Other signal (time series) | ||
Analysis (§4.11) | Analysis | Symbolic (text as fixed syntax or free prose) | |
| Genre labels | Symbolic (text based on taxonomy) | ||
| Journalistic writing | Symbolic (text) | ||
| Fan/open writing | Symbolic (text) | ||
Legal (§4.12) | Court precedents | Symbolic (text) | |

Figure 2
Excerpt of a timeline analysis (TiLiA) consisting of two hierarchy timelines (‘video cuts’ and ‘Hauptstimme + form’), a beat timeline (‘Beats’), and multiple PDF timelines.

Figure 3
Overview of the 17 proposed criteria, organised into four broad categories, for the design and evaluation of multimodal music datasets.

Context (§
Preparation (§
Composition (§
Performance (§
Associated media (§
Other media (§
Meta‑composition (§
Popularity (§
Culture/occasion (§
Physiology (§
Analysis (§
Legal (§