
Figure 1
An illustration of components of the Raveform dataset. A DJ mix includes a track list, provides beats of the mix and track and has alignments between the DJ mix and each track at the beat level. Also, mix points are extracted based on the alignment. A subset of tracks are manually annotated structures: beats, downbeats, and functional segment boundaries with labels.
Table 1
Summary of the dataset components.
| File | Content |
|---|---|
| mixes.jsonl | Metadata for DJ mixes, including track lists. |
| tracks.jsonl | Metadata for individual tracks that appear in the mixes. |
| alignments/ └ *.jsonl | Mix‑to‑track alignments for each mix, including estimated mix points. |
| beats/ └ mixes/*.json └ tracks/*.json | Beats estimated for DJ mixes and tracks. |
| structures/ └ beats/*.csv └ segments.json | Human‑annotated structure labels for a subset of tracks. |

Figure 2
A representative structural pattern commonly observed in the proposed dataset, with darker regions indicating higher energy.
Table 2
A summary of the vocabulary for segment labels.
| Name | Characteristic Summary |
|---|---|
| Intro | Appears at the beginning. Primarily consists of percussive sounds, designed to facilitate seamless mixing for DJs. |
| Buildup | Typically precedes a breakdown or a drop. Gradually increases energy by introducing musical elements one by one. |
| Breakdown | Appears before a drop. Characterized by a sudden drop in energy to heighten contrast with the upcoming drop. |
| Drop | The main section of the track, conveying its core musical idea. It is the most energetic and danceable part, usually featuring all instruments together. |
| Cooldown | Follows a drop and precedes a breakdown or outro. Gradually reduces energy, functioning as the opposite of a buildup. |
| Bridge | Often occurs between two breakdowns before the final drop. Offers contrast and builds anticipation for the final drop. Rare in this dataset. |
| Outro | Occurs at the end of the track. Functions as the opposite of an intro, yet similarly features percussive sounds to aid in mixing. |
| Ambient‑intro | An intro section without percussive elements. Beats are nearly unrecognizable; the section consists mainly of melodic, harmonic, or ambient textures. |
| Ambient‑outro | An outro section without percussive elements. Sonically similar to an ambient‑intro. |
Table 3
Summary statistics of the Raveform dataset.
| The number of mixes | 4,902 |
| The number of unique tracks | 56,873 |
| The number of played tracks | 73,505 |
| The number of tracks with structural annotations | 1,423 |
| The total length of mixes (in hours) | 6,522 |
| The number of available transitions | 53,780 |

Figure 3
The genre distribution of the DJ mixes.

Figure 4
The genre distribution of the structure‑annotated tracks.

Figure 5
The tempo distribution of the structure‑annotated tracks in beats per minute.

Figure 6
The distribution of the fraction of identified tracks per DJ mix.

Figure 7
The distribution of the fraction of correctly aligned beats.

Figure 8
The distribution of the number of segments in the structure‑annotated tracks.

Figure 9
The number of appearing segments of each segment label.

Figure 10
The distribution of segment length in seconds (top) and measures (bottom).

Figure 11
The common structure of EDM tracks.

Figure 12
Average loudness (dB) values across frequency bands for each segment label.
Table 4
Cross‑dataset evaluation results of metrical and functional structure analysis tested on Harmonix set and Raveform. Boldface indicates the best performance for each metric on each test set. denotes scores achieved with label mappings (see Section 6.1.3 for details).
| Training Set (Genre) | Model | Beat | Downbeat | Segment | Label | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| F1 | CMLt | AMLt | F1 | CMLt | AMLt | HR.5F | PWF | Sf | ||
| Tested on Harmonix set (Popular Music) | ||||||||||
| Harmonix (Pop) | All‑In‑One | 0.958 | 0.913 | 0.964 | 0.915 | 0.873 | 0.932 | 0.660 | 0.738 | 0.769 |
| Raveform (EDM) | All‑In‑One | 0.810 | 0.622 | 0.856 | 0.727 | 0.591 | 0.792 | 0.509 | 0.533* | 0.543* |
| Both (Pop+EDM) | All‑In‑One | 0.953 | 0.895 | 0.963 | 0.921 | 0.876 | 0.939 | 0.659 | 0.720 | 0.751 |
| 10 Datasets (Various) | Madmom | 0.941 | 0.859 | 0.955 | 0.805 | 0.756 | 0.882 | – | – | – |
| Tested on Raveform (EDM) | ||||||||||
| Raveform (EDM) | All‑In‑One | 0.991 | 0.985 | 0.991 | 0.965 | 0.964 | 0.971 | 0.835 | 0.847 | 0.890 |
| Harmonix (Pop) | All‑In‑One | 0.930 | 0.890 | 0.918 | 0.753 | 0.746 | 0.812 | 0.635 | 0.542* | 0.643* |
| Both (Pop+EDM) | All‑In‑One | 0.990 | 0.985 | 0.989 | 0.967 | 0.965 | 0.971 | 0.835 | 0.842 | 0.890 |
| 10 Datasets (Various) | Madmom | 0.947 | 0.930 | 0.938 | 0.669 | 0.678 | 0.792 | – | – | – |
