Table 1
The first ten open corpora available through the Dezrann platform. Except SUPRA, all corpora have scores, and most of them have some annotations and synchronized audio. For some corpora, the platform only includes a part of the original corpus. In particular, the ‘annotations’ column displays the number of analytical labels visible through Dezrann—usually labels on scores describing patterns, harmony, and/or structure. Datasets with audio components often include additional annotations, such as note‑wise onsets (here incorporated into the synchronization) or even frame‑wise annotations. New synchronizations were done through the platform (⋆). All the other data were published elsewhere and were here adapted for the platform. Some corpora have pieces with multiple audio (‡). The https://doc.dezrann.net/status web page shows the live status of each corpus as well as links to download data. Further documentation to rebuild the corpora is available from https://doc.dezrann.net/rebuild.
| Corpus, references, licenses | Score annotations | Synchronized audio |
|---|---|---|
| Bach Fugues | ||
| 24 fugues, 1722 ODbL (annotations), CC‑BY‑3.0 (audio), YT (video) (Giraud et al., 2015) | 450 labels fugue structure, cadences, pedals | Two‡ complete audio/video recordings⋆ K. Ishizaka (Open Well‑Temp. Clavier) The Netherlands Bach Soc. (All of Bach) |
| Mozart Piano Sonatas | ||
| 54 movements from 18 sonatas, 1774–1789 CC‑BY‑NC‑SA‑4.0 and ODbL (scores, annotations), YT (audio) (Couturier et al., 2022; Hentschel et al., 2021) | 17800+ labels keys, harmony, cadences, textures | Complete audio recordings⋆ K. Würtz (2006) |
| Mozart String Quartets | ||
| 72 movements in 23 quartets, 1770–1790 ? (scores), ODbL (annotations), CC‑BY‑NC‑ND‑3.0 (audio) (Allegraud et al., 2019) | 2200+ labels sonata form structure, keys, cadences | Recordings⋆ for 8 mvts from 4 quartets Borromeo String Quartet (2009) |
| First Movements of Classical Symphonies | ||
| 24 first movements, 1779–1824 Various licences (scores), ODbL (annotations), Public Domain (audio) (Le et al., 2022) | 6000+ labels sonata form structure, texture analysis | Recordings⋆ for 6 mvts (Haydn) The Royal Phil. Orchestra (1960) |
| 19th-Century Lieder, Female Composers | ||
| 170 lieder (OpenScore Lieder), 1780–1920 CC0‑1.0 (scores, annotatios), YT (audio) (Gotham and Jonas, 2021; Gotham et al., 2023b) | 4900+ labels tonality, harmony, phrases on 53 pieces | Recordings⋆ for 25 lieder (Fanny Mendelssohn) L. Kolb, A. Shrut (1992) |
| Schubert Winterreise | ||
| 24 lieder, 1827–1828 CC‑BY‑3.0 (scores, annotations), PDM‑1.0, CC‑BY‑NC‑ND‑3.0 (audio) (Weiß et al., 2021) | 2400+ labels structure, keys, harmony | Two‡ complete recordings G. Hüsch, H.‑U. Müller (1933) R. Scarlata, J. Denk (2006) |
| SUPRA Piano Roll | ||
| 456 piano rolls, 1905–1928 CC‑BY‑NC‑SA‑4.0 (Shi et al., 2019) | – | Rendered expressive audio |
| Slovenian Folk Song Ballads | ||
| 404 transcriptions of ballads, 1819–1995 CC‑BY‑NC‑SA‑4.0 (Borsan et al., 2025) | 2000+ labels contour, structure, harmony | 23 historic recordings⋆ |
| Weimar Jazz Database | ||
| 333 transcriptions of jazz solos, 1925–2009 ODbL (scores, annotations), YT (audio) (Pfleiderer et al., 2017) | 1200+ labels form, chords, phrases, midlevel units | 228 historic recordings (1925–2009) |
| Traditional Georgian Sacred Music | ||
| 101 3‑voice songs with transcriptions, 1966 CC‑BY‑NC‑SA‑4.0 (Rosenzweig et al., 2020) | – | 101 historic recordings A. Erkomaishvili (1966) 4 files‡ per song, mix and sep. voices |

Figure 1
Overview of synchronization types within the Dezrann platform: (a) synchronization between musical time (in quarter notes and measures) (Gotham et al., 2023a), (b) image‑to‑musical‑time alignment, (c) audio‑to‑musical‑time alignment, and (d) image‑to‑audio‑time alignment. Plain arrows indicate straightforward synchronizations with only two reference points (e.g., an audio file with a constant tempo) or those automatically handled by the rendering software. Dashed arrows represent synchronizations that may be derived from explicit data, inferred via Optical Music Recognition (OMR) or score/audio alignment, or manually adjusted using the synchronization editor.

Figure 2
Fugue in Eb Major BWV 852 by J.‑S. Bach, https://www.dezrann.net/~/bach-fugues/bwv852. The score is synchronized with two performances by Pieter‑Jan Belder (left, harpsichord, Netherlands Bach Society) and Kimiko Ishizaka (right, piano, https://welltemperedclavier.org) as well as annotation labels on fugue analysis (Giraud et al., 2015). As both recordings are synchronized to the musical time, the playback can be switched from one performance to another one.

Figure 3
Four musical pieces analyzed with different focuses and ontology conventions: (a) Symphony No. 101 ‘The Clock’ by J. Haydn, annotated with texture labels (Le et al., 2022); (b) ‘Gondellied’ by Fanny Mendelssohn, featuring harmonic analysis (Gotham and Jonas, 2021); (c) ‘Die Wetterfahne’ by Franz Schubert, analyzed harmonically (Weiß et al., 2021); and (d) ‘I Fall in Love Too Easily’, a transcription of Chet Baker’s trumpet solo from the Weimar Jazz Database (Pfleiderer et al., 2017).

Figure 4
Some quality criteria applied to corpora (quality:corpus, top block) and individual pieces (bottom blocks). The complete list of quality criteria is available at https://doc.dezrann.net/quality. Additionally, the corpora status page, at https://doc.dezrann.net/status, provides a summary of quality values for each corpus.

Figure 5
(Top) Extract of the corpora list on https://dezrann.net/corpora, where each corpus is presented with a motto text and a link to a showcase piece. (Bottom) Detailed view of the ‘19th‑Century Lieder’ from Female Composers corpus on https://dezrann.net/explore/openscore-lieder, displaying the corpus description (localized here in Croatian) and a list of included pieces.

Figure 6
Dein ist mein Herz (Op. 7) by Fanny Mendelssohn, the OpenScore Lieder corpus (Gotham and Jonas, 2021), shown in the synchronization editing process. Color coding is used to distinguish corresponding synchronization points. Regular points, such as those on strong beats of each measure, can be added using ‘tap’ mode. Here, the synchronization is further refined by adding additional points, for example, to account for a slowdown before the cadence.

Figure 7
Editing and filtering capabilities. (Top) Slovenian folk song ‘Lansko leto sem se vženiv’ (‘Last Year I Got Married’) with analytical annotations from (Borsan et al., 2023). Each label can be selected and modified. (Bottom) First movement of Mozart’s Piano Sonata K279.1, displaying annotations from two different sources (Couturier et al., 2022; Hentschel et al., 2021), which can be filtered based on the annotation type.

Figure 8
Editing a harmonic analysis at the end of Träumerei (Kinderszenen, Op. 15) by R. Schumann, based on a 1905 piano roll from the SUPRA dataset, featuring a recording by Alfred Grünfeld (Shi et al., 2019).
