Skip to main content
Have a personal or library account? Click to login
KoVox Dataset—A Relational Database of Korean Classical Vocal Performance Ephemera Cover

KoVox Dataset—A Relational Database of Korean Classical Vocal Performance Ephemera

By: Minji Kim and  Eunsoo Lee  
Open Access
|Apr 2026

Abstract

The KoVox Dataset contains structured data on promotional materials for 1,319 Korean classical vocal performances listed on the KOPIS platform between 2016 and 2025. These digital performance ephemera capture artistic intent, program structure, and performer participation, yet they are often non-machine-readable due to their image-based formats. To transform these materials into structured data, we applied a hybrid OCR workflow combining Apple Live Text with ChatGPT-assisted extraction, followed by entity disambiguation using MusicBrainz identifiers. The resulting text was organized into a five-table relational database: performance, work, person, program, participation. Archived on Zenodo as CSV files together with an SQLite database and SQL schema, KoVox functions as a living, extensible archive that supports comparative and longitudinal studies of South Korea’s evolving vocal music performance culture.

DOI: https://doi.org/10.5334/johd.417 | Journal eISSN: 2059-481X
Language: English
Submitted on: Nov 27, 2025
Accepted on: Mar 11, 2026
Published on: Apr 22, 2026
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2026 Minji Kim, Eunsoo Lee, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.