Skip to main content
Have a personal or library account? Click to login
Making Chant Computing Easy: CantusCorpus v1.0 and the PyCantus Library Cover

Making Chant Computing Easy: CantusCorpus v1.0 and the PyCantus Library

Open Access
|Apr 2026

Abstract

Digital Gregorian chant scholarship has, for decades, enjoyed the privilege of a large digital resource cataloguing chant sources: the Cantus ecosystem, with nearly 900,000 chants catalogued across more than 2,000 sources. The Cantus Database data model and the Cantus ID mechanism have been adopted by 18 more chant databases, jointly accessible through the Cantus Index interface. However, these data have only been available piecemeal via the individual online user interfaces; computational methods have so far had only a limited opportunity to process these immense resources. To overcome this hurdle, we compiled CantusCorpus v1.0, a dataset that combines everything that was available across the Cantus Index–centred network of databases as of mid-2025, and we have also provided the code for updating the dataset as the databases grow. We then created the lightweight PyCantus library for working with these data. PyCantus decouples the data model from the Cantus codebase and thus allows integration of further chant data sources, which we illustrate with harmonising pilot data from the Corpus Monodicum project. Computational chant research is attractive – and CantusCorpus v1.0 and PyCantus are infrastructures that should make work in this field more transparent, replicable and accessible to digital humanities practitioners beyond chant scholars themselves.

DOI: https://doi.org/10.5334/tismir.321 | Journal eISSN: 2514-3298
Language: English
Page range: 164 - 178
Submitted on: Jul 1, 2025
Accepted on: Jan 31, 2026
Published on: Apr 29, 2026
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2026 Anna Dvořáková, Tim Eipert, Debra Lacoste, Jan Hajič jr, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.