Making Chant Computing Easy: CantusCorpus v1.0 and the PyCantus Library

Anna Dvořáková; Tim Eipert; Debra Lacoste; Jan Hajič jr

doi:10.5334/tismir.321

Making Chant Computing Easy: CantusCorpus v1.0 and the PyCantus Library

Transactions of the International Society for Music Information Retrieval

Volume 9 (2026): Issue 1

By: Anna Dvořáková , Tim Eipert , Debra Lacoste and Jan Hajič jr

Open Access

|Apr 2026

Abstract

Digital Gregorian chant scholarship has, for decades, enjoyed the privilege of a large digital resource cataloguing chant sources: the Cantus ecosystem, with nearly 900,000 chants catalogued across more than 2,000 sources. The Cantus Database data model and the Cantus ID mechanism have been adopted by 18 more chant databases, jointly accessible through the Cantus Index interface. However, these data have only been available piecemeal via the individual online user interfaces; computational methods have so far had only a limited opportunity to process these immense resources. To overcome this hurdle, we compiled CantusCorpus v1.0, a dataset that combines everything that was available across the Cantus Index–centred network of databases as of mid-2025, and we have also provided the code for updating the dataset as the databases grow. We then created the lightweight PyCantus library for working with these data. PyCantus decouples the data model from the Cantus codebase and thus allows integration of further chant data sources, which we illustrate with harmonising pilot data from the Corpus Monodicum project. Computational chant research is attractive – and CantusCorpus v1.0 and PyCantus are infrastructures that should make work in this field more transparent, replicable and accessible to digital humanities practitioners beyond chant scholars themselves.

References

Atkinson, C. M. (2008). The critical nexus: Tone‑system, mode, and notation in early medieval music. Oxford University Press.
Search in Google Scholar Back to article
Blachly, A. (1990). Some observations on the “Germanic” plainchant tradition. Current Musicology, 45–47, 85–117.
Search in Google Scholar Back to article
Cornelissen, B., Zuidema, W., and Burgoyne, J. A. (2020a). Studying large plainchant corpora using chant21. In 7th International Conference on Digital Libraries for Musicology, Montréal, Canada (pp. 40–44).
Search in Google Scholar Back to article
Cornelissen, B., Zuidema, W. H., and Burgoyne, J. A., et al. (2020b). Mode classification and natural units in plainchant. In Proceedings of the 21st International Society for Music Information Retrieval Conference, Montréal, Canada (pp. 869–875).
Search in Google Scholar Back to article
Dvořáková, A., and Hajič jr, J. (2025). Visualising Gregorian traditions: ChantMapper. In Music Encoding Conference 2025 Book of Abstracts, London (pp. 143–151).
Search in Google Scholar Back to article
Eipert, T., Bongartz, C., and Moss, F. C. (2025). Corpus Troporum Dataset: A digital catalog of trope elements in medieval chant. Journal of Open Humanities Data, 11(1).
Search in Google Scholar Back to article
Eipert, T., Herrmann, F., Wick, C., Puppe, F., and Haug, A. (2019). Editor support for digital editions of medieval monophonic music. In Proceedings of the 2nd International Workshop on Reading Music Systems, Delft, The Netherlands (pp. 4–7).
Search in Google Scholar Back to article
Eipert, T., and Moss, F. C. (2023a). Monodikit: A data model and toolkit for medieval monophonic chant. In Proceedings of the 10th International Conference on Digital Libraries for Musicology, Milan, Italy (pp. 67–71).
Search in Google Scholar Back to article
Eipert, T., and Moss, F. C. (2023b). Poster: Communities in medieval troper networks are shaped by Carolingian politics. In Proceedings of the 10th International Conference on Digital Libraries for Musicology, Milan, Italy.
Search in Google Scholar Back to article
Ferretti, P. (1934). Estetica Gregoriana: Ossia, Trattato Delle Forme Musicali Del Canto Gregoriano. Number sv. 1. In Estetica Gregoriana: Ossia, Trattato Delle Forme Musicali Del Canto Gregoriano. Pontificio istituto di musica sacra.
Search in Google Scholar Back to article
Froger, D. J. (1978). The critical edition of the Roman Gradual by the monks of Solesmes. Journal of the Plainsong & Mediaeval Music Society, 1, 81–97.
Search in Google Scholar Back to article
Fuentes‑Martínez, E., Ríos‑Vila, A., Martinez‑Sevilla, J. C., Rizo, D., and Calvo‑Zaragoza, J. (2026). Aligned music notation and lyrics transcription. Pattern Recognition, 170, 112094.
Search in Google Scholar Back to article
Fujinaga, I. (2019). Single interface for music score searching and analysis (SIMSSA) project: Optical music recognition workflow for neume notation. In Proceedings of the Computers and the Humanities Symposium (Jin‑MonCom) (pp. 281–286). Osaka, Japan: Information Processing Society of Japan.
Search in Google Scholar Back to article
Glasenapp, J. (2020). To Pray without Ceasing: A Diachronic History of Cistercian Chant in the Beaupré Antiphoner (Baltimore, Walters Art Museum, W. 759–762). Columbia University.
Search in Google Scholar Back to article
Hajič jr, J., Ballen, J., A, G., Mühlová, K. H., and Vlhová‑Wörner, H. (2023). Towards building a phylogeny of Gregorian chant melodies. In Proceedings of the 24th International Society for Music Information Retrieval Conference, Milan, Italy (pp. 571–578). ISMIR.
Search in Google Scholar Back to article
Hajič jr, J., Lanz, V., and Ballen, G. A. (2025). Genome of melody: Applying bioinformatics to study the evolution of Gregorian chant. Philosophical Transactions of the Royal Society B: Biological Sciences, 380(1940), 20240274.
Search in Google Scholar Back to article
Hajič jr, J., and Moss, F. (2025). Knowing when to stop: Insights from ecology for building catalogues, collections, and corpora. In Proceedings of the 12th Digital Libraries for Musicology Conference, Daejeon, Korea (pp. 90–94).
Search in Google Scholar Back to article
Hartelt, A., Eipert, T., and Puppe, F. (2024). Optical medieval music recognition: A complete pipeline for historic chants. Applied Sciences, 14(16), 7355.
Search in Google Scholar Back to article
Helsen, K., Bain, J., Fujinaga, I., Hankinson, A., and Lacoste, D. (2014). Optical music recognition and manuscript chant sources. Early Music, 42(4), 555–558.
Search in Google Scholar Back to article
Helsen, K., Daley, M., and Schindler, J. (2021). The sticky riff: Quantifying the melodic identities of medieval modes. Empirical Musicology Review, 16(2), 312–325.
Search in Google Scholar Back to article
Hesbert, R. J. (1963–1979). Corpus Antiphonalium Officii, volume 1–6 of Rerum Ecclesiasticarum Documenta. Series Maior: Fontes; 7–12. Herder.
Search in Google Scholar Back to article
Hiley, D. (1993). Western Plainchant: A Handbook. Clarendon Press.
Search in Google Scholar Back to article
Hiley, D. (2009). Gregorian Chant. Cambridge Introductions to Music. Cambridge University Press.
Search in Google Scholar Back to article
Hornby, E. (2002). Gregorian and Old Roman Eighth‑Mode Tracts: A Case Study in the Transmission of Western Chant. Ashgate Publishing Ltd.
Search in Google Scholar Back to article
Hornby, E., Maloy, R., and Rouse, P. (2022). Chant editing and analysis program: A tool for analyzing liturgical chant. Journal of Medieval Iberian Studies, 14(1), 82–95.
Search in Google Scholar Back to article
Huglo, M. (2009). Statistical survey of notated liturgical manuscripts. In W. Horn and D. Hiley (Eds.), Antiphonaria: Studien zu Quellen und Gesängen des Mittelalterlichen Offiziums, volume 7 of Regensburger Studien zur Musikgeschichte.
Search in Google Scholar Back to article
Huron, D., and Veltman, J. (2006). A cognitive approach to medieval mode: Evidence for a historical antecedent to the major/minor system. Empirical Musicology Review, 1, 33–55.
Search in Google Scholar Back to article
Lacoste, D. (2012). The Cantus database: Mining for medieval chant traditions. Digital Medievalist, 7. 10.16995/dm.42
Open DOI Search in Google Scholar Back to article
Lacoste, D. (2022). The Cantus Database and Cantus Index Network. In The Oxford Handbook of Music and Corpus Studies. Oxford University Press.
Search in Google Scholar Back to article
Lanz, V., and Hajič jr, J. (2023). Text boundaries do not provide a better segmentation of Gregorian antiphons. In Proceedings of the 10th International Conference on Digital Libraries for Musicology, Milan, Italy (pp. 72–76).
Search in Google Scholar Back to article
Lanz, V., and Hajič jr, J. (2025). Gregorian melody, modality, and memory: Segmenting chant with Bayesian nonparametrics. In Proceedings of the 26th International Society for Music Information Retrieval Conference, Daejeon, Korea (pp. 638–646).
Search in Google Scholar Back to article
Lanz, V., Szabová, K., and Hajič jr, J. (2025). Making computational study of Gregorian melody accessible with ChantLab. In Music Encoding Conference 2025 Book of Abstracts, London (pp. 157–163).
Search in Google Scholar Back to article
Levy, K. (1970). The Italian neophytes’ chants. Journal of the American Musicological Society, 23(2), 181–227.
Search in Google Scholar Back to article
Martinez‑Sevilla, J. C., Rios‑Vila, A., Castellanos, F. J., and Calvo‑Zaragoza, J. (2023). A holistic approach for aligned music and lyrics transcription. In International Conference on Document Analysis and Recognition, California, USA (pp. 185–201). Springer.
Search in Google Scholar Back to article
McBride, J. M. (2025). Commentary on Buechele et al. (2023): Communicating across the divide – a place for physics in music?. Empirical Musicology Review, 19(2), 154–172.
Search in Google Scholar Back to article
Monks of Solesmes, o. (1960). IV: Le texte neumatique, i: Le groupement des manuscrits. In Graduel Romain: Édition Critique par les Moines de Solesmes. Abbey of Solesmes.
Search in Google Scholar Back to article
Moss, F. C., Hajič jr, J., Nachtwey, A., and Pugin, L. (2025). The rest is silence: Leveraging unseen species models for computational musicology. Anthology of Computers and the Humanities, 3, 557–574.
Search in Google Scholar Back to article
Narmour, E. (1992). The Analysis and Cognition of Melodic Complexity: The Implication‑Realization Model. University of Chicago Press.
Search in Google Scholar Back to article
Ottosen, K. (2007). The Responsories and Versicles of the Latin Office of the Dead. BoD–Books on Demand, GmbH.
Search in Google Scholar Back to article
Phan, A., Thomae, M. E., De Luca, E., and Oriol, F. (2025). Analyser tool for MEI neumes encoded chants. In Music Encoding Conference 2025 Book of Abstracts (pp. 36–48).
Search in Google Scholar Back to article
Treitler, L. (1975). “Centonate” chant: “Übles Flickwerk” or “E pluribus unus”? Journal of the American Musicological Society, 28(1), 1–23.
Search in Google Scholar Back to article
van Kranenburg, P., and Maessen, G. (2017). Comparing offertory melodies of five medieval Christian chant traditions. In Proceedings of the 18th International Society for Music Information Retrieval Conference, Suzhou, China (pp. 204–210).
Search in Google Scholar Back to article
Vigliensoni, G., Daigle, A., Liu, E., Calvo‑Zaragoza, J., Regimbal, J., Nguyen, M. A., Baxter, N., McLennan, Z., and Fujinaga, I. (2019). From image to encoding: Full optical music recognition of medieval and Renaissance music. In Music Encoding Conference 2019 Book of Abstracts, Vienna, Austria.
Search in Google Scholar Back to article
Wagner, P. (1925). Germanisches und Romanisches im frühmittelalterlichen Kirchengesang. In Bericht über den I. Musikwissenschaftlichen Kongreß der deutschen Musikgesellschaft in Leipzig 1925 (pp. 21–34).
Search in Google Scholar Back to article
Wick, C., and Puppe, F. (2019). OMMR4all: A semiautomatic online editor for medieval music notations. In 2nd International Workshop on Reading Music Systems, Delft, The Netherlands (pp. 31–34).
Search in Google Scholar Back to article

Articles in this issue

DOI: https://doi.org/10.5334/tismir.321 | Journal eISSN: 2514-3298

Journal RSS Feed

Language: English

Page range: 164 - 178

Submitted on: Jul 1, 2025

Accepted on: Jan 31, 2026

Published on: Apr 29, 2026

Published by: Ubiquity Press

In partnership with: Paradigm Publishing Services

Publication frequency: 1 issue per year

Keywords:

Gregorian chant,

computational musicology,

digital musicology,

dataset harmonisation

© 2026 Anna Dvořáková, Tim Eipert, Debra Lacoste, Jan Hajič jr, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.

Volume 9 (2026): Issue 1