Have a personal or library account? Click to login
Wikidata and LiLa for Latin: Enabling Interoperability and Access to Inflected Forms and Corpus Attestations Cover

Wikidata and LiLa for Latin: Enabling Interoperability and Access to Inflected Forms and Corpus Attestations

Open Access
|Dec 2025

Abstract

This paper presents an approach to integrating Latin inflected forms and corpus attestations within a Linked Open Data (LOD) framework, enhancing interoperability between Wikidata and the LiLa knowledge base. Building on the PrinParLat lexicon of Latin verb principal parts, we generate the complete set of inflected forms for over 8,000 verbs, encoded as RDF in a dedicated Wikibase instance. These forms are linked to the Index Thomisticus Treebank (ITTB), whose morphologically annotated tokens are related to corresponding forms based on segmental identity, lemma alignment, and mapped morphological features. Our generation and linking process achieves over 95% coverage of ITTB verbal tokens, demonstrating the robustness of our pipeline even for Medieval Latin data. By aligning Paralex, Wikidata, and LiLa ontologies, we ensure semantic interoperability and facilitate future integration into Wikidata. Beyond Latin, this workflow provides a reproducible model for linking inflectional paradigms and corpus attestations in other languages.

DOI: https://doi.org/10.5334/johd.464 | Journal eISSN: 2059-481X
Language: English
Submitted on: Nov 9, 2025
|
Accepted on: Dec 8, 2025
|
Published on: Dec 29, 2025
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2025 David Lindemann, Matteo Pellegrini, Francesco Mambrini, Marco Passarotti, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.