Skip to main content
Have a personal or library account? Click to login
Extending CLDF — Towards a Type System for Cross-Linguistic Data Cover

Extending CLDF — Towards a Type System for Cross-Linguistic Data

Open Access
|Apr 2026

Abstract

We argue that, in order to maximize reusability of cross-linguistic data, it is useful to think about it in terms of a type system. Type systems are enforceable rules guiding the interpretation of data in computer programs. Thus, they link data values to valid operations which can be performed on them. Clearly, the reusability of research data is determined largely by the availability of suitable analysis methods. A clear idea of cross-linguistic data types will enable development of analysis methods as well as a mechanism to match valid data with appropriate operations. The Cross-Linguistic Data Formats (CLDF) initiative provides a toolkit to model such cross-linguistic data types, and in recent years we have seen a paradigm (and an associated process) arise of how new types can be added to CLDF through stepwise conventionalization. Additionally, data types provide a useful selection criterion to group datasets for unified curation. Thus, a type system for cross-linguistic data will provide actionable metadata to guide data curation and inform data reuse.

DOI: https://doi.org/10.5334/johd.517 | Journal eISSN: 2059-481X
Language: English
Page range: 62 - 62
Submitted on: Jan 30, 2026
Accepted on: Mar 26, 2026
Published on: Apr 29, 2026
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2026 Robert Forkel, Johann-Mattis List, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.