
The Reusability of Resources in Language-Specific Contexts: The SADiLaR Repository as a Case Study
Abstract
The reuse of existing language resources is increasingly positioned as a foundational principle of responsible, sustainable, and FAIR-aligned research in linguistics and digital humanities. In practice, however, language resource reuse remains difficult to trace and historically insufficiently supported by standardised citation practices. This position paper is a meta-analysis of reuse in general and argues that current approaches to assessing the reuse of language resources underestimate actual reuse.
Focusing on the South African Centre for Digital Language Resources (SADiLaR) repository as a case in point, the paper contends that repository infrastructure alone, including persistent identifiers, rich metadata, and long-term preservation, is insufficient to ensure visible and measurable reuse. Researchers may be unaware of the existence of relevant resources, cite secondary publications instead of datasets, or reuse data in ways that leave no explicit trace in scholarly outputs. As a result, language resources appear underutilised, despite evidence of their implicit integration into research workflows.
To substantiate this position, the authors report on a manual, exploratory effort to identify instances where SADiLaR-hosted resources are referenced in academic works. The labour-intensive and inaccurate nature of this process underscores the need for systematic pathways to track resource reuse.
The paper argues for a shift in how language resource reuse is conceptualised, supported, and evaluated. It calls for the active adoption of dataset citation norms and greater awareness among researchers of existing resources. In doing so, the paper positions language repositories not merely as storage infrastructures, but as active participants in shaping sustainable and transparent research ecosystems.
© 2026 Benito Trollip, Michelle White, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.