Have a personal or library account? Click to login
From Fragmented Data to Linked History: Developing the FAIR Epigraphic Vocabularies Cover

From Fragmented Data to Linked History: Developing the FAIR Epigraphic Vocabularies

Open Access
|Dec 2025

Full Article

(1) Context and motivation

The proliferation of digital epigraphic projects over the past two decades has made hundreds of thousands of ancient inscriptions accessible online, alongside rich metadata about their physical carriers and provenance. However, several competing conceptualisations of epigraphic terminology have emerged across projects, making it challenging for researchers to query across databases or aggregate results from multiple sources (Bodel et al., 2024; Heřmánková et al., 2021; Tupman, 2021). The FAIR Epigraphy project,1 an international collaboration of domain experts in epigraphy and digital humanities, addresses this challenge by developing a comprehensive framework for standardised epigraphic data representation using CIDOC CRM.2 Bidirectional integration with Wikidata extends this framework’s reach beyond the project’s primary focus on Greek and Latin inscriptions, enabling connections with epigraphic materials across multiple writing systems and cultural contexts.3 Wikidata integration also enhances discoverability: researchers and cultural heritage projects can access FAIR Epigraphic vocabularies through Wikidata’s infrastructure, whilst epigraphic databases can leverage Wikidata’s multilingual labels and cross-references to other knowledge bases.

(1.1) The Fragmentation Challenge in Digital Epigraphy

Digital epigraphy has matured considerably since the introduction of EpiDoc, TEI XML tailored to record epigraphy in the early 2000s (Elliott et al., 2017–2025). Major projects such as the Epigraphic Database Heidelberg4 and the Epigraphic Database Rome5 have made significant portions of the epigraphic record computationally accessible. However, as documented in our 2022 scoping survey (Heřmánková et al., 2022), most projects employ project-specific data models, terminologies, and classification systems, often without providing machine-readable metadata, formal definitions, or alignment with existing standards. This heterogeneity creates substantial obstacles for researchers seeking to aggregate data sources or conduct large-scale quantitative analyses (Heřmánková et al., 2021; Laurence & Trifilò, 2023; Mullen & Willi, 2024). Different scholarly traditions have developed distinct approaches to categorising inscription types, material supports, and writing techniques. Without shared vocabularies and conceptual frameworks, even basic comparative questions require manual reconciliation of divergent classification schemes, a labour-intensive process that was initially undertaken by the international consortium EAGLE Europeana Project, running between 2013 and 2016, to enable cross-project data integration of epigraphy (Orlandi et al., 2014).

(1.2) The FAIR Imperative and Community Requirements

The FAIR principles (Wilkinson et al., 2016) have become increasingly central to research data management across humanities disciplines. Our 2022 community survey revealed that the majority of surveyed projects in epigraphy were familiar with FAIR principles, but their compliance was varied and depended heavily on the resources available (Heřmánková et al., 2022). Only a minority had implemented concrete measures for interoperability, such as providing data in RDF format or using standardised vocabularies. This gap between awareness and implementation resonates with the explicit, yet still rare, discussion of FAIR principles implementation challenges within the Text Encoding Community and, more specifically, in epigraphy (Creamer et al., 2021). The survey respondents identified controlled vocabularies as one of the most critical infrastructure gaps, emphasising that vocabularies must be developed collaboratively rather than imposed top-down. Alongside identifying this shortcoming, there was a strong desire to avoid limiting discussion about standard vocabularies to a small, exclusive group. Rather, respondents emphasised treating vocabulary development as a central undertaking affecting all projects and individual researchers, one that actively encourages community involvement throughout the process.

The adoption of Linked Open Data (LOD) approaches offers a pathway to FAIR compliance whilst preserving project autonomy. The EAGLE Europeana Project pioneered this approach (Liuzzo et al., 2013; Liuzzo, 2015, 2019), developing controlled vocabularies between 2013 and 2016 that established foundational terminology for inscription types, materials, object types, execution technique, dating criteria, decoration and state of preservation.6 Following their partial integration into Wikidata in 2015,7 these vocabularies provided crucial groundwork for interoperability in the field. After several years of use and implementation by major digital databases, active users identified the need to consolidate, extend, and align the EAGLE vocabularies with contemporary semantic web standards to meet current FAIR requirements and support bidirectional data exchange (Prag et al., 2022). The FAIR Epigraphy project emerged from this recognised need to update and expand the existing infrastructure.

(1.3) Strategic Alignment

The FAIR Epigraphic vocabularies framework,8 aligned with existing digital resources, positions digital epigraphy within the broader landscape of cultural heritage linked data. Alignment with the Getty Art & Architecture Thesaurus (AAT)9 ensures compatibility with museum and archaeological data systems, whilst the CIDOC CRM-based ontology provides a robust conceptual framework for modelling complex relationships between inscriptions, objects, persons, places, and events.10 Additionally, the FAIR Epigraphic vocabularies maintain explicit alignment with the EAGLE vocabularies, which are partially present in Wikidata,11 and with other Wikipedia items; for alignment examples, see Section 3.4.2. This multi-layered approach reflects the reality that inscriptions exist at the intersection of multiple domains – textual, material, historical, and spatial – and the conceptual differences between existing knowledge recording infrastructure.

(1.3.1) EAGLE in Wikidata

Property P1900 (Eagle ID) functions as an identifier linking Wikidata items to epigraphic concepts within the EAGLE vocabulary system. As of October 2025, Wikidata contains only a subset of the EAGLE vocabularies: P1900 comprises 698 items and seven qualifiers, representing 18.6% of the total 3,747 EAGLE vocabulary entries. Most entries were created via upload by Pietro Liuzzo from the EAGLE project in 2015 and were gradually expanded by Wikidata contributors, but they are far from being complete. As an early-stage integration effort, this work naturally exhibits some structural challenges: entries occasionally overlap or duplicate concepts, and hierarchical relationships and definitions remain underdeveloped, see Sections 3.2 and 3.4.2 for examples. These characteristics reflect the exploratory nature of this preliminary implementation, which nonetheless provides a valuable foundation for future enhancement of the EAGLE vocabulary and its incorporation within Wikidata.

(2) Dataset description

The FAIR Epigraphic vocabularies are published as openly accessible linked open data resources, designed for computational analysis and semantic web integration. Building upon the EAGLE vocabularies, they address structural issues in existing resources by eliminating duplicate entries, establishing clear hierarchical relationships, and providing comprehensive definitions and usage examples for each concept. Rather than attempting to revise EAGLE vocabularies from 2015 retroactively, a thorough community consultation in 2020–2021 established consensus for developing enhanced vocabularies with stable URIs that maintain backward compatibility. This approach ensures that existing projects linking to EAGLE vocabularies can benefit from improvements whilst preserving their current integrations, and enables future projects to adopt more robust semantic foundations from the outset.

Repository location

The vocabularies are available at https://ontology.inscriptiones.org/ with stable URIs for each vocabulary term. The website is hosted via GitHub at https://github.com/FAIR-epigraphy/ontology and maintained by the Centre for the Study of Ancient Documents (CSAD), University of Oxford. Individual sets of vocabularies are assigned their own DOIs for citation and persistent identification.

Repository name

Zenodo, FAIR Epigraphy community.12

Object name

FAIR Epigraphic vocabularies, comprising individual controlled vocabularies for epigraphic data representation, e.g., type of inscription https://doi.org/10.5281/zenodo.17454135 (Heřmánková et al., 2025), epigraphic bilingualism https://doi.org/10.5281/zenodo.17569510 (Mullen, 2025).

Format names and versions

RDF/Turtle (.ttl) files conforming to W3C standards.

Creation dates

Development began in 2022, first published in 2024, with ongoing expansion throughout 2025.

Dataset creators

Individual vocabularies list their specific creators and contributors in their respective VoID (Vocabulary of Interlinked Datasets) metadata files, along with their ORCID identifiers. Petra Heřmánková is responsible for the development of vocabularies, Jonathan Prag for the development of the ontology and Imran Asif for the technical development.

Language

English, with multilingual labels planned for future releases (fr, de, es, la). URI patterns and technical documentation in English.

License

Individual vocabularies within the framework are released under different open licenses depending on the originating project and authors, with details specified in the VoID metadata. Preference is given to the CC BY-SA 4.0 license.

Publication date

Initial vocabularies published 2024-04-01, with continuous updates.

Scope and coverage

The vocabularies currently cover inscription types (building on EAGLE typology) and classifications for bilingualism in epigraphy, see Section 3.3. Planned expansion includes writing techniques (12/25), dating criteria (12/25), object types (1/26) and materials (2/26). Each vocabulary term includes stable URIs, definitions, examples from major epigraphic databases, and mappings to EAGLE, EpiVoc (Brunet et al., 2022), Wikidata, and Getty AAT where applicable.

(3) Method

This Section describes the methodological approach employed to develop the FAIR Epigraphic vocabularies, from community requirements gathering through technical implementation and integration with external resources.

(3.1) Community Requirements and Collaborative Development

Development commenced with a comprehensive scoping survey conducted between February and April 2022, gathering responses from digital epigraphic projects across Europe and North America (Heřmánková et al., 2022). The survey identified controlled vocabularies as the most critical infrastructure gap, with projects emphasising that vocabularies must emerge from collaborative, bottom-up processes reflecting diverse scholarly traditions. The Epigraphy.info Vocabularies Working Group13 determined through community consultation between 2020 and 2021, that, rather than attempting to revise the completed EAGLE vocabularies, developing enhanced vocabularies with improved structure and backward compatibility would better serve both existing and future projects. The FAIR Epigraphy project implemented this community feedback by consolidating existing resources into an enhanced framework that addresses interoperability challenges within the discipline and enables integration with the broader semantic web ecosystem.

(3.2) Vocabulary Consolidation and Extension

The EAGLE vocabularies,14 developed between 2013 and 2016 (Liuzzo et al., 2013; Liuzzo, 2015), provided crucial groundwork but required substantial consolidation and extension. The original EAGLE vocabularies suffered from structural limitations: a flat hierarchy that obscured conceptual relationships, numerous duplicate entries, and conceptual conflation with other vocabulary categories, e.g., graffiti or instrumentum domesticum as a type of inscription,15 instead of technique or type of inscribed object. Moreover, the lack of detailed definitions elucidating their scope created ambiguity for data creators (e.g., what constitutes a ‘proskynema’ inscription).16 Where applicable, we systematically reviewed each EAGLE vocabulary term, identifying and resolving duplicates whilst establishing hierarchical relationships using SKOS (Simple Knowledge Organisation System) properties, including broader, narrower, and related terms, to express hierarchical and associative connections. Terms were verified against usage in major epigraphic databases, reflecting the status in 2025, and where projects employed divergent terminology for equivalent concepts, we conducted alignment exercises to establish mappings based on close reading and evaluation of inscriptions and consultations with resource owners and maintainers. We expanded coverage to accommodate functional classifications underrepresented in the original scheme, particularly for multilingual contexts. Each term includes formal definitions in English, scope notes clarifying boundaries with related concepts, and additional verified usage examples drawn from published corpora.

(3.3) Vocabulary Content and Coverage

The Type of Inscription vocabulary (Heřmánková et al., 2025) categorises inscriptions according to their primary societal function and content, encompassing 122 vocabulary terms organised under 15 top-level categories, ranging from official legal documents and religious dedications to commercial records and personal commemorations. The vocabulary prioritises functional categories over formal characteristics, enabling analysis of epigraphic practice across contexts and periods. Building upon EAGLE Project foundations (Liuzzo et al., 2013), it consolidates the original classification through systematic review of duplicates and establishment of hierarchical relationships, as well as comprehensive scholarship-based definitions.

The Bilingualism in epigraphy vocabulary (Mullen, 2025) classifies inscriptions indicating concurrent use of two or more languages. The 33 vocabulary items, structured into language-focused and script-focused subcategories, address a critical gap: whilst digital projects increasingly recognise bilingual features, systematic tagging using established schema has been lacking. Developed by Mullen, building upon Adams (Adams, 2003) and engagement with modern sociolinguistics scholarship, terms include bibliography and usage examples linked to the LatinNow webGIS database.17

Planned vocabulary releases throughout 2025 and early 2026 address further core epigraphic categories: Execution Technique (methods of physical text production), Inscribed Object (physical support typologies), and Dating Criteria (evidence types for chronological establishment).18 These vocabularies build upon the foundational EAGLE categories, applying the same consolidation methodology of removing duplicates, establishing hierarchical relationships, providing comprehensive definitions with usage examples, and maintaining SKOS-based alignments.

(3.4) Multi-Platform Integration Strategy

Linking several knowledge bases positions epigraphic data within multiple overlapping semantic networks. The FAIR Epigraphic vocabularies maintain explicit alignments with the EAGLE vocabularies, their Wikidata references, and Wikidata items not originated by EAGLE. This alignment using SKOS mapping properties (exactMatch, closeMatch, broadMatch, narrowMatch, relatedMatch) allows for the expression of precise semantic relationships, whilst providing enhanced structure and extended coverage beyond the original EAGLE terms. Alignment with reference thesauri, such as Getty AAT, or EpiVoc (Brunet et al., 2022), ensures compatibility with museum and archaeological data systems, particularly for materials, object types, and techniques. The result of predominantly manual evaluation of the concepts, these alignments use SKOS mapping properties to document semantic relationships between FAIR and related terms in other knowledge bases. This multi-vocabulary alignment strategy maximises discoverability and enables cross-domain queries bridging epigraphy, archaeology, art history, and museum collections, as well as improving transparency within the disciplines.

(3.4.1) Wikidata as a Central Knowledge Graph

A central component of our integration strategy is Wikidata, a free, collaborative, and multilingual knowledge graph that functions as a central repository for structured data (Vrandečić & Krötzsch, 2014). Its structure is fundamentally entity-centric, meaning every concept, person, place, or object is represented as a unique ‘item’ (identified by a ‘Q-number,’ such as Q121614747 for ‘funerary inscription’). These items are then linked by ‘properties’ (identified by ‘P-numbers’) to create simple, factual statements (Ripoll et al., 2025). This structure makes Wikidata a powerful tool for authority control and for linking together different datasets (Vrandečić & Krötzsch, 2014). The planned implementation of bidirectional workflow integrates the FAIR Epigraphic vocabularies’s URIs to matching Wikidata items via a new dedicated P property for authority control, see Section 3.4.2 for exact steps. This connection helps position our domain-specific terms within Wikidata’s vast, cross-disciplinary network, making them easier to find and use. That said, as we explore further in Section 4, Wikidata’s entity-centric approach poses some real challenges when we try to align it with our own event-based ontology, using the same SKOS-based alignment.19

(3.4.2) Bidirectional integration: ‘Type of inscription’ vocabulary

Bidirectional integration with Wikidata is currently in development. This bidirectional model will enable epigraphic projects to query FAIR Epigraphic vocabularies for standardised terms whilst accessing broader contextual information from Wikidata, or from external sources linked in Wikidata, for example, retrieving biographical data about honorands or geographical information about findspots. As a first step, we established inverse links in the FAIR RDF files pointing to corresponding Wikidata Q-numbers.

In order to illustrate the workflow and current state of development, we briefly review the coverage in Wikidata across the FAIR and EAGLE vocabularies, with specific reference to the ‘type of inscription’ vocabulary, one subset of the FAIR Epigraphic vocabularies. The vocabulary contains 334 items in EAGLE, of which 82 are present in Wikidata (24.5% coverage). The FAIR ‘type of inscription’ vocabulary contains 122 items; the lower number of items compared to EAGLE is a result of deduplication and reconceptualisation of some of the terms. Out of the 82 EAGLE items existing in Wikidata, we identified 40 as having an equivalent in the FAIR vocabulary. We linked them in the FAIR RDF file, using the SKOS properties exactMatch (0 terms), closeMatch (33), relatedMatch (29), broadMatch (1), and narrowMatch (29). Preference was given to wider mapping (close, instead of exactMatch), due to the insufficient definitional clarity in some of the existing EAGLE and Wikidata items. However, the brevity of textual definitions seems to be a common problem in many sources, plausibly due to their general scope.

For example, the type of inscription ‘Defixio’, a curse (tablet), has the following definition in EAGLE: “Also listed by Anne Glock from volumes II/7 (Corduba); VI 8, 2 and 3 (Rome); XIII (Gaul and Germania); XVII/4,1 (Milestones) in CIL.”20 In Wikidata, the relatedMatch ‘curse tablet’21 has a definition “type of votive tablet”, which is technically not incorrect, but it neglects the magic aspect of the text. The FAIR Epigraphic vocabularies provide the following definition “Curse tablets or binding spells, aimed at causing harm to others or seeking revenge. These thin lead tablets, often pierced with a nail and usually rolled up, contain curses of people or animals, written in cursive script and labelled with magical signs. They were found on riverbanks, in wells and especially in graves. The anonymous texts are not addressed to the public, but to chthonic deities such as Hecate or the dii inferi. Despite their formulaic wording, these tablets offer scope for personalisation [Schmidt 2004, 48–50].”22 The need to develop a detailed description resulted from comprehensive community and literature scoping, which revealed no agreed-upon definitions or a clearly defined source of truth, despite the discipline’s centuries-long existence. Additionally, in this particular instance, we provided 15 examples of curse tablets from major online databases, including texts both in Greek and Latin, so users can make their own informed decision (see Figure 1 for the full record).

johd-11-428-g1.png
Figure 1

FAIR Epigraphic vocabularies entry for the ‘Defixio’, curse (tablet) type of inscription, showing SKOS-mapped aligned resources, 15 examples, bibliographic references, and translations with common variations of the term in multiple languages (English, German, Latin, Italian).

Linking to Wikidata records beyond EAGLE, we have identified that 104 FAIR items have at least one or more distant equivalent, extending the coverage of FAIR in Wikidata to 85%: SKOS exactMatch (0 items), closeMatch (49), relatedMatch (64), broadMatch (3), narrowMatch (13). 18 FAIR ‘type of inscription’ items do not currently exist in Wikidata and will need to be created. These mostly relate to highly domain-specific categories, such as healing texts,23 mortgage stones,24 or proskynema25 inscriptions. After additional mapping, we were unable to associate six out of 122 items with any existing vocabulary outside of Wikidata, including Getty AAT or EAGLE. In those cases, we have provided multiple examples from epigraphic databases and bibliographic citations, so that users can make informed decisions themselves, e.g., the case of boundary disputes.26

Once the active development of new FAIR vocabularies is completed in early 2026, we plan to implement the second step of the bidirectional linking process in three stages:

  1. Propose a new authority control property (P) called FAIR Epigraphic vocabularies ID;

  2. Link existing Wikidata records with FAIR Epigraphic vocabularies URIs, using the SKOS generic mapping relation property (P4390) with the following values: exact match (Q39893449), close match (Q39893184), broad match (Q39894595), narrow match (Q39893967), and related match (Q39894604);

  3. Create new items (Q) for the FAIR Epigraphic vocabularies’ terms not already represented in Wikidata.

(3.5) Technical Infrastructure and Quality Control

The vocabularies are published as RDF/Turtle files conforming to W3C standards, with each term assigned a stable, dereferenceable URI following linked data best practices. The technical infrastructure comprises a GitHub repository for version control and collaborative development, a web interface providing human-readable documentation,27 and VoID (Vocabulary of Interlinked Datasets) metadata documenting provenance, licensing, and creator information. Individual vocabularies receive their own DOIs to enable precise citation via Zenodo. Quality control encompasses both technical validation (automated checks for RDF syntax correctness, URI stability, and conformance to SKOS and CIDOC CRM specifications) and scholarly review by the domain experts (Epigraphy.info Vocabularies Working Group, FAIR Epigraphy partners). We maintain a transparent issue-tracking system via GitHub, enabling community members to report errors or propose modifications, with changes undergoing review by vocabulary creators before incorporation.

The practical use of FAIR Epigraphic vocabularies will be discussed in detail in a separate paper, but given that the EpiDoc guidelines28 provide only suggestions, our scoping confirms that practice differs widely. For example, the EpiDoc guidelines suggest the following structure for inscription type, see Figure 2:29

johd-11-428-g2.png
Figure 2

Structure suggested by the EpiDoc guidelines to record controlled vocabularies (i.e., type of inscription).

Not all projects employ controlled vocabularies; where they do, implementation varies considerably. The ideal scenario in Figure 3 uses the suggested EpiDoc structure and maintains both the data structure and terminology of the original resource (I.Sicily in this case) and alignment with the controlled vocabulary of their choice (FAIR Epigraphic vocabularies in this case):

johd-11-428-g3.png
Figure 3

An ideal implementation of controlled vocabularies in EpiDoc.

We strongly advocate that users provide links to their chosen controlled vocabulary, which represents their conceptual understanding, that may or may not necessarily align with the FAIR Epigraphic vocabularies, either directly in EpiDoc files or as attribute metadata in their databases, to ensure transparency and facilitate future research.

(4) Results and discussion

The vocabularies demonstrate robust technical implementation and interoperability with existing linked data infrastructures. Each term receives a stable URI enabling persistent citation and computational integration, with comprehensive metadata including English definitions, examples, scope notes clarifying usage boundaries, and bidirectional alignments to Wikidata. The implementation employs CIDOC CRM as its foundational framework, providing hierarchically organised categories with formal SKOS relationships expressing semantic connections between terms. Technical implementation demonstrates robust interoperability. SPARQL queries successfully retrieve vocabulary terms and traverse alignments to Wikidata entities, enabling researchers to combine local epigraphic classifications with external knowledge graphs. Getty AAT alignments enable cross-domain queries linking epigraphic data with museum collections and archaeological datasets.

However, implementation revealed substantial challenges, particularly regarding integration with Wikidata. The planned bidirectional integration proved complex due to fundamental structural asymmetries between the FAIR ontology’s event-based CIDOC CRM framework and Wikidata’s entity-centric model. As explained in Section 3.4.1, Wikidata is excellent at representing and linking clear, individual entities, such as an inscription, a person, or a findspot. Our framework, on the other hand, is designed to focus on the events and relationships that tie those entities together. For instance, we do not just document that an inscription exists; we model the entire writing event, i.e., who carved it, who commissioned it, who it was for, when and where it happened, and so on (Bodard et al., 2021) and the entire history of subsequent study and evaluation.

Wikidata does not currently offer the tools to fully capture these kinds of rich, layered relationships. It is not just a technical limitation; it is a difference in how the two systems understand and structure knowledge. As a result, syncing data between the two is not simply a matter of writing code. It is a process of conceptual translation and figuring out how to map the complexities of the two models. For this reason, the bidirectional mapping has to be done manually. That is not a failure of technology; it is an essential act of scholarly interpretation. But it does come at a cost. The process is time-consuming, and because both our FAIR Epigraphic vocabularies and Wikidata are always changing, there is a real risk of things falling out of sync over time. Maintaining alignment requires ongoing attention and effort.

When manually implementing the bidirectional mapping, we encountered three main problems. First, the EAGLE records in Wikidata lack connections between other existing Wikidata items, either creating duplicate concepts within the Wikidata knowledge base or missing a concept linkage. For example, the record ‘funerary inscription’ (Q121614747) created by EAGLE lists only EAGLE Project as its external resource. The record ‘epitaph’ (Q1772) is more widely used and linked to at least 40 other external resources, but not to EAGLE. Second, whilst the EAGLE and FAIR vocabularies were created by domain specialists who receive attribution for their work, other Wikidata entries may have been created by specialists, enthusiasts, or contributors without detailed knowledge backgrounds, leading to inconsistent quality and reliability across the dataset. The varying depth of detail makes it difficult to determine whether linked terms constitute close or related matches, even for domain specialists. Therefore, when linking concepts accros data sources, we preferred relatedMatch as a precautionary approach when in doubt, whereas others might have chosen closeMatch. Third, implementing the SKOS relational mapping of concepts in Wikidata is necessary yet non-trivial, and as it is not a widely adopted practice, the documentation is sparse. We have identified the necessary steps and describe them in Section 3.4.2.

The bottom-up approach of building vocabularies proved both essential and challenging. In the discussions between 2020 and 2022, the Epigraphy.info Vocabularies Working Group identified disagreements regarding term granularity, particularly for inscription types and between projects focused on specific time periods, geographical regions, or epigraphic traditions. We resolved this through a relatively generic design of the vocabulary based on domain expertise, detailed textual descriptions, multiple examples across the field, as well as iterative user testing and continuous refinement based on feedback. Moreover, the hierarchical design of FAIR Epigraphic vocabularies enables classification at appropriate levels of granularity, allowing multiple tags from the same vocabulary per record. To enhance the usefulness of the FAIR Epigraphic vocabularies, we consulted with numerous projects, including projects using the Wikidata schema, such as IDEA (Thornton et al., 2024b) or Greek Metrical Inscriptions (Ortimini, 2025). Limitations of the FAIR Epigraphic vocabularies remain in vocabulary coverage and its primary focus on Latin and Greek epigraphic cultures, restricting current utility for comprehensive epigraphic data modelling.

The Wikidata sources for epigraphy, however, are represented by more than just EAGLE. The recent efforts of WikiProjects IDEA (Thornton et al., 2024a, 2024b) and Epigraphy30 testify to the need to improve the foundations made in 2015 by EAGLE and coordinate the proliferation of Wikidata resources related to epigraphy, as well as their relevance to non-Wikidata-based conceptual models, like the one of FAIR Epigraphy. The latest projects and activities at academic conferences suggest a dynamically expanding field (Azzolini, 2025; Ortimini, 2025),31 with potential to benefit all participants, regardless of choice of underlying conceptual model. For a comprehensive list of resources, see the WikiProject Epigraphy.32

(5) Implications/Applications

The FAIR Epigraphic Ontology & Vocabularies establish foundational infrastructure for interoperable epigraphic data, with implications extending beyond technical standardisation to reshape collaborative research practices. By providing shared semantic frameworks grounded in community consensus, the vocabularies enable genuinely comparative analysis across previously fragmented datasets. The FAIR Epigraphy Browser33 demonstrates this potential as a proof of concept for linking distinct datasets via RDF. Researchers can now browse or formulate SPARQL queries spanning multiple epigraphic databases, retrieving all honorific inscriptions on marble, for instance, without first having to manually reconcile divergent classification schemes. This transforms the range of feasible research questions, particularly for computational approaches requiring large, structured datasets.

Despite ongoing development, FAIR Epigraphic vocabularies are already reflected upon in digital epigraphy and cultural heritage studies (Tamrazyan & Hovhannisyan, 2025), listed amongst resources and tools for digital epigraphy at the Wikibase instance Greek Metrical Inscriptions34 (Ortimini, 2025), and referenced within specialised thesauri, e.g., Thesaurus poésie épigraphique.35 Individual vocabularies are adopted by partner epigraphic projects such as I.Sicily (Prag, 2017–2025) or LatinNow (Mullen et al., 2025; Mullen & Willi, 2024). Several major epigraphic projects plan future implementation of the vocabularies, such as Roman Inscriptions of Britain (RIB),36 or Carmina Epigraphica Latina Online (CLEO).37

The vocabularies’ integration with Wikidata positions epigraphic data within broader knowledge graphs encompassing biographical, geographical, and historical information. Once properly implemented, an inscription mentioning a magistrate (Q126733465)38 links not only to controlled vocabularies defining the term ‘magistrate’ (Q4594605) but to Wikidata entities documenting that individual’s career, family connections, and historical context. As a practical example, serves linking of emperors in Wikidata with the Inscriptions of Roman Tripolitania dataset, done through the project’s authority lists (Roueche et al., 2022). This facilitates research bridging epigraphy with prosopography, historical geography, and social network analysis, enabling scholars to trace how individuals, families, and institutions employed inscriptions within broader communicative strategies.

For digital humanities infrastructure, the project demonstrates practical approaches to developing domain-specific ontologies within established frameworks. The methodology, community requirements gathering, consolidation of existing vocabularies, and alignment with external authorities, provides a replicable model for other subdisciplines confronting similar standardisation challenges. The experience highlights both the potential and limitations of integrating specialised scholarly vocabularies with general-purpose platforms like Wikidata, offering lessons for future linked open data initiatives. The distributed nature of linked open data presents both opportunities and challenges: whilst improvements to one dataset can theoretically propagate across interconnected resources, ensuring that changes cascade appropriately across multiple platforms requires careful coordination. Without robust version control and clear communication channels, updates risk creating temporary inconsistencies that may confuse users encountering different versions of aligned vocabularies across different platforms.

The adoption of linked open data principles offers substantial benefits to the epigraphic research community that justify the initial learning curve and implementation effort. By establishing machine-readable, interoperable vocabularies, researchers gain the ability to aggregate data across projects, discover previously hidden patterns in large corpora, and connect epigraphic evidence with broader historical and archaeological contexts. The investment in LOD infrastructure pays dividends through enhanced discoverability, reduced duplication of effort in data entry and classification, and the creation of sustainable digital resources that outlast individual projects. Training researchers in linked data principles through concrete epigraphic examples, using the vocabularies’ transparent structure and comprehensive documentation, enables the community to understand how controlled vocabularies facilitate data integration whilst grasping the intellectual decisions underlying classification schemes. This supports critical data literacy development alongside traditional philological skills, equipping scholars to participate effectively in increasingly collaborative, digitally mediated research environments.

Ongoing development priorities through 2025 focus on expanding vocabulary coverage whilst strengthening technical infrastructure. Planned additions include comprehensive inscribed object vocabularies, execution technique classifications, materials, and dating criteria categories. The project’s open, collaborative governance model via GitHub39 ensures continued responsiveness to evolving scholarly needs through transparent issue-tracking and version control. Although the FAIR Epigraphy project concludes in February 2026, the infrastructure will continue to be maintained and developed through ongoing support from CSAD in Oxford and the efforts of the Epigraphy.info community, acknowledging that effective scholarly infrastructure emerges from sustained dialogue between diverse research communities rather than top-down standardisation.

Data Accessibility Statement

Individual datasets are accessible under CC licenses at https://ontology.inscriptiones.org/ and in the FAIR Epigraphy Zenodo Community https://zenodo.org/communities/fairepigraphy/.

GenAI declaration

GenAI tools, such as Grammarly, were used for proofreading and copyediting of the manuscript.

Notes

[1] https://inscriptiones.org/ The FAIR Epigraphy project (2022–2026), co-directed by Marietta Horster (JGU/BBAW, Germany) and Jonathan Prag (Oxford, UK), provides access to FAIR and linked open data for epigraphy by creating digital tools and resources.

[2] The FAIR Epigraphic vocabularies are structured within a CIDOC CRM-based ontological framework that builds upon CRMtex 1.0 (Murano & Felicetti, 2020) and EpOnt (Bodard et al., 2021). Our implementation extends CIDOC CRM through CRMtex 2.0 (Murano et al., 2023), incorporating elements from CRMsci (Doerr et al., 2023) and LRMoo (Aalberg et al., 2024) to model the conceptual complexity of inscriptions as both linguistic texts and physical objects with rich study and publication histories. Initial user testing and engagement took place in early 2025, and the full ontology will be published in early 2026 via the FAIR Epigraphy website https://inscriptiones.org/ and Zenodo community https://zenodo.org/communities/fairepigraphy/.

[3] Bidirectional integration establishes explicit links between FAIR Epigraphic vocabularies and Wikidata, enabling computationally executable data exchange. The concrete implementation using SKOS (Simple Knowledge Organisation System) properties to maintain conceptual complexity and nuance is described in the Section 3.4.2.

[9] https://www.getty.edu/research/tools/vocabularies/aat/ The Art & Architecture Thesaurus (AAT) is a structured vocabulary for describing and indexing the visual arts and architecture.

[10] For details about the ontology, see footnote 2.

[16] EAGLE provides no definition: https://www.eagle-network.eu/voc/typeins/lod/423 vs FAIR Epigraphic Vocabulary: https://ontology.inscriptiones.org/type_of_inscription/ProskynemaInscriptions associated with acts of reverence, adoration, or worship, such as bowing or kneeling before a deity or sacred object.”

[18] The FAIR Epigraphy platform (https://ontology.inscriptiones.org/) is open to host additional developed structured vocabularies related to digital epigraphy. For expression of interest, please get in touch with the authors.

[19] Although mapping between CIDOC and Wikidata models is not always straightforward, several recent projects have demonstrated that transforming Wikidata data into CIDOC CRM structure is eminently feasible (Koch et al., 2024; Untner, 2025). Mapping is required only for integration across both models, not for independent data extraction.

Acknowledgements

We gratefully acknowledge the foundational work of the EAGLE Europeana Project in establishing controlled vocabularies for digital epigraphy. We are particularly grateful to Pietro Liuzzo, Silvia Evangelisti, and Alex Mullen for their ongoing support and sharing of resources. We extend our thanks to the FAIR Epigraphy partners and Epigraphy.info community, especially the members of the Vocabularies Working Group, for their invaluable contributions to the development of community-wide standards and their ongoing engagement in advancing FAIR principles in digital epigraphy. We thank the anonymous reviewers whose detailed comments substantially improved the manuscript.

Competing Interests

The authors have no competing interests to declare.

Author Contributions

Petra Heřmánková: Conceptualisation, Data curation, Methodology, Project administration, Resources, Validation, Writing – original draft, Writing – review & editing

Imran Asif: Data curation, Methodology, Software, Visualisation, Writing – original draft, Writing – review & editing

Marietta Horster: Conceptualisation, Funding acquisition, Methodology, Supervision, Writing – original draft, Writing – review & editing

Jonathan R. W. Prag: Conceptualisation, Funding acquisition, Methodology, Supervision, Writing – original draft, Writing – review & editing

DOI: https://doi.org/10.5334/johd.428 | Journal eISSN: 2059-481X
Language: English
Submitted on: Oct 27, 2025
Accepted on: Nov 25, 2025
Published on: Dec 17, 2025
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2025 Petra Heřmánková, Imran Asif, Marietta Horster, Jonathan R. W. Prag, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.