A Multi-Dimensional Evaluation Framework for Assessing LLM Performance in TEI Encoding

Figures & Tables

Task Taxonomy for TEI Encoding.

DIMENSION	TASK CATEGORY	ENCODING TASKS
0	Format Conversion	Transforming plain text into valid XML
1	Source Preservation	Preserving evidence of the source’s textual characteristics
2	Schema Application	Selecting and applying appropriate TEI elements and attributes according to TEI P5 Guidelines and project-specific constraints
3	Structural Markup	Constructing document scaffolding: segmenting texts into structural units (e.g., <div>, <opener>, <closer>, paragraph boundaries), and ensuring correct hierarchy and ordering
4	Semantic Markup	Annotating meaning-bearing spans and editorial phenomena, including named entities, temporal expressions, discourse markers, etc.
5	Contextual Enrichment	Linking entities to authority records, resolving references, and normalisations
6	Metadata Management	Extracting and normalising descriptive or administrative metadata from sources, and enriching records with external information
7	Collection Management	Maintaining consistent encoding depth and conventions across documents, monitoring quality drift, and checking interoperability standards