Have a personal or library account? Click to login
CuneiML: A Cuneiform Dataset for Machine Learning Cover

CuneiML: A Cuneiform Dataset for Machine Learning

Open Access
|Dec 2023

Figures & Tables

johd-9-151-g1.png
Figure 1

An overview of CuneiML. An example tablet of ID 453248 with multi-modal data: (1) Metadata consist of time period, provenience, genre and measurement. (2) High-resolution 2d photograph of 6 faces. (3) Lineart from paleographers. (4) Latin transliteration directly downloaded from CDLI. (5) Cuneiform Unicode transcription we automatically converted from the Latin transliteration. (6) Major face cutouts automatically processed from the 2d photograph.

johd-9-151-g2.png
Figure 2

Number of tablets by metadata attributes: time period, genre, and provenance.

johd-9-151-g3.png
Figure 3

A random sample of 20 major face cutouts.

Table 1

Task summary with possible input and output pairs. (1) Metadata consist of time period, provenience, genre and measurement. (2) High-resolution 2d photograph of 6 faces. (3) Lineart from paleographers. (4) Latin transliteration (5) Cuneiform Unicode transcription. (6) Major face cutouts.

TASK NAMEINPUTOUTPUT
Language Modeling(4)(5)(4)(5)
Transliteration(5)(4)
Lineart generation(2)(6)(3)
Attribute prediction(2)(3)(4)(5)(6)(1)
Sign identification(2)(3)(6)(5)
Table 2

Summary of test accuracy for attribute prediction using different features.

IMAGEUNICODETRANS.# OF CLASSES
Time period97.6690.5087.1714
Provenience85.7261.7168.6025
Genre89.0081.5086.2112
DOI: https://doi.org/10.5334/johd.151 | Journal eISSN: 2059-481X
Language: English
Submitted on: Sep 2, 2023
Accepted on: Oct 17, 2023
Published on: Dec 6, 2023
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2023 Danlu Chen, Aditi Agarwal, Taylor Berg-Kirkpatrick, Jacobo Myerston, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.