A Large-Scale Dataset of Annotated Cuneiform Sign Images for Digital Palaeography

Or Lewenstein; Daniel López; Cyrill Dankwardt; Mays Fadhil Alrawi; Louisa Grill; Brian Mak; Albert Setälä; Fiammetta Gori; Aino Hätinen; Felix Rauchhaus; Zsombor Földi; Enrique Jiménez

doi:10.5334/johd.503

A Large-Scale Dataset of Annotated Cuneiform Sign Images for Digital Palaeography

Volume 12 (2026): Issue 1

By: Or Lewenstein , Daniel López , Cyrill Dankwardt , Mays Fadhil Alrawi , Louisa Grill , Brian Mak , Albert Setälä , Fiammetta Gori , Aino Hätinen , Felix Rauchhaus , Zsombor Földi and Enrique Jiménez

Open Access

|Mar 2026

Abstract

This paper presents a large-scale dataset of 158,946 annotated cuneiform sign crops extracted from 9,276 clay tablets and other objects spanning over three millennia of Mesopotamian history (ca. 2800 BCE–75 CE). The dataset was created through manual annotation on the Electronic Babylonian Library (eBL) platform and semi-automated extraction methods, combining high-resolution photographs from major collections with detailed metadata including sign names, transliteration values, and historical periods. The data is stored in JPEG format for images and JSON for metadata, and is publicly accessible under a CC BY-NC 4.0 license. This dataset enables digital palaeographic analysis for dating undated tablets, supports machine learning applications in optical character recognition, and facilitates computational studies of scribal practices and regional variation in cuneiform writing.

References

DOI: https://doi.org/10.5334/johd.503 | Journal eISSN: 2059-481X

Journal RSS Feed

Language: English

Submitted on: Dec 19, 2025

Accepted on: Feb 20, 2026

Published on: Mar 25, 2026

Published by: Ubiquity Press

In partnership with: Paradigm Publishing Services

Publication frequency: 1 issue per year

Keywords:

cuneiform,

digital palaeography,

optical character recognition,

Assyriology,

machine learning,

ancient writing systems

© 2026 Or Lewenstein, Daniel López, Cyrill Dankwardt, Mays Fadhil Alrawi, Louisa Grill, Brian Mak, Albert Setälä, Fiammetta Gori, Aino Hätinen, Felix Rauchhaus, Zsombor Földi, Enrique Jiménez, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.

Previous article Volume 12 (2026): Issue 1 Next article