Have a personal or library account? Click to login
A Full Morphosyntactic Annotation of the State Archives of Assyria Letter Corpus Cover

A Full Morphosyntactic Annotation of the State Archives of Assyria Letter Corpus

By: Matthew Ong  
Open Access
|Apr 2024

Abstract

The dataset consists of a full morphosyntactic annotation of the normalized letter corpus of the State Archives of Assyria online (SAAo), plus associated metadata regarding sender, recipient, estimated date of composition, script, and dialect of Akkadian (if determinable). This corpus comprises ten of the twenty-one current volumes of SAAo and contains approximately 2600 letters from the royal archives of the late Neo-Assyrian kings. Each letter features morphosyntactic annotations specifying part of speech, lemma, morphological decomposition, and syntactic dependencies of all relevant tokens in the text. The annotations were made with the help of a spaCy language model with additional human checking and completion. The annotations are available both as a set of CONLLU files (one per text) and as linked open data in a single TTL file. The associated metadata is available as a CSV file. Due to the letters’ shared format, topics of concern, and historical period in which they were written, this corpus forms a natural object of study from a linguistic and social historical perspective. It is hoped this data will be of use to researchers wishing to do linguistic and sociolinguistic corpus research on these texts.

DOI: https://doi.org/10.5334/johd.202 | Journal eISSN: 2059-481X
Language: English
Submitted on: Feb 10, 2024
Accepted on: Mar 18, 2024
Published on: Apr 12, 2024
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2024 Matthew Ong, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.