Have a personal or library account? Click to login
Text Recognition Model for Yiddish in Vaybertaytsh Typeface, Based on Community Regulations Cover

Text Recognition Model for Yiddish in Vaybertaytsh Typeface, Based on Community Regulations

By: Ronny Reshef and  Mirjam Gutschow  
Open Access
|May 2024

Abstract

We present a public text recognition PyLaia model accompanied by a baseline model for the layout of community regulations in Yiddish and a dataset for Yiddish texts printed in Vaybertaytsh typeface. The model was built using legal documents, namely regulations written by the Ashkenazi Jewish community in Amsterdam during the 18th century. The necessity of such a model for Vaybertaytsh typeface stems from the substantial differences between it and other Yiddish or Hebrew typefaces. Existing text recognition models for Yiddish are dedicated to handwritten texts or substantially other typefaces, followed by a short description of the dataset, its unique characteristics, and how it can be used further. The process of training the text recognition model is explained, and challenges encountered are specified, as well as strategies for coping with them. The model is publicly accessible via Transkribus, and the complete dataset used to train the model is available via Figshare. The models and dataset offer valuable contributions to the digital humanities, specifically for research on linguistics, Jewish History and related fields.

DOI: https://doi.org/10.5334/johd.194 | Journal eISSN: 2059-481X
Language: English
Submitted on: Dec 23, 2023
Accepted on: Apr 8, 2024
Published on: May 6, 2024
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2024 Ronny Reshef, Mirjam Gutschow, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.