Have a personal or library account? Click to login
Multilingual Workflows in Bullinger Digital: Data Curation for Latin and Early New High German Cover

Multilingual Workflows in Bullinger Digital: Data Curation for Latin and Early New High German

Open Access
|Jan 2024

Abstract

This paper presents how we enhanced the accessibility and utility of historical linguistic data in the project Bullinger Digital. The project involved the transformation of 3,100 letters, primarily available as scanned PDFs, into a dynamic, fully digital format. The expanded digital collection now includes 12,000 letters, 3,100 edited, 5,400 transcribed, and 3,500 represented through detailed metadata and results from handwritten text recognition. Central to our discussion is the innovative workflow developed for this multilingual corpus. This includes strategies for text normalisation, machine translation, and handwritten text recognition, particularly focusing on the challenges of code-switching within historical documents. The resulting digital platform features an advanced search system, offering users various filtering options such as correspondent names, time periods, languages, and locations. It also incorporates fuzzy and exact search capabilities, with the ability to focus searches within specific text parts, like summaries or footnotes. Beyond detailing the technical process, this paper underscores the project’s contribution to historical research and digital humanities. While the Bullinger Digital platform serves as a model for similar projects, the corpus behind it demonstrates the vast potential for data reuse in historical linguistics. The project exemplifies how digital humanities methodologies can revitalise historical text collections, offering researchers access to and interaction with historical data. This paper aims to provide readers with a comprehensive understanding of our project’s scope and broader implications for the field of digital humanities, highlighting the transformative potential of such digital endeavours in historical linguistic research.

DOI: https://doi.org/10.5334/johd.174 | Journal eISSN: 2059-481X
Language: English
Submitted on: Oct 16, 2023
Accepted on: Dec 20, 2023
Published on: Jan 24, 2024
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2024 Phillip Benjamin Ströbel, Lukas Fischer, Raphael Müller, Patricia Scheurer, Bernard Schroffenegger, Benjamin Suter, Martin Volk, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.