Have a personal or library account? Click to login
Multilingual Workflows in Bullinger Digital: Data Curation for Latin and Early New High German Cover

Multilingual Workflows in Bullinger Digital: Data Curation for Latin and Early New High German

Open Access
|Jan 2024

Figures & Tables

johd-10-174-g1.png
Figure 1

Overview of the Bullinger Digital project workflow, illustrating its comprehensive approach to digitising and processing historical correspondence.

johd-10-174-g2.png
Figure 2

A snapshot from the teoirgsed website. The left side displays the scanned printed edition, and the right side shows the OCR output. Note the omission of special characters like ˚u in sch˚ulmeister (EN: principal) (see red ellipses) in the OCRed text.

johd-10-174-g3.png
Figure 3

Visualisation of code-switching on the Bullinger Digital website, illustrated with a letter from Johannes Fabricius Schmid to Heinrich Bullinger (29 November 1563) (Bullinger Digital, 2023). Users can choose to highlight languages in different colours.

johd-10-174-g4.png
Figure 4

An example sentence from the Bullinger letters in its original Early New High German and normalised forms.

johd-10-174-g5.png
Figure 5

An example of a search for a) ENHG wein (EN wine). The search shows results for the words as entered. However, since a normalised text has been indexed, letters containing the words b) win or c) wyn (both valid variants of wein) are also returned.

Table 1

Excerpt of the human vs. the automatic summary of a letter from Heinrich Bullinger to Oswald Myconius, 25. December 1546 (see https://www.bullinger-digital.ch/letter/12771 (Bullinger Digital, 2023)).

LETTER IN ENG GERMAN AND LATINHUMAN SUMMARY (GERMAN) FROM THE BULLINGER EDITIONGPT-4 SUMMARY (ENGLISH)
Früntlicher, fürgeliebter herr und br˚uder, ich wünschen üch, üwer lieben hußfrowen und dem gantzen hußgesind von gott durch unsern herren ein g˚ut glückhafft jar. Des bedoerffend wir ouch wol, dann die sachen stand gantz spitzig und gfarlich. Gott der allmaechtig woe llesich
unser erbarmen.
[1] Gute Wünsche zum neuen Jahr, auch an Myconius’ Gattin [Margret] und an die Haus- angehörigen. Im Hinblick auf die gegenwärtigen Gefahren kann man Glückwünsche brauchen![1] Greeting and New Year Wishes: Bullinger extends friendly greetings and wishes for a fortunate new year to Myconius, his wife, and household, noting the particularly perilous and critical state of current affairs. He calls for God’s mercy in these challenging times.
Ulm wil nitt vil g˚uts rh˚umbs
haben der dappfferkeit. Man mumlet, sy vertrue gend sich gern mitt dem Sanherib; quod tibi dictum volo. Noe rlingen, Gien- gen, Boppffingen, Dinckelspühel, Rotenburg an der Tuber, Hall in Schwaben, alles und noch mee vomm keyser yngenommen! Ich bin gantz angsthafft umb den herrlichen gelerten Brentium.
[2] Die Stadt Ulm wird keine
Lorbeeren für Tapferkeit ernten, denn angeblich hat sie sich
recht leicht mit dem Sanherib“ [Kaiser Karl V.] ” öhnt!
Dies als vertrauliche Mitteilung.
Nördlingen, Giengen, Bopfingen, Dinkelsbühl, Rothenburg ob der Tauber und Schwäbisch Hall befinden sich in der Hand des Kaisers. Bullinger macht sich große Sorgen um den gelehrten [Johannes] Brenz.
[2] Military and Political Situation: He comments on the lack of bravery in Ulm and rumors of their alliance with Sanherib (possibly a metaphor for aligning with a powerful adversary). He reports the capture of several towns (Nördlingen, Giengen, Bopfingen, Dinkelsbühl, Rothenburg, and Hall) in Swabia by the Emperor. Bullinger expresses concern for the well-being of a distinguished scholar, Brentius,
Nostri officii fuerit in tantis
calamitatibus inhortari populum ad preces assiduas et ad poeni- tentiam. Certo, certo inveniet et nos malum istud!
[4] Im gegenwärtigen Unglück
muss das Volk zum Gebet und zur Buße angehalten werden, denn bestimmt wird nun auch die Eidgenossenschaft bedroht!
[4] Religious Duty and Moral
Exhortation: In light of these calamities, Bullinger feels it’s their duty to urge the people to constant prayer and repentance, with a certain belief that these troubles will also befall them.
johd-10-174-g6.png
Figure 6

TrOCR model outputs demonstrating multilingual capabilities with an example of a language switch from Latin to Early New High German.

johd-10-174-g7.png
Figure 7

Comparative analysis of different HTR models, showcasing the efficacy of our multilingual approach in the Bullinger corpus.

DOI: https://doi.org/10.5334/johd.174 | Journal eISSN: 2059-481X
Language: English
Submitted on: Oct 16, 2023
Accepted on: Dec 20, 2023
Published on: Jan 24, 2024
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2024 Phillip Benjamin Ströbel, Lukas Fischer, Raphael Müller, Patricia Scheurer, Bernard Schroffenegger, Benjamin Suter, Martin Volk, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.