
Figure 1
Example of an incident (FelixArchief, 731#1604, p. 211).
Table 1
Transcription of Figure 1 and translation to English (FelixArchief, 731#1604, p. 211).
| 1 Mei 1885 Mr Sergoynne à verbaliser D | Om 9 1/2 uren ‘s avonds heb ik gezien dat de meid uit het huis n° 57 der Leopoldst matten uitklopte voor hare woning tegen den muer van het gasthuis toen ik er naartoe ging liep zy binnen en is niet meer buiten gekomen (zij veroorzaakte eenen over vloed van stof) Broes, Dymphna, 16 jaar, geb. te Minderhout, meid Wonende Leopold str. 57. Comme cette fille est ci jeune et qu’elle savait pas mieux nous n’y avons pas donné suite [onleesbare handtekening] |
| 1 May 1885 Mr Sergoynne draw up an official report D | At 9 1/2 hours in the evening I saw that the maid from the house N° 57 in the Leopold street was beating rugs against the wall of the guesthouse when I came over she ran away inside the house and didn’t come outside any more (she caused a flood of dust). Broes, Dymphna, 16 years, born in Minderhout, maid residing in Leopoldstr 57. Because she was so young and she didn’t know better, we didn’t follow up on this. [unreadable signature-mark] |

Figure 2
Visualization of the training process.

Figure 3
An example of a page with annotated text regions (green) and baselines (pink) (FelixArchief, MA#17612, p. 85).
Table 2
Train-validation splits for each dataset.
| DATASET | SUBSET | PAGES | REGIONS | LINES | LINE LENGTH* | CHARACTERS | VOCABULARY** |
|---|---|---|---|---|---|---|---|
| VOC | train | 4261 | 6079 | 132611 | 36.37 (±17.84) | 4823321 | 120 |
| valid | 474 | 655 | 15154 | 35.79 (±17.57) | 542321 | 101 | |
| Notarial | train | 1453 | 3624 | 92003 | 35.03 (±20.03) | 3222690 | 107 |
| valid | 162 | 377 | 9554 | 35.78 (±18.67) | 341841 | 96 | |
| Antw-expert | train | 243 | 1828 | 10766 | 30.63 (±17.10) | 329806 | 89 |
| valid | 28 | 209 | 1260 | 29.59 (±16.77) | 37288 | 83 | |
| Antw-students | train | 3099 | 27496 | 145387 | 29.00 (±17.76) | 4216129 | 118 |
| valid | 345 | 3089 | 16196 | 28.92 (±17.85) | 468359 | 106 | |
| Antw-test | test | 101 | 715 | 4628 | 30.18 (±17.09) | 139658 | 90 |
[i] * Line length expressed in characters (including whitespace). Mean and standard deviation are reported.
** Number of unique characters in the character vocabulary.
Table 3
HTR training results (CAR).
| MODEL | VALIDATION* | ANTW-TEST | ANTW-TEST (RELAXED)** |
|---|---|---|---|
| Manu (base model) | NA | 70.51% | 72.24% |
| Manu: VOC | 94.36% | 76.57% | 78.95% |
| Manu: VOC → Notarial | 95.68% | 83.04% | 85.22% |
| Manu: VOC → Notarial → Antwexpert | 91.47% | 90.01% | 91.57% |
| Manu: VOC → Notarial → Antwexpert + Antwstudent | 89.97% | 92.58% | 93.97% |
| Antwexpert + Antwstudent (scratch) | 89.05% | 91.54% | 92.88% |
| Manu: VOC + Notarial + Antwexpert + Antwstudent (super model) | 92.78% | 92.31% | 93.69% |
[i] *Note that the validation scores cannot be directly compared across model rows below, but are still useful because they give the reader a sense of in-domain model fit.
**Exclusion of whitespace, punctuation and capital letters.
