The Digital Scholarly Edition of Angelo Poliziano’s Marginal Notes to the Epistle of Sappho to Phaon. Encoding of Multiple Edition Levels and Implementation of Wikidata Items and CTS URNs

Andrea La Veglia

doi:10.5334/johd.493

Full Article

(1) Overview

Repository location

https://doi.org/10.5281/zenodo.17315766

https://github.com/AndreaLaVeglia/Postille_Poliziano_Sappho_EVT

The dataset presented here will be discussed more extensively in a forthcoming paper in the second issue of the journal TeCuM – Testi e Culture del Medioevo (https://serena.sharepress.it/TECUM/).

Context

This dataset is related to the digital scholarly edition (DSE) of Poliziano’s annotations on the Epistle of Sappho that I developed for my Master of Art thesis at the University of Naples “Federico II”. It includes the encoding of the text, the settings of the software viewer and the XSLTs used.

Poliziano’s marginal notes on the Epistle of Sappho can be read in the Bodleian incunabulum Auct. P 2.2 (ff. 238v-241r), which is a copy of the 1477 Parma edition of Ovid’s Opera Omnia.¹ Among all the annotations in the incunabulum, the marginal notes to the Epistle of Sappho are the most conspicuous (see Brenkman, 1722) and were first studied by M. Kubo (1985) who made a draft transcription of the text that cannot be considered as a proper edition.

These annotations represent an intermediate stage of work between the study of primary sources and the academic lesson (enarratio) on this epistle held by Poliziano in Florence in 1481.² The comparison between the marginal notes and the enarratio demonstrates that the second one represents an advanced stage of research, since the marginal notes “contained little or no material substance that were not to be later utilized for the Enarratio” (Kubo, 1985, p. 35).³

For my dissertation I completed, revised and corrected the transcription of Kubo and then I prepared a digital encoding useful to export both a printed and a digital edition.

The DSE was created using the second version of Edition Visualization Technology (EVT 2), an open source software viewer for texts encoded according to the Text Encoding Initiative (TEI) guidelines, publishing the text along with related annotations and metadata (Rosselli Del Turco et al., 2014, 2019).⁴

In my edition, three view modes of EVT 2 are available: the “image-text” (mode-imgTxt, see Figure 1), the “text-text” (mode-txtTxt, see Figure 2) and the “source-text” (mode-srcTxt, see Figure 3) (Rosselli Del Turco et al., 2014, p. 11, 2019, p. 13, 2020). In the first mode you can select one of three parallel edition levels (facsimile, diplomatic-interpretative and critical) and see the text-image alignment; in the second mode you can compare different transcriptions and in the third mode you can explore the sources of quotes in the text.

The *image-text* view mode and the selector of transcription level.

In all view modes the named entities are highlighted and included in an interactive button that can be clicked to open a pop-up card displaying the corresponding Wikidata item’s Uniform Resource Locator (URL) and a list of the entity’s occurrences throughout the edition (Figure 4).

The Wikidata implementation for the named entity Strabo.

(2) Method

The digital space has enabled a multidimensional representation of texts, overcoming the limitations of the printed page, which is confined by its nature to a bidimensional space (see Reggiani, 2025). In this case study, I faced the challenging editing of an apparatus of marks in a book (for genre definition see Stoddard, 1985; Petrella, 2022), namely all the annotations made in a book by its reader: due to its complexity and asystematicity, the edition of such a text is only possible if it is conceived on multiple layers of representation.

In this DSE the following information is encoded in compliance with TEI guidelines:

semantic annotation of layout with attributes of <zone> in <facsimile>;
codicological information with <msDesc>;
categorization of each unit of text with <taxonomy> and attributes of <div> (taking inspiration from the model of Siciliano & Del Grosso, 2022);
double level of transcription using the <choice>;
palaeographical analysis with <add> and <unclear>;
identification of sources with <quote> connected to a <cit> in <back>, with encoding of reference to online corpora via Uniform Resource Names (URNs) of Canonical Text Services (CTS);⁵
annotation of named entities by using <persName> and <placeName> nested within a <ref> element which refers to the Wikidata URL through the @target attribute;⁶
identification of signs used by Poliziano to relate each annotation to the main text of incunabulum by using <metamark> and its @target attribute and a nested <g> element related to a series of <glyph> declared in <charDecl>.

Steps

The work was initiated from a sequence of close-up photographs taken by Prof. Giovan Battista D’Alessio at the Bodleian Library, which I processed in Metashape to produce HD pictures of full pages.

Afterward, I made use of Transkribus (© ReadCoop)⁷ for a manual transcription and a semantic annotation of layout and text. In addition to native tags, I created custom tags according to TEI guidelines (Figure 5).

Regarding the named entities, I took advantage of the Transkribus tool “Wikidata ID”, which allows connecting each name with its related Wikidata item and store its identifier number as an attribute. Precisely, in the web app, after choosing a tag (e.g. ‘person’ or ‘place’) by clicking the Wikidata ID button, the system will automatically suggest an entity linked to the selected term. The user can select the suggested item or alternatively use the search field to look up the correct Wikidata entry.⁸ Since the version released in May 2025, it has been possible to automatically tag all occurrences of the same term in the Transkribus document, including the Wikidata ID attribute.⁹

After completing the transcription and annotation, I addressed the issues related to data export. While Transkribus enables the export of transcriptions and tags in TEI XML format, the resulting files do not always conform to EVT 2, and customized tags are not converted into canonical TEI elements. An additional encoding regularization step was therefore necessary. To carry out this transformation, I developed a set of XSLT stylesheets, with assistance from ChatGPT, as documented in the GitHub repository.

The first XSLT (Pipeline-XSLT/01-regularize_TEI.xsl) regularizes the elements <facsimile> (according to EVT 2 expectations), <choice>, <sic> and <del> and creates both the source apparatus and the index of named entities. The transformation of each element is described in the comments of the XSLT sheet.

Subsequently, the division of the whole text in glosses was managed by a second XSLT (Pipeline-XSLT/02-create_div.xsl) and I manually attributed a semantic ID for each gloss. Then, with a third XSLT (Pipeline-XSLT/03-order_div.xsl) all <div> elements were arranged in ascending order.

At the conclusion of this XSLT sequence, a line of validation code was added to ensure compliance with tei_all schema. This was followed by the manual categorization of each text division and the description of metatextual signs. Finally, all codicological information was recorded into the <msDesc> module, along with additional metadata in the <fileDesc>.

At the end of this regularization of the XML TEI encoding, I configured the visualization in EVT 2 by editing the settings in the config folder of the software (DigitalEditionEVT/config). Subsequently, the print edition was produced by exporting a LaTeX document from the XML encoding (Pipeline-XSLT/06-LaTeX_export.xsl), which has been useful to generate a Portable Document Format (PDF) (Figure 5).

(3) Dataset Description

Repository name

GitHub (https://github.com/AndreaLaVeglia/Postille_Poliziano_Sappho_EVT); Zenodo (https://doi.org/10.5281/zenodo.17315766).

Object name

The Digital Scholarly Edition of Marginal Notes of Poliziano to the Epistle of Sappho to Phaon.

Format names and versions

XML, XSL

Creation dates

15-02-2025–13-02-2026

Dataset creators

Andrea La Veglia (University of Salerno): creation and curation of dataset; Giovan Battista D’Alessio (University of Rome “La Sapienza”): image acquisition and supervision of transcriptions and philological work; Gennaro Ferrante (University of Naples “Federico II”): supervision of workflow and technical work; Roberto Rosselli Del Turco (University of Turin): creator and developer of software “EVT 2”; Luigi Bambaci (University of Bologna): assistance in developing of XSLT transformation from XML TEI to LaTeX; Angelo Mario Del Grosso (CNR-ILC): supervision and validation of XML TEI encoding.

Language

English, Italian, Latin, Ancient Greek.

License

Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International.

Publication date

2025-10-10.

The repository on GitHub is linked to Zenodo, according to the best practice for publishing digital editions, just as in the case of the PerseusProject (Cerrato, Babeu, et al., 2025; Cerrato, Clérice, et al., 2025) and of BelliniDigitalCorrispondence (Del Grosso & Spampinato, 2023).

The project includes a creation of a Wikidata item in case of missing correspondent for named entities, as in the case of the work De Elocutione by Demetrius the Rhetorician (https://www.wikidata.org/wiki/Q131412205).

(4) Reuse Potential

The first potential reuse of the dataset concerns the pipeline. The sequence of XSLT stylesheets can be applied to any TEI XML file exported from a Transkribus document, if it uses the same customized tags adopted in this project. In this way Transkribus can be used as an effective environment for producing scholarly digital editions. Moreover, as I have shown, once the digital edition is encoded, generating a print edition directly from this encoding is also feasible.

In addition, the encoding introduces innovative applications of the TEI guidelines for representing the relationships between main texts and paratexts (or specifically postillae) and proposes a model for encoding other similar paratexts rich in information distributed across multiple layers of analysis.

Finally, the inclusion of references to Wikidata items and CTS URNs contributes to making the edition both open and interoperable, enabling meaningful dialogue and data exchange with other projects.

Notes

[1] The incunabulum has been studied by Owen (1889, pp. XII–XVI), Cotton (1937, p. 397), Maïer (1965, pp. 350–351), Daneloni (2013, p. 311) and Villani (2018, pp. 1037–1042).

[2] Text published by E. Lazzeri (1971). With regard to the professorship of Poliziano at the University of Florence see: Branca (1983, pp. 73–90), Cesarini Martinelli (1996), Mandosio (2008), Orvieto (2009, pp. 324–326).

[3] About this point, see also Cesarini Martinelli (1978, p. 143).

[4] Official website of the EVT project: http://evt.labcd.unipi.it/.

[5] The Canonical Text Service (CTS) is a protocol to create machine-readable citations by using Uniform Resource Names (URNs). The CTS protocol is based on the CITE (“Collections, Indices, Texts, and Extensions”) Architecture that is a framework for identifying citations by considering the text as an “Ordered Hierarchy of Citable Objects” (OHCO) (with regard to URN: Moats, 1997; with regard to OHCO: Smith & Weaver, 2009; with regard to CITE architecture and CTS URNs: Blackwell & Smith, 2019; Tiepmar & Heyer, 2019; Berti, 2021, pp. 105–114, 2023, pp. 318–319).

[6] See the TEI P5 guidelines for implementation of Wikidata references in <listPerson> and <listPlace> at https://www.tei-c.org/Vault/P5/4.9.0/doc/tei-p5-doc/en/html/examples-listPerson.html and https://www.tei-c.org/Vault/P5/4.9.0/doc/tei-p5-doc/en/html/examples-listPlace.html.

[7] https://www.transkribus.org/.

[8] See https://help.transkribus.org/textual-tags.

[9] See the change log (viewable only by registered users) https://app.transkribus.org/changelog, release of 07 May 2025.

Acknowledgements

The data paper presents the results of my Master of Arts dissertation presented at the University of Naples “Federico II” in February 2025, under the supervision of Prof. Giovan Battista D’Alessio and Prof. Gennaro Ferrante whom I thank for their guidance. My research benefited from the activities organised by the FeDHLab, Digital Humanities laboratory at the Department of Humanities of the University of Naples.

I am very grateful to all experts that provided me invaluable support for the development of this work and for the paper: Roberto Rosselli del Turco, Angelo Mario Del Grosso, Camillo Carlo Pellizzari di San Girolamo, Luigi Bambaci, Nicola Reggiani, Thibault Clérice, Sara Mansutti, Monica Berti, Michele Giovanni Silani, Riccardo Montalto, Paolo Monella and Aanandavardhan Nyaupane.

I thank Andrea Farina, the editorial team of the JHOD and the anonymous reviewers for their suggestions and assistance in the publishing process.

Competing Interests

The author was supported by the Transkribus Scholarship program for one year and is currently regional coordinator for Wikimedia Italia.

Author Contributions

Andrea La Veglia: Conceptualization, Data curation, Investigation, Methodology.

The Digital Scholarly Edition of Angelo Poliziano’s Marginal Notes to the Epistle of Sappho to Phaon. Encoding of Multiple Edition Levels and Implementation of Wikidata Items and CTS URNs