(1) Overview
Repository location – https://doi.org/10.5281/zenodo.13212804
Context and Motivation
The resource description framework (RDF)1 provides a standardized way to publish data on the Web as statements, known as RDF-triples. In the field of Digital Humanities, RDF technology is used to create digital editions because it offers a robust framework for organizing, annotating, and disseminating data present in vast collections of historical manuscripts. Examples include Bernoulli-Euler-Digital,2 LetterSampo,3 Anton Weber Gesamteasugabe,4 and Tolstoy semantisized (Bonch-Osmolovskya et al., 2019). The RDF technology implements interconnected data within and across open repositories based on the “follow your nose” principle (Dodds and Davis, 2012). That means the users and web applications can access underlying RDF resources and ontology constructs (classes and predicates) by dereferencing their IRIs.
Even though RDF technology is well suited for representing and storing inherently connected data, standard RDF is not an optimal choice for representing meta-level information that requires statements about statements. This makes it challenging to create and query digital editions of metadata-oriented documents, such as travel journals, because most of the information in such documents is accompanied by metadata information describing it. For example, “person A was at location B” on 15th of June 2024. Here, the date is a meta-information about the main statement. Creating statements about statements using standard RDF is troublesome. The very first RDF 1.0 specification uses a mechanism called reification for supporting statements about statements. Reification, however, introduces processing overhead due to the increased number of additional statements needed to identify the reference triple, which appears too verbose when represented in RDF and SPARQL (Kasenchak et al., 2021). RDF-star and SPARQL-star overcome this deficit by extending the RDF standard and increasing the efficiency of queries by reducing the query time. RDF-star allows triples to represent metadata about another triple by directly using this other triple as its subject or object (Hartig, 2017). Currently, scholars are utilizing the power of RDF-star technology for meta-level data modeling; for example, Ruppert et al. (2023) suggest a workflow for this purpose.
Using RDF-star, we can easily attach metadata to the edges of the knowledge graph and use SPARQL-star to query and analyze the graph concerning the metadata information stored for triples. RDF-star technology significantly reduces the complexity of the data models and the resulting knowledge graph. It allows data modelers to reduce the complexity of the primary data by shifting additional information to the meta-level, eliminating the need to define project-specific RDF classes to represent metadata information and create thousands of blank nodes to represent metadata via reification. RDF-star technology also facilitates adding provenance and certainty information to the statements, essential for faithful representation and citability (see the section titled Provenance-Aware Knowledge Representation through RDF-star).
In this paper, we present our approach to defining a research-based RDF-star ontology, JourneyStar,5 that is openly accessible on GitHub6 and developed to represent travel data (historical or modern) with all metadata information. In 2023, we used this ontology to construct the knowledge graph of Reisbüchlein and created the first RDF-star-based digital edition, available on the Bernoulli-Euler Digital platform, with tools to analyze and visualize the graph data.7 This digital edition is region-based, multilayered, and interactive, with the facsimiles, texts, and metadata all presented on the web application in a highly functional and interconnected way. The description of the digital edition is out of this paper’s scope; we will focus on the ontology and meta-level data modeling.
Ontology Description
Representing travel data, modern or historical, as a knowledge graph based on Linked Open Data (LOD) principles requires a generic and comprehensive ontology containing definitions of RDF classes and predicates describing travel data: journeys, sub-journeys, excursions, activities, accommodations, means of transportation, etc. The existing Trip8 ontology contains the definitions of some of these necessary constructs. Yet, it is based on the standard RDF and does not have the required structures for representing the metadata, such as provenance information and spatiotemporal data. Hence, we have aimed to devise a versatile RDF-star-based ontology for travel data capable of thoroughly and effectively encapsulating the wealth of information in travel accounts (Alassi and Rosenthaler, 2022). This ontology is developed through extensive research in travel journals, particularly Jacob I Bernoulli’s travel diary, Reisbüchlein.
Jacob I Bernoulli (1654–1705) traveled in the years 1676 to 1683 across Europe in pursuit of knowledge; during his journeys, he not only engaged in typical activities such as sightseeing, meeting friends, and excursions but also worked as a private teacher and gave lectures. He meticulously noted details about his itineraries, accommodations, means of transportation, food consumption, and corresponding costs in this journal. To develop the JourneyStar ontology, we closely studied the content of the Reisbüchlein to identify the constructs necessary to fully represent the travel data, staying as close as possible to the content of this travel diary without losing generality (Ammann et al., 2023).
Most information in documents such as travel journals is accompanied by metadata. For example, in his travel diary, Bernoulli states that he left Basel on August 20, 1676, a date in the Julian calendar. As shown in Table 1, a standard RDF resource can represent the journey. Basel, as the origin of the journey, can be represented with an RDF triple. This core statement then has the departure date as its meta-level information. The core statement is defined as an edge of the graph; the departure date describes this edge; thus, it is attached to the graph’s edge, as shown in Figure 1. Since RDF-star makes embedding triples directly within other triples possible, we can represent the core statement with its metadata as an annotated triple where the core triple takes the subject position. The annotated triples can have further metadata statements expressing them. In this example, the calendar type is a statement about the statement representing the departure date (metadata of metadata). Thus, we can have multiple levels of embedded triples (see Figure 1). Table 1 contains the graph data in Turtle serialization.

Table 1
Examples of representing metadata using RDF-star.

Figure 1
Journey from Basel to Geneva.
The JourneyStar ontology contains multiple constructs (classes and predicates) to describe the journeys with all associated metadata information.
(2) Method: Definition of Ontology Constructs
JourneyStar ontology, with prefix js, is based on the Trip9 ontology and other existing ontologies, such as schema,10 foaf,11 dbo,12 and beol,13 and uses the definitions in these ontologies by making sub-classes and sub-properties. These ontologies are specifically chosen for their widespread use in scholarly digital editions of historical texts. The js:Journey class defines a journey, including means of transportation, participants, stays, transits, dates, and activities. A journey begins with the departure date from the origin and concludes on the date of departure from the destination, marking the end of the stay at the destination. The date values can be in xsd:date or xsd:dateTime format and must be accompanied by the date’s calendar for accurate representation on the web application and to facilitate the comparison of dates in various calendars. Figure 1 shows the RDF-star representation of Jacob I Bernoulli’s journey from Basel to Geneva.
In this ontology, we have defined the class js:Excursion to represent round trips of short durations without an overnight stay. The class js:Journey represents trips that last more than a day. A journey might include several stops; if a stop included an overnight stay, it would be a sub-journey or stage (represented by the predicate js:hasStage); otherwise, it would be a transit (represented by the predicate js:transitThrough). A sub-journey is of type js:Journey; thus, all properties of this class can be used to describe it. Description of a main journey and sub-journeys contains information about the stay at the destination that can be represented using the js:Stay class with properties to describe the time frame of the stay and description of the accommodation. The class js:Accomodation contains sub-classes representing different accommodation types, such as js:GuestHouse. The ontology includes the js:Activity class to describe various activities undertaken during a journey; it has subclasses such as js:Sightseeing and js:Dining to represent multiple activity types. Figure 2 shows the first stage of Jacob Bernoulli’s journey en route to Geneva. This stage was a journey from Basel to Liestal with a transit through Nieder Schöntal with a one-night stay at a guest house and dinner together with Johann Ulrich Frey on 20.08.1678, which cost 2 CHF.

Figure 2
Sub-journey from Basel to Liestal en route to Geneva.
JourneyStar ontology follows event-based data modeling, an approach to designing and structuring data systems that focuses on capturing and modeling events within a domain. Like other event ontologies, JourneyStar provides a shared and formal specification about what happens in the real world (i.e., an event description typically includes spatiotemporal data and participants in the event). Considering travel data, events can be activities, such as journeys, stays, dining, sightseeings, occurrences such as encounters, and observed natural phenomena. An event with spatiotemporal data and participants can be represented through the base class js:Event and properties js:hasDate, js:hasLocation, and js:hasParticipant. Thus, the classes js:Excursion, js:Journey, js:Activity, and js:Occurrence (used for describing occurrences) are all subclasses of js:Event. In this way, we can overcome the common shortcomings of event-based ontologies in bridging the gap between spatiotemporal extents and participants to describe a specific domain event (Ashour, 2023, 212).
Persons and locations can be added to the knowledge graph through the IRI of their existing representations on LOD repositories. As shown in Figures 1 and 2 above, we used the existing representations on repositories such as DBpedia, Wikidata, or Bernoulli-Euler Online (BEOL) 14 whenever possible. New resources can be created for persons and locations with no existing records through js:Person (a subclass of schema:Person) and js:Location (a subclass of dbo:settlement), respectively.15 To prevent duplicates and connect the resources to external data records, whenever applicable, for persons, GND numbers,16 and locations GeoName IDs17 can be used.
Furthermore, to add the currencies of costs to our knowledge graph, the resources of the currency repository can be used.18 Historical travel records might contain indications of defunct currencies. For example, the Reichtaler commonly mentioned by Bernoulli. Since these old currencies do not have representations in the mentioned currency database, they can be represented as hierarchical RDF structures with all their sub-units through the js:Currency class. The spellings of the old currencies found in various historical documents can be standardized according to the guidelines of the Money Museum.19 Given that the diverse usage of currency names can be valuable for historical research, it is possible to associate alternative currency names found in historical texts with relevant resources through the js:alternativeName predicate. The js:hasSubunit predicate describes the hierarchical relation of currencies. Through the js:value property, the value of a subunit can be added to the triple that represents the hierarchical relation of currencies; for example, 1 Batzen equals 4 Kreuzen, as shown in Figure 3. Following these steps, it would be possible to calculate costs in different currencies through SPARQL-star.

Figure 3
Currency Representation.
Provenance-Aware Knowledge Representation through RDF-star
The standard used for representing information as RDF triples using standard RDF inherently lacks the mechanism to attach provenance data, which would be crucial to make automatically generated and/or processed data authoritative (Sikos and Philp, 2020, 293). RDF-star enables the addition of provenance information to the knowledge graph to increase the credibility and citability of information. For scholarly research on graph data, providing information about the source document from which a statement is retrieved is particularly important. This way, the user studying the graph data can access the source document to examine the content closely. The JourneyStar ontology offers the js:mentionedIn property for this purpose, which accepts a resource IRI as its object value (see Figure 4).

Figure 4
Example of provenance data with RDF-star.
Furthermore, adding metadata about the information retrieval mode would be possible. It is crucial to include metadata indicating if a fact in the graph is retrieved through automatic natural language processing-based algorithms or if the statement results from a particular person’s research. For example, on page 13 of Reisbüchlein, it is mentioned that Jacob Bernoulli traveled with “H. Frey” from Basel to Liestal. Based on his research, editor A is 85% certain that the referenced “Mr. Frey” is Johann Ulrich Frey. The provenance and certainty information can be added to the graph using RDF-star through predicates js:accordingTo and js:certaintyPercentage, as shown in Figure 4 and Table 2 in Turtle serialization.

Table 2
Example of the Provenance and Certainty Representation with RDF-star.
Users will then be aware of the information’s provenance, and the editor’s role in acquiring and presenting the information will also be highlighted. A certain degree of ambiguity regarding the statement is made clear through the predicate js:certaintyPercentage, which can even be used to represent the information accordingly on the web application, perhaps with a different color or a pop up note. Users can also query the graph for statements with certainty higher than 95%.
(3) Dataset Description
Repository name
JourneyStar
Object name
journeyStar.ttl and journeyStar_shacl.ttl.
Format names and versions
Turtle (.ttl)
Creation dates
2023-02-01
Dataset creators
Nora Olivia Ammann, initial ontology design; Ann Karimi Kern, SHACL shapes define and documentation; Dr. Sepideh Alassi, supervision and generic ontology design. All mentioned contributors are affiliated with the University of Basel, Switzerland.
Language
English
License
CC BY
Publication date
Initial release on GitHub 2023-02-01, generic version release on GitHub 2024-01-23, publication on Zenodo 2024-08-04.
(4) Reuse Potential and Results
The RDF-star graph representing travel data can be stored in triplestores that support RDF-star technology. For example, Apache Jena Fuseki20 and GraphDB,21 both have query endpoints supporting SPARQL-star through which the graph data can be accessed and fully queried. We have defined SHACL22 shapes to validate the consistency of the graph data with the JourneyStar ontology before storing the graph; these shapes are openly accessible together with the ontology. Figure 5 shows an example of a SPARQL-star query to answer the question: “On his way to Geneva, when did Bernoulli arrive at Liestal, and how long did he stay there?” Figure 6 shows the query results. The intuitive hierarchy of the RDF-star model is reflected in the SPARQL-star query. By binding annotated triples to variables, we can express complex query criteria concisely.

Figure 5
SPARQL-star query for arrival date and duration of the stay at Liestal.

Figure 6
Results of the query in Figure 5.
Figure 7 shows another SPARQL-star query to answer the question: “With whom did Bernoulli dine in Liestal, and what is the source and certainty of this information?”; see the corresponding response in Figure 8.

Figure 7
SPARQL-star query for the dining participant with provenance data.

Figure 8
Results of the query in Figure 7.
If we represented the same information using standard RDF through reification and tried to find answers to these questions using standard SPARQL, we would have to deal with processing overhead due to the increased number of additional statements needed to identify the reference triple and the SPARQL query would appear too verbose. In comparison, SPARQL-star queries, despite the nested levels of RDF-star triples, have lower query time and are easier to compose. Kasenchak (2021) provides an in-depth discussion on performance enhancement by applying RDF-star and SPARQL-star technologies.
(5) Conclusion
In conclusion, the implementation of RDF-star technology within the field of Digital Humanities, as exemplified by the JourneyStar ontology, marks a significant advancement in the representation and analysis of complex, metadata-rich travel data. While powerful in many respects, the traditional RDF framework encounters limitations when handling meta-level information. This challenge is particularly acute in metadata-oriented data, such as travel accounts. RDF-star effectively addresses these limitations by allowing for direct annotation of triples, thereby simplifying data models and enhancing data storage and querying efficiency. The JourneyStar ontology showcases the practical application of RDF-star in creating fully machine-readable open databases of historical and modern travel data. Integrating metadata seamlessly reduces the need for project-specific RDF classes and the cumbersome use of blank nodes for reification. This streamlines the creation of digital editions and enhances their usability and accuracy, particularly regarding provenance and data citation. The JourneyStar ontology is an open-source repository available on GitHub where users can actively contribute to its development by adding new classes and properties, enhancing documentation, and more through pull requests. Additionally, projects can leverage the existing constructs of the ontology and adapt them to their specific needs by creating subclasses and sub-properties tailored to their use cases.
Our work on the Reisbüchlein knowledge graph, now available on the Bernoulli-Euler Digital platform, illustrates the tangible benefits of RDF-star in Digital Humanities projects. This digital edition demonstrates how JourneyStar ontology can facilitate the development of rich, interconnected data graphs in a functional and accessible web application. In summary, RDF-star represents a significant leap forward in data modeling, offering a robust and efficient solution for managing metadata. As open humanities data repositories continue to evolve, technologies like RDF-star will play a crucial role in enabling more sophisticated and accurate representations of complex datasets, ultimately enriching our understanding and preservation of humanities data.
Notes
[1] https://www.w3.org/TR/rdf11-concepts/ (last accessed: 26 July 2024).
[2] Alassi, S., Rosenthaler, L. 2024. https://bernoulli-euler.dhlab.unibas.ch/ (last accessed: 26 July 2024).
[3] Hyvönen, E. et al. 2023. https://seco.cs.aalto.fi/projects/rrl/ (last accessed: 26 July 2024).
[4] https://www.dasch.swiss/project/anton-webern-gesamtausgabe (last accessed: 26 July 2024).
[5] http://journey-star.dhlab.unibas.ch (last accessed: 26 July 2024).
[6] https://github.com/dhlab-basel/JourneyStar(last accessed: 26 July 2024).
[7] https://bernoulli-euler.dhlab.unibas.ch/biography/Jacob%20I%20Bernoulli (last accessed: 26 July 2024).
[8] https://enterpriseintegrationlab.github.io/icity/Trip/doc/index-en.html (last accessed: 26 July 2024).
[10] https://schema.org/(last accessed: 26 July 2024).
[11] http://xmlns.com/foaf/spec/ (last accessed: 26 July 2024).
[12] https://dbpedia.org/ontology/(last accessed: 26 July 2024).
[13] https://app.dasch.swiss/project/yTerZGyxjZVqFMNNKXCDPF/data-models (last accessed: 26 July 2024).
[15] https://app.dasch.swiss/project/yTerZGyxjZVqFMNNKXCDPF/data-models (last accessed: 26 July 2024).
[16] https://www.dnb.de/EN/Professionell/Standardisierung/GND/gnd_node.html(last accessed: 26 July 2024).
[17] Identifier of a location according to GeoNames https://www.geonames.org/ (last accessed: 26 July 2024).
[18] https://spec.edmcouncil.org/fibo/ontology/FND/Accounting/ISO4217-CurrencyCodes/ (last accessed: 26 July 2024).
[19] Money Museum. Währungen Des Mittelalters. https://www.moneymuseum.com/pdf/gestern/04_mittelalter/19%20Waehrungen%20des%20Mittelalters.pdf (last accessed: 26 July 2024).
[20] https://jena.apache.org/documentation/fuseki2/ (last accessed: 26 July 2024).
[21] https://graphdb.ontotext.com/ (last accessed: 26 July 2024).
[22] https://www.w3.org/TR/shacl/ (last accessed: 26 July 2024).
Acknowledgements
We highly appreciate the contributions of our research assistant, Ann Karimi Kern, and volunteering researcher, Dr. Soledad Castaño Santos, in developing the ontology.
Funding Information
U.410.0003: Forschungsfonds (Excellent Junior Researcher), University of Basel.
Competing Interests
The authors have no competing interests to declare.
Author contributions
Dr. Sepideh Alassi: Conceptualization, Formal Analysis, Investigation, Methodology, Funding Acquisition, Project Administration, Supervision, Writing – original draft.
Nora Olivia Ammann: Data Curation, Formal Analysis, Investigation, Methodology, Software, Validation, Visualization, Writing – review & Editing.
