Have a personal or library account? Click to login
A Lightweight File System Based Approach to Getting Data Ready for Data Management Solutions Cover

A Lightweight File System Based Approach to Getting Data Ready for Data Management Solutions

Open Access
|Apr 2025

Figures & Tables

dsj-24-1853-g1.png
Figure 1

An example set of metadata presented in different data exchange formats, i.e., a) YAML, b) JSON, and c) XML, storing metadata in key-value pairs, illustrating the complexity of the respective format’s syntax and highlighting the human readability of YAML.

dsj-24-1853-g2.png
Figure 2

a) Snapshot of the graphical user interface of the autotag-metadata program (Hermann and Engstfeld, 2024), which monitors b) the file system for file creation events and tags files with metadata. The program can be coupled with c) external editors, such as VSCodium, which provides syntax highlighting and can be used for validating the metadata against a schema.

dsj-24-1853-g3.png
Figure 3

Example content of files for a) time series data stored as a CSV recorded by a user with a specific instrument and b) metadata stored as YAML along with the CSV, for example using autotag-metadata (Hermann and Engstfeld, 2024) (see text for details). The YAML metadata file describes the structure of the CSV in a structured way, here in the keys within figure description.fields, including the units to the column names and a description of the measured values. In addition, the metadata can contain additional information on the CSV, such as the user who recorded the data, the research question describing why the measurement was performed, the instrument used, or the values set at the instrument.

dsj-24-1853-g5.png
dsj-24-1853-g6.png
dsj-24-1853-g7.png
dsj-24-1853-g8.png
dsj-24-1853-g9.png
dsj-24-1853-g10.png
dsj-24-1853-g11.png
dsj-24-1853-g4.png
Figure 4

Snapshots of an electrochemical database displayed on the echemdb website (the echemdb community developers) generated from frictionless Data Packages, which in turn were inferred from literature data using svgdigitizer (Engstfeld et al., 2023). a) Overview page showing a list of entries with the most relevant selected descriptor relevant to the community from the metadata in the Data Package, including a thumbnail of the data. b) A page of an individual entry reached by clicking on the thumbnail in a), which provides detailed information on the respective entry, summarizing metadata and including an interactive plot. Both pages include links to the original publication.

Language: English
Submitted on: Nov 7, 2024
Accepted on: Mar 11, 2025
Published on: Apr 21, 2025
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2025 Albert K. Engstfeld, Johannes M. Hermann, Nicolas G. Hörmann, Julian Rüth, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.