Abstract
This paper presents a systematic workflow for integrating historical humanities datasets with Wikidata, using Berlin’s Cold War-era public transport network (1946–1989) as a case study. Through a three-phase methodology, data documentation with reconciliation, computational domain investigation, and modelling-aware upload planning, we identify fundamental challenges in representing temporal complexity within collaborative knowledge bases. Analysis of collected transport-station records reveals significant gaps in Wikidata’s coverage of non-rail transport modalities, systematic absence of temporal qualifiers on time-varying properties, and incomplete provenance documentation. These findings extend beyond transport history to illuminate broader tensions between historical research requirements and Wikidata’s present-focused design. We propose reusable strategies for representing discontinuous operation periods, geographic uncertainty, and snapshot sources while maintaining compatibility with Wikidata’s community standards and linked open data principles. This workflow demonstrates how digital humanities projects can prepare for contribution to Wikidata as methodologically informed contributors, using systematic investigation to address the platform’s representational limitations for historical phenomena before upload execution.
