Have a personal or library account? Click to login
From Experiments to Epistemic Practice: The RISE Humanities Data Benchmark Cover

From Experiments to Epistemic Practice: The RISE Humanities Data Benchmark

Open Access
|Mar 2026

References

  1. Abdurahman, S., Salkhordeh Ziabari, A., Moore, A. K., Bartels, D. M., & Dehghani, M. (2025). A primer for evaluating large language models in social science research. Advances in Methods and Practices in Psychological Science, 8(2). 10.1177/25152459251325174
  2. Bamman, D., Chang, K. K., Lucy, L., & Zhou, N. (2024, October). On classification with large language models in cultural analytics. arXiv. 10.1038/s41597-022-01710-x
  3. Barker, M., Chue Hong, N. P., Katz, D. S., Lamprecht, A.-L., Martinez-Ortiz, C., Psomopoulos, F., … Honeyman, T. (2022, October). Introducing the FAIR Principles for research software. Scientific Data, 9(1), 622. 10.1038/s41597-022-01710-x
  4. Dobson, J. (2020, June). Interpretable Outputs: Criteria for Machine Learning in the Humanities. Digital Humanities Quarterly, 15(2).
  5. Hamilton, S., Wilkens, M., & Piper, A. (2025, October). NarraBench: A Comprehensive Framework for Narrative Benchmarking. arXiv. 10.48550/arXiv.2510.09869
  6. Hauser, J., Kondor, D., Reddish, J., Benam, M., Cioni, E., Villa, F., … del Rio-Chanona, R. M. (2024, November). Large language models’ expert-level global history knowledge benchmark (hiST-LLM). In The Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track. Retrieved from https://openreview.net/forum?id=xlKeMuyoZ5#discussion
  7. Hindermann, M. (2024, August). FAIR use of GPT-generated data in SSH research: A practical guide.
  8. Hindermann, M., Marti, S., Alberto, A., Burkhardt, S., Decker, E., Frick, P., … Spadini, E. (2026, January). RISE-UNIBAS/humanities_data_benchmark. Zenodo. 10.5281/zenodo.18293269
  9. Hindermann, M., Marti, S., Kasper, L., & Bosse, A. (2026). The RISE Humanities Data Benchmark: A framework for evaluating large language models for humanities tasks. Journal of Open Humanities Data, 12(1), 24. 10.5334/johd.481
  10. Kang, Z., Gong, J., Yan, J., Xia, W., Wang, Y., Wang, Z., … Li, X. (2025, June). HSSBench: Benchmarking humanities and social sciences ability for multimodal large language models. arXiv. 10.48550/arXiv.2506.03922
  11. Karjus, A. (2025, February). Machine-assisted quantitizing designs: Augmenting humanities and social sciences with artificial intelligence. Humanities and Social Sciences Communications, 12(1), 277. 10.1057/s41599-025-04503-w
  12. Khadangi, A., Sartipi, A., Tchappi, I., & Fridgen, G. (2025, February). CognArtive: Large language models for automating art analysis and decoding aesthetic elements. arXiv. 10.48550/arXiv.2502.04353
  13. Marti, S. (2024, April). NDR Core. Zenodo. 10.5281/zenodo.10969133
  14. Marti, S. (2025, October). generic-llm-api-client: A unified, provider-agnostic Python client for multiple LLM APIs. Retrieved 2025-11-14, from https://github.com/RISE-UNIBAS/generic_llm_api_client
  15. Simons, A., Zichert, M., & Wüthrich, A. (2025, June). Large language models for history, philosophy, and sociology of science: Interpretive uses, methodological challenges, and critical perspectives. arXiv. 10.48550/arXiv.2506.12242
  16. Sokol, A., Daly, E., Hind, M., Piorkowski, D., Zhang, X., Moniz, N., & Chawla, N. (2024). BenchmarkCards: Standardized documentation for large language model benchmarks. arXiv. 10.48550/ARXIV.2410.12974
  17. Spinaci, G., Klic, L., & Colavizza, G. (2025, September). Benchmarking vision–language and multimodal large language models in zero-shot and few-shot scenarios: A study on christian iconography. arXiv. 10.63744/oxWtm5MhhwBH
  18. Treloar, A., Groenewegen, D., & Harboe-Ree, C. (2007, September). The Data Curation Continuum: Managing Data Objects in Institutional Repositories. D-Lib Magazine, 13(9/10). 10.1045/september2007-treloar
  19. Ziems, C., Held, W., Shaikh, O., Chen, J., Zhang, Z., & Yang, D. (2024, March). Can large language models transform computational social science? Computational Linguistics, 50(1), 237291. 10.1162/coli_a_00502
DOI: https://doi.org/10.5334/johd.470 | Journal eISSN: 2059-481X
Language: English
Submitted on: Nov 15, 2025
|
Accepted on: Jan 21, 2026
|
Published on: Mar 2, 2026
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2026 Maximilian Hindermann, Lea Katharina Kasper, Sorin Marti, Arno Bosse, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.