References
- Abdin, M. I., Jacobs, S. A., Awan, A. A., Aneja, J., Awadallah, A., Awadalla, H., …, & Zhou, X. (2024). Phi-3 technical report: A highly capable language model locally on your phone. 10.48550/arxiv.2404.14219
- Berger, J. (1972). Ways of seeing. Penguin Books.
- Clark, K. (1956). The nude: A study in ideal form. Pantheon Books. 10.1515/9780691252896
- DeepSeek-AI, Guo, D., Yang, D., Zhang, H., Song, J., Zhang, R., …, & Bi, X. (2025). DeepSeek-R1: Incentivizing reasoning capability in LLMs via reinforcement learning. 10.48550/arxiv.2501.12948
- Dubey, A., Jauhri, A., Pandey, A., Kadian, A., Al-Dahle, A., Letman, A., …, & Stone, K. (2024). The Llama 3 herd of models. 10.48550/arxiv.2407.21783
- Dunn, A., Dagdelen, J., Walker, N., Lee, S., Rosen, A. S., Ceder, G., …, & Jain, A. (2022). Structured information extraction from complex scientific text with fine-tuned large language models. 10.48550/arxiv.2212.05238
- Impett, L., & Offert, F. (2022). There is a digital art history. Visual Resources, 38(2), 186–209. 10.1080/01973762.2024.2362466
- Kamath, A., Ferret, J., Pathak, S., Vieillard, N., Merhej, R., Perrin, S., …, & Hussenot, L. (2025). Gemma 3 technical report. 10.48550/arxiv.2503.19786
- Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33(1), 159–174. 10.2307/2529310
- Mazzanti, P., Ferracani, A., Bertini, M., & Principi, F. (2025). Reshaping museum experiences with AI: The ReInHerit toolkit. Heritage, 8(7),
277 . 10.3390/heritage8070277 - Offert, F., & Bell, P. (2023). imgs.ai: A deep visual search engine for digital art history. In A. Baillot, T. Tasovac, W. Scholger, & G. Vogeler (Eds.), International conference of the alliance of digital humanities organizations, DH2022. 10.5281/zenodo.8107778
- Pasquinelli, M., & Joler, V. (2021). The Nooscope manifested: AI as instrument of knowledge extractivism. AI & Society, 36(4), 1263–1280. 10.1007/S00146-020-01097-6
- Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., …, & Sutskever, I. (2021). Learning transferable visual models from natural language supervision. In M. Meila & T. Zhang (Eds.), Proceedings of the 38th international conference on machine learning, ICML 2021 (Vol. 139, pp. 8748–8763).
PMLR .http://proceedings.mlr.press/v139/radford21a.html - Schneider, S., Springstein, M., Rahnama, J., Kohle, H., Ewerth, R., & Hüllermeier, E. (2022). iART: Eine Suchmaschine zur Unterstützung von bildorientierten Forschungsprozessen. In M. Geierhos (Ed.), 8. Tagung des Verbands Digital Humanities im deutschsprachigen Raum, DHd 2022 (pp. 142–147). 10.5281/zenodo.6328175
- Schwemmer, C., Knight, C., Bello-Pardo, E. D., Oklobdzija, S., Schoonvelde, M., & Lockhart, J. W. (2020). Diagnosing gender bias in image recognition systems. Socius, 6. 10.1177/2378023120967171
- Springstein, M., Schneider, S., Rahnama, J., Hüllermeier, E., Kohle, H., & Ewerth, R. (2021). iART: A search engine for art-historical images to support research in the humanities. In H. T. Shen et al. (Eds.), MM ’21: ACM multimedia conference (pp. 2801–2803).
ACM . 10.1145/3474085.3478564 - Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M., Lacroix, T., …, & Lample, G. (2023). LLaMA: Open and efficient foundation language models. 10.48550/arxiv.2302.13971
- Tschannen, M., Gritsenko, A. A., Wang, X., Naeem, M. F., Alabdulmohsin, I., Parthasarathy, N., …, & Zhai, X. (2025). SigLIP 2: Multilingual vision-language encoders with improved semantic understanding, localization, and dense features. 10.48550/arxiv.2502.14786
- van de Waal, H. (1973–1985). Iconclass: An iconographic classification system. Completed and edited by L. D. Couprie with R. H. Fuchs. Amsterdam: North-Holland Publishing Company.
- van Straten, R. (1994). Iconography, indexing, Iconclass: A handbook. Foleor.
- Wang, Y., & Luo, Z. (2023). Enhance multi-domain sentiment analysis of review texts through prompting strategies. 10.48550/arxiv.2309.02045
- Warburg, A. (1907). Arbeitende Bauern auf burgundischen Teppichen. Zeitschrift für bildende Kunst, XVIII, 41–47.
- Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., …, & Zhou, D. (2022). Chain-of-thought prompting elicits reasoning in large language models. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, & A. Oh (Eds.), Advances in neural information processing systems 35: Annual conference on neural information processing systems 2022, NeurIPS 2022.
http://papers.nips.cc/paper_files/paper/2022/hash/9d5609613524ecf4f15af0f7b31abca4-Abstract-Conference.html - Yang, A., Li, A., Yang, B., Zhang, B., Hui, B., Zheng, B., …, & Qiu, Z. (2025). Qwen3 technical report. 10.48550/arxiv.2505.09388
- Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K. R., & Cao, Y. (2023). ReAct: Synergizing reasoning and acting in language models. In The eleventh international conference on learning representations, ICLR 2023.
https://openreview.net/forum?id=WE_vluYUL-X - Yee, K., Swearingen, K., Li, K., & Hearst, M. A. (2003). Faceted metadata for image search and browsing. In G. Cockton & P. Korhonen (Eds.), Proceedings of the 2003 conference on human factors in computing systems, CHI 2003 (pp. 401–408).
ACM . 10.1145/642611.642681
