iWEEMS: Interactive Word Embeddings for Early Modern Science

Vojtěch Kaše; Jana Švadlenková; Jan Tvrz; Georgiana Hedesan; Petr Pavlas

doi:10.5334/johd.379

References

Akopyan, O., Barton, W., Baumgartner, F., Berrens, D., Kirchler, U., Korenjak, M., Luggin, J., Tautschnig, I., & Zathammer, S. (2023). Noscemus Wiki [Dataset]. Zenodo. 10.5281/ZENODO.7855322
Open DOI Search in Google Scholar Back to article
Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2017). Enriching Word Vectors with Subword Information. Transactions of the Association for Computational Linguistics, 5, 135–146. 10.1162/tacl_a_00051
Open DOI Search in Google Scholar Back to article
Burns, P. J. (2023). LatinCy: Synthetic Trained Pipelines for Latin NLP (Version 1). arXiv. 10.48550/ARXIV.2305.04365
Open DOI Search in Google Scholar Back to article
Denooz, J. (2004). Opera Latina: Une base de données sur internet. Euphrosyne, 32, 79–88. 10.1484/J.EUPHR.5.125535
Open DOI Search in Google Scholar Back to article
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. 10.48550/ARXIV.1810.04805
Open DOI Search in Google Scholar Back to article
Ehrmanntraut, A., Hagen, T., Konle, L., & Jannidis, F. (2021). Type- and Token-based Word Embeddings in the Digital Humanities. CHR 2021: Computational Humanities Research 2021, 2989, 23.
Search in Google Scholar Back to article
Harris, Z. S. (1954). Distributional Structure. Word World, 10(2–3), 146–162. 10.1080/00437956.1954.11659520
Open DOI Search in Google Scholar Back to article
Hedesan, G., Huber, A., Kodetová, J., Kříž, O., Kubíčková, J., Kaše, V., & Pavlas, P. (2025). EMLAP (Version v0.4) [Dataset]. Zenodo. 10.5281/ZENODO.14765294
Open DOI Search in Google Scholar Back to article
Lenci, A. (2018). Distributional Models of Word Meaning. Annual Review of Linguistics, 4, 151–171. 10.1146/annurev-linguistics-030514-125254
Open DOI Search in Google Scholar Back to article
Lenci, A., Sahlgren, M., Jeuniaux, P., Cuba Gyllensten, A., & Miliani, M. (2022). A comparative evaluation and analysis of three generations of Distributional Semantic Models. Language Resources and Evaluation, 56(4), 1269–1313. 10.1007/s10579-021-09575-z
Open DOI Search in Google Scholar Back to article
Longree, D., Fantoli, M., & LASLA (ULiège). (2023). LASLAfiles_Latin_DATformat [Dataset]. ULiège Open Data Repository. 10.58119/ULG/27VZID
Open DOI Search in Google Scholar Back to article
McInnes, L., Healy, J., & Melville, J. (2018). UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction (Version 3). arXiv. 10.48550/ARXIV.1802.03426
Open DOI Search in Google Scholar Back to article
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed Representations of Words and Phrases and their Compositionality. In C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, & K. Q. Weinberger (Eds), Advances in Neural Information Processing Systems 26 (NIPS 2013), Vol. 26, pp. 3111–3119). New York City: Curran Associates, Inc. 10.48550/arXiv.1310.4546
Open DOI Search in Google Scholar Back to article
Montani, I., Honnibal, M., Honnibal, M., Van Landeghem, S., Boyd, A., Peters, H., McCann, P. O., Geovedi, J., O’Regan, J., Samsonov, M., Orosz, G., De Kok, D., Blättermann, M., Altinok, D., Kristiansen, S. L., Madeesh Kannan, Mitsch, R., Bournhonesque, R., Edward, … Tamura, Y. (2023). spaCy: Industrial-strength Natural Language Processing in Python (Version v3.5.1) [Computer software]. Zenodo. 10.5281/zenodo.10009823
Open DOI Search in Google Scholar Back to article
Passarotti, M. (2010). Leaving behind the less-resourced status. The case of latin through the experience of the index thomisticus treebank. 7th SaLTMiL Workshop on Creation and Use of Basic Lexical Resources for Less-Resourced Languages LREC 2010. Valetta, Malta, 23 May 2010 Workshop Programme, 27.
Search in Google Scholar Back to article
Passarotti, M. (2015). What you can do with linguistically annotated data. From the Index Thomisticus to the Index Thomisticus Treebank. In P. Roszak & J. Vijgen (Eds.), Reading Sacred Scripture with Thomas Aquinas. Hermeneutical Tools. Theological Questions and New Perspectives (pp. 3–44). Turnhout: Brepols. 10.1484/M.TEMA-EB.4.000129
Open DOI Search in Google Scholar Back to article
Pražák, O., Přibáň, P., Taylor, S., & Sido, J. (2020). UWB at SemEval-2020 Task 1: Lexical Semantic Change Detection. In A. Herbelot, X. Zhu, A. Palmer, N. Schneider, J. May, & E. Shutova (Eds.), Proceedings of the Fourteenth Workshop on Semantic Evaluation (pp. 246–254). International Committee for Computational Linguistics. 10.18653/v1/2020.semeval-1.30
Open DOI Search in Google Scholar Back to article
Řehůřek, R., & Sojka, P. (2010). Software Framework for Topic Modelling with Large Corpora. Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, 45–50. http://is.muni.cz/publication/884893/en (last accessed 10 October 2025).
Search in Google Scholar Back to article
Sahlgren, M. (2008). The distributional hypothesis. Rivista Di Linguistica, 20(1), 33–53.
Search in Google Scholar Back to article
Schlechtweg, D., McGillivray, B., Hengchen, S., Dubossarsky, H., & Tahmasebi, N. (2020). SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection. In A. Herbelot, X. Zhu, A. Palmer, N. Schneider, J. May, & E. Shutova (Eds.), Proceedings of the Fourteenth Workshop on Semantic Evaluation (pp. 1–23). International Committee for Computational Linguistics. 10.18653/v1/2020.semeval-1.1
Open DOI Search in Google Scholar Back to article
Sprugnoli, R., Moretti, G., & Passarotti, M. (2020). Building and Comparing Lemma Embeddings for Latin. Classical Latin versus Thomas Aquinas. Italian Journal of Computational Linguistics, 6(1). 10.4000/ijcol.624
Open DOI Search in Google Scholar Back to article
Sprugnoli, R., Passarotti, M., & Moretti, G. (2019). Vir is to Moderatus as Mulier is to Intemperans—Lemma Embeddings for Latin. Proceedings of the Sixth Italian Conference on Computational Linguistics Bari, Italy, 13–15 November 2019. 10.5281/zenodo.3565572
Open DOI Search in Google Scholar Back to article
van der Maaten, L., & Hinton, G. (2008). Visualizing Data using t-SNE. Journal of Machine Learning Research, 9(86), 2579–2605.
Search in Google Scholar Back to article
Venna, J., & Kaski, S. (2001). Neighborhood Preservation in Nonlinear Projection Methods: An Experimental Study. In G. Dorffner, H. Bischof, K. Hornik (Eds.), Artificial Neural Networks — ICANN 2001. ICANN 2001. Lecture Notes in Computer Science, vol 2130 (pp 485–491). Berlin, Heidelberg: Springer. 10.1007/3-540-44668-0_68
Open DOI Search in Google Scholar Back to article
Zathammer, S. (2025). Noscemus Digital Sourcebook [Dataset]. Zenodo. 10.5281/ZENODO.15040256
Open DOI Search in Google Scholar Back to article

iWEEMS: Interactive Word Embeddings for Early Modern Science

References

Paradigm

My account