Application of NLP Technologies to Low-Resource Croatian Dialects
By: Maja Polanec and Marina Bagić Babac
References
- Agić, Ž., & Ljubešić, N. (2015, September). Universal dependencies for Croatian (that work for Serbian, too). In Proceedings of the 5th Workshop on Balto-Slavic Natural Language Processing (pp. 1–8).
- Ali, A., Dehak, N., Cardinal, P., Khurana, S., Yella, S. H., Glass, J., Bell, P., & Renals, S. (2016). Automatic dialect detection in Arabic broadcast speech. In Proceedings of INTERSPEECH 2016 (pp. 2934–2938). San Francisco, CA, USA.
- Alshutayri, A., & Atwell, E. (2017). Exploring Twitter as a source of an Arabic dialect corpus. International Journal of Computational Linguistics (IJCL), 8(2).
- Bagić Babac, M. (2023). Emotion analysis of user reactions to online news. Information Discovery and Delivery, 51(2), 179–193. https://doi.org/10.1108/IDD-04-2022-0027
- Borotić, G., Granoša, L., Kovačević, J., & Bagić Babac, M. (2023). Effective spam detection with machine learning. Croatian Regional Development Journal, 3(2), 43–64. https://doi.org/10.2478/crdj-2023-0007
- Celinić, A. (2020). Kajkavian. Hrvatski dijalektološki zbornik, 24, 1–37.
- Farkaš, D., & Filko, M. (2022). Obilježavanje koordinacije u ovisnosnim bankama stabala. Jezikoslovlje, 23(2), 193–214.
- Joshi, A., Dabre, R., Kanojia, D., Li, Z., Zhan, H., Haffari, G., & Dippold, D. (2024). Natural language processing for dialects of a language: A survey. arXiv preprint arXiv:2401.05632. https://arxiv.org/abs/2401.05632
- Jørgensen, A., Hovy, D., & Søgaard, A. (2016). Learning a POS tagger for AAVE-like language. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 1115–1120). San Diego, CA, USA.
- Qi, P., Dozat, T., Zhang, Y., & Manning, C. D. (2019). Universal dependency parsing from scratch. arXiv preprint arXiv:1901.10457. https://arxiv.org/abs/1901.10457
- Scherrer, Y. (2014, August). Unsupervised adaptation of supervised part-of-speech taggers for closely related languages. In Proceedings of the First Workshop on Applying NLP Tools to Similar Languages, Varieties and Dialects (pp. 30–38).
- Scherrer, Y., & Rabus, A. (2017, April). Multi-source morphosyntactic tagging for spoken Rusyn. In Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial) (pp. 84–92).
- Scherrer, Y., & Rabus, A. (2019). Neural morphosyntactic tagging for Rusyn. Natural Language Engineering, 25(5), 633–650. https://doi.org/10.1017/S1351324919000202
- Scherrer, Y., Samardžič, T., & Glaser, E. (2019). Digitising Swiss German – How to process and study a polycentric spoken language. Language Resources and Evaluation, 53(4), 735–769. https://doi.org/10.1007/s10579-019-09459-5
- Šandor, D., & Bagić Babac, M. (2024). Sarcasm detection in online comments using machine learning. Information Discovery and Delivery, 52(2), 213–226. https://doi.org/10.1108/IDD-01-2023-0002
- Tadić, M. (2007). Building the Croatian dependency treebank: The initial stages. Suvremena lingvistika, 33(63), 85–92.
- Vania, C., Kementchedjhieva, Y., Søgaard, A., & Lopez, A. (2019). A systematic comparison of methods for low-resource dependency parsing on genuinely low-resource languages. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (pp. 1105–1116).
- Zampieri, M., Malmasi, S., Ljubešić, N., Nakov, P., Ali, A., Tiedemann, J., & Aepli, N. (2017, April). Findings of the VarDial evaluation campaign 2017. In Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial) (pp. 1–15).
DOI: https://doi.org/10.2478/crdj-2025-0008 | Journal eISSN: 2718-4978
Language: English
Page range: 13 - 23
Submitted on: Jun 14, 2024
Accepted on: Jan 4, 2025
Published on: Apr 26, 2026
Published by: Međimurje University of Applied Sciences in Čakovec
In partnership with: Paradigm Publishing Services
Publication frequency: 2 issues per year
Keywords:
Related subjects:
© 2026 Maja Polanec, Marina Bagić Babac, published by Međimurje University of Applied Sciences in Čakovec
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.