Have a personal or library account? Click to login
Evaluation of Quality of Slovak Language Use in LLMS Cover

Evaluation of Quality of Slovak Language Use in LLMS

By: Marek Dobeš  
Open Access
|Feb 2025

References

  1. RADFORD, A. ‒ WU, J. ‒ CHILD, R. ‒ LUAN, D. ‒ AMODEI, D. ‒ SUTSKEVER, I. (2019). Language models are unsupervised multitask learners.
  2. PAPINENI, K. ‒ ROUKOS, S. ‒ WARD, T. ‒ ZHU, W. J. (2002). BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics (pp. 311-318).
  3. LIN, C. Y. (2004). ROUGE: A package for automatic evaluation of summaries. Text Summarization Branches Out, 74-81.
  4. CHINCHOR, N. (1991). MUC-3 Evaluation Metrics and Linguistic Phenomena Tests. In: NATURAL LANGUAGE PROCESSING SYSTEMS EVALUATION WORKSHOP. p. 13.
  5. ZHANG, T. ‒ KISHORE, V. ‒ WU, F. ‒ WEINBERGER, K. Q. ‒ ARTZI, Y. (2019). BERTScore: Evaluating text generation with BERT. arXiv preprint arXiv:1904.09675.
  6. ZHAO, W. ‒ PEYRARD, M. ‒ LIU, F. ‒ GAO, Y. ‒ MEYER, C. M. ‒ EGER, S. (2019). MoverScore: Text generation evaluating with contextualized embeddings and Earth Mover Distance. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing, 563-578.
  7. BROWN, T. B. ‒ MANN, B. ‒ RYDER, N. ‒ SUBBIAH, M. ‒ KAPLAN, J. ‒ DHARIWAL, P. ‒ AMODEI, D. (2020). Language models are few-shot learners. arXiv preprint arXiv:2005.14165.
  8. WEI, J. ‒ WANG, X. ‒ SCHUURMANS, D. ‒ BOSMA, M. ‒ ICHTER, B. ‒ XIA, F. ‒ LE, Q. (2022). Chain-of-thought prompting elicits reasoning in large language models. arXiv preprint arXiv:2201.11903.
  9. MAYNEZ, J. ‒ NARAYAN, S. ‒ BOHNET, B. ‒ MCDONALD, R. (2020). On faithfulness and factuality in abstractive summarization. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 1906-1919.
  10. SHENG, E. ‒ CHANG, K. W. ‒ NATARAJAN, P. ‒ PENG, N. (2021). Societal biases in language generation: Progress and challenges. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics, 4275-4293.
  11. GEHMAN, S. ‒ GURURANGAN, S. ‒ SAP, M. ‒ CHOI, Y. ‒ SMITH, N. A. (2020). RealToxicityPrompts: Evaluating neural toxic degeneration in language models. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 3356-3369.
  12. POPOVIĆ, M. (2017). chrF++: words helping character n-grams. Proceedings of the Second Conference on Machine Translation, 612-618.
  13. JOSHI, P. ‒ SANTY, S. ‒ BUDHIRAJA, A. ‒ BALI, K. ‒ CHOUDHURY, M. (2020). The state and fate of linguistic diversity and inclusion in the NLP world. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 6282-6293.
  14. BEDNÁR, P. ‒ DOBEŠ, M. ‒ GARABÍK, R. (2024). Training of large language model Mistral on Slovak language data. Jazykovedný časopis. Under review.
  15. VAN DER LEE, C. ‒ GATT, A. ‒ VAN MILTENBURG, E. ‒ WUBBEN, S. ‒ KRAHMER, E. (2019). Best practices for the human evaluation of automatically generated text. Proceedings of the 12th International Conference on Natural Language Generation, 355-368.
  16. CHIANG, W.-L. ‒ LI, Z. ‒ LIN, Z. ‒ SHENG, Y. ‒ WU, Z. ‒ ZHANG, P. ‒ ZHANG, C. (2023). Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90% ChatGPT Quality. https://lmsys.org/blog/2023-03-30-vicuna/
  17. KOCMI, T. ‒ FEDERMANN, C. (2023). Large language models are state-of-the-art evaluators of translation quality. arXiv preprint arXiv:2302.14520.
  18. BENDER, E. M. ‒ GEBRU, T. ‒ MCMILLAN-MAJOR, A. ‒ SHMITCHELL, S. (2021). On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (pp. 610–623). ACM.
DOI: https://doi.org/10.2478/aei-2025-0004 | Journal eISSN: 1338-3957 | Journal ISSN: 1335-8243
Language: English
Page range: 28 - 33
Submitted on: Sep 10, 2024
|
Accepted on: Nov 13, 2024
|
Published on: Feb 24, 2025
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year

© 2025 Marek Dobeš, published by Technical University of Košice
This work is licensed under the Creative Commons Attribution 4.0 License.