Have a personal or library account? Click to login
The Effect of (Historical) Language Variation on the East Slavic Lects Lematisers Performance Cover

The Effect of (Historical) Language Variation on the East Slavic Lects Lematisers Performance

Open Access
|Dec 2023

References

  1. Anastasyev, D. (2020). Exploring pretrained models for joint morphosyntactic parsing of Russian. In Computational Linguistics and Intellectual Technologies: Papers from the Annual Conference “Dialogue”, 19, pages 1–12, Moscow, Russia.
  2. Ankhimiuk, U. V. (2000). Soligalicheskije akty iz “Arkhiva Volynskikh”. In A. V. Antonov (ed.): Russian Diplomatary. Moscow: Archeographical center, pages 25–42.
  3. Berdičevskis, A., Eckhoff, H., and Gavrilova, T. (2016). The beginning of a beautiful friendship: rule-based and statistical analysis of Middle Russian. In Komp’yuternaya lingvistika i intellektual’nye tekhnologii. Trudy mezhdunarodnoj konferencii «Dialog», pages 99–111, Moscow, Russia. RSSU.
  4. Bergmanis, T., and Goldwater, S. (2018). Context sensitive neural lemmatization with Lematus. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 1391–1400, New Orleans, Louisiana. Association for Computational Linguistics.
  5. Cherepnin, L. V. (1961). Akty feodal’nogo zemlievladenija i khozyajstwa XIV – XVI vekov (in 3 volumes). Moscow: USSR Academy of Sciences.
  6. Cho, K., Merriënboer van, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., and Bengio, Y. (2014). Learning phrase representations using RNN encoder–decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1724–1734, Doha, Qatar. Association for Computational Linguistics.
  7. Damerau, F. J. (1964). A technique for computer detection and correction of spelling errors. Communications of the ACM, 7(3), pages 171–176.
  8. Fernández, L. G. (2020). A contribution to Old English lexicography. NOWELE / North-Western European Language Evolution, 73(2), pages 236–251.
  9. Graaf de, E., Stopponi, S., Bos, J. K., Peels-Matthey, S., and Nissim, M. (2022). AGILe: The first lemmatizer for Ancient Greek inscriptions. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 5334–5344, Marseille, France. European Language Resources Association.
  10. Jaro, M. A. (1989). Advances in record linkage methodology as applied to the 1985 census of Tampa Florida. Journal of the American Statistical Association, 84, pages 414–420.
  11. Kanerva, J., and Ginter, F., and Salakoski, T. (2021). Universal lemmatizer: A sequence-to-sequence model for lemmatizing universal dependencies treebanks. Natural Language Engineering, 27(5), pages 545–574.
  12. Kruchkova, O., and Goldin, V. (2011). Corpus of Russian dialect speech: concept and parameters of evaluation. In Computational Linguistics and Intellectual Technologies. Proceedings of International Conference “Dialog–2011”, pages 359–367, Moscow, Russia.
  13. Kruchkova, O., and Goldin, V. (2015). The parameters of text processing for the Russian dialect corpus. In Proceedings of the international conference “Corpus linguistics — 2015”, pages 307–314, Saint Petersburg, Russia.
  14. Kuzmina, O. V., and Filippova, I. S. (2012). Arkhiv stol’nika Andreja Il’jicha Besobrasowa, vol. II. Moscow: Russian History Insitute of Russian Academy of Sciences, 877 p.
  15. Levenshtein, V. I. (1966). Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady, 10(8), pages 707–710.
  16. Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., Stoyanov, V., and Zettlemoyer, L. (2020). BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7871–7880, Online. Association for Computational Linguistics.
  17. Likhachov, D. S. (1954). Puteshestvija russkich poslov XVI – XVII vv. Statejnyje spiski. Moscow, Leningrad: USSR Academy of Sciences, 490 p.
  18. Lyashevskaya, O., Afanasev, I., Rebrikov, S., Shishkina, Y., Suleymanova, E., Trofinov, I., and Vlasova, N. Disambiguation in context in the Russian National Corpus: 20 years later. In Proceedings of International Conference “Dialogue 2023”, pages 1–12, Online.
  19. Lyashevskaya, O., and Afanasev, I. (in print). String similarity measures for evaluating the lemmatisation in Old Church Slavonic. In Proceedings of International Conference on Historical Lexicography and Lexicology. La Rioja, Spain. Universidad de La Rioja.
  20. Lyashevskaya, O., and Penkova, Y. (2021). Revised entries in the multi-volume edition and TEI encoding: a case of the historical dictionary of Russian. In Proceedings of XIX EURALEX Congress: Lexicography for Inclusion, vol. II, pages 655–662. Komotini, Greece. Democritus University of Thrace.
  21. Milintsevich, K., and Sirts, K. (2021). Enhancing sequence-to-sequence neural lemmatization with external resources. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 3112–3122, Online. Association for Computational Linguistics.
  22. Novokhatko, O. V. (2012). Arkhiv stol’nika Andreja Il’jicha Besobrasowa, vol. I. Moscow: Russian History Insitute of Rassian Academy of Sciences, 903 p.
  23. Omelianchuk, K., Atrasevych, V., Chernodub, A., and Skurzhanskyi, O. (2020). GECToR – Grammatical Error Correction: Tag, Not Rewrite. In Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications, pages 163–170, Seattle, WA, USA. Online. Association for Computational Linguistics.
  24. Omelianchuk, K., Raheja, V., and Skurzhanskyi, O. (2021). Text Simplification by Tagging. In Proceedings of the 16th Workshop on Innovative Use of NLP for Building Educational Applications, pages 11–25, Online. Association for Computational Linguistics.
  25. Pedrazzini, N., and Eckhoff, H. M. (2021). OldSlavNet: A scalable Early Slavic dependency parser trained on modern language data. Software Impacts, 8, pages 1–4.
  26. Rozysknyje dela o Fedore Shaklovitom i jego soobshchnikakh, in 4 volumes (1893). Saint Petersburg: Arkheological Commission.
  27. Russian History Library, volume II (1875). Saint Petersburg: Arkheological Comission, 351 p.
  28. Scherrer, Y. (2021). Adaptation of morphosyntactic taggers: Cross-lectal and multilectal approaches. In M. Zampieri – P. Nakov (eds).: Similar languages, varieties, and dialects: A computational perspective. Studies in Natural Language Processing, Cambridge University Press, pages 138–166.
  29. Shavrina, T., and Shapovalova, O. (2017). To the methodology of corpus construction for machine learning: «Taiga» syntax tree corpus and parser. In Proceedings of the International Conference “CORPORA 2017”, Saint-Petersbourg, Russia.
  30. Shishkina, Y., and Lyashevskaya, O. (2021). Sculpting enhanced dependencies for Belarusian. In Revised Selected Papers of Analysis of Images, Social Networks and Texts: 10th International Conference (AIST 2021), pages 137–147. Tbilisi, Georgia.
  31. Straka, M., and Straková, J. (2017). Tokenizing, POS tagging, lemmatizing and parsing UD 2.0 with UDPipe. In Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pages 88–99, Vancouver, Canada. Association for Computational Linguistics.
  32. Sutskever, I., Vinyals, O., and Le, Q. V. (2014) Sequence to sequence learning with neural networks In Z. Ghahramani – M. Welling – C. Cortes – N. Lawrence – K. Q. Weinberger (eds.): Advances in neural information processing systems, vol. 27. Proceedings of NIPS 2014, pages 3104–3112, Montreal, Curran.
  33. Winkler, W. E. (1990). String Comparator Metrics and Enhanced Decision Rules in the Fellegi-Sunter Model of Record Linkage. In Proceedings of the Section on Survey Research Methods, pages 354–359, Alexandria, VA. American Statistical Association.
  34. Zaharova, K. F., and Orlova, V. G. (2004). Dialektnoe chlenenie russkogo yazyka. Moscow: URSS, 176 p.
DOI: https://doi.org/10.2478/jazcas-2023-0040 | Journal eISSN: 1338-4287 | Journal ISSN: 0021-5597
Language: English
Page range: 225 - 233
Published on: Dec 25, 2023
Published by: Slovak Academy of Sciences, Ľudovít Štúr Institute of Linguistics
In partnership with: Paradigm Publishing Services
Publication frequency: 2 issues per year

© 2023 Ilia Afanasev, Olga Lyashevskaya, Stefan Rebrikov, Yana Shishkina, Igor Trofimov, Natalia Vlasova, published by Slovak Academy of Sciences, Ľudovít Štúr Institute of Linguistics
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.