Have a personal or library account? Click to login
Slovak Question Answering Dataset Based on the Machine Translation of the Squad V2.0 Cover

Slovak Question Answering Dataset Based on the Machine Translation of the Squad V2.0

Open Access
|Dec 2023

References

  1. Abadani, N., Mozafani, J., Fatemi, A., Nematbakhsh, M., and Kazemi, A. (2021). ParSQuAD: Persian question answering dataset based on machine translation of SQuAD 2.0. International Journal of Web Research, 4(1), pages 34–46.
  2. Bojanowski, P., Grave, E., Joulin, A., and Mikolov, T. (2016). Enriching word vectors with subword information. Trans. of the ACL, Vol. 5, Cambridge, MA, pages 135–146. Accessible at: https://aclanthology.org/Q17-1010.pdf.
  3. Carrino, C. P., Costa-Jussa, M. R., and Fonollosa, J. A. R. (2020). Automatic Spanish translation of the SQuAD dataset for multilingual question answering. In Proc. of LREC, Marseille, France, pages 5515–5523. Accessible at: https://arxiv.org/abs/1912.05200.
  4. Cattan, O., Servan, C., and Rosset, S. (2021). On the usability of transformers-based models for a French question-answering task. In Proc. of RANLP, Varna, Bulgaria, pages 244–255. Accessible at: https://hal.archives-ouvertes.fr/hal-03336060/.
  5. Conneau, A., Khandelwal, K., Goyal, N., Chaudhary, V., Wenzek, G., Guzmán, F., Grave, E., Ott, M., Zettlemoyer, L., and Stoyanov, V. (2020). Unsupervised cross-lingual representation learning at scale. In Proc. of ACL, Online, pages 8440–8451. Accessible at: https://aclanthology.org/2020.acl-main.747.pdf.
  6. Croce, D., Zelenanska, A, and Basili, R. (2018). Neural learning for question answering in Italian. In C. Ghidini – B. Magnini – A. Passerini – P. Traverso (eds): Advances in Artificial Intelligence, LNAI vol. 11298, Springer, Cham, pages 389–402. Accessible at: https://doi.org/10.1007/978-3-030-03840-3_29.
  7. Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proc. of NAACL, Minneapolis, Minnesota, pages 4171–4186. Accessible at: https://aclanthology.org/N19-1423/.
  8. Germirter, C. B., and Goularas, D. (2021). A Turkish question answering system based on deep learning neural networks. Journal of Intelligent Systems: Theory and Applications, 4(2), pages 65–75. Accessible at: https://dergipark.org.tr/tr/download/article-file/1361881.
  9. Gupta, D., Ekbal, A., and Bhattacharyya, P. (2019). A deep neural network framework for English Hindi question answering. ACM TALLIP, 19(2), Article No. 25, pages 1–22.
  10. Hládek, D., Staš, J., Juhár, J., and Koctúr, T. (2023). Slovak dataset for multilingual question answering. IEEE Access, Vol. 11, pages 32869–32881. Accessible at: https://ieeexplore.ieee.org/document/10082887.
  11. Honnibal, M., Montani, I., Landeghem van, S., and Boyd, A. (2020). spaCy: Industrialstrength natural language processing in Python. Accessible at: doi 10.5281/zenodo.1212303.
  12. Junczys-Dowmunt, M., Grundkiewicz, R., Dwojak, T., Hoang, H., Heafield, K., Neckermann, T., Seide, F., Germann, U., Aji, A. F., Bogoychev, N., Martins, A. F. T., and Birch, A. (2018). Marian: Fast neural machine translation in C++. In Proc. of ACL, Melbourne, Australia, pages 116–121. Accessible at: https://aclanthology.org/P18-4020.pdf.
  13. Lee, K., Yoon, K., Park, S., and Hwang, S. W. (2018). Semi-supervised training data generation for multilingual question answering. In Proc. of LREC, Miyazaki, Japan, pages 2758–2762. Accessible at: https://aclanthology.org/L18-1437.
  14. Macková, K., and Straka, M. (2020). Reading comprehension in Czech via machine translation and cross-lingual transfer. In Proc. of TSD, Brno, Czech Republic, pages 171–179. Accessible at: https://arxiv.org/abs/2007.01667.
  15. Mayeesha, T. T., Sarwar, A. Md., and Rahman, R. M. (2021). Deep learning based question answering in Bengali. Journal of Information and Telecommunication, 5(2), pages 145–178. Accessible at: https://doi.org/10.1080/24751839.2020.1833136.
  16. Mozannar, H., El Hajal, K., Maamary, E., and Hajj, H. M. (2019). Neural Arabic question answering. In Proc. of WANLP, Florence, Italy, pages 108–118. Accessible at: https://arxiv.org/abs/1906.05394v1.
  17. Pikuliak, M., Grivalský, Š., Konôpka, M., Blšták, M., Tamajka, M., Bachratý, V., Šimko, M., Balážik, P., Trnka, M., and Uhlárik, F. (2022). SlovakBERT: Slovak masked language model. In Proc. of EMNLP, Abu Dhabi, United Arab Emirates, pages 7156–7168. Accessible at: https://aclanthology.org/2022.findings-emnlp.530.pdf.
  18. Rajpurkar, P., Zhang, J., Lopyrev, K., and Liang, P. (2016). SQuAD: 100,000+ questions for machine comprehension of text. In Proc. of EMNLP, Austin, Texas, pages 2383–2392. Accessible at: https://aclanthology.org/2021.emnlp-main.530.pdf.
  19. Rajpurkar, P., Jia, R., and Liang, P. (2018). Know what you don’t know: Unanswerable questions for SQuAD. In Proc. of ACL, Melbourne, Australia, pages 784–789. Accessible at: https://aclanthology.org/P18-2124.pdf.
  20. Tiedemann, J., and Thottingal, S. (2020). OPUS-MT – Building open translation services for the Worlds. In Proc. of EAMT, Lisboa, Portugal, pages 479–4810. Accessible at: https://aclanthology.org/2020.eamt-1.61.pdf.
  21. Tiutiunnyk, S., and Dyomkin, V. (2019). Context-based question-answering system for the Ukrainian language. In Proc. of MS-AMLV, Lviv, Ukraine, pages 81–88. Accessible at: https://ceur-ws.org/Vol-2566/MS-AMLV-2019-paper17-p081.pdf.
DOI: https://doi.org/10.2478/jazcas-2023-0054 | Journal eISSN: 1338-4287 | Journal ISSN: 0021-5597
Language: English
Page range: 381 - 390
Published on: Dec 25, 2023
Published by: Slovak Academy of Sciences, Mathematical Institute
In partnership with: Paradigm Publishing Services
Publication frequency: 2 issues per year

© 2023 Ján Staš, Daniel Hládek, Tomáš Koctúr, published by Slovak Academy of Sciences, Mathematical Institute
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.