Domain Sensitivity in Arabic Morphological Analysis: A Multi-Corpus Evaluation of Farasa, CAMeL, and ALP Across Modern, Classical Religious, and Classical Jurisprudential Domains

Behrouz Minaei-Bidgoli; Huda AlShuhayeb; Sayyed-Ali Hossayni

doi:10.5334/johd.418

References

Abdelali, A., Darwish, K., Durrani, N., & Mubarak, H. (2016). Farasa: A fast and furious segmenter for arabic. In Proceedings of the 2016 conference of the north american chapter of the association for computational linguistics (naacl) (pp. 11–16). 10.18653/v1/N16-3003
Open DOI Search in Google Scholar Back to article
Alharbi, R., & Lee, M. (2022). A cross-domain evaluation of arabic nlp tools. Journal of King Saud University – Computer and Information Sciences. (In press).
Search in Google Scholar Back to article
Aljumaily, A. (2022). Evaluation of classical arabic nlp tools. Journal of Arabic Linguistics.
Search in Google Scholar Back to article
Blitzer, J., McDonald, R., & Pereira, F. (2006). Domain adaptation with structural correspondence learning. In Proceedings of the conference on empirical methods in natural language processing (emnlp) (pp. 120–128). Association for Computational Linguistics. 10.3115/1610075.1610094
Open DOI Search in Google Scholar Back to article
Darwish, K. (2014). Arabic nlp for social media. In Proceedings of the acl workshop on social nlp (pp. 1–6). Association for Computational Linguistics.
Search in Google Scholar Back to article
Daumé, H. III. (2007). Frustratingly easy domain adaptation. In Proceedings of the 45th annual meeting of the association for computational linguistics (acl) (pp. 256–263). Association for Computational Linguistics.
Search in Google Scholar Back to article
Dukes, K., & Habash, N. (2010a). Morphological annotation of quranic arabic. In Proceedings of the seventh international conference on language resources and evaluation (lrec 2010) (pp. 2530–2536). Retrieved from http://www.lrec-conf.org/proceedings/lrec2010/pdf/276_Paper.pdf
Search in Google Scholar Back to article
Dukes, K., & Habash, N. (2010b). Morphological annotation of quranic arabic. In Proceedings of the seventh international conference on language resources and evaluation (lrec) (pp. 2530–2536).
Search in Google Scholar Back to article
Freihat, A. A., Bella, G., Mubarak, H., & Giunchiglia, F. (2018). A single-model approach for arabic segmentation, pos tagging, and named entity recognition. In Proceedings of the 11th international conference on language resources and evaluation (lrec) (pp. 1756–1763). 10.1109/ICNLSP.2018.8374393
Open DOI Search in Google Scholar Back to article
Gururangan, S., Marasović, A., Swayamdipta, S., Lo, K., Beltagy, I., Downey, D., & Smith, N. A. (2020). Don’t stop pretraining: Adapt language models to domains and tasks. In Proceedings of the 58th annual meeting of the association for computational linguistics (acl) (pp. 8342–8360). Association for Computational Linguistics. 10.18653/v1/2020.acl-main.740
Open DOI Search in Google Scholar Back to article
Maamouri, M., Bies, A., Buckwalter, T., & Mekki, W. (2004). Penn arabic treebank: Part 1 v 2.0 (Tech. Rep. No. LDC2004T11). Philadelphia: Linguistic Data Consortium. Retrieved 2025-01-03, from https://catalog.ldc.upenn.edu/LDC2004T11
Search in Google Scholar Back to article
Maamouri, M., Graff, D., Bouziri, B., Krouna, S., Kulick, S., & Buckwalter, T. (2010). Standard arabic morphological analyzer (sama) version 3.1 (Tech. Rep.). 10.35111/wgjk-zy44 (LDC Catalog No. LDC2010L01)
Open DOI Search in Google Scholar Back to article
Namly, D., Tajmout, R., Bouzoubaa, K., & Abouenour, L. (2016). Nafis: A gold standard corpus for arabic stemmers evaluation. (Dataset).
Search in Google Scholar Back to article
Obeid, O., Zalmout, N., Khalifa, S., Taji, D., Obeid, M., Alhafni, B., … Habash, N. (2020). Camel tools: An open source python toolkit for arabic natural language processing. In Proceedings of the 12th language resources and evaluation conference (lrec) (pp. 7022–7032).
Search in Google Scholar Back to article
Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 1345–1359. 10.1109/TKDE.2009.191
Open DOI Search in Google Scholar Back to article
Ramponi, A., & Plank, B. (2020). Neural unsupervised domain adaptation in nlp: A survey. Computa- tional Linguistics, 46(2), 1–42. 10.48550/arXiv.2006.00632
Open DOI Search in Google Scholar Back to article
Zaidan, O., & Callison-Burch, C. (2012). Arabic dialect identification. In Proceedings of the 50th annual meeting of the association for computational linguistics (acl) (pp. 49–54). Association for Computational Linguistics.
Search in Google Scholar Back to article

Domain Sensitivity in Arabic Morphological Analysis: A Multi-Corpus Evaluation of Farasa, CAMeL, and ALP Across Modern, Classical Religious, and Classical Jurisprudential Domains

References

Paradigm

My account