Integrating Handcrafted Features with Machine Learning for Hate Speech Detection in Albanian Social Media

Endrit Fetahi; Mentor Hamiti; Arsim Susuri; Xhemal Zenuni; Jaumin Ajdari

doi:10.2478/seeur-2024-0025

.blurhash-client-img { display: none !important; }

Integrating Handcrafted Features with Machine Learning for Hate Speech Detection in Albanian Social Media

SEEU Review

Volume 19 (2024): Issue 2 (December 2024)

By: Endrit Fetahi, Mentor Hamiti, Arsim Susuri, Xhemal Zenuni and Jaumin Ajdari

Open Access

|Dec 2024

Ajdari, J., Ismaili, F., Raufi, B., & Zenuni, X. (2017). Automatic hate speech detection in online contents using latent semantic analysis. Pressacademia, 5(1), 368–371. https://doi.org/10.17261/pressacademia.2017.612
Search in Google Scholar Back to article
Alharthi, R., Alharthi, R., Shekhar, R., & Zubiaga, A. (2023). Target-Oriented Investigation of Online Abusive Attacks: A Dataset and Analysis. IEEE Access, 11, 64114–64127. https://doi.org/10.1109/ACCESS.2023.3289148
Search in Google Scholar Back to article
Álvarez-Carmona, M., Guzmán-Falcón, E., Montes-y-Gómez, M., Escalante, H. J., Villaseñor-Pineda, L., Reyes-Meza, V., & Rico-Sulayes, A. (2018). Overview of MEX-A3T at IberEval 2018: Authorship and aggressiveness analysis in Mexican Spanish tweets. CEUR Workshop Proceedings, 2150, 74–96.
Search in Google Scholar Back to article
Ayo, F. E., Folorunso, O., Ibharalu, F. T., & Osinuga, I. A. (2020). Machine learning techniques for hate speech classification of twitter data: State-of-The-Art, future challenges and research directions. Computer Science Review, 38, 100311. https://doi.org/10.1016/j.cosrev.2020.100311
Search in Google Scholar Back to article
Bénard, C., Veiga, S. Da, & Scornet, E. (2022). Interpretability via Random Forests. In Interpretability for Industry 4.0 : Statistical and Machine Learning Approaches (pp. 37–84). Springer International Publishing. https://doi.org/10.1007/978-3-031-12402-0_3
Search in Google Scholar Back to article
Beyhan, F., Çarık, B., Arın, İ., Terzioğlu, A., Yanikoglu, B., & Yeniterzi, R. (2022). A Turkish Hate Speech Dataset and Detection System. Proceedings of the Language Resources and Evaluation Conference, June, 4177–4185. https://aclanthology.org/2022.lrec-1.443
Search in Google Scholar Back to article
Canhasi, E., Shijaku, R., & Berisha, E. (2022). Albanian Fake News Detection. ACM Transactions on Asian and Low-Resource Language Information Processing, 21(5), 1–24. https://doi.org/10.1145/3487288
Search in Google Scholar Back to article
Chen, H., Lundberg, S. M., & Lee, S.-I. (2022). Explaining a series of models by propagating Shapley values. Nature Communications, 13(1), 4512. https://doi.org/10.1038/s41467-022-31384-3
Search in Google Scholar Back to article
Del Vigna, F., Cimino, A., Dell’Orletta, F., Petrocchi, M., & Tesconi, M. (2017). Hate me, hate me not: Hate speech detection on Facebook. CEUR Workshop Proceedings, 1816(January), 86–95.
Search in Google Scholar Back to article
Fetahi, E., Hamiti, M., Susuri, A., Selimi, B., & Saiti, D. I. (2024). Neural Network and Transformer-Based PoS Tagger for Low Resource Languages. 2024 International Conference on Information Technologies (InfoTech). https://doi.org/10.1109/InfoTech63258.2024.10701401
Search in Google Scholar Back to article
Fetahi, E., Hamiti, M., Susuri, A., Shehu, V., & Besimi, A. (2023). Automatic Hate Speech Detection using Natural Language Processing: A state-of-the-art literature review. 2023 12th Mediterranean Conference on Embedded Computing (MECO), 1–6. https://doi.org/10.1109/MECO58584.2023.10155070
Search in Google Scholar Back to article
Fortuna, P., & Nunes, S. (2019). A Survey on Automatic Detection of Hate Speech in Text. ACM Computing Surveys, 51(4), 1–30. https://doi.org/10.1145/3232676
Search in Google Scholar Back to article
Hackeling, G. (2014). Mastering Machine Learning with scikit-learn. In Book. http://books.google.com/books?id=fZQeBQAAQBAJ&pgis=1
Search in Google Scholar Back to article
Khairy, M., Mahmoud, T. M., & Abd-El-Hafeez, T. (2021). Automatic Detection of Cyberbullying and Abusive Language in Arabic Content on Social Networks: A Survey. Procedia CIRP, 189, 156–166. https://doi.org/10.1016/j.procs.2021.05.080
Search in Google Scholar Back to article
Lundberg, S. M., & Lee, S. I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 2017-Decem(Section 2), 4766–4775.
Search in Google Scholar Back to article
Misini, A., Canhasi, E., Kadriu, A., & Fetahi, E. (2024). Automatic authorship attribution in Albanian texts. PLOS ONE, 19(10), e0310057. https://doi.org/10.1371/journal.pone.0310057
Search in Google Scholar Back to article
Mozafari, M., Farahbakhsh, R., & Crespi, N. (2020). Hate speech detection and racial bias mitigation in social media based on BERT model. PLoS ONE, 15(8 August), 1–26. https://doi.org/10.1371/journal.pone.0237861
Search in Google Scholar Back to article
Nascimento, F. R. S., Cavalcanti, G. D. C., & Da Costa-Abreu, M. (2023). Exploring Automatic Hate Speech Detection on Social Media: A Focus on Content-Based Analysis. SAGE Open, 13(2). https://doi.org/10.1177/21582440231181311
Search in Google Scholar Back to article
Nurce, E., Keci, J., & Derczynski, L. (2021). Detecting Abusive Albanian. ArXiv Preprint ArXiv:2107.13592.
Search in Google Scholar Back to article
Orlenko, A., & Moore, J. H. (2021). A comparison of methods for interpreting random forest models of genetic association in the presence of non-additive interactions. BioData Mining, 14(1), 9. https://doi.org/10.1186/s13040-021-00243-0
Search in Google Scholar Back to article
Ramezan, C. A. (2022). Transferability of Recursive Feature Elimination (RFE)-Derived Feature Sets for Support Vector Machine Land Cover Classification. Remote Sensing, 14(24), 6218. https://doi.org/10.3390/rs14246218
Search in Google Scholar Back to article
Reddy, A. N. (2024). Enhancing Hate Speech Detection with Integrated Content-Based and Stylistic Features. J.ElectricalSystems, 3660–3666.
Search in Google Scholar Back to article
Turki, T., & Roy, S. S. (2022). Novel Hate Speech Detection Using Word Cloud Visualization and Ensemble Learning Coupled with Count Vectorizer. Applied Sciences (Switzerland), 12(13). https://doi.org/10.3390/app12136611
Search in Google Scholar Back to article

Authors

Metrics

Articles in this issue

DOI: https://doi.org/10.2478/seeur-2024-0025 | Journal eISSN: 1857-8462

Journal RSS Feed

Language: English

Page range: 80 - 92

Published on: Dec 24, 2024

Published by: South East European University

In partnership with: Paradigm Publishing Services

Publication frequency: 2 issues per year

Keywords:

Hate speech detection,

Machine learning,

handcrafted features,

Albanian,

social media

Related subjects:

General interest

© 2024 Endrit Fetahi, Mentor Hamiti, Arsim Susuri, Xhemal Zenuni, Jaumin Ajdari, published by South East European University
This work is licensed under the Creative Commons Attribution 4.0 License.

Volume 19 (2024): Issue 2 (December 2024)