
Mapping Multiclass-Targeted Hate Speech in Online Discourse: An Open Dataset
By: Sanaa Kaddoura and Sumaia Al-Kohlani
References
- Alkomah, F., & Ma, X. (2022). A Literature Review of Textual Hate Speech Detection Methods and Datasets. Information, 13(6),
273 . 10.3390/info13060273 - Bajt, V. (2025). The Sociology of Hate Speech. ANNALES, SERIES HISTORIA ET SOCIOLOGIA, 35(4), 397–410. 10.19233/ASHS.2025.26
- Basile, V., Bosco, C., Fersini, E., Nozza, D., Patti, V., Rangel Pardo, F. M., Rosso, P., & Sanguinetti, M. (2019). Semeval-2019 task 5: Multilingual detection of hate speech against immigrants and women in twitter. In Proceedings of the 13th international workshop on semantic evaluation (pp. 54–63). 10.18653/v1/S19-2007
- Bäumler, J., Blöcher, L., Frey, L. J., Chen, X., Bayer, M., & Reuter, C. (2025). A Survey of Machine Learning Models and Datasets for the Multi-label Classification of Textual Hate Speech in English. arXiv preprint arXiv:2504.08609.
- Davidson, T., Warmsley, D., Macy, M., & Weber, I. (2017, May). Automated Hate Speech Detection and the Problem of Offensive Language. In Proceedings of the international AAAI conference on web and social media (Vol. 11, No. 1, pp. 512–515). 10.1609/icwsm.v11i1.14955
- Kaddoura, S., & Nassar, R. (2025). Language model-based approach for multiclass cyberbullying detection. In M. Barhamgi, H. Wang, & X. Wang (Eds.), Web Information Systems Engineering – WISE 2024. WISE 2024. Lecture Notes in Computer Science (Vol 1543, pp. 78–89).
Springer . 10.1007/978-981-96-0567-5_7 - Krippendorff, K. (2022). The Reliability of Generating data. Chapman and Hall/CRC. 10.1201/9781003112020
- Landis, J. R., & Koch, G. G. (1977). The Measurement of Observer Agreement for Categorical Data. Biometrics, 33(1), 159–174. 10.2307/2529310
- Lantz, B., & Faulkner, L. (2025).
Female Hate Crime Offenders: The Theoretical and Policy Implications of an Under-Researched Phenomenon . In Hate Crime Perpetrators: New Perspectives from Theory, Research and Practice (Vol. 1, pp. 189–209). Springer Nature Switzerland. 10.1007/978-3-031-92666-2_9 - Lee, J., Lim, T., Lee, H., Jo, B., Kim, Y., Yoon, H., & Han, S. C. (2022, October). K-MHaS: A Multi-label Hate Speech Detection Dataset in Korean Online News Comment. In Proceedings of the 29th International Conference on Computational Linguistics (pp. 3530–3538).
International Committee on Computational Linguistics . - Madriaza, P., Hassan, G., Brouillette–Alarie, S., Mounchingam, A. N., Durocher–Corfa, L., Borokhovski, E., Pickup, D., & Paillé, S. (2025). Exposure to hate in online and traditional media: A systematic review and meta–analysis of the impact of this exposure on individuals and communities. Campbell systematic reviews, 21(1),
e70018 . 10.1002/cl2.70018 - Mathew, B., Saha, P., Yimam, S. M., Biemann, C., Goyal, P., & Mukherjee, A. (2021, May). HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection. In Proceedings of the AAAI conference on artificial intelligence (Vol. 35, No. 17, pp. 14867–14875). 10.1609/aaai.v35i17.17745
- Mody, D., Huang, Y., & De Oliveira, T. E. A. (2023). A curated dataset for hate speech detection on social media text. Data in Brief, 46,
108832 . 10.1016/j.dib.2022.108832 - Mollas, I., Chrysopoulou, Z., Karlos, S., & Tsoumakas, G. (2022). ETHOS: a multi-label hate speech detection dataset. Complex & Intelligent Systems, 8, 4663–4678. 10.1007/s40747-021-00608-2
- Mubeen, M., Muskan, A., Akram, A., Rashid, J., Alshalali, T. A. N., & Sarwar, N. (2025). Cyberbullying-Related Automated Hate Speech Detection on Social Media Platforms Using Stack Ensemble Classification Method. International Journal of Computational Intelligence Systems, 18,
174 . 10.1007/s44196-025-00919-z - Papcunová, J., Martončik, M., Fedáková, D., Kentoš, M., Bozogáňová, M., Srba, I., et al. (2023). Hate speech operationalization: a preliminary examination of hate speech indicators and their structure. Complex & intelligent systems, 9, 2827–2842. 10.1007/s40747-021-00561-0
- Scheffler, T., Solopova, V., & Popa-Wyatt, M. (2021). The Telegram Chronicles of Online Harm. Journal of Open Humanities Data, 7. 10.5334/johd.31
- Walsh, S., & Greaney, P. (2025). Multiclass hate speech detection with an aggregated dataset. Natural Language Processing, 31(6), 1350–1366. 10.1017/nlp.2024.62
- Warner, W., & Hirschberg, J. (2012, June).
Detecting Hate Speech on the World Wide Web . In S. O. Sood, M. Nagarajan, & M. Gamon (Eds.), Proceedings of the Second Workshop on Language in Social Media (pp. 19–26). Association for Computational Linguistics. - Waseem, Z., & Hovy, D. (2016, June).
Hateful symbols or hateful people? predictive features for hate speech detection on twitter . In J. Andreas, E. Choi, & A. Lazaridou (Eds.), Proceedings of the NAACL student research workshop (pp. 88–93). Association for Computational Linguistics. 10.18653/v1/N16-2013 - Yu, Z., Sen, I., Assenmacher, D., Samory, M., Fröhling, L., Dahn, C., Nozza, D., & Wagner, C. (2025). The Unseen Targets of Hate: A Systematic Review of Hateful Communication Datasets. Social Science Computer Review, 43(5), 1114–1144. 10.1177/08944393241258771
DOI: https://doi.org/10.5334/johd.521 | Journal eISSN: 2059-481X
Language: English
Submitted on: Feb 11, 2026
Accepted on: Mar 30, 2026
Published on: Apr 27, 2026
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year
© 2026 Sanaa Kaddoura, Sumaia Al-Kohlani, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.