Skip to main content
Have a personal or library account? Click to login
Mapping Multiclass-Targeted Hate Speech in Online Discourse: An Open Dataset Cover

Mapping Multiclass-Targeted Hate Speech in Online Discourse: An Open Dataset

Open Access
|Apr 2026

Abstract

Online social networks have become central spaces for public discourse, where hostile and discriminatory language toward social groups can cause psychological and social consequences for marginalized communities. Although multiple public hate speech datasets are available, many rely on binary categorization practices that obscure linguistic, cultural, and contextual variation across targeted groups. As a result, minority and less visible forms of hate speech remain insufficiently documented and analyzed. This discussion paper examines methodological limitations in existing hate speech annotation schemes and presents a re-annotation framework applied to the HatEval2019 dataset. The proposed framework introduces target-specific multiclass labels that distinguish subcategories of gender-based, racial, ethnic, religious, and xenophobic hate speech, enabling more fine-grained analysis of online discourse. The annotation process involved multiple independent annotators, systematic reliability assessment, and iterative guideline refinement. The resulting dataset comprises 5,455 annotated texts that differentiate between targeted hate speech, direct insults, and specific target subcategories. Detailed annotation guidelines and documentation are provided, and the dataset is openly available in tabular format. This paper documents interpretive decisions, ethical considerations, and data practices, enabling reuse of the dataset across digital humanities, discourse analysis, media studies, and social justice research. The dataset allows researchers to examine how hate speech, identity, and power relations are constructed in online communication and contributes to more transparent and responsible humanities data infrastructures.

DOI: https://doi.org/10.5334/johd.521 | Journal eISSN: 2059-481X
Language: English
Submitted on: Feb 11, 2026
Accepted on: Mar 30, 2026
Published on: Apr 27, 2026
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2026 Sanaa Kaddoura, Sumaia Al-Kohlani, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.