Table 1
Comparison with existing English-language hate speech datasets, their objectives, and classification frameworks.
| DATASET | OBJECTIVE | CLASSIFICATION SCHEMA | MAIN CATEGORY CLASSIFICATION | SUBCATEGORY CLASSIFICATION |
|---|---|---|---|---|
| Mody et al. (2023) | Hate speech detection | 2-class (hate, non-hate) | No | No |
| Davidson et al. (2017) | Detect hate speech from offensive and normal language | 3-class (hate, offensive, normal) | No | No |
| Mollas et al. (2022) | Multi-label hate speech detection | 8-class (violence, directed vs general, gender, race, national origin, disability, religion, sexual orientation) | Yes | No |
| Waseem and Hovy (2016) | Identify racism and sexism on Twitter | 3-classes (racism, sexism, neither) | Yes | No |
| Mathew et al. (2021) | Explainable hate speech classification with target identification | 3-class (hate, offensive, normal) | No | No |
| Walsh and Greaney (2025) | Categorize hate speech across multiple groups | 5-class (ethnicity, gender, sexuality, religion, non-hate) | Yes | No |
| Proposed data | Hate speech categorization | 14-classes | Yes | Yes |
Table 2
Label distribution in the dataset.
| MAIN CATEGORY | SUBCATEGORY | COUNT |
|---|---|---|
| Gender-Based Hate Speech | Misogyny | 2019 |
| Misandry | 29 | |
| Immigration and Xenophobic Hate Speech | Anti-Immigrant Hate Speech | 1882 |
| Anti-Refugee Hate Speech | 379 | |
| Xenophobia | 51 | |
| Religious Hate Speech | Islamophobia | 167 |
| Anti-Christian Hate Speech | 3 | |
| Racial and Ethnic Hate Speech | Anti-Black Hate Speech | 105 |
| Anti-Hispanic Hate Speech | 16 | |
| Anti-Semitic Hate Speech | 4 | |
| Anti-Asian Hate Speech | 2 | |
| Profanity and General Abuse | – | 674 |
| Threats and Violence | – | 106 |
| Hate Speech toward Countries | – | 18 |

Figure 1
Data curation and re-annotation workflow.
Table 3
Masked lexical indicators used for the dataset selection process.
| WORDS | HATE SPEECH CONTEXT TYPE |
|---|---|
| F*** (profanity), sh*** (profanity), d*** (insult), j*** (insult), b*** (insult), a*** (insult), idiot, moron, stupid, trash, garbage, scum, vermin, animal, savage | Profanities and general insults |
| B*** (gender-slur), h*** (gender-slur), s*** (gender-slur), w*** (gender-slur), p*** (gender-slur), gold-digger, feminazi | Gender-based slurs and sexist language |
| N*** (racial slur), n*** (racial slur), k*** (racial slur), Pakistani, black, white, Arab, Mexican, Asian, African, monkey | Race-related or ethnicity-targeted words |
| Muslim, jew, Islamic, Christian, hindu, jihad, jihadist, infidel, terrorist | Religion-related expressions |
| Immigrant, refugee, migrant, alien, asylum seeker, #sendthemback, invader, invasion, illegals, #buildthatwall, foreigner | Anti-immigrant and anti-refugee language |
| Chinese, China, Russian, Russia, middle eastern, Australia, American | Country or nationality targeting terms |
Table 4
Misclassified hate speech instances in the dataset.
| REMOVED HATE SPEECH SAMPLES |
|---|
| Well that’s just great! @user @user |
| @user Simply put |
| @user @user @user @user @user @user @user @user @user @user @user @user @user @user @user @user @user @user @user |
Table 5
Examples from the dataset.
| TEXT | LABEL |
|---|---|
| No matter what color a girl is she still a [slur: derogatory term for women] | Gender-Based Hate Speech (Misogyny) |
| @user @user thiers none because not all men rape | Gender-Based Hate Speech (Misandry) |
| Illegal Alien Hit Man, Dragged Him For Blocks In Deadly DUI Crash #BuildTheWall #NoAmnesty #NoDACA #DeportThemAll #KeepAmericaSafe | Immigration and Xenophobic Hate Speech (Anti-Immigrant Hate Speech) |
| @user @user real talk do you have eyes or were they gouged out by a [hate slur: derogatory anti-refugee term]? | Immigration and Xenophobic Hate Speech (Anti-Refugee Hate Speech) |
| to hell with the #Tories. #SendThemHome | Immigration and Xenophobic Hate Speech (Xenophobia) |
| Victory for Trump ! But only 6 islamic countries now are stopped for immigration into USA ! There are 34 islamic countries more. To beware the USA from terrorists these also has to get banned. | Religious Hate Speech (Islamophobia) |
| God is dead, Jesus is a [profanity] and the Virgin Mary was a [slur: derogatory term for women]. | Religious Hate Speech (Anti-Christian Hate Speech) |
| #sendthemback if your children want a peacefull life, without [sexual violence reference] on every corner.#forza. #Italia ##NoMoreAfricans !! | Racial and Ethnic Hate Speech (Anti-Black Hate Speech) |
| Mexico’s not sending their best. They’re dumping their killers aka garbage on us. #StopTheInvasion #DeportThemAll #NoAmnesty #BuildTheWall | Racial and Ethnic Hate Speech (Anti-Hispanic Hate Speech) |
| Let’s me honest, Jews offer things and the white women are just [slur: sexualized derogatory term]. Muslims [sexual violence reference] rape and beat white women. | Racial and Ethnic Hate Speech (Anti-Semitic Hate Speech) |
| @user <– another son of a dirty [profanity] Korean [ethnic slur] [slur] | Racial and Ethnic Hate Speech (Anti-Asian Hate Speech) |
| Your [slur: derogatory insult] ass disgusts me so much | Profanity and General Abuse |
| @user @user lol. you selfish terrorist [profanity] should be boiled alive. Maybe lynched by neonazis. Would serve you right. | Threats and Violence |
| @user Bloody Germany who needs Germany we dont want their Visa plans were are fed up of being over ran by migrants no Uk Jobs threaten | Hate Speech toward Countries |
Table 6
Lexical statistics per class.
| LABEL | TOTAL WORDS | MAXIMUM TEXT LENGTH | TOTAL UNIQUE WORDS |
|---|---|---|---|
| Gender-Based Hate Speech (Misogyny) | 37,184 | 57 | 5,964 |
| Gender-Based Hate Speech (Misandry) | 594 | 93 | 309 |
| Immigration and Xenophobic Hate Speech (Anti-Immigrant Hate Speech) | 57,357 | 88 | 8,905 |
| Immigration and Xenophobic Hate Speech (Anti-Refugee Hate Speech) | 10,362 | 59 | 2,906 |
| Immigration and Xenophobic Hate Speech (Xenophobia) | 1,350 | 54 | 634 |
| Religious Hate Speech (Islamophobia) | 4,898 | 69 | 1,678 |
| Religious Hate Speech (Anti-Christian Hate Speech) | 78 | 38 | 64 |
| Racial and Ethnic Hate Speech (Anti-Black Hate Speech) | 2,677 | 54 | 1,050 |
| Racial and Ethnic Hate Speech (Anti-Hispanic Hate Speech) | 441 | 50 | 260 |
| Racial and Ethnic Hate Speech (Anti-Semitic Hate Speech) | 88 | 34 | 73 |
| Racial and Ethnic Hate Speech (Anti-Asian Hate Speech) | 42 | 32 | 40 |
| Profanity and General Abuse | 12,767 | 70 | 3,181 |
| Threats and Violence | 2,419 | 64 | 995 |
| Hate Speech toward Countries | 664 | 62 | 396 |
