Have a personal or library account? Click to login
Advancements in Offensive Language Detection: A Comprehensive Review and Experimental Analysis Cover

Advancements in Offensive Language Detection: A Comprehensive Review and Experimental Analysis

Open Access
|Feb 2025

Abstract

The proliferation of offensive language in digital communication has become a significant challenge in the internet era, prompting the urgent need for advanced Natural Language Processing (NLP) techniques for its identification and mitigation. With a particular focus on NLP techniques, machine learning, deep learning, and transformer models, this study presents a thorough review of the shifting landscape of offensive language identification from the years 2020 through 2023. The datasets utilized in prior research have been scrutinized, specifically those of Dravidian languages such as Tamil, Malayalam, etc. Preprocessing techniques encompass a range of data cleansing and word embedding methodologies, including TF-IDF and Word2Vec, which are employed to train and optimize the model. We reviewed past work to compare the standard supervised learning models like Support Vector Machine and Naive Bayes to emergent transformer models like BERT, identifying the superior approach that would improve a model’s accuracy and effectiveness.

DOI: https://doi.org/10.2478/ias-2024-0012 | Journal eISSN: 1554-1029 | Journal ISSN: 1554-1010
Language: English
Page range: 162 - 179
Published on: Feb 20, 2025
Published by: Cerebration Science Publishing Co., Limited
In partnership with: Paradigm Publishing Services
Publication frequency: 6 issues per year

© 2025 C. Nalini, R. Shanthakumari, Y. Agashia Maria, T. Janarthanan, M. Manibharathi, published by Cerebration Science Publishing Co., Limited
This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 License.