Have a personal or library account? Click to login
Outlier Ensemble Based on Isolation Forest: The CBOEA Approach Cover

Outlier Ensemble Based on Isolation Forest: The CBOEA Approach

Open Access
|Mar 2025

Abstract

Outliers are instances that deviate from the norm. In certain fields, their detection is crucial since they are often indicators of interesting events such as system faults and deliberate human actions. Anomaly detection is an essential data mining task that is employed in many real-life applications. The continuous development of anomaly detection algorithms is primarily motivated by the explosive growth in both size and number of attributes of the data sets. Such growth requires algorithms that can deal with large data sets with e↵ectiveness and efficiency. Isolation Forest (IF) was introduced with that idea in mind. IF uses an isolation mechanism to detect outliers without relying on any distance or density measures. This approach handles large data sets quite well, thanks to its low time complexity. However, IF struggles to detect local outliers. In this work, a new algorithm called Cluster-Based Outlier Ensemble Approach (CBOEA) is proposed. This approach combines IF and Local Outlier Factor (LOF) outputs through a clustering algorithm called OPTICS to identify the clustering structure. This clustering technique allows the compensation of IF weaknesses while maintaining its strengths. The proposed algorithm is then compared to LOF and IF using two evaluation metrics. The performance with benchmark data sets shows that the proposed method is competitive with its components.

DOI: https://doi.org/10.2478/fcds-2025-0002 | Journal eISSN: 2300-3405 | Journal ISSN: 0867-6356
Language: English
Page range: 27 - 55
Submitted on: Feb 18, 2023
|
Accepted on: Sep 27, 2024
|
Published on: Mar 8, 2025
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year

© 2025 Ali Chaabouni, Mohamed Ayman Boujelben, published by Poznan University of Technology
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.