Have a personal or library account? Click to login
Using Information on Class Interrelations to Improve Classification of Multiclass Imbalanced Data: A New Resampling Algorithm Cover

Using Information on Class Interrelations to Improve Classification of Multiclass Imbalanced Data: A New Resampling Algorithm

Open Access
|Dec 2019

Abstract

The relations between multiple imbalanced classes can be handled with a specialized approach which evaluates types of examples’ difficulty based on an analysis of the class distribution in the examples’ neighborhood, additionally exploiting information about the similarity of neighboring classes. In this paper, we demonstrate that such an approach can be implemented as a data preprocessing technique and that it can improve the performance of various classifiers on multiclass imbalanced datasets. It has led us to the introduction of a new resampling algorithm, called Similarity Oversampling and Undersampling Preprocessing (SOUP), which resamples examples according to their difficulty. Its experimental evaluation on real and artificial datasets has shown that it is competitive with the most popular decomposition ensembles and better than specialized preprocessing techniques for multi-imbalanced problems.

DOI: https://doi.org/10.2478/amcs-2019-0057 | Journal eISSN: 2083-8492 | Journal ISSN: 1641-876X
Language: English
Page range: 769 - 781
Submitted on: Dec 14, 2018
Accepted on: Jul 12, 2019
Published on: Dec 31, 2019
Published by: University of Zielona Góra
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year

© 2019 Małgorzata Janicka, Mateusz Lango, Jerzy Stefanowski, published by University of Zielona Góra
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.