The Influence of Unbalanced Economic Data on Feature Selection and Quality of Classifiers

Kubus, Mariusz

The Influence of Unbalanced Economic Data on Feature Selection and Quality of Classifiers

Folia Oeconomica Stetinensia

Volume 20 (2020): Issue 1 (June 2020)

By:

Mariusz Kubus

Open Access

|Aug 2020

References

Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.10.1023/A:1010933404324
Search in Google Scholar Back to article
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P. (2002). SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research, 16, 321–357.10.1613/jair.953
Search in Google Scholar Back to article
Chawla, N.V., Japkowicz, N., Kołcz, A. (2004). Special issue on learning from imbalanced data sets. ACM Sigkdd Explorations Newsletter, 6 (1), 1–6.10.1145/1007730.1007733
Search in Google Scholar Back to article
Chen, C., Liaw, A., Breiman, L. (2004) Using random forest to learn imbalanced data. University of California, Berkeley, 110, 1–12.
Search in Google Scholar Back to article
Dua, D., Graff, C. (2019). UCI Machine Learning Repository. Irvine, CA: University of California, School of Information and Computer Science. Retrieved from: http://archive.ics.uci.edu/ml (17.06.2019).
Search in Google Scholar Back to article
Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27, 861–874.10.1016/j.patrec.2005.10.010
Search in Google Scholar Back to article
Fayyad, U., Irani, K. (1993). Multi-interval discretization of continuous-valued attributes for classification learning. Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence (pp. 1022–1027).
Search in Google Scholar Back to article
Galar, M., Fernandez, A., Barrenechea, E., Bustince, H., Herrera, F. (2011). A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 42 (4), 463–484.10.1109/TSMCC.2011.2161285
Search in Google Scholar Back to article
Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L. (2006). Feature Extraction: Foundations and Applications. New York: Springer.10.1007/978-3-540-35488-8
Search in Google Scholar Back to article
Guyon, I., Weston, J., Barnhill, S., Vapnik, V. (2002). Gene Selection for Cancer Classification using Support Vector Machines. Machine Learning, 46, 389–422.10.1023/A:1012487302797
Search in Google Scholar Back to article
Haixiang, G., Yijing, L., Shang, J., Mingyun, G., Yuanyue, H., Bing, G. (2017). Learning from class-imbalanced data: Review of methods and applications. Expert Systems with Applications, 73, 220–239.10.1016/j.eswa.2016.12.035
Search in Google Scholar Back to article
Japkowicz, N., Shah, M. (2011). Evaluating learning algorithms: a classification perspective. Cambridge University Press.10.1017/CBO9780511921803
Search in Google Scholar Back to article
King, G., Zeng, L. (2001). Logistic regression in rare events data. Political Analysis, 9, 137–163.10.1093/oxfordjournals.pan.a004868
Search in Google Scholar Back to article
Kononenko, I. (1994). Estimating attributes: Analysis and extensions of RELIEF, Proceedings of European Conference on Machine Learning (pp. 171–182).10.1007/3-540-57868-4_57
Search in Google Scholar Back to article
Kubus, M. (2015). Rekurencyjna eliminacja cech w metodach dyskryminacji. Prace Naukowe Uniwersytetu Ekonomicznego we Wrocławiu, 384. Taksonomia, 24, 154–162. DOI: 10.15611/pn.2015.384.16.10.15611/pn.2015.384.16
Search in Google Scholar Back to article
Kubus, M. (2016). Lokalna ocena mocy dyskryminacyjnej zmiennych. Prace Naukowe Uniwersytetu Ekonomicznego we Wrocławiu, 427, Taksonomia 27, 143–152. DOI: 10.15611/pn.2016.427.15.10.15611/pn.2016.427.15
Search in Google Scholar Back to article
Longadge, R., Dongre, S.S., Malik, L. (2013). Class Imbalance Problem in Data Mining: Review. International Journal of Computer Science and Network, 2 (1), 83–87.
Search in Google Scholar Back to article
Menardi, G., Torelli, N. (2014). Training and assessing classification rules with imbalanced data. Data Mining and Knowledge Discovery, 28, 92–122.10.1007/s10618-012-0295-5
Search in Google Scholar Back to article
Pociecha, J., Pawełek, B., Baryła, M., Augustyn, S. (2014). Statystyczne metody prognozowania bankructwa w zmieniającej się koniunkturze gospodarczej. Kraków: Fundacja Uniwersytetu Ekonomicznego w Krakowie.
Search in Google Scholar Back to article
Tomek, I. (1976). Two modifications of CNN. IEEE Trans. Systems, Man and Cybernetics, 6, 769–772.10.1109/TSMC.1976.4309452
Search in Google Scholar Back to article
Tsamardinos, I., Aliferis, C.F. (2003). Towards principled feature selection: relevancy, filters and wrappers. In Proceedings of the Workshop on Artificial Intelligence and Statistics.
Search in Google Scholar Back to article
Weiss, G. (2004). Mining with rarity: A unifying framework. SIGKDD Explorations, 6 (1), 7–19.10.1145/1007730.1007734
Search in Google Scholar Back to article
Yu, L., Liu, H. (2004). Efficient feature selection via analysis of relevance and redundancy. Journal of Machine Learning Research, 5, 1205–1224.
Search in Google Scholar Back to article
Zou, H., Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society Series B, 67 (2), 301–320.10.1111/j.1467-9868.2005.00503.x
Search in Google Scholar Back to article