Have a personal or library account? Click to login

Comparison of Methods for Handling Imbalanced Data in Customer Churn Prediction with Feature Selection Using SHAP and mRMR Frameworks

Open Access
|Sep 2025

References

  1. Y, N. N., T. V. Ly, D. V. T. Son. Churn Prediction in Telecommunication Industry Using Kernel Support Vector Machines. – PLOS ONE, Vol. 17, 2022, No 5, e0267935.
  2. Burez, J., D. Van den Poel. Handling Class Imbalance in Customer Churn Prediction. – Expert Systems with Applications, Vol. 36, 2009, No 3, Part 1, pp. 4626-4636.
  3. Zhu, B., B. Baesens, S. K. L. M. van den Broucke. An Empirical Comparison of Techniques for the Class Imbalance Problem in Churn Prediction. – Information Sciences, Vol. 408, 2017, pp. 84-99.
  4. Ahmad, A. K., A. Jafar, K. Aljoumaa. Customer Churn Prediction in Telecom Using Machine Learning in Big Data Platform. – Journal of Big Data, Vol. 6, 2019, No 1, 28.
  5. P. Bhuse, A. Gandhi, P. Meswani, R. Muni, N. Katre, Eds. Machine Learning Based Telecom-Customer Churn Prediction. – In: Proc. of 3rd International Conference on Intelligent Sustainable Systems (ICISS’20), 3-5 December 2020.
  6. Jain, H., A. Khunteta, S. Srivastava. Churn Prediction in Telecommunication Using Logistic Regression and Logit Boost. – Procedia Computer Science, Vol. 167, 2020, pp. 101-112.
  7. Lalwani, P., M. K. Mishra, J. S. Chadha, P. Sethi. Customer Churn Prediction System: A Machine Learning Approach. – Computing, Vol. 104, 2022, No 2, pp. 271-294.
  8. Pustokhina, I. V., D. A. Pustokhin, P. T. Nguyen, M. Elhoseny, K. Shankar. Multi-Objective Rain Optimization Algorithm with WELM Model for Customer Churn Prediction in Telecommunication Sector. – Complex & Intelligent Systems, Vol. 9, 2023, No 4, pp. 3473-3485.
  9. Sudharsan, R., E. Ganesh. A Swish RNN Based Customer Churn Prediction for the Telecom Industry with a Novel Feature Selection Strategy. – Connection Science, Vol. 34, 2022, No 1, pp. 1855-1876.
  10. Long, H. V., L. H. Son, M. Khari, K Arora, S. Chopra, R. Kumar et al. A New Approach for Construction of Geodemographic Segmentation Model and Prediction Analysis. – Computational Intelligence and Neuroscience, Vol. 2019, 2019, No 1, 9252837.
  11. M. Rahman, V. Kumar, Eds. Machine Learning Based Customer Churn Prediction in Banking. – In: Proc. of 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA’20), 5-7 November 2020.
  12. De Lima Lemos, R. A., T. C. Silva, B. M. Tabak. Propension to Customer Churn in a Financial Institution: a Machine Learning Approach. – Neural Computing and Applications, Vol. 34, 2022, No 14, pp. 11751-11768.
  13. Peng, K., Y. Peng, W. Li. Research on Customer Churn Prediction and Model Interpretability Analysis. – PLOS ONE, Vol. 18, 2023, No 12, e0289724.
  14. Xiahou, X., Y. Harada. B2C E-Commerce Customer Churn Prediction Based on K-Means and SVM. – Journal of Theoretical and Applied Electronic Commerce Research, Vol. 17, 2022, No 2, pp. 458-475.
  15. AL-Najjar, D., N. Al-Rousan, H. AL-Najjar. Machine Learning to Develop Credit Card Customer Churn Prediction. – Journal of Theoretical and Applied Electronic Commerce Research, Vol. 17, 2022, No 4, pp. 1529-1542.
  16. Amin, A., S. Anwar, A. Adnan, M. Nawaz, N. Howard, J. Qadir et al. Comparing Oversampling Techniques to Handle the Class Imbalance Problem: A Customer Churn Prediction Case Study. – IEEE Access, Vol. 4, 2016, pp. 7940-7957.
  17. Bharathi, S. V., D. Pramod, R. Raman. An Ensemble Model for Predicting Retail Banking Churn in the Youth Segment of Customers. – Data, Vol. 7, 2022, No 5, 61.
  18. Brito, J. B. G., G. B. Bucco, R. Heldt, J. L. Becker, C. S. Silveira, F. B. Luce et al. A Framework to Improve Churn Prediction Performance in Retail Banking. – Financial Innovation, Vol. 10, 2024, No 1, 17.
  19. Xiahou, X., Y. Harada. B2C E-Commerce Customer Churn Prediction Based on K-Means and SVM. – Journal of Theoretical and Applied Electronic Commerce Research [Internet], Vol. 17, 2022, No 2, pp. 458-475.
  20. Xu, T., Y. Ma, K. Kim. Telecom Churn Prediction System Based on Ensemble Learning Using Feature Grouping. – Applied Sciences, Vol. 11, 2021, No 11, 4742.
  21. Asif, D., M. S. Arif, A. Mukheimer. A Data-Driven Approach with Explainable Artificial Intelligence for Customer Churn Prediction in the Telecommunications Industry. – Results in Engineering, Vol. 26, 2025, 104629.
  22. Zhou, Y., W. Chen, X. Sun, D. Yang. Early Warning of Telecom Enterprise Customer Churn Based on Ensemble Learning. – PLOS ONE, Vol. 18, 2023, No 10, e0292466.
  23. Ngo, V.-B., V.-H. Vu. Multi-Level Machine Learning Model to Improve the Effectiveness of Predicting Customers Churn Banks. – Cybernetics and Information Technologies, Vol. 24, 2024, No 3, pp. 3-20.
  24. Lundberg, S. M., S.-I. Lee. A Unified Approach to Interpreting Model Predictions. – Advances in Neural Information Processing Systems, Vol. 30, 2017.
  25. Hanchuan, P., L. Fuhui, C. Ding. Feature Selection Based on Mutual Information Criteria of Max-Dependency, Max-Relevance, and Min-Redundancy. – IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 27, 2005, No 8, pp. 1226-1238.
  26. Nitesh, V. C. SMOTE: Synthetic Minority Over‐Sampling Technique. – J. Artif. Intell. Res., Vol. 16, 2002, No 1, 321.
  27. H. Haibo, B. Yang, E. A. Garcia, L. Shutao, Eds. ADASYN: Adaptive Synthetic Sampling Approach for Imbalanced Learning. – In: Proc. of IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), 1-8 June 2008.
  28. H. Han, W.-Y. Wang, B.-H. Mao, Eds. Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning. – In: Proc. of International Conference on Intelligent Computing, Springer, 2005.
  29. Batista, G. E., R. C. Prati, M. C. Monard. A Study of the Behavior of Several Methods for Balancing Machine Learning Training Data. – ACM SIGKDD Explorations Newsletter, Vol. 6, 2004, No 1, pp. 20-29.
  30. I. Mani, I. Zhang, Eds. kNN Approach to Unbalanced Data Distributions: A Case Study Involving Information Extraction. – In: Proc. of Workshop on Learning from Imbalanced Datasets, 2003, ICML United States.
  31. Two Modifications of CNN. – IEEE Transactions on Systems, Man, and Cybernetics. SMC-Vol. 6, 1976, No 11, pp. 769-772.
  32. Wilson, D. L. Asymptotic Properties of Nearest Neighbor Rules Using Edited Data. – IEEE Transactions on Systems, Man, and Cybernetics, SMC- Vol. 2, 1972, No 3, pp. 408-421.
  33. Fernández, A., S. García, M. Galar, R. C. Prati, B. Krawczyk, F. Herrera. Cost-Sensitive Learning. – In: A. Fernández, S. García, M. Galar, R. C. Prati, B. Krawczyk, F. Herrera, Eds. Learning from Imbalanced Data Sets. Cham, Springer International Publishing, 2018, pp. 63-78.
  34. Al-Najjar, D., N. Al-Rousan, H. Al-Najjar. Machine Learning to Develop Credit Card Customer Churn Prediction. – Journal of Theoretical and Applied Electronic Commerce Research [Internet], Vol. 17, 2022, No 4, pp. 1529-1542.
  35. Wu, Z., L. Jing, B. Wu, L. Jin. A PCA-AdaBoost Model for E-Commerce Customer Churn Prediction. – Annals of Operations Research, 2022.
  36. John Lu, Z. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Oxford University Press, 2010.
  37. Breiman, L. Bagging Predictors. – Machine Learning, Vol. 24, 1996, pp. 123-140.
DOI: https://doi.org/10.2478/cait-2025-0023 | Journal eISSN: 1314-4081 | Journal ISSN: 1311-9702
Language: English
Page range: 68 - 87
Published on: Sep 25, 2025
Published by: Bulgarian Academy of Sciences, Institute of Information and Communication Technologies
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year

© 2025 Luong Thanh Tam, Luong Gia Vi, Nguyen Manh Tuan, published by Bulgarian Academy of Sciences, Institute of Information and Communication Technologies
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.