Have a personal or library account? Click to login
Default Prediction in the Finance Industry Based on Ensemble Learning: Combining Machine Learning and Deep Learning Cover

Default Prediction in the Finance Industry Based on Ensemble Learning: Combining Machine Learning and Deep Learning

Open Access
|Jun 2025

References

  1. Abdelmoula, A. K. (2015). Bank credit risk analysis with k-nearest-neighbor classifier: Case of Tunisian banks. Accounting and Management Information Systems, 14(1), 79-106. https://doi.org/10.1109/OCIT59427.2023.10431007
  2. Abhiram, P., Artham, N., Reddy, N., & Kumari, K. V. (2023 ). Predicting the borrower’s genuineness in loan repayment through big data analytics. 2023 OITS International Conference on Information Technology (OCIT) (pp. 767-774). IEEE: Piscataway, NJ, USA. https://doi.org/10.1109/OCIT59427.2023.10431007
  3. Acito, F. (2023). k nearest neighbors. In F. Acito (Ed.), Predictive analytics with KNIME: Analytics for citizen data scientists (pp. 209-227). Cham, Switzerland: Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-45630-5_10
  4. Adedapo, K. D. (2007). Analysis of default risk of agricultural loan by some selected commercial banks in Osogbo, Osun State, Nigeria. International Journal of Applied Agriculture and Apiculture Research, 4(1&2), 24-29.
  5. Alaloul, W. S., & Qureshi, A. H. (2020). Data processing using artificial neural networks. In D. Harkut (Ed.), Dynamic data assimilation: Beating the uncertainties (pp. 81–107). IntechOpen. https://doi.org/10.5772/intechopen.91935
  6. Ali, A., Hamraz, M., Gul, N., Khan, D. M., Aldahmani, S., & Khan, Z. (2023). A k nearest neighbour ensemble via extended neighbourhood rule and feature subsets. Pattern Recognition, 142(1), 109641. https://doi.org/10.1016/j.patcog.2023.109641
  7. Basha, S. A., Elgammal, M. M., & Abuzayed, B. M. (2021). Online peer-to-peer lending: A review of the literature. Electronic Commerce Research and Applications, 48, 101069. https://doi.org/10.1016/j.elerap.2021.101069
  8. Brownlee, J. (2016). XGBoost with Python: Gradient boosted trees with XGBoost and scikit-learn. S.l.: Machine Learning Mastery. https://machinelearningmastery.com/xgboost-with-python/
  9. Chen, D., Ye, J., & Ye, W. (2023). Interpretable selective learning in credit risk. Research in International Business and Finance, 65(C), 101940. https://doi.org/10.1016/j.ribaf.2023.101940
  10. Chen, T., & Guestrin, C. (2016). Xgboost: A scalable tree boosting system. Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785-794). New York: Association for Computing Machinery. https://doi.org/10.1145/2939672.2939785
  11. Chen, Y. R., Leu, J. S., Huang, S. A., Wang, J. T., & Takada, J. I. (2021). Predicting default risk on peer-to-peer lending imbalanced datasets. IEEE Access, 9, 73103-73109. https://doi.org/10.1109/ACCESS.2021.3079701
  12. Chi Tin. (2023, 07 26). Ministry of Finance Makes a Breakthrough in Administrative Reform and Digital Transformation. Retrieved from Ministry of Finance of Vietnam: https://mof.gov.vn/webcenter/portal/ttncdtbh/pages_r/l/chi-tiettin?dDocName=MOFUCM278175
  13. Vietnam Governance. (2010, 6 16). Law No. 47/2010/QH12 by the National Assembly: LAW ON CREDIT INSTITUTIONS. Retrieved from Government Document System of Vietnam: https://vanban.chinhphu.vn/default.aspx?pageid=27160&docid=96074
  14. Dhruv, C., Paul, D., Kumar, M. H., & Reddy, M. S. (2023). Framework for bank loan repayment prediction and income prediction. 2023 Third International Conference on Secure Cyber Computing and Communication (ICSCCC) (pp. 833-840). Piscataway, NJ, USA: IEEE. https://doi.org/10.1109/ICSCCC58608.2023.10176363
  15. Dong, X., Yu, Z., Cao, W., Shi, Y., & Ma, Q. (2020). A survey on ensemble learning. Frontiers of Computer Science, 14, 241-258. https://link.springer.com/article/10.1007/s11704-019-8208-z
  16. Emmanuel, I., Sun, Y., & Wang, Z. (2024). A machine learning-based credit risk prediction engine system using a stacked classifier and a filter-based feature selection method. Journal of Big Data, 11(1), 23. https://doi.org/10.1186/s40537-024-00882-0
  17. Fan, S. (2023). Design and implementation of a personal loan default prediction platform based on LightGBM model. 2023 IEEE 3rd International Conference on Power, Electronics and Computer Applications (ICPECA) (pp. 1232-1236). Piscataway, NJ, USA: IEEE. https://doi.org/10.1109/ICPECA56706.2023.10076254
  18. Fang, J., & Ji, Z. (2024). Application of machine learning in loan default prediction. Mathematical Modeling and Algorithm Application, 2(2), 33-35. https://doi.org/10.54097/75k4fe13
  19. Fauzi, M. A., & Yuniarti, A. (2018). Ensemble method for indonesian twitter hate speech detection. Indonesian Journal of Electrical Engineering and Computer Science, 11(1), 294-299. http://doi.org/10.11591/ijeecs.v11.i1.pp294-299
  20. George, N. (2021, 2 1). All Lending Club loan data. Retrieved from Kaggle: https://www.kaggle.com/datasets/wordsforthewise/lending-club/data
  21. Gupta, A., Pant, V., Kumar, S., & Bansal, P. K. (2020). Bank Loan Prediction System using Machine Learning. 2020 9th International Conference System Modeling and Advancement in Research Trends (SMART) (pp. 423-426). Piscataway, NJ, USA: IEEE. https://doi.org/10.1109/SMART50582.2020.9336801
  22. Hakkal, S., & Lahcen, A. A. (2024). XGBoost to enhance learner performance prediction. Computers and Education: Artificial Intelligence, 7, 100254. https://doi.org/10.1016/j.caeai.2024.100254
  23. Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735-1780. https://doi.org/10.1162/neco.1997.9.8.1735
  24. Jayaram, E. S., Balachandar, G., & Kumar, K. (2024). Machine learning-based loan default prediction: Models, insights, and performance evaluation in peer-to-peer lending platforms. Educational Administration: Theory and Practice, 30(5), 12975-12989. http://dx.doi.org/10.53555/kuey.v30i5.5637
  25. Jin, Y., & Zhu, Y. (2015). A data-driven approach to predict default risk of loan for online peer-to-peer (P2P) lending. 2015 Fifth International Conference on Communication Systems and Network Technologies (pp. 609-613). Piscataway, NJ, USA: IEEE. https://doi.org/10.1109/CSNT.2015.25
  26. Kabari, L. G., & Onwuka, U. C. (2019). Comparison of bagging and voting ensemble machine learning algorithm as a classifier. International Journals of Advanced Research in Computer Science and Software Engineering, 9(3), 19-23.
  27. Kalule, R., Abderrahmane, H. A., Alameri, W., & Sassi, M. (2023). Stacked ensemble machine learning for porosity and absolute permeability prediction of carbonate rock plugs. Scientific Reports, 13(1), 9855. https://doi.org/10.1038/s41598-023-36096-2
  28. Ke, G., Meng, Q., Finely, T., Wang, T., Chen, W., Ma, W., . . . Liu, T. (2017, 12). LightGBM: A Highly Efficient Gradient Boosting Decision Tree. Retrieved from Microsoft Research1: https://www.microsoft.com/en-us/research/publication/lightgbm-a-highly-efficient-gradient-boosting-decision-tree/
  29. Kim, H., Cho, H., & Ryu, D. (2020). Corporate Default Predictions Using Machine Learning: Literature Review. Sustainability, 12(16), 6325. https://doi.org/10.3390/su12166325
  30. Koç, U., & Sevgili, T. (2020). Consumer loans’ first payment default detection: A predictive model. Turkish Journal of Electrical Engineering and Computer Sciences, 28(1), 167-181. https://doi.org/10.3906/elk-1809-190
  31. Kumari, S., Kumar, D., & Mittal, M. (2021). An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier. International Journal of Cognitive Computing in Engineering, 2, 40-46. https://doi.org/10.1016/j.ijcce.2021.01.001
  32. Li, F., Zhang, L., Chen, B., Gao, D., Cheng, Y., Zhang, X., . . . Huang, Z. (2020). An optimal stacking ensemble for remaining useful life estimation of systems under multi-operating conditions. IEEE Access, 8, 31854-31868. https://doi.org/10.1109/ACCESS.2020.2973500
  33. Li, S., Ma, K., Niu, X., Wang, Y., Ji, K., Yu, Z., & Chen, Z. (2019). Stacking-based ensemble learning on low dimensional features for fake news detection. 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS), 2730-2735. https://doi.org/10.1109/HPCC/SmartCity/DSS.2019.00383
  34. Machado, M. R., Karray, S., & De Sousa, I. T. (2019). LightGBM: An effective decision tree gradient boosting method to predict customer loyalty in the finance industry. 2019 14th International Conference on Computer Science & Education (ICCSE) (pp. pp. 1111-1116). Piscataway, NJ, USA: IEEE. https://doi.org/10.1109/ICCSE.2019.8845529
  35. Pandey, D., & Pandey, B. K. (2022). An efficient deep neural network with adaptive galactic swarm optimization for complex image text extraction. In (Eds), V. Yadav, A. K. Dubey, H. P. Singh, G. Dubey, & E. Suryani, Process mining techniques for pattern recognition (pp. 121-137). Boca Raton, FL: CRC Press. https://doi.org/10.1201/9781003169550-10
  36. Qi, X. (2023). Factors influence loan default–A credit risk analysis. International Conference on Economic Management and Green Development (pp. 849-862). Singapore: Springer Nature Singapore. https://doi.org/10.1007/978-981-97-0523-8_79
  37. Rincy, T. N., & Gupta, R. (2020). Ensemble learning techniques and its efficiency in machine learning: A survey. 2020 2nd International Conference on Data, Engineering and Applications (IDEA) (pp. 1-6). Piscataway, NJ, USA: IEEE. https://doi.org/10.1109/IDEA49133.2020.9170675
  38. Sain, K., & Kumar, P. C. (2022). An Overview of Artificial Neural Networks. In K. Sain, & P. C. Kumar, Meta-Attributes and Artificial Networking: A New Tool for Seismic Interpretation (pp. 73-93). Hoboken, New Jersey: John Wiley & Sons. https://doi.org/10.1002/9781119481874
  39. Satpute, S., Jayabalan, M., Kolivand, H., Assi, J., Aldhaibani, O. A., Liatsis, P., & Mahyoub, M. (2022). Loan default forecasting using StackNet. The International Conference on Data Science and Emerging Technologies (pp. 434-447). Singapore: Springer Nature Singapore. https://doi.org/10.1007/978-981-99-0741-0_31
  40. Schonlau, M. (2023). Logistic regression. In M. Schonlau, Applied statistical learning: With case studies in Stata (pp. 49-71). Cham, Switzerland: Springer International Publishing. https://doi.org/10.1007/978-3-031-33390-3_4
  41. Sokolova, M., Japkowicz, N., & Szpakowicz, S. (2006). Beyond accuracy, F-score and ROC: a family of discriminant measures for performance evaluation. Australasian joint conference on artificial intelligence (pp. 1015-1021). Berlin, Heidelberg: Springer Berlin Heidelberg. https://doi.org/10.1007/11941439_114
  42. Thorat, M., Pandit, S., & Balote, S. (2022). Artificial neural network: A brief study. Asian Journal for Convergence in Technology (AJCT), 8(3), 12-16. https://doi.org/10.33130/AJCT.2022v08i03.003
  43. Uddin, N., Ahamed, M. K., Uddin, M. A., Islam, M. M., Talukder, M. A., & Aryal, S. (2023). An ensemble machine learning based bank loan approval predictions system with a smart application. International Journal of Cognitive Computing in Engineering, 4(6), 327-339. https://doi.org/10.1016/j.ijcce.2023.09.001
  44. Uddin, S., Haque, I., Lu, H., Moni, M. A., & Gide, E. (2022). Comparative performance analysis of K-nearest neighbour (KNN) algorithm and its different variants for disease prediction. Scientific Reports, 12(1), 6256. https://doi.org/10.1038/s41598-022-10358-x
  45. Wang, C., Han, D., Liu, Q., & Luo, S. (2018). A deep learning approach for credit scoring of peer-to-peer lending using attention mechanism LSTM. IEEE Access, 7, 2161-2168. https://doi.org/10.1109/ACCESS.2018.2887138
  46. Wang, W., Zuo, X., & Han, D. (2024). Predict credit risk with XGBoost. Applied and Computational Engineering, 74(1), 164-177. https://doi.org/10.54254/2755-2721/74/20240462
  47. Wolpert, D. H. (1992). Stacked generalization. Neural networks, 5(2), 241-259. https://doi.org/10.1016/S0893-6080(05)80023-1
  48. Xia, Y., He, L., Li, Y., Liu, N., & Ding, Y. (2020). Predicting loan default in peer-to-peer lending using narrative data. Journal of Forecasting, 39(2), 39(2), 260-280. https://doi.org/10.1002/for.2625
  49. Yadav, D., Sahoo, L., Mandal, S. K., Ravivarman, G., Vijayaraghavan, P., & Prasad, B. (2023). Using long short-term memory units for time series forecasting. 2023 2nd International Conference on Futuristic Technologies (INCOFT) (pp. 1-6). Piscataway, NJ, USA: IEEE. https://doi.org/10.1109/INCOFT60753.2023.10425756
  50. Zhou, Y. (2023). Loan default prediction based on machine learning methods. Proceedings of the 3rd International Conference on Big Data Economy and Information Management (BDEIM 2022). Zhengzhou, China: EAI. http://doi.org/10.4108/eai.2-12-2022.2328740
DOI: https://doi.org/10.2478/bsrj-2025-0010 | Journal eISSN: 1847-9375 | Journal ISSN: 1847-8344
Language: English
Page range: 198 - 218
Submitted on: May 5, 2024
|
Accepted on: Oct 11, 2024
|
Published on: Jun 20, 2025
In partnership with: Paradigm Publishing Services
Publication frequency: 2 issues per year

© 2025 Le Hoanh-Su, Le Quang Chan Phong, Truong Cong Vinh, Ho Mai Minh Nhat, Jong-Hwa Lee, published by IRENET - Society for Advancing Innovation and Research in Economy
This work is licensed under the Creative Commons Attribution 4.0 License.