Have a personal or library account? Click to login
Comparing Different Oversampling Methods in Predicting Multi-Class Educational Datasets Using Machine Learning Techniques Cover

Comparing Different Oversampling Methods in Predicting Multi-Class Educational Datasets Using Machine Learning Techniques

Open Access
|Nov 2023

References

  1. Kustitskaya, T. A., A. A. Kytmanov, M. V. Noskov. Early Student-at-Risk Detection by Current Learning Performance and Learning Behavior Indicators. – Cybernetics and Information Technologies, Vol. 22, 2022, No 1, pp. 117-133. https://doi.org/10.2478/cait-2022-0008.
  2. Atahua, A. S., J. V. Guerrero, L. Andrade-Arenas, C. M. Huerta. Data Mining: Application of Digital Marketing in Education. – Advances in Mobile Learning Educational Research, Vol. 3, 2023, pp. 621-629.
  3. Abouzinadah, E., O. Rabie, A. Bessadok. Exploring Students Digital Activities and Performances through Their Activities Logged in Learning Management System Using Educational Data Mining Approach. – Interactive Technology and Smart Education, Vol. 20, 2023, pp. 58-72.
  4. Asif, R., N. G. Haider, K. Mahboob. Quality Enhancement at Higher Education Institutions by Early Identifying Students at Risk Using Data Mining. – Mehran University Research Journal of Engineering and Technology, Vol. 42, 2023, pp. 120-136.
  5. SouzaNeto, P. A., I. Silva, L. A. Guedes, T. M. Barros. Predictive Models for Imbalanced Data: A School Dropout Perspective. – Education Sciences, Vol. 9, 2019.
  6. Düsçtegör, D., E. Alyahyan. Predicting Academic Success in Higher Education: Literature Review and Best Practices. – International Journal of Educational Technology in Higher Education, Vol. 17, 2020, pp. 1-21.
  7. Lin, W. C., Y. H. Hu, G. T. Yao, C. F. Tsai. Under-Sampling Class Imbalanced Datasets by Combining Clustering Analysis and Instance Selection. – Information Sciences, Vol. 477, 2019, pp. 47-54.
  8. Kalegele, K., D. Machuve, N. Mduma. A Survey of Machine Learning Approaches and Techniques for Student Dropout Prediction. – Data Science Journal, Vol. 18, 2019, pp. 1-10.
  9. Hammoud, S., F. Kamalov, Gonsalves, F. Thabtah. Data Imbalance in Classification: Experimental Evaluation. – Information Sciences, Vol. 513, 2020, pp. 429-441.
  10. Rawashdeh, J., M. Abdullah, R. Mohammed. Machine Learning with Oversampling and Under-Sampling Techniques: Overview Study and Experimental Results. – In: Proc. of 11th International Conference on Information and Communication Systems (ICICS’20), 2020, pp. 243-248.
  11. Chawla, N. V., K. W. Bowyer, L. O. Hall, Kegelmeyer. SMOTE: Synthetic Minority Over-Sampling Technique. – Journal of Artificial Intelligence Research, Vol. 16, 2002, pp. 321-357.
  12. He, H., Y. Bai, E. A. Garcia, S. L i. ADASYN: Adaptive Synthetic Sampling Approach for Imbalanced Learning. – In: Proc. of IEEE International Joint Conference on Neural Networks, 2008, pp. 1322-1328.
  13. Wang, W. Y., B. H. Mao, H. Han. Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning. – In: Proc. of International Conference on Advances in Intelligent Computing: Intelligent Computing, 2005, pp. 878-887.
  14. DeLaCalleja, J., O. Fuentes. A Distance-Based Over-Sampling Method for Learning from Imbalanced Data Sets. – In: Proc. of 20th International Florida Artificial Intelligence, 2007, pp. 634-635.
  15. Douzas, F. B. G., F. Last. Improving Imbalanced Learning through a Heuristic Oversampling Method Based on k-Means and SMOTE. – Information Sciences, 2018, pp. 1-20.
  16. Zhang, Y. Q., N. V. Chawla, S. Krasser, Y. Tang. SVMS Modeling for Highly Imbalanced Classification. – IEEE Transactions on Systems, Vol. 39, 2008, pp. 281-288.
  17. Maciejewski, T., J. Stefanowski. Local Neighbourhood Extension of SMOTE for Mining Imbalanced Data. – In: Proc. of IEEE Symposium on Computational Intelligence and Data Mining, 2011, pp. 104-111.
  18. Barua, S., M. M. Islam, X. Yao, K. Murase. MWMOTE – Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning. – IEEE Transactions on Knowledge and Data Engineering, Vol. 26, 2014, pp. 405-425.
  19. Bunkhumpornpat, C., K. Sinapiromsaran, C. Lursinsap. Safe-Level-SMOTE: Safe-Level-Synthetic Minority Over-Sampling Technique for Handling the Class Imbalanced Problem. – In: Proc. of 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, 2009, pp. 475-482.
  20. Prati, R. C., M. C. Monard, G. E. Batista. A Study of the Behavior of Several Methods for Balancing Machine Learning Training Data. – ACM, Vol. 6, 2004, pp. 20-29.
  21. Tahir, M., K. Jawad, M. A. Shah. Students’ Academic Performance and Engagement Prediction in a Virtual Learning Environment Using Random Forest with Data Balancing. – Sustainability, Vol. 14, 2022.
  22. Prasetyo, W. A., A. R. Taufani, U. Pujianto. Students Academic Performance Prediction with k-Nearest Neighbor and C4.5 on Smote-Balanced Data. – In: Proc. of 3rd International Seminar on Research of Information Technology and Intelligent Systems (ISRITI’20), 2020, pp 348-353.
  23. Kissoum, Y., A. Mouhssen, M. A. Karek, S, Mazouzi, M. L. Boughouas. Towards a Big Educational Data Analytics. – In: Proc. of International Conference on Advanced Aspects of Software Engineering (ICAASE’22), 2022, pp. 1-6.
  24. Shaiba, H., M. Bezbradica, S. Almutairi. Predicting Students’ Academic Performance and Main Behavioral Features Using Data Mining Techniques. – In: Proc. of 1st International Conference on Computing, in Advances in Data Science, Cyber Security and IT Applications, 2019, pp. 245-259.
  25. Ajoodha, R., K. Padayachee, E. Buraimoh. Importance of Data Resampling and Dimensionality Reduction in Predicting Students’ Success. – In: Proc. of International Conference on Electrical, Communication, and Computer Engineering (ICECCE’21), 2021, pp. 1-6.
  26. Ullah, Z., B. Fakieh, F. Kateb, F. Saleem. Intelligent Decision Support System for Predicting Student’s e-Learning Performance Using Ensemble Machine Learning. – Mathematics, Vol. 9, 2022.
  27. Ullah, Z., B. Fakieh, F. Kateb, F. Saleem. Comparing Different Resampling Methods in Predicting Students’ Performance Using Machine Learning Techniques. – IEEE Access, Vol. 8, 2020, pp. 67899-67911.
  28. Arham, T., Y. Niaz, A. Amin. Systematic Approach for Re-Sampling and Prediction of Low Sample Educational Datasets. – International Journal of Computing and Digital System, 2021.
  29. Rahman, T., I. Khan, I. Ullah, A. UrRehman, M. Baz, H. Hamam, O. Cheikhrouhou, B. K. Yousafzai, S. A. Khan. Student-Performulator: Student Academic Performance Using Hybrid Deep Neural Network. – Sustainability, Vol. 13, 2021.
  30. Lin, J., J. Yu. Data Mining Technology in the Analysis of College Students’ Psychological Problems. – Computer Science and Information Systems, Vol. 12, 2022, pp. 1583-1596.
  31. Lahoud, C., H. E. Khoury, P. Champin, C. Obeid. Novel Hybrid Recommender System Approach for Student Academic Advising Named Cohrs, Supported by Case-Based Reasoning and Ontology. – Computer Science and Information Systems, Vol. 19, 2022, pp. 979-1005.
  32. Sun, C., Z. Wu, J. Yang, J. Wang, T. Tao. Deep Neural Network-Based Prediction and Early Warning of Student Grades and Recommendations for Similar Learning Approaches. – Computer Science and Information Systems, Vol. 12, 2022.
  33. Hamtini, T., I. Aljarah, E. A. Amrieh. Preprocessing and Analyzing Educational Data Set Using x-Api for Improving Student’s Performance. – In: Proc. of Applied Electrical Engineering and Computing Technologies (AEECT’15), 2015, pp. 1-5.
DOI: https://doi.org/10.2478/cait-2023-0044 | Journal eISSN: 1314-4081 | Journal ISSN: 1311-9702
Language: English
Page range: 199 - 212
Submitted on: Oct 11, 2023
Accepted on: Nov 17, 2023
Published on: Nov 30, 2023
Published by: Bulgarian Academy of Sciences, Institute of Information and Communication Technologies
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year

© 2023 Muhammad Arham Tariq, Allah Bux Sargano, Muhammad Aksam Iftikhar, Zulfiqar Habib, published by Bulgarian Academy of Sciences, Institute of Information and Communication Technologies
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.