Have a personal or library account? Click to login
Statistical, machine learning, and deep learning models for COVID-19 forecasting in Kenya Cover

Statistical, machine learning, and deep learning models for COVID-19 forecasting in Kenya

Open Access
|Sep 2025

Abstract

This study aims to enhance coronavirus disease 2019 forecasting in Kenya by comparing the predictive performance of statistical, machine learning, and deep learning (DL) models for total cases, critical cases, severe cases, and total deaths, using data from April 2020 to August 2021. Six models – autoregressive integrated moving average (ARIMA), support vector regression, random forest (RF), recurrent neural network, long short-term memory, and gated recurrent unit – were evaluated with an 80–20 train-test split, employing root mean squared error, mean absolute error, mean absolute percentage error, and R 2 {R}^{2} metrics. The Diebold-Mariano (DM) test assessed statistical significance of error differences. Results reveal RF as the top performer, consistently achieving the lowest errors and highest R 2 {R}^{2} across all datasets, indicating superior accuracy in capturing nonlinear epidemic patterns. GRU outperformed other DL models, while ARIMA showed the weakest performance. The DM test confirmed significant differences in forecasting errors, with RF generally outperforming other models.

Language: English
Submitted on: Apr 18, 2025
Accepted on: Jul 23, 2025
Published on: Sep 15, 2025
Published by: Sciendo
In partnership with: Paradigm Publishing Services

© 2025 Joyce Kiarie, Samuel Musili Mwalili, Rachel Mbogo, John Kamwele Mutinda, Amos Kipkorir Langat, published by Sciendo
This work is licensed under the Creative Commons Attribution 4.0 License.