Statistical, machine learning, and deep learning models for COVID-19 forecasting in Kenya

Kiarie, Joyce; Mwalili, Samuel Musili; Mbogo, Rachel; Mutinda, John Kamwele; Langat, Amos Kipkorir

doi:10.1515/cmb-2025-0026

Figures & Tables

Architecture of the LSTM network. Source: From ref. [39].

Architecture of the GRU. Source: From ref. [39].

Time series plots of total cases, severe cases, critical cases, and total deaths. Source: Created by the authors.

ACF and PACF for total cases, severe cases, critical cases, and total deaths. Source: Created by the authors.

Prediction vs actual values for total cases, critical cases, severe cases, and total deaths datasets. Source: Created by the authors.

Bar plot of performance metrics (RMSE, MAE, MAPE, and

R

2

{R}^{2}

) for six models ARIMA, SVR, RNN, LSTM, GRU, and RF across four COVID-19 datasets: total cases, critical cases, severe cases, and total deaths. Source: Created by the authors. — Bar plot of performance metrics (RMSE, MAE, MAPE, and R 2 {R}^{2} ) for six models ARIMA, SVR, RNN, LSTM, GRU, and RF across four COVID-19 datasets: total cases, critical cases, severe cases, and total deaths. Source: Created by the authors.

Scatterplot of observed vs predicted values for the total cases dataset. The RF model demonstrates a strong linear relationship, indicating accurate predictions. Source: Created by the authors.

Scatterplot of observed vs predicted values for the critical cases dataset. The RF model demonstrates a strong linear relationship, indicating accurate predictions. Source: Created by the authors.

Scatterplot of observed vs predicted values for the Severe Cases dataset. The RF model demonstrates a strong linear relationship, indicating accurate predictions. Source: Created by the authors.

Scatterplot of observed vs predicted values for the total deaths dataset. The RF model demonstrates a strong linear relationship, indicating accurate predictions. Source: Created by the authors.

Boxplot of forecasting error distributions for six models – ARIMA, SVR, RNN, LSTM, GRU, and RF – across four COVID-19 datasets. Each model’s errors are grouped by dataset, with boxes showing the interquartile range, median, and outliers of the forecasting errors. The RF model consistently exhibits the lowest median error, minimal variance, and fewer outliers across all datasets, highlighting its superior predictive accuracy and stability. Source: Created by the authors.

Heatmap of model performance metrics (RMSE, MAE, MAPE) across total cases, critical cases, severe cases, and total deaths datasets. All models show improvement over RF, with ARIMA exhibiting the highest improvement compared to RF. Source: Created by the authors.

Performance metrics (RMSE, MAE, MAPE, and R 2 {R}^{2} ) for six models – ARIMA, SVR, RNN, LSTM, GRU, and RF – across four datasets: total cases, critical cases, severe cases, and total deaths

Dataset	Model	RMSE	MAE	MAPE (%)	R 2 {R}^{2}
Total cases	ARIMA	13870.6770	10916.6357	82.0056	− 9.2272 -9.2272
	SVR	5589.6811	3874.5758	26.4893	− 0.6609 -0.6609
	RNN	365.3872	289.7108	2.3229	0.9929
	LSTM	338.4482	281.7235	2.4740	0.9939
	GRU	215.3505	177.0998	1.5449	0.9975
	RF	93.4117	35.9370	0.2668	0.9995
Critical cases	ARIMA	2055.3009	1599.7675	70.9180	− 6.7976 -6.7976
	SVR	467.2761	216.0696	6.8270	0.5970
	RNN	62.5152	54.0835	2.8693	0.9928
	LSTM	185.1695	144.1515	6.6211	0.9367
	GRU	42.5913	35.1251	1.8999	0.9967
	RF	17.5342	7.3318	0.3330	0.9994
Severe cases	ARIMA	6593.2169	5188.9595	81.8556	− 9.1904 -9.1904
	SVR	2223.5193	1341.9956	17.5188	− 0.1590 -0.1590
	RNN	140. 1730	110.0014	2.3796	0.9954
	LSTM	281.8612	221.9638	3.9355	0.9814
	GRU	140.4860	113.2804	1.9507	0.9954
	RF	44.4818	17.1128	0.2668	0.9995
Total deaths	ARIMA	822.0758	639.8712	70.9140	− 6.7967 -6.7967
	SVR	89.1646	34.3085	2.6381	0.9083
	RNN	23.3445	17.9225	2.0940	0.9937
	LSTM	40.4583	31.6118	4.0341	0.9811
	GRU	21.3337	18.6039	2.5657	0.9947
	RF	7.0137	2.9327	0.3330	0.9994

Parameter settings for RNN, LSTM, and GRU models

Parameter	RNN	LSTM	GRU
Number of layers	3	3	3
Activation	ReLU	ReLU	ReLU
Loss function	MSE	MSE	MSE
Optimizer	Adam	Adam	Adam
Learning rate	0.001	0.001	0.001
Dropout rate	0.2	0.2	0.2
Epochs	100	100	100
Batch size	16	16	16
Units per layer	100, 50, 25	100, 50, 25	100, 50, 25
Early stopping	Yes (monitor = val_loss, patience = 10)	Yes (monitor = val_loss, patience = 10)	Yes (monitor = val_loss, patience = 10)

ADF test results for stationarity

Variable	ADF statistic	p p -value
Total cases	− 1.6906 -1.6906	0.4360
Severe cases	− 1.6906 -1.6906	0.4360
Critical cases	− 1.6968 -1.6968	0.4327
Total deaths	− 1.6968 -1.6968	0.4328

DM test statistics comparing RF to benchmark models across datasets and loss functions

Dataset	Benchmark model	MSE	MAE	MAPE
Total cases	ARIMA	− 8.381 3 * * * -8.381{3}^{* * * }	− 12.632 2 * * * -12.632{2}^{* * * }	− 14.587 6 * * * -14.587{6}^{* * * }
	SVR	− 5.986 5 * * * -5.986{5}^{* * * }	− 9.517 9 * * * -9.517{9}^{* * * }	− 14.060 0 * * * -14.060{0}^{* * * }
	RNN	− 7.478 2 * * * -7.478{2}^{* * * }	− 11.299 8 * * * -11.299{8}^{* * * }	− 13.647 8 * * * -13.647{8}^{* * * }
	LSTM	− 7.713 3 * * * -7.713{3}^{* * * }	− 13.057 4 * * * -13.057{4}^{* * * }	− 14.500 8 * * * -14.500{8}^{* * * }
	GRU	− 4.937 5 * * * -4.937{5}^{* * * }	− 9.653 0 * * * -9.653{0}^{* * * }	− 12.054 7 * * * -12.054{7}^{* * * }
Critical cases	ARIMA	− 8.196 2 * * * -8.196{2}^{* * * }	− 12.271 1 * * * -12.271{1}^{* * * }	− 16.908 7 * * * -16.908{7}^{* * * }
	SVR	− 4.138 3 * * * -4.138{3}^{* * * }	− 5.092 6 * * * -5.092{6}^{* * * }	− 5.328 4 * * * -5.328{4}^{* * * }
	RNN	− 8.906 4 * * * -8.906{4}^{* * * }	− 15.779 6 * * * -15.779{6}^{* * * }	− 17.316 7 * * * -17.316{7}^{* * * }
	LSTM	− 7.806 8 * * * -7.806{8}^{* * * }	− 12.091 7 * * * -12.091{7}^{* * * }	− 17.446 0 * * * -17.446{0}^{* * * }
	GRU	− 6.949 7 * * * -6.949{7}^{* * * }	− 10.959 0 * * * -10.959{0}^{* * * }	− 11.152 7 * * * -11.152{7}^{* * * }
Severe cases	ARIMA	− 8.380 6 * * * -8.380{6}^{* * * }	− 12.631 6 * * * -12.631{6}^{* * * }	− 14.564 6 * * * -14.564{6}^{* * * }
	SVR	− 5.235 0 * * * -5.235{0}^{* * * }	− 7.475 3 * * * -7.475{3}^{* * * }	− 9.483 4 * * * -9.483{4}^{* * * }
	RNN	− 5.296 3 * * * -5.296{3}^{* * * }	− 9.109 2 * * * -9.109{2}^{* * * }	− 9.459 2 * * * -9.459{2}^{* * * }
	LSTM	− 7.056 2 * * * -7.056{2}^{* * * }	− 11.344 0 * * * -11.344{0}^{* * * }	− 13.424 1 * * * -13.424{1}^{* * * }
	GRU	− 7.321 1 * * * -7.321{1}^{* * * }	− 11.440 3 * * * -11.440{3}^{* * * }	− 13.382 3 * * * -13.382{3}^{* * * }
Total deaths	ARIMA	− 8.196 2 * * * -8.196{2}^{* * * }	− 12.271 0 * * * -12.271{0}^{* * * }	− 13.478 9 * * * -13.478{9}^{* * * }
	SVR	− 3.243 0 * * * -3.243{0}^{* * * }	− 3.969 7 * * * -3.969{7}^{* * * }	− 4.008 1 * * * -4.008{1}^{* * * }
	RNN	− 7.122 7 * * * -7.122{7}^{* * * }	− 9.983 4 * * * -9.983{4}^{* * * }	− 11.479 3 * * * -11.479{3}^{* * * }
	LSTM	− 6.444 3 * * * -6.444{3}^{* * * }	− 10.958 3 * * * -10.958{3}^{* * * }	− 14.155 8 * * * -14.155{8}^{* * * }
	GRU	− 8.801 3 * * * -8.801{3}^{* * * }	− 13.753 9 * * * -13.753{9}^{* * * }	− 13.808 8 * * * -13.808{8}^{* * * }

Summary statistics

	Count	Mean	Std Dev	Min	25%	50%	75%	Max	Kurtosis	Skewness
Total cases	499	8841.58	5911.12	146.05	3737.57	8218.49	12979.51	21393.05	− 0.8752 -0.8752	0.4086
Severe cases	499	4210.27	2814.82	69.55	1779.80	3913.57	6180.72	10187.17	− 0.8752 -0.8752	0.4086
Critical cases	499	1472.06	983.35	25.06	616.25	1359.01	2158.80	3570.59	− 0.8753 -0.8753	0.4116
Total deaths	499	588.83	393.34	10.03	246.50	543.60	863.52	1428.24	− 0.8753 -0.8753	0.4116

Statistical, machine learning, and deep learning models for COVID-19 forecasting in Kenya

Figures & Tables

Figure 1

Figure 2

Figure 3

Figure 4

Figure 5

Figure 6

Figure 7

Figure 8

Figure 9

Figure 10

Figure 11

Figure 12

Performance metrics (RMSE, MAE, MAPE, and R 2 {R}^{2} ) for six models – ARIMA, SVR, RNN, LSTM, GRU, and RF – across four datasets: total cases, critical cases, severe cases, and total deaths

Parameter settings for RNN, LSTM, and GRU models

ADF test results for stationarity

DM test statistics comparing RF to benchmark models across datasets and loss functions

Summary statistics

Paradigm

My account