Have a personal or library account? Click to login
Nvidia's Stock Returns Prediction Using Machine Learning Techniques for Time Series Forecasting Problem Cover

Nvidia's Stock Returns Prediction Using Machine Learning Techniques for Time Series Forecasting Problem

Open Access
|Jan 2021

Figures & Tables

Fig. 1

Algorithm of model buildingSource: Author
Algorithm of model buildingSource: Author

Fig. 2

Nvidia stock returns on in-sample setSource: Authors calculations
Nvidia stock returns on in-sample setSource: Authors calculations

Fig. 3

Nvidia stock returns on out-of-sample setSource: Authors calculations
Nvidia stock returns on out-of-sample setSource: Authors calculations

Fig. 4

Results of Ljung-Box test for in-sample set and out-of-sample set. Notes: Figure presents results of Ljung-Box test of white-noise hypothesis of Nvidia's stock returns on in-sample set and out-of-sample setSource: Authors calculations
Results of Ljung-Box test for in-sample set and out-of-sample set. Notes: Figure presents results of Ljung-Box test of white-noise hypothesis of Nvidia's stock returns on in-sample set and out-of-sample setSource: Authors calculations

Fig. 5

ACF and PACF for in-sample and out-of-sample sets. Notes: Figure presents autocorrelation and partial autocorrelation plots of Nvidia's stock returns on in-sample and out-of-sample setsSource: Authors calculations
ACF and PACF for in-sample and out-of-sample sets. Notes: Figure presents autocorrelation and partial autocorrelation plots of Nvidia's stock returns on in-sample and out-of-sample setsSource: Authors calculations

Fig. 6

Performance on test set of the best model in research – SVR (based on stationary variables)Source: Authors calculations
Performance on test set of the best model in research – SVR (based on stationary variables)Source: Authors calculations

Results of Augmented Dickey-Fuller test for in-sample set and out-of-sample set

Test statistic (in sample)p-value (in sample)Test statistic (out of sample)p-value (out of sample)
−11.01<0.0001−11.09<0.0001

Results of singular models on validation and test set (based on stationary variables)

Model (number of attributes)SetHyperparametersRMSEMAEMedAE
SVR (20)ValidationC=0.005206Epsilon=0.0873080.0269240.0194780.014985
SVR (20)TestC=0.005206epsilon=0.0873080.0360140.0249160.016682
KNN (20)ValidationPower of Minkowski metric=2k=7Weight function=uniform0.0263280.0203310.016199
KNN (20)TestPower of Minkowski metric=2k=7Weight function=uniform0.0393050.0259350.017202
XGBoost (27)ValidationMax depth:7Subsample: 0.760762Colsample by tree: 0.199892Lambda: 0.345263Gamma: 0.000233Learning rate: 0.20.0276220.0206780.016553
XGBoost (27)TestMax depth:7Subsample: 0.760762Colsample by tree: 0.199892Lambda: 0.345263Gamma: 0.000233Learning rate: 0.20.0388480.0272180.019782
LGBM (43)ValidationNumber of leaves:58Min data in leaf:21ETA: 0.067318Max drop: 52L1 regularization: 0.059938L2 regularization: 0.0503050.0259050.0188030.014339
LGBM (43)TestNumber of leaves:58Min data in leaf:21ETA: 0.067318Max drop: 52L1 regularization: 0.059938L2 regularization: 0.0503050.0388700.0262830.016467
LSTM (20)ValidationH10.0265650.0197410.014537
LSTM (20)TestH10.0367050.0249180.016772

Performance of ensemble models on test set (based on all models)

Number of modelsModels (weight)RMSEMAEMedAE
2S+NS SVR (0.504057), S+NS LightGBM (0.495943)0.038730.0258820.017714
3S+NS SVR (0.337508), S+NS LightGBM (0.332075), S LightGBM (0.330417)0.0385930.0259590.017321
4S+NS SVR (0.255711), S+NS LightGBM (0.251594), S LightGBM (0.250338), S KNN (0.242357)0.0384360.0256650.017152
5S+NS SVR (0.206542), S+NS LightGBM (0.203217), S LightGBM (0.202202), S KNN (0.195756), S LSTM (0.192283)0.0377340.0252670.016599
6S+NS SVR (0.173976), S+NS LightGBM (0.171176), S LightGBM (0.170321), S KNN (0.164891), S LSTM (0.161965), S SVR (0.157671)0.036810.0247510.01743
7S+NS SVR (0.150427), S+NS LightGBM (0.148006), S LightGBM (0.147266), S KNN (0.142572), S LSTM (0.140042), S SVR (0.136329), S+NS KNN (0.135359)0.0368710.0248970.016953
8S+NS SVR (0.133177), S+NS LightGBM (0.131034), S LightGBM (0.130379), S KNN (0.126223), S LSTM (0.123983), S SVR (0.120696), S+NS KNN (0.119837), S XGBoost (0.114672)0.0367460.0246810.01647
9S+NS SVR (0.119825), S+NS LightGBM (0.117896), S LightGBM (0.117307), S KNN (0.113568), S LSTM (0.111553), S SVR (0.108595), S+NS KNN (0.107822), S XGBoost (0.103175), S+NS XGBoost (0.100259)0.0368980.0247570.01645
10S+NS SVR (0.109125), S+NS LightGBM (0.107368), S LightGBM (0.106832), S KNN (0.103426), S LSTM (0.101591), S SVR (0.098897), S+NS KNN (0.098193), S XGBoost (0.093961), S+NS XGBoost (0.091305), S+NS LSTM (0.089302)0.0368990.0249150.016466

Performance of best models on test set (ensemble and primary models)

MetricBest stationary ensemble modelBest stationary + non-stationary ensemble modelBest ensemble model based on all modelsBest stationary model – SVRBest stationary + non-stationary model − LGBMNaive model
RMSE0.0361060.0380530.0367460.0360140.0372840.050244
MAE0.0240940.0260680.0246810.0249160.0262950.034908
MedAE0.0153070.0166450.016470.0166820.0179590.022378

Performance of ensemble models on test set (models based on stationary and non-stationary variables)

Number of modelsModels (weight)RMSEMAEMedAE
2SVR (0.504057), LightGBM (0.495943)0.0387300.0258820.017714
3SVR (0.346773), LightGBM (0.341191), KNN (0.312036)0.0383140.0260030.016682
4SVR (0.268786), LightGBM (0.264459), KNN (0.241861), XGBoost (0.224895)0.0383010.0257930.016876
5SVR (0.220323), LightGBM (0.216777), KNN (0.198253), XGBoost (0.184346), LSTM (0.180301)0.0380530.0260680.016645

Performance of models on test set (based on stationary and non-stationary variables)

MetricSVRKNNXGBoostLSTMLGBMBest ensemble modelNaive model
RMSE0.0419040.0393130.0406850.0395930.0372840.0380530.050244
MAE0.0258750.0268630.0269060.0288910.0262950.0260680.034908
MedAE0.0172790.0189460.0169390.0205760.0179590.0166450.022378

Performance of ensemble models on test set (models based on stationary variables)

Number of modelsModels (weight)RMSEMAEMedAE
2LightGBM (0.508099), KNN (0.491901)0.0385710.0257840.017147
3LightGBM (0.342575), KNN (0.331655), LSTM (0.32577)0.0374030.0251110.015704
4LightGBM (0.260092), KNN (0.251801), LSTM (0.247333), SVR (0.240775)0.0361810.0243660.01599
5LightGBM (0.211671), KNN (0.204923), LSTM (0.201287), SVR (0.19595),XGBoost (0.18617)0.0361060.0240940.015307

Results of singular models on validation and test set (based on stationary and non-stationary variables)

Model (number of attributes)SetHyperparametersRMSEMAEMedAE
SVR (27)ValidationC=0.005317, epsilon=0.0921790.0256320.0191260.015488
SVR (27)TestC=0.005317, epsilon=0.0921790.0419040.0258750.017279
KNN (40)ValidationPower of Minkowski metric=1k=6Weight function=uniform0.0270210.0201100.013813
KNN (40)TestPower of Minkowski metric=1k=6Weight function=uniform0.0393130.0268630.018946
XGBoost (74)ValidationMax depth:3Subsample: 0.840403Colsample by tree: 0.605006Lambda: 4.461698Gamma: 0.000808Learning rate: 0.1050.0280210.0216040.020396
XGBoost (74)TestMax depth:3Subsample: 0.840403Colsample by tree: 0.605006Lambda: 4.461698Gamma: 0.000808Learning rate: 0.1050.0406850.0269060.016939
LGBM (80)ValidationNumber of leaves:32Min data in leaf:38ETA: 0.099519Max drop: 51L1 regularization: 0.060221L2 regularization: 0.0504230.0258400.0193610.014083
LGBM (80)TestNumber of leaves:32Min data in leaf:38ETA: 0.099519Max drop: 51L1 regularization: 0.060221L2 regularization: 0.0504230.0372840.0262950.017959
LSTM (20)ValidationH20.0283340.0217020.018201
LSTM (20)TestH20.0395930.0288910.020576

Hyperparameters tuning algorithm_

1.For each pair of sets (Xi,Yi) ∈ S={(train, validation1), (train ∪ validation1, validation2), (train ∪ validation1∪ validation2, validation3)} next operations will be performed:
a.the possibly largest group of hyperparameters will be selected according to best practice mentioned in literature,
b.one-step-ahead prediction will be done, providing Xi as train set and Yi as test set, and then one model with the lowest RMSE will be chosen, with parameters Hi.
As a result, set {H1, H2, H3} is obtained.
2.For Hi will be executed three predictions on each pair from S. In effect 3 RMSE will be received, from which the average will be calculated – Ai. As a result, set {A1, A2, A3} is obtained.
3.Hj will be chosen, where Aj = min{A1, A2, A3}. It is the best set of hyperparameters, which is believed to assure stable fit in future forecasts.

Performance of models on test set (based on stationary variables)

MetricSVRKNNXGBoostLSTMLGBMBest ensembleNaive model
RMSE0.0360140.0393050.0388480.0367050.0388700.0361060.050244
MAE0.0249160.0259350.0272180.0249180.0262830.0240940.034908
MedAE0.0166820.0172020.0197800.0167720.0164670.0153070.022378
DOI: https://doi.org/10.2478/ceej-2021-0004 | Journal eISSN: 2543-6821 | Journal ISSN: 2544-9001
Language: English
Page range: 44 - 62
Published on: Jan 29, 2021
Published by: Faculty of Economic Sciences, University of Warsaw
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2021 Marcin Chlebus, Michał Dyczko, Michał Woźniak, published by Faculty of Economic Sciences, University of Warsaw
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.