Have a personal or library account? Click to login

Figures & Tables

Figure 1:

Supervised learning algorithms used for big data analysis in selected articles.
Supervised learning algorithms used for big data analysis in selected articles.

Figure 2:

The results of forecasting the output of the 73 mm casing, tubing, and coupling using the SARIMAX model.
The results of forecasting the output of the 73 mm casing, tubing, and coupling using the SARIMAX model.

Figure 3:

Results of forecasting the output of the casing coupling using the SARIMAX model.
Results of forecasting the output of the casing coupling using the SARIMAX model.

Analysis of the mathematical and statistical forecasting methods

Name of the methodBrief description of the methodAdvantages of the methodDisadvantages of the method
Exponential smoothing [7]A time-series forecasting method based on weighing past observations with exponential attenuation.
  • Easy to implement

  • Takes into account recent observations

  • It is sensitive to emissions/anomalies

  • Does not take into account trends

Linear regression [8]A method based on the search for a linear relationship between independent and dependent variables.
  • Easy to interpret

  • Effective for linear dependencies

  • Suitable only for linear dependencies

  • Sensitive to emissions/anomalies

ARIMA [9]A method that allows us to model time series taking into account autoregression, moving average and seasonality.
  • Takes into account the complex structure of time series

  • Adapts to different types of data

  • Requires defining model parameters

  • Difficult to interpret

Forecasting based on ML [10]Using ML algorithms for forecasting based on historical data and external factors.
  • Takes into account complex nonlinear dependencies

  • Takes into account many input features

  • Requires a large amount of data for training

  • Requires a lot of computing resources

Holt-Winters method [11]A method that extends exponential smoothing to account for seasonality and trend.
  • It takes into account trends and seasonality

  • Suitable for data with explicit cyclic behavior

  • Requires parameter settings

  • Strong dependence on initial conditions

Prediction by the k-nearest neighbor method [12]A method based on the fact that objects with similar attributes have similar values of the target variable.
  • Easy to implement

  • Does not require assumptions about the data structure

  • Sensitive to emissions

  • Requires setting the k parameter

Principal component method [13]A method that reduces the dimensionality of data by projection onto a subspace with maximum variance.
  • Effective for a large number of signs

  • Reduces the effect of multicollinearity

  • May lose its interpretability

  • Does not take into account the dependencies between variables

Facebook prophet [8]A method developed by Facebook to predict time series based on seasonality, holidays and trends.
  • Easy to use

  • It takes into account seasonality and holidays

  • It does not always show good results on short time series

  • Does not take into account external factors

Neural network method [14]A method using ANNs for prediction based on learning from historical data.
  • Takes into account complex nonlinear dependencies

  • Works with different types of data

  • Requires a large amount of data for training

  • Difficult to set up and interpret

Random forest method [15]A method based on constructing an ensemble of decision trees and averaging their predictions.Resistant to retraining and works with a large number of signs
  • Prone to overtraining with suboptimal parameter settings

  • Takes time to learn

Time-series method SARIMA [16]
Hybrid models [17]Methods that combine several different forecasting methods to improve the accuracy of forecasts.
  • Work with a variety of data characteristics

  • Improve forecast accuracy

  • Require additional configuration

  • Difficult to implement

Gaussian processes [18]Methods that simulate random processes, including time series, using Gaussian distributions.
  • Take into account uncertainty in forecasts

  • Simulate nonlinear dependencies

  • Require computing resources to evaluate

  • Difficult to interpret

Bayesian methods [19]Methods based on Bayesian statistics for modeling and forecasting.
  • Take into account the uncertainty in the forecasts

  • Allow us to update forecasts based on new information

  • Require the definition of a priori distributions

  • Complex calculations

Gradient boosting [20]A method based on the construction of an ensemble of weak models, with each subsequent model correcting the errors of the previous one.
  • High prediction accuracy

  • Resistant to overtraining

  • Demanding on resources

  • Difficult to configure parameters

LSTM [21]A method that uses RNNs with LSTM to analyze sequential data.
  • Takes into account long-term dependencies

  • Effective when working with sequential data

  • Requires a large amount of data for training

  • Requires computing resources

Method of graphical models [22]A method that models dependencies between variables in the form of a graph, where nodes represent variables and edges represent dependencies.
  • Allows us to take into account the structure of dependencies between variables

  • Works with different types of data

  • Requires specification of the graph structure

  • Difficult to interpret

Quantile regression [23]A method that allows us to estimate not only the average value of the target variable but also its quantiles.
  • Allows us to estimate the confidence intervals of forecasts

  • Takes into account different levels of uncertainty

  • Requires more data to accurately estimate quantiles

  • High sensitivity to emissions/anomalies

Method of extreme cases [24]A method based on the analysis of extreme (extreme) data values to predict rare events or extreme conditions.
  • Effective in predicting rare events

  • Used for risk assessment

  • Requires a large amount of data on extreme values

  • Difficult to interpret

Time-series decomposition method [25]A method that divides a time series into components (trend, seasonality, and residuals), and then predicts each component separately.
  • Takes into account various characteristics of time series

  • Effective in predicting nonstationary series

  • Requires setting the parameters of the decomposition method

  • Difficulties in analyzing the results

Graph neural networks [26]A method that combines graph models and neural networks for data structure analysis and forecasting.
  • Takes into account complex dependencies between variables

  • Works with graph data

  • Requires a large amount of data for training

  • Difficult to set up

Temporary neural autoencoder [27]A method using neural autoencoders to study the internal structure of time series and their subsequent prediction.
  • Takes into account complex dependencies in the data

  • Works with different types of time series

  • Requires a lot of computing resources

  • Requires a large amount of data for training

Language: English
Submitted on: Jun 10, 2024
Published on: Mar 27, 2025
Published by: Professor Subhas Chandra Mukhopadhyay
In partnership with: Paradigm Publishing Services
Publication frequency: 1 times per year

© 2025 Ali Sajae Mannaa, Tatiana A Makarenya, Alexey I Kalinichenko, Svetlana V Petrenko, published by Professor Subhas Chandra Mukhopadhyay
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.