Comparative analysis of types of neural networks for solving problems of modeling socioeconomic systems (forecasting of production using neural networks, for example, on an LSTM-type network)

Tatiana A. Makarenya; Ali Sajae Mannaa; Alexey I. Kalinichenko; Svetlana V. Petrenko

doi:10.2478/ijssis-2026-2018

Introduction

In our country, there is a system of accounting for indicators of the implementation of the concept of sustainable development based on the assessment of indicators of the 17 Sustainable Development Goals. To determine the implementation of sustainable development indicators, there is a system of statistical indicators, as evidenced by the official statistical information on the website of the Federal State Statistics Service. However, if you look at the information on the number of products produced, there are no official statistics in the context of types of products, since currently there is a system of accounting by type of economic activity. Under the conditions of the imposed economic sanctions, the economic development of the country is possible if there is a clear accounting system for the types of products produced, which can be used not only to make managerial decisions regarding what and how much to produce products of the military–industrial complex but also to plan the need for certain types of products and resources for the restoration of new territories of the country. Unfortunately, there is no such statistical accounting system. Let us consider the GRP dynamics presented in Figure 1 [10], which illustrates the growth of gross regional product in all federal districts and detailed factors (except for types of economic activity) of GRP growth for the period from 2014 to 2020. It is not possible.

For example, if we look at the collection of scientific and technical progress in the USSR, we can see the accounting by branches of the national economy in the context of types of products. Such detailed accounting of production will allow the use of scientific methods and information resources for forecasting and planning socioeconomic development, including using neural networks. To develop proposals on the use of neural networks for drawing up an industrial production plan, we will analyze the most well-known types of neural networks. A neural network can solve the problem of modeling socioeconomic systems. From system S, we will understand a set of elements that enter into relations with each other and have integrity and unity [1]. The structure of the system is a set of elements and the connections connecting them [2].

Based on the above definitions, we will assume that the industrial complex of the region (federal district) is a socioeconomic system that is a subsystem of the industrial complex of the country and the economy of the country as a whole. Therefore, in order to make a forecast of the development of the socioeconomic system, it is necessary to have information about the elements of the system, in our case, about the number of industrial enterprises, products produced, resources used, technologies used, and consumers of products. The existing statistical accounting system contains information on GRP, valuation by type of economic activity, and the number of enterprises by type of economic activity. Thus, it is not possible to determine the elements and functions of the socioeconomic system from official statistical data. What hinders the use of modern information resources to solve problems of forecasting, modeling industrial development, and achieving certain quantitative values for a number of indicators—i.e., indicators of regional development programs?

Analysis of the results of previous work

The development of the field of science related to artificial intelligence was preceded by research on the nature of thinking, the basic principles of which were formulated by Aristotle. However, significant advances occurred in the 20th century, as a result of research in the field of neuroscience. Thanks to this, we have determined how neurons work and how they are interconnected. Scientists were able to create mathematical models to create artificial neural networks.

In the 1940s and 1960s, there was a definite breakthrough in the field of neurointelligence. The researchers developed the first learning algorithm and also proposed simulating simple functions of neurons [7, 18]. However, simple networks were not capable of solving complex problems.

The next goal was the construction of powerful multilayer networks and their further training. The results were trainable neural network models (with a teacher or without a teacher) [5, 8, 9, 13].

The fundamental research in the field of neural networks can consider the work of Raul Rojas, published in 1996 [19], in which he presented theoretical laws and models and combined them into a general theory of artificial neural networks. Using biology as an example, the author showed how the properties of models change with the introduction of more general computational elements and network topologies.

Today, there are a sufficient number of neural network models that have their own characteristics. The fundamental foundations of neural network architecture, learning methods for multilayer, radial, and recurrent networks, as well as deep learning (DL) methods used in industry are presented in previous studies [11, 17, 22, 25].

Research on the mathematics of neural networks in conjunction with the Python programming language is also important. The studies of the researchers [20, 23, 24] explain the extensive concepts of creating their own neural networks capable of learning and creating solutions, including those for modeling situations.

The field of research pertaining to demand forecasting and water management encompasses a diverse array of models and methodologies aimed at enhancing the precision and efficacy of these processes. In the field of retail demand forecasting, the incorporation of macroeconomic variables, such as the consumer price index and the unemployment rate, has been demonstrated to enhance the precision of DL long short-term memory (LSTM) models. This is evidenced by a reduction in forecast errors and the identification of the significance of these variables in demand prediction [32]. The deployment of hybrid models, such as convolutional neural networks (CNNs) coupled with LSTM units, for climate forecasting in China has demonstrated high accuracy in predicting extreme climate events. This makes these models a promising avenue for the prevention of natural disasters [27]. In the petroleum industry, effective wastewater (PW) management necessitates the implementation of hybrid treatment technologies, as single-component methods are inadequate for addressing the presence of multiple pollutants. The integration of membrane filtration and thermal treatment represents a promising approach to address the fouling problem [1]. Research has demonstrated the value of adaptive models that incorporate time series and macroeconomic data to enhance the precision of demand forecasting in the retail and oil industries [8]. In the manufacturing sector, the application of multivariate models, such as ARIMAX and machine learning, has demonstrated that the consideration of leading indicators can facilitate more accurate prediction of component demand throughout the product life cycle. These approaches have been demonstrated to yield superior results compared to univariate models [18]. Concurrently, the advancement of hybrid models, such as NN-ARMAX, enables the incorporation of external macroeconomic variables and the enhancement of inventory management practices within the manufacturing sector [26]. With respect to component demand management, the integration of machine learning with conventional methodologies has also been shown to offer advantages, particularly when utilizing data from disparate life-cycle phases [28]. In the agricultural industry, the implementation of DL methodologies for yield prediction necessitates meticulous data preprocessing and cross-validation. Simpler models, such as those based on decision trees, often demonstrate comparable performance to DL models, underscoring the significance of selecting appropriate variables [16]. In the oil industry, the CNN-GRU model has demonstrated superior accuracy for forecasting oil production in comparison to conventional machine learning techniques. This suggests potential for optimizing field development [9]. The integration of hybrid models is also pertinent to water resources management. Membrane filtration, thermal treatment, and biological methods can be employed in conjunction to treat wastewater generated by the oil industry [10]. These technologies necessitate further investigation to enhance the sustainable utilization of water resources, particularly in contexts of water scarcity [24]. The use of advanced neural network models and the application of machine learning to the management techniques of weakly structured and/or unstructured complex systems will ensure the management of such systems in real time.

II.

Research Methods and Materials

Currently, there are several types of neural networks, but they are all divided into single layer, multilayer, direct propagation, and recurrent.

Let us consider the types of neural networks in the direction of information distribution between neurons.

Radial basis function networks (RBFNs) are feedforward networks trained using a supervised learning algorithm.

The role of hidden neurons in a radial-basis neural network is played by basic radial functions—a special class of functions, and their value monotonically decreases with increasing distance of the input vector X from the center C [31]. This is a direct distribution network.

CNNs are used for pattern recognition and image classification.

LSTM networks are used to generate text.

Recurrent neural networks (RNNs) are used to process sequences of data.

Generative neural networks are built on a combination of two neural networks, one of which generates patterns and the other tries to distinguish the correct pattern from the incorrect one.

Multilayer perceptrons (MLPs) are the simplest neural networks with simple communication. They can be used to build regression models, approximation of time series.

Self-organizing maps can be used to enter data into a particular cluster Self-organizing maps can be used to enter data into a particular cluster.

Deep belief networks can probabilistically reconstruct their inputs and can be trained for classification.

Restricted Boltzmann machines (RBMs) determine the probability distribution of input data samples. This neural network solves the problems of data dimensionality reduction, classification, and topic modeling problems.

Autoencoders copy input data to output, which is used to reduce dimensionality and smooth out noise.

Currently, the NLP direction is actively developing, generating contextual information from unstructured data, including news feeds, analyst calls, and other online content. This is used as an indicator to improve forecasting accuracy, including the stock market, whose models (SVM models) have shown the best results with high accuracy ranging from 72% to 99%. These models performed exceptionally well in predicting the S&P Index [10]. There is no information on the use of this direction in forecasting production output.

A wide range of applications of AI capabilities in the financial sector, robotic systems, and the labor market was discussed by Ivanovsky [25].

There are examples of using neural networks in marketing. In Russia, a similar neural network is used by Yandex Zen. In the 5 years since its launch in 2015, the company has been able to increase traffic—up to 50 million people—according to statistics [29].

The authors of the following study wanted to find out whether weather data could improve the accuracy of product sales and create a corresponding forecasting model for clothing sales. The developed model used the basic attributes of clothing data, historical sales data, and weather data. The resulting prediction was based on random forest, XGB, and GBDT using a stacking strategy. Additionally, the study found that the summation strategy model outperformed the voting strategy model with an average reduction of 49.28% in mean square error (MSE). Apparel managers can use this model to forecast their sales when they create sales plans based on weather information [5].

The authors of the study [15] considered the use of tree-based ensemble forecasting, in particular, using additional tree regressors (ETRs) and LSTM networks. Using 6 years of historical demand data from a retail company, the dataset includes daily demand metrics for more than 330 products, with a total of 5.2 million records. Additionally, external variables such as meteorological and COVID-19-related data are integrated into the analysis. The resulting evaluation, covering three categories of perishable products, shows that the ETR model outperforms LSTM in terms of MAPE, MAE, RMSE, and R². This difference in rates is especially pronounced for fresh meat products and is insignificant for fruit products. These ETR results were evaluated along with three other tree ensemble methods, namely, XGBoost, random forest regression (RFR), and gradient boosting regression (GBR). The comparable performance of these four tree ensemble methods reinforces their comparative analysis with LSTM-based DL models [15].

The authors of the study [4] presented the first experience of applying DL methods in the context of the industrial supply chain of a Moroccan company that specialized in the production of electrical products. The LSTM-GRU model, which combines two DL models, LSTM and GRU, was determined to be more powerful than the LSTM and GRU models. This is the most accurate because it can better track changes in demand [4].

III.

Research Results and Discussion

Despite the large number of types of neural networks, the forecasting and modeling problems from the presented analysis can theoretically be solved by three types of neural networks: radial basis neural networks, networks with radial basis functions, and MLPs.

When choosing a model to predict the volume of output for several years in advance, it was decided to choose an RNN of the LSTM type. This choice is based on several key factors.

First, RNNs, especially LSTMs, are well suited to working with sequential data such as time series. They are able to account for dependencies between data at different time steps, making them an ideal choice for timing performance forecasting.

Second, LSTM has the unique ability to preserve information about long-term dependencies in data. This allows the model to take into account important trends and cyclical patterns that may affect future output.

It should also be noted that the choice of LSTM was made based on the results of preliminary research and experiments with various machine learning models. The LSTM showed good performance and learning ability on the provided data, confirming its suitability for this forecasting task.

Thus, the LSTM RNN was selected as a model to forecast output 4 years in advance based on its ability to deal with sequential data and account for long-term dependencies in the data.

To develop and configure the forecasting model, the Python programming language was used in conjunction with the TensorFlow library, which provides ample opportunities for creating and training neural networks.

The model development process began with defining its architecture. For time series such as product output data, RNNs, especially LSTMs, are ideal due to their ability to account for long-term dependencies in the data. Here is a piece of code illustrating the creation of an LSTM architecture using TensorFlow:

model = tf.keras.Sequential([
- tf.keras.layers.LSTM(units = 64, return_sequences = True, input_shape = (X_train.shape[1], X_train. shape[2])),
- tf.keras.layers.Dropout(0.2),
- tf.keras.layers.LSTM(units = 64, return_sequences = True),
- tf.keras.layers.Dropout(0.2),
- tf.keras.layers.LSTM(units = 64),
- tf.keras.layers.Dropout(0.2),
- tf.keras.layers.Dense(units = 1)
model.compile(optimizer = ‘adam’, loss = ‘mean_ squared_error’)

There is a sequential model with three LSTM layers. The first LSTM layer has 64 neurons and returns sequences for the next LSTM layer. Dropout layers are then used to reduce the risk of overfitting. Finally, a fully connected layer is added for the final prediction.

After creating the model architecture, how to configure it includes choosing the loss function, optimizer, and other training parameters. In this case, the Adam optimizer and the MSE loss function were chosen because they are well suited for regression problems such as product quantity forecasting.

To train the model, data on the number of products “Casing coupling” and “Tubing coupling 73 mm” were used for the period from the beginning of 2018 to the end of 2022. Before training, the data were preprocessed and prepared for input into the model. The preprocessing stage included scaling the data to normalize the values, dividing it into training and test sets, and converting the time series into fixed-length sequences for feeding into the LSTM network input.

The model was trained over several epochs using the training dataset. During the training process, the model gradually adjusted its parameters to minimize the prediction error. To control overfitting and ensure the generalization ability of the model, regularization and early stopping mechanisms were used.

After completing the training of the model, validation was carried out on a test dataset to evaluate its performance on new data that were not used in the training process. This made it possible to evaluate the accuracy and reliability of the forecasts made by the model and to verify its suitability for use in forecasting the volume of output for the following years.

To assess the accuracy of the model, cross-validation was carried out on historical data. To do this, the data were divided into several time intervals, and the model was trained on part of the data and then tested on the remaining data. This process was repeated several times to obtain a more robust estimate of the model’s performance. The results of predicting the release of “Casing pipe coupling” using an RNN LSTM are presented in Figure 11 and the production of the product “Couplings for tubing pipes 73 mm” in Figure 12.

After completing the cross-validation, the following results were obtained.

The average absolute error (MAE) of the forecast for 2023 was 93 units of production for “Casing pipe coupling” and 86 units for “Couplings for tubing pipes 73 mm.” This indicates that the model is on average wrong by 93 units when predicting the production volume of “Casing pipe coupling” and by 86 units for “Couplings for tubing pipes 73 mm.”

Based on these results, we can conclude that the model has acceptable accuracy in predicting the volume of output for the following years. However, it is worth noting that forecast accuracy can be improved by using a wider dataset.

Since all the algorithms that are embedded in neural networks are completely unknown, the role of humans remains the main one in this process of making management decisions.

In the book by Larichev [26], a person is considered as the one who chooses the best solution. In the absence of accurate quantitative data on socioeconomic systems, including regional industrial systems, and information on products, uncertainty increases when making management decisions. Thus, only a specialist in the field of decision-making can make a qualified management decision. Therefore, a specialist who solves problems of forecasting product output must have skills and knowledge of system analysis, economic processes, mathematical modeling, and the use of neural networks. This is a complex multidisciplinary task of training qualified personnel, which must be solved now in order to keep up with the current state of development of science and technology.

IV.

Conclusion

The conducted research aimed at forecasting production using LSTM-type RNNs yielded encouraging results demonstrating high accuracy of modeling in a dynamically changing environment. LSTM models demonstrated the ability to take into account long-term dependencies in data, which is critical for forecasting production volumes, especially in manufacturing processes subject to seasonal and cyclical changes.

Using a dataset containing a time series of production indicators, it was found that the LSTM-based approach can significantly reduce the average forecast error compared to traditional methods such as moving averages and linear regression. Different options for network architecture and training parameters were successfully tested, which made it possible to optimize the model and achieve improved forecast accuracy.

The sensitivity analysis of the LSTM model to various parameters such as the number of layers, hidden state dimension, and learning rate was conducted to further improve the research results. It turned out that an increase in the number of hidden layers allows the model to take into account complex dependencies in the data. At the same time, it is important to maintain a balance to avoid overfitting, which can negatively affect the quality of forecasts.

The comparative analysis also showed that the use of additional input variables, such as macroeconomic indicators and demand data, can significantly improve the accuracy of forecasts. Integrating these data into the LSTM model provided a more complete picture of the factors affecting production.

An important aspect of LSTM application in industry is the need for interpretation of the results. Developing visualization tools that will allow analysts and production managers to understand which parameters have the greatest impact on forecasts will be a key step in the use of these technologies in practice.

Thus, the results of our study highlight the high efficiency of LSTM in forecasting production indicators and open up new horizons for optimizing production processes, which can significantly improve the competitiveness of enterprises.

The findings of the study confirm the potential value of LSTM application in the field of production process management, offering enterprises a tool for improving planning and reducing risks associated with demand uncertainty.

The utilization of RNNs, specifically the LSTM variant, demonstrates significant promise in the realm of forecasting and modeling across various complex and dynamic domains. The research analysis presented confirms the advantages of LSTM networks in effectively managing sequential dependencies, capturing long-term temporal patterns, and addressing the vanishing gradient problem commonly associated with traditional RNNs. This capability is crucial for tasks that require understanding patterns over extended sequences, making LSTMs particularly suitable for time-series forecasting, speech recognition, and other applications involving temporal or sequential data.

The performance analysis shows that LSTM models consistently outperform conventional statistical methods and basic machine learning algorithms, especially in cases where data exhibit nonlinearity and intricate dependencies. The model’s ability to dynamically update its state based on new input data allows it to adaptively learn and refine predictions over time, improving the accuracy and reliability of the forecasts.

Additionally, the research highlights the scalability of LSTM architectures, allowing for adaptation to multidimensional data and integration with other neural network layers (such as convolutional layers) to form hybrid models that can further enhance predictive power. This adaptability is critical for developing robust solutions in fields like financial market forecasting, environmental and climatic modeling, energy demand prediction, and even healthcare diagnostics.

Future work should aim at refining LSTM architectures through hyperparameter optimization, investigating advanced variations such as bidirectional LSTM (BiLSTM) and attention mechanisms to capture bidirectional dependencies and enhance interpretability. Furthermore, incorporating domain-specific knowledge and integrating LSTM models with external data sources could increase the predictive accuracy and contextual relevance of the results. In summary, the evidence supports the efficacy of LSTM networks as powerful tools for tackling complex forecasting and modeling challenges, offering a valuable foundation for further research and practical applications.

Comparative analysis of types of neural networks for solving problems of modeling socioeconomic systems (forecasting of production using neural networks, for example, on an LSTM-type network)

Full Article

Paradigm

My account