The profitability of bank operations is one of the hot topics of research by academics and market practitioners, including investors. First, the profitability of the banking sector of the host country is a determinant of foreign direct investment (FDI) in the financial industry [Focarelli and Pozzolo, 2000]. Second, profitability affects the dynamics of regulatory capital accumulation, which drives the pace of lending [Bustamante et al., 2019]. Finally, knowledge of the determinants of the banking sector’s profitability makes it possible to answer whether, in what area and to what extent, the macroeconomic or regulatory environment changes alter bank performance [Rahman et al., 2015]. In the last 15 years, the world has experienced the subprime crisis, the sovereign debt crisis, the migration crisis, and the COVID-19 pandemic. This study primarily focuses on the profitability of banking sectors in the European Union (EU) countries over the period covering most of the above events, that is, 2007–2021. The choice of research period is therefore driven by the intention to test whether regulatory action in response to the subprime and pandemic crises has altered the importance of individual determinants of profitability of European commercial banks. Previous studies [e.g., Bongini et al., 2019; Bonaccorsi di Patti and Palazzo, 2020] have focused on the post-subprime crisis period. This study extends the period of analysis to include the pandemic period. The choice of the EU stems from the intention to conduct the study in a single regulatory and supervisory environment.
We narrow our study to the identification of profitability determinants, deliberately separating this sphere from performance measures, in those considering the market value of the bank (e.g., Tobin’s Q). Like Yüksel et al. [2018], we have chosen return on equity (ROE) as a measure of profitability because this measure lacks the disadvantage that characterizes the commonly used formula for determining the return on assets (ROA), that is, net income/total assets. In the ROA equation, liabilities include debt and equity investors’ contributions, whereas net income reflects only the return to equity investors. By contrast, adjusting the ROA calculation to make the numerator include the income of creditors and owners would require information on interest expenses and tax rates by year, and these are not present in the databases that we have used. The omission of the net interest margin (NIM), on the other hand, is due to the lack of data for a broad sample of banks and the relatively small number of studies that verify this measure of profitability.
In this study’s framework, EU countries are combined into four groups according to their characteristics. According to the criterion of similarity of trajectories of change for variables, each variable is assigned to the three groups (macroeconomic, banking, and governance). The random forest model is used within each country group, with ROE as a dependent variable. Using the machine learning (ML) approach allows for high accuracy of ROE assessment. Identification of potential ROE determinants, however, is not straightforward due to the lack of direct interpretability of the results of ML models. A modern SHapley Additive exPlanations (SHAP) technique for examining interpretability proposed by Lundberg and Lee [2017] is used to overcome these explainability problems.
The novelty of this study consists in concentrating on the entire banking sectors of all EU countries rather than individual banks, the period between the subprime and pandemic crises. This is also the first study to highlight the importance of globalization, economic freedom, and the risk of banking sector insolvency as variables determining banking sector profitability.
The rest of the article is organized as follows: in Section 2, we review relevant literature and develop hypotheses. Section 3 describes the methodology and data. Section 4 reports our empirical results, the next one discusses them while Section 6 covers conclusions limitations and directions for further studies.
Bonaccorsi di Patti and Palazzo [2020] suggest that GDP growth is one of the main factors supporting bank profitability. In turn, Mondol and Wadud [2022] prove the negative impact of GDP growth on banks’ ROA. Kok et al. [2015] add that the development of economic activity increases the demand for banking products and thus contributes to banks’ net interest income and fee income. Although we have not found any studies in the literature on the direct impact of public debt on bank profitability, however since the increase in public debt weakens economic growth [Heimberger, 2021], we expect that the growth of public debt indirectly contributes to the deterioration of bank profitability.
Sekwati and Dagume [2023] state that inflation and unemployment have a negative impact on GDP growth. Since GDP growth is positively correlated with the bank profitability, a negative impact of inflation on bank results can be expected. Inflation causes a decrease in the part of the income that can be used for debt service, while capital and interest payments will grow up. This thesis is confirmed by Pradhan [2016]. On the other hand, however higher inflation is accompanied by greater interest rates, and these, due to the possibility of leveraging the result in products such as cash loans and increasing the differential between the reference rate and the actual cost of funding, contribute to improving the bank’s net result [Caliskan and Kirer-Silva-Lecuna; 2020; Saif-Alyousfi, 2020]. Since employment has a positive impact on bank profitability [Genay and Podjasek, 2014] and labor cost moderation helps employment creation [Pierluigi and Roma, 2008], a negative influence of labor costs on bank performance can be expected.
Brahmaiah and Ranajee [2018] find that deposits/GDP positively affect ROA and ROE of banks. This suggests that the broader category, that is, savings, will also be positively correlated with the profitability of the banking sector.
Based on the inconclusive results of the studies cited above, the following hypothesis is formulated:
H1: The GDP growth, inflation, public debt, savings, and labor costs influence European banking sectors’ profitability.
Teixeira et al. [2019] highlight that higher levels of the institutional environment or stricter banking regulation typically reduce banks’ profitability. However, they note that this effect weakens considerably during crises. It implies that banking sectors subject to more rigorous regulation and monitoring by supervisory institutions are less exposed to the adverse effects of emergencies. Asteriou et al. [2021] find that corruption and transparency have a negative effect on bank profitability while economic freedom and regulation positively impact the performance of banks. Nasreen et al. [2020] state that financial globalization hinders the process of financial sector development, as expressed by the dimensions of the financial market. Athari [2021] pays attention to the global economic policy uncertainty as a factor determining profitability of banks. Based on the findings mentioned above, the following hypothesis is tested:
H2: Banking sectors’ profitability in the EU depends on the business and regulatory environment as well as openness of the economy.
Mondol and Wadud [2022] find that the level of profitability is determined mainly by the bank’s credit risk management quality. It is also confirmed by the study of Elkedag et al. [2020]. According to them, the most important factor (besides real GDP growth) is the level of non-performing loans (NPL). Menicucci and Paolucci [2016] argue that higher loan loss provisions result in lower profitability levels. Bongini et al. [2019] conclude that the main reason for the dynamic changes in the profitability of European banks is the credit policy pursued. Therefore, the main factor influencing banks’ profitability during crises is the application of conservative lending policies and the effective recovery of NPL. Karkowska and Niedziółka [2019] also demonstrate that the profitability of European banks is positively affected by their credit policies. Hasanov et al. [2018] present a positive impact of bank liquidity risk on its profitability. In contrast to the results of the studies mentioned above, Yuan et al. [2022] and Mondol and Wadud [2022] indicate a negative relation between liquidity and profitability. The cost efficiency of banks and the level of operating costs are pretty often cited as determinants of the profitability of banking sectors. For example, Mehzabin et al. [2023] suggest that the impact of operational efficiency on profitability is positive. Also, Neves et al. [2020] prove that cost management is one of the main drivers of bank performance. Khan [2022] proves a significant relationship between operating efficiency and ROE. The positive impact of operating efficiency on ROA and ROE is confirmed by Mondol and Wadud [2022].
As the quality of risk management and cost efficiency in individual banks translates into indicators describing the risk of the banking sectors formed by these banks, it can be expected that:
H3: The banking sector’s risk affects its profitability. H4: The ROE of European banking sectors is impacted by their cost-efficiency ratios.
The positive impact of bank capital on its profitability is confirmed inter alia by Menicucci and Paolucci [2016] and Saif-Alyousfi [2020]. Regarding capital adequacy, Khan [2022] proves its negative impact on bank ROA, whereas Mondol and Wadud [2022] come to the opposite conclusion. Rossi et al. [2018] attempt to identify the determinants of bank profitability in nine European countries to see evident changes in the context of the onset of the global crisis. In their view, “after a period of ‘irrational exuberance’ in which credit growth and high leverage are seen as proper and fast ways to boost profitability, a sound financial structure and a wiser and objective credit portfolio management have become the main drivers to ensure higher returns.” Having in mind the aforementioned results, we expect that:
H5: The capitalization of the banking sector is an essential determinant of the return on capital in this industry.
Yuanita [2019] and Horobet et al. [2021] argue that a higher concentration ratio measured by the Herfindahl–Hirschman Index (HHI) and illustrating banking sector structure is associated with lower profitability, while results achieved by Silalahi et al. [2015] suggest the opposite direction of the relation. Also, Keil [2017] points to the negative impact of the resurgence of market concentration on the sector’s profitability. Having these ambiguous results in mind, we presume that:
H6: The concentration of the banking sector determines its profitability.
With fresh memories of the subprime crisis, the sovereign debt crisis, the COVID-19 crisis and the current economic slowdown, energy market shock, rapidly rising inflation, and tense geopolitical situation, it is also worth citing the results of studies dedicated to the impact of these extreme events on banking sector performance. Examining the resilience of Polish banks to the pandemic crisis, Karkowska et al. [2023] reveal that the largest banks are the most resistant ones to the consequences of the pandemic. In turn, Bernardelli et al. [2021], analyzing the behavior of banks during the COVID-19 crisis, prove that large retail banks have been less affected than medium-sized ones with relatively meaningful corporate portfolios.
Furthermore, given the heterogeneity of the sets of explanatory variables of relevance to the profitability of the different groups of banks (distinguished based on geography, business model, financial condition, and others) as well as different structures of EU banking sectors, we expect that:
H7: Different sets of explanatory variables influence the profitability of clusters of European banking sectors.
The study is based on variables from 2007 to 2021 for all EU-27 countries, which are assigned to three categories (Table 1).
Taxonomy of variables
| Symbol | Description | Source |
|---|---|---|
| Bank-specific variables | ||
| Assets | Total assets of the banking sector | ECB |
| Dyn_Assets | Asset dynamics in the banking sector | ECB |
| Liquid | Liquid assets to deposits and short-term funding (%) | World Bank |
| CIR | Bank cost-to-income ratio (%) | World Bank |
| Bank_Conc | Assets of the three largest commercial banks as a share of total commercial banking assets | World Bank |
| Foreign | Share (total assets) of foreign credit institutions | ECB |
| NPL | Bank NPL to gross loans (%) | World Bank |
| Z score | Bank Z-score | World Bank |
| Macroeconomic variables | ||
| GDP_Growth | GDP growth (annual %) | World Bank |
| Gov_Debt | Government debt (consolidated) (as % of GDP) | ECB, Eurostat |
| PPI | Producer Prices (2010 = 100) | Eurostat |
| Labor | Labor cost for LCI (compensation of employees plus taxes minus subsidies) in euro | Eurostat |
| Savings | Gross savings (% of GDP) | ECB, Eurostat |
| Business environment variables | ||
| GLOB | KOF Globalization Index | KOF Swiss Economic Institute |
| Freedom | An index that measures economic freedom based on trade freedom, business freedom, investment freedom, and property rights | Heritage Foundation |
Source: Own elaboration.
NPL, non-performing loans; PPI, Producer Price Index.
The sovereign ratings as of the end of 2021 granted by S&P, Moody’s and Fitch Ratings are considered and then normalized on a scale of 0–21 points. In the next step, the averages of the ratings mentioned above are determined and assigned to four ranges (<14; <14.00–16.49>; <16.50–18>; and <18.01–21>). Based on this approach, 27 EU countries are divided into four groups:
Group 1: Bulgaria, Croatia, Cyprus, Greece, Hungary, Italy, Portugal, and Romania.
Group 2: Latvia, Lithuania, Malta, Poland, Slovakia, Slovenia, and Spain.
Group 3: Czech Republic, Estonia, and Ireland.
Group 4: Austria, Belgium, Denmark, Finland, France, Germany, Luxembourg, Netherlands, and Sweden.
The above division is based on the criterion of assigning a given banking sector to a cluster characterized by a rating that falls within a certain range with the same spread. The exception is the first band containing the lowest ratings. The ROE is chosen to approximate the profitability rates of the banking sectors (Figure 1).

Distribution of ROE values for individual countries in 2007–2021. Source: own calculations. ROE, return on equity.
The crises that occurred within the years 2008–2013 affected the banking systems of individual countries to varying degrees. Hence, it is decided to omit the observation of countries in the year when the absolute value of ROE is >25, treating them as outliers. Detailed descriptive statistics for ROE are provided in Table 2.
ROE descriptive statistics per country
| Group 1 | Group 2 | ||||||
|---|---|---|---|---|---|---|---|
| Country | Mean | SD | Range | Country | Mean | SD | Range |
| Bulgaria | 9.72 | 6.09 | 21.70 | Latvia | 2.78 | 16.59 | 70.96 |
| Croatia | 6.36 | 3.93 | 16.06 | Lithuania | 6.67 | 17.52 | 76.91 |
| Cyprus | 0.89 | 17.12 | 75.72 | Malta | 7.89 | 5.45 | 23.66 |
| Greece | −11.06 | 29.19 | 124.16 | Poland | 9.35 | 4.52 | 19.37 |
| Hungary | 8.37 | 8.39 | 27.32 | Slovakia | 8.11 | 4.39 | 18.78 |
| Italy | 0.69 | 7.15 | 24.61 | Slovenia | −4.62 | 32.98 | 131.88 |
| Portugal | −2.43 | 10.18 | 37.25 | Spain | 4.33 | 6.53 | 26.53 |
| Romania | 8.31 | 9.09 | 32.28 | ||||
| Group 3 | Group 4 | ||||||
| Country | Mean | SD | Range | Country | Mean | SD | Range |
| Czech Republic | 14.39 | 4.13 | 16.43 | Austria | 5.89 | 4.66 | 18.36 |
| Estonia | 11.49 | 12.68 | 53.39 | Belgium | 3.63 | 14.90 | 58.66 |
| Ireland | −9.68 | 34.18 | 126.80 | Denmark | 5.32 | 4.92 | 18.61 |
| Finland | 8.29 | 1.70 | 6.05 | ||||
| France | 5.69 | 5.57 | 22.99 | ||||
| Germany | 0.33 | 7.34 | 30.19 | ||||
| Luxembourg | 6.81 | 3.99 | 18.60 | ||||
| Netherlands | 5.22 | 12.41 | 59.49 | ||||
| Sweden | 13.30 | 3.70 | 15.96 | ||||
Source: own calculations.
ROE, return on equity.
Machine learning methods (ML) are usually often characterized by greater accuracy than classic econometric approaches regarding prediction or fitting quality [Pérez-Pons et al., 2021]. These models are often presented as the so-called “black boxes” due to the problematic interpretability of the results and the influence strength of individual factors contributing to these results. A frequently used technique for examining interpretability is SHAP, which are proposed by Lundberg and Lee [2017]. Shapley values determine the importance of a given variable by comparing the model’s predictions with and without that variable, considering any possible ordering of the remaining variables. It is a valuable substitute for the interpretability of the econometric models, providing insights into the determinants of the phenomena under consideration. As regards the advantages of this methodology, the prediction is fairly distributed among the feature values. Instead of comparing a prediction to the average prediction of the entire dataset, it can be confronted with a subset or even with a single data point. SHAP requires a lot of computing time, and it can produce unintuitive feature attributions, which seem to be the disadvantage of this approach [Molnar, 2023].
Modeling using the set of all variables is performed independently for each group of countries. The modeling consists of two stages. The first stage is the selection and application of an ML model. The second stage is to assess the degree of influence of explanatory variables on the explained variable ROE using the SHAP value. Among many available ML methods, the random forest model is chosen. The random forest algorithm has been applied in many areas to make better predictions, estimations, and decisions. A version of the random forest algorithm used in modern applications was developed by Breiman [2001]. By integrating the combination of the bagging concept [Breiman, 1994] and the random selection of features introduced by Ho [1995], constructing the set of decision trees with controlled variance is possible. This research uses a fast implementation of random forests from the Ranger package written in R language (R is available as Free Software under the terms of the Free Software Foundation’s GNU General Public License in source code form). This implementation is particularly suited for high-dimensional data.
The analysis of the results of the random forest models is performed on many levels. First, the quality of the models measured by the root-mean-square error (RMSE) is compared for data in the considered groups of countries from all available years, and the results are given in Table 2. RMSEs are in the range of 1.7–2.9, which – concerning the range of ROE (from 6 to 132, see Table 2) – seems to be an excellent approximation. Fitting accuracy and the associated RMSE vary for different countries. Descriptive statistics per country are presented in Table 3. For most countries, the absolute value of the average RMSE is <1. Only Latvia and Ireland are above that threshold. In the case of Ireland, the main reason for worse model accuracy is the lower number of observations due to missing data compared with other countries. The more significant variance of results influences the slightly larger average error in the case of Latvia (see the SD column in Table 4), which is also observed for Slovenia, Ireland, and Romania. The discrepancy of random forest model estimates from the actual ROE values exceeds 10 for Cyprus, Estonia, Hungary, Lithuania, and Portugal. A visible relationship exists between the range of ROE values and the range of estimation errors for individual countries (see columns with range values in Tables 2 and 4). Spearman’s correlation coefficient for these values is 0.76 (p-value = 4.332e−06).
Values of RMSE and R2 for random forest models per group of countries
| Measure | Group 1 | Group 2 | Group 3 | Group 4 |
|---|---|---|---|---|
| RMSE | 2.33 | 2.59 | 2.90 | 1.70 |
| R2 [%] | 57.28 | 37.50 | 33.51 | 52.44 |
Source: own calculations.
RMSE, root-mean-square error.
Random forest model errors per group of countries
| Group 1 | Group 2 | ||||||
|---|---|---|---|---|---|---|---|
| Country | Mean | SD | Range | Country | Mean | SD | Range |
| Bulgaria | 0.19 | 1.50 | 6.22 | Latvia | −1.02 | 3.81 | 16.74 |
| Croatia | −0.02 | 0.79 | 3.15 | Lithuania | 0.57 | 2.85 | 13.22 |
| Cyprus | −0.19 | 2.90 | 10.98 | Malta | −0.12 | 1.37 | 5.75 |
| Greece | 0.36 | 2.16 | 7.19 | Poland | 0.29 | 1.25 | 5.23 |
| Hungary | 0.42 | 2.96 | 10.90 | Slovakia | 0.19 | 1.97 | 7.41 |
| Italy | 0.16 | 2.15 | 7.14 | Slovenia | 0.27 | 3.79 | 15.02 |
| Portugal | −0.76 | 2.79 | 10.35 | Spain | −0.01 | 2.38 | 9.20 |
| Romania | 0.23 | 2.93 | 11.10 | ||||
| Group 3 | Group 4 | ||||||
| Country | Mean | SD | Range | Country | Mean | SD | Range |
| Czech | |||||||
| Republic | 0.37 | 1.03 | 3.98 | Austria | −0.03 | 1.98 | 7.95 |
| Estonia | −0.65 | 3.47 | 13.44 | Belgium | −0.55 | 2.31 | 9.54 |
| Ireland | −1.81 | 4.19 | 11.73 | Denmark | −0.55 | 1.41 | 5.38 |
| Finland | 0.03 | 0.65 | 2.15 | ||||
| France | −0.03 | 1.94 | 8.23 | ||||
| Germany | −0.54 | 2.44 | 9.20 | ||||
| Luxembourg | −0.12 | 0.92 | 3.38 | ||||
| Netherlands | 0.27 | 1.13 | 4.06 | ||||
| Sweden | 0.63 | 1.64 | 6.84 | ||||
Source: own calculations.
Additionally, Table 3 shows a summary of R2, also known as explained variance or coefficient of determination. It is computed on out-of-bag data and indicates what percentage of the variance in the ROE values the random forest regression model explains. For the first and the fourth groups, R2 exceeds 50%. The other two groups are worse, but still, the model explains more than one-third of the variance in the ROE values.
The second part of the analysis concerns determining the strength of the influence of the considered factors on the explained variable. The SHAP approach is used in which each feature assigns an importance value to the particular feature. It determines the extent to which removing a variable reduces the accuracy of the prediction. The values alone do not carry information, but in the context of other values, information can be obtained about the relative strength of the influence of individual variables on the ROE variable. The higher the value, the more significant the impact. The bar plots in Figure 2 show SHAP feature importance, calculated as the average absolute SHAP value per feature for each group of countries independently. The top 10 features with the most significant importance are plotted.

SHAP features importance per group of countries.
Source: own calculations. NPL, non-performing loans; PPI, Producer Price Index; SHAP, SHapley Additive exPlanations.
As a robustness check, the method of division into groups has been changed by limiting it to the S&P ranking at the end of 2021. Due to the exact composition of countries, the results for Groups 1 and 4 have not changed. Moving Slovenia from Group 2 to Group 3 resulted in a slight improvement in the RMSE error in Group 2 (mean RMSE of 2.07), at the cost of worsening the mean RMSE in Group 3 to 2.99. The increase in the number of observations in Group 2 also resulted in a significant increase in R2 from 33.51 to as much as 49.85. At the same time, the decreasing number of observations in Group 2 (which still includes many countries) only slightly reduces the R2 coefficient to 37.26. The stability of the results concerning the original division into groups proves the approach used well.
Econometric models have the advantage of interpretability but, at the same time, generally have weaker predictive ability and much more restrictive requirements for meeting applicability assumptions. ML methods supplemented with the SHAP method are an excellent alternative [Pérez-Pons et al., 2021; Molnar, 2023]. The combination of these two methods in the analysis of financial data has so far been rare in scientific research. Meanwhile, the results of the empirical analysis described in this article prove the effectiveness of this approach. The application of the SHAP framework for interpreting the output of the random forest model made it possible to isolate macroeconomic and banking variables, where, most likely, there is a causal relationship linking these variables to the ROE of the banking sectors in the analyzed period of 2007– 2021. Thus, the results show the mechanism of action of the studied variables with an additional indication of the strength of the relationship between the analyzed variables and ROE.
The order of variables regarding the strength of influence affecting the ROE value differs depending on the group of countries, which allows confirming H7. The Government debt/GDP and Liquidity variables lead in Group 1, EU-27 in Group 2, and the Cost/Income (C/I) in Groups 3–4. The C/I and Z-Score variables appear in the top five in each group, with three groups having the most significant impact on the value of the explained variable. Leaders in Groups 1, 3, and 4 stand out significantly in the strength of their influence. Group 2 has no such drastic difference between the first and subsequent variables.
Demonstrating the significant impact of cost efficiency on the banking sector’s profitability allows H4 to be confirmed. Given banks’ relatively stable capital base (the denominator of ROE), the influence of operating costs relative to income on the banking sector’s profitability is quite apparent. Therefore, the gradual reduction of the C/I ratio is becoming an integral part of the banks’ strategies [Broby, 2021]. The importance of the C/I ratio is also emphasized by Mehzabin et al. [2023], Khan [2022], Neves et al. [2020], and among others.
The importance of cost-effectiveness is particularly significant in the third and fourth group of countries (economies with relatively high sovereign ratings), which can be linked to the limited possibilities of realizing additional income in an environment of high interest rates following inflation. It results from the position of the Producer Price Index (PPI) is significantly higher for groups of countries with weaker ratings.
In turn, demonstrating the significance of the Z-Score, a measure of the banking sector’s default risk, allows confirmation H3. A similar relationship is proven by Mushafiq et al. [2022]. A relatively higher cost of funding accompanies higher default risk, which ultimately determines the pricing of financial products and services in a given market. This relationship can also be explained from the perspective of the banking system’s abundance of capital. The lower its value, the greater the default risk and, simultaneously, ceteris paribus, the higher the level of ROE in light of the Z-Score concept. The relevance of the Z-score in the context of the crucial role of own funds in its calculation allows positive verification of H5.
Our results also indicate the relatively high importance of the share of foreign ownership in banking sector assets as a determinant of profitability. This variable is among the five most significant in three groups of countries, and the finding of this relationship is in line with the results obtained by Othmani [2022], Gupta et al. [2020], and Rahman et al. [2020], and among others. The effect of sovereign credit rating on foreign ownership demonstrated by Korzeb et al. [2023], in turn, confirms the validity of combining macroeconomic and bank-specific variables as determinants of bank profitability. This factor is important primarily in the groups of countries with relatively low ratings, where the share of foreign ownership in banking sector assets is higher than it is in the case of the countries that make up the fourth group. In the cluster mentioned above, the foreign ownership is so low that it is difficult to indicate its impact on the banking sector’s profitability.
The concentration of the banking sector as a variable with a significant impact on its profitability is included in the list of the five most essential factors in two clusters of countries, which allows partial positive verification of H6. Studies by Yuanita [2019], Horobet et al. [2021], and Silalahi et al. [2015] support our results. The higher concentration is accompanied by less competition, and the oligopolistic structure of the banking sector translates into relatively higher profitability for huge banks [Vera-Gilces et al., 2020].
Liquidity opposes profitability and is particularly important in relatively low-rated countries, at greater risk of speculative capital outflows. This low sovereign rating also determines the ability and cost of funding for banks in these countries. Our results, confirming H3, are in line with the findings of Yuan et al. [2022], Mondol and Wadud [2022], Caliskan and Kirer-Silva-Lecuna [2020], and Hasanov et al. [2018].
Among the macroeconomic variables, GDP growth and PPI are significantly affecting the profitability of banking sectors, which allows for positive verification of H1. Additionally, our study reveals the importance of such macroeconomic variables like government debt/GDP and labor costs. The government debt/GDP is the dominant determinant in the first group of countries, that is, economies with the lowest average sovereign ratings. They are characterized by relatively high public sector debt, and some of them have experienced problems with financial sector stability (e.g., Cyprus, Greece, Hungary), which has been sustained by state support or bail-outs from foreign investors.
The significance of the impact of GDP growth on bank profitability is demonstrated by Mondol and Wadud [2022], Saif-Alyousfi [2020], and Bonaccorsi di Patti and Palazzo [2020]. Labor costs and savings also have a relatively significant impact on the profitability of the banking sectors, but this applies only to the third group of countries. It is formed by countries with a relatively high share of foreign capital in the ownership structures of banks, so economic freedom can be considered an incentive for foreign investors, who can realize higher income in such a regulatory and tax environment than in their home countries.
Inflation as a factor affecting the profitability of the banking sector is, in turn, pointed out by Saif-Alyousfi [2020] and Caliskan and Kirer-Silva-Lecuna [2020]. The higher interest rates accompany higher inflation. Leveraging the result in products such as cash loans and increasing the differential between the reference rate and the actual cost of funding contributes to improving the bank’s net result.
It is to be expected that the legal and regulatory environment and the quality and enforceability of a country’s existing laws, as well as political stability and the extent of economic freedom (our study confirms its impact on banking sector profitability which allows to verify positively H2), will have a positive impact on the profitability. However, this issue requires in-depth research. Therefore, the conducted analyses will provide a basis for further detailed research focusing on: (i) building a model describing the determinants of the profitability of the banking sectors of the EU-27 countries, taking into account the strength, shape, and direction of this impact in individual subperiods and groups of banking sectors, (ii) explaining the reasons for the emergence of outliers and diagnosing the idiosyncratic determinants of such phenomena, and (iii) identifying the reasons for the differentiation and belonging of individual countries to the formed clusters using additional factors besides macroeconomic and banking factors, such as ESG and cultural factors.
The performance of the banking sector is vital to investors (since it affects enterprise value), then the determinants of its volatility should also be in the orbit of their interest. Supervisory institutions should also be among the potential recipients of the results of this study since profitability affects capital adequacy and liquidity ratios. The profitability of a banking sector is also essential from the point of view of fiscal and economic policy since banking sector profits impact future lending. The banking sector’s profitability also determines the possibility of using the banking sector to mitigate the adverse social effects of recession or inflation.
This article establishes a list of potential determinants of profitability of banking sectors. Based on the random forest method and the SHAP values, the strength of influence of each factor within the particular group of countries is established. To our knowledge, this is one of the first studies focusing on entire banking sectors rather than individual banks, covering all EU countries, the period between the subprime and pandemic crises, and the first study emphasizing the importance of the globalization, economic freedom, and default risk of the banking sector.
This study and its results entail certain limitations. The first is the high level of data aggregation (analysis of entire banking sectors), which may cause some distortion of the picture resulting from the fact that the results of the industry are dominated either by the financials of a single or a few large commercial banks or by the significant underperformance or overperformance of one of the banks, which does not necessarily belong to the largest institutions in the sector. The second limitation is that sectoral results have not been cleaned of the effects of exceptional or one-off events (e.g., the option crisis in Poland and the conversion of FX mortgages in Hungary). Additionally, only European banking systems are analyzed. These sectors, given certain criteria, are similar to each other that may narrow the spectrum of candidates for profitability determinants and cause exclusion of variables like cultural factors or Economic Policy Uncertainty Index (EPUE).
As regards the future directions of research, it would be interesting to compare the results of the study that uses ML and SHAP methodology with other (traditional) approaches like linear regression.