Have a personal or library account? Click to login
Early Warning System for Debt Group Migration: The Case of One Commercial Bank in Vietnam Cover

Early Warning System for Debt Group Migration: The Case of One Commercial Bank in Vietnam

Open Access
|Sep 2024

Figures & Tables

Figure 1.

The lifecycle of a customer’s credit relationship with a bank. (Source: Author’s own research)
The lifecycle of a customer’s credit relationship with a bank. (Source: Author’s own research)

Figure 2.

A decision tree classifier (Source: Author’s illustration)
A decision tree classifier (Source: Author’s illustration)

Figure 3.

A random forest classifier (Source: Author’s illustration)
A random forest classifier (Source: Author’s illustration)

Figure 4.

A random forest classifier (Source: Author’s own research)
A random forest classifier (Source: Author’s own research)

Figure 5.

ROC and AUROC (Source: Author’s illustration)
ROC and AUROC (Source: Author’s illustration)

Figure 6.

Logistic regression results (Source: Author’s own calculation)
Logistic regression results (Source: Author’s own calculation)

Figure 7.

Support vector machine results (Source: Author’s own calculation)
Support vector machine results (Source: Author’s own calculation)

Figure 8.

Decision tree results. (Source: Author’s own calculation)
Decision tree results. (Source: Author’s own calculation)

Figure 9.

Decision tree results (Source: Author’s own calculation)
Decision tree results (Source: Author’s own calculation)

Figure 10.

MCC versus F-Recall and F-Precision (Source: Author’s own calculation)
MCC versus F-Recall and F-Precision (Source: Author’s own calculation)

The results of the evaluation criteria for the B Score model (Source: Author’s own calculation)

DatasetCriteria (%)Tuned by MCCTuned by F-Recall
LRSVMDTRFLRSVMDTRF
TrainAccuracy64.6283.2296.2999.0664.6247.0550.3196.11
Recall65.1986.9898.2099.9765.1991.5390.0599.10
Precision47.5969.8691.2997.2747.5937.7339.1990.15
F155.0277.4894.6298.6055.0253.4454.6194.41
F-Recall60.7082.9196.7499.4260.7071.2271.4997.17
F-Precision50.3172.7292.5997.8050.3142.7644.1891.81
MCC27.9265.3491.9497.9227.9219.6022.8291.68
ValidationAccuracy64.6670.5874.0981.8464.6646.6250.4179.85
Recall64.9861.5866.9567.4564.9891.4887.8370.78
Precision47.6255.0859.7975.2647.6237.5239.0269.20
F154.9658.1563.1771.1454.9653.2254.0469.98
F-Recall60.5660.1665.3968.8860.5671.0470.2670.46
F-Precision50.3156.2761.1073.5650.3142.5443.9069.51
MCC27.8935.7143.4558.1327.8918.9421.2954.83

Matrix for different types of alerts (Source: Reihart, et al_, 2010)

-The event occurredThe event did not occur
There is a warning signalAB
There is no warning signalCD

Summary of parameters of models (Source: Author’s compilation)

ModelParameterDescription
LGNoneThe baseline model is a linear regression model combined with the sigmoid (logit) activation function, so no tuning is required
SVMKernel functionThe activation function used to transform data into a different feature space for linear separation includes Linear, Polynomial, Sigmoid, and RBF
CThe coefficient for balancing the weight between distance and noise
dThe degree parameter when using the Polynomial kernel, which takes a natural number value
γThe gamma parameter for Polynomial, Sigmoid, and RBF kernels, which takes a non-negative value
rThe intercept for the Polynomial and Sigmoid kernels
DTDepthIt is necessary to limit the depth of the DT to avoid overfitting and reduce computational cost
Number of leaf nodesIt is necessary to limit the number of leaf nodes of the DT to avoid overfitting and reduce computational cost
RFDepthIt is necessary to limit the depth of each DT to avoid overfitting and reduce computational cost
Number of leaf nodesIt is necessary to limit the number of leaf nodes of each DT to avoid overfitting and reduce computational cost
Number of DTsThe number of DTs in Random Forest Classifier (RF) needs to be considered for computational cost when the number is too high

Confusion matrix (Source: Author’s illustration)

Target variablePredicted: 1Predicted: 0Total
Actual: 1TP: True positivesFN: False negativesP
Actual: 0FP: False positivesTN: True negativesN
Total P^ ${\rm{\hat P}}$ N^ ${\rm{\hat N}}$ P + N

Early warning system deployment for B Score customers (Source: Author’s own calculation)

Criteria (%)Tuned by MCCTuned by F-Recall
RF (best)SVM (best)RF (second best)
Accuracy81.84 → 78.6746.62 → 36.8679.85 → 76.56
Recall67.45 → 29.0191.48 → 91.5370.78 → 38.14
Precision75.26 → 42.8437.52 → 22.4469.20 → 39.37
F-Recall68.88 → 31.0171.04 → 56.6570.46 → 38.38

The results of the evaluation criteria for the C Score model (Source: Author’s own calculation)

CustomerModelSelectionParameters
B ScoreBestRF tuned by MCCn_estimators = 100;max_depth = 20;max_leaf_node = None
SVM tuned by F-Recallkernel = ‘sigmoid’;C = 0.1;gamma = 0.01
Second bestRF tuned by F-Recalln_estimators = 100;max_depth = 16;max_leaf_node = None
C ScoreBestSVM tuned by MCCkernel = ‘poly’;degree = 4;C = 0.01gamma = 0.1
SVM tuned by F-Precisionkernel = ‘poly’;degree = 4;C = 0.1gamma = 0.01

Model tuning parameters in Scikit-learn (Source: Author’s own research)

ModelParameterParameters in Scikit-learnRange of values for tuning
LGNoneNoneNone
SVMKernel functionkernel: accepts a value from ‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’. The default value is ‘rbf’‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’
CC: data type is float; the default value is 10.01, 0.1, 1, 10
ddegree: data type is integer; the default value is 32, 3, 4, 5 (this is applicable only when kernel is set to ‘poly’)
γGamma: accepts a value from ‘scale’, ‘auto’. The default value is ‘scale’. It can also be specified as a non-negative float0.01, 0.1, 1, 10 (not applicable when kernel is set to ‘linear’)
DTDepthmax_depth: data type is integer or none. The default value is none, which means the tree is expanded until the maximum depth is reachedThe range from 2 to 21 (with a step size of 2) and none
Number of leaf nodesmax_leaf_nodes: data type is integer or None. The default value is none, which means an unlimited number of leaf nodes will be developed, regardless of max_depthThe range from 2 to 21 (with a step size of 2) and none
RFDepthSimilar to DTSimilar to DT
Number of leaf nodesSimilar to DTSimilar to DT
Number of DTsn_estimators, data type is integer. The default value is 10010, 50, 100

Early warning system deployment for C Score customers (Source: Author’s own calculation)

Criteria (%)Tuned by MCCTuned by F-Precision
SVM (best)SVM (best)
Accuracy70.78 → 61.9471.60 → 65.54
Recall53.57 → 54.4845.24 → 50.00
Precision (*)58.44 → 40.3362.30 → 43.79
F-Precision (*)57.40 → 42.5457.93 → 44.91
DOI: https://doi.org/10.2478/fman-2024-0012 | Journal eISSN: 2300-5661 | Journal ISSN: 2080-7279
Language: English
Page range: 195 - 216
Published on: Sep 10, 2024
Published by: Warsaw University of Technology
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2024 Quoc Hung Nguyen, Hoang Viet Trinh, Truong Viet Phuong, Truong Thi Minh Ly, published by Warsaw University of Technology
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.