Abstract
The aim of this study is to develop machine learning based models for staff churn and staff lifetime prediction. Three different approaches are used in the development of these models. In the first approach, prediction models are developed without feature selection. In the second approach, prediction models are developed using the Minimum Redundancy Maximum Relevance (mRMR) feature selection algorithm. In the third approach, prediction models are developed using the Principal Component Analysis (PCA) feature selection algorithm. Two different datasets are used in the development of the models. For predicting staff churn in both datasets, Logistic Regression (LR), Categorical Boosting (CatBoost), and Extreme Learning Machine (ELM) are used. To predict staff lifetime, Support Vector Machine (SVM), Gradient Boosting Machine (LightGBM), and K-Nearest Neighbors (KNN) are used. In order to evaluate the performance of the prediction models, Accuracy and F-Score are used for classification-based models, while Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE) are used for regression-based models. The results obtained in this study show that the feature selection algorithms have no significant effect on the performance of the models.