Abstract
The paper aims to investigate the main factors affecting stress from the housing costs of household in Europe by applying machine learning methods to a set of structural, demographic, market and policy indicators. The research design combines variable selection using the VSURF algorithm with modelling via Random Forest and a subsequent interpretation of the results using SHAP analysis, including the identification of non-linear relationships and interactions between the variables. The data are drawn from the publicly available databases Eurostat, Housing Europe, Numbeo, Horwath HTL, and AllTheRooms. The analysis encompasses all 27 EU member states. It identifies threshold values for individual factors at which their effect on the household cost burden becomes disruptive. In addition, it quantifies the interactions among these variables. The findings thereby underscore the necessity of a comprehensively integrated housing policy. The contribution to knowledge lies in the methodological integration of machine learning with a public policy framework, which allows for a more precise identification of the determinants of housing unaffordability and better targeting of policy interventions.