Abstract
This study introduces a novel machine learning framework to accurately predict the discharge coefficient (Cd) of elliptical side orifices (ESOs). A cleaned experimental dataset consisting of 575 entries, refined using the Interquartile Range (IQR) method to remove outliers was employed. Five key dimensionless input variables were used to predict Cd: relative crest height (W/B), relative orifice width (a/B), relative orifice height (b/B), relative upstream height (y1/B), and upstream Froude number (F1). Four advanced Bayesian-optimized base models: Extreme Gradient Boosting (BO-XGB), LightGBM (BO-LGB), CatBoost (BO-CGB), and Histogram-based Gradient Boosting (BO-HGB) were integrated within a stacked ensemble architecture. A meta-learner based on Multiple Linear Regression (MLR) linearly combined these predictions to form the final Stacked Model (SM-MLR). Among the base models, the BO-CGB model achieved the best validation performance, with R2=0.8884, RMSE=0.0100, and MARE=0.0155. The final SM-MLR model outperformed all base learners and prior models, reaching R2=0.920, RMSE=0.0086, and MARE=0.0122. Model interpretation using Shapley Additive Explanations (SHAP) and Partial Dependence Plots (PDPs) revealed that a/B and b/B were the most influential. PDP analysis highlighted a consistently positive influence of a/B and a nonlinear but stabilizing trend for b/B. In contrast, W/B exhibited a strong negative linear effect on Cd, while y1/B and F1 showed more complex, nonlinear behaviors. These nonlinear and geometry-dependent relationships reinforce the fact that the hydraulic behavior of ESOs is not adequately captured by classical side-orifice theory. Accordingly, this study provides a comprehensive ML-based framework tailored to this geometry, and the analysis offers new theoretical insight into how ESO geometric ratios govern lateral outflow mechanics, addressing a key gap in hydraulic modeling of non-rectangular side orifice. To support practical application, a user-friendly graphical user interface (GUI) was developed, enabling engineers to estimate Cd in real time based on the input parameters. Overall, the proposed stacked ensemble approach significantly advances both the theoretical understanding and predictive accuracy of ESO discharge behavior, offering a robust and practical tool for modern hydraulic design.