An Empirical Study of Automated Machine Learning Python Libraries Using Source Code Analysis
Abstract
The growth of Automated Machine Learning (AutoML) has expanded access to machine learning workflows by enabling the automation of tasks and reducing the technical barrier to entry. However, the reliability and maintainability of these libraries depend on the quality of their underlying source code. This study presents a novel, systematic analysis of 16 Python AutoML libraries utilising SonarQube – an industry-standard SCA platform – and Python analysis tools: Bandit, Coverage.py, Prospector, Pylint, Radon, and Ruff. The AutoML Libraries are evaluated using software quality metrics, which collectively reflect overall code complexity, maintainability, security, and adherence to Python coding standards.
Strong agreement was observed between SonarQube-based rankings and rankings derived from Python-based tools. Based on median SCA rankings, the libraries were ordered (highest to lowest estimated code quality) as follows: Hyperopt-sklearn, AutoKeras, GAMA, MLBox, FEDOT, TPOT, MLJAR, LightAutoML, Auto-sklearn, PyCaret, FLAML, Auto-PyTorch, Ludwig, EvalML, AutoTS, and AutoGluon.
An additional exploratory Spearman rank correlation analysis examined the relationship between SCA metrics and forecasting performance measures from a prior electricity price prediction benchmark (n = 7). Several SCA metrics exhibit strong monotonic relationships with forecasting error measures, e.g., SonarQube Violations and Code Smells correlate positively with mean absolute error (ρ = 0.86), while Class Cyclomatic Complexity (ρ = −0.89) and Duplicated Files (ρ = − 0.86) correlate negatively with library execution time. Due to the limited sample size, these findings are descriptive and non-parametric. The results suggest that code quality scores may relate to lower-bound predictive performance and computational efficiency, warranting further validation.
© 2026 Christian O’Leary, Conor Lynch, Farshad Ghassemi Toosi, published by Riga Technical University
This work is licensed under the Creative Commons Attribution 4.0 License.