An Empirical Study of Automated Machine Learning Python Libraries Using Source Code Analysis
References
- C. McHugh, S. Coleman, and D. Kerr, “Hourly electricity price forecasting with NARMAX,” Machine Learning with Applications, vol. 9, Sep. 2022, Art. no. 100383. https://doi.org/10.1016/j.mlwa.2022.100383
- A. Bauer, M. Züfle, S. Eismann, J. Grohmann, N. Herbst, and S. Kounev, “Libra: A benchmark for time series forecasting methods,” in Proceedings of the ACM/SPEC International Conference on Performance Engineering, USA, Apr. 2021, pp. 189–200. https://doi.org/10.1145/3427921.3450241
- X. Zhang et al., “Robust log-based anomaly detection on unstable log data,” in Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Estonia, Aug. 2019, pp. 807–817. https://doi.org/10.1145/3338906.3338931
- A. Warzynski, L. Falas, and P. Schauer, “Excess-mass and mass-volume anomaly detection algorithms applicability in unsupervised intrusion detection systems,” in 30th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises, Bayonne, France, Oct. 2021, pp. 131–136. https://doi.org/10.1109/WETICE53228.2021.00035
- H. Ismail Fawaz, G. Forestier, J. Weber, L. Idoumghar, and P.-A. Muller, “Deep learning for time series classification: a review,” Data Mining and Knowledge Discovery, vol. 33, pp. 917–963, Mar. 2019. https://doi.org/10.1007/s10618-019-00619-1
- N. Mohammadi Foumani, L. Miller, C. W. Tan, G. I. Webb, G. Forestier, and M. Salehi, “Deep learning for time series classification and extrinsic regression: A current survey,” ACM Comput. Surv., vol. 56, no. 9, Apr. 2024, Art. no. 217. https://dl.acm.org/doi/10.1145/3649448
- D. Nam, A. Macvean, V. Hellendoorn, B. Vasilescu, and B. Myers, “Using an LLM to help with code understanding” in Proceedings of the IEEE/ACM 46th International Conference on Software Engineering, USA, Apr. 2024, pp. 1–13. https://doi.org/10.1145/3597503.3639187
- L. Pepino, P. Riera, L. Ferrer, and A. Gravano, “Fusion approaches for emotion recognition from speech using acoustic and text-based features,” in ICASSP 2020, Barcelona, Spain, Apr. 2020, pp. 6484–6488. https://doi.org/10.1109/ICASSP40776.2020.9054709
- M. Schubert, T. Riedlinger, K. Kahl, D. Kröll, S. Schoenen, S. Šegvic, and M. Rottmann, “Identifying label errors in object detection datasets by loss inspection,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, USA, Jan. 2024, pp. 4570–4579. http://doi.org/10.1109/WACV57701.2024.00452
- K. Chachula, J. Lyskawa, B. Olber, P. Fratczak, A. Popowicz, and K. Radlak, “Combating noisy labels in object detection datasets,” arXiv:2211.13993, Dec. 2023. https://doi.org/10.48550/arXiv.2211.13993
- A. Thessen, “Adoption of machine learning techniques in ecology and Earth science,” One Ecosystem, vol. 1, Jun. 2016, Art. no. e8621. https://doi.org/10.3897/oneeco.1.e8621
- A. Alsharef, K. Aggarwal, Sonia, M. Kumar, and A. Mishra, “Review of ML and AutoML solutions to forecast time-series data,” Archives of Computational Methods in Engineering, vol. 29, pp. 5297–5311, Nov. 2022. https://doi.org/10.1007/s11831-022-09765-0
- Stack Exchange Inc, “Stack Overflow Developer Survey 2023,” 2023. [Online]. Available: https://survey.stackoverflow.co/2023/
- Stack Exchange Inc, “Stack Overflow Developer Survey 2024,” 2024. [Online]. Available: https://survey.stackoverflow.co/2024/
- H. A. M. Salih and Q. I. Sarhan, “A study of large language models in detecting Python code violations,” ARO – The Scientific Journal of Koya University, vol. 13, no. 2, pp. 215–225, Oct. 2025. https://doi.org/10.14500/aro.12395
- N. Erickson, J. Mueller, A. Shirkov, H. Zhang, P. Larroy, M. Li, and A. Smola, “AutoGluon-Tabular: Robust and accurate AutoML for structured data,” arXiv:2003.06505, Mar. 2020. https://doi.org/10.48550/arXiv.2003.06505
- H. Jin, F. Chollet, Q. Song, and X. Hu, “AutoKeras: An AutoML library for deep learning,” Journal of Machine Learning Research, vol. 24, no. 6, pp. 1–6, 2023. [Online]. Available: http://jmlr.org/papers/v24/20-1355.html
- C. Catlin, “winedarksea/AutoTS,” 2025, original date: 2019-11-26. [Online]. Available: https://github.com/winedarksea/AutoTS
- L. Zimmer, M. Lindauer, and F. Hutter, “Auto-Pytorch: Multi-fidelity MetaLearning for efficient and robust AutoDL,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 9, pp. 3079–3090, Sep. 2021. https://doi.org/10.1109/TPAMI.2021.3067763
- M. Feurer, K. Eggensperger, S. Falkner, M. Lindauer, and F. Hutter, “Auto-Sklearn 2.0: Hands-free AutoML via Meta-Learning,” arXiv:2007.04074, Oct. 2022. https://doi.org/10.48550/arXiv.2007.04074
- Alteryx, “alteryx/evalml,” Aug. 2022, originaldate: 2019-07-17. [Online]. Available: https://github.com/alteryx/evalml
- N. O. Nikitin et al., “Automated evolutionary approach for the design of composite machine learning pipelines,” Future Generation Computer Systems, vol. 127, pp. 109–125, Feb. 2022. https://doi.org/10.1016/j.future.2021.08.022
- C. Wang, Q. Wu, M. Weimer, and E. E. Zhu, “FLAML: A fast and lightweight AutoML library,” in Proceedings of the Fourth Conference on Machine Learning and Systems, MLSys 2021, Apr. 2021. [Online]. Available: https://www.microsoft.com/en-us/research/publication/flamla-fast-and-lightweight-automl-library/
- P. Gijsbers and J. Vanschoren, “GAMA: Genetic automated machine learning assistant,” Journal of Open Source Software, vol. 4, no. 33, Jan. 2019, Art. no. 1132. https://doi.org/10.21105/joss.01132
- B. Komer, J. Bergstra, and C. Eliasmith, “HyperoptSklearn,” in Automated Machine Learning: Methods, Systems, Challenges, ser. The Springer Series on Challenges in Machine Learning, F. Hutter, L. Kotthoff, and J. Vanschoren, Eds. Cham: Springer International Publishing, 2019, pp. 97–111. https://doi.org/10.1007/978-3-030-05318-55
- A. Vakhrushev, A. Ryzhkov, M. Savchenko, D. Simakov, R. Damdinov, and A. Tuzhilin, “LightAutoML: AutoML solution for a large financial services ecosystem,” arXiv:2109.01528, Apr. 2022. https://doi.org/10.48550/arXiv.2109.01528
- P. Molino, Y. Dudin, and S. S. Miryala, “Ludwig: a typebased declarative deep learning toolbox,” arXiv:1909.07930, Sep. 2019. https://doi.org/10.48550/arXiv.1909.07930
- A. De Romblay, “MLBox,” 2025. [Online]. Available: https://github.com/AxeldeRomblay/MLBox
- A. Plonska and P. Płonski, “MLJAR: State-of-the-art automated machine learning framework for tabular data,” 2021. [Online]. Available: https://github.com/mljar/mljar-supervised
- M. Ali, “PyCaret: An open source, low-code machine learning library in Python,” 2020. [Online]. Available: https://github.com/pycaret/pycaret
- R. S. Olson, N. Bartley, R. J. Urbanowicz, and H. Moore, “Evaluation of a tree-based pipeline optimization tool for automating data science,” in Proceedings of the Genetic and Evolutionary Computation Conference, USA, July 2016, pp. 485–492. https://doi.org/10.1145/2908812.2908918
- D. Binkley, H. Feild, D. Lawrie, and M. Pighin, “Increasing diversity: Natural language measures for software fault prediction,” Journal of Systems and Software, vol. 82, no. 11, pp. 1793–1803, Nov. 2009. https://doi.org/10.1016/j.jss.2009.06.036
- S. Afshan, P. McMinn, and M. Stevenson, “Evolving readable string test inputs using a natural language model to reduce human oracle cost,” in Verification and Validation 2013 IEEE Sixth International Conference on Software Testing, Luxembourg, Mar. 2013, pp. 352–361. https://doi.org/10.1109/ICST.2013.11
- N. Medeiros, N. Ivaki, P. Costa, and M. Vieira, “An empirical study on software metrics and machine learning to identify untrustworthy code,” in 2021 17th European Dependable Computing Conference (EDCC), Munich, Germany, Sep. 2021, pp. 87–94. https://doi.org/10.1109/EDCC53658.2021.00020
- J. Pantiuchina, M. Lanza, and G. Bavota, “Improving code: The (mis) perception of quality metrics,” in 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME), Madrid, Spain, Sep. 2018, pp. 80–91. https://doi.org/10.1109/ICSME.2018.00017
- A. Wingkvist, M. Ericsson, R. Lincke, and W. Löwe, “A metrics-based approach to technical documentation quality,” in QUATIC’10: Proceedings of the 2010 Seventh International Conference on the Quality of Information and Communications Technology, ser. QUATIC’10, Porto, Portugal, Sep.–Oct. 2010, pp. 476–481. https://doi.org/10.1109/QUATIC.2010.88
- M. H. Halstead, Elements of Software Science (Operating and programming systems series), 3rd ed. USA: Elsevier Science Inc., 1977.
- R. P. Buse and W. R. Weimer, “Learning a metric for code readability,” IEEE Transactions on Software Engineering, vol. 36, no. 4, pp. 546–558, Nov. 2010. http://doi.org/10.1109/TSE.2009.70
- SonarSource, “SonarQube,” 2025. [Online]. Available: https://www.sonarqube.org/
- Python Code Quality Authority, “Bandit,” 2025. [Online]. Available: https://bandit.readthedocs.io/
- N. Batchelder and Contributors to Coverage.py, “Coverage.py: The code coverage tool for Python,” 2025, original-date: 2018-06-23T17:44:53Z. [Online]. Available: https://github.com/nedbat/coveragepy
- S. Brunner and C. Crowder, “landscapeio/prospector,” 2025. [Online]. Available: https://github.com/prospector-dev/prospector
- M. Murphy, M. O’Mahony, L. Shalloo, P. French, and J. Upton, “Comparison of modelling techniques for milk-production forecasting,” Journal of Dairy Science, vol. 97, no. 6, pp. 3352–3363, Jun. 2014. https://doi.org/10.3168/jds.2013-7451
- Michele Lacchia, “Radon 4.1.0 documentation,” 2025. [Online]. Available: https://radon.readthedocs.io/
- Charles Marsh, “Ruff,” 2025. [Online]. Available: https://docs.astral.sh/ruff/
- F. G. Toosi, “Source code features and their dependencies: An aggregative statistical analysis on open-source Java software systems,” Applied Computer Systems, vol. 28, no. 2, pp. 221–231, Jan. 2024. https://doi.org/10.2478/acss-2023-0022
- V. Bhutani, F. G. Toosi, and J. Buckley, “Analysing the analysers: An investigation of source code analysis tools,” Applied Computer Systems, vol. 29, no. 1, pp. 98–111, Jun. 2024. https://doi.org/10.2478/acss-2024-0013
- D. Lawrie, H. Feild, and D. Binkley, “Leveraged quality assessment using information retrieval techniques,” in 14th IEEE International Conference on Program Comprehension, Athens, Greece, 2006. https://ieeexplore.ieee.org/abstract/document/1631117
- S. Scalabrino, “Automatically assessing and improving code readability and understandability,” PhD dissertation, Università degli Studi del Molise, Campobasso, Italy, 2019. [Online]. Available: https://iris.unimol.it/retrieve/handle/11695/90885/92359/Tesi_S_Scalabrino.pdf
- T. McCabe, “A complexity measure,” IEEE Transactions on Software Engineering, vol. SE-2, no. 4, pp. 308–320, Dec. 1976. https://doi.org/10.1109/TSE.1976.233837
- S. Chidamber and C. Kemerer, “A metrics suite for object oriented design,” IEEE Transactions on Software Engineering, vol. 20, no. 6, pp. 476–493, June 1994. https://doi.org/10.1109/32.295895
- B. Henderson-Sellers, L. L. Constantine, and I. M. Graham, “Coupling and cohesion (towards a valid metrics suite for object-oriented analysis and design),” Object Oriented Systems, vol. 3, no. 3, pp. 143–158, 1996.
- A. Marcus, D. Poshyvanyk, and R. Ferenc, “Using the conceptual cohesion of classes for fault prediction in object-oriented systems,” IEEE Transactions on Software Engineering, vol. 34, no. 2, pp. 287–300, Apr. 2008. https://doi.org/10.1109/TSE.2007.70768
- F. Deissenbock and M. Pizka, “Concise and consistent naming,” in 13th International Workshop on Program Comprehension (IWPC’05), St. Louis, USA, May 2005, pp. 97–106. http://doi.org/10.1109/WPC.2005.14
- T. Brown et al., “Language models are few-shot learners,” in Advances in Neural Information Processing Systems, vol. 33, 2020. https://papers.nips.cc/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html
- E. Daka, J. Campos, G. Fraser, J. Dorn, and W. Weimer, “Modeling readability to improve unit tests,” in Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering, Bergamo Italy, Aug. 2015, pp. 107–118. https://doi.org/10.1145/2786805.2786838
- D. Posnett, A. Hindle, and P. Devanbu, “A simpler model of software readability,” in Proceedings International Conference on Software Engineering, USA, May 2011, pp. 73–82. https://doi.org/10.1145/1985441.1985454
- S. Scalabrino, M. Linares-Vásquez, D. Poshyvanyk, and R. Oliveto, “Improving code readability models with textual features,” in 2016 IEEE 24th International Conference on Program Comprehension (ICPC), Austin, USA, May 2016, pp. 1–10. https://doi.org/10.1109/ICPC.2016.7503707
- G. A. Miller, “WordNet: a lexical database for English,” Communications of the ACM, vol. 38, no. 11, pp. 39–41, Nov. 1995. https://doi.org/10.1145/219717.219748
- R. Flesch, “A new readability yardstick,” Journal of Applied Psychology, vol. 32, no. 3, pp. 221–233, 1948. https://doi.org/10.1037/h0057532
- H. Alves, B. Fonseca, and N. Antunes, “Software metrics and security vulnerabilities: Dataset and exploratory study,” in 2016 12th European Dependable Computing Conference (EDCC), Gothenburg, Sweden, Sep. 2016, pp. 37–44. https://doi.org/10.1109/EDCC.2016.34
- M. M. Mohajer, R. Aleithan, N. S. Harzevili, M. Wei, A. B. Belle, H. V. Pham, and S. Wang, “SkipAnalyzer: A tool for static code analysis with large language models,” arXiv:2310.18532, Dec. 2023. https://doi.org/10.48550/arXiv.2310.18532
- “ChatGPT,” 2025. [Online]. Available: https://chatgpt.com
- Pylint contributors, “Pylint,” 2025. [Online]. Available: https://github.com/pylint-dev/pylint
- PyCQA, “flake8,” Jan. 2026, original-date: 2014-0913T17:06:24Z. [Online]. Available: https://github.com/PyCQA/flake8
- “h2o-3,” 2025. [Online]. Available: https://github.com/h2oai/h2o-3
- C. O’Leary, C. Lynch, and F. G. Toosi, “A comparative analysis of automated machine learning libraries for electricity price forecasting,” Applied Computer Systems, vol. 29, no. 2, pp. 43–52, Dec. 2024. https://doi.org/10.2478/acss-2024-0020
- C. Francois, “Keras: the Python deep learning API,” 2025. [Online]. Available: https://keras.io/
- M. Abadi et al., “TensorFlow: LargeScale machine learning on heterogeneous distributed systems,” 2015. [Online]. Available: https://research.google/pubs/pub45166/
- F. Pedregosa et al., “Scikit-learn: Machine learning in Python,” Journal of Machine Learning Research, Oct. 2011, Art. no. 6. https://www.researchgate.net/publication/51969319_Scikitlearn_Machine_Learning_in_Python
- T. Chen and C. Guestrin, “XGBoost: A scalable tree boosting system,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug. 2016, pp. 785–794. https://doi.org/10.1145/2939672.2939785
- J. Bergstra, D. Yamins, and D. Cox, “Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures,” in Proceedings of the 30th International Conference on Machine Learning, Feb. 2013, pp. 115–123. https://proceedings.mlr.press/v28/bergstra13.html
- J. Bergstra, R. Bardenet, Y. Bengio, and B. Kégl, “Algorithms for hyperparameter optimization,” in Advances in neural information processing systems, J. Shawe-Taylor, R. Zemel, P. Bartlett, F. Pereira, and K. Weinberger, Eds., vol. 24. Curran Associates, Inc., 2011, pp. 2546–2554. https://proceedings.neurips.cc/paper_files/paper/2011/file/86e8f7ab32cfd12577bc2619bc635690-Paper.pdf
- T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, “Optuna: a nextgeneration hyperparameter optimization framework,” in KDD’19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, USA, July 2019, pp. 2623–2631. https://doi.org/10.1145/3292500.3330701
- G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.-Y. Liu, “LightGBM: a highly efficient gradient boosting decision tree,” in Proceedings of the 31st International Conference on Neural Information Processing Systems, ser. NIPS’17, USA, 2017.
- P. Gijsbers, M. L. P. Bueno, S. Coors, E. LeDell, S. Poirier, J. Thomas, B. Bischl, and J. Vanschoren, “AMLB: an AutoML benchmark,” arXiv:2207.12560, Nov. 2023. https://doi.org/10.48550/arXiv.2207.12560
- P. Oman and J. Hagemeister, “Metrics for assessing a software system’s maintainability,” in Proceedings Conference on Software Maintenance 1992, Orlando, FL, USA, Nov. 1992, pp. 337–344. https://doi.org/10.1109/ICSM.1992.242525
- G. A. Campbell, “Cognitive complexity – An overview and evaluation,” in Proceedings of the 2018 International Conference on Technical Debt, ser. TechDebt’18, New York, NY, USA, May 2018, pp. 57–58. https://doi.org/10.1145/3194164.3194186
- J.-L. Letouzey, “The SQALE method for evaluating Technical Debt,” in 2012 Third International Workshop on Managing Technical Debt (MTD), Zurich, Switzerland, June 2012, pp. 31–36. https://doi.org/10.1109/MTD.2012.6225997
- D. Grimes, G. Ifrim, B. O’Sullivan, and H. Simonis, “Analyzing the impact of electricity price forecasting on energy cost-aware scheduling,” Sustainable Computing: Informatics and Systems, vol. 4, no. 4, pp. 276–291, Dec. 2014. https://doi.org/10.1016/j.suscom.2014.08.009
- P. Schober, C. Boer, and L. A. Schwarte, “Correlation coefficients: Appropriate use and interpretation,” Anesthesia & Analgesia, vol. 126, no. 5, pp. 1763–1768, May 2018. http://doi.org/10.1213/ANE.0000000000002864
- S. Ajel, F. Ribeiro, R. Ejbali, and J. Saraiva, “Energy efficiency of Python machine learning frameworks,” in Intelligent Systems Design and Applications, A. Abraham, S. Pllana, G. Casalino, K. Ma, and A. Bajaj, Eds. Cham: Springer Nature Switzerland, 2023, pp. 586–595. https://doi.org/10.1007/978-3-031-35507-3_57
- K. Lottick, S. Susai, S. Friedler, and J. Wilson, “Energy usage reports: Environmental awareness as part of algorithmic accountability,” in Climate Change AI. Climate Change AI, Dec. 2019. [Online]. Available: https://www.climatechange.ai/papers/neurips2019/8
- F. G. Toosi, “Green software engineering: A DualPerspective overview of stakeholder and societal interpretations,” in 2025 Computing, Communications and IoT Applications (ComComAp), Madrid, Spain, Dec. 2025, pp. 385–394. https://doi.org/10.1109/ComComAp68359.2025.11353190
- J. Dorn, “A general software readability model,” M. S. thesis, University of Virginia, Charlottesville, Virginia, USA, 2012. https://web.eecs.umich.edu/~weimerw/students/dornmcs-paper.pdf
Language: English
Page range: 95 - 115
Submitted on: Feb 19, 2026
Accepted on: May 14, 2026
Published on: Jun 2, 2026
Published by: Riga Technical University
In partnership with: Paradigm Publishing Services
Publication frequency: Volume open
Keywords:
Related subjects:
© 2026 Christian O’Leary, Conor Lynch, Farshad Ghassemi Toosi, published by Riga Technical University
This work is licensed under the Creative Commons Attribution 4.0 License.