Have a personal or library account? Click to login
FAIRness Along the Machine Learning Lifecycle Using Dataverse in Combination with MLflow Cover

FAIRness Along the Machine Learning Lifecycle Using Dataverse in Combination with MLflow

Open Access
|Dec 2024

Abstract

Typical Machine Learning (ML) approaches are characterized by their iterative and exploratory nature: continuously refining and adapting not only code but also ML models to optimize the results and the performance on new data. This poses novel challenges related to keeping the trained model Findable, Accessible, Interoperable and Reusable (FAIR), especially for the automation of the entire machine learning lifecycle within the concept of Machine Learning Operations (MLOps). The article introduces a comprehensive integration of a data repository (based on the software Dataverse) and an ML platform (based on the MLflow framework) that enables seamless sharing and publishing of data, experiments and models, ensuring FAIRness. The presented solution is evaluated using an ML use case scenario with model training, hyper-parameter optimization, and model sharing via the data platform.

Language: English
Submitted on: Mar 24, 2024
Accepted on: Nov 16, 2024
Published on: Dec 6, 2024
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2024 Lincoln Sherpa, Valentin Khaydarov, Ralph Müller-Pfefferkorn, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.