ProFed: A Benchmark for Proximity-Based Non-IID Federated Learning

Davide Domini; Christian Otte Ingemann; Gianluca Aguzzi; Lukas Esterle; Mirko Viroli

doi:10.5334/jors.624

Abstract

Federated Learning (FL) has emerged as a key paradigm in machine learning but its performance often deteriorates under non-independent and identically distributed (non-IID) client data. Such heterogeneity frequently reflects geographic factors—for example, regional linguistic variations or localized traffic patterns—leading to IID data within regions but with non-IID distributions across them. However, existing FL algorithms are typically evaluated by randomly splitting non-IID data across devices, disregarding their spatial distribution.

To address this gap, we introduce PROFED, a benchmark that simulates data splits with varying degrees of skewness across different regions. We incorporate several skewness methods from the literature and apply them to well-known datasets, including MNIST, FashionMNIST, Extended MNIST, CIFAR-10, CIFAR-100, and UTKFace. Our goal is to provide researchers with a standardized framework to evaluate FL algorithms more effectively and consistently against established baselines.

References

McMahan B, Moore E, Ramage D, Hampson S, Agüera y Arcas B. Communication-efficient learning of deep networks from decentralized data. In Proceedings of the International Conference on Artificial Intelligence and Statistics, volume 54 of Proceedings of Machine Learning Research. PMLR; 2017. pp. 1273–1282.
Search in Google Scholar Back to article
Ma X, Zhu J, Lin Z, Chen S, Qin Y. A state-of-the-art survey on solving non-iid data in federated learning. Future Gener. Comput. Syst. 2022;135:244–258. DOI: 10.1016/j.future.2022.05.003
Open DOI Search in Google Scholar Back to article
Huang Y, Chu L, Zhou Z, Wang L, Liu J, Pei J, Zhang Y. Personalized cross-silo federated learning on non-iid data. In: EAAI 2021. AAAI Press; 2021. pp. 7865–7873. DOI: 10.1609/aaai.v35i9.16960
Open DOI Search in Google Scholar Back to article
Karimireddy SP, Kale S, Mohri M, Reddi SJ, Stich SU, Suresh AT. SCAFFOLD: stochastic controlled averaging for federated learning. In: ICML 2020, volume 119 of Proceedings of Machine Learning Research. PMLR; 2020. pp. 5132–5143.
Search in Google Scholar Back to article
Li T, Sahu AK, Zaheer M, Sanjabi M, Talwalkar A, Smith V. Federated optimization in heterogeneous networks. In: Proceedings of the Third Conference on Machine Learning and Systems. Austin, TX, USA: MLSys, 2020. mlsys.org
Search in Google Scholar Back to article
Chen X, Xiao C, Liu Y. Confusion-resistant federated learning via diffusion-based data harmonization on non-iid data. In: Globerson A, Mackey L, Belgrave D, Fan A, Paquet U, Tomczak JM, Zhang C, editors. Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024. Vancouver, BC, Canada: NeurIPS; 2024. DOI: 10.52202/079017-4368
Open DOI Search in Google Scholar Back to article
Sattler F, Wiedemann S, Müller KR, Samek W. Robust and communication-efficient federated learning from non-i.i.d. data. IEEE Trans. Neural Networks Learn. Syst. 2020;31(9):3400–3413. DOI: 10.1109/TNNLS.2019.2944481
Open DOI Search in Google Scholar Back to article
Domini D, Farabegoli N, Aguzzi G, Viroli M, Esterke L. Decentralized proximity-aware clustering for collective self-federated learning. Internet of Things 2026;35:101841. DOI: 10.1016/j.iot.2025.101841
Open DOI Search in Google Scholar Back to article
Domini D. Towards self-adaptive cooperative learning in collective systems. In: ACSOS 2024 – Companion. Aarhus, Denmark, September 16–20, 2024. IEEE; 2024. pp. 158–160. DOI: 10.1109/ACSOS-C63493.2024.00049
Open DOI Search in Google Scholar Back to article
Esterle L. Deep learning in multiagent systems. In: Deep Learning for Robot Perception and Cognition. Elsevier; 2022. pp. 435–460. DOI: 10.1016/B978-0-32-385787-1.00022-1
Open DOI Search in Google Scholar Back to article
Malucelli N, Domini D, Aguzzi G, Viroli M. Neighbor-based decentralized training strategies for multi-agent reinforcement learning. In: Hong J, Battiato S, Esposito C, Park JW, Przybylek A, editors. Proceedings of the 40th ACM/SIGAPP Symposium on Applied Computing, SAC 2025, Catania International Airport. Catania, Italy, 31 March 2025 – 4 April 2025. ACM; 2025. pp. 1250–1257. DOI: 10.1145/3672608.3707923
Open DOI Search in Google Scholar Back to article
Ghosh A, Chung J, Yin D, Ramchandran K. An efficient framework for clustered federated learning. IEEE Trans. Inf. Theory 2022;68(12):8076–8091. DOI: 10.1109/TIT.2022.3192506
Open DOI Search in Google Scholar Back to article
Domini D, Aguzzi G, Farabegoli N, Viroli M, Esterle L. Proximity-based self-federated learning. In: ACSOS 2024. Aarhus, Denmark: September 16–20, 2024. IEEE; 2024. pp. 139–144. DOI: 10.1109/ACSOS61780.2024.00033
Open DOI Search in Google Scholar Back to article
Li X, Chen X, Tang B, Wang S, Xuan Y, Zhao Z. Unsupervised graph structure-assisted personalized federated learning. In: Proceedings of the European Conference on Artificial Intelligence, volume 372 of Frontiers in Artificial Intelligence and Applications. IOS Press; 2023. pp. 1430–1438. DOI: 10.3233/FAIA230421
Open DOI Search in Google Scholar Back to article
Li Q, Diao Y, Chen Q, He B. Federated learning on non-iid data silos: An experimental study. In: Proceedings of the International Conference on Data Engineering. IEEE; 2022. pp. 965–978. DOI: 10.1109/ICDE53745.2022.00077
Open DOI Search in Google Scholar Back to article
Huang W, Ye M, Shi Z, Wan G, Li H, Du B, Yang Q. Federated learning for generalization, robustness, fairness: A survey and benchmark. IEEE Trans. Pattern Anal. Mach. Intell. 2024;46(12):9387–9406. DOI: 10.1109/TPAMI.2024.3418862
Open DOI Search in Google Scholar Back to article
Ansel J, et al. Pytorch 2: Faster machine learning through dynamic python bytecode transformation and graph compilation. In: ASPLOS. ACM; 2024. pp. 929–947. DOI: 10.1145/3620665.3640366
Open DOI Search in Google Scholar Back to article
TorchVision maintainers and contributors. TorchVision: PyTorch’s Computer Vision library, November 2016.
Search in Google Scholar Back to article
LeCun Y, Cortes C, Burges C, et al. Mnist handwritten digit database, 2010.
Search in Google Scholar Back to article
Krizhevsky A, Nair V, Hinton G. Cifar-10 (Canadian institute for advanced research).
Search in Google Scholar Back to article
Krizhevsky A, Nair V, Hinton G. Cifar-100 (Canadian institute for advanced research).
Search in Google Scholar Back to article
Zhang Z, Song Y, Qi H. Age progression/regression by conditional adversarial autoencoder. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017. Honolulu, HI, USA, July 21–26, 2017. IEEE Computer Society; 2017. pp. 4352–4360. DOI: 10.1109/CVPR.2017.463
Open DOI Search in Google Scholar Back to article
Lin T, Kong L, Stich SU, Jaggi M. Ensemble distillation for robust model fusion in federated learning. In: NeurIPS 2020; 2020.
Search in Google Scholar Back to article
Wang J, Liu Q, Liang H, Joshi G, Poor HV. Tackling the objective inconsistency problem in heterogeneous federated optimization. In: NeurIPS 2020; 2020.
Search in Google Scholar Back to article
Domini D, Erhan L, Aguzzi G, Cavallaro L, Zenoozi AD, Liotta A, Viroli M. Sparse self-federated learning for energy efficient cooperative intelligence in society 5.0. CoRR, abs/2507.07613; 2025. DOI: 10.1109/IJCNN64981.2025.11228400
Open DOI Search in Google Scholar Back to article
Domini D, Aguzzi G, Esterle L, Viroli M. FBFL: A field-based coordination approach for data heterogeneity in federated learning. CoRR, abs/2502.08577; 2025.
Search in Google Scholar Back to article
He C, et al. Fedml: A research library and benchmark for federated machine learning. CoRR, abs/2007.13518; 2020.
Search in Google Scholar Back to article
Beutel DJ, Topal T, Mathur A, Qiu X, Parcollet T, Lane ND. Flower: A friendly federated learning research framework. CoRR, abs/2007.14390; 2020.
Search in Google Scholar Back to article
Lai F, Dai Y, Singapuram SSV, Liu J, Zhu X, Madhyastha HV, Chowdhury M. Fedscale: Benchmarking model and system performance of federated learning at scale. In: ICML 2022, volume 162 of Proceedings of Machine Learning Research. PMLR; 2022. pp. 11814–11827.
Search in Google Scholar Back to article
Caldas S, Wu P, Li T, Konečn’y J, McMahan HB, Smith V, Talwalkar A. LEAF: A benchmark for federated settings. CoRR, abs/1812.01097; 2018.
Search in Google Scholar Back to article
Elvebakken MF, Iosifidis A, Esterle L. Adaptive parameterization of deep learning models for federated learning. CoRR, abs/2302.02949; 2023.
Search in Google Scholar Back to article
Kingma DP, Ba J. Adam: A method for stochastic optimization. CoRR, abs/1412.6980; 2014.
Search in Google Scholar Back to article
Cohen G, Afshar S, Tapson J, van Schaik A. EMNIST: an extension of MNIST to handwritten letters. CoRR, abs/1702.05373; 2017. DOI: 10.1109/IJCNN.2017.7966217
Open DOI Search in Google Scholar Back to article
Domini D, Aguzzi G, Esterle L, Viroli M. Field-based coordination for federated learning. In: COORDINATION 2024, volume 14676 of Lecture Notes in Computer Science. Springer; 2024. pp. 56–74. DOI: 10.1007/978-3-031-62697-5_4
Open DOI Search in Google Scholar Back to article

ProFed: A Benchmark for Proximity-Based Non-IID Federated Learning

Abstract

Paradigm

My account