El-Niss, Ayoub ORCID: https://orcid.org/0009-0009-3776-3734, Alzu’Bi, Ahmad ORCID: https://orcid.org/0000-0001-5466-0379, Abuarqoub, Abdelrahman ORCID: https://orcid.org/0000-0001-6576-8932, Hammoudeh, Mohammad ORCID: https://orcid.org/0000-0003-1058-0996 and Muthanna, Ammar ORCID: https://orcid.org/0000-0003-0213-8145 (2024) SimProx: A Similarity-Based Aggregation in Federated Learning With Client Weight Optimization. IEEE Open Journal of the Communications Society, 5. pp. 7806-7817. ISSN 2644-125X
|
Published Version
Available under License Creative Commons Attribution. Download (2MB) | Preview |
Abstract
Federated Learning (FL) enables decentralized training of machine learning models across multiple clients, preserving data privacy by aggregating locally trained models without sharing raw data. Traditional aggregation methods, such as Federated Averaging (FedAvg), often assume uniform client contributions, leading to suboptimal global models in heterogeneous data environments. This article introduces SimProx, a novel FL approach for aggregation that addresses heterogeneity in data through three key improvements. First, SimProx employs a composite similarity-based weighting mechanism, integrating cosine and Gaussian similarity measures to dynamically optimize client contributions. Then, it incorporates a proximal term in the client weighting scheme, using gradient norms to prioritize updates closer to the global optimum, thereby enhancing model convergence and robustness. Finally, a dynamic parameter learning technique is introduced, which adapts the balance between similarity measures based on data heterogeneity, refining the aggregation process. Extensive experiments on standard benchmarking datasets and real-world multimodal data demonstrate that SimProx significantly outperforms traditional methods like FedAvg in terms of accuracy. SimProx offers a scalable and effective solution for decentralized deep learning in diverse and heterogeneous environments.
Impact and Reach
Statistics
Additional statistics for this dataset are available via IRStats2.