In the era of big data and artificial intelligence, distributed machine learning has emerged as a promising solution to address privacy and security concerns while fostering collaboration between multiple parties. However, with the data increased in terms of volume, velocity, veracity and variety, ensuring effective data management and responsible data sharing in these systems remains a challenge. In this paper, we explore the potential solutions and propose a system architecture that incorporates FAIR data principles (Findable, Accessible, Interoperable, and Reusable) to promote effective and secure collaboration in federated learning. A minimum set of metadata schemes tailored for distributed machine learning and a decentralized authentication and authorization mechanism based on self-sovereign identity and policy-based access control architecture are proposed. To demonstrate the effectiveness of the proposed system, we conduct a FAIRness assessment and evaluate the model performance with a federated learning use case. Our work contributes to the development of an efficient, secure, and collaborative data ecosystem, fostering innovation in artificial intelligence and machine learning.
Citation:
Y. Mou et al., “Towards FAIR Data in Distributed Machine Learning Systems,” IEEE Global Communications Conference 2023, Dec. 2023, doi: 10.1109/globecom54140.2023.10437414.
More Information:
Open source: https://doi.org/10.1109/GLOBECOM54140.2023.10437414