Federated Learning With Differential Privacy: Algorithms and Performance Analysis

Federated learning (FL), as a type of distributed machine learning, is capable of significantly preserving clients' private data from being exposed to adversaries. Nevertheless, private information can still be divulged by analyzing uploaded parameters from clients, e.g., weights trained in dee...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on information forensics and security Vol. 15; pp. 3454 - 3469
Main Authors:	Wei, Kang, Li, Jun, Ding, Ming, Ma, Chuan, Yang, Howard H., Farokhi, Farhad, Jin, Shi, Quek, Tony Q. S., Vincent Poor, H.
Format:	Journal Article
Language:	English
Published:	New York IEEE 2020 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	Agglomeration Algorithms Analytical models Artificial neural networks client selection Clients Computer simulation Convergence convergence performance differential privacy Distributed databases Federated learning information leakage Levels Machine learning Mathematical models Parameters Privacy Scheduling Servers Tradeoffs Training
ISSN:	1556-6013, 1556-6021
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Federated learning (FL), as a type of distributed machine learning, is capable of significantly preserving clients' private data from being exposed to adversaries. Nevertheless, private information can still be divulged by analyzing uploaded parameters from clients, e.g., weights trained in deep neural networks. In this paper, to effectively prevent information leakage, we propose a novel framework based on the concept of differential privacy (DP), in which artificial noise is added to parameters at the clients' side before aggregating, namely, noising before model aggregation FL (NbAFL). First, we prove that the NbAFL can satisfy DP under distinct protection levels by properly adapting different variances of artificial noise. Then we develop a theoretical convergence bound on the loss function of the trained FL model in the NbAFL. Specifically, the theoretical bound reveals the following three key properties: 1) there is a tradeoff between convergence performance and privacy protection levels, i.e., better convergence performance leads to a lower protection level; 2) given a fixed privacy protection level, increasing the number <inline-formula> <tex-math notation="LaTeX">N </tex-math></inline-formula> of overall clients participating in FL can improve the convergence performance; and 3) there is an optimal number aggregation times (communication rounds) in terms of convergence performance for a given protection level. Furthermore, we propose a <inline-formula> <tex-math notation="LaTeX">K </tex-math></inline-formula>-client random scheduling strategy, where <inline-formula> <tex-math notation="LaTeX">K </tex-math></inline-formula> (<inline-formula> <tex-math notation="LaTeX">1\leq K< N </tex-math></inline-formula>) clients are randomly selected from the <inline-formula> <tex-math notation="LaTeX">N </tex-math></inline-formula> overall clients to participate in each aggregation. We also develop a corresponding convergence bound for the loss function in this case and the <inline-formula> <tex-math notation="LaTeX">K </tex-math></inline-formula>-client random scheduling strategy also retains the above three properties. Moreover, we find that there is an optimal <inline-formula> <tex-math notation="LaTeX">K </tex-math></inline-formula> that achieves the best convergence performance at a fixed privacy level. Evaluations demonstrate that our theoretical results are consistent with simulations, thereby facilitating the design of various privacy-preserving FL algorithms with different tradeoff requirements on convergence performance and privacy levels.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1556-6013 1556-6021
DOI:	10.1109/TIFS.2020.2988575