key: cord-0188760-y55gt8lm
authors: Zhou, Yao; Wu, Jun; Wang, Haixun; He, Jingrui
title: Adversarial Robustness through Bias Variance Decomposition: A New Perspective for Federated Learning
date: 2020-09-18
journal: nan
DOI: nan
sha: 28a367f2780b8abfd30bbe94000cb626bdfbbc89
doc_id: 188760
cord_uid: y55gt8lm

Federated learning learns a neural network model by aggregating the knowledge from a group of distributed clients under the privacy-preserving constraint. In this work, we show that this paradigm might inherit the adversarial vulnerability of the centralized neural network, i.e., it has deteriorated performance on adversarial examples when the model is deployed. This is even more alarming when federated learning paradigm is designed to approximate the updating behavior of a centralized neural network. To solve this problem, we propose an adversarially robust federated learning framework, named Fed_BVA, with improved server and client update mechanisms. This is motivated by our observation that the generalization error in federated learning can be naturally decomposed into the bias and variance triggered by multiple clients' predictions. Thus, we propose to generate the adversarial examples via maximizing the bias and variance during server update, and learn the adversarially robust model updates with those examples during client update. As a result, an adversarially robust neural network can be aggregated from these improved local clients' model updates. The experiments are conducted on multiple benchmark data sets using several prevalent neural network models, and the empirical results show that our framework is robust against white-box and black-box adversarial corruptions under both IID and non-IID settings.

The explosive amount of decentralized user data collected from the ever-growing usage of smart devices, e.g., smartphones, wearable devices, home sensors, etc., has led to a surge of interest in the field of decentralized learning. To protect the privacy-sensitive data of the clients, federated learning [25, 47] has been proposed. It only allows a group of clients to train local models using their own data on each individual device, and then collectively merges the model updates on a central server using secure aggregation [1] . Due to its high privacy-preserving property, federated learning has attracted much attention in recent years along with the prevalence of efficient light-weight deep models [15] and low-cost network communications [21, 44] . * Equal contribution Most existing federated learning models focus on improving the strategies of local model training (e.g., local SGD [38] ) and global aggregation (e.g., FedAvg [25] ). The intuition behind is to approximate the updating behavior of the centralized model trained on all clients' data. However, little effort has been devoted to improving the adversarial robustness of federated learning paradigm. This becomes even more alarming when centralized machine learning models have been shown to be vulnerable to adversarial attacks when those models are deployed in the testing phase [12, 27] .

Our work bridges this gap by investigating the generalization error incurred in federated learning's aggregation process from the perspective of bias-variance decomposition [8, 41] . Specifically, we show that this generalization error on the central server can be decomposed as the combination of bias (triggered by the main prediction of these clients) and variance (triggered by the variations among clients' predictions). This motivates us to propose a novel adversarially robust federated learning framework Fed_BVA. The key idea is to perform the local robust training on clients by supplying them with bias-variance perturbed examples generated from a small public training set on the central server. It has the following advantages: (i) it encourages the clients to consistently produce the optimal prediction for perturbed examples, thereby leading to a better generalization performance; (ii) local adversarial training on perturbed examples allows to learn a robust local model, and thus a robust global model could be aggregated from clients. The experiments are conducted on neural networks with cross-entropy loss, however, other loss functions are also applicable as long as their gradients w.r.t. bias and variance are tractable to be computed.

The most similar works to ours are robust federated learning models [7, 10, 28, 32, 33] . However, our work is fundamentally different from them. To be more specific, those works proposed to improve the robustness of federated learning against Byzantine failures induced by corrupted client's updates during model training, by performing client-level robust training or designing server-level aggregation variants with hyper-parameter tuning. In contrast, we focus on the adversarial robustness of federated learning against adversarial examples, when the model is deployed in the testing phase. The problem studied in this paper assumes that the learning process is clean (no malicious behaviors or Byzantine faults are observed on clients in training). Note that our Fed_BVA is flexible to incorporate with Byzantine-robust aggregation variants when this assumption is violated, but it is beyond the scope of this paper.

Compared with previous work, our major contributions include:

• We provide the exact solution of bias-variance analysis w.r.t. the generalization error for neural networks in the federated learning setting. As a comparison, performing the adversarial training on conventional federated learning methods can only focus on the bias of the central model but ignore the variance. • We demonstrate that the conventional federated learning framework is vulnerable to strong attacks with increasing communication rounds even if the adversarial training using the locally generated adversarial examples is performed on each client. • Without violating the clients' privacy, we show that providing a tiny amount of bias-variance perturbed data from the central server to the clients through asymmetrical communication could dramatically improve the robustness of the training model under various adversarial settings. The rest of this paper is organized as follows. The related work is reviewed in Section 2. In Section 3, the preliminary is provided. Then, in Section 4, a generic framework for adversarially robust federated learning is proposed, followed by the instantiated algorithm Fed_BVA with bias-variance attacks in Section 5. The experiments are presented in Section 6. We conclude this paper in Section 7.

In this section, we introduce the related work on federated learning and bias-variance decomposition.

Federated learning [21, 25] aggregates the knowledge from a large number of local devices (e.g., smartphone [13] ) with limited storage and computational resources. This aggregated knowledge can thus be leveraged to train the modern machine learning models (e.g., deep neural networks) [5, 16, 17, 19, 42] . However, we observe that those federated learning algorithms over neural networks are approximating the update dynamics of centralized neural networks, thus resulting in inheriting the adversarial vulnerability of centralized neural networks [12, 39] . More specifically, the deployed neural network models in federated learning would not be able to correctly classify the adversarial examples generated by the existing evasion attacks (e.g., FGSM [12, 45] , PGD [24] ).

There are two lines of related works for improving the adversarial robustness of federated learning. One is to develop robust model aggregation schemes [7, 10, 23, 28, 32, 33] against Byzantine failures induced by corrupted client's updates during model training. Those works focus on performing client-level robust training or designing server-level aggregation variants with hyper-parameter tuning. The other one is to improve the model robustness against the adversarial attacks after model deployment [29] . This idea is fundamentally different from the Byzantine-robust approaches in the following. (1) Different from Byzantine-robust model training, it is assumed that no malicious behaviors or Byzantine faults are observed on clients in training. (2) It improves the model robustness by refining the decision boundary around the adversarial examples, whereas Byzantine-robust models focus on mitigating the impact of potential clients' faults. It is studied [34, 36, 50] that one naïve idea is to leverage the adversarial training techniques [24, 43, 45] for clients' local training. However, it might suffer from (1) expensive computation for local clients associated with limited storage and computational resources; and (2) increased distribution shift among local clients, as the generated local adversarial examples might be biased towards different directions. Therefore, in this paper, we study the adversarial robustness of federated learning from the perspective of bias-variance analysis. Different from previous works, the bias-variance analysis on local clients allows capturing the global model dynamics for improving the adversarial robustness of the aggregated global model in the federated learning setting.

Bias-variance decomposition [11, 20] was originally introduced to analyze the generalization error of a learning algorithm. Then, a generalized bias-variance decomposition [8, 41] was studied in the classification setting which enabled flexible loss functions (e.g., squared loss, zero-one loss). Specifically, it is shown that under mild conditions, the generalization error of a learning algorithm can be decomposed into bias, variance and irreducible noise. In this case, bias measures how the trained models over different training data sets deviate from the optimal model, and variance measures the differences of those trained models. Moreover, it is previously observed [11] that the bias decreases and the variance increases monotonically with respect to the model complexity. It thus suggests that a better trade-off of bias and variance could lead to improving the generalization performance of a learning algorithm. Nevertheless, in recent years, it is empirically shown [31, 37] that increasing the model complexity of deep neural networks tends to produce better generalization performance, which is in contradiction to previous bias-variance analysis. Following this idea, bias-variance trade-off was experimentally revisited on modern neural network models [2, 3, 9, 30, 48] . It is found that for deep neural networks, the variance is more likely to increase first and then decrease with respect to the model complexity.

We would like to point out that compared to standard supervised learning, federated learning can be better characterized by the biasvariance decomposition. That is because the trained models in local clients can be naturally applied to define the bias and variance in the generalization error of a learning algorithm. To the best of our knowledge, this is the first work studying the federated learning problem from the perspective of bias-variance analysis.

In this section, we formally present the problem definition and the bias-variance trade-off in the classification setting.

In federated learning [25, 47] , there is a central server and different clients, each with access to a private training set D = {( , )} =1 . Here, , , and are the features, label, and number of training examples in the th client where = 1, · · · , . Each data D is exclusively owned by client and will not be shared with the central server or other clients.

The goal of standard federated learning is to learn the prediction function by aggregating the knowledge of user data from a set of local clients without leaking the user privacy. A typical framework [25, 47] of federated learning can be summarized as follows:

(1) Client Update: Each client updates local parameters by minimizing the empirical loss over its own training set; (2) Forward Communication: Each client uploads its parameter update to the central server; (3) Server Update: It synchronously aggregates the received parameters; (4) Backward Communication: The global parameters are sent back to the clients.

It can be seen that the neural network in federated learning actually approximates the updating behavior of a centralized model. For example, the update behavior of FedAvg [25] is closely equivalent to the stochastic gradient descent (SGD) on the centralized data when each client only takes one step of gradient descent. That explains why it might potentially inherit the adversarial vulnerability of centralized neural network models [12, 39] . Therefore, in this paper, we study the adversarial robustness of neural networks in the federated learning setting.

We introduce a small public training set D = {( , )} =1 with training examples within the central server that is shared with clients, where ≪

. Note that this will not break the privacy constraints, as the local data in every client is not shared. For example, during the COVID-19 pandemic, hospitals (local devices) would like to consider a few publicly accessible template data with typical symptoms 1 for model training of the diagnostic system. Notice that federated semi-supervised learning [18, 49] has studied a similar problem setting where some labeled examples are available only at the server. Instead, in our paper, we propose to improve the adversarial robustness of federated learned system over local devices by taking those publicly accessible data D into consideration. Formally, we define adversarially robust federated learning as follows. Output: A trained model on the central server that is robust against adversarial perturbations.

We would like to point out that our problem definition has the following properties. (1) Asymmetrical communication: The asymmetrical communication between each client and server cloud is allowed: the server provides both global model parameters and limited shared data to the clients; while each client only uploads its local model parameters back to the server. This implies that compared to standard federated learning, the communication cost might increase due to those shared data, but it enables the improvement of adversarial robustness in federated learning (see Appendix A.4.1 for more empirical analysis) 2 . (2) Data distribution: All training examples on the clients and the server are assumed to follow the same data distribution. However, the experiments show that our proposed algorithm also achieves outstanding performance under the non-IID setting (see Subsection 6.2), which could be commonly seen among personalized clients in real scenarios. (3) Shared learning algorithm: All the clients are assumed to use the identical model (·), including architectures as well as hyper-parameters (e.g., learning rate, local epochs, local batch size, etc.).

In this paper, we investigate the adversarially robust federated learning by studying the generalization error incurred in its aggregation process. We discover that the key to analyze this error is from the perspective of bias-variance trade-off. Following [8] , we define the optimal prediction, main prediction as well as the bias, variance, and noise for any real-valued loss function (·, ·) as follows:

Definition 3.2. (Optimal Prediction and Main Prediction [8] ) Given a loss function (·, ·) and a learning algorithm (·), optimal prediction * and main prediction for input are defined as:

where the label 3 and data set D are viewed as the random variables to denote the class label and training set, and D denotes the model trained on D.

Definition 3.3. (Bias, Variance and Noise) Given a loss function (·, ·) and a learning algorithm (·), the bias, variance and noise can be defined as follows.

(

Furthermore, there exists , 0 ∈ R such that the expected prediction loss E D, [ ( D ( ), )] for an example can be decomposed into bias, variance and noise as follows:

In short, bias is the loss incurred by the main prediction w.r.t. the optimal prediction, and variance is the average loss incurred by all individual predictions w.r.t. the main prediction. Noise is conventionally assumed to be irreducible and independent to (·). Our definitions on optimal prediction, main prediction, bias, variance and noise slightly differ from previous ones [8, 41] . For example, conventional optimal prediction was defined as * ( ) = arg min E [ ( , )], and it is equivalent to our definition when loss function is symmetric over its arguments, i.e., ( 1 , 2 ) = ( 2 , 1 ).

In this section, we present the adversarially robust federated learning framework. It follows the same paradigm of traditional federated learning (see Subsection 3.1) but with substantial modifications on the server update and client update as follows.

The server has the following two crucial components. The first one is the model aggregation, which synchronously compresses and aggregates the received local model parameters. Another component is designed to produce adversarially perturbed examples which could be induced by a poisoning attack algorithm for the usage of robust adversarial training. 4.1.1 Model Aggregation. The overall goal of federated learning is to learn a prediction function by using knowledge from all clients such that it could generalize well over the test data set. When the central server receives the parameter updates from local clients, it aggregates the locally updated parameters to obtain a shared global model. It is notable that most existing federated learning approaches [25, 26, 42] focus on developing the advanced model aggregation schemes. One of the popular aggregation methods is FedAvg [25] , which aims to average the element-wise parameters of local models, i.e.,

where is the model parameters in the th client and = =1 . In this paper, we focus on improving the adversarial robustness of federated learning by encouraging the local model parameters to be updated with adversarial examples (see next subsection). Then the adversarial robustness of local models can be naturally propagated into the server's global model after model aggregation. Therefore, our framework is flexible to be incorporated with existing model aggregation methods, e.g., FedAvg [25] , FedMA [42] , AFL [26] , etc. 4.1.2 Adversarial Examples. It is shown [12, 45] that the adversarial robustness of deep neural networks can be improved by updating the model parameters over adversarial examples. That is because the generalization error of neural networks on adversarial examples can be minimized during model training. However, it will encounter a few issues if we apply adversarial training on federated learning directly. Specifically, one intuitive solution of generating adversarial examples is to separately maximize the generalization error in each local client (see Subsection 4.3 for more discussion). Its drawbacks are two folds: First, it will significantly increase the computational burden and memory usage on local clients. Second, the locally generated adversarial examples make the augmented data distribution of local client much more biased [46] , which challenges the standard server-level aggregation mechanisms [26] .

Instead, we would like to study the adversarial robustness of federated learning from the perspective of bias-variance decomposition. The following theorem shows that in the classification setting, the generalization error of a learning algorithm can be decomposed into bias, variance, and irreducible noise. Note that this decomposition holds for any real-valued loss function in the binary setting [8] with a bias & variance trade-off coefficient that has a closed-form expression.

Theorem 4.1. In binary case, the decomposition in Eq. (1) is valid for any real-valued loss function that satisfy ∀ ( , ) = 0 and

otherwise.

The proof of Theorem 4.1 is similar to [8] , we omit it here for space. Note that noise is irreducible and we usually drop this term in real applications [8, 11, 41, 48] . Commonly used loss functions for this decomposition are square loss, zero-one loss, and cross-entropy loss with one-hot labels. For the multi-class setting, the close-form solution of coefficient for cross-entropy loss is given as follows.

Remark. Intuitively, when = * , no bias is induced and the generalization error mainly comes from variance; when bias exists ≠ * , a negative variance will help to reduce error when prediction is identical to main prediction or optimal prediction; otherwise, the variance is set to zero. In this paper, neural networks that use crossentropy loss and mean squared loss with softmax prediction outputs are studied. Thus, we inherit their definition of bias & variance directly, but treat the trade-off coefficient as a hyper-parameter to tune because no closed-form expression of is available in a general multiclass learning scenario.

Following [8, 41] , we assume a noise-free scenario where the class label is a deterministic function of (i.e., if is sampled repeatedly, the same values of its class will be observed). The bias and variance can be empirically estimated over the clients' models. This motivates us to generate the adversarial examples by attacking the bias and variance induced by clients' models as: max

where (ˆ; 1 , · · · , ) and (ˆ; 1 , · · · , ) could be empirically estimated from a finite number of clients' parameters 1 , · · · , trained on local training sets {D 1 , · · · , D }. Here is a hyperparameter to measure the trade-off of bias and variance, and Ω( ) is the perturbation constraint.ˆis the perturbed examples w.r.t. the clean example associated with class label . Specifically, the perturbation constraintˆ∈ Ω( ) forces the adversarial examplê to be visually indistinguishable w.r.t. . In our paper, we use ∞bounded adversaries [12] , i.e., Ω( ) := {ˆ ||ˆ− || ∞ ≤ } for a perturbation magnitude .

Note that D (on the server) is the candidate subset of all available training examples that would lead to their perturbed counterparts. This is a more feasible setting as compared to generating adversarial examples on clients' devices because the server usually has much powerful computational capacity in real scenarios that allows the usage of flexible poisoning attack algorithms. In this case, both poisoned examples and server model parameters would be sent back to each client (Backward Communication), while only clients' local parameters would be uploaded to the server (Forward Communication), i.e., the asymmetrical communication.

The robust training of one client's prediction model (i.e., ) can be formulated as the following minimization problem.

whereˆ∈ Ω( ) is the perturbed examples that is asymmetrically transmitted from the server. Intuitively, the bias measures the systematic loss of a learning algorithm, and the variance measures the prediction consistency of the learner over different training sets. Therefore, our robust federated learning framework has the following advantages: (i) it encourages the clients to consistently produce the optimal prediction for perturbed examples, thereby leading to a better generalization performance; (ii) local adversarial training on perturbed examples allows to learn a robust local model, and thus a robust global model could be aggregated from clients.

Theoretically, we could have another alternative robust federated learning strategy where the perturbed training examples of each client is generated on local devices from D instead of transmitted from the server. It is similar to [24, 40] where it iteratively synthesizes the adversarial counterparts of clean examples and updates the model parameters over perturbed training examples. Thus, each local robust model is trained individually. Nevertheless, poisoning attacks on the device will largely increase the computational cost and memory usage. Meanwhile, it only considers the client-specific loss and is still vulnerable against adversarial examples with increasing communication rounds. Both phenomena are observed in our experiments (see Fig. 2 (b) and Fig. 2(c) ).

In this section, we present the instantiated algorithm Fed_BVA with bias-variance attacks.

We first consider the maximization problem in Eq. (3) using biasvariance based adversarial attacks. It aims to find the adversarial exampleˆ(from the original example ) that would produce large bias and variance values w.r.t. clients' local models. To this end, we propose the following two gradient-based algorithms to generate adversarial examples.

Bias-variance based Fast Gradient Sign Method (BV-FGSM): Following FGSM [12, 45] , it linearizes the maximization problem in Eq. (3) with one-step attack as follows.

BV−FGSM = + · sign ( ∇ ( ( ; 1 , · · · , ) + ( ; 1 , · · · , )))

where sign(·) denotes the sign function.

Bias-variance based Projected Gradient Descent (BV-PGD): PGD [24] can be considered as a multi-step variant of FGSM and might generate powerful adversarial examples. This motivated us to derive a BV-based PGD attack:

whereˆis the adversarial example at the th step with the initial-izationˆ0 = and Proj Ω ( ) (·) projects each step onto Ω( ).

Remark. The proposed framework could be naturally generalized to any gradient-based adversarial attack algorithms where the gradients of bias (·) and variance (·) w.r.t. are tractable when estimated from finite training sets. Compared with the existing attack methods [4, 12] , our loss function the adversary aims to optimize is a linear combination of bias and variance, whereas existing work mainly focused on attacking the overall classification error involving bias only.

The following theorem states that bias (·) and variance (·) as well as their gradients over input could be estimated using the clients' models. ← randomly sampled clients 6: for each client ∈ in parallel do is the entropy of the main prediction and is the number of classes. In addition, we also consider the case where (·, ·) is the MSE loss function. Its main prediction for an input example ( , ) has a closed-form expression which is exactly the same as the CE loss, its empirical bias and unbiased variance can only be estimated in the following formulas:

and their gradients over input are:

It can be seen that the empirical estimate of ∇ ( ) has much higher computational complexity than ∇ ( ) because it involves the gradient calculation of prediction vector D ( ; ) over the input tensor . Besides, it is easy to show that the empirical estimate of ∇ ( ) is also computationally expensive. Comparison between using CE and MSE losses is presented in Subsection 6.3.1.

We present a novel robust federated learning algorithm with our proposed bias-variance attacks, named Fed_BVA. Following the framework defined in Eq. (3) and Eq. (4), key components of our algorithm are (1) bias-variance attacks for generating adversarial examples on the server, and (2) adversarial training using poisoned server examples together with clean local examples on each client. Therefore, we optimize these two objectives by producing the adversarial examplesD and updating the local model parameters iteratively.

The proposed algorithm is summarized in Alg. 1. Given the server's D and clients' training data {D } =1 as input, the output is a robust global model on the server. In this case, the clean server data D will be shared with all the clients. First, it initializes the server's model parameter and perturbed dataD , and then assigns to the randomly selected clients (Steps 4-5). Next, each client optimizes its own local model (Steps 6-15) with the received global parameters as well as its own clean data D , and uploads the updated parameters as well as the gradients of the local model on each shared server example back to the server. At last, the server generates the perturbed dataD (Step 16-19) using the proposed bias-variance attacks with aggregations (model parameters average, bias gradients average, and variance gradients average) in a similar manner as FedAvg [25] . These aggregations can be privacy secured if additive homomorphic encryption [1] is applied.

In this section, we present the experimental results on evaluating the adversarial robustness of our proposed algorithm. Code is available in this anonymous link 4 .

We evaluate our proposed algorithm on four data sets: MNIST 5 , Fashion-MNIST 6 , CIFAR-10 7 and CIFAR-100 7 . Following [25] , we consider two methods to partition the data over clients: IID and non-IID. For IID setting, the data is shuffled and uniformly partitioned into each client. For non-IID setting, data is divided into 2 · shards based on sorted labels, then assigns each client with 2 shards. Thereby, each client will have data with at most two classes. (1) . Centralized: the training with one centralized model, which is identical to the federated learning case that only has one client ( = 1) with fraction ( = 1). (2) . FedAvg: the classical federated averaging model [25] . (3) . FedAvg_AT: The simplified version of our proposed method where the local clients perform adversarial training with the asymmetrical transmitted perturbed data generated on top of FedAvg's aggregation. (4) - (6) . Fed_Bias, Fed_Variance, Fed_BVA: Our proposed methods where the asymmetrical transmitted perturbed data is generated using the gradients of bias-only attack, variance-only attack, and bias-variance attack, respectively. (7) . EAT: Ensemble adversarial training [40, 50] , where each client performs local adversarial training, and their model updates are aggregated on server using FedAvg. (8) . EAT+Fed_BVA: A combination of baselines (6) and (7) . Note that the baselines (7) and (8) have high computational requirements on client devices, and are usually not preferred in real scenarios. For fair comparisons, all baselines are modified to the asymmetrical communication setting (FedAvg and EAT have clean D received), and all their initializations are set to be the same.

For Fed_BVA framework, we use a 4layer CNN model for MNIST and Fashion-MNIST, and VGG9 architecture for CIFAR-10 and CIFAR-100. The training is performed using the SGD optimizer with a fixed learning rate of 0.01 and momentum of value 0.9. The trade-off coefficient between bias and variance is set to = 0.01 for all experiments. All hyper-parameters of federated learning are presented in Table 9 in the appendix. We empirically demonstrate that these hyper-parameter settings are preferable in terms of both training accuracy and robustness (see the details of Fig. 3 -Fig. 5 in Appendix A.4). To demonstrate the robustness of our Fed_BVA framework, we evaluate the deployed server model on the test set D against adversarial attacks FGSM [12] , PGD [24] with 10 and 20 steps (i.e., PGD-10, PGD-20). Following [40, 43] , the maximum perturbations allowed are = 0.3 on MNIST and Fashion-MNIST, and = 16 255 on CIFAR-10 and CIFAR-100 for both threat and defense models.

To analyze the properties of our proposed Fed_BVA framework, we present two visualization plots on MNIST using a trained CNN model where the bias and variance are both calculated on the training examples. In Fig. 1(a) , we visualize the extracted gradients using adversarial attack from bias, variance, and bias-variance. Notice that the gradients of bias and variance are similar but with subtle differences in local pixel areas. However, according to Theorem 5.1, Method   IID  non-IID   Clean  FGSM  PGD-20  Clean  FGSM  PGD-20 Centralized 0.991±0.000 0.689±0.000 0.182±0.000 n/a n/a n/a Table 3 : Accuracy of CIFAR-10 and CIFAR-100 under white-box attacks the gradient calculation of these two are quite different: bias requires the target label as input, but variance only needs the model output and main prediction. From another perspective, we also investigate the bias-variance magnitude relationship with varying model complexity. As shown in Fig. 1(b) , with increasing model complexity (more convolutional filters in CNN), both bias and variance decrease. This result is different from the double-descent curve or bell-shape variance curve claimed in [3, 48] . The reasons are twofold: First, their bias-variance definitions are from the MSE regression decomposition perspective, whereas our decomposition utilizes the concept of main prediction, and the generalization error is decomposed from the classification perspective; Second, their implementations only evaluate the bias and variance using training batches on one central model and thus is different from the definition which requires the variance to be estimated from multiple sub-models (in our scenario, client models).

The convergence plot of all baselines is presented in Fig. 2(a) . We observe that FedAvg has the best convergence, and all robust training will have a slightly higher loss upon convergence. This matches the observations in [24] which states that training performance may be sacrificed in order to provide robustness for small capacity networks. For the model performance shown in Fig. 2(b) , we observe that the aggregation of federated learning is vulnerable We also observe that when communication rounds reach 40, Fed_BVA starts to outperform EAT while the latter is even more resource-demanding than Fed_BVA (shown in Fig. 2(c) , where the pie plot size represents the running time). Overall, bias-variance based adversarial training via asymmetric communication is both effective and efficient for robust federated learning. For the comprehensive experiments in Table 1 and Table 2 , it is easy to verify that our proposed model outperforms all other baselines regardless of the source of the perturbed examples (i.e., locally generated like EAT+Fed_BVA or asymmetrically transmitted from the server like Fed_BVA). In this case, BV-FGSM (see Eq. (5)) is used to generate the adversarial examplesD during model training for robust federated learning. Compared to standard robust federated learning FedAvg_AT, the performance of Fed_BVA against adversarial attacks still increases 4% − 13% and 2% − 9% on IID and non-IID settings respectively, although Fed_BVA is theoretically suitable for the cases that clients have IID samples. In Table 3 , we observe a similar trend where Fed_BVA outperforms FedAvg_AT on CIFAR-10 and CIFAR-100 (with 0.2% − 10% increases) when defending different types of adversarial examples. Compared to strong local adversarial training baseline EAT, we also observe a maximum 13% accuracy increase when applying its bias-variance oriented baseline EAT+Fed_BVA. Overall, the takeaway is that without local adversarial training, using a bias-variance based robust learning framework could outperform other baselines for defending FGSM and PGD attacks. When local adversarial training is allowed (e.g., the client device has powerful computation ability), using biasvariance robust learning with local adversarial training will mostly have the best robustness.

6.3.1 MSE loss v.s. CE loss. Both cross-entropy (CE) and mean squared error (MSE) loss functions could be used for training a neural network model. In our paper, the loss function of neural networks determines the derivation of bias and variance terms used for producing the adversarial examples. We experimentally compare the CE and MSE loss functions in our framework. Table 4 reports the adversarial robustness of our federated framework w.r.t. FGSM attack ( = 0.3) on MNIST with IID setting. It is observed that (1) our framework with MSE has significantly larger running time;

(2) the robustness of our framework with MSE becomes slightly weaker, which might be induced by the weakness of MSE in training neural networks under the classification setting. Table 5 provides our results on w.r.t. FGSM and PGD attacks ( = 0.3) on MNIST with IID and non-IID settings. Compared to FedAvg, our framework Fed_BVA with either BV-FGSM or BV-PGD could largely improve the model robustness against adversarial noise. Furthermore, BV-PGD could potentially improve white-box robustness on multi-step attacks, but it is often more computationally demanding. As a comparison, BV-FGSM is more robust against single-step attacks. 6.3.3 Alternative Training Strategies for Fed_BVA. In our Fed_BVA framework, we propose to maximize the overall generalization error induced by bias and variance from different clients for the adversarial examples generation. Under this setting, the generated adversarial examples on the server are shared with all the clients for local adversarial training. In particular, we also found that when using the CE loss, the estimated gradients of both bias and variance can be considered as the average of clients' local gradients over input (see Subsection 5.1). This motivates us to consider several alternative training strategies by generating the client-specific adversarial examples on the server. To be more specific, we have the following three training strategies: • S1: We generate the adversarial examplesD to maximize the bias and variance from all clients' predictions. In this case, the generated adversarial examples on the server will be shared with the local clients. This is the strategy used in our Fed_BVA algorithm. It guarantees the minimization of generalization error from the perspective of bias-variance decomposition, thus leading to an adversarially robust federated learning model. It is a special case of S2 with = 0. In this case, every client will only use its own model parameters to generate the client-specific adversarial examples on the server. This strategy is actually a simple extension of adversarial training (with centralized data) in the federated learning setting. We conduct the ablation study to compare different training strategies in our Fed_BVA framework. In this case, we use = 10 clients with fraction = 1 and local epoch = 5. Other hyperparameters and model architecture setting are the same as our previous experiments. Table 6 provides the performance of adversarial robustness using our Fed_BVA framework on MNIST with both IID and non-IID settings. It is observed that Fed_BVA with S1 has the best robustness in most cases compared to other heuristic training strategies S2 and S3. This indicates that bias and variance provide better direction to generate the adversarial examples for robust training. In contrast, detecting the adversarial examples with individual directions in S2 for each client might be suboptimal.

Using black-box attack, we test the transferability of single-step attack (i.e., FGSM [12] ) and multi-step attack (i.e., PGD [22] ) on various federated learning baseline models. For CIFAR-10 data set, we pretrained ResNet18 [14] , VGG11 [37] , Xception [6] , and MobileNetV2 [35] as the source threat models for generating the single-step and multi-step adversarial examples. The black-box transfer attacking results are shown in Table 7 (see more results in the appendix). We observe that without any adversarial training, FedAvg will suffer a maximum of 28% accuracy lost. For comparison, the robust federated learning model with global transmitted perturb samples (i.e., FedAvg_AT, Fed_Bias, Fed_Variance, Fed_BVA) will have increase robustness with a maximum of 23% accuracy drop on best baselines. For the computation-demanding local robust training methods (i.e., EAT and EAT+Fed_BVA), the maximum accuracy drop is only 3%, respectively. Thus, it is straightforward to see that CIFAR-10 is more vulnerable to multi-step black-box adversarial attack, but the local adversarial training methods could improve its robustness.

In this paper, we proposed a novel robust federated learning framework, in which the loss incurred during the server's aggregation is dissected into a bias part and a variance part. Our approach improves the model robustness through adversarial training by supplying a few bias-variance perturbed samples to the clients via asymmetrical communications. Extensive experiments have been conducted where we evaluated its performance from various aspects on several benchmark data sets.

In the appendix, we provide the notation summary, reproducible setting and additional experimental results. Table 8 Expected bias, variance and noise

Perturbation constraint Perturbation magnitudê Adversarial counterpart of , i.e.,ˆ∈ Ω ( ) ( ·) Entropy For the CIFAR-10 and CIFAR-100 data sets (we use its 20 class version with coarse labels), we deployed 20 clients (each has 5% of the overall data) since the data examples have more variations in terms of their categories and cluttered backgrounds. Among these clients, 20% of them would be selected for model updating (with 5 local epochs since we utilized deeper models with longer training time per epoch). Similarly, we only transmitted 30 or 60 examples from the server to the clients in CIFAR-10 or CIFAR-100. Due to the complexity of CIFAR's data distributions, we enforce to be spread out in every categories (i.e., both CIFAR-10 and CIFAR-100 with coarse labels will have 3 shared examples among these clients per category). An alternative method is to select randomly from the data sets, and in this scenario, we only observe slight performance drop in all settings where the model behavior remains unchanged. In CIFAR-10, we choose to be only 0.05% of the data set's size. MNIST 

Using black-box attack, we test the transferability of single-step attack (i.e., FGSM [12] ) and multi-step attack (i.e., PGD [22] ) on various federated learning baseline models. For MNIST and Fashion-MNIST, the architectures for threat models are shown in Table 10 . For MNIST under the setting of IID|non-IID (see Table 11 and Table 12 ), we observe maximum 27%|28% accuracy drop on FedAvg under various black-box attacks. As a comparison, the robust federated learning model with global transmitted perturb samples (i.e., FedAvg_AT, Fed_Bias, Fed_Variance, Fed_BVA) will have increase robustness with a maximum of 12%|14% accuracy drop on best baselines. For the more computation-demanding local robust training methods (i.e., EAT and EAT+Fed_BVA), the maximum accuracy drop is only 6%|11% respectively.

For Fashion-MNIST data set under the IID|non-IID settings (see Table 13 and Table 14) , similar trends are observed. We see that without any adversarial training, FedAvg will suffer a maximum of 74%|64% accuracy lost. For comparison, the robust federated learning model with global transmitted perturb samples (i.e., FedAvg_AT, Fed_Bias, Fed_Variance, Fed_BVA) will have increase robustness with a maximum of 41%|17% accuracy drop on best baselines. For the computation-demanding local robust training methods (i.e., EAT and EAT+Fed_BVA), the maximum accuracy drop is only 27%|14%, respectively.

On both MNIST and Fashion-MNIST data sets, we observe that the single-step adversarial attacks (i.e., FGSM) have higher transferability than those generated using multi-step adversarial attacks (i.e., PGD). That is, the accuracy after PGD-20 black-box attack is higher than the accuracy after FGSM black-box attack. This phenomenon is observed on all results of Table 11 -Table 14 , and this match with the conclusion of [40] .

In this part, we perform parameter analysis regarding the key hyperparameters. Fig. 3 , we see that the accuracy of FedAvg (i.e., = 0) has the best accuracy as we expected. For Fed_BVA with varying size of asymmetrical transmitted perturbed samples (i.e., = 8, 16, 32, 64), its performance drops slightly with increasing (on average drop of 0.05% per plot). As a comparison, the robustness on test set D increases dramatically with increasing (the improvement ranges from 18% to 22% under FGSM attack and ranges from 15% to 60% under PGD-20 attack). However, choosing large would have high model robustness but also suffer from the high communication cost. In our experiments, we choose = 64 for MNIST for the ideal trade-off point.

We also care about the choice of options in the SGD optimizer. As shown in Fig. 4(a) , the accuracy of clean training monotonically increases when the momentum is varying from 0.1 to 0.9. Interestingly, we also observe from Fig. 4(b) and Fig. 4(c) that the federated learning model is less vulnerable when momentum is large no matter the adversarial attack is FGSM or PGD-20 on the test set D . This suggests us to choose the momentum as 0.9 through all the experiments.

A.4.3 Local epochs . Another important factor of federated learning is the number of local epochs. In Fig. 5 , we see that more local epochs in each client lead to a more accurate aggregated server model in prediction. Similarly, its robustness against both FGSM and PGD-20 attacks on the test set D also has the best performance when the local epochs are large on the device. Hence, in our experiments, if the on-device computational cost is not very high (large data example size, deep models with many layers), we choose = 50. Otherwise, we will reduce to a smaller number accordingly. 

A survey on homomorphic encryption schemes: Theory and implementation

Understanding Double Descent Requires A Fine-Grained Bias-Variance Decomposition

Reconciling modern machine-learning practice and the classical bias-variance trade-off

Towards evaluating the robustness of neural networks

Cronus: Robust and Heterogeneous Collaborative Learning with Black-Box Knowledge Transfer

Xception: Deep Learning with Depthwise Separable Convolutions

Byzantine-resilient high-dimensional SGD with local iterations on heterogeneous data

A unified bias-variance decomposition and its applications

Double trouble in double descent: Bias and variance (s) in the lazy regime

Local model poisoning attacks to Byzantine-robust federated learning

Neural networks and the bias/variance dilemma

Explaining and harnessing adversarial examples

Federated learning for mobile keyboard prediction

Deep Residual Learning for Image Recognition

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

FL-NTK: A Neural Tangent Kernel-based Framework for Federated Learning Analysis

Communication-Efficient On-Device Machine Learning: Federated Distillation and Augmentation under Non-IID Private Data

Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint Learning

Sebastian Stich, and Ananda Theertha Suresh. 2020. Scaffold: Stochastic controlled averaging for federated learning

Bias plus variance decomposition for zero-one loss functions

Federated learning: Strategies for improving communication efficiency

Adversarial machine learning at scale

Ditto: Fair and robust federated learning through personalization

Towards deep learning models resistant to adversarial attacks

Communication-Efficient Learning of Deep Networks from Decentralized Data

Agnostic Federated Learning

DeepFool: A simple and accurate method to fool deep neural networks

Robust Federated Learning Through Representation Matching and Adaptive Hyper-parameters

A survey on security and privacy of federated learning

A modern take on the biasvariance tradeoff in neural networks

The role of over-parametrization in generalization of neural networks

Robust Aggregation for Federated Learning

Towards Realistic Byzantine-Robust Federated Learning

Robust Federated Learning: The Case of Affine Distribution Shifts

MobileNetV2: Inverted Residuals and Linear Bottlenecks

Adversarial training in communication constrained federated learning

Very Deep Convolutional Networks for Large-Scale Image Recognition

Local SGD Converges Fast and Communicates Little

Intriguing properties of neural networks

Ensemble adversarial training: Attacks and defenses

Bias-variance analysis of support vector machines for the development of SVM-based ensemble methods

Federated Learning with Matched Averaging

On the Convergence and Robustness of Adversarial Training

TernGrad: Ternary Gradients to Reduce Communication in Distributed Deep Learning

Fast is better than free: Revisiting adversarial training

Intriguing Properties of Adversarial Training at Scale

Federated machine learning: Concept and applications

Rethinking bias-variance trade-off for generalization of neural networks

Improving Semi-supervised Federated Learning by Reducing the Gradient Diversity of Models

Mathieu Sinn, and Beat Buesser. 2020. FAT: Federated adversarial training