key: cord-0653127-ptuz9873
authors: Fan, Mingyuan; Liu, Yang; Guo, Wenzhong; Liu, Ximeng; Li, Jianhua
title: Case-Aware Adversarial Training
date: 2022-04-20
journal: nan
DOI: nan
sha: e167685e8f606689a983ce472c51fba60e55e496
doc_id: 653127
cord_uid: ptuz9873

The neural network (NN) becomes one of the most heated type of models in various signal processing applications. However, NNs are extremely vulnerable to adversarial examples (AEs). To defend AEs, adversarial training (AT) is believed to be the most effective method while due to the intensive computation, AT is limited to be applied in most applications. In this paper, to resolve the problem, we design a generic and efficient AT improvement scheme, namely case-aware adversarial training (CAT). Specifically, the intuition stems from the fact that a very limited part of informative samples can contribute to most of model performance. Alternatively, if only the most informative AEs are used in AT, we can lower the computation complexity of AT significantly as maintaining the defense effect. To achieve this, CAT achieves two breakthroughs. First, a method to estimate the information degree of adversarial examples is proposed for AE filtering. Second, to further enrich the information that the NN can obtain from AEs, CAT involves a weight estimation and class-level balancing based sampling strategy to increase the diversity of AT at each iteration. Extensive experiments show that CAT is faster than vanilla AT by up to 3x while achieving competitive defense effect.

Over the past decade, the neural network (NN) becomes the most outstanding approach in a wide range of signal processing tasks due to its remarkable data processing ability. However, despite the promising performance, NN is extremely vulnerable to the attack of adversarial examples (AEs) [1, 2, 3, 4, 5] . By adding human-imperceptible and carefully crafted tiny noises, the AEs can trick the victim model to behave as desired by the attacker [6, 7, 8, 9] . Due to the great security threat caused by the AE attack, an effective AE defense method becomes growingly much-needed in applications.

Thanks to the National Natural Science Foundation of China (No.62072109, No.U1804263) for funding.

Currently, three methods are mainly explored to achieve AE defense, including AE detection [10] , data preprocessing [11] and adversarial training (AT) [12] . Among them, AE detection aims to reveal the malicious inputs mixed in the normal ones and avert the attack before it happens. Data preprocessing achieves AE defense by transforming malicious inputs into normal inputs. However, both of these two kinds of defense are proved to be breakable in [13] . At last, AT becomes the only way believed to be able to raise the robustness of NNs against AEs intrinsically [14] .

AT achieves AE defense in a straightforward manner, i.e., making the model trainer to actively add AEs into the training set to induce the model to learn how to correctly classify them. However, due to the introduction of sample generation (AE generation) and training, the cost of the AT can be dozens of times more compared with normal training. Considering the long time used for training a NN, such as training a VGG-16 network with ImageNet, the applicable range of the naive AT method [12] is very limited.

In this paper, we propose a novel method to accelerate the AT process, called case-aware adversarial training (CAT). The design of CAT is first motivated by the fact revealed in the active learning technique [15, 16] that the decision boundaries of a NN model can be strongly affected by a rare number of training examples with rich information. Thus, instead of using AEs transformed from all training samples, a more efficient way to achieve AT is to actively select some informative AEs and only use them in the AT process. However, straightforward as the idea is, it cannot be effortlessly applied in practice because the information gain of AEs cannot be derived from the original examples directly. Alternatively, to select informative AEs, we have to craft every sample into AE, which is still computation-intensive.

To get over the drawback, CAT mainly achieve two breakthroughs. First, we observe that the AEs of a certain example over successive iterations exhibit good similarity. It suggests that AEs crafted before can be employed to approximately estimate the importance of the example, without crafting its AEs from scratch so that the cost of AT can be greatly reduced. Second, even if each sampled example contains rich information, these examples may be quite similar, decreasing the overall effective information contained in the mini-batch.

To alleviate the issue, CAT leverages two measures for increasing the diversity of the examples in the mini-batch, i.e., weighted sampling without replacement and class-level balancing. Moreover, we highlight that CAT is indeed orthogonal with other AT techniques, and it can be easily integrated into other frameworks of AT.

Our contributions can be summaries as follows:

• We propose a novel scheme to accelerate AT, which can be used against the AE attack. To this end, we design a weighted sampling strategy that involves a information gain estimation method to evaluate the importance of an AE to adversarial training.

• We discover that during AT, the AEs perform similarly between iterations. Based on the discovery, we can improve the weighted sampling process by deriving the gain of each AE from its previous counterparts.

• We conduct extensive experiments to examine the effectiveness of the proposed scheme, and the experiment results show that CAT is faster than the conventional adversarial training scheme by up to 3x with competitive performance.

This paper is designed to leverage adversarial training to achieve AE defense. Formally, given the training set D = {(x 1 , y 1 ), · · · , (x n , y n )} and an NN F θ , the objective of traditional adversarial training can be formulated as follows.

where || · || ∞ denotes the ∞-norm of inputs and L(·, ·) indicates the loss function (e.g., cross-entropy loss). Besides, the noises δ i used to synthesize AEs are generated by Eq. 2.

where P roj (·) can project the inputs into -balls [17] and m is the iteration number required to generate the noise δ i . From Eq. 1, it can be observed that adversarial training enhances the robustness of F θ against AEs by making it to remember how to classify these AEs correctly in the training stage. However, since the past adversarial training schemes [12, 18] require all training samples to generate AEs, their costs can be tens of times more than the normal training, which is totally intolerable for applications. To break the bottleneck, we propose the following CAT scheme.

Compared with traditional schemes [12] , the basic improvement of CAT (Algorithm 1) is the involvement of a novel sampling strategy that can select informative AEs to maximize the gains of the model at each iteration. Specifically, the sampling strategy is mainly composed of two parts: 1) measuring samples with the highest information gains; 2) maximizing the diversity of sample at each each training itearation.

Protocol 1 CAT Input: F : the neural network; N : the number of iterations; D: the training dataset. Output: F θ1 : adversarially trained neural network.

1: Initialize the weight of each sample in D to be

Sample a batch of data X, Y from D based on the information gains computed by Eq. 3.

Craft adversarial examples X adv for X.

Update the parameters θ i of F with X adv , Y.

Update the weight of X with X adv based on Eq. 4. 7: end for

Measuring the information gain. Instead of using the AEs crafted from every x i ∈ D as defined in Eq. 1, CAT only uses parts of AEs with high information gains during the training process to save the computation cost. Here, the information gain w i for each AE x i + δ i is measured according to the fact that the higher the uncertainty of a sample is for a model, the more the model can gain from the sample (see uncertainty sampling [19] . Based on the idea, w i can be computed as follows.

where K is the number of classes, log(F θ (·)[k]) denotes the likelihood of the k-th class, and y i is the ground-truth label. Intuitively, Eq. 3 describes the uncertainty of x i + δ i by computing its maximum likelihood distance. The higher w i indicates the lower confidence for F θ to correctly classify x i + δ i (higher probability to classify the AE into another class).

Then, the direct application of Eq. 3 needs the defender to prepare AE for each x i ∈ D from the start to obtain w i , which is still costly. CAT avoids the problem by utilizing the property that the model outputs similarly over AEs crafted from the identical sample at adjacent iteration. As illustrated in Fig. 1 , it can be observed that the AEs crafted in previous iterations are also fairly comparable (in terms of ASR and model prediction) to the AEs crafted in the current iteration. Furthermore, it suggests that the importance of an example in the current iteration can be approximately evaluated by its previous adversarial versions, without employing its adversarial example. Therefore, the sampling weight of the i-th training example in t-th iteration can be reformulated as follows:

where w i t represents the weighted accumulation of threat intensity of previous AEs of i-th training example and alpha is the hyperparameter (w i t = w i t−1 if an example is not selected in t-th iteration). Moreover, the higher the α is, the more attention the model pays to previous versions of AEs.

Practical Tricks. It is well known that NNs are usually trained based on the mini-batch stochastic gradient descent. However, after applying the weighted sampling strategy mentioned above, a mini-batch of data can contain repeated samples (or samples with the same class), which can lower the learning efficiency of AT according to the active learning theory [19] . To address the problem, we improve the sampling strategy used in CAT from two aspects.

First, CAT adopts the strategy about sampling without replacement to avoid involving repeated samples. Second, we introduce the class-wise balance strategy into CAT to avoid that the samples used at each iteration belong to the same class. In more details, the class-wise balance strategy evenly allocates the number of samples for each class during the sampling process. Thus, every mini-batch of data is ensured to contain the samples of each class.

We exhaustively examine the effectiveness of CAT compared with vanilla AT [12] with two widely-used benchmark datasets, i.e., MNIST and CIFAR-10. Specifically, we adopt ResNet18 as the base model throughout the experiments. Moreover, for fair comparisons, we set the same training parameters for CAT and AT. To evaluate the performance of CAT, we adopt natural accuracy (accuracy of natural samples), robust accuracy (accuracy of AEs), and the number of iterations for convergence as metrics. The mini-batch size is set 128 by default. Referring to [20] , the AEs are all crafted with PGD of 20 iterations and 10 restarts. 

Overall results. First, we evaluate the overall performance of our method from the three main metrics, as reported in Table 1 and 2. For MNIST and CIFAR-10, the accuracy and robust accuracy are evaluated at 1-iteration interval and 10-iteration intervals, respectively. The baseline points in MNIST apply a bigger span because it is an easy-to-learn dataset. Moreover, the speed ratio denotes the ratio of iterations required for model to converge with our method and AT. From the results, CAT achieves faster convergence speed in all settings, and the speedups are especially significant in the more sophisticated dataset CIFAR-10. For instance, CAT just requires 250 iterations to reach 60% robust accuracy; whereas, AT requires more than triple times of iterations for convergence than CAT in the same condition. Likewise, in terms of accuracy, the iterations required to achieve different baseline points are considerably reduced (about faster 2∼5x compared to AT).

A closer look at CAT. According to our design, a crucial factor that affects the performance of CAT is the number of AEs sampled at each iteration. Here, we examine the performance of CAT over various sampling numbers in MNIST and CIFAR-10, as shown in Figure 2 and 3 (MNIST for 100 iterations and CIFAR-10 for 3000 iterations). In the experiments, all hyperparameters are followed afore-mentioned ones. First, from the experiments, we observe that as achieving identical accuracy or robust accuracy (i.e., the same defense effect), CAT always requires fairly fewer iterations than AT for convergence. The improvements mainly have the benefit of the sampling strategy used by CAT, which can selectively filter informative samples to train the model, instead of equally treating each sample. Moreover, the characteristic also leads to the fact that the speed-up effect of CAT is more striking with the complicated learning task. Thus, as shown in Figure 2 and 3, the convergence speedup with CIFAR-10 is faster than MNIST.

Then, it can be discovered that there is a trade-off of the convergence speed and the sampling number for CAT. Specifically, the convergence speed can rise with increasing sampling number when the sampling number is below a certain threshold, while the increased sampling number instead weakens the efficiency of CAT when the sampling number is higher than the threshold. For example, in CIFAR-10, the convergence speed rises in smaller sampling numbers until its peak of around 256, and then the convergence speed gradually decreases with the increased sampling numbers. The situation is understandable. On the one hand, if the sampling number is small, in each iteration, only a few samples are involved, i.e., only weights of a few samples can accurately reflect the actual gains brought to the model. In other words, the weights based on adversarial examples crafted in previous iterations can no longer accurately (or approximately) measure the constructive information contained by the AEs because of the excessive update frequency of the model. On the other hand, as the sampling number remarkably increases, the performance of random sampling has to approach the weighted sampling gradually. Loosely speaking, it can be intuitively regarded as that, the highly informative AEs can be sufficiently sampled with the current sampling number; if the sampling numbers are increased, the additional AEs sampled can only provide quite limited contribution to the model.

In this paper, inspired by the philosophy of active learning, we presented a novel method, i.e., case-aware adversarial training (CAT) to improve the efficiency of adversarial training. The core of the method was to relieve the inherent major flaw (high computation cost) of adversarial training by selecting samples with rich information at each iteration. During the process, the weights of adversarial examples could not be accurately derived from the original examples. To overcome the problem, we proposed a likelihood based method to measure the information gain of AEs. Moreover, we also introduced two practical tricks to improve the diversity of each adversarial training mini-batch. Extensive experiments were conducted to validate the effectiveness of CAT, and the results showed that CAT could significantly accelerate the ex-isting adversarial training method. Finally, we highlighted that CAT was a generic training scheme and it could be effortlessly combined with other adversarial methods.

Pidbased approach to adversarial attacks

Simple and efficient hard label black-box adversarial attacks in low query budget regimes

Surfree: a fast surrogate-free black-box attack

Composite adversarial attacks

Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks

Adversarial examples-security threats to covid-19 deep learning systems in medical iot devices

Adversarial examples-security threats to covid-19 deep learning systems in medical iot devices

Adversarial examples on object recognition: A comprehensive survey

Fooling automated surveillance cameras: adversarial patches to attack person detection

Detecting adversarial attacks via subset scanning of autoencoder activations and reconstruction error

Defense-gan: Protecting classifiers against adversarial attacks using generative models

Towards deep learning models resistant to adversarial attacks

Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples

Bag of tricks for adversarial training

Bayesian active learning for classification and preference learning

Deep bayesian active learning with image data

Towards deep learning models resistant to adversarial attacks

Improving adversarial robustness via channel-wise activation suppressing

Active learning literature survey

Adversarial training for free!