Frequency domain regularization for iterative adversarial attacks

作者:

Highlights:

摘要

Adversarial examples have attracted more and more attentions with the prosperity of convolutional neural networks. The transferability of adversarial examples is an important property that makes black-box attacks possible in real-world applications. On the other side, many adversarial defense methods have been proposed to improve the robustness, leading to the requirement for more transferable adversarial examples. Inspired by the regularization term for network parameters at training process, we treat adversarial attacks as training process of inputs and propose regularization constraint for inputs to prevent adversarial examples from overfitting the white-box networks and enhance the transferability. Specifically, we find a universal attribute that the outputs of convolutional neural networks have consistency to the low frequencies of inputs, and based on this, we construct a frequency domain regularization to inputs for iterative attacks. In this way, our method is compatible with existing iterative attack methods and can learn more transferable adversarial examples. Extensive experiments on ImageNet validate the superiority of our method, and compared with several attacks, we achieve attack success rate improvements of 8.0% and 11.5% on average to normal models and defense methods respectively.

论文关键词:Adversarial examples,Transfer-based attack,Black-box attack,Frequency-domain characteristics

论文评审过程:Received 27 January 2021, Revised 19 June 2022, Accepted 25 September 2022, Available online 5 October 2022, Version of Record 8 October 2022.

论文官网地址:https://doi.org/10.1016/j.patcog.2022.109075