A stealthy and robust backdoor attack via frequency domain transform

Deep learning models are vulnerable to backdoor attacks, where an adversary aims to inject a hidden backdoor into the deep learning models, such that the victim models perform well on clean data but output predefined wrong results on data containing specific triggers (e.g., a pattern, or a specific...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:World wide web (Bussum) Ročník 26; číslo 5; s. 2767 - 2783
Hlavní autori: Hou, Ruitao, Huang, Teng, Yan, Hongyang, Ke, Lishan, Tang, Weixuan
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: New York Springer US 01.09.2023
Springer Nature B.V
Predmet:
ISSN:1386-145X, 1573-1413
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:Deep learning models are vulnerable to backdoor attacks, where an adversary aims to inject a hidden backdoor into the deep learning models, such that the victim models perform well on clean data but output predefined wrong results on data containing specific triggers (e.g., a pattern, or a specific accessory). While existing attack methods are effective, they are commonly not stealthy and robust, i.e., the backdoor triggers are unnatural and easily detected, and they are hard to resist data augmentation operations. To address these issues, in this paper, we explore new types of attack methods that significantly improve the stealthiness and robustness of backdoor attacks. Specifically, inspired by digital watermarking techniques, we propose two backdoor trigger injection algorithms based on discrete Fourier transform and discrete cosine transform. These algorithms select the frequency domain instead of the spatial domain for trigger injection, ensuring the stealthiness. Besides they divide the original data into multiple data blocks for multiple injections of triggers to improve the robustness. We experimentally evaluated the proposed methods on GTSRB and CIFAR10 datasets, and the results demonstrate that our methods remarkably improve the stealthiness and robustness of backdoor attacks without compromising effectiveness. For example, on GTSRB, compared with the Badnets and Blend, our methods generate more natural poisoned data, and improve at least 80.99%, 68.09%, 25.49%, and 63.31% in random horizontal flip, random vertical flip, random cropping (padding=2), and random cropping (padding=4).
Bibliografia:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1386-145X
1573-1413
DOI:10.1007/s11280-023-01153-3