Depth-2 neural networks under a data-poisoning attack

In this work, we study the possibility of defending against data-poisoning attacks while training a shallow neural network in a regression setup. We focus on doing supervised learning with realizable labels for a class of depth-2 finite-width neural networks, which includes single-filter convolution...

Full description

Saved in:
Bibliographic Details
Published in:Neurocomputing (Amsterdam) Vol. 532; pp. 56 - 66
Main Authors: Karmakar, Sayar, Mukherjee, Anirbit, Papamarkou, Theodore
Format: Journal Article
Language:English
Published: Elsevier B.V 01.05.2023
Subjects:
ISSN:0925-2312, 1872-8286
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this work, we study the possibility of defending against data-poisoning attacks while training a shallow neural network in a regression setup. We focus on doing supervised learning with realizable labels for a class of depth-2 finite-width neural networks, which includes single-filter convolutional networks. In this class of networks, we attempt to learn the true network weights generating the labels in the presence of a malicious oracle doing stochastic, bounded and additive adversarial distortions on the true labels, during training. For the gradient-free stochastic algorithm that we construct, we prove worst-case near-optimal trade-offs among the magnitude of the adversarial attack, the weight approximation accuracy, and the confidence achieved by the proposed algorithm. As our algorithm uses mini-batching, we analyze how the mini-batch size affects convergence. We also show how to utilize the scaling of the outer layer weights to counter data-poisoning attacks on true labels depending on the probability of attack. Lastly, we give experimental evidence demonstrating how our algorithm outperforms stochastic gradient descent under different input data distributions, including instances of heavy-tailed distributions.
ISSN:0925-2312
1872-8286
DOI:10.1016/j.neucom.2023.02.034