Visible to the public Deep Learning Poison Data Attack Detection

TitleDeep Learning Poison Data Attack Detection
Publication TypeConference Paper
Year of Publication2019
AuthorsChacon, H., Silva, S., Rad, P.
Conference Name2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)
Date PublishedNov. 2019
ISBN Number978-1-7281-3798-8
Keywordsadversarial information, AI Poisoning, attacking training data, Bayesian statistic, CNN model, computer network security, Deep Learning, deep learning poison data attack detection, deep neural networks, Entropy, Human Behavior, learning (artificial intelligence), Maximum Entropy method, maximum entropy principle, MNIST data, model definitions, network attack, neural nets, poisoned training data, poisonous data, pre-trained model parameters, pubcrawl, resilience, Resiliency, Scalability, system-critical applications, testing data, training phase, transfer learning, Variational inference, variational inference approach

Deep neural networks are widely used in many walks of life. Techniques such as transfer learning enable neural networks pre-trained on certain tasks to be retrained for a new duty, often with much less data. Users have access to both pre-trained model parameters and model definitions along with testing data but have either limited access to training data or just a subset of it. This is risky for system-critical applications, where adversarial information can be maliciously included during the training phase to attack the system. Determining the existence and level of attack in a model is challenging. In this paper, we present evidence on how adversarially attacking training data increases the boundary of model parameters using as an example of a CNN model and the MNIST data set as a test. This expansion is due to new characteristics of the poisonous data that are added to the training data. Approaching the problem from the feature space learned by the network provides a relation between them and the possible parameters taken by the model on the training phase. An algorithm is proposed to determine if a given network was attacked in the training by comparing the boundaries of parameters distribution on intermediate layers of the model estimated by using the Maximum Entropy Principle and the Variational inference approach.

Citation Keychacon_deep_2019