Visible to the public Attribution Based Approach for Adversarial Example Generation

TitleAttribution Based Approach for Adversarial Example Generation
Publication TypeConference Paper
Year of Publication2021
AuthorsWu, Xiaohe, Calderon, Juan, Obeng, Morrison
Conference NameSoutheastCon 2021
Keywordsattribution, Classification algorithms, composability, deep architecture, gradient methods, Human Behavior, Iterative algorithms, Metrics, Neural networks, Perturbation methods, pubcrawl, Systematics
AbstractNeural networks with deep architectures have been used to construct state-of-the-art classifiers that can match human level accuracy in areas such as image classification. However, many of these classifiers can be fooled by examples slightly modified from their original forms. In this work, we propose a novel approach for generating adversarial examples that makes use of only attribution information of the features and perturbs only features that are highly influential to the output of the classifier. We call this approach Attribution Based Adversarial Generation (ABAG). To demonstrate the effectiveness of this approach, three somewhat arbitrary algorithms are proposed and examined. In the first algorithm all non-zero attributions are utilized and associated features perturbed; in the second algorithm only the top-n most positive and top-n most negative attributions are used and corresponding features perturbed; and in the third algorithm the level of perturbation is increased in an iterative manner until an adversarial example is discovered. All of the three algorithms are implemented and experiments are performed on the well-known MNIST dataset. Experiment results show that adversarial examples can be generated very efficiently, and thus prove the validity and efficacy of ABAG - utilizing attributions for the generation of adversarial examples. Furthermore, as shown by examples, ABAG can be adapted to provides a systematic searching approach to generate adversarial examples by perturbing a minimum amount of features.
Citation Keywu_attribution_2021