Visible to the public Biblio

Filters: Keyword is adversarial examples  [Clear All Filters]
2021-04-27
Marchisio, A., Nanfa, G., Khalid, F., Hanif, M. A., Martina, M., Shafique, M..  2020.  Is Spiking Secure? A Comparative Study on the Security Vulnerabilities of Spiking and Deep Neural Networks 2020 International Joint Conference on Neural Networks (IJCNN). :1–8.
Spiking Neural Networks (SNNs) claim to present many advantages in terms of biological plausibility and energy efficiency compared to standard Deep Neural Networks (DNNs). Recent works have shown that DNNs are vulnerable to adversarial attacks, i.e., small perturbations added to the input data can lead to targeted or random misclassifications. In this paper, we aim at investigating the key research question: "Are SNNs secure?" Towards this, we perform a comparative study of the security vulnerabilities in SNNs and DNNs w.r.t. the adversarial noise. Afterwards, we propose a novel black-box attack methodology, i.e., without the knowledge of the internal structure of the SNN, which employs a greedy heuristic to automatically generate imperceptible and robust adversarial examples (i.e., attack images) for the given SNN. We perform an in-depth evaluation for a Spiking Deep Belief Network (SDBN) and a DNN having the same number of layers and neurons (to obtain a fair comparison), in order to study the efficiency of our methodology and to understand the differences between SNNs and DNNs w.r.t. the adversarial examples. Our work opens new avenues of research towards the robustness of the SNNs, considering their similarities to the human brain's functionality.
2021-03-29
Peng, Y., Fu, G., Luo, Y., Hu, J., Li, B., Yan, Q..  2020.  Detecting Adversarial Examples for Network Intrusion Detection System with GAN. 2020 IEEE 11th International Conference on Software Engineering and Service Science (ICSESS). :6–10.
With the increasing scale of network, attacks against network emerge one after another, and security problems become increasingly prominent. Network intrusion detection system is a widely used and effective security means at present. In addition, with the development of machine learning technology, various intelligent intrusion detection algorithms also start to sprout. By flexibly combining these intelligent methods with intrusion detection technology, the comprehensive performance of intrusion detection can be improved, but the vulnerability of machine learning model in the adversarial environment can not be ignored. In this paper, we study the defense problem of network intrusion detection system against adversarial samples. More specifically, we design a defense algorithm for NIDS against adversarial samples by using bidirectional generative adversarial network. The generator learns the data distribution of normal samples during training, which is an implicit model reflecting the normal data distribution. After training, the adversarial sample detection module calculates the reconstruction error and the discriminator matching error of sample. Then, the adversarial samples are removed, which improves the robustness and accuracy of NIDS in the adversarial environment.
2021-03-09
Rahmati, A., Moosavi-Dezfooli, S.-M., Frossard, P., Dai, H..  2020.  GeoDA: A Geometric Framework for Black-Box Adversarial Attacks. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). :8443–8452.
Adversarial examples are known as carefully perturbed images fooling image classifiers. We propose a geometric framework to generate adversarial examples in one of the most challenging black-box settings where the adversary can only generate a small number of queries, each of them returning the top-1 label of the classifier. Our framework is based on the observation that the decision boundary of deep networks usually has a small mean curvature in the vicinity of data samples. We propose an effective iterative algorithm to generate query-efficient black-box perturbations with small p norms which is confirmed via experimental evaluations on state-of-the-art natural image classifiers. Moreover, for p=2, we theoretically show that our algorithm actually converges to the minimal perturbation when the curvature of the decision boundary is bounded. We also obtain the optimal distribution of the queries over the iterations of the algorithm. Finally, experimental results confirm that our principled black-box attack algorithm performs better than state-of-the-art algorithms as it generates smaller perturbations with a reduced number of queries.
2020-12-07
Yang, Z..  2019.  Fidelity: Towards Measuring the Trustworthiness of Neural Network Classification. 2019 IEEE Conference on Dependable and Secure Computing (DSC). :1–8.
With the increasing performance of neural networks on many security-critical tasks, the security concerns of machine learning have become increasingly prominent. Recent studies have shown that neural networks are vulnerable to adversarial examples: carefully crafted inputs with negligible perturbations on legitimate samples could mislead a neural network to produce adversary-selected outputs while humans can still correctly classify them. Therefore, we need an additional measurement on the trustworthiness of the results of a machine learning model, especially in adversarial settings. In this paper, we analyse the root cause of adversarial examples, and propose a new property, namely fidelity, of machine learning models to describe the gap between what a model learns and the ground truth learned by humans. One of its benefits is detecting adversarial attacks. We formally define fidelity, and propose a novel approach to quantify it. We evaluate the quantification of fidelity in adversarial settings on two neural networks. The study shows that involving the fidelity enables a neural network system to detect adversarial examples with true positive rate 97.7%, and false positive rate 1.67% on a studied neural network.
2020-09-21
Chow, Ka-Ho, Wei, Wenqi, Wu, Yanzhao, Liu, Ling.  2019.  Denoising and Verification Cross-Layer Ensemble Against Black-box Adversarial Attacks. 2019 IEEE International Conference on Big Data (Big Data). :1282–1291.
Deep neural networks (DNNs) have demonstrated impressive performance on many challenging machine learning tasks. However, DNNs are vulnerable to adversarial inputs generated by adding maliciously crafted perturbations to the benign inputs. As a growing number of attacks have been reported to generate adversarial inputs of varying sophistication, the defense-attack arms race has been accelerated. In this paper, we present MODEF, a cross-layer model diversity ensemble framework. MODEF intelligently combines unsupervised model denoising ensemble with supervised model verification ensemble by quantifying model diversity, aiming to boost the robustness of the target model against adversarial examples. Evaluated using eleven representative attacks on popular benchmark datasets, we show that MODEF achieves remarkable defense success rates, compared with existing defense methods, and provides a superior capability of repairing adversarial inputs and making correct predictions with high accuracy in the presence of black-box attacks.
2020-09-11
Azakami, Tomoka, Shibata, Chihiro, Uda, Ryuya, Kinoshita, Toshiyuki.  2019.  Creation of Adversarial Examples with Keeping High Visual Performance. 2019 IEEE 2nd International Conference on Information and Computer Technologies (ICICT). :52—56.
The accuracy of the image classification by the convolutional neural network is exceeding the ability of human being and contributes to various fields. However, the improvement of the image recognition technology gives a great blow to security system with an image such as CAPTCHA. In particular, since the character string CAPTCHA has already added distortion and noise in order not to be read by the computer, it becomes a problem that the human readability is lowered. Adversarial examples is a technique to produce an image letting an image classification by the machine learning be wrong intentionally. The best feature of this technique is that when human beings compare the original image with the adversarial examples, they cannot understand the difference on appearance. However, Adversarial examples that is created with conventional FGSM cannot completely misclassify strong nonlinear networks like CNN. Osadchy et al. have researched to apply this adversarial examples to CAPTCHA and attempted to let CNN misclassify them. However, they could not let CNN misclassify character images. In this research, we propose a method to apply FGSM to the character string CAPTCHAs and to let CNN misclassified them.
Shekhar, Heemany, Moh, Melody, Moh, Teng-Sheng.  2019.  Exploring Adversaries to Defend Audio CAPTCHA. 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA). :1155—1161.
CAPTCHA is a web-based authentication method used by websites to distinguish between humans (valid users) and bots (attackers). Audio captcha is an accessible captcha meant for the visually disabled section of users such as color-blind, blind, near-sighted users. Firstly, this paper analyzes how secure current audio captchas are from attacks using machine learning (ML) and deep learning (DL) models. Each audio captcha is made up of five, seven or ten random digits[0-9] spoken one after the other along with varying background noise throughout the length of the audio. If the ML or DL model is able to correctly identify all spoken digits and in the correct order of occurance in a single audio captcha, we consider that captcha to be broken and the attack to be successful. Throughout the paper, accuracy refers to the attack model's success at breaking audio captchas. The higher the attack accuracy, the more unsecure the audio captchas are. In our baseline experiments, we found that attack models could break audio captchas that had no background noise or medium background noise with any number of spoken digits with nearly 99% to 100% accuracy. Whereas, audio captchas with high background noise were relatively more secure with attack accuracy of 85%. Secondly, we propose that the concepts of adversarial examples algorithms can be used to create a new kind of audio captcha that is more resilient towards attacks. We found that even after retraining the models on the new adversarial audio data, the attack accuracy remained as low as 25% to 36% only. Lastly, we explore the benefits of creating adversarial audio captcha through different algorithms such as Basic Iterative Method (BIM) and deepFool. We found that as long as the attacker has less than 45% sample from each kinds of adversarial audio datasets, the defense will be successful at preventing attacks.
2020-09-04
Song, Chengru, Xu, Changqiao, Yang, Shujie, Zhou, Zan, Gong, Changhui.  2019.  A Black-Box Approach to Generate Adversarial Examples Against Deep Neural Networks for High Dimensional Input. 2019 IEEE Fourth International Conference on Data Science in Cyberspace (DSC). :473—479.
Generating adversarial samples is gathering much attention as an intuitive approach to evaluate the robustness of learning models. Extensive recent works have demonstrated that numerous advanced image classifiers are defenseless to adversarial perturbations in the white-box setting. However, the white-box setting assumes attackers to have prior knowledge of model parameters, which are generally inaccessible in real world cases. In this paper, we concentrate on the hard-label black-box setting where attackers can only pose queries to probe the model parameters responsible for classifying different images. Therefore, the issue is converted into minimizing non-continuous function. A black-box approach is proposed to address both massive queries and the non-continuous step function problem by applying a combination of a linear fine-grained search, Fibonacci search, and a zeroth order optimization algorithm. However, the input dimension of a image is so high that the estimation of gradient is noisy. Hence, we adopt a zeroth-order optimization method in high dimensions. The approach converts calculation of gradient into a linear regression model and extracts dimensions that are more significant. Experimental results illustrate that our approach can relatively reduce the amount of queries and effectively accelerate convergence of the optimization method.
Zhao, Pu, Liu, Sijia, Chen, Pin-Yu, Hoang, Nghia, Xu, Kaidi, Kailkhura, Bhavya, Lin, Xue.  2019.  On the Design of Black-Box Adversarial Examples by Leveraging Gradient-Free Optimization and Operator Splitting Method. 2019 IEEE/CVF International Conference on Computer Vision (ICCV). :121—130.
Robust machine learning is currently one of the most prominent topics which could potentially help shaping a future of advanced AI platforms that not only perform well in average cases but also in worst cases or adverse situations. Despite the long-term vision, however, existing studies on black-box adversarial attacks are still restricted to very specific settings of threat models (e.g., single distortion metric and restrictive assumption on target model's feedback to queries) and/or suffer from prohibitively high query complexity. To push for further advances in this field, we introduce a general framework based on an operator splitting method, the alternating direction method of multipliers (ADMM) to devise efficient, robust black-box attacks that work with various distortion metrics and feedback settings without incurring high query complexity. Due to the black-box nature of the threat model, the proposed ADMM solution framework is integrated with zeroth-order (ZO) optimization and Bayesian optimization (BO), and thus is applicable to the gradient-free regime. This results in two new black-box adversarial attack generation methods, ZO-ADMM and BO-ADMM. Our empirical evaluations on image classification datasets show that our proposed approaches have much lower function query complexities compared to state-of-the-art attack methods, but achieve very competitive attack success rates.
2020-08-03
Juuti, Mika, Szyller, Sebastian, Marchal, Samuel, Asokan, N..  2019.  PRADA: Protecting Against DNN Model Stealing Attacks. 2019 IEEE European Symposium on Security and Privacy (EuroS P). :512–527.
Machine learning (ML) applications are increasingly prevalent. Protecting the confidentiality of ML models becomes paramount for two reasons: (a) a model can be a business advantage to its owner, and (b) an adversary may use a stolen model to find transferable adversarial examples that can evade classification by the original model. Access to the model can be restricted to be only via well-defined prediction APIs. Nevertheless, prediction APIs still provide enough information to allow an adversary to mount model extraction attacks by sending repeated queries via the prediction API. In this paper, we describe new model extraction attacks using novel approaches for generating synthetic queries, and optimizing training hyperparameters. Our attacks outperform state-of-the-art model extraction in terms of transferability of both targeted and non-targeted adversarial examples (up to +29-44 percentage points, pp), and prediction accuracy (up to +46 pp) on two datasets. We provide take-aways on how to perform effective model extraction attacks. We then propose PRADA, the first step towards generic and effective detection of DNN model extraction attacks. It analyzes the distribution of consecutive API queries and raises an alarm when this distribution deviates from benign behavior. We show that PRADA can detect all prior model extraction attacks with no false positives.
2020-07-20
Pengcheng, Li, Yi, Jinfeng, Zhang, Lijun.  2018.  Query-Efficient Black-Box Attack by Active Learning. 2018 IEEE International Conference on Data Mining (ICDM). :1200–1205.
Deep neural network (DNN) as a popular machine learning model is found to be vulnerable to adversarial attack. This attack constructs adversarial examples by adding small perturbations to the raw input, while appearing unmodified to human eyes but will be misclassified by a well-trained classifier. In this paper, we focus on the black-box attack setting where attackers have almost no access to the underlying models. To conduct black-box attack, a popular approach aims to train a substitute model based on the information queried from the target DNN. The substitute model can then be attacked using existing white-box attack approaches, and the generated adversarial examples will be used to attack the target DNN. Despite its encouraging results, this approach suffers from poor query efficiency, i.e., attackers usually needs to query a huge amount of times to collect enough information for training an accurate substitute model. To this end, we first utilize state-of-the-art white-box attack methods to generate samples for querying, and then introduce an active learning strategy to significantly reduce the number of queries needed. Besides, we also propose a diversity criterion to avoid the sampling bias. Our extensive experimental results on MNIST and CIFAR-10 show that the proposed method can reduce more than 90% of queries while preserve attacking success rates and obtain an accurate substitute model which is more than 85% similar with the target oracle.
2020-06-19
Wang, Si, Liu, Wenye, Chang, Chip-Hong.  2019.  Detecting Adversarial Examples for Deep Neural Networks via Layer Directed Discriminative Noise Injection. 2019 Asian Hardware Oriented Security and Trust Symposium (AsianHOST). :1—6.

Deep learning is a popular powerful machine learning solution to the computer vision tasks. The most criticized vulnerability of deep learning is its poor tolerance towards adversarial images obtained by deliberately adding imperceptibly small perturbations to the clean inputs. Such negatives can delude a classifier into wrong decision making. Previous defensive techniques mostly focused on refining the models or input transformation. They are either implemented only with small datasets or shown to have limited success. Furthermore, they are rarely scrutinized from the hardware perspective despite Artificial Intelligence (AI) on a chip is a roadmap for embedded intelligence everywhere. In this paper we propose a new discriminative noise injection strategy to adaptively select a few dominant layers and progressively discriminate adversarial from benign inputs. This is made possible by evaluating the differences in label change rate from both adversarial and natural images by injecting different amount of noise into the weights of individual layers in the model. The approach is evaluated on the ImageNet Dataset with 8-bit truncated models for the state-of-the-art DNN architectures. The results show a high detection rate of up to 88.00% with only approximately 5% of false positive rate for MobileNet. Both detection rate and false positive rate have been improved well above existing advanced defenses against the most practical noninvasive universal perturbation attack on deep learning based AI chip.

2020-04-20
Lecuyer, Mathias, Atlidakis, Vaggelis, Geambasu, Roxana, Hsu, Daniel, Jana, Suman.  2019.  Certified Robustness to Adversarial Examples with Differential Privacy. 2019 IEEE Symposium on Security and Privacy (SP). :656–672.
Adversarial examples that fool machine learning models, particularly deep neural networks, have been a topic of intense research interest, with attacks and defenses being developed in a tight back-and-forth. Most past defenses are best effort and have been shown to be vulnerable to sophisticated attacks. Recently a set of certified defenses have been introduced, which provide guarantees of robustness to norm-bounded attacks. However these defenses either do not scale to large datasets or are limited in the types of models they can support. This paper presents the first certified defense that both scales to large networks and datasets (such as Google's Inception network for ImageNet) and applies broadly to arbitrary model types. Our defense, called PixelDP, is based on a novel connection between robustness against adversarial examples and differential privacy, a cryptographically-inspired privacy formalism, that provides a rigorous, generic, and flexible foundation for defense.
2020-02-18
Huang, Yonghong, Verma, Utkarsh, Fralick, Celeste, Infantec-Lopez, Gabriel, Kumar, Brajesh, Woodward, Carl.  2019.  Malware Evasion Attack and Defense. 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W). :34–38.

Machine learning (ML) classifiers are vulnerable to adversarial examples. An adversarial example is an input sample which is slightly modified to induce misclassification in an ML classifier. In this work, we investigate white-box and grey-box evasion attacks to an ML-based malware detector and conduct performance evaluations in a real-world setting. We compare the defense approaches in mitigating the attacks. We propose a framework for deploying grey-box and black-box attacks to malware detection systems.

Han, Chihye, Yoon, Wonjun, Kwon, Gihyun, Kim, Daeshik, Nam, Seungkyu.  2019.  Representation of White- and Black-Box Adversarial Examples in Deep Neural Networks and Humans: A Functional Magnetic Resonance Imaging Study. 2019 International Joint Conference on Neural Networks (IJCNN). :1–8.

The recent success of brain-inspired deep neural networks (DNNs) in solving complex, high-level visual tasks has led to rising expectations for their potential to match the human visual system. However, DNNs exhibit idiosyncrasies that suggest their visual representation and processing might be substantially different from human vision. One limitation of DNNs is that they are vulnerable to adversarial examples, input images on which subtle, carefully designed noises are added to fool a machine classifier. The robustness of the human visual system against adversarial examples is potentially of great importance as it could uncover a key mechanistic feature that machine vision is yet to incorporate. In this study, we compare the visual representations of white- and black-box adversarial examples in DNNs and humans by leveraging functional magnetic resonance imaging (fMRI). We find a small but significant difference in representation patterns for different (i.e. white- versus black-box) types of adversarial examples for both humans and DNNs. However, human performance on categorical judgment is not degraded by noise regardless of the type unlike DNN. These results suggest that adversarial examples may be differentially represented in the human visual system, but unable to affect the perceptual experience.

2019-06-10
Kargaard, J., Drange, T., Kor, A., Twafik, H., Butterfield, E..  2018.  Defending IT Systems against Intelligent Malware. 2018 IEEE 9th International Conference on Dependable Systems, Services and Technologies (DESSERT). :411-417.

The increasing amount of malware variants seen in the wild is causing problems for Antivirus Software vendors, unable to keep up by creating signatures for each. The methods used to develop a signature, static and dynamic analysis, have various limitations. Machine learning has been used by Antivirus vendors to detect malware based on the information gathered from the analysis process. However, adversarial examples can cause machine learning algorithms to miss-classify new data. In this paper we describe a method for malware analysis by converting malware binaries to images and then preparing those images for training within a Generative Adversarial Network. These unsupervised deep neural networks are not susceptible to adversarial examples. The conversion to images from malware binaries should be faster than using dynamic analysis and it would still be possible to link malware families together. Using the Generative Adversarial Network, malware detection could be much more effective and reliable.

2019-02-14
Zhu, Yimin, Woo, Simon S..  2018.  Adversarial Product Review Generation with Word Replacements. Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. :2324-2326.

Machine learning algorithms including Deep Neural Networks (DNNs) have shown great success in many different areas. However, they are frequently susceptible to adversarial examples, which are maliciously crafted inputs to fool machine learning classifiers. On the other hand, humans cannot distinguish between non-adversarial and adversarial inputs. In this work, we focus on creating adversarial examples to change the polarity of positive and negative reviews with Amazon product review dataset. We introduce a simple heuristics algorithm to construct adversarial product reviews by replacing words with semantically and synthetically similar synonyms. We evaluate our approach against the state-of-the-art CNN-BLSTM classifier. Our preliminary results show the performance drop of the classifier against the adversarial examples. We also present the defense mechanism using adversarial training.

2019-02-08
Zhang, Yiwei, Zhang, Weiming, Chen, Kejiang, Liu, Jiayang, Liu, Yujia, Yu, Nenghai.  2018.  Adversarial Examples Against Deep Neural Network Based Steganalysis. Proceedings of the 6th ACM Workshop on Information Hiding and Multimedia Security. :67-72.

Deep neural network based steganalysis has developed rapidly in recent years, which poses a challenge to the security of steganography. However, there is no steganography method that can effectively resist the neural networks for steganalysis at present. In this paper, we propose a new strategy that constructs enhanced covers against neural networks with the technique of adversarial examples. The enhanced covers and their corresponding stegos are most likely to be judged as covers by the networks. Besides, we use both deep neural network based steganalysis and high-dimensional feature classifiers to evaluate the performance of steganography and propose a new comprehensive security criterion. We also make a tradeoff between the two analysis systems and improve the comprehensive security. The effectiveness of the proposed scheme is verified with the evidence obtained from the experiments on the BOSSbase using the steganography algorithm of WOW and popular steganalyzers with rich models and three state-of-the-art neural networks.

2019-01-21
Kos, J., Fischer, I., Song, D..  2018.  Adversarial Examples for Generative Models. 2018 IEEE Security and Privacy Workshops (SPW). :36–42.

We explore methods of producing adversarial examples on deep generative models such as the variational autoencoder (VAE) and the VAE-GAN. Deep learning architectures are known to be vulnerable to adversarial examples, but previous work has focused on the application of adversarial examples to classification tasks. Deep generative models have recently become popular due to their ability to model input data distributions and generate realistic examples from those distributions. We present three classes of attacks on the VAE and VAE-GAN architectures and demonstrate them against networks trained on MNIST, SVHN and CelebA. Our first attack leverages classification-based adversaries by attaching a classifier to the trained encoder of the target generative model, which can then be used to indirectly manipulate the latent representation. Our second attack directly uses the VAE loss function to generate a target reconstruction image from the adversarial example. Our third attack moves beyond relying on classification or the standard loss for the gradient and directly optimizes against differences in source and target latent representations. We also motivate why an attacker might be interested in deploying such techniques against a target generative network.

Warzyński, A., Kołaczek, G..  2018.  Intrusion detection systems vulnerability on adversarial examples. 2018 Innovations in Intelligent Systems and Applications (INISTA). :1–4.

Intrusion detection systems define an important and dynamic research area for cybersecurity. The role of Intrusion Detection System within security architecture is to improve a security level by identification of all malicious and also suspicious events that could be observed in computer or network system. One of the more specific research areas related to intrusion detection is anomaly detection. Anomaly-based intrusion detection in networks refers to the problem of finding untypical events in the observed network traffic that do not conform to the expected normal patterns. It is assumed that everything that is untypical/anomalous could be dangerous and related to some security events. To detect anomalies many security systems implements a classification or clustering algorithms. However, recent research proved that machine learning models might misclassify adversarial events, e.g. observations which were created by applying intentionally non-random perturbations to the dataset. Such weakness could increase of false negative rate which implies undetected attacks. This fact can lead to one of the most dangerous vulnerabilities of intrusion detection systems. The goal of the research performed was verification of the anomaly detection systems ability to resist this type of attack. This paper presents the preliminary results of tests taken to investigate existence of attack vector, which can use adversarial examples to conceal a real attack from being detected by intrusion detection systems.

2019-01-16
Kreuk, F., Adi, Y., Cisse, M., Keshet, J..  2018.  Fooling End-To-End Speaker Verification With Adversarial Examples. 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). :1962–1966.
Automatic speaker verification systems are increasingly used as the primary means to authenticate costumers. Recently, it has been proposed to train speaker verification systems using end-to-end deep neural models. In this paper, we show that such systems are vulnerable to adversarial example attacks. Adversarial examples are generated by adding a peculiar noise to original speaker examples, in such a way that they are almost indistinguishable, by a human listener. Yet, the generated waveforms, which sound as speaker A can be used to fool such a system by claiming as if the waveforms were uttered by speaker B. We present white-box attacks on a deep end-to-end network that was either trained on YOHO or NTIMIT. We also present two black-box attacks. In the first one, we generate adversarial examples with a system trained on NTIMIT and perform the attack on a system that trained on YOHO. In the second one, we generate the adversarial examples with a system trained using Mel-spectrum features and perform the attack on a system trained using MFCCs. Our results show that one can significantly decrease the accuracy of a target system even when the adversarial examples are generated with different system potentially using different features.
Carlini, N., Wagner, D..  2018.  Audio Adversarial Examples: Targeted Attacks on Speech-to-Text. 2018 IEEE Security and Privacy Workshops (SPW). :1–7.
We construct targeted audio adversarial examples on automatic speech recognition. Given any audio waveform, we can produce another that is over 99.9% similar, but transcribes as any phrase we choose (recognizing up to 50 characters per second of audio). We apply our white-box iterative optimization-based attack to Mozilla's implementation DeepSpeech end-to-end, and show it has a 100% success rate. The feasibility of this attack introduce a new domain to study adversarial examples.
Bai, X., Niu, W., Liu, J., Gao, X., Xiang, Y., Liu, J..  2018.  Adversarial Examples Construction Towards White-Box Q Table Variation in DQN Pathfinding Training. 2018 IEEE Third International Conference on Data Science in Cyberspace (DSC). :781–787.

As a new research hotspot in the field of artificial intelligence, deep reinforcement learning (DRL) has achieved certain success in various fields such as robot control, computer vision, natural language processing and so on. At the same time, the possibility of its application being attacked and whether it have a strong resistance to strike has also become a hot topic in recent years. Therefore, we select the representative Deep Q Network (DQN) algorithm in deep reinforcement learning, and use the robotic automatic pathfinding application as a countermeasure application scenario for the first time, and attack DQN algorithm against the vulnerability of the adversarial samples. In this paper, we first use DQN to find the optimal path, and analyze the rules of DQN pathfinding. Then, we propose a method that can effectively find vulnerable points towards White-Box Q table variation in DQN pathfinding training. Finally, we build a simulation environment as a basic experimental platform to test our method, through multiple experiments, we can successfully find the adversarial examples and the experimental results show that the supervised method we proposed is effective.

2018-12-10
Kwon, Hyun, Yoon, Hyunsoo, Choi, Daeseon.  2018.  POSTER: Zero-Day Evasion Attack Analysis on Race Between Attack and Defense. Proceedings of the 2018 on Asia Conference on Computer and Communications Security. :805–807.

Deep neural networks (DNNs) exhibit excellent performance in machine learning tasks such as image recognition, pattern recognition, speech recognition, and intrusion detection. However, the usage of adversarial examples, which are intentionally corrupted by noise, can lead to misclassification. As adversarial examples are serious threats to DNNs, both adversarial attacks and methods of defending against adversarial examples have been continuously studied. Zero-day adversarial examples are created with new test data and are unknown to the classifier; hence, they represent a more significant threat to DNNs. To the best of our knowledge, there are no analytical studies in the literature of zero-day adversarial examples with a focus on attack and defense methods through experiments using several scenarios. Therefore, in this study, zero-day adversarial examples are practically analyzed with an emphasis on attack and defense methods through experiments using various scenarios composed of a fixed target model and an adaptive target model. The Carlini method was used for a state-of-the-art attack, while an adversarial training method was used as a typical defense method. We used the MNIST dataset and analyzed success rates of zero-day adversarial examples, average distortions, and recognition of original samples through several scenarios of fixed and adaptive target models. Experimental results demonstrate that changing the parameters of the target model in real time leads to resistance to adversarial examples in both the fixed and adaptive target models.

2018-09-12
Jang, Uyeong, Wu, Xi, Jha, Somesh.  2017.  Objective Metrics and Gradient Descent Algorithms for Adversarial Examples in Machine Learning. Proceedings of the 33rd Annual Computer Security Applications Conference. :262–277.
Fueled by massive amounts of data, models produced by machine-learning (ML) algorithms are being used in diverse domains where security is a concern, such as, automotive systems, finance, health-care, computer vision, speech recognition, natural-language processing, and malware detection. Of particular concern is use of ML in cyberphysical systems, such as driver-less cars and aviation, where the presence of an adversary can cause serious consequences. In this paper we focus on attacks caused by adversarial samples, which are inputs crafted by adding small, often imperceptible, perturbations to force a ML model to misclassify. We present a simple gradient-descent based algorithm for finding adversarial samples, which performs well in comparison to existing algorithms. The second issue that this paper tackles is that of metrics. We present a novel metric based on few computer-vision algorithms for measuring the quality of adversarial samples.