Visible to the public Biblio

Found 552 results

Filters: Keyword is learning (artificial intelligence)  [Clear All Filters]
Akbay, Abdullah Basar, Wang, Weina, Zhang, Junshan.  2019.  Data Collection from Privacy-Aware Users in the Presence of Social Learning. 2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton). :679–686.
We study a model where a data collector obtains data from users through a payment mechanism to learn the underlying state from the elicited data. The private signal of each user represents her individual knowledge about the state. Through social interactions, each user can also learn noisy versions of her friends' signals, which is called group signals. Based on both her private signal and group signals, each user makes strategic decisions to report a privacy-preserved version of her data to the data collector. We develop a Bayesian game theoretic framework to study the impact of social learning on users' data reporting strategies and devise the payment mechanism for the data collector accordingly. Our findings reveal that, the Bayesian-Nash equilibrium can be in the form of either a symmetric randomized response (SR) strategy or an informative non-disclosive (ND) strategy. A generalized majority voting rule is applied by each user to her noisy group signals to determine which strategy to follow. When a user plays the ND strategy, she reports privacy-preserving data completely based on her group signals, independent of her private signal, which indicates that her privacy cost is zero. Both the data collector and the users can benefit from social learning which drives down the privacy costs and helps to improve the state estimation at a given payment budget. We derive bounds on the minimum total payment required to achieve a given level of state estimation accuracy.
Zhang, Xuejun, Chen, Qian, Peng, Xiaohui, Jiang, Xinlong.  2019.  Differential Privacy-Based Indoor Localization Privacy Protection in Edge Computing. 2019 IEEE SmartWorld, Ubiquitous Intelligence Computing, Advanced Trusted Computing, Scalable Computing Communications, Cloud Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI). :491–496.
With the popularity of smart devices and the widespread use of the Wi-Fi-based indoor localization, edge computing is becoming the mainstream paradigm of processing massive sensing data to acquire indoor localization service. However, these data which were conveyed to train the localization model unintentionally contain some sensitive information of users/devices, and were released without any protection may cause serious privacy leakage. To solve this issue, we propose a lightweight differential privacy-preserving mechanism for the edge computing environment. We extend ε-differential privacy theory to a mature machine learning localization technology to achieve privacy protection while training the localization model. Experimental results on multiple real-world datasets show that, compared with the original localization technology without privacy-preserving, our proposed scheme can achieve high accuracy of indoor localization while providing differential privacy guarantee. Through regulating the value of ε, the data quality loss of our method can be controlled up to 8.9% and the time consumption can be almost negligible. Therefore, our scheme can be efficiently applied in the edge networks and provides some guidance on indoor localization privacy protection in the edge computing.
Chow, Ka-Ho, Wei, Wenqi, Wu, Yanzhao, Liu, Ling.  2019.  Denoising and Verification Cross-Layer Ensemble Against Black-box Adversarial Attacks. 2019 IEEE International Conference on Big Data (Big Data). :1282–1291.
Deep neural networks (DNNs) have demonstrated impressive performance on many challenging machine learning tasks. However, DNNs are vulnerable to adversarial inputs generated by adding maliciously crafted perturbations to the benign inputs. As a growing number of attacks have been reported to generate adversarial inputs of varying sophistication, the defense-attack arms race has been accelerated. In this paper, we present MODEF, a cross-layer model diversity ensemble framework. MODEF intelligently combines unsupervised model denoising ensemble with supervised model verification ensemble by quantifying model diversity, aiming to boost the robustness of the target model against adversarial examples. Evaluated using eleven representative attacks on popular benchmark datasets, we show that MODEF achieves remarkable defense success rates, compared with existing defense methods, and provides a superior capability of repairing adversarial inputs and making correct predictions with high accuracy in the presence of black-box attacks.
Zolanvari, Maede, Teixeira, Marcio A., Gupta, Lav, Khan, Khaled M., Jain, Raj.  2019.  Machine Learning-Based Network Vulnerability Analysis of Industrial Internet of Things. IEEE Internet of Things Journal. 6:6822—6834.
It is critical to secure the Industrial Internet of Things (IIoT) devices because of potentially devastating consequences in case of an attack. Machine learning (ML) and big data analytics are the two powerful leverages for analyzing and securing the Internet of Things (IoT) technology. By extension, these techniques can help improve the security of the IIoT systems as well. In this paper, we first present common IIoT protocols and their associated vulnerabilities. Then, we run a cyber-vulnerability assessment and discuss the utilization of ML in countering these susceptibilities. Following that, a literature review of the available intrusion detection solutions using ML models is presented. Finally, we discuss our case study, which includes details of a real-world testbed that we have built to conduct cyber-attacks and to design an intrusion detection system (IDS). We deploy backdoor, command injection, and Structured Query Language (SQL) injection attacks against the system and demonstrate how a ML-based anomaly detection system can perform well in detecting these attacks. We have evaluated the performance through representative metrics to have a fair point of view on the effectiveness of the methods.
Ling, Mee Hong, Yau, Kok-Lim Alvin.  2019.  Can Reinforcement Learning Address Security Issues? an Investigation into a Clustering Scheme in Distributed Cognitive Radio Networks 2019 International Conference on Information Networking (ICOIN). :296—300.
This paper investigates the effectiveness of reinforcement learning (RL) model in clustering as an approach to achieve higher network scalability in distributed cognitive radio networks. Specifically, it analyzes the effects of RL parameters, namely the learning rate and discount factor in a volatile environment, which consists of member nodes (or secondary users) that launch attacks with various probabilities of attack. The clusterhead, which resides in an operating region (environment) that is characterized by the probability of attacks, countermeasures the malicious SUs by leveraging on a RL model. Simulation results have shown that in a volatile operating environment, the RL model with learning rate α= 1 provides the highest network scalability when the probability of attacks ranges between 0.3 and 0.7, while the discount factor γ does not play a significant role in learning in an operating environment that is volatile due to attacks.
Yuan, Yaofeng, When, JieChang.  2019.  Adaptively Weighted Channel Feature Network of Mixed Convolution Kernel. 2019 15th International Conference on Computational Intelligence and Security (CIS). :87–91.
In the deep learning tasks, we can design different network models to address different tasks (classification, detection, segmentation). But traditional deep learning networks simply increase the depth and breadth of the network. This leads to a higher complexity of the model. We propose Adaptively Weighted Channel Feature Network of Mixed Convolution Kernel(SKENet). SKENet extract features from different kernels, then mixed those features by elementwise, lastly do sigmoid operator on channel features to get adaptive weightings. We did a simple classification test on the CIFAR10 amd CIFAR100 dataset. The results show that SKENet can achieve a better result in a shorter time. After that, we did an object detection experiment on the VOC dataset. The experimental results show that SKENet is far ahead of the SKNet[20] in terms of speed and accuracy.
Du, Jia, Wang, Zhe, Yang, Junqiang, Song, Xiaofeng.  2019.  Research on Cognitive Linkage of Network Security Equipment. 2019 International Conference on Robots Intelligent System (ICRIS). :296–298.
To solve the problems of weak linkage ability and low intellectualization of strategy allocation in existing network security devices, a new method of cognitive linkage of network security equipment is proposed by learning from human brain. Firstly, the basic connotation and cognitive cycle of cognitive linkage are expounded. Secondly, the main functions of cognitive linkage are clarified. Finally, the cognitive linkage system model is constructed, and the information process flow of cognitive linkage is described. Cognitive linkage of network security equipment provides a new way to effectively enhance the overall protection capability of network security equipment.
Ortiz Garcés, Ivan, Cazares, Maria Fernada, Andrade, Roberto Omar.  2019.  Detection of Phishing Attacks with Machine Learning Techniques in Cognitive Security Architecture. 2019 International Conference on Computational Science and Computational Intelligence (CSCI). :366–370.
The number of phishing attacks has increased in Latin America, exceeding the operational skills of cybersecurity analysts. The cognitive security application proposes the use of bigdata, machine learning, and data analytics to improve response times in attack detection. This paper presents an investigation about the analysis of anomalous behavior related with phishing web attacks and how machine learning techniques can be an option to face the problem. This analysis is made with the use of an contaminated data sets, and python tools for developing machine learning for detect phishing attacks through of the analysis of URLs to determinate if are good or bad URLs in base of specific characteristics of the URLs, with the goal of provide realtime information for take proactive decisions that minimize the impact of an attack.
Wang, Lizhi, Xiong, Zhiwei, Huang, Hua, Shi, Guangming, Wu, Feng, Zeng, Wenjun.  2019.  High-Speed Hyperspectral Video Acquisition By Combining Nyquist and Compressive Sampling. IEEE Transactions on Pattern Analysis and Machine Intelligence. 41:857–870.
We propose a novel hybrid imaging system to acquire 4D high-speed hyperspectral (HSHS) videos with high spatial and spectral resolution. The proposed system consists of two branches: one branch performs Nyquist sampling in the temporal dimension while integrating the whole spectrum, resulting in a high-frame-rate panchromatic video; the other branch performs compressive sampling in the spectral dimension with longer exposures, resulting in a low-frame-rate hyperspectral video. Owing to the high light throughput and complementary sampling, these two branches jointly provide reliable measurements for recovering the underlying HSHS video. Moreover, the panchromatic video can be used to learn an over-complete 3D dictionary to represent each band-wise video sparsely, thanks to the inherent structural similarity in the spectral dimension. Based on the joint measurements and the self-adaptive dictionary, we further propose a simultaneous spectral sparse (3S) model to reinforce the structural similarity across different bands and develop an efficient computational reconstruction algorithm to recover the HSHS video. Both simulation and hardware experiments validate the effectiveness of the proposed approach. To the best of our knowledge, this is the first time that hyperspectral videos can be acquired at a frame rate up to 100fps with commodity optical elements and under ordinary indoor illumination.
Garip, Mevlut Turker, Lin, Jonathan, Reiher, Peter, Gerla, Mario.  2019.  SHIELDNET: An Adaptive Detection Mechanism against Vehicular Botnets in VANETs. 2019 IEEE Vehicular Networking Conference (VNC). :1—7.
Vehicular ad hoc networks (VANETs) are designed to provide traffic safety by enabling vehicles to broadcast information-such as speed, location and heading-through inter-vehicular communications to proactively avoid collisions. However, the attacks targeting these networks might overshadow their advantages if not protected against. One powerful threat against VANETs is vehicular botnets. In our earlier work, we demonstrated several vehicular botnet attacks that can have damaging impacts on the security and privacy of VANETs. In this paper, we present SHIELDNET, the first detection mechanism against vehicular botnets. Similar to the detection approaches against Internet botnets, we target the vehicular botnet communication and use several machine learning techniques to identify vehicular bots. We show via simulation that SHIELDNET can identify 77 percent of the vehicular bots. We propose several improvements on the VANET standards and show that their existing vulnerabilities make an effective defense against vehicular botnets infeasible.
Azakami, Tomoka, Shibata, Chihiro, Uda, Ryuya, Kinoshita, Toshiyuki.  2019.  Creation of Adversarial Examples with Keeping High Visual Performance. 2019 IEEE 2nd International Conference on Information and Computer Technologies (ICICT). :52—56.
The accuracy of the image classification by the convolutional neural network is exceeding the ability of human being and contributes to various fields. However, the improvement of the image recognition technology gives a great blow to security system with an image such as CAPTCHA. In particular, since the character string CAPTCHA has already added distortion and noise in order not to be read by the computer, it becomes a problem that the human readability is lowered. Adversarial examples is a technique to produce an image letting an image classification by the machine learning be wrong intentionally. The best feature of this technique is that when human beings compare the original image with the adversarial examples, they cannot understand the difference on appearance. However, Adversarial examples that is created with conventional FGSM cannot completely misclassify strong nonlinear networks like CNN. Osadchy et al. have researched to apply this adversarial examples to CAPTCHA and attempted to let CNN misclassify them. However, they could not let CNN misclassify character images. In this research, we propose a method to apply FGSM to the character string CAPTCHAs and to let CNN misclassified them.
Shekhar, Heemany, Moh, Melody, Moh, Teng-Sheng.  2019.  Exploring Adversaries to Defend Audio CAPTCHA. 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA). :1155—1161.
CAPTCHA is a web-based authentication method used by websites to distinguish between humans (valid users) and bots (attackers). Audio captcha is an accessible captcha meant for the visually disabled section of users such as color-blind, blind, near-sighted users. Firstly, this paper analyzes how secure current audio captchas are from attacks using machine learning (ML) and deep learning (DL) models. Each audio captcha is made up of five, seven or ten random digits[0-9] spoken one after the other along with varying background noise throughout the length of the audio. If the ML or DL model is able to correctly identify all spoken digits and in the correct order of occurance in a single audio captcha, we consider that captcha to be broken and the attack to be successful. Throughout the paper, accuracy refers to the attack model's success at breaking audio captchas. The higher the attack accuracy, the more unsecure the audio captchas are. In our baseline experiments, we found that attack models could break audio captchas that had no background noise or medium background noise with any number of spoken digits with nearly 99% to 100% accuracy. Whereas, audio captchas with high background noise were relatively more secure with attack accuracy of 85%. Secondly, we propose that the concepts of adversarial examples algorithms can be used to create a new kind of audio captcha that is more resilient towards attacks. We found that even after retraining the models on the new adversarial audio data, the attack accuracy remained as low as 25% to 36% only. Lastly, we explore the benefits of creating adversarial audio captcha through different algorithms such as Basic Iterative Method (BIM) and deepFool. We found that as long as the attacker has less than 45% sample from each kinds of adversarial audio datasets, the defense will be successful at preventing attacks.
Chen, Yu-Cheng, Gieseking, Tim, Campbell, Dustin, Mooney, Vincent, Grijalva, Santiago.  2019.  A Hybrid Attack Model for Cyber-Physical Security Assessment in Electricity Grid. 2019 IEEE Texas Power and Energy Conference (TPEC). :1–6.
A detailed model of an attack on the power grid involves both a preparation stage as well as an execution stage of the attack. This paper introduces a novel Hybrid Attack Model (HAM) that combines Probabilistic Learning Attacker, Dynamic Defender (PLADD) model and a Markov Chain model to simulate the planning and execution stages of a bad data injection attack in power grid. We discuss the advantages and limitations of the prior work models and of our proposed Hybrid Attack Model and show that HAM is more effective compared to individual PLADD or Markov Chain models.
Elkanishy, Abdelrahman, Badawy, Abdel-Hameed A., Furth, Paul M., Boucheron, Laura E., Michael, Christopher P..  2019.  Machine Learning Bluetooth Profile Operation Verification via Monitoring the Transmission Pattern. 2019 53rd Asilomar Conference on Signals, Systems, and Computers. :2144—2148.
Manufacturers often buy and/or license communication ICs from third-party suppliers. These communication ICs are then integrated into a complex computational system, resulting in a wide range of potential hardware-software security issues. This work proposes a compact supervisory circuit to classify the Bluetooth profile operation of a Bluetooth System-on-Chip (SoC) at low frequencies by monitoring the radio frequency (RF) output power of the Bluetooth SoC. The idea is to inexpensively manufacture an RF envelope detector to monitor the RF output power and a profile classification algorithm on a custom low-frequency integrated circuit in a low-cost legacy technology. When the supervisory circuit observes unexpected behavior, it can shut off power to the Bluetooth SoC. In this preliminary work, we proto-type the supervisory circuit using off-the-shelf components to collect a sufficient data set to train 11 different Machine Learning models. We extract smart descriptive time-domain features from the envelope of the RF output signal. Then, we train the machine learning models to classify three different Bluetooth operation profiles: sensor, hands-free, and headset. Our results demonstrate 100% classification accuracy with low computational complexity.
Khan, Aasher, Rehman, Suriya, Khan, Muhammad U.S, Ali, Mazhar.  2019.  Synonym-based Attack to Confuse Machine Learning Classifiers Using Black-box Setting. 2019 4th International Conference on Emerging Trends in Engineering, Sciences and Technology (ICEEST). :1—7.
Twitter being the most popular content sharing platform is giving rise to automated accounts called “bots”. Majority of the users on Twitter are bots. Various machine learning (ML) algorithms are designed to detect bots avoiding the vulnerability constraints of ML-based models. This paper contributes to exploit vulnerabilities of machine learning (ML) algorithms through black-box attack. An adversarial text sequence misclassifies the results of deep learning (DL) classifiers for bot detection. Literature shows that ML models are vulnerable to attacks. The aim of this paper is to compromise the accuracy of ML-based bot detection algorithms by replacing original words in tweets with their synonyms. Our results show 7.2% decrease in the accuracy for bot tweets, therefore classifying bot tweets as legitimate tweets.
Song, Chengru, Xu, Changqiao, Yang, Shujie, Zhou, Zan, Gong, Changhui.  2019.  A Black-Box Approach to Generate Adversarial Examples Against Deep Neural Networks for High Dimensional Input. 2019 IEEE Fourth International Conference on Data Science in Cyberspace (DSC). :473—479.
Generating adversarial samples is gathering much attention as an intuitive approach to evaluate the robustness of learning models. Extensive recent works have demonstrated that numerous advanced image classifiers are defenseless to adversarial perturbations in the white-box setting. However, the white-box setting assumes attackers to have prior knowledge of model parameters, which are generally inaccessible in real world cases. In this paper, we concentrate on the hard-label black-box setting where attackers can only pose queries to probe the model parameters responsible for classifying different images. Therefore, the issue is converted into minimizing non-continuous function. A black-box approach is proposed to address both massive queries and the non-continuous step function problem by applying a combination of a linear fine-grained search, Fibonacci search, and a zeroth order optimization algorithm. However, the input dimension of a image is so high that the estimation of gradient is noisy. Hence, we adopt a zeroth-order optimization method in high dimensions. The approach converts calculation of gradient into a linear regression model and extracts dimensions that are more significant. Experimental results illustrate that our approach can relatively reduce the amount of queries and effectively accelerate convergence of the optimization method.
Zhao, Pu, Liu, Sijia, Chen, Pin-Yu, Hoang, Nghia, Xu, Kaidi, Kailkhura, Bhavya, Lin, Xue.  2019.  On the Design of Black-Box Adversarial Examples by Leveraging Gradient-Free Optimization and Operator Splitting Method. 2019 IEEE/CVF International Conference on Computer Vision (ICCV). :121—130.
Robust machine learning is currently one of the most prominent topics which could potentially help shaping a future of advanced AI platforms that not only perform well in average cases but also in worst cases or adverse situations. Despite the long-term vision, however, existing studies on black-box adversarial attacks are still restricted to very specific settings of threat models (e.g., single distortion metric and restrictive assumption on target model's feedback to queries) and/or suffer from prohibitively high query complexity. To push for further advances in this field, we introduce a general framework based on an operator splitting method, the alternating direction method of multipliers (ADMM) to devise efficient, robust black-box attacks that work with various distortion metrics and feedback settings without incurring high query complexity. Due to the black-box nature of the threat model, the proposed ADMM solution framework is integrated with zeroth-order (ZO) optimization and Bayesian optimization (BO), and thus is applicable to the gradient-free regime. This results in two new black-box adversarial attack generation methods, ZO-ADMM and BO-ADMM. Our empirical evaluations on image classification datasets show that our proposed approaches have much lower function query complexities compared to state-of-the-art attack methods, but achieve very competitive attack success rates.
Tsingenopoulos, Ilias, Preuveneers, Davy, Joosen, Wouter.  2019.  AutoAttacker: A reinforcement learning approach for black-box adversarial attacks. 2019 IEEE European Symposium on Security and Privacy Workshops (EuroS PW). :229—237.
Recent research has shown that machine learning models are susceptible to adversarial examples, allowing attackers to trick a machine learning model into making a mistake and producing an incorrect output. Adversarial examples are commonly constructed or discovered by using gradient-based methods that require white-box access to the model. In most real-world AI system deployments, having complete access to the machine learning model is an unrealistic threat model. However, it is possible for an attacker to construct adversarial examples even in the black-box case - where we assume solely a query capability to the model - with a variety of approaches each with its advantages and shortcomings. We introduce AutoAttacker, a novel reinforcement learning framework where agents learn how to operate around the black-box model by querying it, to effectively extract the underlying decision behaviour, and to undermine it successfully. AutoAttacker is a first of kind framework that uses reinforcement learning and assumes nothing about the differentiability or structure of the underlying function and is thus robust towards common defenses like gradient obfuscation or adversarial training. Finally, without differentiable output, as in binary classification, most methods cease to operate and require either an approximation of the gradient, or another approach altogether. Our approach, however, maintains the capability to function when the output descriptiveness diminishes.
Usama, Muhammad, Qayyum, Adnan, Qadir, Junaid, Al-Fuqaha, Ala.  2019.  Black-box Adversarial Machine Learning Attack on Network Traffic Classification. 2019 15th International Wireless Communications Mobile Computing Conference (IWCMC). :84—89.
Deep machine learning techniques have shown promising results in network traffic classification, however, the robustness of these techniques under adversarial threats is still in question. Deep machine learning models are found vulnerable to small carefully crafted adversarial perturbations posing a major question on the performance of deep machine learning techniques. In this paper, we propose a black-box adversarial attack on network traffic classification. The proposed attack successfully evades deep machine learning-based classifiers which highlights the potential security threat of using deep machine learning techniques to realize autonomous networks.
Jing, Huiyun, Meng, Chengrui, He, Xin, Wei, Wei.  2019.  Black Box Explanation Guided Decision-Based Adversarial Attacks. 2019 IEEE 5th International Conference on Computer and Communications (ICCC). :1592—1596.
Adversarial attacks have been the hot research field in artificial intelligence security. Decision-based black-box adversarial attacks are much more appropriate in the real-world scenarios, where only the final decisions of the targeted deep neural networks are accessible. However, since there is no available guidance for searching the imperceptive adversarial perturbation, boundary attack, one of the best performing decision-based black-box attacks, carries out computationally expensive search. For improving attack efficiency, we propose a novel black box explanation guided decision-based black-box adversarial attack. Firstly, the problem of decision-based adversarial attacks is modeled as a derivative-free and constraint optimization problem. To solve this optimization problem, the black box explanation guided constrained random search method is proposed to more quickly find the imperceptible adversarial example. The insights into the targeted deep neural networks explored by the black box explanation are fully used to accelerate the computationally expensive random search. Experimental results demonstrate that our proposed attack improves the attack efficiency by 64% compared with boundary attack.
Liang, Jiaqi, Li, Linjing, Chen, Weiyun, Zeng, Daniel.  2019.  Targeted Addresses Identification for Bitcoin with Network Representation Learning. 2019 IEEE International Conference on Intelligence and Security Informatics (ISI). :158—160.
The anonymity and decentralization of Bitcoin make it widely accepted in illegal transactions, such as money laundering, drug and weapon trafficking, gambling, to name a few, which has already caused significant security risk all around the world. The obvious de-anonymity approach that matches transaction addresses and users is not possible in practice due to limited annotated data set. In this paper, we divide addresses into four types, exchange, gambling, service, and general, and propose targeted addresses identification algorithms with high fault tolerance which may be employed in a wide range of applications. We use network representation learning to extract features and train imbalanced multi-classifiers. Experimental results validated the effectiveness of the proposed method.
Laguduva, Vishalini, Islam, Sheikh Ariful, Aakur, Sathyanarayanan, Katkoori, Srinivas, Karam, Robert.  2019.  Machine Learning Based IoT Edge Node Security Attack and Countermeasures. 2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI). :670—675.
Advances in technology have enabled tremendous progress in the development of a highly connected ecosystem of ubiquitous computing devices collectively called the Internet of Things (IoT). Ensuring the security of IoT devices is a high priority due to the sensitive nature of the collected data. Physically Unclonable Functions (PUFs) have emerged as critical hardware primitive for ensuring the security of IoT nodes. Malicious modeling of PUF architectures has proven to be difficult due to the inherently stochastic nature of PUF architectures. Extant approaches to malicious PUF modeling assume that a priori knowledge and physical access to the PUF architecture is available for malicious attack on the IoT node. However, many IoT networks make the underlying assumption that the PUF architecture is sufficiently tamper-proof, both physically and mathematically. In this work, we show that knowledge of the underlying PUF structure is not necessary to clone a PUF. We present a novel non-invasive, architecture independent, machine learning attack for strong PUF designs with a cloning accuracy of 93.5% and improvements of up to 48.31% over an alternative, two-stage brute force attack model. We also propose a machine-learning based countermeasure, discriminator, which can distinguish cloned PUF devices and authentic PUFs with an average accuracy of 96.01%. The proposed discriminator can be used for rapidly authenticating millions of IoT nodes remotely from the cloud server.
Sutton, Sara, Bond, Benjamin, Tahiri, Sementa, Rrushi, Julian.  2019.  Countering Malware Via Decoy Processes with Improved Resource Utilization Consistency. 2019 First IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA). :110—119.
The concept of a decoy process is a new development of defensive deception beyond traditional honeypots. Decoy processes can be exceptionally effective in detecting malware, directly upon contact or by redirecting malware to decoy I/O. A key requirement is that they resemble their real counterparts very closely to withstand adversarial probes by threat actors. To be usable, decoy processes need to consume only a small fraction of the resources consumed by their real counterparts. Our contribution in this paper is twofold. We attack the resource utilization consistency of decoy processes provided by a neural network with a heatmap training mechanism, which we find to be insufficiently trained. We then devise machine learning over control flow graphs that improves the heatmap training mechanism. A neural network retrained by our work shows higher accuracy and defeats our attacks without a significant increase in its own resource utilization.
Hasanin, Tawfiq, Khoshgoftaar, Taghi M., Leevy, Joffrey L..  2019.  A Comparison of Performance Metrics with Severely Imbalanced Network Security Big Data. 2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI). :83—88.
Severe class imbalance between the majority and minority classes in large datasets can prejudice Machine Learning classifiers toward the majority class. Our work uniquely consolidates two case studies, each utilizing three learners implemented within an Apache Spark framework, six sampling methods, and five sampling distribution ratios to analyze the effect of severe class imbalance on big data analytics. We use three performance metrics to evaluate this study: Area Under the Receiver Operating Characteristic Curve, Area Under the Precision-Recall Curve, and Geometric Mean. In the first case study, models were trained on one dataset (POST) and tested on another (SlowlorisBig). In the second case study, the training and testing dataset roles were switched. Our comparison of performance metrics shows that Area Under the Precision-Recall Curve and Geometric Mean are sensitive to changes in the sampling distribution ratio, whereas Area Under the Receiver Operating Characteristic Curve is relatively unaffected. In addition, we demonstrate that when comparing sampling methods, borderline-SMOTE2 outperforms the other methods in the first case study, and Random Undersampling is the top performer in the second case study.
Yau, Yiu Chung, Khethavath, Praveen, Figueroa, Jose A..  2019.  Secure Pattern-Based Data Sensitivity Framework for Big Data in Healthcare. 2019 IEEE International Conference on Big Data, Cloud Computing, Data Science Engineering (BCD). :65—70.
With the exponential growth in the usage of electronic medical records (EMR), the amount of data generated by the healthcare industry has too increased exponentially. These large amounts of data, known as “Big Data” is mostly unstructured. Special big data analytics methods are required to process the information and retrieve information which is meaningful. As patient information in hospitals and other healthcare facilities become increasingly electronic, Big Data technologies are needed now more than ever to manage and understand this data. In addition, this information tends to be quite sensitive and needs a highly secure environment. However, current security algorithms are hard to be implemented because it would take a huge amount of time and resources. Security protocols in Big data are also not adequate in protecting sensitive information in the healthcare. As a result, the healthcare data is both heterogeneous and insecure. As a solution we propose the Secure Pattern-Based Data Sensitivity Framework (PBDSF), that uses machine learning mechanisms to identify the common set of attributes of patient data, data frequency, various patterns of codes used to identify specific conditions to secure sensitive information. The framework uses Hadoop and is built on Hadoop Distributed File System (HDFS) as a basis for our clusters of machines to process Big Data, and perform tasks such as identifying sensitive information in a huge amount of data and encrypting data that are identified to be sensitive.