Visible to the public Machine Learning 2015

SoS Newsletter- Advanced Book Block


SoS Logo

Machine Learning


Machine learning offers potential efficiencies and is an important tool in data mining. However, the “learned” or derived data must maintain integrity. Machine learning can also be used to identify threats and attacks. Research in this field relates to resilient architectures, composability, and privacy. Works cited here appeared in 2015.

Gaikwad, D.P.; Thool, R.C., “Intrusion Detection System Using Bagging Ensemble Method of Machine Learning,” in Computing Communication Control and Automation (ICCUBEA), 2015 International Conference on, vol., no., pp. 291–295, 26–27 Feb. 2015. doi:10.1109/ICCUBEA.2015.61
Abstract: Intrusion detection system is widely used to protect and reduce damage to information system. It protects virtual and physical computer networks against threats and vulnerabilities. Presently, machine learning techniques are widely extended to implement effective intrusion detection system. Neural network, statistical models, rule learning, and ensemble methods are some of the kinds of machine learning methods for intrusion detection. Among them, ensemble methods of machine learning are known for good performance in learning process. Investigation of appropriate ensemble method is essential for building effective intrusion detection system. In this paper, a novel intrusion detection technique based on ensemble method of machine learning is proposed. The Bagging method of ensemble with REPTree as base class is used to implement intrusion detection system. The relevant features from NSL_KDD dataset are selected to improve the classification accuracy and reduce the false positive rate. The performance of proposed ensemble method is evaluated in term of classification accuracy, model building time and False Positives. The experimental results show that the Bagging ensemble with REPTree base class exhibits highest classification accuracy. One advantage of using Bagging method is that it takes less time to build the model. The proposed ensemble method provides competitively low false positives compared with other machine learning techniques.
Keywords: data analysis; learning (artificial intelligence); neural nets; security of data; statistical analysis; trees (mathematics); NSL-KDD dataset; REPTree; classification accuracy; intrusion detection system; machine learning techniques; neural network; physical computer networks; statistical models; using bagging ensemble method; virtual computer networks; Accuracy; Bagging; Classification algorithms; Feature extraction; Hidden Markov models; Intrusion detection; Training; Bagging; Ensemble; False positives; Machine learning; REPTree; intrusion detection (ID#: 15-6556)


Sapegin, A.; Gawron, M.; Jaeger, D.; Feng Cheng; Meinel, C., “High-Speed Security Analytics Powered by In-Memory Machine Learning Engine,” in Parallel and Distributed Computing (ISPDC), 2015 14th International Symposium on, vol., no., pp. 74–81, June 29 2015–July 2 2015. doi:10.1109/ISPDC.2015.16
Abstract: Modern Security Information and Event Management systems should be capable to store and process high amount of events or log messages in different formats and from different sources. This requirement often prevents such systems from usage of computational-heavy algorithms for security analysis. To deal with this issue, we built our system based on an in-memory data base with an integrated machine learning library, namely SAP HANA. Three approaches, i.e. (1) deep normalisation of log messages (2) storing data in the main memory and (3) running data analysis directly in the database, allow us to increase processing speed in such a way, that machine learning analysis of security events becomes possible nearly in real-time. To prove our concepts, we measured the processing speed for the developed system on the data generated using Active Directory tested and showed the efficiency of our approach for high-speed analysis of security events.
Keywords: data analysis; learning (artificial intelligence); security of data; SAP HANA; active directory; computational-heavy algorithms; data analysis; deep log message normalisation; high-speed security analytics; high-speed security event analysis; in-memory database; in-memory machine learning engine; integrated machine learning library; machine learning analysis; security information and event management systems; Algorithm design and analysis; Computers; Databases; Libraries; Machine learning algorithms; Prediction algorithms; Security; in-memory; intrusion detection; machine learning; security (ID#: 15-6557)


Mehta, V.; Bahadur, P.; Kapoor, M.; Singh, P.; Rajpoot, S., “Threat Prediction Using Honeypot and Machine Learning,” in Futuristic Trends on Computational Analysis and Knowledge Management (ABLAZE), 2015 International Conference on, vol., no., pp. 278–282, 25–27 Feb. 2015. doi:10.1109/ABLAZE.2015.7155011
Abstract: Data is an abstraction which encapsulates information. In today’s era businesses are data driven which gives insight to predict the destiny of the business by making predictions but another side of the coin is data also helps in placing the present health of the business under our radar and looking back in our past and answer some important questions: what exactly went wrong in the past?. In this paper we try to look into the architecture of frameworks which can predict threat using Honeypot as the source of data and various machine learning algorithms to make precise prediction using OSSEC as Host Intrusion Detection System [HIDS], SNORT for Network Intrusion Detection System [NIDS] and Honeyd an open source Honeypot.
Keywords: business data processing; computer network security; data encapsulation; learning (artificial intelligence); public domain software; HIDS; Honeyd; NIDS; OSSEC; SNORT; business prediction; data source; frameworks architecture; host intrusion detection system; information encapsulation; machine learning; network intrusion detection system; open source honeypot; threat prediction; Computer hacking; Conferences; IP networks; Intrusion detection; Market research; Operating systems; Ports (Computers); High Interaction Honeypots (HIH); Host Intrusion Detection System (HIDS); Low Interaction Honeypots (LIH); Network Intrusion Detection System (NIDS) (ID#: 15-6558)


Tagluk, M.E.; Mamis, M.S.; Arkan, M.; Ertugrul, O.F., “Detecting Fault Type and Fault Location in Power Transmission Lines by Extreme Learning Machines,” in Signal Processing and Communications Applications Conference (SIU), 2015 23th, vol., no., pp. 1090–1093, 16–19 May 2015. doi:10.1109/SIU.2015.7130024
Abstract: Importance of supplying qualified and undisturbed electricity is increasing day by day. Therefore, detecting fault, fault type and fault location is a major issue in power transmission system in order to prevent power delivery system security. In previous studies, we observed that faults can be easily determined by extreme learning machine (ELM) and the aim of this study is to determine applicability of ELM in fault type, zone and location detection. 8 different feature sets were exacted from fault data that produced by ATP and these features were assessed by 15 different classifier and 5 different regression method. The results showed that ELM can be employed for detecting fault types and locations successfully.
Keywords: fault location; learning (artificial intelligence); power engineering computing; power transmission faults; regression analysis; ELM; extreme learning machines; fault type detection; power transmission lines; regression method; Artificial neural networks; Fault location; Feature extraction; Niobium; Optical wavelength conversion; Power transmission lines; Support vector machines; Extreme Learning Machine; Fault Location; Fault Type; Power Transmission Lines (ID#: 15-6559)


Dan Jiang; Omote, K., “An Approach to Detect Remote Access Trojan in the Early Stage of Communication,” in Advanced Information Networking and Applications (AINA), 2015 IEEE 29th International Conference on, vol., no., pp. 706–713, 24–27 March 2015. doi:10.1109/AINA.2015.257
Abstract: As data leakage accidents occur every year, the security of confidential information is becoming increasingly important. Remote Access Trojans (RAT), a kind of spyware, are used to invade the PC of a victim through targeted attacks. After the intrusion, the attacker can monitor and control the victim’s PC remotely, to wait for an opportunity to steal the confidential information. Since it is hard to prevent the intrusion of RATs completely, preventing confidential information being leaked back to the attacker is the main issue. Various existing approaches introduce different network behaviors of RAT to construct detection systems. Unfortunately, two challenges remain: one is to detect RAT sessions as early as possible, the other is to remain a high accuracy to detect RAT sessions, while there exist normal applications whose traffic behave similarly to RATs. In this paper, we propose a novel approach to detect RAT sessions in the early stage of communication. To differentiate network behaviors between normal applications and RAT, we extract the features from the traffic of a short period of time at the beginning. Afterward, we use machine learning techniques to train the detection model, then evaluate it by K-Fold cross-validation. The results show that our approach is able to detect RAT sessions with a high accuracy. In particular, our approach achieves over 96% accuracy together with the FNR of 10% by Random Forest algorithm, which means that our approach is valid to detect RAT sessions in the early stage of communication.
Keywords: invasive software; learning (artificial intelligence); K-fold cross-validation; RAT sessions; confidential information; data leakage accidents; feature extraction; intrusion; machine learning; network behaviors; random forest algorithm; remote access trojan detection; spyware; Accuracy; Feature extraction; Machine learning algorithms; Rats; Support vector machines; Training; Trojan horses; Remote Access Trojan detection; network behavior; targeted attack (ID#: 15-6560)


Kawaguchi, N.; Omote, K., “Malware Function Classification Using APIs in Initial Behavior,” in Information Security (AsiaJCIS), 2015 10th Asia Joint Conference on, vol., no., pp.138–144, 24–26 May 2015. doi:10.1109/AsiaJCIS.2015.15
Abstract: Malware proliferation has become a serious threat to the Internet in recent years. Most of the current malware are subspecies of existing malware that have been automatically generated by illegal tools. To conduct an efficient analysis of malware, estimating their functions in advance is effective when we give priority to analyze. However, estimating malware functions has been difficult due to the increasing sophistication of malware. Although various approaches for malware detection and classification have been considered, the classification accuracy is still low. In this paper, we propose a new classification method which estimates malware’s functions from APIs observed by dynamic analysis on a host. We examining whether the proposed method can correctly classify unknown malware based on function by machine learning. The results show that the our new method can classify each malware’s function with an average accuracy of 83.4%.
Keywords: Internet; invasive software; learning (artificial intelligence); pattern classification; API; Internet; dynamic analysis; efficient malware analysis; illegal tools; initial behavior; machine learning; malware detection; malware function classification; malware proliferation; Accuracy; Data mining; Feature extraction; Machine learning algorithms; Malware; Software; Support vector machines; malware classification (ID#: 15-6561)


Beiye Liu; Chunpeng Wu; Hai Li; Yiran Chen; Qing Wu; Barnell, M.; Qinru Qiu, “Cloning Your Mind: Security Challenges in Cognitive System Designs and Their Solutions,” in Design Automation Conference (DAC), 2015 52nd ACM/EDAC/IEEE, vol., no., pp. 1–5, 8–12 June 2015. doi:10.1145/2744769.2747915
Abstract: With the booming of big-data applications, cognitive information processing systems that leverage advanced data processing technologies, e.g., machine learning and data mining, are widely used in many industry fields. Although these technologies demonstrate great processing capability and accuracy in the relevant applications, several security and safety challenges are also emerging against these learning based technologies. In this paper, we will first introduce several security concerns in cognitive system designs. Some real examples are then used to demonstrate how the attackers can potentially access the confidential user data, replicate a sensitive data processing model without being granted the access to the details of the model, and obtain some key features of the training data by using the services publically accessible to a normal user. Based on the analysis of these security challenges, we also discuss several possible solutions that can protect the information privacy and security of cognitive systems during different stages of the usage.
Keywords: Big Data; cognition; security of data; Big-Data application; cognitive information processing systems; cognitive system design; data mining; data security; machine learning; sensitive data processing model; Data models; Neural networks; Predictive models; Security; Training; Training data; Cognitive Systems; Machine Learning; Security (ID#: 15-6562)


Chih-Hung Hsieh; Yu-Siang Shen; Chao-Wen Li; Jain-Shing Wu, “iF2: An Interpretable Fuzzy Rule Filter for Web Log Post-Compromised Malicious Activity Monitoring,” in Information Security (AsiaJCIS), 2015 10th Asia Joint Conference on, vol., no., pp.130–137, 24–26 May 2015. doi:10.1109/AsiaJCIS.2015.19
Abstract: To alleviate the loads of tracking web log file by human effort, machine learning methods are now commonly used to analyze log data and to identify the pattern of malicious activities. Traditional kernel based techniques, like the neural network and the support vector machine (SVM), typically can deliver higher prediction accuracy. However, the user of a kernel based techniques normally cannot get an overall picture about the distribution of the data set. On the other hand, logic based techniques, such as the decision tree and the rule-based algorithm, feature the advantage of presenting a good summary about the distinctive characteristics of different classes of data such that they are more suitable to generate interpretable feedbacks to domain experts. In this study, a real web-access log dataset from a certain organization was collected. An efficient interpretable fuzzy rule filter (iF2) was proposed as a filter to analyze the data and to detect suspicious internet addresses from the normal ones. The historical information of each internet address recorded in web log file is summarized as multiple statistics. And the design process of iF2 is elaborately modeled as a parameter optimization problem which simultaneously considers 1) maximizing prediction accuracy, 2) minimizing number of used rules, and 3) minimizing number of selected statistics. Experimental results show that the fuzzy rule filter constructed with the proposed approach is capable of delivering superior prediction accuracy in comparison with the conventional logic based classifiers and the expectation maximization based kernel algorithm. On the other hand, though it cannot match the prediction accuracy delivered by the SVM, however, when facing real web log file where the ratio of positive and negative cases is extremely unbalanced, the proposed iF2 of having optimization flexibility results in a better recall rate and enjoys one major advantage due to providing the user with an overall picture of the underlying distributions.
Keywords: Internet; data mining; fuzzy set theory; learning (artificial intelligence); neural nets; pattern classification; statistical analysis; support vector machines; Internet address; SVM; Web log file tracking; Web log post-compromised malicious activity monitoring; Web-access log dataset; decision tree; expectation maximization based kernel algorithm; fuzzy rule filter; iF2; interpretable fuzzy rule filter; kernel based techniques; log data analysis; logic based classifiers; logic based techniques; machine learning methods; malicious activities; neural network; parameter optimization problem; recall rate; rule-based algorithm; support vector machine; Accuracy; Internet; Kernel; Monitoring; Optimization; Prediction algorithms; Support vector machines; Fuzzy Rule Based Filter; Machine Learning; Parameter Optimization; Pattern Recognition; Post-Compromised Threat Identification; Web Log Analysis (ID#: 15-6563)


Zheng Dong; Kapadia, A.; Blythe, J.; Camp, L.J., “Beyond the Lock Icon: Real-Time Detection of Phishing Websites Using Public Key Certificates,” in Electronic Crime Research (eCrime), 2015 APWG Symposium on, vol., no., pp.1–12, 26–29 May 2015. doi:10.1109/ECRIME.2015.7120795
Abstract: We propose a machine-learning approach to detect phishing websites using features from their X.509 public key certificates. We show that its efficacy extends beyond HTTPS-enabled sites. Our solution enables immediate local identification of phishing sites. As such, this serves as an important complement to the existing server-based anti-phishing mechanisms which predominately use blacklists. Blacklisting suffers from several inherent drawbacks in terms of correctness, timeliness, and completeness. Due to the potentially significant lag prior to site blacklisting, there is a window of opportunity for attackers. Other local client-side phishing detection approaches also exist, but primarily rely on page content or URLs, which are arguably easier to manipulate by attackers. We illustrate that our certificate-based approach greatly increases the difficulty of masquerading undetected for phishers, with single millisecond delays for users. We further show that this approach works not only against HTTPS-enabled phishing attacks, but also detects HTTP phishing attacks with port 443 enabled.
Keywords: Web sites; computer crime; learning (artificial intelligence); public key cryptography; HTTPS-enabled phishing attack; Web site phishing detection; machine-learning approach from; public key certificate; server-based antiphishing mechanism; site blacklisting; Browsers; Electronic mail; Feature extraction; Public key; Servers; Uniform resource locators; certificates; machine learning; security (ID#: 15-6564)


Egemen, E.; Inal ve Albert Levi, E., “Mobile Malware Classification Based on Permission Data,” in Signal Processing and Communications Applications Conference (SIU), 2015 23th, vol., no., pp.1529–1532, 16–19 May 2015. doi:10.1109/SIU.2015.7130137
Abstract: The prevalence of mobile devices in today’s world caused the security of these devices questioned more frequently than ever. Android, as one of the most widely used mobile operating systems, is the most likely target for malwares through third party applications. In this work, a method has been devised to detect malwares that target Android platform, by using classification based machine learning. In this study, we use permissions of applications as the features. After the training and test steps on the dataset consisting 5271 malwares and 5097 goodwares, we conclude that Random Forest classification results in 98% performance on the classification of applications. This work emphasizes how much mobile malware classification result can be improved by a system using only the permissions data.
Keywords: Android (operating system); invasive software; learning (artificial intelligence); mobile computing; pattern classification; Android; classification based machine learning; device security; malware detection; mobile devices; mobile malware classification; mobile operating systems; permission data; random forest classification; third party applications; Androids; Google; Humanoid robots; Malware; Mobile communication; Support vector machines; android; classification; machine learning; malware; mobile; permissions (ID#: 15-6565)


Patrascu, Alecsandru; Patriciu, Victor-Valeriu, “Cyber Protection of Critical Infrastructures Using Supervised Learning,” in Control Systems and Computer Science (CSCS), 2015 20th International Conference on, vol., no., pp. 461–468, 27–29 May 2015. doi:10.1109/CSCS.2015.34
Abstract: Interconnected computing units are used more and more in our daily lives, starting from the transportation systems and ending with gas and electricity distribution, together with tenths or hundreds of systems and sensors, called critical infrastructures. In this context, cyber protection is vital because they represent one of the most important parts of a country’s economy thus making them very attractive to cyber criminals or malware attacks. Even though the detection technologies for new threats have improved over time, modern malware still manage to pass even the most secure and well organized computer networks, firewalls and intrusion detection equipments, making all systems vulnerable. This is the main reason that automatic learning is used more often than any other detection algorithms as it can learn from existing attacks and prevent newer ones. In this paper we discuss the issues threatening critical infrastructures systems and propose a framework based on machine learning algorithms and game theory decision models that can be used to protect such systems. We present the results taken after implementing it using three distinct classifiers - k nearest neighbors, decision trees and support vector machines.
Keywords: Biological system modeling; Game theory; Security; Sensors; Support vector machines; Testing; Training; critical infrastructure protection; cybersecurity framework; game theory decision engine; machine learning (ID#: 15-6566)


Adachi, T.; Omote, K., “An Approach to Predict Drive-by-Download Attacks by Vulnerability Evaluation and Opcode,” in Information Security (AsiaJCIS), 2015 10th Asia Joint Conference on, vol., no., pp. 145–151, 24–26 May 2015. doi:10.1109/AsiaJCIS.2015.17
Abstract: Drive-by-download attacks exploit vulnerabilities in Web browsers, and users are unnoticeably downloading malware which accesses to the compromised Web sites. A number of detection approaches and tools against such attacks have been proposed so far. Especially, it is becoming easy to specify vulnerabilities of attacks, because researchers well analyze the trend of various attacks. Unfortunately, in the previous schemes, vulnerability information has not been used in the detection/prediction approaches of drive-by-download attacks. In this paper, we propose a prediction approach of “malware downloading” during drive-by-download attacks (approach-I), which uses vulnerability information. Our experimental results show our approach-I achieves the prediction rate (accuracy) of 92%, FNR of 15% and FPR of 1.0% using Naive Bayes. Furthermore, we propose an enhanced approach (approach-II) which embeds Opcode analysis (dynamic analysis) into our approach-I (static approach). We implement our approach-I and II, and compare the three approaches (approach-I, II and Opcode approaches) using the same datasets in our experiment. As a result, our approach-II has the prediction rate of 92%, and improves FNR to 11% using Random Forest, compared with our approach-I.
Keywords: Web sites; invasive software; learning (artificial intelligence); system monitoring; FNR; FPR; Opcode analysis; Web browsers; Web sites; attack vulnerabilities; drive-by-download attack prediction; dynamic analysis; malware downloading; naive Bayes; prediction rate; random forest; static approach; vulnerability evaluation; vulnerability information; Browsers; Feature extraction; Machine learning algorithms; Malware; Predictive models; Probability; Web pages; Drive-by-Download Attacks; Malware; Supervised Machine Learning (ID#: 15-6567)


Gilmore, R.; Hanley, N.; O’Neill, M., “Neural Network Based Attack on a Masked Implementation of AES,” in Hardware Oriented Security and Trust (HOST), 2015 IEEE International Symposium on, vol., no., pp. 106–111, 5–7 May 2015. doi:10.1109/HST.2015.7140247
Abstract: Masked implementations of cryptographic algorithms are often used in commercial embedded cryptographic devices to increase their resistance to side channel attacks. In this work we show how neural networks can be used to both identify the mask value, and to subsequently identify the secret key value with a single attack trace with high probability. We propose the use of a pre-processing step using principal component analysis (PCA) to significantly increase the success of the attack. We have developed a classifier that can correctly identify the mask for each trace, hence removing the security provided by that mask and reducing the attack to being equivalent to an attack against an unprotected implementation. The attack is performed on the freely available differential power analysis (DPA) contest data set to allow our work to be easily reproducible. We show that neural networks allow for a robust and efficient classification in the context of side-channel attacks.
Keywords: cryptography; neural nets; pattern classification; principal component analysis; AES; Advanced Encryption Standard; DPA; PCA; cryptographic algorithms; differential power analysis contest data set; embedded cryptographic devices; machine learning; mask value identification; masked implementation; neural network based attack; principal component analysis; secret key value identification; side channel attacks; Artificial neural networks; Cryptography; Error analysis; Hardware; Power demand; Principal component analysis; Training; AES; SCA; masking; neural network (ID#: 15-6568)


Kavitha, P.; Mukesh, R., “To Detect Malicious Nodes in the Mobile Ad-hoc Networks Using Soft Computing Technique,” in Electronics and Communication Systems (ICECS), 2015 2nd International Conference on, vol., no., pp.1564–1573, 26–27 Feb. 2015. doi:10.1109/ECS.2015.7124851
Abstract: A Mobile Ad-hoc Network (MANET) is a constantly self-configuring, infrastructure-less network of mobile devices where each device is wireless, moves without restraint and be a router to put across traffic unassociated to its own use. Every device must be prepared to constantly sustain the information obligatory for routing the traffic. And this is the main challenge in building a MANET. Such networks may be self operating or linked to a larger internet and may have one or multiple different transceivers between nodes resulting in a highly dynamic and autonomous topology. The first focus is on MANET attacks followed by detection of the malicious node from MANET via Polynomial-Reduction Algorithm. Although scientists have assessed many algorithms for the detection and rectification of the malicious nodes in the MANETs, the problem still persists. Due to the unprecedented growth in technology, the unidentified vulnerabilities are also intensifying. Therefore, it is very crucial to come up with some ground-breaking ideas to prevent the MANET. In this paper we are used NS2 simulator to implementing malicious nodes in MANET.
Keywords: Internet; learning (artificial intelligence); mobile ad hoc networks; polynomials; telecommunication network routing; telecommunication network topology; telecommunication security; telecommunication traffic; uncertainty handling; Internet; MANET attacks; NS2 simulator; autonomous topology; dynamic topology; infrastructure-less network; machine learning algorithm; malicious node detection; malicious node rectification; mobile devices; polynomial-reduction algorithm; self-configuring network; soft computing technique; traffic routing; transceivers; Mobile ad hoc networks; Mobile communication; Routing; Routing protocols; Security; Machine Learning Algorithm; Mobile Ad-hoc Networks; Polynomial-Reduction Algorithm (ID#: 15-6569)


Neelam, Sahil; Sood, Sandeep; Mehmi, Sandeep; Dogra, Shikha, “Artificial Intelligence for Designing User Profiling System for Cloud Computing Security: Experiment,” in Computer Engineering and Applications (ICACEA), 2015 International Conference on Advances in, vol., no., pp. 51–58, 19–20 March 2015. doi:10.1109/ICACEA.2015.7164645
Abstract: In Cloud Computing security, the existing mechanisms: Anti-virus programs, Authentications, Firewalls are not able to withstand the dynamic nature of threats. So, User Profiling System, which registers user’s activities to analyze user’s behavior, augments the security system to work in proactive and reactive manner and provides an enhanced security. This paper focuses on designing a User Profiling System for Cloud environment using Artificial Intelligence techniques and studies behavior (of User Profiling System) and proposes a new hybrid approach, which will deliver a comprehensive User Profiling System for Cloud Computing security.
Keywords: artificial intelligence; authorisation; cloud computing; firewalls; antivirus programs; artificial intelligence techniques; authentications; cloud computing security; cloud environment; firewalls; proactive manner; reactive manner; user activities; user behavior; user profiling system; Artificial intelligence; Cloud computing; Computational modeling; Fuzzy logic; Fuzzy systems; Genetic algorithms; Security; Artificial Intelligence; Artificial Neural Networks; Cloud Computing; Datacenters; Expert Systems; Genetics; Machine Learning; Multi-tenancy; Networking Systems; Pay-as-you-go Model (ID#: 15-6570)


Enache, Adriana-Cristina; Sgarciu, Valentin; Petrescu-Nita, Alina, “Intelligent Feature Selection Method Rooted in Binary Bat Algorithm for Intrusion Detection,” in Applied Computational Intelligence and Informatics (SACI), 2015 IEEE 10th Jubilee International Symposium on, vol., no., pp. 517–521, 21–23 May 2015. doi:10.1109/SACI.2015.7208259
Abstract: The multitude of hardware and software applications generate a lot of data and burden security solutions that must acquire informations from all these heterogenous systems. Adding the current dynamic and complex cyber threats in this context, make it clear that new security solutions are needed. In this paper we propose a wrapper feature selection approach that combines two machine learning algorithms with an improved version of the Binary Bat Algorithm. Tests on the NSL-KDD dataset empirically prove that our proposed method can reduce the number of features with almost 60% and obtains good results in terms of attack detection rate and false alarm rate, even for unknown attacks.
Keywords: Feature extraction; Intrusion detection; Machine learning algorithms; Niobium; Silicon; Support vector machines; Training; Feature selection; Naïve Bayes and BBA; SVM (ID#: 15-6571)


Aggarwal, P.; Sharma, S.K., “An Empirical Comparison of Classifiers to Analyze Intrusion Detection,” in Advanced Computing & Communication Technologies (ACCT), 2015 Fifth International Conference on, vol., no., pp. 446–450, 21–22 Feb. 2015. doi:10.1109/ACCT.2015.59
Abstract: The massive data exchange on the web has deeply increased the risk of malicious activities thereby propelling the research in the area of Intrusion Detection System (IDS). This paper aims to first select ten classification algorithms based on their efficiency in terms of speed, capability to handle large dataset and dependency on parameter tuning and then simulates the ten selected existing classifiers on a data mining tool Weka for KDD’99 dataset. The simulation results are evaluated and benchmarked based on the generic evaluation metrics for IDS like F-score and accuracy.
Keywords: Internet; data mining; electronic data interchange; pattern classification; security of data; F-score; IDS; Web; Weka; classification algorithms; data classifiers; data mining tool; generic evaluation metrics; intrusion detection system; malicious activities; massive data exchange; parameter tuning; Accuracy; Classification algorithms; Intrusion detection; Machine learning algorithms; Mathematical model; Measurement; Vegetation; Classification algorithm; Intrusion detection system; NSL-KDD (ID#: 15-6572)


Mokhov, S.A.; Paquet, J.; Debbabi, M., “MARFCAT: Fast Code Analysis for Defects and Vulnerabilities,” in Software Analytics (SWAN), 2015 IEEE 1st International Workshop on, vol., no., pp. 35–38, 2–2 March 2015. doi:10.1109/SWAN.2015.7070488
Abstract: We present a fast machine-learning approach to static code analysis and fingerprinting for weaknesses related to security, software engineering, and others using the open-source MARF framework and its MARFCAT application. We used the NIST’s SATE IV static analysis tool exposition workshop’s data sets that included popular open-source projects and large synthetic sets as test cases. To aid detection of weak or vulnerable code, including source or binary on different platforms the machine learning approach proved to be fast and accurate to for such tasks where other tools are either much slower or have much smaller recall of known vulnerabilities. We use signal processing techniques in our approach to accomplish the classification tasks. MARFCAT’s design is independent of the language being analyzed, source code, bytecode, or binary.
Keywords: learning (artificial intelligence); pattern classification; program diagnostics; signal processing; MARF-based Code Analysis Tool; MARFCAT; NIST; SATE IV; defects; fingerprinting; machine-learning; open-source MARF framework; open-source projects; signal processing techniques; static analysis tool exposition workshop data sets; static code analysis; vulnerabilities; Algorithm design and analysis; Feature extraction; Indexes; Java; Testing; Wavelet transforms (ID#: 15-6573)


Appelt, D.; Nguyen, C.D.; Briand, L., “Behind an Application Firewall, Are We Safe from SQL Injection Attacks?,” in Software Testing, Verification and Validation (ICST), 2015 IEEE 8th International Conference on, vol., no., pp. 1–10, 13–17 April 2015. doi:10.1109/ICST.2015.7102581
Abstract: Web application firewalls are an indispensable layer to protect online systems from attacks. However, the fast pace at which new kinds of attacks appear and their sophistication require that firewalls be updated and tested regularly as otherwise they will be circumvented. In this paper, we focus our research on web application firewalls and SQL injection attacks. We present a machine learning-based testing approach to detect holes in firewalls that let SQL injection attacks bypass. At the beginning, the approach can automatically generate diverse attack payloads, which can be seeded into inputs of web-based applications, and then submit them to a system that is protected by a firewall. Incrementally learning from the tests that are blocked or passed by the firewall, our approach can then select tests that exhibit characteristics associated with bypassing the firewall and mutate them to efficiently generate new bypassing attacks. In the race against cyber attacks, time is vital. Being able to learn and anticipate more attacks that can circumvent a firewall in a timely manner is very important in order to quickly fix or fine-tune the firewall. We developed a tool that implements the approach and evaluated it on ModSecurity, a widely used application firewall. The results we obtained suggest a good performance and efficiency in detecting holes in the firewall that could let SQLi attacks go undetected.
Keywords: Internet; SQL; firewalls; learning (artificial intelligence); ModSecurity; SQL injection attacks; SQLi attacks; Web application firewalls; bypassing attacks; cyber attacks; machine learning-based testing approach; online system protection; Databases; Grammar; Radio access networks; Security; Servers; Syntactics; Testing (ID#: 15-6574)


Stampar, M.; Fertalj, K., “Artificial Intelligence in Network Intrusion Detection,” in Information and Communication Technology, Electronics and Microelectronics (MIPRO), 2015 38th International Convention on, vol., no., pp.1318–1323, 25–29 May 2015. doi:10.1109/MIPRO.2015.7160479
Abstract: In past, detection of network attacks has been almost solely done by human operators. They anticipated network anomalies in front of consoles, where based on their expert knowledge applied necessary security measures. With the exponential growth of network bandwidth, this task slowly demanded substantial improvements in both speed and accuracy. One proposed way how to achieve this is the usage of artificial intelligence (AI), progressive and promising computer science branch, particularly one of its sub-fields - machine learning (ML) - where main idea is learning from data. In this paper authors will try to give a general overview of AI algorithms, with main focus on their usage for network intrusion detection.
Keywords: computer network security; learning (artificial intelligence); AI algorithm; ML; artificial intelligence; expert knowledge; machine learning; network attacks detection; network bandwidth; network intrusion detection;  Artificial intelligence; Artificial neural networks; Classification algorithms; Intrusion detection; Market research; Niobium; Support vector machines (ID#: 15-6575)


Tao Ding; AlEroud, A.; Karabatis, G., “Multi-Granular Aggregation of Network Flows for Security Analysis, in Intelligence and Security Informatics (ISI), 2015 IEEE International Conference on, vol., no., pp.173–175, 27–29 May 2015. doi:10.1109/ISI.2015.7165965
Abstract: Investigating network flows is an approach of detecting attacks by identifying known patterns. Flow statistics are used to discover anomalies by aggregating network traces and then using machine-learning classifiers to discover suspicious activities. However, the efficiency and effectiveness of the flow classification models depends on the granularity of aggregation. This paper describes a novel approach that aggregates packets into network flows and correlates them with security events generated by payload-based IDSs for detection of cyber-attacks.
Keywords: computer network security; learning (artificial intelligence); pattern classification; statistical analysis; cyber-attack; machine-learning classifier; multigranular aggregation; network flow statistics; payload-based IDS; security analysis; security event; Correlation; Grippers; Hidden Markov models; IP networks; Intrusion detection; Predictive models; Flow aggregation; Intrusion Detection; NetFlow; traffic classification (ID#: 15-6576)


Becker, G.T.; Wild, A.; Guneysu, T., “Security Analysis of Index-Based Syndrome Coding for PUF-Based Key Generation,” in Hardware Oriented Security and Trust (HOST), 2015 IEEE International Symposium on, vol., no., pp. 20–25, 5–7 May 2015. doi:10.1109/HST.2015.7140230
Abstract: Physical Unclonable Functions (PUFs) as secure providers for cryptographic keys have gained significant research interest in recent years. Since plain PUF responses are typically unreliable, error-correcting mechanisms are employed to transform a fuzzy PUF response into a deterministic cryptographic key. In this context, Index-Based Syndrome Coding (IBS) has been reported as being provably secure in case of identical and independently distributed PUF responses and is therefore an interesting option to implement a highly secure key provider. In this paper we analyze the security of IBS in combination with a k-sum PUF as proposed at CHES 2011. Since for a k-sum PUF the assumption of identical and independently distributed responses does not hold, the notion of leaked bits was introduced at CHES 2011 to capture the security of such constructions. Based on a refined analysis using hamming distance characterization and machine learning techniques, we show that the entropy of the key obtained is significantly lower than expected. More precisely, we obtained from our findings that even the construction from CHES with the highest security claims only achieves a bit entropy rate of 0.39.
Keywords: cryptography; fuzzy set theory; learning (artificial intelligence); CHES 2011; IBS; PUF-based key generation; cryptographic keys; deterministic cryptographic key; error-correcting mechanisms; fuzzy PUF response; hamming distance characterization; index-based syndrome coding; k-sum PUF; machine learning techniques; physical unclonable functions; Cost function; Decoding; Encoding; Entropy; Hamming distance; Measurement; Security; Error-Correction; Fuzzy Extractor; Index-Based Syndrome Coding; Physical Unclonable Functions; k-sum PUF (ID#: 15-6577)


Merat, S.; Almuhtadi, W., “Artificial Intelligence Application for Improving Cyber-Security Acquirement,” in Electrical and Computer Engineering (CCECE), 2015 IEEE 28th Canadian Conference on, vol., no., pp.1445–1450, 3–6 May 2015. doi:10.1109/CCECE.2015.7129493
Abstract: The main focus of this paper is the improvement of machine learning where a number of different types of computer processes can be mapped in multitasking environment. A software mapping and modelling paradigm named SHOWAN is developed to learn and characterize the cyber awareness behaviour of a computer process against multiple concurrent threads. The examined process start to outperform, and tended to manage numerous tasks poorly, but it gradually learned to acquire and control tasks, in the context of anomaly detection. Finally, SHOWAN plots the abnormal activities of manually projected task and compare with loading trends of other tasks within the group.
Keywords: learning (artificial intelligence); security of data; SHOWAN; anomaly detection; artificial intelligence application; computer process; concurrent threads; cyber awareness behaviour; cyber-security acquirement; machine learning; modelling paradigm; multitasking environment; software mapping; Artificial intelligence; Indexes; Instruction sets; Message systems; Routing; Security; Cyber Multitasking Performance; Cyber-Attack; Cyber-Security; Intrinsically locked; Non-maskable task; Normative Model; Queuing Management; Task Prioritization; synchronized thread (ID#: 15-6578)


Articles listed on these pages have been found on publicly available internet pages and are cited with links to those pages. Some of the information included herein has been reprinted with permission from the authors or data repositories. Direct any requests via Email to for removal of the links or modifications to specific citations. Please include the ID# of the specific citation in your correspondence.