Visible to the public Biblio

Found 275 results

Filters: Keyword is learning (artificial intelligence)  [Clear All Filters]
2019-12-09
Khokhlov, Igor, Jain, Chinmay, Miller-Jacobson, Ben, Heyman, Andrew, Reznik, Leonid, Jacques, Robert St..  2018.  MeetCI: A Computational Intelligence Software Design Automation Framework. 2018 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE). :1-8.

Computational Intelligence (CI) algorithms/techniques are packaged in a variety of disparate frameworks/applications that all vary with respect to specific supported functionality and implementation decisions that drastically change performance. Developers looking to employ different CI techniques are faced with a series of trade-offs in selecting the appropriate library/framework. These include resource consumption, features, portability, interface complexity, ease of parallelization, etc. Considerations such as language compatibility and familiarity with a particular library make the choice of libraries even more difficult. The paper introduces MeetCI, an open source software framework for computational intelligence software design automation that facilitates the application design decisions and their software implementation process. MeetCI abstracts away specific framework details of CI techniques designed within a variety of libraries. This allows CI users to benefit from a variety of current frameworks without investigating the nuances of each library/framework. Using an XML file, developed in accordance with the specifications, the user can design a CI application generically, and utilize various CI software without having to redesign their entire technology stack. Switching between libraries in MeetCI is trivial and accessing the right library to satisfy a user's goals can be done easily and effectively. The paper discusses the framework's use in design of various applications. The design process is illustrated with four different examples from expert systems and machine learning domains, including the development of an expert system for security evaluation, two classification problems and a prediction problem with recurrent neural networks.

Li, Wenjuan, Cao, Jian, Hu, Keyong, Xu, Jie, Buyya, Rajkumar.  2019.  A Trust-Based Agent Learning Model for Service Composition in Mobile Cloud Computing Environments. IEEE Access. 7:34207–34226.
Mobile cloud computing has the features of resource constraints, openness, and uncertainty which leads to the high uncertainty on its quality of service (QoS) provision and serious security risks. Therefore, when faced with complex service requirements, an efficient and reliable service composition approach is extremely important. In addition, preference learning is also a key factor to improve user experiences. In order to address them, this paper introduces a three-layered trust-enabled service composition model for the mobile cloud computing systems. Based on the fuzzy comprehensive evaluation method, we design a novel and integrated trust management model. Service brokers are equipped with a learning module enabling them to better analyze customers' service preferences, especially in cases when the details of a service request are not totally disclosed. Because traditional methods cannot totally reflect the autonomous collaboration between the mobile cloud entities, a prototype system based on the multi-agent platform JADE is implemented to evaluate the efficiency of the proposed strategies. The experimental results show that our approach improves the transaction success rate and user satisfaction.
2019-12-05
Yu, Yiding, Wang, Taotao, Liew, Soung Chang.  2018.  Deep-Reinforcement Learning Multiple Access for Heterogeneous Wireless Networks. 2018 IEEE International Conference on Communications (ICC). :1-7.

This paper investigates the use of deep reinforcement learning (DRL) in the design of a "universal" MAC protocol referred to as Deep-reinforcement Learning Multiple Access (DLMA). The design framework is partially inspired by the vision of DARPA SC2, a 3-year competition whereby competitors are to come up with a clean-slate design that "best share spectrum with any network(s), in any environment, without prior knowledge, leveraging on machine-learning technique". While the scope of DARPA SC2 is broad and involves the redesign of PHY, MAC, and Network layers, this paper's focus is narrower and only involves the MAC design. In particular, we consider the problem of sharing time slots among a multiple of time-slotted networks that adopt different MAC protocols. One of the MAC protocols is DLMA. The other two are TDMA and ALOHA. The DRL agents of DLMA do not know that the other two MAC protocols are TDMA and ALOHA. Yet, by a series of observations of the environment, its own actions, and the rewards - in accordance with the DRL algorithmic framework - a DRL agent can learn the optimal MAC strategy for harmonious co-existence with TDMA and ALOHA nodes. In particular, the use of neural networks in DRL (as opposed to traditional reinforcement learning) allows for fast convergence to optimal solutions and robustness against perturbation in hyper- parameter settings, two essential properties for practical deployment of DLMA in real wireless networks.

2019-12-02
Elfar, Mahmoud, Zhu, Haibei, Cummings, M. L., Pajic, Miroslav.  2019.  Security-Aware Synthesis of Human-UAV Protocols. 2019 International Conference on Robotics and Automation (ICRA). :8011–8017.
In this work, we synthesize collaboration protocols for human-unmanned aerial vehicle (H-UAV) command and control systems, where the human operator aids in securing the UAV by intermittently performing geolocation tasks to confirm its reported location. We first present a stochastic game-based model for the system that accounts for both the operator and an adversary capable of launching stealthy false-data injection attacks, causing the UAV to deviate from its path. We also describe a synthesis challenge due to the UAV's hidden-information constraint. Next, we perform human experiments using a developed RESCHU-SA testbed to recognize the geolocation strategies that operators adopt. Furthermore, we deploy machine learning techniques on the collected experimental data to predict the correctness of a geolocation task at a given location based on its geographical features. By representing the model as a delayed-action game and formalizing the system objectives, we utilize off-the-shelf model checkers to synthesize protocols for the human-UAV coalition that satisfy these objectives. Finally, we demonstrate the usefulness of the H-UAV protocol synthesis through a case study where the protocols are experimentally analyzed and further evaluated by human operators.
2019-11-25
Zuin, Gianlucca, Chaimowicz, Luiz, Veloso, Adriano.  2018.  Learning Transferable Features For Open-Domain Question Answering. 2018 International Joint Conference on Neural Networks (IJCNN). :1–8.

Corpora used to learn open-domain Question-Answering (QA) models are typically collected from a wide variety of topics or domains. Since QA requires understanding natural language, open-domain QA models generally need very large training corpora. A simple way to alleviate data demand is to restrict the domain covered by the QA model, leading thus to domain-specific QA models. While learning improved QA models for a specific domain is still challenging due to the lack of sufficient training data in the topic of interest, additional training data can be obtained from related topic domains. Thus, instead of learning a single open-domain QA model, we investigate domain adaptation approaches in order to create multiple improved domain-specific QA models. We demonstrate that this can be achieved by stratifying the source dataset, without the need of searching for complementary data unlike many other domain adaptation approaches. We propose a deep architecture that jointly exploits convolutional and recurrent networks for learning domain-specific features while transferring domain-shared features. That is, we use transferable features to enable model adaptation from multiple source domains. We consider different transference approaches designed to learn span-level and sentence-level QA models. We found that domain-adaptation greatly improves sentence-level QA performance, and span-level QA benefits from sentence information. Finally, we also show that a simple clustering algorithm may be employed when the topic domains are unknown and the resulting loss in accuracy is negligible.

2019-11-12
Ferenc, Rudolf, Heged\H us, Péter, Gyimesi, Péter, Antal, Gábor, Bán, Dénes, Gyimóthy, Tibor.  2019.  Challenging Machine Learning Algorithms in Predicting Vulnerable JavaScript Functions. 2019 IEEE/ACM 7th International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE). :8-14.

The rapid rise of cyber-crime activities and the growing number of devices threatened by them place software security issues in the spotlight. As around 90% of all attacks exploit known types of security issues, finding vulnerable components and applying existing mitigation techniques is a viable practical approach for fighting against cyber-crime. In this paper, we investigate how the state-of-the-art machine learning techniques, including a popular deep learning algorithm, perform in predicting functions with possible security vulnerabilities in JavaScript programs. We applied 8 machine learning algorithms to build prediction models using a new dataset constructed for this research from the vulnerability information in public databases of the Node Security Project and the Snyk platform, and code fixing patches from GitHub. We used static source code metrics as predictors and an extensive grid-search algorithm to find the best performing models. We also examined the effect of various re-sampling strategies to handle the imbalanced nature of the dataset. The best performing algorithm was KNN, which created a model for the prediction of vulnerable functions with an F-measure of 0.76 (0.91 precision and 0.66 recall). Moreover, deep learning, tree and forest based classifiers, and SVM were competitive with F-measures over 0.70. Although the F-measures did not vary significantly with the re-sampling strategies, the distribution of precision and recall did change. No re-sampling seemed to produce models preferring high precision, while re-sampling strategies balanced the IR measures.

2019-11-11
Kunihiro, Noboru, Lu, Wen-jie, Nishide, Takashi, Sakuma, Jun.  2018.  Outsourced Private Function Evaluation with Privacy Policy Enforcement. 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/ 12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). :412–423.
We propose a novel framework for outsourced private function evaluation with privacy policy enforcement (OPFE-PPE). Suppose an evaluator evaluates a function with private data contributed by a data contributor, and a client obtains the result of the evaluation. OPFE-PPE enables a data contributor to enforce two different kinds of privacy policies to the process of function evaluation: evaluator policy and client policy. An evaluator policy restricts entities that can conduct function evaluation with the data. A client policy restricts entities that can obtain the result of function evaluation. We demonstrate our construction with three applications: personalized medication, genetic epidemiology, and prediction by machine learning. Experimental results show that the overhead caused by enforcing the two privacy policies is less than 10% compared to function evaluation by homomorphic encryption without any privacy policy enforcement.
2019-11-04
Li, Teng, Ma, Jianfeng, Pei, Qingqi, Shen, Yulong, Sun, Cong.  2018.  Anomalies Detection of Routers Based on Multiple Information Learning. 2018 International Conference on Networking and Network Applications (NaNA). :206-211.

Routers are important devices in the networks that carry the burden of transmitting information among the communication devices on the Internet. If a malicious adversary wants to intercept the information or paralyze the network, it can directly attack the routers and then achieve the suspicious goals. Thus, preventing router security is of great importance. However, router systems are notoriously difficult to understand or diagnose for their inaccessibility and heterogeneity. The common way of gaining access to the router system and detecting the anomaly behaviors is to inspect the router syslogs or monitor the packets of information flowing to the routers. These approaches just diagnose the routers from one aspect but do not consider them from multiple views. In this paper, we propose an approach to detect the anomalies and faults of the routers with multiple information learning. We try to use the routers' information not from the developer's view but from the user' s view, which does not need any expert knowledge. First, we do the offline learning to transform the benign or corrupted user actions into the syslogs. Then, we try to decide whether the input routers' conditions are poor or not with clustering. During the detection phase, we use the distance between the event and the cluster to decide if it is the anomaly event and we can provide the corresponding solutions. We have applied our approach in a university network which contains Cisco, Huawei and Dlink routers for three months. We aligned our experiment with former work as a baseline for comparison. Our approach can gain 89.6% accuracy in detecting the attacks which is 5.1% higher than the former work. The results show that our approach performs in limited time as well as memory usages and has high detection and low false positives.

Ramachandran, Raji, Nidhin, R, Shogil, P P.  2018.  Anomaly Detection in Role Administered Relational Databases — A Novel Method. 2018 International Conference on Advances in Computing, Communications and Informatics (ICACCI). :1017–1021.
A significant amount of attempt has been lately committed for the progress of Database Management Systems (DBMS) that ensures high assertion and high security. Common security measures for database like access control measures, validation, encryption technologies, etc are not sufficient enough to secure the data from all the threats. By using an anomaly detection system, we are able to enhance the security feature of the Database management system. We are taking an assumption that the database access control is role based. In this paper, a mechanism is proposed for finding the anomaly in database by using machine learning technique such as classification. The importance of providing anomaly detection technique to a Role-Based Access Control database is that it will help for the protection against the insider attacks. The experimentation results shows that the system is able to detect intrusion effectively with high accuracy and high F1-score.
2019-10-28
Ocaña, Kary, Galheigo, Marcelo, Osthoff, Carla, Gadelha, Luiz, Gomes, Antônio Tadeu A., De Oliveira, Daniel, Porto, Fabio, Vasconcelos, Ana Tereza.  2019.  Towards a Science Gateway for Bioinformatics: Experiences in the Brazilian System of High Performance Computing. 2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID). :638–647.
Science gateways bring out the possibility of reproducible science as they are integrated into reusable techniques, data and workflow management systems, security mechanisms, and high performance computing (HPC). We introduce BioinfoPortal, a science gateway that integrates a suite of different bioinformatics applications using HPC and data management resources provided by the Brazilian National HPC System (SINAPAD). BioinfoPortal follows the Software as a Service (SaaS) model and the web server is freely available for academic use. The goal of this paper is to describe the science gateway and its usage, addressing challenges of designing a multiuser computational platform for parallel/distributed executions of large-scale bioinformatics applications using the Brazilian HPC resources. We also present a study of performance and scalability of some bioinformatics applications executed in the HPC environments and perform machine learning analyses for predicting features for the HPC allocation/usage that could better perform the bioinformatics applications via BioinfoPortal.
2019-10-15
Pan, Y., He, F., Yu, H..  2018.  An Adaptive Method to Learn Directive Trust Strength for Trust-Aware Recommender Systems. 2018 IEEE 22nd International Conference on Computer Supported Cooperative Work in Design ((CSCWD)). :10–16.

Trust Relationships have shown great potential to improve recommendation quality, especially for cold start and sparse users. Since each user trust their friends in different degrees, there are numbers of works been proposed to take Trust Strength into account for recommender systems. However, these methods ignore the information of trust directions between users. In this paper, we propose a novel method to adaptively learn directive trust strength to improve trust-aware recommender systems. Advancing previous works, we propose to establish direction of trust strength by modeling the implicit relationships between users with roles of trusters and trustees. Specially, under new trust strength with directions, how to compute the directive trust strength is becoming a new challenge. Therefore, we present a novel method to adaptively learn directive trust strengths in a unified framework by enforcing the trust strength into range of [0, 1] through a mapping function. Our experiments on Epinions and Ciao datasets demonstrate that the proposed algorithm can effectively outperform several state-of-art algorithms on both MAE and RMSE metrics.

Coleman, M. S., Doody, D. P., Shields, M. A..  2018.  Machine Learning for Real-Time Data-Driven Security Practices. 2018 29th Irish Signals and Systems Conference (ISSC). :1–6.

The risk of cyber-attacks exploiting vulnerable organisations has increased significantly over the past several years. These attacks may combine to exploit a vulnerability breach within a system's protection strategy, which has the potential for loss, damage or destruction of assets. Consequently, every vulnerability has an accompanying risk, which is defined as the "intersection of assets, threats, and vulnerabilities" [1]. This research project aims to experimentally compare the similarity-based ranking of cyber security information utilising a recommendation environment. The Memory-Based Collaborative Filtering technique was employed, specifically the User-Based and Item-Based approaches. These systems utilised information from the National Vulnerability Database, specifically for the identification and similarity-based ranking of cyber-security vulnerability information, relating to hardware and software applications. Experiments were performed using the Item-Based technique, to identify the optimum system parameters, evaluated through the AUC evaluation metric. Once identified, the Item-Based technique was compared with the User-Based technique which utilised the parameters identified from the previous experiments. During these experiments, the Pearson's Correlation Coefficient and the Cosine similarity measure was used. From these experiments, it was identified that utilised the Item-Based technique which employed the Cosine similarity measure, an AUC evaluation metric of 0.80225 was achieved.

Panagiotakis, C., Papadakis, H., Fragopoulou, P..  2018.  Detection of Hurriedly Created Abnormal Profiles in Recommender Systems. 2018 International Conference on Intelligent Systems (IS). :499–506.

Recommender systems try to predict the preferences of users for specific items. These systems suffer from profile injection attacks, where the attackers have some prior knowledge of the system ratings and their goal is to promote or demote a particular item introducing abnormal (anomalous) ratings. The detection of both cases is a challenging problem. In this paper, we propose a framework to spot anomalous rating profiles (outliers), where the outliers hurriedly create a profile that injects into the system either random ratings or specific ratings, without any prior knowledge of the existing ratings. The proposed detection method is based on the unpredictable behavior of the outliers in a validation set, on the user-item rating matrix and on the similarity between users. The proposed system is totally unsupervised, and in the last step it uses the k-means clustering method automatically spotting the spurious profiles. For the cases where labeling sample data is available, a random forest classifier is trained to show how supervised methods outperforms unsupervised ones. Experimental results on the MovieLens 100k and the MovieLens 1M datasets demonstrate the high performance of the proposed schemata.

2019-10-07
Agrawal, R., Stokes, J. W., Selvaraj, K., Marinescu, M..  2019.  Attention in Recurrent Neural Networks for Ransomware Detection. ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). :3222–3226.

Ransomware, as a specialized form of malicious software, has recently emerged as a major threat in computer security. With an ability to lock out user access to their content, recent ransomware attacks have caused severe impact at an individual and organizational level. While research in malware detection can be adapted directly for ransomware, specific structural properties of ransomware can further improve the quality of detection. In this paper, we adapt the deep learning methods used in malware detection for detecting ransomware from emulation sequences. We present specialized recurrent neural networks for capturing local event patterns in ransomware sequences using the concept of attention mechanisms. We demonstrate the performance of enhanced LSTM models on a sequence dataset derived by the emulation of ransomware executables targeting the Windows environment.

2019-10-02
Hussein, A., Salman, O., Chehab, A., Elhajj, I., Kayssi, A..  2019.  Machine Learning for Network Resiliency and Consistency. 2019 Sixth International Conference on Software Defined Systems (SDS). :146–153.
Being able to describe a specific network as consistent is a large step towards resiliency. Next to the importance of security lies the necessity of consistency verification. Attackers are currently focusing on targeting small and crutial goals such as network configurations or flow tables. These types of attacks would defy the whole purpose of a security system when built on top of an inconsistent network. Advances in Artificial Intelligence (AI) are playing a key role in ensuring a fast responce to the large number of evolving threats. Software Defined Networking (SDN), being centralized by design, offers a global overview of the network. Robustness and adaptability are part of a package offered by programmable networking, which drove us to consider the integration between both AI and SDN. The general goal of our series is to achieve an Artificial Intelligence Resiliency System (ARS). The aim of this paper is to propose a new AI-based consistency verification system, which will be part of ARS in our future work. The comparison of different deep learning architectures shows that Convolutional Neural Networks (CNN) give the best results with an accuracy of 99.39% on our dataset and 96% on our consistency test scenario.
2019-09-05
Panfili, M., Giuseppi, A., Fiaschetti, A., Al-Jibreen, H. B., Pietrabissa, A., Priscoli, F. Delli.  2018.  A Game-Theoretical Approach to Cyber-Security of Critical Infrastructures Based on Multi-Agent Reinforcement Learning. 2018 26th Mediterranean Conference on Control and Automation (MED). :460-465.

This paper presents a control strategy for Cyber-Physical System defense developed in the framework of the European Project ATENA, that concerns Critical Infrastructure (CI) protection. The aim of the controller is to find the optimal security configuration, in terms of countermeasures to implement, in order to address the system vulnerabilities. The attack/defense problem is modeled as a multi-agent general sum game, where the aim of the defender is to prevent the most damage possible by finding an optimal trade-off between prevention actions and their costs. The problem is solved utilizing Reinforcement Learning and simulation results provide a proof of the proposed concept, showing how the defender of the protected CI is able to minimize the damage caused by his her opponents by finding the Nash equilibrium of the game in the zero-sum variant, and, in a more general scenario, by driving the attacker in the position where the damage she/he can cause to the infrastructure is lower than the cost it has to sustain to enforce her/his attack strategy.

Sun, Y., Zhang, L., Zhao, C..  2018.  A Study of Network Covert Channel Detection Based on Deep Learning. 2018 2nd IEEE Advanced Information Management,Communicates,Electronic and Automation Control Conference (IMCEC). :637-641.
Information security has become a growing concern. Computer covert channel which is regarded as an important area of information security research gets more attention. In order to detect these covert channels, a variety of detection algorithms are proposed in the course of the research. The algorithms of machine learning type show better results in these detection algorithms. However, the common machine learning algorithms have many problems in the testing process and have great limitations. Based on the deep learning algorithm, this paper proposes a new idea of network covert channel detection and forms a new detection model. On the one hand, this algorithmic model can detect more complex covert channels and, on the other hand, greatly improve the accuracy of detection due to the use of a new deep learning model. By optimizing this test model, we can get better results on the evaluation index.
Elsadig, M. A., Fadlalla, Y. A..  2018.  Packet Length Covert Channel: A Detection Scheme. 2018 1st International Conference on Computer Applications Information Security (ICCAIS). :1-7.
A covert channel is a communication channel that is subjugated for illegal flow of information in a way that violates system security policies. It is a dangerous, invisible, undetectable, and developed security attack. Recently, Packet length covert channel has motivated many researchers as it is a one of the most undetectable network covert channels. Packet length covert channel generates a covert traffic that is very similar to normal terrific which complicates the detection of such type of covert channels. This motivates us to introduce a machine learning based detection scheme. Recently, a machine learning approach has proved its capability in many different fields especially in security field as it usually brings up a reliable and realistic results. Based in our developed content and frequency-based features, the developed detection scheme has been fully trained and tested. Our detection scheme has gained an excellent degree of detection accuracy which reaches 98% (zero false negative rate and 0.02 false positive rate).
2019-09-04
Vanjari, M. S. P., Balsaraf, M. K. P..  2018.  Efficient Exploration of Algorithm in Scholarly Big Data Document. 2018 International Conference on Information , Communication, Engineering and Technology (ICICET). :1–5.
Algorithms are used to develop, analyzing, and applying in the computer field and used for developing new application. It is used for finding solutions to any problems in different condition. It transforms the problems into algorithmic ones on which standard algorithms are applied. Day by day Scholarly Digital documents are increasing. AlgorithmSeer is a search engine used for searching algorithms. The main aim of it provides a large algorithm database. It is used to automatically encountering and take these algorithms in this big collection of documents that enable algorithm indexing, searching, discovery, and analysis. An original set to identify and pull out algorithm representations in a big collection of scholarly documents is proposed, of scale able techniques used by AlgorithmSeer. Along with this, particularly important and relevant textual content can be accessed the platform and highlight portions by anyone with different levels of knowledge. In support of lectures and self-learning, the highlighted documents can be shared with others. But different levels of learners cannot use the highlighted part of text at same understanding level. The problem of guessing new highlights of partially highlighted documents can be solved by us.
2019-08-12
Eetha, S., Agrawal, S., Neelam, S..  2018.  Zynq FPGA Based System Design for Video Surveillance with Sobel Edge Detection. 2018 IEEE International Symposium on Smart Electronic Systems (iSES) (Formerly iNiS). :76–79.

Advancements in semiconductor domain gave way to realize numerous applications in Video Surveillance using Computer vision and Deep learning, Video Surveillances in Industrial automation, Security, ADAS, Live traffic analysis etc. through image understanding improves efficiency. Image understanding requires input data with high precision which is dependent on Image resolution and location of camera. The data of interest can be thermal image or live feed coming for various sensors. Composite(CVBS) is a popular video interface capable of streaming upto HD(1920x1080) quality. Unlike high speed serial interfaces like HDMI/MIPI CSI, Analog composite video interface is a single wire standard supporting longer distances. Image understanding requires edge detection and classification for further processing. Sobel filter is one the most used edge detection filter which can be embedded into live stream. This paper proposes Zynq FPGA based system design for video surveillance with Sobel edge detection, where the input Composite video decoded (Analog CVBS input to YCbCr digital output), processed in HW and streamed to HDMI display simultaneously storing in SD memory for later processing. The HW design is scalable for resolutions from VGA to Full HD for 60fps and 4K for 24fps. The system is built on Xilinx ZC702 platform and TVP5146 to showcase the functional path.

Liu, Y., Yang, Y., Shi, A., Jigang, P., Haowei, L..  2019.  Intelligent monitoring of indoor surveillance video based on deep learning. 2019 21st International Conference on Advanced Communication Technology (ICACT). :648–653.

With the rapid development of information technology, video surveillance system has become a key part in the security and protection system of modern cities. Especially in prisons, surveillance cameras could be found almost everywhere. However, with the continuous expansion of the surveillance network, surveillance cameras not only bring convenience, but also produce a massive amount of monitoring data, which poses huge challenges to storage, analytics and retrieval. The smart monitoring system equipped with intelligent video analytics technology can monitor as well as pre-alarm abnormal events or behaviours, which is a hot research direction in the field of surveillance. This paper combines deep learning methods, using the state-of-the-art framework for instance segmentation, called Mask R-CNN, to train the fine-tuning network on our datasets, which can efficiently detect objects in a video image while simultaneously generating a high-quality segmentation mask for each instance. The experiment show that our network is simple to train and easy to generalize to other datasets, and the mask average precision is nearly up to 98.5% on our own datasets.

2019-08-05
Kaiafas, G., Varisteas, G., Lagraa, S., State, R., Nguyen, C. D., Ries, T., Ourdane, M..  2018.  Detecting Malicious Authentication Events Trustfully. NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium. :1-6.

Anomaly detection on security logs is receiving more and more attention. Authentication events are an important component of security logs, and being able to produce trustful and accurate predictions minimizes the effort of cyber-experts to stop false attacks. Observed events are classified into Normal, for legitimate user behavior, and Malicious, for malevolent actions. These classes are consistently excessively imbalanced which makes the classification problem harder; in the commonly used Los Alamos dataset, the malicious class comprises only 0.00033% of the total. This work proposes a novel method to extract advanced composite features, and a supervised learning technique for classifying authentication logs trustfully; the models are Random Forest, LogitBoost, Logistic Regression, and ultimately Majority Voting which leverages the predictions of the previous models and gives the final prediction for each authentication event. We measure the performance of our experiments by using the False Negative Rate and False Positive Rate. In overall we achieve 0 False Negative Rate (i.e. no attack was missed), and on average a False Positive Rate of 0.0019.

2019-07-01
Clemente, C. J., Jaafar, F., Malik, Y..  2018.  Is Predicting Software Security Bugs Using Deep Learning Better Than the Traditional Machine Learning Algorithms? 2018 IEEE International Conference on Software Quality, Reliability and Security (QRS). :95–102.

Software insecurity is being identified as one of the leading causes of security breaches. In this paper, we revisited one of the strategies in solving software insecurity, which is the use of software quality metrics. We utilized a multilayer deep feedforward network in examining whether there is a combination of metrics that can predict the appearance of security-related bugs. We also applied the traditional machine learning algorithms such as decision tree, random forest, naïve bayes, and support vector machines and compared the results with that of the Deep Learning technique. The results have successfully demonstrated that it was possible to develop an effective predictive model to forecast software insecurity based on the software metrics and using Deep Learning. All the models generated have shown an accuracy of more than sixty percent with Deep Learning leading the list. This finding proved that utilizing Deep Learning methods and a combination of software metrics can be tapped to create a better forecasting model thereby aiding software developers in predicting security bugs.

Perez, R. Lopez, Adamsky, F., Soua, R., Engel, T..  2018.  Machine Learning for Reliable Network Attack Detection in SCADA Systems. 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/ 12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). :633–638.

Critical Infrastructures (CIs) use Supervisory Control And Data Acquisition (SCADA) systems for remote control and monitoring. Sophisticated security measures are needed to address malicious intrusions, which are steadily increasing in number and variety due to the massive spread of connectivity and standardisation of open SCADA protocols. Traditional Intrusion Detection Systems (IDSs) cannot detect attacks that are not already present in their databases. Therefore, in this paper, we assess Machine Learning (ML) for intrusion detection in SCADA systems using a real data set collected from a gas pipeline system and provided by the Mississippi State University (MSU). The contribution of this paper is two-fold: 1) The evaluation of four techniques for missing data estimation and two techniques for data normalization, 2) The performances of Support Vector Machine (SVM), and Random Forest (RF) are assessed in terms of accuracy, precision, recall and F1score for intrusion detection. Two cases are differentiated: binary and categorical classifications. Our experiments reveal that RF detect intrusions effectively, with an F1score of respectively \textbackslashtextgreater 99%.

Amjad, N., Afzal, H., Amjad, M. F., Khan, F. A..  2018.  A Multi-Classifier Framework for Open Source Malware Forensics. 2018 IEEE 27th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE). :106-111.

Traditional anti-virus technologies have failed to keep pace with proliferation of malware due to slow process of their signatures and heuristics updates. Similarly, there are limitations of time and resources in order to perform manual analysis on each malware. There is a need to learn from this vast quantity of data, containing cyber attack pattern, in an automated manner to proactively adapt to ever-evolving threats. Machine learning offers unique advantages to learn from past cyber attacks to handle future cyber threats. The purpose of this research is to propose a framework for multi-classification of malware into well-known categories by applying different machine learning models over corpus of malware analysis reports. These reports are generated through an open source malware sandbox in an automated manner. We applied extensive pre-modeling techniques for data cleaning, features exploration and features engineering to prepare training and test datasets. Best possible hyper-parameters are selected to build machine learning models. These prepared datasets are then used to train the machine learning classifiers and to compare their prediction accuracy. Finally, these results are validated through a comprehensive 10-fold cross-validation methodology. The best results are achieved through Gaussian Naive Bayes classifier with random accuracy of 96% and 10-Fold Cross Validation accuracy of 91.2%. The said framework can be deployed in an operational environment to learn from malware attacks for proactively adapting matching counter measures.