Visible to the public Biblio

Found 744 results

Filters: Keyword is machine learning  [Clear All Filters]
Ramapatruni, S., Narayanan, S. N., Mittal, S., Joshi, A., Joshi, K..  2019.  Anomaly Detection Models for Smart Home Security. 2019 IEEE 5th Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS). :19–24.
Recent years have seen significant growth in the adoption of smart homes devices. These devices provide convenience, security, and energy efficiency to users. For example, smart security cameras can detect unauthorized movements, and smoke sensors can detect potential fire accidents. However, many recent examples have shown that they open up a new cyber threat surface. There have been several recent examples of smart devices being hacked for privacy violations and also misused so as to perform DDoS attacks. In this paper, we explore the application of big data and machine learning to identify anomalous activities that can occur in a smart home environment. A Hidden Markov Model (HMM) is trained on network level sensor data, created from a test bed with multiple sensors and smart devices. The generated HMM model is shown to achieve an accuracy of 97% in identifying potential anomalies that indicate attacks. We present our approach to build this model and compare with other techniques available in the literature.
Awaysheh, F., Cabaleiro, J. C., Pena, T. F., Alazab, M..  2019.  Big Data Security Frameworks Meet the Intelligent Transportation Systems Trust Challenges. 2019 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/13th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). :807–813.
Many technological cases exploiting data science have been realized in recent years; machine learning, Internet of Things, and stream data processing are examples of this trend. Other advanced applications have focused on capturing the value from streaming data of different objects of transport and traffic management in an Intelligent Transportation System (ITS). In this context, security control and trust level play a decisive role in the sustainable adoption of this trend. However, conceptual work integrating the security approaches of different disciplines into one coherent reference architecture is limited. The contribution of this paper is a reference architecture for ITS security (called SITS). In addition, a classification of Big Data technologies, products, and services to address the ITS trust challenges is presented. We also proposed a novel multi-tier ITS security framework for validating the usability of SITS with business intelligence development in the enterprise domain.
Benzekri, A., Laborde, R., Oglaza, A., Rammal, D., Barrere, F..  2019.  Dynamic security management driven by situations: An exploratory analysis of logs for the identification of security situations. 2019 3rd Cyber Security in Networking Conference (CSNet). :66—72.
Situation awareness consists of "the perception of the elements in the environment within a volume of time and space, the comprehension of their meaning, and the projection of their status in the near future". Being aware of the security situation is then mandatory to launch proper security reactions in response to cybersecurity attacks. Security Incident and Event Management solutions are deployed within Security Operation Centers. Some vendors propose machine learning based approaches to detect intrusions by analysing networks behaviours. But cyberattacks like Wannacry and NotPetya, which shut down hundreds of thousands of computers, demonstrated that networks monitoring and surveillance solutions remain insufficient. Detecting these complex attacks (a.k.a. Advanced Persistent Threats) requires security administrators to retain a large number of logs just in case problems are detected and involve the investigation of past security events. This approach generates massive data that have to be analysed at the right time in order to detect any accidental or caused incident. In the same time, security administrators are not yet seasoned to such a task and lack the desired skills in data science. As a consequence, a large amount of data is available and still remains unexplored which leaves number of indicators of compromise under the radar. Building on the concept of situation awareness, we developed a situation-driven framework, called dynSMAUG, for dynamic security management. This approach simplifies the security management of dynamic systems and allows the specification of security policies at a high-level of abstraction (close to security requirements). This invited paper aims at exposing real security situations elicitation, coming from networks security experts, and showing the results of exploratory analysis techniques using complex event processing techniques to identify and extract security situations from a large volume of logs. The results contributed to the extension of the dynSMAUG solution.
Efstathopoulos, G., Grammatikis, P. R., Sarigiannidis, P., Argyriou, V., Sarigiannidis, A., Stamatakis, K., Angelopoulos, M. K., Athanasopoulos, S. K..  2019.  Operational Data Based Intrusion Detection System for Smart Grid. 2019 IEEE 24th International Workshop on Computer Aided Modeling and Design of Communication Links and Networks (CAMAD). :1—6.

With the rapid progression of Information and Communication Technology (ICT) and especially of Internet of Things (IoT), the conventional electrical grid is transformed into a new intelligent paradigm, known as Smart Grid (SG). SG provides significant benefits both for utility companies and energy consumers such as the two-way communication (both electricity and information), distributed generation, remote monitoring, self-healing and pervasive control. However, at the same time, this dependence introduces new security challenges, since SG inherits the vulnerabilities of multiple heterogeneous, co-existing legacy and smart technologies, such as IoT and Industrial Control Systems (ICS). An effective countermeasure against the various cyberthreats in SG is the Intrusion Detection System (IDS), informing the operator timely about the possible cyberattacks and anomalies. In this paper, we provide an anomaly-based IDS especially designed for SG utilising operational data from a real power plant. In particular, many machine learning and deep learning models were deployed, introducing novel parameters and feature representations in a comparative study. The evaluation analysis demonstrated the efficacy of the proposed IDS and the improvement due to the suggested complex data representation.

Prasad, G., Huo, Y., Lampe, L., Leung, V. C. M..  2019.  Machine Learning Based Physical-Layer Intrusion Detection and Location for the Smart Grid. 2019 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm). :1—6.
Security and privacy of smart grid communication data is crucial given the nature of the continuous bidirectional information exchange between the consumer and the utilities. Data security has conventionally been ensured using cryptographic techniques implemented at the upper layers of the network stack. However, it has been shown that security can be further enhanced using physical layer (PHY) methods. To aid and/or complement such PHY and upper layer techniques, in this paper, we propose a PHY design that can detect and locate not only an active intruder but also a passive eavesdropper in the network. Our method can either be used as a stand-alone solution or together with existing techniques to achieve improved smart grid data security. Our machine learning based solution intelligently and automatically detects and locates a possible intruder in the network by reusing power line transmission modems installed in the grid for communication purposes. Simulation results show that our cost-efficient design provides near ideal intruder detection rates and also estimates its location with a high degree of accuracy.
Roy, D. D., Shin, D..  2019.  Network Intrusion Detection in Smart Grids for Imbalanced Attack Types Using Machine Learning Models. 2019 International Conference on Information and Communication Technology Convergence (ICTC). :576—581.
Smart grid has evolved as the next generation power grid paradigm which enables the transfer of real time information between the utility company and the consumer via smart meter and advanced metering infrastructure (AMI). These information facilitate many services for both, such as automatic meter reading, demand side management, and time-of-use (TOU) pricing. However, there have been growing security and privacy concerns over smart grid systems, which are built with both smart and legacy information and operational technologies. Intrusion detection is a critical security service for smart grid systems, alerting the system operator for the presence of ongoing attacks. Hence, there has been lots of research conducted on intrusion detection in the past, especially anomaly-based intrusion detection. Problems emerge when common approaches of pattern recognition are used for imbalanced data which represent much more data instances belonging to normal behaviors than to attack ones, and these approaches cause low detection rates for minority classes. In this paper, we study various machine learning models to overcome this drawback by using CIC-IDS2018 dataset [1].
Goyal, Y., Sharma, A..  2019.  A Semantic Machine Learning Approach for Cyber Security Monitoring. 2019 3rd International Conference on Computing Methodologies and Communication (ICCMC). :439—442.
Security refers to precautions designed to shield the availability and integrity of information exchanged among the digital global community. Information safety measure typically protects the virtual facts from unauthorized sources to get a right of entry to, disclosure, manipulation, alteration or destruction on both hardware and software technologies. According to an evaluation through experts operating in the place of information safety, some of the new cyber-attacks are keep on emerging in all the business processes. As a stop result of the analyses done, it's been determined that although the level of risk is not excessive in maximum of the attacks, it's far a severe risk for important data and the severity of those attacks is prolonged. Prior safety structures has been established to monitor various cyber-threats, predominantly using a gadget processed data or alerts for showing each deterministic and stochastic styles. The principal finding for deterministic patterns in cyber- attacks is that they're neither unbiased nor random over the years. Consequently, the quantity of assaults in the past helps to monitor the range of destiny attacks. The deterministic styles can often be leveraged to generate moderately correct monitoring.
Poltronieri, F., Sadler, L., Benincasa, G., Gregory, T., Harrell, J. M., Metu, S., Moulton, C..  2018.  Enabling Efficient and Interoperable Control of IoBT Devices in a Multi-Force Environment. MILCOM 2018 - 2018 IEEE Military Communications Conference (MILCOM). :757—762.

Efficient application of Internet of Battlefield Things (IoBT) technology on the battlefield calls for innovative solutions to control and manage the deluge of heterogeneous IoBT devices. This paper presents an innovative paradigm to address heterogeneity in controlling IoBT and IoT devices, enabling multi-force cooperation in challenging battlefield scenarios.

Agadakos, I., Ciocarlie, G. F., Copos, B., George, J., Leslie, N., Michaelis, J..  2019.  Security for Resilient IoBT Systems: Emerging Research Directions. IEEE INFOCOM 2019 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS). :1—6.

Continued advances in IoT technology have prompted new investigation into its usage for military operations, both to augment and complement existing military sensing assets and support next-generation artificial intelligence and machine learning systems. Under the emerging Internet of Battlefield Things (IoBT) paradigm, a multitude of operational conditions (e.g., diverse asset ownership, degraded networking infrastructure, adversary activities) necessitate the development of novel security techniques, centered on establishment of trust for individual assets and supporting resilience of broader systems. To advance current IoBT efforts, a set of research directions are proposed that aim to fundamentally address the issues of trust and trustworthiness in contested battlefield environments, building on prior research in the cybersecurity domain. These research directions focus on two themes: (1) Supporting trust assessment for known/unknown IoT assets; (2) Ensuring continued trust of known IoBT assets and systems.

Agadakos, I., Ciocarlie, G. F., Copos, B., Emmi, M., George, J., Leslie, N., Michaelis, J..  2019.  Application of Trust Assessment Techniques to IoBT Systems. MILCOM 2019 - 2019 IEEE Military Communications Conference (MILCOM). :833—840.

Continued advances in IoT technology have prompted new investigation into its usage for military operations, both to augment and complement existing military sensing assets and support next-generation artificial intelligence and machine learning systems. Under the emerging Internet of Battlefield Things (IoBT) paradigm, current operational conditions necessitate the development of novel security techniques, centered on establishment of trust for individual assets and supporting resilience of broader systems. To advance current IoBT efforts, a collection of prior-developed cybersecurity techniques is reviewed for applicability to conditions presented by IoBT operational environments (e.g., diverse asset ownership, degraded networking infrastructure, adversary activities) through use of supporting case study examples. The research techniques covered focus on two themes: (1) Supporting trust assessment for known/unknown IoT assets; (2) ensuring continued trust of known IoT assets and IoBT systems.

Su, H., Halak, B., Zwolinski, M..  2019.  Two-Stage Architectures for Resilient Lightweight PUFs. 2019 IEEE 4th International Verification and Security Workshop (IVSW). :19–24.
The following topics are dealt with: Internet of Things; invasive software; security of data; program testing; reverse engineering; product codes; binary codes; decoding; maximum likelihood decoding; field programmable gate arrays.
Wheelus, C., Bou-Harb, E., Zhu, X..  2018.  Tackling Class Imbalance in Cyber Security Datasets. 2018 IEEE International Conference on Information Reuse and Integration (IRI). :229–232.
It is clear that cyber-attacks are a danger that must be addressed with great resolve, as they threaten the information infrastructure upon which we all depend. Many studies have been published expressing varying levels of success with machine learning approaches to combating cyber-attacks, but many modern studies still focus on training and evaluating with very outdated datasets containing old attacks that are no longer a threat, and also lack data on new attacks. Recent datasets like UNSW-NB15 and SANTA have been produced to address this problem. Even so, these modern datasets suffer from class imbalance, which reduces the efficacy of predictive models trained using these datasets. Herein we evaluate several pre-processing methods for addressing the class imbalance problem; using several of the most popular machine learning algorithms and a variant of UNSW-NB15 based upon the attributes from the SANTA dataset.
Apruzzese, G., Colajanni, M., Ferretti, L., Marchetti, M..  2019.  Addressing Adversarial Attacks Against Security Systems Based on Machine Learning. 2019 11th International Conference on Cyber Conflict (CyCon). 900:1—18.

Machine-learning solutions are successfully adopted in multiple contexts but the application of these techniques to the cyber security domain is complex and still immature. Among the many open issues that affect security systems based on machine learning, we concentrate on adversarial attacks that aim to affect the detection and prediction capabilities of machine-learning models. We consider realistic types of poisoning and evasion attacks targeting security solutions devoted to malware, spam and network intrusion detection. We explore the possible damages that an attacker can cause to a cyber detector and present some existing and original defensive techniques in the context of intrusion detection systems. This paper contains several performance evaluations that are based on extensive experiments using large traffic datasets. The results highlight that modern adversarial attacks are highly effective against machine-learning classifiers for cyber detection, and that existing solutions require improvements in several directions. The paper paves the way for more robust machine-learning-based techniques that can be integrated into cyber security platforms.

Khalid, F., Hanif, M. A., Rehman, S., Ahmed, R., Shafique, M..  2019.  TrISec: Training Data-Unaware Imperceptible Security Attacks on Deep Neural Networks. 2019 IEEE 25th International Symposium on On-Line Testing and Robust System Design (IOLTS). :188—193.

Most of the data manipulation attacks on deep neural networks (DNNs) during the training stage introduce a perceptible noise that can be catered by preprocessing during inference, or can be identified during the validation phase. There-fore, data poisoning attacks during inference (e.g., adversarial attacks) are becoming more popular. However, many of them do not consider the imperceptibility factor in their optimization algorithms, and can be detected by correlation and structural similarity analysis, or noticeable (e.g., by humans) in multi-level security system. Moreover, majority of the inference attack rely on some knowledge about the training dataset. In this paper, we propose a novel methodology which automatically generates imperceptible attack images by using the back-propagation algorithm on pre-trained DNNs, without requiring any information about the training dataset (i.e., completely training data-unaware). We present a case study on traffic sign detection using the VGGNet trained on the German Traffic Sign Recognition Benchmarks dataset in an autonomous driving use case. Our results demonstrate that the generated attack images successfully perform misclassification while remaining imperceptible in both “subjective” and “objective” quality tests.

Jin, Y., Tomoishi, M., Matsuura, S..  2019.  A Detection Method Against DNS Cache Poisoning Attacks Using Machine Learning Techniques: Work in Progress. 2019 IEEE 18th International Symposium on Network Computing and Applications (NCA). :1—3.

DNS based domain name resolution has been known as one of the most fundamental Internet services. In the meanwhile, DNS cache poisoning attacks also have become a critical threat in the cyber world. In addition to Kaminsky attacks, the falsified data from the compromised authoritative DNS servers also have become the threats nowadays. Several solutions have been proposed in order to prevent DNS cache poisoning attacks in the literature for the former case such as DNSSEC (DNS Security Extensions), however no effective solutions have been proposed for the later case. Moreover, due to the performance issue and significant workload increase on DNS cache servers, DNSSEC has not been deployed widely yet. In this work, we propose an advanced detection method against DNS cache poisoning attacks using machine learning techniques. In the proposed method, in addition to the basic 5-tuple information of a DNS packet, we intend to add a lot of special features extracted based on the standard DNS protocols as well as the heuristic aspects such as “time related features”, “GeoIP related features” and “trigger of cached DNS data”, etc., in order to identify the DNS response packets used for cache poisoning attacks especially those from compromised authoritative DNS servers. In this paper, as a work in progress, we describe the basic idea and concept of our proposed method as well as the intended network topology of the experimental environment while the prototype implementation, training data preparation and model creation as well as the evaluations will belong to the future work.

Chong, T., Anu, V., Sultana, K. Z..  2019.  Using Software Metrics for Predicting Vulnerable Code-Components: A Study on Java and Python Open Source Projects. 2019 IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Computing (EUC). :98–103.

Software vulnerabilities often remain hidden until an attacker exploits the weak/insecure code. Therefore, testing the software from a vulnerability discovery perspective becomes challenging for developers if they do not inspect their code thoroughly (which is time-consuming). We propose that vulnerability prediction using certain software metrics can support the testing process by identifying vulnerable code-components (e.g., functions, classes, etc.). Once a code-component is predicted as vulnerable, the developers can focus their testing efforts on it, thereby avoiding the time/effort required for testing the entire application. The current paper presents a study that compares how software metrics perform as vulnerability predictors for software projects developed in two different languages (Java vs Python). The goal of this research is to analyze the vulnerability prediction performance of software metrics for different programming languages. We designed and conducted experiments on security vulnerabilities reported for three Java projects (Apache Tomcat 6, Tomcat 7, Apache CXF) and two Python projects (Django and Keystone). In this paper, we focus on a specific type of code component: Functions. We apply Machine Learning models for predicting vulnerable functions. Overall results show that software metrics-based vulnerability prediction is more useful for Java projects than Python projects (i.e., software metrics when used as features were able to predict Java vulnerable functions with a higher recall and precision compared to Python vulnerable functions prediction).

Zhong, J., Yang, C..  2019.  A Compositionality Assembled Model for Learning and Recognizing Emotion from Bodily Expression. 2019 IEEE 4th International Conference on Advanced Robotics and Mechatronics (ICARM). :821–826.
When we are express our internal status, such as emotions, the human body expression we use follows the compositionality principle. It is a theory in linguistic which proposes that the single components of the bodily presentation as well as the rules used to combine them are the major parts to finish this process. In this paper, such principle is applied to the process of expressing and recognizing emotional states through body expression, in which certain key features can be learned to represent certain primitives of the internal emotional state in the form of basic variables. This is done by a hierarchical recurrent neural learning framework (RNN) because of its nonlinear dynamic bifurcation, so that variables can be learned to represent different hierarchies. In addition, we applied some adaptive learning techniques in machine learning for the requirement of real-time emotion recognition, in which a stable representation can be maintained compared to previous work. The model is examined by comparing the PB values between the training and recognition phases. This hierarchical model shows the rationality of the compositionality hypothesis by the RNN learning and explains how key features can be used and combined in bodily expression to show the emotional state.
Zhao, Xinghan, Gao, Xiangfei.  2018.  An AI Software Test Method Based on Scene Deductive Approach. 2018 IEEE International Conference on Software Quality, Reliability and Security Companion (QRS-C). :14—20.
Artificial intelligence (AI) software has high algorithm complexity, and the scale and dimension of the input and output parameters are high, and the test oracle isn't explicit. These features make a lot of difficulties for the design of test cases. This paper proposes an AI software testing method based on scene deductive approach. It models the input, output parameters and the environment, uses the random algorithm to generate the inputs of the test cases, then use the algorithm of deductive approach to make the software testing automatically, and use the test assertions to verify the results of the test. After description of the theory, this paper uses intelligent tracking car as an example to illustrate the application of this method and the problems needing attention. In the end, the paper describes the shortcoming of this method and the future research directions.
Basu, Kanad, Elnaggar, Rana, Chakrabarty, Krishnendu, Karri, Ramesh.  2019.  PREEMPT: PReempting Malware by Examining Embedded Processor Traces. 2019 56th ACM/IEEE Design Automation Conference (DAC). :1—6.

Anti-virus software (AVS) tools are used to detect Malware in a system. However, software-based AVS are vulnerable to attacks. A malicious entity can exploit these vulnerabilities to subvert the AVS. Recently, hardware components such as Hardware Performance Counters (HPC) have been used for Malware detection. In this paper, we propose PREEMPT, a zero overhead, high-accuracy and low-latency technique to detect Malware by re-purposing the embedded trace buffer (ETB), a debug hardware component available in most modern processors. The ETB is used for post-silicon validation and debug and allows us to control and monitor the internal activities of a chip, beyond what is provided by the Input/Output pins. PREEMPT combines these hardware-level observations with machine learning-based classifiers to preempt Malware before it can cause damage. There are many benefits of re-using the ETB for Malware detection. It is difficult to hack into hardware compared to software, and hence, PREEMPT is more robust against attacks than AVS. PREEMPT does not incur performance penalties. Finally, PREEMPT has a high True Positive value of 94% and maintains a low False Positive value of 2%.

Zhang, Jiliang, Qu, Gang.  2020.  Physical Unclonable Function-Based Key Sharing via Machine Learning for IoT Security. IEEE Transactions on Industrial Electronics. 67:7025—7033.

In many industry Internet of Things applications, resources like CPU, memory, and battery power are limited and cannot afford the classic cryptographic security solutions. Silicon physical unclonable function (PUF) is a lightweight security primitive that exploits manufacturing variations during the chip fabrication process for key generation and/or device authentication. However, traditional weak PUFs such as ring oscillator (RO) PUF generate chip-unique key for each device, which restricts their application in security protocols where the same key is required to be shared in resource-constrained devices. In this article, in order to address this issue, we propose a PUF-based key sharing method for the first time. The basic idea is to implement one-to-one input-output mapping with lookup table (LUT)-based interstage crossing structures in each level of inverters of RO PUF. Individual customization on configuration bits of interstage crossing structure and different RO selections with challenges bring high flexibility. Therefore, with the flexible configuration of interstage crossing structures and challenges, crossover RO PUF can generate the same shared key for resource-constrained devices, which enables a new application for lightweight key sharing protocols.

Vi, Bao Ngoc, Noi Nguyen, Huu, Nguyen, Ngoc Tran, Truong Tran, Cao.  2019.  Adversarial Examples Against Image-based Malware Classification Systems. 2019 11th International Conference on Knowledge and Systems Engineering (KSE). :1—5.

Malicious software, known as malware, has become urgently serious threat for computer security, so automatic mal-ware classification techniques have received increasing attention. In recent years, deep learning (DL) techniques for computer vision have been successfully applied for malware classification by visualizing malware files and then using DL to classify visualized images. Although DL-based classification systems have been proven to be much more accurate than conventional ones, these systems have been shown to be vulnerable to adversarial attacks. However, there has been little research to consider the danger of adversarial attacks to visualized image-based malware classification systems. This paper proposes an adversarial attack method based on the gradient to attack image-based malware classification systems by introducing perturbations on resource section of PE files. The experimental results on the Malimg dataset show that by a small interference, the proposed method can achieve success attack rate when challenging convolutional neural network malware classifiers.

Wei, Qu, Xiao, Shi, Dongbao, Li.  2019.  Malware Classification System Based on Machine Learning. 2019 Chinese Control And Decision Conference (CCDC). :647—652.

The main challenge for malware researchers is the large amount of data and files that need to be evaluated for potential threats. Researchers analyze a large number of new malware daily and classify them in order to extract common features. Therefore, a system that can ensure and improve the efficiency and accuracy of the classification is of great significance for the study of malware characteristics. A high-performance, high-efficiency automatic classification system based on multi-feature selection fusion of machine learning is proposed in this paper. Its performance and efficiency, according to our experiments, have been greatly improved compared to single-featured systems.

Mahajan, Ginika, Saini, Bhavna, Anand, Shivam.  2019.  Malware Classification Using Machine Learning Algorithms and Tools. 2019 Second International Conference on Advanced Computational and Communication Paradigms (ICACCP). :1—8.

Malware classification is the process of categorizing the families of malware on the basis of their signatures. This work focuses on classifying the emerging malwares on the basis of comparable features of similar malwares. This paper proposes a novel framework that categorizes malware samples into their families and can identify new malware samples for analysis. For this six diverse classification techniques of machine learning are used. To get more comparative and thus accurate classification results, analysis is done using two different tools, named as Knime and Orange. The work proposed can help in identifying and thus cleaning new malwares and classifying malware into their families. The correctness of family classification of malwares is investigated in terms of confusion matrix, accuracy and Cohen's Kappa. After evaluation it is analyzed that Random Forest gives the highest accuracy.

Tran, Trung Kien, Sato, Hiroshi, Kubo, Masao.  2019.  Image-Based Unknown Malware Classification with Few-Shot Learning Models. 2019 Seventh International Symposium on Computing and Networking Workshops (CANDARW). :401—407.

Knowing malware types in every malware attacks is very helpful to the administrators to have proper defense policies for their system. It must be a massive benefit for the organization as well as the social if the automatic protection systems could themselves detect, classify an existence of new malware types in the whole network system with a few malware samples. This feature helps to prevent the spreading of malware as soon as any damage is caused to the networks. An approach introduced in this paper takes advantage of One-shot/few-shot learning algorithms in solving the malware classification problems by using some well-known models such as Matching Networks, Prototypical Networks. To demonstrate an efficiency of the approach, we run the experiments on the two malware datasets (namely, MalImg and Microsoft Malware Classification Challenge), and both experiments all give us very high accuracies. We confirm that if applying models correctly from the machine learning area could bring excellent performance compared to the other traditional methods, open a new area of malware research.

Sethi, Kamalakanta, Kumar, Rahul, Sethi, Lingaraj, Bera, Padmalochan, Patra, Prashanta Kumar.  2019.  A Novel Machine Learning Based Malware Detection and Classification Framework. 2019 International Conference on Cyber Security and Protection of Digital Services (Cyber Security). :1–4.
As time progresses, new and complex malware types are being generated which causes a serious threat to computer systems. Due to this drastic increase in the number of malware samples, the signature-based malware detection techniques cannot provide accurate results. Different studies have demonstrated the proficiency of machine learning for the detection and classification of malware files. Further, the accuracy of these machine learning models can be improved by using feature selection algorithms to select the most essential features and reducing the size of the dataset which leads to lesser computations. In this paper, we have developed a machine learning based malware analysis framework for efficient and accurate malware detection and classification. We used Cuckoo sandbox for dynamic analysis which executes malware in an isolated environment and generates an analysis report based on the system activities during execution. Further, we propose a feature extraction and selection module which extracts features from the report and selects the most important features for ensuring high accuracy at minimum computation cost. Then, we employ different machine learning algorithms for accurate detection and fine-grained classification. Experimental results show that we got high detection and classification accuracy in comparison to the state-of-the-art approaches.