Visible to the public Biblio

Filters: Keyword is malware detection  [Clear All Filters]
2021-09-21
Choudhary, Sunita, Sharma, Anand.  2020.  Malware Detection Amp; Classification Using Machine Learning. 2020 International Conference on Emerging Trends in Communication, Control and Computing (ICONC3). :1–4.
With fast turn of events and development of the web, malware is one of major digital dangers nowadays. Henceforth, malware detection is an important factor in the security of computer systems. Nowadays, attackers generally design polymeric malware [1], it is usually a type of malware [2] that continuously changes its recognizable feature to fool detection techniques that uses typical signature based methods [3]. That is why the need for Machine Learning based detection arises. In this work, we are going to obtain behavioral-pattern that may be achieved through static or dynamic analysis, afterward we can apply dissimilar ML techniques to identify whether it's malware or not. Behavioral based Detection methods [4] will be discussed to take advantage from ML algorithms so as to frame social-based malware recognition and classification model.
Li, Mingxuan, Lv, Shichao, Shi, Zhiqiang.  2020.  Malware Detection for Industrial Internet Based on GAN. 2020 IEEE International Conference on Information Technology,Big Data and Artificial Intelligence (ICIBA). 1:475–481.
This thesis focuses on the detection of malware in industrial Internet. The basic flow of the detection of malware contains feature extraction and sample identification. API graph can effectively represent the behavior information of malware. However, due to the high algorithm complexity of solving the problem of subgraph isomorphism, the efficiency of analysis based on graph structure feature is low. Due to the different scales of API graph of different malicious codes, the API graph needs to be normalized. Considering the difficulties of sample collection and manual marking, it is necessary to expand the number of malware samples in industrial Internet. This paper proposes a method that combines PageRank with TF-IDF to process the API graph. Besides, this paper proposes a method to construct the adversarial samples of malwares based on GAN.
Chai, Yuhan, Qiu, Jing, Su, Shen, Zhu, Chunsheng, Yin, Lihua, Tian, Zhihong.  2020.  LGMal: A Joint Framework Based on Local and Global Features for Malware Detection. 2020 International Wireless Communications and Mobile Computing (IWCMC). :463–468.
With the gradual advancement of smart city construction, various information systems have been widely used in smart cities. In order to obtain huge economic benefits, criminals frequently invade the information system, which leads to the increase of malware. Malware attacks not only seriously infringe on the legitimate rights and interests of users, but also cause huge economic losses. Signature-based malware detection algorithms can only detect known malware, and are susceptible to evasion techniques such as binary obfuscation. Behavior-based malware detection methods can solve this problem well. Although there are some malware behavior analysis works, they may ignore semantic information in the malware API call sequence. In this paper, we design a joint framework based on local and global features for malware detection to solve the problem of network security of smart cities, called LGMal, which combines the stacked convolutional neural network and graph convolutional networks. Specially, the stacked convolutional neural network is used to learn API call sequence information to capture local semantic features and the graph convolutional networks is used to learn API call semantic graph structure information to capture global semantic features. Experiments on Alibaba Cloud Security Malware Detection datasets show that the joint framework gets better results. The experimental results show that the precision is 87.76%, the recall is 88.08%, and the F1-measure is 87.79%. We hope this paper can provide a useful way for malware detection and protect the network security of smart city.
Jin, Xiang, Xing, Xiaofei, Elahi, Haroon, Wang, Guojun, Jiang, Hai.  2020.  A Malware Detection Approach Using Malware Images and Autoencoders. 2020 IEEE 17th International Conference on Mobile Ad Hoc and Sensor Systems (MASS). :1–6.
Most machine learning-based malware detection systems use various supervised learning methods to classify different instances of software as benign or malicious. This approach provides no information regarding the behavioral characteristics of malware. It also requires a large amount of training data and is prone to labeling difficulties and can reduce accuracy due to redundant training data. Therefore, we propose a malware detection method based on deep learning, which uses malware images and a set of autoencoders to detect malware. The method is to design an autoencoder to learn the functional characteristics of malware, and then to observe the reconstruction error of autoencoder to realize the classification and detection of malware and benign software. The proposed approach achieves 93% accuracy and comparatively better F1-score values while detecting malware and needs little training data when compared with traditional malware detection systems.
Patil, Rajvardhan, Deng, Wei.  2020.  Malware Analysis using Machine Learning and Deep Learning techniques. 2020 SoutheastCon. 2:1–7.
In this era, where the volume and diversity of malware is rising exponentially, new techniques need to be employed for faster and accurate identification of the malwares. Manual heuristic inspection of malware analysis are neither effective in detecting new malware, nor efficient as they fail to keep up with the high spreading rate of malware. Machine learning approaches have therefore gained momentum. They have been used to automate static and dynamic analysis investigation where malware having similar behavior are clustered together, and based on the proximity unknown malwares get classified to their respective families. Although many such research efforts have been conducted where data-mining and machine-learning techniques have been applied, in this paper we show how the accuracy can further be improved using deep learning networks. As deep learning offers superior classification by constructing neural networks with a higher number of potentially diverse layers it leads to improvement in automatic detection and classification of the malware variants.In this research, we present a framework which extracts various feature-sets such as system calls, operational codes, sections, and byte codes from the malware files. In the experimental and result section, we compare the accuracy obtained from each of these features and demonstrate that feature vector for system calls yields the highest accuracy. The paper concludes by showing how deep learning approach performs better than the traditional shallow machine learning approaches.
Walker, Aaron, Sengupta, Shamik.  2020.  Malware Family Fingerprinting Through Behavioral Analysis. 2020 IEEE International Conference on Intelligence and Security Informatics (ISI). :1–5.
Signature-based malware detection is not always effective at detecting polymorphic variants of known malware. Malware signatures are devised to counter known threats, which also limits efficacy against new forms of malware. However, existing signatures do present the ability to classify malware based upon known malicious behavior which occurs on a victim computer. In this paper we present a method of classifying malware by family type through behavioral analysis, where the frequency of system function calls is used to fingerprint the actions of specific malware families. This in turn allows us to demonstrate a machine learning classifier which is capable of distinguishing malware by family affiliation with high accuracy.
Sathya, K, Premalatha, J, Suwathika, S.  2020.  Reinforcing Cyber World Security with Deep Learning Approaches. 2020 International Conference on Communication and Signal Processing (ICCSP). :0766–0769.
In the past decade, the Machine Learning (ML) and Deep learning (DL) has produced much research interest in the society and attracted them. Now-a-days, the Internet and social life make a lead in most of their life but it has serious social threats. It is a challenging thing to protect the sensitive information, data network and the computers which are in unauthorized cyber-attacks. For protecting the data's we need the cyber security. For these problems, the recent technologies of Deep learning and Machine Learning are integrated with the cyber-attacks to provide the solution for the problems. This paper gives a synopsis of utilizing deep learning to enhance the security of cyber world and various challenges in integrating deep learning into cyber security are analyzed.
2021-08-31
Rouka, Elpida, Birkinshaw, Celyn, Vassilakis, Vassilios G..  2020.  SDN-based Malware Detection and Mitigation: The Case of ExPetr Ransomware. 2020 IEEE International Conference on Informatics, IoT, and Enabling Technologies (ICIoT). :150–155.
This paper investigates the use of Software-Defined Networking (SDN) in the detection and mitigation of malware threat, focusing on the example of ExPetr ransomware. Extensive static and dynamic analysis of ExPetr is performed in a purpose-built SDN testbed. The results acquired from this analysis are then used to design and implement an SDN-based solution to detect the malware and prevent it from spreading to other machines inside a local network. Our solution consists of three security mechanisms that have been implemented as components/modules of the Python-based POX controller. These mechanisms include: port blocking, SMB payload inspection, and HTTP payload inspection. When malicious activity is detected, the controller communicates with the SDN switches via the OpenFlow protocol and installs appropriate entries in their flow tables. In particular, the controller blocks machines which are considered infected, by monitoring and reacting in real time to the network traffic they produce. Our experimental results demonstrate that the proposed designs are effective against self-propagating malware in local networks. The implemented system can respond to malicious activities quickly and in real time. Furthermore, by tuning certain thresholds of the detection mechanisms it is possible to trade-off the detection time with the false positive rate.
2021-05-05
Kumar, Rahul, Sethi, Kamalakanta, Prajapati, Nishant, Rout, Rashmi Ranjan, Bera, Padmalochan.  2020.  Machine Learning based Malware Detection in Cloud Environment using Clustering Approach. 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT). :1—7.

Enforcing security and resilience in a cloud platform is an essential but challenging problem due to the presence of a large number of heterogeneous applications running on shared resources. A security analysis system that can detect threats or malware must exist inside the cloud infrastructure. Much research has been done on machine learning-driven malware analysis, but it is limited in computational complexity and detection accuracy. To overcome these drawbacks, we proposed a new malware detection system based on the concept of clustering and trend micro locality sensitive hashing (TLSH). We used Cuckoo sandbox, which provides dynamic analysis reports of files by executing them in an isolated environment. We used a novel feature extraction algorithm to extract essential features from the malware reports obtained from the Cuckoo sandbox. Further, the most important features are selected using principal component analysis (PCA), random forest, and Chi-square feature selection methods. Subsequently, the experimental results are obtained for clustering and non-clustering approaches on three classifiers, including Decision Tree, Random Forest, and Logistic Regression. The model performance shows better classification accuracy and false positive rate (FPR) as compared to the state-of-the-art works and non-clustering approach at significantly lesser computation cost.

2021-04-08
Westland, T., Niu, N., Jha, R., Kapp, D., Kebede, T..  2020.  Relating the Empirical Foundations of Attack Generation and Vulnerability Discovery. 2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science (IRI). :37–44.
Automatically generating exploits for attacks receives much attention in security testing and auditing. However, little is known about the continuous effect of automatic attack generation and detection. In this paper, we develop an analytic model to understand the cost-benefit tradeoffs in light of the process of vulnerability discovery. We develop a three-phased model, suggesting that the cumulative malware detection has a productive period before the rate of gain flattens. As the detection mechanisms co-evolve, the gain will likely increase. We evaluate our analytic model by using an anti-virus tool to detect the thousands of Trojans automatically created. The anti-virus scanning results over five months show the validity of the model and point out future research directions.
2021-03-04
Matin, I. Muhamad Malik, Rahardjo, B..  2020.  A Framework for Collecting and Analysis PE Malware Using Modern Honey Network (MHN). 2020 8th International Conference on Cyber and IT Service Management (CITSM). :1—5.

Nowadays, Windows is an operating system that is very popular among people, especially users who have limited knowledge of computers. But unconsciously, the security threat to the windows operating system is very high. Security threats can be in the form of illegal exploitation of the system. The most common attack is using malware. To determine the characteristics of malware using dynamic analysis techniques and static analysis is very dependent on the availability of malware samples. Honeypot is the most effective malware collection technique. But honeypot cannot determine the type of file format contained in malware. File format information is needed for the purpose of handling malware analysis that is focused on windows-based malware. For this reason, we propose a framework that can collect malware information as well as identify malware PE file type formats. In this study, we collected malware samples using a modern honey network. Next, we performed a feature extraction to determine the PE file format. Then, we classify types of malware using VirusTotal scanning. As the results of this study, we managed to get 1.222 malware samples. Out of 1.222 malware samples, we successfully extracted 945 PE malware. This study can help researchers in other research fields, such as machine learning and deep learning, for malware detection.

2021-02-10
Tanana, D., Tanana, G..  2020.  Advanced Behavior-Based Technique for Cryptojacking Malware Detection. 2020 14th International Conference on Signal Processing and Communication Systems (ICSPCS). :1—4.
With rising value and popularity of cryptocurrencies, they inevitably attract cybercriminals seeking illicit profits within blockchain ecosystem. Two of the most popular methods are ransomware and cryptojacking. Ransomware, being the first and more obvious threat has been extensively studied in the past. Unlike that, scientists have often neglected cryptojacking, because it’s less obvious and less harmful than ransomware. In this paper, we’d like to propose enhanced detection program to combat cryptojacking, additionally briefly touching history of cryptojacking, also known as malicious mining and reviewing most notable previous attempts to detect and combat cryptojacking. The review would include out previous work on malicious mining detection and our current detection program is based on its previous iteration, which mostly used CPU usage heuristics to detect cryptojacking. However, we will include additional metrics for malicious mining detection, such as network usage and calls to cryptographic libraries, which result in a 93% detection rate against the selected number of cryptojacking samples, compared to 81% rate achieved in previous work. Finally, we’ll discuss generalization of proposed detection technique to include GPU cryptojackers.
2021-02-08
Zhang, J..  2020.  DeepMal: A CNN-LSTM Model for Malware Detection Based on Dynamic Semantic Behaviours. 2020 International Conference on Computer Information and Big Data Applications (CIBDA). :313–316.
Malware refers to any software accessing or being installed in a system without the authorisation of administrators. Various malware has been widely used for cyber-criminals to accomplish their evil intentions and goals. To combat the increasing amount and reduce the threat of malicious programs, a novel deep learning framework, which uses NLP techniques for reference, combines CNN and LSTM neurones to capture the locally spatial correlations and learn from sequential longterm dependency is proposed. Hence, high-level abstractions and representations are automatically extracted for the malware classification task. The classification accuracy improves from 0.81 (best one by Random Forest) to approximately 1.0.
2021-01-22
Mani, G., Pasumarti, V., Bhargava, B., Vora, F. T., MacDonald, J., King, J., Kobes, J..  2020.  DeCrypto Pro: Deep Learning Based Cryptomining Malware Detection Using Performance Counters. 2020 IEEE International Conference on Autonomic Computing and Self-Organizing Systems (ACSOS). :109—118.
Autonomy in cybersystems depends on their ability to be self-aware by understanding the intent of services and applications that are running on those systems. In case of mission-critical cybersystems that are deployed in dynamic and unpredictable environments, the newly integrated unknown applications or services can either be benign and essential for the mission or they can be cyberattacks. In some cases, these cyberattacks are evasive Advanced Persistent Threats (APTs) where the attackers remain undetected for reconnaissance in order to ascertain system features for an attack e.g. Trojan Laziok. In other cases, the attackers can use the system only for computing e.g. cryptomining malware. APTs such as cryptomining malware neither disrupt normal system functionalities nor trigger any warning signs because they simply perform bitwise and cryptographic operations as any other benign compression or encoding application. Thus, it is difficult for defense mechanisms such as antivirus applications to detect these attacks. In this paper, we propose an Operating Context profiling system based on deep neural networks-Long Short-Term Memory (LSTM) networks-using Windows Performance Counters data for detecting these evasive cryptomining applications. In addition, we propose Deep Cryptomining Profiler (DeCrypto Pro), a detection system with a novel model selection framework containing a utility function that can select a classification model for behavior profiling from both the light-weight machine learning models (Random Forest and k-Nearest Neighbors) and a deep learning model (LSTM), depending on available computing resources. Given data from performance counters, we show that individual models perform with high accuracy and can be trained with limited training data. We also show that the DeCrypto Profiler framework reduces the use of computational resources and accurately detects cryptomining applications by selecting an appropriate model, given the constraints such as data sample size and system configuration.
2020-12-14
Dong, D., Ye, Z., Su, J., Xie, S., Cao, Y., Kochan, R..  2020.  A Malware Detection Method Based on Improved Fireworks Algorithm and Support Vector Machine. 2020 IEEE 15th International Conference on Advanced Trends in Radioelectronics, Telecommunications and Computer Engineering (TCSET). :846–851.
The increasing of malwares has presented a serious threat to the security of computer systems in recent years. Traditional signature-based anti-virus systems are not able to detect metamorphic and previously unseen malwares and it inspires people to use machine learning methods such as Naive Bayes and Decision Tree to identity malicious executables. Among these methods, detecting malwares by using Support Vector Machine (SVM) is one of the most effective approaches. However, the parameters of SVM have serious impacts on its classification performance. In order to find the optimal parameter combination and avoid the problem of falling into local optimal solution, many methods based on evolutionary algorithms are proposed, including Particle Swarm Optimization (PSO), Genetic Algorithm (GA), Differential Evolution (DE) and others. But these algorithms still face the problem of being trapped into local solution spaces in different degree. In this paper, an improved fireworks algorithm is presented and applied to search parameters of SVM: penalty factor c and kernel function parameter g. To research the performance of the proposed algorithm, numeric experiments are made and compared with some typical algorithms, the experimental results demonstrate it outperforms other algorithms.
2020-12-11
Phu, T. N., Hoang, L., Toan, N. N., Tho, N. Dai, Binh, N. N..  2019.  C500-CFG: A Novel Algorithm to Extract Control Flow-based Features for IoT Malware Detection. 2019 19th International Symposium on Communications and Information Technologies (ISCIT). :568—573.

{Static characteristic extraction method Control flow-based features proposed by Ding has the ability to detect malicious code with higher accuracy than traditional Text-based methods. However, this method resolved NP-hard problem in a graph, therefore it is not feasible with the large-size and high-complexity programs. So, we propose the C500-CFG algorithm in Control flow-based features based on the idea of dynamic programming, solving Ding's NP-hard problem in O(N2) time complexity, where N is the number of basic blocks in decom-piled executable codes. Our algorithm is more efficient and more outstanding in detecting malware than Ding's algorithm: fast processing time, allowing processing large files, using less memory and extracting more feature information. Applying our algorithms with IoT data sets gives outstanding results on 2 measures: Accuracy = 99.34%

Payne, J., Kundu, A..  2019.  Towards Deep Federated Defenses Against Malware in Cloud Ecosystems. 2019 First IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA). :92—100.

In cloud computing environments with many virtual machines, containers, and other systems, an epidemic of malware can be crippling and highly threatening to business processes. In this vision paper, we introduce a hierarchical approach to performing malware detection and analysis using several recent advances in machine learning on graphs, hypergraphs, and natural language. We analyze individual systems and their logs, inspecting and understanding their behavior with attentional sequence models. Given a feature representation of each system's logs using this procedure, we construct an attributed network of the cloud with systems and other components as vertices and propose an analysis of malware with inductive graph and hypergraph learning models. With this foundation, we consider the multicloud case, in which multiple clouds with differing privacy requirements cooperate against the spread of malware, proposing the use of federated learning to perform inference and training while preserving privacy. Finally, we discuss several open problems that remain in defending cloud computing environments against malware related to designing robust ecosystems, identifying cloud-specific optimization problems for response strategy, action spaces for malware containment and eradication, and developing priors and transfer learning tasks for machine learning models in this area.

Abusnaina, A., Khormali, A., Alasmary, H., Park, J., Anwar, A., Mohaisen, A..  2019.  Adversarial Learning Attacks on Graph-based IoT Malware Detection Systems. 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS). :1296—1305.

IoT malware detection using control flow graph (CFG)-based features and deep learning networks are widely explored. The main goal of this study is to investigate the robustness of such models against adversarial learning. We designed two approaches to craft adversarial IoT software: off-the-shelf methods and Graph Embedding and Augmentation (GEA) method. In the off-the-shelf adversarial learning attack methods, we examine eight different adversarial learning methods to force the model to misclassification. The GEA approach aims to preserve the functionality and practicality of the generated adversarial sample through a careful embedding of a benign sample to a malicious one. Intensive experiments are conducted to evaluate the performance of the proposed method, showing that off-the-shelf adversarial attack methods are able to achieve a misclassification rate of 100%. In addition, we observed that the GEA approach is able to misclassify all IoT malware samples as benign. The findings of this work highlight the essential need for more robust detection tools against adversarial learning, including features that are not easy to manipulate, unlike CFG-based features. The implications of the study are quite broad, since the approach challenged in this work is widely used for other applications using graphs.

2020-11-30
Stokes, J. W., Agrawal, R., McDonald, G., Hausknecht, M..  2019.  ScriptNet: Neural Static Analysis for Malicious JavaScript Detection. MILCOM 2019 - 2019 IEEE Military Communications Conference (MILCOM). :1–8.
Malicious scripts are an important computer infection threat vector for computer users. For internet-scale processing, static analysis offers substantial computing efficiencies. We propose the ScriptNet system for neural malicious JavaScript detection which is based on static analysis. We also propose a novel deep learning model, Pre-Informant Learning (PIL), which processes Javascript files as byte sequences. Lower layers capture the sequential nature of these byte sequences while higher layers classify the resulting embedding as malicious or benign. Unlike previously proposed solutions, our model variants are trained in an end-to-end fashion allowing discriminative training even for the sequential processing layers. Evaluating this model on a large corpus of 212,408 JavaScript files indicates that the best performing PIL model offers a 98.10% true positive rate (TPR) for the first 60K byte subsequences and 81.66% for the full-length files, at a false positive rate (FPR) of 0.50%. Both models significantly outperform several baseline models. The best performing PIL model can successfully detect 92.02% of unknown malware samples in a hindsight experiment where the true labels of the malicious JavaScript files were not known when the model was trained.
2020-10-30
Basu, Kanad, Elnaggar, Rana, Chakrabarty, Krishnendu, Karri, Ramesh.  2019.  PREEMPT: PReempting Malware by Examining Embedded Processor Traces. 2019 56th ACM/IEEE Design Automation Conference (DAC). :1—6.

Anti-virus software (AVS) tools are used to detect Malware in a system. However, software-based AVS are vulnerable to attacks. A malicious entity can exploit these vulnerabilities to subvert the AVS. Recently, hardware components such as Hardware Performance Counters (HPC) have been used for Malware detection. In this paper, we propose PREEMPT, a zero overhead, high-accuracy and low-latency technique to detect Malware by re-purposing the embedded trace buffer (ETB), a debug hardware component available in most modern processors. The ETB is used for post-silicon validation and debug and allows us to control and monitor the internal activities of a chip, beyond what is provided by the Input/Output pins. PREEMPT combines these hardware-level observations with machine learning-based classifiers to preempt Malware before it can cause damage. There are many benefits of re-using the ETB for Malware detection. It is difficult to hack into hardware compared to software, and hence, PREEMPT is more robust against attacks than AVS. PREEMPT does not incur performance penalties. Finally, PREEMPT has a high True Positive value of 94% and maintains a low False Positive value of 2%.

2020-10-29
Xylogiannopoulos, Konstantinos F., Karampelas, Panagiotis, Alhajj, Reda.  2019.  Text Mining for Malware Classification Using Multivariate All Repeated Patterns Detection. 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). :887—894.

Mobile phones have become nowadays a commodity to the majority of people. Using them, people are able to access the world of Internet and connect with their friends, their colleagues at work or even unknown people with common interests. This proliferation of the mobile devices has also been seen as an opportunity for the cyber criminals to deceive smartphone users and steel their money directly or indirectly, respectively, by accessing their bank accounts through the smartphones or by blackmailing them or selling their private data such as photos, credit card data, etc. to third parties. This is usually achieved by installing malware to smartphones masking their malevolent payload as a legitimate application and advertise it to the users with the hope that mobile users will install it in their devices. Thus, any existing application can easily be modified by integrating a malware and then presented it as a legitimate one. In response to this, scientists have proposed a number of malware detection and classification methods using a variety of techniques. Even though, several of them achieve relatively high precision in malware classification, there is still space for improvement. In this paper, we propose a text mining all repeated pattern detection method which uses the decompiled files of an application in order to classify a suspicious application into one of the known malware families. Based on the experimental results using a real malware dataset, the methodology tries to correctly classify (without any misclassification) all randomly selected malware applications of 3 categories with 3 different families each.

Roseline, S. Abijah, Sasisri, A. D., Geetha, S., Balasubramanian, C..  2019.  Towards Efficient Malware Detection and Classification using Multilayered Random Forest Ensemble Technique. 2019 International Carnahan Conference on Security Technology (ICCST). :1—6.

The exponential growth rate of malware causes significant security concern in this digital era to computer users, private and government organizations. Traditional malware detection methods employ static and dynamic analysis, which are ineffective in identifying unknown malware. Malware authors develop new malware by using polymorphic and evasion techniques on existing malware and escape detection. Newly arriving malware are variants of existing malware and their patterns can be analyzed using the vision-based method. Malware patterns are visualized as images and their features are characterized. The alternative generation of class vectors and feature vectors using ensemble forests in multiple sequential layers is performed for classifying malware. This paper proposes a hybrid stacked multilayered ensembling approach which is robust and efficient than deep learning models. The proposed model outperforms the machine learning and deep learning models with an accuracy of 98.91%. The proposed system works well for small-scale and large-scale data since its adaptive nature of setting parameters (number of sequential levels) automatically. It is computationally efficient in terms of resources and time. The method uses very fewer hyper-parameters compared to deep neural networks.

Priyamvada Davuluru, Venkata Salini, Narayanan Narayanan, Barath, Balster, Eric J..  2019.  Convolutional Neural Networks as Classification Tools and Feature Extractors for Distinguishing Malware Programs. 2019 IEEE National Aerospace and Electronics Conference (NAECON). :273—278.

Classifying malware programs is a research area attracting great interest for Anti-Malware industry. In this research, we propose a system that visualizes malware programs as images and distinguishes those using Convolutional Neural Networks (CNNs). We study the performance of several well-established CNN based algorithms such as AlexNet, ResNet and VGG16 using transfer learning approaches. We also propose a computationally efficient CNN-based architecture for classification of malware programs. In addition, we study the performance of these CNNs as feature extractors by using Support Vector Machine (SVM) and K-nearest Neighbors (kNN) for classification purposes. We also propose fusion methods to boost the performance further. We make use of the publicly available database provided by Microsoft Malware Classification Challenge (BIG 2015) for this study. Our overall performance is 99.4% for a set of 2174 test samples comprising 9 different classes thereby setting a new benchmark.

2020-10-26
Li, Huhua, Zhan, Dongyang, Liu, Tianrui, Ye, Lin.  2019.  Using Deep-Learning-Based Memory Analysis for Malware Detection in Cloud. 2019 IEEE 16th International Conference on Mobile Ad Hoc and Sensor Systems Workshops (MASSW). :1–6.
Malware is one of the biggest threats in cloud computing. Malware running inside virtual machines or containers could steal critical information or continue to attack other cloud nodes. To detect malware in cloud, especially zero-day malware, signature-and machine-learning-based approaches are proposed to analyze the execution binary. However, malicious binary files may not permanently be stored in the file system of virtual machine or container, periodically scanner may not find the target files. Dynamic analysis approach usually introduce run-time overhead to virtual machines, which is not widely used in cloud. To solve these problems, we propose a memory analysis approach to detect malware, employing the deep learning technology. The system analyzes the memory image periodically during malware execution, which will not introduce run-time overhead. We first extract the memory snapshot from running virtual machines or containers. Then, the snapshot is converted to a grayscale image. Finally, we employ CNN to detect malware. In the learning phase, malicious and benign software are trained. In the testing phase, we test our system with real-world malwares.
Sethi, Kamalakanta, Kumar, Rahul, Sethi, Lingaraj, Bera, Padmalochan, Patra, Prashanta Kumar.  2019.  A Novel Machine Learning Based Malware Detection and Classification Framework. 2019 International Conference on Cyber Security and Protection of Digital Services (Cyber Security). :1–4.
As time progresses, new and complex malware types are being generated which causes a serious threat to computer systems. Due to this drastic increase in the number of malware samples, the signature-based malware detection techniques cannot provide accurate results. Different studies have demonstrated the proficiency of machine learning for the detection and classification of malware files. Further, the accuracy of these machine learning models can be improved by using feature selection algorithms to select the most essential features and reducing the size of the dataset which leads to lesser computations. In this paper, we have developed a machine learning based malware analysis framework for efficient and accurate malware detection and classification. We used Cuckoo sandbox for dynamic analysis which executes malware in an isolated environment and generates an analysis report based on the system activities during execution. Further, we propose a feature extraction and selection module which extracts features from the report and selects the most important features for ensuring high accuracy at minimum computation cost. Then, we employ different machine learning algorithms for accurate detection and fine-grained classification. Experimental results show that we got high detection and classification accuracy in comparison to the state-of-the-art approaches.