Visible to the public Biblio

Filters: Keyword is malware analysis  [Clear All Filters]
2019-10-14
Angelini, M., Blasilli, G., Borrello, P., Coppa, E., D’Elia, D. C., Ferracci, S., Lenti, S., Santucci, G..  2018.  ROPMate: Visually Assisting the Creation of ROP-based Exploits. 2018 IEEE Symposium on Visualization for Cyber Security (VizSec). :1–8.

Exploits based on ROP (Return-Oriented Programming) are increasingly present in advanced attack scenarios. Testing systems for ROP-based attacks can be valuable for improving the security and reliability of software. In this paper, we propose ROPMATE, the first Visual Analytics system specifically designed to assist human red team ROP exploit builders. In contrast, previous ROP tools typically require users to inspect a puzzle of hundreds or thousands of lines of textual information, making it a daunting task. ROPMATE presents builders with a clear interface of well-defined and semantically meaningful gadgets, i.e., fragments of code already present in the binary application that can be chained to form fully-functional exploits. The system supports incrementally building exploits by suggesting gadget candidates filtered according to constraints on preserved registers and accessed memory. Several visual aids are offered to identify suitable gadgets and assemble them into semantically correct chains. We report on a preliminary user study that shows how ROPMATE can assist users in building ROP chains.

2019-08-05
Tofighi-Shirazi, Ramtine, Christofi, Maria, Elbaz-Vincent, Philippe, Le, Thanh-ha.  2018.  DoSE: Deobfuscation Based on Semantic Equivalence. Proceedings of the 8th Software Security, Protection, and Reverse Engineering Workshop. :1:1-1:12.

Software deobfuscation is a key challenge in malware analysis to understand the internal logic of the code and establish adequate countermeasures. In order to defeat recent obfuscation techniques, state-of-the-art generic deobfuscation methodologies are based on dynamic symbolic execution (DSE). However, DSE suffers from limitations such as code coverage and scalability. In the race to counter and remove the most advanced obfuscation techniques, there is a need to reduce the amount of code to cover. To that extend, we propose a novel deobfuscation approach based on semantic equivalence, called DoSE. With DoSE, we aim to improve and complement DSE-based deobfuscation techniques by statically eliminating obfuscation transformations (built on code-reuse). This improves the code coverage. Our method's novelty comes from the transposition of existing binary diffing techniques, namely semantic equivalence checking, to the purpose of the deobfuscation of untreated techniques, such as two-way opaque constructs, that we encounter in surreptitious software. In order to challenge DoSE, we used both known malwares such as Cryptowall, WannaCry, Flame and BitCoinMiner and obfuscated code samples. Our experimental results show that DoSE is an efficient strategy of detecting obfuscation transformations based on code-reuse with low rates of false positive and/or false negative results in practice, and up to 63% of code reduction on certain types of malwares.

2019-06-24
Stokes, J. W., Wang, D., Marinescu, M., Marino, M., Bussone, B..  2018.  Attack and Defense of Dynamic Analysis-Based, Adversarial Neural Malware Detection Models. MILCOM 2018 - 2018 IEEE Military Communications Conference (MILCOM). :1–8.

Recently researchers have proposed using deep learning-based systems for malware detection. Unfortunately, all deep learning classification systems are vulnerable to adversarial learning-based attacks, or adversarial attacks, where miscreants can avoid detection by the classification algorithm with very few perturbations of the input data. Previous work has studied adversarial attacks against static analysis-based malware classifiers which only classify the content of the unknown file without execution. However, since the majority of malware is either packed or encrypted, malware classification based on static analysis often fails to detect these types of files. To overcome this limitation, anti-malware companies typically perform dynamic analysis by emulating each file in the anti-malware engine or performing in-depth scanning in a virtual machine. These strategies allow the analysis of the malware after unpacking or decryption. In this work, we study different strategies of crafting adversarial samples for dynamic analysis. These strategies operate on sparse, binary inputs in contrast to continuous inputs such as pixels in images. We then study the effects of two, previously proposed defensive mechanisms against crafted adversarial samples including the distillation and ensemble defenses. We also propose and evaluate the weight decay defense. Experiments show that with these three defenses, the number of successfully crafted adversarial samples is reduced compared to an unprotected baseline system. In particular, the ensemble defense is the most resilient to adversarial attacks. Importantly, none of the defenses significantly reduce the classification accuracy for detecting malware. Finally, we show that while adding additional hidden layers to neural models does not significantly improve the malware classification accuracy, it does significantly increase the classifier's robustness to adversarial attacks.

Lai, Chia-Min, Lu, Chia-Yu, Lee, Hahn-Ming.  2018.  Implementation of Adversarial Scenario to Malware Analytic. Proceedings of the 2Nd International Conference on Machine Learning and Soft Computing. :127–132.

As the worldwide internet has non-stop developments, it comes with enormous amount automatically generated malware. Those malware had become huge threaten to computer users. A comprehensive malware family classifier can help security researchers to quickly identify characteristics of malware which help malware analysts to investigate in more efficient way. However, despite the assistance of the artificial intelligent (AI) classifiers, it has been shown that the AI-based classifiers are vulnerable to so-called adversarial attacks. In this paper, we demonstrate how the adversarial settings can be applied to the classifier of malware families classification. Our experimental results achieved high successful rate through the adversarial attack. We also find the important features which are ignored by malware analysts but useful in the future analysis.

Copty, Fady, Danos, Matan, Edelstein, Orit, Eisner, Cindy, Murik, Dov, Zeltser, Benjamin.  2018.  Accurate Malware Detection by Extreme Abstraction. Proceedings of the 34th Annual Computer Security Applications Conference. :101–111.

Modern malware applies a rich arsenal of evasion techniques to render dynamic analysis ineffective. In turn, dynamic analysis tools take great pains to hide themselves from malware; typically this entails trying to be as faithful as possible to the behavior of a real run. We present a novel approach to malware analysis that turns this idea on its head, using an extreme abstraction of the operating system that intentionally strays from real behavior. The key insight is that the presence of malicious behavior is sufficient evidence of malicious intent, even if the path taken is not one that could occur during a real run of the sample. By exploring multiple paths in a system that only approximates the behavior of a real system, we can discover behavior that would often be hard to elicit otherwise. We aggregate features from multiple paths and use a funnel-like configuration of machine learning classifiers to achieve high accuracy without incurring too much of a performance penalty. We describe our system, TAMALES (The Abstract Malware Analysis LEarning System), in detail and present machine learning results using a 330K sample set showing an FPR (False Positive Rate) of 0.10% with a TPR (True Positive Rate) of 99.11%, demonstrating that extreme abstraction can be extraordinarily effective in providing data that allows a classifier to accurately detect malware.

Ijaz, M., Durad, M. H., Ismail, M..  2019.  Static and Dynamic Malware Analysis Using Machine Learning. 2019 16th International Bhurban Conference on Applied Sciences and Technology (IBCAST). :687–691.

Malware detection is an indispensable factor in security of internet oriented machines. The combinations of different features are used for dynamic malware analysis. The different combinations are generated from APIs, Summary Information, DLLs and Registry Keys Changed. Cuckoo sandbox is used for dynamic malware analysis, which is customizable, and provide good accuracy. More than 2300 features are extracted from dynamic analysis of malware and 92 features are extracted statically from binary malware using PEFILE. Static features are extracted from 39000 malicious binaries and 10000 benign files. Dynamically 800 benign files and 2200 malware files are analyzed in Cuckoo Sandbox and 2300 features are extracted. The accuracy of dynamic malware analysis is 94.64% while static analysis accuracy is 99.36%. The dynamic malware analysis is not effective due to tricky and intelligent behaviours of malwares. The dynamic analysis has some limitations due to controlled network behavior and it cannot be analyzed completely due to limited access of network.

Wright, D., Stroschein, J..  2018.  A Malware Analysis and Artifact Capture Tool. 2018 IEEE 16th Intl Conf on Dependable, Autonomic and Secure Computing, 16th Intl Conf on Pervasive Intelligence and Computing, 4th Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress(DASC/PiCom/DataCom/CyberSciTech). :328–333.

Malware authors attempt to obfuscate and hide their code in its static and dynamic states. This paper provides a novel approach to aid analysis by intercepting and capturing malware artifacts and providing dynamic control of process flow. Capturing malware artifacts allows an analyst to more quickly and comprehensively understand malware behavior and obfuscation techniques and doing so interactively allows multiple code paths to be explored. The faster that malware can be analyzed the quicker the systems and data compromised by it can be determined and its infection stopped. This research proposes an instantiation of an interactive malware analysis and artifact capture tool.

Naeem, H., Guo, B., Naeem, M. R..  2018.  A light-weight malware static visual analysis for IoT infrastructure. 2018 International Conference on Artificial Intelligence and Big Data (ICAIBD). :240–244.

Recently a huge trend on the internet of things (IoT) and an exponential increase in automated tools are helping malware producers to target IoT devices. The traditional security solutions against malware are infeasible due to low computing power for large-scale data in IoT environment. The number of malware and their variants are increasing due to continuous malware attacks. Consequently, the performance improvement in malware analysis is critical requirement to stop rapid expansion of malicious attacks in IoT environment. To solve this problem, the paper proposed a novel framework for classifying malware in IoT environment. To achieve flne-grained malware classification in suggested framework, the malware image classification system (MICS) is designed for representing malware image globally and locally. MICS first converts the suspicious program into the gray-scale image and then captures hybrid local and global malware features to perform malware family classification. Preliminary experimental outcomes of MICS are quite promising with 97.4% classification accuracy on 9342 windows suspicious programs of 25 families. The experimental results indicate that proposed framework is quite capable to process large-scale IoT malware.

Qbeitah, M. A., Aldwairi, M..  2018.  Dynamic malware analysis of phishing emails. 2018 9th International Conference on Information and Communication Systems (ICICS). :18–24.

Malicious software or malware is one of the most significant dangers facing the Internet today. In the fight against malware, users depend on anti-malware and anti-virus products to proactively detect threats before damage is done. Those products rely on static signatures obtained through malware analysis. Unfortunately, malware authors are always one step ahead in avoiding detection. This research deals with dynamic malware analysis, which emphasizes on: how the malware will behave after execution, what changes to the operating system, registry and network communication take place. Dynamic analysis opens up the doors for automatic generation of anomaly and active signatures based on the new malware's behavior. The research includes a design of honeypot to capture new malware and a complete dynamic analysis laboratory setting. We propose a standard analysis methodology by preparing the analysis tools, then running the malicious samples in a controlled environment to investigate their behavior. We analyze 173 recent Phishing emails and 45 SPIM messages in search for potentially new malwares, we present two malware samples and their comprehensive dynamic analysis.

Sethi, Kamalakanta, Chaudhary, Shankar Kumar, Tripathy, Bata Krishan, Bera, Padmalochan.  2018.  A Novel Malware Analysis Framework for Malware Detection and Classification Using Machine Learning Approach. Proceedings of the 19th International Conference on Distributed Computing and Networking. :49:1–49:4.

Nowadays, the digitization of the world is under a serious threat due to the emergence of various new and complex malware every day. Due to this, the traditional signature-based methods for detection of malware effectively become an obsolete method. The efficiency of the machine learning techniques in context to the detection of malwares has been proved by state-of-the-art research works. In this paper, we have proposed a framework to detect and classify different files (e.g., exe, pdf, php, etc.) as benign and malicious using two level classifier namely, Macro (for detection of malware) and Micro (for classification of malware files as a Trojan, Spyware, Ad-ware, etc.). Our solution uses Cuckoo Sandbox for generating static and dynamic analysis report by executing the sample files in the virtual environment. In addition, a novel feature extraction module has been developed which functions based on static, behavioral and network analysis using the reports generated by the Cuckoo Sandbox. Weka Framework is used to develop machine learning models by using training datasets. The experimental results using the proposed framework shows high detection rate and high classification rate using different machine learning algorithms

Viglianisi, Gabriele, Carminati, Michele, Polino, Mario, Continella, Andrea, Zanero, Stefano.  2018.  SysTaint: Assisting Reversing of Malicious Network Communications. Proceedings of the 8th Software Security, Protection, and Reverse Engineering Workshop. :4:1–4:12.

The ever-increasing number of malware samples demands for automated tools that aid the analysts in the reverse engineering of complex malicious binaries. Frequently, malware communicates over an encrypted channel with external network resources under the control of malicious actors, such as Command and Control servers that control the botnet of infected machines. Hence, a key aspect in malware analysis is uncovering and understanding the semantics of network communications. In this paper we present SysTaint, a semi-automated tool that runs malware samples in a controlled environment and analyzes their execution to support the analyst in identifying the functions involved in the communication and the exchanged data. Our evaluation on four banking Trojan samples from different families shows that SysTaint is able to handle and inspect encrypted network communications, obtaining useful information on the data being sent and received, on how each sample processes this data, and on the inner portions of code that deal with the data processing.

Yakura, Hiromu, Shinozaki, Shinnosuke, Nishimura, Reon, Oyama, Yoshihiro, Sakuma, Jun.  2018.  Malware Analysis of Imaged Binary Samples by Convolutional Neural Network with Attention Mechanism. Proceedings of the Eighth ACM Conference on Data and Application Security and Privacy. :127–134.
This paper presents a proposal of a method to extract important byte sequences in malware samples to reduce the workload of human analysts who investigate the functionalities of the samples. This method, by applying convolutional neural network (CNN) with a technique called attention mechanism to an image converted from binary data, enables calculation of an "attention map," which shows regions having higher importance for classification in the image. This distinction of regions enables extraction of characteristic byte sequences peculiar to the malware family from the binary data and can provide useful information for the human analysts without a priori knowledge. Furthermore, the proposed method calculates the attention map for all binary data including the data section. Thus, it can process packed malware that might contain obfuscated code in the data section. Results of our evaluation experiment using malware datasets show that the proposed method provides higher classification accuracy than conventional methods. Furthermore, analysis of malware samples based on the calculated attention maps confirmed that the extracted sequences provide useful information for manual analysis, even when samples are packed.
2019-06-10
Kargaard, J., Drange, T., Kor, A., Twafik, H., Butterfield, E..  2018.  Defending IT Systems against Intelligent Malware. 2018 IEEE 9th International Conference on Dependable Systems, Services and Technologies (DESSERT). :411-417.

The increasing amount of malware variants seen in the wild is causing problems for Antivirus Software vendors, unable to keep up by creating signatures for each. The methods used to develop a signature, static and dynamic analysis, have various limitations. Machine learning has been used by Antivirus vendors to detect malware based on the information gathered from the analysis process. However, adversarial examples can cause machine learning algorithms to miss-classify new data. In this paper we describe a method for malware analysis by converting malware binaries to images and then preparing those images for training within a Generative Adversarial Network. These unsupervised deep neural networks are not susceptible to adversarial examples. The conversion to images from malware binaries should be faster than using dynamic analysis and it would still be possible to link malware families together. Using the Generative Adversarial Network, malware detection could be much more effective and reliable.

Debatty, T., Mees, W., Gilon, T..  2018.  Graph-Based APT Detection. 2018 International Conference on Military Communications and Information Systems (ICMCIS). :1-8.

In this paper we propose a new algorithm to detect Advanced Persistent Threats (APT's) that relies on a graph model of HTTP traffic. We also implement a complete detection system with a web interface that allows to interactively analyze the data. We perform a complete parameter study and experimental evaluation using data collected on a real network. The results show that the performance of our system is comparable to currently available antiviruses, although antiviruses use signatures to detect known malwares while our algorithm solely uses behavior analysis to detect new undocumented attacks.

Xue, S., Zhang, L., Li, A., Li, X., Ruan, C., Huang, W..  2018.  AppDNA: App Behavior Profiling via Graph-Based Deep Learning. IEEE INFOCOM 2018 - IEEE Conference on Computer Communications. :1475-1483.

Better understanding of mobile applications' behaviors would lead to better malware detection/classification and better app recommendation for users. In this work, we design a framework AppDNA to automatically generate a compact representation for each app to comprehensively profile its behaviors. The behavior difference between two apps can be measured by the distance between their representations. As a result, the versatile representation can be generated once for each app, and then be used for a wide variety of objectives, including malware detection, app categorizing, plagiarism detection, etc. Based on a systematic and deep understanding of an app's behavior, we propose to perform a function-call-graph-based app profiling. We carefully design a graph-encoding method to convert a typically extremely large call-graph to a 64-dimension fix-size vector to achieve robust app profiling. Our extensive evaluations based on 86,332 benign and malicious apps demonstrate that our system performs app profiling (thus malware detection, classification, and app recommendation) to a high accuracy with extremely low computation cost: it classifies 4024 (benign/malware) apps using around 5.06 second with accuracy about 93.07%; it classifies 570 malware's family (total 21 families) using around 0.83 second with accuracy 82.3%; it classifies 9,730 apps' functionality with accuracy 33.3% for a total of 7 categories and accuracy of 88.1 % for 2 categories.

Jiang, H., Turki, T., Wang, J. T. L..  2018.  DLGraph: Malware Detection Using Deep Learning and Graph Embedding. 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA). :1029-1033.

In this paper we present a new approach, named DLGraph, for malware detection using deep learning and graph embedding. DLGraph employs two stacked denoising autoencoders (SDAs) for representation learning, taking into consideration computer programs' function-call graphs and Windows application programming interface (API) calls. Given a program, we first use a graph embedding technique that maps the program's function-call graph to a vector in a low-dimensional feature space. One SDA in our deep learning model is used to learn a latent representation of the embedded vector of the function-call graph. The other SDA in our model is used to learn a latent representation of the given program's Windows API calls. The two learned latent representations are then merged to form a combined feature vector. Finally, we use softmax regression to classify the combined feature vector for predicting whether the given program is malware or not. Experimental results based on different datasets demonstrate the effectiveness of the proposed approach and its superiority over a related method.

Jain, D., Khemani, S., Prasad, G..  2018.  Identification of Distributed Malware. 2018 IEEE 3rd International Conference on Communication and Information Systems (ICCIS). :242-246.

Smartphones have evolved over the years from simple devices to communicate with each other to fully functional portable computers although with comparatively less computational power but inholding multiple applications within. With the smartphone revolution, the value of personal data has increased. As technological complexities increase, so do the vulnerabilities in the system. Smartphones are the latest target for attacks. Android being an open source platform and also the most widely used smartphone OS draws the attention of many malware writers to exploit the vulnerabilities of it. Attackers try to take advantage of these vulnerabilities and fool the user and misuse their data. Malwares have come a long way from simple worms to sophisticated DDOS using Botnets, the latest trends in computer malware tend to go in the distributed direction, to evade the multiple anti-virus apps developed to counter generic viruses and Trojans. However, the recent trend in android system is to have a combination of applications which acts as malware. The applications are benign individually but when grouped, these may result into a malicious activity. This paper proposes a new category of distributed malware in android system, how it can be used to evade the current security, and how it can be detected with the help of graph matching algorithm.

Luo, Chen, Chen, Zhengzhang, Tang, Lu-An, Shrivastava, Anshumali, Li, Zhichun, Chen, Haifeng, Ye, Jieping.  2018.  TINET: Learning Invariant Networks via Knowledge Transfer. Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. :1890-1899.

The latent behavior of an information system that can exhibit extreme events, such as system faults or cyber-attacks, is complex. Recently, the invariant network has shown to be a powerful way of characterizing complex system behaviors. Structures and evolutions of the invariance network, in particular, the vanishing correlations, can shed light on identifying causal anomalies and performing system diagnosis. However, due to the dynamic and complex nature of real-world information systems, learning a reliable invariant network in a new environment often requires continuous collecting and analyzing the system surveillance data for several weeks or even months. Although the invariant networks learned from old environments have some common entities and entity relationships, these networks cannot be directly borrowed for the new environment due to the domain variety problem. To avoid the prohibitive time and resource consuming network building process, we propose TINET, a knowledge transfer based model for accelerating invariant network construction. In particular, we first propose an entity estimation model to estimate the probability of each source domain entity that can be included in the final invariant network of the target domain. Then, we propose a dependency construction model for constructing the unbiased dependency relationships by solving a two-constraint optimization problem. Extensive experiments on both synthetic and real-world datasets demonstrate the effectiveness and efficiency of TINET. We also apply TINET to a real enterprise security system for intrusion detection. TINET achieves superior detection performance at least 20 days lead-lag time in advance with more than 75% accuracy.

Cao, Cheng, Chen, Zhengzhang, Caverlee, James, Tang, Lu-An, Luo, Chen, Li, Zhichun.  2018.  Behavior-Based Community Detection: Application to Host Assessment In Enterprise Information Networks. Proceedings of the 27th ACM International Conference on Information and Knowledge Management. :1977-1985.

Community detection in complex networks is a fundamental problem that attracts much attention across various disciplines. Previous studies have been mostly focusing on external connections between nodes (i.e., topology structure) in the network whereas largely ignoring internal intricacies (i.e., local behavior) of each node. A pair of nodes without any interaction can still share similar internal behaviors. For example, in an enterprise information network, compromised computers controlled by the same intruder often demonstrate similar abnormal behaviors even if they do not connect with each other. In this paper, we study the problem of community detection in enterprise information networks, where large-scale internal events and external events coexist on each host. The discovered host communities, capturing behavioral affinity, can benefit many comparative analysis tasks such as host anomaly assessment. In particular, we propose a novel community detection framework to identify behavior-based host communities in enterprise information networks, purely based on large-scale heterogeneous event data. We continue proposing an efficient method for assessing host's anomaly level by leveraging the detected host communities. Experimental results on enterprise networks demonstrate the effectiveness of our model.

Su, Fang-Hsiang, Bell, Jonathan, Kaiser, Gail, Ray, Baishakhi.  2018.  Obfuscation Resilient Search Through Executable Classification. Proceedings of the 2Nd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages. :20-30.

Android applications are usually obfuscated before release, making it difficult to analyze them for malware presence or intellectual property violations. Obfuscators might hide the true intent of code by renaming variables and/or modifying program structures. It is challenging to search for executables relevant to an obfuscated application for developers to analyze efficiently. Prior approaches toward obfuscation resilient search have relied on certain structural parts of apps remaining as landmarks, un-touched by obfuscation. For instance, some prior approaches have assumed that the structural relationships between identifiers are not broken by obfuscators; others have assumed that control flow graphs maintain their structures. Both approaches can be easily defeated by a motivated obfuscator. We present a new approach, MACNETO, to search for programs relevant to obfuscated executables leveraging deep learning and principal components on instructions. MACNETO makes few assumptions about the kinds of modifications that an obfuscator might perform. We show that it has high search precision for executables obfuscated by a state-of-the-art obfuscator that changes control flow. Further, we also demonstrate the potential of MACNETO to help developers understand executables, where MACNETO infers keywords (which are from relevant un-obfuscated programs) for obfuscated executables.

Jo, Saehan, Yoo, Jaemin, Kang, U.  2018.  Fast and Scalable Distributed Loopy Belief Propagation on Real-World Graphs. Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. :297-305.

Given graphs with millions or billions of vertices and edges, how can we efficiently make inferences based on partial knowledge? Loopy Belief Propagation(LBP) is a graph inference algorithm widely used in various applications including social network analysis, malware detection, recommendation, and image restoration. The algorithm calculates approximate marginal probabilities of vertices in a graph within a linear running time proportional to the number of edges. However, when it comes to real-world graphs with millions or billions of vertices and edges, this cost overwhelms the computing power of a single machine. Moreover, this kind of large-scale graphs does not fit into the memory of a single machine. Although several distributed LBP methods have been proposed, previous works do not consider the properties of real-world graphs, especially the effect of power-law degree distribution on LBP. Therefore, our work focuses on developing a fast and scalable LBP for such large real-world graphs on distributed environment. In this paper, we propose DLBP, a Distributed Loopy Belief Propagation algorithm which efficiently computes LBP in a distributed manner across multiple machines. By setting the correct convergence criterion and carefully scheduling the computations, DLBP provides up to 10.7x speed up compared to standard distributed LBP. We show that DLBP demonstrates near-linear scalability with respect to the number of machines as well as the number of edges.

Mpanti, Anna, Nikolopoulos, Stavros D., Polenakis, Iosif.  2018.  A Graph-Based Model for Malicious Software Detection Exploiting Domination Relations Between System-Call Groups. Proceedings of the 19th International Conference on Computer Systems and Technologies. :20-26.

In this paper, we propose a graph-based algorithmic technique for malware detection, utilizing the System-call Dependency Graphs (ScDG) obtained through taint analysis traces. We leverage the grouping of system-calls into system-call groups with respect to their functionality to merge disjoint vertices of ScDG graphs, transforming them to Group Relation Graphs (GrG); note that, the GrG graphs represent malware's behavior being hence more resilient to probable mutations of its structure. More precisely, we extend the use of GrG graphs by mapping their vertices on the plane utilizing the degrees and the vertex-weights of a specific underlying graph of the GrG graph as to compute domination relations. Furthermore, we investigate how the activity of each system-call group could be utilized in order to distinguish graph-representations of malware and benign software. The domination relations among the vertices of GrG graphs result to a new graph representation that we call Coverage Graph of the GrG graph. Finally, we evaluate the potentials of our detection model using graph similarity between Coverage Graphs of known malicious and benign software samples of various types.

Karbab, ElMouatez Billah, Debbabi, Mourad.  2018.  ToGather: Automatic Investigation of Android Malware Cyber-Infrastructures. Proceedings of the 13th International Conference on Availability, Reliability and Security. :20:1-20:10.

The popularity of Android, not only in handsets but also in IoT devices, makes it a very attractive target for malware threats, which are actually expanding at a significant rate. The state-of-the-art in malware mitigation solutions mainly focuses on the detection of malicious Android apps using dynamic and static analysis features to segregate malicious apps from benign ones. Nevertheless, there is a small coverage for the Internet/network dimension of Android malicious apps. In this paper, we present ToGather, an automatic investigation framework that takes Android malware samples as input and produces insights about the underlying malicious cyber infrastructures. ToGather leverages state-of-the-art graph theory techniques to generate actionable, relevant and granular intelligence to mitigate the threat effects induced by the malicious Internet activity of Android malware apps. We evaluate ToGather on a large dataset of real malware samples from various Android families, and the obtained results are both interesting and promising.

2019-03-04
Schwartz, Edward J., Cohen, Cory F., Duggan, Michael, Gennari, Jeffrey, Havrilla, Jeffrey S., Hines, Charles.  2018.  Using Logic Programming to Recover C++ Classes and Methods from Compiled Executables. Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. :426–441.
High-level C++ source code abstractions such as classes and methods greatly assist human analysts and automated algorithms alike when analyzing C++ programs. Unfortunately, these abstractions are lost when compiling C++ source code, which impedes the understanding of C++ executables. In this paper, we propose a system, OOAnalyzer, that uses an innovative new design to statically recover detailed C++ abstractions from executables in a scalable manner. OOAnalyzer's design is motivated by the observation that many human analysts reason about C++ programs by recognizing simple patterns in binary code and then combining these findings using logical inference, domain knowledge, and intuition. We codify this approach by combining a lightweight symbolic analysis with a flexible Prolog-based reasoning system. Unlike most existing work, OOAnalyzer is able to recover both polymorphic and non-polymorphic C++ classes. We show in our evaluation that OOAnalyzer assigns over 78% of methods to the correct class on our test corpus, which includes both malware and real-world software such as Firefox and MySQL. These recovered abstractions can help analysts understand the behavior of C++ malware and cleanware, and can also improve the precision of program analyses on C++ executables.
2018-06-20
Lee, Y., Choi, S. S., Choi, J., Song, J..  2017.  A Lightweight Malware Classification Method Based on Detection Results of Anti-Virus Software. 2017 12th Asia Joint Conference on Information Security (AsiaJCIS). :5–9.

With the development of cyber threats on the Internet, the number of malware, especially unknown malware, is also dramatically increasing. Since all of malware cannot be analyzed by analysts, it is very important to find out new malware that should be analyzed by them. In order to cope with this issue, the existing approaches focused on malware classification using static or dynamic analysis results of malware. However, the static and the dynamic analyses themselves are also too costly and not easy to build the isolated, secure and Internet-like analysis environments such as sandbox. In this paper, we propose a lightweight malware classification method based on detection results of anti-virus software. Since the proposed method can reduce the volume of malware that should be analyzed by analysts, it can be used as a preprocess for in-depth analysis of malware. The experimental showed that the proposed method succeeded in classification of 1,000 malware samples into 187 unique groups. This means that 81% of the original malware samples do not need to analyze by analysts.