Visible to the public Biblio

Found 218 results

Filters: Keyword is Malware  [Clear All Filters]
Huang, Yonghong, Verma, Utkarsh, Fralick, Celeste, Infantec-Lopez, Gabriel, Kumar, Brajesh, Woodward, Carl.  2019.  Malware Evasion Attack and Defense. 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W). :34–38.
Machine learning (ML) classifiers are vulnerable to adversarial examples. An adversarial example is an input sample which is slightly modified to induce misclassification in an ML classifier. In this work, we investigate white-box and grey-box evasion attacks to an ML-based malware detector and conduct performance evaluations in a real-world setting. We compare the defense approaches in mitigating the attacks. We propose a framework for deploying grey-box and black-box attacks to malware detection systems.
Biswal, Satya Ranjan, Swain, Santosh Kumar.  2019.  Model for Study of Malware Propagation Dynamics in Wireless Sensor Network. 2019 3rd International Conference on Trends in Electronics and Informatics (ICOEI). :647–653.
Wireless Sensor Network (WSN) faces critical security challenges due to malware(worm, virus, malicious code etc.) attack. When a single node gets compromised by malware then start to spread in entire sensor network through neighboring sensor nodes. To understand the dynamics of malware propagation in WSN proposed a Susceptible-Exposed-Infectious-Recovered-Dead (SEIRD) model. This model used the concept of epidemiology. The model focused on early detection of malicious signals presence in the network and accordingly application of security mechanism for its removal. The early detection method helps in controlling of malware spread and reduce battery consumption of sensor nodes. In this paper study the dynamics of malware propagation and stability analysis of the system. In epidemiology basic reproduction number is a crucial parameter which is used for the determination of malware status in the system. The expression of basic reproduction number has been obtained. Analyze the propagation dynamics and compared with previous model. The proposed model provides improved security mechanism in comparison to previous one. The extensive simulation results conform the analytical investigation and accuracy of proposed model.
Skopik, Florian, Filip, Stefan.  2019.  Design principles for national cyber security sensor networks: Lessons learned from small-scale demonstrators. 2019 International Conference on Cyber Security and Protection of Digital Services (Cyber Security). :1–8.
The timely exchange of information on new threats and vulnerabilities has become a cornerstone of effective cyber defence in recent years. Especially national authorities increasingly assume their role as information brokers through national cyber security centres and distribute warnings on new attack vectors and vital recommendations on how to mitigate them. Although many of these initiatives are effective to some degree, they also suffer from severe limitations. Many steps in the exchange process require extensive human involvement to manually review, vet, enrich, analyse and distribute security information. Some countries have therefore started to adopt distributed cyber security sensor networks to enable the automatic collection, analysis and preparation of security data and thus effectively overcome limiting scalability factors. The basic idea of IoC-centric cyber security sensor networks is that the national authorities distribute Indicators of Compromise (IoCs) to organizations and receive sightings in return. This effectively helps them to estimate the spreading of malware, anticipate further trends of spreading and derive vital findings for decision makers. While this application case seems quite simple, there are some tough questions to be answered in advance, which steer the further design decisions: How much can the monitored organization be trusted to be a partner in the search for malware? How much control of the scanning process should be delegated to the organization? What is the right level of search depth? How to deal with confidential indicators? What can be derived from encrypted traffic? How are new indicators distributed, prioritized, and scan targets selected in a scalable manner? What is a good strategy to re-schedule scans to derive meaningful data on trends, such as rate of spreading? This paper suggests a blueprint for a sensor network and raises related questions, outlines design principles, and discusses lessons learned from small-scale pilots.
Ding, Steven H. H., Fung, Benjamin C. M., Charland, Philippe.  2019.  Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization. 2019 IEEE Symposium on Security and Privacy (SP). :472–489.

Reverse engineering is a manually intensive but necessary technique for understanding the inner workings of new malware, finding vulnerabilities in existing systems, and detecting patent infringements in released software. An assembly clone search engine facilitates the work of reverse engineers by identifying those duplicated or known parts. However, it is challenging to design a robust clone search engine, since there exist various compiler optimization options and code obfuscation techniques that make logically similar assembly functions appear to be very different. A practical clone search engine relies on a robust vector representation of assembly code. However, the existing clone search approaches, which rely on a manual feature engineering process to form a feature vector for an assembly function, fail to consider the relationships between features and identify those unique patterns that can statistically distinguish assembly functions. To address this problem, we propose to jointly learn the lexical semantic relationships and the vector representation of assembly functions based on assembly code. We have developed an assembly code representation learning model \textbackslashemphAsm2Vec. It only needs assembly code as input and does not require any prior knowledge such as the correct mapping between assembly functions. It can find and incorporate rich semantic relationships among tokens appearing in assembly code. We conduct extensive experiments and benchmark the learning model with state-of-the-art static and dynamic clone search approaches. We show that the learned representation is more robust and significantly outperforms existing methods against changes introduced by obfuscation and optimizations.

KADOGUCHI, Masashi, HAYASHI, Shota, HASHIMOTO, Masaki, OTSUKA, Akira.  2019.  Exploring the Dark Web for Cyber Threat Intelligence Using Machine Leaning. 2019 IEEE International Conference on Intelligence and Security Informatics (ISI). :200–202.

In recent years, cyber attack techniques are increasingly sophisticated, and blocking the attack is more and more difficult, even if a kind of counter measure or another is taken. In order for a successful handling of this situation, it is crucial to have a prediction of cyber attacks, appropriate precautions, and effective utilization of cyber intelligence that enables these actions. Malicious hackers share various kinds of information through particular communities such as the dark web, indicating that a great deal of intelligence exists in cyberspace. This paper focuses on forums on the dark web and proposes an approach to extract forums which include important information or intelligence from huge amounts of forums and identify traits of each forum using methodologies such as machine learning, natural language processing and so on. This approach will allow us to grasp the emerging threats in cyberspace and take appropriate measures against malicious activities.

Hou, Ye, Such, Jose, Rashid, Awais.  2019.  Understanding Security Requirements for Industrial Control System Supply Chains. 2019 IEEE/ACM 5th International Workshop on Software Engineering for Smart Cyber-Physical Systems (SEsCPS). :50–53.

We address the need for security requirements to take into account risks arising from complex supply chains underpinning cyber-physical infrastructures such as industrial control systems (ICS). We present SEISMiC (SEcurity Industrial control SysteM supply Chains), a framework that takes into account the whole spectrum of security risks - from technical aspects through to human and organizational issues - across an ICS supply chain. We demonstrate the effectiveness of SEISMiC through a supply chain risk assessment of Natanz, Iran's nuclear facility that was the subject of the Stuxnet attack.

He, Zecheng, Raghavan, Aswin, Hu, Guangyuan, Chai, Sek, Lee, Ruby.  2019.  Power-Grid Controller Anomaly Detection with Enhanced Temporal Deep Learning. 2019 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/13th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). :160–167.
Controllers of security-critical cyber-physical systems, like the power grid, are a very important class of computer systems. Attacks against the control code of a power-grid system, especially zero-day attacks, can be catastrophic. Earlier detection of the anomalies can prevent further damage. However, detecting zero-day attacks is extremely challenging because they have no known code and have unknown behavior. Furthermore, if data collected from the controller is transferred to a server through networks for analysis and detection of anomalous behavior, this creates a very large attack surface and also delays detection. In order to address this problem, we propose Reconstruction Error Distribution (RED) of Hardware Performance Counters (HPCs), and a data-driven defense system based on it. Specifically, we first train a temporal deep learning model, using only normal HPC readings from legitimate processes that run daily in these power-grid systems, to model the normal behavior of the power-grid controller. Then, we run this model using real-time data from commonly available HPCs. We use the proposed RED to enhance the temporal deep learning detection of anomalous behavior, by estimating distribution deviations from the normal behavior with an effective statistical test. Experimental results on a real power-grid controller show that we can detect anomalous behavior with high accuracy (\textbackslashtextgreater99.9%), nearly zero false positives and short (\textbackslashtextless; 360ms) latency.
Xiao, Kaiming, Zhu, Cheng, Xie, Junjie, Zhou, Yun, Zhu, Xianqiang, Zhang, Weiming.  2018.  Dynamic Defense Strategy against Stealth Malware Propagation in Cyber-Physical Systems. IEEE INFOCOM 2018 - IEEE Conference on Computer Communications. :1790–1798.
Stealth malware, a representative tool of advanced persistent threat (APT) attacks, in particular poses an increased threat to cyber-physical systems (CPS). Due to the use of stealthy and evasive techniques (e.g., zero-day exploits, obfuscation techniques), stealth malwares usually render conventional heavyweight countermeasures (e.g., exploits patching, specialized ant-malware program) inapplicable. Light-weight countermeasures (e.g., containment techniques), on the other hand, can help retard the spread of stealth malwares, but the ensuing side effects might violate the primary safety requirement of CPS. Hence, defenders need to find a balance between the gain and loss of deploying light-weight countermeasures. To address this challenge, we model the persistent anti-malware process as a shortest-path tree interdiction (SPTI) Stackelberg game, and safety requirements of CPS are introduced as constraints in the defender's decision model. Specifically, we first propose a static game (SSPTI), and then extend it to a multi-stage dynamic game (DSPTI) to meet the need of real-time decision making. Both games are modelled as bi-level integer programs, and proved to be NP-hard. We then develop a Benders decomposition algorithm to achieve the Stackelberg Equilibrium of SSPTI. Finally, we design a model predictive control strategy to solve DSPTI approximately by sequentially solving an approximation of SSPTI. The extensive simulation results demonstrate that the proposed dynamic defense strategy can achieve a balance between fail-secure ability and fail-safe ability while retarding the stealth malware propagation in CPS.
Musca, Constantin, Mirica, Emma, Deaconescu, Razvan.  2013.  Detecting and Analyzing Zero-Day Attacks Using Honeypots. 2013 19th International Conference on Control Systems and Computer Science. :543–548.
Computer networks are overwhelmed by self propagating malware (worms, viruses, trojans). Although the number of security vulnerabilities grows every day, not the same thing can be said about the number of defense methods. But the most delicate problem in the information security domain remains detecting unknown attacks known as zero-day attacks. This paper presents methods for isolating the malicious traffic by using a honeypot system and analyzing it in order to automatically generate attack signatures for the Snort intrusion detection/prevention system. The honeypot is deployed as a virtual machine and its job is to log as much information as it can about the attacks. Then, using a protected machine, the logs are collected remotely, through a safe connection, for analysis. The challenge is to mitigate the risk we are exposed to and at the same time search for unknown attacks.
Hagan, Matthew, Kang, BooJoong, McLaughlin, Kieran, Sezer, Sakir.  2018.  Peer Based Tracking Using Multi-Tuple Indexing for Network Traffic Analysis and Malware Detection. 2018 16th Annual Conference on Privacy, Security and Trust (PST). :1–5.

Traditional firewalls, Intrusion Detection Systems(IDS) and network analytics tools extensively use the `flow' connection concept, consisting of five `tuples' of source and destination IP, ports and protocol type, for classification and management of network activities. By analysing flows, information can be obtained from TCP/IP fields and packet content to give an understanding of what is being transferred within a single connection. As networks have evolved to incorporate more connections and greater bandwidth, particularly from ``always on'' IoT devices and video and data streaming, so too have malicious network threats, whose communication methods have increased in sophistication. As a result, the concept of the 5 tuple flow in isolation is unable to detect such threats and malicious behaviours. This is due to factors such as the length of time and data required to understand the network traffic behaviour, which cannot be accomplished by observing a single connection. To alleviate this issue, this paper proposes the use of additional, two tuple and single tuple flow types to associate multiple 5 tuple communications, with generated metadata used to profile individual connnection behaviour. This proposed approach enables advanced linking of different connections and behaviours, developing a clearer picture as to what network activities have been taking place over a prolonged period of time. To demonstrate the capability of this approach, an expert system rule set has been developed to detect the presence of a multi-peered ZeuS botnet, which communicates by making multiple connections with multiple hosts, thus undetectable to standard IDS systems observing 5 tuple flow types in isolation. Finally, as the solution is rule based, this implementation operates in realtime and does not require post-processing and analytics of other research solutions. This paper aims to demonstrate possible applications for next generation firewalls and methods to acquire additional information from network traffic.

Mar\'ın, Gonzalo, Casas, Pedro, Capdehourat, Germán.  2019.  Deep in the Dark - Deep Learning-Based Malware Traffic Detection Without Expert Knowledge. 2019 IEEE Security and Privacy Workshops (SPW). :36–42.

With the ever-growing occurrence of networking attacks, robust network security systems are essential to prevent and mitigate their harming effects. In recent years, machine learning-based systems have gain popularity for network security applications, usually considering the application of shallow models, where a set of expert handcrafted features are needed to pre-process the data before training. The main problem with this approach is that handcrafted features can fail to perform well given different kinds of scenarios and problems. Deep Learning models can solve this kind of issues using their ability to learn feature representations from input raw or basic, non-processed data. In this paper we explore the power of deep learning models on the specific problem of detection and classification of malware network traffic, using different representations for the input data. As a major advantage as compared to the state of the art, we consider raw measurements coming directly from the stream of monitored bytes as the input to the proposed models, and evaluate different raw-traffic feature representations, including packet and flow-level ones. Our results suggest that deep learning models can better capture the underlying statistics of malicious traffic as compared to classical, shallow-like models, even while operating in the dark, i.e., without any sort of expert handcrafted inputs.

Zubarev, Dmytro, Skarga-Bandurova, Inna.  2019.  Cross-Site Scripting for Graphic Data: Vulnerabilities and Prevention. 2019 10th International Conference on Dependable Systems, Services and Technologies (DESSERT). :154–160.

In this paper, we present an overview of the problems associated with the cross-site scripting (XSS) in the graphical content of web applications. The brief analysis of vulnerabilities for graphical files and factors responsible for making SVG images vulnerable to XSS attacks are discussed. XML treatment methods and their practical testing are performed. As a result, the set of rules for protecting the graphic content of the websites and prevent XSS vulnerabilities are proposed.

Bukhari, Syed Nisar, Ahmad Dar, Muneer, Iqbal, Ummer.  2018.  Reducing attack surface corresponding to Type 1 cross-site scripting attacks using secure development life cycle practices. 2018 Fourth International Conference on Advances in Electrical, Electronics, Information, Communication and Bio-Informatics (AEEICB). :1–4.

While because the range of web users have increased exponentially, thus has the quantity of attacks that decide to use it for malicious functions. The vulnerability that has become usually exploited is thought as cross-site scripting (XSS). Cross-site Scripting (XSS) refers to client-side code injection attack whereby a malicious user will execute malicious scripts (also usually stated as a malicious payload) into a legitimate web site or web based application. XSS is amongst the foremost rampant of web based application vulnerabilities and happens once an internet based application makes use of un-validated or un-encoded user input at intervals the output it generates. In such instances, the victim is unaware that their data is being transferred from a website that he/she trusts to a different site controlled by the malicious user. In this paper we shall focus on type 1 or "non-persistent cross-site scripting". With non-persistent cross-site scripting, malicious code or script is embedded in a Web request, and then partially or entirely echoed (or "reflected") by the Web server without encoding or validation in the Web response. The malicious code or script is then executed in the client's Web browser which could lead to several negative outcomes, such as the theft of session data and accessing sensitive data within cookies. In order for this type of cross-site scripting to be successful, a malicious user must coerce a user into clicking a link that triggers the non-persistent cross-site scripting attack. This is usually done through an email that encourages the user to click on a provided malicious link, or to visit a web site that is fraught with malicious links. In this paper it will be discussed and elaborated as to how attack surfaces related to type 1 or "non-persistent cross-site scripting" attack shall be reduced using secure development life cycle practices and techniques.

Werner, Gordon, Okutan, Ahmet, Yang, Shanchieh, McConky, Katie.  2018.  Forecasting Cyberattacks as Time Series with Different Aggregation Granularity. 2018 IEEE International Symposium on Technologies for Homeland Security (HST). :1-7.

Cyber defense can no longer be limited to intrusion detection methods. These systems require malicious activity to enter an internal network before an attack can be detected. Having advanced, predictive knowledge of future attacks allow a potential victim to heighten security and possibly prevent any malicious traffic from breaching the network. This paper investigates the use of Auto-Regressive Integrated Moving Average (ARIMA) models and Bayesian Networks (BN) to predict future cyber attack occurrences and intensities against two target entities. In addition to incident count forecasting, categorical and binary occurrence metrics are proposed to better represent volume forecasts to a victim. Different measurement periods are used in time series construction to better model the temporal patterns unique to each attack type and target configuration, seeing over 86% improvement over baseline forecasts. Using ground truth aggregated over different measurement periods as signals, a BN is trained and tested for each attack type and the obtained results provided further evidence to support the findings from ARIMA. This work highlights the complexity of cyber attack occurrences; each subset has unique characteristics and is influenced by a number of potential external factors.

Rong, Z., Xie, P., Wang, J., Xu, S., Wang, Y..  2018.  Clean the Scratch Registers: A Way to Mitigate Return-Oriented Programming Attacks. 2018 IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP). :1–8.

With the implementation of W ⊕ X security model on computer system, Return-Oriented Programming(ROP) has become the primary exploitation technique for adversaries. Although many solutions that defend against ROP exploits have been proposed, they still suffer from various shortcomings. In this paper, we propose a new way to mitigate ROP attacks that are based on return instructions. We clean the scratch registers which are also the parameter registers based on the features of ROP malicious code and calling convention. A prototype is implemented on x64-based Linux platform based on Pin. Preliminary experimental results show that our method can efficiently mitigate conventional ROP attacks.

Cusack, Greg, Michel, Oliver, Keller, Eric.  2018.  Machine Learning-Based Detection of Ransomware Using SDN. Proceedings of the 2018 ACM International Workshop on Security in Software Defined Networks & Network Function Virtualization. :1–6.
The growth of malware poses a major threat to internet users, governments, and businesses around the world. One of the major types of malware, ransomware, encrypts a user's sensitive information and only returns the original files to the user after a ransom is paid. As malware developers shift the delivery of their product from HTTP to HTTPS to protect themselves from payload inspection, we can no longer rely on deep packet inspection to extract features for malware identification. Toward this goal, we propose a solution leveraging a recent trend in networking hardware, that is programmable forwarding engines (PFEs). PFEs allow collection of per-packet, network monitoring data at high rates. We use this data to monitor the network traffic between an infected computer and the command and control (C&C) server. We extract high-level flow features from this traffic and use this data for ransomware classification. We write a stream processor and use a random forest, binary classifier to utilizes these rich flow records in fingerprinting malicious, network activity without the requirement of deep packet inspection. Our classification model achieves a detection rate in excess of 0.86, while maintaining a false negative rate under 0.11. Our results suggest that a flow-based fingerprinting method is feasible and accurate enough to catch ransomware before encryption.
Genç, Ziya Alper, Lenzini, Gabriele, Ryan, Peter Y.A..  2018.  Security Analysis of Key Acquiring Strategies Used by Cryptographic Ransomware. Proceedings of the Central European Cybersecurity Conference 2018. :7:1–7:6.
To achieve its goals, ransomware needs to employ strong encryption, which in turn requires access to high-grade encryption keys. Over the evolution of ransomware, various techniques have been observed to accomplish the latter. Understanding the advantages and disadvantages of each method is essential to develop robust defense strategies. In this paper we explain the techniques used by ransomware to derive encryption keys and analyze the security of each approach. We argue that recovery of data might be possible if the ransomware cannot access high entropy randomness sources. As an evidence to support our theoretical results, we provide a decryptor program for a previously undefeated ransomware.
Kara, I., Aydos, M..  2018.  Static and Dynamic Analysis of Third Generation Cerber Ransomware. 2018 International Congress on Big Data, Deep Learning and Fighting Cyber Terrorism (IBIGDELFT). :12–17.

Cyber criminals have been extensively using malicious Ransomware software for years. Ransomware is a subset of malware in which the data on a victim's computer is locked, typically by encryption, and payment is demanded before the ransomed data is decrypted and access returned to the victim. The motives for such attacks are not only limited to economical scumming. Illegal attacks on official databases may also target people with political or social power. Although billions of dollars have been spent for preventing or at least reducing the tremendous amount of losses, these malicious Ransomware attacks have been expanding and growing. Therefore, it is critical to perform technical analysis of such malicious codes and, if possible, determine the source of such attacks. It might be almost impossible to recover the affected files due to the strong encryption imposed on such files, however the determination of the source of Ransomware attacks have been becoming significantly important for criminal justice. Unfortunately, there are only a few technical analysis of real life attacks in the literature. In this work, a real life Ransomware attack on an official institute is investigated and fully analyzed. The analysis have been performed by both static and dynamic methods. The results show that the source of the Ransomware attack has been shown to be traceable from the server's whois information.

Kesidis, G., Shan, Y., Fleck, D., Stavrou, A., Konstantopoulos, T..  2018.  An adversarial coupon-collector model of asynchronous moving-target defense against botnet reconnaissance*. 2018 13th International Conference on Malicious and Unwanted Software (MALWARE). :61–67.
We consider a moving-target defense of a proxied multiserver tenant of the cloud where the proxies dynamically change to defeat reconnaissance activity by a botnet planning a DDoS attack targeting the tenant. Unlike the system of [4] where all proxies change simultaneously at a fixed rate, we consider a more “responsive” system where the proxies may change more rapidly and selectively based on the current session request intensity, which is expected to be abnormally large during active reconnaissance. In this paper, we study a tractable “adversarial” coupon-collector model wherein proxies change after a random period of time from the latest request, i.e., asynchronously. In addition to determining the stationary mean number of proxies discovered by the attacker, we study the age of a proxy (coupon type) when it has been identified (requested) by the botnet. This gives us the rate at which proxies change (cost to the defender) when the nominal client request load is relatively negligible.
Akhtar, T., Gupta, B. B., Yamaguchi, S..  2018.  Malware propagation effects on SCADA system and smart power grid. 2018 IEEE International Conference on Consumer Electronics (ICCE). :1–6.

Critical infrastructures have suffered from different kind of cyber attacks over the years. Many of these attacks are performed using malwares by exploiting the vulnerabilities of these resources. Smart power grid is one of the major victim which suffered from these attacks and its SCADA system are frequently targeted. In this paper we describe our proposed framework to analyze smart power grid, while its SCADA system is under attack by malware. Malware propagation and its effects on SCADA system is the focal point of our analysis. OMNeT++ simulator and openDSS is used for developing and analyzing the simulated smart power grid environment.

Amjad, N., Afzal, H., Amjad, M. F., Khan, F. A..  2018.  A Multi-Classifier Framework for Open Source Malware Forensics. 2018 IEEE 27th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE). :106-111.

Traditional anti-virus technologies have failed to keep pace with proliferation of malware due to slow process of their signatures and heuristics updates. Similarly, there are limitations of time and resources in order to perform manual analysis on each malware. There is a need to learn from this vast quantity of data, containing cyber attack pattern, in an automated manner to proactively adapt to ever-evolving threats. Machine learning offers unique advantages to learn from past cyber attacks to handle future cyber threats. The purpose of this research is to propose a framework for multi-classification of malware into well-known categories by applying different machine learning models over corpus of malware analysis reports. These reports are generated through an open source malware sandbox in an automated manner. We applied extensive pre-modeling techniques for data cleaning, features exploration and features engineering to prepare training and test datasets. Best possible hyper-parameters are selected to build machine learning models. These prepared datasets are then used to train the machine learning classifiers and to compare their prediction accuracy. Finally, these results are validated through a comprehensive 10-fold cross-validation methodology. The best results are achieved through Gaussian Naive Bayes classifier with random accuracy of 96% and 10-Fold Cross Validation accuracy of 91.2%. The said framework can be deployed in an operational environment to learn from malware attacks for proactively adapting matching counter measures.

Stokes, J. W., Wang, D., Marinescu, M., Marino, M., Bussone, B..  2018.  Attack and Defense of Dynamic Analysis-Based, Adversarial Neural Malware Detection Models. MILCOM 2018 - 2018 IEEE Military Communications Conference (MILCOM). :1–8.

Recently researchers have proposed using deep learning-based systems for malware detection. Unfortunately, all deep learning classification systems are vulnerable to adversarial learning-based attacks, or adversarial attacks, where miscreants can avoid detection by the classification algorithm with very few perturbations of the input data. Previous work has studied adversarial attacks against static analysis-based malware classifiers which only classify the content of the unknown file without execution. However, since the majority of malware is either packed or encrypted, malware classification based on static analysis often fails to detect these types of files. To overcome this limitation, anti-malware companies typically perform dynamic analysis by emulating each file in the anti-malware engine or performing in-depth scanning in a virtual machine. These strategies allow the analysis of the malware after unpacking or decryption. In this work, we study different strategies of crafting adversarial samples for dynamic analysis. These strategies operate on sparse, binary inputs in contrast to continuous inputs such as pixels in images. We then study the effects of two, previously proposed defensive mechanisms against crafted adversarial samples including the distillation and ensemble defenses. We also propose and evaluate the weight decay defense. Experiments show that with these three defenses, the number of successfully crafted adversarial samples is reduced compared to an unprotected baseline system. In particular, the ensemble defense is the most resilient to adversarial attacks. Importantly, none of the defenses significantly reduce the classification accuracy for detecting malware. Finally, we show that while adding additional hidden layers to neural models does not significantly improve the malware classification accuracy, it does significantly increase the classifier's robustness to adversarial attacks.

Copty, Fady, Danos, Matan, Edelstein, Orit, Eisner, Cindy, Murik, Dov, Zeltser, Benjamin.  2018.  Accurate Malware Detection by Extreme Abstraction. Proceedings of the 34th Annual Computer Security Applications Conference. :101–111.

Modern malware applies a rich arsenal of evasion techniques to render dynamic analysis ineffective. In turn, dynamic analysis tools take great pains to hide themselves from malware; typically this entails trying to be as faithful as possible to the behavior of a real run. We present a novel approach to malware analysis that turns this idea on its head, using an extreme abstraction of the operating system that intentionally strays from real behavior. The key insight is that the presence of malicious behavior is sufficient evidence of malicious intent, even if the path taken is not one that could occur during a real run of the sample. By exploring multiple paths in a system that only approximates the behavior of a real system, we can discover behavior that would often be hard to elicit otherwise. We aggregate features from multiple paths and use a funnel-like configuration of machine learning classifiers to achieve high accuracy without incurring too much of a performance penalty. We describe our system, TAMALES (The Abstract Malware Analysis LEarning System), in detail and present machine learning results using a 330K sample set showing an FPR (False Positive Rate) of 0.10% with a TPR (True Positive Rate) of 99.11%, demonstrating that extreme abstraction can be extraordinarily effective in providing data that allows a classifier to accurately detect malware.

Ijaz, M., Durad, M. H., Ismail, M..  2019.  Static and Dynamic Malware Analysis Using Machine Learning. 2019 16th International Bhurban Conference on Applied Sciences and Technology (IBCAST). :687–691.

Malware detection is an indispensable factor in security of internet oriented machines. The combinations of different features are used for dynamic malware analysis. The different combinations are generated from APIs, Summary Information, DLLs and Registry Keys Changed. Cuckoo sandbox is used for dynamic malware analysis, which is customizable, and provide good accuracy. More than 2300 features are extracted from dynamic analysis of malware and 92 features are extracted statically from binary malware using PEFILE. Static features are extracted from 39000 malicious binaries and 10000 benign files. Dynamically 800 benign files and 2200 malware files are analyzed in Cuckoo Sandbox and 2300 features are extracted. The accuracy of dynamic malware analysis is 94.64% while static analysis accuracy is 99.36%. The dynamic malware analysis is not effective due to tricky and intelligent behaviours of malwares. The dynamic analysis has some limitations due to controlled network behavior and it cannot be analyzed completely due to limited access of network.

Wright, D., Stroschein, J..  2018.  A Malware Analysis and Artifact Capture Tool. 2018 IEEE 16th Intl Conf on Dependable, Autonomic and Secure Computing, 16th Intl Conf on Pervasive Intelligence and Computing, 4th Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress(DASC/PiCom/DataCom/CyberSciTech). :328–333.

Malware authors attempt to obfuscate and hide their code in its static and dynamic states. This paper provides a novel approach to aid analysis by intercepting and capturing malware artifacts and providing dynamic control of process flow. Capturing malware artifacts allows an analyst to more quickly and comprehensively understand malware behavior and obfuscation techniques and doing so interactively allows multiple code paths to be explored. The faster that malware can be analyzed the quicker the systems and data compromised by it can be determined and its infection stopped. This research proposes an instantiation of an interactive malware analysis and artifact capture tool.