Visible to the public Biblio

Filters: Keyword is static analysis  [Clear All Filters]
2020-02-10
Izurieta, Clemente, Prouty, Mary.  2019.  Leveraging SecDevOps to Tackle the Technical Debt Associated with Cybersecurity Attack Tactics. 2019 IEEE/ACM International Conference on Technical Debt (TechDebt). :33–37.
Context: Managing technical debt (TD) associated with external cybersecurity attacks on an organization can significantly improve decisions made when prioritizing which security weaknesses require attention. Whilst source code vulnerabilities can be found using static analysis techniques, malicious external attacks expose the vulnerabilities of a system at runtime and can sometimes remain hidden for long periods of time. By mapping malicious attack tactics to the consequences of weaknesses (i.e. exploitable source code vulnerabilities) we can begin to understand and prioritize the refactoring of the source code vulnerabilities that cause the greatest amount of technical debt on a system. Goal: To establish an approach that maps common external attack tactics to system weaknesses. The consequences of a weakness associated with a specific attack technique can then be used to determine the technical debt principal of said violation; which can be measured in terms of loss of business rather than source code maintenance. Method: We present a position study that uses Jaccard similarity scoring to examine how 11 malicious attack tactics can relate to Common Weakness Enumerations (CWEs). Results: We conduct a study to simulate attacks, and generate dependency graphs between external attacks and the technical consequences associated with CWEs. Conclusion: The mapping of cyber security attacks to weaknesses allows operational staff (SecDevOps) to focus on deploying appropriate countermeasures and allows developers to focus on refactoring the vulnerabilities with the greatest potential for technical debt.
Visalli, Nicholas, Deng, Lin, Al-Suwaida, Amro, Brown, Zachary, Joshi, Manish, Wei, Bingyang.  2019.  Towards Automated Security Vulnerability and Software Defect Localization. 2019 IEEE 17th International Conference on Software Engineering Research, Management and Applications (SERA). :90–93.

Security vulnerabilities and software defects are prevalent in software systems, threatening every aspect of cyberspace. The complexity of modern software makes it hard to secure systems. Security vulnerabilities and software defects become a major target of cyberattacks which can lead to significant consequences. Manual identification of vulnerabilities and defects in software systems is very time-consuming and tedious. Many tools have been designed to help analyze software systems and to discover vulnerabilities and defects. However, these tools tend to miss various types of bugs. The bugs that are not caught by these tools usually include vulnerabilities and defects that are too complicated to find or do not fall inside of an existing rule-set for identification. It was hypothesized that these undiscovered vulnerabilities and defects do not occur randomly, rather, they share certain common characteristics. A methodology was proposed to detect the probability of a bug existing in a code structure. We used a comprehensive experimental evaluation to assess the methodology and report our findings.

Talukder, Md Arabin Islam, Shahriar, Hossain, Qian, Kai, Rahman, Mohammad, Ahamed, Sheikh, Wu, Fan, Agu, Emmanuel.  2019.  DroidPatrol: A Static Analysis Plugin For Secure Mobile Software Development. 2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC). 1:565–569.

While the number of mobile applications are rapidly growing, these applications are often coming with numerous security flaws due to the lack of appropriate coding practices. Security issues must be addressed earlier in the development lifecycle rather than fixing them after the attacks because the damage might already be extensive. Early elimination of possible security vulnerabilities will help us increase the security of our software and mitigate or reduce the potential damages through data losses or service disruptions caused by malicious attacks. However, many software developers lack necessary security knowledge and skills required at the development stage, and Secure Mobile Software Development (SMSD) is not yet well represented in academia and industry. In this paper, we present a static analysis-based security analysis approach through design and implementation of a plugin for Android Development Studio, namely DroidPatrol. The proposed plugins can support developers by providing list of potential vulnerabilities early.

Rahman, Md Rayhanur, Rahman, Akond, Williams, Laurie.  2019.  Share, But Be Aware: Security Smells in Python Gists. 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME). :536–540.

Github Gist is a service provided by Github which is used by developers to share code snippets. While sharing, developers may inadvertently introduce security smells in code snippets as well, such as hard-coded passwords. Security smells are recurrent coding patterns that are indicative of security weaknesses, which could potentially lead to security breaches. The goal of this paper is to help software practitioners avoid insecure coding practices through an empirical study of security smells in publicly-available GitHub Gists. Through static analysis, we found 13 types of security smells with 4,403 occurrences in 5,822 publicly-available Python Gists. 1,817 of those Gists, which is around 31%, have at least one security smell including 689 instances of hard-coded secrets. We also found no significance relation between the presence of these security smells and the reputation of the Gist author. Based on our findings, we advocate for increased awareness and rigorous code review efforts related to software security for Github Gists so that propagation of insecure coding practices are mitigated.

Rahman, Akond, Parnin, Chris, Williams, Laurie.  2019.  The Seven Sins: Security Smells in Infrastructure as Code Scripts. 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). :164–175.

Practitioners use infrastructure as code (IaC) scripts to provision servers and development environments. While developing IaC scripts, practitioners may inadvertently introduce security smells. Security smells are recurring coding patterns that are indicative of security weakness and can potentially lead to security breaches. The goal of this paper is to help practitioners avoid insecure coding practices while developing infrastructure as code (IaC) scripts through an empirical study of security smells in IaC scripts. We apply qualitative analysis on 1,726 IaC scripts to identify seven security smells. Next, we implement and validate a static analysis tool called Security Linter for Infrastructure as Code scripts (SLIC) to identify the occurrence of each smell in 15,232 IaC scripts collected from 293 open source repositories. We identify 21,201 occurrences of security smells that include 1,326 occurrences of hard-coded passwords. We submitted bug reports for 1,000 randomly-selected security smell occurrences. We obtain 212 responses to these bug reports, of which 148 occurrences were accepted by the development teams to be fixed. We observe security smells can have a long lifetime, e.g., a hard-coded secret can persist for as long as 98 months, with a median lifetime of 20 months.

Pfeffer, Tobias, Göthel, Thomas, Glesner, Sabine.  2019.  Automatic Analysis of Critical Sections for Efficient Secure Multi-Execution. 2019 IEEE 19th International Conference on Software Quality, Reliability and Security (QRS). :318–325.

Enforcement of hypersafety security policies such as noninterference can be achieved through Secure Multi-Execution (SME). While this is typically very resource-intensive, more efficient solutions such as Demand-Driven Secure Multi-Execution (DDSME) exist. Here, the resource requirements are reduced by restricting multi-execution enforcement to critical sections in the code. However, the current solution requires manual binary analysis. In this paper, we propose a fully automatic critical section analysis. Our analysis extracts a context-sensitive boundary of all nodes that handle information from the reachability relation implied by the control-flow graph. We also provide evaluation results, demonstrating the correctness and acceleration of DDSME with our analysis.

Marin, M\u ad\u alina Angelica, Carabas, Costin, Deaconescu, R\u azvan, T\u apus, Nicolae.  2019.  Proactive Secure Coding for iOS Applications. 2019 18th RoEduNet Conference: Networking in Education and Research (RoEduNet). :1–5.

In this paper we propose a solution to support iOS developers in creating better applications, to use static analysis to investigate source code and detect secure coding issues while simultaneously pointing out good practices and/or secure APIs they should use.

Cheng, Xiao, Wang, Haoyu, Hua, Jiayi, Zhang, Miao, Xu, Guoai, Yi, Li, Sui, Yulei.  2019.  Static Detection of Control-Flow-Related Vulnerabilities Using Graph Embedding. 2019 24th International Conference on Engineering of Complex Computer Systems (ICECCS). :41–50.

Static vulnerability detection has shown its effectiveness in detecting well-defined low-level memory errors. However, high-level control-flow related (CFR) vulnerabilities, such as insufficient control flow management (CWE-691), business logic errors (CWE-840), and program behavioral problems (CWE-438), which are often caused by a wide variety of bad programming practices, posing a great challenge for existing general static analysis solutions. This paper presents a new deep-learning-based graph embedding approach to accurate detection of CFR vulnerabilities. Our approach makes a new attempt by applying a recent graph convolutional network to embed code fragments in a compact and low-dimensional representation that preserves high-level control-flow information of a vulnerable program. We have conducted our experiments using 8,368 real-world vulnerable programs by comparing our approach with several traditional static vulnerability detectors and state-of-the-art machine-learning-based approaches. The experimental results show the effectiveness of our approach in terms of both accuracy and recall. Our research has shed light on the promising direction of combining program analysis with deep learning techniques to address the general static analysis challenges.

Ashfaq, Qirat, Khan, Rimsha, Farooq, Sehrish.  2019.  A Comparative Analysis of Static Code Analysis Tools That Check Java Code Adherence to Java Coding Standards. 2019 2nd International Conference on Communication, Computing and Digital Systems (C-CODE). :98–103.

Java programming language is considered highly important due to its extensive use in the development of web, desktop as well as handheld devices applications. Implementing Java Coding standards on Java code has great importance as it creates consistency and as a result better development and maintenance. Finding bugs and standard's violations is important at an early stage of software development than at a later stage when the change becomes impossible or too expensive. In the paper, some tools and research work done on Coding Standard Analyzers is reviewed. These tools are categorized based on the type of rules they cheeked, namely: style, concurrency, exceptions, and quality, security, dependency and general methods of static code analysis. Finally, list of Java Coding Standards Enforcing Tools are analyzed against certain predefined parameters that are limited by the scope of research paper under study. This review will provide the basis for selecting a static code analysis tool that enforce International Java Coding Standards such as the Rule of Ten and the JPL Coding Standards. Such tools have great importance especially in the development of mission/safety critical system. This work can be very useful for developers in selecting a good tool for Java code analysis, according to their requirements.

2020-01-27
Altamimi, Abdulaziz, Clarke, Nathan, Furnell, Steven, Li, Fudong.  2019.  Multi-Platform Authorship Verification. Proceedings of the Third Central European Cybersecurity Conference. :1–7.
At the present time, there has been a rapid increase in the variety and popularity of messaging systems such as social network messaging, text messages, email and Twitter, with users frequently exchanging messages across various platforms. Unfortunately, in amongst the legitimate messages, there is a host of illegitimate and inappropriate content - with cyber stalking, trolling and computerassisted crime all taking place. Therefore, there is a need to identify individuals using messaging systems. Stylometry is the study of linguistic features in a text which consists of verifying an author based on his writing style that consists of checking whether a target text was written or not by a specific individual author. Whilst much research has taken place within authorship verification, studies have focused upon singular platforms, often had limited datasets and restricted methodologies that have meant it is difficult to appreciate the real-world value of the approach. This paper seeks to overcome these limitations through providing an analysis of authorship verification across four common messaging systems. This approach enables a direct comparison of recognition performance and provides a basis for analyzing the feature vectors across platforms to better understand what aspects each capitalize upon in order to achieve good classification. The experiments also include an investigation into the feature vector creation, utilizing population and user-based techniques to compare and contrast performance. The experiment involved 50 participants across four common platforms with a total 13,617; 106,359; 4,539; and 6,540 samples for Twitter, SMS, Facebook, and Email achieving an Equal Error Rate (EER) of 20.16%, 7.97%, 25% and 13.11% respectively.
Li, Zhangtan, Cheng, Liang, Zhang, Yang.  2019.  Tracking Sensitive Information and Operations in Integrated Clinical Environment. 2019 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/13th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). :192–199.
Integrated Clinical Environment (ICE) is a standardized framework for achieving device interoperability in medical cyber-physical systems. The ICE utilizes high-level supervisory apps and a low-level communication middleware to coordinate medical devices. The need to design complex ICE systems that are both safe and effective has presented numerous challenges, including interoperability, context-aware intelligence, security and privacy. In this paper, we present a data flow analysis framework for the ICE systems. The framework performs the combination of static and dynamic analysis for the sensitive data and operations in the ICE systems. Our experiments demonstrate that the data flow analysis framework can record how the medical devices transmit sensitive data and perform misuse detection by tracing the runtime context of the sensitive operations.
2020-01-20
Zhu, Lipeng, Fu, Xiaotong, Yao, Yao, Zhang, Yuqing, Wang, He.  2019.  FIoT: Detecting the Memory Corruption in Lightweight IoT Device Firmware. 2019 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/13th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). :248–255.
The IoT industry has developed rapidly in recent years, which has attracted the attention of security researchers. However, the researchers are hampered by the wide variety of IoT device operating systems and their hardware architectures. Especially for the lightweight IoT devices, many manufacturers do not provide the device firmware images, embedded firmware source code or even the develop documents. As a result, it hinders traditional static analysis and dynamic analysis techniques. In this paper, we propose a novel dynamic analysis framework, called FIoT, which aims at finding memory corruption vulnerabilities in lightweight IoT device firmware images. The key idea is dynamically run the binary code snippets through symbolic execution with carrying out a fuzzing test. Specifically, we generate code snippets through traversing the control-flow graph (CFG) in a backward manner. We improved the CFG recovery approach and backward slice approach for better performance. To reduce the influence of the binary firmware, FIoT leverages loading address determination analysis and library function identification approach. We have implemented a prototype of FIoT and conducted experiments. Our results show that FIoT can complete the Fuzzing test within 40 seconds in average. Considering 170 seconds for static analysis, FIoT can load and analyze a lightweight IoT firmware within 210 seconds in total. Furthermore, we illustrate the effectiveness of FIoT by applying it over 115 firmware images from 17 manufacturers. We have found 35 images exist memory corruptions, which are all zero-day vulnerabilities.
2020-01-02
Jung, Byungho, Kim, Taeguen, Im, Eul Gyu.  2018.  Malware Classification Using Byte Sequence Information. Proceedings of the 2018 Conference on Research in Adaptive and Convergent Systems. :143–148.
The number of new malware and new malware variants have been increasing continuously. Security experts analyze malware to capture the malicious properties of malware and to generate signatures or detection rules, but the analysis overheads keep increasing with the increasing number of malware. To analyze a large amount of malware, various kinds of automatic analysis methods are in need. Recently, deep learning techniques such as convolutional neural network (CNN) and recurrent neural network (RNN) have been applied for malware classifications. The features used in the previous approches are mostly based on API (Application Programming Interface) information, and the API invocation information can be obtained through dynamic analysis. However, the invocation information may not reflect malicious behaviors of malware because malware developers use various analysis avoidance techniques. Therefore, deep learning-based malware analysis using other features still need to be developed to improve malware analysis performance. In this paper, we propose a malware classification method using the deep learning algorithm based on byte information. Our proposed method uses images generated from malware byte information that can reflect malware behavioral context, and the convolutional neural network-based sentence analysis is used to process the generated images. We performed several experiments to show the effecitveness of our proposed method, and the experimental results show that our method showed higher accuracy than the naive CNN model, and the detection accuracy was about 99%.
2019-12-30
Dai, Ting, He, Jingzhu, Gu, Xiaohui, Lu, Shan, Wang, Peipei.  2018.  DScope: Detecting Real-World Data Corruption Hang Bugs in Cloud Server Systems. Proceedings of the ACM Symposium on Cloud Computing. :313-325.

Cloud server systems such as Hadoop and Cassandra have enabled many real-world data-intensive applications running inside computing clouds. However, those systems present many data-corruption and performance problems which are notoriously difficult to debug due to the lack of diagnosis information. In this paper, we present DScope, a tool that statically detects data-corruption related software hang bugs in cloud server systems. DScope statically analyzes I/O operations and loops in a software package, and identifies loops whose exit conditions can be affected by I/O operations through returned data, returned error code, or I/O exception handling. After identifying those loops which are prone to hang problems under data corruption, DScope conducts loop bound and loop stride analysis to prune out false positives. We have implemented DScope and evaluated it using 9 common cloud server systems. Our results show that DScope can detect 42 real software hang bugs including 29 newly discovered software hang bugs. In contrast, existing bug detection tools miss detecting most of those bugs.

2019-12-16
Marashdih, Abdalla Wasef, Zaaba, Zarul Fitri, Suwais, Khaled.  2018.  Cross Site Scripting: Investigations in PHP Web Application. 2018 International Conference on Promising Electronic Technologies (ICPET). :25–30.

Web applications are now considered one of the common platforms to represent data and conducting service releases throughout the World Wide Web. A number of the most commonly utilised frameworks for web applications are written in PHP. They became main targets because a vast number of servers are running these applications throughout the world. This increase in web application utilisation has made it more attractive to both users and hackers. According to the latest web security reports and research, cross site scripting (XSS) is the most popular vulnerability in PHP web application. XSS is considered an injection type of attack, which results in the theft of sensitive data, cookies, and sessions. Several tools and approaches have focused on detecting this kind of vulnerability in PHP source code. However, it is still a current problem in PHP web applications. This paper describes the popularity of PHP technology among other technologies, and highlight the approaches used to detect the most common vulnerabilities on PHP web applications, which is XSS. In addition, the discussion and the conclusion with future direction of research within this domain are highlighted.

2019-11-19
Ying, Huan, Zhang, Yanmiao, Han, Lifang, Cheng, Yushi, Li, Jiyuan, Ji, Xiaoyu, Xu, Wenyuan.  2019.  Detecting Buffer-Overflow Vulnerabilities in Smart Grid Devices via Automatic Static Analysis. 2019 IEEE 3rd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC). :813-817.

As a modern power transmission network, smart grid connects plenty of terminal devices. However, along with the growth of devices are the security threats. Different from the previous separated environment, an adversary nowadays can destroy the power system by attacking these devices. Therefore, it's critical to ensure the security and safety of terminal devices. To achieve this goal, detecting the pre-existing vulnerabilities of the device program and enhance the terminal security, are of great importance and necessity. In this paper, we propose a novel approach that detects existing buffer-overflow vulnerabilities of terminal devices via automatic static analysis (ASA). We utilize the static analysis to extract the device program information and build corresponding program models. By further matching the generated program model with pre-defined vulnerability patterns, we achieve vulnerability detection and error reporting. The evaluation results demonstrate that our method can effectively detect buffer-overflow vulnerabilities of smart terminals with a high accuracy and a low false positive rate.

2019-09-26
Elliott, A. S., Ruef, A., Hicks, M., Tarditi, D..  2018.  Checked C: Making C Safe by Extension. 2018 IEEE Cybersecurity Development (SecDev). :53-60.

This paper presents Checked C, an extension to C designed to support spatial safety, implemented in Clang and LLVM. Checked C's design is distinguished by its focus on backward-compatibility, incremental conversion, developer control, and enabling highly performant code. Like past approaches to a safer C, Checked C employs a form of checked pointer whose accesses can be statically or dynamically verified. Performance evaluation on a set of standard benchmark programs shows overheads to be relatively low. More interestingly, Checked C introduces the notions of a checked region and bounds-safe interfaces.

Khatchadourian, R., Tang, Y., Bagherzadeh, M., Ahmed, S..  2019.  Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams. 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). :619-630.

Streaming APIs are becoming more pervasive in mainstream Object-Oriented programming languages. For example, the Stream API introduced in Java 8 allows for functional-like, MapReduce-style operations in processing both finite and infinite data structures. However, using this API efficiently involves subtle considerations like determining when it is best for stream operations to run in parallel, when running operations in parallel can be less efficient, and when it is safe to run in parallel due to possible lambda expression side-effects. In this paper, we present an automated refactoring approach that assists developers in writing efficient stream code in a semantics-preserving fashion. The approach, based on a novel data ordering and typestate analysis, consists of preconditions for automatically determining when it is safe and possibly advantageous to convert sequential streams to parallel and unorder or de-parallelize already parallel streams. The approach was implemented as a plug-in to the Eclipse IDE, uses the WALA and SAFE analysis frameworks, and was evaluated on 11 Java projects consisting of ?642K lines of code. We found that 57 of 157 candidate streams (36.31%) were refactorable, and an average speedup of 3.49 on performance tests was observed. The results indicate that the approach is useful in optimizing stream code to their full potential.

2019-08-26
Izurieta, C., Kimball, K., Rice, D., Valentien, T..  2018.  A Position Study to Investigate Technical Debt Associated with Security Weaknesses. 2018 IEEE/ACM International Conference on Technical Debt (TechDebt). :138–142.
Context: Managing technical debt (TD) associated with potential security breaches found during design can lead to catching vulnerabilities (i.e., exploitable weaknesses) earlier in the software lifecycle; thus, anticipating TD principal and interest that can have decidedly negative impacts on businesses. Goal: To establish an approach to help assess TD associated with security weaknesses by leveraging the Common Weakness Enumeration (CWE) and its scoring mechanism, the Common Weakness Scoring System (CWSS). Method: We present a position study with a five-step approach employing the Quamoco quality model to operationalize the scoring of architectural CWEs. Results: We use static analysis to detect design level CWEs, calculate their CWSS scores, and provide a relative ranking of weaknesses that help practitioners identify the highest risks in an organization with a potential to impact TD. Conclusion: CWSS is a community agreed upon method that should be leveraged to help inform the ranking of security related TD items.
2019-06-24
Ijaz, M., Durad, M. H., Ismail, M..  2019.  Static and Dynamic Malware Analysis Using Machine Learning. 2019 16th International Bhurban Conference on Applied Sciences and Technology (IBCAST). :687–691.

Malware detection is an indispensable factor in security of internet oriented machines. The combinations of different features are used for dynamic malware analysis. The different combinations are generated from APIs, Summary Information, DLLs and Registry Keys Changed. Cuckoo sandbox is used for dynamic malware analysis, which is customizable, and provide good accuracy. More than 2300 features are extracted from dynamic analysis of malware and 92 features are extracted statically from binary malware using PEFILE. Static features are extracted from 39000 malicious binaries and 10000 benign files. Dynamically 800 benign files and 2200 malware files are analyzed in Cuckoo Sandbox and 2300 features are extracted. The accuracy of dynamic malware analysis is 94.64% while static analysis accuracy is 99.36%. The dynamic malware analysis is not effective due to tricky and intelligent behaviours of malwares. The dynamic analysis has some limitations due to controlled network behavior and it cannot be analyzed completely due to limited access of network.

Wright, D., Stroschein, J..  2018.  A Malware Analysis and Artifact Capture Tool. 2018 IEEE 16th Intl Conf on Dependable, Autonomic and Secure Computing, 16th Intl Conf on Pervasive Intelligence and Computing, 4th Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress(DASC/PiCom/DataCom/CyberSciTech). :328–333.

Malware authors attempt to obfuscate and hide their code in its static and dynamic states. This paper provides a novel approach to aid analysis by intercepting and capturing malware artifacts and providing dynamic control of process flow. Capturing malware artifacts allows an analyst to more quickly and comprehensively understand malware behavior and obfuscation techniques and doing so interactively allows multiple code paths to be explored. The faster that malware can be analyzed the quicker the systems and data compromised by it can be determined and its infection stopped. This research proposes an instantiation of an interactive malware analysis and artifact capture tool.

2019-06-10
Tran, T. K., Sato, H., Kubo, M..  2018.  One-Shot Learning Approach for Unknown Malware Classification. 2018 5th Asian Conference on Defense Technology (ACDT). :8-13.

Early detection of new kinds of malware always plays an important role in defending the network systems. Especially, if intelligent protection systems could themselves detect an existence of new malware types in their system, even with a very small number of malware samples, it must be a huge benefit for the organization as well as the social since it help preventing the spreading of that kind of malware. To deal with learning from few samples, term ``one-shot learning'' or ``fewshot learning'' was introduced, and mostly used in computer vision to recognize images, handwriting, etc. An approach introduced in this paper takes advantage of One-shot learning algorithms in solving the malware classification problem by using Memory Augmented Neural Network in combination with malware's API calls sequence, which is a very valuable source of information for identifying malware behavior. In addition, it also use some advantages of the development in Natural Language Processing field such as word2vec, etc. to convert those API sequences to numeric vectors before feeding to the one-shot learning network. The results confirm very good accuracies compared to the other traditional methods.

Kargaard, J., Drange, T., Kor, A., Twafik, H., Butterfield, E..  2018.  Defending IT Systems against Intelligent Malware. 2018 IEEE 9th International Conference on Dependable Systems, Services and Technologies (DESSERT). :411-417.

The increasing amount of malware variants seen in the wild is causing problems for Antivirus Software vendors, unable to keep up by creating signatures for each. The methods used to develop a signature, static and dynamic analysis, have various limitations. Machine learning has been used by Antivirus vendors to detect malware based on the information gathered from the analysis process. However, adversarial examples can cause machine learning algorithms to miss-classify new data. In this paper we describe a method for malware analysis by converting malware binaries to images and then preparing those images for training within a Generative Adversarial Network. These unsupervised deep neural networks are not susceptible to adversarial examples. The conversion to images from malware binaries should be faster than using dynamic analysis and it would still be possible to link malware families together. Using the Generative Adversarial Network, malware detection could be much more effective and reliable.

Kim, H. M., Song, H. M., Seo, J. W., Kim, H. K..  2018.  Andro-Simnet: Android Malware Family Classification Using Social Network Analysis. 2018 16th Annual Conference on Privacy, Security and Trust (PST). :1-8.

While the rapid adaptation of mobile devices changes our daily life more conveniently, the threat derived from malware is also increased. There are lots of research to detect malware to protect mobile devices, but most of them adopt only signature-based malware detection method that can be easily bypassed by polymorphic and metamorphic malware. To detect malware and its variants, it is essential to adopt behavior-based detection for efficient malware classification. This paper presents a system that classifies malware by using common behavioral characteristics along with malware families. We measure the similarity between malware families with carefully chosen features commonly appeared in the same family. With the proposed similarity measure, we can classify malware by malware's attack behavior pattern and tactical characteristics. Also, we apply community detection algorithm to increase the modularity within each malware family network aggregation. To maintain high classification accuracy, we propose a process to derive the optimal weights of the selected features in the proposed similarity measure. During this process, we find out which features are significant for representing the similarity between malware samples. Finally, we provide an intuitive graph visualization of malware samples which is helpful to understand the distribution and likeness of the malware networks. In the experiment, the proposed system achieved 97% accuracy for malware classification and 95% accuracy for prediction by K-fold cross-validation using the real malware dataset.

Jiang, H., Turki, T., Wang, J. T. L..  2018.  DLGraph: Malware Detection Using Deep Learning and Graph Embedding. 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA). :1029-1033.

In this paper we present a new approach, named DLGraph, for malware detection using deep learning and graph embedding. DLGraph employs two stacked denoising autoencoders (SDAs) for representation learning, taking into consideration computer programs' function-call graphs and Windows application programming interface (API) calls. Given a program, we first use a graph embedding technique that maps the program's function-call graph to a vector in a low-dimensional feature space. One SDA in our deep learning model is used to learn a latent representation of the embedded vector of the function-call graph. The other SDA in our model is used to learn a latent representation of the given program's Windows API calls. The two learned latent representations are then merged to form a combined feature vector. Finally, we use softmax regression to classify the combined feature vector for predicting whether the given program is malware or not. Experimental results based on different datasets demonstrate the effectiveness of the proposed approach and its superiority over a related method.