Visible to the public Biblio

Found 127 results

Filters: Keyword is static analysis  [Clear All Filters]
2021-10-12
Sultana, Kazi Zakia, Codabux, Zadia, Williams, Byron.  2020.  Examining the Relationship of Code and Architectural Smells with Software Vulnerabilities. 2020 27th Asia-Pacific Software Engineering Conference (APSEC). :31–40.
Context: Security is vital to software developed for commercial or personal use. Although more organizations are realizing the importance of applying secure coding practices, in many of them, security concerns are not known or addressed until a security failure occurs. The root cause of security failures is vulnerable code. While metrics have been used to predict software vulnerabilities, we explore the relationship between code and architectural smells with security weaknesses. As smells are surface indicators of a deeper problem in software, determining the relationship between smells and software vulnerabilities can play a significant role in vulnerability prediction models. Objective: This study explores the relationship between smells and software vulnerabilities to identify the smells. Method: We extracted the class, method, file, and package level smells for three systems: Apache Tomcat, Apache CXF, and Android. We then compared their occurrences in the vulnerable classes which were reported to contain vulnerable code and in the neutral classes (non-vulnerable classes where no vulnerability had yet been reported). Results: We found that a vulnerable class is more likely to have certain smells compared to a non-vulnerable class. God Class, Complex Class, Large Class, Data Class, Feature Envy, Brain Class have a statistically significant relationship with software vulnerabilities. We found no significant relationship between architectural smells and software vulnerabilities. Conclusion: We can conclude that for all the systems examined, there is a statistically significant correlation between software vulnerabilities and some smells.
2021-10-04
Lu, Shuaibing, Kuang, Xiaohui, Nie, Yuanping, Lin, Zhechao.  2020.  A Hybrid Interface Recovery Method for Android Kernels Fuzzing. 2020 IEEE 20th International Conference on Software Quality, Reliability and Security (QRS). :335–346.
Android kernel fuzzing is a research area of interest specifically for detecting kernel vulnerabilities which may allow attackers to obtain the root privilege. The number of Android mobile phones is increasing rapidly with the explosive growth of Android kernel drivers. Interface aware fuzzing is an effective technique to test the security of kernel driver. Existing researches rely on static analysis with kernel source code. However, in fact, there exist millions of Android mobile phones without public accessible source code. In this paper, we propose a hybrid interface recovery method for fuzzing kernels which can recover kernel driver interface no matter the source code is available or not. In white box condition, we employ a dynamic interface recover method that can automatically and completely identify the interface knowledge. In black box condition, we use reverse engineering to extract the key interface information and use similarity computation to infer argument types. We evaluate our hybrid algorithm on on 12 Android smartphones from 9 vendors. Empirical experimental results show that our method can effectively recover interface argument lists and find Android kernel bugs. In total, 31 vulnerabilities are reported in white and black box conditions. The vulnerabilities were responsibly disclosed to affected vendors and 9 of the reported vulnerabilities have been already assigned CVEs.
2021-09-21
Lee, Yen-Ting, Ban, Tao, Wan, Tzu-Ling, Cheng, Shin-Ming, Isawa, Ryoichi, Takahashi, Takeshi, Inoue, Daisuke.  2020.  Cross Platform IoT-Malware Family Classification Based on Printable Strings. 2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). :775–784.
In this era of rapid network development, Internet of Things (IoT) security considerations receive a lot of attention from both the research and commercial sectors. With limited computation resource, unfriendly interface, and poor software implementation, legacy IoT devices are vulnerable to many infamous mal ware attacks. Moreover, the heterogeneity of IoT platforms and the diversity of IoT malware make the detection and classification of IoT malware even more challenging. In this paper, we propose to use printable strings as an easy-to-get but effective cross-platform feature to identify IoT malware on different IoT platforms. The discriminating capability of these strings are verified using a set of machine learning algorithms on malware family classification across different platforms. The proposed scheme shows a 99% accuracy on a large scale IoT malware dataset consisted of 120K executable fils in executable and linkable format when the training and test are done on the same platform. Meanwhile, it also achieves a 96% accuracy when training is carried out on a few popular IoT platforms but test is done on different platforms. Efficient malware prevention and mitigation solutions can be enabled based on the proposed method to prevent and mitigate IoT malware damages across different platforms.
Sartoli, Sara, Wei, Yong, Hampton, Shane.  2020.  Malware Classification Using Recurrence Plots and Deep Neural Network. 2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA). :901–906.
In this paper, we introduce a method for visualizing and classifying malware binaries. A malware binary consists of a series of data points of compiled machine codes that represent programming components. The occurrence and recurrence behavior of these components is determined by the common tasks malware samples in a particular family carry out. Thus, we view a malware binary as a series of emissions generated by an underlying stochastic process and use recurrence plots to transform malware binaries into two-dimensional texture images. We observe that recurrence plot-based malware images have significant visual similarities within the same family and are different from samples in other families. We apply deep CNN classifiers to classify malware samples. The proposed approach does not require creating malware signature or manual feature engineering. Our preliminary experimental results show that the proposed malware representation leads to a higher and more stable accuracy in comparison to directly transforming malware binaries to gray-scale images.
Brezinski, Kenneth, Ferens, Ken.  2020.  Complexity-Based Convolutional Neural Network for Malware Classification. 2020 International Conference on Computational Science and Computational Intelligence (CSCI). :1–9.
Malware classification remains at the forefront of ongoing research as the prevalence of metamorphic malware introduces new challenges to anti-virus vendors and firms alike. One approach to malware classification is Static Analysis - a form of analysis which does not require malware to be executed before classification can be performed. For this reason, a lightweight classifier based on the features of a malware binary is preferred, with relatively low computational overhead. In this work a modified convolutional neural network (CNN) architecture was deployed which integrated a complexity-based evaluation based on box-counting. This was implemented by setting up max-pooling layers in parallel, and then extracting the fractal dimension using a polyscalar relationship based on the resolution of the measurement scale and the number of elements of a malware image covered in the measurement under consideration. To test the robustness and efficacy of our approach we trained and tested on over 9300 malware binaries from 25 unique malware families. This work was compared to other award-winning image recognition models, and results showed categorical accuracy in excess of 96.54%.
Petrenko, Sergei A., Petrenko, Alexey S., Makoveichuk, Krystina A., Olifirov, Alexander V..  2020.  "Digital Bombs" Neutralization Method. 2020 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus). :446–451.
The article discusses new models and methods for timely identification and blocking of malicious code of critically important information infrastructure based on static and dynamic analysis of executable program codes. A two-stage method for detecting malicious code in the executable program codes (the so-called "digital bombs") is described. The first step of the method is to build the initial program model in the form of a control graph, the construction is carried out at the stage of static analysis of the program. The article discusses the purpose, features and construction criteria of an ordered control graph. The second step of the method is to embed control points in the program's executable code for organizing control of the possible behavior of the program using a specially designed recognition automaton - an automaton of dynamic control. Structural criteria for the completeness of the functional control of the subprogram are given. The practical implementation of the proposed models and methods was completed and presented in a special instrumental complex IRIDA.
Chai, Yuhan, Qiu, Jing, Su, Shen, Zhu, Chunsheng, Yin, Lihua, Tian, Zhihong.  2020.  LGMal: A Joint Framework Based on Local and Global Features for Malware Detection. 2020 International Wireless Communications and Mobile Computing (IWCMC). :463–468.
With the gradual advancement of smart city construction, various information systems have been widely used in smart cities. In order to obtain huge economic benefits, criminals frequently invade the information system, which leads to the increase of malware. Malware attacks not only seriously infringe on the legitimate rights and interests of users, but also cause huge economic losses. Signature-based malware detection algorithms can only detect known malware, and are susceptible to evasion techniques such as binary obfuscation. Behavior-based malware detection methods can solve this problem well. Although there are some malware behavior analysis works, they may ignore semantic information in the malware API call sequence. In this paper, we design a joint framework based on local and global features for malware detection to solve the problem of network security of smart cities, called LGMal, which combines the stacked convolutional neural network and graph convolutional networks. Specially, the stacked convolutional neural network is used to learn API call sequence information to capture local semantic features and the graph convolutional networks is used to learn API call semantic graph structure information to capture global semantic features. Experiments on Alibaba Cloud Security Malware Detection datasets show that the joint framework gets better results. The experimental results show that the precision is 87.76%, the recall is 88.08%, and the F1-measure is 87.79%. We hope this paper can provide a useful way for malware detection and protect the network security of smart city.
Ramadhan, Beno, Purwanto, Yudha, Ruriawan, Muhammad Faris.  2020.  Forensic Malware Identification Using Naive Bayes Method. 2020 International Conference on Information Technology Systems and Innovation (ICITSI). :1–7.
Malware is a kind of software that, if installed on a malware victim's device, might carry malicious actions. The malicious actions might be data theft, system failure, or denial of service. Malware analysis is a process to identify whether a piece of software is a malware or not. However, with the advancement of malware technologies, there are several evasion techniques that could be implemented by malware developers to prevent analysis, such as polymorphic and oligomorphic. Therefore, this research proposes an automatic malware detection system. In the system, the malware characteristics data were obtained through both static and dynamic analysis processes. Data from the analysis process were classified using Naive Bayes algorithm to identify whether the software is a malware or not. The process of identifying malware and benign files using the Naive Bayes machine learning method has an accuracy value of 93 percent for the detection process using static characteristics and 85 percent for detection through dynamic characteristics.
2021-08-31
Fadolalkarim, Daren, Bertino, Elisa, Sallam, Asmaa.  2020.  An Anomaly Detection System for the Protection of Relational Database Systems against Data Leakage by Application Programs. 2020 IEEE 36th International Conference on Data Engineering (ICDE). :265—276.
Application programs are a possible source of attacks to databases as attackers might exploit vulnerabilities in a privileged database application. They can perform code injection or code-reuse attack in order to steal sensitive data. However, as such attacks very often result in changes in the program's behavior, program monitoring techniques represent an effective defense to detect on-going attacks. One such technique is monitoring the library/system calls that the application program issues while running. In this paper, we propose AD-PROM, an Anomaly Detection system that aims at protecting relational database systems against malicious/compromised applications PROgraMs aiming at stealing data. AD-PROM tracks calls executed by application programs on data extracted from a database. The system operates in two phases. The first phase statically and dynamically analyzes the behavior of the application in order to build profiles representing the application's normal behavior. AD-PROM analyzes the control and data flow of the application program (i.e., static analysis), and builds a hidden Markov model trained by the program traces (i.e., dynamic analysis). During the second phase, the program execution is monitored in order to detect anomalies that may represent data leakage attempts. We have implemented AD-PROM and carried experimental activities to assess its performance. The results showed that our system is highly accurate in detecting changes in the application programs' behaviors and has very low false positive rates.
2021-07-27
Wang, X., Shen, Q., Luo, W., Wu, P..  2020.  RSDS: Getting System Call Whitelist for Container Through Dynamic and Static Analysis. 2020 IEEE 13th International Conference on Cloud Computing (CLOUD). :600—608.
Container technology has been used for running multiple isolated operating system distros on a host or deploying large scale microservice-based applications. In most cases, containers share the same kernel with the host and other containers on the same host, and the application in the container can make system calls of the host kernel like a normal process on the host. Seccomp is a security mechanism for the Linux kernel, through which we can prohibit certain system calls from being executed by the program. Docker began to support the seccomp mechanism from version 1.10 and disables around 44 system calls out of 300+ by default. However, for a particular container, there are still many system calls that are unnecessary for running it allowed to be executed, and the abuse of system calls by a compromised container can trigger the security vulnerabilities of a host kernel. Unfortunately, Docker does not provide a way to get the necessary system calls for a particular container. In this paper, we propose RSDS, a method combining dynamic analysis and static analysis to get the necessary system calls for a particular container. Our experiments show that our solution can reduce system calls by 69.27%-85.89% compared to the default configuration on an x86-64 PC with Ubuntu 16.04 host OS and does not affect the functionalities of these containers.
2021-06-24
Saletta, Martina, Ferretti, Claudio.  2020.  A Neural Embedding for Source Code: Security Analysis and CWE Lists. 2020 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech). :523—530.
In this paper, we design a technique for mapping the source code into a vector space and we show its application in the recognition of security weaknesses. By applying ideas commonly used in Natural Language Processing, we train a model for producing an embedding of programs starting from their Abstract Syntax Trees. We then show how such embedding is able to infer clusters roughly separating different classes of software weaknesses. Even if the training of the embedding is unsupervised and made on a generic Java dataset, we show that the model can be used for supervised learning of specific classes of vulnerabilities, helping to capture some features distinguishing them in code. Finally, we discuss how our model performs over the different types of vulnerabilities categorized by the CWE initiative.
2021-05-18
Tai, Zeming, Washizaki, Hironori, Fukazawa, Yoshiaki, Fujimatsu, Yurie, Kanai, Jun.  2020.  Binary Similarity Analysis for Vulnerability Detection. 2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC). :1121–1122.
Binary similarity has been widely used in function recognition and vulnerability detection. How to define a proper similarity is the key element in implementing a fast detection method. We proposed a scalable method to detect binary vulnerabilities based on similarity. Procedures lifted from binaries are divided into several comparable strands by data dependency, and those strands are transformed into a normalized form by our tool named VulneraBin, so that similarity can be determined between two procedures through a hash value comparison. The low computational complexity allows semantically equivalent code to be identified in binaries compiled from million lines of source code in a fast and accurate way.
2021-05-05
Herrera, Adrian.  2020.  Optimizing Away JavaScript Obfuscation. 2020 IEEE 20th International Working Conference on Source Code Analysis and Manipulation (SCAM). :215—220.

JavaScript is a popular attack vector for releasing malicious payloads on unsuspecting Internet users. Authors of this malicious JavaScript often employ numerous obfuscation techniques in order to prevent the automatic detection by antivirus and hinder manual analysis by professional malware analysts. Consequently, this paper presents SAFE-DEOBS, a JavaScript deobfuscation tool that we have built. The aim of SAFE-DEOBS is to automatically deobfuscate JavaScript malware such that an analyst can more rapidly determine the malicious script's intent. This is achieved through a number of static analyses, inspired by techniques from compiler theory. We demonstrate the utility of SAFE-DEOBS through a case study on real-world JavaScript malware, and show that it is a useful addition to a malware analyst's toolset.

2021-05-03
Xu, Shenglin, Xie, Peidai, Wang, Yongjun.  2020.  AT-ROP: Using static analysis and binary patch technology to defend against ROP attacks based on return instruction. 2020 International Symposium on Theoretical Aspects of Software Engineering (TASE). :209–216.
Return-Oriented Programming (ROP) is one of the most common techniques to exploit software vulnerabilities. Although many solutions to defend against ROP attacks have been proposed, they still have various drawbacks, such as requiring additional information (source code, debug symbols, etc.), increasing program running cost, and causing program instability. In this paper, we propose a method: using static analysis and binary patch technology to defend against ROP attacks based on return instruction. According to this method, we implemented the AT- ROP tool in a Linux 64-bit system environment. Compared to existing tools, it clears the parameter registers when the function returns. As a result, it makes the binary to defend against ROP attacks based on return instruction without having to obtain the source code of the binary. We use the binary challenges in the CTF competition and the binary programs commonly used in the Linux environment to experiment. It turns out that AT-ROP can make the binary program have the ability to defend against ROP attacks based on return instruction with a small increase in the size of the binary program and without affecting its normal execution.
2021-04-08
Colbaugh, R., Glass, K., Bauer, T..  2013.  Dynamic information-theoretic measures for security informatics. 2013 IEEE International Conference on Intelligence and Security Informatics. :45–49.
Many important security informatics problems require consideration of dynamical phenomena for their solution; examples include predicting the behavior of individuals in social networks and distinguishing malicious and innocent computer network activities based on activity traces. While information theory offers powerful tools for analyzing dynamical processes, to date the application of information-theoretic methods in security domains has focused on static analyses (e.g., cryptography, natural language processing). This paper leverages information-theoretic concepts and measures to quantify the similarity of pairs of stochastic dynamical systems, and shows that this capability can be used to solve important problems which arise in security applications. We begin by presenting a concise review of the information theory required for our development, and then address two challenging tasks: 1.) characterizing the way influence propagates through social networks, and 2.) distinguishing malware from legitimate software based on the instruction sequences of the disassembled programs. In each application, case studies involving real-world datasets demonstrate that the proposed techniques outperform standard methods.
2021-03-15
Staicu, C.-A., Torp, M. T., Schäfer, M., Møller, A., Pradel, M..  2020.  Extracting Taint Specifications for JavaScript Libraries. 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). :198—209.

Modern JavaScript applications extensively depend on third-party libraries. Especially for the Node.js platform, vulnerabilities can have severe consequences to the security of applications, resulting in, e.g., cross-site scripting and command injection attacks. Existing static analysis tools that have been developed to automatically detect such issues are either too coarse-grained, looking only at package dependency structure while ignoring dataflow, or rely on manually written taint specifications for the most popular libraries to ensure analysis scalability. In this work, we propose a technique for automatically extracting taint specifications for JavaScript libraries, based on a dynamic analysis that leverages the existing test suites of the libraries and their available clients in the npm repository. Due to the dynamic nature of JavaScript, mapping observations from dynamic analysis to taint specifications that fit into a static analysis is non-trivial. Our main insight is that this challenge can be addressed by a combination of an access path mechanism that identifies entry and exit points, and the use of membranes around the libraries of interest. We show that our approach is effective at inferring useful taint specifications at scale. Our prototype tool automatically extracts 146 additional taint sinks and 7 840 propagation summaries spanning 1 393 npm modules. By integrating the extracted specifications into a commercial, state-of-the-art static analysis, 136 new alerts are produced, many of which correspond to likely security vulnerabilities. Moreover, many important specifications that were originally manually written are among the ones that our tool can now extract automatically.

Danilova, A., Naiakshina, A., Smith, M..  2020.  One Size Does Not Fit All: A Grounded Theory and Online Survey Study of Developer Preferences for Security Warning Types. 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). :136–148.
A wide range of tools exist to assist developers in creating secure software. Many of these tools, such as static analysis engines or security checkers included in compilers, use warnings to communicate security issues to developers. The effectiveness of these tools relies on developers heeding these warnings, and there are many ways in which these warnings could be displayed. Johnson et al. [46] conducted qualitative research and found that warning presentation and integration are main issues. We built on Johnson et al.'s work and examined what developers want from security warnings, including what form they should take and how they should integrate into their workflow and work context. To this end, we conducted a Grounded Theory study with 14 professional software developers and 12 computer science students as well as a focus group with 7 academic researchers to gather qualitative insights. To back up the theory developed from the qualitative research, we ran a quantitative survey with 50 professional software developers. Our results show that there is significant heterogeneity amongst developers and that no one warning type is preferred over all others. The context in which the warnings are shown is also highly relevant, indicating that it is likely to be beneficial if IDEs and other development tools become more flexible in their warning interactions with developers. Based on our findings, we provide concrete recommendations for both future research as well as how IDEs and other security tools can improve their interaction with developers.
Hwang, S., Ryu, S..  2020.  Gap between Theory and Practice: An Empirical Study of Security Patches in Solidity. 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). :542–553.
Ethereum, one of the most popular blockchain platforms, provides financial transactions like payments and auctions through smart contracts. Due to the immense interest in smart contracts in academia, the research community of smart contract security has made a significant improvement recently. Researchers have reported various security vulnerabilities in smart contracts, and developed static analysis tools and verification frameworks to detect them. However, it is unclear whether such great efforts from academia has indeed enhanced the security of smart contracts in reality. To understand the security level of smart contracts in the wild, we empirically studied 55,046 real-world Ethereum smart contracts written in Solidity, the most popular programming language used by Ethereum smart contract developers. We first examined how many well-known vulnerabilities the Solidity compiler has patched, and how frequently the Solidity team publishes compiler releases. Unfortunately, we observed that many known vulnerabilities are not yet patched, and some patches are not even sufficient to avoid their target vulnerabilities. Subsequently, we investigated whether smart contract developers use the most recent compiler with vulnerabilities patched. We reported that developers of more than 98% of real-world Solidity contracts still use older compilers without vulnerability patches, and more than 25% of the contracts are potentially vulnerable due to the missing security patches. To understand actual impacts of the missing patches, we manually investigated potentially vulnerable contracts that are detected by our static analyzer and identified common mistakes by Solidity developers, which may cause serious security issues such as financial loss. We detected hundreds of vulnerable contracts and about one fourth of the vulnerable contracts are used by thousands of people. We recommend the Solidity team to make patches that resolve known vulnerabilities correctly, and developers to use the latest Solidity compiler to avoid missing security patches.
2021-03-09
Akram, B., Ogi, D..  2020.  The Making of Indicator of Compromise using Malware Reverse Engineering Techniques. 2020 International Conference on ICT for Smart Society (ICISS). CFP2013V-ART:1—6.

Malware threats often go undetected immediately, because attackers can camouflage well within the system. The users realize this after the devices stop working and cause harm for them. One way to deceive malicious content detection, malware authors use packers. Malware analysis is an activity to gain knowledge about malware. Reverse engineering is a technique used to identify and deal with new viruses or to understand malware behavior. Therefore, this technique can be the right choice for conducting malware analysis, especially for malware with packers. The results of the analysis are used as a source for making creating indicator of compromise in the YARA rule format. YARA rule is used as a component for detecting malware using the indicators obtained in the analysis process.

2021-03-04
Matin, I. Muhamad Malik, Rahardjo, B..  2020.  A Framework for Collecting and Analysis PE Malware Using Modern Honey Network (MHN). 2020 8th International Conference on Cyber and IT Service Management (CITSM). :1—5.

Nowadays, Windows is an operating system that is very popular among people, especially users who have limited knowledge of computers. But unconsciously, the security threat to the windows operating system is very high. Security threats can be in the form of illegal exploitation of the system. The most common attack is using malware. To determine the characteristics of malware using dynamic analysis techniques and static analysis is very dependent on the availability of malware samples. Honeypot is the most effective malware collection technique. But honeypot cannot determine the type of file format contained in malware. File format information is needed for the purpose of handling malware analysis that is focused on windows-based malware. For this reason, we propose a framework that can collect malware information as well as identify malware PE file type formats. In this study, we collected malware samples using a modern honey network. Next, we performed a feature extraction to determine the PE file format. Then, we classify types of malware using VirusTotal scanning. As the results of this study, we managed to get 1.222 malware samples. Out of 1.222 malware samples, we successfully extracted 945 PE malware. This study can help researchers in other research fields, such as machine learning and deep learning, for malware detection.

2020-12-17
Sun, P., Garcia, L., Salles-Loustau, G., Zonouz, S..  2020.  Hybrid Firmware Analysis for Known Mobile and IoT Security Vulnerabilities. 2020 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). :373—384.

Mobile and IoT operating systems–and their ensuing software updates–are usually distributed as binary files. Given that these binary files are commonly closed source, users or businesses who want to assess the security of the software need to rely on reverse engineering. Further, verifying the correct application of the latest software patches in a given binary is an open problem. The regular application of software patches is a central pillar for improving mobile and IoT device security. This requires developers, integrators, and vendors to propagate patches to all affected devices in a timely and coordinated fashion. In practice, vendors follow different and sometimes improper security update agendas for both mobile and IoT products. Moreover, previous studies revealed the existence of a hidden patch gap: several vendors falsely reported that they patched vulnerabilities. Therefore, techniques to verify whether vulnerabilities have been patched or not in a given binary are essential. Deep learning approaches have shown to be promising for static binary analyses with respect to inferring binary similarity as well as vulnerability detection. However, these approaches fail to capture the dynamic behavior of these systems, and, as a result, they may inundate the analysis with false positives when performing vulnerability discovery in the wild. In particular, they cannot capture the fine-grained characteristics necessary to distinguish whether a vulnerability has been patched or not. In this paper, we present PATCHECKO, a vulnerability and patch presence detection framework for executable binaries. PATCHECKO relies on a hybrid, cross-platform binary code similarity analysis that combines deep learning-based static binary analysis with dynamic binary analysis. PATCHECKO does not require access to the source code of the target binary nor that of vulnerable functions. We evaluate PATCHECKO on the most recent Google Pixel 2 smartphone and the Android Things IoT firmware images, within which 25 known CVE vulnerabilities have been previously reported and patched. Our deep learning model shows a vulnerability detection accuracy of over 93%. We further prune the candidates found by the deep learning stage–which includes false positives–via dynamic binary analysis. Consequently, PATCHECKO successfully identifies the correct matches among the candidate functions in the top 3 ranked outcomes 100% of the time. Furthermore, PATCHECKO's differential engine distinguishes between functions that are still vulnerable and those that are patched with an accuracy of 96%.

2020-12-11
Ge, X., Pan, Y., Fan, Y., Fang, C..  2019.  AMDroid: Android Malware Detection Using Function Call Graphs. 2019 IEEE 19th International Conference on Software Quality, Reliability and Security Companion (QRS-C). :71—77.

With the rapid development of the mobile Internet, Android has been the most popular mobile operating system. Due to the open nature of Android, c countless malicious applications are hidden in a large number of benign applications, which pose great threats to users. Most previous malware detection approaches mainly rely on features such as permissions, API calls, and opcode sequences. However, these approaches fail to capture structural semantics of applications. In this paper, we propose AMDroid that leverages function call graphs (FCGs) representing the behaviors of applications and applies graph kernels to automatically learn the structural semantics of applications from FCGs. We evaluate AMDroid on the Genome Project, and the experimental results show that AMDroid is effective to detect Android malware with 97.49% detection accuracy.

Abusnaina, A., Khormali, A., Alasmary, H., Park, J., Anwar, A., Mohaisen, A..  2019.  Adversarial Learning Attacks on Graph-based IoT Malware Detection Systems. 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS). :1296—1305.

IoT malware detection using control flow graph (CFG)-based features and deep learning networks are widely explored. The main goal of this study is to investigate the robustness of such models against adversarial learning. We designed two approaches to craft adversarial IoT software: off-the-shelf methods and Graph Embedding and Augmentation (GEA) method. In the off-the-shelf adversarial learning attack methods, we examine eight different adversarial learning methods to force the model to misclassification. The GEA approach aims to preserve the functionality and practicality of the generated adversarial sample through a careful embedding of a benign sample to a malicious one. Intensive experiments are conducted to evaluate the performance of the proposed method, showing that off-the-shelf adversarial attack methods are able to achieve a misclassification rate of 100%. In addition, we observed that the GEA approach is able to misclassify all IoT malware samples as benign. The findings of this work highlight the essential need for more robust detection tools against adversarial learning, including features that are not easy to manipulate, unlike CFG-based features. The implications of the study are quite broad, since the approach challenged in this work is widely used for other applications using graphs.

Wu, Y., Li, X., Zou, D., Yang, W., Zhang, X., Jin, H..  2019.  MalScan: Fast Market-Wide Mobile Malware Scanning by Social-Network Centrality Analysis. 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). :139—150.

Malware scanning of an app market is expected to be scalable and effective. However, existing approaches use either syntax-based features which can be evaded by transformation attacks or semantic-based features which are usually extracted by performing expensive program analysis. Therefor, in this paper, we propose a lightweight graph-based approach to perform Android malware detection. Instead of traditional heavyweight static analysis, we treat function call graphs of apps as social networks and perform social-network-based centrality analysis to represent the semantic features of the graphs. Our key insight is that centrality provides a succinct and fault-tolerant representation of graph semantics, especially for graphs with certain amount of inaccurate information (e.g., inaccurate call graphs). We implement a prototype system, MalScan, and evaluate it on datasets of 15,285 benign samples and 15,430 malicious samples. Experimental results show that MalScan is capable of detecting Android malware with up to 98% accuracy under one second which is more than 100 times faster than two state-of-the-art approaches, namely MaMaDroid and Drebin. We also demonstrate the feasibility of MalScan on market-wide malware scanning by performing a statistical study on over 3 million apps. Finally, in a corpus of dataset collected from Google-Play app market, MalScan is able to identify 18 zero-day malware including malware samples that can evade detection of existing tools.

2020-12-02
Gliksberg, J., Capra, A., Louvet, A., García, P. J., Sohier, D..  2019.  High-Quality Fault-Resiliency in Fat-Tree Networks (Extended Abstract). 2019 IEEE Symposium on High-Performance Interconnects (HOTI). :9—12.
Coupling regular topologies with optimized routing algorithms is key in pushing the performance of interconnection networks of HPC systems. In this paper we present Dmodc, a fast deterministic routing algorithm for Parallel Generalized Fat-Trees (PGFTs) which minimizes congestion risk even under massive topology degradation caused by equipment failure. It applies a modulo-based computation of forwarding tables among switches closer to the destination, using only knowledge of subtrees for pre-modulo division. Dmodc allows complete re-routing of topologies with tens of thousands of nodes in less than a second, which greatly helps centralized fabric management react to faults with high-quality routing tables and no impact to running applications in current and future very large-scale HPC clusters. We compare Dmodc against routing algorithms available in the InfiniBand control software (OpenSM) first for routing execution time to show feasibility at scale, and then for congestion risk under degradation to demonstrate robustness. The latter comparison is done using static analysis of routing tables under random permutation (RP), shift permutation (SP) and all-to-all (A2A) traffic patterns. Results for Dmodc show A2A and RP congestion risks similar under heavy degradation as the most stable algorithms compared, and near-optimal SP congestion risk up to 1% of random degradation.