Visible to the public Biblio

Filters: Keyword is program analysis  [Clear All Filters]
Singleton, Larry, Zhao, Rui, Siy, Harvey, Song, Myoungkyu.  2021.  FireBugs: Finding and Repairing Cryptography API Misuses in Mobile Applications. 2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC). :1194–1201.
In this paper, we present FireBugs for Finding and Repairing Bugs based on security patterns. For the common misuse patterns of cryptography APIs (crypto APIs), we encode common cryptography rules into the pattern representations for bug detection and program repair regarding cryptography rule violations. In the evaluation, we conducted a case study to assess the bug detection capability by applying FireBugs to datasets mined from both open source and commercial projects. Also, we conducted a user study with professional software engineers at Mutual of Omaha Insurance Company to estimate the program repair capability. This evaluation showed that FireBugs can help professional engineers develop various cryptographic requirements in a resilient application.
Fu, Y., Tong, S., Guo, X., Cheng, L., Zhang, Y., Feng, D..  2020.  Improving the Effectiveness of Grey-box Fuzzing By Extracting Program Information. 2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). :434–441.
Fuzzing has been widely adopted as an effective techniques to detect vulnerabilities in softwares. However, existing fuzzers suffer from the problems of generating excessive test inputs that either cannot pass input validation or are ineffective in exploring unvisited regions in the program under test (PUT). To tackle these problems, we propose a greybox fuzzer called MuFuzzer based on AFL, which incorporates two heuristics that optimize seed selection and automatically extract input formatting information from the PUT to increase the chance of generating valid test inputs, respectively. In particular, the first heuristic collects the branch coverage and execution information during a fuzz session, and utilizes such information to guide fuzzing tools in selecting seeds that are fast to execute, small in size, and more importantly, more likely to explore new behaviors of the PUT for subsequent fuzzing activities. The second heuristic automatically identifies string comparison operations that the PUT uses for input validation, and establishes a dictionary with string constants from these operations to help fuzzers generate test inputs that have higher chances to pass input validation. We have evaluated the performance of MuFuzzer, in terms of code coverage and bug detection, using a set of realistic programs and the LAVA-M test bench. Experiment results demonstrate that MuFuzzer is able to achieve higher code coverage and better or comparative bug detection performance than state-of-the-art fuzzers.
Wu, Y., Li, X., Zou, D., Yang, W., Zhang, X., Jin, H..  2019.  MalScan: Fast Market-Wide Mobile Malware Scanning by Social-Network Centrality Analysis. 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). :139—150.

Malware scanning of an app market is expected to be scalable and effective. However, existing approaches use either syntax-based features which can be evaded by transformation attacks or semantic-based features which are usually extracted by performing expensive program analysis. Therefor, in this paper, we propose a lightweight graph-based approach to perform Android malware detection. Instead of traditional heavyweight static analysis, we treat function call graphs of apps as social networks and perform social-network-based centrality analysis to represent the semantic features of the graphs. Our key insight is that centrality provides a succinct and fault-tolerant representation of graph semantics, especially for graphs with certain amount of inaccurate information (e.g., inaccurate call graphs). We implement a prototype system, MalScan, and evaluate it on datasets of 15,285 benign samples and 15,430 malicious samples. Experimental results show that MalScan is capable of detecting Android malware with up to 98% accuracy under one second which is more than 100 times faster than two state-of-the-art approaches, namely MaMaDroid and Drebin. We also demonstrate the feasibility of MalScan on market-wide malware scanning by performing a statistical study on over 3 million apps. Finally, in a corpus of dataset collected from Google-Play app market, MalScan is able to identify 18 zero-day malware including malware samples that can evade detection of existing tools.

Mohammadi, Mahmoud, Chu, Bill, Richter Lipford, Heather.  2019.  Automated Repair of Cross-Site Scripting Vulnerabilities through Unit Testing. 2019 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW). :370–377.
Many web applications are vulnerable to Cross Site Scripting (XSS) attacks enabling attackers to steal sensitive information and commit frauds. Much research in this area have focused on detecting vulnerable web pages using static and dynamic program analysis. The best practice to prevent XSS vulnerabilities is to encode untrusted dynamic content. However, a common programming error is the use of a wrong type of encoder to sanitize untrusted data, leaving the application vulnerable. We propose a new approach that can automatically fix this common type of XSS vulnerability in many situations. This approach is integrated into the software maintenance life cycle through unit testing. Vulnerable codes are refactored to reflect the suggested encoder and then verified using an attack evaluating mechanism to find a proper repair. Evaluation of this approach has been conducted on an open source medical record application with over 200 web pages written in JSP.
Piskachev, Goran, Nguyen Quang Do, Lisa, Johnson, Oshando, Bodden, Eric.  2019.  SWAN\_ASSIST: Semi-Automated Detection of Code-Specific, Security-Relevant Methods. 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). :1094–1097.
To detect specific types of bugs and vulnerabilities, static analysis tools must be correctly configured with security-relevant methods (SRM), e.g., sources, sinks, sanitizers and authentication methods-usually a very labour-intensive and error-prone process. This work presents the semi-automated tool SWAN\_ASSIST, which aids the configuration with an IntelliJ plugin based on active machine learning. It integrates our novel automated machine-learning approach SWAN, which identifies and classifies Java SRM. SWAN\_ASSIST further integrates user feedback through iterative learning. SWAN\_ASSIST aids developers by asking them to classify at each point in time exactly those methods whose classification best impact the classification result. Our experiments show that SWAN\_ASSIST classifies SRM with a high precision, and requires a relatively low effort from the user. A video demo of SWAN\_ASSIST can be found at The source code is available at
Gopinath, Divya, S. Pasareanu, Corina, Wang, Kaiyuan, Zhang, Mengshi, Khurshid, Sarfraz.  2019.  Symbolic Execution for Attribution and Attack Synthesis in Neural Networks. 2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion). :282—283.

This paper introduces DeepCheck, a new approach for validating Deep Neural Networks (DNNs) based on core ideas from program analysis, specifically from symbolic execution. DeepCheck implements techniques for lightweight symbolic analysis of DNNs and applies them in the context of image classification to address two challenging problems: 1) identification of important pixels (for attribution and adversarial generation); and 2) creation of adversarial attacks. Experimental results using the MNIST data-set show that DeepCheck's lightweight symbolic analysis provides a valuable tool for DNN validation.

Cheng, Xiao, Wang, Haoyu, Hua, Jiayi, Zhang, Miao, Xu, Guoai, Yi, Li, Sui, Yulei.  2019.  Static Detection of Control-Flow-Related Vulnerabilities Using Graph Embedding. 2019 24th International Conference on Engineering of Complex Computer Systems (ICECCS). :41–50.

Static vulnerability detection has shown its effectiveness in detecting well-defined low-level memory errors. However, high-level control-flow related (CFR) vulnerabilities, such as insufficient control flow management (CWE-691), business logic errors (CWE-840), and program behavioral problems (CWE-438), which are often caused by a wide variety of bad programming practices, posing a great challenge for existing general static analysis solutions. This paper presents a new deep-learning-based graph embedding approach to accurate detection of CFR vulnerabilities. Our approach makes a new attempt by applying a recent graph convolutional network to embed code fragments in a compact and low-dimensional representation that preserves high-level control-flow information of a vulnerable program. We have conducted our experiments using 8,368 real-world vulnerable programs by comparing our approach with several traditional static vulnerability detectors and state-of-the-art machine-learning-based approaches. The experimental results show the effectiveness of our approach in terms of both accuracy and recall. Our research has shed light on the promising direction of combining program analysis with deep learning techniques to address the general static analysis challenges.

Hejderup, J., Deursen, A. v, Gousios, G..  2018.  Software Ecosystem Call Graph for Dependency Management. 2018 IEEE/ACM 40th International Conference on Software Engineering: New Ideas and Emerging Technologies Results (ICSE-NIER). :101–104.
A popular form of software reuse is the use of open source software libraries hosted on centralized code repositories, such as Maven or npm. Developers only need to declare dependencies to external libraries, and automated tools make them available to the workspace of the project. Recent incidents, such as the Equifax data breach and the leftpad package removal, demonstrate the difficulty in assessing the severity, impact and spread of bugs in dependency networks. While dependency checkers are being adapted as a counter measure, they only provide indicative information. To remedy this situation, we propose a fine-grained dependency network that goes beyond packages and into call graphs. The result is a versioned ecosystem-level call graph. In this paper, we outline the process to construct the proposed graph and present a preliminary evaluation of a security issue from a core package to an affected client application.
Nguyen Quang Do, Lisa, Bodden, Eric.  2018.  Gamifying Static Analysis. Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. :714-718.

In the past decades, static code analysis has become a prevalent means to detect bugs and security vulnerabilities in software systems. As software becomes more complex, analysis tools also report lists of increasingly complex warnings that developers need to address on a daily basis. The novel insight we present in this work is that static analysis tools and video games both require users to take on repetitive and challenging tasks. Importantly, though, while good video games manage to keep players engaged, static analysis tools are notorious for their lacking user experience, which prevents developers from using them to their full potential, frequently resulting in dissatisfaction and even tool abandonment. We show parallels between gaming and using static analysis tools, and advocate that the user-experience issues of analysis tools can be addressed by looking at the analysis tooling system as a whole, and by integrating gaming elements that keep users engaged, such as providing immediate and clear feedback, collaborative problem solving, or motivators such as points and badges.

Peng, H., Shoshitaishvili, Y., Payer, M..  2018.  T-Fuzz: Fuzzing by Program Transformation. 2018 IEEE Symposium on Security and Privacy (SP). :697-710.

Fuzzing is a simple yet effective approach to discover software bugs utilizing randomly generated inputs. However, it is limited by coverage and cannot find bugs hidden in deep execution paths of the program because the randomly generated inputs fail complex sanity checks, e.g., checks on magic values, checksums, or hashes. To improve coverage, existing approaches rely on imprecise heuristics or complex input mutation techniques (e.g., symbolic execution or taint analysis) to bypass sanity checks. Our novel method tackles coverage from a different angle: by removing sanity checks in the target program. T-Fuzz leverages a coverage-guided fuzzer to generate inputs. Whenever the fuzzer can no longer trigger new code paths, a light-weight, dynamic tracing based technique detects the input checks that the fuzzer-generated inputs fail. These checks are then removed from the target program. Fuzzing then continues on the transformed program, allowing the code protected by the removed checks to be triggered and potential bugs discovered. Fuzzing transformed programs to find bugs poses two challenges: (1) removal of checks leads to over-approximation and false positives, and (2) even for true bugs, the crashing input on the transformed program may not trigger the bug in the original program. As an auxiliary post-processing step, T-Fuzz leverages a symbolic execution-based approach to filter out false positives and reproduce true bugs in the original program. By transforming the program as well as mutating the input, T-Fuzz covers more code and finds more true bugs than any existing technique. We have evaluated T-Fuzz on the DARPA Cyber Grand Challenge dataset, LAVA-M dataset and 4 real-world programs (pngfix, tiffinfo, magick and pdftohtml). For the CGC dataset, T-Fuzz finds bugs in 166 binaries, Driller in 121, and AFL in 105. In addition, found 3 new bugs in previously-fuzzed programs and libraries.

Yan, Hua, Sui, Yulei, Chen, Shiping, Xue, Jingling.  2018.  Spatio-temporal Context Reduction: A Pointer-analysis-based Static Approach for Detecting Use-after-free Vulnerabilities. Proceedings of the 40th International Conference on Software Engineering. :327–337.

Zero-day Use-After-Free (UAF) vulnerabilities are increasingly popular and highly dangerous, but few mitigations exist. We introduce a new pointer-analysis-based static analysis, CRed, for finding UAF bugs in multi-MLOC C source code efficiently and effectively. CRed achieves this by making three advances: (i) a spatio-temporal context reduction technique for scaling down soundly and precisely the exponential number of contexts that would otherwise be considered at a pair of free and use sites, (ii) a multi-stage analysis for filtering out false alarms efficiently, and (iii) a path-sensitive demand-driven approach for finding the points-to information required. We have implemented CRed in LLVM-3.8.0 and compared it with four different state-of-the-art static tools: CBMC (model checking), Clang (abstract interpretation), Coccinelle (pattern matching), and Supa (pointer analysis) using all the C test cases in Juliet Test Suite (JTS) and 10 open-source C applications. For the ground-truth validated with JTS, CRed detects all the 138 known UAF bugs as CBMC and Supa do while Clang and Coccinelle miss some bugs, with no false alarms from any tool. For practicality validated with the 10 applications (totaling 3+ MLOC), CRed reports 132 warnings including 85 bugs in 7.6 hours while the existing tools are either unscalable by terminating within 3 days only for one application (CBMC) or impractical by finding virtually no bugs (Clang and Coccinelle) or issuing an excessive number of false alarms (Supa).

Jinan, S., Kefeng, P., Xuefeng, C., Junfu, Z..  2017.  Security Patterns from Intelligent Data: A Map of Software Vulnerability Analysis. 2017 ieee 3rd international conference on big data security on cloud (bigdatasecurity), ieee international conference on high performance and smart computing (hpsc), and ieee international conference on intelligent data and security (ids). :18–25.

A significant milestone is reached when the field of software vulnerability research matures to a point warranting related security patterns represented by intelligent data. A substantial research material of empirical findings, distinctive taxonomy, theoretical models, and a set of novel or adapted detection methods justify a unifying research map. The growth interest in software vulnerability is evident from a large number of works done during the last several decades. This article briefly reviews research works in vulnerability enumeration, taxonomy, models and detection methods from the perspective of intelligent data processing and analysis. This article also draws the map which associated with specific characteristics and challenges of vulnerability research, such as vulnerability patterns representation and problem-solving strategies.

Zhang, Xin, Si, Xujie, Naik, Mayur.  2017.  Combining the Logical and the Probabilistic in Program Analysis. Proceedings of the 1st ACM SIGPLAN International Workshop on Machine Learning and Programming Languages. :27–34.

Conventional program analyses have made great strides by leveraging logical reasoning. However, they cannot handle uncertain knowledge, and they lack the ability to learn and adapt. This in turn hinders the accuracy, scalability, and usability of program analysis tools in practice. We seek to address these limitations by proposing a methodology and framework for incorporating probabilistic reasoning directly into existing program analyses that are based on logical reasoning. We demonstrate that the combined approach can benefit a number of important applications of program analysis and thereby facilitate more widespread adoption of this technology.

Scully, Ziv, Chlipala, Adam.  2017.  A Program Optimization for Automatic Database Result Caching. Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages. :271–284.

Most popular Web applications rely on persistent databases based on languages like SQL for declarative specification of data models and the operations that read and modify them. As applications scale up in user base, they often face challenges responding quickly enough to the high volume of requests. A common aid is caching of database results in the application's memory space, taking advantage of program-specific knowledge of which caching schemes are sound and useful, embodied in handwritten modifications that make the program less maintainable. These modifications also require nontrivial reasoning about the read-write dependencies across operations. In this paper, we present a compiler optimization that automatically adds sound SQL caching to Web applications coded in the Ur/Web domain-specific functional language, with no modifications required to source code. We use a custom cache implementation that supports concurrent operations without compromising the transactional semantics of the database abstraction. Through experiments with microbenchmarks and production Ur/Web applications, we show that our optimization in many cases enables an easy doubling or more of an application's throughput, requiring nothing more than passing an extra command-line flag to the compiler.

Mohammadi, M., Chu, B., Lipford, H. R..  2017.  Detecting Cross-Site Scripting Vulnerabilities through Automated Unit Testing. 2017 IEEE International Conference on Software Quality, Reliability and Security (QRS). :364–373.

The best practice to prevent Cross Site Scripting (XSS) attacks is to apply encoders to sanitize untrusted data. To balance security and functionality, encoders should be applied to match the web page context, such as HTML body, JavaScript, and style sheets. A common programming error is the use of a wrong encoder to sanitize untrusted data, leaving the application vulnerable. We present a security unit testing approach to detect XSS vulnerabilities caused by improper encoding of untrusted data. Unit tests for the XSS vulnerability are automatically constructed out of each web page and then evaluated by a unit test execution framework. A grammar-based attack generator is used to automatically generate test inputs. We evaluate our approach on a large open source medical records application, demonstrating that we can detect many 0-day XSS vulnerabilities with very low false positives, and that the grammar-based attack generator has better test coverage than industry best practices.

Mohammadi, M., Chu, B., Lipford, H. R., Murphy-Hill, E..  2016.  Automatic Web Security Unit Testing: XSS Vulnerability Detection. 2016 IEEE/ACM 11th International Workshop in Automation of Software Test (AST). :78–84.

Integrating security testing into the workflow of software developers not only can save resources for separate security testing but also reduce the cost of fixing security vulnerabilities by detecting them early in the development cycle. We present an automatic testing approach to detect a common type of Cross Site Scripting (XSS) vulnerability caused by improper encoding of untrusted data. We automatically extract encoding functions used in a web application to sanitize untrusted inputs and then evaluate their effectiveness by automatically generating XSS attack strings. Our evaluations show that this technique can detect 0-day XSS vulnerabilities that cannot be found by static analysis tools. We will also show that our approach can efficiently cover a common type of XSS vulnerability. This approach can be generalized to test for input validation against other types injections such as command line injection.

Koskinen, Eric, Yang, Junfeng.  2016.  Reducing Crash Recoverability to Reachability. Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. :97–108.

Software applications run on a variety of platforms (filesystems, virtual slices, mobile hardware, etc.) that do not provide 100% uptime. As such, these applications may crash at any unfortunate moment losing volatile data and, when re-launched, they must be able to correctly recover from potentially inconsistent states left on persistent storage. From a verification perspective, crash recovery bugs can be particularly frustrating because, even when it has been formally proved for a program that it satisfies a property, the proof is foiled by these external events that crash and restart the program. In this paper we first provide a hierarchical formal model of what it means for a program to be crash recoverable. Our model captures the recoverability of many real world programs, including those in our evaluation which use sophisticated recovery algorithms such as shadow paging and write-ahead logging. Next, we introduce a novel technique capable of automatically proving that a program correctly recovers from a crash via a reduction to reachability. Our technique takes an input control-flow automaton and transforms it into an encoding that blends the capture of snapshots of pre-crash states into a symbolic search for a proof that recovery terminates and every recovered execution simulates some crash-free execution. Our encoding is designed to enable one to apply existing abstraction techniques in order to do the work that is necessary to prove recoverability. We have implemented our technique in a tool called Eleven82, capable of analyzing C programs to detect recoverability bugs or prove their absence. We have applied our tool to benchmark examples drawn from industrial file systems and databases, including GDBM, LevelDB, LMDB, PostgreSQL, SQLite, VMware and ZooKeeper. Within minutes, our tool is able to discover bugs or prove that these fragments are crash recoverable.

Jilcott, S..  2015.  Securing the supply chain for commodity IT devices by automated scenario generation. 2015 IEEE International Symposium on Technologies for Homeland Security (HST). :1–6.

Almost all commodity IT devices include firmware and software components from non-US suppliers, potentially introducing grave vulnerabilities to homeland security by enabling cyber-attacks via flaws injected into these devices through the supply chain. However, determining that a given device is free of any and all implementation flaws is computationally infeasible in the general case; hence a critical part of any vetting process is prioritizing what kinds of flaws are likely to enable potential adversary goals. We present Theseus, a four-year research project sponsored by the DARPA VET program. Theseus will provide technology to automatically map and explore the firmware/software (FW/SW) architecture of a commodity IT device and then generate attack scenarios for the device. From these device attack scenarios, Theseus then creates a prioritized checklist of FW/SW components to check for potential vulnerabilities. Theseus combines static program analysis, attack graph generation algorithms, and a Boolean satisfiability solver to automate the checklist generation workflow. We describe how Theseus exploits analogies between the commodity IT device problem and attack graph generation for networks. We also present a novel approach called Component Interaction Mapping to recover a formal model of a device's FW/SW architecture from which attack scenarios can be generated.

SHAR, L., Briand, L., Tan, H..  2014.  Web Application Vulnerability Prediction using Hybrid Program Analysis and Machine Learning. Dependable and Secure Computing, IEEE Transactions on. PP:1-1.

Due to limited time and resources, web software engineers need support in identifying vulnerable code. A practical approach to predicting vulnerable code would enable them to prioritize security auditing efforts. In this paper, we propose using a set of hybrid (static+dynamic) code attributes that characterize input validation and input sanitization code patterns and are expected to be significant indicators of web application vulnerabilities. Because static and dynamic program analyses complement each other, both techniques are used to extract the proposed attributes in an accurate and scalable way. Current vulnerability prediction techniques rely on the availability of data labeled with vulnerability information for training. For many real world applications, past vulnerability data is often not available or at least not complete. Hence, to address both situations where labeled past data is fully available or not, we apply both supervised and semi-supervised learning when building vulnerability predictors based on hybrid code attributes. Given that semi-supervised learning is entirely unexplored in this domain, we describe how to use this learning scheme effectively for vulnerability prediction. We performed empirical case studies on seven open source projects where we built and evaluated supervised and semi-supervised models. When cross validated with fully available labeled data, the supervised models achieve an average of 77 percent recall and 5 percent probability of false alarm for predicting SQL injection, cross site scripting, remote code execution and file inclusion vulnerabilities. With a low amount of labeled data, when compared to the supervised model, the semi-supervised model showed an average improvement of 24 percent higher recall and 3 percent lower probability of false alarm, thus suggesting semi-supervised learning may be a preferable solution for many real world applications where vulnerability data is missing.

Sen, S., Guha, S., Datta, A., Rajamani, S.K., Tsai, J., Wing, J.M..  2014.  Bootstrapping Privacy Compliance in Big Data Systems. Security and Privacy (SP), 2014 IEEE Symposium on. :327-342.

With the rapid increase in cloud services collecting and using user data to offer personalized experiences, ensuring that these services comply with their privacy policies has become a business imperative for building user trust. However, most compliance efforts in industry today rely on manual review processes and audits designed to safeguard user data, and therefore are resource intensive and lack coverage. In this paper, we present our experience building and operating a system to automate privacy policy compliance checking in Bing. Central to the design of the system are (a) Legal ease-a language that allows specification of privacy policies that impose restrictions on how user data is handled, and (b) Grok-a data inventory for Map-Reduce-like big data systems that tracks how user data flows among programs. Grok maps code-level schema elements to data types in Legal ease, in essence, annotating existing programs with information flow types with minimal human input. Compliance checking is thus reduced to information flow analysis of Big Data systems. The system, bootstrapped by a small team, checks compliance daily of millions of lines of ever-changing source code written by several thousand developers.