Visible to the public Secure Native Binary Executions--2019 Q3Conflict Detection Enabled

PI(s): Prasad Kulkarni

Scalability and Composability, Security Metrics


Our project goal is to develop a high-performance framework for client-side security assessment and enforcement for binary software. During the last quarter we conducted the following tasks:

1. Framework to update any binary with additional security checks:

We completed our initial implementation of the SoftBound technique using static analysis information gathered from Ghidra and dynamic instrumentation using Pin. As stated in our last report (2019 Q2), the SoftBound technique can provide complete spatial memory safety for code written in unsafe languages, like C and C++. The original implementation of SoftBound in the authors' 2009 paper relied on high-level program information available in the source code and additional checks generated and inserted by the compiler. Instead, our implementation gathers the static information through Ghidra and inserts the checks at run-time using Pin, and does not require the source code. While our first implementation maintains program symbol information in the binary executable (-g flag in GCC) to provide best case static analysis data, we will remove this limitation in later iterations of this work.

We are studying the data from our implementation to understand the following issues:

(A) How accurate and complete is the static analysis conducted by state-of-the-art reverse engineering tools, like Ghidra, with and without program symbol information? We have generated our own test cases (over 60 small programs), and are using programs from the SARD-81 and SARD-89 test suites (NIST Juliet test suite for C/C++; for this analysis. SARD-81 has 5 cases with buffer overflows. SARD-89 suite has 291 test sets, each a combination of cases with and without buffer overflows (1164 cases in total). These test cases manifest common memory vulnerabilities including CWE-119 and CWE-121. For each test case, we use Ghidra to reverse engineer the binary executable and generate information that is needed for Pin to implement SoftBound at run-time. While this setup works as intended for a majority of our test cases, it fails to correctly detects the buffer overflow vulnerability in several cases. We are currently analyzing this data to, (a) know if the issues are fundamental or due to a lack of the best algorithms in Ghidra, (b) compare analysis conducted by Ghidra to the best algorithms proposed in the research literature, and (c) compare analysis performed by Ghidra with other reverse engineering frameworks, like Angr, BAP, and Binary Ninja.

(B) Compare the overhead of our approach. Most of the heavy-lifting for our technique occurs before program execution. For small programs, like those in the SARD test suites, the primary overhead is due to Pin startup. We are setting up the SPEC 2017 benchmarks to study the overhead for standard programs.

In the future, we will continue to study and understand the reasons when Ghidra is unable to adequately reverse engineer necessary program information, develop techniques to resolve these issues so our defence works in all cases, and reduce runt-ime overhead of binary-level client-side defense techniques.

2. Framework to determine the security level of a given binary executable:

For this thrust we are exploring a fundamental question, how can we assess if a given binary program is susceptible to certain classes of attacks? We are exploring mechanisms based on signature/code-pattern detection in the static binary code and orchestrating attacks (using the Pin instrumentation engine) at execution-time that would be caught by a defense, if it was implemented.

We selected the "stack canary" defense (as generated by the GCC -fstack-protector flag) as a case study. We had implemented a signature based method that checks the static binary code to detect code patterns to determine if the binary has implemented this defense in the previous quarter. We are still in the process of building an automated strategy to orchestrate a buffer overflow attack on any binary that would trigger the stack protection check, if present in the binary. Triggering of the defense confirms its presence. We found that the Pin tool does not provide a flexible way to update program data (register and memory values) and instructions at run-time. Consequently, we switched to the DynamoRio dynamic binary translator, and are having better success in developing our set of techniques to simulate attacks by changing program values at run-time (as could be done in a real attack), and checking if there is code in the binary to detect such attacks.

3. Framework to reduce run-time overhead or our technique.

This project has a subcontract with the University of Tennessee, which is now active. Dr. Michael Jantz is the PI at UT. Dr. Jantz and his team have been developing a custom binary instrumentation tool to reduce the execution time overhead of analyzing and performing security checks in programs where source code is not available. For a given executable file, the tool accepts a static set of binary instructions to instrument. At program startup, but prior to executing the initial program routine (i.e., main), the tool inserts the instrumentation at each selected instruction. In contrast to standard binary instrumentation tools, such as the Intel Pin framework, this tool inserts instrumentation directly into the original copy of the program text in memory, and does not create an instrumented version of the code in a separate code cache. As a result, execution time costs associated with dynamic binary instrumentation are substantially reduced.

Dr. Jantz and his team have developed a prototype version of this tool and have used it to instrument the entire set of memory instructions in a small set of custom programs. They are currently validating its correctness and performance on a larger set of standard benchmark programs (SPEC CPU 2017). When the implementation is complete and has been validated, they will adapt it for use with the dynamic security instrumentation that has been developed at KU, and evaluate the performance of this approach compared to instrumentation with the standard Pin framework.