Visible to the public Biblio

Filters: Keyword is taint analysis  [Clear All Filters]
Staicu, C.-A., Torp, M. T., Schäfer, M., Møller, A., Pradel, M..  2020.  Extracting Taint Specifications for JavaScript Libraries. 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). :198—209.

Modern JavaScript applications extensively depend on third-party libraries. Especially for the Node.js platform, vulnerabilities can have severe consequences to the security of applications, resulting in, e.g., cross-site scripting and command injection attacks. Existing static analysis tools that have been developed to automatically detect such issues are either too coarse-grained, looking only at package dependency structure while ignoring dataflow, or rely on manually written taint specifications for the most popular libraries to ensure analysis scalability. In this work, we propose a technique for automatically extracting taint specifications for JavaScript libraries, based on a dynamic analysis that leverages the existing test suites of the libraries and their available clients in the npm repository. Due to the dynamic nature of JavaScript, mapping observations from dynamic analysis to taint specifications that fit into a static analysis is non-trivial. Our main insight is that this challenge can be addressed by a combination of an access path mechanism that identifies entry and exit points, and the use of membranes around the libraries of interest. We show that our approach is effective at inferring useful taint specifications at scale. Our prototype tool automatically extracts 146 additional taint sinks and 7 840 propagation summaries spanning 1 393 npm modules. By integrating the extracted specifications into a commercial, state-of-the-art static analysis, 136 new alerts are produced, many of which correspond to likely security vulnerabilities. Moreover, many important specifications that were originally manually written are among the ones that our tool can now extract automatically.

Yulianto, Arief Dwi, Sukarno, Parman, Warrdana, Aulia Arif, Makky, Muhammad Al.  2019.  Mitigation of Cryptojacking Attacks Using Taint Analysis. 2019 4th International Conference on Information Technology, Information Systems and Electrical Engineering (ICITISEE). :234—238.

Cryptojacking (also called malicious cryptocurrency mining or cryptomining) is a new threat model using CPU resources covertly “mining” a cryptocurrency in the browser. The impact is a surge in CPU Usage and slows the system performance. In this research, in-browsercryptojacking mitigation has been built as an extension in Google Chrome using Taint analysis method. The method used in this research is attack modeling with abuse case using the Man-In-The-Middle (MITM) attack as a testing for mitigation. The proposed model is designed so that users will be notified if a cryptojacking attack occurs. Hence, the user is able to check the script characteristics that run on the website background. The results of this research show that the taint analysis is a promising method to mitigate cryptojacking attacks. From 100 random sample websites, the taint analysis method can detect 19 websites that are infcted by cryptojacking.

Ashouri, Mohammadreza.  2019.  Detecting Input Sanitization Errors in Scala. 2019 Seventh International Symposium on Computing and Networking Workshops (CANDARW). :313—319.

Scala programming language combines object-oriented and functional programming in one concise, high-level language, and the language supports static types that help to avoid bugs in complex programs. This paper proposes a dynamic taint analyzer called ScalaTaint for Scala applications. The analyzer traces the propagation of malicious inputs from untrusted sources to sensitive sink methods in programs that can be exploited by adversaries. In this work, we evaluated the accuracy of ScalaTaint with a security benchmark suite including 7 projects in Scala. As a result, our analyzer could report 49 vulnerabilities within 753,372 lines of code. Moreover, the result of our performance measurement on ScalaBench shows 67% runtime overhead that demonstrates the usefulness and efficiently of our technique in comparison with similar tools.

Guan, Le, Cao, Chen, Zhu, Sencun, Lin, Jingqiang, Liu, Peng, Xia, Yubin, Luo, Bo.  2019.  Protecting mobile devices from physical memory attacks with targeted encryption. Proceedings of the 12th Conference on Security and Privacy in Wireless and Mobile Networks. :34–44.
Sensitive data in a process could be scattered over the memory of a computer system for a prolonged period of time. Unfortunately, DRAM chips were proven insecure in previous studies. The problem becomes worse in the mobile environment, in which users' smartphones are easily lost or stolen. The powered-on phones may contain sensitive data in the vulnerable DRAM chips. In this paper, we propose MemVault, a mechanism to protect sensitive data in Android devices against physical memory attacks. MemVault keeps track of the propagation of well-marked sensitive data sources, and selectively encrypts tainted sensitive memory contents in the DRAM chip. When a tainted object is accessed, MemVault redirects the access to the internal RAM (iRAM), where the cipher-text object is decrypted transparently. iRAM is a system-on-chip (SoC) component which is by nature immune to physical memory exploits. We have implemented a MemVault prototype system, and have evaluated it with extensive experiments. Our results validate that MemVault effectively eliminates the occurrences of clear-text sensitive objects in DRAM chips, and imposes acceptable overheads.
Cao, Mengchen, Hou, Xiantong, Wang, Tao, Qu, Hunter, Zhou, Yajin, Bai, Xiaolong, Wang, Fuwei.  2019.  Different is Good: Detecting the Use of Uninitialized Variables through Differential Replay. Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. :1883–1897.
The use of uninitialized variables is a common issue. It could cause kernel information leak, which defeats the widely deployed security defense, i.e., kernel address space layout randomization (KASLR). Though a recent system called Bochspwn Reloaded reported multiple memory leaks in Windows kernels, how to effectively detect this issue is still largely behind. In this paper, we propose a new technique, i.e., differential replay, that could effectively detect the use of uninitialized variables. Specifically, it records and replays a program's execution in multiple instances. One instance is with the vanilla memory, the other one changes (or poisons) values of variables allocated from the stack and the heap. Then it compares program states to find references to uninitialized variables. The idea is that if a variable is properly initialized, it will overwrite the poisoned value and program states in two running instances should be the same. After detecting the differences, our system leverages the symbolic taint analysis to further identify the location where the variable was allocated. This helps us to identify the root cause and facilitate the development of real exploits. We have implemented a prototype called TimePlayer. After applying it to both Windows 7 and Windows 10 kernels (x86/x64), it successfully identified 34 new issues and another 85 ones that had been patched (some of them were publicly unknown.) Among 34 new issues, 17 of them have been confirmed as zero-day vulnerabilities by Microsoft.
Gao, Jianbo, Liu, Han, Liu, Chao, Li, Qingshan, Guan, Zhi, Chen, Zhong.  2019.  EasyFlow: keep ethereum away from overflow. Proceedings of the 41st International Conference on Software Engineering: Companion Proceedings. :23–26.
While Ethereum smart contracts enabled a wide range of blockchain applications, they are extremely vulnerable to different forms of security attacks. Due to the fact that transactions to smart contracts commonly involve cryptocurrency transfer, any successful attacks can lead to money loss or even financial disorder. In this paper, we focus on the overflow attacks in Ethereum, mainly because they widely rooted in many smart contracts and comparatively easy to exploit. We have developed EasyFlow, an overflow detector at Ethereum Virtual Machine level. The key insight behind EasyFlow is a taint analysis based tracking technique to analyze the propagation of involved taints. Specifically, EasyFlow can not only divide smart contracts into safe contracts, manifested overflows, well-protected overflows and potential overflows, but also automatically generate transactions to trigger potential overflows. In our preliminary evaluation, EasyFlow managed to find potentially vulnerable Ethereum contracts with little runtime overhead. A demo video of EasyFlow is at
Kreindl, Jacob, Bonetta, Daniele, Mössenböck, Hanspeter.  2019.  Towards efficient, multi-language dynamic taint analysis. Proceedings of the 16th ACM SIGPLAN International Conference on Managed Programming Languages and Runtimes. :85–94.
Dynamic taint analysis is a program analysis technique in which data is marked and its propagation is tracked while the program is executing. It is applied to solve problems in many fields, especially in software security. Current taint analysis platforms are limited to a single programming language, and therefore cannot support programs which, as is common today, are implemented in multiple programming languages. Current implementations of dynamic taint analysis also incur a significant performance overhead. In this paper we address both these limitations (1) by presenting our vision of a multi-language dynamic taint analysis platform, which is built around a language-agnostic core framework that is extended by language-specific front-ends and (2) by discussing the use of speculative optimization and dynamic compilation to reduce the execution overhead of dynamic taint analysis applications. An implementation of such a platform would enable dynamic taint analyses that can target multiple languages in one analysis implementation and can track tainted data across language boundaries. We describe this approach in the context of the GraalVM runtime and its included JIT compiler, Graal, which allows us to target both dynamic and static languages.
Li, Zhangtan, Cheng, Liang, Zhang, Yang.  2019.  Tracking Sensitive Information and Operations in Integrated Clinical Environment. 2019 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/13th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). :192–199.
Integrated Clinical Environment (ICE) is a standardized framework for achieving device interoperability in medical cyber-physical systems. The ICE utilizes high-level supervisory apps and a low-level communication middleware to coordinate medical devices. The need to design complex ICE systems that are both safe and effective has presented numerous challenges, including interoperability, context-aware intelligence, security and privacy. In this paper, we present a data flow analysis framework for the ICE systems. The framework performs the combination of static and dynamic analysis for the sensitive data and operations in the ICE systems. Our experiments demonstrate that the data flow analysis framework can record how the medical devices transmit sensitive data and perform misuse detection by tracing the runtime context of the sensitive operations.
He, Dongjie, Li, Haofeng, Wang, Lei, Meng, Haining, Zheng, Hengjie, Liu, Jie, Hu, Shuangwei, Li, Lian, Xue, Jingling.  2019.  Performance-Boosting Sparsification of the IFDS Algorithm with Applications to Taint Analysis. 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). :267–279.
The IFDS algorithm can be compute-and memoryintensive for some large programs, often running for a long time (more than expected) or terminating prematurely after some time and/or memory budgets have been exhausted. In the latter case, the corresponding IFDS data-flow analyses may suffer from false negatives and/or false positives. To improve this, we introduce a sparse alternative to the traditional IFDS algorithm. Instead of propagating the data-flow facts across all the program points along the program’s (interprocedural) control flow graph, we propagate every data-flow fact directly to its next possible use points along its own sparse control flow graph constructed on the fly, thus reducing significantly both the time and memory requirements incurred by the traditional IFDS algorithm. In our evaluation, we compare FLOWDROID, a taint analysis performed by using the traditional IFDS algorithm, with our sparse incarnation, SPARSEDROID, on a set of 40 Android apps selected. For the time budget (5 hours) and memory budget (220GB) allocated per app, SPARSEDROID can run every app to completion but FLOWDROID terminates prematurely for 9 apps, resulting in an average speedup of 22.0x. This implies that when used as a market-level vetting tool, SPARSEDROID can finish analyzing these 40 apps in 2.13 hours (by issuing 228 leak warnings) while FLOWDROID manages to analyze only 30 apps in the same time period (by issuing only 147 leak warnings).
Inayoshi, Hiroki, Kakei, Shohei, Takimoto, Eiji, Mouri, Koichi, Saito, Shoichi.  2019.  Prevention of Data Leakage due to Implicit Information Flows in Android Applications. 2019 14th Asia Joint Conference on Information Security (AsiaJCIS). :103–110.
Dynamic Taint Analysis (DTA) technique has been developed for analysis and understanding behavior of Android applications and privacy policy enforcement. Meanwhile, implicit information flows (IIFs) are major concern of security researchers because IIFs can evade DTA technique easily and give attackers an advantage over the researchers. Some researchers suggested approaches to the issue and developed analysis systems supporting privacy policy enforcement against IIF-accompanied attacks; however, there is still no effective technique of comprehensive analysis and privacy policy enforcement against IIF-accompanied attacks. In this paper, we propose an IIF detection technique to enforce privacy policy against IIF-accompanied attacks in Android applications. We developed a new analysis tool, called Smalien, that can discover data leakage caused by IIF-contained information flows as well as explicit information flows. We demonstrated practicability of Smalien by applying it to 16 IIF tricks from ScrubDroid and two IIF tricks from DroidBench. Smalien enforced privacy policy successfully against all the tricks except one trick because the trick loads code dynamically from a remote server at runtime, and Smalien cannot analyze any code outside of a target application. The results show that our approach can be a solution to the current attacker-superior situation.
Syed, Shafaque Fatma, Ahmed, Aamir, D'mello, Gavin, Ansari, Zeeshan.  2019.  Removal of Web Application Vulnerabilities using Taint Analyzer and Code Corrector. 2019 International Conference on Nascent Technologies in Engineering (ICNTE). :1–7.
Security has been a challenging aspect recently in the field of Web Development. A failure to obtain security in web applications may lead to complete destruction of the web application or may cause some loss to the user or the owner. To tackle this, a huge research on how to secure a web app has been going on for quite some time, yet to achieve security in today's modern era is a very difficult and no less than a challenge for web applications. All these things lead only to a vulnerable/faulty source code, formulated in coding such as PHP. Static Source Code analysis (SCSA) tools tend to give a solution to detect vulnerabilities, but they tend to detect vulnerabilities which actually are false positives, which leads to excess code reexamination. The proposed system will tackle the current situation of SCSA. This will be achieved by two additional modules to SCSA i.e. Taint analysis with False Positive Predictor which will detect and segregate the true vulnerable code from false positives respectively. The proposed system will be used by the Web Application programmers during testing of web application.
Luo, Linghui, Bodden, Eric, Späth, Johannes.  2019.  A Qualitative Analysis of Android Taint-Analysis Results. 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). :102–114.
In the past, researchers have developed a number of popular taint-analysis approaches, particularly in the context of Android applications. Numerous studies have shown that automated code analyses are adopted by developers only if they yield a good "signal to noise ratio", i.e., high precision. Many previous studies have reported analysis precision quantitatively, but this gives little insight into what can and should be done to increase precision further. To guide future research on increasing precision, we present a comprehensive study that evaluates static Android taint-analysis results on a qualitative level. To unravel the exact nature of taint flows, we have designed COVA, an analysis tool to compute partial path constraints that inform about the circumstances under which taint flows may actually occur in practice. We have conducted a qualitative study on the taint flows reported by FlowDroid in 1,022 real-world Android applications. Our results reveal several key findings: Many taint flows occur only under specific conditions, e.g., environment settings, user interaction, I/O. Taint analyses should consider the application context to discern such situations. COVA shows that few taint flows are guarded by multiple different kinds of conditions simultaneously, so tools that seek to confirm true positives dynamically can concentrate on one kind at a time, e.g., only simulating user interactions. Lastly, many false positives arise due to a too liberal source/sink configuration. Taint analyses must be more carefully configured, and their configuration could benefit from better tool assistance.
Schmeidl, Florian, Nazzal, Bara, Alalfi, Manar H..  2019.  Security Analysis for SmartThings IoT Applications. 2019 IEEE/ACM 6th International Conference on Mobile Software Engineering and Systems (MOBILESoft). :25–29.
This paper presents a fully automated static analysis approach and a tool, Taint-Things, for the identification of tainted flows in SmartThings IoT apps. Taint-Things accurately identified all tainted flows reported by one of the state-of the-art tools with at least 4 times improved performance. In addition, our approach reports potential vulnerable tainted flow in a form of a concise security slice, which could provide security auditors with an effective and precise tool to pinpoint security issues in SmartThings apps under test.
Li, Y., Liu, X., Tian, H., Luo, C..  2018.  Research of Industrial Control System Device Firmware Vulnerability Mining Technology Based on Taint Analysis. 2018 IEEE 9th International Conference on Software Engineering and Service Science (ICSESS). :607-610.

Aiming at the problem that there is little research on firmware vulnerability mining and the traditional method of vulnerability mining based on fuzzing test is inefficient, this paper proposed a new method of mining vulnerabilities in industrial control system firmware. Based on taint analysis technology, this method can construct test cases specifically for the variables that may trigger vulnerabilities, thus reducing the number of invalid test cases and improving the test efficiency. Experiment result shows that this method can reduce about 23 % of test cases and can effectively improve test efficiency.

Chen, Quan, Kapravelos, Alexandros.  2018.  Mystique: Uncovering Information Leakage from Browser Extensions. Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. :1687–1700.
Browser extensions are small JavaScript, CSS and HTML programs that run inside the browser with special privileges. These programs, often written by third parties, operate on the pages that the browser is visiting, giving the user a programmatic way to configure the browser. The privacy implications that arise by allowing privileged third-party code to execute inside the users' browser are not well understood. In this paper, we develop a taint analysis framework for browser extensions and use it to perform a large scale study of extensions in regard to their privacy practices. We first present a hybrid approach to traditional taint analysis: by leveraging the fact that extension source code is available to the runtime JavaScript engine, we implement as well as enhance traditional taint analysis using information gathered from static data flow and control-flow analysis of the JavaScript source code. Based on this, we further modify the Chromium browser to support taint tracking for extensions. We analyzed 178,893 extensions crawled from the Chrome Web Store between September 2016 and March 2018, as well as a separate set of all available extensions (2,790 in total) for the Opera browser at the time of analysis. From these, our analysis flagged 3,868 (2.13%) extensions as potentially leaking privacy-sensitive information. The top 10 most popular Chrome extensions that we confirmed to be leaking privacy-sensitive information have more than 60 million users combined. We ran the analysis on a local Kubernetes cluster and were able to finish within a month, demonstrating the feasibility of our approach for large-scale analysis of browser extensions. At the same time, our results emphasize the threat browser extensions pose to user privacy, and the need for countermeasures to safeguard against misbehaving extensions that abuse their privileges.
Schuette, J., Brost, G. S..  2018.  LUCON: Data Flow Control for Message-Based IoT Systems. 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/ 12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). :289-299.

Today's emerging Industrial Internet of Things (IIoT) scenarios are characterized by the exchange of data between services across enterprises. Traditional access and usage control mechanisms are only able to determine if data may be used by a subject, but lack an understanding of how it may be used. The ability to control the way how data is processed is however crucial for enterprises to guarantee (and provide evidence of) compliant processing of critical data, as well as for users who need to control if their private data may be analyzed or linked with additional information - a major concern in IoT applications processing personal information. In this paper, we introduce LUCON, a data-centric security policy framework for distributed systems that considers data flows by controlling how messages may be routed across services and how they are combined and processed. LUCON policies prevent information leaks, bind data usage to obligations, and enforce data flows across services. Policy enforcement is based on a dynamic taint analysis at runtime and an upfront static verification of message routes against policies. We discuss the semantics of these two complementing enforcement models and illustrate how LUCON policies are compiled from a simple policy language into a first-order logic representation. We demonstrate the practical application of LUCON in a real-world IoT middleware and discuss its integration into Apache Camel. Finally, we evaluate the runtime impact of LUCON and discuss performance and scalability aspects.

Kelkar, S., Kraus, T., Morgan, D., Zhang, J., Dai, R..  2018.  Analyzing HTTP-Based Information Exfiltration of Malicious Android Applications. 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/ 12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). :1642-1645.

Exfiltrating sensitive information from smartphones has become one of the most significant security threats. We have built a system to identify HTTP-based information exfiltration of malicious Android applications. In this paper, we discuss the method to track the propagation of sensitive information in Android applications using static taint analysis. We have studied the leaked information, destinations to which information is exfiltrated, and their correlations with types of sensitive information. The analysis results based on 578 malicious Android applications have revealed that a significant portion of these applications are interested in identity-related sensitive information. The vast majority of malicious applications leak multiple types of sensitive information. We have also identified servers associated with three country codes including CN, US, and SG are most active in collecting sensitive information. The analysis results have also demonstrated that a wide range of non-default ports are used by suspicious URLs.

Peng, H., Shoshitaishvili, Y., Payer, M..  2018.  T-Fuzz: Fuzzing by Program Transformation. 2018 IEEE Symposium on Security and Privacy (SP). :697-710.

Fuzzing is a simple yet effective approach to discover software bugs utilizing randomly generated inputs. However, it is limited by coverage and cannot find bugs hidden in deep execution paths of the program because the randomly generated inputs fail complex sanity checks, e.g., checks on magic values, checksums, or hashes. To improve coverage, existing approaches rely on imprecise heuristics or complex input mutation techniques (e.g., symbolic execution or taint analysis) to bypass sanity checks. Our novel method tackles coverage from a different angle: by removing sanity checks in the target program. T-Fuzz leverages a coverage-guided fuzzer to generate inputs. Whenever the fuzzer can no longer trigger new code paths, a light-weight, dynamic tracing based technique detects the input checks that the fuzzer-generated inputs fail. These checks are then removed from the target program. Fuzzing then continues on the transformed program, allowing the code protected by the removed checks to be triggered and potential bugs discovered. Fuzzing transformed programs to find bugs poses two challenges: (1) removal of checks leads to over-approximation and false positives, and (2) even for true bugs, the crashing input on the transformed program may not trigger the bug in the original program. As an auxiliary post-processing step, T-Fuzz leverages a symbolic execution-based approach to filter out false positives and reproduce true bugs in the original program. By transforming the program as well as mutating the input, T-Fuzz covers more code and finds more true bugs than any existing technique. We have evaluated T-Fuzz on the DARPA Cyber Grand Challenge dataset, LAVA-M dataset and 4 real-world programs (pngfix, tiffinfo, magick and pdftohtml). For the CGC dataset, T-Fuzz finds bugs in 166 binaries, Driller in 121, and AFL in 105. In addition, found 3 new bugs in previously-fuzzed programs and libraries.

Facon, A., Guilley, S., Lec'Hvien, M., Schaub, A., Souissi, Y..  2018.  Detecting Cache-Timing Vulnerabilities in Post-Quantum Cryptography Algorithms. 2018 IEEE 3rd International Verification and Security Workshop (IVSW). :7-12.

When implemented on real systems, cryptographic algorithms are vulnerable to attacks observing their execution behavior, such as cache-timing attacks. Designing protected implementations must be done with knowledge and validation tools as early as possible in the development cycle. In this article we propose a methodology to assess the robustness of the candidates for the NIST post-quantum standardization project to cache-timing attacks. To this end we have developed a dedicated vulnerability research tool. It performs a static analysis with tainting propagation of sensitive variables across the source code and detects leakage patterns. We use it to assess the security of the NIST post-quantum cryptography project submissions. Our results show that more than 80% of the analyzed implementations have at least one potential flaw, and three submissions total more than 1000 reported flaws each. Finally, this comprehensive study of the competitors security allows us to identify the most frequent weaknesses amongst candidates and how they might be fixed.

Xu, Z., Shi, C., Cheng, C. C., Gong, N. Z., Guan, Y..  2018.  A Dynamic Taint Analysis Tool for Android App Forensics. 2018 IEEE Security and Privacy Workshops (SPW). :160-169.

The plethora of mobile apps introduce critical challenges to digital forensics practitioners, due to the diversity and the large number (millions) of mobile apps available to download from Google play, Apple store, as well as hundreds of other online app stores. Law enforcement investigators often find themselves in a situation that on the seized mobile phone devices, there are many popular and less-popular apps with interface of different languages and functionalities. Investigators would not be able to have sufficient expert-knowledge about every single app, sometimes nor even a very basic understanding about what possible evidentiary data could be discoverable from these mobile devices being investigated. Existing literature in digital forensic field showed that most such investigations still rely on the investigator's manual analysis using mobile forensic toolkits like Cellebrite and Encase. The problem with such manual approaches is that there is no guarantee on the completeness of such evidence discovery. Our goal is to develop an automated mobile app analysis tool to analyze an app and discover what types of and where forensic evidentiary data that app generate and store locally on the mobile device or remotely on external 3rd-party server(s). With the app analysis tool, we will build a database of mobile apps, and for each app, we will create a list of app-generated evidence in terms of data types, locations (and/or sequence of locations) and data format/syntax. The outcome from this research will help digital forensic practitioners to reduce the complexity of their case investigations and provide a better completeness guarantee of evidence discovery, thereby deliver timely and more complete investigative results, and eventually reduce backlogs at crime labs. In this paper, we will present the main technical approaches for us to implement a dynamic Taint analysis tool for Android apps forensics. With the tool, we have analyzed 2,100 real-world Android apps. For each app, our tool produces the list of evidentiary data (e.g., GPS locations, device ID, contacts, browsing history, and some user inputs) that the app could have collected and stored on the devices' local storage in the forms of file or SQLite database. We have evaluated our tool using both benchmark apps and real-world apps. Our results demonstrated that the initial success of our tool in accurately discovering the evidentiary data.

Jenkins, J., Cai, H..  2018.  Leveraging Historical Versions of Android Apps for Efficient and Precise Taint Analysis. 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR). :265-269.

Today, computing on various Android devices is pervasive. However, growing security vulnerabilities and attacks in the Android ecosystem constitute various threats through user apps. Taint analysis is a common technique for defending against these threats, yet it suffers from challenges in attaining practical simultaneous scalability and effectiveness. This paper presents a novel approach to fast and precise taint checking, called incremental taint analysis, by exploiting the evolving nature of Android apps. The analysis narrows down the search space of taint checking from an entire app, as conventionally addressed, to the parts of the program that are different from its previous versions. This technique improves the overall efficiency of checking multiple versions of the app as it evolves. We have implemented the techniques as a tool prototype, EVOTAINT, and evaluated our analysis by applying it to real-world evolving Android apps. Our preliminary results show that the incremental approach largely reduced the cost of taint analysis, by 78.6% on average, yet without sacrificing the analysis effectiveness, relative to a representative precise taint analysis as the baseline.

Raghothaman, Mukund, Kulkarni, Sulekha, Heo, Kihong, Naik, Mayur.  2018.  User-Guided Program Reasoning Using Bayesian Inference. Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation. :722-735.

Program analyses necessarily make approximations that often lead them to report true alarms interspersed with many false alarms. We propose a new approach to leverage user feedback to guide program analyses towards true alarms and away from false alarms. Our approach associates each alarm with a confidence value by performing Bayesian inference on a probabilistic model derived from the analysis rules. In each iteration, the user inspects the alarm with the highest confidence and labels its ground truth, and the approach recomputes the confidences of the remaining alarms given this feedback. It thereby maximizes the return on the effort by the user in inspecting each alarm. We have implemented our approach in a tool named Bingo for program analyses expressed in Datalog. Experiments with real users and two sophisticated analyses–-a static datarace analysis for Java programs and a static taint analysis for Android apps–-show significant improvements on a range of metrics, including false alarm rates and number of bugs found.

Jain, Vivek, Rawat, Sanjay, Giuffrida, Cristiano, Bos, Herbert.  2018.  TIFF: Using Input Type Inference To Improve Fuzzing. Proceedings of the 34th Annual Computer Security Applications Conference. :505-517.

Developers commonly use fuzzing techniques to hunt down all manner of memory corruption vulnerabilities during the testing phase. Irrespective of the fuzzer, input mutation plays a central role in providing adequate code coverage, as well as in triggering bugs. However, each class of memory corruption bugs requires a different trigger condition. While the goal of a fuzzer is to find bugs, most existing fuzzers merely approximate this goal by targeting their mutation strategies toward maximizing code coverage. In this work, we present a new mutation strategy that maximizes the likelihood of triggering memory-corruption bugs by generating fewer, but better inputs. In particular, our strategy achieves bug-directed mutation by inferring the type of the input bytes. To do so, it tags each offset of the input with a basic type (e.g., 32-bit integer, string, array etc.), while deriving mutation rules for specific classes of bugs. We infer types by means of in-memory data-structure identification and dynamic taint analysis, and implement our novel mutation strategy in a fully functional fuzzer which we call TIFF (Type Inference-based Fuzzing Framework). Our evaluation on real-world applications shows that type-based fuzzing triggers bugs much earlier than existing solutions, while maintaining high code coverage. For example, on several real-world applications and libraries (e.g., poppler, mpg123 etc.), we find real bugs (with known CVEs) in almost half of the time and upto an order of magnitude fewer inputs than state-of-the-art fuzzers.

Torres, Christof Ferreira, Schütte, Julian, State, Radu.  2018.  Osiris: Hunting for Integer Bugs in Ethereum Smart Contracts. Proceedings of the 34th Annual Computer Security Applications Conference. :664-676.

The capability of executing so-called smart contracts in a decentralised manner is one of the compelling features of modern blockchains. Smart contracts are fully fledged programs which cannot be changed once deployed to the blockchain. They typically implement the business logic of distributed apps and carry billions of dollars worth of coins. In that respect, it is imperative that smart contracts are correct and have no vulnerabilities or bugs. However, research has identified different classes of vulnerabilities in smart contracts, some of which led to prominent multi-million dollar fraud cases. In this paper we focus on vulnerabilities related to integer bugs, a class of bugs that is particularly difficult to avoid due to some characteristics of the Ethereum Virtual Machine and the Solidity programming language. In this paper we introduce Osiris – a framework that combines symbolic execution and taint analysis, in order to accurately find integer bugs in Ethereum smart contracts. Osiris detects a greater range of bugs than existing tools, while providing a better specificity of its detection. We have evaluated its performance on a large experimental dataset containing more than 1.2 million smart contracts. We found that 42,108 contracts contain integer bugs. Besides being able to identify several vulnerabilities that have been reported in the past few months, we were also able to identify a yet unknown critical vulnerability in a couple of smart contracts that are currently deployed on the Ethereum blockchain.

Wang, Yan, Zhang, Chao, Xiang, Xiaobo, Zhao, Zixuan, Li, Wenjie, Gong, Xiaorui, Liu, Bingchang, Chen, Kaixiang, Zou, Wei.  2018.  Revery: From Proof-of-Concept to Exploitable. Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. :1914-1927.

Automatic exploit generation is an open challenge. Existing solutions usually explore in depth the crashing paths, i.e., paths taken by proof-of-concept (POC) inputs triggering vulnerabilities, and generate exploits when exploitable states are found along the paths. However, exploitable states do not always exist in crashing paths. Moreover, existing solutions heavily rely on symbolic execution and are not scalable in path exploration and exploit generation. In addition, few solutions could exploit heap-based vulnerabilities. In this paper, we propose a new solution revery to search for exploitable states in paths diverging from crashing paths, and generate control-flow hijacking exploits for heap-based vulnerabilities. It adopts three novel techniques:(1) a digraph to characterize a vulnerability's memory layout and its contributor instructions;(2) a fuzz solution to explore diverging paths, which have similar memory layouts as the crashing paths, in order to search more exploitable states and generate corresponding diverging inputs;(3) a stitch solution to stitch crashing paths and diverging paths together, and synthesize EXP inputs able to trigger both vulnerabilities and exploitable states. We have developed a prototype of revery based on the binary analysis engine angr, and evaluated it on a set of 19 real world CTF (capture the flag) challenges. Experiment results showed that it could generate exploits for 9 (47%) of them, and generate EXP inputs able to trigger exploitable states for another 5 (26%) of them.