Visible to the public Biblio

Filters: Keyword is taint analysis  [Clear All Filters]
2022-07-28
[Anonymous].  2021.  An Automated Pipeline for Privacy Leak Analysis of Android Applications. 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE). :1048—1050.
We propose an automated pipeline for analyzing privacy leaks in Android applications. By using a combination of dynamic and static analysis, we validate the results from each other to improve accuracy. Compare to the state-of-the-art approaches, we not only capture the network traffic for analysis, but also look into the data flows inside the application. We particularly focus on the privacy leakage caused by third-party services and high-risk permissions. The proposed automated approach will combine taint analysis, permission analysis, network traffic analysis, and dynamic function tracing during run-time to identify private information leaks. We further implement an automatic validation and complementation process to reduce false positives. A small-scale experiment has been conducted on 30 Android applications and a large-scale experiment on more than 10,000 Android applications is in progress.
2022-05-19
Kwon, Seongkyeong, Woo, Seunghoon, Seong, Gangmo, Lee, Heejo.  2021.  OCTOPOCS: Automatic Verification of Propagated Vulnerable Code Using Reformed Proofs of Concept. 2021 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). :174–185.
Addressing vulnerability propagation has become a major issue in software ecosystems. Existing approaches hold the promise of detecting widespread vulnerabilities but cannot be applied to verify effectively whether propagated vulnerable code still poses threats. We present OCTOPOCS, which uses a reformed Proof-of-Concept (PoC), to verify whether a vulnerability is propagated. Using context-aware taint analysis, OCTOPOCS extracts crash primitives (the parts used in the shared code area between the original vulnerable software and propagated software) from the original PoC. OCTOPOCS then utilizes directed symbolic execution to generate guiding inputs that direct the execution of the propagated software from the entry point to the shared code area. Thereafter, OCTOPOCS creates a new PoC by combining crash primitives and guiding inputs. It finally verifies the propagated vulnerability using the created PoC. We evaluated OCTOPOCS with 15 real-world C and C++ vulnerable software pairs, with results showing that OCTOPOCS successfully verified 14 propagated vulnerabilities.
Li, Haofeng, Meng, Haining, Zheng, Hengjie, Cao, Liqing, Lu, Jie, Li, Lian, Gao, Lin.  2021.  Scaling Up the IFDS Algorithm with Efficient Disk-Assisted Computing. 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). :236–247.
The IFDS algorithm can be memory-intensive, requiring a memory budget of more than 100 GB of RAM for some applications. The large memory requirements significantly restrict the deployment of IFDS-based tools in practise. To improve this, we propose a disk-assisted solution that drastically reduces the memory requirements of traditional IFDS solvers. Our solution saves memory by 1) recomputing instead of memorizing intermediate analysis data, and 2) swapping in-memory data to disk when memory usages reach a threshold. We implement sophisticated scheduling schemes to swap data between memory and disks efficiently. We have developed a new taint analysis tool, DiskDroid, based on our disk-assisted IFDS solver. Compared to FlowDroid, a state-of-the-art IFDS-based taint analysis tool, for a set of 19 apps which take from 10 to 128 GB of RAM by FlowDroid, DiskDroid can analyze them with less than 10GB of RAM at a slight performance improvement of 8.6%. In addition, for 21 apps requiring more than 128GB of RAM by FlowDroid, DiskDroid can analyze each app in 3 hours, under the same memory budget of 10GB. This makes the tool deployable to normal desktop environments. We make the tool publicly available at https://github.com/HaofLi/DiskDroid.
Hung, Yu-Hsin, Jheng, Bing-Jhong, Li, Hong-Wei, Lai, Wen-Yang, Mallissery, Sanoop, Wu, Yu-Sung.  2021.  Mixed-mode Information Flow Tracking with Compile-time Taint Semantics Extraction and Offline Replay. 2021 IEEE Conference on Dependable and Secure Computing (DSC). :1–8.
Static information flow analysis (IFA) and dynamic information flow tracking (DIFT) have been widely employed in offline security analysis of computer programs. As security attacks become more sophisticated, there is a rising need for IFA and DIFT in production environment. However, existing systems usually deal with IFA and DIFT separately, and most DIFT systems incur significant performance overhead. We propose MIT to facilitate IFA and DIFT in online production environment. MIT offers mixed-mode information flow tracking at byte-granularity and incurs moderate runtime performance overhead. The core techniques consist of the extraction of taint semantics intermediate representation (TSIR) at compile-time and the decoupled execution of TSIR for information flow analysis. We conducted an extensive performance overhead evaluation on MIT to confirm its applicability in production environment. We also outline potential applications of MIT, including the implementation of data provenance checking and information flow based anomaly detection in real-world applications.
Shimchik, N. V., Ignatyev, V. N., Belevantsev, A. A..  2021.  Improving Accuracy and Completeness of Source Code Static Taint Analysis. 2021 Ivannikov Ispras Open Conference (ISPRAS). :61–68.
Static analysis is a general name for various methods of program examination without actually executing it. In particular, it is widely used to discover errors and vulnerabilities in software. Taint analysis usually denotes the process of checking the flow of user-provided data in the program in order to find potential vulnerabilities. It can be performed either statically or dynamically. In the paper we evaluate several improvements for the static taint analyzer Irbis [1], which is based on a special case of interprocedural graph reachability problem - the so-called IFDS problem, originally proposed by Reps et al. [2]. The analyzer is currently being developed at the Ivannikov Institute for System Programming of the Russian Academy of Sciences (ISP RAS). The evaluation is based on several real projects with known vulnerabilities and a subset of the Juliet Test Suite for C/C++ [3]. The chosen subset consists of more than 5 thousand tests for 11 different CWEs.
Chen, Xiarun, Li, Qien, Yang, Zhou, Liu, Yongzhi, Shi, Shaosen, Xie, Chenglin, Wen, Weiping.  2021.  VulChecker: Achieving More Effective Taint Analysis by Identifying Sanitizers Automatically. 2021 IEEE 20th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). :774–782.
The automatic detection of vulnerabilities in Web applications using taint analysis is a hot topic. However, existing taint analysis methods for sanitizers identification are too simple to find available taint transmission chains effectively. These methods generally use pre-constructed dictionaries or simple keywords to identify, which usually suffer from large false positives and false negatives. No doubt, it will have a greater impact on the final result of the taint analysis. To solve that, we summarise and classify the commonly used sanitizers in Web applications and propose an identification method based on semantic analysis. Our method can accurately and completely identify the sanitizers in the target Web applications through static analysis. Specifically, we analyse the natural semantics and program semantics of existing sanitizers, use semantic analysis to find more in Web applications. Besides, we implemented the method prototype in PHP and achieved a vulnerability detection tool called VulChecker. Then, we experimented with some popular open-source CMS frameworks. The results show that Vulchecker can accurately identify more sanitizers. In terms of vulnerability detection, VulChecker also has a lower false positive rate and a higher detection rate than existing methods. Finally, we used VulChecker to analyse the latest PHP applications. We identified several new suspicious taint data propagation chains. Before the paper was completed, we have identified four unreported vulnerabilities. In general, these results show that our approach is highly effective in improving vulnerability detection based on taint analysis.
Ji, Songyan, Dong, Jian, Qiu, Junfu, Gu, Bowen, Wang, Ye, Wang, Tongqi.  2021.  Increasing Fuzz Testing Coverage for Smart Contracts with Dynamic Taint Analysis. 2021 IEEE 21st International Conference on Software Quality, Reliability and Security (QRS). :243–247.
Nowadays, smart contracts manage more and more digital assets and have become an attractive target for adversaries. To prevent smart contracts from malicious attacks, a thorough test is indispensable and must be finished before deployment because smart contracts cannot be modified after being deployed. Fuzzing is an important testing approach, but most existing smart contract fuzzers can hardly solve the constraints which involve deeply nested conditional statements, resulting in low coverage. To address this problem, we propose Targy, an efficient targeted mutation strategy based on dynamic taint analysis. We obtain the taint flow by dynamic taint propagation, and generate a more accurate mutation strategy for the input parameters of functions to simultaneously satisfy all conditional statements. We implemented Targy on sFuzz with 3.6 thousand smart contracts running on Ethereum. The numbers of covered branches and detected vulnerabilities increase by 6% and 7% respectively, and the average time required for covering a branch is reduced by 11 %.
Fursova, Natalia, Dovgalyuk, Pavel, Vasiliev, Ivan, Klimushenkova, Maria, Egorov, Danila.  2021.  Detecting Attack Surface With Full-System Taint Analysis. 2021 IEEE 21st International Conference on Software Quality, Reliability and Security Companion (QRS-C). :1161–1162.
Attack surface detection for the complex software is needed to find targets for the fuzzing, because testing the whole system with many inputs is not realistic. Researchers that previously applied taint analysis for dealing with different security tasks in the virtual machines did not examined how to apply it for attack surface detection. I.e., getting the program modules and functions, that may be affected by input data. We propose using taint tracking within a virtual machine and virtual machine introspection to create a new approach that can detect the internal module interfaces that can be fuzz tested to assure that software is safe or find the vulnerabilities.
Kong, Xiangdong, Tang, Yong, Wang, Pengfei, Wei, Shuning, Yue, Tai.  2021.  HashMTI: Scalable Mutation-based Taint Inference with Hash Records. 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). :84–95.
Mutation-based taint inference (MTI) is a novel technique for taint analysis. Compared with traditional techniques that track propagations of taint tags, MTI infers a variable is tainted if its values change due to input mutations, which is lightweight and conceptually sound. However, there are 3 challenges to its efficiency and scalability: (1) it cannot efficiently record variable values to monitor their changes; (2) it consumes a large amount of memory monitoring variable values, especially on complex programs; and (3) its excessive memory overhead leads to a low hit ratio of CPU cache, which slows down the speed of taint inference. This paper presents an efficient and scalable solution named HashMTI. We first explain the above challenges based on 4 observations. Motivated by these challenges, we propose a hash record scheme to efficiently monitor changes in variable values and significantly reduce the memory overhead. The scheme is based on our specially selected and optimized hash functions that possess 3 crucial properties. Moreover, we propose the DoubleMutation strategy, which applies additional mutations to mitigate the limitation of the hash record and detect more taint information. We implemented a prototype of HashMTI and evaluated it on 18 real-world programs and 4 LAVA-M programs. Compared with the baseline OrigMTI, HashMTI significantly reduces the overhead while having similar accuracy. It achieves a speedup of 2.5X to 23.5X and consumes little memory which is on average 70.4 times less than that of OrigMTI.
Zhang, Xueling, Wang, Xiaoyin, Slavin, Rocky, Niu, Jianwei.  2021.  ConDySTA: Context-Aware Dynamic Supplement to Static Taint Analysis. 2021 IEEE Symposium on Security and Privacy (SP). :796–812.
Static taint analyses are widely-applied techniques to detect taint flows in software systems. Although they are theoretically conservative and de-signed to detect all possible taint flows, static taint analyses almost always exhibit false negatives due to a variety of implementation limitations. Dynamic programming language features, inaccessible code, and the usage of multiple programming languages in a software project are some of the major causes. To alleviate this problem, we developed a novel approach, DySTA, which uses dynamic taint analysis results as additional sources for static taint analysis. However, naïvely adding sources causes static analysis to lose context sensitivity and thus produce false positives. Thus, we developed a hybrid context matching algorithm and corresponding tool, ConDySTA, to preserve context sensitivity in DySTA. We applied REPRODROID [1], a comprehensive benchmarking framework for Android analysis tools, to evaluate ConDySTA. The results show that across 28 apps (1) ConDySTA was able to detect 12 out of 28 taint flows which were not detected by any of the six state-of-the-art static taint analyses considered in ReproDroid, and (2) ConDySTA reported no false positives, whereas nine were reported by DySTA alone. We further applied ConDySTA and FlowDroid to 100 top Android apps from Google Play, and ConDySTA was able to detect 39 additional taint flows (besides 281 taint flows found by FlowDroid) while preserving the context sensitivity of FlowDroid.
Piskachev, Goran, Krishnamurthy, Ranjith, Bodden, Eric.  2021.  SecuCheck: Engineering configurable taint analysis for software developers. 2021 IEEE 21st International Working Conference on Source Code Analysis and Manipulation (SCAM). :24–29.
Due to its ability to detect many frequently occurring security vulnerabilities, taint analysis is one of the core static analyses used by many static application security testing (SAST) tools. Previous studies have identified issues that software developers face with SAST tools. This paper reports on our experience in building a configurable taint analysis tool, named SecuCheck, that runs in multiple integrated development environments. SecuCheck is built on top of multiple existing components and comes with a Java-internal domain-specific language fluentTQL for specifying taint-flows, designed for software developers. We evaluate the applicability of SecuCheck in detecting eleven taint-style vulnerabilities in microbench programs and three real-world Java applications with known vulnerabilities. Empirically, we identify factors that impact the runtime of SecuCheck.
2021-11-29
Hough, Katherine, Welearegai, Gebrehiwet, Hammer, Christian, Bell, Jonathan.  2020.  Revealing Injection Vulnerabilities by Leveraging Existing Tests. 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). :284–296.
Code injection attacks, like the one used in the high-profile 2017 Equifax breach, have become increasingly common, now ranking \#1 on OWASP's list of critical web application vulnerabilities. Static analyses for detecting these vulnerabilities can overwhelm developers with false positive reports. Meanwhile, most dynamic analyses rely on detecting vulnerabilities as they occur in the field, which can introduce a high performance overhead in production code. This paper describes a new approach for detecting injection vulnerabilities in applications by harnessing the combined power of human developers' test suites and automated dynamic analysis. Our new approach, Rivulet, monitors the execution of developer-written functional tests in order to detect information flows that may be vulnerable to attack. Then, Rivulet uses a white-box test generation technique to repurpose those functional tests to check if any vulnerable flow could be exploited. When applied to the version of Apache Struts exploited in the 2017 Equifax attack, Rivulet quickly identifies the vulnerability, leveraging only the tests that existed in Struts at that time. We compared Rivulet to the state-of-the-art static vulnerability detector Julia on benchmarks, finding that Rivulet outperformed Julia in both false positives and false negatives. We also used Rivulet to detect new vulnerabilities.
Naeem, Hajra, Alalfi, Manar H..  2020.  Identifying Vulnerable IoT Applications Using Deep Learning. 2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER). :582–586.
This paper presents an approach for the identification of vulnerable IoT applications using deep learning algorithms. The approach focuses on a category of vulnerabilities that leads to sensitive information leakage which can be identified using taint flow analysis. First, we analyze the source code of IoT apps in order to recover tokens along their frequencies and tainted flows. Second, we develop, Token2Vec, which transforms the source code tokens into vectors. We have also developed Flow2Vec, which transforms the identified tainted flows into vectors. Third, we use the recovered vectors to train a deep learning algorithm to build a model for the identification of tainted apps. We have evaluated the approach on two datasets and the experiments show that the proposed approach of combining tainted flows features with the base benchmark that uses token frequencies only, has improved the accuracy of the prediction models from 77.78% to 92.59% for Corpus1 and 61.11% to 87.03% for Corpus2.
Sapountzis, Nikolaos, Sun, Ruimin, Wei, Xuetao, Jin, Yier, Crandall, Jedidiah, Oliveira, Daniela.  2020.  MITOS: Optimal Decisioning for the Indirect Flow Propagation Dilemma in Dynamic Information Flow Tracking Systems. 2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS). :1090–1100.
Dynamic Information Flow Tracking (DIFT), also called Dynamic Taint Analysis (DTA), is a technique for tracking the information as it flows through a program's execution. Specifically, some inputs or data get tainted and then these taint marks (tags) propagate usually at the instruction-level. While DIFT has been a fundamental concept in computer and network security for the past decade, it still faces open challenges that impede its widespread application in practice; one of them being the indirect flow propagation dilemma: should the tags involved in an indirect flow, e.g., in a control or address dependency, be propagated? Propagating all these tags, as is done for direct flows, leads to overtainting (all taintable objects become tainted), while not propagating them leads to undertainting (information flow becomes incomplete). In this paper, we analytically model that decisioning problem for indirect flows, by considering various tradeoffs including undertainting versus overtainting, importance of heterogeneous code semantics and context. Towards tackling this problem, we design MITOS, a distributed-optimization algorithm, that: decides about the propagation of indirect flows by properly weighting all these tradeoffs, is of low-complexity, is scalable, is able to flexibly adapt to different application scenarios and security needs of large distributed systems. Additionally, MITOS is applicable to most DIFT systems that consider an arbitrary number of tag types, and introduces the key properties of fairness and tag-balancing to the DIFT field. To demonstrate MITOS's applicability in practice, we implement and evaluate MITOS on top of an open-source DIFT, and we shed light on the open problem. We also perform a case-study scenario with a real in-memory only attack and show that MITOS improves simultaneously (i) system's spatiotemporal overhead (up to 40%), and (ii) system's fingerprint on suspected bytes (up to 167%) compared to traditional DIFT, even though these metrics usually conflict.
Hermerschmidt, Lars, Straub, Andreas, Piskachev, Goran.  2020.  Language-Agnostic Injection Detection. 2020 IEEE Security and Privacy Workshops (SPW). :268–275.
Formal languages are ubiquitous wherever software systems need to exchange or store data. Unparsing into and parsing from such languages is an error-prone process that has spawned an entire class of security vulnerabilities. There has been ample research into finding vulnerabilities on the parser side, but outside of language specific approaches, few techniques targeting unparser vulnerabilities exist. This work presents a language-agnostic approach for spotting injection vulnerabilities in unparsers. It achieves this by mining unparse trees using dynamic taint analysis to extract language keywords, which are leveraged for guided fuzzing. Vulnerabilities can thus be found without requiring prior knowledge about the formal language, and in fact, the approach is even applicable where no specification thereof exists at all. This empowers security researchers and developers alike to gain deeper understanding of unparser implementations through examination of the unparse trees generated by the approach, as well as enabling them to find new vulnerabilities in poorly-understood software. This work presents a language-agnostic approach for spotting injection vulnerabilities in unparsers. It achieves this by mining unparse trees using dynamic taint analysis to extract language keywords, which are leveraged for guided fuzzing. Vulnerabilities can thus be found without requiring prior knowledge about the formal language, and in fact, the approach is even applicable where no specification thereof exists at all. This empowers security researchers and developers alike to gain deeper understanding of unparser implementations through examination of the unparse trees generated by the approach, as well as enabling them to find new vulnerabilities in poorly-understood software.
Lyons, D., Zahra, S..  2020.  Using Taint Analysis and Reinforcement Learning (TARL) to Repair Autonomous Robot Software. 2020 IEEE Security and Privacy Workshops (SPW). :181–184.
It is important to be able to establish formal performance bounds for autonomous systems. However, formal verification techniques require a model of the environment in which the system operates; a challenge for autonomous systems, especially those expected to operate over longer timescales. This paper describes work in progress to automate the monitor and repair of ROS-based autonomous robot software written for an apriori partially known and possibly incorrect environment model. A taint analysis method is used to automatically extract the dataflow sequence from input topic to publish topic, and instrument that code. A unique reinforcement learning approximation of MDP utility is calculated, an empirical and non-invasive characterization of the inherent objectives of the software designers. By comparing design (a-priori) utility with deploy (deployed system) utility, we show, using a small but real ROS example, that it's possible to monitor a performance criterion and relate violations of the criterion to parts of the software. The software is then patched using automated software repair techniques and evaluated against the original off-line utility.
Andarzian, Seyed Behnam, Ladani, Behrouz Tork.  2020.  Compositional Taint Analysis of Native Codes for Security Vetting of Android Applications. 2020 10th International Conference on Computer and Knowledge Engineering (ICCKE). :567–572.
Security vetting of Android applications is one of the crucial aspects of the Android ecosystem. Regarding the state of the art tools for this goal, most of them doesn't consider analyzing native codes and only analyze the Java code. However, Android concedes its developers to implement a part or all of their applications using C or C++ code. Thus, applying conservative manners for analyzing Android applications while ignoring native codes would lead to less precision in results. Few works have tried to analyze Android native codes, but only JN-SAF has applied taint analysis using static techniques such as symbolic execution. However, symbolic execution has some problems when is used in large programs. One of these problems is the exponential growth of program paths that would raise the path explosion issue. In this work, we have tried to alleviate this issue by introducing our new tool named CTAN. CTAN applies new symbolic execution methods to angr in a particular way that it can make JN-SAF more efficient and faster. We have introduced compositional taint analysis in CTAN by combining satisfiability modulo theories with symbolic execution. Our experiments show that CTAN is 26 percent faster than its previous work JN-SAF and it also leads to more precision by detecting more data-leakage in large Android native codes.
Fu, Xiaoqin, Cai, Haipeng.  2020.  Scaling Application-Level Dynamic Taint Analysis to Enterprise-Scale Distributed Systems. 2020 IEEE/ACM 42nd International Conference on Software Engineering: Companion Proceedings (ICSE-Companion). :270–271.
With the increasing deployment of enterprise-scale distributed systems, effective and practical defenses for such systems against various security vulnerabilities such as sensitive data leaks are urgently needed. However, most existing solutions are limited to centralized programs. For real-world distributed systems which are of large scales, current solutions commonly face one or more of scalability, applicability, and portability challenges. To overcome these challenges, we develop a novel dynamic taint analysis for enterprise-scale distributed systems. To achieve scalability, we use a multi-phase analysis strategy to reduce the overall cost. We infer implicit dependencies via partial-ordering method events in distributed programs to address the applicability challenge. To achieve greater portability, the analysis is designed to work at an application level without customizing platforms. Empirical results have shown promising scalability and capabilities of our approach.
She, Dongdong, Chen, Yizheng, Shah, Abhishek, Ray, Baishakhi, Jana, Suman.  2020.  Neutaint: Efficient Dynamic Taint Analysis with Neural Networks. 2020 IEEE Symposium on Security and Privacy (SP). :1527–1543.
Dynamic taint analysis (DTA) is widely used by various applications to track information flow during runtime execution. Existing DTA techniques use rule-based taint-propagation, which is neither accurate (i.e., high false positive rate) nor efficient (i.e., large runtime overhead). It is hard to specify taint rules for each operation while covering all corner cases correctly. Moreover, the overtaint and undertaint errors can accumulate during the propagation of taint information across multiple operations. Finally, rule-based propagation requires each operation to be inspected before applying the appropriate rules resulting in prohibitive performance overhead on large real-world applications.In this work, we propose Neutaint, a novel end-to-end approach to track information flow using neural program embeddings. The neural program embeddings model the target's programs computations taking place between taint sources and sinks, which automatically learns the information flow by observing a diverse set of execution traces. To perform lightweight and precise information flow analysis, we utilize saliency maps to reason about most influential sources for different sinks. Neutaint constructs two saliency maps, a popular machine learning approach to influence analysis, to summarize both coarse-grained and fine-grained information flow in the neural program embeddings.We compare Neutaint with 3 state-of-the-art dynamic taint analysis tools. The evaluation results show that Neutaint can achieve 68% accuracy, on average, which is 10% improvement while reducing 40× runtime overhead over the second-best taint tool Libdft on 6 real world programs. Neutaint also achieves 61% more edge coverage when used for taint-guided fuzzing indicating the effectiveness of the identified influential bytes. We also evaluate Neutaint's ability to detect real world software attacks. The results show that Neutaint can successfully detect different types of vulnerabilities including buffer/heap/integer overflows, division by zero, etc. Lastly, Neutaint can detect 98.7% of total flows, the highest among all taint analysis tools.
2021-03-15
Staicu, C.-A., Torp, M. T., Schäfer, M., Møller, A., Pradel, M..  2020.  Extracting Taint Specifications for JavaScript Libraries. 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). :198—209.

Modern JavaScript applications extensively depend on third-party libraries. Especially for the Node.js platform, vulnerabilities can have severe consequences to the security of applications, resulting in, e.g., cross-site scripting and command injection attacks. Existing static analysis tools that have been developed to automatically detect such issues are either too coarse-grained, looking only at package dependency structure while ignoring dataflow, or rely on manually written taint specifications for the most popular libraries to ensure analysis scalability. In this work, we propose a technique for automatically extracting taint specifications for JavaScript libraries, based on a dynamic analysis that leverages the existing test suites of the libraries and their available clients in the npm repository. Due to the dynamic nature of JavaScript, mapping observations from dynamic analysis to taint specifications that fit into a static analysis is non-trivial. Our main insight is that this challenge can be addressed by a combination of an access path mechanism that identifies entry and exit points, and the use of membranes around the libraries of interest. We show that our approach is effective at inferring useful taint specifications at scale. Our prototype tool automatically extracts 146 additional taint sinks and 7 840 propagation summaries spanning 1 393 npm modules. By integrating the extracted specifications into a commercial, state-of-the-art static analysis, 136 new alerts are produced, many of which correspond to likely security vulnerabilities. Moreover, many important specifications that were originally manually written are among the ones that our tool can now extract automatically.

2021-01-11
Lobo-Vesga, E., Russo, A., Gaboardi, M..  2020.  A Programming Framework for Differential Privacy with Accuracy Concentration Bounds. 2020 IEEE Symposium on Security and Privacy (SP). :411–428.
Differential privacy offers a formal framework for reasoning about privacy and accuracy of computations on private data. It also offers a rich set of building blocks for constructing private data analyses. When carefully calibrated, these analyses simultaneously guarantee the privacy of the individuals contributing their data, and the accuracy of the data analyses results, inferring useful properties about the population. The compositional nature of differential privacy has motivated the design and implementation of several programming languages aimed at helping a data analyst in programming differentially private analyses. However, most of the programming languages for differential privacy proposed so far provide support for reasoning about privacy but not for reasoning about the accuracy of data analyses. To overcome this limitation, in this work we present DPella, a programming framework providing data analysts with support for reasoning about privacy, accuracy and their trade-offs. The distinguishing feature of DPella is a novel component which statically tracks the accuracy of different data analyses. In order to make tighter accuracy estimations, this component leverages taint analysis for automatically inferring statistical independence of the different noise quantities added for guaranteeing privacy. We evaluate our approach by implementing several classical queries from the literature and showing how data analysts can figure out the best manner to calibrate privacy to meet the accuracy requirements.
2020-07-10
Yulianto, Arief Dwi, Sukarno, Parman, Warrdana, Aulia Arif, Makky, Muhammad Al.  2019.  Mitigation of Cryptojacking Attacks Using Taint Analysis. 2019 4th International Conference on Information Technology, Information Systems and Electrical Engineering (ICITISEE). :234—238.

Cryptojacking (also called malicious cryptocurrency mining or cryptomining) is a new threat model using CPU resources covertly “mining” a cryptocurrency in the browser. The impact is a surge in CPU Usage and slows the system performance. In this research, in-browsercryptojacking mitigation has been built as an extension in Google Chrome using Taint analysis method. The method used in this research is attack modeling with abuse case using the Man-In-The-Middle (MITM) attack as a testing for mitigation. The proposed model is designed so that users will be notified if a cryptojacking attack occurs. Hence, the user is able to check the script characteristics that run on the website background. The results of this research show that the taint analysis is a promising method to mitigate cryptojacking attacks. From 100 random sample websites, the taint analysis method can detect 19 websites that are infcted by cryptojacking.

2020-07-09
Ashouri, Mohammadreza.  2019.  Detecting Input Sanitization Errors in Scala. 2019 Seventh International Symposium on Computing and Networking Workshops (CANDARW). :313—319.

Scala programming language combines object-oriented and functional programming in one concise, high-level language, and the language supports static types that help to avoid bugs in complex programs. This paper proposes a dynamic taint analyzer called ScalaTaint for Scala applications. The analyzer traces the propagation of malicious inputs from untrusted sources to sensitive sink methods in programs that can be exploited by adversaries. In this work, we evaluated the accuracy of ScalaTaint with a security benchmark suite including 7 projects in Scala. As a result, our analyzer could report 49 vulnerabilities within 753,372 lines of code. Moreover, the result of our performance measurement on ScalaBench shows 67% runtime overhead that demonstrates the usefulness and efficiently of our technique in comparison with similar tools.

2020-01-27
Guan, Le, Cao, Chen, Zhu, Sencun, Lin, Jingqiang, Liu, Peng, Xia, Yubin, Luo, Bo.  2019.  Protecting mobile devices from physical memory attacks with targeted encryption. Proceedings of the 12th Conference on Security and Privacy in Wireless and Mobile Networks. :34–44.
Sensitive data in a process could be scattered over the memory of a computer system for a prolonged period of time. Unfortunately, DRAM chips were proven insecure in previous studies. The problem becomes worse in the mobile environment, in which users' smartphones are easily lost or stolen. The powered-on phones may contain sensitive data in the vulnerable DRAM chips. In this paper, we propose MemVault, a mechanism to protect sensitive data in Android devices against physical memory attacks. MemVault keeps track of the propagation of well-marked sensitive data sources, and selectively encrypts tainted sensitive memory contents in the DRAM chip. When a tainted object is accessed, MemVault redirects the access to the internal RAM (iRAM), where the cipher-text object is decrypted transparently. iRAM is a system-on-chip (SoC) component which is by nature immune to physical memory exploits. We have implemented a MemVault prototype system, and have evaluated it with extensive experiments. Our results validate that MemVault effectively eliminates the occurrences of clear-text sensitive objects in DRAM chips, and imposes acceptable overheads.
Cao, Mengchen, Hou, Xiantong, Wang, Tao, Qu, Hunter, Zhou, Yajin, Bai, Xiaolong, Wang, Fuwei.  2019.  Different is Good: Detecting the Use of Uninitialized Variables through Differential Replay. Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. :1883–1897.
The use of uninitialized variables is a common issue. It could cause kernel information leak, which defeats the widely deployed security defense, i.e., kernel address space layout randomization (KASLR). Though a recent system called Bochspwn Reloaded reported multiple memory leaks in Windows kernels, how to effectively detect this issue is still largely behind. In this paper, we propose a new technique, i.e., differential replay, that could effectively detect the use of uninitialized variables. Specifically, it records and replays a program's execution in multiple instances. One instance is with the vanilla memory, the other one changes (or poisons) values of variables allocated from the stack and the heap. Then it compares program states to find references to uninitialized variables. The idea is that if a variable is properly initialized, it will overwrite the poisoned value and program states in two running instances should be the same. After detecting the differences, our system leverages the symbolic taint analysis to further identify the location where the variable was allocated. This helps us to identify the root cause and facilitate the development of real exploits. We have implemented a prototype called TimePlayer. After applying it to both Windows 7 and Windows 10 kernels (x86/x64), it successfully identified 34 new issues and another 85 ones that had been patched (some of them were publicly unknown.) Among 34 new issues, 17 of them have been confirmed as zero-day vulnerabilities by Microsoft.