Biblio
In parallel with the meteoric rise of mobile software, we are witnessing an alarming escalation in the number and sophistication of the security threats targeted at mobile platforms, particularly Android, as the dominant platform. While existing research has made significant progress towards detection and mitigation of Android security, gaps and challenges remain. This paper contributes a comprehensive taxonomy to classify and characterize the state-of-the-art research in this area. We have carefully followed the systematic literature review process, and analyzed the results of more than 300 research papers, resulting in the most comprehensive and elaborate investigation of the literature in this area of research. The systematic analysis of the research literature has revealed patterns, trends, and gaps in the existing literature, and underlined key challenges and opportunities that will shape the focus of future research efforts.
In parallel with the meteoric rise of mobile software, we are witnessing an alarming escalation in the number and sophistication of the security threats targeted at mobile platforms, particularly Android, as the dominant platform. While existing research has made significant progress towards detection and mitigation of Android security, gaps and challenges remain. This paper contributes a comprehensive taxonomy to classify and characterize the state-of-the-art research in this area. We have carefully followed the systematic literature review process, and analyzed the results of more than 300 research papers, resulting in the most comprehensive and elaborate investigation of the literature in this area of research. The systematic analysis of the research literature has revealed patterns, trends, and gaps in
It is more expensive and time consuming to build modern software without extensive supply chains. Supply chains decrease these development risks, but typically at the cost of increased security risk. In particular, it is often difficult to understand or verify what a software component delivered by a third party does or could do. Such a component could contain unwanted behaviors, vulnerabilities, or malicious code, many of which become incorporated in applications utilizing the component. Sandboxes provide relief by encapsulating a component and imposing a security policy on it. This limits the operations the component can perform without as much need to trust or verify the component. Instead, a component user must trust or verify the relatively simple sandbox. Given this appealing prospect, researchers have spent the last few decades developing new sandboxing techniques and sandboxes. However, while sandboxes have been adopted in practice, they are not as pervasive as they could be. Why are sandboxes not achieving ubiquity at the same rate as extensive supply chains? This thesis advances our understanding of and overcomes some barriers to sandbox adoption. We systematically analyze ten years (2004 – 2014) of sandboxing research from top-tier security and systems conferences. We uncover two barriers: (1) sandboxes are often validated using relatively subjective techniques and (2) usability for sandbox deployers is often ignored by the studied community. We then focus on the Java sandbox to empirically study its use within the open source community. We find features in the sandbox that benign applications do not use, which have promoted a thriving exploit landscape. We develop run time monitors for the Java Virtual Machine (JVM) to turn off these features, stopping all known sandbox escaping JVM exploits without breaking benign applications. Furthermore, we find that the sandbox contains a high degree of complexity benign applications need that hampers sandbox use. When studying the sandbox’s use, we did not find a single application that successfully deployed the sandbox for security purposes, which motivated us to overcome benignly-used complexity via tooling. We develop and evaluate a series of tools to automate the most complex tasks, which currently require error-prone manual effort. Our tools help users derive, express, and refine a security policy and impose it on targeted Java application JARs and classes. This tooling is evaluated through case studies with industrial collaborators where we sandbox components that were previously difficult to sandbox securely. Finally, we observe that design and implementation complexity causes sandbox developers to accidentally create vulnerable sandboxes. Thus, we develop and evaluate a sandboxing technique that leverages existing cloud computing environments to execute untrusted computations. Malicious outcomes produced by the computations are contained by ephemeral virtual machines. We describe a field trial using this technique with Adobe Reader and compare the new sandbox to existing sandboxes using a qualitative framework we developed.
Foundational models of object-oriented constructs typically model objects as records with a structural type. However, many object-oriented languages are class-based; statically-typed formal models of these languages tend to sacrifice the foundational nature of the record-based models, and in addition cannot express dynamic class loading or creation. In this paper, we explore how to model statically-typed object-oriented languages that support dynamic class creation using foundational constructs of type theory. We start with an extensible tag construct motivated by type theory, and adapt it to support static reasoning about class hierarchy and the tags supported by each object. The result is a model that better explains the relationship between object-oriented and functional programming paradigms, suggests a useful enhancement to functional programming languages, and paves the way for more expressive statically typed object-oriented languages. In that vein, we describe the design and implementation of the Wyvern language, which leverages our theory.
Ambiguity arises in requirements when astatement is unintentionally or otherwise incomplete, missing information, or when a word or phrase has morethan one possible meaning. For web-based and mobileinformation systems, ambiguity, and vagueness inparticular, undermines the ability of organizations to aligntheir privacy policies with their data practices, which canconfuse or mislead users thus leading to an increase inprivacy risk. In this paper, we introduce a theory ofvagueness for privacy policy statements based on ataxonomy of vague terms derived from an empiricalcontent analysis of 15 privacy policies. The taxonomy wasevaluated in a paired comparison experiment and resultswere analyzed using the Bradley-Terry model to yield arank order of vague terms in both isolation andcomposition. The theory predicts how vague modifiers toinformation actions and information types can becomposed to increase or decrease overall vagueness. Wefurther provide empirical evidence based on factorialvignette surveys to show how increases in vagueness willdecrease users' acceptance of privacy risk and thusdecrease users' willingness to share personal information.
The Alloy specification language, and the corresponding Alloy Analyzer, have received much attention in the last two decades with applications in many areas of software engineering. Increasingly, formal analyses enabled by Alloy are desired for use in an on-line mode, where the specifications are automatically kept in sync with the running, possibly changing, software system. However, given Alloy Analyzer’s reliance on computationally expensive SAT solvers, an important challenge is the time it takes for such analyses to execute at runtime. The fact that in an on-line mode, the analyses are often repeated on slightly revised versions of a given specification, presents us with an opportunity to tackle this challenge. We present Titanium, an extension of Alloy for formal analysis of evolving specifications. By leveraging the results from previous analyses, Titanium narrows the state space of the revised specification, thereby greatly reducing the required computational effort. We describe the semantic basis of Titanium in terms of models specified in relational logic. We show how the approach can be realized atop an existing relational logic model finder. Our experimental results show Titanium achieves a significant speed-up over Alloy Analyzer when applied to the analysis of evolving specifications.
The notion of commitment is widely studied as a high-level abstraction for modeling multiagent interaction. An important challenge is supporting flexible decentralized enactments of commitment specifications. In this paper, we combine recent advances on specifying commitments and information protocols. Specifically, we contribute Tosca, a technique for automatically synthesizing information protocols from commitment specifications. Our main result is that the synthesized protocols support commitment alignment, which is the idea that agents must make compatible inferences about their commitments despite decentralization.
Mobile applications frequently access sensitive personal information to meet user or business requirements. Because such information is sensitive in general, regulators increasingly require mobile-app developers to publish privacy policies that describe what information is collected. Furthermore, regulators have fined companies when these policies are inconsistent with the actual data practices of mobile apps. To help mobile-app developers check their privacy policies against their apps' code for consistency, we propose a semi-automated framework that consists of a policy terminology-API method map that links policy phrases to API methods that produce sensitive information, and information flow analysis to detect misalignments. We present an implementation of our framework based on a privacy-policy-phrase ontology and a collection of mappings from API methods to policy phrases. Our empirical evaluation on 477 top Android apps discovered 341 potential privacy policy violations.
Sociotechnical systems (STSs), where users interact with software components, support automated logging, i.e., what a user has performed in the system. However, most systems do not implement automated processes for inspecting the logs when a misuse happens. Deciding what needs to be logged is crucial as excessive amounts of logs might be overwhelming for human analysts to inspect. The goal of this research is to aid software practitioners to implement automated forensic logging by providing a systematic method of using attackers' malicious intentions to decide what needs to be logged. We propose Lokma: a normative framework to construct logging rules for forensic knowledge. We describe the general forensic process of Lokma, and discuss related directions.
Programming language definitions assign formal meaning to complete programs. Programmers, however, spend a substantial amount of time interacting with incomplete programs -- programs with holes, type inconsistencies and binding inconsistencies -- using tools like program editors and live programming environments (which interleave editing and evaluation). Semanticists have done comparatively little to formally characterize (1) the static and dynamic semantics of incomplete programs; (2) the actions available to programmers as they edit and inspect incomplete programs; and (3) the behavior of editor services that suggest likely edit actions to the programmer based on semantic information extracted from the incomplete program being edited, and from programs that the system has encountered in the past. As such, each tool designer has largely been left to develop their own ad hoc heuristics.
This paper serves as a vision statement for a research program that seeks to develop these "missing" semantic foundations. Our hope is that these contributions, which will take the form of a series of simple formal calculi equipped with a tractable metatheory, will guide the design of a variety of current and future interactive programming tools, much as various lambda calculi have guided modern language designs. Our own research will apply these principles in the design of Hazel, an experimental live lab notebook programming environment designed for data science tasks. We plan to co-design the Hazel language with the editor so that we can explore concepts such as edit-time semantic conflict resolution mechanisms and mechanisms that allow library providers to install library-specific editor services.
Despite the abundance of information security guidelines, system developers have difficulties implementing technical solutions that are reasonably secure. Security patterns are one possible solution to help developers reuse security knowledge. The challenge is that it takes experts to develop security patterns. To address this challenge, we need a framework to identify and assess patterns and pattern application practices that are accessible to non-experts. In this paper, we narrowly define what we mean by patterns by focusing on requirements patterns and the considerations that may inform how we identify and validate patterns for knowledge reuse. We motivate this discussion using examples from the requirements pattern literature and theory in cognitive psychology.
Despite the abundance of information security guidelines, system developers have difficulties implementing technical solutions that are reasonably secure. Security patterns are one possible solution to help developers reuse security knowledge. The challenge is that it takes experts to develop security patterns. To address this challenge, we need a framework to identify and assess patterns and pattern application practices that are accessible to non-experts. In this paper, we narrowly define what we mean by patterns by focusing on requirements patterns and the considerations that may inform how we identify and validate patterns for knowledge reuse. We motivate this discussion using examples from the requirements pattern literature and theory in cognitive psychology.
Intrusion Detection Systems (IDSs) are crucial security mechanisms widely deployed for critical network protection. However, conventional IDSs become incompetent due to the rapid growth in network size and the sophistication of large scale attacks. To mitigate this problem, Collaborative IDSs (CIDSs) have been proposed in literature. In CIDSs, a number of IDSs exchange their intrusion alerts and other relevant data so as to achieve better intrusion detection performance. Nevertheless, the required information exchange may result in privacy leakage, especially when these IDSs belong to different self-interested organizations. In order to obtain a quantitative understanding of the fundamental tradeoff between the intrusion detection accuracy and the organizations' privacy, a repeated two-layer single-leader multi-follower game is proposed in this work. Based on our game-theoretic analysis, we are able to derive the expected behaviors of both the attacker and the IDSs and obtain the utility-privacy tradeoff curve. In addition, the existence of Nash equilibrium (NE) is proved and an asynchronous dynamic update algorithm is proposed to compute the optimal collaboration strategies of IDSs. Finally, simulation results are shown to validate the analysis.
Domain-specific languages improve ease-of-use, expressiveness and
verifiability, but defining and using different DSLs within a single
application remains difficult. We introduce an approach for embedded
DSLs where 1) whitespace delimits DSL-governed blocks, and
2) the parsing and type checking phases occur in tandem so that
the expected type of the block determines which domain-specific
parser governs that block. We argue that this approach occupies
a sweet spot, providing high expressiveness and ease-of-use while
maintaining safe composability. We introduce the design, provide
examples and describe an ongoing implementation of this strategy
in the Wyvern programming language. We also discuss how a more
conventional keyword-directed strategy for parsing of DSLs can
arise as a special case of this type-directed strategy.
Domain-specific languages improve ease-of-use, expressiveness and verifiability, but defining and using different DSLs within a single application remains difficult. We introduce an approach for embedded DSLs where 1) whitespace delimits DSL-governed blocks, and 2) the parsing and type checking phases occur in tandem so that the expected type of the block determines which domain-specific parser governs that block. We argue that this approach occupies a sweet spot, providing high expressiveness and ease-of-use while maintaining safe composability. We introduce the design, provide examples and describe an ongoing implementation of this strategy in the Wyvern programming language. We also discuss how a more conventional keyword-directed strategy for parsing of DSLs can arise as a special case of this type-directed strategy.