Visible to the public Biblio

Filters: First Letter Of Last Name is M  [Clear All Filters]
A B C D E F G H I J K L [M] N O P Q R S T U V W X Y Z   [Show ALL]
Munindar P. Singh.  2022.  Consent as a Foundation for Responsible Autonomy. Proceedings of the 36th AAAI Conference on Artificial Intelligence (AAAI). 36
This paper focuses on a dynamic aspect of responsible autonomy, namely, to make intelligent agents be responsible at run time. That is, it considers settings where decision making by agents impinges upon the outcomes perceived by other agents. For an agent to act responsibly, it must accommodate the desires and other attitudes of its users and, through other agents, of their users. The contribution of this paper is twofold. First, it provides a conceptual analysis of consent, its benefits and misuses, and how understanding consent can help achieve responsible autonomy. Second, it outlines challenges for AI (in particular, for agents and multiagent systems) that merit investigation to form as a basis for modeling consent in multiagent systems and applying consent to achieve responsible autonomy.
Blue Sky Track
Munindar P. Singh, Amit K. Chopra.  2017.  The Internet of Things and Multiagent Systems: Decentralized Intelligence in Distributed Computing. Proceedings of the 37th IEEE International Conference on Distributed Computing Systems (ICDCS). :1738–1747.

Traditionally, distributed computing concentrates on computation understood at the level of information exchange and sets aside human and organizational concerns as largely to be handled in an ad hoc manner.  Increasingly, however, distributed applications involve multiple loci of autonomy.  Research in multiagent systems (MAS) addresses autonomy by drawing on concepts and techniques from artificial intelligence.  However, MAS research generally lacks an adequate understanding of modern distributed computing.

In this Blue Sky paper, we envision decentralized multiagent systems as a way to place decentralized intelligence in distributed computing, specifically, by supporting computation at the level of social meanings.  We motivate our proposals for research in the context of the Internet of Things (IoT), which has become a major thrust in distributed computing.  From the IoT's representative applications, we abstract out the major challenges of relevance to decentralized intelligence.  These include the heterogeneity of IoT components; asynchronous and delay-tolerant communication and decoupled enactment; and multiple stakeholders with subtle requirements for governance, incorporating resource usage, cooperation, and privacy.  The IoT yields high-impact problems that require solutions that go beyond traditional ways of thinking.

We conclude with highlights of some possible research directions in decentralized MAS, including programming models; interaction-oriented software engineering; and what we term enlightened governance.

Blue Sky Thinking Track

Morgan Evans, Jaspreet Bhatia, Sudarshan Wadkar, Travis Breaux.  2017.  An Evaluation of Constituency-based Hyponymy Extraction from Privacy Policies . 25th IEEE International Requirements Engineering Conference.

Requirements analysts can model regulated data practices to identify and reason about risks of noncompliance. If terminology is inconsistent or ambiguous, however, these models and their conclusions will be unreliable. To study this problem, we investigated an approach to automatically construct an information type ontology by identifying information type hyponymy in privacy policies using Tregex patterns. Tregex is a utility to match regular expressions against constituency parse trees, which are hierarchical expressions of natural language clauses, including noun and verb phrases. We discovered the Tregex patterns by applying content analysis to 30 privacy policies from six domains (shopping, telecommunication, social networks, employment, health, and news.) From this dataset, three semantic and four lexical categories of hyponymy emerged based on category completeness and wordorder. Among these, we identified and empirically evaluated 72 Tregex patterns to automate the extraction of hyponyms from privacy policies. The patterns match information type hyponyms with an average precision of 0.72 and recall of 0.74. 

Momin Malik, Jurgen Pfeffer, Gabriel Ferreira, Christian Kästner.  2016.  Visualizing the variational callgraph of the Linux Kernel: An approach for reasoning about dependencies. HotSos '16 Proceedings of the Symposium and Bootcamp on the Science of Security.

Software developers use #ifdef statements to support code configurability, allowing software product diversification. But because functions can be in many executions paths that depend on complex combinations of configuration options, the introduction of an #ifdef for a given purpose (such as adding a new feature to a program) can enable unintended function calls, which can be a source of vulnerabilities. Part of the difficulty lies in maintaining mental models of all dependencies. We propose analytic visualizations of thevariational callgraph to capture dependencies across configurations and create visualizations to demonstrate how it would help developers visually reason through the implications of diversification, for example through visually doing change impact analysis.

Mohammed Alsaleh, Ehab Al-Shaer.  2016.  Towards Automated Verification of Active Cyber Defense Strategies on Software Defined Networks. ACM CCS SafeConfig '16 Proceedings of the 2016 ACM Workshop on Automated Decision Making for Active Cyber Defense:.
Mitra Bokaei Hosseini, Sudarshan Wadkar, Travis Breaux, Jianwei Niu.  2016.  Lexical Similarity of Information Type Hypernyms, Meronyms and Synonyms in Privacy Policies. Association for the Advancement of Artificial Intelligence.

Privacy policies are used to communicate company data practices to consumers and must be accurate and comprehensive. Each policy author is free to use their own nomenclature when describing data practices, which leads to different ways in which similar information types are described across policies. A formal ontology can help policy authors, users and regulators consistently check how data practice descriptions relate to other interpretations of information types. In this paper, we describe an empirical method for manually constructing an information type ontology from privacy policies. The method consists of seven heuristics that explain how to infer hypernym, meronym and synonym relationships from information type phrases, which we discovered using grounded analysis of five privacy policies. The method was evaluated on 50 mobile privacy policies which produced an ontology consisting of 355 unique information type names. Based on the manual results, we describe an automated technique consisting of 14 reusable semantic rules to extract hypernymy, meronymy, and synonymy relations from information type phrases. The technique was evaluated on the manually constructed ontology to yield .95 precision and .51 recall.

Michael Maass, Adam Sales, Benjamin Chung, Joshua Sunshine.  2016.  A systematic analysis of the science of sandboxing. PeerJ Computer Science. 2

Sandboxes are increasingly important building materials for secure software systems. In recognition of their potential to improve the security posture of many systems at various points in the development lifecycle, researchers have spent the last several decades developing, improving, and evaluating sandboxing techniques. What has been done in this space? Where are the barriers to advancement? What are the gaps in these efforts? We systematically analyze a decade of sandbox research from five top-tier security and systems conferences using qualitative content analysis, statistical clustering, and graph-based metrics to answer these questions and more. We find that the term “sandbox” currently has no widely accepted or acceptable definition. We use our broad scope to propose the first concise and comprehensive definition for “sandbox” that consistently encompasses research sandboxes. We learn that the sandboxing landscape covers a range of deployment options and policy enforcement techniques collectively capable of defending diverse sets of components while mitigating a wide range of vulnerabilities. Researchers consistently make security, performance, and applicability claims about their sandboxes and tend to narrowly define the claims to ensure they can be evaluated. Those claims are validated using multi-faceted strategies spanning proof, analytical analysis, benchmark suites, case studies, and argumentation. However, we find two cases for improvement: (1) the arguments researchers present are often ad hoc and (2) sandbox usability is mostly uncharted territory. We propose ways to structure arguments to ensure they fully support their corresponding claims and suggest lightweight means of evaluating sandbox usability.

Michael Maass.  2016.  A Theory and Tools for Applying Sandboxes Effectively. Institute for Software Research, School of Computer Science. PhD Philosphy:166.

It is more expensive and time consuming to build modern software without extensive supply chains. Supply chains decrease these development risks, but typically at the cost of increased security risk. In particular, it is often difficult to understand or verify what a software component delivered by a third party does or could do. Such a component could contain unwanted behaviors, vulnerabilities, or malicious code, many of which become incorporated in applications utilizing the component. Sandboxes provide relief by encapsulating a component and imposing a security policy on it. This limits the operations the component can perform without as much need to trust or verify the component. Instead, a component user must trust or verify the relatively simple sandbox. Given this appealing prospect, researchers have spent the last few decades developing new sandboxing techniques and sandboxes. However, while sandboxes have been adopted in practice, they are not as pervasive as they could be. Why are sandboxes not achieving ubiquity at the same rate as extensive supply chains? This thesis advances our understanding of and overcomes some barriers to sandbox adoption. We systematically analyze ten years (2004 – 2014) of sandboxing research from top-tier security and systems conferences. We uncover two barriers: (1) sandboxes are often validated using relatively subjective techniques and (2) usability for sandbox deployers is often ignored by the studied community. We then focus on the Java sandbox to empirically study its use within the open source community. We find features in the sandbox that benign applications do not use, which have promoted a thriving exploit landscape. We develop run time monitors for the Java Virtual Machine (JVM) to turn off these features, stopping all known sandbox escaping JVM exploits without breaking benign applications. Furthermore, we find that the sandbox contains a high degree of complexity benign applications need that hampers sandbox use. When studying the sandbox’s use, we did not find a single application that successfully deployed the sandbox for security purposes, which motivated us to overcome benignly-used complexity via tooling. We develop and evaluate a series of tools to automate the most complex tasks, which currently require error-prone manual effort. Our tools help users derive, express, and refine a security policy and impose it on targeted Java application JARs and classes. This tooling is evaluated through case studies with industrial collaborators where we sandbox components that were previously difficult to sandbox securely. Finally, we observe that design and implementation complexity causes sandbox developers to accidentally create vulnerable sandboxes. Thus, we develop and evaluate a sandboxing technique that leverages existing cloud computing environments to execute untrusted computations. Malicious outcomes produced by the computations are contained by ephemeral virtual machines. We describe a field trial using this technique with Adobe Reader and compare the new sandbox to existing sandboxes using a qualitative framework we developed.

Michael Maass, William Scherlis, Jonathan Aldrich.  2014.  In-Nimbo Sandboxing. HotSoS '14 Proceedings of the 2014 Symposium and Bootcamp on the Science of Security.

Sandboxes impose a security policy, isolating applications and their components from the rest of a system. While many sandboxing techniques exist, state of the art sandboxes generally perform their functions within the system that is being defended. As a result, when the sandbox fails or is bypassed, the security of the surrounding system can no longer be assured. We experiment with the idea of in-nimbo sandboxing, encapsulating untrusted computations away from the system we are trying to protect. The idea is to delegate computations that may be vulnerable or malicious to virtual machine instances in a cloud computing environment.

This may not reduce the possibility of an in-situ sandbox compromise, but it could significantly reduce the consequences should that possibility be realized. To achieve this advantage, there are additional requirements, including: (1) A regulated channel between the local and cloud environments that supports interaction with the encapsulated application, (2) Performance design that acceptably minimizes latencies in excess of the in-situ baseline.

To test the feasibility of the idea, we built an in-nimbo sandbox for Adobe Reader, an application that historically has been subject to significant attacks. We undertook a prototype deployment with PDF users in a large aerospace firm. In addition to thwarting several examples of existing PDF-based malware, we found that the added increment of latency, perhaps surprisingly, does not overly impair the user experience with respect to performance or usability.

Michael Lanham, Geoffrey Morgan, Kathleen Carley.  2014.  Social Network Modeling and Agent‐Based Simulation in Support of Crisis De‐escalation. IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS. 44

Decision makers need capabilities to quickly model and effectively assess consequences of actions and reactions in crisis de-escalation environments. The creation and what-if exercising of such models has traditionally had onerous resource requirements. This research demonstrates fast and viable ways to build such models in operational environments. Through social network extraction from texts, network analytics to identify key actors, and then simulation to assess alternative interventions, advisors can support practicing and execution of crisis de-escalation activities. We describe how we used this approach as part of a scenario-driven modeling effort. We demonstrate the strength of moving from data to models and the advantages of data-driven simulation, which allow for iterative refinement. We conclude with a discussion of the limitations of this approach and anticipated future work.

Michael Coblenz, Joshua Sunshine, Jonathan Aldrich, Brad Myers, Sam Weber, Forrest Shull.  2016.  Exploring Language Support for Immutability. ICSE '16 Proceedings of the 38th International Conference on Software Engineering.

Programming languages can restrict state change by preventing it entirely (immutability) or by restricting which clients may modify state (read-only restrictions). The benefits of immutability and read-only restrictions in software structures have been long-argued by practicing software engineers, researchers, and programming language designers. However, there are many proposals for language mechanisms for restricting state change, with a remarkable diversity of techniques and goals, and there is little empirical data regarding what practicing software engineers want in their tools and what would benefit them. We systematized the large collection of techniques used by programming languages to help programmers prevent undesired changes in state. We interviewed expert software engineers to discover their expectations and requirements, and found that important requirements, such as expressing immutability constraints, were not reflected in features available in the languages participants used. The interview results informed our design of a new language extension for specifying immutability in Java. Through an iterative, participatory design process, we created a tool that reflects requirements from both our interviews and the research literature.

Michael Coblenz, Jonathan Aldrich, Brad Myers, Joshua Sunshine.  2014.  Considering Productivity Effects of Explicit Type Declarations. PLATEAU '14 Proceedings of the 5th Workshop on Evaluation and Usability of Programming Languages and Tools.

Static types may be used both by the language implementation and directly by the user as documentation. Though much existing work focuses primarily on the implications of static types on the semantics of programs, relatively little work considers the impact on usability that static types provide. Though the omission of static type information may decrease program length and thereby improve readability, it may also decrease readability because users must then frequently derive type information manually while reading programs. As type inference becomes more popular in languages that are in widespread use, it is important to consider whether the adoption of type inference may impact productivity of developers.

Michael Coblenz, Robert Seacord, Brad Myers, Joshua Sunshine, Jonathan Aldrich.  2015.  A Course-Based Usability Analysis of Cilk Plus and OpenMP. IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC) .

Cilk Plus and OpenMP are parallel language ex-tensions for the C and C++ programming languages. The CPLEX Study Group of the ISO/IEC C Standards Committee is developing a proposal for a parallel programming extension to C that combines ideas from Cilk Plus and OpenMP. We conducted a preliminary comparison of Cilk Plus and OpenMP in a master's level course on security to evaluate the design tradeoffs in the usability and security of these two approaches. The eventual goal is to inform decision making within the committee. We found several usability problems worthy of further investigation based on student performance, including declaring and using reductions, multi-line compiler directives, and the understandability of task assignment to threads.

Michael Coblenz, Whitney Nelson, Jonathan Aldrich, Brad Myers, Joshua Sunshine.  2017.  Glacier: Transitive Class Immutability for Java. 39th International Conference on Software Engineering.

Though immutability has been long-proposed as a way to prevent bugs in software, little is known about how to make immutability support in programming languages effective for software engineers. We designed a new formalism that extends Java to support transitive class immutability, the form of immutability for which there is the strongest empirical support, and implemented that formalism in a tool called Glacier. We applied Glacier successfully to two real-world systems. We also compared Glacier to Java’s final in a user study of twenty participants. We found that even after being given instructions on how to express immutability with final, participants who used final were unable to express immutability correctly, whereas almost all participants who used Glacier succeeded. We also asked participants to make specific changes to immutable classes and found that participants who used final all incorrectly mutated immutable state, whereas almost all of the participants who used Glacier succeeded. Glacier represents a promising approach to enforcing immutability in Java and provides a model for enforcement in other languages.

Mezzour, Ghita, Carley, L. Richard, Carley, Kathleen.  2014.  Longitudinal Analysis of a Large Corpus of Cyber Threat Descriptions. Journal of Computer Virology and Hacking Techniques.

Online cyber threat descriptions are rich, but little research has attempted to systematically analyze these descriptions. In this paper, we process and analyze two of Symantec’s online threat description corpora. The Anti-Virus (AV) corpus contains descriptions of more than 12,400 threats detected by Symantec’s AV, and the Intrusion Prevention System (IPS) corpus contains descriptions of more than 2,700 attacks detected by Symantec’s IPS. In our analysis, we quantify the over time evolution of threat severity and type in the corpora. We also assess the amount of time Symantec takes to release signatures for newly discovered threats. Our analysis indicates that a very small minority of threats in the AV corpus are high-severity, whereas the majority of attacks in the IPS corpus are high-severity. Moreover, we find that the prevalence of different threat types such as worms and viruses in the corpora varies considerably over time. Finally, we find that Symantec prioritizes releasing signatures for fast propagating threats.

Meng Meng, Jens Meinicke, Chu-Pan Wong, Eric Walkingshaw, Christian Kästner.  2017.  A Choice of Variational Stacks: Exploring Variational Data Structures. 11th International Workshop on Variability Modelling of Software-intensive Systems (VAMOS).

Many applications require not only representing variability in software and data, but also computing with it. To do so efficiently requires variational data structures that make the variability explicit in the underlying data and the operations used to manipulate it. Variational data structures have been developed ad hoc for many applications, but there is little general understanding of how to design them or what tradeoffs exist among them. In this paper, we strive for a more systematic exploration and analysis of a variational data structure. We want to know how different design decisions affect the performance and scalability of a variational data structure, and what properties of the underlying data and operation sequences need to be considered. Specifically, we study several alternative designs of a variational stack, a data structure that supports efficiently representing and computing with multiple variants of a plain stack, and that is a common building block in many algorithms. The different variational stacks are presented as a small product line organized by three design decisions. We analyze how these design decisions affect the performance of a variational stack with different usage profiles. Finally, we evaluate how these design decisions affect the performance of the variational stack in a real-world scenario: in the interpreter VarexJ when executing real software containing variability. 

Mehdi Mashayekhi, Hongying Du, George F. List, Munindar P. Singh.  2016.  Silk: A Simulation Study of Regulating Open Normative Multiagent Systems. Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI). :1–7.

In a multiagent system, a (social) norm describes what the agents may expect from each other.  Norms promote autonomy (an agent need not comply with a norm) and heterogeneity (a norm describes interactions at a high level independent of implementation details). Researchers have studied norm emergence through social learning where the agents interact repeatedly in a graph structure.

In contrast, we consider norm emergence in an open system, where membership can change, and where no predetermined graph structure exists.  We propose Silk, a mechanism wherein a generator monitors interactions among member agents and recommends norms to help resolve conflicts.  Each member decides on whether to accept or reject a recommended norm.  Upon exiting the system, a member passes its experience along to incoming members of the same type.  Thus, members develop norms in a hybrid manner to resolve conflicts.

We evaluate Silk via simulation in the traffic domain.  Our results show that social norms promoting conflict resolution emerge in both moderate and selfish societies via our hybrid mechanism.

Marwan Abi-Antoun, Sumukhi Chandrashekar, Radu Vanciu, Andrew Giang.  2014.  Are Object Graphs Extracted Using Abstract Interpretation Significantly Different from the Code? Extended Version SCAM '14 Proceedings of the 2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation.

To evolve object-oriented code, one must understand both the code structure in terms of classes, and the runtime structure in terms of abstractions of objects that are being created and relations between those objects. To help with this understanding, static program analysis can extract heap abstractions such as object graphs. But the extracted graphs can become too large if they do not sufficiently abstract objects, or too imprecise if they abstract objects excessively to the point of being similar to a class diagram that shows one box for a class to represent all the instances of that class. One previously proposed solution uses both annotations and abstract interpretation to extract a global, hierarchical, abstract object graph that conveys both abstraction and design intent, but can still be related to the code structure. In this paper, we define metrics that relate nodes and edges in the object graph to elements in the code structure to measure how they differ, and if the differences are indicative of language or design features such as encapsulation, polymorphism and inheritance. We compute the metrics across eight systems totaling over 100 KLOC, and show a statistically significant difference between the code and the object graph. In several cases, the magnitude of this difference is large.

Marwan Abi-Antoun, Sumukhi Chandrashekar, Radu Vanciu, Andrew Giang.  2014.  Are Object Graphs Extracted Using Abstract Interpretation Significantly Different from the Code? SCAM '14 Proceedings of the 2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation.

To evolve object-oriented code, one must understand both the code structure in terms of classes, and the runtime structure in terms of abstractions of objects that are being created and relations between those objects. To help with this understanding, static program analysis can extract heap abstractions such as object graphs. But the extracted graphs can become too large if they do not sufficiently abstract objects, or too imprecise if they abstract objects excessively to the point of being similar to a class diagram, where one box for a class represents all the instances of that class. One previously proposed solution uses both annotations and abstract interpretation to extract a global, hierarchical, abstract object graph that conveys both abstraction and design intent, but can still be related to the code structure. In this paper, we define metrics that relate nodes and edges in the object graph to elements in the code structure, to measure how they differ, and if the differences are indicative of language or design features such as encapsulation, polymorphism and inheritance. We compute the metrics across eight systems totaling over 100 KLOC, and show a statistically significant difference between the code and the object graph. In several cases, the magnitude of this difference is large.

Marwan Abi-Antoun, Yibin Wang, Ebrahim Khalaj, Andrew Giang, Vaclav Rajlich.  2015.  Impact Analysis based on a Global Hierarchical Object Graph. 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

During impact analysis on object-oriented code, statically extracting dependencies is often complicated by subclassing, programming to interfaces, aliasing, and collections, among others. When a tool recommends a large number of types or does not rank its recommendations, it may lead developers to explore more irrelevant code. We propose to mine and rank dependencies based on a global, hierarchical points-to graph that is extracted using abstract interpretation. A previous whole-program static analysis interprets a program enriched with annotations that express hierarchy, and over-approximates all the objects that may be created at runtime and how they may communicate. In this paper, an analysis mines the hierarchy and the edges in the graph to extract and rank dependencies such as the most important classes related to a class, or the most important classes behind an interface. An evaluation using two case studies on two systems totaling 10,000 lines of code and five completed code modification tasks shows that following dependencies based on abstract interpretation achieves higher effectiveness compared to following dependencies extracted from the abstract syntax tree. As a result, developers explore less irrelevant code.

Marwan Abi-Antoun, Ebrahim Khalaj, Radu Vanciu, Ahmad Moghimi.  2016.  Abstract Runtime Structure Reasoning about Security. HotSos '16 Proceedings of the Symposium and Bootcamp on the Science of Security.

We propose an interactive approach where analysts reason about the security of a system using an abstraction of its runtime structure, as opposed to looking at the code. They interactively refine a hierarchical object graph, set security properties on abstract objects or edges, query the graph, and investigate the results by studying highlighted objects or edges or tracing to the code. Behind the scenes, an inference analysis and an extraction analysis maintain the soundness of the graph with respect to the code.

Maria Riaz, Laurie Williams.  2012.  Security Requirements Patterns: Understanding the Science Behind the Art of Pattern Writing. 2012 Second IEEE International Workshop on Requirements Patterns (RePa).

Security requirements engineering ideally combines expertise in software security with proficiency in requirements engineering to provide a foundation for developing secure systems. However, security requirements are often inadequately understood and improperly specified, often due to lack of security expertise and a lack of emphasis on security during early stages of system development. Software systems often have common and recurrent security requirements in addition to system-specific security needs. Security requirements patterns can provide a means of capturing common security requirements while documenting the context in which a requirement manifests itself and the tradeoffs involved. The objective of this paper is to aid in understanding of the process for pattern development and provide considerations for writing effective security requirements patterns. We analyzed existing literature on software patterns, problem solving and cognition to outline the process for developing software patterns. We also reviewed strategies for specifying reusable security requirements and security requirements patterns. Our proposed considerations can aid pattern writers in capturing necessary contextual information when documenting security requirements patterns to facilitate application and integration of security requirements.

Maria Riaz, Travis Breaux, Laurie Williams.  2015.  How have we evaluated software pattern application? A systematic mapping study of research design practices Journal Information and Software Technology . 65(C):14-38.

ContextSoftware patterns encapsulate expert knowledge for constructing successful solutions to recurring problems. Although a large collection of software patterns is available in literature, empirical evidence on how well various patterns help in problem solving is limited and inconclusive. The context of these empirical findings is also not well understood, limiting applicability and generalizability of the findings. ObjectiveTo characterize the research design of empirical studies exploring software pattern application involving human participants. MethodWe conducted a systematic mapping study to identify and analyze 30 primary empirical studies on software pattern application, including 24 original studies and 6 replications. We characterize the research design in terms of the questions researchers have explored and the context of empirical research efforts. We also classify the studies in terms of measures used for evaluation, and threats to validity considered during study design and execution. ResultsUse of software patterns in maintenance is the most commonly investigated theme, explored in 16 studies. Object-oriented design patterns are evaluated in 14 studies while 4 studies evaluate architectural patterns. We identified 10 different constructs with 31 associated measures used to evaluate software patterns. Measures for 'efficiency' and 'usability' are commonly used to evaluate the problem solving process. While measures for 'completeness', 'correctness' and 'quality' are commonly used to evaluate the final artifact. Overall, 'time to complete a task' is the most frequently used measure, employed in 15 studies to measure 'efficiency'. For qualitative measures, studies do not report approaches for minimizing biases 27% of the time. Nine studies do not discuss any threats to validity. ConclusionSubtle differences in study design and execution can limit comparison of findings. Establishing baselines for participants' experience level, providing appropriate training, standardizing problem sets, and employing commonly used measures to evaluate performance can support replication and comparison of results across studies.

Mahmoud Hammad, Hamid Bagheri, Sam Malek.  2017.  DELDroid: Determination and Enforcement of Least-Privilege Architecture in Android. 2017 IEEE International Conference on Software Architecture.

Modern mobile platforms rely on a permission model to guard the system's resources and apps. In Android, since the permissions are granted at the granularity of apps, and all components belonging to an app inherit those permissions, an app's components are typically over-privileged, i.e., components are granted more privileges than they need to complete their tasks. Systematic violation of least-privilege principle in Android has shown to be the root cause of many security vulnerabilities. To mitigate this issue, we have developed DELDROID, an automated system for determination of least privilege architecture in Android and its enforcement at runtime. A key contribution of our approach is the ability to limit the privileges granted to apps without the need to modify them. DELDROID utilizes static program analysis techniques to extract the exact privileges each component needs for providing its functionality. A Multiple-Domain Matrix representation of the system's architecture is then used to automatically analyze the security posture of the system and derive its least-privilege architecture. Our experiments on hundreds of real world apps corroborate DELDROID's ability in effectively establishing the least-privilege architecture and its benefits in alleviating the security threats.

Mahmood, Riyadh, Mirzaei, Nariman, Malek, Sam.  2014.  EvoDroid: Segmented Evolutionary Testing of Android Apps. FSE 2014 Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering.

Proliferation of Android devices and apps has created a demand for applicable automated software testing techniques. Prior research has primarily focused on either unit or GUI testing of Android apps, but not their end-to-end system testing in a systematic manner. We present EvoDroid, an evolutionary approach for system testing of Android apps. EvoDroid overcomes a key shortcoming of using evolutionary techniques for system testing, i.e., the inability to pass on genetic makeup of good individuals in the search. To that end, EvoDroid combines two novel techniques: (1) an Android-specific program analysis technique that identifies the segments of the code amenable to be searched independently, and (2) an evolutionary algorithm that given information of such segments performs a step-wise search for test cases reaching deep into the code. Our experiments have corroborated EvoDroid’s ability to achieve significantly higher code coverage than existing Android testing tools.