Visible to the public Biblio

Filters: First Letter Of Last Name is W  [Clear All Filters]
A B C D E F G H I J K L M N O P Q R S T U V [W] X Y Z   [Show ALL]
A
Subramani, Shweta, Vouk, Mladen A., Williams, Laurie.  2014.  An Analysis of Fedora Security Profile. HotSoS 2014 Symposium and Bootcamp on the Science of Security (HotSoS). :169-71.

In our previous work we showed that for Fedora, under normal operational conditions, security problem discovery appears to be a random process. While in the case of Fedora, and a number of other open source products, classical reliability models can be adapted to estimate the number of residual security problems under “normal” operational usage (not attacks), the predictive ability of the model is lower for security faults due to the rarity of security events and because there appears to be no real security reliability growth. The ratio of security to non-security faults is an indicator that the process needs improving, but it also may be leveraged to assess vulnerability profile of a release and possibly guide testing of its next version. We manually analyzed randomly sampled problems for four different versions of Fedora and classified them into security vulnerability categories. We also analyzed the distribution of these problems over the software’s lifespan and we found that they exhibit a symmetry which can perhaps be used in estimating the number of residual security problems in the software. Based on our work, we believe that an approach to vulnerability elimination based on a combination of “classical” and some non-operational “bounded” high-assurance testing along the lines discussed in may yield good vulnerability elimination results, as well as a way of estimating vulnerability level of a release. Classical SRE methods, metrics and models can be used to track both non-security and security problem detection under normal operational profile. We can then model the reliability growth, if any, and estimate the number of residual faults by estimating the lower and upper bounds on the total number of faults of a certain type. In production, there may be a simpler alternative. Just count the vulnerabilities and project over the next period assuming constant vulnerability discovery rate. In testing phase, to accelerate the process, one might leverage collected vulnerability information to generate non-operational test-cases aimed at vulnerability categories. The observed distributions of security problems reported under normal “operational” usage appear to support the above approach – i.e., what is learned say in the first x weeks can them be leveraged in selecting test cases in the next stage. Similarly, what is learned about a product Y weeks after its release may be very indicative of its vulnerability profile for the rest of its life given the assumption of constant vulnerability discovery rate.

Tingting Yu, Witawas Srisa-an, Gregg Rothermel.  2017.  An automated framework to support testing for process-level race conditions. Software: Testing, Verification, and Reliability .

Race conditions are difficult to detect because they usually occur only under specific execution interleavings. Numerous program analysis and testing techniques have been proposed to detect race conditions between threads on single applications. However, most of these techniques neglect races that occur at the process level due to complex system event interactions. This article presents a framework, SIMEXPLORER, that allows engineers to effectively test for process-level race conditions. SIMEXPLORER first uses dynamic analysis techniques to observe system execution, identify program locations of interest, and report faults related to oracles. Next, it uses virtualization to achieve the fine-grained controllability needed to exercise event interleavings that are likely to expose races. We evaluated the effectiveness of SIMEXPLORER on 24 real-world applications containing both known and unknown process-level race conditions. Our results show that SIMEXPLORER is effective at detecting these race conditions, while incurring an overhead that is acceptable given its effectiveness improvements.

C
Burcham, Morgan, Al-Zyoud, Mahran, Carver, Jeffrey C., Alsaleh, Mohammed, Du, Hongying, Gilani, Fida, Jiang, Jun, Rahman, Akond, Kafalı, Özgür, Al-Shaer, Ehab et al..  2017.  Characterizing Scientific Reporting in Security Literature: An Analysis of ACM CCS and IEEE S&P Papers. Proceedings of the Hot Topics in Science of Security: Symposium and Bootcamp. :13–23.

Scientific advancement is fueled by solid fundamental research, followed by replication, meta-analysis, and theory building. To support such advancement, researchers and government agencies have been working towards a "science of security". As in other sciences, security science requires high-quality fundamental research addressing important problems and reporting approaches that capture the information necessary for replication, meta-analysis, and theory building. The goal of this paper is to aid security researchers in establishing a baseline of the state of scientific reporting in security through an analysis of indicators of scientific research as reported in top security conferences, specifically the 2015 ACM CCS and 2016 IEEE S&P proceedings. To conduct this analysis, we employed a series of rubrics to analyze the completeness of information reported in papers relative to the type of evaluation used (e.g. empirical study, proof, discussion). Our findings indicated some important information is often missing from papers, including explicit documentation of research objectives and the threats to validity. Our findings show a relatively small number of replications reported in the literature. We hope that this initial analysis will serve as a baseline against which we can measure the advancement of the science of security.

Rivers, Anthony T., Vouk, Mladen A., Williams, Laurie.  2014.  On Coverage-Based Attack Profiles. Eight International Conference on Software Security and Reliability (SERE) . :5-6.

Automated cyber attacks tend to be schedule and resource limited. The primary progress metric is often “coverage” of pre-determined “known” vulnerabilities that may not have been patched, along with possible zero-day exploits (if such exist). We present and discuss a hypergeometric process model that describes such attack patterns. We used web request signatures from the logs of a production web server to assess the applicability of the model.

D
Riaz, Maria, Breaux, Travis, Williams, Laurie, Niu, Jianwei.  2012.  On the Design of Empirical Studies to Evaluate Software Patterns: A Survey.

Software patterns are created with the goal of capturing expert
knowledge so it can be efficiently and effectively shared with the
software development community. However, patterns in practice
may or may not achieve these goals. Empirical studies of the use
of software patterns can help in providing deeper insight into
whether these goals have been met. The objective of this paper is
to aid researchers in designing empirical studies of software
patterns by summarizing the study designs of software patterns
available in the literature. The important components of these
study designs include the evaluation criteria and how the patterns
are presented to study participants. We select and analyze 19
distinct empirical studies and identify 17 independent variables in
three different categories (participants demographics; pattern
presentation; problem presentation). We also extract 10 evaluation
criteria with 23 associated observable measures. Additionally, by
synthesizing the reported observations, we identify challenges
faced during study execution. Provision of multiple domainspecific
examples of pattern application and tool support to assist
in pattern selection are helpful for the study participants in
understanding and completing the study task. Capturing data
regarding the cognitive processes of participants can provide
insights into the findings of the study.

Aiping Xiong, R. W. Proctor, Weining Yang, Ninghui Li.  2017.  Is Domain Highlighting Actually Helpful in Identifying Phishing Webpages? Human Factors: The Journal of the Human Factors and Ergonomics Society.

Objective: To evaluate the effectiveness of domain highlighting in helping users identify whether webpages are legitimate or spurious.

Background: As a component of the URL, a domain name can be overlooked. Consequently, browsers highlight the domain name to help users identify which website they are visiting. Nevertheless, few studies have assessed the effectiveness of domain highlighting, and the only formal study confounded highlighting with instructions to look at the address bar. 

Method: We conducted two phishing detection experiments. Experiment 1 was run online: Participants judged the legitimacy of webpages in two phases. In phase one, participants were to judge the legitimacy based on any information on the webpage, whereas phase two they were to focus on the address bar. Whether the domain was highlighted was also varied.  Experiment 2 was conducted similarly but with participants in a laboratory setting, which allowed tracking of fixations.

Results: Participants differentiated the legitimate and fraudulent webpages better than chance. There was some benefit of attending to the address bar, but domain highlighting did not provide effective protection against phishing attacks. Analysis of eye-gaze fixation measures was in agreement with the task performance, but heat-map results revealed that participants’ visual attention was attracted by the highlighted domains.

Conclusion: Failure to detect many fraudulent webpages even when the domain was highlighted implies that users lacked knowledge of webpage security cues or how to use those cues.

Zhiqiang Li, Lichao Sun, Qiben Yan, Witawas Srisa-an, Zhenxiang Chen.  2016.  DroidClassifier: Efficient Adaptive Mining of Application-Layer Header for Classifying Android Malware. 12th EAI International Conference on Security and Privacy in Communication Networks.

A recent report has shown that there are more than 5,000 malicious applications created for Android devices each day. This creates a need for researchers to develop effective and efficient malware classification and detection approaches. To address this need, we introduce DroidClassifier: a systematic framework for classifying network traffic generated by mobile malware. Our approach utilizes network traffic analysis to construct multiple models in an automated fashion using a supervised method over a set of labeled malware network traffic (the training dataset). Each model is built by extracting common identifiers from multiple HTTP header fields. Adaptive thresholds are designed to capture the disparate characteristics of different malware families. Clustering is then used to improve the classification efficiency. Finally, we aggregate the multiple models to construct a holistic model to conduct cluster-level malware classification. We then perform a comprehensive evaluation of DroidClassifier by using 706 malware samples as the training set and 657 malware samples and 5,215 benign apps as the testing set. Collectively , these malicious and benign apps generate 17,949 network flows. The results show that DroidClassifier successfully identifies over 90% of different families of malware with more than 90% accuracy with accessible computational cost. Thus, DroidClassifier can facilitate network management in a large network, and enable unobtrusive detection of mobile malware. By focusing on analyzing network behaviors, we expect DroidClassifier to work with reasonable accuracy for other mobile platforms such as iOS and Windows Mobile as well.

E
Yutaka Tsutano, Shakthi Bachala, Witawas Srisa-an, Gregg Rothermel, Jackson Dinh.  2017.  An Efficient, Robust, and Scalable Approach for Analyzing Interacting Android Apps. 39th International Conference on Software Engineering.

When multiple apps on an Android platform interact, faults and security vulnerabilities can occur. Software engineers need to be able to analyze interacting apps to detect such problems. Current approaches for performing such analyses, however, do not scale to the numbers of apps that may need to be considered, and thus, are impractical for application to realworld scenarios. In this paper, we introduce JITANA, a program analysis framework designed to analyze multiple Android apps simultaneously. By using a classloader-based approach instead of a compiler-based approach such as SOOT, JITANA is able to simultaneously analyze large numbers of interacting apps, perform on-demand analysis of large libraries, and effectively analyze dynamically generated code. Empirical studies of JITANA show that it is substantially more efficient than a state-of-theart approach, and that it can effectively and efficiently analyze complex apps including Facebook, Pokemon Go, and Pandora ´ that the state-of-the-art approach cannot handle.

Junjie Qian, Hong Jiang, Witawas Srisa-an, Sharad Seth.  2017.  Energy-efficient I/O Thread Schedulers for NVMe SSDs on NUMA. CCGrid '17 Proceedings of the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

Non-volatile memory express (NVMe) based SSDs and the NUMA platform are widely adopted in servers to achieve faster storage speed and more powerful processing capability. As of now, very little research has been conducted to investigate the performance and energy efficiency of the stateof-the-art NUMA architecture integrated with NVMe SSDs, an emerging technology used to host parallel I/O threads. As this technology continues to be widely developed and adopted, we need to understand the runtime behaviors of such systems in order to design software runtime systems that deliver optimal performance while consuming only the necessary amount of energy. This paper characterizes the runtime behaviors of a Linuxbased NUMA system employing multiple NVMe SSDs. Our comprehensive performance and energy-efficiency study using massive numbers of parallel I/O threads shows that the penalty due to CPU contention is much smaller than that due to remote access of NVMe SSDs. Based on this insight, we develop a dynamic “lesser evil” algorithm called ESN, to minimize the impact of these two types of penalties. ESN is an energyefficient profiling-based I/O thread scheduler for managing I/O threads accessing NVMe SSDs on NUMA systems. Our empirical evaluation shows that ESN can achieve optimal I/O throughput and latency while consuming up to 50% less energy and using fewer CPUs.

Waqar Ahmad, Joshua Sunshine, Christian Kästner, Adam Wynne.  2015.  Enforcing Fine-Grained Security and Privacy Policies in an Ecosystem within an Ecosystem. Systems, Programming, Languages and Applications: Software for Humanity (SPLASH).

Smart home automation and IoT promise to bring many advantages but they also expose their users to certain security and privacy vulnerabilities. For example, leaking the information about the absence of a person from home or the medicine somebody is taking may have serious security and privacy consequences for home users and potential legal implications for providers of home automation and IoT platforms. We envision that a new ecosystem within an existing smartphone ecosystem will be a suitable platform for distribution of apps for smart home and IoT devices. Android is increasingly becoming a popular platform for smart home and IoT devices and applications. Built-in security mechanisms in ecosystems such as Android have limitations that can be exploited by malicious apps to leak users' sensitive data to unintended recipients. For instance, Android enforces that an app requires the Internet permission in order to access a web server but it does not control which servers the app talks to or what data it shares with other apps. Therefore, sub-ecosystems that enforce additional fine-grained custom policies on top of existing policies of the smartphone ecosystems are necessary for smart home or IoT platforms. To this end, we have built a tool that enforces additional policies on inter-app interactions and permissions of Android apps. We have done preliminary testing of our tool on three proprietary apps developed by a future provider of a home automation platform. Our initial evaluation demonstrates that it is possible to develop mechanisms that allow definition and enforcement of custom security policies appropriate for ecosystems of the like smart home automation and IoT.

Junjie Qian, Witawas Srisa-an, Hong Jiang, Sharad Seth, Du Li, Pan Yi.  2016.  Exploiting FIFO Scheduler to Improve Parallel Garbage Collection Performance.. VEE '16 12th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments.

Recent studies have found that parallel garbage collection performs worse with more CPUs and more collector threads. As part of this work, we further investigate this enomenon and find that poor scalability is worst in highly scalable Java applications. Our investigation to find the causes clearly reveals that efficient multi-threading in an application can prolong the average object lifespan, which results in less effective garbage collection. We also find that prolonging lifespan is the direct result of Linux's Completely Fair Scheduler due to its round-robin like behavior that can increase the heap contention between the application threads. Instead, if we use pseudo first-in-first-out to schedule application threads in large multicore systems, the garbage collection scalability is significantly improved while the time spent in garbage collection is reduced by as much as 21%. The average execution time of the 24 Java applications used in our study is also reduced by 11%. Based on this observation, we propose two approaches to optimally select scheduling policies based on application scalability profile. Our first approach uses the profile information from one execution to tune the subsequent executions. Our second approach dynamically collects profile information and performs policy selection during execution.

F
Hibshi, Hanan, Breaux, Travis, Riaz, Maria, Williams, Laurie.  2014.  A Framework to Measure Experts’ Decision Making in Security Requirements Analysis. IEEE 1st International Workshop on Evolving Security and Privacy Requirements Engineering, .

Research shows that commonly accepted security requirements are not generally applied in practice. Instead of relying on requirements checklists, security experts rely on their expertise and background knowledge to identify security vulnerabilities. To understand the gap between available checklists and practice, we conducted a series of interviews to encode the decision-making process of security experts and novices during security requirements analysis. Participants were asked to analyze two types of artifacts: source code, and network diagrams for vulnerabilities and to apply a requirements checklist to mitigate some of those vulnerabilities. We framed our study using Situation Awareness-a cognitive theory from psychology-to elicit responses that we later analyzed using coding theory and grounded analysis. We report our preliminary results of analyzing two interviews that reveal possible decision-making patterns that could characterize how analysts perceive, comprehend and project future threats which leads them to decide upon requirements and their specifications, in addition, to how experts use assumptions to overcome ambiguity in specifications. Our goal is to build a model that researchers can use to evaluate their security requirements methods against how experts transition through different situation awareness levels in their decision-making process.

G
Michael Coblenz, Whitney Nelson, Jonathan Aldrich, Brad Myers, Joshua Sunshine.  2017.  Glacier: Transitive Class Immutability for Java. 39th International Conference on Software Engineering.

Though immutability has been long-proposed as a way to prevent bugs in software, little is known about how to make immutability support in programming languages effective for software engineers. We designed a new formalism that extends Java to support transitive class immutability, the form of immutability for which there is the strongest empirical support, and implemented that formalism in a tool called Glacier. We applied Glacier successfully to two real-world systems. We also compared Glacier to Java’s final in a user study of twenty participants. We found that even after being given instructions on how to express immutability with final, participants who used final were unable to express immutability correctly, whereas almost all participants who used Glacier succeeded. We also asked participants to make specific changes to immutable classes and found that participants who used final all incorrectly mutated immutable state, whereas almost all of the participants who used Glacier succeeded. Glacier represents a promising approach to enforcing immutability in Java and provides a model for enforcement in other languages.

I
Michael Maass, William Scherlis, Jonathan Aldrich.  2014.  In-Nimbo Sandboxing. HotSoS '14 Proceedings of the 2014 Symposium and Bootcamp on the Science of Security.

Sandboxes impose a security policy, isolating applications and their components from the rest of a system. While many sandboxing techniques exist, state of the art sandboxes generally perform their functions within the system that is being defended. As a result, when the sandbox fails or is bypassed, the security of the surrounding system can no longer be assured. We experiment with the idea of in-nimbo sandboxing, encapsulating untrusted computations away from the system we are trying to protect. The idea is to delegate computations that may be vulnerable or malicious to virtual machine instances in a cloud computing environment.

This may not reduce the possibility of an in-situ sandbox compromise, but it could significantly reduce the consequences should that possibility be realized. To achieve this advantage, there are additional requirements, including: (1) A regulated channel between the local and cloud environments that supports interaction with the encapsulated application, (2) Performance design that acceptably minimizes latencies in excess of the in-situ baseline.

To test the feasibility of the idea, we built an in-nimbo sandbox for Adobe Reader, an application that historically has been subject to significant attacks. We undertook a prototype deployment with PDF users in a large aerospace firm. In addition to thwarting several examples of existing PDF-based malware, we found that the added increment of latency, perhaps surprisingly, does not overly impair the user experience with respect to performance or usability.

Waqar Ahmad, Christian Kästner, Joshua Sunshine, Jonathan Aldrich.  2016.  Inter-app Communication in Android: Developer Challenges. 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories. :177-188.

The Android platform is designed to support mutually untrusted third-party apps, which run as isolated processes but may interact via platform-controlled mechanisms, called Intents. Interactions among third-party apps are intended and can contribute to a rich user experience, for example, the ability to share pictures from one app with another. The Android platform presents an interesting point in a design space of module systems that is biased toward isolation, extensibility, and untrusted contributions. The Intent mechanism essentially provides message channels among modules, in which the set of message types is extensible. However, the module system has design limitations including the lack of consistent mechanisms to document message types, very limited checking that a message conforms to its specifications, the inability to explicitly declare dependencies on other modules, and the lack of checks for backward compatibility as message types evolve over time. In order to understand the degree to which these design limitations result in real issues, we studied a broad corpus of apps and cross-validated our results against app documentation and Android support forums. Our findings suggest that design limitations do indeed cause development problems. Based on our results, we outline further research questions and propose possible mitigation strategies.

M
West, Andrew, Aviv, Adam.  2014.  Measuring Privacy Disclosures in URL Query Strings. IEEE Internet Computing. 18(6)
N
Subramani, Shweta, Vouk, Mladen A., Williams, Laurie.  2013.  Non-Operational Testing of Software for Security Issues. ISSRE 2013. :pp21-22.

We have been studying extension of the classical Software Reliability Engineering (SRE) methodology into the security space. We combine “classical” reliability modeling, when applied to reported vulnerabilities found under “normal” operational profile conditions, with safety oriented fault management processes. We illustrate with open source Fedora software.

Our initial results appear to indicate that generation of a repeatable automated test-strategy that would explicitly cover the “top 25” security problems may help considerably – eliminating perhaps as much as 50% of the field observable problems. However, genuine aleatoric and more process oriented incomplete analysis and design flaws remain. While we have made some progress in identifying focus areas, a number of questions remain, and we continue working on them.

P
Adwait Nadkarni, Benjamin Andow, William Enck, Somesh Jha.  2016.  Practical DIFC Enforcement on Android. USENIX Security Symposium.

Smartphone users often use private and enterprise data with untrusted third party applications.  The fundamental lack of secrecy guarantees in smartphone OSes, such as Android, exposes this data to the risk of unauthorized exfiltration.  A natural solution is the integration of secrecy guarantees into the OS.  In this paper, we describe the challenges for decentralized information flow control (DIFC) enforcement on Android.  We propose context-sensitive DIFC enforcement via lazy polyinstantiation and practical and secure network export through domain declassification.  Our DIFC system, Weir, is backwards compatible by design, and incurs less than 4 ms overhead for component startup.  With Weir,  we demonstrate practical and secure DIFC enforcement on Android.

Rahman, Akond, Pradhan, Priysha, Partho, Asif, Williams, Laurie.  2017.  Predicting Android Application Security and Privacy Risk with Static Code Metrics. Proceedings of the 4th International Conference on Mobile Software Engineering and Systems. :149–153.

Android applications pose security and privacy risks for end-users. These risks are often quantified by performing dynamic analysis and permission analysis of the Android applications after release. Prediction of security and privacy risks associated with Android applications at early stages of application development, e.g. when the developer (s) are writing the code of the application, might help Android application developers in releasing applications to end-users that have less security and privacy risk. The goal of this paper is to aid Android application developers in assessing the security and privacy risk associated with Android applications by using static code metrics as predictors. In our paper, we consider security and privacy risk of Android application as how susceptible the application is to leaking private information of end-users and to releasing vulnerabilities. We investigate how effectively static code metrics that are extracted from the source code of Android applications, can be used to predict security and privacy risk of Android applications. We collected 21 static code metrics of 1,407 Android applications, and use the collected static code metrics to predict security and privacy risk of the applications. As the oracle of security and privacy risk, we used Androrisk, a tool that quantifies the amount of security and privacy risk of an Android application using analysis of Android permissions and dynamic analysis. To accomplish our goal, we used statistical learners such as, radial-based support vector machine (r-SVM). For r-SVM, we observe a precision of 0.83. Findings from our paper suggest that with proper selection of static code metrics, r-SVM can be used effectively to predict security and privacy risk of Android applications.

West, Andrew G, Aviv, Adam J.  2014.  On the Privacy Concerns of URL Query Strings. W2SP'14: Proceedings of the 8th Workshop on Web 2.0 Security and Privacy .
R
Supat Rattanasuksun, Tingting Yu, Witawas Srisa-an, Gregg Rothermel.  2016.  RRF: A Race Reproduction Framework for Use in Debugging Process-Level Races. 27th International Symposium on Software Reliability Engineering (ISSRE).

Process-level races are endemic in modern  systems. These races are difficult  to debug  because they are  sensitive to execution   events  such  as  interrupts and scheduling.  Unless  a process interleaving   that can result in the race can be found, it cannot be reproduced  and cannot be corrected. In practice, however,  the number of interleavings  that can occur among processes  in practice  is large,  and the patterns of interleavings can be complex. Thus, approaches for reproducing process-level races  to date are  often ineffective.  In  this paper, we present RRF, a race reproduction  framework that can help software engineers reproduce reported process-level races, enabling  them to potentially  debug these races. RRF performs a hybrid analysis by leveraging  existing  static program analysis tools, dynamic kernel event  reporting tools,  and yield points  to provide  the observability and controllability  needed to reproduce races. We conducted an empirical study to evaluate RRF; our results show that RRF can be effective for reproducing races.

S
Lichao Sun, Zhiqiang Li, Qiben Yan, Witawas Srisa-an, Yu Pan.  2016.  SigPID: Significant Permission Identification for Android Malware Detection. 11th International Conference on Malicious and Unwanted Software (MALCON 2016).

A recent report indicates that a newly developed mali- cious app for Android is introduced every 11 seconds.  To combat this alarming rate of malware creation,  we need a scalable malware detection approach that is effective and efficient. In this paper, we introduce SIGPID, a malware detection system based on permission  analysis to cope with the rapid increase in the number of Android malware. In- stead of analyzing all 135 Android permissions, our ap- proach applies 3-level pruning by mining the permission data to identify only significant permissions that can be ef- fective in distinguishing benign and malicious apps. SIG- PID then utilizes classification algorithms to classify differ- ent families of malware and benign apps. Our evaluation finds that only 22 out of 135 permissions are significant. We then compare the performance of our approach, using only

22 permissions, against a baseline approach that analyzes all permissions. The results indicate that when Support Vec- tor Machine (SVM) is used as the classifier, we can achieve over 90% of precision, recall, accuracy, and F-measure, which  are about the same as those produced by the base- line approach while incurring the analysis times that are 4 to 32 times smaller that those of using all 135 permissions. When we compare the detection effectiveness of SIGPID to those of other approaches, SIGPID can detect 93.62% of malware in the data set, and 91.4% unknown malware.

Tingting Yu, Witawas Srisa-an, Gregg Rothermel.  2014.  SimRT: An Automated Framework to Support Regression Testing for Data Races. ICSE 2014 Proceedings of the 36th International Conference on Software Engineering.

Concurrent programs are prone to various classes of difficult-to-detect faults, of which data races are particularly prevalent. Prior work has attempted to increase the cost-effectiveness of approaches for testing for data races by employing race detection techniques, but to date, no work has considered cost-effective approaches for re-testing for races as programs evolve. In this paper we present SimRT, an automated regression testing framework for use in detecting races introduced by code modifications. SimRT employs a regression test selection technique, focused on sets of program elements related to race detection, to reduce the number of test cases that must be run on a changed program to detect races that occur due to code modifications, and it employs a test case prioritization technique to improve the rate at which such races are detected. Our empirical study of SimRT reveals that it is more efficient and effective for revealing races than other approaches, and that its constituent test selection and prioritization components each contribute to its performance.

Junjie Qian, Witawas Srisa-an, Du Li, Hong Jiang, Sharad Seth, Yaodong Yang.  2015.  SmartStealing: Analysis and Optimization of Work Stealing in Parallel Garbage Collection for Java VM.. Principles and Practice of Programming in Java (PPPJ).

Parallel garbage collection has been used to speedup the collection process on multicore architectures. Similar to other parallel techniques, balancing the workload among threads is critical to ensuring good overall collection performance. To this end, work stealing is employed by the current stateof-the-art Java Virtual Machine, OpenJDK, to keep GC threads from idling during a collection process. However, we found that the current algorithm is not efficient. Its usage can often cause GC performance to be worse than when work stealing is not used. In this paper, we identify three factors that affect work stealing efficiency: determining tasks that can benefit from stealing, frequency with which to attempt stealing, and performance impacts of failed stealing attempts. Based on this analysis, we propose SmartStealing, a new algorithm that can automatically decide whether to attempt stealing at a particular point during execution. If stealing is attempted, it can efficiently identify a task to steal from. We then compare the collection performances when (i) the default work stealing algorithm is used, (ii) work stealing is not used at all, and (iii) the SmartStealing approach is used. Without modifying the remaining garbage collection system, the evaluation result shows that SmartStealing can reduce the parallel GC execution time for 19 of the 21 benchmarks. The average reduction is 50.4% and the highest reduction is 78.7%. We also investigate the performances of SmartStealing on NUMA and UMA architectures.