Biblio

Filters: Keyword is program debugging  [Clear All Filters]
2021-03-01
Golagha, M., Pretschner, A., Briand, L. C..  2020.  Can We Predict the Quality of Spectrum-based Fault Localization? 2020 IEEE 13th International Conference on Software Testing, Validation and Verification (ICST). :4–15.
Fault localization and repair are time-consuming and tedious. There is a significant and growing need for automated techniques to support such tasks. Despite significant progress in this area, existing fault localization techniques are not widely applied in practice yet and their effectiveness varies greatly from case to case. Existing work suggests new algorithms and ideas as well as adjustments to the test suites to improve the effectiveness of automated fault localization. However, important questions remain open: Why is the effectiveness of these techniques so unpredictable? What are the factors that influence the effectiveness of fault localization? Can we accurately predict fault localization effectiveness? In this paper, we try to answer these questions by collecting 70 static, dynamic, test suite, and fault-related metrics that we hypothesize are related to effectiveness. Our analysis shows that a combination of only a few static, dynamic, and test metrics enables the construction of a prediction model with excellent discrimination power between levels of effectiveness (eight metrics yielding an AUC of .86; fifteen metrics yielding an AUC of.88). The model hence yields a practically useful confidence factor that can be used to assess the potential effectiveness of fault localization. Given that the metrics are the most influential metrics explaining the effectiveness of fault localization, they can also be used as a guide for corrective actions on code and test suites leading to more effective fault localization.
2020-03-09
Li, Chi, Zhou, Min, Gu, Zuxing, Gu, Ming, Zhang, Hongyu.  2019.  Ares: Inferring Error Specifications through Static Analysis. 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). :1174–1177.
Misuse of APIs happens frequently due to misunderstanding of API semantics and lack of documentation. An important category of API-related defects is the error handling defects, which may result in security and reliability flaws. These defects can be detected with the help of static program analysis, provided that error specifications are known. The error specification of an API function indicates how the function can fail. Writing error specifications manually is time-consuming and tedious. Therefore, automatic inferring the error specification from API usage code is preferred. In this paper, we present Ares, a tool for automatic inferring error specifications for C code through static analysis. We employ multiple heuristics to identify error handling blocks and infer error specifications by analyzing the corresponding condition logic. Ares is evaluated on 19 real world projects, and the results reveal that Ares outperforms the state-of-the-art tool APEx by 37% in precision. Ares can also identify more error specifications than APEx. Moreover, the specifications inferred from Ares help find dozens of API-related bugs in well-known projects such as OpenSSL, among them 10 bugs are confirmed by developers. Video: https://youtu.be/nf1QnFAmu8Q. Repository: https://github.com/lc3412/Ares.
2020-08-14
Gu, Zuxing, Wu, Jiecheng, Liu, Jiaxiang, Zhou, Min, Gu, Ming.  2019.  An Empirical Study on API-Misuse Bugs in Open-Source C Programs. 2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC). 1:11—20.
Today, large and complex software is developed with integrated components using application programming interfaces (APIs). Correct usage of APIs in practice presents a challenge due to implicit constraints, such as call conditions or call orders. API misuse, i.e., violation of these constraints, is a well-known source of bugs, some of which can cause serious security vulnerabilities. Although researchers have developed many API-misuse detectors over the last two decades, recent studies show that API misuses are still prevalent. In this paper, we provide a comprehensive empirical study on API-misuse bugs in open-source C programs. To understand the nature of API misuses in practice, we analyze 830 API-misuse bugs from six popular programs across different domains. For all the studied bugs, we summarize their root causes, fix patterns and usage statistics. Furthermore, to understand the capabilities and limitations of state-of-the-art static analysis detectors for API-misuse detection, we develop APIMU4C, a dataset of API-misuse bugs in C code based on our empirical study results, and evaluate three widely-used detectors on it qualitatively and quantitatively. We share all the findings and present possible directions towards more powerful API-misuse detectors.
2020-10-26
Leach, Kevin, Dougherty, Ryan, Spensky, Chad, Forrest, Stephanie, Weimer, Westley.  2019.  Evolutionary Computation for Improving Malware Analysis. 2019 IEEE/ACM International Workshop on Genetic Improvement (GI). :18–19.
Research in genetic improvement (GI) conventionally focuses on the improvement of software, including the automated repair of bugs and vulnerabilities as well as the refinement of software to increase performance. Eliminating or reducing vulnerabilities using GI has improved the security of benign software, but the growing volume and complexity of malicious software necessitates better analysis techniques that may benefit from a GI-based approach. Rather than focus on the use of GI to improve individual software artifacts, we believe GI can be applied to the tools used to analyze malicious code for its behavior. First, malware analysis is critical to understanding the damage caused by an attacker, which GI-based bug repair does not currently address. Second, modern malware samples leverage complex vectors for infection that cannot currently be addressed by GI. In this paper, we discuss an application of genetic improvement to the realm of automated malware analysis through the use of variable-strength covering arrays.
2020-08-14
Zolfaghari, Majid, Salimi, Solmaz, Kharrazi, Mehdi.  2019.  Inferring API Correct Usage Rules: A Tree-based Approach. 2019 16th International ISC (Iranian Society of Cryptology) Conference on Information Security and Cryptology (ISCISC). :78—84.
The lack of knowledge about API correct usage rules is one of the main reasons that APIs are employed incorrectly by programmers, which in some cases lead to serious security vulnerabilities. However, finding a correct usage rule for an API is a time-consuming and error-prone task, particularly in the absence of an API documentation. Existing approaches to extract correct usage rules are mostly based on majority API usages, assuming the correct usage is prevalent. Although statistically extracting API correct usage rules achieves reasonable accuracy, it cannot work correctly in the absence of a fair amount of sample usages. We propose inferring API correct usage rules independent of the number of sample usages by leveraging an API tree structure. In an API tree, each node is an API, and each node's children are APIs called by the parent API. Starting from lower-level APIs, it is possible to infer the correct usage rules for them by utilizing the available correct usage rules of their children. We developed a tool based on our idea for inferring API correct usages rules hierarchically, and have applied it to the source code of Linux kernel v4.3 drivers and found 24 previously reported bugs.
2020-07-20
Fowler, Daniel S., Bryans, Jeremy, Cheah, Madeline, Wooderson, Paul, Shaikh, Siraj A..  2019.  A Method for Constructing Automotive Cybersecurity Tests, a CAN Fuzz Testing Example. 2019 IEEE 19th International Conference on Software Quality, Reliability and Security Companion (QRS-C). :1–8.
There is a need for new tools and techniques to aid automotive engineers performing cybersecurity testing on connected car systems. This is in order to support the principle of secure-by-design. Our research has produced a method to construct useful automotive security tooling and tests. It has been used to implement Controller Area Network (CAN) fuzz testing (a dynamic security test) via a prototype CAN fuzzer. The black-box fuzz testing of a laboratory vehicle's display ECU demonstrates the value of a fuzzer in the automotive field, revealing bugs in the ECU software, and weaknesses in the vehicle's systems design.
2020-09-28
Piskachev, Goran, Nguyen Quang Do, Lisa, Johnson, Oshando, Bodden, Eric.  2019.  SWAN\_ASSIST: Semi-Automated Detection of Code-Specific, Security-Relevant Methods. 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). :1094–1097.
To detect specific types of bugs and vulnerabilities, static analysis tools must be correctly configured with security-relevant methods (SRM), e.g., sources, sinks, sanitizers and authentication methods-usually a very labour-intensive and error-prone process. This work presents the semi-automated tool SWAN\_ASSIST, which aids the configuration with an IntelliJ plugin based on active machine learning. It integrates our novel automated machine-learning approach SWAN, which identifies and classifies Java SRM. SWAN\_ASSIST further integrates user feedback through iterative learning. SWAN\_ASSIST aids developers by asking them to classify at each point in time exactly those methods whose classification best impact the classification result. Our experiments show that SWAN\_ASSIST classifies SRM with a high precision, and requires a relatively low effort from the user. A video demo of SWAN\_ASSIST can be found at https://youtu.be/fSyD3V6EQOY. The source code is available at https://github.com/secure-software-engineering/swan.
2020-04-20
Huang, Zhen, Lie, David, Tan, Gang, Jaeger, Trent.  2019.  Using Safety Properties to Generate Vulnerability Patches. 2019 IEEE Symposium on Security and Privacy (SP). :539–554.
Security vulnerabilities are among the most critical software defects in existence. When identified, programmers aim to produce patches that prevent the vulnerability as quickly as possible, motivating the need for automatic program repair (APR) methods to generate patches automatically. Unfortunately, most current APR methods fall short because they approximate the properties necessary to prevent the vulnerability using examples. Approximations result in patches that either do not fix the vulnerability comprehensively, or may even introduce new bugs. Instead, we propose property-based APR, which uses human-specified, program-independent and vulnerability-specific safety properties to derive source code patches for security vulnerabilities. Unlike properties that are approximated by observing the execution of test cases, such safety properties are precise and complete. The primary challenge lies in mapping such safety properties into source code patches that can be instantiated into an existing program. To address these challenges, we propose Senx, which, given a set of safety properties and a single input that triggers the vulnerability, detects the safety property violated by the vulnerability input and generates a corresponding patch that enforces the safety property and thus, removes the vulnerability. Senx solves several challenges with property-based APR: it identifies the program expressions and variables that must be evaluated to check safety properties and identifies the program scopes where they can be evaluated, it generates new code to selectively compute the values it needs if calling existing program code would cause unwanted side effects, and it uses a novel access range analysis technique to avoid placing patches inside loops where it could incur performance overhead. Our evaluation shows that the patches generated by Senx successfully fix 32 of 42 real-world vulnerabilities from 11 applications including various tools or libraries for manipulating graphics/media files, a programming language interpreter, a relational database engine, a collection of programming tools for creating and managing binary programs, and a collection of basic file, shell, and text manipulation tools.
2020-03-09
Chhillar, Dheeraj, Sharma, Kalpana.  2019.  ACT Testbot and 4S Quality Metrics in XAAS Framework. 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon). :503–509.

The purpose of this paper is to analyze all Cloud based Service Models, Continuous Integration, Deployment and Delivery process and propose an Automated Continuous Testing and testing as a service based TestBot and metrics dashboard which will be integrated with all existing automation, bug logging, build management, configuration and test management tools. Recently cloud is being used by organizations to save time, money and efforts required to setup and maintain infrastructure and platform. Continuous Integration and Delivery is in practice nowadays within Agile methodology to give capability of multiple software releases on daily basis and ensuring all the development, test and Production environments could be synched up quickly. In such an agile environment there is need to ramp up testing tools and processes so that overall regression testing including functional, performance and security testing could be done along with build deployments at real time. To support this phenomenon, we researched on Continuous Testing and worked with industry professionals who are involved in architecting, developing and testing the software products. A lot of research has been done towards automating software testing so that testing of software product could be done quickly and overall testing process could be optimized. As part of this paper we have proposed ACT TestBot tool, metrics dashboard and coined 4S quality metrics term to quantify quality of the software product. ACT testbot and metrics dashboard will be integrated with Continuous Integration tools, Bug reporting tools, test management tools and Data Analytics tools to trigger automation scripts, continuously analyze application logs, open defects automatically and generate metrics reports. Defect pattern report will be created to support root cause analysis and to take preventive action.

2020-08-14
Singleton, Larry, Zhao, Rui, Song, Myoungkyu, Siy, Harvey.  2019.  FireBugs: Finding and Repairing Bugs with Security Patterns. 2019 IEEE/ACM 6th International Conference on Mobile Software Engineering and Systems (MOBILESoft). :30—34.

Security is often a critical problem in software systems. The consequences of the failure lead to substantial economic loss or extensive environmental damage. Developing secure software is challenging, and retrofitting existing systems to introduce security is even harder. In this paper, we propose an automated approach for Finding and Repairing Bugs based on security patterns (FireBugs), to repair defects causing security vulnerabilities. To locate and fix security bugs, we apply security patterns that are reusable solutions comprising large amounts of software design experience in many different situations. In the evaluation, we investigated 2,800 Android app repositories to apply our approach to 200 subject projects that use javax.crypto APIs. The vision of our automated approach is to reduce software maintenance burdens where the number of outstanding software defects exceeds available resources. Our ultimate vision is to design more security patterns that have a positive impact on software quality by disseminating correlated sets of best security design practices and knowledge.

2019-11-12
Vizarreta, Petra, Sakic, Ermin, Kellerer, Wolfgang, Machuca, Carmen Mas.  2019.  Mining Software Repositories for Predictive Modelling of Defects in SDN Controller. 2019 IFIP/IEEE Symposium on Integrated Network and Service Management (IM). :80-88.

In Software Defined Networking (SDN) control plane of forwarding devices is concentrated in the SDN controller, which assumes the role of a network operating system. Big share of today's commercial SDN controllers are based on OpenDaylight, an open source SDN controller platform, whose bug repository is publicly available. In this article we provide a first insight into 8k+ bugs reported in the period over five years between March 2013 and September 2018. We first present the functional components in OpenDaylight architecture, localize the most vulnerable modules and measure their contribution to the total bug content. We provide high fidelity models that can accurately reproduce the stochastic behaviour of bug manifestation and bug removal rates, and discuss how these can be used to optimize the planning of the test effort, and to improve the software release management. Finally, we study the correlation between the code internals, derived from the Git version control system, and software defect metrics, derived from Jira issue tracker. To the best of our knowledge, this is the first study to provide a comprehensive analysis of bug characteristics in a production grade SDN controller.

2020-08-14
Gu, Zuxing, Zhou, Min, Wu, Jiecheng, Jiang, Yu, Liu, Jiaxiang, Gu, Ming.  2019.  IMSpec: An Extensible Approach to Exploring the Incorrect Usage of APIs. 2019 International Symposium on Theoretical Aspects of Software Engineering (TASE). :216—223.
Application Programming Interfaces (APIs) usually have usage constraints, such as call conditions or call orders. Incorrect usage of these constraints, called API misuse, will result in system crashes, bugs, and even security problems. It is crucial to detect such misuses early in the development process. Though many approaches have been proposed over the last years, recent studies show that API misuses are still prevalent, especially the ones specific to individual projects. In this paper, we strive to improve current API-misuse detection capability for large-scale C programs. First, We propose IMSpec, a lightweight domain-specific language enabling developers to specify API usage constraints in three different aspects (i.e., parameter validation, error handling, and causal calling), which are the majority of API-misuse bugs. Then, we have tailored a constraint guided static analysis engine to automatically parse IMSpec rules and detect API-misuse bugs with rich semantics. We evaluate our approach on widely used benchmarks and real-world projects. The results show that our easily extensible approach performs better than state-of-the-art tools. We also discover 19 previously unknown bugs in real-world open-source projects, all of which have been confirmed by the corresponding developers.
2020-03-23
Pewny, Jannik, Koppe, Philipp, Holz, Thorsten.  2019.  STEROIDS for DOPed Applications: A Compiler for Automated Data-Oriented Programming. 2019 IEEE European Symposium on Security and Privacy (EuroS P). :111–126.
The wide-spread adoption of system defenses such as the randomization of code, stack, and heap raises the bar for code-reuse attacks. Thus, attackers utilize a scripting engine in target programs like a web browser to prepare the code-reuse chain, e.g., relocate gadget addresses or perform a just-in-time gadget search. However, many types of programs do not provide such an execution context that an attacker can use. Recent advances in data-oriented programming (DOP) explored an orthogonal way to abuse memory corruption vulnerabilities and demonstrated that an attacker can achieve Turing-complete computations without modifying code pointers in applications. As of now, constructing DOP exploits requires a lot of manual work-for every combination of application and payload anew. In this paper, we present novel techniques to automate the process of generating DOP exploits. We implemented a compiler called STEROIDS that leverages these techniques and compiles our high-level language SLANG into low-level DOP data structures driving malicious computations at run time. This enables an attacker to specify her intent in an application-and vulnerability-independent manner to maximize reusability. We demonstrate the effectiveness of our techniques and prototype implementation by specifying four programs of varying complexity in SLANG that calculate the Levenshtein distance, traverse a pointer chain to steal a private key, relocate a ROP chain, and perform a JIT-ROP attack. STEROIDS compiles each of those programs to low-level DOP data structures targeted at five different applications including GStreamer, Wireshark and ProFTPd, which have vastly different vulnerabilities and DOP instances. Ultimately, this shows that our compiler is versatile, can be used for both 32-bit and 64-bit applications, works across bug classes, and enables highly expressive attacks without conventional code-injection or code-reuse techniques in applications lacking a scripting engine.
2020-10-30
Basu, Kanad, Elnaggar, Rana, Chakrabarty, Krishnendu, Karri, Ramesh.  2019.  PREEMPT: PReempting Malware by Examining Embedded Processor Traces. 2019 56th ACM/IEEE Design Automation Conference (DAC). :1—6.

Anti-virus software (AVS) tools are used to detect Malware in a system. However, software-based AVS are vulnerable to attacks. A malicious entity can exploit these vulnerabilities to subvert the AVS. Recently, hardware components such as Hardware Performance Counters (HPC) have been used for Malware detection. In this paper, we propose PREEMPT, a zero overhead, high-accuracy and low-latency technique to detect Malware by re-purposing the embedded trace buffer (ETB), a debug hardware component available in most modern processors. The ETB is used for post-silicon validation and debug and allows us to control and monitor the internal activities of a chip, beyond what is provided by the Input/Output pins. PREEMPT combines these hardware-level observations with machine learning-based classifiers to preempt Malware before it can cause damage. There are many benefits of re-using the ETB for Malware detection. It is difficult to hack into hardware compared to software, and hence, PREEMPT is more robust against attacks than AVS. PREEMPT does not incur performance penalties. Finally, PREEMPT has a high True Positive value of 94% and maintains a low False Positive value of 2%.

2020-03-27
Huang, Shiyou, Guo, Jianmei, Li, Sanhong, Li, Xiang, Qi, Yumin, Chow, Kingsum, Huang, Jeff.  2019.  SafeCheck: Safety Enhancement of Java Unsafe API. 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). :889–899.

Java is a safe programming language by providing bytecode verification and enforcing memory protection. For instance, programmers cannot directly access the memory but have to use object references. Yet, the Java runtime provides an Unsafe API as a backdoor for the developers to access the low- level system code. Whereas the Unsafe API is designed to be used by the Java core library, a growing community of third-party libraries use it to achieve high performance. The Unsafe API is powerful, but dangerous, which leads to data corruption, resource leaks and difficult-to-diagnose JVM crash if used improperly. In this work, we study the Unsafe crash patterns and propose a memory checker to enforce memory safety, thus avoiding the JVM crash caused by the misuse of the Unsafe API at the bytecode level. We evaluate our technique on real crash cases from the openJDK bug system and real-world applications from AJDK. Our tool reduces the efforts from several days to a few minutes for the developers to diagnose the Unsafe related crashes. We also evaluate the runtime overhead of our tool on projects using intensive Unsafe operations, and the result shows that our tool causes a negligible perturbation to the execution of the applications.

2020-02-10
Rahman, Akond, Parnin, Chris, Williams, Laurie.  2019.  The Seven Sins: Security Smells in Infrastructure as Code Scripts. 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). :164–175.

Practitioners use infrastructure as code (IaC) scripts to provision servers and development environments. While developing IaC scripts, practitioners may inadvertently introduce security smells. Security smells are recurring coding patterns that are indicative of security weakness and can potentially lead to security breaches. The goal of this paper is to help practitioners avoid insecure coding practices while developing infrastructure as code (IaC) scripts through an empirical study of security smells in IaC scripts. We apply qualitative analysis on 1,726 IaC scripts to identify seven security smells. Next, we implement and validate a static analysis tool called Security Linter for Infrastructure as Code scripts (SLIC) to identify the occurrence of each smell in 15,232 IaC scripts collected from 293 open source repositories. We identify 21,201 occurrences of security smells that include 1,326 occurrences of hard-coded passwords. We submitted bug reports for 1,000 randomly-selected security smell occurrences. We obtain 212 responses to these bug reports, of which 148 occurrences were accepted by the development teams to be fixed. We observe security smells can have a long lifetime, e.g., a hard-coded secret can persist for as long as 98 months, with a median lifetime of 20 months.

2020-03-27
Coblenz, Michael, Sunshine, Joshua, Aldrich, Jonathan, Myers, Brad A..  2019.  Smarter Smart Contract Development Tools. 2019 IEEE/ACM 2nd International Workshop on Emerging Trends in Software Engineering for Blockchain (WETSEB). :48–51.

Much recent work focuses on finding bugs and security vulnerabilities in smart contracts written in existing languages. Although this approach may be helpful, it does not address flaws in the underlying programming language, which can facilitate writing buggy code in the first place. We advocate a re-thinking of the blockchain software engineering tool set, starting with the programming language in which smart contracts are written. In this paper, we propose and justify requirements for a new generation of blockchain software development tools. New tools should (1) consider users' needs as a primary concern; (2) seek to facilitate safe development by detecting relevant classes of serious bugs at compile time; (3) as much as possible, be blockchain-agnostic, given the wide variety of different blockchain platforms available, and leverage the properties that are common among blockchain environments to improve safety and developer effectiveness.

2020-06-01
Ye, Yu, Guo, Jun, Xu, Xunjian, Li, Qinpu, Liu, Hong, Di, Yuelun.  2019.  High-risk Problem of Penetration Testing of Power Grid Rainstorm Disaster Artificial Intelligence Prediction System and Its Countermeasures. 2019 IEEE 3rd Conference on Energy Internet and Energy System Integration (EI2). :2675–2680.
System penetration testing is an important measure of discovering information system security issues. This paper summarizes and analyzes the high-risk problems found in the penetration testing of the artificial storm prediction system for power grid storm disasters from four aspects: application security, middleware security, host security and network security. In particular, in order to overcome the blindness of PGRDAIPS current SQL injection penetration test, this paper proposes a SQL blind bug based on improved second-order fragmentation reorganization. By modeling the SQL injection attack behavior and comparing the SQL injection vulnerability test in PGRDAIPS, this method can effectively reduce the blindness of SQL injection penetration test and improve its accuracy. With the prevalence of ubiquitous power internet of things, the electric power information system security defense work has to be taken seriously. This paper can not only guide the design, development and maintenance of disaster prediction information systems, but also provide security for the Energy Internet disaster safety and power meteorological service technology support.
2020-10-06
Zaman, Tarannum Shaila, Han, Xue, Yu, Tingting.  2019.  SCMiner: Localizing System-Level Concurrency Faults from Large System Call Traces. 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). :515—526.

Localizing concurrency faults that occur in production is hard because, (1) detailed field data, such as user input, file content and interleaving schedule, may not be available to developers to reproduce the failure; (2) it is often impractical to assume the availability of multiple failing executions to localize the faults using existing techniques; (3) it is challenging to search for buggy locations in an application given limited runtime data; and, (4) concurrency failures at the system level often involve multiple processes or event handlers (e.g., software signals), which can not be handled by existing tools for diagnosing intra-process(thread-level) failures. To address these problems, we present SCMiner, a practical online bug diagnosis tool to help developers understand how a system-level concurrency fault happens based on the logs collected by the default system audit tools. SCMiner achieves online bug diagnosis to obviate the need for offline bug reproduction. SCMiner does not require code instrumentation on the production system or rely on the assumption of the availability of multiple failing executions. Specifically, after the system call traces are collected, SCMiner uses data mining and statistical anomaly detection techniques to identify the failure-inducing system call sequences. It then maps each abnormal sequence to specific application functions. We have conducted an empirical study on 19 real-world benchmarks. The results show that SCMiner is both effective and efficient at localizing system-level concurrency faults.

2019-01-16
Gao, J., Lanchantin, J., Soffa, M. L., Qi, Y..  2018.  Black-Box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers. 2018 IEEE Security and Privacy Workshops (SPW). :50–56.

Although various techniques have been proposed to generate adversarial samples for white-box attacks on text, little attention has been paid to a black-box attack, which is a more realistic scenario. In this paper, we present a novel algorithm, DeepWordBug, to effectively generate small text perturbations in a black-box setting that forces a deep-learning classifier to misclassify a text input. We develop novel scoring strategies to find the most important words to modify such that the deep classifier makes a wrong prediction. Simple character-level transformations are applied to the highest-ranked words in order to minimize the edit distance of the perturbation. We evaluated DeepWordBug on two real-world text datasets: Enron spam emails and IMDB movie reviews. Our experimental results indicate that DeepWordBug can reduce the classification accuracy from 99% to 40% on Enron and from 87% to 26% on IMDB. Our results strongly demonstrate that the generated adversarial sequences from a deep-learning model can similarly evade other deep models.

2019-09-26
Elliott, A. S., Ruef, A., Hicks, M., Tarditi, D..  2018.  Checked C: Making C Safe by Extension. 2018 IEEE Cybersecurity Development (SecDev). :53-60.

This paper presents Checked C, an extension to C designed to support spatial safety, implemented in Clang and LLVM. Checked C's design is distinguished by its focus on backward-compatibility, incremental conversion, developer control, and enabling highly performant code. Like past approaches to a safer C, Checked C employs a form of checked pointer whose accesses can be statically or dynamically verified. Performance evaluation on a set of standard benchmark programs shows overheads to be relatively low. More interestingly, Checked C introduces the notions of a checked region and bounds-safe interfaces.

2019-08-26
Gries, S., Hesenius, M., Gruhn, V..  2018.  Embedding Non-Compliant Nodes into the Information Flow Monitor by Dependency Modeling. 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS). :1541-1542.

Observing semantic dependencies in large and heterogeneous networks is a critical task, since it is quite difficult to find the actual source of a malfunction in the case of an error. Dependencies might exist between many network nodes and among multiple hops in paths. If those dependency structures are unknown, debugging errors gets quite difficult. Since CPS and other large networks change at runtime and consists of custom software and hardware, as well as components off-the-shelf, it is necessary to be able to not only include own components in approaches to detect dependencies between nodes. In this paper we present an extension to the Information Flow Monitor approach. Our goal is that this approach should be able to handle unalterable blackbox nodes. This is quite challenging, since the IFM originally requires each network node to be compliant with the IFM protocol.

2019-11-12
Zhang, Xian, Ben, Kerong, Zeng, Jie.  2018.  Cross-Entropy: A New Metric for Software Defect Prediction. 2018 IEEE International Conference on Software Quality, Reliability and Security (QRS). :111-122.

Defect prediction is an active topic in software quality assurance, which can help developers find potential bugs and make better use of resources. To improve prediction performance, this paper introduces cross-entropy, one common measure for natural language, as a new code metric into defect prediction tasks and proposes a framework called DefectLearner for this process. We first build a recurrent neural network language model to learn regularities in source code from software repository. Based on the trained model, the cross-entropy of each component can be calculated. To evaluate the discrimination for defect-proneness, cross-entropy is compared with 20 widely used metrics on 12 open-source projects. The experimental results show that cross-entropy metric is more discriminative than 50% of the traditional metrics. Besides, we combine cross-entropy with traditional metric suites together for accurate defect prediction. With cross-entropy added, the performance of prediction models is improved by an average of 2.8% in F1-score.

2019-07-01
Clemente, C. J., Jaafar, F., Malik, Y..  2018.  Is Predicting Software Security Bugs Using Deep Learning Better Than the Traditional Machine Learning Algorithms? 2018 IEEE International Conference on Software Quality, Reliability and Security (QRS). :95–102.

Software insecurity is being identified as one of the leading causes of security breaches. In this paper, we revisited one of the strategies in solving software insecurity, which is the use of software quality metrics. We utilized a multilayer deep feedforward network in examining whether there is a combination of metrics that can predict the appearance of security-related bugs. We also applied the traditional machine learning algorithms such as decision tree, random forest, naïve bayes, and support vector machines and compared the results with that of the Deep Learning technique. The results have successfully demonstrated that it was possible to develop an effective predictive model to forecast software insecurity based on the software metrics and using Deep Learning. All the models generated have shown an accuracy of more than sixty percent with Deep Learning leading the list. This finding proved that utilizing Deep Learning methods and a combination of software metrics can be tapped to create a better forecasting model thereby aiding software developers in predicting security bugs.

2019-12-17
Zhao, Shixiong, Gu, Rui, Qiu, Haoran, Li, Tsz On, Wang, Yuexuan, Cui, Heming, Yang, Junfeng.  2018.  OWL: Understanding and Detecting Concurrency Attacks. 2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). :219-230.
Just like bugs in single-threaded programs can lead to vulnerabilities, bugs in multithreaded programs can also lead to concurrency attacks. We studied 31 real-world concurrency attacks, including privilege escalations, hijacking code executions, and bypassing security checks. We found that compared to concurrency bugs' traditional consequences (e.g., program crashes), concurrency attacks' consequences are often implicit, extremely hard to be observed and diagnosed by program developers. Moreover, in addition to bug-inducing inputs, extra subtle inputs are often needed to trigger the attacks. These subtle features make existing tools ineffective to detect concurrency attacks. To tackle this problem, we present OWL, the first practical tool that models general concurrency attacks' implicit consequences and automatically detects them. We implemented OWL in Linux and successfully detected five new concurrency attacks, including three confirmed and fixed by developers, and two exploited from previously known and well-studied concurrency bugs. OWL has also detected seven known concurrency attacks. Our evaluation shows that OWL eliminates 94.1% of the reports generated by existing concurrency bug detectors as false positive, greatly reducing developers' efforts on diagnosis. All OWL source code, concurrency attack exploit scripts, and results are available on github.com/hku-systems/owl.