Visible to the public Malware Analysis, 2014, Part 2 (ACM)

SoS Newsletter- Advanced Book Block

SoS Logo

Malware Analysis, 2014, (ACM)

Part 2


The ACM published nearly 500 articles about malware analysis in 2014, making the topic one of the most studied. The bibliographical citations presented here, broken into several parts, should be of interest to the Science of Security community.

Ting-Fang Yen, Victor Heorhiadi, Alina Oprea, Michael K. Reiter, Ari Juels;  An Epidemiological Study of Malware Encounters in a Large Enterprise; CCS '14 Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, November 2014, Pages 1117-1130. Doi: 10.1145/2660267.2660330  Abstract: We present an epidemiological study of malware encounters in a large, multi-national enterprise. Our data sets allow us to observe or infer not only malware presence on enterprise computers, but also malware entry points, network locations of the computers (i.e., inside the enterprise network or outside) when the malware were encountered, and for some web-based malware encounters, web activities that gave rise to them. By coupling this data with demographic information for each host's primary user, such as his or her job title and level in the management hierarchy, we are able to paint a reasonably comprehensive picture of malware encounters for this enterprise. We use this analysis to build a logistic regression model for inferring the risk of hosts encountering malware; those ranked highly by our model have a >3x higher rate of encountering malware than the base rate. We also discuss where our study confirms or refutes other studies and guidance that our results suggest.
Keywords: enterprise security, logistic regression, malware encounters, measurement (ID#: 15-4686)


Patrick Cousot, Radhia Cousot; Abstract Interpretation: Past, Present and Future; CSL-LICS '14 Proceedings of the Joint Meeting of the Twenty-Third EACSL Annual Conference on Computer Science Logic (CSL) and the Twenty-Ninth Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), July 2014, Article No. 2. Doi: 10.1145/2603088.2603165  Abstract:  Abstract Interpretation is a theory of abstraction and constructive approximation of the mathematical structures used in the formal description of complex or infinite systems and the inference or verification of their combinatorial or undecidable properties. Developed in the late seventies, it has been since then used, implicitly or explicitly, to many aspects of computer science (such as static analysis and verification, contract inference, type inference, termination inference, model-checking, abstraction/refinement, program transformation (including watermarking, obfuscation, etc), combination of decision procedures, security, malware detection, database queries, etc) and more recently, to system biology and SAT/SMT solvers. Production-quality verification tools based on abstract interpretation are available and used in the advanced software, hardware, transportation, communication, and medical industries.  The talk will consist in an introduction to the basic notions of abstract interpretation and the induced methodology for the systematic development of sound abstract interpretation-based tools. Examples of abstractions will be provided, from semantics to typing, grammars to safety, reachability to potential/definite termination, numerical to protein-protein abstractions, as well as applications (including those in industrial use) to software, hardware and system biology.  This paper is a general discussion of abstract interpretation, with selected publications, which unfortunately are far from exhaustive both in the considered themes and the corresponding references.
Keywords: abstract interpretation, proof, semantics, static analysis, verification (ID#: 15-4687)


Christopher Neasbitt, Roberto Perdisci, Kang Li, Terry Nelms; ClickMiner: Towards Forensic Reconstruction of User-Browser Interactions from Network Traces; CCS '14 Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, November 2014, Pages 1244-1255. Doi: 10.1145/2660267.2660268  Abstract: Recent advances in network traffic capturing techniques have made it feasible to record full traffic traces, often for extended periods of time. Among the applications enabled by full traffic captures, being able to automatically reconstruct user-browser interactions from archived web traffic traces would be helpful in a number of scenarios, such as aiding the forensic analysis of network security incidents. Unfortunately, the modern web is becoming increasingly complex, serving highly dynamic pages that make heavy use of scripting languages, a variety of browser plugins, and asynchronous content requests. Consequently, the semantic gap between user-browser interactions and the network traces has grown significantly, making it challenging to analyze the web traffic produced by even a single user.  In this paper, we propose ClickMiner, a novel system that aims to automatically reconstruct user-browser interactions from network traces. Through a user study involving 21 participants, we collected real user browsing traces to evaluate our approach. We show that, on average, ClickMiner can correctly reconstruct between 82% and 90% of user-browser interactions with false positives between 0.74% and 1.16%, and that it outperforms reconstruction algorithms based solely on referrer-based approaches. We also present a number of case studies that aim to demonstrate how ClickMiner can aid the forensic analysis of malware downloads triggered by social engineering attacks.
Keywords: forensics, network traffic replay (ID#: 15-4688)


Abdullah J. Alzahrani, Ali A. Ghorbani; SMS Mobile Botnet Detection Using a Multi-Agent System: Research in Progress;  ACySE '14 Proceedings of the 1st International Workshop on Agents and CyberSecurity, May 2014, Article No. 2. Doi: 10.1145/2602945.2602950  Abstract: With the enormous growth of Android mobile devices and the huge increase in the number of published applications (apps), Short Message Service (SMS) is becoming an important issue. SMS can be abused by attackers when they send SMS spam, transfer all command and control (C&C) instructions, launch denial-of-service (DoS) attacks to send premium-rate SMS messages without user permission, and propagate malware via URLs sent within SMS messages. Thus, SMS has to be reliable as well as secure. In this paper, we propose a SMS botnet detection framework that uses multi-agent technology based on observations of SMS and Android smartphone features. This system detects SMS botnets and identifies ways to block the attacks in order to prevent damage caused by these attacks. An adaptive hybrid model of SMS botnet detectors is being developed by using a combination of signature-based and anomaly-based methods. The model is designed to recognize malicious SMS messages by applying behavioural analysis to find the correlation between suspicious SMS messages and reported profiling. Behaviour profiles of Android smartphones are being created to carry out robust and efficient anomaly detection. A multi-agent system technology was selected to perform light-weight detection without exhausting smartphone resources such as battery and memory.
Keywords: SMS, botnet detection, multi-agent system, smartphone (ID#: 15-4689)


Zhenlong Yuan, Yongqiang Lu, Zhaoguo Wang, Yibo Xue;  Droid-Sec: Deep Learning in Android Malware Detection; SIGCOMM '14 Proceedings of the 2014 ACM Conference on SIGCOMM, August 2014, Pages 371-372. Doi: 10.1145/2619239.2631434  Abstract: As smartphones and mobile devices are rapidly becoming indispensable for many network users, mobile malware has become a serious threat in the network security and privacy. Especially on the popular Android platform, many malicious apps are hiding in a large number of normal apps, which makes the malware detection more challenging. In this paper, we propose a ML-based method that utilizes more than 200 features extracted from both static analysis and dynamic analysis of Android app for malware detection. The comparison of modeling results demonstrates that the deep learning technique is especially suitable for Android malware detection and can achieve a high level of 96% accuracy with real-world Android application sets.
Keywords: android malware, deep learning, detection (ID#: 15-4690)


 Yuru Shao, Xiapu Luo, Chenxiong Qian, Pengfei Zhu, Lei Zhang; Towards a Scalable Resource-Driven Approach for Detecting Repackaged Android Applications;  ACSAC '14 Proceedings of the 30th Annual Computer Security Applications Conference, December 2014, Pages 56-65. Doi: 10.1145/2664243.2664275  Abstract: Repackaged Android applications (or simply apps) are one of the major sources of mobile malware and also an important cause of severe revenue loss to app developers. Although a number of solutions have been proposed to detect repackaged apps, the majority of them heavily rely on code analysis, thus suffering from two limitations: (1) poor scalability due to the billion opcode problem; (2) unreliability to code obfuscation/app hardening techniques. In this paper, we explore an alternative approach that exploits core resources, which have close relationships with codes, to detect repackaged apps. More precisely, we define new features for characterizing apps, investigate two kinds of algorithms for searching similar apps, and propose a two-stage methodology to speed up the detection. We realize our approach in a system named ResDroid and conduct large scale evaluation on it. The results show that ResDroid can identify repackaged apps efficiently and effectively even if they are protected by obfuscation or hardening systems.
Keywords:  (not provided) (ID#: 15-4691)


Justin Hummel, Andrew McDonald, Vatsal Shah, Riju Singh, Bradford D. Boyle, Tingshan Huang, Nagarajan Kandasamy, Harish Sethu, Steven Weber;  A Modular Multi-Location Anonymized Traffic Monitoring Tool for a Wifi Network ; CODASPY '14 Proceedings of the 4th ACM Conference on Data and Application Security and Privacy, March 2014, Pages 135-138. Doi: 10.1145/2557547.2557580  Abstract: Network traffic anomaly detection is now considered a surer approach to early detection of malware than signature-based approaches and is best accomplished with traffic data collected from multiple locations. Existing open-source tools are primarily signature-based, or do not facilitate integration of traffic data from multiple locations for real-time analysis, or are insufficiently modular for incorporation of newly proposed approaches to anomaly detection. In this paper, we describe DataMap, a new modular open-source tool for the collection and real-time analysis of sampled, anonymized, and filtered traffic data from multiple WiFi locations in a network and an example of its use in anomaly detection.
Keywords: open source tool, real time analysis, traffic anomaly detection (ID#: 15-4692)


Battista Biggio; On Learning and Recognition of Secure Patterns;  AISec '14 Proceedings of the 2014 Workshop on Artificial Intelligence and Security Workshop, November 2014, Pages 1-2. Doi: 10.1145/2666652.2666653 Abstract: Learning and recognition of secure patterns is a well-known problem in nature. Mimicry and camouflage are widely-spread techniques in the arms race between predators and preys. All of the information acquired by our senses is therefore not necessarily secure or reliable. In machine learning and pattern recognition systems, we have started investigating these issues only recently, with the goal of learning to discriminate between secure and hostile patterns. This phenomenon has been especially observed in the context of adversarial settings like biometric recognition, malware detection and spam filtering, in which data can be adversely manipulated by humans to undermine the outcomes of an automatic analysis. As current pattern recognition methods are not natively designed to deal with the intrinsic, adversarial nature of these problems, they exhibit specific vulnerabilities that an adversary may exploit either to mislead learning or to avoid detection. Identifying these vulnerabilities and analyzing the impact of the corresponding attacks on pattern classifiers is one of the main open issues in the novel research field of adversarial machine learning.  In the first part of this talk, I introduce a general framework that encompasses and unifies previous work in the field, allowing one to systematically evaluate classifier security against different, potential attacks. As an example of application of this framework, in the second part of the talk, I discuss evasion attacks, where malicious samples are manipulated at test time to avoid detection. I then show how carefully-designed poisoning attacks can mislead learning of support vector machines by manipulating a small fraction of their training data, and how to poison adaptive biometric verification systems to compromise the biometric templates (face images) of the enrolled clients. Finally, I briefly discuss our ongoing work on attacks against clustering algorithms, and sketch some possible future research directions.
Keywords: adversarial machine learning, evasion attacks, poisoning attacks, secure pattern recognition (ID#: 15-4693)


Alexander Long, Joshua Saxe, Robert Gove; Detecting Malware Samples with Similar Image Sets;  VizSec '14 Proceedings of the Eleventh Workshop on Visualization for Cyber Security, November 2014, Pages 88-95. Doi: 10.1145/2671491.2671500 Abstract: This paper proposes a method for identifying and visualizing similarity relationships between malware samples based on their embedded graphical assets (such as desktop icons and button skins). We argue that analyzing such relationships has practical merit for a number of reasons. For example, we find that malware desktop icons are often used to trick users into running malware programs, so identifying groups of related malware samples based on these visual features can highlight themes in the social engineering tactics of today's malware authors. Also, when malware samples share rare images, these image sharing relationships may indicate that the samples were generated or deployed by the same adversaries.  To explore and evaluate this malware comparison method, the paper makes two contributions. First, we provide a scalable and intuitive method for computing similarity measurements between malware based on the visual similarity of their sets of images. Second, we give a visualization method that combines a force-directed graph layout with a set visualization technique so as to highlight visual similarity relationships in malware corpora. We evaluate the accuracy of our image set similarity comparison method against a hand curated malware relationship ground truth dataset, finding that our method performs well. We also evaluate our overall concept through a small qualitative study we conducted with three cyber security researchers. Feedback from the researchers confirmed our use cases and suggests that computer network defenders are interested in this capability.
Keywords: human computer interaction, malware, security, visualization (ID#: 15-4694)


Luke Deshotels, Vivek Notani, Arun Lakhotia; DroidLegacy: Automated Familial Classification of Android Malware;  PPREW'14 Proceedings of ACM SIGPLAN on Program Protection and Reverse Engineering Workshop 2014, January 2014, Article No. 3. Doi: 10.1145/2556464.2556467  Abstract: We present an automated method for extracting familial signatures for Android malware, i.e., signatures that identify malware produced by piggybacking potentially different benign applications with the same (or similar) malicious code. The APK classes that constitute malware code in a repackaged application are separated from the benign code and the Android API calls used by the malicious modules are extracted to create a signature. A piggybacked malicious app can be detected by first decomposing it into loosely coupled modules and then matching the Android API calls called by each of the modules against the signatures of the known malware families. Since the signatures are based on Android API calls, they are related to the core malware behavior, and thus are more resilient to obfuscations.  In triage, AV companies need to automatically classify large number of samples so as to optimize assignment of human analysts. They need a system that gives low false negatives even if it is at the cost of higher false positives. Keeping this goal in mind, we fine tuned our system and used standard 10 fold cross validation over a dataset of 1,052 malicious APKs and 48 benign APKs to verify our algorithm. Results show that we have 94% accuracy, 97% precision, and 93% recall when separating benign from malware. We successfully classified our entire malware dataset into 11 families with 98% accuracy, 87% precision, and 94% recall.
Keywords: Android malware, class dependence graphs, familial classification, malware detection, module generation, piggybacked malware, signature generation, static analysis (ID#: 15-4695)


Ting Wang, Shicong Meng, Wei Gao, Xin Hu; Rebuilding the Tower of Babel: Towards Cross-System Malware Information Sharing; CIKM '14 Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, November 2014, Pages 1239-1248. Doi: 10.1145/2661829.2662086  Abstract: Anti-virus systems developed by different vendors often demonstrate strong discrepancies in how they name malware, which signficantly hinders malware information sharing. While existing work has proposed a plethora of malware naming standards, most anti-virus vendors were reluctant to change their own naming conventions. In this paper we explore a new, more pragmatic alternative. We propose to exploit the correlation between malware naming of different anti-virus systems to create their consensus classification, through which these systems can share malware information without modifying their naming conventions. Specifically we present Latin, a novel classification integration framework leveraging the correspondence between participating anti-virus systems as reflected in heterogeneous information sources at instance-instance, instance-name, and name-name levels. We provide results from extensive experimental studies using real malware datasets and concrete use cases to verify the efficacy of Latin in supporting cross-system malware information sharing.
Keywords: classification integration, consensus learning, malware naming (ID#: 15-4696)


Andrew Henderson, Aravind Prakash, Lok Kwong Yan, Xunchao Hu, Xujiewen Wang, Rundong Zhou, Heng Yin; Make It Work, Make It Right, Make It Fast: Building a Platform-Neutral Whole-System Dynamic Binary Analysis Platform; ISSTA 2014 Proceedings of the 2014 International Symposium on Software Testing and Analysis, July 2014, Pages 248-258. Doi: 10.1145/2610384.2610407  Abstract: Dynamic binary analysis is a prevalent and indispensable technique in program analysis. While several dynamic binary analysis tools and frameworks have been proposed, all suffer from one or more of: prohibitive performance degradation, semantic gap between the analysis code and the program being analyzed, architecture/OS specificity, being user-mode only, lacking APIs, etc. We present DECAF, a virtual machine based, multi-target, whole-system dynamic binary analysis framework built on top of QEMU. DECAF provides Just-In-Time Virtual Machine Introspection combined with a novel TCG instruction-level tainting at bit granularity, backed by a plugin based, simple-to-use event driven programming interface. DECAF exercises fine control over the TCG instructions to accomplish on-the-fly optimizations. We present 3 platform-neutral plugins - Instruction Tracer, Keylogger Detector, and API Tracer, to demonstrate the ease of use and effectiveness of DECAF in writing cross-platform and system-wide analysis tools. Implementation of DECAF consists of 9550 lines of C++ code and 10270 lines of C code and we evaluate DECAF using CPU2006 SPEC benchmarks and show average overhead of 605% for system wide tainting and 12% for VMI.
Keywords: Dynamic binary analysis, dynamic taint analysis, virtual machine introspection (ID#: 15-4697)


Battista Biggio, Konrad Rieck, Davide Ariu, Christian Wressnegger, Igino Corona, Giorgio Giacinto, Fabio Roli; Poisoning Behavioral Malware Clustering;  AISec '14 Proceedings of the 2014 Workshop on Artificial Intelligence and Security Workshop, November 2014, Pages 27-36. Doi: 10.1145/2666652.2666666 Abstract: Clustering algorithms have become a popular tool in computer security to analyze the behavior of malware variants, identify novel malware families, and generate signatures for antivirus systems. However, the suitability of clustering algorithms for security-sensitive settings has been recently questioned by showing that they can be significantly compromised if an attacker can exercise some control over the input data. In this paper, we revisit this problem by focusing on behavioral malware clustering approaches, and investigate whether and to what extent an attacker may be able to subvert these approaches through a careful injection of samples with poisoning behavior. To this end, we present a case study on Malheur, an open-source tool for behavioral malware clustering. Our experiments not only demonstrate that this tool is vulnerable to poisoning attacks, but also that it can be significantly compromised even if the attacker can only inject a very small percentage of attacks into the input data. As a remedy, we discuss possible countermeasures and highlight the need for more secure clustering algorithms.
Keywords: adversarial machine learning, clustering, computer security, malware detection, security evaluation, unsupervised learning (ID#: 15-4698)


Igino Corona, Davide Maiorca, Davide Ariu, Giorgio Giacinto; Lux0R: Detection of Malicious PDF-embedded JavaScript code through Discriminant Analysis of API References;  AISec '14 Proceedings of the 2014 Workshop on Artificial Intelligence and Security Workshop, November 2014, Pages 47-57. Doi: 10.1145/2666652.2666657  Abstract: JavaScript is a dynamic programming language adopted in a variety of applications, including web pages, PDF Readers, widget engines, network platforms, office suites. Given its widespread presence throughout different software platforms, JavaScript is a primary tool for the development of novel -rapidly evolving- malicious exploits. If the classical signature- and heuristic-based detection approaches are clearly inadequate to cope with this kind of threat, machine learning solutions proposed so far suffer from high false-alarm rates or require special instrumentation that make them not suitable for protecting end-user systems. In this paper we present Lux0R "Lux 0n discriminant References", a novel, lightweight approach to the detection of malicious JavaScript code. Our method is based on the characterization of JavaScript code through its API references, i.e., functions, constants, objects, methods, keywords as well as attributes natively recognized by a JavaScript Application Programming Interface (API). We exploit machine learning techniques to select a subset of API references that characterize malicious code, and then use them to detect JavaScript malware. The selection algorithm has been thought to be "secure by design" against evasion by mimicry attacks. In this investigation, we focus on a relevant application domain, i.e., the detection of malicious JavaScript code within PDF documents. We show that our technique is able to achieve excellent malware detection accuracy, even on samples exploiting never-before-seen vulnerabilities, i.e., for which there are no examples in training data. Finally, we experimentally assess the robustness of Lux0R against mimicry attacks based on feature addition.
Keywords: adversarial machine learning, javascript code, malware detection, mimicry attacks, pdf documents (ID#: 15-4699)


Markus Kammerstetter, Christian Platzer, Wolfgang Kastner; Prospect: Peripheral Proxying Supported Embedded Code Testing; ASIA CCS '14 Proceedings of the 9th ACM Symposium On Information, Computer And Communications Security; June 2014, pages 329-340. Doi: 10.1145/2590296.2590301  Abstract: Embedded systems are an integral part of almost every electronic product today. From consumer electronics to industrial components in SCADA systems, their possible fields of application are manifold. While especially in industrial and critical infrastructures the security requirements are high, recent publications have shown that embedded systems do not cope well with this demand. One of the reasons is that embedded systems are being less scrutinized as embedded security analysis is considered to be more time consuming and challenging in comparison to PC systems. One of the key challenges on proprietary, resource constrained embedded devices is dynamic code analysis. The devices typically do not have the capabilities for a full-scale dynamic security evaluation. Likewise, the analyst cannot execute the software implementation inside a virtual machine due to the missing peripheral hardware that is required by the software to run. In this paper, we present PROSPECT, a system that can overcome these shortcomings and enables dynamic code analysis of embedded binary code inside arbitrary analysis environments. By transparently forwarding peripheral hardware accesses from the original host system into a virtual machine, PROSPECT allows security analysts to run the embedded software implementation without the need to know which and how embedded peripheral hardware components are accessed. We evaluated PROSPECT with respect to the performance impact and conducted a case study by doing a full-scale security audit of a widely used commercial fire alarm system in the building automation domain. Our results show that PROSPECT is both practical and usable for real-world application.
Keywords: device tunneling, dynamic analysis, embedded system, fuzz testing, security (ID#: 15-4700)


Jyun-Yu Jiang, Chun-Liang Li, Chun-Pai Yang, Chung-Tsai Su; POSTER: Scanning-free Personalized Malware Warning System by Learning Implicit Feedback from Detection Logs; CCS '14 Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, November 2014, Pages 1436-1438. Doi: 10.1145/2660267.2662359  Abstract: Nowadays, World Wide Web connects people to each other in many ways ubiquitously. Followed along with the convenience and usability, millions of malware infect various devices of numerous users through the web every day. In contrast, traditional anti-malware systems detect such malware by scanning file systems and provide secure environments for users. However, some malware might not be detected by traditional scanning-based detection systems due to hackers' obfuscation techniques. Also, scanning-based approaches cannot caution users for uninfected malware with high risks. In this paper, we aim to build a personalized malware warning system. Different from traditional scanning-based approaches, we focus on discovering the potential malware which has not been detected for each user. If users and the system know the potentially infected malware in advance, they can be alert against the corresponding risks. We propose a novel approach to learn the implicit feedback from detection logs and give a personalized risk ranking of malware for each user. Finally, the experiments on real-world detection datasets demonstrate the proposed algorithm outperforms traditional popularity-based algorithms.
Keywords: computer security, malware detection, malware warning system, personalized collaborative filtering (ID#: 15-4701)


Olatunji Ruwase, Michael A. Kozuch, Phillip B. Gibbons, Todd C. Mowry;  Guardrail: A High Fidelity Approach to Protecting Hardware Devices from Buggy Drivers; ASPLOS '14 Proceedings of the 19th International Conference On Architectural Support For Programming Languages And Operating Systems, February 2014, Pages 655-670. Doi: 10.1145/2654822.2541970  Abstract: Device drivers are an Achilles' heel of modern commodity operating systems, accounting for far too many system failures. Previous work on driver reliability has focused on protecting the kernel from unsafe driver side-effects by interposing an invariant-checking layer at the driver interface, but otherwise treating the driver as a black box. In this paper, we propose and evaluate Guardrail, which is a more powerful framework for run-time driver analysis that performs decoupled instruction-grain dynamic correctness checking on arbitrary kernel-mode drivers as they execute, thereby enabling the system to detect and mitigate more challenging correctness bugs (e.g., data races, uninitialized memory accesses) that cannot be detected by today's fault isolation techniques. Our evaluation of Guardrail shows that it can find serious data races, memory faults, and DMA faults in native Linux drivers that required fixes, including previously unknown bugs. Also, with hardware logging support, Guardrail can be used for online protection of persistent device state from driver bugs with at most 10% overhead on the end-to-end performance of most standard I/O workloads.
Keywords: device drivers, dynamic analysis (ID#: 15-4702)


Mordechai Guri, Gabi Kedma, Buky Carmeli, Yuval Elovici; Limiting Access to Unintentionally Leaked Sensitive Documents Using Malware Signatures;  SACMAT '14 Proceedings of the 19th ACM Symposium On Access Control Models And Technologies, June 2014, Pages 129-140. Doi: 10.1145/2613087.2613103 Abstract: Organizations are repeatedly embarrassed when their sensitive digital documents go public or fall into the hands of adversaries, often as a result of unintentional or inadvertent leakage. Such leakage has been traditionally handled either by preventive means, which are evidently not hermetic, or by punitive measures taken after the main damage has already been done. Yet, the challenge of preventing a leaked file from spreading further among computers and over the Internet is not resolved by existing approaches. This paper presents a novel method, which aims at reducing and limiting the potential damage of a leakage that has already occurred. The main idea is to tag sensitive documents within the organization's boundaries by attaching a benign detectable malware signature (DMS). While the DMS is masked inside the organization, if a tagged document is somehow leaked out of the organization's boundaries, common security services such as Anti-Virus (AV) programs, firewalls or email gateways will detect the file as a real threat and will consequently delete or quarantine it, preventing it from spreading further. This paper discusses various aspects of the DMS, such as signature type and attachment techniques, along with proper design considerations and implementation issues. The proposed method was implemented and successfully tested on various file types including documents, spreadsheets, presentations, images, executable binaries and textual source code. The evaluation results have demonstrated its effectiveness in limiting the spread of leaked documents.
Keywords: anti-virus program, data leakage, detectable malware signature, sensitive document (ID#: 15-4703)


Paul Pearce, Vacha Dave, Chris Grier, Kirill Levchenko, Saikat Guha, Damon McCoy, Vern Paxson, Stefan Savage, Geoffrey M. Voelker; Characterizing Large-Scale Click Fraud in ZeroAccess;  CCS '14 Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, November 2014, Pages 141-152. Doi: 10.1145/2660267.2660369  Abstract: Click fraud is a scam that hits a criminal sweet spot by both tapping into the vast wealth of online advertising and exploiting that ecosystem's complex structure to obfuscate the flow of money to its perpetrators. In this work, we illuminate the intricate nature of this activity through the lens of ZeroAccess--one of the largest click fraud botnets in operation. Using a broad range of data sources, including peer-to-peer measurements, command-and-control telemetry, and contemporaneous click data from one of the top ad networks, we construct a view into the scale and complexity of modern click fraud operations. By leveraging the dynamics associated with Microsoft's attempted takedown of ZeroAccess in December 2013, we employ this coordinated view to identify "ad units" whose traffic (and hence revenue) primarily derived from ZeroAccess. While it proves highly challenging to extrapolate from our direct observations to a truly global view, by anchoring our analysis in the data for these ad units we estimate that the botnet's fraudulent activities plausibly induced advertising losses on the order of $100,000 per day.
Keywords: click fraud, cybercrime, malware, measurement, ZeroAccess (ID#: 15-4704)


Byeongho Kang, Eul Gyu Im; Analysis of Binary Code Topology for Dynamic Analysis; SAC '14 Proceedings of the 29th Annual ACM Symposium on Applied Computing, March 2014, Pages  1731-1732. Doi: 10.1145/2554850.2559912  Abstract: A better understanding of binary code topology is essential in making execution path exploration method. The execution path exploration is closely related to code coverage in binary code dynamic analysis. Since the number of execution paths in a program is astronomically high, the efficient exploration strategy is needed. In this paper, we analyze binary code topology in a viewpoint of basic blocks. We find that the incoming edges show unbalanced distribution which follows power law, instead of balanced distribution. This unbalanced distribution of incoming edges can help understanding binary code topology, and we propose our study for deciding efficient execution path exploration strategy of dynamic binary code analysis.
Keywords: basic block topology, binary analysis, dynamic analysis (ID#: 15-4705)


Josh Marston, Komminist Weldemariam, Mohammad Zulkernine; On Evaluating and Securing Firefox for Android Browser Extensions; MOBILESoft 2014 Proceedings of the 1st International Conference on Mobile Software Engineering and Systems, June 2014, Pages 27-36. Doi: 10.1145/2593902.2593909  Abstract: Unsafely or maliciously coded extensions allow an attacker to run their own code in the victim's browser with elevated privileges. This gives the attacker a large amount of control over not only the browser but the underlying machine as well. The topic of securing desktop browsers from such threats has been well studied but mitigating the same danger on mobile devices has seen little attention. Similarly, mobile device use continues to grow world-wide at a rapid pace along with their capability and ability to perform sensitive actions. In an effort to mitigate the risks inherent with these actions, this paper details the dangers of JavaScript injection on the mobile browser. We further present a defense technique that was developed by extending from the desktop environment to work in the mobile space. Our prototype implementation is a combination of extensions on the Firefox for Android and a slightly modified browser of Firefox for Android. When the user attempts to install a new extension or update an existing one, the modified browser is called a priori. The overall extension logic, code transformation, and static analyzer components were implemented in JavaScript and SQLLite database. Our preliminary evaluation shows that our prototype implementation can effectively prevent real-world attacks against extensions on Firefox for Android without affecting users' browsing experience.
Keywords: Browser Extensions, Firefox for Android, Information Flow, JavaScript, Mobile Security, Static Analysis (ID#: 15-4706)


Zhengyang Qu, Vaibhav Rastogi, Xinyi Zhang, Yan Chen, Tiantian Zhu, Zhong Chen; AutoCog: Measuring the Description-to-permission Fidelity in Android Applications; CCS '14 Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, November 2014, Pages 1354-1365. Doi: 10.1145/2660267.2660287  Abstract: The booming popularity of smartphones is partly a result of application markets where users can easily download wide range of third-party applications. However, due to the open nature of markets, especially on Android, there have been several privacy and security concerns with these applications. On Google Play, as with most other markets, users have direct access to natural-language descriptions of those applications, which give an intuitive idea of the functionality including the security-related information of those applications. Google Play also provides the permissions requested by applications to access security and privacy-sensitive APIs on the devices. Users may use such a list to evaluate the risks of using these applications. To best assist the end users, the descriptions should reflect the need for permissions, which we term description-to-permission fidelity. In this paper, we present a system AutoCog to automatically assess description-to-permission fidelity of applications. AutoCog employs state-of-the-art techniques in natural language processing and our own learning-based algorithm to relate description with permissions. In our evaluation, AutoCog outperforms other related work on both performance of detection and ability of generalization over various permissions by a large extent. On an evaluation of eleven permissions, we achieve an average precision of 92.6% and an average recall of 92.0%. Our large-scale measurements over 45,811 applications demonstrate the severity of the problem of low description-to-permission fidelity. AutoCog helps bridge the long-lasting usability gap between security techniques and average users.
Keywords: android, google play, machine learning, mobile, natural language processing, permissions (ID#: 15-4707)


Chuangang Ren, Kai Chen, Peng Liu; Droidmarking: Resilient Software Watermarking for Impeding Android Application Repackaging; ASE '14 Proceedings of the 29th ACM/IEEE International Conference On Automated Software Engineering, September 2014, Pages 635-646. Doi: 10.1145/2642937.2642977  Abstract: Software plagiarism in Android markets (app repackaging) is raising serious concerns about the health of the Android ecosystem. Existing app repackaging detection techniques fall short in detection efficiency and in resilience to circumventing attacks; this allows repackaged apps to be widely propagated and causes extensive damages before being detected. To overcome these difficulties and instantly thwart app repackaging threats, we devise a new dynamic software watermarking technique - Droidmarking - for Android apps that combines the efforts of all stakeholders and achieves the following three goals: (1) copyright ownership assertion for developers, (2) real-time app repackaging detection on user devices, and (3) resilience to evading attacks. Distinct from existing watermarking techniques, the watermarks in Droidmarking are non-stealthy, which means that watermark locations are not intentionally concealed, yet still are impervious to evading attacks. This property effectively enables normal users to recover and verify watermark copyright information without requiring a confidential watermark recognizer. Droidmarking is based on a primitive called self-decrypting code (SDC). Our evaluations show that Droidmarking is a feasible and robust technique to effectively impede app repackaging with relatively small performance overhead.
Keywords: android, app repackaging, software watermarking (ID#: 15-4708)


Articles listed on these pages have been found on publicly available internet pages and are cited with links to those pages. Some of the information included herein has been reprinted with permission from the authors or data repositories. Direct any requests via Email to for removal of the links or modifications to specific citations. Please include the ID# of the specific citation in your correspondence.