Visible to the public Content-Agnostic Malware Detection in Heterogeneous Malicious Distribution Graph

TitleContent-Agnostic Malware Detection in Heterogeneous Malicious Distribution Graph
Publication TypeConference Paper
Year of Publication2016
AuthorsAlabdulmohsin, Ibrahim, Han, YuFei, Shen, Yun, Zhang, XiangLiang
Conference NameProceedings of the 25th ACM International on Conference on Information and Knowledge Management
Conference LocationNew York, NY, USA
ISBN Number978-1-4503-4073-1
KeywordsAlgorithm, Bayesian inference, composability, data mining, download activity graph, edge detection, graph theory, label propagation, malware analysis, malware detection, malware mitigation, Metrics, pubcrawl, Resiliency, Scalability, security, semi-supervised learning

Malware detection has been widely studied by analysing either file dropping relationships or characteristics of the file distribution network. This paper, for the first time, studies a global heterogeneous malware delivery graph fusing file dropping relationship and the topology of the file distribution network. The integration offers a unique ability of structuring the end-to-end distribution relationship. However, it brings large heterogeneous graphs to analysis. In our study, an average daily generated graph has more than 4 million edges and 2.7 million nodes that differ in type, such as IPs, URLs, and files. We propose a novel Bayesian label propagation model to unify the multi-source information, including content-agnostic features of different node types and topological information of the heterogeneous network. Our approach does not need to examine the source codes nor inspect the dynamic behaviours of a binary. Instead, it estimates the maliciousness of a given file through a semi-supervised label propagation procedure, which has a linear time complexity w.r.t. the number of nodes and edges. The evaluation on 567 million real-world download events validates that our proposed approach efficiently detects malware with a high accuracy.

Citation Keyalabdulmohsin_content-agnostic_2016