Visible to the public AppDNA: App Behavior Profiling via Graph-Based Deep Learning

TitleAppDNA: App Behavior Profiling via Graph-Based Deep Learning
Publication TypeConference Paper
Year of Publication2018
AuthorsXue, S., Zhang, L., Li, A., Li, X., Ruan, C., Huang, W.
Conference NameIEEE INFOCOM 2018 - IEEE Conference on Computer Communications
Date PublishedApril 2018
ISBN Number978-1-5386-4128-6
Keywordsapp behavior profiling, app recommendation, AppDNA, benign apps, encoding, feature extraction, function-call-graph-based app profiling, graph theory, graph-based deep learning, graph-encoding method, Human Behavior, invasive software, learning (artificial intelligence), machine learning, malicious apps, Malware, malware analysis, malware classification, malware detection, Metrics, mobile applications behaviors, mobile computing, Neural networks, pattern classification, Plagiarism, privacy, pubcrawl, Resiliency, Task Analysis

Better understanding of mobile applications' behaviors would lead to better malware detection/classification and better app recommendation for users. In this work, we design a framework AppDNA to automatically generate a compact representation for each app to comprehensively profile its behaviors. The behavior difference between two apps can be measured by the distance between their representations. As a result, the versatile representation can be generated once for each app, and then be used for a wide variety of objectives, including malware detection, app categorizing, plagiarism detection, etc. Based on a systematic and deep understanding of an app's behavior, we propose to perform a function-call-graph-based app profiling. We carefully design a graph-encoding method to convert a typically extremely large call-graph to a 64-dimension fix-size vector to achieve robust app profiling. Our extensive evaluations based on 86,332 benign and malicious apps demonstrate that our system performs app profiling (thus malware detection, classification, and app recommendation) to a high accuracy with extremely low computation cost: it classifies 4024 (benign/malware) apps using around 5.06 second with accuracy about 93.07%; it classifies 570 malware's family (total 21 families) using around 0.83 second with accuracy 82.3%; it classifies 9,730 apps' functionality with accuracy 33.3% for a total of 7 categories and accuracy of 88.1 % for 2 categories.

Citation Keyxue_appdna:_2018