Visible to the public BCD: Decomposing Binary Code Into Components Using Graph-Based Clustering

TitleBCD: Decomposing Binary Code Into Components Using Graph-Based Clustering
Publication TypeConference Paper
Year of Publication2018
AuthorsKarande, Vishal, Chandra, Swarup, Lin, Zhiqiang, Caballero, Juan, Khan, Latifur, Hamlen, Kevin
Conference NameProceedings of the 2018 on Asia Conference on Computer and Communications Security
ISBN Number978-1-4503-5576-6
Keywordsbinary code decomposition, components, compositionality, Cyber physical system, decomposition, graph-based clustering, Predictive Metrics, pubcrawl, Resiliency, Scientific Computing Security

Complex software is built by composing components implementing largely independent blocks of functionality. However, once the sources are compiled into an executable, that modularity is lost. This is unfortunate for code recipients, for whom knowing the components has many potential benefits, such as improved program understanding for reverse-engineering, identifying shared code across different programs, binary code reuse, and authorship attribution. A novel approach for decomposing such source-free program executables into components is here proposed. Given an executable, the approach first statically builds a decomposition graph, where nodes are functions and edges capture three types of relationships: code locality, data references, and function calls. It then applies a graph-theoretic approach to partition the functions into disjoint components. A prototype implementation, BCD, demonstrates the approach's efficacy: Evaluation of BCD with 25 C++ binary programs to recover the methods belonging to each class achieves high precision and recall scores for these tested programs.

Citation Keykarande_bcd:_2018