Visible to the public [Research Paper] Combining Obfuscation and Optimizations in the Real World

Title[Research Paper] Combining Obfuscation and Optimizations in the Real World
Publication TypeConference Paper
Year of Publication2018
AuthorsGuelton, Serge, Guinet, Adrien, Brunet, Pierrick, Martinez, Juan Manuel, Dagnat, Fabien, Szlifierski, Nicolas
Conference Name2018 IEEE 18th International Working Conference on Source Code Analysis and Manipulation (SCAM)
Date Publishedsep
KeywordsCode Obfuscation, code obfuscator, code tangling techniques, composability, cryptography, Heuristic algorithms, industrial-strength obfuscator, Job shop scheduling, Kernel, Metrics, multicriteria optimization problem, obfuscation, optimisation, Optimization, pass combinations, pass ordering, pass scheduling problem, program compilers, pubcrawl, Resiliency, reverse engineer, reverse engineering, rewriting systems, scheduling, scheduling code transformations, sequential pass management techniques, tool-specific countermeasures, Tools, virtualization, white box cryptography, white-box encryption calls
AbstractCode obfuscation is the de facto standard to protect intellectual property when delivering code in an unmanaged environment. It relies on additive layers of code tangling techniques, white-box encryption calls and platform-specific or tool-specific countermeasures to make it harder for a reverse engineer to access critical pieces of data or to understand core algorithms. The literature provides plenty of different obfuscation techniques that can be used at compile time to transform data or control flow in order to provide some kind of protection against different reverse engineering scenarii. Scheduling code transformations to optimize a given metric is known as the pass scheduling problem, a problem known to be NP-hard, but solved in a practical way using hard-coded sequences that are generally satisfactory. Adding code obfuscation to the problem introduces two new dimensions. First, as a code obfuscator needs to find a balance between obfuscation and performance, pass scheduling becomes a multi-criteria optimization problem. Second, obfuscation passes transform their inputs in unconventional ways, which means some pass combinations may not be desirable or even valid. This paper highlights several issues met when blindly chaining different kind of obfuscation and optimization passes, emphasizing the need of a formal model to combine them. It proposes a non-intrusive formalism to leverage on sequential pass management techniques. The model is validated on real-world scenarii gathered during the development of an industrial-strength obfuscator on top of the LLVM compiler infrastructure.
Citation Keyguelton_research_2018