Remotely Observing Reverse Engineers to Evaluate Software Protection
Publisher
The University of Arizona.Rights
Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction, presentation (such as public display or performance) of protected items is prohibited except with permission of the author.Abstract
Software often contains proprietary information --- algorithms, intellectual property, and encryption keys, for example --- which malicious actors seek to access through reverse engineering. In order to preserve the confidentiality and integrity of these assets, programmers can apply protections to their software. Code obfuscation, in particular, aims to counter reverse engineers, making asset extraction and program tampering much more difficult. In spite of decades of research into how to best generate and analyze code obfuscation and reverse engineering methods, prior efforts to model the hardness of obfuscation schemes and efficacy of reverse engineering have failed to yield robust results. This, in turn, makes code obfuscation an unpredictable protection. The work here furthers analysis of real-world obfuscation resilience by examining reverse engineers as they overcome obfuscation in solving synthetic challenges. The general process involves (1) generating reverse engineering challenges, (2) giving those challenges to reverse engineers to solve under remote supervision, (3) collecting fine-grained traces of the reverse engineering tasks performed and (4) analyzing the resulting traces to build higher level models of reverse engineer behavior. The success of this process hinges on the validity of the challenges, the ability to attract reverse engineer subjects, the robustness of the system in gathering and analyzing generated data, and the algorithms to infer high-level attack operations from low-level trace data. Concretely, this dissertation documents the development, deployment, refinement, and ultimately the results of using the Catalyst Data Collection System (Catalyst) to collect trace data from reverse engineers in capture-the-flag competitions, in particular the Grand Reverse Engineering Challenge (GrandRe). Specifically, it presents (1) a methodology and system to generate basic models of human behavior remotely and asynchronously with no supervision, (2) the application of this methodology and system to reverse engineering obfuscated code, and (3) the results of that application. Alongside this, I release the reverse engineering data sets and Catalyst software for further research.Type
textElectronic Dissertation
Degree Name
Ph.D.Degree Level
doctoralDegree Program
Graduate CollegeComputer Science