We are upgrading the repository! A content freeze is in effect until December 6th, 2024 - no new submissions will be accepted; however, all content already published will remain publicly available. Please reach out to repository@u.library.arizona.edu with your questions, or if you are a UA affiliate who needs to make content available soon. Note that any new user accounts created after September 22, 2024 will need to be recreated by the user in November after our migration is completed.
Automatic Deobfuscation and Reverse Engineering of Obfuscated Code
Publisher
The University of Arizona.Rights
Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.Abstract
Automatic malware analysis is an essential part of today's computer security practices. Nearly one million malware samples were delivered to the analysts on a daily basis on year 2014 alone while the number of samples submitted for analysis increases almost exponentially each year. Given the size of the threat we are facing today and the amount of malicious codes emerging every day, the ability to automatically analyze unknown and unwanted software is critically important more than ever. On the other hand, malware writers adapt their malicious codes to new security measurements to protect them from being exposed and detected. This is usually achieved by employing obfuscation techniques that complicate the reverse engineering and analysis of the code by adding lots of unnecessary and irrelevant computations. Most of the malicious samples found in the wild are obfuscated and equipped with complicated anti-analysis defenses intended to hide the malicious intent of the malware by defeating the analysis and/or increasing the analysis time. Deobfuscation (reversing the obfuscation) requires automatic techniques to extract the original logic embedded in the obfuscated code for further analysis. Presumably the deobfuscated code requires less analysis time and is easier to analyze compared to the obfuscated one. Previous approaches in this regard target specific types of obfuscations by making strong assumptions about the underlying protection scheme leaving opportunities for the adversaries to attack. This work addresses this limitation by proposing new program analysis techniques that are effective against code obfuscations while being generic by minimizing the assumptions about the underlying code. We found that standard program analysis techniques, including well-known data and control flow analyses and/or symbolic execution, suffer from imprecision due to the obfuscation and show how to mitigate this loss of precision. Using more precise program analysis techniques, we propose a deobfuscation technique that is successful in reversing the complex obfuscation techniques such as virtualization-obfuscation and/or Return-Oriented Programming (ROP).Type
textElectronic Dissertation
Degree Name
Ph.D.Degree Level
doctoralDegree Program
Graduate CollegeComputer Science