Thoughts on binary obfuscation
The purpose of binary obfuscation is to prevent an adversary from reconstructing the high level logic of your program (see the Wikipedia article on reverse engineering).
I'm not an expert in this area. I'm writing this entry so I can organize my thoughts and refer to it later.
The most popular disassembler is IDA Pro, and the most popular decompiler is the Hex-Rays Decompiler.
I'll assume the adversary is using a combination of static analysis and debugging with IDA Pro to extract a high-level description of your program.
Basics
The details change over time, but the basics remain the same:
Make the adversary's life as difficult by using as many tricks as possible.
The tricks change over time.
Require a debugger
Compressing or encrypting your executable will force the adversary into using a debugger, and will foil simple tools like strings. See the Wikipedia page on executable compression for a list of "packers".
Complicate your control flow
- Insert junk code and conditional jumps
puts("hello\n");
→
if (true()) goto l1; asm(".byte 42"); l1: puts("hello\n");
Many disassemblers will choke on this.
It's better to use something that's not obviously junk:
puts("hello"\n")
→
if (true()) goto l1; puts("world\n"); l1: puts("hello\n");
- Turn conditional jumps into indirect jumps
if (true()) goto l1; puts("world\n"); l1: puts("hello\n");
→
void *ptr; if (true()) ptr = &&l1 else ptr = &&l0; goto *ptr; l0: puts("world\n"); l1: puts("hello\n");
- Complicate your conditionals
void *ptr; if (true()) ptr = &&l1 else ptr = &&l0; goto *ptr; l0: puts("world\n"); l1: puts("hello\n");
→
void *ptr; int c; c = (true() || false()) && (getenv("HOSTNAME") != NULL || getenv("FROB") != NULL); if (c) ptr = &&l1 else ptr = &&l0; goto *ptr; l0: puts("world\n"); l1: puts("hello\n");
This plays very well with control flow modifications. Look up opaque predicate.
- Multiple return points
void *ptr; if (true()) ptr = &&l1 else ptr = &&l0; goto *ptr; l0: puts("world\n"); l1: puts("hello\n");
→
void sub(void *l0, *l1) { if (true()) goto *l1 else goto *l0; } sub(&&l0, &&l1); l0: puts("world\n"); l1: puts("hello\n");
The above is meant to be pseudo code, not valid C code. Furthermore, you shouldn't pass the possible return values to
sub()
. - Inline functions
Let's assume that
puts(x)
is defined as:int puts(const char *str) { FILE *s = stdout; if (s->capacity > __STDOUT_CAPACITY) __fflush(s); __fappend(s, str); }
By inlining, the code
if (true()) goto l1; puts("world\n"); l1: puts("hello\n");
would turn into:
if (true()) goto l1; FILE *s1 = stdout; if (s->capacity > __STDOUT_CAPACITY) __fflush(s); __fappend(s, "world\n"); l1: FILE *s1 = stdout; if (s->capacity > __STDOUT_CAPACITY) __fflush(s); __fappend(s, "hello\n");
Complicate your data flow
- Obfuscate identifiers and strings
puts("hello\n");
→
char buffer[6]; decrypt(buffer, 6, "uryyb\n"); puts(buffer)
- Add unnecessary loads and stores
This works best when combined with opaque predicates (complex conditional statements).
char buffer[6]; decrypt(buffer, 6, "uryyb\n"); puts(buffer)
→
char buffer[6]; decrypt(buffer, 6, "0ryyb\n"); global[0] = buffer[3]; global[1] = 'h'; if (true()) buffer[0] = global[1] else buffer[0] = global[0]; puts(buffer)
- Reuse global locations
I'm out of steam. I will talk about this and other ways of obfuscating data flow another day.
Other tricks
- Use an interpreter
I'll revert back to simpler code:
void *ptr; if (true()) ptr = &&l1 else ptr = &&l0; goto *ptr; l0: puts("world\n"); l1: puts("hello\n");
→
long do(int code, long arg) { static long reg; static long tmp; switch(code) { case 0: return reg; case 1: tmp = reg; reg = arg; return tmp; case 2: puts((void*)arg); return 0; case 3: puts((void*)arg); return 0; default: return 0; } }
void *ptr; do(1, true()); if (do(0,42)) ptr = &&l1 else ptr = &&l0; goto *ptr; l0: do(2,"world\n"); l1: do(3,"hello\n");
- Generate and execute code on the fly
You can use something like GNU lightning to generate and execute code at a run-time. This would be an alternative to using the interpreter.
- Check whether a debugger is running, or if the program was modified
There are many tricks, and most of these are platform specific. One (mostly) portable trick is to checksum your code.
int checksum = crc32((void*)proc_begin, proc_end - proc_begin); if (checksum != 0xfafef0) abort();
These checks work best when combined with some of the other techniques:
int checksum = crc32((void*)proc_begin, proc_end - proc_begin); void *ptr; do((checksum >> 15) & 0x1, true()); // isolate bit 15 of checksum if (do(0,42)) ptr = &&l1 else ptr = &&l0; goto *ptr; l0: do(2,"world\n"); l1: do(3,"hello\n");
- Silently corrupt state
Never error out when a debugger is detected, or code jumps to a location that shouldn't be reached. Silently corrupt some global state and continue on.
if (true()) global[0] = 1 else global[0] = 0; if (checksum_valid()) global[1] = 1 else global[1] = 0; // way later, in another part of the program... int r1 = 42 / global[0]; // ... int r2 = 42 * global[1];
References
- On the (Im)possibility of Obfuscating Programs by B. Barak, O. Goldreich R. Impagliazzo, S. Rudich, A. Sahai, S. Vadhan and K. Yang
- Obfuscation of Executable Code to Improve Resistance to Static Disassembly by by Cullen Linn and Saumya Debray
- Watermarking, Tamper-Proofing, and Obfuscation - Tools for Software Protection by Christian Collberg and Clark Thomborson
0 comments:
Post a Comment