Friday, July 1, 2011

Thoughts on binary obfuscation

Thoughts on binary obfuscation

The purpose of binary obfuscation is to prevent an adversary from reconstructing the high level logic of your program (see the Wikipedia article on reverse engineering).

I'm not an expert in this area. I'm writing this entry so I can organize my thoughts and refer to it later.

The most popular disassembler is IDA Pro, and the most popular decompiler is the Hex-Rays Decompiler.

I'll assume the adversary is using a combination of static analysis and debugging with IDA Pro to extract a high-level description of your program.

Basics

The details change over time, but the basics remain the same:

Make the adversary's life as difficult by using as many tricks as possible.

The tricks change over time.

Require a debugger

Compressing or encrypting your executable will force the adversary into using a debugger, and will foil simple tools like strings. See the Wikipedia page on executable compression for a list of "packers".

Complicate your control flow

  1. Insert junk code and conditional jumps
    puts("hello\n");
    

    if (true()) goto l1;
    asm(".byte 42");
    l1:
    puts("hello\n");
    

    Many disassemblers will choke on this.

    It's better to use something that's not obviously junk:

    puts("hello"\n")
    

    if (true()) goto l1;
        puts("world\n");
    l1: puts("hello\n");
    
  2. Turn conditional jumps into indirect jumps
    if (true()) goto l1;
        puts("world\n");
    l1: puts("hello\n");
    

    void *ptr;
    if (true()) ptr = &&l1 else ptr = &&l0;
    goto *ptr;
    l0: puts("world\n");
    l1: puts("hello\n");
    
  3. Complicate your conditionals
    void *ptr;
    if (true()) ptr = &&l1 else ptr = &&l0;
    goto *ptr;
    l0: puts("world\n");
    l1: puts("hello\n");
    

    void *ptr; int c;
    c = (true() || false()) && (getenv("HOSTNAME") != NULL || getenv("FROB") != NULL);
    if (c) ptr = &&l1 else ptr = &&l0;
    goto *ptr;
    l0: puts("world\n");
    l1: puts("hello\n");
    

    This plays very well with control flow modifications. Look up opaque predicate.

  4. Multiple return points
    void *ptr;
    if (true()) ptr = &&l1 else ptr = &&l0;
    goto *ptr;
    l0: puts("world\n");
    l1: puts("hello\n");
    

    void sub(void *l0, *l1) {
      if (true()) goto *l1 else goto *l0;
    }
    sub(&&l0, &&l1);
    l0: puts("world\n");
    l1: puts("hello\n");
    

    The above is meant to be pseudo code, not valid C code. Furthermore, you shouldn't pass the possible return values to sub().

  5. Inline functions

    Let's assume that puts(x) is defined as:

    int puts(const char *str) {
      FILE *s = stdout; 
      if (s->capacity > __STDOUT_CAPACITY) __fflush(s);
      __fappend(s, str);
    }
    

    By inlining, the code

    if (true()) goto l1;
        puts("world\n");
    l1: puts("hello\n");
    

    would turn into:

    if (true()) goto l1;
    FILE *s1 = stdout; 
    if (s->capacity > __STDOUT_CAPACITY) __fflush(s);
    __fappend(s, "world\n");
    l1:
    FILE *s1 = stdout; 
    if (s->capacity > __STDOUT_CAPACITY) __fflush(s);
    __fappend(s, "hello\n");
    
You can repeatedly apply the above listed transformations.

Complicate your data flow

  1. Obfuscate identifiers and strings
    puts("hello\n");
    

    char buffer[6];
    decrypt(buffer, 6, "uryyb\n");
    puts(buffer)
    
  2. Add unnecessary loads and stores

    This works best when combined with opaque predicates (complex conditional statements).

    char buffer[6];
    decrypt(buffer, 6, "uryyb\n");
    puts(buffer)
    

    char buffer[6];
    decrypt(buffer, 6, "0ryyb\n");
    global[0] = buffer[3];
    global[1] = 'h';
    if (true()) buffer[0] = global[1] else buffer[0] = global[0];
    puts(buffer)
    
  3. Reuse global locations

    I'm out of steam. I will talk about this and other ways of obfuscating data flow another day.

Other tricks

  1. Use an interpreter

    I'll revert back to simpler code:

    void *ptr;
    if (true()) ptr = &&l1 else ptr = &&l0;
    goto *ptr;
    l0: puts("world\n");
    l1: puts("hello\n");
    

    long do(int code, long arg) { 
     static long reg;
     static long tmp;
     switch(code) {
      case 0: return reg;
      case 1: tmp = reg; reg = arg; return tmp;
      case 2: puts((void*)arg); return 0;
      case 3: puts((void*)arg); return 0;
      default: return 0;
     }
    }
    
    void *ptr;
    do(1, true());
    if (do(0,42)) ptr = &&l1 else ptr = &&l0;
    goto *ptr;
    l0: do(2,"world\n");
    l1: do(3,"hello\n");
    
  2. Generate and execute code on the fly

    You can use something like GNU lightning to generate and execute code at a run-time. This would be an alternative to using the interpreter.

  3. Check whether a debugger is running, or if the program was modified

    There are many tricks, and most of these are platform specific. One (mostly) portable trick is to checksum your code.

    int checksum = crc32((void*)proc_begin, proc_end - proc_begin);
    if (checksum != 0xfafef0) abort();
    

    These checks work best when combined with some of the other techniques:

    int checksum = crc32((void*)proc_begin, proc_end - proc_begin);
    void *ptr;
    do((checksum >> 15) & 0x1, true()); // isolate bit 15 of checksum
    if (do(0,42)) ptr = &&l1 else ptr = &&l0;
    goto *ptr;
    l0: do(2,"world\n");
    l1: do(3,"hello\n");
    
  4. Silently corrupt state

    Never error out when a debugger is detected, or code jumps to a location that shouldn't be reached. Silently corrupt some global state and continue on.

    if (true()) global[0] = 1 else global[0] = 0;
    if (checksum_valid()) global[1] = 1 else global[1] = 0;
    // way later, in another part of the program...
    int r1 = 42 / global[0];
    // ...
    int r2 = 42 * global[1];
    

References

0 comments:

Post a Comment