Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Adding Another Level of Hell to Reverse Engineering ORStatic Binary Obfuscation using Opaque Predicates and Semi-Junk Code...
Who am I<br />Ben Agre<br />Reverse Engineer<br />Worked random places<br />Currently work for Raytheon SI<br />Done Rando...
Obligatory term slide<br />SDLC<br />Sandbox<br />APT<br />Cyber Pompeii<br />Cyber Eyjafjallajökull(Credit to Jon Oberhei...
Overview<br />Introduction to X86<br />Overview of current packers<br />Overview of current ways to beat packers<br />Why ...
Assumptions<br />We assume that it is 32 bit x86 assembly<br />This can be extended and would work better with 64 bits, bu...
X86 Assembly<br />I apologize to those of you who know assembly this is going to be review at best, and boring to tears at...
Eflags<br />Eflags is essentially the status register<br />It contains 32 bits and can be broken down into certain items t...
Basics<br />Mov r1,r2/imm1<br />Move register or immediate r2, into r1<br />Add sub r1,r2<br />Does the operation to the f...
More Commands<br />imul, idiv<br />Unsigned multiply and divide<br />Effect eax:edx, and change appropriate flags<br />Cal...
Conditional Jumps<br />JS<br />JE<br />JG<br />JLE<br />JZ<br />Jump if zero flag<br />JNZ<br />Jump if zero flag is not s...
Now that were out of Narnia, let’s shake it up<br />Packers were originally trying to make executable’s smaller<br />They ...
General	Packer Magic<br />Mangle the IAT<br />Make it so on each outside function call it’s hard to figure out where thing...
Current direction<br />Currently there is a large push towards making virtual machines<br />This approach leads to closer ...
ASProtect<br />Some opaque predicates<br />Creates stack madness<br />Virtualizes many things<br />
Themida<br />“state of the art”<br />Uses highly virtualized systems<br />Locks the binary in everyway it can be<br />Cisc...
Both have been kicked badly<br />Themida has the full VM reversed by a pair of Chinese hackers<br />Apparently modified CI...
Destroying Them<br />There is currently a pair of IDA modules for themida decompiling being sold on the black market<br />...
Terms<br />This seems random but is important<br />Functionally isomorphic<br />Two functions that do the same thing but l...
Let’s create a way that is different<br />Instead of virtualizing the entire system lets stick in x86<br />Instead of maki...
Previous work<br />Kenshoto<br />MathIsHard<br />Binary is public, packer is not<br />Does more function rearranging, than...
What this is<br />It’s a packer which is state aware and uses that to its advantage<br />It adds little pieces of assembly...
Why you care<br />Since it’s a bit different then the normal way<br />Instead of creating a high startup cost we create a ...
Mode of operation<br />I take some function or group of functions, from a fully compiled binary, lets call the function A<...
Objectives<br />Create a non deterministic obfuscator<br />Make IDA DIAF<br />Make a semi extensible intermediate represen...
Why This is different<br />Randomization<br />In cryptography to make it harder for an adversary you randomize you’re plai...
Design Decisions<br />There are two separate ways we analyze the program<br />Previous state engine<br />Analyze the progr...
Call indirection<br />So in our dynamic engine at times we have to fix things up<br />We also may not want to actually pla...
What is a call<br />Call 0xdeadb33f<br />Push eip<br />Jmp 0xdeadb33f<br />What could a call be<br />Push eip<br />Push 0x...
Now how do we rewrite this with stubs<br />F(retnOffset, callAddress)<br />Switch(retnOffset)<br />Case x:<br />Ret = retn...
Other debated way to do this<br />Short call that pushes eip<br />Push function to go to<br />Retn<br />Issue with this is...
A third way<br />Push value to jmp to, either offset or address<br />Do essentially xchg [esp+4],[esp]<br />Retn<br />Else...
Finding opaque predicates<br />Some actions have definitive outcomes before they are ever used<br />Xor r1,r1<br />Sub r2,...
However these are not the only predicates<br />JZ <br />If the jump is taken we know that the zero flag is set<br />Else i...
Still too easy<br />JZ then JNZ is fairly easy to spot<br />Well we could add some do nothing instructions if we wanted<br...
Adding little stubs<br />So now that we have some instructions we can throw, we can actually make little sub funcs essenti...
Looks kinda like<br />JNZ(Program logic)<br />Inceax( makes eax not zero, compare and jump left out due to space restraint...
Well so we’re still now pretty easy<br />Lets bend the program to our will<br />Dynamic state isomorphisms<br />Calling co...
Now we’re getting somewhere<br />We can change items before and after the code executes.  <br />We can also do things like...
Now why is this Semi-Junk<br />Since we can fix items up inside of this random little stubs<br />If we fix things up insid...
We’re not deterministic<br />There are a lot of things that make this nondeterministic<br />Our semi junk can look one of ...
Hence<br />Things look different every time we ever do packing<br />This means that each time that a person wants to fix i...
Other Features Not Discussed<br />Max length of basic blocks<br />No more than lets say 5 lines can appear together, this ...
Future Work<br />Add other architectures<br />Move from nasm to my own assembler<br />Yet to be built<br />Maybe add some ...
Added bonus<br />FLIRT <br />Flirt is based on signatures of functions<br />Heavily relies on prologues, hence if we rando...
Field tests<br />Two groups, 2 Highly skilled, 1 skilled, 1 novice in each group<br />One group got the program before pac...
Results<br />Without packing<br />Around half an hour<br />With<br />Around 9<br />Novice gave up<br />
Tool Design<br />This tool is based on vtrace<br />Thank you kenshoto<br />Uses nasm for assembling the instructions requi...
Tool Release<br />This tool will most likely be released in the next month after finals<br />I added a feature three weeks...
Thanks<br />For helping me design and build<br />Thing1<br />Design<br />d4s, Visi, Psifertex, Metr0, Nitrik<br />For just...
Release Addendum<br />Will probably be released after my finals, so around May 28th<br />I will most likely announce via t...
Upcoming SlideShare
Loading in …5
×

Ben Agre - Adding Another Level of Hell to Reverse Engineering

2,191 views

Published on

Published in: Technology, Education
  • Be the first to comment

Ben Agre - Adding Another Level of Hell to Reverse Engineering

  1. 1. Adding Another Level of Hell to Reverse Engineering ORStatic Binary Obfuscation using Opaque Predicates and Semi-Junk Code<br />Ben Agre (@sboxkid)<br />MIT <br />Raytheon SI<br />
  2. 2. Who am I<br />Ben Agre<br />Reverse Engineer<br />Worked random places<br />Currently work for Raytheon SI<br />Done Random things<br />Kind of an asshole<br />Currently a student at MIT<br />
  3. 3. Obligatory term slide<br />SDLC<br />Sandbox<br />APT<br />Cyber Pompeii<br />Cyber Eyjafjallajökull(Credit to Jon Oberheide)<br />Stuxnet<br />
  4. 4. Overview<br />Introduction to X86<br />Overview of current packers<br />Overview of current ways to beat packers<br />Why this is different/why I’m an asshole<br />
  5. 5. Assumptions<br />We assume that it is 32 bit x86 assembly<br />This can be extended and would work better with 64 bits, but was originally written for 32<br />All items are assumed to be cdecl calling convention<br />I don’t like my friends, that’s why I built this tool<br />
  6. 6. X86 Assembly<br />I apologize to those of you who know assembly this is going to be review at best, and boring to tears at worst<br />This is a non aligned language, hence the order which bytes appear matter<br />The smallest instruction is one byte, the largest is 15, anything past that will throw a #UD exception<br />
  7. 7. Eflags<br />Eflags is essentially the status register<br />It contains 32 bits and can be broken down into certain items that are used for conditional jumps<br />Important flags<br />ZF=Zero Flag<br />SF= Sign Flag<br />OF= Overflow flag<br />CF= Carry Flag<br />
  8. 8. Basics<br />Mov r1,r2/imm1<br />Move register or immediate r2, into r1<br />Add sub r1,r2<br />Does the operation to the first register, and stores it in r1<br />Modify Eflags appropriately <br />Xorr1,r2<br />eXclusive OR r1 and r2, and store result in r1<br />Modify eflags appropriately <br />Jmp<br />Jump to a chunk of code<br />
  9. 9. More Commands<br />imul, idiv<br />Unsigned multiply and divide<br />Effect eax:edx, and change appropriate flags<br />Call addr<br />Call A function<br />
  10. 10. Conditional Jumps<br />JS<br />JE<br />JG<br />JLE<br />JZ<br />Jump if zero flag<br />JNZ<br />Jump if zero flag is not set<br />These all jump on state of eflags<br />
  11. 11. Now that were out of Narnia, let’s shake it up<br />Packers were originally trying to make executable’s smaller<br />They are now used to be an ass to reverse engineers<br />People have their favorites<br />
  12. 12. General Packer Magic<br />Mangle the IAT<br />Make it so on each outside function call it’s hard to figure out where things are going<br />Do some operation to all data<br />Uncompress it<br />Usually add some anti debugging magic<br />Armadillo parent child debugging<br />Themida, anything it can think of<br />
  13. 13. Current direction<br />Currently there is a large push towards making virtual machines<br />This approach leads to closer generic defeats, one learns the language and deals with it<br />Tracing is a pain<br />
  14. 14. ASProtect<br />Some opaque predicates<br />Creates stack madness<br />Virtualizes many things<br />
  15. 15. Themida<br />“state of the art”<br />Uses highly virtualized systems<br />Locks the binary in everyway it can be<br />Cisc architecture<br />Hates VM’s<br />
  16. 16. Both have been kicked badly<br />Themida has the full VM reversed by a pair of Chinese hackers<br />Apparently modified CISC architecture or RISC for older versions<br />Softworm did amazing things in this respect<br />ASProtect<br />Thousand tutorials on how to beat it<br />These systems make high initial bar to entry but not continued protection<br />
  17. 17. Destroying Them<br />There is currently a pair of IDA modules for themida decompiling being sold on the black market<br />This shows how broken this model can be at times<br />Packing for all intensive purposes is deterministic<br />Not IND-CCA secure<br />
  18. 18. Terms<br />This seems random but is important<br />Functionally isomorphic<br />Two functions that do the same thing but look different<br />State isomorphic<br />Two states that do the same thing, but look different<br />Opaque Predicate<br />A question which you know the answer to before you ask it<br />If a term doesn’t make sense ask<br />
  19. 19. Let’s create a way that is different<br />Instead of virtualizing the entire system lets stick in x86<br />Instead of making one high bar of entry, lets play against the tools<br />We can actually modify these binaries to the point at where they won’t look the same<br />Example<br />
  20. 20. Previous work<br />Kenshoto<br />MathIsHard<br />Binary is public, packer is not<br />Does more function rearranging, than function obfuscation<br />Some packers employ basic junk code, but it’s always actual junk<br />We use semi-Junk<br />
  21. 21. What this is<br />It’s a packer which is state aware and uses that to its advantage<br />It adds little pieces of assembly to be executed<br />Also adds items from /dev/urandom in order to mess up instruction alignment<br />Non-Deterministic<br />Always executes no matter how things change on the OS<br />
  22. 22. Why you care<br />Since it’s a bit different then the normal way<br />Instead of creating a high startup cost we create a continued use cost<br />It’s still straight x86 assembly no matter what<br />It uses the junk so it’s hard to determine real from fake codes<br />
  23. 23. Mode of operation<br />I take some function or group of functions, from a fully compiled binary, lets call the function A<br />I take A and I reassemble it into A’<br />A’ is functionally isomorphic to A<br />However, A’ can look nothing like A<br />Opaque predicates are added, as well as the random bytes<br />Original function is noppedout<br />Functions become longer and have to get rewritten to the end of the program<br />Call Indirection added<br />
  24. 24. Objectives<br />Create a non deterministic obfuscator<br />Make IDA DIAF<br />Make a semi extensible intermediate representation of the assembly<br />Make my friends hate me<br />???<br />Profit on the tears of my friends ?<br />
  25. 25. Why This is different<br />Randomization<br />In cryptography to make it harder for an adversary you randomize you’re plaintext, making it plaintext aware<br />What this means<br />I can pass in a binary twice and get two completely different results<br />
  26. 26. Design Decisions<br />There are two separate ways we analyze the program<br />Previous state engine<br />Analyze the program, look for opaque predicates<br />xoreax,eax is awesome for this<br />Created state engine<br />AKA Dynamic state engine<br />Can modify elements, and will use them until they change<br />
  27. 27. Call indirection<br />So in our dynamic engine at times we have to fix things up<br />We also may not want to actually place function addresses for calls<br />IDA uses these to recursively find functions<br />
  28. 28. What is a call<br />Call 0xdeadb33f<br />Push eip<br />Jmp 0xdeadb33f<br />What could a call be<br />Push eip<br />Push 0xdeadb33f<br />retn<br />
  29. 29. Now how do we rewrite this with stubs<br />F(retnOffset, callAddress)<br />Switch(retnOffset)<br />Case x:<br />Ret = retnOffset[x]<br />Push ret<br />Push callAddress<br />return<br />Each stub is essentially a mini function with a switch table<br />We pregenerate a lookup table (retnOffset)<br />Based on value push the parent return address<br />Then push address of function to call<br />Return<br />This calls callAdress and will then return to parent function bypassing stub on return<br />
  30. 30. Other debated way to do this<br />Short call that pushes eip<br />Push function to go to<br />Retn<br />Issue with this is that call is easy to find<br />
  31. 31. A third way<br />Push value to jmp to, either offset or address<br />Do essentially xchg [esp+4],[esp]<br />Retn<br />Else do something like <br />Pop eax<br />Jmpeax<br />
  32. 32. Finding opaque predicates<br />Some actions have definitive outcomes before they are ever used<br />Xor r1,r1<br />Sub r2,r2<br />These will always set eflags in one specific way, or throw an exception<br />
  33. 33. However these are not the only predicates<br />JZ <br />If the jump is taken we know that the zero flag is set<br />Else it’s not<br />Hence we can reason below it<br />Add a JNZ, and then throw in some junk<br />We know that the jump will be taken, a valid code path followed and our junk will still mess up IDA<br />
  34. 34. Still too easy<br />JZ then JNZ is fairly easy to spot<br />Well we could add some do nothing instructions if we wanted<br />If we know that after the item is used, there is nothing pertaining to EAX, until a moveax, [edx], we can throw in some instructions<br />Add eax,ecx<br />Xoreax, eax<br />These do not change the flow of the program, yet still make RE harder<br />Creates an isomorphic state<br />
  35. 35. Adding little stubs<br />So now that we have some instructions we can throw, we can actually make little sub funcs essentially<br />We do some calculation with eax, push it onto the stack and since we controlled the last few things we did, undo it<br />
  36. 36. Looks kinda like<br />JNZ(Program logic)<br />Inceax( makes eax not zero, compare and jump left out due to space restraint)<br />Add eax,edx(edx can be whatever, we don’t care)<br />Push eax<br />Moveax,[esp+88]<br />JNZ our code<br />After JNZ, random bytes<br />Pop eax<br />Their code<br />Before any item using eax, overwrites eax<br />
  37. 37. Well so we’re still now pretty easy<br />Lets bend the program to our will<br />Dynamic state isomorphisms<br />Calling conventions are awesome<br />CDECL means that the program makes some assumptions on function calls<br />EBX stays static<br />However, on call, there are no assumptions about eax,ecx,edx. Means we can mess with these before and after the program executes, except eax after<br />
  38. 38. Now we’re getting somewhere<br />We can change items before and after the code executes. <br />We can also do things like change items in the middle of execution<br />So if we do some items where we know how it will modify eflags, and then change a bit later without being used<br />Xoreax,eax<br />We can add a jump that goes where we want, and just add junk afterwards<br />
  39. 39. Now why is this Semi-Junk<br />Since we can fix items up inside of this random little stubs<br />If we fix things up inside of these little stubs, then when people look for completely dead code removal it won’t be flagged<br />It also means that during execution a trace will get a lot of chaffe from our items.<br />Hard to distinguish differences between our code and program code<br />
  40. 40. We’re not deterministic<br />There are a lot of things that make this nondeterministic<br />Our semi junk can look one of many separate indeterminent forms<br />Our prologue junk can be as long as we want and can redo or undo anything in a short or long version<br />
  41. 41. Hence<br />Things look different every time we ever do packing<br />This means that each time that a person wants to fix it up, they need to redo the entire process by hand<br />If we rearrange functions, and then do reapply the packer, then the RE has to do it all again from scratch<br />
  42. 42. Other Features Not Discussed<br />Max length of basic blocks<br />No more than lets say 5 lines can appear together, this is just a parameter<br />Tunable parameters for semi junk code<br />Hence one can have the preambles be short or long<br />Also can tell it to prefer registers<br />
  43. 43. Future Work<br />Add other architectures<br />Move from nasm to my own assembler<br />Yet to be built<br />Maybe add some anti debugging foo just for lulz<br />
  44. 44. Added bonus<br />FLIRT <br />Flirt is based on signatures of functions<br />Heavily relies on prologues, hence if we randomize the prologues FLIRT no longer picks up the signatures<br />Makes static Binaries so much worse then the amount that they already suck<br />
  45. 45. Field tests<br />Two groups, 2 Highly skilled, 1 skilled, 1 novice in each group<br />One group got the program before packing<br />One got the program after packing<br />Calculated sum of a fibonacci sequence with memory, using two arrays, non trivial but not hardest<br />Also had some other random functions to mess with them<br />Dropped privileges, changed prologues some other red herrings<br />
  46. 46. Results<br />Without packing<br />Around half an hour<br />With<br />Around 9<br />Novice gave up<br />
  47. 47. Tool Design<br />This tool is based on vtrace<br />Thank you kenshoto<br />Uses nasm for assembling the instructions required<br />Functions are rewritten at the end of the program, will add pages if necessary<br />
  48. 48. Tool Release<br />This tool will most likely be released in the next month after finals<br />I added a feature three weeks ago and it borked so many things<br />Based on vtrace, so one must download it seperately<br />I’ll probably tweet it or something<br />
  49. 49. Thanks<br />For helping me design and build<br />Thing1<br />Design<br />d4s, Visi, Psifertex, Metr0, Nitrik<br />For just being epic<br />Draugr<br />Raid<br />Gynophage<br />Bliss<br />Hates Irony<br />Kenshoto<br />Prof Zeldovichand Rivest<br />Both of whom’s classes were awesome<br />The busticati—forever busticating<br />The NY Crew- whom are too many to name<br />And all not enumerated herein<br />
  50. 50. Release Addendum<br />Will probably be released after my finals, so around May 28th<br />I will most likely announce via twitter, @sboxkid<br />Email me at bagre@mit.edu if you want to know anything else.<br />

×