REVERSE
ENGINEERING
An Introduction to Cracking Windows Applications




                                                   Saswat Padhi
                                                   3rd Year UG, CSE

                                                   Hostel 8, IIT Bombay
Presentation for STAB Annual Seminar Weekday   2




                      DISCLAIMER
• This presentation is intended for educational purposes only.
• Reverse engineering of copyrighted material is illegal and
 unethical and in no way, do I encourage this.

• Playing with malware is not a good idea, unless you take
  proper precautionary measures. Always work within
  sandboxed environment. For example, in a virtual OS.
• Malware analysis is “advanced” stuff. And when you choose to
  mess with a nasty app, it is expected that you know what you
  are doing. Only you are responsible for what happens to your
  system.
Presentation for STAB Annual Seminar Weekday   3




What is this heavy term?
                                                         • Reverse Engineering simply aims
                                                            at “understanding” a system
                                                            through analysis of its structure,
                                                            function and operation.

                                                         • We try to go backwards in the
                                                            development cycle: having an
                                                            implementation, we try to go
                                                            back to the analysis stage, with a
                                                            high level understanding.

                                                         • With the abstract understanding,
                                                            we try to modify parts of existing
                                                            implementation or implement on
 Complete disassembly of a Pentax K1000 camera.             our own!
       Image borrowed from bitrebel.com.
Presentation for STAB Annual Seminar Weekday   4




And why should someone learn this?
• Malware analysis: Analyzing malware to build anti-malware
• Bug fixing: Fixing bugs in legacy software
• Personalization/Customizations
• Academic/Learning purposes
• Removing access restrictions
• Removal of copy protection
• Compatibility
• Just for fun! 
• Convey the message “Go Open Source”! :-P
Presentation for STAB Annual Seminar Weekday   5




Wait … Isn’t this “illegal”!?
• Public release of information obtained through reversing a
 proprietary application or sharing the application after
 modifying it to remove/reduce security is illegal.


• DMCA allows, reverse engineering applications for achieving
  interoperability.
• Analysis of malware is of course legal! 
• “Clean room” design is perfectly legal.
  • A team of examiners write a specification for target the software.
  • Several reviews ensure exclusion of any copyrighted materials.
  • A separate team of developers re-implement the software.
Presentation for STAB Annual Seminar Weekday   6




A bit of history…
• AMD had reverse engineered Intel’s early processors (and had
 outperformed all of them!).
• ReactOS, an open source clone of Windows is still under active
 development. It’s not based on *nix systems at all.
• Phoenix developed it’s BIOS chip by reverse engineering IBM’s
 BIOS, with a clean room approach. Phoenix BIOS gave birth to
 the first IBM-compatible PCs.
• Wine is reverse engineering Windows for supporting Win API.
• OpenOffice.org is reverse engineering Microsoft Office for
 supporting the proprietary file formats.
• Samba enables file sharing between Windows and non-
 Windows systems. But they had to reverse engineer Windows
 file sharing.
Presentation for STAB Annual Seminar Weekday   7




What we would be discussing today…
Reverse Engineering is a whole different branch by itself. We
would only be touching upon a few important fundamental
points that would give you the basic feel of it (and hopefully
motivate you  to dig further into it)…

•PE Identification: Packers, Identifying the source language
•Decompilers; Disassemblers and Debuggers
•Introduction to OllyDbg’s Features
•Where to start? & Where to focus?
•Patching with OllyDbg
•Phishing and KeyGening
Presentation for STAB Annual Seminar Weekday   8




PEiD
• Identification of PE (Portable Executable) is an important step.
 It involves identification of compilers, cryptors, packers etc…
• Gives you a starting point to look for a solution.
  • Unpacking needed?
  • De-compilation possible?
  • Crypto libraries used?

• PEiD is by far the best PE Identification tool (470+ signatures
 and can be extended with external signatures).
• If PEiD says “Nothing Found”, the application might be using a
 custom packer.

  PEiD (PEiD)         PEiD (OllyDbg)               PEiD (KeyGenMe_#6)              X
Presentation for STAB Annual Seminar Weekday       9




Packers ?!
• Packers compress the compiled program, much like data.
• They compress either or both code and data sections and can
 additionally encrypt them.
                Compiled                         Packed
                                Packers        Executable
               Executable
                                                                                 Unpacking
                                             CODE       DATA              The extractor stub
                               Unpackers
             CODE       DATA                        STUB                  decompresses and/or
                                                                          decrypts the data and
                                                                          code. Writes them to
                                                                          a separate file or does
                                                                          it in-memory.
 OEP: Original Entry Point                                Entry Point

• Several packers exist. Most of them are selective about the
 type of PE they support.
Presentation for STAB Annual Seminar Weekday   10




Packers ?!
• Some of the most popular (and thus most wanted targets for
 unpackers) are ASPack, Enigma, Themida, MPRESS, UPX etc.
• Most packed PEs are manually customized to avoid usage of
 direct unpackers.
• Customized executables might not even be recognized by PE
 Identifiers! So, might not have any idea of the “behavior” of
 the packer.
• Without direct unpackers, we have to manually step through
 the executable, and find out the original entry point. (When we
 step through code in a debugger, the application is actually in
 execution, so take precaution while trying this with malware).
Presentation for STAB Annual Seminar Weekday   11




Why should you know the compiler?
• Java and Python compile sources to byte code instead of
 machine code, which runs on a virtual machine.
• Same is the case with managed code like .NET. C#, VB and all
 .NET languages compile the code to MSIL (MicroSoft
 Intermediate Language).
• The meta-data present with the intermediate byte code is
 enough to make satisfactory “de-compilation” possible.
• For example, Java byte code contains names of classes and all
 members; types and modifiers of fields; signatures of methods.
 Even names of local variables and line numbers can be saved
 optionally, by setting specific attributes.
Presentation for STAB Annual Seminar Weekday   12




Why should you know the compiler?
• Some of the things de-compilers need to be “smart” about are:
   • Structured control flow (loops and conditionals) from gotos in byte code.
   • Type inference for locals, especially for generics as they are generated at
     compile time.


• This is very different from the compilation with debug info (-g)
 that can be used with C/C++ compilers.
  • Debug info is just a mapping from binary to source file.
  • If the sources are missing, debugger would not “magically” show them!


• Machine code is so much “reduced” that it’s almost impossible
 to “grow” it back to a high level source code.
Presentation for STAB Annual Seminar Weekday   13




Some Available Decompilers…

• .NET MSIL Decompiler: Reflector (Freeware till 6.x)

 KeyGenMe #6    Reflector(KeyGenMe #6)          UnpackMe            Reflector(UnpackMe)




• Java Decompiler: JD (Freeware and Decent, there are dozen others!)

                   SK_CrackMe         JD(SK_CrackMe)
Presentation for STAB Annual Seminar Weekday   14




What if you can’t decompile?
• Well .. The fun begins! 

• Disassemblers and debuggers transform the binary machine
 code to assembly instructions. Debuggers offer additional
 features like setting break points, stepping through the code
 etc…
• OllyDbg is one of the best light-weight debuggers available. IDA
 Pro is a heavy-weight and paid debugger (but it’s worth it!).
• OllyDbg supports external plugins and that makes it an even
 more powerful tool.
• IDA Pro has the power of recognizing “known” library functions
 by their signatures. It can highly simplify the assembly dump.
Presentation for STAB Annual Seminar Weekday   15




More about OllyDbg…
• Emphasizes on code analysis.

• Olly contains description of about 2200 standard C/C++ library
 and Win32 API functions. It also contains 7800 symbolic
 constants (grouped into 490 types).
• Olly can detect nested loops, switches and cascaded ifs. It can
 also predict register usage.
• You can add comments at each assembly line.

• You can assemble your own expressions instantly and re-build
 the program. (Great for patching!)

              KeyGenMe #2             OllyDbg(KeyGenMe #2)
Presentation for STAB Annual Seminar Weekday   16




Finding the needle(s) in a haystack…
• Patience and lateral thinking!
• Examine “All referenced strings”
  • Locality of Good Boy or Bad boy
• Examine “All inter-modular calls”
  • Get used to Windows API (Olly does provide context sensitive help)
  • I/O calls are a usual starting point for examination
    • msvcrt.printf
    • msvcrt.scanf
    • USER32.GetMessageX
    • USER32.GetDlgItemTextX

• Use breakpoints at suspicious DLL calls
• Use memory breakpoint at addresses storing the entered key
Presentation for STAB Annual Seminar Weekday     17




A (very) simple example
string username, password, therealpass;                            Case_1       Olly(Case_1)

cout << “USERNAME : ”; cin >> username;
cout << “PASSWORD : ”; cin >> password;

threalpass = “TOP_SECRET”;

if(password == therealpass)         cout << “You deserve an award!n”;
else                                cout << “Y U No Give up?n”;



• Suspicious entities in dump: (good starting points)
   • both the final messages,
   • therealpass “TOP_SECRET”.
• Just the hex dump gives some hints about a “possible” password and
  trying this out gives good boy already!
Presentation for STAB Annual Seminar Weekday   18




Another example (Patching)
while(username.length() > 99 || username.length() < 10) {
       cout << “USERNAME : ”;        cin >> username;}
                                                              Case_2
cin >> password;
therealpass = username.substr(username.length() - 4) + “-”;
therealpass += (‘0’ + (username.length()%10));
therealpass += (‘0’ + (username.length()/10));           Olly(Case_2)
therealpass += “-” + username.substr(0,4);           Case_2_PATCHED

if(password == therealpass)           cout << “You deserve an award!n”;
else                                  cout << “Y U No Give up?n”;

• Patch JZ after TEST AL,AL to bypass the check and always output good boy.
• When trying to analyze, “what” is done with the password (KeyGen-ing), more
  interesting aspects like why BPs at strcmp are useless; why some passwords hit no
  BPs at memcmp but others hit a BP on memcmp…
• Not only helps you make a KeyGen; but helps you discover low level GCC
  implementation details as well!
Presentation for STAB Annual Seminar Weekday   19




Some awesome links…
• AMD - Rise & Fall: http://www.techspot.com/article/599-amd-rise-and-fall
 /
• ..cantor.dust..: http://www.toolswatch.org/2012/08/blackhat-arsenal-
 2012-releases-cantor-dust-next-generation-of-visualization-tools/
• ReactOS project: http://www.reactos.org

• OllyDbg: http://ollydbg.de




                    http://crackmes.de
Presentation for STAB Annual Seminar Weekday   20




Phew…




         Thank you!
        Any questions?

Reverse engineering

  • 1.
    REVERSE ENGINEERING An Introduction toCracking Windows Applications Saswat Padhi 3rd Year UG, CSE Hostel 8, IIT Bombay
  • 2.
    Presentation for STABAnnual Seminar Weekday 2 DISCLAIMER • This presentation is intended for educational purposes only. • Reverse engineering of copyrighted material is illegal and unethical and in no way, do I encourage this. • Playing with malware is not a good idea, unless you take proper precautionary measures. Always work within sandboxed environment. For example, in a virtual OS. • Malware analysis is “advanced” stuff. And when you choose to mess with a nasty app, it is expected that you know what you are doing. Only you are responsible for what happens to your system.
  • 3.
    Presentation for STABAnnual Seminar Weekday 3 What is this heavy term? • Reverse Engineering simply aims at “understanding” a system through analysis of its structure, function and operation. • We try to go backwards in the development cycle: having an implementation, we try to go back to the analysis stage, with a high level understanding. • With the abstract understanding, we try to modify parts of existing implementation or implement on Complete disassembly of a Pentax K1000 camera. our own! Image borrowed from bitrebel.com.
  • 4.
    Presentation for STABAnnual Seminar Weekday 4 And why should someone learn this? • Malware analysis: Analyzing malware to build anti-malware • Bug fixing: Fixing bugs in legacy software • Personalization/Customizations • Academic/Learning purposes • Removing access restrictions • Removal of copy protection • Compatibility • Just for fun!  • Convey the message “Go Open Source”! :-P
  • 5.
    Presentation for STABAnnual Seminar Weekday 5 Wait … Isn’t this “illegal”!? • Public release of information obtained through reversing a proprietary application or sharing the application after modifying it to remove/reduce security is illegal. • DMCA allows, reverse engineering applications for achieving interoperability. • Analysis of malware is of course legal!  • “Clean room” design is perfectly legal. • A team of examiners write a specification for target the software. • Several reviews ensure exclusion of any copyrighted materials. • A separate team of developers re-implement the software.
  • 6.
    Presentation for STABAnnual Seminar Weekday 6 A bit of history… • AMD had reverse engineered Intel’s early processors (and had outperformed all of them!). • ReactOS, an open source clone of Windows is still under active development. It’s not based on *nix systems at all. • Phoenix developed it’s BIOS chip by reverse engineering IBM’s BIOS, with a clean room approach. Phoenix BIOS gave birth to the first IBM-compatible PCs. • Wine is reverse engineering Windows for supporting Win API. • OpenOffice.org is reverse engineering Microsoft Office for supporting the proprietary file formats. • Samba enables file sharing between Windows and non- Windows systems. But they had to reverse engineer Windows file sharing.
  • 7.
    Presentation for STABAnnual Seminar Weekday 7 What we would be discussing today… Reverse Engineering is a whole different branch by itself. We would only be touching upon a few important fundamental points that would give you the basic feel of it (and hopefully motivate you  to dig further into it)… •PE Identification: Packers, Identifying the source language •Decompilers; Disassemblers and Debuggers •Introduction to OllyDbg’s Features •Where to start? & Where to focus? •Patching with OllyDbg •Phishing and KeyGening
  • 8.
    Presentation for STABAnnual Seminar Weekday 8 PEiD • Identification of PE (Portable Executable) is an important step. It involves identification of compilers, cryptors, packers etc… • Gives you a starting point to look for a solution. • Unpacking needed? • De-compilation possible? • Crypto libraries used? • PEiD is by far the best PE Identification tool (470+ signatures and can be extended with external signatures). • If PEiD says “Nothing Found”, the application might be using a custom packer. PEiD (PEiD) PEiD (OllyDbg) PEiD (KeyGenMe_#6) X
  • 9.
    Presentation for STABAnnual Seminar Weekday 9 Packers ?! • Packers compress the compiled program, much like data. • They compress either or both code and data sections and can additionally encrypt them. Compiled Packed Packers Executable Executable Unpacking CODE DATA The extractor stub Unpackers CODE DATA STUB decompresses and/or decrypts the data and code. Writes them to a separate file or does it in-memory. OEP: Original Entry Point Entry Point • Several packers exist. Most of them are selective about the type of PE they support.
  • 10.
    Presentation for STABAnnual Seminar Weekday 10 Packers ?! • Some of the most popular (and thus most wanted targets for unpackers) are ASPack, Enigma, Themida, MPRESS, UPX etc. • Most packed PEs are manually customized to avoid usage of direct unpackers. • Customized executables might not even be recognized by PE Identifiers! So, might not have any idea of the “behavior” of the packer. • Without direct unpackers, we have to manually step through the executable, and find out the original entry point. (When we step through code in a debugger, the application is actually in execution, so take precaution while trying this with malware).
  • 11.
    Presentation for STABAnnual Seminar Weekday 11 Why should you know the compiler? • Java and Python compile sources to byte code instead of machine code, which runs on a virtual machine. • Same is the case with managed code like .NET. C#, VB and all .NET languages compile the code to MSIL (MicroSoft Intermediate Language). • The meta-data present with the intermediate byte code is enough to make satisfactory “de-compilation” possible. • For example, Java byte code contains names of classes and all members; types and modifiers of fields; signatures of methods. Even names of local variables and line numbers can be saved optionally, by setting specific attributes.
  • 12.
    Presentation for STABAnnual Seminar Weekday 12 Why should you know the compiler? • Some of the things de-compilers need to be “smart” about are: • Structured control flow (loops and conditionals) from gotos in byte code. • Type inference for locals, especially for generics as they are generated at compile time. • This is very different from the compilation with debug info (-g) that can be used with C/C++ compilers. • Debug info is just a mapping from binary to source file. • If the sources are missing, debugger would not “magically” show them! • Machine code is so much “reduced” that it’s almost impossible to “grow” it back to a high level source code.
  • 13.
    Presentation for STABAnnual Seminar Weekday 13 Some Available Decompilers… • .NET MSIL Decompiler: Reflector (Freeware till 6.x) KeyGenMe #6 Reflector(KeyGenMe #6) UnpackMe Reflector(UnpackMe) • Java Decompiler: JD (Freeware and Decent, there are dozen others!) SK_CrackMe JD(SK_CrackMe)
  • 14.
    Presentation for STABAnnual Seminar Weekday 14 What if you can’t decompile? • Well .. The fun begins!  • Disassemblers and debuggers transform the binary machine code to assembly instructions. Debuggers offer additional features like setting break points, stepping through the code etc… • OllyDbg is one of the best light-weight debuggers available. IDA Pro is a heavy-weight and paid debugger (but it’s worth it!). • OllyDbg supports external plugins and that makes it an even more powerful tool. • IDA Pro has the power of recognizing “known” library functions by their signatures. It can highly simplify the assembly dump.
  • 15.
    Presentation for STABAnnual Seminar Weekday 15 More about OllyDbg… • Emphasizes on code analysis. • Olly contains description of about 2200 standard C/C++ library and Win32 API functions. It also contains 7800 symbolic constants (grouped into 490 types). • Olly can detect nested loops, switches and cascaded ifs. It can also predict register usage. • You can add comments at each assembly line. • You can assemble your own expressions instantly and re-build the program. (Great for patching!) KeyGenMe #2 OllyDbg(KeyGenMe #2)
  • 16.
    Presentation for STABAnnual Seminar Weekday 16 Finding the needle(s) in a haystack… • Patience and lateral thinking! • Examine “All referenced strings” • Locality of Good Boy or Bad boy • Examine “All inter-modular calls” • Get used to Windows API (Olly does provide context sensitive help) • I/O calls are a usual starting point for examination • msvcrt.printf • msvcrt.scanf • USER32.GetMessageX • USER32.GetDlgItemTextX • Use breakpoints at suspicious DLL calls • Use memory breakpoint at addresses storing the entered key
  • 17.
    Presentation for STABAnnual Seminar Weekday 17 A (very) simple example string username, password, therealpass; Case_1 Olly(Case_1) cout << “USERNAME : ”; cin >> username; cout << “PASSWORD : ”; cin >> password; threalpass = “TOP_SECRET”; if(password == therealpass) cout << “You deserve an award!n”; else cout << “Y U No Give up?n”; • Suspicious entities in dump: (good starting points) • both the final messages, • therealpass “TOP_SECRET”. • Just the hex dump gives some hints about a “possible” password and trying this out gives good boy already!
  • 18.
    Presentation for STABAnnual Seminar Weekday 18 Another example (Patching) while(username.length() > 99 || username.length() < 10) { cout << “USERNAME : ”; cin >> username;} Case_2 cin >> password; therealpass = username.substr(username.length() - 4) + “-”; therealpass += (‘0’ + (username.length()%10)); therealpass += (‘0’ + (username.length()/10)); Olly(Case_2) therealpass += “-” + username.substr(0,4); Case_2_PATCHED if(password == therealpass) cout << “You deserve an award!n”; else cout << “Y U No Give up?n”; • Patch JZ after TEST AL,AL to bypass the check and always output good boy. • When trying to analyze, “what” is done with the password (KeyGen-ing), more interesting aspects like why BPs at strcmp are useless; why some passwords hit no BPs at memcmp but others hit a BP on memcmp… • Not only helps you make a KeyGen; but helps you discover low level GCC implementation details as well!
  • 19.
    Presentation for STABAnnual Seminar Weekday 19 Some awesome links… • AMD - Rise & Fall: http://www.techspot.com/article/599-amd-rise-and-fall / • ..cantor.dust..: http://www.toolswatch.org/2012/08/blackhat-arsenal- 2012-releases-cantor-dust-next-generation-of-visualization-tools/ • ReactOS project: http://www.reactos.org • OllyDbg: http://ollydbg.de http://crackmes.de
  • 20.
    Presentation for STABAnnual Seminar Weekday 20 Phew… Thank you! Any questions?