At Holberton School, we have had a couple rounds of a ‘#forfun’ project called crackme. For these projects, we are given an executable that accepts a password. Our assignment is to crack the program through reverse engineering.
2. Reverse-engineering: Using GDB on Linux
At Holberton School, we have had a couple rounds of a ‘#forfun’ project called
crackme. For these projects, we are given an executable that accepts a password.
Our assignment is to crack the program through reverse engineering. For round
one we were given four Linux tools to use, and we had to demonstrate how to find
the answer with each tool. It was quickly apparent that using a standard library
string comparison is a bad idea as was hardcoding passwords into the executable
in plain text. Another round demonstrated that the ltrace tool could gather not only
the password from string comparisons, but the encryption method (MD5 in that
case) to decrypt the password.
3. Reverse-engineering: Using GDB on Linux
This week we were given another crack at hacking. I went to my go-to tool for
reverse-engineering, the GNU Project Debugger (aka GDB), to find the password. If
you would like to take a shot at cracking the executable, you can find it at
Holberton School’s Github. The file relevant to this post is crackme3.
4. Program Checks
Before I dig too deep into the exec file, I check what information I can get from it.
First, I do a test run of the file to see what error information is provided.
$ ./crackme3
Usage: ./crackme3 password
For this executable the password is expected to be provided on the command line.
5. Program Checks
The next check I run is ltrace just to see if the password will appear. In addition, it
can provide some other useful information about how the program works.
$ ltrace ./crackme3 password
__libc_start_main(0x40068c, 2, 0x7ffcdd754bd8, 0x400710 <unfinished …>
strlen(“password”) = 8
puts(“ko”ko
) = 3
+++ exited (status 1) +++
The return shows that the program is checking the length of the string, but there is
no clear indication that this is a roadblock. Time to break down the program.
6. The GNU Project Debugger
GDB is a tool developed for Linux systems with the goal of helping developers
identify sources of bugs in their programs. In their own words, from the gnu.org
website:
GDB, the GNU Project debugger, allows you to see what is going on `inside’
another program while it executes—or what another program was doing at the
moment it crashed.
7. The GNU Project Debugger
When reverse engineering a program, the tool is used to review the compiled
Assembly code in either the AT&T or Intel flavors to see step-by-step what is
happening. Breakpoints are added to stop the program midstream and review data
in the memory registers to identify how it is being manipulated. I will cover these
steps in more detail below.
8. The Anatomy of Assembly
To get started, I entered the command to launch the crackme3 file with GDB
followed by the disass command and the function name. The output is a list of
Assembly instructions that direct each action of the executable.
$ gdb ./crackme3
(gdb) disass main
10. The Anatomy of Assembly
In the previous slide, the AT&T and Intel syntaxes are displayed side-by-side.
However, the output will actually display only one of the two. I prefer to use the
AT&T format because the flow makes more sense to me. The first column provides
the address of the command. The next column is the command itself followed by
the data source and the destination. Jumps and function calls have the jump
location or function name following those lines. Intel syntax reverses the data
source and destination in its display. There are additional differences in the
command names and data syntaxes, but this is common when comparing scripts of
two different languages that perform the same function. If I were writing Assembly,
my syntax preference might be different and would be based on more than just
flow of information.
11. The Logic Flow
Every script depends highly on logic flow. Depending on the compiler and options
selected when compiled, the flow of the Assembly code could be straightforward or
very complex. Some options intentionally obfuscate the flow to disrupt attempts to
reverse engineer the executable. Below is the output of the disass main command
in AT&T syntax.
13. The Logic Flow
The portions of the command not highlighted are jumps and closing processes
before exit. There are four types of jumps in the output of main and
check_password; je, jmp, jne, and jbe. The jmp command performs the described
jump regardless of condition. The other three are conditional jumps. The first two, je
and jne, are straightforward. They mean jump if equal and jump if not equal. The
last command, jbe, is a jump used in a loop that means jump if less than or equal.
14. The Heart of the Question
Ultimately we are looking for the password. Based on the information from the main
output, it is primarily depending on the check_password function to determine
whether to exit or provide access. To analyze the process happening in that
function, I entered disass check_password.
16. The Heart of the Question
The first thing I confirm is that the length of the password entered is important. The
program looks for a password that is four characters long. The instruction at
0x400632 actually shows the password in integer form, but I did not recognize it
immediately. That value is stored in memory four bytes before the memory address
stored in the RBP register. I use x/h * $RBP - 0x04 to print the value. The ‘h’ stands
for hexadecimal and it is the easiest format to to see how the password is stored.
From the instruction set, a comparison of two registers, rax and rdx, occurs at
0x40066a. This is where the next step of my investigation leads.
17. Registered and Certified
(gdb) b *0x40066a
(gdb) run test
Starting program: /home/vagrant/reverse_engineering/crackme3/crackme3 test
Breakpoint 1, 0x000000000040066a in check_password ()
(gdb) info registers
18. Registered and Certified
I set a breakpoint to analyze the data in process. Breakpoints do exactly what they
say, they interrupt the process at the given instruction address. Once the
breakpoint is set, I initialize the executable with the command run test. The value
‘test’ is the four character password I used to get past length test and into the
password comparison. Once the breakpoint is triggered I enter info registers to
view the data in the registers at the point the program was interrupted.
19. Registered and Certified
Register Data On Each Loop
1: RDX—0x41 = A
2: RDX—0x42 = B
3: RDX—0x43 = C
4: RDX—0x4 = ^D or EOF
Blue — User input | Green — Stored Password
20. Registered and Certified
From the register information, I find two integers are stored; 0x74 and 0x41. The
ASCII value of the letter ‘t’ is 0x74. The printable letter for 0x41 is ‘A’. I also noticed
that in RCX is an integer value of 0x4434241. If read in reverse, it is 41, 42, 43 and 4.
Converted to the character values it is A, B, C, ^D or EOF.
21. Registered and Certified
Inputting the password is tricky. Bash interprets the EOF file command so it isn’t
passed to the executable. In fact, it is used to exit executables. I tried to store it in a
file, but emacs reads ^D (Ctrl + D) as an end of buffer command. My workaround is
to use an online ASCII to text converter and paste into a file through Atom. It adds a
new line character which I remove with emacs.
22. Registered and Certified
To get the password past Bash and into the executable, I use the command line
below. This keeps Bash busy with passing the values to the executable so it does
not interpret the EOF, end of file or transmission.
$ ./crackme3 $(< 0-password)
Congratulations!
23. Conclusion
Gdb is a powerful tool that is useful for much more than I have covered in this post.
Take the time to read the documentation from GNU to learn more. I am confident
there are many other tools that can be used as well. Share your go-to tool for
reverse-engineering or debugging in the comments below.
24. About Rick Harris
Student at Holberton School, a project-based,
peer-learning school developing full-stack
software engineers in San Francisco.
Involved in the IT industry for 6+ years most
recently as a part of the Office of Information
Technologies at the University of Notre Dame.
Keep in Touch
Twitter: @rickharris_dev
LinkedIn: rickharrisdev