Lab 7 - Debugger
Accessing
lldbThe automatic process to load
lldbseems to be broken. In the meantime, when you SSH to the portal, you should start by issuing the following command to enable lldb:module load gcc/14.2.0 clang/18.1.8
In this lab, you’ll be practicing using the debugger to continue to trace x86-64 assembly. This time, however, you’ll be tracing code that you did not write!
Reviewing the Debugger
Take a moment before you begin to review how to use the debugger from the USing the Debugger reading. You may also find the example walkthrough reading helpful.
Enigma Box
Dr. E. Nigma has created a new line of fiendish programming puzzle boxes — the enigma boxes! Each enigma box is a binary program that contains a sequence of locks that must be solved. The key to each lock is encoded in how the program is written, so you’ll need to trace the execution of her enigma box carefully! The box expects you to type a particular string–the key–to solve and open the lock. However, once you open one lock, you find yet another inner-box with yet another lock!
But be warned! If you type the correct string, the lock is opened and you proceed on to the next lock. If you type the wrong string, the lock emits a shreaking shrill buzz! Well, it doesn’t actually make a sound. It does, however, print a message announcing the buzz. It then tells us that the buzz happened and terminates.
Work together
In lab, we strongly encourage you to work with one another (just for lock 1). Reading binary is much more fun and effective with someone else to talk to.
You should not work together on lock 2 and beyond, that is HOMEWORK 6.
Grading
You’ll use the same enigma box for this lab and for the following homework.
For lab, you need to either (a) have a TA record that you were part of a team that solved the first lock or (b) solve the first lock of your enigma box.
For the homework, you’ll need to solve the additional locks on your own.
Each time your enigma box buzzes with a failed attempt to unlock, it notifies Dr. E. Nigma’s enigma box server, which she has graciously provided to us this semester. If we’re notified of 20 buzzes we’ll start removing points each time your box buzzes. So run your box carefully with the debugger!
How to proceed
-
Run our script to get a unique enigma box in your home directory:
getabox cd box#(where#is your enigma box number).- Read the
README - You are welcome to look at
box.c– it isn’t very interesting, though. Or is it? - Use the debugger to understand what the box and each of the locks are doing
- Only run the enigma box
./boxonce you are confident you can solve a puzzle (or at least avoid a buzzer and its notification) - Once you solve a lock visit the scoreboard to verify that we saw your success.
Hints
If you run your enigma box with a command line argument, for example, ./box lock-solutions.txt, then it will read the input lines from lock-solutions.txt until it reaches EOF (end of file), and then switch over to the command line. This will keep you from having re-type solutions.
Because you want to avoid buzzes on incorrect lock solutions, you’ll want to set a breakpoint before you run the program so that you can stop the program before it gets to a the function that does the buzzing.
You might find it useful to run, objdump --syms box to get a list of all symbols in the enigma box file, including all function names, as a starting point on where you want your breakpoint.
The best way is to use your favorite debugger to step through the disassembled binary. Almost no students succeed without using a debugger like lldb or gdb. We recommend using lldb. On the department machines, if you did not run the script from Lab 1, you can enable lldb buy running module load gcc/14.2.0 clang/18.1.8.
To avoid accidentally submitting an incorrect solution to a lock on your box, you will need to learn how to single-step through the assembly code and how to set breakpoints. You will also need to learn how to inspect both the registers and the memory states.
It may be helpful to use various utilities for examining the enigma box program outside a debugger, as described in “examining the executable” below.
Enigma Box Usage
-
The enigma box ignores blank input lines.
-
If you run your enigma box with a command line argument, for example,
./box lock-solutions.txtthen it will read the input lines from
lock-solutions.txtuntil it reaches EOF (end of file), and then switch over tostdin. This will keep you from having re-type solutions.
Examining the Executable
-
objdump -twill print out the enigma box’s symbol table. The symbol table includes the names of all functions and global variables in the box, the names of all the functions the box calls, and their addresses. You may learn something by looking at the function names! -
objdump -dwill disassemble all of the code in the enigma box. You can also just look at individual functions. Reading the assembler code can tell you how the box and locks work.If you prefer to get Intel syntax disassembly from
objdump, you can useobjdump -M intel -d. -
stringsis a utility which will display the printable strings in your enigma box.
Using LLDB
-
Run the enigma box program
boxfrom a debugger likelldbinstead of running it directly. The debugger will allow you to stop the box before it buzzes due to a bad lock solution.For example, if I ran
lldb box (lldb) b methodName (lldb) run (lldb) killthis will start
lldb, set a breakpoint atmethodName, and run the code. The code will halt before it runsmethodName; callingkillwill stop the enigma box and exit the current debugging session withoutmethodNamerunning. -
Walk through code using one of
nextigoes one assembly instruction at a time, skipping over function callsstepigoes one assembly instruction at a time, entering function calls
lldb box (lldb) b lineNumberForLock1Call (lldb) runinput test passphrase here
(lldb) register read (lldb) frame variableGenerally some parameters are local variables and some are stored in registers and others on the stack; if none are on the stack,
frame variablesprints nothing. Strings are stored as pointers so you’ll need to “examine” what they point to. Try looking at several as if they are strings:(lldb) x/s anAddressDisplayedByRegisterReadOrFrameVariableYou can also look at the assembly directly
(lldb) disasAnd walk through it instruction by instruction
(lldb) nextikeep
nextiing until you seestrings_not_equalmethod (a suspicious name that might be checking your passphrase)(lldb) register read (lldb) frame variableWhich one holds your passphrase? Try “examining” that and others…
-
Some useful
lldbcommands:(lldb) frame variable
prints out the name and value of local variables in scope at your current place in the code, if any.(lldb) register read
prints the values of all registers except floating-point and vector registers(lldb) x/20bx 0x...
examine the values of the 20 bytes of memory stored at the specified memory address (0x…). Displays it in hexadecimal bytes.(lldb) x/20bd 0x...
examine the values of the 20 bytes of memory stored at the specified memory address (0x…). Displays it in decimal bytes.(lldb) x/gx 0x...
examine the value of the 8-byte integer stored at the specified memory address.(lldb) x/s 0x...
examines the value stored at the specified memory address. Displays the value as a string.(lldb) x/s $someRegister
examines the value at register someRegister. Displays the value as a string (assuming the register contains a pointer).Note it is
x/s $rdiin lldb, not%rdilike it would be in assembly.(lldb) print expr
evaluates and prints the value of the given expressioncall (void) puts (0x...)
calls the built-in output methodputswith the givenchar *(as a memory address). Seeman putsfor more.(lldb) disas methodName
get the machine instruction translation of the methodmethodName.(lldb) disas
get the machine instruction translation of the currently executing method.(lldb) x/6i 0x...
try to disassemble 6 instructions in memory starting at the memory address 0x…(lldb) b *0x...
set a breakpoint at the specified memory address (0x…).(lldb) b function_name.
set a breakpoint at the beginning of the specified function.(lldb) nexti
step forward by one instruction, skipping any called function.(lldb) stepi
step forward by one instruction, entering any called function.(lldb) kill
termiante the program immediately(lldb) help
brings up lldb’s built-in help menu
On interpreting the disassembly
-
Reviewing the x86-64 calling convention may be helpful.
-
The C standard library function
sscanfis called__isoc99_sscanfin the executable. Tryman sscanffor more information about this library function. -
%fs:0x1234refers to a value in a “thread-local storage” region at offset0x1234. The enigma box only has one thread (using multiple threads would allow the box to do multiple things at once, but solving two locks simultaneously seems a little too fiendish for Dr. Nigma), so this is effectively a region for extra global variables. In the enigma box, this appears mostly to implement stack canaries, a security feature designed to cause out-of-bounds accesses to arrays on the stack to more consistently trigger a crash. -
Pay attention to the names of functions being called.
-
Disassembling a standard library function instead of reading the documentation for the function is probably a waste of time.
-
Some of the things later locks in your enigma box might be using include:
- calls to
scanf(which is a formatted read; tryman scanfor Wikipedia for more) - linked data structure traversal
- recursion
- string literals
switchstatements
- calls to