Saki's Romhacking Shenanigans Mk.II

Saki's From Zero to Hero Romhacking Guide Part 2: Debugging & Reversing   

game_image

Debuggers

Whether it's the emulator/software you use which includes debugging features by default, or a third-party software that takes care of it, debuggers work in a pretty similar fashion: you run a program, either instruction by instruction, or "normally"/in real time, and you will be able to see how said program's memory, registers, and PC evolve. It's referred to as debugging since being able to step through a program one step at a time, and to notice how its' memory and registers change, to catch the eventual exceptions and when/where/why/how they happen, is how bugs are usually found and fixed by developers.

Saki's Usual Debuggers

There are many debugging tools out there, but here is a small breakdown of the ones I use, and what for:

All of those tools share common concepts and terms, which I will introduce below. Note: this terminology assumes you either read the previous blogpost or know how a program is organized and what an instruction is.

Step & Run

When debugging a software, you will do so from one particular instruction. Usually, it'd be the entry point of your program, so your main function. In contrary of your normal program execution, the debugger would not automatically execute all of the instructions in the main function one after another. Instead, you'd do so manually by stepping from the current instruction into the next. The program's current registers and memory would be updated (and usually highlighted) between every state.

Here, we use x32dbg. The images show the state of an x86 program's registers before and after stepping, with every updated register being highlighted in red.

There are three types of steps. When you Step Into an instruction, you step from the current instruction to the next one as the program normally would. If the next instruction is a jump instruction, the jump would be made. When you Step Over an instruction, you ignore said jump. When you Step Out of a function, you skip all of the instructions of the function you're in until you leave it (mainly useful when we're inside of a sub-function and want to return to the main one).

Another useful command to go through a program's instructions is the Run command. This command will execute all of the instructions we'll encounter, until the program ends... Or until we reach a breakpoint. But what is a breakpoint ?

Breakpoints

A breakpoint is a flag set on a specific instruction which tells the debugger "you will stop running if you reach this instruction", it's useful when we want to inspect the way the program behaves when reaching a specific point in the code, or when there was a crash and we want to understand where and how it happened.

There are three types of breakpoints. The Execution Breakpoint will stop the debugger once the Program Counter is equal to a specific address. The Write Breakpoint will do so if a certain address is written to by the program, and the Read Breakpoint will do it if an address is read from by the program. Some debuggers allow to add other conditions on top of the initial read/write, such as "on change" or "if greater than". For example, if you're working on a PS1 game and you know that the game stores its' lives at address 0x8010345A, you can set a write breakpoint there and find out which exact function(s) update the value.

Give the PCSX-Redux devs a raise seriously. (here I've set a read breakpoint to the address 0x8004f9a4, the emulator will stop if any instruction reads the data at that address in RAM,
but only under the condition "read value > 5").

Believe it or not, but with this little knowledge you've already seen all that should be known about debuggers. How useful they will be to you would depend on the way you use them.
Note: in very rare cases, or when you're the one doing the compilation, a program can embed Debug Symbols in its' binary data. This way, whenever a debugger or a disassembler would be ran, we would be able to see the name of the functions and variables used as they were in the original source code, rather than generic names. With GCC for example, you can add the -g flag when compiling a project so the debug symbols would be embedded into an executable.

The first screenshot shows GDB's output without debug symbols for a basic "Hello World!" program: it doesn't know what line we're at nor how the variable storing the message was originally called, in contrary of the second screenshot where the executable contains the information.

Disassemblers & Reverse Engineering

Disassemblers

If you don't know what a disassembler is, I highly recommend you read my previous post's introduction, but basically it's a program that will try to reconstitute a binary executable's assembly, and it often goes a step beyond in the process, also attempting to decompile it (reconstructing some human-readable code in a target language). Unlike debuggers, disassemblers don't run the executable but analyse it statically instead, following the flow of assembly instructions to guess where functions and variables are located. A well-known example is objdump on Linux system.

Running objdump on the executable gives us the disassembly of the main function. Note: by default it will disassemble all of the libraries embedded into the executable, but passing the option "--disassemble=main" fixes that.

As it turns out, there aren't many tools which offer disassembly/decompilation features without also allowing the user to debug the binary they'd analyze. When you reverse engineer a program, i.e when you want to understand how it works without having the actual source code, you don't really want to have to choose between statically analyzing disassembled code or hammering your way through with a debugger, both are valid ways of deobfuscating an executable's behaviour so why hold yourself back? As such, whether you'd use Ghidra, IDA Pro, Binary Ninja, or more language-specific tools like dnSpy, they'd all offer debugging and disassembly/decompilation as features at the same time.

Reverse Engineering

Here is what I generally do when I try to understand a program's logic with the tools I'm given:

Conclusion

There isn't much to say about the general use of debuggers and disassemblers, it's honestly a knowledge that mainly comes with practice. Do not be afraid of opening up one of your favorite (or not) pieces of software in a debugger and watch step by step what exactly goes under the hood. Be even less afraid to decompile an executable with Ghidra, and lose yourself into a senseless flow of functions and branches. Break stuff, fix stuff, patch stuff, and you will get somewhere eventually. My next blog post will be a demonstration of the process through the making of Magical Date: Doki Doki Kokuhaku Daisakusen's English Patch. Thank you for reading and until next time.

Previous Page