Overview
In the previous lesson, we briefly discussed how when debugging a game, you will be primarily dealing with assembly instructions. There are a large amount of assembly instructions; however, for our purposes, we will only need to understand a handful. The purpose of this lesson is to introduce and explain some of the common assembly instructions you will run into when debugging a game.
Data Management
Like we have discussed previously, data is commonly moved from memory into a register. Then, operations are performed on this data before it is then moved back into memory. This movement is commonly done using the mov instruction. The mov instruction can be in multiple forms. For example, the following instructions move the value in ecx into eax and move the value 1 into eax, respectively:
mov eax, ecx
mov eax, 1
In addition, the mov instruction can be used to move data from memory into a register. To do this, we can use a special directive, such as dword ptr ds. This tells the mov instruction to move the data stored in a memory address into a register. The example below moves the value stored at 0x12345678
into eax:
mov eax, dword ptr ds:[0x12345678]
When dealing with classes, games will often want to load an address instead of a value. A game will often store an address in a register and then load that address into another register to retrieve certain variables inside a class. To do this, games will commonly use the lea instruction:
lea eax, dword ptr ds:[ecx]
Changing Data
Once data is loaded into a register, games will use several instructions to modify the data. The inc and dec instructions can be used to increase or decrease the value in the register by one:
inc eax
dec eax
Similarly, the add and sub instructions can be used to add and subtract from a value in a register:
add eax, 2
add eax, ebx
sub eax, 4
Other operations, such as multiplying and dividing, are also supported through the mul and div instructions. The mul instruction takes a single value and multiplies it against the value stored in eax:
mul 4
The div operation works the same way:
div 4
Often games will need to manipulate the individual bits of a value stored in a register. To do this, they will use several bitwise instructions. These include the shift left and shift right operations (shl and shr), and the and, or, and xor operations:
shl eax, 2
shr eax, 2
and eax, eax
or eax, ebx
xor eax, eax
Flow Control
Games are made up of hundreds of different functions. To navigate to these functions from the main loop, games will use a variety of flow control instructions. For example, the jump instruction (jmp) jumps to another section of code to start executing:
jmp 0x12345678
Games will also use the call instruction to execute a function:
call 0x12345678
The differences between these instructions will be discussed more in future lessons. For now, the main difference is that a jmp instruction permanently changes where a game is executing whereas a call instruction temporarily changes it.
Games will often need to compare a value of a register to another value to determine if an action needs to take place. For example, if a player reaches 0 lives, the game will want to present a “Game Over” screen to the player. To do this, games will use the compare (cmp) and test operations:
cmp eax, 2
test eax, eax
Testing a register against itself, such as in the example above, has the effect of comparing a register to the value 0.
The cmp and test instructions set several special flags on the CPU. These flags can then be used in combination with conditional jump operations to navigate to sections of code. For example, the jump if zero (jz) instruction will jump to a section of code if the compared values equal each other. The example below will jump to 0x12345678
if ebx and eax equal each other:
cmp eax, ebx
jz 0x12345678
The most common conditional jumps are jump if zero (jz), jump if not zero (jnz), jump if equal (je), and jump if not equal (jne). For the sake of reversing games, jz and je can be considered identical.
The Stack
In addition to registers, programs can store information in a place known as the stack. Information is stored (or pushed) on the stack through the use of the push instruction. Information is received (or popped) from the stack through the pop instruction. The example below pushes the value of 5 on the stack and then pops that value into the register eax.
push 5
pop eax
Functions often take several inputs or parameters. To achieve this in assembly, games will often push several values on the stack before a call, like so:
push 5
push eax
call 0x12345678
Most calls will start and end with the following instructions:
push ebp
mov ebp, esp
sub esp, ...
...
leave
ret
The first three instructions are known as the function prologue and set up the stack frame for the function. Each function will have its own stack frame. The last two instructions restore the previous function’s stack frame and return to the function that called this current function.
An Example
The example below contains several instructions combined together. In this example, 0x12345678
represents the memory address holding the current player’s number of lives and 0x45923821
is a function that prints text to the screen:
mov eax, dword ptr ds:[0x12345678]
sub eax, 1
mov dword ptr ds:[0x12345678], eax
test eax, eax
jz 0x22222222
jmp 0x22222200
0x22222222: mov ecx, ebx
push 5
push ecx
call 0x45923821
Looking through the instructions, the game first loads the current player’s number of lives into eax. It then decreases this value by 1. It then moves that value back into the current player’s number of lives. Then, it checks if the player has 0 lives. If not, it jumps back to some code above this section. If so, it jumps to a section of a code below that calls a function with two parameters pushed onto the stack. From these instructions, we can guess that this example is responsible for checking if the player has lost all their lives and needs to be presented with a game over screen.