So, I couldn’t think of anything cool and advanced (that isn’t covered by my NDA) to talk about on such short notice, so I figured I’d start with something easy. My apologies to all the veterans on the list since its basic stuff you already know.
We all know that when it comes to programming, there aren’t many language types as fun and exciting as assembly. Unfortunately, in this crazy world of power lunches and tight deadlines, we don’t really get as many chances to write in assembly as we’d like to. However, being able to read and understand that alien language in your debugger’s disassembly tab is something that every programmer needs to be able to do. It’s essential for debugging crashes that occur in release builds only, diagnosing optimization-related compiler bugs, and better understanding what the compiler is thinking so you can make more informed optimization decisions.
Since all non-handheld gaming platforms are based around PowerPC, I’ll be focusing on that. Maybe I’ll update this to include ARM/NEON or MIPS someday. VFPU would be awesome but I’m not sure if that’s supposed to be secret (can anyone verify?)
Basic Calling Convention
The first thing you can do is familiarize yourself with the PowerPC calling convention and ABI. If you know the calling convention and some basic instructions, you can extract almost any information you need.
Lets start with something simple that comes up often. You step into a function and want to see the arguments that are passed in to the function. Mousing over the variable either gives you something like 0xFFFFFFFF or no value at all. What can you do? Well, lets see what the generated code looks like for an update function:
void Doooooosh( Bag* bag, DooooooshLevel level)
bl __savegprlr + 0034h (82eea8c4h)
There are a few things you need to know about the PowerPC calling convention.
1) for non-member functions or static member functions, small non-float arguments ( int, bool, pointers, etc… ) are passed in as r3 through r10
2) for C++ member functions, r3 is always the this pointer, and function arguments are passed in as r4 through r10
3) more often than not, float arguments are passed in using the floating point registers ( fr1, for example )
So, knowing this is a C style standalone function, all you have to do is set a breakpoint early in the function and look at r3 and r4. Later on you’ll see why it has to be early. To get the real values, all we have to do is open up a watch and cast each register to its expected type:
( Bag * )r3
( DooooooshLevel )r4
When working with a C++ member function we’d use r4 and r5 instead, and we could also get the this pointer using:
( SomeClassName * )r3
As a side note, if you’re wondering how the proper values end up in the right registers to make a function call, its set up like this:
Doooooosh( bag, level );
bl Doooooosh (8293d0e8h)
mr is the “move register” instruction. In the above example, mr r4, r30 will take the contents of r30 and copy it to r4. We must assume that r31 and r30 contain the bag pointer and level respectively. Since all C function calls expect their arguments to be in r3 and up, we have to copy all our arguments to those registers. bl stands for “branch link” and is how we usually call non-leaf functions.
Now there is a catch. Remember when I said we have to look at these registers early in the function? At the very beginning of Doooooosh( ) we can assume that the bag pointer and level will be in r3 and r4 respectively. Thats just how function arguments are passed in. But what if Doooooosh( ) calls another function? Wont that called function also need its argument in r3? The point is that just because your function arguments are originally in r3 and r4 doesn’t mean you can expect them to stay there for long. Taking a look back at the original example, you’ll see
Basically, this is the code saying “I understand that r3 is probably going to get overwritten very soon so I’m going to back up its value in r31″. Any time after these two register moves are executed, we can now get the function arguments ( more safely ) like this:
( Bag * )r31
( DooooooshLevel )r30
Remember, on the PS3 and Xenon, r3 through r10 are considered volatile and r14 through r31 are considered general use non-volatile. Non volatile means that if you stick a value in r30 and then make a function call, when that function call returns r30 will be just as you left it. That is why at the beginning of Doooooosh( ) we save all the argument registers ( r3 and r4 ) into safe non-volatile registers ( r31 and r30 )
Some More Debugging Tips
Don’t be afraid to go back in the call stack if the info you need can’t be found by the above method. For example, I wanted to examine a string that only existed in a function earlier up in the call stack. The solution was to go up one call in the call stack and look for the bl function call. A few lines above that, we were copying the function argument from r30 to r4 ( like we always do for function arguments ). I moused over r30, casted it to a char *, and that gave me the string. Remember that this usually only works for non-volatile registers r14 to r31 ( this is because the registers are “spilled” or copied into the stack frame. Visual Studio and SN debugger are usually able to look in the stack frame to retrieve the saved register values. )
Getting local variables stored in registers can be a little tricky. While I don’t think there is any one way that works 100% of the time, there are a couple of tricks you can use that may help you through. I’m sure with a little imagination, you’ll figure it out
1) If the local variable is passed as the first argument of the function, look for it in the corresponding register right before the function is called ( r3 for a C function or r4 for a C++ member function ) before a function call ( bl ). If you need to catch it a little earlier, start at the function call and work backwards. If you know that the value in r31 is moved into r3 right before the function call, then work your way up the code and see where r31 is being set. The lesson is don’t be afraid to work backwards.
2) look for landmarks. Often, the generated assembly wont match the code very well. Sometimes in mixed view, you’ll have what looks like 10 lines of perfectly good C++ code that seem to have no assembly code generated. Thats when landmarks come in handy. If you have something like this
x = sqrt( y );
manually scan through the function and look for some assembly opcode that looks like it could correspond to a floating point square root. From there, you can see what the code does with the result and better trace through the assembly. Some other good landmarks include incrementing, trig functions, floating point multiplies, loop conditions, NULL checks, and any other function that would have some stand out opcodes.
3) look for constant initializers. If you have something like this
int x = 123;
and you see some assembly in the function that looks like
li r30, 123
You may have found a hint that r30 corresponds to x at this point in time. By the way, in case you didn’t already know, li stands for “load immediate” and it loads an immediate value into a register. Note that you can only load 16 bit constants in this way. 32 bit constants are done in two instructions by loading the lower 16, then loading the upper 16 and shifting left.
4) if the local variable is used in a conditional, see what is being compared. Compares look something like this
if( player_controller < 8 )
bge cr6, CPlusPlusSucks::AndSoDoesThisFunc + 0064h (8283457ch)
most compare instructions begin with cmp. Here you are comparing r3 with 8 and setting some result flags in cr6. bge means branch greater than or equal. It checks the cr6 result flags that were set by the compare and then branches if appropriate. The point is that we know for sure that at this point r3 is player_controller. If needed we can work our way backwards and look for useful information.
Stack Frame: When When All Else Fails
The above diagram is what the stack frame could look like on Xenon. If there is some weird bug you have to track down and all else fails, including good old fashioned logical thinking about the problem at a higher level, you can draw out one of these stack diagrams and extract more information than you could get using some of the previous techniques.
PPC updates its stack all at once at the beginning of the function, unlike LoseTel which seems to do it as you pop and push. The code will look something like this
stwu r1, -96(r1)
Obviously r1 is the stack pointer, and stwu is a clever way of telling people to shut up. Errr… I mean its “store word and update.” It atomically stores r1 at the address and then updates r1 with the new address. The update direction is negative because the stack grows towards low memory. Since the caller’s SP is saved exactly at the top of the new stack frame, this is exactly what we want.
This can get you a few things. First, it enables you to get a call stack in some cases where the debugger goes nuts. It allows you to get the value of params that are too big or too numerous to pass in registers. It also leads to your religious coworkers calling you a witch and trying to burn you for your black magic.
Here is a quick way to decipher instructions you may not know:
if it starts with “L”, it’s probably a load
if it starts with “S”, its probably a store ( instructions starting with “sl” and “sr” are bitshifting operations )
if it starts with “F”, it’s probably a floating point math instruction
if it starts with a “B” it’s a branch.
if it has an “i” at the end of it, it’s probably taking input from an immediate rather than a register.
Thats the very basics. Hopefully that should be enough to get you started reading and understanding your code’s disassembly. Real understanding only comes with practice, so when you have free time (during rebuilds?), look at random bits of code in optimized and unoptimized builds and see how they differ. Don’t just look at the code and see a bunch of instructions, one of which may or may not be a bl with a function name. Instead, try to really understand every instruction and what the code is doing. Its not easy, but someday you’ll be a hero to your unenlightened coworkers who truly believe that optimized builds can not be debugged by humans.