At the risk of stating the obvious: Debug builds of our games and tools are designed with the intention helping us diagnose and fix bugs. The debugger, the compiler, the linker and the libraries we link against are all laden with features intended to make bugs easier to squish. But what if those features actually conspired to obscure bugs from us? Many of us have experienced a frustrating bug that happens to our testers and our end users, but never to us. The dreaded “works for me” situation.
In this post I’ll explain how some of those situations arise, and detail methods for avoiding them. To demonstrate, I’ll use the following class. All the example code given was compiled with Visual Studio 2008 Professional as a Win32 Console Application. If you’re compiling for another platform the debugger characteristics are likely to be different, but many of the same rules still apply.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | class TestClass { public: int size; int *data; TestClass() {} void RunCrashTest() { DebugPrint("Array size is %d (0x%08x)\n",size,size); for(int i=0; i<size; i++) data[i] = 0; } }; |
This class holds a pointer to an array of integers and the size of that array, along with a method which tries to initialise the contents of that array to 0. Notice that I haven’t initialised either the array or the size. If they were local variables inside a method the compiler would give me a C4700: uninitialized local variable warning, but because they’re class members it doesn’t track them in the same way.
When calling the RunCrashTest method you’d probably expect a crash (and not just because of the function name), however this is where the debugger starts to obscure those crashes. Here are three ways of calling the method that result in very different sets of results:
Example 1: Allocating on the stack.
1 2 | TestClass test; test.RunCrashTest(); |
Here I’ve created an instance of my TestClass on the stack and called the RunCrashTest() method. That’ll crash, right? Well, sometimes it will. If you like tables of data, you’ll like this:
Build | Running Under Debugger | Value of size as hex | Value as signed decimal | Crash |
Debug | Yes |
0xcccccccc
|
-858993460
|
No |
Debug | No |
0xcccccccc
|
-858993460
|
No |
Release | Yes |
0×00403034
|
4206644
|
Yes |
Release | No |
0×00403034
|
4206644
|
Yes |
What you see here is that the code crashed in a release build, but not in a debug build. That’s kind of annoying – so why the variation?
It all comes down to the debug version of the C-runtime. The compiler is being helpful in a debug build and filling the class with 0xcccccccc when it gets allocated on the stack. I’m not complaining, this is a useful compiler feature that often shows up problems – however when 0xcccccccc is interpreted as a signed integer (which the example code does) it is negative, meaning that the loop exits immediately without running its crash-inducing contents. In fact it’s only by luck (or lack of) that it crashed in release – there’s only a 50/50 chance of the uninitialised value being positive, so it would theoretically pass 50% of the time.
There are some changes we can implement to improve those odds, but first – here’s an example with even worse odds.
Example 2: Allocating on the heap
1 2 3 | TestClass *test = new TestClass; test->RunCrashTest(); delete test; |
Not much has changed here except that the class is now allocated on the heap rather than the stack. But look what that does to the results:
Build | Running Under Debugger | Value of size as hex | Value as signed decimal | Crash |
Debug | Yes | 0xcdcdcdcd | -842150451 | No |
Debug | No | 0xcdcdcdcd | -842150451 | No |
Release | Yes | 0xbaadf00d | -1163005939 | No |
Release | No | 0×00000018 | 24 | Yes |
Not only does it fail to crash in a debug build, it also does so when running the release build under the debugger. That’s without re-compiling or re-linking any code – simply clicking “Start Debugging” in Visual Studio IDE rather than “Start Without Debugging” is enough to alter the behaviour.
In this case the behaviour is altered by the debug heap code, which Microsoft’s runtime library will initialise if it detects the presence of the a debugger (incidentally, there’s Win32 API function called IsDebuggerPresent() that you can use in your own code to determine debugger presence). The debug heap initialises the new memory to 0xbaadf00d, which is what we see in release. In a debug build the C-runtime will also chip in and fill the memory with 0xcdcdcdcd during initialisation. The result is that a crash only occurs in one of 4 configurations – coincidentally the one that we can’t debug. Not an ideal situation, but it is one we can do something about. Before that though, here’s one last example.
Example 3: Freeing the data from the heap before calling the method (yet another bug to compound the misery!).
1 2 3 | TestClass *test = new TestClass; delete test; test->RunCrashTest(); |
Build | Running Under Debugger | Value of size as hex | Value as signed decimal | Crash |
Debug | Yes | 0xfeeefeee | -17891602 | No |
Debug | No | 0xdddddddd | -572662307 | No |
Release | Yes | 0x001d9ff0 | 1941488 | Yes |
Release | No | 0×00000018 | 24 | Yes |
Again, the code refuses to crash in debug because the compiler and debug heap are setting the free memory to 0xdddddddd and 0xfeeefeee respectively. In a release build running under the debugger the compiler has optimised the heap in such a way that it doesn’t hit the 0xfeeefeee’d memory, but had the integer value been further down the member list it would actually have failed to crash.
So there you have it: 12 different ways of running the same function, only 5 of which crash, none of those crashes in a debug build, and it’s only by chance that it crashes at all – if you imagine a situation where the memory that stores the array size had previously been storing a time value, you might get a crash that only occurred at certain times of day.
The burning question then, is what can we do about it?
Solutions
1. Bypass the debug heap
One easy work-around is to start your game/application using ”Start Without Debugging” and then attach the debugger using “Attach to Process” as soon as it has started up. The debug heap is initialised when the executable is starting up, so by debugging later on you bypass the debug heap. This would have let us debug Example 2 in a release build.
2. Write your own heap
While this might be seen as a symptom of NIH syndrome, writing your own heap that gives you the ability to vary the heap-fill values can be a worthwhile exercise for a large project (it can also yield performance and fragmentation benefits, but that’s another topic altogether).
3. Use unsigned integers for array sizes
Had the array size in my example been unsigned the code would have crashed in every single case.
4. Use ‘not equal’ conditions in for loops.
I hadn’t heard this tip until I was discussing the problem with a colleague recently, but if the example loop was changed from for(int i=0; i<size; i++) to for(int i=0; i!=size; i++) it would crash every time, because even if size was negative the loop would still execute.
5. Use container classes and templates.
Using a container class guards against the kind of silly mistakes that are evident in the example code, so that you simply can’t get into many of the crash situations I demonstrated.
6. Test Earlier and More Often
Make a habit of running all the different configurations of your project, don’t just run debug and expect to see all the bugs. Or go a step further and set up a continuous integration server that runs your project in a variety of configurations whenever any new code is committed.
7. Implement a crash report system
To catch bugs that only seem to occur in the wild, implement a crash report system that saves off a dump file, or simply writes the callstack to a text file. Being able to see which line and data caused the crash is often enough to catch and fix it, even if you never reproduce it.
Up Next…
That concludes this post, I hope you find these tips useful in finding and avoiding bugs in your projects.
For my next post I’ll be talking about some unusual performance characteristics that debug builds and the debugger can introduce. It’s common knowledge that debug builds run slower than release, but the areas where those performance gaps are introduced might catch you by surprise.