Before You Wreck Yourself

Before You Wreck Yourself

Recently, Thaddaeus Frogley’s article ArenaNet to prevent, detect, and repair errors in gameplay data, a.k.a. ‘content’.

Sources of Content Errors

The gameplay data of a modern game drives the behavior of NPCs, propels the narrative, and weaves together a tapestry of audio, art, and animation to create an immersive, virtual world. As we strive to provide a greater variety of interactive experiences to our players, so too must we provide our designers with a broader and deeper representation of the world’s mechanics.

Yet, as this model becomes more complex, it becomes increasingly difficult for designers to produce complete and correct content. Additionally, the content model itself is often dynamic, growing to support new features, or adapting to iteration on existing features. Together, the volume, complexity, and evolutionary nature of content all contribute to the emergence of content errors.

Productivity Impact of Content Errors

Some content errors may be show-stoppers. Resolving cases where bad content can crash the game can have a pronounced impact on the productivity of the design and engineering teams. A single designer may be unable to iterate on content while an engineer investigates the root cause of a crash that occurs only with that designer’s content changes. Worse still, a designer may commit content changes that prevent the entire design team from running the game until the offending content can be identified and rolled back.

The majority of potential content errors are likely to be much more subtle. An NPC may occasionally speak a line recorded by a different voice actor, or a monster in a remote location may be far too powerful for the players who are likely to enter that region. These issues can be very difficult to track down in a live game environment. No matter how many QA man-hours you can devote to in-game testing, rapid content iteration by the design team will quickly create new corner cases to explore.

Preventing Content Errors

The introduction of interactive tools is a common way to reduce the risk and burden of content production. These may be off-the-shelf products, but I’m sure we are not alone in developing an in-house application for content designers.

Interactive tools can enable a dramatic boost in productivity by filling in reasonable default values, enumerating available options, cross-referencing related data, and alerting users to inconsistencies. An advanced system may even allow common data patterns to be abstracted into reusable components. All of these features go a long way toward preventing content errors.

Our interactive design tool is built in three layers. The lowest level content reading/writing/modification library is written in C++ where explicit memory management provides advantages in both speed and size. This layer is also used by other C++-authored tools that need to access content. Due to its simplicity, our C++ library is the most stable and changes very infrequently.

Above that, a number of C# libraries provide the bulk of the UI features, e.g. an object-oriented content model, source depot integration, undo/redo, commands, menus, tree and grid views, and view customization. The C# layer of the application is where we do most of our active feature work, and we generally release a new build weekly.

Content-type-specific code and one-off scripts are implemented in Python, integrated with C# via IronPython. The built-in Python editor (using ScintillaNET) is one of the most powerful features, allowing users to generate or repair large batches of content. It is easy to get started by loading one of the many example ‘utility scripts’ shared by engineers and technical designers.

Detecting Content Errors

Defensive programming within the game itself can reduce the impact of show-stopping errors, and may even provide feedback on less critical inconsistencies in data. However, this approach has two serious disadvantages.

First, defensive code increases the complexity of the game. This code may have a performance impact at runtime, possibly mitigated by conditional compilation. Even if the performance cost is minimal, defensive code also has a productivity cost that is much more difficult to measure, as it increases the difficulty of subsequent refactoring efforts.

Second, that growing list of red and yellow error messages that rapidly scrolls by every time a designer loads a map is a very poor form of feedback. Logging these errors to the bug database can dramatically improve tracking for such issues. However, there is still a significant burden on design or QA, who must sift through these bugs, determine which remain unresolved, and assign responsibility for a fix.

Early detection is the key to minimizing the impact of content errors. Detecting errors before local content changes are loaded into the game, greatly improves reliability of the game, and reduces the reliance on defensive code. Detecting errors before local content changes are checked in further improves the reliability of new content as it is distributed to other designers and to QA, enhancing everyone’s productivity.

An Early Warning System

Our early warning system, the aptly named ‘ContentValidator’, is a very simple command line tool, implemented in Python. When run, ContentValidator first imports a few core data-management libraries. These libraries provide APIs to parse content into a hierarchical object/property representation, and support iteration, modification, and persistence of these content objects. ContentValidator can then load the entire content hierarchy, or just a subset, into memory and iterate over each content object, running a variety tests.

Perhaps unsurprisingly, each concrete content object type has a simple string name, e.g. “SpawnDef”. The tests run by ContentValidator are implemented in separate Python scripts, each named for the type of content object it validates, e.g. “SpawnDef.py”. As new content types are introduced, it is trivial to add new validation scripts, and as existing types evolve, it is easy to add new tests or update old ones.

The implementation of tests is an ongoing process. Currently, the tools team writes many of the most obvious tests for each data type as we become familiar with it. Ideally, gameplay programmers would bear the primary responsibility for maintaining validation scripts. This would obviate much of the need for defensive code in the game itself, and cover many more subtle content cases that the tools team may not fully grasp.

Additionally, designers are empowered to participate in enhancing validation by the simple, script-based system. By opening this system to all of the affected parties, we enable everyone to contribute. As a result, the discovery of a new content error generally results in improvements to validation.

Test Early, Test Often, Test Fast

All the tests in the world would do us little good if they weren’t run on a regular basis. And they couldn’t be run on a regular basis if they weren’t fast. We are able to run our entire suite of tests over our entire content hierarchy in about 2 minutes. Testing the subset of content that a particular designer is editing generally takes only a few seconds.

We run a quick validation of a designer’s local content changes just before they load the game for testing. We run a full validation of all content, including the designer’s local changes before submitting changes to the depot.

This process is automated by integrating ContentValidator into the interactive design tool that designers are already using to produce content. Using IronPython, we can call the Validate() function in ContentValidator.py and pass a specific content object as the root of the hierarchy to be validated. Starting the game for testing, or submitting content changes to the depot are just a single click, and always perform validation first.

Validation Reports

Any warning or error message produced by ContentValidator is always associated with a particular content object, accompanied by a textual description. If ContentValidator is invoked as a command-line application, the log output prepends a fully-qualified identifier for the content object, which allows us to locate and inspect that object in greater detail.

When ContentValidator is invoked from within our interactive design tool, validation reports become much more powerful. Since the design tool and ContentValidator both use the same core data-management libraries, the content object associated with each warning or error message is actually a memory-resident content object available to the tool.

When the list of validation failures is presented in the tool, each warning or error message appears as a hyperlink to the related content object. Clicking the link presents the content object alongside the message describing the problem, allowing users to very rapidly resolve warnings and errors.

Who Watches the Watchmen?

Since ContentValidator depends heavily on our core data-management libraries, it is vulnerable to bugs in those libraries. These libraries are very generic, and change very infrequently, however, bugs here can have a wide-ranging impact on content. For this reason, ContentValidator performs a set of sanity checks every time it is run.

These sanity checks are implemented in separate scripts, much like the validation tests, one per content object type. Each sanity check script simply produces a number of known good and bad data objects. The existing suite of validation tests are then applied to these objects. Failing validation for a good object, or passing validation for a bad object indicates either a bug in our core libraries, or a recent change to the content model that is not yet reflected by our sanity checks or validation tests.

Either way, we want to raise this red flag high, so we run ContentValidator on our build server as the final step during a new tool chain release build. If content validation fails, then the build fails. Due to a hobby project by a particularly industrious ArenaNet engineer, a build failure actually fires up a flashing red light in the tools team room.

Evolving with Content

The tool chain must be able to evolve in parallel with the content model, otherwise it quickly becomes a roadblock to productivity. Due to the per-type scripts composing the primary features of ContentValidator, we are able to rapidly accommodate changes in content with updated validation.

Our interactive design tool follows a similar philosophy. The core tool code is very general purpose, and can provide a default UI for any content data. The default UI may not be ideal for all content types, in which case we can customize the UI in a variety of ways simply by editing presentation metadata for a particular type. For more radical UI customization, we can register custom UI code on a per-content-type basis.

Content-type-specific requirements for values are implemented in Python. These files can actually be edited while the tool is running, which enables us to rapidly debug new scripts without reloading the tool. These scripts provide callbacks to be invoked when a content object is created, modified, or moved, which enable us to enforce invariants for a variety of content values, e.g. maintaining a set of monotonically increasing, positive integer IDs as objects are added to or removed from a collection.

A Dedicated Tools Team

The tools team is primarily geared toward increasing designer productivity. In my mind, however, the tools team is ultimately responsible for the prevention, detection, and elimination of content errors. This benefits developers company-wide, reducing the risk of producing content errors, and minimizing the associated productivity costs across all disciplines.

Acknowledgements
I’d like to thank Cameron Dunn for proof-reading my article and providing invaluable feedback. Also, thanks to Mom for finding my typos.