A lot of code is about transforming data. Well, from a theoretic point of view all programming is about transforming data. Of course claiming that a video game is a function that integrates mouse movements into N-dimensions framebuffer is not exactly helpful, but sometimes it’s useful to see your data being processed. Some examples are quite intuitive (like using vertex shader do transform world space vertices into screen space ones) but others are hard to notice, especially with strong OOP mindset.

But how can you do transformations the smart way? Actual vertex transformation is quite an easy case (if you’re not doing anything fancy) — do some (fixed) math and you’re good to go. Particle systems would be harder. Some systems could be stateless, some not. For some of them you would like to have billboards, for other some DOF unlocked. Some would use simple textures while others… etc. This can quickly get very complex. But let’s go even further — imagine a Diablo-like (or top-down shooter) game where you have hundreds of opponents, all of them casting spells or just throwing knives and chairs into you. There are many nice Flash games out there, but unfortunately most of them is no longer fun after first 200 sprites visible on screen.

1. Fixed function

You can start with something like (in pseudocode)

SpellsSystem::Update(float dt)
 
  {
 
    for each (Spell s in spells)
 
    {
 
      // do some calculations and update s state
 
    }
 
  }

This is OK, but how many different effects can your spell make? What is their relationship? Maybe some effects depend on certain conditions like time delay or other objects state? The bottom line is that it may end with heavily conditioned code or use of class inheritance/virtuals/etc. Either way this causes several problems:

  • while the code itself may be more or less readable, it will be very hard to track what is exactly going on with specific object
  • performance will be rather choppy (it may not be so bad with good branch prediction — but there are other platforms than X86)…
  • …but it’s certainly going to be terrible in Debug mode. And what is the purpose of using Debug mode when your game hardly hits realtime (in SIGGRAPH terms) framerates, let alone  smooth gameplay?

2. Oh my XML

But don’t forget that you also need to specify settings for particular data system. Some may be expressed with simple flags like Particle::isTransparent or Attack::isLethal. But what if you want to express something like “attack with acid cloud for 5 seconds and then throw fireball unless below 10 HP”? I’ve seen an actual game that used XML for such things:

<effect type="bullet">
 
    <emitter>
 
      <speed value="0.0" />
 
      <acceleration value="0.0" />
 
      <ttl value="0.5" />
 
      <count value="1" />
 
      <width value="0" />
 
      <explode-on-fade value="1" />
 
      <bullet value="data/physicals/invisible-bullet.xml" />
 
      <filter value="none" />
 
   
 
      <flight-effect type="list">
 
        <effect type="delayed">
 
          <time value="0.15"/>
 
          <effect type="retarget">
 
            <target-chooser>
 
              <filter value="none"/>
 
              <range value="2.0" />
 
              <angle value="60.0"/>
 
              <aim-at-ground value="1"/>
 
            </target-chooser>
 
            <effect type="list">
 
              <effect type="jump">
 
                <name value="crawling-lightning"/>
 
              </effect>
 
              <effect type="bullet">
 
                <emitter>
 
                  <speed value="0.0" />
 
                  <acceleration value="0.0" />
 
                  <ttl value="0.5" />
 
                  <count value="1" />
 
                  <width value="0" />
 
                  <explode-on-fade value="1" />
 
                  <bullet value="data/physicals/invisible-bullet.xml" />
 
                  <filter value="none" />
 
   
 
                  <flight-effect type="delayed">
 
                    <time value="0.15"/>
 
                    <effect type="retarget">
 
                      <target-chooser>
 
                        <filter value="none"/>
 
                        <range value="1.5" />
 
                        <angle value="60.0"/>
 
                        <aim-at-ground value="1"/>
 
                      </target-chooser>
 
                      <effect type="list">
 
                        <effect type="jump">
 
                          <name value="crawling-lightning"/>
 
                        </effect>
 
                        <effect type="bullet">
 
                          <emitter>
 
                            <speed value="0.0" />
 
                            <acceleration value="0.0" />
 
                            <ttl value="0.5" />
 
                            <count value="2" />
 
                            <width value="0" />
 
                            <explode-on-fade value="1" />
 
                            <bullet value="data/physicals/invisible-bullet.xml" />
 
                            <filter value="none" />
 
   
 
                            <flight-effect type="delayed">
 
                              <time value="0.15"/>
 
                              <effect type="retarget">
 
                                <target-chooser>
 
                                  <filter value="none"/>
 
                                  <range value="1.0" />
 
                                  <angle value="60.0"/>
 
                                  <aim-at-ground value="1"/>
 
                                </target-chooser>
 
                                <effect type="jump">
 
                                  <name value="crawling-lightning"/>
 
                                </effect>
 
                              </effect>
 
                            </flight-effect>
 
                          </emitter>
 
                        </effect>
 
                      </effect>
 
                    </effect>
 
                  </flight-effect>
 
                </emitter>
 
              </effect>
 
            </effect>
 
          </effect>
 
        </effect>
 
      </flight-effect>
 
    </emitter>
 
  </effect>

Other possibility is to use graph nodes editor and carefully connects all such effects, actions and conditions.

(image omitted — too big and scary)

3. Programmable pipeline

Some time ago I’ve written a blog post about data transformation pipeline build around Flex/Bison/ASMjit (and recently LLVM). This can be thought as something like regular data shader (or object shader if you like silly buzzwords). You can basically write a piece of code that is somewhere between a shader and Lua script and compile it in runtime. You can use your favourite approach (class hierarchy-based, components-based) and it’s still going to be fast because it’ll end as (hopefully) branchless piece of raw data processing code. You can even use XML or nodes editor to prepare such code, but <rant> I have no idea why anyone would like to link graph nodes instead of writing C-like code. I mean, try to implement, debug and optimize HBAO using only your mouse. </rant>

struct State
 
  {
 
     float4 * positions;
 
     float4 * orientations;
 
     uint32_t * hps;
 
  };
 
  struct Output
 
  {
 
     State new_state;
 
     uint32_t * attack_types;
 
  };
 
   
 
  void (*transform_data)(State &, Output &);

Using such method, you gain performance boost in general, thanks to:

  • using struct-of-arrays data input & output
  • agressive inlining
  • not including unnecessary code in transformation (hopefully branchless)
  • optimizations of data transform code are independent with general profile so you can still enjoy thousands of objects in Debug
  • ability to target to specific machine (I mean, most of general JIT-ted languages like C# don’t really benefit from SSE etc., but here you can enjoy full processor power not relying that all end-users actually have SSE 4.2 and fancy DPPS ops)
  • last but not least, it can be easily parallelized

Why don’t use OpenCL/CUDA/DirectCompute for this?

As far as I can see, CPU OpenCL implementations are still not production-ready (at least when I was digging the subject). And other solutions are too vendor-specific.

Why don’t use GPU for this?

Well, most of the time you could. But most GPGPU evangelists have trouble noticing that in typical video game, GPU already has a lot to do. So, depending on your specific needs, using GPU for heavy computations could only slow down the application in overall. But this is also system dependent. Gamers can have powerful GPU combined with low-end CPU and vice-versa. Additionally don’t forget that there is a lot of overhead in copying data between GPU and CPU. So there is no silver bullet but of course the ability to do those computations on GPU is cool.

PS. I’ll try to post some benchmarks if this is interesting enough. And feel free to correct any errors — after all, I’m not AAA veteran with 30 years of experience. ;)