This may sound a bit niche, but I thought that I’d talk a little bit about a simple design for a packet-based networking interface that we came up with on a project a while back, since it’s both a relatively elegant solution in its own right and a demonstration of some tricks for bending the C++ preprocessor into doing your dirty work for you. So, without further ado…

Basically, the problem this sets out to solve is simple – we have two machines on a network, and we want to provide an API for the user to send data between them (in practice there might be more than two, but extending things is simple once you have the basics working). Since a lot of different bits of random code will be doing this (particularly as this system is intended to handle communication for debugging purposes as well as “real” game code), we want to make the API as simple as possible, and keep the potential for errors as small as possible. “Simple” often equates to “small”, so what’s the minimal set of functionality we can get away with?

Well, what the end user wants to do are two things – send data, and receive it. For our purposes we’ll assume that the data in question is some sort of fixed-size structure (variable size things are a fairly easy extension for later). Since we’re in C++-land, it makes sense that we’d represent that as a class (or struct if you prefer). So the user wants to put some data into a class, and then send it across the network to the other machine, where they will do something to receive it. So our API will need send() and receive() calls…

…Or will it? Whilst at first glance receive() makes sense as a function for the user to call, in practice that can be a bit of a PITA. It opens up two big problems – firstly, on the user end, they have to poll every frame (or suitable time period) to see if data has arrived and deal with it. Secondly, and even more awkwardly, it opens up API design questions like “what happens to data that no-one is trying to receive?” and “what happens if the receive call is expecting a different type of data from what actually arrived?”. So maybe we should turn the receive side on it’s head, and say that the receive function is something the user implements, and it gets automatically called when data arrives. There are some timing-dependency implications for that (if, for example, a packet is received half-way through the code doing some other processing), but a little bit of common sense with scheduling and mutexes can generally resolve that. (And, as timing problems go, the “what if my receive function has a race condition against the processing code?” problem is far less scary and hard to deal with than the “what if data packet A has a race condition against data packet B?” problem, IMHO, which is where you seem to end up if you go down the road of having user-triggered receive calls).

So, we’ve got our data class:

class PlayerPosition
 
  {
 
  public:
 
      float m_x, m_y;
 
  };

And what we want, ideally, to be able to do with it is something like this:

PlayerPosition update_packet;
 
  update_packet.m_x = g_player_x;
 
  update_packet.m_y = g_player_y;
 
  update_packet.send();

…and at the other end:

void PlayerPosition::receive()
 
  {
 
      g_player_x = this->m_x;
 
      g_player_y = this->m_y;
 
  };

That’s about the minimal amount of code we could possibly use to achieve the results we want, and since it reads pretty cleanly and looks sane, let’s aim for that. It is worth noting that one more small API requirement has crept in here, though – in the send() example, we’re sending data which is on the stack. Hence, it will be necessary for the send() function itself to copy to another buffer if it wants to do the actual processing on another thread or similar… that’s a little inefficient in some cases, but to be honest the alternative (forcing the user to ensure that the data passed exists until the send() processing is complete) is so painful to actually use that it doesn’t seem practical. (For what it’s worth, in our system we implemented a two-tier solution to this – small packets get copied to a buffer and send() returns immediately, whilst large ones stall the fiber calling send() until the data has been flushed).

So, what can we do to implement this? Well, the first thing that springs out is that somewhere between the declaration and actual usage, the PlayerPosition class magically acquired a send() function. That seems like a pretty simple thing to implement, so let’s start there – we can inject the code we want if we add a macro to the class:

class PlayerPosition
 
  {
 
      PACKET_CLASS(PlayerPosition);
 
  public:
 
      float m_x, m_y;
 
  };

This is a trick I’m very fond of – you can achieve an awful lot by sticking a single macro at the top of a class with the name as a parameter, and as a bonus unlike a lot of macro hackery it doesn’t actually look too bad aesthetically either. With this in place, the macro would look something like this (the trailing “private:” is there to ensure that we don’t leave the “default” scope for the class in the wrong state after being used):

#define PACKET_CLASS(classname) \
 
  public: \
 
      void send() \
 
      { \
 
          g_NetworkManager->send(this, sizeof(classname)); \
 
      } \
 
  private:

I’m presupposing here that the network manager has some way of sending raw data in packet form – that’s out of the scope of the current discussion so we’ll just hand-wave past it. So this looks like it should work… we’ve added a send() function to our packet class which sends the contents over the network. All is well!

…All is well, except for the fact that the other end has no way of knowing what the data we just send actually was. The user code wants us to call the correct receive() function for the type of data that was sent, so without that we’re pretty much snookered. What’s needed is a unique identifier that both systems can use to hook up the types correctly. The problem, though, is that we don’t have anything except the class name to work with, and whilst in theory we could send that across the network as some sort of variable-length byte stream, that’s going to be pretty inefficient all-round. So we want to assign a more compact identifier to our packets.

This is where we pull out another thing in our C++ bag of tricks – static initialisers. If you have a global static object with a constructor attached, the C++ runtime will execute it before main() gets called. Now, normally putting any significant code in these is a recipe for disaster (as other subsystems will not have had a chance to initialise yet, and the ordering they are called in is essentially random), but we can use this to construct a list of packet types.

First off, since the packets themselves are neither global nor static, we need a class that is for our constructor to hang off. Something like this should do the job:

class PacketClassInfo
 
  {
 
  public:
 
      PacketClassInfo()
 
      {
 
          m_next = s_first_packet_class;
 
          s_first_packet_class = this;        
 
      }
 
   
 
      PacketClassInfo* m_next;
 
      static PacketClassInfo* s_first_packet_class;
 
  };

As you can see, all the constructor does is to add this instance of PacketClassInfo to an internal linked-list. We can then iterate through this at our leisure once the system has started up properly. To actually make one of these objects for each packet class, we need to expand our macro a little too:

#define PACKET_CLASS(classname) \
 
  public: \
 
      static PacketClassInfo s_packet_class_info; \
 
      void send() \
 
      { \
 
          g_NetworkManager->send(this, sizeof(classname)); \
 
      } \
 
  private:

…this declares our static class, but in order to actually construct it we need an instance in a CPP file somewhere, rather than in the header. That gets a bit awkward – ideally, it would be best if we could avoid having more than one declaration necessary for a packet class, but without using tricks like #including the header file twice (with different macro declarations), C++ doesn’t provide any mechanism for us to do that. So we’ll bite the bullet and add this macro:

#define IMPLEMENT_PACKET_CLASS(classname) \
 
      PacketClassInfo classname::s_packet_class_info();

Which is simply placed in an appropriate CPP file to add the packet class static data (we’ll be making more use of this macro for other purposes later, too).

This gives us our linked-list of packet classes, so now we can think about the ID numbers themselves. Since we’ve handily added this static PacketClassInfo class, the logical place to put them would be in there, so let’s add one:

class PacketClassInfo
 
  {
 
  public:
 
      PacketClassInfo()
 
      {
 
          m_next = s_first_packet_class;
 
          s_first_packet_class = this;
 
          m_packet_id = 0xFFFFFFFF;
 
      }
 
   
 
      u32 m_packet_id;
 
      PacketClassInfo* m_next;
 
      static PacketClassInfo* s_first_packet_class;
 
  };

As you can see, I’ve initialised the ID to 0xFFFFFFFF – this gives us a guard value we can test against in the send() code to check that we aren’t trying to send something before the packet initialisation has happened. And next up is that initialisation itself – a simple enough task given what we have now:

void InitPackets()
 
  {
 
      u32 current_id = 0;
 
   
 
      PacketClassInfo* current_packet_class = PacketClassInfo::s_first_packet_class;
 
   
 
      while(current_packet_class)
 
      {
 
          current_packet_class->m_packet_id = current_id++;
 
          current_packet_class = current_packet_class->m_next;
 
      }
 
  };

As you can see, in this case we’re simply assigning indices to each packet type, starting from zero. This is fine if you know that you will always be communicating between two copies of the same executable, but if there is a chance they are different the indices will probably not match up (as they are dependent on static initialisation order, which can change even with a simple re-link). Better schemes for real use are to sort the packets into name order first, or even better to use a hash of the name itself as the ID. Implementation of those is left as an exercise for the reader, though. And so, with ID in hand, it becomes trivial to modify the macro so that send() passes this through to the communications system:

#define PACKET_CLASS(classname) \
 
  public: \
 
      static PacketClassInfo s_packet_class_info; \
 
      void send() \
 
      { \
 
          g_NetworkManager->send(s_packet_class_info.m_packet_id, this, sizeof(classname)); \
 
      } \
 
  private:

…And with that, we’re done with the implementation of send() – (over) half of the system is complete! So, onto the receiving side. We’ve got one basic problem to solve here – how we take a packet ID number and turn that into a call to the appropriate receive function?

One approach to this would be to search through the linked-list of packet classes to match the ID number, but efficiency-wise that isn’t particularly fantastic, so let’s make use of the fact that our ID numbers are linear indexes and build an array to map ID back into the appropriate PacketClassInfo pointer instead (you may note that if you choose to use a hashed value for the ID instead, a different approach will be needed here). All of this can be done quite conveniently inside our InitPackets() function:

#define MAX_PACKET_TYPES 64
 
  PacketClassInfo* g_packet_class_info[MAX_PACKET_TYPES];
 
   
 
  void InitPackets()
 
  {
 
      u32 current_id = 0;
 
   
 
      PacketClassInfo* current_packet_class = PacketClassInfo::s_first_packet_class;
 
   
 
      while(current_packet_class)
 
      {
 
          ASSERT(current_id < MAX_PACKET_TYPES);
 
          g_packet_class_info[current_id] = current_packet_class;
 
          current_packet_class->m_packet_id = current_id++;
 
          current_packet_class = current_packet_class->m_next;
 
      }
 
  };

In this instance I’ve simple specified a hard-coded maximum number of classes – obviously a dynamic array or similar may be more appropriate in the real world (or not – there is a pretty good argument that in simple cases like this the extra thinking/coding time of “doing it right” outweighs any practical advantage from saving a few bytes and avoiding the occasional need to bump the maximum number).

So, the next problem on the list is this – how do we get from a PacketClassInfo pointer to an actual call into the receive function? The simplest answer is to add another member to PacketClassInfo – this time a function pointer:

typedef void *PacketReceiveFunction(void* packet_data);
 
   
 
  class PacketClassInfo
 
  {
 
  public:
 
      PacketClassInfo(PacketReceiveFunction* receive_function)
 
      {
 
          m_receive_function = receive_function;
 
          m_next = s_first_packet_class;
 
          s_first_packet_class = this;
 
          m_packet_id = 0xFFFFFFFF;
 
      }
 
   
 
      u32 m_packet_id;
 
      PacketClassInfo* m_next;
 
      PacketReceiveFunction* m_receive_function;
 
      static PacketClassInfo* s_first_packet_class;
 
  };

This demonstrates one very useful trick when employing static helper classes – passing data as a parameter to the constructor allows you to put together all sorts of helpful information inside the macro and then save it off for later use. A great use of this (left as an exercise for the reader) is to use the preprocessor’s stringize operator (##) to store the name of the packet class as a char* for debugging purposes. So, with the new PacketClassInfo, our macros become:

#define PACKET_CLASS(classname) \
 
  public: \
 
      static PacketClassInfo s_packet_class_info; \
 
      void send() \
 
      { \
 
          g_NetworkManager->send(s_packet_class_info.m_packet_id, this, sizeof(classname)); \
 
      } \
 
      static void ReceiveFunctionStub(void* received_data); \
 
  private:
 
   
 
  #define IMPLEMENT_PACKET_CLASS(classname) \
 
      PacketClassInfo classname::s_packet_class_info(classname::ReceiveFunctionStub); \
 
      void classname::ReceiveFunctionStub(void* received_data) \
 
      { \
 
      }

As you can see, the receive function is pointing to ReceiveFunctionStub(), which is currently empty. The reason for this stub function’s existence is that our actual receive() function is a member function, which cannot be (easily) called via the void* pointer we have to the packet data. So, instead, we call into the stub, and let it do the dirty work of casting to the right type and making the call for us. Adding in that dirty work gives us:

#define PACKET_CLASS(classname) \
 
  public: \
 
      static PacketClassInfo s_packet_class_info; \
 
      void send() \
 
      { \
 
          g_NetworkManager->send(s_packet_class_info.m_packet_id, this, sizeof(classname)); \
 
      } \
 
      static void ReceiveFunctionStub(void* received_data); \
 
      void receive(); \
 
  private:
 
   
 
  #define IMPLEMENT_PACKET_CLASS(classname) \
 
      PacketClassInfo classname::s_packet_class_info(classname::ReceiveFunctionStub); \
 
      void classname::ReceiveFunctionStub(void* received_data) \
 
      { \
 
          ((classname *) received_data)->receive(); \
 
      }

ReceiveFunctionStub() acts as a trampoline to call the real receive() function, which we’ve also added a prototype for in the header (in the spirit of reducing the workload on the user as far as possible). Now all they have to do is implement that function in their code, as in our original desired API design. It is worth noting that whilst in this example we have made receive() a member function of the packet class, depending on the surrounding code design this may not be the easiest for the user – my personal preference is to make receive() a delegate on the packet, allowing the handler to live in another object/system, but that requires quite a lot of support code unless you already have a delegate implementation in your engine.

At any rate, now the final piece of the puzzle is very trivial indeed – gluing together the list of packet classes and the stub function to give a generic packet receive handler the networking system can call:

void ReceiveDispatcher(u32 packet_id, void* received_data)
 
  {
 
      ASSERT(packet_id < MAX_PACKET_TYPES);
 
      ASSERT(g_packet_class_info[packet_id]);
 
      g_packet_class_info[packet_id]->m_receive_function(received_data);
 
  }

After a couple of simple checks to make sure we actually have a packet matching the id requested, this function just calls into the current stub with the data and lets it get on with it.

So putting it all together, what does this all get us? Well, the API is about as close to the simplest model as reasonably possible, with just a couple of macro invocations added to get everything going. The packet ID system is completely hidden from the user, which eliminates any possible errors with mis-casting data pointers and suchlike, and adding a new packet doesn’t take any significant effort (and can be done entirely in the code for the system using the packet – no global enums or switch statements to maintain). Common errors such as duplicate packet names will be caught at compile-time, avoiding some potentially painful debugging situations. Finally, the code is pretty efficient – both send() and receive() have a very small (and constant) overhead over whatever the network layer is already doing.

Aside from the networking itself, hopefully reading this will have sparked some ideas about how a little bit of meta-programming can really streamline the design of many APIs and introduce a lot more user-friendliness without bringing lots of arcane incantations into the code itself.

Many thanks to Yamamoto-san, my partner-in-crime on networking systems design and bottomless wellspring of obscure C++ knowledge!