My game is an open world game. The player can freely explore the game world. It is impossible to load all the game objects when the game start, so my engine should be able to stream in the game objects when the player is playing.

In order to stream the game objects, I have to partition the whole game world into many square tiles:

I will load the world tile(s) according to the player position. I divide each world tile into 9 regions as below:

There are 3 types of region in each tile (marked as A, B & C). When the player is in region A, only the tile that the player inside is loaded. When the player is inside region B, 1 of the adjacent tile will be loaded:

And within region C, 3 of the adjacent tiles will be loaded:

So, maximum number of tiles in memory will be 4.

To maintain the maximum number of tiles in memory, unseen tiles need to be unloaded. I divide each tile into 4 region for unloading:

Say, when the player is in region 0, the nearby tiles X, Y, Z in the below figure will be unloaded:

The only thing I need to ensure is tiles X, Y, Z are completely unloaded before new tile need to be loaded according to the above rules. If this situation happens, I will block the game until they are unloaded.

Memory and Threading
Since we already know that there are only 4 maximum tiles. Beside the pool memory allocator used for physics/script in the main thread. I also have 4 linear allocator for streaming world tile and all tiles are within this limit. The memory allocation pattern is related to how the threading model works in the game. In the game there are 2 threads: main thread and streaming thread. The main thread is responsible for updating the game logic, physics and rendering, while the streaming thread is for loading resources, decompress textures, etc. When the player update the position of the ship in the main thread, it will signal the streaming thread to load the tile if needed. Several frames later, the streaming thread signal back the main thread finished loading. The communication between 2 threads are double buffered to achieve minimum locking and also ensure the linear allocator will only be used in the streaming thread which can avoid using any mutex. But things go complicated when some of the objects should be created in main thread such as graphics objects and Lua objects. For example, the streaming thread should notify back to the main thread to create an openGL texture handle after it finished decompressing a texture.

Using linear allocator for streaming can avoid memory fragmentation. Partition the game world into tiles make managing memory easier as each tile should have roughly the same amount of memory used.

To unload a world tile, those resources created on CPU side are simply freed by reseting the linear allocator. However, things are not that easy as I think originally… For example, when I want to unload the physics objects in a tile, I need to remove all of them from the collision world first before resetting the allocator. Also, for the graphics objects, I need to release all of the openGL objects in that tile before resetting the allocator, otherwise, leak will occur in the GPU side. I also need to release the script object in the tile so that those scripts can be garbage collected in Lua. Therefore, it is nearly the same as ‘delete’ all game objects in a tile one by one and cannot free all resources simply by resetting the linear allocator. Besides, creating objects in bullet physics using another custom allocator other than the currently hooked up allocator using btAlignedAllocSetCustom() is not easy too. Since bullet is not designed for allocating objects to another memory allocator(i.e. in my case, the linear allocator for streamed objects such as the rigid bodies and collision shapes). I need to modify bullet source code to make it works.

After making the streaming system, I have a more clear understanding between inter-thread communication as it should be planned carefully, and some of the objects needed to be created on specific thread. Also, I think that it is not a wise choice to divide a dedicated memory region using linear allocator for streaming objects as those objects will be deleted one by one when they need to be removed from the game world. And it costs too much work to modify bullet physics to cope with this memory model and this result in hard to maintain and update bullet physics library.