Once again I didn’t have time to write my next post in the lua-5.1.4, the latest Lua version as of today.
Disclaimer: I didn’t compile the code presented here, it’s here just to help with the explanation. “Source code is worth a thousand words. Actually, most times it is lot more than a thousand words.”
TL;DR Version
Change the lua_tolstring
Lua API function to also return the string hash, or create another function to just return the hash, and use it in your __index
and __newindex
metamethods. Thanks for reading.
The Problem
If you’re embedding Lua into your game engine you’re also writing code to expose engine functions to Lua scripts. Ok, maybe you’re using tolua++ or another binding tool to avoid coding the binding by hand, but if you’re writing the code yourself chances are you’ll end up with something like this:
class AwesomeObject { /* ... */ public: void SetThis( int value ); void SetThat( int value ); int GetThis() const; int GetThat() const; void DoThis(); void DoThat(); void LuaPush( lua_State* L ); static AwesomeObject* LuaCheck( lua_State* L, int index ); protected: static int LuaDoThis( lua_State* L ); static int LuaDoThis( lua_State* L ); static int LuaIndex( lua_State* L ); static int LuaNewIndex( lua_State* L ); }; /* ... */ int AwesomeObject::LuaIndex( lua_State* L ) { AwesomeObject* self = LuaCheck( L, 1 ); const char* key = lua_tostring( L, 2 ); if ( key ) { if ( !strcmp( key, "this" ) ) { lua_pushnumber( L, self->GetThis() ); return 1; } else if ( !strcmp( key, "that" ) ) { lua_pushnumber( L, self->GetThat() ); return 1; } else if ( !strcmp( key, "DoThis" ) ) { lua_pushcfunction( L, AwesomeObject::LuaDoThis ); return 1; } else if ( !strcmp( key, "DoThat" ) ) { lua_pushcfunction( L, AwesomeObject::LuaDoThat ); return 1; } return luaL_error( L, "Field %s not found in structure Point", key ); } return luaL_error( L, "Key is not a string" ); } int AwesomeObject::LuaNewIndex( lua_State* L ) { AwesomeObject* self = LuaCheck( L, 1 ); const char* key = lua_tostring( L, 2 ); if ( key ) { if ( !strcmp( key, "this" ) ) { self->SetThis( (int)lua_tonumber( L, 3 ) ); return 0; } else if ( !strcmp( key, "that" ) ) { self->SetThat( (int)lua_tonumber( L, 3 ) ); return 0; } return luaL_error( L, "Field %s not found in structure Point", key ); } return luaL_error( L, "Key is not a string" ); }
The obvious drawback of this approach is that we’re making a lot of string comparisons with strcmp
.
Using the Key Length to Speed Up Comparisons
Since we can get the length of a Lua string via its API for free (the length is stored along with the string) we can use it to help with the comparisons:
int AwesomeObject::LuaIndex( lua_State* L ) { AwesomeObject* self = LuaCheck( L, 1 ); size_t len; const char* key = lua_tolstring( L, 2, &len ); if ( key ) { if ( len == 4 ) { if ( !strcmp( key, "this" ) ) { lua_pushnumber( L, self->GetThis() ); return 1; } else if ( !strcmp( key, "that" ) ) { lua_pushnumber( L, self->GetThat() ); return 1; } } else if ( len == 6 ) { else if ( !strcmp( key, "DoThis" ) ) { lua_pushcfunction( L, AwesomeObject::LuaDoThis ); return 1; } else if ( !strcmp( key, "DoThat" ) ) { lua_pushcfunction( L, AwesomeObject::LuaDoThat ); return 1; } } return luaL_error( L, "Field %s not found in structure Point", key ); } return luaL_error( L, "Key is not a string" ); } int AwesomeObject::LuaNewIndex( lua_State* L ) { AwesomeObject* self = LuaCheck( L, 1 ); size_t len; const char* key = lua_tolstring( L, 2, &len ); if ( key ) { if ( len == 4 ) { if ( !strcmp( key, "this" ) ) { self->SetThis( (int)lua_tonumber( L, 3 ) ); return 0; } else if ( !strcmp( key, "that" ) ) { self->SetThat( (int)lua_tonumber( L, 3 ) ); return 0; } } return luaL_error( L, "Field %s not found in structure Point", key ); } return luaL_error( L, "Key is not a string" ); }
While this avoids unnecessary calls to strcmp
it increases the number of branches in the code. Also, once the number of fields exposed to Lua increases we end up with many keys having the same length, increasing again the calls to strcmp
. The best thing to do is to visit the key just once.
Using a Hash to Avoid string comparisons
If we compute the hash of the key we’ll avoid comparisons to strcmp
. If we don’t mind false positives then we can remove all string comparisons:
int AwesomeObject::LuaIndex( lua_State* L ) { AwesomeObject* self = LuaCheck( L, 1 ); const char* key = lua_tostring( L, 2 ); if ( key ) { unsigned int hash = Hash( key ); switch ( hash ) { case 0xTHISHASH: lua_pushnumber( L, self->GetThis() ); return 1; case 0xTHATHASH: lua_pushnumber( L, self->GetThat() ); return 1; case 0xDOTHISHASH: lua_pushcfunction( L, AwesomeObject::LuaDoThis ); return 1; case 0xDOTHATHASH: lua_pushcfunction( L, AwesomeObject::LuaDoThat ); return 1; } return luaL_error( L, "Field %s not found in structure Point", key ); } return luaL_error( L, "Key is not a string" ); } int AwesomeObject::LuaNewIndex( lua_State* L ) { AwesomeObject* self = LuaCheck( L, 1 ); const char* key = lua_tostring( L, 2 ); if ( key ) { unsigned int hash = Hash( key ); switch ( hash ) { case 0xTHISHASH: self->SetThis( (int)lua_tonumber( L, 3 ) ); return 0; case 0xTHATHASH: self->SetThat( (int)lua_tonumber( L, 3 ) ); return 0; } return luaL_error( L, "Field %s not found in structure Point", key ); } return luaL_error( L, "Key is not a string" ); }
That’s a lot better, the key is used only to compute the hash and then we use it for all comparisons. But we can do better.
Using the Lua Hash
Lua stores hashes along with strings so it’s would be nice to be able to get it and use it for comparisons. I’ve asked for this in the Lua mailing list a couple of times to no avail, so I decided to stop whining and do it myself. The changes were so few and small in size that I won’t even bother making a diff:
-
In
lapi.c
, changeLUA_API const char *lua_tolstring (lua_State *L, int idx, size_t *len)
to
LUA_API const char *lua_tolhstring (lua_State *L, int idx, size_t *len, unsigned int *hash)
-
Still in
lapi.c
, addif (hash != NULL) *hash = tsvalue(o)->hash;
immediately before the
return
statement. -
In
lua.h
, changeLUA_API const char *(lua_tolstring) (lua_State *L, int idx, size_t *len);
to
LUA_API const char *(lua_tolhstring) (lua_State *L, int idx, size_t *len, unsigned int *hash);
-
Still in
lapi.c
, change#define lua_tostring(L,i) lua_tolstring(L, (i), NULL)
to
#define lua_tostring(L,i) lua_tolhstring(L, (i), NULL, NULL)
#define lua_tolstring(L,i,l) lua_tolhstring(L, (i), l, NULL) - Recompile!
Now we can use the Lua hash in our code:
int AwesomeObject::LuaIndex( lua_State* L ) { AwesomeObject* self = LuaCheck( L, 1 ); unsigned int hash; const char* key = lua_tolhstring( L, 2, NULL, &hash ); if ( key ) { switch ( hash ) { case 0x0079406eU: lua_pushnumber( L, self->GetThis() ); return 1; case 0x00791416U: lua_pushnumber( L, self->GetThat() ); return 1; case 0xb5bdc08eU: lua_pushcfunction( L, AwesomeObject::LuaDoThis ); return 1; case 0x8549e7a3U: lua_pushcfunction( L, AwesomeObject::LuaDoThat ); return 1; } return luaL_error( L, "Field %s not found in structure Point", key ); } return luaL_error( L, "Key is not a string" ); } int AwesomeObject::LuaNewIndex( lua_State* L ) { AwesomeObject* self = LuaCheck( L, 1 ); unsigned int hash; const char* key = lua_tolhstring( L, 2, NULL, &hash ); if ( key ) { switch ( hash ) { case 0x0079406eU: self->SetThis( (int)lua_tonumber( L, 3 ) ); return 0; case 0x00791416U: self->SetThat( (int)lua_tonumber( L, 3 ) ); return 0; } return luaL_error( L, "Field %s not found in structure Point", key ); } return luaL_error( L, "Key is not a string" ); }
Great!, we’re saving the call to Hash
and our binding is faster. This alone probably isn’t worth the effort in the grand scheme of things but it’s something I always wanted to do. And small savings here and there can add up to the point where they do make a difference as a whole. And if we want to avoid one more if
in lua_tolhstring
it’s just a matter of creating an additional API function, say lua_gethash
, instead of changing lua_tolstring
.
But Wait!, From Where Did Those Hash Constants Come From?
Since we now have lua_tolhstring
available in the Lua API, it’s just a matter of writing a small program that pushes strings onto the Lua stack and queries them for their hashes:
#include <stdio.h> #include <lua.h> #include <lauxlib.h> int main( int argc, const char* argv[] ) { int i; lua_State *L = lua_open(); if ( L == NULL ) { fprintf( stderr, "Not enough memory" ); return -1; } for ( i = 1; i < argc; i++ ) { unsigned int hash; lua_pushstring( L, argv[ i ] ); lua_tolhstring( L, -1, NULL, &hash ); printf( "\"%s\" = 0x%08x\n", argv[ i ], hash ); lua_pop( L, 1 ); } return 0; }
Usage:
$ luahash this that DoThis DoThat "this" = 0x0079406e "that" = 0x00791416 "DoThis" = 0xb5bdc08e "DoThat" = 0x8549e7a3