Comments on: Ideal Hash Tries: an implementation in C++ backing up from the reserve() idea - with more theorizing, sorry :) my original post suggested pre-allocating Log64(N)-1 levels because the level with most leaves is also the most likely to be sparsely populated and we'd want to avoid the memory waste. but by definition the node count up to depth Log64(N)-1 is only a fraction of the total node count so it wouldn't be really worth it anyway. backing up from the reserve() idea – with more theorizing, sorry :)
my original post suggested pre-allocating Log64(N)-1 levels because the level with most leaves is also the most likely to be sparsely populated and we’d want to avoid the memory waste. but by definition the node count up to depth Log64(N)-1 is only a fraction of the total node count so it wouldn’t be really worth it anyway.

]]>
By: Alex Evans/2011/03/22/ideal-hash-tries-an-implementation-in-c/#comment-1933 Alex Evans Thu, 24 Mar 2011 15:30:36 +0000 to clarify, reserve() could even allocate a single chunk of memory. its not easily undo-able but the point is that you want to only reserve the minimum depth you are ready to keep in memory, then you can grow above that with regularly allocated nodes and come back down to reserved amount at will. to clarify, reserve() could even allocate a single chunk of memory. its not easily undo-able but the point is that you want to only reserve the minimum depth you are ready to keep in memory, then you can grow above that with regularly allocated nodes and come back down to reserved amount at will.

]]>
By: Omar/2011/03/22/ideal-hash-tries-an-implementation-in-c/#comment-1927 Omar Thu, 24 Mar 2011 14:09:03 +0000 thanks niall! You're absolutely right, my comparison to std::set was completely unfair and sloppy. But fun :) and an opportunity for you to help out! thanks. it definitely looks like those numbers are close enough that the trie could be made a bit more competitive... hmmm. tempting. Also, you ('one') could also fairly easily make the trie implementation much more STL-like/generic, by putting the internal/external flag into another u64/u32 in the header of each node (mirroring the 'used' field), rather than requiring it to be part of the T. Obviously it's very slightly more memory overhead, but means the T can be whatever you want. I haven't tried it though... should be a pretty simple / mechanical change though. but regardless of hypothetical improvements, many thanks (also from any future readers) for fixing my code & article with those extra results. a winner is you! thanks niall! You’re absolutely right, my comparison to std::set was completely unfair and sloppy. But fun :) and an opportunity for you to help out! thanks.

it definitely looks like those numbers are close enough that the trie could be made a bit more competitive… hmmm. tempting.
Also, you (‘one’) could also fairly easily make the trie implementation much more STL-like/generic, by putting the internal/external flag into another u64/u32 in the header of each node (mirroring the ‘used’ field), rather than requiring it to be part of the T. Obviously it’s very slightly more memory overhead, but means the T can be whatever you want. I haven’t tried it though… should be a pretty simple / mechanical change though.

but regardless of hypothetical improvements, many thanks (also from any future readers) for fixing my code & article with those extra results. a winner is you!

]]>
By: Niall Ryan/2011/03/22/ideal-hash-tries-an-implementation-in-c/#comment-1891 Niall Ryan Wed, 23 Mar 2011 06:16:14 +0000

On my PC these are the performance results:

trie insert 1000000 485857usec
trie get 2000000 557166usec
trie get 2000000 557673usec
trie get 2000000 546991usec
trie delete 1000000 501743usec

std::unordered_set
------------------
hset insert 1000000 417596usec memory used=32777232, numAllocs=1000002
hset get 2000000 458527usec 1000000
hset get 2000000 454985usec 1000000
hset get 2000000 451403usec 1000000
hset delete 1000000 349804usec memory used=80, numAllocs=2

The unordered_set is quite a bit faster than the trie, but it’s using almost 32MB in a million separate allocations, one for each element, and that’s not including allocator overhead. Internally unordered_set uses a std::list to store the elements, each node is 16 bytes so that accounts for 16MB, the other 16MB will be used for the hash bucket iterators. So the trie is definitely better for memory usage, but the std::unordered_set is better for speed or if you don’t have a spare bit in the element :) Would be interesting to see if the trie speed can get closer to the unordered_set with some more optimizing…

]]>
By: Guy Sherman/2011/03/22/ideal-hash-tries-an-implementation-in-c/#comment-1876 Guy Sherman Tue, 22 Mar 2011 19:49:44 +0000