Simulating a Loaded Dice in a Constant Time
A fair dice can be easily simulated with rand() function(or C++11 std::uniform_int_distribution) and integer modulo arithmetic. Then how about a loaded dice(a dice with a non-uniform probability for each face)?
No sweat. Just partition a normalized range of 0.0 to 1.0 according to each probability and bin the result of a random function to a corresponding subrange. But the search will require O(n) time complexity(here, n is the number of dice faces). Can we do better? Sure. One can do a binary search or similar, then one can get a result of rolling a loaded dice within O(logn) time complexity. How about a constant time complexity? Can we achieve that?
Suprisingly (to me, at least) it’s possible. There is an algorithm called ‘alias method‘ which guarantees O(1) generation cost once an initialization step(i.e. preprocessing) of O(n) has been done. This algorithm is from a paper called “A Linear Algorithm For Generating Random Numbers With a Given Distribution” by Michael Vose.
Original probability distribution(top) and aliased one(bottom)
Let’s suppose a dice with four faces, each of which has a probability of 1/2, 1/3, 1/12 and 1/12. The probability distribution can be visualized as the top diagram above. The gist of the alias method is to transform the diagram to something like the bottom one. The essential point is that each column in the bottom diagram has at most two items. By simulating a fair dice to select a column and then flipping a biased coin, which matches to the probability distribution of in the column, to select between the two, one can get the result. A new item added to the original one in that column is called an alias of it, so the name of the algorithm. Quite neat, isn’t it?
I learned this algorithm from this excellent(though, rather long) article, Darts, Dice, and Coins: Sampling from a Discrete Distribution. So I highly recommend reading it for more kind & thorough explanation of the matter. Especially, you can check there how the alias table can be generated in a linear time.
The author of the article also provides a Java implementation of it here:
(This article has also been posted to my personal blog.)