Discovered in 2013 by then-UPMC-student Jérémie Lumbroso, and independently re-discovered by an Apple programmer last Fall (with much ado on Twitter and HN), the basic idea is as follows:

  1. Take your (infinite) random bitstream as the binary digits ("fractional part") of random real $r\in\left[0,1\right)$
    Specifically: $r=\sum_{i=1}^{\infty}{\text{bit}_i 2^{-i}}$
  2. Read out only as many digits of $r$ as are required to determine the value of $\left\lfloor rn\right\rfloor$
  3. You now (clearly) have fairly chosen $\leftarrow_{$}\left\{0,1,\cdots,n-1\right\}$
  4. For the next number, resume reading bits from the stream where this number left off

While it's easily-shown that no single-draw algorithm can be perfectly optimal, this algorithm is noticeably more efficient than rejection sampling in the single-draw case (and Lumbroso reminds us that this it is, obviously, also perfectly compatible with arithmetic-coded batching of the draws (left as an exercise for the reader)).

from fractions import Fraction
from random import getrandbits
def educational_randbelow(n):
	# performs VERY poorly CPU-wise --
	# for illustration only!
	if n <= 1:
			# A random integer "below" 1 must be 0
			return 0
	k = (n-1).bit_length()
	if n == 2 ** k:
		# Ironically, this interpretation
		# of the algorithm
		# doesn't handle the
		# trivial case well...
		# handle it specially:
		return getrandbits(k)
	# i represents the current level of "detail" we have on r,
	# telling us how MANY bits of it we have uncovered so far:
	i = 1
	# next is the value r (from the equation):
	# (it starts out VERY coarse, just i=1 bit of detail!)
	# (*technically, it is a lower bound on r --
	#  it tells us AT LEAST how big r is known
	#  to be based on what we've seen of it.)
	r = Fraction(getrandbits(1), 2**i)
	# this next number is how big we KNOW
	# that r can NEVER QUITE reach; if
	# 100% of our "random" bits from here on out
	# were high, it would APPROACH that value:
	r_upperbound = r + Fraction(1, 2**i)
	# if int(n * r) != int(n * r_upperbound)
	# then that means that we don't yet have
	# enough detail about our random number
	# to see exactly what the true integer
	# value of n*r should be
	while int(n * r) != int(n * r_upperbound):
		# Just keep resolving more
		# bits of 0 <= r < 1
		# until a number is
		# homed-in-on unambiguously
		i += 1
		r += Fraction(getrandbits(1), 2**i)
		r_upperbound = r + Fraction(1, 2**i)
	# Once int(n * r) == int(n * r_upperbound),
	# the while-loop will break, since resolving
	# more bits of r would be pointless
	# -- time to return the random number!
	return int(n * r)

With one helping of mathematical intuition, two helpings of high-school algebra, and an invested afternoon, you can eliminate all the Fraction objects in the above example to MASSIVELY improve performance:

def randbelow(n):
	if n <= 1:
		return 0
	k = (n - 1).bit_length()
	a = getrandbits(k)
	b = 2 ** k
	if n == b:
		return a
	while (n * a // b) != (n * (a + 1) // b):
		a = (a * 2) | getrandbits(1)
		b = b * 2
	return n * a // b

Below here is an adaptation from the paper itself; it runs far faster even than the above optimization (and has integrated handling of powers-of-2), but I couldn't possibly explain to you at this time just how or why it is equivalent to the above code. Perhaps I'll understand it after more pondering, someday:

Python

def randbelow(n):
	"""Fast Dice Roller, Lumbroso, 2013"""
	a = 0
	b = 1
	while True:
		a = (a * 2) | getrandbits(1)
		b = b * 2
		if b >= n:
			if a < n:
				return a
			a = a - n
			b = b - n

C

uintmax_t randbelow(uintmax_t n, bool (*flip)())
// Fast Dice Roller
// Lumbroso, 2013
{
	uintmax_t a = 0;
	uintmax_t b = 1;
	while (1) {
		a = (a << 1) | flip();
		b = b << 1;
		if ( b >= n ) {
			if ( a < n ) return a;
			else
			{
				a = a - n;
				b = b - n;
			}
		}
	}
}

bool flip()
// example fair coin-flip function
{
	return (rand() > RAND_MAX/2);
}

JS

function randbelow(n, getrandbit=()=>(Math.random() > 1/2)) {
	// Fast Dice Roller
	// Lumbroso, 2013
	const _int = m => n.__proto__.constructor(m);
	const [_0, _1] = [_int(0), _int(1)];
	let a = _0;
	let b = _1;
	while( true ) {
		a = (a << _1) + _int(getrandbit());
		b = b << _1;
		if ( b >= n ) {
			if ( a < n ) return a;
			else
			{
				a = a - n;
				b = b - n;
			}
		}
	}
}

One thought on “Most efficient random integer generator by random bits consumed

Leave a Reply

Your email address will not be published.

Warning: This site uses Akismet to filter spam. Until or unless I can find a suitable replacement anti-spam solution, this means that (per their indemnification document) all commenters' IP addresses will be sent to Automattic, Inc., who may choose to share such with 3rd parties.
If this is unacceptable to you, I highly recommend using an anonymous proxy or public Wi-Fi connection when commenting.