{"id":1817,"date":"2022-02-23T05:51:18","date_gmt":"2022-02-23T05:51:18","guid":{"rendered":"http:\/\/www.ishygddt.xyz\/~blog\/?p=1817"},"modified":"2022-03-26T03:47:11","modified_gmt":"2022-03-26T08:47:11","slug":"most-efficient-random-integer","status":"publish","type":"post","link":"http:\/\/www.ishygddt.xyz\/~blog\/2022\/02\/most-efficient-random-integer","title":{"rendered":"Most efficient random integer generator by random bits consumed"},"content":{"rendered":"<p>Discovered <a href=\"https:\/\/arxiv.org\/abs\/1304.1916\">in 2013 by then-<abbr title=\"Universit\u00e9 Pierre-et-Marie-Curie (Pierre and Marie Curie University)\">UPMC<\/abbr>-student J\u00e9r\u00e9mie Lumbroso<\/a>, and <a href=\"https:\/\/github.com\/apple\/swift\/pull\/39143\">independently re-discovered by an Apple programmer last Fall<\/a> (with much ado on <a href=\"https:\/\/twitter.com\/stephentyrone\/status\/1433506218677518337\">Twitter<\/a> and <a href=\"https:\/\/news.ycombinator.com\/item?id=28396077\">HN<\/a>), the basic idea is as follows:<\/p>\n<ol>\n<li>Take your (infinite) random bitstream as the binary digits (\"fractional part\") of random <strong>real<\/strong> <span class=\"language-mathjax\">$r\\in\\left[0,1\\right)$<\/span><br \/>\nSpecifically: <span class=\"language-mathjax\">$r=\\sum_{i=1}^{\\infty}{\\text{bit}_i 2^{-i}}$<\/span><\/li>\n<li>Read out <em>only<\/em> as many digits of <span class=\"language-mathjax\">$r$<\/span> as are required to determine the value of <span class=\"language-mathjax\">$\\left\\lfloor rn\\right\\rfloor$<\/span><\/li>\n<li>You now (clearly) have fairly chosen <span class=\"language-mathjax\">$\\leftarrow_{$}\\left\\{0,1,\\cdots,n-1\\right\\}$<\/span><\/li>\n<li>For the next number, resume reading bits from the stream where this number left off<\/li>\n<\/ol>\n<p>While it's easily-shown that no single-draw algorithm can be perfectly optimal, this algorithm <em>is<\/em> noticeably more efficient than rejection sampling in the single-draw case (and Lumbroso reminds us that this it is, obviously, also perfectly compatible with arithmetic-coded batching of the draws (left as an exercise for the reader)).<\/p>\n<pre><code class=\"language-python\" data-line=\"\">from fractions import Fraction\nfrom random import getrandbits\ndef educational_randbelow(n):\n\t# performs VERY poorly CPU-wise --\n\t# for illustration only!\n\tif n &lt;= 1:\n\t\t\t# A random integer &quot;below&quot; 1 must be 0\n\t\t\treturn 0\n\tk = (n-1).bit_length()\n\tif n == 2 ** k:\n\t\t# Ironically, this interpretation\n\t\t# of the algorithm\n\t\t# doesn&#039;t handle the\n\t\t# trivial case well...\n\t\t# handle it specially:\n\t\treturn getrandbits(k)\n\t# i represents the current level of &quot;detail&quot; we have on r,\n\t# telling us how MANY bits of it we have uncovered so far:\n\ti = 1\n\t# next is the value r (from the equation):\n\t# (it starts out VERY coarse, just i=1 bit of detail!)\n\t# (*technically, it is a lower bound on r --\n\t#  it tells us AT LEAST how big r is known\n\t#  to be based on what we&#039;ve seen of it.)\n\tr = Fraction(getrandbits(1), 2**i)\n\t# this next number is how big we KNOW\n\t# that r can NEVER QUITE reach; if\n\t# 100% of our &quot;random&quot; bits from here on out\n\t# were high, it would APPROACH that value:\n\tr_upperbound = r + Fraction(1, 2**i)\n\t# if int(n * r) != int(n * r_upperbound)\n\t# then that means that we don&#039;t yet have\n\t# enough detail about our random number\n\t# to see exactly what the true integer\n\t# value of n*r should be\n\twhile int(n * r) != int(n * r_upperbound):\n\t\t# Just keep resolving more\n\t\t# bits of 0 &lt;= r &lt; 1\n\t\t# until a number is\n\t\t# homed-in-on unambiguously\n\t\ti += 1\n\t\tr += Fraction(getrandbits(1), 2**i)\n\t\tr_upperbound = r + Fraction(1, 2**i)\n\t# Once int(n * r) == int(n * r_upperbound),\n\t# the while-loop will break, since resolving\n\t# more bits of r would be pointless\n\t# -- time to return the random number!\n\treturn int(n * r)<\/code><\/pre>\n<p>With one helping of mathematical intuition, two helpings of high-school algebra, and an invested afternoon, you can eliminate all the Fraction objects in the above example to MASSIVELY improve performance:<\/p>\n<pre><code class=\"language-python\" data-line=\"\">def randbelow(n):\n\tif n &lt;= 1:\n\t\treturn 0\n\tk = (n - 1).bit_length()\n\ta = getrandbits(k)\n\tb = 2 ** k\n\tif n == b:\n\t\treturn a\n\twhile (n * a \/\/ b) != (n * (a + 1) \/\/ b):\n\t\ta = (a * 2) | getrandbits(1)\n\t\tb = b * 2\n\treturn n * a \/\/ b<\/code><\/pre>\n<hr \/>\n<p>Below here is an adaptation <strong>from the paper itself<\/strong>; it runs far faster even than the above optimization (<em>and<\/em> has integrated handling of powers-of-2), but I couldn't possibly explain to you at this time just how or why <em>it is equivalent to<\/em> the above code. Perhaps I'll understand it after more pondering, someday:<\/p>\n<p>Python<\/p>\n<pre><code class=\"language-python\" data-line=\"\">def randbelow(n):\n\t&quot;&quot;&quot;Fast Dice Roller, Lumbroso, 2013&quot;&quot;&quot;\n\ta = 0\n\tb = 1\n\twhile True:\n\t\ta = (a * 2) | getrandbits(1)\n\t\tb = b * 2\n\t\tif b &gt;= n:\n\t\t\tif a &lt; n:\n\t\t\t\treturn a\n\t\t\ta = a - n\n\t\t\tb = b - n<\/code><\/pre>\n<p>C<\/p>\n<pre><code class=\"language-c\" data-line=\"\">uintmax_t randbelow(uintmax_t n, bool (*flip)())\n\/\/ Fast Dice Roller\n\/\/ Lumbroso, 2013\n{\n\tuintmax_t a = 0;\n\tuintmax_t b = 1;\n\twhile (1) {\n\t\ta = (a &lt;&lt; 1) | flip();\n\t\tb = b &lt;&lt; 1;\n\t\tif ( b &gt;= n ) {\n\t\t\tif ( a &lt; n ) return a;\n\t\t\telse\n\t\t\t{\n\t\t\t\ta = a - n;\n\t\t\t\tb = b - n;\n\t\t\t}\n\t\t}\n\t}\n}\n\nbool flip()\n\/\/ example fair coin-flip function\n{\n\treturn (rand() &gt; RAND_MAX\/2);\n}<\/code><\/pre>\n<p>JS<\/p>\n<pre><code class=\"language-javascript\" data-line=\"\">function randbelow(n, getrandbit=()=&gt;(Math.random() &gt; 1\/2)) {\n\t\/\/ Fast Dice Roller\n\t\/\/ Lumbroso, 2013\n\tconst _int = m =&gt; n.__proto__.constructor(m);\n\tconst [_0, _1] = [_int(0), _int(1)];\n\tlet a = _0;\n\tlet b = _1;\n\twhile( true ) {\n\t\ta = (a &lt;&lt; _1) + _int(getrandbit());\n\t\tb = b &lt;&lt; _1;\n\t\tif ( b &gt;= n ) {\n\t\t\tif ( a &lt; n ) return a;\n\t\t\telse\n\t\t\t{\n\t\t\t\ta = a - n;\n\t\t\t\tb = b - n;\n\t\t\t}\n\t\t}\n\t}\n}<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Discovered in 2013 by then-<abbr title=\"Universit\u00e9 Pierre-et-Marie-Curie (Pierre and Marie Curie University)\">UPMC<\/abbr>-student J\u00e9r\u00e9mie Lumbroso, and independently re-discovered by an Apple programmer last Fall (with much ado on Twitter and Hacker News), the basic idea is as follows: Take your (infinite) random bitstream as the binary digits of random <span class=\"language-mathjax\">$r\\in\\left[0,1\\right)$\u2026<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[101],"tags":[92,91,44,90],"class_list":["post-1817","post","type-post","status-publish","format-standard","hentry","category-writeups","tag-algorithms","tag-math","tag-python","tag-rng"],"_links":{"self":[{"href":"http:\/\/www.ishygddt.xyz\/~blog\/wp-json\/wp\/v2\/posts\/1817","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.ishygddt.xyz\/~blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.ishygddt.xyz\/~blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.ishygddt.xyz\/~blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"http:\/\/www.ishygddt.xyz\/~blog\/wp-json\/wp\/v2\/comments?post=1817"}],"version-history":[{"count":87,"href":"http:\/\/www.ishygddt.xyz\/~blog\/wp-json\/wp\/v2\/posts\/1817\/revisions"}],"predecessor-version":[{"id":2232,"href":"http:\/\/www.ishygddt.xyz\/~blog\/wp-json\/wp\/v2\/posts\/1817\/revisions\/2232"}],"wp:attachment":[{"href":"http:\/\/www.ishygddt.xyz\/~blog\/wp-json\/wp\/v2\/media?parent=1817"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.ishygddt.xyz\/~blog\/wp-json\/wp\/v2\/categories?post=1817"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.ishygddt.xyz\/~blog\/wp-json\/wp\/v2\/tags?post=1817"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}