Jekyll2022-05-18T19:33:24+00:00https://nickdrozd.github.io/feed.xmlSomething Something ProgrammingMostly thoughts about programming. Maybe other stuff too.Heaven Must Be Missing a Syllable2022-05-18T00:00:00+00:002022-05-18T00:00:00+00:00https://nickdrozd.github.io/2022/05/18/heaven-syllable<p>In English-language hymns, the word “heaven”, which clearly has two syllables, is commonly treated as if it had just one syllable. It’s an old tradition, but not a proud one. It’s embarrassing, really. It’s a stupid practice, and it’s terrible to sing, and I hate it.</p>
<p>To illustrate, here’s a hymn selected at random:</p>
<p><img src="/assets/2022-05-18-heaven-syllable/lead-us-heavenly-father.png" alt="img" /></p>
<p>The first two bars have four quarter notes apiece, with the first bar going up and the second bar going down. The third verse has a standard strong-weak scansion: “SPIR-it OF our GOD de-SCEND-ing”. All well and good.</p>
<p>But look at the third “syllable” of the first verse. There it is. “Heaven” stuck to one note. “LEAD us HEAV(EN?)-ly FA-ther LEAD us”. What are you supposed to do with that? How do you sing two syllables as one? If you’re lucky the tune will be played fast, and then you just kinda blur the syllables together, so that “heaven” sounds like “hen”. If you aren’t so lucky, it will be played slow, and you will be stuck trying to deal with that extra syllable. If there’s a whole congregation singing, they won’t be able to coordinate it properly and it will sound like mush.</p>
<p>In many Protestant churches, this trainwreck is a regularly scheduled event known as the “Doxology”:</p>
<p><img src="/assets/2022-05-18-heaven-syllable/doxology.jpg" alt="img" /></p>
<p>This gets sung every week, right after the money plate is passed around, and it goes slow. As a kid I always dreaded singing it, because the crunched syllable in “heavenly host” was just so cringe-inducing. (It was also a hymn sung without a hymnal, and I think the congregation wasn’t really clear about the word “host”. That certainly didn’t help.)</p>
<p>The most famous case of pretending that “heaven” doesn’t have two syllables is in the Christmas carol “Joy to the World”:</p>
<p><img src="/assets/2022-05-18-heaven-syllable/joy-to-the-world.jpg" alt="img" /></p>
<p>“And heav’n and nature sing”. Here the ruse succeeds by cheating; a French-style liaison results in the “n” of “heaven” being pronounced as if it were the first letter of “and”, as in “heav nan nature”. Generally, heaven-with-one-syllable is workable as long as the subsequent word begins with a vowel: “heav’n above”, “heav’n eternal”, etc.</p>
<p>But wait a minute, don’t we crunch syllables in English all the time? For example, what about the phrase “every chocolate camera”? It has six syllables and not nine because each of those words get crunched. Why not crunch “heaven” too?</p>
<p>It’s true that English speakers have a tendency to drop syllables, but only in certain circumstances. Specifically, there needs to be a weak syllable with an indeterminate vowel surrounded by strong syllables. But “heaven” is a word with just two syllables, and they’re both required. You can’t drop either of them.</p>
<p>Sometimes an apostrophe is added: “heav’n”. I interpret this as an admission of shame on the part of the composer. It says: I know there’s just one syllable of space here, but I really needed to mention heaven; please don’t think too hard about it. But this postage stamp-sized figleaf is inadequate to cover things up.</p>In English-language hymns, the word “heaven”, which clearly has two syllables, is commonly treated as if it had just one syllable. It’s an old tradition, but not a proud one. It’s embarrassing, really. It’s a stupid practice, and it’s terrible to sing, and I hate it.A Solution to the Halting Problem2022-05-05T00:00:00+00:002022-05-05T00:00:00+00:00https://nickdrozd.github.io/2022/05/05/halting-problem-solution<p>Please excuse my <strong>clickbait</strong> title. It was intended to trigger the emotions of the kind of people who get worked up about computability theory. Of course I don’t have a solution to the halting problem – <strong>there is no such thing</strong>. Instead, I want to talk about a <strong>partial solution</strong> to the halting problem, a method to solve it for <strong>certain instances</strong>. And to make up for the misleading title, I’ll also discuss how to extend the method to <strong>other similar problems</strong>.</p>
<p>The halting problem is this: <strong>given a program, does it halt or not?</strong> Sometimes the problem is stated in terms of a program run on an input, or a program run on something like its own source code. For our purposes, we’ll consider a program to be <strong>self-contained</strong>, like a compiled binary that accepts no arguments.</p>
<p>A solution to the halting problem would be a <strong><a href="https://nickdrozd.github.io/2022/04/01/total-partial-functions.html">total function</a></strong> with the type signature <code class="language-plaintext highlighter-rouge">Program -> Bool</code>. <strong>Turing’s Theorem</strong> says that such a function cannot exist, and any function with that signature must be either <strong>partial</strong> or <strong>incorrect</strong>. (If this sounds similar to Gödel’s Incompleteness Theorem, it’s because <a href="https://scottaaronson.blog/?p=710">G’s theorem follows directly from T’s</a>.) Incorrect functions are bad, so we’ll settle for partial. Or rather, the function we’ll implement will declare a program to be non-halting only when it is <strong>absolutely sure</strong>; in the iffy cases, it won’t declare the program to be nonhalting.</p>
<p>The term “program” is a bit vague. What kind of programs are we talking about here? Python? C? No, as usual we’ll be talking about <strong><a href="https://nickdrozd.github.io/2020/10/04/turing-machine-notation-and-normal-form.html">Turing machines</a></strong>. And there is even a specific application for this: the <strong><a href="https://nickdrozd.github.io/2020/10/15/busy-beaver-well-defined.html">Busy Beaver game</a></strong>. The primary use here is in <a href="https://nickdrozd.github.io/2022/01/14/bradys-algorithm.html">paring down the search space</a>. The second use is in <a href="https://nickdrozd.github.io/2020/12/15/lin-rado-proof.html">proving the true values</a> of the Busy Beaver function.</p>
<p>Following <a href="http://turbotm.de/~heiner/BB/mabu90.html">Marxen and Buntrock</a>, we can say that there are two methods for determining nonhaltingness: <strong>forward reasoning</strong> and <strong>backward reasoning</strong>. Reasoning forward means determining that the program will repeat some kind of behavior forever, and that therefore it will not halt. Detecting <strong><a href="https://nickdrozd.github.io/2021/02/24/lin-recurrence-and-lins-algorithm.html">Lin recurrence</a></strong> is an example of forward reasoning. Reasoning backward means starting from the end of the program and showing that there is no possible path to the halt instruction. An <strong>algorithm</strong> for doing that is what we’ll be concerned with here.</p>
<p>Now, there is one quick and easy way to tell that a program will never halt. <strong>If there are no halt instructions, it definitely won’t halt.</strong> <em>Halt-free</em> programs can be dismissed out of hand, and the absence of a halt instruction can be checked using simple string functions. This kind of check belongs to <strong>static analysis</strong>.</p>
<p>The static instruction check can be applied to behaviors other than halting. Consider, for example, <strong><a href="https://nickdrozd.github.io/2021/07/11/self-cleaning-turing-machine.html">self-cleaning</a></strong> Turing machines that erase all the marks on the tape. In order wipe the tape clean, a program must have at least one <strong>erase instruction</strong>, an instruction that prints a blank upon scanning a mark. If a program does not have any erase instructions – and again, this is a quick and easy check – then it definitely cannot be self-cleaning, and therefore can be ignored as a candidate for the <strong><a href="https://nickdrozd.github.io/2021/02/14/blanking-beavers.html">Blanking Beaver</a></strong> problem.</p>
<p>For another example, consider the behavior known as <strong><em><a href="https://groups.google.com/g/busy-beaver-discuss/c/Dq8PYAkoMXU">spinning out</a></em></strong>. Here, the machine is scanning a blank cell directly to the left (or right) of all the marks on the tape, and the next instruction keeps control in the current state with a move to the left (or right). A machine in this circumstance will get “stuck” doing the same move forever. This is the simplest possible form of <strong><a href="https://nickdrozd.github.io/2021/01/14/halt-quasihalt-recur.html">quasihalting</a></strong>, and all known <strong><a href="https://nickdrozd.github.io/2022/02/11/latest-beeping-busy-beaver-results.html">Beeping Busy Beaver</a></strong> champions are spinners. There is an easy static check for this behavior: a program must have at least one <strong><em>zero-reflexive</em></strong> instruction, an instruction that keeps the state the same upon scanning a blank.</p>
<p>The halting problem asks whether a given program will halt, and along those same lines the <strong>blank tape problem</strong> and the <strong>spin-out problem</strong> ask whether a given program will blank the tape or spin out. An <strong>oracular solution</strong> for any one of these problems could be used to solve the other two, and so all three problems are <strong>co-computable</strong>.</p>
<p><strong>But in practice, the halting problem is easier.</strong> We’re assuming that programs are self-contained and run without input arguments of any kind, and therefore only one of a program’s halt instructions can actually be executed. A common way to go about Busy Beaver searching is to discard out of hand any program with more than one halt instruction, since such a program has <strong>wasted some of its precious few instructions</strong>. But a program can have multiple erase or zero-reflexive instructions and they can all be used more than once.</p>
<p>The static analysis approach to the halting (blanking, spin-out) problem is <strong>easy to implement</strong> and <strong>fast to run</strong>. It makes for a great first pass. However, it’s <strong>shallow</strong> and it leaves a lot on the table. There are programs that have the required instructions but still cannot halt (or wipe the tape, or spin out).</p>
<p>To show that these cannot reach their goals, we will <strong>run the programs backwards</strong>. Yes, we’ll actually run them, and therefore this is a form of <strong>dynamic analysis</strong>. There are some downsides to the dynamic approach: it is slow, difficult to implement, and occasionally memory-intensive. The tradeoff is that it is <strong>more thorough</strong>, and it is capable of returning correct, definitive answers for a wider range of programs. (But it still isn’t a general solution to the halting problem, because again, <strong>there is no such thing</strong>.)</p>
<p>The basic idea is that we will start from where we want to end up, and then work backwards from there. If we can <strong>reconstruct a path to that endpoint</strong> within a certain number of steps, then we will say that the program might reach that endpoint; on the other hand, if we can show that there is no possible path, then we can definitively say that the program will not reach that endpoint. In the latter case, we will have <strong>solved that instance of the halting problem</strong>.</p>
<p>This post has gone on for a while, and <strong>describing algorithms is hard</strong>, so I’m going to stop here and just post some code. This is <strong><a href="https://github.com/nickdrozd/busy-beaver-stuff/blob/main/generate/program.py">real working code</a></strong> that has been <a href="https://groups.google.com/g/busy-beaver-discuss/c/KofE0K7_AbQ">used for real</a>. I’m posting it as-is – no touch-ups, no tidying, none of that. <em>If anyone has actually read this far but finds the code unclear and wants further explanation of the backward-reasoning algorithm, please let me know and I’ll write up another post.</em></p>
<p>And with that, here’s a (partial) solution to the halting (and blanking and spin-out) problem:</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="k">class</span> <span class="nc">Program</span><span class="p">:</span>
<span class="c1"># ... other stuff ...
</span>
<span class="o">@</span><span class="nb">property</span>
<span class="k">def</span> <span class="nf">cant_halt</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">bool</span><span class="p">:</span>
<span class="k">return</span> <span class="bp">self</span><span class="p">.</span><span class="n">_cant_reach</span><span class="p">(</span>
<span class="s">'halted'</span><span class="p">,</span>
<span class="bp">self</span><span class="p">.</span><span class="n">halt_slots</span><span class="p">,</span>
<span class="p">)</span>
<span class="o">@</span><span class="nb">property</span>
<span class="k">def</span> <span class="nf">cant_blank</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">bool</span><span class="p">:</span>
<span class="k">return</span> <span class="bp">self</span><span class="p">.</span><span class="n">_cant_reach</span><span class="p">(</span>
<span class="s">'blanks'</span><span class="p">,</span>
<span class="bp">self</span><span class="p">.</span><span class="n">erase_slots</span><span class="p">,</span>
<span class="n">blank</span> <span class="o">=</span> <span class="bp">True</span><span class="p">,</span>
<span class="p">)</span>
<span class="o">@</span><span class="nb">property</span>
<span class="k">def</span> <span class="nf">cant_spin_out</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">bool</span><span class="p">:</span>
<span class="k">return</span> <span class="bp">self</span><span class="p">.</span><span class="n">_cant_reach</span><span class="p">(</span>
<span class="s">'spnout'</span><span class="p">,</span>
<span class="nb">tuple</span><span class="p">(</span>
<span class="n">state</span> <span class="o">+</span> <span class="nb">str</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="k">for</span> <span class="n">state</span> <span class="ow">in</span>
<span class="bp">self</span><span class="p">.</span><span class="n">graph</span><span class="p">.</span><span class="n">zero_reflexive_states</span><span class="p">),</span>
<span class="p">)</span>
<span class="k">def</span> <span class="nf">_cant_reach</span><span class="p">(</span>
<span class="bp">self</span><span class="p">,</span>
<span class="n">final_prop</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span>
<span class="n">slots</span><span class="p">:</span> <span class="n">Tuple</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="p">...],</span>
<span class="n">max_attempts</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="mi">24</span><span class="p">,</span>
<span class="n">blank</span><span class="p">:</span> <span class="nb">bool</span> <span class="o">=</span> <span class="bp">False</span><span class="p">,</span>
<span class="p">):</span>
<span class="n">configs</span><span class="p">:</span> <span class="n">List</span><span class="p">[</span>
<span class="n">Tuple</span><span class="p">[</span><span class="nb">int</span><span class="p">,</span> <span class="nb">str</span><span class="p">,</span> <span class="n">BlockTape</span><span class="p">,</span> <span class="nb">int</span><span class="p">,</span> <span class="n">History</span><span class="p">]</span>
<span class="p">]</span> <span class="o">=</span> <span class="p">[</span> <span class="c1"># type: ignore
</span> <span class="p">(</span>
<span class="mi">1</span><span class="p">,</span>
<span class="n">state</span><span class="p">,</span> <span class="c1"># type: ignore
</span> <span class="n">BlockTape</span><span class="p">([],</span> <span class="n">color</span><span class="p">,</span> <span class="p">[]),</span> <span class="c1"># type: ignore
</span> <span class="mi">0</span><span class="p">,</span>
<span class="n">History</span><span class="p">(),</span>
<span class="p">)</span>
<span class="k">for</span> <span class="n">state</span><span class="p">,</span> <span class="n">color</span> <span class="ow">in</span> <span class="nb">sorted</span><span class="p">(</span><span class="n">slots</span><span class="p">)</span>
<span class="p">]</span>
<span class="n">comp</span> <span class="o">=</span> <span class="n">tcompile</span><span class="p">(</span><span class="nb">str</span><span class="p">(</span><span class="bp">self</span><span class="p">))</span>
<span class="n">max_repeats</span> <span class="o">=</span> <span class="n">max_attempts</span> <span class="o">//</span> <span class="mi">2</span>
<span class="n">seen</span><span class="p">:</span> <span class="n">Dict</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="n">Set</span><span class="p">[</span><span class="nb">str</span><span class="p">]]</span> <span class="o">=</span> <span class="n">defaultdict</span><span class="p">(</span><span class="nb">set</span><span class="p">)</span>
<span class="k">while</span> <span class="n">configs</span><span class="p">:</span> <span class="c1"># pylint: disable = while-used
</span> <span class="n">step</span><span class="p">,</span> <span class="n">state</span><span class="p">,</span> <span class="n">tape</span><span class="p">,</span> <span class="n">repeat</span><span class="p">,</span> <span class="n">history</span> <span class="o">=</span> <span class="n">configs</span><span class="p">.</span><span class="n">pop</span><span class="p">()</span>
<span class="k">if</span> <span class="n">step</span> <span class="o">></span> <span class="n">max_attempts</span><span class="p">:</span>
<span class="k">return</span> <span class="bp">False</span>
<span class="k">if</span> <span class="n">state</span> <span class="o">==</span> <span class="s">'A'</span> <span class="ow">and</span> <span class="n">tape</span><span class="p">.</span><span class="n">blank</span><span class="p">:</span>
<span class="k">return</span> <span class="bp">False</span>
<span class="k">if</span> <span class="p">(</span><span class="n">tape_hash</span> <span class="p">:</span><span class="o">=</span> <span class="nb">str</span><span class="p">(</span><span class="n">tape</span><span class="p">))</span> <span class="ow">in</span> <span class="n">seen</span><span class="p">[</span><span class="n">state</span><span class="p">]:</span>
<span class="k">continue</span>
<span class="n">seen</span><span class="p">[</span><span class="n">state</span><span class="p">].</span><span class="n">add</span><span class="p">(</span><span class="n">tape_hash</span><span class="p">)</span>
<span class="n">history</span><span class="p">.</span><span class="n">add_state_at_step</span><span class="p">(</span><span class="n">step</span><span class="p">,</span> <span class="n">state</span><span class="p">)</span> <span class="c1"># type: ignore
</span> <span class="n">history</span><span class="p">.</span><span class="n">add_tape_at_step</span><span class="p">(</span><span class="n">step</span><span class="p">,</span> <span class="n">tape</span><span class="p">)</span>
<span class="k">if</span> <span class="n">history</span><span class="p">.</span><span class="n">check_for_recurrence</span><span class="p">(</span>
<span class="n">step</span><span class="p">,</span>
<span class="p">(</span><span class="n">state</span><span class="p">,</span> <span class="n">tape</span><span class="p">.</span><span class="n">scan</span><span class="p">))</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span> <span class="c1"># type: ignore
</span> <span class="n">repeat</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">repeat</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="k">if</span> <span class="n">repeat</span> <span class="o">></span> <span class="n">max_repeats</span><span class="p">:</span>
<span class="k">continue</span>
<span class="n">history</span><span class="p">.</span><span class="n">add_action_at_step</span><span class="p">(</span>
<span class="n">step</span><span class="p">,</span>
<span class="p">(</span><span class="n">state</span><span class="p">,</span> <span class="n">tape</span><span class="p">.</span><span class="n">scan</span><span class="p">))</span> <span class="c1"># type: ignore
</span>
<span class="c1"># print(step, state, tape)
</span>
<span class="k">for</span> <span class="n">entry</span> <span class="ow">in</span> <span class="nb">sorted</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">graph</span><span class="p">.</span><span class="n">entry_points</span><span class="p">[</span><span class="n">state</span><span class="p">]):</span>
<span class="k">for</span> <span class="n">_</span><span class="p">,</span> <span class="p">(</span><span class="n">_</span><span class="p">,</span> <span class="n">shift</span><span class="p">,</span> <span class="n">trans</span><span class="p">)</span> <span class="ow">in</span> <span class="bp">self</span><span class="p">[</span><span class="n">entry</span><span class="p">].</span><span class="n">items</span><span class="p">():</span>
<span class="k">if</span> <span class="n">trans</span> <span class="o">!=</span> <span class="n">state</span><span class="p">:</span>
<span class="k">continue</span>
<span class="k">for</span> <span class="n">color</span> <span class="ow">in</span> <span class="nb">sorted</span><span class="p">(</span><span class="nb">map</span><span class="p">(</span><span class="nb">int</span><span class="p">,</span> <span class="bp">self</span><span class="p">.</span><span class="n">colors</span><span class="p">)):</span>
<span class="n">next_tape</span> <span class="o">=</span> <span class="n">tape</span><span class="p">.</span><span class="n">copy</span><span class="p">()</span>
<span class="n">_</span> <span class="o">=</span> <span class="n">next_tape</span><span class="p">.</span><span class="n">step</span><span class="p">(</span>
<span class="ow">not</span> <span class="p">(</span><span class="mi">0</span> <span class="k">if</span> <span class="n">shift</span> <span class="o">==</span> <span class="s">'L'</span> <span class="k">else</span> <span class="mi">1</span><span class="p">),</span>
<span class="n">next_tape</span><span class="p">.</span><span class="n">scan</span><span class="p">,</span>
<span class="p">)</span>
<span class="n">next_tape</span><span class="p">.</span><span class="n">scan</span> <span class="o">=</span> <span class="n">color</span>
<span class="n">run</span> <span class="o">=</span> <span class="n">Machine</span><span class="p">(</span><span class="n">comp</span><span class="p">).</span><span class="n">run</span><span class="p">(</span>
<span class="n">step_lim</span> <span class="o">=</span> <span class="n">step</span> <span class="o">+</span> <span class="mi">1</span><span class="p">,</span>
<span class="n">tape</span> <span class="o">=</span> <span class="n">next_tape</span><span class="p">.</span><span class="n">copy</span><span class="p">(),</span>
<span class="n">state</span> <span class="o">=</span> <span class="nb">ord</span><span class="p">(</span><span class="n">entry</span><span class="p">)</span> <span class="o">-</span> <span class="mi">65</span><span class="p">,</span>
<span class="n">check_blanks</span> <span class="o">=</span> <span class="n">blank</span><span class="p">,</span>
<span class="p">)</span>
<span class="n">result</span> <span class="o">=</span> <span class="nb">getattr</span><span class="p">(</span><span class="n">run</span><span class="p">.</span><span class="n">final</span><span class="p">,</span> <span class="n">final_prop</span><span class="p">)</span>
<span class="k">if</span> <span class="n">result</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span>
<span class="k">if</span> <span class="n">run</span><span class="p">.</span><span class="n">final</span><span class="p">.</span><span class="n">undfnd</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span>
<span class="k">continue</span>
<span class="n">result</span> <span class="o">=</span> <span class="n">step</span> <span class="o">+</span> <span class="mi">1</span>
<span class="k">if</span> <span class="nb">abs</span><span class="p">(</span><span class="n">result</span> <span class="o">-</span> <span class="n">step</span><span class="p">)</span> <span class="o">></span> <span class="mi">1</span><span class="p">:</span>
<span class="k">continue</span>
<span class="n">configs</span><span class="p">.</span><span class="n">append</span><span class="p">((</span>
<span class="n">step</span> <span class="o">+</span> <span class="mi">1</span><span class="p">,</span>
<span class="n">entry</span><span class="p">,</span>
<span class="n">next_tape</span><span class="p">,</span>
<span class="n">repeat</span><span class="p">,</span>
<span class="n">history</span><span class="p">.</span><span class="n">copy</span><span class="p">(),</span>
<span class="p">))</span>
<span class="k">return</span> <span class="bp">True</span></code></pre></figure>Please excuse my clickbait title. It was intended to trigger the emotions of the kind of people who get worked up about computability theory. Of course I don’t have a solution to the halting problem – there is no such thing. Instead, I want to talk about a partial solution to the halting problem, a method to solve it for certain instances. And to make up for the misleading title, I’ll also discuss how to extend the method to other similar problems.Does This Function Terminate?2022-04-13T00:00:00+00:002022-04-13T00:00:00+00:00https://nickdrozd.github.io/2022/04/13/does-this-function-terminate<p>The function <strong><em>G</em></strong> is defined as follows:</p>
<figure class="highlight"><pre><code class="language-nil" data-lang="nil">G : ℕ × ℕ → ℕ
G(n, 0) = n
G(n, 1) = G(1, n + 3)
G(4k + 0, m) = G(7k + 7, m - 2)
G(4k + 1, m) = G(7k + 8, m - 1)
G(4k + 2, m) = G(7k + 8, m - 1)
G(4k + 3, m) = G(7k + 14, m - 2)</code></pre></figure>
<p>🛃 <strong>Question</strong> 🛃 Does <em>G(1, 1)</em> terminate?</p>
<p>🛄 <strong>Answer</strong> 🛄 Yes. <em>G(1, 1) = 2533…2210 > 18<sup>7003</sup></em>.</p>
<p>That was just a warmup. <em>G</em> is not the function referred to in the title of this post. What we really want to know about is <strong><em>H</em></strong>:</p>
<figure class="highlight"><pre><code class="language-nil" data-lang="nil">H : ℕ × ℕ → ℕ
H(n, 0) = n
H(n, 1) = H(1, n + 3)
H(4k + 0, m) = H(7k + 14, m - 2)
H(4k + 1, m) = H(7k + 8, m - 1)
H(4k + 2, m) = H(7k + 15, m - 1)
H(4k + 3, m) = H(7k + 14, m - 2)</code></pre></figure>
<p>🛃 <strong>Question</strong> 🛃 Does <em>H(1, 1)</em> terminate?</p>
<p>🛄 <strong>Answer</strong> 🛄 <strong>… ¿ 🛐 ? …</strong></p>
<p><em>G(1, 1)</em> returns its answer after <strong>28,833 iterations</strong>. <em>H(1, 1)</em> does not return an answer after any <strong>reasonable number</strong> of iterations. There are two possibilities:</p>
<ol>
<li><em>H(1, 1)</em> <strong>never</strong> returns an answer.</li>
<li><em>H(1, 1)</em> returns an answer after an <strong>unreasonable number</strong> of iterations.</li>
</ol>
<p><strong>Practically speaking</strong>, there is no difference – we get no answer either way.</p>
<p>🛃 <strong>Question</strong> 🛃 Which possibility is more likely?</p>
<p><em>G</em> and <em>H</em> are examples of <strong>iterated Collatz-like functions</strong>. The first argument accumulates the answer and the second argument counts how many times to apply the transformation. The countdown argument can get reset, giving these functions an <strong>Ackermann</strong> flavor.</p>
<p>Notice that we are <em>not</em> asking about whether <em>H</em> is <strong><a href="https://nickdrozd.github.io/2022/04/01/total-partial-functions.html">total</a></strong>, and neither are we claiming that <em>G</em> is. Those would be <strong><em>very difficult claims to prove</em></strong>. No, we are asking the <em>severely unambitiious</em> question of whether these functions terminate on <strong><em>just this one particular argument</em></strong>. <em>G(1, 1)</em> terminates, but we can’t answer for <em>H(1, 1)</em>.</p>
<p>One way to look at this is to say: <em>G</em> and <em>H</em> have quite similar definitions – they differ at just two parameters. <strong><em>G(1, 1)</em> terminates, so why shouldn’t <em>H(1, 1)</em>?</strong> Maybe most functions similar to <em>G</em> terminate, and so we should assume that <em>H</em> does too.</p>
<p>On the other hand, <strong>why exactly <em>does</em> <em>G(1, 1)</em> terminate?</strong> There’s no obvious reason why it should work. In fact, it seems <strong>miraculous</strong> that it terminates at all. Maybe most of these functions <em>don’t</em> terminate, and <em>G</em> is a remarkable exception.</p>
<p>🛃 <strong>Question</strong> 🛃 Why are we talking about these weird functions?</p>
<p>🛄 <strong>Answer</strong> 🛄 In connection with the <strong><a href="https://nickdrozd.github.io/2022/02/11/latest-beeping-busy-beaver-results.html">Beeping Busy Beaver</a></strong> problem, <strong>Turing machine programs</strong> have been discovered that implement these functions. The program that implements <em>G</em> is the current 5-state 2-color BBB champion, and the program that implements <em>H</em> is a contender. Both programs are “children” of the so-called <strong><a href="https://www.sligocki.com/2022/04/03/mother-of-giants.html">Mother of Giants</a></strong> and were discovered by <strong><a href="https://github.com/sligocki/busy-beaver/">Shawn Ligocki</a></strong>.</p>
<p><strong>Proving the true value of <em>BBB(5)</em> is at least as hard as determining the outcome of <em>H(1, 1)</em>.</strong></p>The function G is defined as follows:Performance Hot Spots2022-04-12T00:00:00+00:002022-04-12T00:00:00+00:00https://nickdrozd.github.io/2022/04/12/performance-hot-spots<p><strong>My code was running too slow.</strong> I tried changing some things. I short-circuited some loops. I manually garbage-collected some objects. I even changed some lists to tuples, because an old piece of Python folklore passed down through the ages says that “tuples are faster than lists”. <strong>But nothing worked.</strong></p>
<p>It didn’t work because none of the changes were made at the <strong><em>hot spots</em></strong>. Hot spots are places in the code where an inordinate amount of execution time is spent. Code doesn’t always have hot spots, but they aren’t rare.</p>
<p>There are several <strong>methods for detecting hot spots</strong> in common use.</p>
<ol>
<li>Experience</li>
<li>Instinct</li>
<li>Guesswork</li>
<li>Clairvoyance</li>
<li>Hunches</li>
<li>Superstition</li>
<li>Gut</li>
<li>Trial and error</li>
<li>Measurement</li>
</ol>
<p><strong>Measurement is the only reliable method for finding hot spots.</strong> The other methods range from mostly ineffective to stupid and counterproductive. Working on performance without measuring is like cutting your own hair without a mirror. Believe me, I’ve done both.</p>
<p>I measured my code with a <strong>profiler</strong>, and here’s what I saw:</p>
<p><img src="/assets/2022-04-12-performance-hot-spots/before.png" alt="img" /></p>
<p>Can you guess where the problem is? That’s right, it’s the <strong>red box</strong>. According to the <strong>callgraph</strong>, a whopping <em>76% of execution time</em> is getting spent in that box and its sub-boxes.</p>
<p>This isn’t an uncommon situation, and it isn’t always indicative of a problem. It could be that the red box is executing some <strong>critical computation</strong>, the hard core of what the app actually “does”, and the rest of the code is just <strong>dispatching fluff</strong>. In that case, it would seem that most of the work getting done is work that has to get done, and that’s good.</p>
<p>But that isn’t what’s going on in my code. Looking closer at that red box, I can see that it’s a <strong>Python library function</strong>, namely <code class="language-plaintext highlighter-rouge">copy.deepcopy</code>. <strong>76% of execution time is spent deepcopying! Whoops!</strong> So much time is spent deepcopying that the profiler looked into that function’s innards; the green box is something called <code class="language-plaintext highlighter-rouge">_deepcopy_list</code>.</p>
<p>This is a <strong>performance bottleneck</strong>, and it’s a problem. It’s also an <strong>opportunity</strong>. It means that I can go in and change just a few lines of code and get a big performance improvement. And indeed that was accomplished in <a href="https://github.com/nickdrozd/busy-beaver-stuff/commit/9aed37844f5067bd4c91fbe3f9ae1ec853e3f60c">a single commit</a>. It was just a matter of doing the copying a little smarter. No need to think about system design or algorithms or any other difficult stuff, which is great.</p>
<p>Here what the profiler showed afterwards:</p>
<p><img src="/assets/2022-04-12-performance-hot-spots/after.png" alt="img" /></p>
<p>That is a <strong>healthier-looking callgraph</strong>. More green and yellow boxes indicate that the load is being spread around more. There’s still a red box, but instead of some crummy builtin, it’s a function called <code class="language-plaintext highlighter-rouge">Machine.run</code>. This is part of the core functionality, so the red box is not so concerning.</p>
<h1 id="how-to-create-a-profiler-callgraph-in-python">How to Create a Profiler Callgraph in Python</h1>
<p>The general principles of profiling apply to any language, but the specific instructions vary. I’ll tell you how I do it in <strong>Python</strong>. I do it the same way every time and I don’t have a great insight into how the tools work. These are the <strong>incantations</strong> that have been passed down to me.</p>
<p>The profiling library is called <code class="language-plaintext highlighter-rouge">yappi</code>. It works by wrapping the <code class="language-plaintext highlighter-rouge">main</code> function, or the toplevel entrypoint or whatever:</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="kn">import</span> <span class="nn">yappi</span>
<span class="n">yappi</span><span class="p">.</span><span class="n">set_clock_type</span><span class="p">(</span><span class="s">'cpu'</span><span class="p">)</span>
<span class="n">yappi</span><span class="p">.</span><span class="n">start</span><span class="p">()</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">main_or_toplevel_entrypoint_or_whatever</span><span class="p">()</span>
<span class="k">finally</span><span class="p">:</span>
<span class="n">stats</span> <span class="o">=</span> <span class="n">yappi</span><span class="p">.</span><span class="n">get_func_stats</span><span class="p">()</span>
<span class="n">stats</span><span class="p">.</span><span class="n">save</span><span class="p">(</span><span class="s">'yappi.callgrind'</span><span class="p">,</span> <span class="nb">type</span> <span class="o">=</span> <span class="s">'callgrind'</span><span class="p">)</span></code></pre></figure>
<p>Run that, then convert the profiling data into <strong>Graphviz</strong> format with <code class="language-plaintext highlighter-rouge">gprof2dot</code>:</p>
<figure class="highlight"><pre><code class="language-shell" data-lang="shell">gprof2dot yappi.callgrind <span class="nt">-f</span> callgrind <span class="nt">--colour-nodes-by-selftime</span> | dot <span class="nt">-T</span> png <span class="nt">-o</span> yappi.png</code></pre></figure>My code was running too slow. I tried changing some things. I short-circuited some loops. I manually garbage-collected some objects. I even changed some lists to tuples, because an old piece of Python folklore passed down through the ages says that “tuples are faster than lists”. But nothing worked.Collatz Arithmetic2022-04-07T00:00:00+00:002022-04-07T00:00:00+00:00https://nickdrozd.github.io/2022/04/07/collatz-arithmetic<p>Pick a natural number <em>n</em> and apply the following transformation to it: if <em>n</em> is even, the output will be <strong><em>n / 2</em></strong>; otherwise, if <em>n</em> is odd, the output will be <strong><em>3n + 1</em></strong>. Keep doing this until <em>1</em> is reached.</p>
<p>Will this procedure always reach <em>1</em>? That is, is this a <strong><a href="https://nickdrozd.github.io/2022/04/01/total-partial-functions.html">total function</a></strong>? The <strong>Collatz Conjecture</strong> says yes, but this has not been proved or disproved. It has also not been proved or disproved that it can be proved or disproved.</p>
<p><strong><a href="https://www.jstor.org/stable/10.4169/amer.math.monthly.120.03.192">John Conway</a></strong> argued that the Collatz Conjecture is both <em>true</em> and <strong><em>unsettleable</em></strong>: it cannot be proved, and it cannot be proved that it cannot be proved, and so on. This is not a particularly satisfying situation, but it could very well be the case.</p>
<p>Call the Collatz transformation <em>C</em>. Notice that if <em>C(n)</em> really does reach <em>1</em> and terminate for a given <em>n</em>, this is <strong>always provable</strong>, and indeed the number of iterations required can be explictly calculated. That is, we can exhibit a number <em>k</em> such that <em>C<sup>k</sup>(n) = 1</em>. So if the conjecture is true but not provable, then this means that each <strong>instance</strong> is provable, but the <strong>generalization</strong> is not. In symbols:</p>
<ul>
<li>⊢ <em>∃k C<sup>k</sup>(1) = 1</em></li>
<li>⊢ <em>∃k C<sup>k</sup>(2) = 1</em></li>
<li>⊢ <em>∃k C<sup>k</sup>(3) = 1</em></li>
<li>⊢ <em>∃k C<sup>k</sup>(4) = 1</em></li>
<li>…</li>
<li>⊬ <em>∀n ∃k C<sup>k</sup>(n) = 1</em></li>
</ul>
<p>This last statement is formalization of the Collatz conjecture, and we’ll refer to it as <strong>CC</strong>.</p>
<p>But “provable” isn’t an absolute property – provability is always with respect to some <strong>theory</strong>. What theory are we talking about here? The answer is: <strong>take your pick</strong>. The conjecture is not known to be decided by any theory in common use: <strong>Peano arithmetic (PA)</strong>, set theory, whatever. For simplicity, we’ll assume PA as our background theory.</p>
<p><strong>Let’s assume that CC is true and also not provable in PA.</strong> These assumptions imply that the <strong>negation</strong> of CC is not provable either, or in other words that CC is <strong>independent</strong> of PA. This being the case, we are free to add either CC or its negation to PA as <strong>new axioms</strong> and the resulting theory will be <strong>consistent</strong>. Adding the negation, however, yields a theory that is <strong><em>𝜔-inconsistent</em></strong>, or more simply, <strong>unsound</strong>. It says things about the natural numbers that are <strong>not true</strong>. (Alternatively, it can be looked at as a theory of <em>number-like objects</em> that are not numbers.)</p>
<p>So we’ll add CC as a new axiom to PA. Call the resulting theory <strong>Collatz arithmetic (CA)</strong>. CA is strictly stronger than PA, since CA ⊢ CC but (by hypothesis) PA ⊬ CC. <strong>What other kinds of things can be proved in CA but not PA?</strong></p>
<p>Before getting to that, let’s get <em>C</em> into a more workable form. To say that <em>n</em> is even is to say that has the form <em>2k</em>, so instead of saying that <em>C(n) = n / 2</em>, we can say that <strong><em>C(2k) = k</em></strong>. Similarly, to say that <em>n</em> is odd is to say that it has the form <em>2k + 1</em>, and <em>3(2k + 1) + 1 = 6k + 4</em>. But notice that if <em>n</em> is odd, <em>3n + 1</em> will always be even, immediately triggering the even clause. The Collatz function can be “accelerated” slightly by applying this immediately. With a little algebra, this works out to <strong><em>C(2k + 1) = 3k + 2</em></strong>.</p>
<p>Let’s consider a variation of the Collatz function that we’ll call the <strong>Half Collatz</strong> function. Define a transformation function <em>H</em> to be just like <em>C</em> except that it leaves its even arguments alone – in other words, <strong><em>H(n = 2k) = n</em></strong> and <strong><em>H(2k + 1) = 3k + 2</em></strong>. And instead of applying its tranformation until the argument reaches <em>1</em>, apply it until its argument is even.</p>
<p>Will the Half Collatz function eventually reach an even number for all inputs? Call this statement the <strong>Half Collatz Conjecture (HCC)</strong>. Like CC, HCC appears to be both true and unprovable, PA ⊬ HCC. <strong>But HCC is provable in CA, CA ⊢ HCC.</strong> For suppose that <em>n</em> is a number > 1 such that repeated applications of <em>H</em> never reach an even number. By CC, there is a <em>k</em> such that <em>C<sup>k</sup>(n) = 1</em>. <em>n > 1</em>, so it must get reduced at some point. That reduction only happens when the even clause of <em>C</em> is triggered, so <em>n</em> must eventually reach an even number. ⊣</p>
<p>Finally, consider one more transformation function: <strong><em>B(n = 3k) = n</em></strong> and <strong><em>B(3k + r) = 5k + 3 + r</em></strong> (with <em>r = 1</em> or <em>r = 2</em>). We’ll call this the <strong>Beaver Collatz</strong> function, since this is the function implemented by the <strong><a href="https://nickdrozd.github.io/2021/10/31/busy-beaver-derived.html">4-state 2-color Beeping Busy Beaver champion</a></strong>. The <strong>Beaver Collatz Conjecture (BCC)</strong> says that repeated applications of <em>B</em> will eventually reach a number divisible by <em>3</em> for all inputs. (For a good time, try starting with <em>2</em>.) BCC seems to be true, and presumably it is also unprovable in PA. <strong>Is BCC provable in CA, CA ⊢ BCC?</strong> I conjecture that it is.</p>
<p>We saw that CA ⊢ HCC. But this is equivalent to saying that PA ⊢ CC → HCC, so <strong>why bother with a new axiom?</strong> Why not just stick with PA and take CC as a hypothesis? Well, if Conway is right, CC is both true and unsettleable. To me this says that CC opens up <strong>a whole new method of reasoning</strong>, a method of reasoning that is totally inaccessible otherwise. “Unsettleable” is a strong word, and Collatz-based arguments must be correspondingly strong.</p>
<p>But if CA ⊬ BCC, then the new method of reasoning seems somewhat weak. BCC would presumably be <strong>unsettleable even in CA</strong>, and would therefore have to be added as <strong>yet another new axiom</strong>. There’s a whole world of <strong><a href="https://arxiv.org/pdf/1311.1029.pdf">Collatz-like functions</a></strong>; are they all <strong>co-unsettleable</strong>? I suppose that’s possible, but it’s also possible that there are classes of such functions that are <strong>co-provable</strong>. That is my feeling.</p>Pick a natural number n and apply the following transformation to it: if n is even, the output will be n / 2; otherwise, if n is odd, the output will be 3n + 1. Keep doing this until 1 is reached.Total Functions and Partial Functions2022-04-01T00:00:00+00:002022-04-01T00:00:00+00:00https://nickdrozd.github.io/2022/04/01/total-partial-functions<p>Given a function and an input for which that function is defined, does the function return an answer? Functions that always return answers are called <strong><em>total</em></strong>, and functions that don’t are called <strong><em>partial</em></strong>.</p>
<p>Note that we are only considering <strong>inputs for which the function is defined</strong>. It’s easy to define a function on a certain set of inputs (the <em>domain</em>) but <strong>forget to cover some cases</strong>. This sort of thing happens all the time in both programming (for instance, writing a function to handle lists but forgetting to deal with the empty list) and math (as when a student writing an inductive proof forgets to cover the base case). It’s easy enough to handle. Just <strong>add the missing clauses</strong> and move on. Alternatively, <strong>redefine the domain</strong> of the function so that it excludes the missing case. Either way, there will be no <strong>“holes”</strong> in the modified function.</p>
<p>How could a function fail to return an answer for an input for which it is defined? This problem is essentially retlated to the <strong>termination of unbounded while-loops</strong>. Functions can be broken down into four classes based on their loop bounds:</p>
<ol>
<li><strong><em>Primitive computable</em></strong>: no unbounded loops required; all loops bounds can be calculated in advance. (Also known as <em>primitive recurisve</em>.)</li>
<li><strong><em>General computable</em></strong>: unbounded loops required; loop bounds cannot be calculated in advance, but loops can nevertheless be guaranteed to terminate. (Also known as <em>general recurisve</em>.)</li>
<li><strong><em>Uncomputable</em></strong>: unbounded loops required; loops can be guaranteeed not to terminate for some inputs, but which inputs exactly cannot be determined.</li>
<li><strong>Unknown</strong>: unbounded loops required as far as anyone knows, but this has not been proved.</li>
</ol>
<p>Let’s take a look at some examples.</p>
<h1 id="primitive-computable">Primitive Computable</h1>
<p>Primitive computable functions encompass <strong>pretty much everything that is encountered in day-to-day programming and math</strong>. A good rule of thumb is that if it can be calculated in <strong>Excel</strong>, it’s primitive computable. This includes everything from simple arithmetic to solving Go.</p>
<p>Functions in this class can be written using only loops with explicit fixed bounds. Yet it’s common to see <strong><a href="https://nickdrozd.github.io/2021/09/02/new-pylint-checks.html">real-world code written with unbounded while-loops</a></strong> even in cases where there is no good reason for doing so.</p>
<h1 id="general-computable">General Computable</h1>
<p>It’s not easy come up with functions that are computable but not primitive. Personally I only know of two examples, and they are both named <strong>after their discoverers</strong>.</p>
<p>The first of these is the <strong>Ackermann</strong> function, also known as the <strong>Sudan-Ackermann-Peters</strong> function. It was devised in the 1920s for the specific purpose of showing that there are computable functions that are not primitive computable. Here’s a definition in <strong>Python</strong>:</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="k">def</span> <span class="nf">ackermann</span><span class="p">(</span><span class="n">bound</span><span class="p">:</span> <span class="nb">int</span><span class="p">,</span> <span class="n">total</span><span class="p">:</span> <span class="nb">int</span><span class="p">)</span> <span class="o">-></span> <span class="nb">int</span><span class="p">:</span>
<span class="n">stack</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">while</span> <span class="bp">True</span><span class="p">:</span>
<span class="k">if</span> <span class="n">bound</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span>
<span class="n">total</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="k">if</span> <span class="ow">not</span> <span class="n">stack</span><span class="p">:</span>
<span class="k">return</span> <span class="n">total</span>
<span class="n">bound</span> <span class="o">=</span> <span class="n">stack</span><span class="p">.</span><span class="n">pop</span><span class="p">()</span>
<span class="k">continue</span>
<span class="k">if</span> <span class="n">total</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span>
<span class="n">total</span> <span class="o">=</span> <span class="mi">1</span>
<span class="n">bound</span> <span class="o">-=</span> <span class="mi">1</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">total</span> <span class="o">-=</span> <span class="mi">1</span>
<span class="n">stack</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">bound</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span></code></pre></figure>
<p>This definition contains an <strong>open while-loop</strong>, and it <strong>cannot be rewritten with a bounded loop</strong>. There is no method for estimating the number of loop passes required short of actually calculating it. Still, it can be shown that it will <strong>always terminate</strong>. On every iteration either <code class="language-plaintext highlighter-rouge">bound</code> or <code class="language-plaintext highlighter-rouge">total</code> is decremented. <code class="language-plaintext highlighter-rouge">bound</code> is increased from time to time, but never past its initial value. The output grows very, very fast with respect to its inputs.</p>
<p>Even faster-growing is the <strong>Goodstein function</strong>, which has to do with representing a number in terms of sums of powers of some base and then incrementing that base. Like the Ackermann function, the Goodstein function can be proved, despite its staggering growth, to be total. But whereas the totality of Ackermann can be proved without too much difficulty, proving the totality of Goodstein is <strong>provably difficult</strong>. Specifically, it requires <strong>infinitary reasoning</strong> and therefore <strong><a href="http://www.cs.tau.ac.il/~nachumd/term/Kirbyparis.pdf">cannot be proved in Peano arithmetic</a></strong>.</p>
<h1 id="uncomputable">Uncomputable</h1>
<p>Uncomputable functions do not return answers for some inputs. The classic example of uncomputability is the <strong>halting problem</strong>: does a given program halt or not? Any attempt to implement a function to calculate haltingness for arbitrary programs is <strong>doomed</strong> to be either partial or incorrect.</p>
<p>In practice, a reasonable means of coping with uncomputability is to <strong>impose a bound</strong>. Rather than asking if a program halts <em>ever</em>, we can ask if it halts within <em>some number of steps</em>. This question in contrast can be answered definitively without any difficulty.</p>
<p>Of course, the <strong><a href="https://nickdrozd.github.io/2020/10/15/busy-beaver-well-defined.html">Busy Beaver problem</a></strong> is also uncomputable.</p>
<h1 id="unknown">Unknown</h1>
<p>Some functions seem to require unbounded loops, but nobody knows if this is actually true. The most famous of these is the <strong>Collatz function</strong>:</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="k">def</span> <span class="nf">collatz</span><span class="p">(</span><span class="n">n</span><span class="p">:</span> <span class="nb">int</span><span class="p">):</span>
<span class="k">while</span> <span class="n">n</span> <span class="o">></span> <span class="mi">1</span><span class="p">:</span>
<span class="n">k</span><span class="p">,</span> <span class="n">r</span> <span class="o">=</span> <span class="nb">divmod</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="mi">2</span><span class="p">)</span>
<span class="n">n</span> <span class="o">=</span> <span class="n">k</span> <span class="k">if</span> <span class="n">r</span> <span class="o">==</span> <span class="mi">0</span> <span class="k">else</span> <span class="p">(</span><span class="mi">3</span> <span class="o">*</span> <span class="n">k</span><span class="p">)</span> <span class="o">+</span> <span class="mi">2</span></code></pre></figure>
<p>Does this function terminate for all inputs? As far as anybody knows, the answer is yes. Can the open while-loop be rewritten with a bound? As far as anybody knows, the answer is no. This function is <strong>awfully simple</strong>, but nobody knows how to answer basic questions about its behavior.</p>
<p>The Collatz function belongs to a genre of functions of the following form: <strong>apply some transformation to an input until some condition is met</strong>. Another example of this genre is the <strong>Lychrel function</strong>, which adds a number with its digit-reverse until a palindrome turns up:</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="k">def</span> <span class="nf">lychrel</span><span class="p">(</span><span class="n">n</span><span class="p">:</span> <span class="nb">int</span><span class="p">)</span> <span class="o">-></span> <span class="nb">int</span><span class="p">:</span>
<span class="k">while</span> <span class="n">n</span> <span class="o">!=</span> <span class="p">(</span><span class="n">rev</span> <span class="p">:</span><span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="s">''</span><span class="p">.</span><span class="n">join</span><span class="p">(</span><span class="nb">reversed</span><span class="p">(</span><span class="nb">str</span><span class="p">(</span><span class="n">n</span><span class="p">))))):</span>
<span class="n">n</span> <span class="o">+=</span> <span class="n">rev</span>
<span class="k">return</span> <span class="n">n</span></code></pre></figure>
<p>Does this function terminate for all inputs? The answer here appears to be <strong>no</strong>: <code class="language-plaintext highlighter-rouge">lychrel(196)</code> has <strong>not been witnessed to terminate</strong>. It could be that the sequence does terminate and nobody has checked far enough yet to see it, or it could be that it really doesn’t terminate. If it doesn’t terminate, it would be nice if that could be proved, but nobody knows if that’s possible either.</p>
<h1 id="discussion-questions">Discussion Questions</h1>
<ol>
<li>General-purpose programming languages allow users to (attempt to) implement uncomputable functions. Is this a feature or a bug?</li>
<li>Suppose your usual programming language was replaced with another language that was exactly the same except unbounded while-loops were prohibited. How long would it take for you to notice? Would you complain?</li>
<li>Are there any practical applications for general computable functions?</li>
<li>What are some real-world use cases where unbounded while-loops are helpful?</li>
<li>What are some real-world use cases where unbounded while-loops are thought to be required but really aren’t?</li>
<li>The expression “primitive computable” is, ironically, not very “PC”. What is a better term?</li>
</ol>Given a function and an input for which that function is defined, does the function return an answer? Functions that always return answers are called total, and functions that don’t are called partial.A Formal Theory of Spaghetti Code2022-03-12T00:00:00+00:002022-03-12T00:00:00+00:00https://nickdrozd.github.io/2022/03/12/formal-theory-of-spaghetti-code<p>The <strong><a href="https://nickdrozd.github.io/2021/01/26/spaghetti-code-conjecture.html">Spaghetti Code Conjecture</a></strong> (SCC) says that <strong>Busy Beaver</strong> programs – the longest-running <strong>Turing machine</strong> programs of a given length – ought to be as complicated as possible. This was first proposed by <strong><a href="https://www.scottaaronson.com/papers/bb.pdf">Scott Aaronson</a></strong>:</p>
<blockquote>
<p>A related intuition, though harder to formalize, is that Busy Beavers shouldn’t be “cleanly factorizable” into main routines and subroutines – but rather, that the way to maximize runtime should be via “spaghetti code,” or a single n-state amorphous mass.</p>
</blockquote>
<p>I think SCC is <a href="https://nickdrozd.github.io/2021/09/25/spaghetti-code-conjecture-false.html">probably false</a>, and other people think it must be true. But what, precisely, does it mean? <strong>What exactly is “spaghetti code”?</strong> As Aaronson pointed out, the conjecture was only stated at the intuitive level and hasn’t been formalized. What’s needed is a <strong>formal theory of spaghetti code</strong>: an effective procedure that will determine of a given Turing machine program whether (or to what extent) the program is spaghetti.</p>
<p>Well, happy day: just such a theory emerged recently from <a href="https://groups.google.com/g/busy-beaver-discuss/c/UzJw8R8qRK4">a discussion between me and <strong>Shawn Ligocki</strong></a> after his discovery of a new <a href="https://www.sligocki.com/2022/02/17/bbb-5-2-search-results.html">5-state Beeping Busy Beaver champion</a>.</p>
<p>Given an N-state K-color TM program, consider the program’s <strong>control flow graph</strong>. This is a directed graph with N nodes and K arrows, with nodes corresponding to program states and arrows corresponding to state transitions. We will subject the graph to a <strong>graph reduction procedure</strong>. Apply the following transformations until no more changes can be made:</p>
<ol>
<li><strong>Purge all nodes with no outbound arrows.</strong></li>
<li><strong>Delete all duplicate arrows.</strong></li>
<li><strong>Delete all self-pointing arrows.</strong></li>
<li><strong>Inline any node with just one exit point.</strong></li>
<li><strong>Inline any node with just one entry point.</strong></li>
</ol>
<p>Steps 4 and 5 refer to “inlining” a node. This means deleting the node and giving its arrows to the nodes that reach it. The idea here is that we really only care about <strong>branching</strong>, and non-branching sequences don’t matter. For example, consider the first graph below. Node C can be reached from either A or B, and D can go to either E or F. These are branches. But C never goes anywhere but D, and so we might as well join them into one conglomerate node:</p>
<p><img src="/assets/2022-03-12-formal-theory-of-spaghetti-code/inline-before.png" alt="img" /> <img src="/assets/2022-03-12-formal-theory-of-spaghetti-code/inline-after.png" alt="img" /></p>
<p>Anyway, you start with the program’s full control flow graph, then apply those reduction steps until no more changes can be made. Call whatever is left over the <strong><em>kernel</em></strong> of the graph. <em>How many nodes are left in the kernel compared to how many nodes were in the original graph?</em> This, I claim, constitutes some kind of meaningful measure of program complexity. A larger kernel means a more complicated program, and a smaller kernel means a simpler one. <strong>In particular, a graph that cannot be reduced at all can be considered utter spaghetti and a graph that can be eliminated completely can be considered thoroughly well-structured.</strong></p>
<p>An important caveat to keep in mind is that <strong>this approach works best when states are many and colors are few</strong>. Here’s a fact: every graph of just two nodes can be reduced to nothing. This is because once the reflexive arrows are cut, the two nodes each have no more than on entry and exit (to the other node), and so all remaining arrows can be cut. <em>This is true irrespective of how many arrows there were to begin with.</em> So by the lights of this theory, every 2-state gajillion-color program is simple. Obviously that is not the case, and this shows a limitation to the theory.</p>
<p>With all this in mind, we can restate the Spaghetti Code Conjecture formally: <strong>the control flow graph of (sufficiently long) Busy Beaver programs ought to be at least partially irreducible.</strong></p>
<p><strong>Do the facts support this claim? No, they do not!</strong> The following programs are all totally reducible and therefore anti-spaghetti:</p>
<ol>
<li>The BBB(4) champion (quasihalts in 32,779,478 steps)</li>
<li>The BB(5) champion (halts in 47,176,870 steps)</li>
<li>The BBB(5) champion (quasihalts in > 10<sup>502</sup> steps)</li>
<li>The BB(6) champion (halts in > 36,534 steps)</li>
</ol>
<p><strong>Pascal Michel</strong> maintains a <strong><a href="https://webusers.imj-prg.fr/~pascal.michel/ha.html">list of historical Busy Beaver champions</a></strong>. Of the 23 top-scoring 5-state halting programs, just two of them (#2 and #23) are partially irreducible; of the 12 top-scoring 6-state halting programs, just four are partially irreducible (#2, #10, #11, and #12). Finally, Shawn Ligocki maintains a <a href="https://github.com/sligocki/busy-beaver/blob/main/Machines/5x2-Beep-Top">list of the 20 top-scoring 5-state quasihalting programs</a>. Not a single one of these is even partially irreducible.</p>
<p><strong>Thus there is no concrete evidence at all that the Spaghetti Code Conjecture is true.</strong> All available evidence points towards the opposite conclusion, which we might call the <strong>Clean Code Conjecture</strong> (CCC): that all Busy Beaver champion programs are well-structured and fully graph-reducible. (A better name maybe would be “Structured Code Conjecture”, but then we would have a collision of initials.)</p>
<p>You can point all this out to people and they will still insist that the SCC must be true. Why? As far as I can tell, <strong>the only arguments in favor of SCC rely on dismissing all the available evidence</strong>. This is done in two ways.</p>
<p>The first is to say that <strong>short programs just can’t be all that complex, and therefore short programs don’t consistute real evidence</strong>. Sure, the BBB(4) champion may be simple, but that’s just because all 4-state programs are trivially simple, and the SCC only applies for “sufficiently long” programs.</p>
<p>The problem with this argument is that it assumes that 4-state programs cannot be complex, and this is stated as if it were some obvious logical truth. But it isn’t a logical truth – it amounts to an <strong>empirical claim</strong>, and in fact it’s a false one. There are indeed complex programs of just four states whose behavior cannot be easily described. <a href="https://nickdrozd.github.io/2021/09/25/spaghetti-code-conjecture-false.html">I’ve previously discussed</a> a program discovered by <strong><a href="https://github.com/boydjohnson/lin-rado-turing/">Boyd Johnson</a></strong> that enters into <a href="https://nickdrozd.github.io/2021/02/24/lin-recurrence-and-lins-algorithm.html">Lin recurrence</a> after 158,491 steps with a shocking period of 17,620. <strong>This program has an irreducible 3-node kernel, and is therefore spaghetti by the lights of our theory.</strong> So 4-state programs <em>can</em> be spaghetti, and therefore the fact that the BBB(4) champion is <em>not</em> constitutes evidence against SCC.</p>
<p>The second argument for dismissing the available evidence is that <strong>the means for discovering champions are biased in favor of simple programs</strong>. The best Turing machine simulator that I am aware of is the one written by <a href="https://github.com/sligocki/busy-beaver">Shawn and Terry Ligocki</a>. It can analyze a running program and determine if it exhibits <strong><a href="https://www.sligocki.com/2021/07/17/bb-collatz.html">Collatz-like behavior</a></strong>; if this behavior is detected, it can be extrapolated out to extreme lengths. This is how Shawn discovered, for instance, the current <strong><a href="https://www.sligocki.com/2022/02/17/bbb-5-2-search-results.html">BBB(5) champion</a></strong>.</p>
<p>But if a program does <em>not</em> exhibit such behavior, <em>the simulator will not find it</em>. This means that the availble evidence is overwhelmingly colored by a <strong>selection bias</strong> in favor of Collatz-like programs, and especially those that are <strong>amenable to analysis</strong>. Simpler programs are more amenable to analysis than more complex ones, and thus we should expect simpler programs to be easier to find. <strong>There is an observable universe of programs, and it does not encompass the whole of program space.</strong></p>
<p>This is a <strong>disquieting state of affairs</strong>, to be sure, and it should be kept in mind at all times when discussing these uncomputable functions. Still though, this isn’t an argument in favor of the SCC; it’s just an argument that the available evidence isn’t all that compelling, and we should keep an open mind about counterexamples.</p>
<p>Such skepticism can be applied to the <strong>Collatz conjecture</strong>. According to <a href="https://en.wikipedia.org/wiki/Collatz_conjecture#Experimental_evidence">Wikipedia</a>, the Collatz conjecture has been verified up through about 10<sup>20</sup>. Well, whoop-de-doo! Any number that we humans can actually reach is by definition <strong>puny</strong>; the “observable universe” of numbers just doesn’t reach very far. It’s even been proved that a Collatz counterexample must have certain striking properties, like an enormously long orbit. These proofs are in effect proofs that we will not be able to find a counterexample, even if there is one.</p>
<p><strong>Is this skeptical attitude reasonable?</strong> There’s definitely something to be said for it, although taken to the extreme it takes on an almost conspiratorial that’s-what-they-want-you-to-think quality. In any case I find myself <strong>unmoved</strong> when it comes to the SCC.</p>
<h1 id="discussion-questions">Discussion Questions</h1>
<ol>
<li>Has this graph reduction procedure been discussed before? If so, what is it called?</li>
<li>Would you join a club that would accept you as a member?</li>
<li>What might exist outside the observable universe of programs?</li>
<li>Does the selection bias argument apply to all open conjectures (Goldbach, etc), or just some of them?</li>
</ol>The Spaghetti Code Conjecture (SCC) says that Busy Beaver programs – the longest-running Turing machine programs of a given length – ought to be as complicated as possible. This was first proposed by Scott Aaronson:ذس پوست س رتن ن ینگلش2022-03-10T00:00:00+00:002022-03-10T00:00:00+00:00https://nickdrozd.github.io/2022/03/10/english-in-arabic-script<p>ذ ینگلش لانگواج س وژالی رتن ن ذ رومن الفابت. بت اس وی ال نو، ذس س ا باد چویس. ینگلش هاس ساوندس ذات ار نت رپرسنتد. ان توپ ف زات، هیستوریکال ورثوگرافی س وایلدلی ینکونسیستینت. ت س ا چالینج، تو سی ذ لیست.</p>
<p>س ذیر ان التیرناتیف؟ ن پارتیکولار، کود ارابیک سکریپت سیرف اس ا رپلیسمنت؟</p>
<p>ویل، نو.</p>
<p>فور ون ثینگ، سن’ت ریلی ذ ارابیک الفابت، ت’س ذ پرژن الفابت. ذس فاریانت ینکلودس ا فیو کریتیکال لیترس میسنگ فرم ارابیک: گ، پ، اند چ. ذوس موست لانگواجس وسنگ “ارابیک” ار ریلی بپرژن.</p>
<p>ذ ارابیک سثریپت س الریدی ا تربل چویس فر پرژن. ت وورکس ویل فر ارابیک، وذر لانگواجس. سپیکرس ف پرژن فیس دیفیکولتیس وث ارابیک سکریپت سیمیلار تو ینگلش سپیکرس وث رومن سکریپت.</p>
<p>ذس پوست وس رتن بای هاند. ای’م شور ذ سپیلنگ یس نت کنسیستینت. شود وردس بی سپیلد اس ذی ساوند؟ ور شود ذی بی رتن ارابیک−ستایل، وث واولس یلایدد؟ ذیز ار هارد کویسچنس.</p>ذ ینگلش لانگواج س وژالی رتن ن ذ رومن الفابت. بت اس وی ال نو، ذس س ا باد چویس. ینگلش هاس ساوندس ذات ار نت رپرسنتد. ان توپ ف زات، هیستوریکال ورثوگرافی س وایلدلی ینکونسیستینت. ت س ا چالینج، تو سی ذ لیست.Latest Beeping Busy Beaver Results2022-02-11T00:00:00+00:002022-02-11T00:00:00+00:00https://nickdrozd.github.io/2022/02/11/latest-beeping-busy-beaver-results<p>The <a href="https://scottaaronson.blog/?p=4916"><strong>Busy Beaver</strong></a> question asks: what is the longest that a Turing machine program of <em>n</em> states and <em>k</em> colors can run when started on the blank tape before halting? The function that maps from <em>(n, k)</em> to the longest run length is <strong>uncomputable</strong> and grows faster than any computable function.</p>
<p>Variations of the Busy Beaver function (BB) can be obtained by changing the <a href="https://nickdrozd.github.io/2021/02/14/blanking-beavers.html"><strong>termination condition</strong></a>: what is the longest that a Turing machine of <em>n</em> states and <em>k</em> colors can run before doing such-and-such? The <strong>Blanking Beaver</strong> function (BLB) arises from running programs until the Turing machine tape becomes blank, and the <strong>Beeping Busy Beaver</strong> (BBB) function arises from running programs until they reach a condition known as <a href="https://nickdrozd.github.io/2021/01/14/halt-quasihalt-recur.html"><strong><em>quasihalting</em></strong></a>.</p>
<table>
<thead>
<tr>
<th>Function</th>
<th>Termination</th>
</tr>
</thead>
<tbody>
<tr>
<td>BB</td>
<td>Halt</td>
</tr>
<tr>
<td>BLB</td>
<td>Blank tape</td>
</tr>
<tr>
<td>BBB</td>
<td>Quasihalt</td>
</tr>
</tbody>
</table>
<p>Here are the <strong>latest and greatest lower bounds</strong> that have been discovered for early values of these functions:</p>
<table>
<thead>
<tr>
<th>States</th>
<th>Colors</th>
<th>BB</th>
<th>BLB</th>
<th>BBB</th>
</tr>
</thead>
<tbody>
<tr>
<td>2</td>
<td>2</td>
<td>6</td>
<td>6</td>
<td>8</td>
</tr>
<tr>
<td>3</td>
<td>2</td>
<td>21</td>
<td>34</td>
<td>55</td>
</tr>
<tr>
<td>2</td>
<td>3</td>
<td>38</td>
<td>77</td>
<td>59</td>
</tr>
<tr>
<td>4</td>
<td>2</td>
<td>107</td>
<td>32,779,477</td>
<td>32,779,478</td>
</tr>
<tr>
<td>2</td>
<td>4</td>
<td>3,932,964</td>
<td>1,367,361,263,049</td>
<td>205,770,076,433,044,242,247,859</td>
</tr>
<tr>
<td>5</td>
<td>2</td>
<td>47,176,870</td>
<td> </td>
<td> </td>
</tr>
</tbody>
</table>
<p>(I claim that <strong>all the small values are the true values</strong>, but that’s for a separate post.)</p>
<p>These numbers suggest several <strong>plausible hypotheses</strong>:</p>
<ol>
<li><strong>BLB grows faster than BB.</strong></li>
<li><strong>BBB grows faster than BB.</strong></li>
<li><strong>BBB grows faster than BLB.</strong></li>
</ol>
<p>(2) and (3) are both known from <strong>computability theory</strong> to be true. This has to do with the termination conditions. At any given step it’s possible to determine whether or not the machine is <em>currently</em> halted and whether or not the tape is <em>currently</em> blank. In other words, they are <strong>decidable predicates</strong>. Determining whether a machine will <em>eventually</em> halt or whether the tape will <em>eventually</em> become blank requires an unbounded search for a decidable predicate, and that search is <strong>semidecidable</strong>: if the condition eventually holds, it will turn up, but otherwise it won’t.</p>
<p>In contrast, checking whether a machine is currently quasihalted or not is already semidecidable, and solving this in general is equivalent to the <strong>halting problem</strong>. Checking whether a machine will <em>eventually</em> quasihalt thus requires an unbounded search for an uncomputable predicate, and this means that BBB is a <strong>super-uncomputable function</strong>. Just as BB grows faster than any computable function, BBB grows faster than any function that is <strong>“just” regular-uncomputable</strong>.</p>
<p>Computability theory is a <strong>theory</strong>, and that theory makes <strong>predictions</strong>, and one of those predictions is that BBB, as a super-uncomputable function, should grow really, really fast. Thus these <strong>empirical results</strong> about Turing machine program behavior serve to <strong>confirm the theory’s predictions</strong>.</p>
<p>What about BB and BLB? These functions are <strong>equicomputable</strong>, and one can be solved given an <strong>oracle</strong> for the other. Computability theory doesn’t make a prediction about which one grows faster. All known empirical results suggest that BB < BLB, but as <a href="https://scottaaronson.blog/?p=5661#comment-1900309">Bruce Smith</a> pointed out, <strong>we can’t even prove that BLB ≤ BB</strong>!</p>
<p><strong>Provability</strong> is important from a mathematical point of view. But the Busy Beaver problem was originally posed as a <strong>competition</strong> to see who could come up with the longest-running program. Searching for long-running Turing machines is like <strong>prospecting for gold</strong>, and it requires making predictions about where the winning programs might be found, even if these predictions cannot be backed up wth proofs. <strong>I made a few predictions myself, and they turned out to be correct.</strong></p>
<p>Here are the best values that had been discovered through <strong>the end of 2021</strong>:</p>
<table>
<thead>
<tr>
<th>States</th>
<th>Colors</th>
<th>BB</th>
<th>BLB</th>
<th>BBB</th>
</tr>
</thead>
<tbody>
<tr>
<td>4</td>
<td>2</td>
<td>107</td>
<td>32,779,477</td>
<td>32,779,478</td>
</tr>
<tr>
<td>2</td>
<td>4</td>
<td>3,932,964</td>
<td>190,524</td>
<td>2,501,552</td>
</tr>
<tr>
<td>5</td>
<td>2</td>
<td>47,176,870</td>
<td> </td>
<td> </td>
</tr>
</tbody>
</table>
<p>This is the situation with which I, as a searcher, was faced. I had searched for 2-state 4-color blank-tape and quasihalting programs up through several hundred million steps, and that was the best I had found. <strong>Were these the true values?</strong> It seemed unlikely to me that BBB(2, 4) < BB(2, 4). Again, computability theory tells us that <strong>the super-uncomputable function should grow uncomputably faster than the regular-uncomputable function</strong>. There’s no good reason why it shouldn’t start early, so I figured it probably did.</p>
<p>Things were not so clear with BLB. Again, there’s no proof even that BLB ≤ BB, so maybe BLB(2, 4) < BB(2, 4). But my <a href="https://nickdrozd.github.io/2021/07/11/self-cleaning-turing-machine.html">previous discovery of the <strong>BLB(4, 2) champion</strong> </a> gave me an <strong>unshakeable hunch</strong> that there was still more to find.</p>
<p>It’s not always easy to discern <strong>justifiable faith</strong> from <strong>blind fanaticism</strong>. I certainly didn’t want to waste a bunch of time searching for something that never existed in the first place, so I set a <strong>limit</strong> beyond which I would not bother searching. I reasoned as follows: The ratio BLB(4, 2) / BB(4, 2) works out to about 306,350. If BLB(2, 4) / BB(2, 4) holds the same ratio, then we should have something like BLB(2, 4) ≈ 1,204,863,521,400, or 1.2 trillion. Rounding up, that means searching within 2 trillion steps or so.</p>
<p>This was really a <strong>grasping-at-straws</strong> kind of estimate, totally made up, with no good reason to believe that it would hold. So imagine my surprise when <a href="https://nickdrozd.github.io/2022/01/10/another-self-cleaning-turing-machine.html"><strong>the estimate turned out to be accurate!</strong></a> A new BLB champion turned up to prove that BLB(2, 4) ≥ 1,367,361,263,049. That same program quasihalts too, but earlier, establishing that BBB(2, 4) ≥ 1,367,354,345,128.</p>
<p>Here is the updated results table through <strong>mid-January 2022</strong>:</p>
<table>
<thead>
<tr>
<th>States</th>
<th>Colors</th>
<th>BB</th>
<th>BLB</th>
<th>BBB</th>
</tr>
</thead>
<tbody>
<tr>
<td>4</td>
<td>2</td>
<td>107</td>
<td>32,779,477</td>
<td>32,779,478</td>
</tr>
<tr>
<td>2</td>
<td>4</td>
<td>3,932,964</td>
<td>1,367,361,263,049</td>
<td>1,367,354,345,128</td>
</tr>
<tr>
<td>5</td>
<td>2</td>
<td>47,176,870</td>
<td> </td>
<td> </td>
</tr>
</tbody>
</table>
<p>According to this table, BLB(2, 4) > BBB(2, 4). This wouldn’t too different from the 4-state 2-color case, where as far as we know BLB(4, 2) + 1 = BBB(4, 2). It’s just that <strong>a single program is the champion for multiple classes</strong>, and the various termination conditions are hit at different steps.</p>
<p>We know that BB and BLB are equicomputable, and so, I figured, maybe they maintain some kind of relationship in their growth. But BBB is <strong>uncomputable even with respect to these uncomputable functions</strong>. BBB grows faster than BLB, and historically Busy Beaver searchers have always understimated how fast BB grows. Putting these facts together, I decided that it the true value of BBB(2, 4) must be even further out.</p>
<p>To find the BLB(2, 4) champion, I used <a href="https://github.com/nickdrozd/busy-beaver-stuff/tree/main/idris">a simulator written in <strong>Idris</strong></a>. It’s pretty fast, but I felt I had reached the limits of what it could do. And so I turned to <a href="https://github.com/sligocki/busy-beaver">a simulator written by <strong>Shawn and Terry Ligocki</strong></a>. That simulator, which was used to discover many <a href="https://webusers.imj-prg.fr/~pascal.michel/ha.html">historical BB candidates</a>, does some sophisticated runtime analysis and is able to provide a <strong>massive speed-up</strong> in certain cases. If those kinds of programs existed in the 2-state 4-color space, this simulator would find them.</p>
<p>And it did! On 24 January 2022, I found a program that quasihalts in 67,093,892,759,901,295 steps (about 67 quadrillion). <strong>This was more like how I had expected things to look based on what I knew from theory.</strong></p>
<p>I reported this value to <a href="https://www.sligocki.com/"><strong>Shawn Ligocki</strong></a> along with the search parameters used. He then pushed his simulator even further, and on 7 February 2022 he reported a 2-state 4-color program that quasihalts in 205,770,076,433,044,242,247,859 steps (about 205 sextillion). <strong>That is where the record stands today.</strong></p>
<h1 id="2-state-4-color-champion-programs">2-state 4-color Champion Programs</h1>
<table>
<thead>
<tr>
<th>Program</th>
<th>Steps</th>
<th>Notes</th>
</tr>
</thead>
<tbody>
<tr>
<td><code class="language-plaintext highlighter-rouge">1RB 2RA 1RA 2RB ; 2LB 3LA 0RB 0RA</code></td>
<td>1,367,361,263,049</td>
<td>Current BLB(2, 4) champion</td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">1RB 2RA 1LA 2LB ; 2LB 3RB 0RB 1RA</code></td>
<td>67,093,892,759,901,295</td>
<td>Former BBB(2, 4) champion</td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">1RB 2LA 1RA 1LB ; 0LB 2RB 3RB 1LA</code></td>
<td>205,770,076,433,044,242,247,859</td>
<td>Current BBB(2, 4) champion</td>
</tr>
</tbody>
</table>
<h1 id="discussion-questions">Discussion Questions</h1>
<ol>
<li>Why should the BLB / BB ratio hold?</li>
<li>How likely is it that BLB(5, 2) > 47,176,870?</li>
<li>How likely is it that BBB(4, 2) = 32,779,478?</li>
<li>Why would a simulator only be able to provide speed-up in “certain cases”? Which cases?</li>
</ol>The Busy Beaver question asks: what is the longest that a Turing machine program of n states and k colors can run when started on the blank tape before halting? The function that maps from (n, k) to the longest run length is uncomputable and grows faster than any computable function.Brady’s Algorithm for Program Generation2022-01-14T00:00:00+00:002022-01-14T00:00:00+00:00https://nickdrozd.github.io/2022/01/14/bradys-algorithm<p>The aim of the <strong><a href="https://www.scottaaronson.com/papers/bb.pdf">Busy Beaver game</a></strong> is to find the <a href="https://nickdrozd.github.io/2021/01/14/halt-quasihalt-recur.html">longest running</a> Turing machine program of <em>n</em> states and <em>k</em> colors. Busy Beaver programs could in principle be <a href="https://nickdrozd.github.io/2021/10/31/busy-beaver-derived.html">written by hand</a>, but nobody has ever succeeded in doing so. Instead, these programs are <strong>discovered</strong> through <strong>exhaustive search</strong>. This is done by a two-stage process:</p>
<ol>
<li>Generate a list of candidate programs.</li>
<li>Run them all.</li>
</ol>
<p>Stage two raises an obvious question: run them for how long? It would be great to run them all for infinitely many steps, but that’s nonsense. Only finitely many steps can be executed. So how many? What we’re looking for is a natural number <em>S</em> and an <em>nk</em>-program <em>P</em> such that <em>P</em> halts after <em>S</em> steps. But there’s no way to <strong>upper-bound</strong> <em>S</em> in terms of <em>n</em> and <em>k</em>; for if there were, it could be used to solve the <strong>halting problem</strong>, and that’s impossible. The only way to deal with this is to pick a step limit <em>T</em> and run everything for that many steps. There isn’t any principled way to decide on a good <em>T</em>; in practice it’s determined by 1) how powerful my <a href="https://nickdrozd.github.io/2021/12/08/busy-beaver-hardware.html">hardware</a> is, 2) how long I’m willing to wait, and 3) my out-of-thin-air estimate on the upper bound of <em>S</em>. The history of Busy Beaver research is littered with estimates of <em>S</em> that turned out to be <strong>comically wrong</strong>. I’ve even made a few stupid estimates myself!</p>
<p>The first stage, <strong>program generation</strong>, is a little more nuanced. The difficulties here are not <strong>uncomputable</strong>; instead, they are “just” <strong>infeasible</strong>. It’s easy to write code to <strong>enumerate</strong> every <em>n</em>-state <em>k</em>-color program, but the number of <em>nk</em>-programs grows multiple-exponentially: there are <strong><em>O(nk<sup>nk</sup>)</em></strong> of them. This is because an <em>nk</em>-program has one instruction per state per color, and there are <em>2nk</em> possible instructions. This gets out of hand quickly.</p>
<p><strong>Normalization</strong> helps a little. Because the Busy Beaver game always started on the blank tape, we can assume (why?) that the first instruction of every program is <code class="language-plaintext highlighter-rouge">1RB</code>. This knocks the exponent down by one, to <em>O(nk<sup>(nk-1)</sup>)</em>.</p>
<p>The <strong>halt instruction</strong> also helps. In the classic Busy Beaver game program are required to execute a halt instruction, so every candidate program must have at least one. Its exact location in the program might differ, introducing a multiplicative constant, but the exponent is still reduced to <em>O(nk<sup>(nk - 2)</sup>)</em>. However, this only works for classic Busy Beaver. For variants like <a href="https://nickdrozd.github.io/2021/02/14/blanking-beavers.html">Blanking Beaver</a> or <a href="https://nickdrozd.github.io/2020/08/13/beeping-busy-beavers.html">Beeping Busy Beaver</a> this optimization is not available. I regard this as evidence that <strong>these sequences are more powerful</strong>.</p>
<p>Straightforward enumeration tends to produce a lot of <strong>junk programs</strong>, or programs that are obviously not of any interest. One problem is that an enumerated program will be filled in all the way, even it has instructions that may not be reachable starting from the blank tape. Another problem is that <strong>isomorphic duplicates</strong> are generated: programs that are equivalent through permutation of states or colors. Remember that all these programs will be passed on to stage two where they will be run for a long time. Trying to muscle through a long list of junk to find something good is wasteful, if indeed it’s feasible at all.</p>
<p>The solution to this is <strong>Brady’s algorithm</strong>, also known as the <strong>tree generation method</strong>. Devised by <strong><a href="https://nickdrozd.github.io/2021/12/08/busy-beaver-hardware.html">Allen Brady</a></strong> in <strong><a href="https://ir.library.oregonstate.edu/concern/graduate_thesis_or_dissertations/zk51vk21c">1964</a></strong>, it is, as far as anyone knows, the best program generation algorithm there is. All non-trivial Busy Beaver champions were discovered using Brady’s algorithm, including <a href="https://nickdrozd.github.io/2021/07/11/self-cleaning-turing-machine.html">the first one</a> and <a href="https://nickdrozd.github.io/2022/01/10/another-self-cleaning-turing-machine.html">the second one</a> that I found.</p>
<p>The goal of the algorithm is to yield a list of programs that are <strong>worth investigating further</strong>. We want to avoid <strong>duds</strong>, but we also want to avoid discarding anything that might be valuable. In short, we want to pass on <strong>all and only interesting programs</strong> to stage two.</p>
<p>A nice feature of Brady’s algorithm is that it is <strong>parallelizable</strong>, so that’s how I’ll describe it. Imagine that there are some <strong>workers</strong> and a <strong>pile of programs</strong>. Initially there is just one program in the pile, which is <strong>totally undefined</strong> except for the instruction <code class="language-plaintext highlighter-rouge">A0 -> 1RB</code>. Every worker now undertakes the following task:</p>
<ol>
<li>
<p>Grab a program from the pile (or wait a short while if the pile is empty and try again, and <strong>quit</strong> if it’s still empty).</p>
</li>
<li>
<p><strong>Run</strong> that program for up to some fairly small number of steps (between one and three hundred, say) or until an undefined instruction is reached or a termination condition is detected.</p>
</li>
<li>
<p>If the step limit is reached, <strong>output the program</strong> – it might be a good one!</p>
</li>
<li>
<p>If a <strong>termination condtiion</strong> like <strong><a href="https://nickdrozd.github.io/2021/02/24/lin-recurrence-and-lins-algorithm.html">Lin recurrence</a></strong> is reached, <strong>throw the program away</strong> – it’s no good!</p>
</li>
<li>
<p>If an <strong>undefined instruction</strong> is reached, construct all the possible <strong>extensions</strong> of the program at that instruction slot pursuant to these <strong>constraints</strong>: the states that can be used are the states that have been visited so far plus the next one that hasn’t, and similar or colors. Put each extension back on the pile.</p>
</li>
<li>
<p>Go back to step 1.</p>
</li>
</ol>
<p>And that’s the whole algorithm! A key optimization is the constraint introduced in step 5. This is what eliminates isomorphic duplicates. The algorithm itself solves the problem of eliminating programs with <strong>unreachable instructions</strong> by only dealing with those programs whose instructions have in fact been reached.</p>
<p>The <strong>parameters</strong> of the procedure are run step limit and the choice of termination conditions. I have only been able to implement detection for Lin recurrence, and it would work even better to include detection for so-called <strong>Christmas tree recurrence</strong> and <strong>counting recurrence</strong>.</p>
<p><strong>Examples</strong>. The initial 4-state 2-color program is</p>
<figure class="highlight"><pre><code class="language-nil" data-lang="nil">1RB ... ... ... ... ... ... ...</code></pre></figure>
<p>After one step it reaches the <code class="language-plaintext highlighter-rouge">B0</code> instruction, which is undefined. Its extensions are</p>
<figure class="highlight"><pre><code class="language-nil" data-lang="nil">1RB ... 0LA ... ... ... ... ...
1RB ... 0LB ... ... ... ... ...
1RB ... 0LC ... ... ... ... ...
1RB ... 0RA ... ... ... ... ...
1RB ... 0RB ... ... ... ... ...
1RB ... 0RC ... ... ... ... ...
1RB ... 1LA ... ... ... ... ...
1RB ... 1LB ... ... ... ... ...
1RB ... 1LC ... ... ... ... ...
1RB ... 1RA ... ... ... ... ...
1RB ... 1RB ... ... ... ... ...
1RB ... 1RC ... ... ... ... ...</code></pre></figure>
<p>Notice that state <code class="language-plaintext highlighter-rouge">D</code> is not used. The states that have been visited are <code class="language-plaintext highlighter-rouge">A</code> and <code class="language-plaintext highlighter-rouge">B</code>, and state <code class="language-plaintext highlighter-rouge">C</code> is next, so <code class="language-plaintext highlighter-rouge">D</code> cannot be used.</p>
<p>The initial 2-state 4-color program is</p>
<figure class="highlight"><pre><code class="language-nil" data-lang="nil">1RB ... ... ... ... ... ... ...</code></pre></figure>
<p>After one step it reaches the <code class="language-plaintext highlighter-rouge">B0</code> instruction, which is undefined. Its instructions are</p>
<figure class="highlight"><pre><code class="language-nil" data-lang="nil">1RB ... ... ... 0LA ... ... ...
1RB ... ... ... 0LB ... ... ...
1RB ... ... ... 0RA ... ... ...
1RB ... ... ... 0RB ... ... ...
1RB ... ... ... 1LA ... ... ...
1RB ... ... ... 1LB ... ... ...
1RB ... ... ... 1RA ... ... ...
1RB ... ... ... 1RB ... ... ...
1RB ... ... ... 2LA ... ... ...
1RB ... ... ... 2LB ... ... ...
1RB ... ... ... 2RA ... ... ...
1RB ... ... ... 2RB ... ... ...</code></pre></figure>
<p>The color <code class="language-plaintext highlighter-rouge">3</code> is not used, because only colors <code class="language-plaintext highlighter-rouge">0</code> and <code class="language-plaintext highlighter-rouge">1</code> have been visited, and <code class="language-plaintext highlighter-rouge">2</code> is next.</p>
<p>Brady himself seemed to take a somewhat dim view of his algorithm, usually referring to it as an <strong>“algorithm”</strong>, with <strong>“scare quotes”</strong>.</p>
<blockquote>
<p>The [tree generation] process … is not an algorithm in the strict sense because of the dependency upon a solution to the halting problem; hence the quotes.</p>
</blockquote>
<p>I believe this is due to a <strong>fault in perspective</strong> rather than a fault in the algorithm. Viewed as a solution to the halting problem, it is certainly lacking, because there is no algorithm at all that can solve it. But it <em>is</em> an effective procedure for <em>something</em>, namely <strong>the set of programs thereby generated</strong>!</p>The aim of the Busy Beaver game is to find the longest running Turing machine program of n states and k colors. Busy Beaver programs could in principle be written by hand, but nobody has ever succeeded in doing so. Instead, these programs are discovered through exhaustive search. This is done by a two-stage process: