The classic Busy Beaver game asks how long an n-state k-color Turing machine program can run before executing a halt instruction. The phrase “executing a halt instruction” is not normally used in stating the problem; instead, people just say “halting”. But executing a halt instruction is just one way for a program to signal that it is finished. From the point of view of a programmer trying to maximize machine steps, it’s about the worst termination signal possible because it requires wasting valuable program space on the halt instruction. If other termination conditions are allowed, programs can run for much, much longer.

I recently discovered one such program. With just four states and two colors, it undertakes a computation for 32,779,478 steps before signalling termination. (Don’t worry, we’ll get to how exactly this “termination” is “signalled”.) Among programs with an explicit halt instruction the record is 107 steps, so this new champion is a shocking improvement. A slightly more liberal understanding of program behavior vastly increases expressive power.

You might think that such a program would be complicated and difficult to understand. This, at least, is what the Spaghetti Code Conjecture predicts. In fact, the Spaghetti Code Conjecture might be false, and the program is remarkably elegant. It’s so straightforward that it can be derived from scratch.

# Number Theory

First, a little number theory. Consider the following function:

Here’s a description in plain language. L checks if its input is divisible by 3. If it is, the output is 0; otherwise, some transformation is applied to the input and L is called again, recursively, on the result. Is L total? That is, does it return an answer for all inputs? That’s an open question.

Here’s another function with a similar flavor:

Is C total? The Collatz Conjecture says that it is, but that too is an open question. But forget about C. We’re interested in L.

It usually takes no more than a few iterations to caluclate L(n), but occasionally it takes longer. In particular, it is strange but true that L(2) requires fourteen iterations to calculate. Keep this fact in mind for later.

# C implementation of L

We’re going to write a Turing machine program that implements L.

No, wait, actually we won’t write a Turing machine program. My last post got submitted to a few discussion boards under the topic of “programming”. Several commenters complained that it was not about programming at all, but rather theoretical computer science, and that it didn’t have any code in it, and that it wasn’t practical. So instead of writing “theoretical” Turing machine code, we’ll implement L in an eminently practical language, namely C. Who would dare question the real-world usefulness of C? What better language is there for “real programmers” to “get things done”?

So, the goal is to write a C program that implements L. But not just any program; we need a program written pursuant to some unusual and severe constraints. First, there will be just one header included:

This file defines a `short` array, which we’ll refer to as the tape. Every cell of the tape is initialized to `0` unless otherwise specified. The exact length of the tape is unknown, but it is plenty long for our purposes. There is no need to worry about checking any bounds.

There is also a pointer into the tape array, but we won’t have direct access to it. Instead, the header provides four tape-manipulation macros:

• `PRINT`: writes a `1` to the current cell;
• `ERASE`: writes a `0` to the current cell;
• `RIGHT`: moves the tape pointer one cell to the right; and
• `LEFT`: moves the tape pointer one cell to the left.

A fifth macro, `BLANK`, returns whether the current cell contains a `0`. It doesn’t execute any side effects.

Those are the primitive operations to which we have access. Beyond that, our `main` function will have a specific form:

• There will be exactly four labels.
• Under each label will be an `if` / `else` block with `(BLANK)` as the test condition.
• The body of each branch will contain three statements:
1. either `PRINT;` or `ERASE;`, and
2. either `LEFT;` or `RIGHT;`, and
3. a jump to one of the four labels (i.e. a statement of the form `goto <LABEL>;`)

Note that if the current cell if `0` and `ERASE` is executed, nothing will happen, and similar for `PRINT` on `1`. So the print / erase statements can be regarded as optional.

We’re trying to implement L, which means a program that will calculate L(n) given an input n. The input n will be represented as a block of n consecutive marks (`1`’s) on the tape, with all other cells blank (`0`). How exactly these marks get onto the tape will be left unspecified for now, but we’ll return to this point later. The initial position of the tape pointer will be at the first blank cell directly to the right of the block of marks. We call this the initial configuration1. Here is what the initial position looks like when the input is 13 (where `^` indicates the current position of the tape pointer and `*` indicates the starting cell of the tape pointer):

Let’s take another look at the definition of L:

With a full range of standard math operations available, this would be easy. Divide n by 3 to get k, then multiply k by 5 and add the rest of the terms. That would be nice, wouldn’t it? But the available operations are too low-level for that; in this setting, division and multiplication are quite sophisticated.

Remember that n is represented in unary notation, and our operations are limited to shuffling around tally marks. The fundamental algorithm will be based around this idea: subtract 3 from n, then add 2 marks outside the initial block, then fill in the 3 that were subtracted. This procedure will then be iterated in the style of Shlemiel the Painter to count through every group of 3 marks in n. The remainder will be left alone, and a few extra marks will be added along the way to demarcate the working areas of the tape. The end result will be that where there were previously 3k + r marks, now there are 5k + r + 3.

When the program starts, control is at the first label and the current cell (according to the stipulated intial configuration) is blank. The first move will be to set down what we’ll call the post. The post demarcates the right edge of the initial block. Then we move left and continue on until we find the start of the block, which at this point will be just one cell away. Later on, however, it will be further.

Once we get to the start of the block, control is still with the initial label. As per the fundamental algorithm, we need to count exactly three marks. By virtue of having found the block, the tape pointer is already on the first mark, so that’s one. We need to count two more. Leave the current mark alone, then move left and transfer to the next label.

So far, the program looks like this:

And the tape looks like this:

Once again, we leave the mark alone, move left, and transfer to the next label.

Now the tape pointer is at the third mark from the right of the initial block:

The fundamental algorithm says to subtract 3 and then add 2 and then add the 3 back. So we’ll erase all the marks we just counted. In fact, we’ll wipe everything until we find the edge of the block, including the post:

Control will remain with this label for a few steps. Eventualy the tape pointer will reach a blank cell, namely the cell directly to the right of the initial square. We leave the cell blank, then move right one more square, then transfer control back to the starting label, `POST_AND_FIND`.

The current cell is two to the left of the initial square. Three marks have been erased from the starting input block, as per the fundamental algorithm. `POST_AND_FIND` will now set down a new post and find what’s left of the block, filling in all the blank squares along the way.

This process will go through a few more iterations, each time moving three marks further into the initial block and two spaces further to the right of the starting square. This is how 3k is transformed into 5k.

Here is the program so far:

On the program side, the blank instruction for `COUNT_SECOND_MARK` hasn’t been used, and the fourth label hasn’t yet been mentioned. On the math side, we still don’t know how to handle the remainder, if there is one, or how to end the computation if there isn’t. Eventually the tape will look like this:

There is just one mark left, representing the remainder of thirteen when divided by three. Control is with `POST_AND_FIND`, which fills in the blanks and transfers control to `COUNT_SECOND_MARK`:

The remainder has been found, which means every group of three has been extended by two. We print a mark and transfer to the fourth and final label, `FINALIZE`. This label will just move back across the block of marks until it finds a blank, at which point it will print, move right, and transfer control to the starting label:

To review: two marks were added for each group of three marks; the remainder was left alone; and the post in the middle plus a mark on either end of the block makes three. 3k + r is transformed into 5k + r + 3, and the program is once again in the initial coniguration, ready for another iteration.

So far we’ve handled the case where there is a remainder of one. What about when there is a remainder of two? `POST_AND_FIND` counts the first mark, then `COUNT_SECOND_MARK` counts the second, and then control goes to `COUNT_THIRD_MARK` on a blank cell. Well, we’ve already developed a remainder-handling subroutine, so why not reuse that? Leave the cell blank, move right, then go back to `POST_AND_FIND`. One case is reduced to the other.

At last we have the glorious final version of the program:

What happens when there is no remainder? According to the definition of L, the output should be zero. There’s no more program left to write, so hopefully that’s what happens! After the third mark is counted, `COUNT_THIRD_MARK` wipes everything to the right. But by hypothesis, there is nothing else to the left, since the third mark was the very last one. So every remaining mark on the tape will get wiped, and then control gets passed back to `POST_AND_FIND`.

When that label executes and the tape is totally blank (that is, there is an input of 0), there is no block to find, so it will start printing marks and moving to the left forever. But observe that this behavior is both obviously degenerate and easily detectable. Therefore we can plausibly regard this simple instance of Lin recurrence as the program’s means of signalling termination.

# Getting the input onto the tape

So, we have a program that will calculate L(n) given an initial input of n, and we know that L(2) runs for a long time. But the rules of the Busy Beaver game say that the tape must start off blank. How does the initial input get onto the tape?

This is a question in the field of Turing machines that is terribly under-discussed. Great efforts are made to find the “simplest possible machine” to do this or that, but the measure of “simplicity” never seems to include the tape inputs. How much computation is required to get the input onto the tape? Is the “simple machine” doing the computation, or is it really just consuming pre-digested mush? What is the true complexity of the total end-to-end computational process?

From the Busy Beaver perspective, getting an input for free is cheating. The blank tape initial condition provides a level playing field by providing nothing whatsoever in advance. Every program has to prepare its own input, and in this way the total complexity is accurately measured.

As it happens, one simple tweak will suffice to prepare the required input. Simply rearrange the labels so that program execution begins with `COUNT_SECOND_MARK` instead of `POST_AND_FIND`:

With this, the first few steps of the program on the blank tape are:

Now there are two marks on the tape and the program is in the initial configuration, ready to rock. The long computation is executed and a high score is achieved.

# From C to Turing Machine

If you’re still reading this, you’ve probably guessed that the bizarre constraints placed on the C program are based on the logical structure of Turing machine programs. When we say that a Turing machine has n “states”, this corresponds to a C program with n labels. “State” is a crappy name, and we should really talk about “labels” instead. The “colors” or “symbols” of a Turing machine correspond to the numbers that can go into the cells of the tape array.

Renaming the labels of the C program to single letters, we get the code for the champion Turing machine:

Or, in string notation:

# Structured Programming

We’ve seen how the champion program works, but the control flow is not exactly obvious from the `goto`-ridden code. Clarity can be improved by introducing standard structured programming operators. For example, consider the `COUNT_THIRD_MARK` label:

This is equivalent to a `while` loop:

Using this and a few other basic transformation techniques, the program can be transformed into fully well-structured code:

Observe that there are no `break` or `continue` statements, and there is just one branch point. There is even a `do-while` loop.

# Who wrote this program?

I hope that you are convinced by now that this program is not some kind of inscrutable tangle of spaghetti code. There are subroutines that execute particular actions as if in the service of a plan of some kind. It could have been written by somebody.

And yet, it wasn’t. Nobody wrote it. It was just floating out there in the full space of 4-state 2-color Turing machine programs, and I managed to find it. I did not write it. I woke up one morning and found it on my screen. I didn’t have a clue what it did until Shawn Ligocki reverse engineered it. He determined that the program’s behavior could be described by the function L, and from there I was able to work out the fine details of the program’s instructions.

# Turing machine programming exercise

The following exercise is due to Bruce Smith.

The program described here starts on the blank tape and then the tape is blank again after 32,779,477.

Add one label to the program so that it reaches the blank tape after 32,810,047 steps.

Hint: take advantange of the `do-while` loop.

# Footnotes

1 This initial configuration differs slightly from what some authors refer to as the standard configuration. But that “standard” is just a made-up convention, so who cares?