Monday, March 1, 2010

Fun with Large Numbers

I haven't been blogging much. In part that is because I've been using buzz instead. (Mostly to tell a joke a day.) However I've got a topic of interest to blog about this time. Namely large numbers.

Be warned. If thinking about how big numbers like 9999 really are hurts your head, you may not want to read on.

It isn't hard to find lots of interesting discussion of large numbers. See Who can name the bigger number? for an example. However when math people go for big numbers they tend to go for things like the Busy Beaver problem. However there are a lot of epistemological issues involved with that, for instance there is a school of mathematical philosophy called constructivism which denies that the Busy Beaver problem is well-formulated or that that sequence is well-defined. I may discuss mathematical philosophy at some future point, but that is definitely for another day.

So I will stick to something simpler. Many years ago in sci.math we had a discussion that saw several of us attempt to produce the largest number we could following a few simple ground rules. The rules were that we could use the symbols 0 through 9, variables, functions (using f(x, y) notation), +, *, the logical operators & (and), ^ (or) ! (not), and => (implies). All numbers are non-negative integers. The goal was to use at most 100 non-whitespace characters and finish off with Z = (the biggest number we can put here). (A computer science person might note that line endings express syntactic intent and should be counted. We did not so count.)

A non-mathematician's first approach would likely be to write down Z = 999...9 for a 98 digit number. Of course 9999 is much larger - you would need an 8 digit number just to write out how many digits it has. But unfortunately we have not defined exponentiation. However that is easily fixed:

p(n,0) = 1
p(n, m+1) = n * p(n, m)

We now have used up 25 characters and have enough room to pile up a tower of exponents 6 deep.

Of course you can do better than that. Anyone with a CS background will start looking for the Ackermann function.

A(0, n) = n+1
A(m+1, 0) = A(m, 1)
A(m+1, n+1) = A(m, A(m+1, n))

That's 49 characters. Incidentally there are many variants of the Ackermann function out there. This one is sometimes called the Ackermann–Péter function in the interest of pedantry. But it was actually first written down by Raphael M. Robinson.

(A random note. When mathematicians define rapidly recursing functions they often deliberately pick ones with rules involving +1, -1. This is not done out of some desire to get a lot done with a little. It is done so that they can try to understand the pattern of recursion without being distracted by overly rapid initial growth.)

However the one thing that all variants on the Ackermann function share is an insane growth rate. Don't let the little +1s fool you - what really matters to growth is the pattern of recursion, and this function has that in spades. As it recurses into itself, its growth keeps on speeding up. Here is its growth pattern for small n. (The n+3/-3 meme makes the general form easier to recognize.)

A(1, n) = 2 + (n+3) - 3
A(2, n) = 2 * (n+3) - 3
A(3, n) = 2n+3 - 3
A(4, n) = 222 - 3 (the tower is n+3 high)

There is no straightforward way to describe A(5, n). Basically it takes the stacked exponent that came up with A(4, n) and iterates that operation n+3 times. Then subtract 3. Which is the starting point for the next term. And so on.

By most people's standards, A(9, 9) would be a large number. We've got about 50 characters left to express something large with this function. :-)

It is worth noting that historically the importance of the Ackermann function was not just to make people's heads hurt, but to demonstrate that there are functions that can be expressed with recursion that grow too quickly to fall into a simpler class of primitive recursive functions. In CS terms you can't express the Ackermann function with just nested loops with variable iteration counts. You need a while loop, recursion, goto, or some other more flexible programming construct to generate it.



Of course with that many characters to work with, we can't be expected to be satisfied with the paltry Ackermann function. No, no, no. We're much more clever than that! But getting to our next entry takes some background.

Let us forget the rules of the contest so far, and try to dream up a function that in some way generalizes the Ackermann function's approach to iteration. Except we'll use more variables to express ever more intense levels of recursion. Let's use an unbounded number of variables. I will call the function D for Dream function because we're just dreaming at this point. Let's give it these properties:


D(b, 0, ...) = b + 1

D(b, a0 + 1, a1, a2, ..., an, 0, ...)
= D(D(b, a0, a1, ..., an, 0, ...), a0, a1, ..., an, 0, ...)

D(b, 0, ..., 0, ai+1, ai+1, ai+2, ..., an, 0, ...)
= D(
D(b, a0, a1, ..., an, 0, ...),
b-1, b-1, ..., b-1,
ai, ai+1, ..., an, 0, ...
)

There is a reason for some of the odd details of this dream. You'll soon see b and b-1 come into things. But for now notice that the pattern with a0 and a1 is somewhat similar to m and n in the Ackermann function. Details differ, but recursive patterns similar to ones that crop up in the Ackermann function crop up here.

D(b, a0, 0, ...) = b+a0+1
D(b, a0, 1, 0, ...) ≈ 32a0 b

And if a1 is 2, then you get something like a stacked tower of exponentials (going 2,3,2,3,... with some complex junk). And you continue on through various such growth patterns.



But then we hit D(b, a0, a1, 1, 0, ...). That is kind of like calling the Ackermann function to decide how many times we will iterate calling the Ackermann function against itself. In the mathematical literature this process is called diagonalization. And it grows much, much faster than the Ackermann function. With each increment of a2 we grow much faster. And each higher variable folds in on itself to speed up even more. The result is that we get a crazy hierarchy of insane growth functions that grow much, much, much faster. Don't bother thinking too hard about how much faster, our brains aren't wired to really appreciate it.

Now we've dreamed up an insanely fast function, but isn't it too bad that we need an unbounded number of variables to write this down? Well actually, if we are clever, we don't. Suppose that b is greater than a0, a1, ..., an. Then we can represent that whole set of variables with a single number, namely m = a0 + a1 b + ... + an bn. Our dream function can be recognized to be the result of calculating D(b, m+1) by subtracting the then replacing the base with D(b, m) (but leaving all of the coefficients alone. So this explains why I introduced b, and all of the details about the -1s in the dream function I wrote.

Now can we encode this using addition, multiplication, non-negative integers, functions and logic? With some minor trickiness we can write the base rewriting operation:


B(b, c, 0) = 0
i < b => B(b, c, i + j*b) = i + B(b, c, j) * c

Since all numbers are non-negative integers the second rule leads to an unambiguous result. The first and second rules can both apply when the third argument is 0, but that is OK since they lead to the same answer. And so far we've used 40 symbols (remember that => counts as 1 in our special rules).

This leads us to be able to finish off defining our dream function with:

D(b, 0) = b + 1
D(b, n+1) = D(D(b, n), B(b, D(b, n), n))

This took another 42 characters.

This leaves us 18 characters left, two of which have to be Z=. So we get

Z = D(2, D(2, D(2, 9)))

So our next entry is

B(b, c, 0) = 0
i < b => B(b, c, i + j*b) = i + B(b, c, j) * c
D(b, 0) = b + 1
D(b, n+1) = D(D(b, n), B(b, D(b, n), n))
Z = D(2, D(2, D(2, 9)))

We're nearly done. The only thing I know to improve is one minor tweak:

B(b, c, 0) = 0
i < b => B(b, c, i + j*b) = i + B(b, c, j) * c
T(b, 0) = b * b
T(b, n+1) = T(T(b, n), B(b, T(b, n), n))
Z = T(2, T(2, T(2, 9)))

Here I changed B into T, and made the 0 case be something that had some growth. This starts us off with the slowest growth being T(b, i) being around b2i, and then everything else gets sped up from there. This is a trivial improvement in overall growth - adding a couple more to the second parameter would be a much bigger win. But if you're looking for largest, every small bit helps. And modulo a minor reformatting and a slight change in the counting, this is where the conversation ended.

Is this the end of our ability to discuss large numbers? Of course not. As impressive as the function that I provided may be, there are other functions that grow faster. For instance consider Goodstein's function. All of the growth patterns in the function that I described are realized there before you get to bb. In a very real sense the growth of that function is as far beyond the one that I described as the one that I described is beyond the Ackermann function.

If anyone is still reading and wants to learn more about attempts by mathematicians to discuss large (but finite) numbers in a useful way, I recommend Large Numbers at MRROB.

No comments: