Author here. The paper is currently in a state of flux, as I started adding material about the graphical notation, but didn't get very far before my attention drifted to other topics. So please have a look at the following page as well

which links to an explanation of the graphical notation, to my corresponding IOCCC entry from 2012, and to a Wikipedia page on Binary Lambda Calculus that has since been deleted.

I am loving this paper so far, brings together some nice ideas with a good eye for knowing what sub topics to explore. The graphical notation is reminiscent of certain graphics that can be expressed with a recursive png tool at http://williamsharkey.com/PNG-Gen.html . This paper is the first time I took the effort to learn lambda calculus notation and work through to understand the examples so you deserve credit for motivating me.

Is Lambda calculus a good match for algorithmic information theory. The versions of AIT I've read involved either Turing machines, hand-waving of computation or recursive function. Lambda calculus seems more complex and thus harder to integrate with an AIT that mostly needs to know that strings are programs, that not all programs terminate, that some programs can give a substring of their string as outputs and so-forth.

In my view lambda calculus is a perfect match for AIT since the language is both so simple and so expressive. This should be readily apparent from the concise self-interpreter. The versions of AIT that you've seen based on TMs probably didn't have any explicit constants because determining them would be much too cumbersome. Chaitin was the first to prove theorems in AIT with actual programs and he resorted to LISP. I'm positioning lambda calculus as an even better choice.

Lambda calculus is certainly tricky for counting steps, since a single step of beta-reduction can actually do a lot of "work", unlike e.g. a Turing machine where each step takes constant resources.

For AIT, the binary encodings used in this paper have the nice property that they're complete and prefix-free, meaning that every binary string either decodes to a term (perhaps using only an initial portion of the string), or can be extended (arbitrarily) until it gives rise to a term.

At first glance, it feels like it would be hard for lambda terms to "give a substring of their string as outputs", since that would require re-encoding a subterm. However, I don't think that's as bad as it sounds. For example, consider that a Turing machine can only move its head along the tape one cell at a time (we could use any finite step size, but we usually stick to one), which makes it difficult/impossible to "preserve" a section of tape. Hence Turing machines which "give a substring of their string as outputs" might sound simple, but would probably be somewhat complicated to ensure that any "corruption" caused when traversing the tape gets "undone" afterwards.

(Unless they're multitape machines, like monotone turing machines!)

IMHO, function composition is a better computational model than function application for expressing algorithmic information theory. Concatenative languages (e.g. Joy) are probably the closest examples, though popular examples are not optimal by any means.

Is there a prose explanation or justification for Algorithm Information Theory. I think I first heard of it when I saw your IOCCC entry, but I was never able to get my mind wrapped around what the goal of the whole field was.

I guess it provides an objective definition of when something is simple or complicated. This could have applications in statistics or machine learning, where "simpler" hypotheses could be less likely to overfit. There are computable measures of "simplicity" (unlike Kolmogorov complexity) based on compression algorithms or computation time (Levin complexity). There are also apparently connections to statistical mechanics, but I haven't looked into it and I don't know if it's useful.

If you like this see also [1], and Joy as well as Iota and Jot[2] (programming languages). And maybe "Algorithmically probable mutations reproduce aspects of evolution such as convergence rate, genetic memory, and modularity": https://arxiv.org/abs/1709.00268v8

> In the context of his Metabiology programme, Gregory Chaitin, a founder of the theory of algorithmic information, introduced a theoretical computational model that evolves ‘organisms’ relative to their environment considerably faster than classical random mutation. While theoretically sound, the ideas had not been tested and further advancements were needed for their actual implementation. Here we follow an experimental approach heavily based on the theory that Chaitin himself helped found. We apply his ideas on evolution operating in software space on synthetic and biological examples and even if further investigation is needed this work represents the first step towards testing and advancing a sound algorithmic framework for biological evolution.

As a layman, it is a personal aspiration to be able to understand this precise paper. It's very appealing to be able to express and reason about the related problems of complexity and edit distances using a generative(?) notation like lambda calculus that you can hack on in accessible languages like Haskell.

I also like that the citations include Emperor's New Mind; Godel, Escher, Bach; the Brainfuck homepage, and Haskell. The effects of these ideas on a couple generations of young minds is bearing fruit.

It feels like we're on the brink of gamifying a lot of important math.

Is there a video lecture version of this paper that someone can recommend?

This is a great paper. But I burn so much mental energy trying to focus when I read something vs. when I listen to a video lecture. So I end up retaining more watching stuff and every more taking some notes.

Not sure about this specific paper, but I've been following John Tromp's work on this for a while. Its current homepage seems to be on GitHub https://tromp.github.io/cl/cl.html

The paper assumes knowledge of combinatory logic and lambda calculus, which you could probably find videos about. This paper uses pretty standard notation, so almost any video/course/book/blog/etc. should be OK. The main thing this paper does is to define a way to encode such programs as a string of bits.

Note that combinatory logic is one of the simplest Turing-complete programming languages, but it's so simple that it's basically unusable for anything more elaborate than toy examples (the paper actually has a quote from Chaitin saying this!). Once you're comfortable playing with little examples containing a handful of symbols, that's basically all you really need.

If you're familiar with other programming languages, it's usually pretty easy to implement combinatory logic and play with it. For example, here are 'S' and 'K' in Javascript:

function k(x, y) { return y; }
function s(x, y, z) { return x(z)(y(z)); }

This isn't quite right, due to most languages using strict evaluation, but the following Haskell is essentially correct:

k x y = y
s x y z = x z (y z)

There's also the esoteric language "unlambda" which implements these directly, including the "monadic IO" that the paper mentions https://en.wikipedia.org/wiki/Unlambda

Lambda calculus is more tricky than combinatory logic, since it contains variables. It's also more widely used (e.g. as the basis for many functional programming languages, like Haskell and Scheme), so there should be more material available. In essence, when you see something like:

λa b c.x y z

You can think of it as acting like this Javascript:

function(a, b, c) {
return x(y)(z);
}

When it comes to Kolmogorov complexity, maybe the Wikipedia page and its citations will help ( https://en.wikipedia.org/wiki/Kolmogorov_complexity )? Notice that all of the definitions, etc. on that page assume that we're talking about some particular programming language or machine, but the examples use pseudocode rather than a "real" programming language. This paper is basically saying that we should use lambda calculus as that language, and we can measure the size/length of a program by encoding it as binary according to the method given.

HN feels psychic sometimes!... this paper happens to tie together a lot of the foundations I was lacking (out of ignorance) that I need for a personal research project. So happy to have found it :)

http://tromp.github.io/cl/cl.html

which links to an explanation of the graphical notation, to my corresponding IOCCC entry from 2012, and to a Wikipedia page on Binary Lambda Calculus that has since been deleted.