The Kolakoski stream

2018-01-19 ❖ by Roberto Bonvallet

The Kolakoski sequence is composed of ones and twos. Try to guess what the pattern is, and click or tap the pink box to reveal a clue that will give away the answer:

1 2 2 1 1 2 1 2 2 1 2 2 1 1 1 2 2 1 1 2 1 2 2 Show clue

This is a very interesting sequence, with all sorts of cool properties. Some of them are very profound and, to some degree, have a level of spiritual relevance that trascends their mere existence from a set of numbers by reflecting the way in which the universe looks into itself.

The previous paragraph was just a filler so you couldn’t peek so easily at the answer, which is: the Kolakoski sequence is composed of alternating runs of ones and twos whose lengths are determined by the sequence itself.

It does look into itself. Like the universe.

Take a look again to see if you got it, and then try to determine what the next items from the sequence are. I’ll be letting you know how wrong you are:

Now that you are a Kolakoski expert —a Kolakoskeer, if you will—, I’ll teach you how to describe the sequence as a Scala collection. The entirety of it!

Streams

In Scala, a stream is a linked list whose tail is lazily evaluated. In other words, it knows what its first element is but it doesn’t know the rest until someone asks for them:

val people = Stream("Alicia", "Bernardo", "Cristóbal", "Débora", "Ernesto")
println(people)      // Stream(Alicia, ?)
println(people(2))   // Cristóbal
println(people)      // Stream(Alicia, Bernardo, Cristóbal, ?)

The printed representation of a stream tells you how much of it is already computed. Initially the stream knows that it starts with Alicia and nothing more. After you ask for its third element, Bernardo and Cristóbal become and remain computed.

There are two good reasons why streams will help us with our goal.

First, streams are good for describing infinite sequences. As long as we don’t do anything stupid, such as computing the average of its values or trying to get its last element, laziness will take care of making a stream appear as if it had no end.

Second, streams can be manipulated and transformed just like any other Scala collection. Also, the standard library provides some convenient functions to create streams that we can use as a starting point.

In order to accustom ourselves to the idea of manipulating an infinite sequence, let’s examine some examples.

No odd stuff in this stream

This is the stream of all the integers:

val integers = Stream.from(0)

And this is the stream of all the even integers:

val evenIntegers = integers.filter(_ % 2 == 0)

This is also a stream of all the even integers:

val evenIntegers = integers.map(_ * 2)

This, again, is the stream of all the even integers, described as its initial value and an operation for getting a new one from the previous one:

val evenIntegers = Stream.iterate(0)(_ + 2)

Finally, this is the stream of even integers described in terms of itself by using #::, the stream cons operator:

val evenIntegers: Stream[Int] = 0 #:: evenIntegers.map(_ + 2)

A prime stream

Let’s create, one more time, the stream of integers. When we print it, we’ll see that it only knows it starts with zero. Nobody has any idea what could be there after that:

val integers = Stream.from(0)
println(integers)  // Stream(0, ?)

The stream of prime numbers is easy to create:

def isPrime(n: Int) = n > 1 && (2 to n/2).forall(n % _ != 0)
val primes = integers.filter(isPrime)

The prime stream knows it starts with 2, and to learn that it needed to consume a couple of values from the integer stream:

println(integers)  // Stream(0, 1, 2, ?)
println(primes)    // Stream(2, ?)

If we ask for the fifth prime, even more values from the integer stream need to be evaluated:

println(primes(4)) // 11
println(integers)  // Stream(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, ?)
println(primes)    // Stream(2, 3, 5, 7, 11, ?)

And now comes the party trick: one can assign one prime number to each person without worrying about how many people there are!

val people = List("Alicia", "Bernardo", "Cristóbal", "Débora", "Ernesto")
println(people zip primes)
// List((Alicia,2), (Bernardo,3), (Cristóbal,5), (Débora,7), (Ernesto,11))

You can tell how fun my parties can get.

Back to Kolakoski

The Kolakoski sequence begins with 1, 2, and 2, and then it uses itself to calculate the rest of the values:

val kolakoski: Stream[Int] = 1 #:: 2 #:: 2 #:: rest(kolakoski.drop(2))

Note that we have to drop the first two members of the sequence since the first two runs are already provided.

The numbers in the sequence tell how long each run is, but not what the number in the run is. For that we need an infinite sequence of alternating ones and twos, which I’ll define like this:

val onesAndTwos = Stream.continually(List(1, 2)).flatten

Can you think of other ways to create that sequence? I know I can:

val onesAndTwos2 = Stream.from(0).map(_ % 2 + 1)
val onesAndTwos3 = Stream.iterate(1) { case 12 case 21 }
val onesAndTwos4 = Stream.iterate(1)(3 - _)
val onesAndTwos5: Stream[Int] = 1 #:: 2 #:: onesAndTwos5

Creating one run of the same thing is something Scala already knows how to do:

val fourHorsemen = Seq.fill(4)("horseman")
assert(fourHorsemen == Seq("horseman", "horseman", "horseman", "horseman"))

What’s left is just assembling the pieces. And when it comes to collections, “assembling the pieces” is spelled with an f:

def rest(runLengths: Stream[Int]): Stream[Int] = {
  runLengths zip onesAndTwos flatMap { case (k, n) ⇒ Seq.fill(k)(n) }
}

Simple, isn’t it? Streams are an effective way to define this infinite sequence in a way that corresponds quite directly to the prose description, once you learn what #:: is and how flatMap is pronounced.

If you liked this post, I invite you to continue lazily reading the infinite stream of articles I have created for my fine visitors. Be aware that evaluation could take a while.