Y-Not? Adventures in Functional Programming (Part III)

In Part I we explored the historical context leading up to the Y-Combinator.

In Part II we laid the foundation for understanding the Y-Combinator

Finally, in Part III we see the Y-Combinator in the solution to the recursion problem of Lambda Calculus.

Solving the Recursion Problem

Its time to attack the problem of recursion with anonymous functions. Remember our factorial example?

  ->(n) { n.zero? ? 1 : n * fact.(n-1) }

Since this function is anonymous, the name fact is not bound to anything. One thing we can do to fix this is to wrap the entire thing in a function that takes fact as an argument. Let's call this higher order function make_fact.

  make_fact = ->(fact) {
    ->(n) { n.zero? ? 1 : n * fact.(n-1) }
  }

Since fact is now bound, there is no problem calling fact from within the function. So to create a factorial function, all we need to do is call make_fact with a definition of factorial. If that sounds a little circular, you are right.

Ok, maybe we don't need a full definition of factorial to start with. I'm willing to lower my expectations a bit. Rather than require a full factorial function, I would be happy with a factorial function that operated on a subset of the possible inputs for factorial. We call this kind a function a "partial" function, it is only valid on a partial set of possible inputs.

For example, given the following definition:

  fact_improver = ->(partial) {
    ->(n) { n.zero? ? 1 : n * partial.(n-1) }
  }

If we feed the factimprover function a partial factorial that is defined over the inputs 0 through 10, we will get back a new function that is factorial for inputs 0 through 11. factimprover improves the factorial function by one step.

So we still need a partial function to kick things off. How about this function:

error = ->(n) { fail "SHOULD NEVER BE CALLED" } I will claim this is a partial factorial function where the set of inputs where it is defined is the empty set. Ok, that's a pretty weak partial, but it works. Consider:

f0 = fact_improver.(error) f0 is a function that can calculate the factorial of zero.

f0.(0) # => 1 Unfortunately it fails on factorial of 1:

f0.(1) # throws "SHOULD NEVER BE CALLED" error But that's ok. We can improve f0

f1 = fact_improver.(f0)

f1.(0) # => 1 f1.(1) # => 1 f1.(2) # throws "SHOULD NEVER BE CALLED" error How about a function that can go up to the factorial of 2?

f2 = fact_improver.(f1)

f2.(0) # => 1 f2.(1) # => 1 f2.(2) # => 2 f2.(3) # throws "SHOULD NEVER BE CALLED" error You should be getting the pattern by now. We can easily create a function that will calculate any fixed sized factorial partial.

fx = fact_improver.( fact_improver.( fact_improver.( fact_improver.( fact_improver.( fact_improver.( fact_improver.( fact_improver.( fact_improver.( fact_improver.( fact_improver.(error))))))))))) fx will calculate up to factorial of 10, but will fail on factorial 11. Although promising, just cascading more and more fact_improvers is not going to ever get us a real factorial function.

Let's go back to the definition of fact_improver

  fact_improver = ->(partial) {
    ->(n) { n.zero? ? 1 : n * partial.(n-1) }
  }

One thing that's bothering me is that we are assigning to the factimprover variable. Let's make factimprover a lambda argument and bind it by calling the lambda.

  fx = ->(improver) {
    improver.(error)
  }.(
    ->(partial) {
      ->(n) { n.zero? ? 1 : n * partial.(n-1) }
    }
    )

(Note: from this point out we will be doing a lot of refactoring. We will highlight the code that is modified at each step to make it easier to follow the changes.)

The body of the function is improver.(error), which will generate the f0 function above. No need to go through the sequence of f0, f1 ... again, we know where that leads.

But it occurs to me that we instead of using error as our kick start function, we can use any function of one argument. Let's kick start it by feeding improver itself.

  fx = ->(improver) {
    improver.(improver)
  }.(
    ->(partial) {
      ->(n) { n.zero? ? 1 : n * partial.(n-1) }
    }
    )

This version of fx correctly works for 0, but when given a 1 as an argument, we get an error that looks something like this:

Proc can't be coerced into Fixnum (TypeError) The error is occuring on the multiply operator "*" that is trying to multiply n by a function returned by partial. Actually partial is now poorly named because the improver function is no longer getting a partial, but rather the improver functions itself. Renaming it make the problem clearer:

  fx = ->(improver) {
    improver.(improver)
  }.(
    ->(improver) {
      ->(n) { n.zero? ? 1 : n * improver.(n-1) }
    }
    )

Well naturally that won't work because improver is a higher order function that expects a function as an argument, not a number. We also note that improver.(improver) is used in the outermost lambda is should be a function that takes a number (it should be, since that's the value that is being assigned to fx. So let's use improver.(improver) in the inner lambda as well.

  fx = ->(improver) {
    improver.(improver)
  }.(
    ->(improver) {
      ->(n) { n.zero? ? 1 : n * improver.(improver).(n-1) }
    }
    )

Giving it a try, we find:

fx.(0) # => 1 fx.(1) # => 1 fx.(2) # => 2 fx.(3) # => 6 fx.(5) # => 120 fx.(10) # => 3628800 fx.(20) # => 2432902008176640000 fx.(30) # => 265252859812191058636308480000000 Wow, all of a sudden, from out of nowhere, we have a fully functional factorial function.

Putting the whole thing together with no named functions anywhere, this is how you could calculate the factorial of 5:

  ->(improver) {
    improver.(improver)
  }.(
    ->(improver) {
      ->(n) { n.zero? ? 1 : n * improver.(improver).(n-1) }
    }
    ).(5)    # => 120

So, the goal of writing a recursive function (like factorial) using nothing but anonymous functions has been achieved. Lambda calculus is indeed Turing complete and this pillar of computational mathematics will not collapse under us.

However, we are not done yet. I have two problems with the our current solution.

Improving Our Solution

The first problem is that "improver" is the wrong name here. Improver originally takes a partial function and improves it. This function currently named "improver" takes itself as an argument. Clearly it is not really an improver function at all and should be renamed.

Choosing a new name is difficult, as we have no common name for this self-swallowing, factorial generating, higher order function. Since this function is generating a factorial function, I'm going to go with the name "gen".

  ->(gen) {
    gen.(gen)
  }.(
    ->(gen) {
      ->(n) { n.zero? ? 1 : n * gen.(gen).(n-1) }
    }
    ).(5)

The second problem I have with this expression is that it weaves together the separate concepts of factorial and recursion. It would never occur to me to write a recursive function in this manner. I would really like to separate this expression into a part that handled recursion and a part that handled factorial. If fact, if I could get the factorial improver function back in its original form, I would be very happy.

So let's start by applying some of the functional refactorings to the expression. We will start by performing a Tennent Correspondence Principle refactoring around the 5th line of the expression, yielding this:

  ->(gen) {
    gen.(gen)
  }.(
    ->(gen) {
      ->() {
        ->(n) { n.zero? ? 1 : n * gen.(gen).(n-1) }
      }.()
    }
    ).(5)   # => 120

Now introduce a binding to that new inserted lambda. The name of the variable we are introducing will be "code" and the value we choose will be "error".

  ->(gen) {
    gen.(gen)
  }.(
    ->(gen) {
      ->(code) {
        ->(n) { n.zero? ? 1 : n * gen.(gen).(n-1) }
      }.(error)
    }
    ).(5)    # => 120

Actually running this code still returns 120 (for the factorial of 5). Like I mentioned before, refactoring the code will not changed its behavior.

I arbitrarily chose the error function to bind to code. I could have chosen any other function. In fact, I could have chosen gen. Give it a try, and you will see that the expression still correctly calculates 120 for the factorial of 5.

At this point, I would like to point out that gen.(gen) is also a function. We return it as the value of the factorial function and we use it to recurse in the factorial definition. Well, if gen.(gen) is indeed a function, I should be able to bind it to code as well.

  ->(gen) {
    gen.(gen)
  }.(
    ->(gen) {
      ->(code) {
        ->(n) { n.zero? ? 1 : n * gen.(gen).(n-1) }
      }.(gen.(gen))
    }
    ).(5)    # => ERROR: Stack Overflow

Running this code results in a stack overflow error. Previously gen.(gen) is only evaluated if n was greater than zero. Now gen.(gen) is fully evaluated in all cases because it is bound to code at every step.

To get around the stack overflow, we need to delay the evaluation of gen.(gen). We can do that by wrapping the gen.(gen) in a lambda (i.e. the "Wrap Function" refactoring).

  ->(gen) {
    gen.(gen)
  }.(
    ->(gen) {
      ->(code) {
        ->(n) { n.zero? ? 1 : n * gen.(gen).(n-1) }
      }.(->(v) { gen.(gen).(v) })
    }
    ).(5)    # => 120

Now we also do a "Wrap Function" refactoring on the gen.(gen) in the recursive call and get:

  ->(gen) {
    gen.(gen)
  }.(
    ->(gen) {
      ->(code) {
        ->(n) { n.zero? ? 1 : n * ->(v) { gen.(gen).(v) }.(n-1) }
      }.(->(v) { gen.(gen).(v) })
    }
    ).(5)    # => 120

Finally I would like to point out that code is bound to ->(v) { gen.(gen).(v) } and we can replace any occurrence of that with the name code (essentially we are doing a reverse inline refactoring).

  ->(gen) {
    gen.(gen)
  }.(
    ->(gen) {
      ->(code) {
        ->(n) { n.zero? ? 1 : n * code.(n-1) }
      }.(->(v) { gen.(gen).(v) })
    }
    ).(5)    # => 120

A simple renaming of code to partial reveals something amazing. The original definition of the factorial improver has reappeared (highlighted in gray).

  ->(gen) {
    gen.(gen)
  }.(
    ->(gen) {
      ->(partial) {
        ->(n) { n.zero? ? 1 : n * partial.(n-1) }
      }.(->(v) { gen.(gen).(v) })
    }
    ).(5)    # => 120

Now let's extract the factorial improver definition. We will start by wrapping (almost) the entire definition with a tennent lambda, and then immediately apply a "Introduce Binding" refactoring using code/error as the name and value. The result of the two refactorings is here.

  ->(code) {
    ->(gen) {
      gen.(gen)
    }.(
      ->(gen) {
        ->(partial) {
          ->(n) { n.zero? ? 1 : n * partial.(n-1) }
        }.(->(v) { gen.(gen).(v) })
      }
      )
  }.(error).(5)    # => 120

Next we will replace error with the factorial improver function defintion. Then replace the internal improver definition (highlighted in grey) with the name code.

  ->(code) {
    ->(gen) {
      gen.(gen)
    }.(
      ->(gen) {
        code.(->(v) { gen.(gen).(v) })
      }
      )
  }.(
      ->(partial) {
        ->(n) { n.zero? ? 1 : n * partial.(n-1) }
      }
    ).(5)    # => 120

Since code is bound to the improver function, let's go ahead and rename code to improver.

  ->(improver) {
    ->(gen) {
      gen.(gen)
    }.(
      ->(gen) {
        improver.(->(v) { gen.(gen).(v) })
      }
      )
  }.(
      ->(partial) {
        ->(n) { n.zero? ? 1 : n * partial.(n-1) }
      }
    ).(5)    # => 120

This version of our factorial expression has a very nice property. All of the code involved in recursion is gathered in one spot (highlighted with a green background). All the code that defines factorial is also gathered in one spot (also highlighted with a grey background), and is in the nice, easy to understand improver format.

Now that we've proved that we can calculate factorial with a single lambda expression, it's time to break it out into individual pieces that we can talk about individually.

  fact_improver = ->(partial) {
    ->(n) { n.zero? ? 1 : n * partial.(n-1) }
  }
  y = ->(improver) {
    ->(gen) { gen.(gen) }.(
      ->(gen) { improver.(->(v) { gen.(gen).(v) }) } )
  }

fact = y.(fact_improver)

fact.(5) # => 120 So fact is the real, full blown, factorial function. What would happen if we passed fact into fact_improver?

???? = fact_improver.(fact) Since improving the full definition of factorial by one step still gives you a fully functional factorial function, we can say:

fact = fact_improver.(fact) In other words, fact is the fixpoint of the fact_improver higher order function. Yes, higher order functions can have fixpoints.

Moreover, the function y actually calculates the fixpoint of the fact_improver function. Actually y will calculate the fixpoint of any function written in the improver style. If I were to write an improver funtion for Fibonacci, y would calculate its fixpoint and return a fully recursive Fibonacci function.

The function y is a form of the Fixpoint Combinator, also known as the Y-Combinator. I say "form" because there are many ways to write the Y-Combinator (we'll see a few more in a bit).

If you were to lookup the Y-Combinator in a book or on the web, it might look a bit different than what we have here. First of all, mathematicians tend to not use intention revealing names. If we rename gen to x and improver to f, we get this form of the Y-Combinator.

  y = ->(f) {
    ->(x) { x.(x) }.(
      ->(x) { f.(->(v) { x.(x).(v) }) } )
  }

This version of Y is still not the most common form you will find in books. Two more refactorings and we will be done. First note that the x.(x) evaluates to the fixpoint function (in our example that would be fact). Since f is the improver (remember, we renamed it earlier), then we can substitute in f.(x.(x)) with no effect.

  y = ->(f) {
    ->(x) { f.(x.(x)) }.(
      ->(x) { f.(->(v) { x.(x).(v) }) } )
  }

And finally we will do a "Wrap Function" on the first x.(x) expression, and realign the indentation just a bit.

  y = ->(f) {
    ->(x) { f.(->(v) { x.(x).(v) }) }.(
    ->(x) { f.(->(v) { x.(x).(v) }) } )
  }

I like to call this the symetric form of the Y-Combinator, because the inner function body and the argument to that function are a reflection of each other.

When you look this up, you will see that this is known as the Applicative Order Y-Combinator, or also the Z-Combinator. Applicative order languages evaluate arguments to a function before the function is called. Ruby (and Scheme, Lisp, and most other languages) are applicative order, an so need to use the this version of the Y-Combinator.

The original Y-Combinator (also known as the Normal Order Y-Combinator) was expressed in Lambda Calculus, which does not evaluate its arguments before calling a function. Haskel is another normal order language.

The Normal Order version of the Y-Combinator differs from ours in that it doesn't use the ->(v) { XXX.(v) } function wrapping around the x.(x). Remember that we added that particular function wrapping to avoid a stack overflow problem.

Summary

So, now that you know all about the Y-Combinator, I bet you are ready to start using the combinator in your applications, right?

Don't bother!

Remember that the Y-Combinator was introduced to solve the problem of trying to recurse with anonymous functions. Any language you are using today will give you the option of naming your functions, so the Y-Combinator itself is just not going to be useful for you.

So why spend all this time time worrying about a theoretical construct that has no value in today's programming languages. I believe the value in the Y-Combinator is not the function itself, but the journey in understanding. In working through this problem myself, I came upon some wonderful refactorings and a better understanding functional programming. I hope you enjoyed it as well.