In Part I we explored the historical context leading up to the Y-Combinator.
In Part II we laid the foundation for understanding the Y-Combinator
Finally, in Part III we see the Y-Combinator in the solution to the recursion problem of Lambda Calculus.
Solving the Recursion Problem
Its time to attack the problem of recursion with anonymous functions. Remember our factorial example?
->(n) { n.zero? ? 1 : n * fact.(n-1) }
Since this function is anonymous, the name fact is not
bound to anything. One thing we can do to fix this is to wrap the
entire thing in a function that takes fact as an
argument. Let's call this higher order function make_fact.
make_fact = ->(fact) {
->(n) { n.zero? ? 1 : n * fact.(n-1) }
}
Since fact is now bound, there is no problem calling
fact from within the function. So to create a factorial
function, all we need to do is call make_fact with a
definition of factorial. If that sounds a little circular, you are
right.
Ok, maybe we don't need a full definition of factorial to start with. I'm willing to lower my expectations a bit. Rather than require a full factorial function, I would be happy with a factorial function that operated on a subset of the possible inputs for factorial. We call this kind a function a "partial" function, it is only valid on a partial set of possible inputs.
For example, given the following definition:
fact_improver = ->(partial) {
->(n) { n.zero? ? 1 : n * partial.(n-1) }
}
If we feed the fact_improver function a partial factorial
that is defined over the inputs 0 through 10, we will get back a new
function that is factorial for inputs 0 through 11.
fact_improver improves the factorial function by one
step.
So we still need a partial function to kick things off. How about this function:
error = ->(n) { fail "SHOULD NEVER BE CALLED" }
I will claim this is a partial factorial function where the set of inputs where it is defined is the empty set. Ok, that's a pretty weak partial, but it works. Consider:
f0 = fact_improver.(error)
f0 is a function that can calculate the factorial of
zero.
f0.(0) # => 1
Unfortunately it fails on factorial of 1:
f0.(1) # throws "SHOULD NEVER BE CALLED" error
But that's ok. We can improve f0
f1 = fact_improver.(f0)
f1.(0) # => 1
f1.(1) # => 1
f1.(2) # throws "SHOULD NEVER BE CALLED" error
How about a function that can go up to the factorial of 2?
f2 = fact_improver.(f1)
f2.(0) # => 1
f2.(1) # => 1
f2.(2) # => 2
f2.(3) # throws "SHOULD NEVER BE CALLED" error
You should be getting the pattern by now. We can easily create a function that will calculate any fixed sized factorial partial.
fx = fact_improver.(
fact_improver.(
fact_improver.(
fact_improver.(
fact_improver.(
fact_improver.(
fact_improver.(
fact_improver.(
fact_improver.(
fact_improver.(
fact_improver.(error)))))))))))
fx will calculate up to factorial of 10, but will fail on
factorial 11. Although promising, just cascading more and more
fact_improvers is not going to ever get us a real
factorial function.
Let's go back to the definition of fact_improver
fact_improver = ->(partial) {
->(n) { n.zero? ? 1 : n * partial.(n-1) }
}
One thing that's bothering me is that we are assigning to the
fact_improver variable. Let's make
fact_improver a lambda argument and bind it by calling
the lambda.
fx = ->(improver) { improver.(error) }.( ->(partial) { ->(n) { n.zero? ? 1 : n * partial.(n-1) } } )
(Note: from this point out we will be doing a lot of refactoring. We will highlight the code that is modified at each step to make it easier to follow the changes.)
The body of the function is improver.(error), which will
generate the f0 function above. No need to go through the
sequence of f0, f1 ... again, we know where
that leads.
But it occurs to me that we instead of using error as our kick start
function, we can use any function of one argument. Let's kick start
it by feeding improver itself.
fx = ->(improver) {
improver.(improver)
}.(
->(partial) {
->(n) { n.zero? ? 1 : n * partial.(n-1) }
}
)
This version of fx correctly works for 0, but when given
a 1 as an argument, we get an error that looks something like this:
Proc can't be coerced into Fixnum (TypeError)
The error is occuring on the multiply operator "*" that is trying to
multiply n by a function returned by
partial. Actually partial is now poorly
named because the improver function is no longer getting a partial,
but rather the improver functions itself. Renaming it
make the problem clearer:
fx = ->(improver) {
improver.(improver)
}.(
->(improver) {
->(n) { n.zero? ? 1 : n * improver.(n-1) }
}
)
Well naturally that won't work because improver is a
higher order function that expects a function as an argument, not a
number. We also note that improver.(improver) is used in
the outermost lambda is should be a function that takes a number (it
should be, since that's the value that is being assigned to
fx. So let's use improver.(improver) in the
inner lambda as well.
fx = ->(improver) {
improver.(improver)
}.(
->(improver) {
->(n) { n.zero? ? 1 : n * improver.(improver).(n-1) }
}
)
Giving it a try, we find:
fx.(0) # => 1
fx.(1) # => 1
fx.(2) # => 2
fx.(3) # => 6
fx.(5) # => 120
fx.(10) # => 3628800
fx.(20) # => 2432902008176640000
fx.(30) # => 265252859812191058636308480000000
Wow, all of a sudden, from out of nowhere, we have a fully functional factorial function.
Putting the whole thing together with no named functions anywhere, this is how you could calculate the factorial of 5:
->(improver) {
improver.(improver)
}.(
->(improver) {
->(n) { n.zero? ? 1 : n * improver.(improver).(n-1) }
}
).(5) # => 120
So, the goal of writing a recursive function (like factorial) using nothing but anonymous functions has been achieved. Lambda calculus is indeed Turing complete and this pillar of computational mathematics will not collapse under us.
However, we are not done yet. I have two problems with the our current solution.
Improving Our Solution
The first problem is that "improver" is the wrong name here. Improver originally takes a partial function and improves it. This function currently named "improver" takes itself as an argument. Clearly it is not really an improver function at all and should be renamed.
Choosing a new name is difficult, as we have no common name for this self-swallowing, factorial generating, higher order function. Since this function is generating a factorial function, I'm going to go with the name "gen".
->(gen) { gen.(gen) }.( ->(gen) { ->(n) { n.zero? ? 1 : n * gen.(gen).(n-1) } } ).(5)
The second problem I have with this expression is that it weaves together the separate concepts of factorial and recursion. It would never occur to me to write a recursive function in this manner. I would really like to separate this expression into a part that handled recursion and a part that handled factorial. If fact, if I could get the factorial improver function back in its original form, I would be very happy.
So let's start by applying some of the functional refactorings to the expression. We will start by performing a Tennent Correspondence Principle refactoring around the 5th line of the expression, yielding this:
->(gen) {
gen.(gen)
}.(
->(gen) {
->() {
->(n) { n.zero? ? 1 : n * gen.(gen).(n-1) }
}.()
}
).(5) # => 120
Now introduce a binding to that new inserted lambda. The name of the variable we are introducing will be "code" and the value we choose will be "error".
->(gen) {
gen.(gen)
}.(
->(gen) {
->(code) {
->(n) { n.zero? ? 1 : n * gen.(gen).(n-1) }
}.(error)
}
).(5) # => 120
Actually running this code still returns 120 (for the factorial of 5). Like I mentioned before, refactoring the code will not changed its behavior.
I arbitrarily chose the error function to bind to
code. I could have chosen any other function. In fact, I
could have chosen gen. Give it a try, and you will see
that the expression still correctly calculates 120 for the factorial
of 5.
At this point, I would like to point out that gen.(gen)
is also a function. We return it as the value of the factorial
function and we use it to recurse in the factorial definition. Well,
if gen.(gen) is indeed a function, I should be able to
bind it to code as well.
->(gen) {
gen.(gen)
}.(
->(gen) {
->(code) {
->(n) { n.zero? ? 1 : n * gen.(gen).(n-1) }
}.(gen.(gen))
}
).(5) # => ERROR: Stack Overflow
Running this code results in a stack overflow error. Previously
gen.(gen) is only evaluated if n was greater than zero.
Now gen.(gen) is fully evaluated in all cases because it
is bound to code at every step.
To get around the stack overflow, we need to delay the evaluation of
gen.(gen). We can do that by wrapping the
gen.(gen) in a lambda (i.e. the "Wrap Function"
refactoring).
->(gen) {
gen.(gen)
}.(
->(gen) {
->(code) {
->(n) { n.zero? ? 1 : n * gen.(gen).(n-1) }
}.(->(v) { gen.(gen).(v) })
}
).(5) # => 120
Now we also do a "Wrap Function" refactoring on the
gen.(gen) in the recursive call and get:
->(gen) {
gen.(gen)
}.(
->(gen) {
->(code) {
->(n) { n.zero? ? 1 : n * ->(v) { gen.(gen).(v) }.(n-1) }
}.(->(v) { gen.(gen).(v) })
}
).(5) # => 120
Finally I would like to point out that code is bound to
->(v) { gen.(gen).(v) } and we can replace any occurrence
of that with the name code (essentially we are doing a
reverse inline refactoring).
->(gen) {
gen.(gen)
}.(
->(gen) {
->(code) {
->(n) { n.zero? ? 1 : n * code.(n-1) }
}.(->(v) { gen.(gen).(v) })
}
).(5) # => 120
A simple renaming of code to partial reveals
something amazing. The original definition of the factorial improver
has reappeared (highlighted in gray).
->(gen) {
gen.(gen)
}.(
->(gen) {
->(partial) {
->(n) { n.zero? ? 1 : n * partial.(n-1) }
}.(->(v) { gen.(gen).(v) })
}
).(5) # => 120
Now let's extract the factorial improver definition. We will start by
wrapping (almost) the entire definition with a tennent lambda, and
then immediately apply a "Introduce Binding" refactoring using
code/error as the name and value. The result
of the two refactorings is here.
->(code) { ->(gen) { gen.(gen) }.( ->(gen) { ->(partial) { ->(n) { n.zero? ? 1 : n * partial.(n-1) } }.(->(v) { gen.(gen).(v) }) } ) }.(error).(5) # => 120
Next we will replace error with the factorial improver
function defintion. Then replace the internal improver definition
(highlighted in grey) with the name code.
->(code) {
->(gen) {
gen.(gen)
}.(
->(gen) {
code.(->(v) { gen.(gen).(v) })
}
)
}.(
->(partial) {
->(n) { n.zero? ? 1 : n * partial.(n-1) }
}
).(5) # => 120
Since code is bound to the improver function, let's go
ahead and rename code to improver.
->(improver) { ->(gen) { gen.(gen) }.( ->(gen) { improver.(->(v) { gen.(gen).(v) }) } ) }.( ->(partial) { ->(n) { n.zero? ? 1 : n * partial.(n-1) } } ).(5) # => 120
This version of our factorial expression has a very nice property. All of the code involved in recursion is gathered in one spot (highlighted with a green background). All the code that defines factorial is also gathered in one spot (also highlighted with a grey background), and is in the nice, easy to understand improver format.
Now that we've proved that we can calculate factorial with a single lambda expression, it's time to break it out into individual pieces that we can talk about individually.
fact_improver = ->(partial) {
->(n) { n.zero? ? 1 : n * partial.(n-1) }
}
y = ->(improver) {
->(gen) { gen.(gen) }.(
->(gen) { improver.(->(v) { gen.(gen).(v) }) } )
}
fact = y.(fact_improver)
fact.(5) # => 120
So fact is the real, full blown, factorial function. What
would happen if we passed fact into
fact_improver?
???? = fact_improver.(fact)
Since improving the full definition of factorial by one step still gives you a fully functional factorial function, we can say:
fact = fact_improver.(fact)
In other words, fact is the fixpoint of the
fact_improver higher order function. Yes, higher order
functions can have fixpoints.
Moreover, the function y actually calculates the fixpoint
of the fact_improver function. Actually y
will calculate the fixpoint of any function written in the improver
style. If I were to write an improver funtion for Fibonacci,
y would calculate its fixpoint and return a fully
recursive Fibonacci function.
The function y is a form of the Fixpoint Combinator, also
known as the Y-Combinator. I say "form" because there are many ways to
write the Y-Combinator (we'll see a few more in a bit).
If you were to lookup the Y-Combinator in a book or on the web, it
might look a bit different than what we have here. First of all,
mathematicians tend to not use intention revealing names. If we rename
gen to x and improver to
f, we get this form of the Y-Combinator.
y = ->(f) {
->(x) { x.(x) }.(
->(x) { f.(->(v) { x.(x).(v) }) } )
}
This version of Y is still not the most common form you will find in
books. Two more refactorings and we will be done. First note that the
x.(x) evaluates to the fixpoint function (in our example
that would be fact). Since f is the improver
(remember, we renamed it earlier), then we can substitute in
f.(x.(x)) with no effect.
y = ->(f) {
->(x) { f.(x.(x)) }.(
->(x) { f.(->(v) { x.(x).(v) }) } )
}
And finally we will do a "Wrap Function" on the first
x.(x) expression, and realign the indentation just a bit.
y = ->(f) {
->(x) { f.(->(v) { x.(x).(v) }) }.(
->(x) { f.(->(v) { x.(x).(v) }) } )
}
I like to call this the symetric form of the Y-Combinator, because the inner function body and the argument to that function are a reflection of each other.
When you look this up, you will see that this is known as the Applicative Order Y-Combinator, or also the Z-Combinator. Applicative order languages evaluate arguments to a function before the function is called. Ruby (and Scheme, Lisp, and most other languages) are applicative order, an so need to use the this version of the Y-Combinator.
The original Y-Combinator (also known as the Normal Order Y-Combinator) was expressed in Lambda Calculus, which does not evaluate its arguments before calling a function. Haskel is another normal order language.
The Normal Order version of the Y-Combinator differs from ours in that
it doesn't use the ->(v) { XXX.(v) } function wrapping
around the x.(x). Remember that we added that particular
function wrapping to avoid a stack overflow problem.
Summary
So, now that you know all about the Y-Combinator, I bet you are ready to start using the combinator in your applications, right?
Don't bother!
Remember that the Y-Combinator was introduced to solve the problem of trying to recurse with anonymous functions. Any language you are using today will give you the option of naming your functions, so the Y-Combinator itself is just not going to be useful for you.
So why spend all this time time worrying about a theoretical construct that has no value in today's programming languages. I believe the value in the Y-Combinator is not the function itself, but the journey in understanding. In working through this problem myself, I came upon some wonderful refactorings and a better understanding functional programming. I hope you enjoyed it as well.