This is perhaps a slight digression; just an extended expression of pleasure about Clojure’s recur statement. It neatly squashes a bee I’ve had in my bonnet for some years now; if it’s wrong to loathe “tail recursion optimization” then I don’t want to be right.
Terminology ·
I acknowledge that recur
isn’t a “statement”, it’s actually a
“special form” which is what languages that don’t want to admit having syntax
call their keywords. But hey, it’s a built-in fixed-syntax thingie that usually
stands on its own line and causes transfer of control to another place in the
program, so for my money it walks like a statement and works like a
statement.
Recur Details ·
There’s nothing to it, really. If you’re in a function and you say
(recur arg arg ...)
— the number of arguments has to be the same
as the function takes — you restart the function with the new set of arguments
from the recur
. It’s a highly controlled and structured
GOTO
. This walks like a loop and smells like a loop, and so,
sensibly enough, there’s another statement oops special form called
loop
:
(def factorial
(fn [n]
(loop [cnt n acc 1]
(if (zero? cnt)
acc
(recur (dec cnt) (* acc cnt))))))
(Gosh, I can’t help but noticing how, in Erlang, you see tons of
functions which exist only to do tail-recursive looping and have names like
looper
or whatever_loop
. Nudge nudge, wink wink.)
recur
is simple, it’s easy to understand, it doesn’t burn
stack space,
and the
function or loop arguments are still immutable in the context of any single
pass through. Also I find the name “recur” to be an example of engineering
elegance; obviously correct, once you think of it.
I’m amused by the write-ups in the Clojure literature; they all take the
form of “Since Clojure is based on Java, and the JVM doesn’t do tail-call
optimization (how sad) we were forced to provide this recur
thingie. Sorry ’bout that, and if Java gets its act together, we’ll do tail
calls right.”
But it seems obvious to me that recur
is better than
tail-call-optimized recursion.
Why I Hate Tail Call Recursion Optimization ·
It feels like a big fat lie. It looks like a subroutine call, but in the
case where it occurs as the last thing in the routine, it magically, silently,
and automatically gets turned into, now how did I put it? “A highly controlled
and structured GOTO
.”
If that’s what the programmer wants, then that’s what the programmer should bloody well ask for. I’ve always been irritated by the explanations, in the Erlang and Lisp introductory books, about how you must be careful to ensure that your recursive calls are the last thing executed, and I’ve observed this leading sometimes to some gnarly unintuitive if/else branch placement. Call me old-fashioned, but it just seems wrong when the semantics of two identical function invocations vary wildly as a function of whether there’s an executable statement between the invocation and the function exit.
I Know I Know I Know ·
Every time I write about this I get lectures in the comments, explaining in
a superior tone of voice that I don’t really understand the wonders of the
tail call, and pointing me at
Guy
Steele’s 1977 paper (PDF) on the subject.
I think the semantics of computer programs should be made explicit in their
syntax. recur
does this. It’s a good thing.
Comment feed for ongoing:
From: Joe Cheng (Oct 27 2009, at 14:56)
Not to nitpick, but tail call optimization is more than a GOTO if you have multiple functions that are mutually recursive...
[link]
From: MenTaLguY (Oct 27 2009, at 15:10)
Recursion does tend to get abused generally. It seldom has a place in a lower-order function.
In the absence of an appropriate predefined combinator (i.e. loop, or a catamorphism), it is really better to write your own combinators rather than employing recursion in lower-order functions. And of course language designers should be putting these in their standard libraries. It's just less mental burden, and often more clearly expresses the meaning of the code for human readers.
But, when you need it, real tail call optimization is pretty important particularly for things like writing catamorphisms in a natural way. You really do need to tail call between several different functions sometimes.
[link]
From: J. Random Hacker (Oct 27 2009, at 15:30)
Tail Call ... Booty Call. More than a coincidence? You decide!
[link]
From: Michael Kay (Oct 27 2009, at 15:32)
XSLT 2.1 will have a very similar construct called xsl:iterate. It's nice to know there's a precedent. It's pre-implemented as saxon:iterate in Saxon 9.2 if you want to have a look at it. Very useful for streamed processing, because it's more restricted/controlled than tail recursion; but also, I think, more understandable to users.
[link]
From: Brett (Oct 27 2009, at 15:36)
Just want to say I completely agree with you, Tim. It's comes off as much cleaner then actual tail calls. Heck, it even has the benefit of minimizing errors from refactoring functions and missing the tail call.
[link]
From: dr2chase (Oct 27 2009, at 15:51)
Generalized tail call is just what you need when you are working in a language that only has structured control flow constructs :-). The canonical irreducible loop is:
top(state) = if p1(state) then left(state) else right(state) end
left(state) = if p2(state) then left(state) else right(state) end
right(state) = if p3(state) then left(state) else right(state) end
You can't write that out without goto, or tail call elimination.
[link]
From: DGentry (Oct 27 2009, at 16:05)
Yes. Its very unfortunate to have convoluted code which suddenly stops working when some future programmer makes what they think should be an innocent change but which happens to break the tail-form requirements. It is particularly unfortunate if that future programmer has no idea how to debug a messy error from blowing the stack.
[link]
From: Leon P Smith (Oct 27 2009, at 16:08)
1. Proper handling of tail calls does not necessarily have anything to do with recursion.
2. "recur" isn't anywhere close to being capable of expressing the examples in Guy Steele's classic paper.
[link]
From: John Cowan (Oct 27 2009, at 16:08)
This is a specialization of something Scheme provides more generally: in Scheme, you get to pick your own name for the recurrence procedure at the top of the loop, and it is an ordinary procedure which may be passed out of the loop and called wherever you please.
Tail calling is to tail call optimization as garbage collection is to what we might call "garbage collection optimization". You could have a C++ implementation that provided garbage collection, but that would just be an optimization with respect to the C++ language. In practice, nobody would rely on it, because their code wouldn't be portable to other C++ implementations.
By contrast, people program quite differently in Java, where they *know* they can rely on garbage collection, because it's part of the language design. Likewise, people program differently in Scheme, where they know they can rely on tail calling, than they do in C or C++, where gcc attempts to provide tail calling as an optimization.
[link]
From: Fred Blasdel (Oct 27 2009, at 16:16)
Oh come on, you're practically a poster boy for <i>Sufficiently Smart Compiler</i> cheerleading, but you despise the most useful real-world example? Aren't you in the middle of a series about how awesome auto-parallelization will be? How is observable nondeterminism not a much bigger lie than TCO?
If you like Clojure's recur but find its syntacticalness distasteful, and clueful compilers abhorrent, the Y Combinator is exactly what you want.
[link]
From: Patrick Logan (Oct 27 2009, at 16:34)
Whether tail calls look like GOTOs or like, well, tail calls, seems to depend on whether you come from a functional programming background or an imperative programming background.
From an FP POV, there is no such thing as "loop" there is only "apply these args to this function". There is no such thing as a "GOTO" in FP.
Your perception is valid, but it is that perception that should change to adopt the FP POV if you want to use an FP language.
Also in any relatively pure FP language TCO is not an "optimization" so much as a "requirement". There's just no getting around it.
[link]
From: Vincent Geddes (Oct 27 2009, at 16:34)
My two cents :)
If the compiler uses an intermediate form based on continuation-passing style (CPS), then _all_ function calls become tail-calls and eventually GOTO's. No need to worry about where to place recursive tail-calls.
In fact, I wager that unless a compiler uses a CPS IR, then the explicit "recur" syntax is pretty much the only option for implementing a limited form of TCO.
On the other hand, I agree that TCO is a pretty bad excuse for using recursion as a general-purpose looping construct. Writing macros that abstract over low-level recursive looping constructs is probably a good idea.
[link]
From: Phil (Oct 27 2009, at 17:41)
It's worth mentioning that recur will signal an error when used in a non-tail-position. If you rely on regular TCO and make a mistake in this regard, your function will silently consume stack. That's why the JVM's lack of TCO doesn't bother me too much.
[link]
From: Jayson Vantuyl (Oct 27 2009, at 17:59)
I actually requested something similar to be added to Python a while back, but for the opposite reason (basically, to have EXPLICIT tail-recursion optimization).
I see it as basically, explicit stack management. The problem with return/recur/tro is that they create "default" performance characteristics that vary from language to language. I suspect your hate of TRO is that it is nonintuitive "magic" that causes non-uniform behavior (i.e. gotchas) when you would much rather just have a simpler performance profile and an explicit optimization.
See my suggestion here:
http://souja.net/2009/04/on-tail-recursion-elimination.html
[link]
From: Justin Forder (Oct 27 2009, at 18:10)
This is not a change in semantics, it's a change in resource use.
An earlier commenter compared it to GC - I would compare it to escape analysis that determines that an object can be stack allocated, rather than heap allocated. It's a similar step back - the stack-alocated object doesn't need GC, and tail recursion doesn't need a new stack frame.
Keep up the good work!
[link]
From: Alexander Wagner (Oct 27 2009, at 18:15)
Speaking of Erlang, I think it should be noted how hot code loading and tail calls interact:
Old code calls new code, and it just works.
With this kind of serendipity I wouldn't throw out optimized tail calls just yet.
[link]
From: Matt Brubeck (Oct 27 2009, at 18:23)
You say "structured goto" like it's a bad thing. You should re-read Knuth's classic paper, "Structured Programming with Goto":
http://pplab.snu.ac.kr/courses/adv_pl05/papers/p261-knuth.pdf
This is the original source of the much-quoted "Premature optimization is the root of all evil." Unfortunately, few people read the rest of the paper. Many of Knuth's "structured goto" examples have since been added into mainstream programming languages as exceptions or various types of "break" statements, but the most interesting ones are practical only as tail calls or gotos, and therefore are not possible in certain languages. In fact, most programmers these days would never even think of the solutions that Knuth considers most elegant, because current languages have left them with a huge conceptual blind spot.
I agree with the previous commenters that "tail recursion" is a red herring. The uses of tail calls that make the most difference are related to coroutines or state machines, not loops or recursion.
[link]
From: Randy Hudson (Oct 27 2009, at 18:52)
As Phil points out, the real beauty of 'recur' is that it can *only* be used in a tail position. When/if Java.future supports general tail call elimination, Clojure should, I believe, provide a form like 'recur' whose first argument is the function being tail-called. You would only get TCE when you used this form, and only in tail position. E.g. dr2chase's 'left' function would look something like
(defn left [state] (if (p2 state) (=> left state) (=> right state)))
Thus it's still clear where the magic happens.
[link]
From: Matt (Oct 28 2009, at 02:35)
If you talk about tail calls in Clojure shouldn't you at least mention Clojure's trampoline http://groups.google.com/group/clojure/msg/3addf875319c5c10
[link]
From: Ryan Angilly (Oct 28 2009, at 06:19)
Extremely well written.
[link]
From: Mark Wilkinson (Oct 28 2009, at 10:09)
Phil Winterbottom's Alef programming language for the Plan 9 operating system (from 1995) had a 'become' statement that was an explicit tail-call optimisation. It's described in section 6.6.4 of the Alef reference manual at http://doc.cat-v.org/plan_9/2nd_edition/papers/alef/ref
I once used it to build the state machine for parsing VT220 control codes for a basic terminal emulator for Plan 9. Each function was a state and 'become' provided a tidy way to move from one state to the next. In combination with the multi-threading and channels it was a lovely language for writing a terminal emulator!
[link]
From: Nathan Youngman (Oct 28 2009, at 19:19)
I totally agree that explicit is better than implicit. I'd rather an error about recur being misused than a stack overflow.
Haven't read the mentioned papers by Guy Steele or Knuth, but there is a very simple state machine example in "Programming in Lua" of TCO that goes beyond recursion:
http://www.lua.org/pil/6.3.html
But it sounds like trampoline enables that sort of ability in an explicit way without requiring a new JVM.
[link]
From: Igor (Oct 31 2009, at 03:54)
Misconception: Tail recursion is primarily syntactic sleight-of-hand that allows a programmer to use a function call rather than a traditional iteration construct.
Truth: Tail recursion is a means of reducing the asymptotic space complexity of some programs. Tail recursion unifies function calls and message passing in a synchronous, sequential Actor model.
Misconception: Its purpose is to avoid writing looping constructs.
Truth: Its purpose is to allow access to more powerful programming models.
Misconception: Any code that depends on tail recursion can easily be rewritten to use a looping construct instead. Such a rewrite is a trivial local transformation.
Truth: Certain constructs that depend on tail recursion cannot be rewritten into a looping construct without global changes to the code.
Misconception: People who like tail recursion are burdening their code, but they expect the implementor to bear the weight.
Truth: Proper tail recursion cannot be added within the language without adding devices such as trampolines and imposing a calling convention on all users.
Misconception: Tail recursion causes loss of valuable debugging information.
Truth: Means to retain stack traces while preserving tail recursion have been known for close to thirty years.
Please read the pages below, that explain in excruciating detail why tail recursion is important. If that does not convince you, nothing will!
Sources:
http://funcall.blogspot.com/2009/04/you-knew-id-say-something.html
http://funcall.blogspot.com/2009/04/you-knew-id-say-something-part-ii.html
http://funcall.blogspot.com/2009/05/you-knew-id-say-something-part-iii.html
http://funcall.blogspot.com/2009/05/you-knew-id-say-something-part-iv.html
http://funcall.blogspot.com/2009/05/you-knew-id-say-something-part-v.html
[link]
From: Michel Salim (Nov 05 2009, at 17:54)
"recur" is not a statement. It cannot be defined as a normal function either, hence the term "special form", which Lisp/Scheme reserves for certain forms like if, cond, etc.
The distinction between statements and expressions are really similar to that between procedures and functions: the former does not have return values, the latter do. You can use expressions as statements -- every expression but the last one in a Clojure "do" or a Scheme "begin" has its return value ignored, for example.
When teaching Scheme, one of the mistakes I keep having to correct is that there is no "if statement", it's an "if expression".
[link]
From: Alex (Nov 09 2009, at 18:59)
Oh, when will the ignorance stop? First Guido, now this. Please, look for something called the "Lambda Papers" and read them. They are almost 40 years old by now...
[link]