I was poking through Code Complete, 2nd Edition by Steve McConnell recently, after I’d handed the book to a computer science student. The book is showing its age, as good object-oriented design is better understood than it used to be, but it remains the best book on how to program well.
One thing that struck me, though, was a particularly strong statement against recursion, on page 397 (2nd Edition):
If a programmer who worked for me used recursion to compute a factorial, I’d hire someone else.
To put this in context, McConnell had just presented a few appropriate uses for recursion (quicksort and maze solving [i.e. graph searching]), and is railing against the simple examples one typically learns in computer science classes. He sees recursion as an advanced technique, which should be avoided unless it provides a clear advantage.
McConnell says recursive algorithms are harder to read. I don’t buy that; it depends on your audience. But his biggest beef is with the stack. When a function calls itself over and over again, the stack grows (keeping track of all the variables in all those function calls) and you can run out of memory, the dreaded stack overflow.
In many computer science departments, the first language you learn is Scheme, which doesn’t support iteration, so you can’t compute a factorial (or do any other kind of looping) without recursion. Scheme handles this with tail recursion elimination, where if the last thing a function does is call itself (i.e. tail recursion), the Scheme interpreter overwrites the stack frame– essentially replacing recursion with a loop.
Since most languages don’t do tail recursion elimination, and since it can be harder to write a function as tail recursive, I mostly agree with McConnell– despite my initial reaction. However, I was thinking about what makes some algorithms amenable to recursion while others are a bad idea.
McConnell’s examples of good recursion– quicksort and graph searching– both involve branching. Most tree algorithms are recursive: one might say a tree is a recursive data structure. The recursive algorithm uses the stack to track the branching, which is cleaner than handling your own stack. (A good programmer can replace any recursive algorithm with a stack-based one.) What makes factorials different is that there’s no branching, so the stack is redundant. So we have two classes of algorithms:
- Branching algorithms, which use recursion to track which branches have been visited.
- Linear (non-branching) algorithms, which should be written in a tail-recursive manner to avoid stack overflows, or written with loops when the compiler/interpreter doesn’t support tail recursion elimination.
I should add that branching algorithms are also susceptible to stack overflows, but since the stack is providing a necessary service, it can’t be avoided with a non-recursive algorithm.
So here’s why I end up agreeing with McConnell: to properly write non-branching recursive algorithms, you must (1) write them in a tail-recursive manner, and (2) run them with tail call elimination. After you write a tail-recursive function, it’s easy to accidentally modify it to no longer be tail-recursive, and the difference can be subtle. That’s why Scala provides the tailrec
annotation: so a function that’s supposed to be tail recursive won’t compile unless Scala can apply the tail call elimination. If you need the compiler’s help to safely apply a technique, it’s an advanced technique, and shouldn’t be done lightly.