April 10, 2016

So, what is recursion^{1} anyways?

Explaining recursion might be easier if we understand mathematical *induction*.
Indunction is a proof–technique, which works somewhat like this^{2}:

- Show that we can get from one state to the next state
- Show that we have an initial state
- We can now get to any state, after the initial state.

The concept can be visalized as climbing a staircase. If we are on an arbitrary step, we know how to get to the next step.
We also know how to get to the 0th step, which (i guess) is the ground.
From this, we conclude that we can get to step `n`

, for any positive `n`

.

What does this have to do with recursion? Well, recursion is kind of the opposite. With recursion, we need the following:

- The problem can be solved by first solving a smaller instance of the same problem
- When the input is small enough, it is trivial to solve
- We can now solve an instance of any size, simply by making the problem smaller enough times.

Recursion is heavily used in functional programming, so if you are familiar with, say a `Lisp`

, or `Haskell`

,
this might be familiar.

For instance, we can find the length of a list using recursion: the length of a list is one more then the length of the same list, with the first element removed, and the length of an empty list is 0.

```
num list_length(List)
if List == []
return 0
let head be first element
let tail be rest of list
return 1 + list_length(tail)
```

Or, in actual python:

```
def list_len(lst):
if lst == []:
return 0
return 1 + list_len(lst[1:])
```

Let’s say we’re trying to sort a list. We could to a lot of different things, like keeping a sorted list, and inserting one new element at a time, or trying to build a data structure, with which extracting a sorted list is trivial^{3}.
But we’ll try something else. We can begin with the observation that if we split the list in two, and somehow manage to sort the two list separately,
we can merge the lists, pretty easily: just take out the smallest element of the two in front of the lists, and push it at the end onto a new list.
Then the new list will consist of all of the elements in sorted order, because we allways took the smaller element.

So great, we just found a way to sort a list… except we didn’t, because in order for our algorithm to work, we need an additional algorithm to sort the two lists. But, do we really? The algorithm we just made is itself a sorting algorithm! What happends if we try to use the algorithm itself?

```
list merge_sort(List)
split List at middle into A, B
merge_sort(A)
merge_sort(B)
return merge(A, B)
```

Does this work? What happends if `List`

is empty? Or contains one element? Actually, when the length of the list is less than two, the
list is already sorted (if we can say that an empty list is sorted).

```
list merge_sort(List)
if List.length < 2
return List
split List at middle into A, B
merge_sort(A)
merge_sort(B)
return merge(A, B)
```

Ok, we are (perhaps) starting to build up confidence in our new, albeit weird, algorithm. But, this call to the function itself looks a little dangerous. How can we know if it will end?

We first observe that if the length of the list is less than two, we return the list, so the algorithm returns from that call.
If the length of the list is larger or equal to two, we split the list at the middle, and call the parts `A`

and `B`

.
But the length of both `A`

and `B`

is the half of `List`

(or something very close to half).
Neither of them can possibly be larger, or even of equal length.
Hence, we know that the algorithm calls itself, but with a smaller input.
Eventually, when the input it small enough — containing less than two elements — we return.
Hence, no matter what size the input is, the algorithm always terminates!

Now, this result is kind of amazing. We have just constructed a sorting algorithm, kind of without even thinking about the problem of sorting! By assuming our own algorithm actually works, we gained confidence that the same algorithm works!

The merge step looks a bit scary, though^{4}. Maybe we can try something similar, but without having to merge the lists.

Ok, what if we sort the list just a little bit, so that we can make the recursion work.
For instance, we can take a element from the list and then split, or *partition*, the list in two, based on that one element,
such that all elements in the first list are less than or equal to our selected element, and all elements in the last list are greater than it.
At the end, we simply put them together, and put our special element, the *pivot*, in betweeen.

```
list sort(List)
if List.length < 2
return List
select e from List
make A such that a in A --> a < e
sort(A)
make B such that b in B --> b > e
sort(B)
return A + [e] + B
```

This might look more like sorting, because we partition the list into two, based on the pivot.
But one would think there is still lots to do. There is not.
This `python`

function implements this algorithm, `quick-sort`

.

```
# Python
def sort(lst):
if len(lst) < 2:
return lst
pivot = lst.pop()
A = [a for a in lst if a <= pivot]
B = [b for b in lst if b <= pivot]
return sort(A) + [pivot] + sort(B)
```

and this is the `haskell`

version:

```
-- Haskell
sort (x:xs) = l ++ [x] ++ g
where l = sort [a | a<-xs, a <= x]
g = sort [b | b<-xs, b > x]
sort l = l
```

Recursion is an amazing tool, and sometimes it works out just right.
However, there are multiple things that can go wrong^{5}.

It is very easy to get burnt on this, either by looping forever^{6}:

```
def fac(n):
if n == 1:
return 1
return n * fac(n - 1)
# fac(-2) loops infinitely
```

or incorrectly handling, or even forgetting, the base case^{7}, and result in a crash:

```
sort [x] = [x]
sort (x:xs) = l ++ [x] ++ g
where l = sort [a | a<-xs, a <= x]
g = sort [b | b<-xs, b > x]
-- Non-exhaustive patterns:
-- the empty list case is not handled
```

As mentioned here^{1}, making sure the recursion ends is tricky.

If we have multiple recursive calls it is also easy to do one thing over and over using recursion.
The classical example here is the recursive `fibonacci`

function^{8}, which runs in exponential time.
Try to run `fib(35)`

^{9}.

```
def fib(n):
if n < 3:
return 1
return fib(n-1) + fib(n-2)
```

A solution to this is *memoization*^{10}, where we save intermediate results, and look up values instead of recomputing them.
Of course, in the case on calculating fibonacci numbers, this is still worse than an iterative solution, because we are using more memory.
This is not to say that it is not possible to write a equally good implementation, while still using recursion; we just have to go upwards, instead of downwards:

```
def fib(n):
def go(current, previous, a):
if a == n:
return current
return go(current + previous, current, a + 1)
if n < 3:
return 1
return go(1, 0, 1)
```

At this point, one could argue that this is basically a loop^{11}, so there is no point in using recursion; we might be better off with simply writing

```
def fib(n):
current = 1
previous = 0
for _ in range(n - 1):
current, previous = current + previous, current
return current
```

Hopefully, we have gained a little insight in recursion — both its magic and its dangers. While there might be tricky to get recursion right, actually getting it right is too much fun to miss out on.

- While trying to make the
*Recursion: See Recursion*joke, the static site generator ended up crashing due to a never ending recursion so that the call stack became too large.^{[return]} - This is not a rigid, and probably not a very good, intro to induction — but that is fine, because this post is about recursion.
^{[return]} - This is my personal favorite.
^{[return]} - It’s not, but it is easy to mess up the merge routine.
^{[return]} - These problems aren’t unique to recursion — both infinite loops and horrible running times are naturally possible without using recursion.
^{[return]} - You could argue that the factorial of a negative number is not well defined, so that calling
`fac(-2)`

does not make any sense. However, defining the factorial of a negative number is useful in many situations.^{[return]} - This actually was the version I wrote when trying to write the correct version above.
^{[return]} - We will be using the same definition as on wikipedia, where
`fib(1) = fib(2) = 1`

.^{[return]} - … or
`fib(50)`

, if you have a lot of time on your hands.^{[return]} - Note that there is no r in memoization.
^{[return]} - The function is
*Tail–Recursive*, so in many languages this would be transformed by the compiler to a loop. However, Python is not one of those languages.^{[return]}