March 2013

# From Imperative to Functional: how to make the leap

Many programmers find functional programming difficult, if not
downright cryptic. This is because functional programming is not
merely about relinquishing the assignment statement and using
closures liberally. It's a whole different *mindset*.

## The functional mindset

…basically boils down to three things:

- Your program isn't about what it does, but about what things are.
- Your code isn't just a linear sequence of instructions, but a tree of nested expressions.
- Functions are ordinary mathematical objects, just like an integer is.

## Some notations first

For this tutorial, I won't be using the syntax of any particular programming language. That would obscure my point. Instead, I will borrow from high school mathematics. Here is an example:

```
n : ℕ
n = 42
```

Which means, "let n be a natural number whose value is 42. More generally, when I say

```
x : T
```

I mean that `x`

has type `T`

. Basically `x`

is an *element* of the
set `T`

(let's just assume that types are sets). And when I say

```
x = Expr
```

I just mean that `x`

is the value denoted by the expression `Expr`

.
Now let's define a function:

```
f : ℕ → ℝ
f = x ↦ x + π
```

Which means "Let f be a function from natural numbers to real numbers, which returns its input plus π". To make reading easier, we will use this more familiar syntax whenever possible:

```
f(x) = x + π
```

I will however insist on using type annotation such as this:

```
f : ℕ → ℝ
```

They're not central to my point, but they make useful documentation.

## Descriptive mindset

Compare the following definitions:

```
-- imperative │ -- functional
│
square : ℕ → ℕ │ square : ℕ → ℕ
square(n) │ square(n) = n * n
{ sq = n * n │
return sq │
} │
```

Even though the two definitions above are strictly equivalent, they are very different in spirit:

The imperative side reads like procedure to follow: "compute n times n, put it in sq, then return sq to whoever called you". (I'm anthropomorphising

`square()`

here, because that's how the imperative mindset often operates).The functional code reads like a description: "the square of n is n times n".

Don't be fooled by this apparent simplicity. In functional
programming the descriptive mindset is *mandatory*. If you stay stuck
into the active mindset, not only will you miss the assignment
statement dearly, but the other difficulties (reading order and first
class functions), will be almost insurmountable.

## Reading order

Functional code is rarely a sequence of instructions that one can just read from top to bottom. To quote Paul Graham's On Lisp, "an imperative program is a functional program turned inside out". Before we see an example, I need to introduce a bit of syntax: tuples.

```
p : ℕ × ℝ
p = (42, π)
```

As you may have guessed, ℕ × ℝ denotes the Cartesian product. So, let's see our example (inspired from §3.2 of the book):

```
-- imperative │ -- functional
│
compute : ℝ×ℝ → ℕ×ℝ │ compute : ℝ×ℝ → ℕ×ℝ
compute(x,y) │ compute(x, y) = (42, |x| + 2)
{ abs = |x| │
res = abs + 2 │
return (42, res) │
} │
```

This silly `compute`

function takes a pair of real numbers as input,
and returns a pair of numbers (one natural, one real). Now look at
how we can convert the imperative program into the functional one,
step by step:

```
return (42, res) -- original return statement
return (42, abs + 2) -- replacing res by its value
return (42, |x| + 2) -- replacing abs by its value
```

What we did is take the imperative program, and turned it outside in:
we stuffed the expressions on the right hand side of `abs`

and `res`

*inside* the final expression. So, after the modification, we have
this:

```
compute : ℝ×ℝ → ℕ×ℝ
compute(x,y)
{ abs = |x| -- we no longer need
res = abs + 2 -- those two lines
return (42, |x| + 2)
}
```

After removing dead code:

```
compute : ℝ×ℝ → ℕ×ℝ
compute(x,y)
{ return (42, |x| + 2)
}
```

And now we can use the functional syntax directly:

```
compute : ℝ×ℝ → ℕ×ℝ
compute(x, y) = (42, |x| + 2)
```

Now, going back and forth the two programming styles is all well and good, but what does it have to do with reading order?

Simple: the imperative program have an obvious reading order: instruction by instruction, from the first to the last. As we have seen, the first statement in the imperative program corresponds to the inner-most expression in the functional program. Likewise, the last imperative statement corresponds to the outer-most functional expression.

So, the functional program typically reads from the inner-most expressions to the outer-most ones. This is no more difficult than reading a sequence of statements, but it is different, and requires some getting used to.

By the way, I believe this getting used to is a major reason why the ternary operator in C is frowned upon:

```
// What I would write: │ // What I often see
int abs(int n) │ int abs(int n)
{ │ {
return n > 0 │ if (n > 0)
? n │ return n;
: -n; │ else
} │ return -n;
┌─────────────────┘ }
│
│ // When QA forbids multiple returns
│ int abs(int n)
│ {
│ int result;
│ if (n > 0)
│ result = n;
│ else
│ result -n;
│ return result;
│ }
│
│ // When QA forbids brace-less ifs
│ ...
```

The code on the left shows my slant towards functional programming, and can't really be read as a sequence of statements. The program on the right, on the other hand, is clearly a set of instructions of the familiar form "if this, then do that, otherwise do those".

There are other differences in style:

```
// What I sometimes see │ // What I would write
if (cond_1) │ if (cond_1 && cond_2)
{ │ {
if (cond_2) │ // some code
{ │ }
// some code │
} │
} │
```

I know some of you will really believe the code on the left is more
readable than the code on the right. It is not. While each separate
line on the left *is* at least as readable as the corresponding line
on the right, there are more of them. Overall, the structure of both
listings have the same complexity: we just trade an `if`

statement for
a boolean expression. Nevertheless, I think writing the version on
the left is a mistake, for two reasons:

- The syntax is significantly heavier in many languages.
- The control flow is is more obvious with one
`if`

instead of two.

And then there are outright mistakes:

```
// What I sometimes see │ // What I'd write
bool flag = false; │ bool flag = (userInput == "yes");
if (userInput == "yes") │
flag = true; │
```

A cursory look tells me there is just no way the code on the left is
more readable. It has an initialisation, an `if`

statement, a
relational expression, and an assignment statement. The code on the
right only has an initialisation and a relational expression. Yet
some programmers still prefer the code on the left. How so?

Well, one still cannot read the code on the right from top to bottom without thinking. Again, that requires some getting used to. But there is another "problem": this code actually goes meta.

While booleans are first class values in nearly every language I know
of, they are rarely *used* as such. For instance, booleans are often
not thought of as truth values even when it is warranted:

```
// What I often see │ // What I would write
if (flag == true) │ if (flag)
```

Many people would tell you that the listings on the left are more readable, even though they're strictly more complex. The reason is, they mentally separate the world of propositions, used for branching and looping, and the world of ordinary values, used for more ordinary calculations. They don't think of truth (or lack thereof) as an ordinary value. But believe me, when you do think of truth values as something as ordinary as integers, your code just gets a bit simpler:

```
-- naive approach
imply : (bool × bool) → bool │ imply : (bool × bool) → bool
imply(A, B) │ imply(A, B) =
{ if A = false │ if A = false
then return true │ then true
if B = true │ else if B = true
then return true │ then true
else return false │ else false
} │
-- first class booleans
imply : (bool × bool) → bool │ imply : (bool × bool) → bool
imply(A, B) │ imply(A, B) = impl
{ │ with impl = nulA ∨ B
nulA = ¬A │ nulA = ¬A
impl = nulA ∨ B │
return impl │
} │
-- full functional approach
imply : (bool × bool) → bool │ imply : (bool × bool) → bool
imply(A, B) │ imply(A, B) = ¬A ∨ B
{ return ¬A ∨ B │
} │
```

(I hope we do agree that the programs at the bottom are by far the more readable ones.)

I digress, but I needed to introduce meta thinking and first class booleans before I get to the real meat.

## First class functions

If there is one thing in functional programming that sends imperative minds running and screaming into the night, it's those.

Don't panic.

The trick is to remember that a function is a mathematical object like any other. You may think it is not, most probably because, unlike a number or a set, a function is not a passive piece of data: when you give it an input, it produces a result.

Nevertheless, functions *can* be seen as a passive piece of data.
See this?

Well, in this image the function, (which happens to map objects to
their colours) is just the set of arrows. A set. If you want
more precision, a function `f`

from `X`

to `Y`

is a subset of the
Cartesian product `X × Y`

, with the property that For each `x`

, there
is one and only one `y`

such that the pair `(x, y)`

is an element of
`f`

. If not just envision the graph of a function: it describes
the function completely, yet is a completely passive object.

So, functions are "ordinary" mathematical objects. Yet, many programming languages don't treat them as such. Mostly for ease of implementation, functions in programming languages are subject to various restrictions. In those languages, you may not be able to:

- give a function as a parameter to another function;
- write expressions that describe functions directly;
- returning a function as the result of a function call.

Making functions "first class" is all about lifting those restrictions, so you can use them as freely as you would an integer. The question is, what are the practical implications?

First, let's begin with three examples (one for each of the restrictions above)

### functional parameters

```
apply : ((ℕ → ℝ) × ℕ) → ℝ
apply(f, n) = f(n)
```

This is a re-implementation of function application, which uses the
built-in one. I know it looks contrived, but I swear it does have
its uses. So, `apply`

takes two arguments:

- a function from integers to real numbers, and
- an integer.

With those, it return an integer.

(The astute reader would have noted that there are no mention of
integers or real numbers in the definition of `apply`

. Indeed, it
could work on any type `A`

and `B`

instead. We are not ready however
to discuss system F just yet, so I will just pretend
parametric polymorphism doesn't exist.)

Using `apply()`

is easy:

```
f : ℕ → ℝ
f(x) = x + π
a : ℝ
a = apply(f, 42) -- now, the number a equals 42 + π
```

Simplicity itself.

### literal function

We usually use the term *anonymous* function, because it's a function
without a name. I don't like the term, because it gives you the hint
that functions without a name are the exception rather than the rule.
Hence the term "literal", as in "literal string". Strings can be
written directly?

```
"Hello World!"
```

Well, so can functions:

```
x ↦ x + π
```

This is the same syntax I used to define `f`

at the beginning of this
tutorial. I can use it to simplify my example on the usage of `apply`

above:

```
a : ℝ
a = apply(x ↦ x + π, 42)
```

I can also apply the function directly, and still have the same result.

```
a : ℝ
a = (x ↦ x + π)(42)
```

Though I concede that at that point, I might as well write

```
a : ℝ
a = 42 + π
```

directly. But in practice, functions can be given as argument to more
complicated functions than `apply`

, so simplifications such as this
one aren't always possible.

### returning functions

Here is another example from high school: function composition:

```
∘ : ((ℚ → ℝ) × (ℕ → ℚ)) → (ℕ → ℝ)
g ∘ f = (x ↦ g(f(x)))
```

(Let's pretend my language can define infix operators. And again, function composition can be more generic than that.)

Now, can you guess the value of this expression?

```
((x ↦ x + π) ∘ (x ↦ x ÷ 2)) (13)
```

To know this, we will have to perform step by step reduction. First, let's use the ∘ operator as a prefix function, instead of an infix operator:

```
(∘) ((x ↦ x + π), (x ↦ x ÷ 2)) (13)
```

Then, we replace `(°)`

by its value:

```
((g, f) ↦ (x ↦ g(f(x)))) ((x ↦ x + π), (x ↦ x ÷ 2)) (13)
```

(This is starting to look seriously unreadable…) Then, we start doing
actual substitutions: first we replace `g`

and `f`

by the values of
the corresponding call parameters (with a bit of renaming so names
don't clash):

```
(x ↦ (z ↦ z + π) ((y ↦ y ÷ 2) (x))) (13)
(z ↦ z + π) ((y ↦ y ÷ 2) (13))
(z ↦ z + π) (13 ÷ 2)
(13 ÷ 2) + π
```

I could also evaluate the body of `(∘)`

first. It makes it more
obvious what function composition actually does:

```
(x ↦ (z ↦ z + π) ((y ↦ y ÷ 2) (x))) (13)
(x ↦ (z ↦ z + π) (x ÷ 2)) (13)
(x ↦ (x ÷ 2) + π) (13)
(13 ÷ 2) + π
```

Compare the last line with the original expression:

```
((x ↦ x + π) ∘ (x ↦ x ÷ 2)) (13)
```

That was a lot of complications for such a simple result, wasn't it? Actually, that's kinda the point: you don't want to do yourself what the compiler could have done for you. Conceptually, chaining functions together isn't that complicated. This for instance:

```
h ∘ g ∘ f
```

is a function that works by first applying `f`

, then `g`

to the result
of `f`

, then `h`

to the result of `g`

. Without the composition
operator however, I would have to write:

```
x ↦ h(g(f(x))
```

which is arguably much less readable. That would be a shame, since functional programming is often about chaining functions together.

### The point of it all

Passing functions around is cool and all, but it looks like a heap of trouble. Why would we do that in our day to day work?

The answer is, we already do.

So called object oriented languages such as C++, Java, or Python have
this nice thing called "subtype polymorphism", which
means that when you pass around an object, its actual behaviour may
not be determined at compile time. This is achieved through
inheritance and method overriding. This can be a *great* mechanism
for code reuse, but it's quite cumbersome.

The functional approach is more direct. Instead of manipulating whole objects, we pass around the behaviour (a function) directly. With syntactic support from the language, this is way more straightforward, and enables a number of powerful idioms.

One can for instance write one's own loops. Let's say for instance you want to process lists. Well, my language doesn't have lists, so we'll work with pairs instead:

```
inc-all : ℝ×ℝ → ℝ×ℝ
inc-all(x, y) = (1 + x, y + 1)
dbl-all : ℝ×ℝ → ℝ×ℝ
dbl-all(x, y) = (2 × x, y × 2)
```

Simple, but not simple *enough*. There's quite a bit of duplication
here. We need to factor out these commonalities:

```
map : ((ℝ → ℝ) × (ℝ×ℝ)) → ℝ×ℝ
map(f, (x, y)) = (f(x), f(y))
```

As you may have noticed, the first argument of map is a function (from
numbers to numbers). And now we can redefine `inc-all`

and `dbl-all`

in a simpler fashion:

```
inc-all : ℝ×ℝ → ℝ×ℝ
inc-all(p) = map ((x ↦ 1 + x), p)
dbl-all : ℝ×ℝ → ℝ×ℝ
dbl-all(p) = map ((x ↦ 2 × x), p)
```

Well, I admit it's not exactly simpler. Nevertheless, being able to
give a function argument to `map`

did allow us to repeat ourselves
less. It can still be improved, though. Right now, `inc-all`

and
`dbl-all`

have the following structure:

```
f(x) = map(g, x)
```

This `x`

is line noise. I'd like to be able to remove it. Like this:

```
f = map(g)
```

To do that, I need not only to have a function argument, but also to
return a function result. So let's redefine `map`

:

```
map : (ℝ → ℝ) → (ℝ×ℝ → ℝ×ℝ)
map(f) = (x, y) ↦ (f(x), f(y))
```

Now I can write:

```
inc-all : ℝ×ℝ → ℝ×ℝ
inc-all = map(x ↦ 1 + x)
dbl-all : ℝ×ℝ → ℝ×ℝ
dbl-all = map(x ↦ 2 × x)
```

Better, isn't it? Still, this is not enough. See the literal functions? Their structures are the same, the only thing that changes is the operator. Now we have seen with function composition that operators are actually functions with two arguments —or functions with a pair argument. We can take advantage of that:

```
curry2 : (ℝ×ℝ → ℝ) → (ℝ → (ℝ → ℝ))
curry2(f) : x ↦ (y ↦ f(x, y))
inc-all : ℝ×ℝ → ℝ×ℝ
inc-all = map(curry2(+)(1))
dbl-all : ℝ×ℝ → ℝ×ℝ
dbl-all = map(curry2(×)(2))
```

Now we did factor out everything.

### on syntax

The last listing still leave me a bit uneasy. We can do better.

The thing is, we're trying to do functional programming here. Which means we're mostly calling functions. And whatever we do the most should incur the least cognitive load possible. In programming languages, this means using lightest possible syntax. My language does not do that:

```
f(x) -- Function call in my language and high school math.
f x -- Lightest possible syntax: nothing at all.
```

If I change my language to use the better syntax, then the definitions above will change a little:

```
map : (ℝ → ℝ) → ℝ×ℝ → ℝ×ℝ -- arrows associate to the right
map f = (x, y) ↦ (f(x), f(y))
curry2 : (ℝ×ℝ → ℝ) → ℝ → ℝ → ℝ
curry2(f) = x ↦ y ↦ f(x, y) -- arrows associate to the right
inc-all : ℝ×ℝ → ℝ×ℝ
inc-all = map (curry2 (+) 1) -- function application associates
-- to the left.
dbl-all : ℝ×ℝ → ℝ×ℝ
dbl-all = map (curry2 (×) 2)
```

Better, but frankly, not much. But we're not finished. Remember that
I wrote `map`

and `curry2`

to factor my code as much as possible.
This suggests that functions that are in "curried" form in the first
place:

```
f : ℝ → ℝ → ℝ
f = x → y → whatever
f x y = whatever -- short hand
```

are more useful than functions that are in "un-curried" form:

```
f : ℝ×ℝ → ℝ
f(x, y) = whatever
```

(Now you know why `curry2`

is named the way it is: it "curryfies"
functions with two arguments.) So, I will make one last change to my
language: operators are now curried by default:

```
map : (ℝ → ℝ) → ℝ×ℝ → ℝ×ℝ
map f (x, y) = (f(x), f(y))
inc-all : ℝ×ℝ → ℝ×ℝ
inc-all = map ((+) 1)
dbl-all : ℝ×ℝ → ℝ×ℝ
dbl-all = map ((×) 2)
```

I don't even need to define `curry2`

.

My point is, some languages use juxtaposition for function calls, and their standard libraries are curried by default. The most notable examples are ML and Haskell. The advantage is extreme conciseness in many situations. It can be quite cryptic however if you're not prepared for this.

The regular syntax for function calls however is more common, so I used it in the hope to prepare you for functional code in JavaScript or such.