# Optimising Right Recursion

*Update, 14/12/2014: Jeffrey Kegler, the author of the Marpa parser,
posted a
response
to this post. You probably should take a look.*

*Note: I have decided not to include this optimisation in the rest of
the series. It complicates the core algorithm, has further
repercussions in the parsing phase, and its performance benefits are
unclear —they need to be tested. (Leo didn't lie about the complexity
bounds, but he omitted less interesting considerations such as the
ever growing memory gap). Simply put, more research is needed. I may
perform it. Someday.*

## The problem

Consider the following grammar:

```
A -> A 'a'
A ->
```

Let it parse this input: `"aaaaa"`

. For now, there's no problem:

```
=== 0 ===
A -> • A 'a' (0)
A -> • (0)
A -> A • 'a' (0)
=== 1 ===
A -> A 'a' • (0)
A -> A • 'a' (0)
=== 2 ===
A -> A 'a' • (0)
A -> A • 'a' (0)
=== 3 ===
A -> A 'a' • (0)
A -> A • 'a' (0)
=== 4 ===
A -> A 'a' • (0)
A -> A • 'a' (0)
=== 5 ===
A -> A 'a' • (0)
A -> A • 'a' (0)
```

But then, consider this slightly different grammar (the symbols of the first rule have been inverted):

```
A -> 'a' A
A ->
```

Upon the same input, it yields *this*:

```
=== 0 ===
A -> • 'a' A (0)
A -> • (0)
=== 1 ===
A -> 'a' • A (0)
A -> • 'a' A (1)
A -> 'a' A • (0)
A -> • (1)
=== 2 ===
A -> 'a' • A (1)
A -> • 'a' A (2)
A -> 'a' A • (1)
A -> • (2)
A -> 'a' A • (0)
=== 3 ===
A -> 'a' • A (2)
A -> • 'a' A (3)
A -> 'a' A • (2)
A -> • (3)
A -> 'a' A • (1)
A -> 'a' A • (0)
=== 4 ===
A -> 'a' • A (3)
A -> • 'a' A (4)
A -> 'a' A • (3)
A -> • (4)
A -> 'a' A • (2)
A -> 'a' A • (1)
A -> 'a' A • (0)
=== 5 ===
A -> 'a' • A (4)
A -> • 'a' A (5)
A -> 'a' A • (4)
A -> • (5)
A -> 'a' A • (3)
A -> 'a' A • (2)
A -> 'a' A • (1)
A -> 'a' A • (0)
```

Now we have a problem: the previous grammar yielded a reasonable number of Earley items, but this one… grows quadratically. Such problems are expected of Earley's algorithm, but come on! This grammar could be parsed by a simple LR parser in linear time! We should not need that many Earley items.

## The solution

The trick was published by Joop Leo in 1991. The idea is to avoid completing certain key Earley items.

### Deterministic reduction path

Picture this Earley item (Greek letters denote sequences of symbols):

```
A -> α • (i)
```

The deterministic path, if any, is a chain of Earley items. The first item of this chain, if there is such a thing, satisfies 2 conditions:

- It is of the form:
`X -> β A • (k)`

- There is
*exactly one*item of the form`X -> β • A (k)`

in the Earley set`S(i)`

,

Now we can picture *this* Earley item: `X -> β A • (k)`

(yes, it's the
one we just discovered), and find what's just above it. We can then
go up the chain until we can't. The last Earley item we find is
called the *topmost* complete item on the deterministic reduction
path.

By the way, you may notice the eerie similarity with the regular
completion step. See, when performing a completion, we check for all
items of the form `X -> β • A δ (k)`

in `S(i)`

, so we can add
`X -> β A • δ (k)`

to the current Earley set. The reduction path
business is only a special case, where `δ`

happens to be empty.

### The algorithm

We modify the completion step. Instead of performing a regular completion right away, we first examine our Earley item, and check if there is a deterministic reduction path.

- If there is, we just add the topmost item on this path to the current Earley set.
- Otherwise, we perform a regular completion.

The effect of these shenanigans is fairly simple: if we were doing the regular completion, we would end up adding the whole reduction path to the current Earley set. Instead, we just add the topmost item, and skip all the others.

Let's take the construction of the final Earley set in our last example. With the regular algorithm, we get this:

```
=== 5 ===
A -> 'a' • A (4) -- scanned previously
A -> • 'a' A (5) -- Predicted
A -> 'a' A • (4) -- Special completion (because A is nullable)
A -> • (5) -- Predicted
A -> 'a' A • (3) -- Completed
A -> 'a' A • (2) -- Completed
A -> 'a' A • (1) -- Completed
A -> 'a' A • (0) -- Completed
```

Now see the last four completions? They're the deterministic
reduction path of `A -> • (5)`

. With our new and improved algorithm,
we would only add the last one. Here is how it would go. First, we
would proceed as usual:

```
=== 5 ===
A -> 'a' • A (4) -- scanned previously
A -> • 'a' A (5) -- Predicted
A -> 'a' A • (4) -- Special completion (because A is nullable)
A -> • (5) -- Predicted
```

Then, things change as we complete `A -> • (5)`

. First, we check for
the presence of an item of the form `X -> α • (k)`

in `S(5)`

. There
is one, *and only one*:

```
A -> 'a' • A (4)
```

Now, we know that

```
A -> 'a' A • (4)
```

is in the deterministic path. So, instead of putting it in `S(5)`

right away (which by the way wouldn't change a thing, since it is
already present), we go up the chain, and look for an item of the form
`X -> α • (k)`

, but this time, in `S(4)`

. And lo and behold, we find…

```
A -> 'a' • A (3)
```

So the next item up the chain is

```
A -> 'a' A • (3)
```

And so on until we find the topmost item:

```
A -> 'a' A • (0)
```

And we just add that. We therefore get this:

```
=== 5 ===
A -> 'a' • A (4) -- scanned previously
A -> • 'a' A (5) -- Predicted
A -> 'a' A • (4) -- Special completion (because A is nullable)
A -> • (5) -- Predicted
A -> 'a' A • (0) -- Topmost item in the deterministic reduction path
```

Now the number of items can stop growing at each step. Here what it would do to the whole thing:

```
=== 0 ===
A -> • 'a' A (0)
A -> • (0)
=== 1 ===
A -> 'a' • A (0)
A -> • 'a' A (1)
A -> 'a' A • (0)
A -> • (1)
=== 2 ===
A -> 'a' • A (1)
A -> • 'a' A (2)
A -> 'a' A • (1)
A -> • (2)
A -> 'a' A • (0)
=== 3 ===
A -> 'a' • A (2)
A -> • 'a' A (3)
A -> 'a' A • (2)
A -> • (3)
A -> 'a' A • (0)
=== 4 ===
A -> 'a' • A (3)
A -> • 'a' A (4)
A -> 'a' A • (3)
A -> • (4)
A -> 'a' A • (0)
=== 5 ===
A -> 'a' • A (4)
A -> • 'a' A (5)
A -> 'a' A • (4)
A -> • (5)
A -> 'a' A • (0)
```

Okay, that's still more than the left recursive version. Still, the number of Earley items per set is now bounded. We got our linear parsing back…

### The *real* algorithm

…mostly. In case you didn't notice, checking the reduction path takes
time. And the further we go into the parse, the longer the reduction
path is, the longer it takes to go to the top. Even worse, we are
checking and re-checking the *same* reduction path over and over! So,
while we got spatial complexity back to normal, the temporal
complexity is still quadratic.

The obvious solution: cache the topmost item in the reduction path.
Next time we check we won't have to go up the whole chain all over
again. To perform the cache, we use what Leo calls *transitive
items*. A transitive item is something of the form:

```
X -> α • (i), Y
```

where `X`

and `Y`

are non-terminal symbols, and `α`

is a sequence of
symbols (terminal or non-terminal). Basically, it is a completed
Earley item, with a symbol attached. The Earley item itself is the
topmost item on some reduction path. The symbol tells us *which*
reduction path is involved.

Oh, I didn't tell you: at any given point in the parse, you can't find
more deterministic paths than non-terminal symbols in the grammar.
Looking back at the definition of deterministic paths, it should have
been obvious. (I actually missed it, and got the tip from Jeffrey
Kegler, in his Marpa parser paper.) In our example above,
for instance, there is only one non-terminal symbol (`A`

). So, there
is only one deterministic reduction path.

Now, when we find the topmost item in the reduction path, we add the corresponding transitive item to the current Earley set. And all the Earley sets we just examined, while we're at it.

*(That last one bothers me. We used to only touch two Earley sets:
the current one (predictions and completions), and the next one
(scans). This could enable the use of a neat dynamic array for the
whole thing. But with this new "optimisation", we now have to deal
with arbitrary insertions. This could be a performance killer: arrays
aren't exactly insertion friendly, and anything else likely involve an
uncanny amount of pointer chasing. Maybe we can mitigate the damage
with some kind of cache aware data structure, but I shudder at the
sheer complexity.)*

In our example it would go like this:

```
=== 0 ===
A -> • 'a' A (0) -- as usual
A -> • (0) -- as usual
=== 1 ===
A -> 'a' • A (0) -- as usual
A -> • 'a' A (1) -- as usual
A -> 'a' A • (0) -- Search for the topmost item -there is none.
A -> • (1) -- as usual
A : A -> 'a' A • (1) <─────────this transitive item────┐
│
=== 2 === │
A -> 'a' • A (1) -- as usual │
A -> • 'a' A (2) -- as usual │
A -> 'a' A • (1) -- Search for the topmost item │
Finds A -> 'a' • A (0), creates ─┘
A : A -> 'a' A • (1) <─── And this one one as well
(Yes, it is the same)
A -> • (2) -- Search for the topmost item,
finds the transitive item above
A -> 'a' A • (0) -- Search for the same topmost item,
there is none.
```

This little dances then goes on in the same fashion:

```
=== 3 ===
A -> 'a' • A (2)
A -> • 'a' A (3)
A -> 'a' A • (2)
A : A -> 'a' A • (1) -- copy of the transitive item above
A -> • (3)
A -> 'a' A • (0)
=== 4 ===
A -> 'a' • A (3)
A -> • 'a' A (4)
A -> 'a' A • (3)
A : A -> 'a' A • (1) -- copy of the transitive item above
A -> • (4)
A -> 'a' A • (0)
=== 5 ===
A -> 'a' • A (4)
A -> • 'a' A (5)
A -> 'a' A • (4)
A : A -> 'a' A • (1) -- copy of the transitive item above
A -> • (5)
A -> 'a' A • (0)
```

More or less. Sorry for the lack of details (especially with respect to empty rules), I'm not very motivated to flesh it out right now: I'm not sure this optimisation is worth the trouble, and the rest of the series is more important anyway.