@ Loup's

Impossible? Like that would stop me.


Here are some experiments about metacompilation. For now, I have two toys: a re-implementation of MetaII, and a simple implementation of Parsing Expression Grammars on top of Parsec for Haskell.

Meta II for Lua

This is a straightforward re-implementation of META II from Shorre's paper, using Lua instead of a custom assembler. You can download it here. Despite this being 60's technology, it's quite impressive: this compiler needs only 50 lines of Lua support code to run, and compiles itself in 30 lines.

There's a similar implementation for C by Long Nguyen which predates mine, but I swear I didn't looked at it to do mine (which was good for my ego and for learning, but stupid from a practical perspective).

If you know nothing about such metacompilers, I suggest you read this excellent tutorial by James M. Neighbors.

Of course, you need Lua. I have tested it with Lua 5.2.


$ tar -xzf metaII-lua.tar.gz
$ cd metaII
$ ls
bootstrap.lua  makefile  runtime.lua
bootstrap.txt  meta.txt


The compiler is made up of two Lua files. The compiler itself, and runtime.lua. For now, your only working compiler is meta-bootstrap.lua. You use it like this:

$ lua meta-bootstrap.lua <meta.txt >meta.lua

(You can also compile meta-bootstrap.txt. Actually, that's how I made meta-bootstrap.lua. I didn't write it by hand. For those who see the infinite regress, I did write the first compiler by hand. I considered using Long Nguyen's compiler instead, but preferred the hard path.)

The compiler you just made can also compile itself:

$ lua meta.lua <meta.txt >meta-diff.lua
$ diff meta.lua meta-diff.lua

Yep. No difference. For the lazy, the whole process is encoded in the makefile:

$ make
$ ls
bootstrap.lua  meta.lua  makefile       runtime.lua
bootstrap.txt  meta.txt  meta-diff.lua

Note: the compilers stemming from bootstrap.txt and meta.txt are the same, except bootstrap.txt adds comments to the code of the compiler. It may help you decipher the generated Lua code.


Let's start with what I consider the hello-world of compilation: infix arithmetic expressions. Write this in a new file (say, arithmetic.txt).

.syntax expr
  expr = exp2 *( '+' exp2 {'add'     }
               | '-' exp2 {'sub'     });
  exp2 = exp3 *( '*' exp3 {'mul'     }
               | '/' exp3 {'div'     });
  exp3 = .number          {'push ' $ }
       | '(' expr ')';

The syntax is loosely based on Backus–Naur Form. What's between {} are things we print as we recognise patterns. They can be seen as patterns that allways succeed. The $ prints the last recognised token. .number is a primitive pattern that recognises integers, and the * denotes repetition. The whole thing turns arithmetic expressions into some sort of reversed Polish notation. So, let's compile this compiler:

$ lua meta.lua <arithmetic.txt >arithmetic.lua

Then, test it:

$ cat expr.txt
1 + 2 * (3 + 4) - 5
$ lua arithmetic.lua <expr.txt
push 1
push 2
push 3
push 4
push 5

Ta daa!

NoMeta for Haskell

This basically implements Parsing Expressions Grammars as syntactic sugar over parser combinators. This one is build on top of Parsec. You can get it here.

The principle is a bit different from the other compiler: instead of defining a standalone language, this one works as a pre-processor for Haskell, augmenting its syntax like a naive implementation of the do notation.


$ tar -xzf nometa-haskell-rule.tar.gz
$ cd nometa-haskell-rule
$ ls
funmeta-bootstrap.hs  funmeta.hs  makefile

funmeta-bootstrap.hs is pure hand-written Haskell code that use Parsec in a mostly readable way. funmeta.hs described the same compiler than its bootstrap version, but makes heavy use of the syntax it implements. It should be decipherable as it is.


Again, the whole bootstrap process is described in the makefile:

$ make

Basically, you compile funmeta-bootstrap.hs, then use the result to pre-process the mostly Haskell code in funmeta.hs. Compile the result, and pre-process funmeta.hs with that, and see that you get the same result than with the first time around.

Now you can use Parsec with a fancy syntax.


As it stands, this compiler not worth the trouble. My syntax is more readable than the combinators themselves only by a hair. I do have several improvements in mind however:

NoMeta for Haskell (reloaded)

This one is much better. It deals with grammars instead of just rules. It does not however have a syntax for anonymous rules in regular haskell code (I don't think we really need one: my syntax is only worth the trouble if we define many rules at once).

You can get this version here. It is installed and used just like the other.


This is basically a proof of concept. If anyone find it useful, please fork it, polish it, and put it on Hackage.