Intermediate Representation

HC

Mateusz

Dec 8, 2022

intermediate-representation
intermediate-representation
intermediate-representation

Right now we compile from Polylang to Miden in one pass, adding an intermediate representation will allow us to perform optimizations and generate more efficient code, as well as provide an alternative for writing built-ins that need to be performant.

Intermediate Representation

We have been working on an Intermediate Representation for the compiler. The idea is to have a representation, between Polylang and Miden, that is easy to work with programmatically and that can be easily compiled into the final representation which is Miden Assembly. If you wanted to do optimizations, you would do them on this representation, rather than on Polylang or Miden ASM level, because it's easier to work with. The main motivation behind starting it now rather than later was to make it easier to implement performant built-ins in the compiler, without writing Miden Assembly, which can be time-consuming and hard to debug.

The IR is a tree of expressions. An expression is either a function call, a constant stack value, an if, or a while. When we compile the IR, we start from the output expressions and recursively compile the expressions that they depend on. This can also eliminate redundant code, like an unused variable.

This is an example of a graph for a program that adds two numbers. To compile it, we would start from the result node and recursively compile the nodes that it depends on.

The resulting Miden program would look like this:

Polybase Miden intermediate representation.

This one is simple because it doesn't need to move the stack around, but it can get more complicated if you need to use an expression more than once or have expressions that return more than one value, or expressions where the order is important (like writing and reading from memory). Abstracting away the stack is a big quality-of-life improvement, it makes writing programs a lot easier.

There is also an x node that is not compiled because the result node doesn't depend on it. This hasn’t been integrated with the compiler first as we have decided to work on the other parts of the compiler first.