Tuesday, February 19, 2013

Evaluating, compiling and running arithmetical expressions.

Partly reminded to do this by recent FB posts by Graham Hutton and Andy Gill, I used foils and OHP pens to deliver my last lecture in the functional programming course "live". I like the sense of tension you get from writing stuff "out there" and interacting with the class in a more thoroughgoing way than seems to happen with pre-prepared slides.

What did I cover? The aim was to hammer home the message about the data driving the programming, and I took the example of processing arithmetical expressions, writing an evaluator, a virtual machine and a compiler … I didn't cover parsing, but that can come in the second half of the course: Stefan?

So, we started with a data type for expressions, with literals, +, - and *.

data Exp = Lit Int 
         | Add Exp Exp
         | Sub Exp Exp
         | Mul Exp Exp
          deriving (Show)

with the expression 2+((3-(-3))*3) represented by

(Lit 2) `Add` (((Lit 3) `Sub` (Lit (-3))) `Mul` (Lit 3))

The first thing to define is the evaluator, which has a straightforward data-directed definition:


eval :: Exp -> Int

eval (Lit n)       = n
eval (Add ex1 ex2) = eval ex1 + eval ex2
eval (Sub ex1 ex2) = eval ex1 - eval ex2
eval (Mul ex1 ex2) = eval ex1 * eval ex2


What sort of machine should we use to evaluate the expressions? The simplest approach is to build a stack machine, which can either push literals, or apply an operation to the top elements on the stack (from my slides):

Here's the type of code.

data Code = PushLit Int
          | DoAdd
          | DoSub
          | DoMul
            deriving (Show)


We can describe this machine by giving the function that runs a program, when we represent the stack by a list of integers, and the program by a list of code.

type Program = [Code]
type Stack = [Int]

run :: Program -> Stack -> Stack

run [] stack 
        = stack
run (PushLit int : program) stack 
        = run program (int : stack)
run (DoAdd : program) (v1:v2:stack)
        = run program (v1 + v2 : stack)
run (DoSub : program) (v1:v2:stack)
        = run program (v1 - v2 : stack)
run (DoMul : program) (v1:v2:stack)
        = run program (v1 * v2 : stack)
run _ _ = []


An alternative here would be to give the oneStep function, and then to define run in terms of that: that's an exercise for the reader.

The final piece of the puzzle is to define the compiler:


compile :: Exp -> Program

compile (Lit n) 
        = [PushLit n]
compile (Add ex1 ex2)
        = compile ex2 ++ compile ex1 ++ [DoAdd]
compile (Sub ex1 ex2)
        = compile ex2 ++ compile ex1 ++ [DoSub]
compile (Mul ex1 ex2)
        = compile ex2 ++ compile ex1 ++ [DoMul]


VoilĂ !

Why do I like this example so much?

  • It shows Haskell's strengths: I think my students get fed up of me comparing Haskell with Java, but this is a case where it really shines. The types make it very clear what's going on; the algebraic types guide the definitions, and we can even prove it correct …
  • What is the statement of correctness? That evaluation is the same as compiling and running:
    run (compile exp) [] = [eval exp]
    but there's a bit of work to do to get the right induction hypothesis (exercise).
  • There are lots of nice generalisations: we can introduce local definitions with a let construct, or indeed allow (updatable) variables within expressions. All these can be added in a nice type-directed way.
  • There are lots of related exercises too: e.g. generate truth tables for Boolean expressions, and using these write tautology checkers.

So, a very enjoyable last lecture on Haskell.

No comments:

Post a Comment

Post a Comment