A5: SSA OCaml logo

Due: Tuesday, 4/5

exp to Static Single Assignment RTL

Your job

in this assignment: translate the Grumpy expression language exp.mli to a new intermediate representation called RTL ("Register Transfer Language"). The definition of this new RTL IL is given (and heavily commented) in ssa.mli. We call the file ssa.mli (rather than, e.g., rtl.mli) because your translation function will generate RTL programs in Static Single Assignment (SSA) form.

Download the entire set of A5 files here. You won't rely very heavily on the Grumpy language specification for this assignment but we link to it just in case.

Before doing any actual programming, read through all the instructions below. And as always, ask early on Piazza if something's unclear!

Changes to the Grumpy spec since A4

To maintain compatibility with LLVM (looking forward to A6), the putchar primitive now has specification:
  putchar(int) : int
  The [putchar] function prints an integer in the range [0..2^8-1] 
  to stdout. It returns the integer printed (or on error, the 
  integer error code EOF, just as in C).

  If the integer argument supplied to [putchar] is not in the 8-bit
  character range given above, the result is undefined.
Previously in a4, it returned the unit value.

If you're re-using your own type-checker for this assignment, you'll have to update putchar's return type to match the new specification.

Pair Programming

On this assignment (as in A4), you may pair-program with up to one other person. If you do so, write that person's name in a comment at the top of your tycheck.ml. For example:
  (* NAME: Your name
     OUID: Ohio University ID

     I worked with ... on this assignment. *)
Each student should individually turn in ssa.ml on Blackboard, regardless whether you worked with someone else. Pair programming does not mean each student does half of the assignment. Instead, it means the two of you construct tycheck.ml collaboratively, while both sitting at the same computer screen.

1. Download the assignment files

First, download the assignment files and unzip the resulting gzipped tarfile into a new directory.
$ tar xzvf a5.tgz
In the resulting directory src you'll find the following file structure:
  src/               -- compiler source files
    Makefile         -- the project Makefile
    _tags            -- the tags file for ocamlbuild
    AST.mli          -- language-independent abstract syntax stuff
    AST.ml           -- associated helper functions
    exp.mli          -- the definition of Grumpy's abstract syntax
    exp.ml           -- associated functions
    lexer.mll        -- ocamllex source file (stub)
    parser.mly       -- Menhir source file (stub)
    tycheck.mli      -- The type-checker interface
    tycheck.ml       -- The type-checker (stub)
    ssa.mli          -- Defines the RTL intermediate language
    ssa.ml           -- (Part II)
    grumpy.ml        -- the toplevel compiler program    
    tests/           -- test cases
To build the project, type
$ make
As in a3 and a4, you'll see a bunch of warnings at this point. That's OK. The lexer, parser, and type-checker files are the same stubs that were given to you in the last assignment. Before you get started on this assignment, copy your own lexer.mll, parser.mly, and tycheck.ml in their place.

If your type-checker doesn't work quite right, you can request a working version from Sam or from me. Just shoot one of us an email. We only ask that you don't share this file with others or post it on the internet.


Run the tests by doing

  $ make test
or by typing ./run.sh from within the tests directory. (Here's sample passing test output and sample failing test output.)

For this assignment, the *.expected files in the tests directory contain the values we expect each Grumpy program to return. Building the test target does the following:

The last two files are most relevant to this assignment. You won't have to edit the RTL interpreter (ssa_interp) for this assignment; however, it may behoove you to read through it, especially as you're diagnosing errors in your compilation function (generally, these errors will result in testcase exceptions caught at "runtime" by ssa_interp).

2. ssa.ml

TL;DR Complete the definitions of fresh_ids, instrs_of_exp, and instrs_of_explist, which together map type-annotated Grumpy source programs of type (ty, ty exp) prog to RTL programs of type (ty, instr list) prog.

Your job in this part is to implement the compilation functions sketched out as stubs in src/ssa.ml. The top-level compilation function

  val ssa_of_prog : (ty, ty exp) prog -> (ty, instr list) prog
maps type-annotated Grumpy expression ASTs to programs in which function bodies and the program result are lists of RTL instructions; it's implemented for you. The main functions you need to implement are described below.


Start by opening ssa.mli. Read the commented descriptions of the types iexp and instr, which collectively define the RTL language. For example:
  (** [ty]-annotated identifiers *)
  type iid = ty tid

  (** As in the [exp] language, we define two versions of RTL
      expressions, which together correspond to a subset of the                  
      expressions of type [exp] in [exp.mli]. The first, [raw_iexp],             
      defines the RTL language's expression constructors. The second,            
      [iexp], wraps [raw_iexp]s with a type [ty]. *)
  type raw_iexp =
    | IInt of int32                 (** 32-bit integers *)
    | IFloat of float               (** Double-precision floats *)
    | IId of id                     (** Identifiers *)

You may need to reference other files, such as AST.mli, to recall the definitions of types such as tid (typed identifiers).


Once you've thoroughly read the comments describing RTL, define the provided stub functions:

let fresh_ids (p : gensym_pkg) (n : int) : id list = ...
let rec instrs_of_exp (p : gensym_pkg) (out : id) (e : ty exp) : instr list = ...
and instrs_of_explist (p : gensym_pkg) (out : id) (el : (ty exp) list) : instr list = ...
Informal specifications for each of these functions are given in comments in the file. In short: A few of the cases of instrs_of_exp are given for you. For example:
  | EInt n -> [IAssign(out, mk_iexp (IInt n) e.ety_of)]
  | EFloat f -> [IAssign(out, mk_iexp (IFloat f) e.ety_of)]
  | EId x -> [IAssign(out, mk_iexp (IId x) e.ety_of)]
define the integer, float, and id cases. In each, you'll note that we convert an expression into an assignment instruction (not every expression is compiled to just an assignment of course), enforcing the following invariant: In each call to instrs_of_exp, the ultimate result of expression e is always stored in variable out.

To maintain this programming discipline in other cases, you'll need to introduce fresh variable names, using fresh_id or fresh_ids, binding these vars to intermediate results.

Other Requirements

Your implementation of instrs_of_exp must additionally ensure that


  • As in many languages, the Grumpy operators && and || have short-circuit behavior (as explained in the Grumpy spec). Think about how to compile these operators in terms of other existing expressions in order to get the right semantics.
  • The RTL interpreter expects that phi nodes
          IPhi(phi_lbl, res, (x,l1), (y,l2)
    are entered via either label l1 or label l2, with no intervening labels. You may need to think a bit about this when implementing case EIf, the only case of instrs_of_exp in which phi nodes need appear...

3. Submit

Submit your ssa.ml on or before the due date, via Blackboard.

4. Piazza

Finally: if any of these instructions are unclear, ask for clarification early and often on Piazza! I want everyone to succeed (and have fun!) on this assignment.