CS 4100: Introduction to Formal Languages and Compilers

Spring 2017

Formal Languages and Compilers

An upper-level course for CS majors on formal languages theory and compilers.

Topics (subject to revision): regular expressions; finite automata; context-free grammars; predictive parsing; LR parsing; abstract syntax; type systems and type-checking; stack layout and activation records; intermediate representations; control-flow graphs; static-single assignment (SSA) form; dataflow/liveness analysis; register allocation; garbage collection/runtimes; the LLVM compiler infrastructure. Over the course of the semester, students will implement a full functioning compiler for a small imperative programming language, targeting LLVM. The course involves a significant amount of programming.

Lecture: Tuesday, Thursday 1:30–2:50 p.m., ARC 315
Professor: Gordon Stewart (gstewart@ohio.edu)
Office Hours: T 3-4pm, Th 11am-12pm (Stocker 355), or by appointment
TA: Alex Bagnall (ab667712@ohio.edu)
Lab Hours: Mondays before assignments due, 4-5pm in Stocker 307 (tentative)
Piazza: Course Page, Signup

Course Objectives

After completing the course, students will have


Textbooks and Software

The primary text is While a hard copy of this book is certainly worthwhile, before you buy I urge you to check out the library's electronic version. If you don't mind reading on your laptop screen, the electronic copy may save you some money!

Periodically I may assign additional supplementary (optional but recommended) readings from resources such as

all of which are freely available online to registered OU students.


CS 3200 and 3610, but also: Some mathematical maturity (at the level of "I've seen and done a few proofs before"), facility with a couple different programming languages (at the 3200 level of exposure), and a desire to learn.

Course Structure

The course consists of twice-weekly lectures (Tuesday and Thursdays), attendance at which is required. To help get you up to speed with OCaml and the course programming assignments, we'll also hold biweekly lab hours (time TBD). Although attendance at the lab hours is optional, I highly recommend that you attend — at least for the first few weeks of the course. The programming assignments for this course are extensive and time consuming, so be prepared!

In addition to biweekly homework assignments, there will be a midterm exam (Week 7, approximately 15% of your grade) and a final (Week 15, approximately 25%). The biweekly homeworks (programming assignments) are worth approximately 40%. We'll have weekly quizzes every Tuesday (with probability 1/3), along with bi-weekly offline Blackboard quizzes (total 10%). Participation and attendance at lecture are worth 5%.

Blackboard will be used only to report grades and to post lecture notes. Up-to-date information on all other aspects of the course (assignment due dates, etc.) will be posted either on this website or on the Piazza page or both.

Assignments Key:
Programming Assignment
Quiz Available On Blackboard
UPDATE 1/16: In the schedule below, quizzes and assignments are now listed by due date rather than by release date.

Schedule (Tentative)

Intro. to Compilers, OCaml
W1: 1/9-13
Introduction to compilers and functional programming in OCaml
Reading: Appel 1; RWO I.1.
Supplemental Reading: OCaml Manual: Core Language.
OCaml QuickStart Lab: Monday 1/16, 4-5pm, Stocker 307
W2: 1/16-20
More functional programming: polymorphism, higher-order functions, algebraic datatypes and pattern-matching
Supplemental Reading: OCaml Pervasives Library (reference)
A0 Due 1/17 at 1:30pm: A0: Intro. to OCaml.
Q0 Due 1/17 at 1:30pm
Lexing and Parsing
W3: 1/23-27
Regular expressions, regular languages
Reading: Appel 2 (up to and including 2.2)
A1 Due 1/24: A1: Functional Programming in OCaml.
Q1 Due 1/24 at 11:59pm
W4: 1/30-2/3
DFAs and NFAs, lexer generators
Reading: Appel 2.3-2.5
Q2 Due 1/31 at 1:30pm
W5: 2/6-10
Context-free languages, pushdown automata
Reading: Appel 3 (through Section 3.1)
A2 Due 2/7 at 11:59pm: A2: Regular Expressions Re-Examined.
W6: 2/13-17
Recursive descent parsing, predictive parsing, parser generators
Reading: Appel Sections 3.2-3.5
Q3 Due 2/14 at 1:30pm
Types and Type-Checking
W7: 2/20-24
Abstract syntax trees, type systems
Reading: Appel 4, TAPL 8 (OU Library eBook)
Q4 Due 2/21 at 1:30pm
W8: 2/27-3/3
Symbol tables, type-checking
Reading: Appel 5
A3 Due 2/28 at 1:30pm: A3: Lexing and Parsing with ocamllex and Menhir.
Midterm Exam: Thursday 3/2
W9: 3/6-3/10 Spring Break, No Class
Intermediate Representations
W10: 3/13-17
Control-flow graphs, dominators
Reading: Appel 7.1, Appel 18.1
W11: 3/20-24
Use-def, dataflow/liveness analysis, Static Single Assignment (SSA) form, interference graphs
Reading: Appel 10.1, Appel 19 (up to but not including 19.1)
A4 Due 3/21 at 11:59pm: A4: Type-checking.
W12: 3/27-31
Dataflow analysis contd., translation to SSA form
Q5 Due 3/28 at 11:59pm
Runtimes and Garbage Collection
W13: 4/3-7
Stack layout and activation records; Intro. to runtimes, garbage collection; mark-and-sweep collection, copying collection, reference counting, generational collection
Reading: Appel 13, through 13.4; Appel 6.1
A5 Due 4/6: A5: SSA.
W14: 4/10-4/14
Intro. to LLVM assembly and the LLVM compiler toolkit; intro. to register allocation
Reading: Appel 11 through 11.3; AOSA: LLVM
Q6 Due 4/14 at 11:59pm
Register Allocation
W15: 4/17-21
Register allocation contd., final exam review
EC A6 Due 4/18: A6: LLVM.
April 24-28: Final Exams

Homework and Collaboration Policies

Homework will usually be due Tuesdays, by the start of class (1:30 p.m.). Late homework assignments will be penalized according to the following formula:

You may discuss the homework with other students in the class, but only after you've attempted the problems on your own first. If you do discuss the homework problems with others, write the names of the students you spoke with, along with a brief summary of what you discussed, in a README comment at the top of each submission. Example:

(* README Gordon Stewart, Assn #1
I worked with X and Y. We swapped tips regarding the use of pattern-matching in OCaml. *)

However, under no circumstances are you permitted to share or directly copy code or other written homework material, except with course instructors. The code and proofs you turn in must be your own. Remember: homework is there to give *you* practice in the new ideas and techniques covered by the course; it does you no good if you don't engage!

In general, students in EECS courses such as this one should adhere to the Russ College of Engineering and Technology Honor Code, and to the OU Student Code of Conduct.

Students with Disabilities

If you suspect you may need an accommodation based on the impact of a disability, please contact me privately to discuss your specific needs. If you're not yet registered as a student with a disability, contact the Office of Student Accessibility Services first.