Projects‎ > ‎

Project1

A Mini Triangle Compiler

For this project you will be implementing a language processing (compiler) for a simple programming language called Mini Triangle.  You will develop a scanner, a parser, and a bytecode compiler for the language.

The Mini Triangle Grammar:

Program            ::=  Command

Command            ::=  single-Command
                    |   Command ';' single-Command

single-Command     ::=  V-name ':=' Expression
                    |   Identifier '(' Expression ')'
                    |   if Expression then single-Command
                           else single-Command
                    |   while Expression do single-Command
                    |   let Declaration in single-Command
                    |   begin Command end

Expression         ::=  primary-Expression
                    |   Expression Operator primary-Expression

primary-Expression ::=  Integer-Literal
                    |   V-name
                    |   Operator primary-Expression
                    |   '(' Expression ')'

V-name             ::=  Identifier

Declaration        ::=  single-Declaration
                    |   Declaration ';' single-Declaration

single-Declaration ::=  const Identifier ~ Expression
                    |   var Identifier : Type-denoter

Type-denoter       ::=  Identifier

Operator           ::=  '+' | '-' | '*' | '/' | '<' | '>' | '=' | '\'

Identifier         ::=  Letter | Identifier Letter | Identifier Digit

Integer-Literal    ::=  Digit | Integer-Literal Digit

Comment            ::=  ! Graphic* <eol>

Part A - The Scanner

Due Mon Feb 11 at 11:59pm in your git repository.  

Your repo should look like this:

<username>/cs345/project1/parta/scanner.py

The lexical grammer (token grammar) looks like this:

Token             ::=  Identifier | Integer-Literal | Operator | 
                       : | ; | := | ~ | ( | ) | eot

Identifier        ::=  Letter | Identifier Letter | Identifier Digit

Integer-Literal   ::=  Digit | Integer-Literal Digit

Operator          ::=  '+' | '-' | '*' | '/' | '<' | '>' | '=' | '\'

Separator         ::=  Comment | <space> | '\t' | eol

Comment           ::=  ! Graphic* eol

However, using left factorization, removal of left recursion, and substitution of non-terminals.  This lexical grammar can be reduced to:

Token             ::=  Letter (Letter | Digit)* | Digit Digit* |
                       '+' | '-' | '*' | '/' | '<' | '>' | '=' | '\'
                       ';' | ':' ('=' | <empty>) | '~' | '(' | ')' | <eot>

Separator         ::=  '!' Graphic* <eol> | <space> | <eol>

Note that when scanning it is not possible to tell the different between an identifier from a keyword until after the identifier has been scanned.  Once scanned it can be turned into a specific keyword token.

Part B - The Parser and AST

Due Mon Feb 25 at 11:59pm in your git repository.  

Your repo should look like this:

<username>/cs345/project1/partb/parser.py

Your parser should produce an AST defined by ast.py (see attachments below):

Part C - The Code Generator

Due Wed Mar 20 at 11:59pm in your git repository.

Your repo should look like this:

<username>/cs345/project1/partc/codegen.py

You will create a CodeGen() class with a generate method that returns a Python code object.  For example:

s = scanner.Scan(mini_triangle_program)
tokens = s.scan()
p = parser.Parse(tokens)
tree = p.parse()
c = codegen.CodeGen(tree)
bytecode = c.generate()

You can then call the bytecode to run the program:

bytecode()

Your code generator should create an executable python program that is equivalent in meaning to the given Mini Triangle program.

You should be able to "compile" Mini Triangle Programs on the command line:

$ python codegen.py factorial.mt
$ python factorial.py

Part D - Advanced Features

Due Mon Mar 25 at 11:59pm in your git repository.

You need to add the following features to Mini Triangle:

  1. The semicolon should now be a statement terminator and not a separator.  No termination is needed after "end".
  2. You need to add proper precedence to the operators.
  3. Add function definitions.  Functions can take multiple parameters and return a single value.
  4. Allow function calls in expressions.
  5. Generate proper bytecode for these new features.
Extra Credit
  1. Add a string type with some basic string operations and the ability to print strings with putstr() and input strings with getstr().
    • String operations: + (concatenate), concat(s1, s2), substr(s, start, end), str(int), int(str)
  2. Add compile-time type checking.
  3. Add llvm code generation.

ċ
ast.py
(4k)
Greg Benson,
Feb 28, 2013, 4:37 PM
ċ
byteplay_test_jump.py
(0k)
Greg Benson,
Mar 20, 2013, 11:23 AM
ċ
byteplay_test_simple.py
(0k)
Greg Benson,
Mar 20, 2013, 11:24 AM
ċ
byteplay_test_while.py
(1k)
Greg Benson,
Mar 20, 2013, 11:24 AM
ċ
calc.tar.bz2
(4k)
Greg Benson,
Mar 19, 2013, 4:23 PM
ċ
eval.py
(5k)
Greg Benson,
Feb 28, 2013, 4:38 PM
ċ
fact.mt
(0k)
Greg Benson,
Mar 20, 2013, 4:10 PM
ċ
isprime.mt
(0k)
Greg Benson,
Mar 20, 2013, 4:10 PM
ċ
parser_test.py
(2k)
Greg Benson,
Feb 21, 2013, 1:56 PM
ċ
partc_func.tar.bz2
(6k)
Greg Benson,
Mar 19, 2013, 4:23 PM
ċ
scanner_given.py
(4k)
Greg Benson,
Feb 7, 2013, 12:43 PM
ċ
scanner_test.py
(3k)
Greg Benson,
Feb 9, 2013, 9:30 AM
Comments