Skip to content

m-zakeri/IUSTCompiler

Repository files navigation

Compiler Course Code Snippets

Welcome to the repository containing course materials and code snippets developed for teaching the compiler at Iran University of Science and Technology (UST). This repository includes various language grammars written in ANTLR v4 format. For each grammar, the source code of the generated Lexer and Parser is available in Python 3.8.x.

Please note that this repository is intended to be updated regularly. It would be appreciated if you use our repository by forking it.

For any queries or concerns, feel free to reach out to me at m-zakeri[at]live.com. Alternatively, you can refer to the documentation for more details.

Examples

This section provides examples of the outputs that can be generated by the code snippets in this repository.

Three Address Codes

Figure 1 illustrates how a single pass compiler can generate three address code for assignment statements with the minimum number of temporary variables, starting with T:

Three Address Codes

Fig 1: Examples of three address codes generated by ANTLR for AssignmentStatement grammar.

Abstract Syntax Tree (AST)

Figure 2 demonstrates how a single pass compiler can generate an abstract syntax tree (AST) for assignment statements:

Abstract Syntax Tree

Fig 2: Examples of abstract syntax trees (AST) generated by ANTLR for AssignmentStatement grammar.

The above tree corresponds to the following expressions:

a1 := (2 + 12 * 3) / (6 - 19)
a2 := 2 + 3 * 4

Repository Structure

This section describes the structure of the repository:

Grammars

The grammars directory contains various grammar files:

  • gram1: ANTLR hello world grammar.
  • Expr1: Simple grammar for handling mathematical expressions without any attribute and action.
  • Expr2: Simple attributed grammar for handling mathematical expressions with code() attribute.
  • Expr3: Same as Expr2 grammar.
  • AssignmentStatement1.g4: Grammar to handle multiple assignment statements and mathematical expressions in languages like Pascal and C/C++.
  • AssignmentStatement2.g4: Same as AssignmentStatement1.g4 grammar with attributes for holding rule code and rule type.
  • AssignmentStatement3.g4: Grammar to handle multiple assignment statements and mathematical expressions in languages like Pascal and C/C++. It provides semantic rules to perform type checking and semantic routines to generate intermediate representation.
  • AssignmentStatement4.g4: Similar to AssignmentStatement3.g4 grammar but designed to generate intermediate representation (three addresses codes) with the minimum number of "temp" variables.
  • CPP14_v2: ANTLR grammar for C++14 forked from the official ANTLR website. Some bugs have been fixed and also the rule identifiers have been added to the grammar rules.
  • EMail.g4: Lexical grammar to validate email addresses.
  • EMail2.g4: Lexical grammar to validate email addresses, fixing bugs in EMail.g4.

Language Applications

The language_apps package contains Lexer and Parser codes for each grammar in the grammars directory, along with a main driver script to demonstrate the type checking and intermediate code generation based on semantic rules and semantic routines.

Terminal Batch Scripts

The terminal_batch_script directory contains several batch scripts to run ANTLR in terminal (Windows) to generate target code in JAVA language. These code snippets belong to my early experiences with ANTLR.

Lecture & Tutorials

The Lectures section of this repository offers comprehensive resources for learning Compiler Design. Each lecture is designed to simplify complex concepts and explain them in an intuitive manner through practical examples. The aim is to make the subject matter accessible and engaging for students, regardless of their prior knowledge or experience with compiler design.

Further Reading

ANTLR slides:

For Reading Compiler Design Lectures: See HERE