Skip to content
Andrew Kelley edited this page Jan 9, 2024 · 10 revisions

The below terms may be helpful if you're new to Zig, compilers, or low-level programming.

AIR

Analyzed Intermediate Representation. This data is produced by Sema and consumed by codegen. Unlike ZIR where there is one instance for an entire source file, each function gets its own AIR. Also, unlike ZIR, each AIR instruction has an associated type.

See src/Air.zig.

Comptime

Compile-time, as opposed to runtime. Introduction to comptime in zig

Container

An array list, hash map, or non-intrusive linked list.

Namespace

Structs, enums, unions, and opaques declare namespaces. Each Zig source file is a struct and therefore also a namespace.

Namespaces contain order-independent declarations.

When deciding upon a name for a namespace, keep in mind the fully qualified namespace.

For example:

  • Bad namespaces:
    • hash.HashMap
    • crypto.sha.Sha3
  • Good namespaces:
    • hash.Map
    • crypto.Sha3

ELF

"Executable and Linkable Format, formerly named Extensible Linking Format, is a common standard file format for executable files, object code, shared libraries, and core dumps." -Executable and Linkable Format on Wikipedia

ELF is the format used by object files and executables on most operating systems. Exceptions are macOS which uses Macho, and Windows, which uses PE for executables and COFF for object files.

IR

Intermediate representation. Typically represented as instructions that reference each other. Zig has several kinds of IR:

  • ZIR
  • AIR
  • MIR (stage2 only)

There is also LLVM IR.

LLVM

LLVM Website

LLVM is a set of compiler and toolchain technologies. Zig uses LLVM as one of many Backends. Zig's LLVM backend inputs AIR and outputs LLVM IR. LLVM inputs LLVM IR, performs optimizations, and then outputs object files.

See src/codegen/llvm.zig.

LLVM IR

LLVM Intermediate Representation. This is the format that LLVM inputs. It then performs optimizations and then outputs an object file.

Semantic Analysis

Semantic analysis of ZIR instructions. Transforms untyped ZIR instructions into semantically-analyzed AIR instructions. Does type checking, comptime control flow, and safety-check generation. This is the the heart of the Zig compiler.

See src/Sema.zig.

MIR

Machine Intermediate Representation, lowered to from AIR. Each instruction set architecture will have its own MIR dialect as it is designed to closely match available instructions. Designed for stage2+, since LLVM and C backends don't need MIR. Partially implemented. See the design issue.

ZIR

Zig Intermediate Representation. src/Astgen.zig converts AST nodes to these untyped IR instructions. Next, src/Sema.zig processes these into AIR. The minimum amount of information needed to represent a list of ZIR instructions. Once this structure is completed, it can be used to generate AIR, followed by machine code, without any memory access into the AST tree token list, node list, or source bytes. Exceptions include:

  • Compile errors, which may need to reach into these data structures to create a useful report.
  • In the future, possibly inline assembly, which needs to get parsed and handled by the codegen backend, and errors reported there. However for now, inline assembly is not an exception.

Each Zig source file has a corresponding ZIR representation. You can use zig ast-check -t example.zig to see a textual representation of the ZIR for any Zig source file.

See src/Zir.zig.

stage1

These files. There used to be a C++ implementation of Zig that was referred to as "stage1" but it was replaced by these files. They are used to build Zig from a WebAssembly kernel.

stage2

These files. This is an implementation of Zig written in Zig.

stage3

stage2 compiled with itself produces stage3. Compiling again with stage3 produces stage3 (input and output are the same).