Assembler
The Hack Machine Language Specification
We distinguish two programs:
Binary Hack program: A binary Hack program is a sequence of text lines, each consisting of sixteen 0 and 1 characters. Contains the instruction data we load onto the CPU to execute.
Assembly Hack program: An assembly Hack program is a sequence of text lines, each being one of thre:
- Assembly instruction: A symbolic A-instruction or a symbolic C-instruction.
- Label declaration: A line of the form
(xxx)
, wherexxx
is a symbol. - Comment: A line beginning with two slashes (//) is considered a comment and is ignored.
See Figure 4.5 for the specification of the Hack instruction set:
Handling Instructions
For each assembly instruction, the assembler
- Parses the instruction into its underlying fields.
- For each field, generates the corresponding bit-code, as specified in figure 4.5.
- If the instruction contains a symbolic reference, resolves the symbol into its numeric value.
- Assembles the resulting binary codes into a string of sixteen $0$ and $1$ characters.
- Writes the assembled string to the output file.
Handling Symbols
A common solution is to develop a two-pass assembler.
- The assembler creates a symbol table and initializes it with all the predefined symbols and their pre-allocated values.
- In the first pass, the assembler builds a symbol table, adds all the label symbols to the table, and generates no code
- In the second pass, the assembler handles the variable symbols and generates binary code, using the symbol table.