Arithmetic
Compiler Implementation
COP-3402
Table of Contents
Arithmetic
Goal
Map SimpleIR operations
x := a + b
To assembly, e.g.,
add %rbx, %rax
Challenges
- Handling variables
- Handling integer constants
- Handling assignment
SimpleIR operations
operation:
NAME ':=' operand1=(NAME | NUM)
operator=('+' | '-' | '*' | '/' | '%')
operand2=(NAME | NUM);
Intuition
x := a + b
- NAME for the assignment, e.g.,
x - Operands, e.g.,
aandb - An operator, e.g.,
+
Variations
Operands can be integers
x := a + 1
- Five operators available:
+-*/%- Add, subtract, multiply, divide, modulo
Handling variables
Symbol table
localVariables x a b
| Variable | Offset |
|---|---|
| x | -16 |
| a | -24 |
| b | -32 |
Accessing operands
Accessing variables operands
- Stored in the
%rbp"array" amaps torpb[-24]- Assembly syntax:
-24(%rbp)isrbp[-24]
mov -24(%rbp), %rax # load variable a into rax
Use the local variable storage scheme in the stack frame.
Representing integer constants
- Assembly supports constants (move immediate)
- Assembly syntax: prefix with
$
mov $1, %rbx
The machine code stores the integer in binary representation right in the binary encoding of the move operation.
Assigning to variables
%rbpholds the variables addresses- Make variable address the destination of move
mov %rax, -32(%rbp) # store rax into variable b
Remember that in AT&T assembly syntax, the second operand is the destination operand
Performing arithmetic
Addition
add %rbx, %rax
is equivalent to
rax = rax + rbx
Left operand and the destination operand are the same register and are the second argument to the assembly instruction.
Remember, the destination is always the second argument to the att assembly instruction.
Subtraction
sub %rbx, %rax
Behaves the same as addition operation, except subtraction happens
Getting the operands mixed up
What are the results of these additions?
%rax first:
mov $9, %rax mov $5, %rbx add %rax, %rbx
%rax second:
mov $9, %rax mov $5, %rbx add %rbx, %rax
Mixing up subtraction operands
What are the results of these additions?
%rax first:
mov $9, %rax mov $5, %rbx sub %rax, %rbx
%rax second:
mov $9, %rax mov $5, %rbx sub %rbx, %rax
Remember that that left operand and the destination are the same register and are the last argument to the assembly instruction
There are two consequences to getting the arguments swapped. The register you don't expect will end up storing the result. Additionally, for subtraction, you will get an unexpected result.
Integer Multiplication
imul %rbx, %rax
Translating SimpleIR to assembly
function main localVariables x a b a := 9 b := 5 x := a + b return x end function
What are the base pointer offsets of x, a, and b?
Local variable offsets
localVariables x a b
| Variable | Offset |
|---|---|
| x | -16 |
| a | -24 |
| b | -32 |
x starts at -16 in our table, because our implementation also saves %rbx.
Three pieces to the operation
- Load variable data to registers (a and b)
- Perform arithmetic (a + b)
- Store resulting variable data (x)
Assembly code
# load a and b
mov -24(%rbp), %rax
mov -32(%rbp), %rbx
# perform addition, i.e., rax = rax + rbx
add %rbx, %rax
# store result in x
mov %rax, -16(%rbp)
Handling integer constants
function main localVariables x a b b := 5 x := 7 + b return x
One or both of the operands can be an integer constant
Assembly code
# move immediate 7
mov -$7, %rax
# load b
mov -32(%rbp), %rbx
# perform addition, i.e., rax = rax + rbx
add %rbx, %rax
# store result in x
mov %rax, -16(%rbp)
There are many ways to generate this code. This is a straightforward way to handle it when writing the compiler. Everything is the same except for how the register is originally given a value.
Pattern for add, sub, imul
SimpleIR pattern
DESTINATION = OPERAND1 OPERATOR OPERAND2
| SimpleIR | Assembly | Notes |
|---|---|---|
| DESTINATION | OFFSET | Find offset in symbol table |
| OPERAND1 | ASM_OPERAND1 | Find offset or set immediate value |
| OPERAND2 | ASM_OPERAND2 | Find offset or set immediate value |
| OPERATOR | ASM_OP | Find corresponding asm opcode |
Assembly code template:
mov ASM_OPERAND1, %rax mov ASM_OPERAND2, %rbx ASM_OP %rbx, %rax mov %rax, OFFSET(%rbp)
Division
- Division and remainder happen together
- Only specific registers can be used
Why might a process architecture design division this way?
Performing integer division
mov $9, %rax # %rax holds the numerator
mov $5, %rbx
cdq # now %rdx:%rax holds the sign-extended numerator
idiv %rbx # divide %rdx:%rax by %rbx # quotient (result) now in %rax # remainder now in %rdx
Notes
- Use
idivfor division and modulo - Setup predefined registers first (%rax and %rdx)
- Retrieve results from predefined registers (%rax and %rdx)
Example
function main localVariables x a b a := 9 b := 5 x := a / b return x
Assembly code
# setup operands (%rax must hold numerator)
mov -24(%rbp), %rax
mov -32(%rbp), %rbx
# perform division
cdq
idiv %rbx # %rbx is the demoninator
# store the result in x
mov %rax, -16(%rbp)
Modulo
How would the assembly code change for modulo?
# original: store the quotient (%rax) in x
mov %rax, -16(%rbp)
# new: store the remainder (%rdx) in x
mov %rdx, -16(%rbp)
Project
Code Generation I (codegen1) Project
- Show C++ and ANTLR usage
- Examples in given functions
- Show project setup
- Show given functions
- Show hard-coded functions