UP | HOME

Machine code generation
Lecture 17

Table of Contents

Machine code generation

No longer need to worry about SimpleC semantics at all.

Intermediate code

  • Closer to assembly, easier to translate
  • What makes TAC different from assembly?

X86 assembly primer (AT&T style)

Operands

  • Immediate: $1
  • Register: %rax
  • Memory: 0xdeadbeef
  • Register indirect (pointers): (%rbp)
    • %rbp is a register that holds an address
  • Register indirect plus offset: -32(%rbp)
    • Get value at address in %rbp minus 32 bytes

Memory layout for SimpleC

Opcodes

CONST

CONST _t0 1
movq	$1, -32(%rbp)

Assuming _t0 is already allocated to stack frame

ASSIGN

ASSIGN true _t0
mov	-32(%rbp), %rax
mov	%rax, -8(%rbp)

Assuming true and _t0 are already allocated

How else can this be implemented?

Can use later optimization step to reduce instructions, use faster instructions, etc.

Arithmetic operators

SUB _t5 x _t4
mov	-24(%rbp), %rax
mov	-64(%rbp), %rcx
sub	%rcx, %rax
mov	%rax, -72(%rbp)

All temp and locals are in the stack frame in memory. In this example we move all variables to registers before operating on them and back into memory afterwards.

Are there anyways to make this more efficient? Will they work for any given SimpleC program?

Labels and branching

GOTOLE _l2_main x _t2
...
LABEL _l2_main
...
  mov	-24(%rbp), %rax
  mov	-48(%rbp), %rcx
  cmp	%rcx, %rax
  jle	_l2_main
  ...
_l2_main:
  ...

Intel architecture has a flags register that gets tested when using jump (jXX) ops.

Example program

SimpleC

main() {
  int x;
  input x;
  while(x > 0) { x = x - 1; }
  return 0;
}

Intermediate code

[main
CONST _t0 1
ASSIGN true _t0
CONST _t1 0
ASSIGN false _t1
INPUT x
LABEL _l0_main
CONST _t2 0
GOTOLE _l2_main x _t2
CONST _t3 1
GOTO _l3_main
LABEL _l2_main
CONST _t3 0
LABEL _l3_main
GOTOZE _l1_main _t3
CONST _t4 1
SUB _t5 x _t4
ASSIGN x _t5
GOTO _l0_main
LABEL _l1_main
CONST _t6 0
RETURN _t6
]

Assembly code

.text
.globl main
.type main, @function
main:
  push	%rbp
  mov	%rsp, %rbp
  sub	$96, %rsp
  movq	$1, -32(%rbp)
  mov	-32(%rbp), %rax
  mov	%rax, -8(%rbp)
  movq	$0, -40(%rbp)
  mov	-40(%rbp), %rax
  mov	%rax, -16(%rbp)
  call	input_int64_t@PLT
  mov	%rax, -24(%rbp)
_l0_main:
  movq	$0, -48(%rbp)
  mov	-24(%rbp), %rax
  mov	-48(%rbp), %rcx
  cmp	%rcx, %rax
  jle	_l2_main
  movq	$1, -56(%rbp)
  jmp	_l3_main
_l2_main:
  movq	$0, -56(%rbp)
_l3_main:
  mov	-56(%rbp), %rax
  cmp	$0, %rax
  jz	_l1_main
  movq	$1, -64(%rbp)
  mov	-24(%rbp), %rax
  mov	-64(%rbp), %rcx
  sub	%rcx, %rax
  mov	%rax, -72(%rbp)
  mov	-72(%rbp), %rax
  mov	%rax, -24(%rbp)
  jmp	_l0_main
_l1_main:
  movq	$0, -80(%rbp)
  mov	-80(%rbp), %rax
  jmp	_main_return
_main_return:
  mov	%rbp, %rsp
  pop	%rbp
  ret

Register allocation

Next time

Assembly file layout

Function calls

I/O

Using GDB

Compiler project

Author: Paul Gazzillo

Created: 2022-03-21 Mon 13:50