I. Supporting Subroutines in the ISA A. feature design to support subroutines What features of an ISA are needed to make writing subroutines easier and have them execute faster? ------------------------------------------ GOALS FOR SUBROUTINES subroutine = function or procedure (abstraction of expressions or commands) ------------------------------------------ ------------------------------------------ VM FEATURES TO SUPPORT SUBROUTINES ------------------------------------------ B. design for statically-scoped subroutines 1. what is static scoping? ------------------------------------------ STATIC SCOPING def: In *static scoping*, each identifier, x def: In *dynamic scoping*, each identifier, x, ------------------------------------------ Is there another way that identifiers could be found in a program? What kind of scoping is in C, C++, and Java? The Unix shell? 2. motivation for static scoping ------------------------------------------ MOTIVATION FOR STATIC SCOPING int incr = 1; int addOne(int y) { return y+incr; } int client() { int incr = 2; int z = addOne(3); // what is the value of z here? return z; } ------------------------------------------ What should addOne do? What should the value of z be? 3. block structure ------------------------------------------ BLOCK STRUCTURE def: A *block* is Usual Grammar: Example in C { int next = 3*x+1; next = next / 2; return next; } ------------------------------------------ ------------------------------------------ ADVANTAGES OF BLOCK STRUCTURE - Local storage - Control of names - Easier to extract procedures ------------------------------------------ 4. motivation for recursion Do you think recursion is hard to use and understand? ------------------------------------------ RECURSIVE DATA ==> RECURSIVE PROGRAMS A good rule of design: make program structure like the data structure #include #include "Tdef.h" typedef struct treeNode { T value; struct treeNode *left, *right; } tree; // helper to compute maximum of its args int max(int a, int b) { return (a >= b) ? a : b; } ------------------------------------------ How should we write a program to find the depth of a tree? Is that better than using while loops and an explicit stack? ------------------------------------------ RECURSIVE GRAMMARS Grammar for statements: ::= ... | while () Structure of a parser: // typedef /* ... */ stmtTree; stmtTree *parseStatement() { /* ... */ parseWhileLoop(); /* ... */ } stmtTree *parseWhileLoop() { /* ... */ parseStatement(); /* ... */ } ------------------------------------------ Is it easy to follow what these routines are doing? Why are natural languages structured recursively? 5. design for subroutines ------------------------------------------ VM DESIGN FOR SUBROUTINES For each call: - storage for a subroutine's variables (local storage) organized as - storage for a single call is called an AR: def: An *activation record* (AR) ------------------------------------------ Why a stack? ------------------------------------------ STACK ORGANIZATION Initially: [ 0 ] Call of subroutine P: [ AR for P ] [ 0 ] After P calls Q: [ AR for Q ] [ AR for P ] [ 0 ] After Q calls R: [ AR for R ] [ AR for Q ] [ AR for P ] [ 0 ] After R returns: [ AR for Q ] [ AR for P ] [ 0 ] After Q returns: [ AR for P ] [ 0 ] After P returns: [ 0 ] ------------------------------------------ 6. stack implementation Since a computer's memory is like a big (1D) array, how should we implement the runtime stack? ------------------------------------------ STACK IMPLEMENTATION AR delimited by two indexes: - SP: - BP: Notes, assuming stack is word-addressed, and grows towards higher addresses (up) ------------------------------------------ Does the storage for a subroutine vary dynamically? ------------------------------------------ STACK IMPLEMENTATION: HEADER FILES Assume: - stack is word addressed - stack grows towards higher addresses (upwards) // File: machine_types.h #ifndef _MACHINE_TYPES_H #define _MACHINE_TYPES_H // words for this machine typedef int word; // addresses for this machine typedef unsigned short int address; #endif // File: stack.h #ifndef _STACK_H #define _STACK_H #include #include #include "machine_types.h" #define MAX_STACK_HEIGHT 2048 // Initialize the stack data structure extern void stack_initialize(); /* ... other extern declarations ... */ #endif ------------------------------------------ ------------------------------------------ STACK IMPLEMENTATION: STACK.C FILE #include #include "utilities.h" #include "stack.h" // the stack's storage static word stack[MAX_STACK_HEIGHT]; // index of next free element static address sp; // first index of current AR static address bp; // Initialize the stack data structure void stack_initialize() { } // Return the stack's num. of elements // (SP value) address stack_size() { } // Return the address of the base // of the current AR (BP value) address stack_AR_base() { return bp; } // Is the stack empty? bool stack_empty() { } // Is the stack full? bool stack_full() { } // Requires: !stack_full() // push a word on the stack void stack_push(word val) { } // Requires: stack_size() + n // < MAX_STACK_HEIGHT // Increase the size of the stack by n void stack_allocate(unsigned int n) { int new_sp = sp + n; if (!legal_stack_index(new_sp)) { // report error } sp = new_sp; } // Requires: !stack_empty() // pop the stack and return the top elem word stack_pop() { } // return the top element without popping word stack_top() { } ------------------------------------------ ------------------------------------------ SOME OTHER FUNCTIONS // Is the given address legal? static bool legal_stack_index(address addr) { return addr < MAX_STACK_HEIGHT; } // fetch the value from the given address word stack_fetch(address addr) { if (!legal_stack_index(addr)) { // report error } return stack[addr]; } // assign val to the given address, addr, on the stack void stack_assign(address addr, word val) { if (!legal_stack_index(addr)) { // report error } stack[addr] = val; } ------------------------------------------ Why do we have fetching and assigning to other than the top element? C. Addressing for nested routines 1. Problem of addressing locals ------------------------------------------ HOW TO ADDRESS LOCAL VARIABLES? proc p0() { var int x; fun p1() { var int y; var int z; proc p2() { var int a; return a+x*x+z; } /* ... */ p2(); /* ... */ p0(); /* ... */ } /* ... */ p1(); /*... */ p0(); /* ... */ } ------------------------------------------ How manycalls to p0 and p1 will be on the stack when the return from p2 is evaluated? At the return, how can the compiler find the locations of x and y? ------------------------------------------ THE PROBLEM Programming language features - subroutines - nesting of subroutines (Pascal, JS) - static scoping ==> absolute address of variables is unpredictable ------------------------------------------ Can we tell from the text of a program where a routine's ARs will be on the runtime stack? If we don't know where the AR will be on the runtime stack how can local variables in an AR be addressed? 2. Compiler response to solve the problem ------------------------------------------ COMPILER-BASED SOLUTION What can compiler know statically about local variable locations? What would we need to find exact location of a local variable? When can an AR be created that needs to know the base of the surrounding AR? Could we pass the base of the AR for the surrounding scope in a call? If each AR stores a (static) link to the AR of the surrounding scope, how can we address two layers out? 3? What information is needed to address a local variable in a surrounding scope? ------------------------------------------ Are all calls like that? a. Summary: two-part addresses ------------------------------------------ SUMMARY Compilers use two-part addresses, called *lexical addresses* that consist of: ------------------------------------------ ------------------------------------------ HOW TO ADDRESS LOCAL VARIABLES? proc p0() { var int x; fun f1() { var int y; var int z; proc p2() { var int a; fun f3() { /* ... */ } // here --v return f1()+f3()+a+x*x+z; } /* ... */ p2(); /* ... */ p0(); /* ... */ } /* ... */ p1(); /*... */ p0(); /* ... */ } ------------------------------------------ At the statement marked "here", what is the lexical address of x? What is the lexical address of z? What is the lexical address of a? What is the lexical address of f3? What is the lexical address of f1? 3. What is stored in an AR ------------------------------------------ INFORMATION STORED IN ARs What information is needed in an AR? ------------------------------------------ 4. VM stack operations for call/return ------------------------------------------ WORKING WITH ARs // Requires: stack_size()+3 // < MAX_STACK_HEIGHT // call a subroutine void stack_call(int ret_addr, address static_link) { } // Requires: restored BP and SP values // satisfy stack's invariant // return given value from a subroutine extern void stack_return_value( int *PC, word fun_value) { } ------------------------------------------ What is sp now? a. pictures ------------------------------------------ PICTURE OF CALL OPERATION Execution of stack_call(PC, SL): PC: 10 BP: 50 SP: 57 63 | | 62 | | 61 | | 60 | | 59 | | 58 | | 57 | | <- SP 56 | arg[3] | 55 | arg[2] | 54 | arg[1] | 53 | | 52 | | 51 | | 50 | | <- BP | | ------------------------------------------ ------------------------------------------ PICTURE OF RETURN OPERATION Execution of stack_return_value(&pc, v): PC: 215 BP: 57 SP: 63 63 | | <- SP 62 | | 61 | | 60 | | 59 | 10 (old PC) | 58 | 50 (old BP) | 57 | | <- BP 56 | arg[3] | 55 | arg[2] | 54 | arg[1] | 53 | | 52 | | 51 | | 50 | | | | ------------------------------------------ b. determining the static link to pass ------------------------------------------ WHAT STATIC LINK TO PASS? If routine R calls E, what static link is passed? ------------------------------------------ II. Other VM designs A. Designs seen previously ------------------------------------------ VM DESIGNS PREVIOUSLY SEEN 1-register machine == Tiny VM Stack machine == VM of Homework 1 ------------------------------------------ B. P-machine or VM/0 ------------------------------------------ P-MACHINE (VM/0) Several registers: R0, R1, R2, R3 Arithmetic names 3 registers: target + 2 sources Comparisons name 2 registers: Load, Store name 1 register: Call, Return use static links ------------------------------------------ 1. P-machine instructions ------------------------------------------ P-MACHINE INSTRUCTIONS Word length == 16 bits Memory is word addressed Instruction Format: OP code: 4 bits Registers: 2 bits Memory location: 8 bits Code location: 12 bits Examples: 1 register, 1 address/value: LIT Rd, 0, M [ Op | Rd | - | M ] <-4--> 2 2 <--8 bits-----> LOD Rd, 0, M 3 registers: SUB Rd, Ry, Rz [ Op | Rd | Ry | Rz | func ] <-4--> 2 2 2 <-6 bits-> System operations, I/O: SYS 0, 0, 0 [ Op | - | - | - | Dv ] <-4--> 2 2 <-6 bits-> 2 Jumps: JMP 0,0,M [ Op | M ] <-4--> <-- 12 bits -----------> ------------------------------------------ What happens to the PC during execute cycle? 2. adapting to changes ------------------------------------------ ADAPTING TO CHANGES Larger word size of 32 bits? Byte addressing? with 16 bit words: with 32 bit words: ------------------------------------------ How does the x86 architecture address more memory?