COP 3402 Spring 2002 Lab #2 (30 pts - out of a total of 400 points for the course) Due: Tuesday, February 26 Notes from the previous lab: - You may work in groups of one or two students. If you work in a group of two, turn in only one copy of the lab with both names on it, i.e., don't turn in two copies of the same program. - The program (in whatever condition) must be turned in (in class) on the day it is due. You'll get partial credit if the program is not completely done. You'll get NO credit if you turn in the program late. - If you use this file as the "skeleton" for your program, make sure you delete the extra stuff. Also, some of my explanations are too detailed to make them easier to follow. You shouldn't keep such long explanations as comments; comments don't need to be that long. This program will implement Pass-1 of a two-pass assembler for the SIC/XE. Notes: - You don't have to use exactly the same names as I have used to explain the lab (just make sure they are meaningful). - Your program structure does not have to be exactly the same as the structure I have used to explain the lab (just make sure your program is well structured). - Your data structures and algorithms for their manipulations must be exactly as specified in the lab. It may be helpful to look at Sample Input/Output as you read the lab description. /********************************************************************/ #include FILE *ot_fptr, /* Operation Table File Pointer; points to the file containing the input operation table */ *in_fptr, /* Input File Pointer; points to the file containing the input assembly program */ *out_fptr, /* Output File Pointer; points to the file to contain all the output of the program */ *fopen(); #define TRUE 1 #define FALSE 0 #define NAME_LENGTH 10 /* maximum number of characters in name */ #define BKT_SIZE 19 /* number of entries in bucket directory */ #define DIG_FACTOR 2 /* digit factor used in hashing */ #define LET_FACTOR 3 /* letter factor used in hashing */ /* an entry in Symbol Table */ struct st_entry { char st_name[NAME_LENGTH + 1]; char st_type; /* R (Relocatable) or A (Absolute) */ int st_value; struct st_entry *st_next; }; struct st_entry *symtbl[BKT_SIZE]; /* bucket directory (hash index); symbol table is maintained using hashing (we did this in Lab #1 - we had a set of linked lists); An entry in this table will indicate a label name, its type and its value (st_name is used for hashing purposes). */ #define OP_LENGTH 6 /* maximum number of characters in operation */ #define OT_SIZE 100 /* maximum number of entries in Operation Table */ /* an entry in Operation Table */ struct ot_entry { char ot_mnemonic[OP_LENGTH + 1]; int ot_format; /* 0 (assembler directive), 1 (format 1), 2 (format 2), 3 (format 3 or 4) */ }; struct ot_entry optbl[OT_SIZE]; /* operation table is maintained as an array. The optbl will be read from a file (using ot_fptr). I will put the file on the web and you can download it so that only one person (me) has to type the file. After reading values into optbl (i.e., after loading it), the optbl must be sorted by your program (you may use any sorting technique). Binary search will then be used for searching this table. The assembler directives are also kept in this table. So the directives will also be in the file (pointed to by ot_fptr) that you read into optbl (note that you read only one file to load optbl). */ int ot_count; /* number of entries in optbl; when reading optbl from the file, this variable must be set to indicate how many entries there are in optbl (note that OT_SIZE indicates the maximum number of entries and not the actual number of entries, i.e., ot_count will most likely be less than OT_SIZE) */ char ERR_MSGS[2][25] = { "Invalid Operation", "Duplicate Label" }; /* error messages; note that since we are not going to change the value of this array in the program, we have used upper case for the array name to indicate it is really a constant */ #define INV_OP 0 /* index to ERR_MSGS for invalid operation */ #define DUP_LBL 1 /* index to ERR_MSGS for duplicate label */ /* an entry in Error Queue */ struct eq_entry { int eq_lineno; /* line number */ int eq_msgind; /* index into ERR_MSGS (error message index) */ struct eq_entry *eq_next; }; struct { struct eq_entry *front; struct eq_entry *rear; } errtbl; /* error table; all the errors for the input assembly- language program are maintained in a queue; when there is an error in the input program, a new node is inserted into the queue; this node will contain the line number of the erroneous input line and an index to ERR_MSGS indicating the type of the error (note that the index is set to INV_OP or DUP_LBL) */ int locctr; /* location counter; this is incremented by the appropriate value for each instruction; increment this only if the instruction affects it, e.g., if the input line is a comment or it has error, locctr is not incremented */ int lineno; /* line number; this is incremented by 1 for each input assembly language line; increment this regardless of input being comment, directive, erroneous, ... */ #define OPND_LENGTH 50 /* maximum number of characters in operand */ #define INP_LENGTH 70 /* maximum number of characters in an input assembly line */ /********************************************************************/ main() { void init(), pass1(), prntinfo(); init(); pass1(); prntinfo(); }/* end of main */ /********************************************************************/ void init() { This routine is used by the main routine. It initializes the global variables such as locctr and lineno. It opens the three files (sets ot_fptr, in_fptr, out_fptr). It loads optbl (using ot_fptr) and sorts optbl. It writes the sorted optbl to output file. It initializes symtbl. It initializes errtbl. }/* end of init */ /********************************************************************/ void pass1() /* This routine is used by the main routine. It performs Pass-1 */ /* of the assembler. */ { char inp_line[INP_LENGTH + 1], /* input assembly line */ label[NAME_LENGTH + 1], /* label in the input */ oper[OP_LENGTH + 1], /* operation in the input */ opnd[OPND_LENGTH + 1]; /* operand in the input; this will contain everything after the operation in the input */ void get_tokens(), dir_processing(), machine_processing(); /* read each input assembly line and perform pass-1 */ while ( fgets(&inp_line[0], INP_LENGTH, in_fptr) != NULL ) { /* increment the line number */ ++lineno; if input is a comment { write a line to output file; (comments are indicated by a period in the first column; all comments are on separate lines, i.e., there won't be in-line commenting); don't increment locctr; done with this input line; } /* get different parts of the input */ get_tokens(inp_line, &label[0], &oper[0], &opnd[0]); search for operation in optbl; if not found (i.e., Invalid Operation error) { add a node to error queue; write a line to output file; don't increment locctr; done with this input line (note that we are not processing the label field); } if assembler directive invoke the routine that processes assembler directives else invoke the routine that processes machine operations }/* end while */ }/* end of pass1 */ /********************************************************************/ void dir_processing(/* parameters */) /* This routine is used by pass1 to process assembler directives. */ { If there is a label at the beginning of the input line, the label is processed the same as the one explained in machine_processing routine. st_value for the label is locctr and st_type is R. The only exception is the assembler directive EQU where st_value and st_type for the label depend on the operand in the input line. The assembler directives, to be handled in this lab, are START, END, BYTE, RESB, RESW, and EQU. START: Put the label in symtbl and set different fields for the label. END: If there is a label at the beginning of the input line, put the label in symtbl and set different fields for it. BYTE: This can be one of two forms: LABEL BYTE C'...' LABEL BYTE X'...' In the first case, each character represent one byte. In the second case, every two hex digits represent one byte. In either case, determine number of bytes needed and increment locctr accordingly. If the input has X but the value is not a multiple of 2, we still have to reserve enough space for the whole value, e.g., if we have X but the value is only, say, 3 hex digits, we have to reserve 2 bytes (in Pass-2, we worry about what to put in unused bits.) RESB: Increment locctr accordingly (assume the operand is an integer). RESW: Increment locctr accordingly (assume the operand is an integer). EQU: This can be one of two forms: LABEL EQU lblvalue LABEL EQU part1-part2 In the first case, lblvalue can be an integer (such as 1024), another label (a relocatable or an absolute address), or * (refers to the current value of location counter; * represents a relocatable address); In the second case, part1 and part2 are both relocatable addresses, i.e., the result is an absolute address. You have to determine the value and the type for the label and put it in symtbl. In all cases, write a line to output file. }/* end of dir_processing */ /********************************************************************/ void machine_processing(/* parameters */) /* This routine is used by pass1 to process machine operations. */ { if there is a label at the beginning of the input line, search for the label in symtbl if found (i.e., Duplicate Label error) { add a node to error queue; write a line to output file; don't increment locctr; done with this input line (i.e., done with this routine); (note that we are not inserting the label in symtbl) } else { put the label in symtbl; set different fields for the label (st_value is locctr and st_type is R); } Find number of bytes needed by the instruction and increment locctr accordingly. Write a line to output file. }/* end of machine_processing */ /********************************************************************/ void prntinfo() /* This routine is used by the main routine after all the input */ /* assembly statements have been processed. This routine writes */ /* errtbl and symtbl to the output file. */ { Note that when this routine is invoked, the sorted optbl and the output for the input assembly lines have already been written to the output file. Write each node of the errtbl queue to the output file. Write the symtbl to the output file. }/* end of prntinfo */ /********************************************************************/ - Input data (optbl and assembly program) will be on the course web site. The optbl will be available immediately, but the other file will be available one week before the lab is due. - You are required to do only the error checking specified in the lab. When a line is in error, that line will have no effects on anything (except lineno), e.g., if the erroneous line is defining a duplicate label, the label will keep its old definition. - The input for optbl has a string and an integer (separated by blanks) on each line. The assembly program input is of the form LABEL OP OPERAND with label starting in column 1, op starting in column 12, and operand starting in column 19. - There is no in-line commenting, i.e., all comments are on separate lines (a period in column 1 indicates comment). locctr is not incremented for a comment line, but lineno is. - The input data will follow the exact format specified in Sample Input, and your output must follow the exact format specified in Sample Output. Sample Input: The file for optbl 12345678901234567890 -- this line won't be in the file RESB 0 RESW 0 ADD 3 ADDR 2 ADDF 3 CLEAR 2 START 0 END 0 EQU 0 JSUB 3 LDB 3 RSUB 3 STL 3 ... The file for input assembly program (note that we are not concerned with whether or not the program is semantically correct or it makes any sense) 12345678901234567890 -- this line won't be in the file DEMO START 0 . this is a comment FIRST STL RETADR LDB #LENGTH ALI 0 +JSUB RDREC RETADR RESW 1 LENGTH RESW 1 RETADR RESB 100 BUFFER RESB 4096 BUFEND EQU * MAXLEN EQU 4096 . RDREC CLEAR X LDB @RETADR RSUB END FIRST Sample Output: 123456789012345678901234567890 -- this line won't be in the file The Operation Table Index Operation Format 0 ADD 3 1 ADDF 3 2 ADDR 2 3 CLEAR 2 4 END 0 5 EQU 0 6 JSUB 3 7 LDB 3 8 RESB 0 9 RESW 0 10 RSUB 3 11 START 0 12 STL 3 ... This line and the next two lines won't be in the file 1 2 3 4 5 6 1234567890123456789012345678901234567890123456789012345678901234567 Line Loc Source Statement Error Op Index 1 0000 DEMO START 0 11 2 . this is a comment 3 0000 FIRST STL RETADR 12 4 0003 LDB #LENGTH 7 5 ALI 0 Y 6 0006 +JSUB RDREC 6 7 000A RETADR RESW 1 9 8 000D LENGTH RESW 1 9 9 RETADR RESB 100 Y 10 0010 BUFFER RESB 4096 8 11 1010 BUFEND EQU * 5 12 1010 MAXLEN EQU 4096 5 13 . 14 1010 RDREC CLEAR X 3 15 1012 LDB @RETADR 7 16 1015 RSUB 10 17 1018 END FIRST 4 Note: When there is LABEL EQU absolute-value your textbook puts the absolute-value under the column Loc. We will put the value of locctr under this column all the time. 123456789012345678901234567890 -- this line won't be in the file Error Table Line Error 5 Invalid Operation 9 Duplicate Label This line and the next two lines won't be in the file 1 2 3 4 1234567890123456789012345678901234567890 Symbol Table Hash Index Name Value Type 12 RDREC 1010 R DEMO 0000 R 15 BUFFER 0010 R RETADR 000A R FIRST 0000 R 18 MAXLEN 1000 A BUFEND 1010 R LENGTH 000D R /********************************************************************/ What To Turn In --------------- - A disk containing source code, executable code, and other relevant files (if any). Write the language and platform on the disk to make it easier for us to compile and run your program, i.e., we shouldn't have to spend time trying to figure out what you used so that we use the same. - A hard copy of source code. - One week before the lab is due, we will put the input on the course web site. Download the input file and run your program with this input to generate an output file. Turn in a hard copy of this output file as well. Put all of these in an envelope; they will be returned to you in the envelope. Please use a large envelope so that you don't have to fold your printouts. /********************************************************************/ - Make your program easy to read, i.e., no cryptic programming. - Comment your program properly three types of comments - comments explaining what each routine does - comments explaining different segments of each routine - comments describing each variable - Indent your program properly. - Use meaningful names. - Use constants (#define), i.e., don't hard code numbers in your program. - You can (and should) use more routines if you want to. /********************************************************************/