COP 3402 Spring 2002 Lab #1 (20 pts - out of a total of 400 points for the course) Due: Tuesday, February 5 Notes: - You may work in groups of one or two students. If you work in a group of two, turn in only one copy of the lab with both names on it, i.e., don't turn in two copies of the same program. - The program (in whatever condition) must be turned in (in class) on the day it is due. You'll get partial credit if the program is not completely done. You'll get NO credit if you turn in the program late. - If you use this file as the "skeleton" for your program, make sure you delete the extra stuff. Also, some of my explanations are too detailed to make them easier to follow. You shouldn't keep such long explanations as comments; comments don't need to be that long. This program will implement some table handling routines since table manipulation is needed for all system software. The information will be stored and searched for using hashing due to the efficiency of this technique. It may be helpful to look at Sample Input/Output as you read the lab description. /********************************************************************/ #include FILE *in_fptr, /* Input File Pointer; points to the file containing the input to the program */ *out_fptr, /* Output File Pointer; points to the file to contain all the output of the program */ *fopen(); #define TRUE 1 #define FALSE 0 #define NAME_LENGTH 10 /* maximum number of characters in name */ #define COM_LENGTH 3 /* number of characters in a command */ #define BKT_SIZE 19 /* number of entries in bucket directory */ #define DIG_FACTOR 2 /* digit factor used in hashing */ #define LET_FACTOR 3 /* letter factor used in hashing */ /* an entry (node) in hash table */ struct ht_entry { char ht_name[NAME_LENGTH + 1]; int ref_count; /* number of references to the name */ struct ht_entry *next; }; struct ht_entry *hash_tbl[BKT_SIZE]; /* bucket directory (hash index) */ /********************************************************************/ main() { char command[COM_LENGTH + 1], /* the command in the input */ inp_name[NAME_LENGTH + 1]; /* the name in the input */ void init(), tbl_flush(); struct ht_entry *search(); The main program first performs the initializations. It then reads the input (from the input file) into command and inp_name (it reads one pair at a time). For each input, it writes the input information to the output file and it then checks to see if the input command is DEF or USE. If DEF command, it checks to see if the input name already exists. If the name does not exist, the name is stored in the appropriate hash table entry (linked list); the new name is inserted at the beginning of the linked list and not at the end. If the name already exists, an error message is printed. If USE command, it checks to see if the input name is in the hash table. If it is, the reference count for the name is incremented by one. If the input name is not in the hash table, an error message is printed. After processing all the input, the main program will write the information in the hash table to the output file. }/* end of main */ /********************************************************************/ void init() { This routine is used by the main routine. It initializes hash_tbl. It opens the two files (sets in_fptr and out_fptr). }/* end of init */ /********************************************************************/ struct ht_entry *search(name_ptr, ind_ptr) char *name_ptr; /* name to be searched */ int *ind_ptr; /* hash index (index into the hash table) */ { int hash(); This routine is used by the main routine. It checks to see if a name already exists in the hash table. If it does, this routine gives the hash index entry and a pointer to the node (in the corresponding linked list) containing the name. The first value (hash index) is given to main by setting *ind_ptr. The second value (pointer to node) is returned using the return facility. If the name is not in the hash table, this routine sets *ind_ptr (to index into hash table for the name) to indicate where the name should go, and it returns NULL to indicate it did not find the name. Note that this routine does not store the name in the hash table if the name is not already there. The reason for this is that a "search" routine should not, in general, perform the insertion since not finding an entry may mean an error rather than a new value. }/* end of search */ /********************************************************************/ int hash(name_ptr) char *name_ptr; /* name to be hashed */ { int digit_count, /* number of unique digits in the name */ letter_count; /* number of unique letters in the name */ This routine is used by the search routine. It hashes a name, i.e., it returns the hash index for a name. Hashing Algorithm: Find number of unique digits and number of unique letters (upper case and lower case) in the name; these values are stored in digit_count and letter_count, respectively. For example, if the name is AB5a_A75_C8J, digit_count will be 3 (digits 5, 7 and 8) and letter_count will be 5 (letters A, B, a, C and J). To get the hash value for the name, compute: (digit_count * DIG_FACTOR + letter_count * LET_FACTOR) mod BKT_SIZE So, the hash value for AB5a_A75_C8J is: (3 * 2 + 5 * 3) % 19 = 21 % 19 = 2. }/* end of hash */ /********************************************************************/ void tbl_flush() { This routine is used by the main routine after all the input have been processed. It writes, to the output file, each hash index and the associated names (names in the linked list for the hash index). The routine also writes the reference count for each name. Note that if there are more than one name associated with a hash index, the index is printed only once. The routine writes only the hash indexes that have some nodes associated with them, i.e., if a hash index does not point to any node, that index will not be written. Also, the reference count for a name is printed only if the name has been referenced at least once, i.e., the count is greater than zero. }/* end of tbl_flush */ /********************************************************************/ Sample Input: 123456789012345678901234567890 -- this line won't be in the file DEF GONE DEF KMN127 DEF EGG DEF VAN DEF AKK DEF EGG USE VAN USE AKK USE JEEP USE AKK DEF FUN DEF ALI USE KMN127 Sample Output: 123456789012345678901234567890 -- this line won't be in the file Input Data Command Name DEF GONE DEF KMN127 DEF EGG DEF VAN DEF AKK DEF EGG *** Duplicate Name *** USE VAN USE AKK USE JEEP *** Undefined Name *** USE AKK DEF FUN DEF ALI USE KMN127 1234567890123456789012345678901234567 -- this line won't be in the file Hash Table Hash Index Name Reference Count 6 AKK 2 EGG 9 ALI FUN VAN 1 12 GONE 15 KMN127 1 /********************************************************************/ What To Turn In --------------- - A disk containing source code, executable code, and other relevant files (if any). Write the language and platform on the disk to make it easier for us to compile and run your program, i.e., we shouldn't have to spend time trying to figure out what you used so that we use the same. - A hard copy of source code. - One week before the lab is due, we will put the input on the course web site. Download the input file and run your program with this input to generate an output file. Turn in a hard copy of this output file as well. Put all of these in an envelope; they will be returned to you in the envelope. Please use a large envelope so that you don't have to fold your printouts. /********************************************************************/ - Make your program easy to read, i.e., no cryptic programming. - Comment your program properly three types of comments - comments explaining what each routine does - comments explaining different segments of each routine - comments describing each variable - Indent your program properly. - Use meaningful names. - You can use more routines if you want to. - Use constants (#define), i.e., don't hard code numbers in your program. - The input data will follow the exact format specified in Sample Input, and your output must follow the exact format specified in Sample Output. - Several do's and don't's will be discussed in class. /********************************************************************/