Example Code

This page provides some samples of C code shown in class.

Connection Between Unix and C

Return to top

Const Language (demo of flex and bison)

The "const language" is a very simple demonstration of the use of flex and bison to create lexical analyzers and parsers.

As can be seen in the const.y grammar file, there a program consists of a single constant definition (like name = 32).

The Makefile has rules that can be used to build lexer and run it on various tests, which are the .const files in the directory.

The flex input file for the lexical analyzer is given in the file const_lexer.l.

Some other files used to build the lexer are also present, such as the ast module (files ast.h and ast.c), the lexer module (files lexer.h and lexer.c) and the utilities module (files utilities.c and utilities.c), and some header files (machine_types.h and parser_types.h).

Return to top

Const Language with Multiple Definitions

This is a version of the const language from the previous section, that was developed in class on October 20, 2023 as a demonstration of how to work with lists in a grammar and with the ASTs.

The "const multiple language" is a simple demonstration of the use of flex and bison to create lexical analyzers and parsers.

As can be seen in the const.y grammar file, there a program consists of zero or more constant definitions (which each look like name = 32).

The Makefile has rules that can be used to build lexer and run it on various tests, which are the .const files in the directory.

The flex input file for the lexical analyzer is given in the file const_lexer.l.

Some other files used to build the lexer are also present, such as the ast module (files ast.h and ast.c), the lexer module (files lexer.h and lexer.c) and the utilities module (files utilities.c and utilities.c), and some header files (machine_types.h and parser_types.h).

Return to top

Simplified RISC Machine Assembler and Disassembler

The code for the assembler and disassembler for the Simplified RISC Machine (SRM) is accessible from this section and in a subdirectory named srm-asm.

The Makefile for the assembler and disassembler has targets (asm and disasm) for building these programs.

Assembler

The asm_main.c file holds the main program for the assembler.

The lexical analyzer for the assembler is defined in the file asm_lexer.l, which is processed by the Flex lexical analyzer generator to produce the file asm_lexer.c. Some parts of lexical analysis are defined in the lexer module (files lexer.c and lexer.h).

The asm.y describes the assembler's grammar and is used with the Bison parser generator to generate the files asm.tab.c asm.tab.h that do the actual parsing. The parser_types.h file connects the parser to the definitions of the abstract syntax trees (ASTs).

The abstract syntax trees produced by the parser are defined by the ast module (files ast.c and ast.h). The parser_types.h file connects the parser to the definitions of the ASTs.

The asm's unparser, which prints ASTs, is are defined by the asm_unparser module (files asm_unparser.c and asm_unparser.h).

The file locations used by the parser and lexer are defined in the file_location module (files file_location.c and file_location.h).

The pass1 module builds the symbol table for the assembler (files pass1.c and pass1.h).

The symtab module builds the symbol table for the assembler (files symtab.c and symtab.h). The id_attrs.h file defines the attributes used in the symbol table.

The main work of the assembler happens in the assemble module (files assemble.c and assemble.h).

The instruction module (files instruction.c and instruction.h) defines the types for machine instructions and functions that work with them. Some of these functions, such as instruction_assembly_form, that returns a string with the human-readable input of a binary format instruction.

The human-readable names of the SRM's registers are defined in the regname module (files regname.c and regname.h).

The output of the assembler is a binary object file, which has a format defined by in the bof module (files bof.c and bof.h).

The machine_types module (files machine_types.h and machine_types.c) has some basic type definitions used in other modules, and some functions related to the machine definition.

The utilities module (files utilities.c and utilities.h) has some functions used by other modules, such as the bail_with_error function.

Disassembler

The disassembler (disasm) performs the opposite task of the assembler. It uses many of the same files used by the assembler, above.

The main program for the disassembler is in the file disasm_main.c.

The main work of the disassembler happens in the disasm module (files disasm.c and disasm.h).

Important modules for the disassembler are the instruction module (files instruction.c and instruction.h) and the bof module (files bof.c and bof.h).

Tiny Virtual Machine (from Spring 2023)

The tiny machine is a Von Neumann VM that is very simple. The ISA for this machine is also available. The implementation uses the following files:

Return to top

Stack (from Fall 2023)

The stack module shown here is a byte-addressed stack that grows downwards. It is an abstract data type in C, which hides some state by making it static.

Return to top

Old Stack (from Spring 2023)

The stack module shown here has support for activation records (ARs) and static scoping; it is an abstract data type in C, which hides some state by making it static.

Trees: a motivating example for recursion

The trees example shows how to compute the depth of a binary tree both recursively and non-recursively, so that one can see how much simpler the recursive solution is. In fact, the non-recursive solution is more difficult, and seems to be easiest to express using two queues.

Strings in C

The testre.c function shows that the standard C library's strcmp() function does not do regular expression matching.

Return to top

Modular C

The decldef.h file shows several declarations that are not definitions in C; there are corresponding definitions in the file decldef.c file.

The non-modular program sayHelloProgram.c is broken into 3 modules in:

There is a simple Makefile available that can separately compile these modules and link together a complete program.

The ADT for string sets is a module composed of:

Return to top

Building ASTs Manually

The directory building_ASTs_manually shows two ways to build abstract syntax trees (ASTs) for PL/0, using the ast module from homework 3.

See the file demo.c for how to build ASTs for some simple statements. There is also a header in demo.h and a main program in demo_main.c, which uses the input file write_x.pl0 in one technique.

The directory also contains several other files needed for the demonstration, including a Makefile.

Return to top

FLOAT Calculator Compiler

The directory FLOAT calculator is an example compiler for a very simple calculator language called FLOAT. It shows how to do lexical analysis, parsing, construction of ASTs, declaration checking, and code generation. A Makefile that builds the compiler and can run tests is included.

The main function (in the file compiler_main.c):

  1. checks the arguments and options
  2. If the -l option was used, then it initializes the lexer, which is specified in the flex input file float_lexer.l. (See also the lexer module, files lexer.h, lexer.c.) Then the lexer prints the tokens from the argument file on stdout and exits. This can be helpful in debugging the lexical analysis phase.
  3. Otherwise it calls the parser, which is specified in the bison input file float.y. There is also a parser module (files parser.h, parser.c). (Note that the parseProgram function that is called also initializes the lexer.)
  4. the parser returns an AST for the program, which is defined in the ast module (files ast.h and ast.c),
  5. if the -u option was used, then the compiler prints a textual version of the AST on standard output by calling the unparser from the unparser module (files unparser.h and unparser.c),
  6. then the compiler initializes the symbol table (in the module symtab, which consists of the files symtab.h and symtab.c, and the module scope, which consists of the files scope.h and scope.c), and then the scope_check module (files scope_check.h and scope_check.c) performs declaration checking; that is the program checks for duplicate declarations of identifiers and undeclared identifiers.
  7. The modules id_attrs (files id_attrs.h and id_attrs.c) and the module id_use (files id_use.h and id_use.h) are used in declaration checking, and the AST is "decorated" with (pointers to) id_use structures to record information about the lexical address and attibutes of each identifier used.
  8. If the -u option was used, then the compiler exits. Otherwise the proceeds to code generation, by initializing the gen_code module (files gen_code.h and gen_code.c), opening a BOF file (see module bof in files bof.h and bof.c) and writing the binary object code into it.
The generated binary object file can be run using the floating-point SRM.

Error messages are handled in the utilities module (see utilities.h and utilities.c). and the lexer_utilities module (see files lexer_utilities.h and lexer_utilities.c).

The type AST of abstract syntax trees for the FLOAT language is declared in the ast module (files ast.h and ast.c). This uses the file location type from the file_location module (files file_location.h and file_location.c).

Return to top

Floating-point Simplified RISC Machine

The virtual machine (VM) for FLOAT is simplified from the MIPS processor. It is found in the vm subdirectory of the float calculator directory. The Floating Point Simplified RISC Machine Manual describes this VM in detail.

The implementation can be built using its Makefile, and uses the following files:

An assembler and disassembler for this VM are also included in this directory. See the Floating-point Simplified RISC Machine Assembler Manual for details. The implemention is very similar to that of the SRM's assembler and disassembler described above.

Return to top

Last modified .