Research Papers

Research paper projects involve reading and understanding the contents of assigned papers, including all major concepts discussed. Each member of the group will find and document a prior research paper serving as the basis for this work. In addition, each member of the group will find and document one research paper using the current one as the basis (e.g., extending presented work).

The following matters need to be addressed:

  • Clear statement of a problem in question
  • Significance of the material presented
  • Novelty of ideas discussed as of publication date
  • Known and hidden limitations (e.g., discussed by authors; brought up by a referring paper, etc)
  • Ideas on how the solution could be extended (including your personal ideas)
  • Possible research directions
  • Research statistics (e.g., how many papers use this as a reference; temporal citation summary; number of papers published by authors on the topic)

The final presentation in class will be similar to infosessions - clear, conscise summary of the papers, taking 25 minutes including Q&A session.

Paper 1. M. Bridges et al, "Revisiting the Sequential Programming Model for Multi-Core" [Download]

Paper 2. J. Cantin et al, "Improving Multiprocessor Performance with Coarse-Grain Coherence Tracking" [Download]

Paper 3. S. McFarling, "Combining Branch Predictors" [Download]

Paper 4. D. A. Patterson, G. Gibson, and R. H. Katz, "A Case for Redundant Arrays of Inexpensive Disks (RAID)", Sigmod Record (ACM Special Interest Group on Management of Data), vol.17, no.3, pp. 109-116, Sept. 1988.

Paper 5. D. M. Tullsen, S. J. Eggers, J. S. Emer, H. M. Levy, J. L. Lo, and R. L. Stamm, "Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneously Multithreading Processor", Proceedings of the 23rd Annual International Symposium on Computer Architecture, pp. 191-202, May 1996.

Paper 6. J.-K. Peir, W. W. Hsu, and A. J. Smith , "Functional Implementation Techniques for CPU Cache Memories," IEEE Transactions on Computers, Vol. 48, No. 2, pp. 100-110, February 1999.

Paper 7. Monica S. Lam and Robert P. Wilson. Limits of Control Flow on Parallelism, Proc. 19th Annual International Symposium on Computer Architecture , May 1992.

Paper 8. A. Moshovos, S. E. Breach, T. N. Vijaykumar and G. S. Sohi. Dynamic Speculation and Synchronization of Data Dependences, Proc. 24th Annual International Symposium on Computer Architecture , June 1997.

Paper 9. M. Schlansker and B. R. Rau. EPIC: An Architecture for Instruction-Level Parallel Processors Technical Report HPL-1999-111 , Hewlett-Packard Laboratories Palo Alto, California February 2000.

Paper 10. Robert S. Chappell, Jared Stark, Sangwook P. Kim, Steven K. Reinhardt and Yale N. Patt. Simultaneous Subordinate Microthreading (SSMT), In Proc. 26nd Annual International Symposium on Computer Architecture , May 1999.

Paper 11. Amir Roth, Andreas Moshovos, and Gurindar S. Sohi Dependence Based Prefetching for Linked Data Structures, In 8th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VIII), Oct 1998.

Paper 12. Tse-Yu Yeh, D. Marr, and Yale N. Patt. Increasing the Instruction Fetch Rate Via Multiple Branch Prediction and Branch Address Cache. In Proceedings of the 1993 ACM International Conference on Supercomputing, pages 51-61, July 1993.

Note: For papers where download is not available, use links at right (IEEE Xplore, ACM portal via UCF library online database access to get an electronic copy of the paper.

Programming Projects

Important Notes: Each project may be implemented in a programming language of your choice, but has to be approved by instructor. There will be no help on the programming details from the instructor or the TA.

While your code may utilize any subset of MIPS instructions, it should support at a minimum integer add, subtract, multiply, divide, load, store, immediate load (and branches, where required by the project). You may ignore any structural hazards (i.e., assume that any hardware unit is available at any time - ALU, register file ports, etc). The only hazards to consider are the data hazards. Any unrecognized instruction or exception (e.g., divide by zero) should halt program execution and output verbose error message.

Any number of teams may pick the same project, but must work independently. No plagiarism on programming projects will be tolerated (including Internet-derived code libraries). This includes instruction-parsing code as well. The grading will be per group, and no comparison to other groups will be made (if the same project is chosen by multiple groups). The source code, methodology and final write-up with example runs are to be submitted by each team. There will be time allocated towards the end of semester for in-class project presentations.

Hints: Assume the input contains one instruction per line of text file. Instructions may be parsed easily by using regular expressions or string comparison/tokenizing operations. You may use a fixed array for integer register contents. Except loop unrolling project, assume the code contains no memory references. For loop unrolling, the loop control variable may be loaded with a known integer value.

Project 1. Design and implement an optimizing compiler (static scheduling). Accept a text file with a subset of MIPS assembly instructions as an input. Schedule the code to minimize the stalls due to data hazards, while preserving data dependencies. You may limit your compiler to process integer instructions only, and make your own realistic assumptions on the latency of each instruction. Output produced should be the optimized code, printed out on the screen, together with the new execution time in clock cycles.

Project 2. Design and implement a loop unrolling compiler. Accept a text file with a subset of MIPS assembly instructions as an input (including Branch instruction). If a value of the loop variable is known (e.g., N), unroll the loop N times, renaming registers as required. Make sure you do not use any more registers than available (32 the in integer case). Make your own realistic assumptions on the latency of each instruction.

Project 3. Implement a step-through MIPS assembly debugger. Given a text file with a subset of the MIPS assembly instructions, display in a graphical interface the contents of all registers, current activity (including stalling), remaining clocks based on current instruction's latency and instruction queue at each clock cycle of execution, as the user steps through the code execution in single clock cycle increments. You may limit your compiler to process integer instructions only, and make your own realistic assumptions on the latency of each instruction.

Project 4. Implement a MIPS integer pipeline visualizer. Given a text file with a subset of the MIPS assembly instructions, generate a pipeline trace of program execution, showing each pipeline stage (IF, ID, EX, ME - abbreviated MEM, WB, or stall - S_) as a delimited output (e.g., use '|' character to separate stages). Show stalls arising from data dependencies. You may make certain assumptions (e.g., on instruction latency, subset of instructions used, etc). For instance, initially exclude branches, but if time allows, you may add the branch component later. If EX stage takes more than one clock cycle per your assumptions for a particular instruction type, you should show it as multiple EX stages within the pipeline.