CS 572 Micro Architecture

Laboratory Assignment 2: Single vs. Multi-cycle Machines

Due 3 November 2017 (Friday) by 11:59 pm

Requirements

  1. Each student should independently accomplish this laboratory assignment. You are allowed to discuss with your friends to solve the coding problems.

  2. You must submit your partially assembled code, the source code of your simulator (see Part 1 and Part 2), a README file, and the values of C, IC, and speed-up (see Part 3).

  3. Once you submit this assignment by email, please immediately schedule a demo session with the Grader during which you will walk him through your simulator in operation.

Grading Criteria

  1. Partially assembled code palindrome.s: 10%
  2. A simulator for multi-cycle machines: 40%
  3. Adhering to coding style: 10%
  4. README file: 10%
  5. Values obtained for C, IC, and speed-up: 15%.
  6. Demonstration: 15%

 0. Introduction                                                                                                                              

In Lab 1, you created two simulators - one  for a stack-based machine and one for an accumulator-based machine. In this lab assignment you will be extending your accumulator-based machine into a General Purpose Register (GPR) machine that runs different instructions in different numbers of cycles (i.e. a multi-cycle machine). You must perform each part as described, answer all related questions, submit your programs and schedule a demo session with the Grader during which you will walk him through your simulator in operation.

 Part 1. More Registers                                                                                                                 

The starting point is your accumulator-based machine. It has only one register which is the accumulator, and that is referenced implicitly in all instructions. Now you need to add 32 general-purpose registers, which can be named $0 through $31. The more complex part is that you will need to parse the instruction stream to find out which registers are being used in each instruction. Let's assume that the only place in the instruction stream that a "$" will appear is right before a register number. 

 Part 2. More Instructions                                                                                                             

Your accumulator-based machine implemented LOAD, STO, ADD, MULT and END. We will have to change the instruction set a bit, to start to line it up with MIPS I. To run the example code we will be working with, you need ADDI, B, BEQZ, BGE, BNE, LA, LB, LI, SUBI, and SYSCALL. We add a table to the simulator which specifies the number of cycles required by each instruction, in each "functional unit" of the machine:

Instruction

Instruction Memory

Register file
- read

ALU

Data Memory

Register file
- write

Total

ADDI

2

1

2


1

6

B

2


2



4

BEQZ, BGE, BNE

2

1

2



5

LA

2


2


1

5

LB

2

1

2


1

6

LI

2




1

3

SUBI

2

1

2


1

6

SYSCALL

2

1

2

2

1

8


It will probably make Lab 3 simpler if you start thinking in terms of five separate functional units right now, and keep track of the time required by each, even if you don't step each instruction through the five functional units in sequence. Strictly speaking, though, for this lab all you need is the "Total" time for each instruction type.

Detailed syntax:

ADDI  Rdest, Rsrc1, Imm
B  label
BEQZ  Rsrc1, label
BGE  Rsrc1, Rsrc2, label
BNE  Rsrc1, Rsrc2, label
LA  Rdest, label
LB  Rdest, offset(Rsrc1)
LI  Rdest, Imm
SUBI  Rdest, Rsrc1, Imm
SYSCALL

 Part 3. Performance Comparison                                                                                                        

Now you will have to evaluate how long it takes to execute an example program on your multi-cycle GPR-based machine, and compare that execution time to the time it would have taken on a "single-cycle" machine. Of course, the key word in all this is "time". Your multi-cycle machine will take more cycles to execute the program, but each cycle would be some fraction of the cycle time on the "matching" single-cycle machine. Why? Well, because the single-cycle machine must be clocked at the rate of the slowest (longest-running) instruction. In this example, SYSCALL takes 8 cycles on the multi-cycle implementation. As such, if the clock period (cycle time) on the multi-cycle machine is 1 ns, the cycle time on the single-cycle machine must be 8 ns so that SYSCALL will be able to complete in a single cycle.

Your task now is to take the example program written in MIPS assembler and convert it into a source file for your simulator. The instructions should stay the same, but you must lay out the code and data in memory and calculate the proper values for the offset and label values (noting carefully that label values will change depending on the address of the instruction where they're used !). You might want to think of this as a "partial assembly", where you are relocating the code and data to reside at chosen locations in memory.

As in Lab 1, your simulator will just be a big case statement inside a loop that reads instructions one by one and simulates the action(s) of each instruction. Of course, before dropping into the main loop you have to "load" your source code into memory and perform any necessary initializations of the data segment. Next you need to initialize the program counter, and start up the main loop.

This simulator requires what we call "instrumentation". As you execute the code, you need to keep track of the total number of instructions executed (IC), and the total number of cycles spent in execution (C). The ratio [8*IC] / C is the speed-up of the multi-cycle implementation. Report your values for IC, C, and [8*IC] / C.

We've added branches to the instruction set, so thorough testing of both your simulator and your "partially assembled" code will be very important. The debugger is your friend - get to know and love it.

Recap: Simulator outline, with instrumentation for C and IC

Open source code file
Load source code and data into memory (using all-powerful initialization routine)
Initialize PC
Initialize IC = 0
Initialize C = 0
user_mode = true;
while ( user_mode ) {
read memory at PC
increment PC
increment IC
case ADDI:    do something; add cycles for ADDI to C; break;
case another_instruction_type:    do something; add cycles for this_type to C; break;
etc.
}
Print summary statistics (including C and IC !) and selected memory locations
You will have to demo your simulator to the Grader and convince him that it is behaving correctly (as well as producing the correct final result). You could use the debugger to do a very convincing demo.

 Lab Assignment 2 Submission                                                                                                      

E-mail the four files to chinmayk93@gmail.com and txie@mail.sdsu.edu with the subject line of "CS 572 Lab Assignment 2". In your Email, please indicate your name and your student ID. Note that this assignment counts 10% towards your course grade.