CS 572 Micro Architecture

Written Assignment 1: 

Due Oct. 4 in class

1. 50 points: (a) 10 points, (b) 10 points, (c) 10 points, (d) 10 points, (e) 10 points

Your company has a benchmark that is considered
representative of your typical applications. An embedded processor under consideration
to support your task does not have a floating-point unit and must emulate
each floating-point instruction by a sequence of integer instructions. This processor
is rated at 120 MIPS on the benchmark. A third-party vendor offers a compatible
coprocessor to boost performance. That coprocessor executes each floatingpoint
instruction in hardware (i.e., no emulation is necessary). The processor/
coprocessor combination rates 80 MIPS on the same benchmark. The following
symbols are used to answer parts (a)?(e) of this exercise:
I—Number of integer instructions executed on the benchmark.
F—Number of floating-point instructions executed on the benchmark.
Y—Number of integer instructions to emulate one floating-point instruction.
W—Time to execute the benchmark on the processor alone.
B—Time to execute the benchmark on the processor/coprocessor combination.
a. [10 points] Write an equation for the MIPS rating of each configuration
using the symbols above.

b. [10 points] For the configuration without the coprocessor, we measure that
F = 8 ? 10^6, Y = 50, and W = 4 seconds. Find I.
c. [10 points] What is the value of B?
d. [10 points] What is the MFLOPS rating of the system with the coprocessor?
MFLOPS: A megaFLOPS (MFLOPS) is equal to one million floating-point operations per second.
e. [10 points] Your colleague wants to purchase the coprocessor even though
the MIPS rating for the configuration using the coprocessor is less than that of
the processor alone. Is your colleague’s evaluation correct? Defend your
answer.

 

 

2. 50 points: (a) 30 points, (b) 20 points

Several researchers have suggested that adding a register-memory addressing
mode to a load-store computer might be useful. The idea is to replace sequences of
LOAD R1,0(Rb)
ADD R2,R2,R1

by
ADD R2,0(Rb)


Assume the new instruction will cause the clock cycle to increase by 5%. Use the instruction
frequencies for the gcc benchmark on the load-store computer from Figure 2.32. The
new instruction affects only the clock cycle and not the CPI.
a. [30 points] What percentage of the loads must be eliminated for the computer with the
new instruction to have at least the same performance?
b. [20 points] Show a situation in a multiple instruction sequence where a load of R1 followed
immediately by a use of R1 (with some type of opcode) could not be replaced
by a single instruction of the form proposed, assuming that the same opcode exists.