Evaluation of Computer
performance
Dr. Prasenjit Dey
Performance
 The performance of a computer is defined by its speed of
processing of the instructions
 Faster the machine, better the performance
 What are the factors influences the performance of a
computer?
Performance Metrics
 Response Time
 The total time required by an instruction to complete
 From start time of the instruction to the finish time
 It is also known as elapsed time
 Response time = Memory access time + waiting time + CPU
time
 Throughput:
 Number of instructions completed in a specific time
Execution Time
 CPU time
 It is the time when an instruction utilizes the CPU
 Time duration when an instruction is in running state
 It is also known as execution time
 time spent to execute the lines of codes in a program
 CPU time = user CPU time + system CPU time
 elapsed time = user CPU time + system CPU time + waiting time
 Computer performance is measured on the basis of CPU time
Comparison of Performance
 The performance of a machine is inversely proportional to
the CPU time
 Performance = 1/CPU time
 If machine A is n times faster than machine B then
 Performance(A)/performance(B) = n
Clock Cycles
 CPU time is measured with the help of CPU cycles
 Each instruction uses a certain number of CPU cycles to execute an
instruction
 The time taken by an instruction is
 (number of CPU cycles) x (time to execute 1 CPU cycle)
 If all instructions take equal number of cycles then,
 The time required to execute a program of N instructions is
 N x (number of CPU cycle) x (time to execute 1 CPU cycle)
 Avg. program execution time
 (cycles/program) x (seconds/cycle)
 seconds/cycle  cycle time
 Clock rate/frequency  cycle/seconds, used more often
Performance of a Program
 CPU execution time of a program is
 (No. of CPU cycles) x (cycle time)
 (No. of CPU cycles)/(clock rate)
 An efficient program should require less number of CPU
cycles and/or high clock rate
 clock rate  1 Hz. = 1 cycle/sec,
 If clock rate = 200 Mhz?
 Time to execute a program = 1/(200*106) = 5*10-3*10-6 =
5*10-9 sec
Performance of a Program
 In computer, different instructions use different amounts of
CPU cycles
 Bitwise instructions require less number of CPU cycles,
whereas multiplicative instructions, floating point instructions
require more number of CPU cycles,
 To compute the execution time of a program, one should
compute the avg. CPU cycle time or clock rate
CPI: Cycles Per Instruction
 For a given program
 Compute number of instructions
 Compute total number of cycles to execute all instructions
 Divide total number of cycles by number of instructions
 cycles per instruction (CPI)
 The avg. amount of time required to execute an instruction in a
program
 Measured in terms of MIPS (millions of instructions per second)
CPU Execution Time
 CPU execution time of a program is
 N x ( 𝐢=𝟏
𝐍
𝐜𝐲𝐜𝐥𝐞 𝐭𝐢𝐦𝐞 𝐟𝐨𝐫 𝐢𝐧𝐬𝐭𝐫𝐜𝐭𝐢𝐨𝐧 𝐢)/N x cycle time,
 Here N is the total number of instructions in a program
 Average CPI x instruction count x cycle time
 (Average CPI x instruction count) / clock rate
Problem 1
 Let there are 2 machines, machine A and machine B, which
execute the sample program. Where,
 Machine A clock cycle is 10ns and CPI is 2
 Machine B clock cycle is 20ns and CPI is 1.2
 Which one has better performance?
 Computation for machine A
 Avg. CPI x clock cycle time = 2 x 10 = 20ns
 Computation for machine B
 Avg. CPI x clock cycle time = 1.2 x 20 = 24ns
 Machine A has better performance for this program
Problem 2
 Let us consider that two programs that contain three different
types of instructions: type A, type B, and type C
 Type A, type B, and type C instructions require 4, 3, 5 cycles
respectively.
 Suppose program 1, uses 1 type A instructions, 2 type B
instructions, and 2 type C instructions
 Suppose program 2, uses 1 type A instructions, 4 type B
instructions, and 1 type C instructions
 Then which program is faster?
 Computation for program 1
 1*4 + 2*3 + 2*5 = 20cycles
 Computation for program 2
 1*4 + 4*3 + 1*5 = 21cycles
 Program is 1 faster
Problem 3
 Suppose your program consists of 2500 instructions. The
proportion of different kinds of instructions in the program is as
follow: Data transfer instruction 50%, arithmetic instruction 30%
and branching related instructions 20%. The cycles consumed by
these types of instructions are 2, 5, and 10 respectively. What will
be the execution time for a 4 GHz processor to execute your
program?
 Avg CPI = 0.5*2 + 0.3*5 + 0.2*10 = 4.5
 Avg execution time = 2500* (4.5/4 *106 )Sec
= 2500*1.125 *10-6 Sec
= 2.8125ms
Amdahl's Law
 It computes the overall performance enhancement when the
performance of a fraction of code(program) is enhanced
 Overall performance enhancement 
 Overall Speedup = old execution time/new execution time
=
1
(1 – fraction_enhanced) + fraction_enhanced
speedup
 Fraction_enhanced
 The sub part of the code/program which has been enhanced by using
some hardware or compiler
 Speedup
 The performance gain in the enhanced fraction of code
Amdahl's Law: example
 Let a program contains 10 multiplicative instructions and 10
additive instructions. Each multiplicative instruction takes 50ns and
each additive instructions take 10ns.
 Now by adding some hardware, we enhanced the performance of
multiplicative instructions and complete a multiplicative instructions
in 20ns.
 What will be the overall speedup?
 Old execution time = 50*10 + 10*10 =600ns
 New execution time =
1
(1 – fraction_enhanced) + fraction_enhanced
speedup
=
1
1−
500
600
+
500
600
500
200
=
1
1
6
+(
1
3
)
=
1
3
6
=
3
6
= 2
Problems on Amdahl's Law
 With the use of Amdahl’s law, conclude among the given options
which possible improvement is the best one
 Possible improvement
A. Branch CPI can be decreased from 4 to 3
B. Increase clock frequency from 2 to 2.3GHz
C. Store CPI can be decreased from 3 to 2
Instruction type Frequency CPI
ALU 40% 1
Branch 20% 4
Load 30% 2
Store 10% 3
Solution
 Avg. CPI = (0.4*1 + 0.2*4 + 0.3*2 + 0.1*3) = 2.1
 Clock rate = 2GHz
 Avg. instruction execution time = Avg. CPI/clock rate = 2.1/2 =
1.05*10-6 sec
 Case A
 Current execution time = (0.4*1 + 0.2*3 + 0.3*2 + 0.1*3)/2 = 0.95*10-6 sec
 Case B
 Increase clock frequency from 2 to 2.3
 Current execution time = (0.4*1 + 0.2*4 + 0.3*2 + 0.1*3)/(2.3) = 0.91*10-6 sec
 Case C
 Current execution time = (0.4*1 + 0.2*4 + 0.3*2 + 0.1*2)/2 = 1*10-6 sec
Conclusion
 Performance of a machine measured with the help of
clock cycles
 Each instructions require different clock cycles, need
to compute average clock cycle per instruction(CPI)
 Overall performance gain can be achieved by
enhancing a fraction of a program
Thank you

Evaluation of computer performance

  • 1.
  • 2.
    Performance  The performanceof a computer is defined by its speed of processing of the instructions  Faster the machine, better the performance  What are the factors influences the performance of a computer?
  • 3.
    Performance Metrics  ResponseTime  The total time required by an instruction to complete  From start time of the instruction to the finish time  It is also known as elapsed time  Response time = Memory access time + waiting time + CPU time  Throughput:  Number of instructions completed in a specific time
  • 4.
    Execution Time  CPUtime  It is the time when an instruction utilizes the CPU  Time duration when an instruction is in running state  It is also known as execution time  time spent to execute the lines of codes in a program  CPU time = user CPU time + system CPU time  elapsed time = user CPU time + system CPU time + waiting time  Computer performance is measured on the basis of CPU time
  • 5.
    Comparison of Performance The performance of a machine is inversely proportional to the CPU time  Performance = 1/CPU time  If machine A is n times faster than machine B then  Performance(A)/performance(B) = n
  • 6.
    Clock Cycles  CPUtime is measured with the help of CPU cycles  Each instruction uses a certain number of CPU cycles to execute an instruction  The time taken by an instruction is  (number of CPU cycles) x (time to execute 1 CPU cycle)  If all instructions take equal number of cycles then,  The time required to execute a program of N instructions is  N x (number of CPU cycle) x (time to execute 1 CPU cycle)  Avg. program execution time  (cycles/program) x (seconds/cycle)  seconds/cycle  cycle time  Clock rate/frequency  cycle/seconds, used more often
  • 7.
    Performance of aProgram  CPU execution time of a program is  (No. of CPU cycles) x (cycle time)  (No. of CPU cycles)/(clock rate)  An efficient program should require less number of CPU cycles and/or high clock rate  clock rate  1 Hz. = 1 cycle/sec,  If clock rate = 200 Mhz?  Time to execute a program = 1/(200*106) = 5*10-3*10-6 = 5*10-9 sec
  • 8.
    Performance of aProgram  In computer, different instructions use different amounts of CPU cycles  Bitwise instructions require less number of CPU cycles, whereas multiplicative instructions, floating point instructions require more number of CPU cycles,  To compute the execution time of a program, one should compute the avg. CPU cycle time or clock rate
  • 9.
    CPI: Cycles PerInstruction  For a given program  Compute number of instructions  Compute total number of cycles to execute all instructions  Divide total number of cycles by number of instructions  cycles per instruction (CPI)  The avg. amount of time required to execute an instruction in a program  Measured in terms of MIPS (millions of instructions per second)
  • 10.
    CPU Execution Time CPU execution time of a program is  N x ( 𝐢=𝟏 𝐍 𝐜𝐲𝐜𝐥𝐞 𝐭𝐢𝐦𝐞 𝐟𝐨𝐫 𝐢𝐧𝐬𝐭𝐫𝐜𝐭𝐢𝐨𝐧 𝐢)/N x cycle time,  Here N is the total number of instructions in a program  Average CPI x instruction count x cycle time  (Average CPI x instruction count) / clock rate
  • 11.
    Problem 1  Letthere are 2 machines, machine A and machine B, which execute the sample program. Where,  Machine A clock cycle is 10ns and CPI is 2  Machine B clock cycle is 20ns and CPI is 1.2  Which one has better performance?  Computation for machine A  Avg. CPI x clock cycle time = 2 x 10 = 20ns  Computation for machine B  Avg. CPI x clock cycle time = 1.2 x 20 = 24ns  Machine A has better performance for this program
  • 12.
    Problem 2  Letus consider that two programs that contain three different types of instructions: type A, type B, and type C  Type A, type B, and type C instructions require 4, 3, 5 cycles respectively.  Suppose program 1, uses 1 type A instructions, 2 type B instructions, and 2 type C instructions  Suppose program 2, uses 1 type A instructions, 4 type B instructions, and 1 type C instructions  Then which program is faster?  Computation for program 1  1*4 + 2*3 + 2*5 = 20cycles  Computation for program 2  1*4 + 4*3 + 1*5 = 21cycles  Program is 1 faster
  • 13.
    Problem 3  Supposeyour program consists of 2500 instructions. The proportion of different kinds of instructions in the program is as follow: Data transfer instruction 50%, arithmetic instruction 30% and branching related instructions 20%. The cycles consumed by these types of instructions are 2, 5, and 10 respectively. What will be the execution time for a 4 GHz processor to execute your program?  Avg CPI = 0.5*2 + 0.3*5 + 0.2*10 = 4.5  Avg execution time = 2500* (4.5/4 *106 )Sec = 2500*1.125 *10-6 Sec = 2.8125ms
  • 14.
    Amdahl's Law  Itcomputes the overall performance enhancement when the performance of a fraction of code(program) is enhanced  Overall performance enhancement   Overall Speedup = old execution time/new execution time = 1 (1 – fraction_enhanced) + fraction_enhanced speedup  Fraction_enhanced  The sub part of the code/program which has been enhanced by using some hardware or compiler  Speedup  The performance gain in the enhanced fraction of code
  • 15.
    Amdahl's Law: example Let a program contains 10 multiplicative instructions and 10 additive instructions. Each multiplicative instruction takes 50ns and each additive instructions take 10ns.  Now by adding some hardware, we enhanced the performance of multiplicative instructions and complete a multiplicative instructions in 20ns.  What will be the overall speedup?  Old execution time = 50*10 + 10*10 =600ns  New execution time = 1 (1 – fraction_enhanced) + fraction_enhanced speedup = 1 1− 500 600 + 500 600 500 200 = 1 1 6 +( 1 3 ) = 1 3 6 = 3 6 = 2
  • 16.
    Problems on Amdahl'sLaw  With the use of Amdahl’s law, conclude among the given options which possible improvement is the best one  Possible improvement A. Branch CPI can be decreased from 4 to 3 B. Increase clock frequency from 2 to 2.3GHz C. Store CPI can be decreased from 3 to 2 Instruction type Frequency CPI ALU 40% 1 Branch 20% 4 Load 30% 2 Store 10% 3
  • 17.
    Solution  Avg. CPI= (0.4*1 + 0.2*4 + 0.3*2 + 0.1*3) = 2.1  Clock rate = 2GHz  Avg. instruction execution time = Avg. CPI/clock rate = 2.1/2 = 1.05*10-6 sec  Case A  Current execution time = (0.4*1 + 0.2*3 + 0.3*2 + 0.1*3)/2 = 0.95*10-6 sec  Case B  Increase clock frequency from 2 to 2.3  Current execution time = (0.4*1 + 0.2*4 + 0.3*2 + 0.1*3)/(2.3) = 0.91*10-6 sec  Case C  Current execution time = (0.4*1 + 0.2*4 + 0.3*2 + 0.1*2)/2 = 1*10-6 sec
  • 18.
    Conclusion  Performance ofa machine measured with the help of clock cycles  Each instructions require different clock cycles, need to compute average clock cycle per instruction(CPI)  Overall performance gain can be achieved by enhancing a fraction of a program
  • 19.