SlideShare a Scribd company logo
1 of 49
CSC 203 1.5
Computer System Architecture
Budditha Hettige
Department of Statistics and Computer Science
University of Sri Jayewardenepura
Performance of ComputersPerformance of Computers
Budditha Hettige 2
Improving Performance of ComputersImproving Performance of Computers
• Increasing clock speed
– Physical limitation (Need new hardware)
• Parallelism (Doing more things at once)
– Instruction-level parallelism
• Getting more instruction per second
– Processor-level parallelism
• Having multiple CPUs working on the same problem
Budditha Hettige 3
Instruction-level parallelismInstruction-level parallelism
• Pipelining
– Instruction execution speed is affected by time taken
to fetch instruction from memory
– Early Computers fetch instructions in advance and
stored in registers (Prefetch buffer)
• Prefetching divides instruction execution into two parts
– Fetching
– Actual execution
– Pipelining divides instruction in to many parts; each
handled by different hardware and can run in parallel
Budditha Hettige 4
Pipelining examplePipelining example
• Packaging cakes
– W1: Place an empty box on the belt every 10 second
– W2: Place the cake in the empty box
– W3: Close and seal the box
– W4: Label the box
– W5: Remove the box and place it in the large container
Budditha Hettige 5
Computer PipelinesComputer Pipelines
• S1: Fetch instruction from memory and place it in a buffer
until it is needed
• S2: Decode the instruction; determine it type and operands it
needs
• S3: locate the fetch operands from memory (or registers)
• S4: Execute instruction
• S5: Write back result in a register
Budditha Hettige 6
ExampleExample
T - Cycle time
N - Number of stages in the pipeline
Latency:
Time taken to execute an instruction = N x T
Processor Bandwidth:
No. of MIPS the CPU has = 1000 MIPS
T
Budditha Hettige 7
Processor - pipeline depthProcessor - pipeline depth
Budditha Hettige 8
Dual pipelinesDual pipelines
• Instruction fetch unit fetches a pair of instructions and puts
each one into own pipeline
• Pentium has two five-stage pipelines
– U pipeline (main) executes an arbitrary Pentium instructions
– V pipeline (second) executes inter instructions, one simple
floating point instruction
• If instructions in a pair conflict, instruction in u pipeline is
executed. Other instruction is held and is paired with next
instruction
Budditha Hettige 9
Superscalar architectureSuperscalar architecture
• Single pipeline with multiple functional units
Budditha Hettige 10
Processor level parallelismProcessor level parallelism
• High bus traffic
• Low bus traffic
Budditha Hettige 11
Measuring PerformanceMeasuring Performance
Budditha Hettige 12
Moore’s lawMoore’s law
• Describes a long-term trend in the history of
computing hardware
• Defined by Dr. Gordon Moore during the
sixties.
• Predicts an exponential increase in component
density over time, with a doubling time of 18
months.
• Applicable to microprocessors, DRAMs ,
DSPs and other microelectronics.
Budditha Hettige 13
Budditha Hettige 14
Moore's Law and PerformanceMoore's Law and Performance
• The performance of computers is determined
by architecture and clock speed.
• Clock speed doubles over a 3 year period due
to the scaling laws on chip.
• Processors using identical or similar
architectures gain performance directly as a
function of Moore's Law.
• Improvements in internal architecture can
yield better gains than predicted by Moore's
Law.
Budditha Hettige 15
Budditha Hettige 16
Measuring PerformanceMeasuring Performance
• Execution time:
– Time between start and completion of a task
(including disk accesses, memory accesses )
• Throughput:
– Total amount of work dome a given time
Budditha Hettige 17
Performance of a ComputerPerformance of a Computer
Two Computer X and Y;
Performance of (X) > Performance of (Y)
Execution Time (Y) > Execution Time (X)
Budditha Hettige 18
Performance of difference 2 ComputerPerformance of difference 2 Computer
X is n Time faster than Y
Budditha Hettige 19
CPU TimeCPU Time
• Time CPU spends on a task
• User CPU time
– CPU time spent in the program
• System CPU time
– CPU time spent in OS performing tasks on behalf
of the program
Budditha Hettige 20
CPU Time (Example)CPU Time (Example)
• User CPU time = 90.7s
• System CPU time 12.9s
• Execution time 2m 39 s 159s
• % of CPU time =
User CPU Time + System CPU Time
X 100 %
Execution time
Budditha Hettige 21
CPU TimeCPU Time
% CPU time = (90.7 + 12.9 ) x 100
159
= 65 %
Budditha Hettige 22
Clock RateClock Rate
• Computer clock runs at the constant rate and
determines when events take place in the
hardware
Clock Rate = 1
Clock Cycle
Budditha Hettige 23
Amdahl’s lawAmdahl’s law
• Performance improvement that can be gained
from some faster mode of execution is limited
by fraction of the time the faster mode can be
used
Budditha Hettige 24
Amdahl’s lawAmdahl’s law
• Speedup depends on
– Fraction of computation time in original machine
that can be converted to take advantage of the
enhancement
(Fraction Enhanced)
– Improvement gains by enhanced execution mode
(Speedup Enhanced)
Budditha Hettige 25
ExampleExample
Total execution time of a Program = 50 s
Execution time that can be enhanced = 30 s
FractionEnhanced = 30 /50
= 0.6
Budditha Hettige 26
SpeedupSpeedup
Budditha Hettige 27
ExampleExample
Normal mode execution time for some portion of
a program = 6s
Enhances mode execution time for the same
program = 2s
Speedup Enhanced = 6/2
= 3
Budditha Hettige 28
Execution TimeExecution Time
Budditha Hettige 29
ExampleExample
• Suppose we consider an enhancement to the processor of a
server system used for Web serving. New CPU is 10 times
faster on computation in Web application than original CPU.
Assume original CPU is busy with computation 40% of the
time and is waiting for I/O 60% of time.
What is the overall speedup gained from
enhancement?
Budditha Hettige 30
AnswerAnswer
Budditha Hettige 31
RemarkRemark
• If an enhancement is only usable for fraction
of a task, we cannot speedup by more than
Budditha Hettige 32
ExampleExample
• A common transformation required in graphics
engines is square root. Implementation of floating-
point (FP) square root vary significantly in
performance, especially among processors designed
graphics
• Suppose FP square root (FPSQR) is responsible for
20% of execution tine of a critical graphics program
• Design alternative
1. Enhance EPSQR hardware and speed up this operation by
a factor of 10
2. Make all FP instruction run faster by a factor of 1.6
Budditha Hettige 33
ExampleExample
• FP instruction are responsible for a total of
50% of execution time. Design team believes
they can make all fp instruction run 1.6 times
faster with same effort as required for fast
square root.
Compare these two design alternatives
Budditha Hettige 34
Budditha Hettige 35
CPU performance equationCPU performance equation
CPU time = CPU clock cycles for a program x Clock cycle time
= CPU clock cycles / Clock rate
Budditha Hettige 36
ExampleExample
A program runs in 10s on computer A having
400 MHz clock. A new machine B, which
could run the same program in 6s, has to be
designed. Further, B should have 1.2 times as
many clock cycles as A.
What should be the clock rate of B?
Budditha Hettige 37
AnswerAnswer
Budditha Hettige 38
CPU Clock CyclesCPU Clock Cycles
CPI (clock cycles per instruction)
average no. of clock cycles each instruction takes to
execute
IC (instruction count)
no. of instructions executed in the program
CPU clock cycles = CPI x IC
Note: CPI can be used to compare two different
implementations of the same instruction set architecture
(as IC required for a program is same)
Budditha Hettige 39
ExampleExample
• Consider two implementations of same instruction set
architecture. For a certain program, details of time
measurements of two machines are given below
• Which machine is faster for this program and by how
much?
Budditha Hettige 40
AnswerAnswer
Budditha Hettige 41
Measuring componentsMeasuring components
of CPU performance equationof CPU performance equation
• CPU Time: by running the program
• Clock Cycle Time: published in documentation
• IC: by a software tools/simulator of the architecture
((more difficult to obtain)
• CPI: by simulation of an implementation (more
difficult to obtain)
Budditha Hettige 42
CPU clock cyclesCPU clock cycles
Suppose n different types of instruction
Let
ICi – No. of times instruction i is executed in a program
CPIi – Avg. no. of clock cycles for instruction i
Budditha Hettige 43
ExampleExample
Suppose we have made the following measurements:
– Frequency of FP operations (other than FPSQR) = 25%
– Average CPI of FP operations = 4.0
– Average CPI of other instructions = 1.33
– Frequency of FPSQR= 2%
– CPI of FPSQR = 20
Design alternatives:
1. decrease CPI of FPSQR to 2
2. decrease average CPI of all FP operation to 2.5
Compare these two design alternatives using CPU performance
equation
Budditha Hettige 44
AnswersAnswers
• Note that only CPI changes; clock rate; IC remain identical
Budditha Hettige 45
MIPS as a performance measureMIPS as a performance measure
Budditha Hettige 46
ProblemsProblems
MIPS as a performance measure
• MIPS is dependant on instruction set
– difficult to compare MIPS of computers with
different instruction sets
• MIPS can vary inversely to performance
Budditha Hettige 47
MFLOPS as a performance measureMFLOPS as a performance measure
Budditha Hettige 48
ProblemsProblems
MIPS as a performance measure
• MFLOPS is not dependable
– Cray C90 has no divide instructions while Pentium
has
• MFLOPS depends on the mixture of fast and
slow floating point operations
– add (fast) and divide (slow) operations
Budditha Hettige 49

More Related Content

What's hot

Computer architecture
Computer architectureComputer architecture
Computer architectureZuhaib Zaroon
 
LEGACY SYSTEM In Software Engineering By NADEEM AHMED
LEGACY SYSTEM In Software Engineering By NADEEM AHMED LEGACY SYSTEM In Software Engineering By NADEEM AHMED
LEGACY SYSTEM In Software Engineering By NADEEM AHMED NA000000
 
isa architecture
isa architectureisa architecture
isa architectureAJAL A J
 
Computer architecture
Computer architectureComputer architecture
Computer architectureRishabha Garg
 
Raspberry pi course syllabus
Raspberry pi course syllabusRaspberry pi course syllabus
Raspberry pi course syllabusSoftroniics india
 
PROTOTYPE MODEL
PROTOTYPE MODELPROTOTYPE MODEL
PROTOTYPE MODELshenagarg44
 
Design of embedded systems
Design of embedded systemsDesign of embedded systems
Design of embedded systemsPradeep Kumar TS
 
Arduino Microcontroller
Arduino MicrocontrollerArduino Microcontroller
Arduino MicrocontrollerShyam Mohan
 
Pic microcontroller architecture
Pic microcontroller architecturePic microcontroller architecture
Pic microcontroller architectureJamia Hamdard
 
Introduction to MPLAB IDE
Introduction to MPLAB IDEIntroduction to MPLAB IDE
Introduction to MPLAB IDEKarim El-Rayes
 
Software development PROCESS
Software development PROCESSSoftware development PROCESS
Software development PROCESSIvano Malavolta
 
FINAL CORRECT Report
FINAL CORRECT ReportFINAL CORRECT Report
FINAL CORRECT ReportRK Saini
 
Project Synopsis sample
Project Synopsis sampleProject Synopsis sample
Project Synopsis sampleRahul Pola
 
Quality attributes of Embedded Systems
Quality attributes of Embedded Systems Quality attributes of Embedded Systems
Quality attributes of Embedded Systems VijayKumar5738
 
Introduction to Embedded System: Chapter 2 (4th portion)
Introduction to Embedded System:  Chapter 2 (4th portion)Introduction to Embedded System:  Chapter 2 (4th portion)
Introduction to Embedded System: Chapter 2 (4th portion)Moe Moe Myint
 
Requirement specification (SRS)
Requirement specification (SRS)Requirement specification (SRS)
Requirement specification (SRS)kunj desai
 
Arm programmer's model
Arm programmer's modelArm programmer's model
Arm programmer's modelv Kalairajan
 
Component-based Software Engineering
Component-based Software EngineeringComponent-based Software Engineering
Component-based Software EngineeringSalman Khan
 

What's hot (20)

Computer architecture
Computer architectureComputer architecture
Computer architecture
 
LEGACY SYSTEM In Software Engineering By NADEEM AHMED
LEGACY SYSTEM In Software Engineering By NADEEM AHMED LEGACY SYSTEM In Software Engineering By NADEEM AHMED
LEGACY SYSTEM In Software Engineering By NADEEM AHMED
 
isa architecture
isa architectureisa architecture
isa architecture
 
Computer architecture
Computer architectureComputer architecture
Computer architecture
 
Raspberry pi course syllabus
Raspberry pi course syllabusRaspberry pi course syllabus
Raspberry pi course syllabus
 
PROTOTYPE MODEL
PROTOTYPE MODELPROTOTYPE MODEL
PROTOTYPE MODEL
 
Design of embedded systems
Design of embedded systemsDesign of embedded systems
Design of embedded systems
 
Arduino Microcontroller
Arduino MicrocontrollerArduino Microcontroller
Arduino Microcontroller
 
Pic microcontroller architecture
Pic microcontroller architecturePic microcontroller architecture
Pic microcontroller architecture
 
Introduction to MPLAB IDE
Introduction to MPLAB IDEIntroduction to MPLAB IDE
Introduction to MPLAB IDE
 
Software development PROCESS
Software development PROCESSSoftware development PROCESS
Software development PROCESS
 
FINAL CORRECT Report
FINAL CORRECT ReportFINAL CORRECT Report
FINAL CORRECT Report
 
Project Synopsis sample
Project Synopsis sampleProject Synopsis sample
Project Synopsis sample
 
Quality attributes of Embedded Systems
Quality attributes of Embedded Systems Quality attributes of Embedded Systems
Quality attributes of Embedded Systems
 
Introduction to Embedded System: Chapter 2 (4th portion)
Introduction to Embedded System:  Chapter 2 (4th portion)Introduction to Embedded System:  Chapter 2 (4th portion)
Introduction to Embedded System: Chapter 2 (4th portion)
 
Arduino uno
Arduino unoArduino uno
Arduino uno
 
1. introduction to python
1. introduction to python1. introduction to python
1. introduction to python
 
Requirement specification (SRS)
Requirement specification (SRS)Requirement specification (SRS)
Requirement specification (SRS)
 
Arm programmer's model
Arm programmer's modelArm programmer's model
Arm programmer's model
 
Component-based Software Engineering
Component-based Software EngineeringComponent-based Software Engineering
Component-based Software Engineering
 

Similar to CSC 203 1.5 Computer System Architecture Performance

Lec3 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Performance
Lec3 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- PerformanceLec3 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Performance
Lec3 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- PerformanceHsien-Hsin Sean Lee, Ph.D.
 
03 performance
03 performance03 performance
03 performancemarangburu42
 
Computer performance
Computer performanceComputer performance
Computer performanceAmit Kumar Rathi
 
Uni Processor Architecture
Uni Processor ArchitectureUni Processor Architecture
Uni Processor ArchitectureAshish KC
 
L-2 (Computer Performance).ppt
L-2 (Computer Performance).pptL-2 (Computer Performance).ppt
L-2 (Computer Performance).pptImranKhan997082
 
Cpu performance matrix
Cpu performance matrixCpu performance matrix
Cpu performance matrixRehman baig
 
Intel hyper threading presentation
Intel hyper threading presentationIntel hyper threading presentation
Intel hyper threading presentationBilaldld
 
Performance of processor.ppt
Performance of processor.pptPerformance of processor.ppt
Performance of processor.pptnivedita murugan
 
Design of Fuzzy PID controller to control DC motor with zero overshoot
Design of Fuzzy PID controller to control DC motor with zero overshootDesign of Fuzzy PID controller to control DC motor with zero overshoot
Design of Fuzzy PID controller to control DC motor with zero overshootIJERA Editor
 
04 performance
04 performance04 performance
04 performancemarangburu42
 
Chapter 19 - Real Time Systems
Chapter 19 - Real Time SystemsChapter 19 - Real Time Systems
Chapter 19 - Real Time SystemsWayne Jones Jnr
 
Document 14 (6).pdf
Document 14 (6).pdfDocument 14 (6).pdf
Document 14 (6).pdfRajMantry
 
Lecture for the day three in jj3 ppt.pdf
Lecture for the day three in jj3 ppt.pdfLecture for the day three in jj3 ppt.pdf
Lecture for the day three in jj3 ppt.pdfAhmedWasiu
 
High Performance Computer Architecture
High Performance Computer ArchitectureHigh Performance Computer Architecture
High Performance Computer ArchitectureSubhasis Dash
 
Renesas DevCon 2010: Starting a QT Application with Minimal Boot
Renesas DevCon 2010: Starting a QT Application with Minimal BootRenesas DevCon 2010: Starting a QT Application with Minimal Boot
Renesas DevCon 2010: Starting a QT Application with Minimal Bootandrewmurraympc
 
Building the Internet of Things with Thingsquare and Contiki - day 2 part 1
Building the Internet of Things with Thingsquare and Contiki - day 2 part 1Building the Internet of Things with Thingsquare and Contiki - day 2 part 1
Building the Internet of Things with Thingsquare and Contiki - day 2 part 1Adam Dunkels
 

Similar to CSC 203 1.5 Computer System Architecture Performance (20)

Lec3 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Performance
Lec3 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- PerformanceLec3 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Performance
Lec3 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Performance
 
03 performance
03 performance03 performance
03 performance
 
Computer performance
Computer performanceComputer performance
Computer performance
 
Uni Processor Architecture
Uni Processor ArchitectureUni Processor Architecture
Uni Processor Architecture
 
Embedded System-design technology
Embedded System-design technologyEmbedded System-design technology
Embedded System-design technology
 
L-2 (Computer Performance).ppt
L-2 (Computer Performance).pptL-2 (Computer Performance).ppt
L-2 (Computer Performance).ppt
 
Parallel Computing on the GPU
Parallel Computing on the GPUParallel Computing on the GPU
Parallel Computing on the GPU
 
Cpu performance matrix
Cpu performance matrixCpu performance matrix
Cpu performance matrix
 
Intel hyper threading presentation
Intel hyper threading presentationIntel hyper threading presentation
Intel hyper threading presentation
 
Lecture 46
Lecture 46Lecture 46
Lecture 46
 
Performance of processor.ppt
Performance of processor.pptPerformance of processor.ppt
Performance of processor.ppt
 
Design of Fuzzy PID controller to control DC motor with zero overshoot
Design of Fuzzy PID controller to control DC motor with zero overshootDesign of Fuzzy PID controller to control DC motor with zero overshoot
Design of Fuzzy PID controller to control DC motor with zero overshoot
 
04 performance
04 performance04 performance
04 performance
 
Chapter 19 - Real Time Systems
Chapter 19 - Real Time SystemsChapter 19 - Real Time Systems
Chapter 19 - Real Time Systems
 
Document 14 (6).pdf
Document 14 (6).pdfDocument 14 (6).pdf
Document 14 (6).pdf
 
Lecture for the day three in jj3 ppt.pdf
Lecture for the day three in jj3 ppt.pdfLecture for the day three in jj3 ppt.pdf
Lecture for the day three in jj3 ppt.pdf
 
High Performance Computer Architecture
High Performance Computer ArchitectureHigh Performance Computer Architecture
High Performance Computer Architecture
 
Renesas DevCon 2010: Starting a QT Application with Minimal Boot
Renesas DevCon 2010: Starting a QT Application with Minimal BootRenesas DevCon 2010: Starting a QT Application with Minimal Boot
Renesas DevCon 2010: Starting a QT Application with Minimal Boot
 
Building the Internet of Things with Thingsquare and Contiki - day 2 part 1
Building the Internet of Things with Thingsquare and Contiki - day 2 part 1Building the Internet of Things with Thingsquare and Contiki - day 2 part 1
Building the Internet of Things with Thingsquare and Contiki - day 2 part 1
 
Section05 scheduling
Section05 schedulingSection05 scheduling
Section05 scheduling
 

More from Budditha Hettige

Graphics Programming OpenGL & GLUT in Code::Blocks
Graphics Programming OpenGL & GLUT in Code::BlocksGraphics Programming OpenGL & GLUT in Code::Blocks
Graphics Programming OpenGL & GLUT in Code::BlocksBudditha Hettige
 
Introduction to Computer Graphics
Introduction to Computer GraphicsIntroduction to Computer Graphics
Introduction to Computer GraphicsBudditha Hettige
 
Computer System Architecture Lecture Note 9 IO fundamentals
Computer System Architecture Lecture Note 9 IO fundamentalsComputer System Architecture Lecture Note 9 IO fundamentals
Computer System Architecture Lecture Note 9 IO fundamentalsBudditha Hettige
 
Computer System Architecture Lecture Note 8.1 primary Memory
Computer System Architecture Lecture Note 8.1 primary MemoryComputer System Architecture Lecture Note 8.1 primary Memory
Computer System Architecture Lecture Note 8.1 primary MemoryBudditha Hettige
 
Computer System Architecture Lecture Note 8.2 Cache Memory
Computer System Architecture Lecture Note 8.2 Cache MemoryComputer System Architecture Lecture Note 8.2 Cache Memory
Computer System Architecture Lecture Note 8.2 Cache MemoryBudditha Hettige
 
Computer System Architecture Lecture Note 7 addressing
Computer System Architecture Lecture Note 7 addressingComputer System Architecture Lecture Note 7 addressing
Computer System Architecture Lecture Note 7 addressingBudditha Hettige
 
Computer System Architecture Lecture Note 5: microprocessor technology
Computer System Architecture Lecture Note 5: microprocessor technologyComputer System Architecture Lecture Note 5: microprocessor technology
Computer System Architecture Lecture Note 5: microprocessor technologyBudditha Hettige
 
Computer System Architecture Lecture Note 3: computer architecture
Computer System Architecture Lecture Note 3: computer architectureComputer System Architecture Lecture Note 3: computer architecture
Computer System Architecture Lecture Note 3: computer architectureBudditha Hettige
 

More from Budditha Hettige (20)

Algorithm analysis
Algorithm analysisAlgorithm analysis
Algorithm analysis
 
Sorting
SortingSorting
Sorting
 
Link List
Link ListLink List
Link List
 
Queue
QueueQueue
Queue
 
02 Stack
02 Stack02 Stack
02 Stack
 
Data Structures 01
Data Structures 01Data Structures 01
Data Structures 01
 
Drawing Fonts
Drawing FontsDrawing Fonts
Drawing Fonts
 
Texture Mapping
Texture Mapping Texture Mapping
Texture Mapping
 
Lighting
LightingLighting
Lighting
 
Viewing
ViewingViewing
Viewing
 
OpenGL 3D Drawing
OpenGL 3D DrawingOpenGL 3D Drawing
OpenGL 3D Drawing
 
2D Drawing
2D Drawing2D Drawing
2D Drawing
 
Graphics Programming OpenGL & GLUT in Code::Blocks
Graphics Programming OpenGL & GLUT in Code::BlocksGraphics Programming OpenGL & GLUT in Code::Blocks
Graphics Programming OpenGL & GLUT in Code::Blocks
 
Introduction to Computer Graphics
Introduction to Computer GraphicsIntroduction to Computer Graphics
Introduction to Computer Graphics
 
Computer System Architecture Lecture Note 9 IO fundamentals
Computer System Architecture Lecture Note 9 IO fundamentalsComputer System Architecture Lecture Note 9 IO fundamentals
Computer System Architecture Lecture Note 9 IO fundamentals
 
Computer System Architecture Lecture Note 8.1 primary Memory
Computer System Architecture Lecture Note 8.1 primary MemoryComputer System Architecture Lecture Note 8.1 primary Memory
Computer System Architecture Lecture Note 8.1 primary Memory
 
Computer System Architecture Lecture Note 8.2 Cache Memory
Computer System Architecture Lecture Note 8.2 Cache MemoryComputer System Architecture Lecture Note 8.2 Cache Memory
Computer System Architecture Lecture Note 8.2 Cache Memory
 
Computer System Architecture Lecture Note 7 addressing
Computer System Architecture Lecture Note 7 addressingComputer System Architecture Lecture Note 7 addressing
Computer System Architecture Lecture Note 7 addressing
 
Computer System Architecture Lecture Note 5: microprocessor technology
Computer System Architecture Lecture Note 5: microprocessor technologyComputer System Architecture Lecture Note 5: microprocessor technology
Computer System Architecture Lecture Note 5: microprocessor technology
 
Computer System Architecture Lecture Note 3: computer architecture
Computer System Architecture Lecture Note 3: computer architectureComputer System Architecture Lecture Note 3: computer architecture
Computer System Architecture Lecture Note 3: computer architecture
 

Recently uploaded

Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsKarinaGenton
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptxENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptxAnaBeatriceAblay2
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxAvyJaneVismanos
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 

Recently uploaded (20)

Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its Characteristics
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptxENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptx
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 

CSC 203 1.5 Computer System Architecture Performance

  • 1. CSC 203 1.5 Computer System Architecture Budditha Hettige Department of Statistics and Computer Science University of Sri Jayewardenepura
  • 2. Performance of ComputersPerformance of Computers Budditha Hettige 2
  • 3. Improving Performance of ComputersImproving Performance of Computers • Increasing clock speed – Physical limitation (Need new hardware) • Parallelism (Doing more things at once) – Instruction-level parallelism • Getting more instruction per second – Processor-level parallelism • Having multiple CPUs working on the same problem Budditha Hettige 3
  • 4. Instruction-level parallelismInstruction-level parallelism • Pipelining – Instruction execution speed is affected by time taken to fetch instruction from memory – Early Computers fetch instructions in advance and stored in registers (Prefetch buffer) • Prefetching divides instruction execution into two parts – Fetching – Actual execution – Pipelining divides instruction in to many parts; each handled by different hardware and can run in parallel Budditha Hettige 4
  • 5. Pipelining examplePipelining example • Packaging cakes – W1: Place an empty box on the belt every 10 second – W2: Place the cake in the empty box – W3: Close and seal the box – W4: Label the box – W5: Remove the box and place it in the large container Budditha Hettige 5
  • 6. Computer PipelinesComputer Pipelines • S1: Fetch instruction from memory and place it in a buffer until it is needed • S2: Decode the instruction; determine it type and operands it needs • S3: locate the fetch operands from memory (or registers) • S4: Execute instruction • S5: Write back result in a register Budditha Hettige 6
  • 7. ExampleExample T - Cycle time N - Number of stages in the pipeline Latency: Time taken to execute an instruction = N x T Processor Bandwidth: No. of MIPS the CPU has = 1000 MIPS T Budditha Hettige 7
  • 8. Processor - pipeline depthProcessor - pipeline depth Budditha Hettige 8
  • 9. Dual pipelinesDual pipelines • Instruction fetch unit fetches a pair of instructions and puts each one into own pipeline • Pentium has two five-stage pipelines – U pipeline (main) executes an arbitrary Pentium instructions – V pipeline (second) executes inter instructions, one simple floating point instruction • If instructions in a pair conflict, instruction in u pipeline is executed. Other instruction is held and is paired with next instruction Budditha Hettige 9
  • 10. Superscalar architectureSuperscalar architecture • Single pipeline with multiple functional units Budditha Hettige 10
  • 11. Processor level parallelismProcessor level parallelism • High bus traffic • Low bus traffic Budditha Hettige 11
  • 13. Moore’s lawMoore’s law • Describes a long-term trend in the history of computing hardware • Defined by Dr. Gordon Moore during the sixties. • Predicts an exponential increase in component density over time, with a doubling time of 18 months. • Applicable to microprocessors, DRAMs , DSPs and other microelectronics. Budditha Hettige 13
  • 15. Moore's Law and PerformanceMoore's Law and Performance • The performance of computers is determined by architecture and clock speed. • Clock speed doubles over a 3 year period due to the scaling laws on chip. • Processors using identical or similar architectures gain performance directly as a function of Moore's Law. • Improvements in internal architecture can yield better gains than predicted by Moore's Law. Budditha Hettige 15
  • 17. Measuring PerformanceMeasuring Performance • Execution time: – Time between start and completion of a task (including disk accesses, memory accesses ) • Throughput: – Total amount of work dome a given time Budditha Hettige 17
  • 18. Performance of a ComputerPerformance of a Computer Two Computer X and Y; Performance of (X) > Performance of (Y) Execution Time (Y) > Execution Time (X) Budditha Hettige 18
  • 19. Performance of difference 2 ComputerPerformance of difference 2 Computer X is n Time faster than Y Budditha Hettige 19
  • 20. CPU TimeCPU Time • Time CPU spends on a task • User CPU time – CPU time spent in the program • System CPU time – CPU time spent in OS performing tasks on behalf of the program Budditha Hettige 20
  • 21. CPU Time (Example)CPU Time (Example) • User CPU time = 90.7s • System CPU time 12.9s • Execution time 2m 39 s 159s • % of CPU time = User CPU Time + System CPU Time X 100 % Execution time Budditha Hettige 21
  • 22. CPU TimeCPU Time % CPU time = (90.7 + 12.9 ) x 100 159 = 65 % Budditha Hettige 22
  • 23. Clock RateClock Rate • Computer clock runs at the constant rate and determines when events take place in the hardware Clock Rate = 1 Clock Cycle Budditha Hettige 23
  • 24. Amdahl’s lawAmdahl’s law • Performance improvement that can be gained from some faster mode of execution is limited by fraction of the time the faster mode can be used Budditha Hettige 24
  • 25. Amdahl’s lawAmdahl’s law • Speedup depends on – Fraction of computation time in original machine that can be converted to take advantage of the enhancement (Fraction Enhanced) – Improvement gains by enhanced execution mode (Speedup Enhanced) Budditha Hettige 25
  • 26. ExampleExample Total execution time of a Program = 50 s Execution time that can be enhanced = 30 s FractionEnhanced = 30 /50 = 0.6 Budditha Hettige 26
  • 28. ExampleExample Normal mode execution time for some portion of a program = 6s Enhances mode execution time for the same program = 2s Speedup Enhanced = 6/2 = 3 Budditha Hettige 28
  • 30. ExampleExample • Suppose we consider an enhancement to the processor of a server system used for Web serving. New CPU is 10 times faster on computation in Web application than original CPU. Assume original CPU is busy with computation 40% of the time and is waiting for I/O 60% of time. What is the overall speedup gained from enhancement? Budditha Hettige 30
  • 32. RemarkRemark • If an enhancement is only usable for fraction of a task, we cannot speedup by more than Budditha Hettige 32
  • 33. ExampleExample • A common transformation required in graphics engines is square root. Implementation of floating- point (FP) square root vary significantly in performance, especially among processors designed graphics • Suppose FP square root (FPSQR) is responsible for 20% of execution tine of a critical graphics program • Design alternative 1. Enhance EPSQR hardware and speed up this operation by a factor of 10 2. Make all FP instruction run faster by a factor of 1.6 Budditha Hettige 33
  • 34. ExampleExample • FP instruction are responsible for a total of 50% of execution time. Design team believes they can make all fp instruction run 1.6 times faster with same effort as required for fast square root. Compare these two design alternatives Budditha Hettige 34
  • 36. CPU performance equationCPU performance equation CPU time = CPU clock cycles for a program x Clock cycle time = CPU clock cycles / Clock rate Budditha Hettige 36
  • 37. ExampleExample A program runs in 10s on computer A having 400 MHz clock. A new machine B, which could run the same program in 6s, has to be designed. Further, B should have 1.2 times as many clock cycles as A. What should be the clock rate of B? Budditha Hettige 37
  • 39. CPU Clock CyclesCPU Clock Cycles CPI (clock cycles per instruction) average no. of clock cycles each instruction takes to execute IC (instruction count) no. of instructions executed in the program CPU clock cycles = CPI x IC Note: CPI can be used to compare two different implementations of the same instruction set architecture (as IC required for a program is same) Budditha Hettige 39
  • 40. ExampleExample • Consider two implementations of same instruction set architecture. For a certain program, details of time measurements of two machines are given below • Which machine is faster for this program and by how much? Budditha Hettige 40
  • 42. Measuring componentsMeasuring components of CPU performance equationof CPU performance equation • CPU Time: by running the program • Clock Cycle Time: published in documentation • IC: by a software tools/simulator of the architecture ((more difficult to obtain) • CPI: by simulation of an implementation (more difficult to obtain) Budditha Hettige 42
  • 43. CPU clock cyclesCPU clock cycles Suppose n different types of instruction Let ICi – No. of times instruction i is executed in a program CPIi – Avg. no. of clock cycles for instruction i Budditha Hettige 43
  • 44. ExampleExample Suppose we have made the following measurements: – Frequency of FP operations (other than FPSQR) = 25% – Average CPI of FP operations = 4.0 – Average CPI of other instructions = 1.33 – Frequency of FPSQR= 2% – CPI of FPSQR = 20 Design alternatives: 1. decrease CPI of FPSQR to 2 2. decrease average CPI of all FP operation to 2.5 Compare these two design alternatives using CPU performance equation Budditha Hettige 44
  • 45. AnswersAnswers • Note that only CPI changes; clock rate; IC remain identical Budditha Hettige 45
  • 46. MIPS as a performance measureMIPS as a performance measure Budditha Hettige 46
  • 47. ProblemsProblems MIPS as a performance measure • MIPS is dependant on instruction set – difficult to compare MIPS of computers with different instruction sets • MIPS can vary inversely to performance Budditha Hettige 47
  • 48. MFLOPS as a performance measureMFLOPS as a performance measure Budditha Hettige 48
  • 49. ProblemsProblems MIPS as a performance measure • MFLOPS is not dependable – Cray C90 has no divide instructions while Pentium has • MFLOPS depends on the mixture of fast and slow floating point operations – add (fast) and divide (slow) operations Budditha Hettige 49