SlideShare a Scribd company logo
1 of 24
American International
University-Bangladesh (AIUB)
Presenter
Nusrat Irin Chowdhury Mary
Superscalar
Architecture
 Superscalar Architecture (SSA) describes a
microprocessor design that execute more than one
instruction at a time during a single clock cycle .
 In a SSA design, the processor or the instruction compiler
is able to determine whether an instruction can be
carried out independently of other sequential
instructions, or whether it has a dependency on another
instruction and must be executed sequentially.
 The design is sometimes called “Second Generation
RISC”. Another term used to describe superscalar
processors is multiple instruction issue processors. 2
Superscalar Architecture
 In a SSA, several scalar instructions can be initiated
simultaneously and executed independently.
 A long series of innovations aimed at producing ever-
faster microprocessors.
 Includes all features of pipelining but, in addition, there
can be several instructions executing simultaneously in
the same pipeline stage.
 SSA introduces a new level of parallelism, called
instruction-level parallelism.
3
Superscalar Architecture cont’d
 In Superscalar CPU Architecture implementation of
Instruction Level Parallelism (ILP) within a single
processor allows faster CPU at a given clock rate.
 A superscalar processor executes more than one
instruction during a clock cycle by simultaneously
dispatching multiple instructions to functional units.
 Each functional unit is not a separate CPU core but an
execution resource within a single CPU such as an
arithmetic logic unit, a bit shifter, or a multiplier.
4
Superscalar CPU Architecture
 SimpleScalar is an open source computer architecture
simulator which is written using ‘C’ programming language.
 A set of tools that model a virtual computer system with
CPU, Cache and Memory Hierarchy.
 Using the tool, users can model applications that simulate
programs running on a range of modern processors and
systems.
 The tool set includes sample simulators ranging from a fast
functional simulator to a detailed.
5
SimpleScalar Architecture
 The simplest processors are scalar processors. Each
instruction executed by a scalar processor typically
manipulates one or two data items at a time.
 In a superscalar CPU the dispatcher reads instructions
from memory and decides which one can be run in
parallel.
 Therefore a superscalar processor can be proposed
having multiple parallel pipelines, each of which is
processing instructions simultaneously from a single
instruction thread.
6
Scalar to Superscalar
 Pipelining is the process of breaking down task
into substeps and executing them in different
parts of processor.
 In order to fully utilise a superscalar processor of
degree m with pipelining, m instructions must be
executable parallely. This situation may not be
true in all clock cycles.
 In a superscalar processor, the simple operation
latency should require only one cycle, as in the
base scalar processor.
7
Pipelining in Superscalar Architecture
A SSA processor fetches multiple instructions at a time,
and attempts to find nearby instructions that are
independent of each other and therefore can be executed
in parallel.
Based on the dependency analysis, the processor may
issue and execute instructions in an order that differs
from that of the original machine code.
The processor may eliminate some unnecessary
dependencies by the use of additional registers.
8
Implement Superscalar
I: Instructions from 1 to
corresponding sequences
Fetch: fetches instructions
from memory (ideally one
per cycle for Scalar)
Decode: reveals instruction
operations to be performed
and identifies the resources
needed
Execute: actual processing
of operations as indicated by
instruction
Store (Write Back): writing
results into the registers.
9
Superscalar with Scalar Instructions
Flow
10
Effect of Dependencies
 ADD r1, r2 (r1 :=
r1+r2;)
 MOVE r3,r1 (r3 :=
r1;)
 Can fetch and
decode second
instruction in
parallel with first
 Can NOT execute
second instruction
until first is
finished
11
General Superscalar Organization
12
Superscalar Operational Block
Diagram
13
Instruction Flow in Superscalar
Architecture
Superpipelining is based on dividing the stages into several sub-
stages, and thus increasing the number of instructions which
are handled by the pipeline at the same time.
For example, by dividing each stage into two sub-stages, a
pipeline can perform at twice the speed in the ideal situation:
 Tasks that require less than half a clock cycle.
 No duplication of hardware is needed for these stages.
14
Superpipelining
Figure:
Duplication of
hardware is for
Superscalar
Base machine: 4-stage pipeline
• Instruction fetch
• Operation decode
• Operation execution
• Result write back
Superpipeline of degree 2
• A sub-stage often takes half
a clock cycle to finish.
Superscalar of degree 2
• Two instructions are
executed.
• Duplication of hardware is
required by definition.
15
Superscalar vs. Superpipeline
16
Superpipelined Superscalar
Superpipeline of degree 3 and
superscalar of degree 4:
• 12 times speed-up over the base
machine.
• 48 times speedup over
sequential execution.
This is a new trend of architecture
design:
• Pentium Pro(P6): 3-degree
superscalar, 12-stage
“superpipeline”.
• PowerPC 620: 4-degree
superscalar, 4/6-stage pipeline.
A Pipeline
architecture in more
detail
17
Instruction Flow
A Superscalar
microarchitecture
18
Instruction Execution
Tasks can be divided into the following
 Parallel decoding
 Superscalar instruction issue
 Parallel instruction execution
preserving sequential consistency of exception
processing
preserving sequential consistency of execution
19
Superscalar Issues to Consider
 Parallel decoding – more complex task for scalar
processors.
 Superscalar instruction issue – A higher issue rate
gives rise to higher processor performance, but
amplifies the restrictive effects of control and data
dependencies on the processor performance.
 Parallel instruction execution task – While instructions
are executed in parallel, instructions are usually
completed out of order in respect to a sequential
operating procedure.
20
Superscalar Issues to Consider
21
Two way Superscalar Execution
Figure: Two instruction parallel execution in one clock.
 Instruction-fetch inefficiencies caused by both branch
delays and instruction misalignment
 not worthwhile to explore highly- concurrent execution
hardware, rather, it is more appropriate to explore
economical execution hardware
 degree of intrinsic parallelism in the instruction
stream (instructions requiring the same
computational resources from the CPU)
 complexity and time cost of the dispatcher and
associated dependency checking logic
 branch instruction processing.
22
Limitations of Superscalar
[1] Zebo Peng. (2010). General format, “Superscalar Architecture – IDA”, Lecture 3
Introduction to Parallel Architectures.
[2] William M. Johnson, “Super-Scalar Processor Design”, Stanford University
Since 1989
[3] James E. Smith (1995), “The Microarchitecture of Superscalar Processors”,
senior member IEEE
[4] Subbarao Palacharla, “Complexity -Effective Superscalar Processors”,
UNIVERSITY OF WISCONSIN—MADISON, Since 1998
[5] Alaa Alameldeen and Haitham Akkary, "The Microarchitecture of Superscalar
Processors", Portland State University, since 2014
[5] http://www.techterms.com/definition/superscalar
[6] https://www.google.com.bd/search?q=superscalar+archit
ecture&newwindow=1&tbm=isch&tbo=u&source=univ&sa=X&ei
=Ll1oU9umJNWHuASgwoDACA&ved=0CDUQsAQ&biw=1048&bih=921 23
Citation
Superscalar Architecture_AIUB

More Related Content

What's hot

Hardware Multi-Threading
Hardware Multi-ThreadingHardware Multi-Threading
Hardware Multi-Threadingbabuece
 
Artificial Neural Networks Lect3: Neural Network Learning rules
Artificial Neural Networks Lect3: Neural Network Learning rulesArtificial Neural Networks Lect3: Neural Network Learning rules
Artificial Neural Networks Lect3: Neural Network Learning rulesMohammed Bennamoun
 
Geographic Routing in WSN
Geographic Routing in WSNGeographic Routing in WSN
Geographic Routing in WSNMahbubur Rahman
 
Important 16 marks questions
Important 16 marks questionsImportant 16 marks questions
Important 16 marks questionsvaidheeswari
 
Digital signal processor architecture
Digital signal processor architectureDigital signal processor architecture
Digital signal processor architecturekomal mistry
 
Multivector and multiprocessor
Multivector and multiprocessorMultivector and multiprocessor
Multivector and multiprocessorKishan Panara
 
Multiple processor (ppt 2010)
Multiple processor (ppt 2010)Multiple processor (ppt 2010)
Multiple processor (ppt 2010)Arth Ramada
 
Multicore Computers
Multicore ComputersMulticore Computers
Multicore ComputersA B Shinde
 
RTOS for Embedded System Design
RTOS for Embedded System DesignRTOS for Embedded System Design
RTOS for Embedded System Designanand hd
 
Difference b/w 8085 & 8086
Difference b/w 8085 & 8086Difference b/w 8085 & 8086
Difference b/w 8085 & 8086j4jiet
 
Pipeline hazards in computer Architecture ppt
Pipeline hazards in computer Architecture pptPipeline hazards in computer Architecture ppt
Pipeline hazards in computer Architecture pptmali yogesh kumar
 
Vliw and superscaler
Vliw and superscalerVliw and superscaler
Vliw and superscalerRafi Dar
 
Task communication
Task communicationTask communication
Task communication1jayanti
 
Multiprocessor
MultiprocessorMultiprocessor
MultiprocessorA B Shinde
 
Real Time Kernels
Real Time KernelsReal Time Kernels
Real Time KernelsArnav Soni
 

What's hot (20)

Hardware Multi-Threading
Hardware Multi-ThreadingHardware Multi-Threading
Hardware Multi-Threading
 
Artificial Neural Networks Lect3: Neural Network Learning rules
Artificial Neural Networks Lect3: Neural Network Learning rulesArtificial Neural Networks Lect3: Neural Network Learning rules
Artificial Neural Networks Lect3: Neural Network Learning rules
 
ARM Architecture
ARM ArchitectureARM Architecture
ARM Architecture
 
Presentation on risc pipeline
Presentation on risc pipelinePresentation on risc pipeline
Presentation on risc pipeline
 
Geographic Routing in WSN
Geographic Routing in WSNGeographic Routing in WSN
Geographic Routing in WSN
 
Important 16 marks questions
Important 16 marks questionsImportant 16 marks questions
Important 16 marks questions
 
Parallel computing persentation
Parallel computing persentationParallel computing persentation
Parallel computing persentation
 
Digital signal processor architecture
Digital signal processor architectureDigital signal processor architecture
Digital signal processor architecture
 
Multivector and multiprocessor
Multivector and multiprocessorMultivector and multiprocessor
Multivector and multiprocessor
 
Multiple processor (ppt 2010)
Multiple processor (ppt 2010)Multiple processor (ppt 2010)
Multiple processor (ppt 2010)
 
Multicore Computers
Multicore ComputersMulticore Computers
Multicore Computers
 
RTOS for Embedded System Design
RTOS for Embedded System DesignRTOS for Embedded System Design
RTOS for Embedded System Design
 
Difference b/w 8085 & 8086
Difference b/w 8085 & 8086Difference b/w 8085 & 8086
Difference b/w 8085 & 8086
 
Pipeline hazards in computer Architecture ppt
Pipeline hazards in computer Architecture pptPipeline hazards in computer Architecture ppt
Pipeline hazards in computer Architecture ppt
 
Vliw and superscaler
Vliw and superscalerVliw and superscaler
Vliw and superscaler
 
Task communication
Task communicationTask communication
Task communication
 
Multiprocessor
MultiprocessorMultiprocessor
Multiprocessor
 
Real Time Kernels
Real Time KernelsReal Time Kernels
Real Time Kernels
 
Cache memory
Cache memoryCache memory
Cache memory
 
Parallel processing
Parallel processingParallel processing
Parallel processing
 

Viewers also liked

VLIW(Very Long Instruction Word)
VLIW(Very Long Instruction Word)VLIW(Very Long Instruction Word)
VLIW(Very Long Instruction Word)Pragnya Dash
 
Multicore Processsors
Multicore ProcesssorsMulticore Processsors
Multicore ProcesssorsAveen Meena
 
Multi core processors
Multi core processorsMulti core processors
Multi core processorsAdithya Bhat
 
Nodes and Networks for HPC computing
Nodes and Networks for HPC computingNodes and Networks for HPC computing
Nodes and Networks for HPC computingrinnocente
 
Embedded Systems Introdution
Embedded Systems IntrodutionEmbedded Systems Introdution
Embedded Systems IntrodutionSheikh Ismail
 
3D Microprocessor Design: Stacking at different granularities
3D Microprocessor Design: Stacking at different granularities3D Microprocessor Design: Stacking at different granularities
3D Microprocessor Design: Stacking at different granularitiesAlberto Villegas
 
Parallel architecture-programming
Parallel architecture-programmingParallel architecture-programming
Parallel architecture-programmingShaveta Banda
 
MULTITHREADING CONCEPT
MULTITHREADING CONCEPTMULTITHREADING CONCEPT
MULTITHREADING CONCEPTRAVI MAURYA
 
Enhancement of Palmprint using Median Filter for Biometrics Application
Enhancement of Palmprint using Median Filter for Biometrics ApplicationEnhancement of Palmprint using Median Filter for Biometrics Application
Enhancement of Palmprint using Median Filter for Biometrics ApplicationMangilal Saraswat
 
Benchmark Processors- VAX 8600,MC68040,SPARC and Superscalar RISC
Benchmark Processors- VAX 8600,MC68040,SPARC and Superscalar RISCBenchmark Processors- VAX 8600,MC68040,SPARC and Superscalar RISC
Benchmark Processors- VAX 8600,MC68040,SPARC and Superscalar RISCPriyodarshini Dhar
 
Multithreading
MultithreadingMultithreading
Multithreadingsagsharma
 
Multithreading
Multithreading Multithreading
Multithreading WafaQKhan
 
Advanced computer architecture lesson 5 and 6
Advanced computer architecture lesson 5 and 6Advanced computer architecture lesson 5 and 6
Advanced computer architecture lesson 5 and 6Ismail Mukiibi
 
11. Computer Systems Hardware 1
11. Computer Systems   Hardware 111. Computer Systems   Hardware 1
11. Computer Systems Hardware 1New Era University
 
Multithreading and Parallelism on iOS [MobOS 2013]
 Multithreading and Parallelism on iOS [MobOS 2013] Multithreading and Parallelism on iOS [MobOS 2013]
Multithreading and Parallelism on iOS [MobOS 2013]Kuba Břečka
 

Viewers also liked (20)

Superscalar processors
Superscalar processorsSuperscalar processors
Superscalar processors
 
Vliw
VliwVliw
Vliw
 
VLIW(Very Long Instruction Word)
VLIW(Very Long Instruction Word)VLIW(Very Long Instruction Word)
VLIW(Very Long Instruction Word)
 
Multicore Processsors
Multicore ProcesssorsMulticore Processsors
Multicore Processsors
 
CISC & RISC Architecture
CISC & RISC Architecture CISC & RISC Architecture
CISC & RISC Architecture
 
Multi core processors
Multi core processorsMulti core processors
Multi core processors
 
Nodes and Networks for HPC computing
Nodes and Networks for HPC computingNodes and Networks for HPC computing
Nodes and Networks for HPC computing
 
Embedded Systems Introdution
Embedded Systems IntrodutionEmbedded Systems Introdution
Embedded Systems Introdution
 
3D Microprocessor Design: Stacking at different granularities
3D Microprocessor Design: Stacking at different granularities3D Microprocessor Design: Stacking at different granularities
3D Microprocessor Design: Stacking at different granularities
 
Parallel architecture-programming
Parallel architecture-programmingParallel architecture-programming
Parallel architecture-programming
 
MULTITHREADING CONCEPT
MULTITHREADING CONCEPTMULTITHREADING CONCEPT
MULTITHREADING CONCEPT
 
13 superscalar
13 superscalar13 superscalar
13 superscalar
 
Enhancement of Palmprint using Median Filter for Biometrics Application
Enhancement of Palmprint using Median Filter for Biometrics ApplicationEnhancement of Palmprint using Median Filter for Biometrics Application
Enhancement of Palmprint using Median Filter for Biometrics Application
 
Benchmark Processors- VAX 8600,MC68040,SPARC and Superscalar RISC
Benchmark Processors- VAX 8600,MC68040,SPARC and Superscalar RISCBenchmark Processors- VAX 8600,MC68040,SPARC and Superscalar RISC
Benchmark Processors- VAX 8600,MC68040,SPARC and Superscalar RISC
 
Chapter6 pipelining
Chapter6  pipeliningChapter6  pipelining
Chapter6 pipelining
 
Multithreading
MultithreadingMultithreading
Multithreading
 
Multithreading
Multithreading Multithreading
Multithreading
 
Advanced computer architecture lesson 5 and 6
Advanced computer architecture lesson 5 and 6Advanced computer architecture lesson 5 and 6
Advanced computer architecture lesson 5 and 6
 
11. Computer Systems Hardware 1
11. Computer Systems   Hardware 111. Computer Systems   Hardware 1
11. Computer Systems Hardware 1
 
Multithreading and Parallelism on iOS [MobOS 2013]
 Multithreading and Parallelism on iOS [MobOS 2013] Multithreading and Parallelism on iOS [MobOS 2013]
Multithreading and Parallelism on iOS [MobOS 2013]
 

Similar to Superscalar Architecture_AIUB

Superscalar processor
Superscalar processorSuperscalar processor
Superscalar processornoor ul ain
 
Two students were debating how to classify superscalar operation ver.pdf
Two students were debating how to classify superscalar operation ver.pdfTwo students were debating how to classify superscalar operation ver.pdf
Two students were debating how to classify superscalar operation ver.pdfbirajdar2
 
SOC System Design Approach
SOC System Design ApproachSOC System Design Approach
SOC System Design ApproachA B Shinde
 
Advanced processor principles
Advanced processor principlesAdvanced processor principles
Advanced processor principlesDhaval Bagal
 
Implementing True Zero Cycle Branching in Scalar and Superscalar Pipelined Pr...
Implementing True Zero Cycle Branching in Scalar and Superscalar Pipelined Pr...Implementing True Zero Cycle Branching in Scalar and Superscalar Pipelined Pr...
Implementing True Zero Cycle Branching in Scalar and Superscalar Pipelined Pr...IDES Editor
 
FIne Grain Multithreading
FIne Grain MultithreadingFIne Grain Multithreading
FIne Grain MultithreadingDharmesh Tank
 
Pipeline Mechanism
Pipeline MechanismPipeline Mechanism
Pipeline MechanismAshik Iqbal
 
STUDY OF VARIOUS FACTORS AFFECTING PERFORMANCE OF MULTI-CORE PROCESSORS
STUDY OF VARIOUS FACTORS AFFECTING PERFORMANCE OF MULTI-CORE PROCESSORSSTUDY OF VARIOUS FACTORS AFFECTING PERFORMANCE OF MULTI-CORE PROCESSORS
STUDY OF VARIOUS FACTORS AFFECTING PERFORMANCE OF MULTI-CORE PROCESSORSijdpsjournal
 
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docxCS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docxfaithxdunce63732
 
Java microarchitecture of a coarse-grain out-of-order superscalar processor
Java  microarchitecture of a coarse-grain out-of-order superscalar processorJava  microarchitecture of a coarse-grain out-of-order superscalar processor
Java microarchitecture of a coarse-grain out-of-order superscalar processorecwayerode
 
Microarchitecture of a coarse grain out-of-order superscalar processor
Microarchitecture of a coarse grain out-of-order superscalar processorMicroarchitecture of a coarse grain out-of-order superscalar processor
Microarchitecture of a coarse grain out-of-order superscalar processorecway
 
Parallel Computing
Parallel ComputingParallel Computing
Parallel ComputingMohsin Bhat
 

Similar to Superscalar Architecture_AIUB (20)

Superscalar processor
Superscalar processorSuperscalar processor
Superscalar processor
 
Two students were debating how to classify superscalar operation ver.pdf
Two students were debating how to classify superscalar operation ver.pdfTwo students were debating how to classify superscalar operation ver.pdf
Two students were debating how to classify superscalar operation ver.pdf
 
SOC System Design Approach
SOC System Design ApproachSOC System Design Approach
SOC System Design Approach
 
Advanced processor principles
Advanced processor principlesAdvanced processor principles
Advanced processor principles
 
Implementing True Zero Cycle Branching in Scalar and Superscalar Pipelined Pr...
Implementing True Zero Cycle Branching in Scalar and Superscalar Pipelined Pr...Implementing True Zero Cycle Branching in Scalar and Superscalar Pipelined Pr...
Implementing True Zero Cycle Branching in Scalar and Superscalar Pipelined Pr...
 
FIne Grain Multithreading
FIne Grain MultithreadingFIne Grain Multithreading
FIne Grain Multithreading
 
Pipeline Mechanism
Pipeline MechanismPipeline Mechanism
Pipeline Mechanism
 
Chap2 slides
Chap2 slidesChap2 slides
Chap2 slides
 
1.prallelism
1.prallelism1.prallelism
1.prallelism
 
1.prallelism
1.prallelism1.prallelism
1.prallelism
 
STUDY OF VARIOUS FACTORS AFFECTING PERFORMANCE OF MULTI-CORE PROCESSORS
STUDY OF VARIOUS FACTORS AFFECTING PERFORMANCE OF MULTI-CORE PROCESSORSSTUDY OF VARIOUS FACTORS AFFECTING PERFORMANCE OF MULTI-CORE PROCESSORS
STUDY OF VARIOUS FACTORS AFFECTING PERFORMANCE OF MULTI-CORE PROCESSORS
 
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docxCS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
 
Scope of parallelism
Scope of parallelismScope of parallelism
Scope of parallelism
 
Java microarchitecture of a coarse-grain out-of-order superscalar processor
Java  microarchitecture of a coarse-grain out-of-order superscalar processorJava  microarchitecture of a coarse-grain out-of-order superscalar processor
Java microarchitecture of a coarse-grain out-of-order superscalar processor
 
Microarchitecture of a coarse grain out-of-order superscalar processor
Microarchitecture of a coarse grain out-of-order superscalar processorMicroarchitecture of a coarse grain out-of-order superscalar processor
Microarchitecture of a coarse grain out-of-order superscalar processor
 
Lecture2
Lecture2Lecture2
Lecture2
 
Parallel Computing
Parallel ComputingParallel Computing
Parallel Computing
 
shashank_hpca1995_00386533
shashank_hpca1995_00386533shashank_hpca1995_00386533
shashank_hpca1995_00386533
 
Computer architecture
Computer architectureComputer architecture
Computer architecture
 
Oversimplified CA
Oversimplified CAOversimplified CA
Oversimplified CA
 

Recently uploaded

Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZTE
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 

Recently uploaded (20)

young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 

Superscalar Architecture_AIUB

  • 1. American International University-Bangladesh (AIUB) Presenter Nusrat Irin Chowdhury Mary Superscalar Architecture
  • 2.  Superscalar Architecture (SSA) describes a microprocessor design that execute more than one instruction at a time during a single clock cycle .  In a SSA design, the processor or the instruction compiler is able to determine whether an instruction can be carried out independently of other sequential instructions, or whether it has a dependency on another instruction and must be executed sequentially.  The design is sometimes called “Second Generation RISC”. Another term used to describe superscalar processors is multiple instruction issue processors. 2 Superscalar Architecture
  • 3.  In a SSA, several scalar instructions can be initiated simultaneously and executed independently.  A long series of innovations aimed at producing ever- faster microprocessors.  Includes all features of pipelining but, in addition, there can be several instructions executing simultaneously in the same pipeline stage.  SSA introduces a new level of parallelism, called instruction-level parallelism. 3 Superscalar Architecture cont’d
  • 4.  In Superscalar CPU Architecture implementation of Instruction Level Parallelism (ILP) within a single processor allows faster CPU at a given clock rate.  A superscalar processor executes more than one instruction during a clock cycle by simultaneously dispatching multiple instructions to functional units.  Each functional unit is not a separate CPU core but an execution resource within a single CPU such as an arithmetic logic unit, a bit shifter, or a multiplier. 4 Superscalar CPU Architecture
  • 5.  SimpleScalar is an open source computer architecture simulator which is written using ‘C’ programming language.  A set of tools that model a virtual computer system with CPU, Cache and Memory Hierarchy.  Using the tool, users can model applications that simulate programs running on a range of modern processors and systems.  The tool set includes sample simulators ranging from a fast functional simulator to a detailed. 5 SimpleScalar Architecture
  • 6.  The simplest processors are scalar processors. Each instruction executed by a scalar processor typically manipulates one or two data items at a time.  In a superscalar CPU the dispatcher reads instructions from memory and decides which one can be run in parallel.  Therefore a superscalar processor can be proposed having multiple parallel pipelines, each of which is processing instructions simultaneously from a single instruction thread. 6 Scalar to Superscalar
  • 7.  Pipelining is the process of breaking down task into substeps and executing them in different parts of processor.  In order to fully utilise a superscalar processor of degree m with pipelining, m instructions must be executable parallely. This situation may not be true in all clock cycles.  In a superscalar processor, the simple operation latency should require only one cycle, as in the base scalar processor. 7 Pipelining in Superscalar Architecture
  • 8. A SSA processor fetches multiple instructions at a time, and attempts to find nearby instructions that are independent of each other and therefore can be executed in parallel. Based on the dependency analysis, the processor may issue and execute instructions in an order that differs from that of the original machine code. The processor may eliminate some unnecessary dependencies by the use of additional registers. 8 Implement Superscalar
  • 9. I: Instructions from 1 to corresponding sequences Fetch: fetches instructions from memory (ideally one per cycle for Scalar) Decode: reveals instruction operations to be performed and identifies the resources needed Execute: actual processing of operations as indicated by instruction Store (Write Back): writing results into the registers. 9 Superscalar with Scalar Instructions Flow
  • 10. 10 Effect of Dependencies  ADD r1, r2 (r1 := r1+r2;)  MOVE r3,r1 (r3 := r1;)  Can fetch and decode second instruction in parallel with first  Can NOT execute second instruction until first is finished
  • 13. 13 Instruction Flow in Superscalar Architecture
  • 14. Superpipelining is based on dividing the stages into several sub- stages, and thus increasing the number of instructions which are handled by the pipeline at the same time. For example, by dividing each stage into two sub-stages, a pipeline can perform at twice the speed in the ideal situation:  Tasks that require less than half a clock cycle.  No duplication of hardware is needed for these stages. 14 Superpipelining Figure: Duplication of hardware is for Superscalar
  • 15. Base machine: 4-stage pipeline • Instruction fetch • Operation decode • Operation execution • Result write back Superpipeline of degree 2 • A sub-stage often takes half a clock cycle to finish. Superscalar of degree 2 • Two instructions are executed. • Duplication of hardware is required by definition. 15 Superscalar vs. Superpipeline
  • 16. 16 Superpipelined Superscalar Superpipeline of degree 3 and superscalar of degree 4: • 12 times speed-up over the base machine. • 48 times speedup over sequential execution. This is a new trend of architecture design: • Pentium Pro(P6): 3-degree superscalar, 12-stage “superpipeline”. • PowerPC 620: 4-degree superscalar, 4/6-stage pipeline.
  • 17. A Pipeline architecture in more detail 17 Instruction Flow A Superscalar microarchitecture
  • 19. Tasks can be divided into the following  Parallel decoding  Superscalar instruction issue  Parallel instruction execution preserving sequential consistency of exception processing preserving sequential consistency of execution 19 Superscalar Issues to Consider
  • 20.  Parallel decoding – more complex task for scalar processors.  Superscalar instruction issue – A higher issue rate gives rise to higher processor performance, but amplifies the restrictive effects of control and data dependencies on the processor performance.  Parallel instruction execution task – While instructions are executed in parallel, instructions are usually completed out of order in respect to a sequential operating procedure. 20 Superscalar Issues to Consider
  • 21. 21 Two way Superscalar Execution Figure: Two instruction parallel execution in one clock.
  • 22.  Instruction-fetch inefficiencies caused by both branch delays and instruction misalignment  not worthwhile to explore highly- concurrent execution hardware, rather, it is more appropriate to explore economical execution hardware  degree of intrinsic parallelism in the instruction stream (instructions requiring the same computational resources from the CPU)  complexity and time cost of the dispatcher and associated dependency checking logic  branch instruction processing. 22 Limitations of Superscalar
  • 23. [1] Zebo Peng. (2010). General format, “Superscalar Architecture – IDA”, Lecture 3 Introduction to Parallel Architectures. [2] William M. Johnson, “Super-Scalar Processor Design”, Stanford University Since 1989 [3] James E. Smith (1995), “The Microarchitecture of Superscalar Processors”, senior member IEEE [4] Subbarao Palacharla, “Complexity -Effective Superscalar Processors”, UNIVERSITY OF WISCONSIN—MADISON, Since 1998 [5] Alaa Alameldeen and Haitham Akkary, "The Microarchitecture of Superscalar Processors", Portland State University, since 2014 [5] http://www.techterms.com/definition/superscalar [6] https://www.google.com.bd/search?q=superscalar+archit ecture&newwindow=1&tbm=isch&tbo=u&source=univ&sa=X&ei =Ll1oU9umJNWHuASgwoDACA&ved=0CDUQsAQ&biw=1048&bih=921 23 Citation