SlideShare a Scribd company logo
1 of 21
Download to read offline
1
 Very long instruction word or VLIW refers to a processor
architecture designed to take advantage of instruction level
parallelism(ILP).
 Whereas conventional processors mostly only allow programs that
specify instructions to be executed one after another.
 A VLIW processor allows programs that can explicitly specify
instructions to be executed at the same time (i.e. in parallel).
 This type of processor architecture is intended to allow higher
performance without the inherent complexity of some other
approaches.
 The term VLIW, and the concept of VLIW architecture itself, were
invented by Josh Fisher in his research group at Yale University in
the early 1980s.
Very long instruction word
2
 Explicitly parallel instruction computing (EPIC) is a term coined in
1997 by the HP–Intel alliance to describe a computing paradigm that
researchers had been investigating since the early 1980s.
 This paradigm is also called Independence architectures. It was the
basis for Intel and HP development of the Intel Itanium architecture,
and HP later asserted that "EPIC" was merely an old term for the
Itanium architecture.
 EPIC permits microprocessors to execute software instructions in
parallel by using the compiler, rather than complex on-die circuitry,
to control parallel instruction execution.
 This was intended to allow simple performance scaling without
resorting to higher clock frequencies.
Explicitly parallel instruction computing
3
 VLIW (very long instruction word):
- Compiler packs a fixed number of instructions into a single VLIW
instruction.
- The instructions within a VLIW instruction are issued and executed in
parallel
Example: High-end signal processors (TMS320C6201)
 EPIC (explicit parallel instruction computing):
Evolution of VLIW
Example: Intel’s IA-64, exemplified by the Itanium processor
VLIW or EPIC
4
VLIW
 VLIW (very long instruction word)
processors use a long instruction word that
contains a usually fixed number of
operations that are fetched, decoded,
issued, and executed synchronously.
 All operations specified within a VLIW
instruction must be independent of one
another.
5
VLIW
 Some of the key issues of a (V)LIW processor:
• (very) long instruction word (up to 1 024 bits per
instruction),
• each instruction consists of multiple independent
parallel operations,
• each operation requires a statically known number
of cycles to complete,
• a central controller that issues a long instruction
word every cycle,
• multiple FUs connected through a global shared
register file.
6
VLIW
 sequential stream of long instruction words
 instructions scheduled statically by the
compiler
 number of simultaneously issued
instructions is fixed during compile-time
 instruction issue is less complicated than in
a superscalar processor
7
VLIW
 Disadvantage: VLIW processors cannot react on dynamic
events,
e.g. cache misses, with the same flexibility like superscalars.
 The number of instructions in a VLIW instruction word is
usually fixed.
 Padding VLIW instructions with no-ops is needed in case
the full issue bandwidth is not be met. This increases code
size. More recent VLIW architectures use a denser code
format which allows to remove the no-ops.
 VLIW is an architectural technique, whereas superscalar is
a microarchitecture technique.
 VLIW processors take advantage of spatial parallelism.
8
EPIC: a paradigm shift
 Superscalar RISC solution
• Based on sequential execution semantics
• Compiler’s role is limited by the instruction set architecture
• Superscalar hardware identifies and exploits parallelism
 EPIC solution – (the evolution of VLIW)
• Based on parallel execution semantics
• EPIC ISA enhancements support static parallelization
• Compiler takes greater responsibility for exploiting parallelism
• Compiler / hardware collaboration often resembles superscalar
9
EPIC: a paradigm shift
 Advantages of pursuing EPIC architectures
• Make wide issue & deep latency less expensive in
hardware
• Allow processor parallelism to scale with additional
VLSI density
 Architect the processor to do well with in-order execution
• Enhance the ISA to allow static parallelization
• Use compiler technology to parallelize program
• However, a purely static VLIW is not appropriate for
general-purpose use
10
The fusion of VLIW and superscalar techniques
 Superscalars need improved support for static parallelization
• Static scheduling
• Limited support for predicated execution
 VLIWs need improved support for dynamic parallelization
• Caches introduce dynamically changing memory latency
• Compatibility: issue width and latency may change with new
hardware
• Application requirements - e.g. object oriented programming with
dynamic binding
 EPIC processors exhibit features derived from both
• Interlock & out-of-order execution hardware are compatible with
EPIC (but not required!)
• EPIC processors can use dynamic translation to parallelize in
software
11
Many EPIC features are taken from VLIWs
Minisupercomputer products stimulated VLIW research (FPS,
Multiflow, Cydrome)
Minisupercomputers were specialized, costly, and short-lived
Traditional VLIWs not suited to general purpose computing
VLIW resurgence in single chip DSP & media processors
Minisupercomputers exaggerated forward-looking challenges:
Long latency
Wide issue
Large number of architected registers
Compile-time scheduling to exploit exotic amounts of parallelism
EPIC exploits many VLIW techniques
12
Shortcomings of early VLIWs
 Expensive multi-chip implementations
 No data cache
 Poor "scalar" performance
 No strategy for object code compatibility
13
EPIC design challenges
 Develop architectures applicable to general-purpose
computing
• Find substantial parallelism in “difficult to parallelize”
scalar programs
• Provide compatibility across hardware generations
• Support emerging applications (e.g. multimedia)
 Compiler must find or create sufficient ILP
 Combine the best attributes of VLIW & superscalar RISC
(incorporated best concepts from all available sources)
 Scale architectures for modern single-chip implementation
14
EPIC Processors, Intel's IA-64 ISA and Itanium
 Joint R&D project by Hewlett-Packard and Intel
(announced in June 1994)
 This resulted in explicitly parallel instruction
computing (EPIC) design style:
• specifying ILP explicit in the machine code, that is, the
parallelism is encoded directly into the instructions
similarly to VLIW;
• a fully predicated instruction set;
• an inherently scalable instruction set (i.e., the ability to
scale to a lot of FUs);
• many registers;
• speculative execution of load instructions
15
IA-64 Architecture
 Unique architecture features & enhancements
• Explicit parallelism and templates
• Predication, speculation, memory support, and
others
• Floating-point and multimedia architecture
 IA-64 resources available to applications
• Large, application visible register set
• Rotating registers, register stack, register stack
engine
 IA-32 & PA-RISC compatibility models
16
Today’s Architecture Challenges
 Performance barriers :
• Memory latency
• Branches
• Loop pipelining and call / return overhead
 Headroom constraints :
• Hardware-based instruction scheduling
- Unable to efficiently schedule parallel execution
• Resource constrained
- Too few registers
- Unable to fully utilize multiple execution units
 Scalability limitations :
• Memory addressing efficiency
17
Intel's IA-64 ISA
 Intel 64-bit Architecture (IA-64) register model:
• 128, 64-bit general purpose registers GR0-GR127
to hold values for integer and multimedia computations
- each register has one additional NaT (Not a Thing) bit to
indicate whether the value stored is valid,
• 128, 82-bit floating-point registers FR0-FR127
- registers f0 and f1 are read-only with values +0.0 and +1.0,
• 64, 1-bit predicate registers P0-PR63
- the first register p0 is read-only and always reads 1 (true)
• 8, 64-bit branch registers BR0-BR7 to specify the target
addresses of indirect branches
18
IA-64’s Large Register File
BR7
BR0
Branch
Registers
63 0
96 Stacked, Rotating
GR1
GR31
GR127
GR32
GR0
NaT 32 Static
0
Integer Registers
63 0
Predicate
Registers
1
PR1
PR63
PR0
PR15
PR16
48 Rotating
16 Static
bit 0
96 Rotating
GR1
GR31
GR127
GR32
GR0
32 Static
0.0
Floating-Point
Registers
81 0
19
Intel's IA-64 ISA
• IA-64 instructions are 41-bit (previously stated 40 bit) long and consist of
- op-code,
- predicate field (6 bits),
- two source register addresses (7 bits each),
- destination register address (7 bits), and
- special fields (includes integer and floating-point arithmetic).
• The 6-bit predicate field in each IA-64 instruction refers to a set of 64 predicate
registers.
• 6 types of instructions
- A: Integer ALU ==> I-unit or M-unit
- I: Non-ALU integer ==> I-unit
- M: Memory ==> M-unit
- B: Branch ==> B-unit
- F: Floating-point ==> F-unit
- L: Long Immediate ==> I-unit
• IA-64 instructions are packed by compiler into bundles.
20
IA-64 Bundles
 A bundle is a 128-bit long instruction word (LIW) containing three 41-bit IA-64
instructions along with a so-called 5-bit template that contains instruction
grouping information.
 IA-64 does not insert no-op instructions to fill slots in the bundles.
 The template explicitly indicates :
• first 4 bits: types of instructions
• last bit (stop bit): whether the bundle can be executed in parallel with the
next bundle
• (previous literature): whether the instructions in the bundle can be executed
in parallel or if one or more must be executed serially.
 Bundled instructions don't have to be in their original program order, and they
can even represent entirely different paths of a branch.
 Also, the compiler can mix dependent and independent instructions together in a
bundle, because the template keeps track of which is which.
21
IA-64 : Explicitly Parallel Architecture
 IA-64 template specifies
• The type of operation for each instruction
- MFI, MMI, MII, MLI, MIB, MMF, MFB, MMB,
MBB, BBB
• Intra-bundle relationship
- M / MI or MI / I
• Inter-bundle relationship
 Most common combinations covered by templates
• Headroom for additional templates
 Simplifies hardware requirements
 Scales compatibly to future generations
Instruction 2
41 bits
Instruction 1
41 bits
Instruction 0
41 bits
Template
5 bits
128 bits (bundle)
M=Memory
F=Floating-point
I=Integer
L=Long Immediate
B=Branch
(MMI)Memory (M) Memory (M) Integer (I)

More Related Content

What's hot

Error Detection And Correction
Error Detection And CorrectionError Detection And Correction
Error Detection And CorrectionRenu Kewalramani
 
Pipeline hazard
Pipeline hazardPipeline hazard
Pipeline hazardAJAL A J
 
Multithreading
MultithreadingMultithreading
MultithreadingA B Shinde
 
Cache coherence ppt
Cache coherence pptCache coherence ppt
Cache coherence pptArendraSingh2
 
Multithreading computer architecture
 Multithreading computer architecture  Multithreading computer architecture
Multithreading computer architecture Haris456
 
Register organization, stack
Register organization, stackRegister organization, stack
Register organization, stackAsif Iqbal
 
Register transfer and micro-operation
Register transfer and micro-operationRegister transfer and micro-operation
Register transfer and micro-operationNikhil Pandit
 
System verilog coverage
System verilog coverageSystem verilog coverage
System verilog coveragePushpa Yakkala
 
Computer Organization and Architecture.
Computer Organization and Architecture.Computer Organization and Architecture.
Computer Organization and Architecture.CS_GDRCST
 
Jumps in Assembly Language.
Jumps in Assembly Language.Jumps in Assembly Language.
Jumps in Assembly Language.NA000000
 
Introduction to System Calls
Introduction to System CallsIntroduction to System Calls
Introduction to System CallsVandana Salve
 
Interrupts
InterruptsInterrupts
InterruptsUrwa Shanza
 
Symbol table in compiler Design
Symbol table in compiler DesignSymbol table in compiler Design
Symbol table in compiler DesignKuppusamy P
 
Lexical Analysis - Compiler Design
Lexical Analysis - Compiler DesignLexical Analysis - Compiler Design
Lexical Analysis - Compiler DesignAkhil Kaushik
 
Storage management in operating system
Storage management in operating systemStorage management in operating system
Storage management in operating systemDeepikaT13
 
Cache memory
Cache memoryCache memory
Cache memoryAnuj Modi
 
Multi processor scheduling
Multi  processor schedulingMulti  processor scheduling
Multi processor schedulingShashank Kapoor
 
Peterson Critical Section Problem Solution
Peterson Critical Section Problem SolutionPeterson Critical Section Problem Solution
Peterson Critical Section Problem SolutionBipul Chandra Kar
 
Register transfer language
Register transfer languageRegister transfer language
Register transfer languageSanjeev Patel
 
UVM TUTORIAL;
UVM TUTORIAL;UVM TUTORIAL;
UVM TUTORIAL;Azad Mishra
 

What's hot (20)

Error Detection And Correction
Error Detection And CorrectionError Detection And Correction
Error Detection And Correction
 
Pipeline hazard
Pipeline hazardPipeline hazard
Pipeline hazard
 
Multithreading
MultithreadingMultithreading
Multithreading
 
Cache coherence ppt
Cache coherence pptCache coherence ppt
Cache coherence ppt
 
Multithreading computer architecture
 Multithreading computer architecture  Multithreading computer architecture
Multithreading computer architecture
 
Register organization, stack
Register organization, stackRegister organization, stack
Register organization, stack
 
Register transfer and micro-operation
Register transfer and micro-operationRegister transfer and micro-operation
Register transfer and micro-operation
 
System verilog coverage
System verilog coverageSystem verilog coverage
System verilog coverage
 
Computer Organization and Architecture.
Computer Organization and Architecture.Computer Organization and Architecture.
Computer Organization and Architecture.
 
Jumps in Assembly Language.
Jumps in Assembly Language.Jumps in Assembly Language.
Jumps in Assembly Language.
 
Introduction to System Calls
Introduction to System CallsIntroduction to System Calls
Introduction to System Calls
 
Interrupts
InterruptsInterrupts
Interrupts
 
Symbol table in compiler Design
Symbol table in compiler DesignSymbol table in compiler Design
Symbol table in compiler Design
 
Lexical Analysis - Compiler Design
Lexical Analysis - Compiler DesignLexical Analysis - Compiler Design
Lexical Analysis - Compiler Design
 
Storage management in operating system
Storage management in operating systemStorage management in operating system
Storage management in operating system
 
Cache memory
Cache memoryCache memory
Cache memory
 
Multi processor scheduling
Multi  processor schedulingMulti  processor scheduling
Multi processor scheduling
 
Peterson Critical Section Problem Solution
Peterson Critical Section Problem SolutionPeterson Critical Section Problem Solution
Peterson Critical Section Problem Solution
 
Register transfer language
Register transfer languageRegister transfer language
Register transfer language
 
UVM TUTORIAL;
UVM TUTORIAL;UVM TUTORIAL;
UVM TUTORIAL;
 

Similar to Vliw or epic

Vliw and superscaler
Vliw and superscalerVliw and superscaler
Vliw and superscalerRafi Dar
 
VLIW(Very Long Instruction Word)
VLIW(Very Long Instruction Word)VLIW(Very Long Instruction Word)
VLIW(Very Long Instruction Word)Pragnya Dash
 
The sunsparc architecture
The sunsparc architectureThe sunsparc architecture
The sunsparc architectureTaha Malampatti
 
0 foundation update__final - Mendy Furmanek
0 foundation update__final - Mendy Furmanek0 foundation update__final - Mendy Furmanek
0 foundation update__final - Mendy FurmanekYutaka Kawai
 
Crussoe proc
Crussoe procCrussoe proc
Crussoe proctyadi
 
finaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdf
finaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdffinaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdf
finaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdfNazarAhmadAlkhidir
 
5-Embedded processor technology-06-01-2024.pdf
5-Embedded processor technology-06-01-2024.pdf5-Embedded processor technology-06-01-2024.pdf
5-Embedded processor technology-06-01-2024.pdfmovocode
 
Difficulties in Pipelining
Difficulties in PipeliningDifficulties in Pipelining
Difficulties in PipeliningChristineMaeCion1
 
Computer Organization: Introduction to Microprocessor and Microcontroller
Computer Organization: Introduction to Microprocessor and MicrocontrollerComputer Organization: Introduction to Microprocessor and Microcontroller
Computer Organization: Introduction to Microprocessor and MicrocontrollerAmrutaMehata
 
Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications OpenEBS
 
Lec4 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ISA
Lec4 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ISALec4 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ISA
Lec4 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ISAHsien-Hsin Sean Lee, Ph.D.
 
Unit-I_part-II_Virtualization.pptx
Unit-I_part-II_Virtualization.pptxUnit-I_part-II_Virtualization.pptx
Unit-I_part-II_Virtualization.pptxDARKKNIGHT116809
 
LinuxIO-Introduction-FUDCon-2015
LinuxIO-Introduction-FUDCon-2015LinuxIO-Introduction-FUDCon-2015
LinuxIO-Introduction-FUDCon-2015KASHISH BHATIA
 
Large Scale Computing Infrastructure - Nautilus
Large Scale Computing Infrastructure - NautilusLarge Scale Computing Infrastructure - Nautilus
Large Scale Computing Infrastructure - NautilusGabriele Di Bernardo
 
Digital Design Flow
Digital Design FlowDigital Design Flow
Digital Design FlowMostafa Khamis
 
Vliw
VliwVliw
VliwAJAL A J
 
Hyper threading
Hyper threadingHyper threading
Hyper threadingAnmol Purohit
 
W04505116121
W04505116121W04505116121
W04505116121IJERA Editor
 
RISC-V Introduction
RISC-V IntroductionRISC-V Introduction
RISC-V IntroductionYi-Hsiu Hsu
 

Similar to Vliw or epic (20)

Vliw and superscaler
Vliw and superscalerVliw and superscaler
Vliw and superscaler
 
VLIW(Very Long Instruction Word)
VLIW(Very Long Instruction Word)VLIW(Very Long Instruction Word)
VLIW(Very Long Instruction Word)
 
Lec1 final
Lec1 finalLec1 final
Lec1 final
 
The sunsparc architecture
The sunsparc architectureThe sunsparc architecture
The sunsparc architecture
 
0 foundation update__final - Mendy Furmanek
0 foundation update__final - Mendy Furmanek0 foundation update__final - Mendy Furmanek
0 foundation update__final - Mendy Furmanek
 
Crussoe proc
Crussoe procCrussoe proc
Crussoe proc
 
finaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdf
finaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdffinaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdf
finaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdf
 
5-Embedded processor technology-06-01-2024.pdf
5-Embedded processor technology-06-01-2024.pdf5-Embedded processor technology-06-01-2024.pdf
5-Embedded processor technology-06-01-2024.pdf
 
Difficulties in Pipelining
Difficulties in PipeliningDifficulties in Pipelining
Difficulties in Pipelining
 
Computer Organization: Introduction to Microprocessor and Microcontroller
Computer Organization: Introduction to Microprocessor and MicrocontrollerComputer Organization: Introduction to Microprocessor and Microcontroller
Computer Organization: Introduction to Microprocessor and Microcontroller
 
Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications
 
Lec4 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ISA
Lec4 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ISALec4 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ISA
Lec4 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ISA
 
Unit-I_part-II_Virtualization.pptx
Unit-I_part-II_Virtualization.pptxUnit-I_part-II_Virtualization.pptx
Unit-I_part-II_Virtualization.pptx
 
LinuxIO-Introduction-FUDCon-2015
LinuxIO-Introduction-FUDCon-2015LinuxIO-Introduction-FUDCon-2015
LinuxIO-Introduction-FUDCon-2015
 
Large Scale Computing Infrastructure - Nautilus
Large Scale Computing Infrastructure - NautilusLarge Scale Computing Infrastructure - Nautilus
Large Scale Computing Infrastructure - Nautilus
 
Digital Design Flow
Digital Design FlowDigital Design Flow
Digital Design Flow
 
Vliw
VliwVliw
Vliw
 
Hyper threading
Hyper threadingHyper threading
Hyper threading
 
W04505116121
W04505116121W04505116121
W04505116121
 
RISC-V Introduction
RISC-V IntroductionRISC-V Introduction
RISC-V Introduction
 

More from Amit Kumar Rathi

Hybrid Systems using Fuzzy, NN and GA (Soft Computing)
Hybrid Systems using Fuzzy, NN and GA (Soft Computing)Hybrid Systems using Fuzzy, NN and GA (Soft Computing)
Hybrid Systems using Fuzzy, NN and GA (Soft Computing)Amit Kumar Rathi
 
Fundamentals of Genetic Algorithms (Soft Computing)
Fundamentals of Genetic Algorithms (Soft Computing)Fundamentals of Genetic Algorithms (Soft Computing)
Fundamentals of Genetic Algorithms (Soft Computing)Amit Kumar Rathi
 
Fuzzy Systems by using fuzzy set (Soft Computing)
Fuzzy Systems by using fuzzy set (Soft Computing)Fuzzy Systems by using fuzzy set (Soft Computing)
Fuzzy Systems by using fuzzy set (Soft Computing)Amit Kumar Rathi
 
Fuzzy Set Theory and Classical Set Theory (Soft Computing)
Fuzzy Set Theory and Classical Set Theory (Soft Computing)Fuzzy Set Theory and Classical Set Theory (Soft Computing)
Fuzzy Set Theory and Classical Set Theory (Soft Computing)Amit Kumar Rathi
 
Associative Memory using NN (Soft Computing)
Associative Memory using NN (Soft Computing)Associative Memory using NN (Soft Computing)
Associative Memory using NN (Soft Computing)Amit Kumar Rathi
 
Back Propagation Network (Soft Computing)
Back Propagation Network (Soft Computing)Back Propagation Network (Soft Computing)
Back Propagation Network (Soft Computing)Amit Kumar Rathi
 
Fundamentals of Neural Network (Soft Computing)
Fundamentals of Neural Network (Soft Computing)Fundamentals of Neural Network (Soft Computing)
Fundamentals of Neural Network (Soft Computing)Amit Kumar Rathi
 
Introduction to Soft Computing (intro to the building blocks of SC)
Introduction to Soft Computing (intro to the building blocks of SC)Introduction to Soft Computing (intro to the building blocks of SC)
Introduction to Soft Computing (intro to the building blocks of SC)Amit Kumar Rathi
 
String matching, naive,
String matching, naive,String matching, naive,
String matching, naive,Amit Kumar Rathi
 
Shortest path algorithms
Shortest path algorithmsShortest path algorithms
Shortest path algorithmsAmit Kumar Rathi
 
Sccd and topological sorting
Sccd and topological sortingSccd and topological sorting
Sccd and topological sortingAmit Kumar Rathi
 
Recurrence and master theorem
Recurrence and master theoremRecurrence and master theorem
Recurrence and master theoremAmit Kumar Rathi
 
Rabin karp string matcher
Rabin karp string matcherRabin karp string matcher
Rabin karp string matcherAmit Kumar Rathi
 
Minimum spanning tree
Minimum spanning treeMinimum spanning tree
Minimum spanning treeAmit Kumar Rathi
 

More from Amit Kumar Rathi (20)

Hybrid Systems using Fuzzy, NN and GA (Soft Computing)
Hybrid Systems using Fuzzy, NN and GA (Soft Computing)Hybrid Systems using Fuzzy, NN and GA (Soft Computing)
Hybrid Systems using Fuzzy, NN and GA (Soft Computing)
 
Fundamentals of Genetic Algorithms (Soft Computing)
Fundamentals of Genetic Algorithms (Soft Computing)Fundamentals of Genetic Algorithms (Soft Computing)
Fundamentals of Genetic Algorithms (Soft Computing)
 
Fuzzy Systems by using fuzzy set (Soft Computing)
Fuzzy Systems by using fuzzy set (Soft Computing)Fuzzy Systems by using fuzzy set (Soft Computing)
Fuzzy Systems by using fuzzy set (Soft Computing)
 
Fuzzy Set Theory and Classical Set Theory (Soft Computing)
Fuzzy Set Theory and Classical Set Theory (Soft Computing)Fuzzy Set Theory and Classical Set Theory (Soft Computing)
Fuzzy Set Theory and Classical Set Theory (Soft Computing)
 
Associative Memory using NN (Soft Computing)
Associative Memory using NN (Soft Computing)Associative Memory using NN (Soft Computing)
Associative Memory using NN (Soft Computing)
 
Back Propagation Network (Soft Computing)
Back Propagation Network (Soft Computing)Back Propagation Network (Soft Computing)
Back Propagation Network (Soft Computing)
 
Fundamentals of Neural Network (Soft Computing)
Fundamentals of Neural Network (Soft Computing)Fundamentals of Neural Network (Soft Computing)
Fundamentals of Neural Network (Soft Computing)
 
Introduction to Soft Computing (intro to the building blocks of SC)
Introduction to Soft Computing (intro to the building blocks of SC)Introduction to Soft Computing (intro to the building blocks of SC)
Introduction to Soft Computing (intro to the building blocks of SC)
 
Topological sorting
Topological sortingTopological sorting
Topological sorting
 
String matching, naive,
String matching, naive,String matching, naive,
String matching, naive,
 
Shortest path algorithms
Shortest path algorithmsShortest path algorithms
Shortest path algorithms
 
Sccd and topological sorting
Sccd and topological sortingSccd and topological sorting
Sccd and topological sorting
 
Red black trees
Red black treesRed black trees
Red black trees
 
Recurrence and master theorem
Recurrence and master theoremRecurrence and master theorem
Recurrence and master theorem
 
Rabin karp string matcher
Rabin karp string matcherRabin karp string matcher
Rabin karp string matcher
 
Minimum spanning tree
Minimum spanning treeMinimum spanning tree
Minimum spanning tree
 
Merge sort analysis
Merge sort analysisMerge sort analysis
Merge sort analysis
 
Loop invarient
Loop invarientLoop invarient
Loop invarient
 
Linear sort
Linear sortLinear sort
Linear sort
 
Heap and heapsort
Heap and heapsortHeap and heapsort
Heap and heapsort
 

Recently uploaded

Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
EduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AIEduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AIkoyaldeepu123
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
pipeline in computer architecture design
pipeline in computer architecture  designpipeline in computer architecture  design
pipeline in computer architecture designssuser87fa0c1
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvLewisJB
 
Study on Air-Water & Water-Water Heat Exchange in a Finned ďťżTube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned ďťżTube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned ďťżTube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned ďťżTube ExchangerAnamika Sarkar
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfROCENODodongVILLACER
 

Recently uploaded (20)

Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
EduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AIEduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AI
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
pipeline in computer architecture design
pipeline in computer architecture  designpipeline in computer architecture  design
pipeline in computer architecture design
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvv
 
Study on Air-Water & Water-Water Heat Exchange in a Finned ďťżTube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned ďťżTube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned ďťżTube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned ďťżTube Exchanger
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdf
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 

Vliw or epic

  • 1. 1  Very long instruction word or VLIW refers to a processor architecture designed to take advantage of instruction level parallelism(ILP).  Whereas conventional processors mostly only allow programs that specify instructions to be executed one after another.  A VLIW processor allows programs that can explicitly specify instructions to be executed at the same time (i.e. in parallel).  This type of processor architecture is intended to allow higher performance without the inherent complexity of some other approaches.  The term VLIW, and the concept of VLIW architecture itself, were invented by Josh Fisher in his research group at Yale University in the early 1980s. Very long instruction word
  • 2. 2  Explicitly parallel instruction computing (EPIC) is a term coined in 1997 by the HP–Intel alliance to describe a computing paradigm that researchers had been investigating since the early 1980s.  This paradigm is also called Independence architectures. It was the basis for Intel and HP development of the Intel Itanium architecture, and HP later asserted that "EPIC" was merely an old term for the Itanium architecture.  EPIC permits microprocessors to execute software instructions in parallel by using the compiler, rather than complex on-die circuitry, to control parallel instruction execution.  This was intended to allow simple performance scaling without resorting to higher clock frequencies. Explicitly parallel instruction computing
  • 3. 3  VLIW (very long instruction word): - Compiler packs a fixed number of instructions into a single VLIW instruction. - The instructions within a VLIW instruction are issued and executed in parallel Example: High-end signal processors (TMS320C6201)  EPIC (explicit parallel instruction computing): Evolution of VLIW Example: Intel’s IA-64, exemplified by the Itanium processor VLIW or EPIC
  • 4. 4 VLIW  VLIW (very long instruction word) processors use a long instruction word that contains a usually fixed number of operations that are fetched, decoded, issued, and executed synchronously.  All operations specified within a VLIW instruction must be independent of one another.
  • 5. 5 VLIW  Some of the key issues of a (V)LIW processor: • (very) long instruction word (up to 1 024 bits per instruction), • each instruction consists of multiple independent parallel operations, • each operation requires a statically known number of cycles to complete, • a central controller that issues a long instruction word every cycle, • multiple FUs connected through a global shared register file.
  • 6. 6 VLIW  sequential stream of long instruction words  instructions scheduled statically by the compiler  number of simultaneously issued instructions is fixed during compile-time  instruction issue is less complicated than in a superscalar processor
  • 7. 7 VLIW  Disadvantage: VLIW processors cannot react on dynamic events, e.g. cache misses, with the same flexibility like superscalars.  The number of instructions in a VLIW instruction word is usually fixed.  Padding VLIW instructions with no-ops is needed in case the full issue bandwidth is not be met. This increases code size. More recent VLIW architectures use a denser code format which allows to remove the no-ops.  VLIW is an architectural technique, whereas superscalar is a microarchitecture technique.  VLIW processors take advantage of spatial parallelism.
  • 8. 8 EPIC: a paradigm shift  Superscalar RISC solution • Based on sequential execution semantics • Compiler’s role is limited by the instruction set architecture • Superscalar hardware identifies and exploits parallelism  EPIC solution – (the evolution of VLIW) • Based on parallel execution semantics • EPIC ISA enhancements support static parallelization • Compiler takes greater responsibility for exploiting parallelism • Compiler / hardware collaboration often resembles superscalar
  • 9. 9 EPIC: a paradigm shift  Advantages of pursuing EPIC architectures • Make wide issue & deep latency less expensive in hardware • Allow processor parallelism to scale with additional VLSI density  Architect the processor to do well with in-order execution • Enhance the ISA to allow static parallelization • Use compiler technology to parallelize program • However, a purely static VLIW is not appropriate for general-purpose use
  • 10. 10 The fusion of VLIW and superscalar techniques  Superscalars need improved support for static parallelization • Static scheduling • Limited support for predicated execution  VLIWs need improved support for dynamic parallelization • Caches introduce dynamically changing memory latency • Compatibility: issue width and latency may change with new hardware • Application requirements - e.g. object oriented programming with dynamic binding  EPIC processors exhibit features derived from both • Interlock & out-of-order execution hardware are compatible with EPIC (but not required!) • EPIC processors can use dynamic translation to parallelize in software
  • 11. 11 Many EPIC features are taken from VLIWs Minisupercomputer products stimulated VLIW research (FPS, Multiflow, Cydrome) Minisupercomputers were specialized, costly, and short-lived Traditional VLIWs not suited to general purpose computing VLIW resurgence in single chip DSP & media processors Minisupercomputers exaggerated forward-looking challenges: Long latency Wide issue Large number of architected registers Compile-time scheduling to exploit exotic amounts of parallelism EPIC exploits many VLIW techniques
  • 12. 12 Shortcomings of early VLIWs  Expensive multi-chip implementations  No data cache  Poor "scalar" performance  No strategy for object code compatibility
  • 13. 13 EPIC design challenges  Develop architectures applicable to general-purpose computing • Find substantial parallelism in “difficult to parallelize” scalar programs • Provide compatibility across hardware generations • Support emerging applications (e.g. multimedia)  Compiler must find or create sufficient ILP  Combine the best attributes of VLIW & superscalar RISC (incorporated best concepts from all available sources)  Scale architectures for modern single-chip implementation
  • 14. 14 EPIC Processors, Intel's IA-64 ISA and Itanium  Joint R&D project by Hewlett-Packard and Intel (announced in June 1994)  This resulted in explicitly parallel instruction computing (EPIC) design style: • specifying ILP explicit in the machine code, that is, the parallelism is encoded directly into the instructions similarly to VLIW; • a fully predicated instruction set; • an inherently scalable instruction set (i.e., the ability to scale to a lot of FUs); • many registers; • speculative execution of load instructions
  • 15. 15 IA-64 Architecture  Unique architecture features & enhancements • Explicit parallelism and templates • Predication, speculation, memory support, and others • Floating-point and multimedia architecture  IA-64 resources available to applications • Large, application visible register set • Rotating registers, register stack, register stack engine  IA-32 & PA-RISC compatibility models
  • 16. 16 Today’s Architecture Challenges  Performance barriers : • Memory latency • Branches • Loop pipelining and call / return overhead  Headroom constraints : • Hardware-based instruction scheduling - Unable to efficiently schedule parallel execution • Resource constrained - Too few registers - Unable to fully utilize multiple execution units  Scalability limitations : • Memory addressing efficiency
  • 17. 17 Intel's IA-64 ISA  Intel 64-bit Architecture (IA-64) register model: • 128, 64-bit general purpose registers GR0-GR127 to hold values for integer and multimedia computations - each register has one additional NaT (Not a Thing) bit to indicate whether the value stored is valid, • 128, 82-bit floating-point registers FR0-FR127 - registers f0 and f1 are read-only with values +0.0 and +1.0, • 64, 1-bit predicate registers P0-PR63 - the first register p0 is read-only and always reads 1 (true) • 8, 64-bit branch registers BR0-BR7 to specify the target addresses of indirect branches
  • 18. 18 IA-64’s Large Register File BR7 BR0 Branch Registers 63 0 96 Stacked, Rotating GR1 GR31 GR127 GR32 GR0 NaT 32 Static 0 Integer Registers 63 0 Predicate Registers 1 PR1 PR63 PR0 PR15 PR16 48 Rotating 16 Static bit 0 96 Rotating GR1 GR31 GR127 GR32 GR0 32 Static 0.0 Floating-Point Registers 81 0
  • 19. 19 Intel's IA-64 ISA • IA-64 instructions are 41-bit (previously stated 40 bit) long and consist of - op-code, - predicate field (6 bits), - two source register addresses (7 bits each), - destination register address (7 bits), and - special fields (includes integer and floating-point arithmetic). • The 6-bit predicate field in each IA-64 instruction refers to a set of 64 predicate registers. • 6 types of instructions - A: Integer ALU ==> I-unit or M-unit - I: Non-ALU integer ==> I-unit - M: Memory ==> M-unit - B: Branch ==> B-unit - F: Floating-point ==> F-unit - L: Long Immediate ==> I-unit • IA-64 instructions are packed by compiler into bundles.
  • 20. 20 IA-64 Bundles  A bundle is a 128-bit long instruction word (LIW) containing three 41-bit IA-64 instructions along with a so-called 5-bit template that contains instruction grouping information.  IA-64 does not insert no-op instructions to fill slots in the bundles.  The template explicitly indicates : • first 4 bits: types of instructions • last bit (stop bit): whether the bundle can be executed in parallel with the next bundle • (previous literature): whether the instructions in the bundle can be executed in parallel or if one or more must be executed serially.  Bundled instructions don't have to be in their original program order, and they can even represent entirely different paths of a branch.  Also, the compiler can mix dependent and independent instructions together in a bundle, because the template keeps track of which is which.
  • 21. 21 IA-64 : Explicitly Parallel Architecture  IA-64 template specifies • The type of operation for each instruction - MFI, MMI, MII, MLI, MIB, MMF, MFB, MMB, MBB, BBB • Intra-bundle relationship - M / MI or MI / I • Inter-bundle relationship  Most common combinations covered by templates • Headroom for additional templates  Simplifies hardware requirements  Scales compatibly to future generations Instruction 2 41 bits Instruction 1 41 bits Instruction 0 41 bits Template 5 bits 128 bits (bundle) M=Memory F=Floating-point I=Integer L=Long Immediate B=Branch (MMI)Memory (M) Memory (M) Integer (I)