Computer Architecture Instruction-Level paraallel processors

•Download as PPT, PDF•

0 likes•75 views

Haris456

presentation

Software

ComputerComputer
ArchitectureArchitecture
Instruction-Level Parallel
Processors

Improve CPU performance by
• increasing clock rates
• (CPU running at even higher frequencies.)
• increasing the number of instructions to be executed in parallel
• (executing 6-10 instructions at the same time)
• What is the limit for these?

• Very long instruction word (VLIW) refers to processor
architectures designed to exploit instruction level
parallelism (ILP).
• VLIW processor allows programs to explicitly specify
instructions to execute at the same time, concurrently, in
parallel
VLIW (very long instruction
word,1024 bits!)

VLIW (very long instruction
word,1024 bits!)

Superscalar (sequential stream of
instructions)
superscalar processor is a CPU that
implements a form of parallelism called
instruction-level parallelism within a single
processor. It therefore allows for more
throughput (the number of instructions that
can be executed in a unit of time) than would
otherwise be possible at a given clock rate

Scalar Vs Superscalar
•In contrast to a scalar processor that can execute at
most one single instruction per clock cycle
• superscalar processor can execute more than one
instruction during a clock cycle by simultaneously
dispatching multiple instructions to different
execution units on the processor.

Superscalar (sequential stream of
instructions)

From Sequential instructions
to parallel execution
• Dependencies between instructions
• Instruction scheduling
• Preserving sequential consistency

Dependencies between instructions
Instructions often depend on each other in such a way that a particular
instruction cannot be executed until a preceding instruction or even two or
three preceding instructions have been executed.
1 Data dependencies
2 Control dependencies
3 Resource dependencies

Data dependencies
• Read after Write (RAW)
• Write after Read (WAR)
• Write after Write (WAW)
• Recurrences

Data dependencies in straight-line code
(RAW)
•RAW dependencies
•i1: load r1, a
•r2: add r2, r1, r1
•flow dependencies
•true dependencies
•cannot be abandoned

Data dependencies in straight-line code
(WAR)
•WAR dependencies
• i1: mul r1, r2, r3
• r2: add r2, r4, r5
•anti-dependencies
•false dependencies
•can be eliminated through register renaming
• i1: mul r1, r2, r3
• r2: add r6, r4, r5
• by using compiler or ILP-processor

Data dependencies in straight-line code
(WAW)
•WAW dependencies
• i1: mul r1, r2, r3
• r2: add r1, r4, r5
•output dependencies
•false dependencies
•can be eliminated through register renaming
• i1: mul r1, r2, r3
• r2: add r6, r4, r5
• by using compiler or ILP-processor

Data dependencies in loops
(recurrences)
for (int i=2; i<10; i++) {
x[i] = a*x[i-1] + b
}
• cannot be executed in parallel
S1. if (a == b)
S2. a = a + b
S3. b = a + b
Data dependencies in conditional
statement

Data dependency graphs
• i1: load r1, a;
• i2: load r2, b;
• i3: add r3, r1, r2; RAW -> δt
• i4: mul r1, r2, r4; WAR -> δa
• i5: div r1, r2, r4; WAW -> δo
i1 i2
i3
i4
i5
δt δt
δa
δo

Control dependencies
mul r1, r2, r3
jz zproc
:
zproc: load r1, x
:
•actual path of execution depends on the outcome
of multiplication
•impose dependencies on the logical subsequent
instructions

Resource dependencies
•An instruction is resource-dependent on a
previously issued instruction if it requires a
hardware resource which is still being used by a
previously issued instruction.
•e.g.
• div r1, r2, r3
• div r4, r2, r5

What's hot

Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...HostedbyConfluent

ApacheCon2019 Talk: Kafka, Cassandra and Kubernetesat Scale – Real-time Ano...Paul Brebner

Apache kafkaLong Nguyen

WSO2Con USA 2015: Deployment Patterns and Capacity PlanningWSO2

Streaming and MessagingXin Wang

Topic and schema management-meetupberlinconfluent

How to manage large amounts of data with akka streamsIgor Mielientiev

Serverless log analytics with Amazon KinesisRob Greenwood

Deploying Kafka at Dropbox, Mark Smith, Sean Fellowsconfluent

Reactive Streams, linking Reactive Application to Spark Streaming by Luc Bour...Spark Summit

Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Exampleconfluent

Building your own Distributed System The easy way - Cassandra Summit EU 2014Kévin LOVATO

Vitalii Korzh - "Exciting Migrations"LogeekNightUkraine

Openzipkin conf: Zipkin at YelpPrateek Agarwal

KELK Stack on AWSSteamhaus

How to Lock Down Apache Kafka and Keep Your Streams Safeconfluent

The Many Faces of Apache Kafka: Leveraging real-time data at scaleNeha Narkhede

Apache Kafka® at Dropboxconfluent

SignalFx Kafka Consumer OptimizationSignalFx

Kafka Streams: The Stream Processing Engine of Apache KafkaEno Thereska

What's hot (20)

Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...

ApacheCon2019 Talk: Kafka, Cassandra and Kubernetesat Scale – Real-time Ano...

Apache kafka

WSO2Con USA 2015: Deployment Patterns and Capacity Planning

Streaming and Messaging

Topic and schema management-meetupberlin

How to manage large amounts of data with akka streams

Serverless log analytics with Amazon Kinesis

Deploying Kafka at Dropbox, Mark Smith, Sean Fellows

Reactive Streams, linking Reactive Application to Spark Streaming by Luc Bour...

Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example

Building your own Distributed System The easy way - Cassandra Summit EU 2014

Vitalii Korzh - "Exciting Migrations"

Openzipkin conf: Zipkin at Yelp

KELK Stack on AWS

How to Lock Down Apache Kafka and Keep Your Streams Safe

The Many Faces of Apache Kafka: Leveraging real-time data at scale

Apache Kafka® at Dropbox

SignalFx Kafka Consumer Optimization

Kafka Streams: The Stream Processing Engine of Apache Kafka

Similar to Computer Architecture Instruction-Level paraallel processors

2. ILP Processors.pptShifaZahra7

14 superscalardilip kumar

VLIW(Very Long Instruction Word)Pragnya Dash

13 superscalarHammad Farooq

Lec1 finalGichelle Amon

13_Superscalar.pptLavleshkumarBais

Instruction Level Parallelism and Superscalar ProcessorsSyed Zaid Irshad

The Challenges facing Libraries and Imperative Languages from Massively Paral...Jason Hearne-McGuiness

Superscalar processornoor ul ain

Lect 1.pptxJimingKang

Optimization TechniquesJoud Khattab

Load Balancingoptalink

Instruction Level Parallelism (ILP) LimitationsJose Pinilla

13 riscSher Shah Merkhel

AP4R on RubyKaigi2007 (English only)Kato Kiwamu

14 superscalarAnwal Mirza

Cassandra Day SV 2014: Spark, Shark, and Apache CassandraDataStax Academy

Network Functions Virtualization and CloudStackChiradeep Vittal

Intro to Apache Apex @ Women in Big DataApache Apex

Vliw or epicAmit Kumar Rathi

Similar to Computer Architecture Instruction-Level paraallel processors (20)

2. ILP Processors.ppt

14 superscalar

VLIW(Very Long Instruction Word)

13 superscalar

Lec1 final

13_Superscalar.ppt

Instruction Level Parallelism and Superscalar Processors

The Challenges facing Libraries and Imperative Languages from Massively Paral...

Superscalar processor

Lect 1.pptx

Optimization Techniques

Load Balancing

Instruction Level Parallelism (ILP) Limitations

13 risc

AP4R on RubyKaigi2007 (English only)

14 superscalar

Cassandra Day SV 2014: Spark, Shark, and Apache Cassandra

Network Functions Virtualization and CloudStack

Intro to Apache Apex @ Women in Big Data

Vliw or epic

Recently uploaded

BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp

Asset Management Software - InfographicHr365.us smith

Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden

Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01

Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran

The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171

EY_Graph Database Powered SustainabilityNeo4j

KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app

chapter--4-software-project-planning.pptkotipi9215

Implementing Zero Trust strategy with AzureDinusha Kumarasiri

XpertSolvers: Your Partner in Building Innovative Software SolutionsMehedi Hasan Shohan

Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh

cybersecurity notes for mca students for learningVitsRangannavar

Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110

Professional Resume Template for Software DevelopersVinodh Ram

What is Fashion PLM and Why Do You Need ItWave PLM

What is Binary Language? Computer Number SystemsJheuzeDellosa

(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700

Recently uploaded (20)

BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE

Asset Management Software - Infographic

Engage Usergroup 2024 - The Good The Bad_The Ugly

Cloud Management Software Platforms: OpenStack

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...

Intelligent Home Wi-Fi Solutions | ThinkPalm

The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf

EY_Graph Database Powered Sustainability

KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx

chapter--4-software-project-planning.ppt

Implementing Zero Trust strategy with Azure

XpertSolvers: Your Partner in Building Innovative Software Solutions

Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...

cybersecurity notes for mca students for learning

Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...

Professional Resume Template for Software Developers

What is Fashion PLM and Why Do You Need It

What is Binary Language? Computer Number Systems

(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...

Computer Architecture Instruction-Level paraallel processors

1. ComputerComputer ArchitectureArchitecture Instruction-Level Parallel Processors

2. Improve CPU performance by • increasing clock rates • (CPU running at even higher frequencies.) • increasing the number of instructions to be executed in parallel • (executing 6-10 instructions at the same time) • What is the limit for these?

3. Pipeline (assembly line)

4. Result of pipeline (e.g.)

5. • Very long instruction word (VLIW) refers to processor architectures designed to exploit instruction level parallelism (ILP). • VLIW processor allows programs to explicitly specify instructions to execute at the same time, concurrently, in parallel VLIW (very long instruction word,1024 bits!)

6. VLIW (very long instruction word,1024 bits!)

7. Superscalar (sequential stream of instructions) superscalar processor is a CPU that implements a form of parallelism called instruction-level parallelism within a single processor. It therefore allows for more throughput (the number of instructions that can be executed in a unit of time) than would otherwise be possible at a given clock rate

8. Scalar Vs Superscalar •In contrast to a scalar processor that can execute at most one single instruction per clock cycle • superscalar processor can execute more than one instruction during a clock cycle by simultaneously dispatching multiple instructions to different execution units on the processor.

9. Superscalar (sequential stream of instructions)

10.

11. Flynn's taxonomy,

12. From Sequential instructions to parallel execution • Dependencies between instructions • Instruction scheduling • Preserving sequential consistency

13. Dependencies between instructions Instructions often depend on each other in such a way that a particular instruction cannot be executed until a preceding instruction or even two or three preceding instructions have been executed. 1 Data dependencies 2 Control dependencies 3 Resource dependencies

14. Data dependencies • Read after Write (RAW) • Write after Read (WAR) • Write after Write (WAW) • Recurrences

15. Data dependencies in straight-line code (RAW) •RAW dependencies •i1: load r1, a •r2: add r2, r1, r1 •flow dependencies •true dependencies •cannot be abandoned

16. Data dependencies in straight-line code (WAR) •WAR dependencies • i1: mul r1, r2, r3 • r2: add r2, r4, r5 •anti-dependencies •false dependencies •can be eliminated through register renaming • i1: mul r1, r2, r3 • r2: add r6, r4, r5 • by using compiler or ILP-processor

17. Data dependencies in straight-line code (WAW) •WAW dependencies • i1: mul r1, r2, r3 • r2: add r1, r4, r5 •output dependencies •false dependencies •can be eliminated through register renaming • i1: mul r1, r2, r3 • r2: add r6, r4, r5 • by using compiler or ILP-processor

18.

19. Data dependencies in loops (recurrences) for (int i=2; i<10; i++) { x[i] = a*x[i-1] + b } • cannot be executed in parallel S1. if (a == b) S2. a = a + b S3. b = a + b Data dependencies in conditional statement

20. Data dependency graphs • i1: load r1, a; • i2: load r2, b; • i3: add r3, r1, r2; RAW -> δt • i4: mul r1, r2, r4; WAR -> δa • i5: div r1, r2, r4; WAW -> δo i1 i2 i3 i4 i5 δt δt δa δo

21. Control dependencies mul r1, r2, r3 jz zproc : zproc: load r1, x : •actual path of execution depends on the outcome of multiplication •impose dependencies on the logical subsequent instructions

22. Control Dependency Graph

23. Resource dependencies •An instruction is resource-dependent on a previously issued instruction if it requires a hardware resource which is still being used by a previously issued instruction. •e.g. • div r1, r2, r3 • div r4, r2, r5

Computer Architecture Instruction-Level paraallel processors

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Computer Architecture Instruction-Level paraallel processors

Similar to Computer Architecture Instruction-Level paraallel processors (20)

More from Haris456

More from Haris456 (11)

Recently uploaded

Recently uploaded (20)

Computer Architecture Instruction-Level paraallel processors