Welcome!!
Today , we will present an interesting topic
So, just enjoy our presentation
And any questions
Will be answered
At the end
BY / Ibrahim Hassan
Core Pipelining
Computer Architecture
Agenda
Style
01
02
03
04
Overview
Get a clear overview of the pipelining
What is pipelining ?
Get the meaning and description of the concept
How can pipelining improve performance?
Getting information of how can it be useful
Data hazard and reduce it by using 2 ways
Problems that we met during using pipelining
Overview
 Pipelining is widely used in modern processors.
 Pipelining improves system performance in terms of
throughput.
 Pipelined organization requires sophisticated compilation
techniques.
What is pipelining ?
 Pipelining is the process of accumulating and executing comput
er instructions and tasks from the processor via a logical
pipeline.
 It allows storing, prioritizing, managing and executing tasks and
instructions in an orderly process. within a single processor.
How can pipelining improve performance?
First ,
we need to remember how to improve performance?
 Recall performance is function of
• CPI: cycles per instruction
• Clock cycle
• Instruction count
Cont.
 within a single processor. It therefore allows faster CPUthrough
put (the number of instructions that can be executed in a unit of
time) than would otherwise be possible at a given clock rate.
 The basic instruction cycle is broken up into a series called a
pipeline. Rather than processing each instruction sequentially
(finishing one instruction before starting the next).
Traditional Pipeline Concept
Laundry Example
Ann, Brian, Cathy, Dave
each have one load of clothes
to wash, dry, and fold
Washer takes 30 minutes
Dryer takes 40 minutes
Folder takes 20 minutes
A B C D
Cont.
A
B
C
D
30 40 20 30 40 20 30 40 20 30 40 20
 Sequential laundry takes 6
hours for 4 loads
 If they learned pipelining,
how long would laundry
take?
6 PM 7 8 9 10 11 Midnight
Time
Using pipelining
A
B
C
D
6 PM 7 8 9
T
a
s
k
O
r
d
e
r
Time
30 40 40 40 40 20
Anyone wants to participate?
Raise your
Hand !
Instruction Execution Cycle
Insert the title of your subtitle Here
store the result in the
destination location.
store
perform the operation
specified by the instruction
Execute
read the instruction from
the memory.
Fetch
Decode the instruction and
fetch the source operand
Decode
01
02
03
04
Applying pipelining in CPU
F1 D1 E1 W1
F2 D2 E2 W2
F3 D3 E3 W3
F4 D4 E4 W4
I2
I3
I4
Clock cycle
Instruction
I1
1 2 3 4 5 6 7
F : Fetch i
nstruction
D : Decode
instruction
and fetch o
perands
E: Execute
operation
W : Write
results
(a) Instruction execution divided into four steps
Interstage buffers
B1 B2 B3
Time
Problems can be occurred during pipelining
 Faster stages can only
wait for the slowest one
to complete.
 The clock period should
be long enough to let the
slowest pipeline stage to
complete.
Since main memory is very slow
compared to the execution, if
each instruction needs to be
fetched from main memory
 Each pipeline stage
is expected to complete
in one clock cycle
.
Let’s see that Pipeline Performance !!
F3I3 E3D3
F1 D1 E1 W1
F2 D2 E2 W2
W3
Instruction
I1
I2
F4I4
Clock cycle 1 2 3 4 5 6 7 8
Effect of an execution operation taking more than one clock cycle.
I5 F5 D5 E5
D4 E4 W4
So, what are the problems ?
 The previous pipeline is said to have been stalled for more than one clock cycle.
 Any condition that causes a pipeline to stall is called a hazard.
 Data hazard any condition in which either the source or the destination operands
of an instruction are not available at the time expected in the pipeline.
 Instruction (control) hazard – a delay in the availability of an instruction causes the
pipeline to stall.[cache miss]
 Structural hazard the situation when two instructions require the use of a given har
dware resource at the same time.
Instruction hazard
F1 D1 E1 W1
F2 D2 E2 W2
F3 D3 E3 W3
I1
I2
I3
7 8 9Clock cycle 1 2 3 4 5 6
Instruction
(a) Instruction execution steps in successive clock cycles
2 3 4 5 6 7 8Clock cycle 1
Stage
F: Fetch F1 F2 F2 F2 F2 F3
D: Decode D1 idle idle idle D2 D3
E: Execute E1 idle idle idle E2 E3
W: Write W1 idle idle idle W2 W3
(b) Function performed by each processor stage in successive clock cycles
9
Time
Time
Instruction hazard
(Cache miss)
Decode unit is idle
in cycles 3 through
5, Execute unit idle
in cycle 4 through 6
and write unit is idle
in cycle 5 through 7
such idle period is
called stalls.
Data Hazards
We must ensure that the results obtained when instructions are executed in a pip
elined processor are identical to those obtained when the same instructions are e
xecuted sequentially.
Hazard occurs
B ← 4 × A A ← 3 + A
No hazard
A ← 5 × C B ← 20 + C
When two operations depend on each other, they must be executed sequ
entially in the correct order.
Another example: Mul R2, R3, R4
Add R5, R4, R6
Handling Data Hazards
 FORWARDING
 NOPS
Forwarding
 Instead of from the register file, the second instruction can get
data directly from the output of ALU after the previous instructi
on is completed.
 A special arrangement needs to be made to “forward” the out
put of ALU to the input of ALU.
 Let the compiler detect and handle the hazard:
I1: Mul R2, R3, R4 NOP
NOP
I2: Add R5, R4, R6
The compiler can reorder the instructions to perform some useful work during
the NOP slots.
NOPS
Side Effects
 The previous example is explicit and easily detected.
Sometimes an instruction changes the contents of a register other than the
one named as the destination.
 When a location other than one explicitly named in an instruction as a desti
nation operand is affected, the instruction is said to have a side effect.
Example: conditional code flags: Add R1, R3
AddWithCarryR2, R4
 Instructions designed for execution on pipelined hardware should have few
side effects.
Any Questions ?
Agenda StyleReferences
Morgan.Kaufmann.Computer.Organization.And.Design.5th.Edition
https://cs.stanford.edu/people/eroberts/courses/soco/projects/risc/pipelini
ng/index.html
Thank you

Core pipelining

  • 1.
    Welcome!! Today , wewill present an interesting topic So, just enjoy our presentation And any questions Will be answered At the end BY / Ibrahim Hassan
  • 2.
  • 3.
    Agenda Style 01 02 03 04 Overview Get a clearoverview of the pipelining What is pipelining ? Get the meaning and description of the concept How can pipelining improve performance? Getting information of how can it be useful Data hazard and reduce it by using 2 ways Problems that we met during using pipelining
  • 4.
    Overview  Pipelining iswidely used in modern processors.  Pipelining improves system performance in terms of throughput.  Pipelined organization requires sophisticated compilation techniques.
  • 5.
    What is pipelining?  Pipelining is the process of accumulating and executing comput er instructions and tasks from the processor via a logical pipeline.  It allows storing, prioritizing, managing and executing tasks and instructions in an orderly process. within a single processor.
  • 6.
    How can pipeliningimprove performance? First , we need to remember how to improve performance?  Recall performance is function of • CPI: cycles per instruction • Clock cycle • Instruction count
  • 7.
    Cont.  within asingle processor. It therefore allows faster CPUthrough put (the number of instructions that can be executed in a unit of time) than would otherwise be possible at a given clock rate.  The basic instruction cycle is broken up into a series called a pipeline. Rather than processing each instruction sequentially (finishing one instruction before starting the next).
  • 8.
    Traditional Pipeline Concept LaundryExample Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes Dryer takes 40 minutes Folder takes 20 minutes A B C D
  • 9.
    Cont. A B C D 30 40 2030 40 20 30 40 20 30 40 20  Sequential laundry takes 6 hours for 4 loads  If they learned pipelining, how long would laundry take? 6 PM 7 8 9 10 11 Midnight Time
  • 10.
    Using pipelining A B C D 6 PM7 8 9 T a s k O r d e r Time 30 40 40 40 40 20
  • 11.
    Anyone wants toparticipate? Raise your Hand !
  • 12.
    Instruction Execution Cycle Insertthe title of your subtitle Here store the result in the destination location. store perform the operation specified by the instruction Execute read the instruction from the memory. Fetch Decode the instruction and fetch the source operand Decode 01 02 03 04
  • 13.
    Applying pipelining inCPU F1 D1 E1 W1 F2 D2 E2 W2 F3 D3 E3 W3 F4 D4 E4 W4 I2 I3 I4 Clock cycle Instruction I1 1 2 3 4 5 6 7 F : Fetch i nstruction D : Decode instruction and fetch o perands E: Execute operation W : Write results (a) Instruction execution divided into four steps Interstage buffers B1 B2 B3 Time
  • 14.
    Problems can beoccurred during pipelining  Faster stages can only wait for the slowest one to complete.  The clock period should be long enough to let the slowest pipeline stage to complete. Since main memory is very slow compared to the execution, if each instruction needs to be fetched from main memory  Each pipeline stage is expected to complete in one clock cycle .
  • 15.
    Let’s see thatPipeline Performance !! F3I3 E3D3 F1 D1 E1 W1 F2 D2 E2 W2 W3 Instruction I1 I2 F4I4 Clock cycle 1 2 3 4 5 6 7 8 Effect of an execution operation taking more than one clock cycle. I5 F5 D5 E5 D4 E4 W4
  • 16.
    So, what arethe problems ?  The previous pipeline is said to have been stalled for more than one clock cycle.  Any condition that causes a pipeline to stall is called a hazard.  Data hazard any condition in which either the source or the destination operands of an instruction are not available at the time expected in the pipeline.  Instruction (control) hazard – a delay in the availability of an instruction causes the pipeline to stall.[cache miss]  Structural hazard the situation when two instructions require the use of a given har dware resource at the same time.
  • 17.
    Instruction hazard F1 D1E1 W1 F2 D2 E2 W2 F3 D3 E3 W3 I1 I2 I3 7 8 9Clock cycle 1 2 3 4 5 6 Instruction (a) Instruction execution steps in successive clock cycles 2 3 4 5 6 7 8Clock cycle 1 Stage F: Fetch F1 F2 F2 F2 F2 F3 D: Decode D1 idle idle idle D2 D3 E: Execute E1 idle idle idle E2 E3 W: Write W1 idle idle idle W2 W3 (b) Function performed by each processor stage in successive clock cycles 9 Time Time Instruction hazard (Cache miss) Decode unit is idle in cycles 3 through 5, Execute unit idle in cycle 4 through 6 and write unit is idle in cycle 5 through 7 such idle period is called stalls.
  • 18.
    Data Hazards We mustensure that the results obtained when instructions are executed in a pip elined processor are identical to those obtained when the same instructions are e xecuted sequentially. Hazard occurs B ← 4 × A A ← 3 + A No hazard A ← 5 × C B ← 20 + C When two operations depend on each other, they must be executed sequ entially in the correct order. Another example: Mul R2, R3, R4 Add R5, R4, R6
  • 19.
    Handling Data Hazards FORWARDING  NOPS
  • 20.
    Forwarding  Instead offrom the register file, the second instruction can get data directly from the output of ALU after the previous instructi on is completed.  A special arrangement needs to be made to “forward” the out put of ALU to the input of ALU.
  • 21.
     Let thecompiler detect and handle the hazard: I1: Mul R2, R3, R4 NOP NOP I2: Add R5, R4, R6 The compiler can reorder the instructions to perform some useful work during the NOP slots. NOPS
  • 22.
    Side Effects  Theprevious example is explicit and easily detected. Sometimes an instruction changes the contents of a register other than the one named as the destination.  When a location other than one explicitly named in an instruction as a desti nation operand is affected, the instruction is said to have a side effect. Example: conditional code flags: Add R1, R3 AddWithCarryR2, R4  Instructions designed for execution on pipelined hardware should have few side effects.
  • 23.
  • 24.
  • 25.