Pipelining and co processor.

What is Pipelining
 In simple words Pipelining means starting the
execution of 2nd
process before 1st
is
completed.

Overview
 Pipelining is widely used in modern
processors.
 Pipelining improves system performance in
terms of throughput.
 Pipelined organization requires sophisticated
compilation techniques.

Basic Concept
Faster Execution
Multi Tasking

Making the Execution of
Programs Faster
 Use faster circuit technology to build the
processor and the main memory.
 Arrange the hardware so that more than one
operation can be performed at the same time.
 In the latter way, the number of operations
performed per second is increased even
though the elapsed time needed to perform
any one operation is not changed.

Traditional Pipeline Concept
 A, B, C, D
each have one load of clothes
to wash, dry, and fold.
“Washer” takes 30 minutes
“Dryer” takes 40 minutes
“Folder” takes 20 minutes
A B C D
Laundry Example

 Sequential laundry takes 6
hours for 4 loads
 If they learned pipelining,
how long would laundry
take?
A
B
C
D
30 40 20 30 40 20 30 40 20 30 40 20
6 PM 7 8 9 10 11 Midnight
Time

Pipelined laundry takes
3.5 hours for 4 loads
A
B
C
D
6 PM 7 8 9 10 11 Midnight
T
a
s
k
O
r
d
e
r
Time
30 40 40 40 40 20

A
B
C
D
6 PM 7 8 9
T
a
s
k
O
r
d
e
r
Time
30 40 40 40 40 20

Use the Idea of Pipelining in a
Computer
F
1
E
1
F
2
E
2
F
3
E
3
I1 I2 I3
(a) Sequential execution
Ttime
F1 E1
F2 E2
F3 E3
I1
I2
I3
Instruction
(c) Pipelined execution
Figure of Basic idea of instruction pipelining.
Clock cycle 1 2 3 4
T
Time
Fetch + Execution

Role of Cache Memory
 Each pipeline stage is expected to complete in one
clock cycle.
 The clock period should be long enough to let the
slowest pipeline stage to complete.
 Faster stages can only wait for the slowest one to
complete.
 Since main memory is very slow compared to the
execution, if each instruction needs to be fetched
from main memory, pipeline is almost useless.
 Fortunately, we have cache.

Pipeline Performance
 The potential increase in performance
resulting from pipelining is proportional to the
number of pipeline stages.
 However, this increase would be achieved
only if all pipeline stages require the same
time to complete, and there is no interruption
throughout program execution.
 Unfortunately, this is not true.

F1
F2
F3
I1
I2
I3
D1
D2
D3
E1
E2
E3
W1
W2
W3
Instruction
Figure 8.4. Pipeline stall caused by a cache miss in F2.
1 2 3 4 5 6 7 8 9Clock cycle
(a) Instruction execution steps in successive clock cycles
1 2 3 4 5 6 7 8Clock cycle
Stage
F: Fetch
D: Decode
E: Execute
W: Write
F1 F2 F3
D1 D2 D3idle idle idle
E1 E2 E3idle idle idle
W1 W2idle idle idle
(b) Function performed by each processor stage in successive clock cycles
9
W3
F2 F2 F2
Time
Time
Idle periods –
stalls (bubbles)

F1
F2
F3
I1
I2 (Load)
I3
E1
M2
D1
D2
D3
W1
W2
Instruction
F4I4
Clock cycle 1 2 3 4 5 6 7
Figure 8.5. Effect of a Load instruction on pipeline timing.
F5I5 D5
Time
E2
E3 W3
E4D4
Load X(R1), R2
Structural
hazard

 Again, pipelining does not result in individual
instructions being executed faster; rather, it is the
throughput that increases.
 Throughput is measured by the rate at which
instruction execution is completed.
 Pipeline stall causes degradation in pipeline
performance.
 We need to identify all hazards that may cause the
pipeline to stall and to find ways to minimize their
impact.

Pipeline Hazards
 There are situations, called hazards, that
prevent the next instruction in the instruction
stream from executing during its designated
cycle
 There are three classes of hazards
 Structural hazard
 Data hazard
 Branch hazard

Pipeline Hazards
 Structural hazard
 Resource conflicts when the hardware cannot support
all possible combination of instructions simultaneously
 Data hazard
 An instruction depends on the results of a previous
instruction
 Branch hazard
 Instructions that change the PC

Pipeline Stall
 When a hazard prevents an instruction step
from happening, the processor pauses the
executing the step until hazard will restored.
 Pipeline stalls slow the execution of an
Instruction , but do not prevent it from
executing correctly.

WHAT IS CO-PROCESSOR
 A computer co-processor is processor
used to supplement the function of
primary processor.
 First seen on mainframe computers.
 Accelerate the system performance.

HISTORY OF CO-PROCESSOR
 Co-processor for floating point arithmetic first
appeared in desktop computers in 1970s.
 The coprocessors become common in 1980s
and into the early 1990s.
 Early 8_Bit and 16 Bit processor uses
software to carryout the floating point
arithmetic operations.
 Math co-processor were popular purchase for
users of computer-aided design (CAD)
software and scientific and engineering
calculations.

OPERATION PERFORMED BY
COPROCESSOR
 Floating point arithmetic
 Graphic & Signal processing.
 String processing.
 Encryption
 Coprocessor are Unable to fetch the code
from the memory so they work under the
control of main processor .

INTEL 8087
 Numeric Processor.
 Packed in 40 pin ceramic DIP package.
 Available in 5 MHz, 8MHz, 10MHz
versions compatible with 8086, 8088,
80186, 80188.
 It adds 68 new instruction to the
instruction set of 8086.

How it works
 The 8087 instruction may lie interleaved in the
8086 program, but it is the task of 8086 to
identify the 8087 instructions from the program,
send it to 8087 for further execution & after the
completion of execution cycle the result may be
referred back to CPU.
 Operation of 8087 does not require any
software support from the system software or
operating system.

Two major sections:
1) Control unit
2) Numeric Execution unit

Control Unit
Function :
 It interface the coprocessor to the
microprocessor – system data bus.
 Monitors the instruction stream.
 If the instruction is an ESCape
(coprocessor) instruction, the coprocessor
executes it; if not the microprocessor
executes it.
 It receives , decodes instructions, read and
write memory operands and executes the
8087 instruction

Numeric Execution Unit (NEU)
Functions :
 Execute all the numeric processor
instructions.
 It has 8 register (80 bit) stack that holds
the operands for arithmetic instructions &
the result.
 Instruction either address data in specific
stack data – register or uses push and
pop mechanism to store and retrieve data.

Coprocessor Control Instructions
 The coprocessor has control instructions for
initialization, exception handling, and task
switching.
 All control instructions have two forms.

FINIT/FNINIT
 Performs a reset (initialize) operation on the
arithmetic coprocessor.
 The coprocessor operates with a closure of
projective (unsigned infinity), rounds to the
nearest or even, and uses extended-
precision when reset or initialized.
 also sets register 0 as the top of the stack

FSETPM
 Changes the coprocessor to the protected-
addressing mode.
 used when the microprocessor is protected mode
 Protected mode can only be exited by a
hardware reset.
 or in 80386-Pentium 4, with a change to the
control register

FLDCW
 Loads the control register with the
word addressed by the operand.
FSTCW
 Stores the control register into the
word-sized memory operand.

FSTSW AX
 Copies the contents of the control register
to the AX register.
 not available to 8087
FCLEX
 Clears the error flags in the status register
and also the busy flag.

Graphics Coprocessor
 noun a high-speed display adapter that is
dedicated to graphics operations such as line
drawing and plotting
 A coprocessor utilized to accelerate the
displaying of graphics, significantly speeding up
the updating of the images on a screen, and
freeing the CPU to take care of other tasks.
 A graphics coprocessor maybe incorporated into
a graphics accelerator, or may be part of a
separate subsystem. Also called graphics
processor .

Pipelining and co processor.

More Related Content

Viewers also liked

Similar to Pipelining and co processor.

More from Piyush Rochwani

Recently uploaded

Pipelining and co processor.