The document discusses the central processing unit (CPU) and pipeline. It covers major CPU components like registers, arithmetic logic unit (ALU), control unit, and bus. It describes general register and stack organizations, instruction formats, addressing modes, and data transfer and manipulation instructions. It provides details on register selection, ALU operations, and how the control unit directs information flow through the CPU.
I am working as a Assistant Professor in ITS, Ghaziabad.This slides is very useful for UPTU,UTU,JNU,BHU etc.please give feedback to me in friendly_rakesh2003@yahoo.co.in
I am working as a Assistant Professor in ITS, Ghaziabad.This slides is very useful for UPTU,UTU,JNU,BHU etc.please give feedback to me in friendly_rakesh2003@yahoo.co.in
Instruction Set, Instruction Format, Instruction Encoding, Instruction Cycle, Types of the instructions, Types of Operand, Instruction Length, Number of Addresses in an instruction
Instruction Set, Instruction Format, Instruction Encoding, Instruction Cycle, Types of the instructions, Types of Operand, Instruction Length, Number of Addresses in an instruction
Rai University provides high quality education for MSc, Law, Mechanical Engineering, BBA, MSc, Computer Science, Microbiology, Hospital Management, Health Management and IT Engineering.
Operation “Blue Star” is the only event in the history of Independent India where the state went into war with its own people. Even after about 40 years it is not clear if it was culmination of states anger over people of the region, a political game of power or start of dictatorial chapter in the democratic setup.
The people of Punjab felt alienated from main stream due to denial of their just demands during a long democratic struggle since independence. As it happen all over the word, it led to militant struggle with great loss of lives of military, police and civilian personnel. Killing of Indira Gandhi and massacre of innocent Sikhs in Delhi and other India cities was also associated with this movement.
Honest Reviews of Tim Han LMA Course Program.pptxtimhan337
Personal development courses are widely available today, with each one promising life-changing outcomes. Tim Han’s Life Mastery Achievers (LMA) Course has drawn a lot of interest. In addition to offering my frank assessment of Success Insider’s LMA Course, this piece examines the course’s effects via a variety of Tim Han LMA course reviews and Success Insider comments.
Read| The latest issue of The Challenger is here! We are thrilled to announce that our school paper has qualified for the NATIONAL SCHOOLS PRESS CONFERENCE (NSPC) 2024. Thank you for your unwavering support and trust. Dive into the stories that made us stand out!
Introduction to AI for Nonprofits with Tapp NetworkTechSoup
Dive into the world of AI! Experts Jon Hill and Tareq Monaur will guide you through AI's role in enhancing nonprofit websites and basic marketing strategies, making it easy to understand and apply.
Francesca Gottschalk - How can education support child empowerment.pptxEduSkills OECD
Francesca Gottschalk from the OECD’s Centre for Educational Research and Innovation presents at the Ask an Expert Webinar: How can education support child empowerment?
2024.06.01 Introducing a competency framework for languag learning materials ...Sandy Millin
http://sandymillin.wordpress.com/iateflwebinar2024
Published classroom materials form the basis of syllabuses, drive teacher professional development, and have a potentially huge influence on learners, teachers and education systems. All teachers also create their own materials, whether a few sentences on a blackboard, a highly-structured fully-realised online course, or anything in between. Despite this, the knowledge and skills needed to create effective language learning materials are rarely part of teacher training, and are mostly learnt by trial and error.
Knowledge and skills frameworks, generally called competency frameworks, for ELT teachers, trainers and managers have existed for a few years now. However, until I created one for my MA dissertation, there wasn’t one drawing together what we need to know and do to be able to effectively produce language learning materials.
This webinar will introduce you to my framework, highlighting the key competencies I identified from my research. It will also show how anybody involved in language teaching (any language, not just English!), teacher training, managing schools or developing language learning materials can benefit from using the framework.
The Roman Empire A Historical Colossus.pdfkaushalkr1407
The Roman Empire, a vast and enduring power, stands as one of history's most remarkable civilizations, leaving an indelible imprint on the world. It emerged from the Roman Republic, transitioning into an imperial powerhouse under the leadership of Augustus Caesar in 27 BCE. This transformation marked the beginning of an era defined by unprecedented territorial expansion, architectural marvels, and profound cultural influence.
The empire's roots lie in the city of Rome, founded, according to legend, by Romulus in 753 BCE. Over centuries, Rome evolved from a small settlement to a formidable republic, characterized by a complex political system with elected officials and checks on power. However, internal strife, class conflicts, and military ambitions paved the way for the end of the Republic. Julius Caesar’s dictatorship and subsequent assassination in 44 BCE created a power vacuum, leading to a civil war. Octavian, later Augustus, emerged victorious, heralding the Roman Empire’s birth.
Under Augustus, the empire experienced the Pax Romana, a 200-year period of relative peace and stability. Augustus reformed the military, established efficient administrative systems, and initiated grand construction projects. The empire's borders expanded, encompassing territories from Britain to Egypt and from Spain to the Euphrates. Roman legions, renowned for their discipline and engineering prowess, secured and maintained these vast territories, building roads, fortifications, and cities that facilitated control and integration.
The Roman Empire’s society was hierarchical, with a rigid class system. At the top were the patricians, wealthy elites who held significant political power. Below them were the plebeians, free citizens with limited political influence, and the vast numbers of slaves who formed the backbone of the economy. The family unit was central, governed by the paterfamilias, the male head who held absolute authority.
Culturally, the Romans were eclectic, absorbing and adapting elements from the civilizations they encountered, particularly the Greeks. Roman art, literature, and philosophy reflected this synthesis, creating a rich cultural tapestry. Latin, the Roman language, became the lingua franca of the Western world, influencing numerous modern languages.
Roman architecture and engineering achievements were monumental. They perfected the arch, vault, and dome, constructing enduring structures like the Colosseum, Pantheon, and aqueducts. These engineering marvels not only showcased Roman ingenuity but also served practical purposes, from public entertainment to water supply.
Acetabularia Information For Class 9 .docxvaibhavrinwa19
Acetabularia acetabulum is a single-celled green alga that in its vegetative state is morphologically differentiated into a basal rhizoid and an axially elongated stalk, which bears whorls of branching hairs. The single diploid nucleus resides in the rhizoid.
How to Make a Field invisible in Odoo 17Celine George
It is possible to hide or invisible some fields in odoo. Commonly using “invisible” attribute in the field definition to invisible the fields. This slide will show how to make a field invisible in odoo 17.
Macroeconomics- Movie Location
This will be used as part of your Personal Professional Portfolio once graded.
Objective:
Prepare a presentation or a paper using research, basic comparative analysis, data organization and application of economic information. You will make an informed assessment of an economic climate outside of the United States to accomplish an entertainment industry objective.
B.sc cs-ii-u-4 central processing unit and pipeline
1. Central Processing Unit and
Pipeline
Course: B.Sc-CS-II
Subject: Computer Organization
And Architecture
Unit-4
1
2. CENTRAL PROCESSING UNIT
2
• Introduction
• General Register Organization
• Stack Organization
• Instruction Formats
• Addressing Modes
• Data Transfer and Manipulation
• Program Control
• Reduced Instruction Set Computer (RISC)
3. MAJOR COMPONENTS OF CPU[1]
3
Storage Components:
Registers
Flip-flops
Execution (Processing) Components:
Arithmetic Logic Unit (ALU):
Arithmetic calculations, Logical computations, Shifts/Rotates
Transfer Components:
Bus
Control Components:
Control Unit
Register
File ALU
Control Unit
5. OPERATION OF CONTROL UNIT
5
The control unit directs the information flow through ALU by:
- Selecting various Components in the system
- Selecting the Function of ALU
Example: R1 <- R2 + R3
[1] MUX A selector (SELA): BUS A ← R2
[2] MUX B selector (SELB): BUS B ← R3
[3] ALU operation selector (OPR): ALU to ADD
[4] Decoder destination selector (SELD): R1 ← Out Bus
Control Word
Encoding of register selection fields
Binary
Code SELA SELB SELD
000 Input Input None
001 R1 R1 R1
010 R2 R2 R2
011 R3 R3 R3
100 R4 R4 R4
101 R5 R5 R5
110 R6 R6 R6
111 R7 R7 R7
SELA SELB SELD OPR
3 3 3 5
6. ALU CONTROL
6
Encoding of ALU operations OPR
Select Operation Symbol
00000 Transfer A TSFA
00001 Increment A INCA
00010 ADD A + B ADD
00101 Subtract A - B SUB
00110 Decrement A DECA
01000 AND A and B AND
01010 OR A and B OR
01100 XOR A and B XOR
01110 Complement A COMA
10000 Shift right A SHRA
11000 Shift left A SHLA
Examples of ALU Microoperations
Symbolic Designation
Microoperation SELA SELB SELD OPR Control Word
R1 R2 - R3 R2 R3 R1 SUB 010 011 001 00101
R4 R4 R5 R4 R5 R4 OR 100 101 100 01010
R6 R6 + 1 R6 - R6 INCA 110 000 110 00001
R7 R1 R1 - R7 TSFA 001 000 111 00000
Output R2 R2 - None TSFA 010 000 000 00000
Output Input Input - None TSFA 000 000 000 00000
R4 shl R4 R4 - R4 SHLA 100 000 100 11000
R5 0 R5 R5 R5 XOR 101 101 101 01100
7. REGISTER STACK ORGANIZATION[1]
7
Register Stack
Push, Pop operations
/* Initially, SP = 0, EMPTY = 1, FULL = 0 */
PUSH POP
SP SP + 1 DR M[SP]
M[SP] DR SP SP - 1
If (SP = 0) then (FULL 1) If (SP = 0) then (EMPTY 1)
EMPTY 0 FULL 0
Stack
- Very useful feature for nested subroutines, nested loops control
- Also efficient for arithmetic expression evaluation
- Storage which can be accessed in LIFO
- Pointer: SP
- Only PUSH and POP operations are applicable
A
B
C
0
1
2
3
4
63
Address
FULL EMPTY
SP
DR
Flags
Stack pointer
stack
8. MEMORY STACK ORGANIZATION
8
- A portion of memory is used as a stack with a
processor register as a stack pointer
- PUSH: SP ← SP - 1
M[SP] ← DR
- POP: DR ← M[SP]
SP ← SP + 1
- Most computers do not provide hardware to check
stack overflow (full stack) or underflow(empty stack)
Memory with Program, Data,
and Stack Segments
DR
4001
4000
3999
3998
3997
3000
Data
(operands)
Program
(instructions)
1000
PC
AR
SP
stack
9. REVERSE POLISH NOTATION
9
A + B Infix notation
+ A B Prefix or Polish notation
A B + Postfix or reverse Polish notation
- The reverse Polish notation is very suitable for stack
manipulation
Evaluation of Arithmetic Expressions
Any arithmetic expression can be expressed in parenthesis-free
Polish notation, including reverse Polish notation
(3 * 4) + (5 * 6) ⇒ 3 4 * 5 6 * +
Arithmetic Expressions: A + B
3 3 12 12 12 12 42
4 5 5
6
30
3 4 * 5 6 * +
10. INSTRUCTION FORMAT
10
OP-code field - specifies the operation to be performed
Address field - designates memory address(s) or a processor register(s)
Mode field - specifies the way the operand or the
effective address is determined
The number of address fields in the instruction format
depends on the internal organization of CPU
- The three most common CPU organizations:
Single accumulator organization:
ADD X /* AC ← AC + M[X] */
General register organization:
ADD R1, R2, R3 /* R1 ← R2 + R3 */
ADD R1, R2 /* R1 ← R1 + R2 */
MOV R1, R2 /* R1 ← R2 */
ADD R1, X /* R1 ← R1 + M[X] */
Stack organization:
PUSH X /* TOS ← M[X] */
ADD
Instruction Fields
11. THREE, and TWO-ADDRESS INSTRUCTIONS
11
Three-Address Instructions:
Program to evaluate X = (A + B) * (C + D) :
ADD R1, A, B /* R1 M[A] + M[B] */
ADD R2, C, D /* R2 M[C] + M[D] */
MUL X, R1, R2 /* M[X] R1 * R2 */
- Results in short programs
- Instruction becomes long (many bits)
Two-Address Instructions:
Program to evaluate X = (A + B) * (C + D) :
MOV R1, A /* R1 M[A] */
ADD R1, B /* R1 R1 + M[B] */
MOV R2, C /* R2 M[C] */
ADD R2, D /* R2 R2 + M[D] */
MUL R1, R2 /* R1 R1 * R2 */
MOV X, R1 /* M[X] R1 */
12. ONE, and ZERO-ADDRESS INSTRUCTIONS
12
One-Address Instructions:
- Use an implied AC register for all data manipulation
- Program to evaluate X = (A + B) * (C + D) :
LOAD A /* AC M[A] */
ADD B /* AC AC + M[B] */
STORE T /* M[T] AC */
LOAD C /* AC M[C] */
ADD D /* AC AC + M[D] */
MUL T /* AC AC * M[T] */
STORE X /* M[X] AC */
Zero-Address Instructions:
- Can be found in a stack-organized computer
- Program to evaluate X = (A + B) * (C + D) :
PUSH A /* TOS A */
PUSH B /* TOS B */
ADD /* TOS (A + B) */
PUSH C /* TOS C */
PUSH D /* TOS D */
ADD /* TOS (C + D) */
MUL /* TOS (C + D) * (A + B) */
POP X /* M[X] TOS */
13. ADDRESSING MODES[2]
13
Addressing Modes:
* Specifies a rule for interpreting or modifying the
address field of the instruction (before the operand
is actually referenced)
* Variety of addressing modes
- to give programming flexibility to the user
- to use the bits in the address field of the
instruction efficiently
14. TYPES OF ADDRESSING MODES
14
Implied Mode
Address of the operands are specified implicitly
in the definition of the instruction
- No need to specify address in the instruction
- EA = AC, or EA = Stack[SP], EA: Effective Address.
Immediate Mode
Instead of specifying the address of the operand,
operand itself is specified
- No need to specify address in the instruction
- However, operand itself needs to be specified
- Sometimes, require more bits than the address
- Fast to acquire an operand
Register Mode
Address specified in the instruction is the register address
- Designated operand need to be in a register
- Shorter address than the memory address
- Saving address field in the instruction
- Faster to acquire an operand than the memory addressing
- EA = IR(R) (IR(R): Register field of IR)
15. TYPES OF ADDRESSING MODES
15
Register Indirect Mode
Instruction specifies a register which contains
the memory address of the operand
- Saving instruction bits since register address
is shorter than the memory address
- Slower to acquire an operand than both the
register addressing or memory addressing
- EA = [IR(R)] ([x]: Content of x)
Auto-increment or Auto-decrement features:
Same as the Register Indirect, but:
- When the address in the register is used to access memory, the
value in the register is incremented or decremented by 1 (after or before the
execution of the instruction)
16. TYPES OF ADDRESSING MODES
16
Direct Address Mode
Instruction specifies the memory address which
can be used directly to the physical memory
- Faster than the other memory addressing modes
- Too many bits are needed to specify the address
for a large physical memory space
- EA = IR(address), (IR(address): address field of IR)
Indirect Addressing Mode
The address field of an instruction specifies the address of a memory
location that contains the address of the operand
- When the abbreviated address is used, large physical memory can
be addressed with a relatively small number of bits
- Slow to acquire an operand because of an additional memory
access
- EA = M[IR(address)]
17. TYPES OF ADDRESSING MODES
17
Relative Addressing Modes
The Address fields of an instruction specifies the part of the address
(abbreviated address) which can be used along with a
designated register to calculate the address of the operand
PC Relative Addressing Mode(R = PC)
- EA = PC + IR(address)
- Address field of the instruction is short
- Large physical memory can be accessed with a small number of
address bits
Indexed Addressing Mode
XR: Index Register:
- EA = XR + IR(address)
Base Register Addressing Mode
BAR: Base Address Register:
- EA = BAR + IR(address)
18. ADDRESSING MODES - EXAMPLES
18
Addressing
Mode
Effective
Address
Content
of AC
Direct address 500 /* AC ← (500) */ 800
Immediate operand - /* AC ← 500 */ 500
Indirect address 800 /* AC ← ((500)) */ 300
Relative address 702 /* AC ← (PC+500) */ 325
Indexed address 600 /* AC ← (XR+500) */ 900
Register - /* AC ← R1 */ 400
Register indirect 400 /* AC ← (R1) */ 700
Autoincrement 400 /* AC ← (R1)+ */ 700
Autodecrement 399 /* AC ← -(R) */ 450
Load to AC Mode
Address = 500
Next instruction
200
201
202
399
400
450
700
500 800
600 900
702 325
800 300
MemoryAddress
PC = 200
R1 = 400
XR = 100
AC
19. DATA TRANSFER INSTRUCTIONS[2]
19
Load LD
Store ST
Move MOV
Exchange XCH
Input IN
Output OUT
Push PUSH
Pop POP
Name Mnemonic
Typical Data Transfer Instructions
Direct address LD ADR AC M[ADR]
Indirect address LD @ADR AC M[M[ADR]]
Relative address LD $ADR AC M[PC + ADR]
Immediate operand LD #NBR AC NBR
Index addressing LD ADR(X) AC M[ADR + XR]
Register LD R1 AC R1
Register indirect LD (R1) AC M[R1]
Autoincrement LD (R1)+ AC M[R1], R1 R1 + 1
Autodecrement LD -(R1) R1 R1 - 1, AC M[R1]
Mode
Assembly
Convention Register Transfer
Data Transfer Instructions with Different Addressing Modes
20. DATA MANIPULATION INSTRUCTIONS
20
Three Basic Types: Arithmetic instructions
Logical and bit manipulation instructions
Shift instructions
Arithmetic Instructions
Name Mnemonic
Clear CLR
Complement COM
AND AND
OR OR
Exclusive-OR XOR
Clear carry CLRC
Set carry SETC
Complement carry COMC
Enable interrupt EI
Disable interrupt DI
Name Mnemonic
Logical shift right SHR
Logical shift left SHL
Arithmetic shift right SHRA
Arithmetic shift left SHLA
Rotate right ROR
Rotate left ROL
Rotate right thru carry RORC
Rotate left thru carry ROLC
Name Mnemonic
Logical and Bit Manipulation Instructions Shift Instructions
Increment INC
Decrement DEC
Add ADD
Subtract SUB
Multiply MUL
Divide DIV
Add with Carry ADDC
Subtract with Borrow SUBB
Negate(2’s Complement) NEG
21. PROGRAM CONTROL INSTRUCTIONS
21
PC
+1
In-Line Sequencing
(Next instruction is fetched from the
next adjacent location in the memory)
Address from other source; Current Instruction, Stack, etc
Branch, Conditional Branch, Subroutine, etc
Program Control Instructions
Name Mnemonic
Branch BR
Jump JMP
Skip SKP
Call CALL
Return RTN
Compare(by - ) CMP
Test (by AND) TST
* CMP and TST instructions do not retain their
results of operations(- and AND, respectively).
They only set or clear certain Flags.
Status Flag Circuit
c7
c8
A B
8 8
8-bit ALU
V Z S C
F7
F7 - F0
8
F
Check for
zero output
22. CONDITIONAL BRANCH INSTRUCTIONS
22
BZ Branch if zero Z = 1
BNZ Branch if not zero Z = 0
BC Branch if carry C = 1
BNC Branch if no carry C = 0
BP Branch if plus S = 0
BM Branch if minus S = 1
BV Branch if overflow V = 1
BNV Branch if no overflow V = 0
BHI Branch if higher A > B
BHE Branch if higher or equal A ≥ B
BLO Branch if lower A < B
BLOE Branch if lower or equal A ≤ B
BE Branch if equal A = B
BNE Branch if not equal A ≠ B
BGT Branch if greater than A > B
BGE Branch if greater or equal A ≥ B
BLT Branch if less than A < B
BLE Branch if less or equal A ≤ B
BE Branch if equal A = B
BNE Branch if not equal A ≠ B
Unsigned compare conditions (A - B)
Signed compare conditions (A - B)
Mnemonic Branch condition Tested condition
23. SUBROUTINE CALL AND RETURN
23
Call subroutine
Jump to subroutine
Branch to subroutine
Branch and save return address
• Fixed Location in the subroutine(Memory)
• Fixed Location in memory
• In a processor Register
• In a memory stack
- most efficient way
SUBROUTINE CALL
Two Most Important Operations are Implied;
* Branch to the beginning of the Subroutine
- Same as the Branch or Conditional Branch
* Save the Return Address to get the address
of the location in the Calling Program upon
exit from the Subroutine
- Locations for storing Return Address: CALL
SP ← SP - 1
M[SP] ← PC
PC ← EA
RTN
PC ← M[SP]
SP ← SP + 1
24. PROGRAM INTERRUPT[3]
24
Types of Interrupts:
External interrupts
External Interrupts initiated from the outside of CPU and Memory
- I/O Device -> Data transfer request or Data transfer complete
- Timing Device -> Timeout
- Power Failure
Internal interrupts (traps)
Internal Interrupts are caused by the currently running program
- Register, Stack Overflow
- Divide by zero
- OP-code Violation
- Protection Violation
Software Interrupts
Both External and Internal Interrupts are initiated by the computer Hardware.
Software Interrupts are initiated by texecuting an instruction.
- Supervisor Call -> Switching from a user mode to the supervisor mode
-> Allows to execute a certain class of operations
which are not allowed in the user mode
25. INTERRUPT PROCEDURE
25
- The interrupt is usually initiated by an internal or
an external signal rather than from the execution of
an instruction (except for the software interrupt)
- The address of the interrupt service program is
determined by the hardware rather than from the
address field of an instruction
- An interrupt procedure usually stores all the
information necessary to define the state of CPU
rather than storing only the PC.
The state of the CPU is determined from;
Content of the PC
Content of all processor registers
Content of status bits
Many ways of saving the CPU state depending on the CPU architectures
Interrupt Procedure and Subroutine Call
26. RISC: REDUCED INSTRUCTION SET
COMPUTERS
26
Historical Background
IBM System/360, 1964
- The real beginning of modern computer architecture
- Distinction between Architecture and Implementation
- Architecture: The abstract structure of a computer
seen by an assembly-language programmer
High-Level
Language
Instruction
Set
Hardware
Compiler -program
Architecture Implementation
Continuing growth in semiconductor memory and microprogramming
-> A much richer and complicated instruction sets
=> CISC(Complex Instruction Set Computer)
- Arguments advanced at that time
Richer instruction sets would simplify compilers
Richer instruction sets would alleviate the software crisis
- move as much functions to the hardware as possible
- close Semantic Gap between machine language
and the high-level language
Richer instruction sets would improve the architecture quality
27. COMPLEX INSTRUCTION SET COMPUTERS: CISC
27
High Performance General Purpose Instructions
Characteristics of CISC:
1. A large number of instructions (from 100-250 usually)
2. Some instructions that performs a certain tasks are not used frequently.
3. Many addressing modes are used (5 to 20)
4. Variable length instruction format.
5. Instructions that manipulate operands in memory.
28. PHYLOSOPHY OF RISC
28
Reduce the semantic gap between machine instruction and microinstruction
1-Cycle instruction
Most of the instructions complete their execution
in 1 CPU clock cycle - like a microoperation
* Functions of the instruction (contrast to CISC)
- Very simple functions
- Very simple instruction format
- Similar to microinstructions
=> No need for microprogrammed control
* Register-Register Instructions
- Avoid memory reference instructions except
Load and Store instructions
- Most of the operands can be found in the
registers instead of main memory
=> Shorter instructions
=> Uniform instruction cycle
=> Requirement of large number of registers
* Employ instruction pipeline
29. CHARACTERISTICS OF RISC
29
Common RISC Characteristics
- Operations are register-to-register, with only LOAD and STORE
accessing memory
- The operations and addressing modes are reduced
Instruction formats are simple
30. CHARACTERISTICS OF RISC
30
RISC Characteristics
- Relatively few instructions
- Relatively few addressing modes
- Memory access limited to load and store instructions
- All operations done within the registers of the CPU
- Fixed-length, easily decoded instruction format
- Single-cycle instruction format
- Hardwired rather than microprogrammed control
-A relatively large numbers of registers in the processor unit.
-Efficient instruction pipeline
-Compiler support: provides efficient translation of high-level language
programs into machine language programs.
More RISC Characteristics
Advantages of RISC
- VLSI Realization
- Computing Speed
- Design Costs and Reliability
- High Level Language Support
31. Parallel processing[4]
• A parallel processing system is able to perform concurrent
data processing to achieve faster execution time
• The system may have two or more ALUs and be able to
execute two or more instructions at the same time
• Goal is to increase the throughput – the amount of processing
that can be accomplished during a given interval of time
32. Parallel processing classification
Single instruction stream, single data stream – SISD
Single instruction stream, multiple data stream – SIMD
Multiple instruction stream, single data stream – MISD
Multiple instruction stream, multiple data stream – MIMD
33. Single instruction stream, single data stream – SISD
• Single control unit, single computer, and a memory unit
• Instructions are executed sequentially. Parallel processing may
be achieved by means of multiple functional units or by
pipeline processing
34. Single instruction stream, multiple data stream – SIMD
• Represents an organization that includes many processing
units under the supervision of a common control unit.
• Includes multiple processing units with a single control unit.
All processors receive the same instruction, but operate on
different data.
35. Multiple instruction stream, single data stream – MISD
• Theoretical only
• processors receive different instructions, but operate on same
data.
36. Multiple instruction stream, multiple data
stream – MIMD
• A computer system capable of processing several programs at
the same time.
• Most multiprocessor and multicomputer systems can be
classified in this category
37. Pipelining: Laundry Example[4]
• Small laundry has one washer,
one dryer and one operator, it
takes 90 minutes to finish one
load:
– Washer takes 30 minutes
– Dryer takes 40 minutes
– “operator folding” takes 20
minutes
A B C D
38. Sequential Laundry
• This operator scheduled his loads to be delivered to the laundry every 90 minutes
which is the time required to finish one load. In other words he will not start a
new task unless he is already done with the previous task
• The process is sequential. Sequential laundry takes 6 hours for 4 loads
A
B
C
D
30 40 20 30 40 20 30 40 20 30 40 20
6 PM 7 8 9 10 11 Midnight
T
a
s
k
O
r
d
e
r
Time
90 min
39. Efficiently scheduled laundry: Pipelined Laundry
Operator start work ASAP
• Another operator asks for the delivery of loads to the laundry every 40
minutes!?.
• Pipelined laundry takes 3.5 hours for 4 loads
A
B
C
D
6 PM 7 8 9 10 11 Midnight
T
a
s
k
O
r
d
e
r
Time
30 40 40 40 40 20
40 40 40
40. Pipelining Facts
• Multiple tasks operating
simultaneously
• Pipelining doesn’t help
latency of single task, it
helps throughput of
entire workload
• Pipeline rate limited by
slowest pipeline stage
• Potential speedup =
Number of pipe stages
• Unbalanced lengths of
pipe stages reduces
speedup
• Time to “fill” pipeline
and time to “drain” it
reduces speedup
A
B
C
D
6 PM 7 8 9
T
a
s
k
O
r
d
e
r
Time
30 40 40 40 40 20
The washer
waits for the
dryer for 10
minutes
41. Pipelining
• Decomposes a sequential process into segments.
• Divide the processor into segment processors each one is
dedicated to a particular segment.
• Each segment is executed in a dedicated segment-processor
operates concurrently with all other segments.
• Information flows through these multiple hardware segments.
42. Pipelining
• Instruction execution is divided into k segments or stages
– Instruction exits pipe stage k-1 and proceeds into pipe stage
k
– All pipe stages take the same amount of time; called one
processor cycle
– Length of the processor cycle is determined by the slowest
pipe stage
k segments
43. Pipelining
• Suppose we want to perform the combined
multiply and add operations with a stream of
numbers:
• Ai * Bi + Ci for i =1,2,3,…,7
44. Pipelining
• The suboperations performed in each segment
of the pipeline are as follows:
– R1 Ai, R2 Bi
– R3 R1 * R2 R4 Ci
– R5 R3 + R4
45.
46.
47. Pipeline Performance
• n:instructions
• k: stages in pipeline
• τ: clockcycle
• Tk: total time
τ))1(( −+= nkTk
)1(
1
−+
==
nk
nk
T
T
Speedup
k
n is equivalent to number of loads in
the laundry example
k is the stages (washing, drying and
folding.
Clock cycle is the slowest task time
n
k
48. SPEEDUP
• • Consider a k-segment pipeline operating on n data sets. (In
the above example, k = 3 and n = 4.)
• > It takes k clock cycles to fill the pipeline and get the first
result from the output of the pipeline.
• After that the remaining (n - 1) results will come out at each
clock cycle.
• > It therefore takes (k + n - 1) clock cycles to complete the
task.
49. SPEEDUP
• If we execute the same task sequentially in a
single processing unit, it takes (k * n) clock
cycles.
• • The speedup gained by using the pipeline is:
• S = k * n / (k + n - 1 )
50. SPEEDUP
• S = k * n / (k + n - 1 )
For n >> k (such as 1 million data sets on a 3-stage
pipeline),
• S ~ k
• So we can gain the speedup which is equal to the
number of functional units for a large data sets. This
is because the multiple functional units can work in
parallel except for the filling and cleaning-up cycles.
51. Example: 6 tasks, divided into 4
segments
1 2 3 4 5 6 7 8 9
T1 T2 T3 T4 T5 T6
T1 T2 T3 T4 T5 T6
T1 T2 T3 T4 T5 T6
T1 T2 T3 T4 T5 T6
52. References
1. Computer Organization and Architecture,
Designing for performance by William Stallings,
Prentice Hall of India.
2. Modern Computer Architecture, by Morris Mano,
Prentice Hall of India.
3. Computer Architecture and Organization by John P.
Hayes, McGraw Hill Publishing Company.
4. Computer Organization by V. Carl Hamacher,
Zvonko G. Vranesic, Safwat G. Zaky, McGraw Hill
Publishing Company.