SlideShare a Scribd company logo
1 of 59
Course – Computer Organization
and Architecture
Course Instructor
Dr. Umadevi V
Department of CSE, BMSCE
26 October 2023 CSE, BMSCE
1
Unit-5
Some Fundamental Concepts, Fundamental Basic Processing Unit:
Some Fundamental Concepts , Instruction Execution, Hardware
Components, Instruction Fetch and Execution Steps, Control
Signals, Hardwired Control
Parallel Computer Architecture: Processor Architecture and
Technology Trends, Flynn’s Taxonomy of Parallel Architectures,
Memory Organization of Parallel Computers: Computers with
Distributed Memory Organization, Computers with Shared
Memory Organization, Thread-Level Parallelism: Simultaneous
Multithreading, Multicore Processors
26 October 2023 CSE, BMSCE 2
Basic Processing Unit
26 October 2023 CSE, BMSCE 3
Fundamental Concepts
 Processor fetches one instruction at a time and perform
the operation specified.
 Instructions are fetched from successive memory
locations until a branch or a jump instruction is
encountered.
 Processor keeps track of the address of the memory
location containing the next instruction to be fetched
using Program Counter (PC).
 Instruction Register (IR)
26 October 2023 CSE, BMSCE 4
Fundamental Concepts (Contd…)
26 October 2023 CSE, BMSCE 5
Fundamental Concepts (Contd…)
26 October 2023 CSE, BMSCE 6
 PC provides
instruction address.
 Instruction is fetched
into IR
 Instruction address
generator updates PC
 Control circuitry
interprets instruction
and generates control
signals to perform the
actions needed.
Main hardware components of a processor
Data Processing Hardware
Contents of register A are processed and deposited in register B.
26 October 2023 CSE, BMSCE 7
Basic structure for data processing
Data Processing Hardware (Contd…)
26 October 2023 CSE, BMSCE 8
A hardware structure with multiple stages.
Processing moves from one stage to the next in each clock cycle.
Such a multi-stage system is known as a pipeline.
High-performance processors have a pipelined organization.
Pipelining enables the execution of successive instructions to be overlapped.
Five-step sequence of actions to fetch and execute
an instruction.
A five-step sequence of actions to fetch and execute an
instruction.
26 October 2023 CSE, BMSCE 9
Instruction Execution
Load Instructions
Consider the instruction
Load R5, X(R7)
which uses the Index addressing mode to load a word of data from memory location X
+ [R7] into register R5.
Fetch and Execution of this instruction involves the following actions:
1. Fetch the instruction and increment the program counter.
2. Decode the instruction and read the contents of register R7 in the register file.
3. Compute the effective address.
4. Read the memory source operand.
5. Load the operand into the destination register, R5.
26 October 2023 CSE, BMSCE 10
Instruction Execution (Contd…)
Arithmetic and Logic Instructions:
A typical instruction of this type is
Add R3, R4, R5
Fetch and Execution of this instruction involves the following actions:
1. Fetch the instruction and increment the program counter.
2. Decode the instruction and read registers R4 and R5.
3. Compute the sum [R4] + [R5].
4. No action.
5. Load the result into the destination register, R3.
26 October 2023 CSE, BMSCE 11
Instruction Execution (Contd…)
Store Instruction
Store R6, X(R8)
stores the contents of register R6 into memory location X + [R8].
Fetch and Execution of this instruction involves the following actions:
1. Fetch the instruction and increment the program counter. 2. Decode the
instruction and read registers R6 and R8.
3. Compute the effective address X + [R8].
4. Store the contents of register R6 into memory location X + [R8].
5. No action
26 October 2023 CSE, BMSCE 12
Hardware Components
The discussion in previous slides indicates that all instructions of a RISC-
style processor can be executed using the five-step sequence. Hence, the
processor hardware may be organized in five stages, such that each stage
performs the actions needed in one of the steps.
Register File
 A 2-port register file is needed to read the two source registers at the
same time.
 It may be implemented using a 2-port memory.
26 October 2023 CSE, BMSCE 13
Hardware Components: ALU (Contd…)
- Two source operands are from registers.
26 October 2023 CSE, BMSCE 14
Conceptual view of the hardware needed for computation.
 Both source operands
and the destination
location are in the
register file.
 [RA] and [RB] denote
values of registers
that are identified by
addresses A and B
 new [RC] denotes the
result that is stored to
the register identified
by address C
Hardware Components: ALU (Contd…)
- One of the source operands is the immediate value in the IR.
26 October 2023 CSE, BMSCE 15
Conceptual view of the hardware needed for computation.
Five-step sequence of actions to fetch and execute
an instruction.
A five-step sequence of actions to fetch and execute an
instruction.
26 October 2023 CSE, BMSCE 16
Hardware Components: DataPath
26 October 2023 CSE, BMSCE 17
 Instruction processing
moves from stage to stage
in every clock cycle,
starting with fetch.
 The instruction is decoded
and the source registers
are read in stage 2.
 Computation takes place in
the ALU in stage 3.
A five-stage organization
Hardware Components: DataPath
26 October 2023 CSE, BMSCE 18
A five-stage organization
 If a memory operation is
involved, it takes place in
stage 4.
 The result of the instruction
is stored in the destination
register in stage 5.
Hardware Components:
DataPath
26 October 2023 CSE, BMSCE 19
Datapath in
a processor
Hardware Components:
DataPath
26 October 2023 20
Sequence of actions needed
to fetch and execute the instruction:
Add R3, R4, R5.
Datapath in a processor
Hardware Components:
DataPath
26 October 2023 21
Sequence of actions needed to
fetch and execute the instruction:
Load R5, X(R7).
Datapath in a processor
Hardware Components:
DataPath
26 October 2023 22
Sequence of actions needed to
fetch and execute the instruction:
Store R6, X(R8).
Datapath in a processor
Five-step sequence of actions to fetch and execute
an instruction.
A five-step sequence of actions to fetch and execute an
instruction.
26 October 2023 CSE, BMSCE 23
Hardware Components:
DataPath
26 October 2023 24
Sequence of actions needed to
fetch and execute the instruction:
Unconditional Branch Instruction
Datapath in a processor
Hardware Components:
DataPath
26 October 2023 25
Sequence of actions needed to
fetch and execute the instruction:
Conditional Branch Instruction
Branch_if_[R5]=[R6] LOOP
Datapath in a processor
Hardware Components:
DataPath
26 October 2023 26
Sequence of actions needed to
fetch and execute the instruction:
Subroutine Call Instructions
Call_Register R9
which calls a subroutine whose address
is in register R9
Datapath in a processor
Control Signals
Select multiplexer inputs to guide the flow of data.
Set the function performed by the ALU.
Determine when data are written into the PC, the IR,
the register file, and the memory.
Inter-stage registers are always enabled because
their contents are only relevant in the cycles for which
the stages connected to the register outputs are active.
26 October 2023 CSE, BMSCE 27
Control signals for the datapath
Rough Slide: to Explain Control Signals
Register File Control Signals
26 October 2023 CSE, BMSCE 28
Rough Slide: to Explain Control Signals
ALU Control Signals
26 October 2023 CSE, BMSCE 29
Rough Slide: to Explain Control Signals
Result Selection Signals
26 October 2023 CSE, BMSCE 30
Control signal generation:
 Actions to fetch & execute instructions have been described.
 The necessary control signals have also been described.
 Circuitry must be implemented to generate control signals
so actions take place in correct sequence and at correct time.
 There are two basic approaches:
Hardwired control and Microprogramming
26 October 2023 CSE, BMSCE 31
Hardwired Control
26 October 2023 CSE, BMSCE 32
Generation of the control signals.
Hardwired control involves implementing circuitry that
considers step counter, IR, ALU result, and external inputs.
Step counter keeps track of execution progress,
one clock cycle for each of the five steps described earlier
(unless a memory access takes longer than one cycle).
Parallel Computer Architecture
Parallel Computer Architecture: Processor Architecture and
Technology Trends, Flynn’s Taxonomy of Parallel Architectures,
Memory Organization of Parallel Computers: Computers
with Distributed Memory Organization, Computers with Shared
Memory Organization,
Thread-Level Parallelism: Simultaneous Multithreading,
Multicore Processors
26 October 2023 CSE, BMSCE 33
What is Parallel Computing ?
Serial vs Parallel Computing
26 October 2023 CSE, BMSCE 34
Serial Computing
Traditionally, software has been written for serial computation: A problem is broken
into a discrete series of instructions
 Instructions are executed sequentially one after another
 Executed on a single processor
 Only one instruction may execute at any moment in time
26 October 2023 CSE, BMSCE 35
Parallel Computing
In the simplest sense, parallel computing is the simultaneous use of multiple
compute resources to solve a computational problem: A problem is broken into
discrete parts that can be solved concurrently
 Each part is further broken down to a series of instructions
 Instructions from each part execute simultaneously on different processors
 An overall control/coordination mechanism is employed
26 October 2023 CSE, BMSCE 36
Internal Parallelism Levels
Question:
List and Explain different levels of Internal Parallelism ?
26 October 2023 CSE, BMSCE 37
Internal Parallelism Levels
1. Bit level parallelism
2. Parallelism by pipelining
3. Parallelism by multiple functional units
4. Parallelism at process or thread level
26 October 2023 CSE, BMSCE 38
Internal Parallelism Levels
1. Bit level parallelism: 1970 to ~1985
 4 bits, 8 bit, 16 bit, 32, 64 bit microprocessors
2. Parallelism by pipelining: ~1985 through today
 Instruction level parallelism (ILP): The idea of pipelining at instruction level is an
overlapping of the execution of multiple instructions.
 The execution of each instruction is partitioned into several steps which are
performed by dedicated hardware units (pipeline stages) one after another. A
typical partitioning could result in the following steps:
(a) fetch: fetch the next instruction to be executed from memory;
(b) decode: decode the instruction fetched in step (a);
(c) execute: load the operands specified and execute the instruction;
(d) write-back: write the result into the target register/memory.
26 October 2023 CSE, BMSCE 39
Internal Parallelism Levels (Contd…)
3. Parallelism by multiple functional units:
 Many processors are multiple-issue processors.
 They use multiple, independent functional units like ALUs (arithmetic
logical units), FPUs (floating-point units), load/store units, or branch
units.
 These units can work in parallel, i.e., different independent instructions
can be executed in parallel by different functional units. Thus, the
average execution rate of instructions can be increased.
4. Parallelism at process or thread level:
 The three techniques described so far assume a single sequential
control flow which is provided by the compiler and which determines
the execution order if there are dependencies between instructions.
 An alternative approach is to use add multiple, independent processor
cores onto a single processor chip.
 This approach has been used for typical desktop processors since
2005. The resulting processor chips are called multicore processors.
Example: Dual Core, Quadcore processor
26 October 2023 CSE, BMSCE 40
Flynn’s Taxonomy of Parallel Computers
Question:
List and Explain different classifications of Parallel Computer
according to Flynn’s Taxonomy ?
26 October 2023 CSE, BMSCE 41
Flynn’s Taxonomy of Parallel Computers
Flynn’s Taxonomy: Classification according to important characteristics of a parallel computer.
Four categories are distinguished based on:
- How Many Instruction Streams
- How Many Data Streams
1. Single-Instruction, Single-Data (SISD)
2. Multiple-Instruction, Single-Data (MISD)
3. Single-Instruction, Multiple-Data (SIMD)
4. Multiple-Instruction, Multiple-Data (MIMD)
26 October 2023 CSE, BMSCE 42
Flynn’s Taxonomy of Parallel Computers
Flynn’s Taxonomy: Classification according to important characteristics of a parallel computer.
Four categories are distinguished based on:
- How Many Instruction Streams
- How Many Data Streams
1. Single-Instruction, Single-Data (SISD)
2. Multiple-Instruction, Single-Data (MISD)
3. Single-Instruction, Multiple-Data (SIMD)
4. Multiple-Instruction, Multiple-Data (MIMD)
26 October 2023 CSE, BMSCE 43
Instruction
Stream
Data
Stream
SSID 1 1
MISD
SIMD
MIMD
Flynn’s Taxonomy of Parallel Computers
Flynn’s Taxonomy: Classification according to important characteristics of a parallel computer.
Four categories are distinguished based on:
- How Many Instruction Streams
- How Many Data Streams
1. Single-Instruction, Single-Data (SISD)
2. Multiple-Instruction, Single-Data (MISD)
3. Single-Instruction, Multiple-Data (SIMD)
4. Multiple-Instruction, Multiple-Data (MIMD)
26 October 2023 CSE, BMSCE 44
Instruction
Stream
Data
Stream
SSID 1 1
MISD >1 1
SIMD
MIMD
Flynn’s Taxonomy of Parallel Computers
Flynn’s Taxonomy: Classification according to important characteristics of a parallel computer.
Four categories are distinguished based on:
- How Many Instruction Streams
- How Many Data Streams
1. Single-Instruction, Single-Data (SISD)
2. Multiple-Instruction, Single-Data (MISD)
3. Single-Instruction, Multiple-Data (SIMD)
4. Multiple-Instruction, Multiple-Data (MIMD)
26 October 2023 CSE, BMSCE 45
Instruction
Stream
Data
Stream
SSID 1 1
MISD >1 1
SIMD 1 >1
MIMD
Flynn’s Taxonomy of Parallel Computers
Flynn’s Taxonomy: Classification according to important characteristics of a parallel computer.
Four categories are distinguished based on:
- How Many Instruction Streams
- How Many Data Streams
1. Single-Instruction, Single-Data (SISD)
2. Multiple-Instruction, Single-Data (MISD)
3. Single-Instruction, Multiple-Data (SIMD)
4. Multiple-Instruction, Multiple-Data (MIMD)
26 October 2023 CSE, BMSCE 46
Instruction
Stream
Data
Stream
SSID 1 1
MISD >1 1
SIMD 1 >1
MIMD >1 >1
Flynn’s Taxonomy of Parallel Computers (Contd…)
1. Single-Instruction, Single-Data (SISD): There is one processing
element which has access to a single program and data storage. In each
step, the processing element loads an instruction and the corresponding
data and executes the instruction. The result is stored back in the data
storage. Thus, SISD is the conventional sequential computer according to
the von Neumann model.
2. Multiple-Instruction, Single-Data (MISD): There are multiple
processing elements each of which has a private program memory, but
there is only one common access to a single global data memory. In each
step, each processing element obtains the same data element from the
data memory and loads an instruction from its private program memory.
These possibly different instructions are then executed in parallel by the
processing elements using the previously obtained (identical) data element
as operand. This execution model is very restrictive and no commercial
parallel computer of this type has ever been built.
26 October 2023 CSE, BMSCE 47
Flynn’s Taxonomy of Parallel Computers (Contd…)
3. Single-Instruction, Multiple-Data (SIMD): There are multiple processing
elements each of which has a private access to a (shared or distributed) data memory.
But there is only one program memory from which a special control processor fetches
and dispatches instructions. In each step, each processing element obtains from the
control processor the same instruction and loads a separate data element through its
private data access on which the instruction is performed. Thus, the instruction
is synchronously applied in parallel by all processing elements to different data
elements.
For applications with a significant degree of data parallelism, the SIMD approach can
be very efficient. Examples are multimedia applications or computer graphics
algorithms to generate realistic three-dimensional views of computer-generated
environments.
4. Multiple-Instruction, Multiple-Data (MIMD): There are multiple processing
elements each of which has a separate instruction and data access to a (shared or
distributed) program and data memory. In each step, each processing element loads a
separate instruction and a separate data element, applies the instruction to the data
element, and stores a possible result back into the data storage. The processing
elements work asynchronously with each other. Multicore processors or cluster
systems are examples for the MIMD model.
26 October 2023 CSE, BMSCE 48
Memory Organization of Parallel Computers
A Further classification of MIMD computers can be done
according to their memory organization:
1. Computers with Distributed Memory Organization
2. Computers with Shared Memory Organization
26 October 2023 CSE, BMSCE 49
Shared vs. Distributed Memory
The simplest and most useful way to classify modern parallel
computers is by their memory model:
Shared memory and Distributed memory
26 October 2023 CSE, BMSCE 50
Illustration of computers with distributed memory
Illustration of computers with distributed memory: (a) abstract structure,
(b) computer with distributed memory and hypercube as interconnection structure,
(c) DMA (direct memory access), (d) processor–memory node with router, and (e)
interconnection network in the form of a mesh to connect the routers of the different
processor–memory nodes
26 October 2023 CSE, BMSCE 51
Illustration of a computer with shared memory
Illustration of a computer with shared memory: (a) abstract view and (b)
implementation of the shared memory with memory modules
26 October 2023 CSE, BMSCE 52
Illustration of the architecture of computers with shared memory:
Illustration of the architecture of computers with shared memory: (a) SMP –
symmetric multiprocessors, (b) NUMA – non-uniform memory access, (c) CC-NUMA –
cache-coherent NUMA, and (d) COMA – cache-only memory access
26 October 2023 CSE, BMSCE 53
Thread-Level Parallelism
26 October 2023 CSE, BMSCE 54
What is Thread w.r.t Computers ?
Thread: A Process (or Program) with own instructions and
data
- Each thread has all the state (instructions, data, PC,
register state, and so on) necessary to allow it to execute
26 October 2023 CSE, BMSCE 55
Thread-Level Parallelism
Multi-Core Processors: Placement of multiple independent
execution cores with all execution resources onto a single
processor chip.
Design choices for multicore chips
26 October 2023 CSE, BMSCE 56
Hierarchical design
Thread-Level Parallelism
Multi-Core Processors: Placement of multiple independent
execution cores with all execution resources onto a single
processor chip.
Design choices for multicore chips
26 October 2023 CSE, BMSCE 57
Hierarchical design Pipelined design
Thread-Level Parallelism
Multi-Core Processors: Placement of multiple independent
execution cores with all execution resources onto a single
processor chip.
Design choices for multicore chips
26 October 2023 CSE, BMSCE 58
Hierarchical design Pipelined design Network-based design
Thanks for Listening
26 October 2023 CSE, BMSCE 59

More Related Content

Similar to Unit-5-BasicProcessing_Parallel-26-1-2023-5am.pptx

Please send the answers to my email. Mirre06@hotmail.comSomeone se.pdf
Please send the answers to my email. Mirre06@hotmail.comSomeone se.pdfPlease send the answers to my email. Mirre06@hotmail.comSomeone se.pdf
Please send the answers to my email. Mirre06@hotmail.comSomeone se.pdf
ebrahimbadushata00
 
the-cpu-design-central-processing-unit-design-1
the-cpu-design-central-processing-unit-design-1the-cpu-design-central-processing-unit-design-1
the-cpu-design-central-processing-unit-design-1
Basel Mansour
 

Similar to Unit-5-BasicProcessing_Parallel-26-1-2023-5am.pptx (20)

Cao 2012
Cao 2012Cao 2012
Cao 2012
 
POLITEKNIK MALAYSIA
POLITEKNIK MALAYSIAPOLITEKNIK MALAYSIA
POLITEKNIK MALAYSIA
 
An Enhanced FPGA Based Asynchronous Microprocessor Design Using VIVADO and ISIM
An Enhanced FPGA Based Asynchronous Microprocessor Design Using VIVADO and ISIMAn Enhanced FPGA Based Asynchronous Microprocessor Design Using VIVADO and ISIM
An Enhanced FPGA Based Asynchronous Microprocessor Design Using VIVADO and ISIM
 
System Programming- Unit I
System Programming- Unit ISystem Programming- Unit I
System Programming- Unit I
 
CISC & RISC Architecture
CISC & RISC Architecture CISC & RISC Architecture
CISC & RISC Architecture
 
8 bit Microprocessor with Single Vectored Interrupt
8 bit Microprocessor with Single Vectored Interrupt8 bit Microprocessor with Single Vectored Interrupt
8 bit Microprocessor with Single Vectored Interrupt
 
Please send the answers to my email. Mirre06@hotmail.comSomeone se.pdf
Please send the answers to my email. Mirre06@hotmail.comSomeone se.pdfPlease send the answers to my email. Mirre06@hotmail.comSomeone se.pdf
Please send the answers to my email. Mirre06@hotmail.comSomeone se.pdf
 
MicroProgrammed Explained .
MicroProgrammed Explained .MicroProgrammed Explained .
MicroProgrammed Explained .
 
Different addressing mode and risc, cisc microprocessor
Different addressing mode and risc, cisc microprocessorDifferent addressing mode and risc, cisc microprocessor
Different addressing mode and risc, cisc microprocessor
 
A044050107
A044050107A044050107
A044050107
 
unit 1ARM INTRODUCTION.pptx
unit 1ARM INTRODUCTION.pptxunit 1ARM INTRODUCTION.pptx
unit 1ARM INTRODUCTION.pptx
 
4bit pc report[cse 08-section-b2_group-02]
4bit pc report[cse 08-section-b2_group-02]4bit pc report[cse 08-section-b2_group-02]
4bit pc report[cse 08-section-b2_group-02]
 
4bit PC report
4bit PC report4bit PC report
4bit PC report
 
Chapter 4 the processor
Chapter 4 the processorChapter 4 the processor
Chapter 4 the processor
 
3.3
3.33.3
3.3
 
the-cpu-design-central-processing-unit-design-1
the-cpu-design-central-processing-unit-design-1the-cpu-design-central-processing-unit-design-1
the-cpu-design-central-processing-unit-design-1
 
Realization of high performance run time loadable mips soft-core processor
Realization of high performance run time loadable mips soft-core processorRealization of high performance run time loadable mips soft-core processor
Realization of high performance run time loadable mips soft-core processor
 
16-bit Microprocessor Design (2005)
16-bit Microprocessor Design (2005)16-bit Microprocessor Design (2005)
16-bit Microprocessor Design (2005)
 
Processor Basics
Processor BasicsProcessor Basics
Processor Basics
 
Computer org Architecture module2 ppt.pdf
Computer org Architecture module2 ppt.pdfComputer org Architecture module2 ppt.pdf
Computer org Architecture module2 ppt.pdf
 

More from IronMan665214 (7)

Unit-5 BSR-1-02-2024 advanced algorithms .pptx
Unit-5 BSR-1-02-2024 advanced algorithms .pptxUnit-5 BSR-1-02-2024 advanced algorithms .pptx
Unit-5 BSR-1-02-2024 advanced algorithms .pptx
 
Unit-4-BSR-11-1-2024 (1)linear programming.pptx
Unit-4-BSR-11-1-2024 (1)linear programming.pptxUnit-4-BSR-11-1-2024 (1)linear programming.pptx
Unit-4-BSR-11-1-2024 (1)linear programming.pptx
 
Dev AAT.pptx
Dev AAT.pptxDev AAT.pptx
Dev AAT.pptx
 
UHV_IntroductionClass-11-6-2023-7am.pptx
UHV_IntroductionClass-11-6-2023-7am.pptxUHV_IntroductionClass-11-6-2023-7am.pptx
UHV_IntroductionClass-11-6-2023-7am.pptx
 
Unit-2-Stack_basic_Input_Output-4-12-2022-8am.ppt
Unit-2-Stack_basic_Input_Output-4-12-2022-8am.pptUnit-2-Stack_basic_Input_Output-4-12-2022-8am.ppt
Unit-2-Stack_basic_Input_Output-4-12-2022-8am.ppt
 
2_Effects of Human activities.pptx
2_Effects of Human activities.pptx2_Effects of Human activities.pptx
2_Effects of Human activities.pptx
 
Unit1_Basic_Structure_of_Computer-7-11-2022-11am.ppt.pdf
Unit1_Basic_Structure_of_Computer-7-11-2022-11am.ppt.pdfUnit1_Basic_Structure_of_Computer-7-11-2022-11am.ppt.pdf
Unit1_Basic_Structure_of_Computer-7-11-2022-11am.ppt.pdf
 

Recently uploaded

FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
dollysharma2066
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
Epec Engineered Technologies
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoorTop Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
dharasingh5698
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
MsecMca
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
ssuser89054b
 

Recently uploaded (20)

(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoorTop Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
 

Unit-5-BasicProcessing_Parallel-26-1-2023-5am.pptx

  • 1. Course – Computer Organization and Architecture Course Instructor Dr. Umadevi V Department of CSE, BMSCE 26 October 2023 CSE, BMSCE 1
  • 2. Unit-5 Some Fundamental Concepts, Fundamental Basic Processing Unit: Some Fundamental Concepts , Instruction Execution, Hardware Components, Instruction Fetch and Execution Steps, Control Signals, Hardwired Control Parallel Computer Architecture: Processor Architecture and Technology Trends, Flynn’s Taxonomy of Parallel Architectures, Memory Organization of Parallel Computers: Computers with Distributed Memory Organization, Computers with Shared Memory Organization, Thread-Level Parallelism: Simultaneous Multithreading, Multicore Processors 26 October 2023 CSE, BMSCE 2
  • 3. Basic Processing Unit 26 October 2023 CSE, BMSCE 3
  • 4. Fundamental Concepts  Processor fetches one instruction at a time and perform the operation specified.  Instructions are fetched from successive memory locations until a branch or a jump instruction is encountered.  Processor keeps track of the address of the memory location containing the next instruction to be fetched using Program Counter (PC).  Instruction Register (IR) 26 October 2023 CSE, BMSCE 4
  • 5. Fundamental Concepts (Contd…) 26 October 2023 CSE, BMSCE 5
  • 6. Fundamental Concepts (Contd…) 26 October 2023 CSE, BMSCE 6  PC provides instruction address.  Instruction is fetched into IR  Instruction address generator updates PC  Control circuitry interprets instruction and generates control signals to perform the actions needed. Main hardware components of a processor
  • 7. Data Processing Hardware Contents of register A are processed and deposited in register B. 26 October 2023 CSE, BMSCE 7 Basic structure for data processing
  • 8. Data Processing Hardware (Contd…) 26 October 2023 CSE, BMSCE 8 A hardware structure with multiple stages. Processing moves from one stage to the next in each clock cycle. Such a multi-stage system is known as a pipeline. High-performance processors have a pipelined organization. Pipelining enables the execution of successive instructions to be overlapped.
  • 9. Five-step sequence of actions to fetch and execute an instruction. A five-step sequence of actions to fetch and execute an instruction. 26 October 2023 CSE, BMSCE 9
  • 10. Instruction Execution Load Instructions Consider the instruction Load R5, X(R7) which uses the Index addressing mode to load a word of data from memory location X + [R7] into register R5. Fetch and Execution of this instruction involves the following actions: 1. Fetch the instruction and increment the program counter. 2. Decode the instruction and read the contents of register R7 in the register file. 3. Compute the effective address. 4. Read the memory source operand. 5. Load the operand into the destination register, R5. 26 October 2023 CSE, BMSCE 10
  • 11. Instruction Execution (Contd…) Arithmetic and Logic Instructions: A typical instruction of this type is Add R3, R4, R5 Fetch and Execution of this instruction involves the following actions: 1. Fetch the instruction and increment the program counter. 2. Decode the instruction and read registers R4 and R5. 3. Compute the sum [R4] + [R5]. 4. No action. 5. Load the result into the destination register, R3. 26 October 2023 CSE, BMSCE 11
  • 12. Instruction Execution (Contd…) Store Instruction Store R6, X(R8) stores the contents of register R6 into memory location X + [R8]. Fetch and Execution of this instruction involves the following actions: 1. Fetch the instruction and increment the program counter. 2. Decode the instruction and read registers R6 and R8. 3. Compute the effective address X + [R8]. 4. Store the contents of register R6 into memory location X + [R8]. 5. No action 26 October 2023 CSE, BMSCE 12
  • 13. Hardware Components The discussion in previous slides indicates that all instructions of a RISC- style processor can be executed using the five-step sequence. Hence, the processor hardware may be organized in five stages, such that each stage performs the actions needed in one of the steps. Register File  A 2-port register file is needed to read the two source registers at the same time.  It may be implemented using a 2-port memory. 26 October 2023 CSE, BMSCE 13
  • 14. Hardware Components: ALU (Contd…) - Two source operands are from registers. 26 October 2023 CSE, BMSCE 14 Conceptual view of the hardware needed for computation.  Both source operands and the destination location are in the register file.  [RA] and [RB] denote values of registers that are identified by addresses A and B  new [RC] denotes the result that is stored to the register identified by address C
  • 15. Hardware Components: ALU (Contd…) - One of the source operands is the immediate value in the IR. 26 October 2023 CSE, BMSCE 15 Conceptual view of the hardware needed for computation.
  • 16. Five-step sequence of actions to fetch and execute an instruction. A five-step sequence of actions to fetch and execute an instruction. 26 October 2023 CSE, BMSCE 16
  • 17. Hardware Components: DataPath 26 October 2023 CSE, BMSCE 17  Instruction processing moves from stage to stage in every clock cycle, starting with fetch.  The instruction is decoded and the source registers are read in stage 2.  Computation takes place in the ALU in stage 3. A five-stage organization
  • 18. Hardware Components: DataPath 26 October 2023 CSE, BMSCE 18 A five-stage organization  If a memory operation is involved, it takes place in stage 4.  The result of the instruction is stored in the destination register in stage 5.
  • 19. Hardware Components: DataPath 26 October 2023 CSE, BMSCE 19 Datapath in a processor
  • 20. Hardware Components: DataPath 26 October 2023 20 Sequence of actions needed to fetch and execute the instruction: Add R3, R4, R5. Datapath in a processor
  • 21. Hardware Components: DataPath 26 October 2023 21 Sequence of actions needed to fetch and execute the instruction: Load R5, X(R7). Datapath in a processor
  • 22. Hardware Components: DataPath 26 October 2023 22 Sequence of actions needed to fetch and execute the instruction: Store R6, X(R8). Datapath in a processor
  • 23. Five-step sequence of actions to fetch and execute an instruction. A five-step sequence of actions to fetch and execute an instruction. 26 October 2023 CSE, BMSCE 23
  • 24. Hardware Components: DataPath 26 October 2023 24 Sequence of actions needed to fetch and execute the instruction: Unconditional Branch Instruction Datapath in a processor
  • 25. Hardware Components: DataPath 26 October 2023 25 Sequence of actions needed to fetch and execute the instruction: Conditional Branch Instruction Branch_if_[R5]=[R6] LOOP Datapath in a processor
  • 26. Hardware Components: DataPath 26 October 2023 26 Sequence of actions needed to fetch and execute the instruction: Subroutine Call Instructions Call_Register R9 which calls a subroutine whose address is in register R9 Datapath in a processor
  • 27. Control Signals Select multiplexer inputs to guide the flow of data. Set the function performed by the ALU. Determine when data are written into the PC, the IR, the register file, and the memory. Inter-stage registers are always enabled because their contents are only relevant in the cycles for which the stages connected to the register outputs are active. 26 October 2023 CSE, BMSCE 27 Control signals for the datapath
  • 28. Rough Slide: to Explain Control Signals Register File Control Signals 26 October 2023 CSE, BMSCE 28
  • 29. Rough Slide: to Explain Control Signals ALU Control Signals 26 October 2023 CSE, BMSCE 29
  • 30. Rough Slide: to Explain Control Signals Result Selection Signals 26 October 2023 CSE, BMSCE 30
  • 31. Control signal generation:  Actions to fetch & execute instructions have been described.  The necessary control signals have also been described.  Circuitry must be implemented to generate control signals so actions take place in correct sequence and at correct time.  There are two basic approaches: Hardwired control and Microprogramming 26 October 2023 CSE, BMSCE 31
  • 32. Hardwired Control 26 October 2023 CSE, BMSCE 32 Generation of the control signals. Hardwired control involves implementing circuitry that considers step counter, IR, ALU result, and external inputs. Step counter keeps track of execution progress, one clock cycle for each of the five steps described earlier (unless a memory access takes longer than one cycle).
  • 33. Parallel Computer Architecture Parallel Computer Architecture: Processor Architecture and Technology Trends, Flynn’s Taxonomy of Parallel Architectures, Memory Organization of Parallel Computers: Computers with Distributed Memory Organization, Computers with Shared Memory Organization, Thread-Level Parallelism: Simultaneous Multithreading, Multicore Processors 26 October 2023 CSE, BMSCE 33
  • 34. What is Parallel Computing ? Serial vs Parallel Computing 26 October 2023 CSE, BMSCE 34
  • 35. Serial Computing Traditionally, software has been written for serial computation: A problem is broken into a discrete series of instructions  Instructions are executed sequentially one after another  Executed on a single processor  Only one instruction may execute at any moment in time 26 October 2023 CSE, BMSCE 35
  • 36. Parallel Computing In the simplest sense, parallel computing is the simultaneous use of multiple compute resources to solve a computational problem: A problem is broken into discrete parts that can be solved concurrently  Each part is further broken down to a series of instructions  Instructions from each part execute simultaneously on different processors  An overall control/coordination mechanism is employed 26 October 2023 CSE, BMSCE 36
  • 37. Internal Parallelism Levels Question: List and Explain different levels of Internal Parallelism ? 26 October 2023 CSE, BMSCE 37
  • 38. Internal Parallelism Levels 1. Bit level parallelism 2. Parallelism by pipelining 3. Parallelism by multiple functional units 4. Parallelism at process or thread level 26 October 2023 CSE, BMSCE 38
  • 39. Internal Parallelism Levels 1. Bit level parallelism: 1970 to ~1985  4 bits, 8 bit, 16 bit, 32, 64 bit microprocessors 2. Parallelism by pipelining: ~1985 through today  Instruction level parallelism (ILP): The idea of pipelining at instruction level is an overlapping of the execution of multiple instructions.  The execution of each instruction is partitioned into several steps which are performed by dedicated hardware units (pipeline stages) one after another. A typical partitioning could result in the following steps: (a) fetch: fetch the next instruction to be executed from memory; (b) decode: decode the instruction fetched in step (a); (c) execute: load the operands specified and execute the instruction; (d) write-back: write the result into the target register/memory. 26 October 2023 CSE, BMSCE 39
  • 40. Internal Parallelism Levels (Contd…) 3. Parallelism by multiple functional units:  Many processors are multiple-issue processors.  They use multiple, independent functional units like ALUs (arithmetic logical units), FPUs (floating-point units), load/store units, or branch units.  These units can work in parallel, i.e., different independent instructions can be executed in parallel by different functional units. Thus, the average execution rate of instructions can be increased. 4. Parallelism at process or thread level:  The three techniques described so far assume a single sequential control flow which is provided by the compiler and which determines the execution order if there are dependencies between instructions.  An alternative approach is to use add multiple, independent processor cores onto a single processor chip.  This approach has been used for typical desktop processors since 2005. The resulting processor chips are called multicore processors. Example: Dual Core, Quadcore processor 26 October 2023 CSE, BMSCE 40
  • 41. Flynn’s Taxonomy of Parallel Computers Question: List and Explain different classifications of Parallel Computer according to Flynn’s Taxonomy ? 26 October 2023 CSE, BMSCE 41
  • 42. Flynn’s Taxonomy of Parallel Computers Flynn’s Taxonomy: Classification according to important characteristics of a parallel computer. Four categories are distinguished based on: - How Many Instruction Streams - How Many Data Streams 1. Single-Instruction, Single-Data (SISD) 2. Multiple-Instruction, Single-Data (MISD) 3. Single-Instruction, Multiple-Data (SIMD) 4. Multiple-Instruction, Multiple-Data (MIMD) 26 October 2023 CSE, BMSCE 42
  • 43. Flynn’s Taxonomy of Parallel Computers Flynn’s Taxonomy: Classification according to important characteristics of a parallel computer. Four categories are distinguished based on: - How Many Instruction Streams - How Many Data Streams 1. Single-Instruction, Single-Data (SISD) 2. Multiple-Instruction, Single-Data (MISD) 3. Single-Instruction, Multiple-Data (SIMD) 4. Multiple-Instruction, Multiple-Data (MIMD) 26 October 2023 CSE, BMSCE 43 Instruction Stream Data Stream SSID 1 1 MISD SIMD MIMD
  • 44. Flynn’s Taxonomy of Parallel Computers Flynn’s Taxonomy: Classification according to important characteristics of a parallel computer. Four categories are distinguished based on: - How Many Instruction Streams - How Many Data Streams 1. Single-Instruction, Single-Data (SISD) 2. Multiple-Instruction, Single-Data (MISD) 3. Single-Instruction, Multiple-Data (SIMD) 4. Multiple-Instruction, Multiple-Data (MIMD) 26 October 2023 CSE, BMSCE 44 Instruction Stream Data Stream SSID 1 1 MISD >1 1 SIMD MIMD
  • 45. Flynn’s Taxonomy of Parallel Computers Flynn’s Taxonomy: Classification according to important characteristics of a parallel computer. Four categories are distinguished based on: - How Many Instruction Streams - How Many Data Streams 1. Single-Instruction, Single-Data (SISD) 2. Multiple-Instruction, Single-Data (MISD) 3. Single-Instruction, Multiple-Data (SIMD) 4. Multiple-Instruction, Multiple-Data (MIMD) 26 October 2023 CSE, BMSCE 45 Instruction Stream Data Stream SSID 1 1 MISD >1 1 SIMD 1 >1 MIMD
  • 46. Flynn’s Taxonomy of Parallel Computers Flynn’s Taxonomy: Classification according to important characteristics of a parallel computer. Four categories are distinguished based on: - How Many Instruction Streams - How Many Data Streams 1. Single-Instruction, Single-Data (SISD) 2. Multiple-Instruction, Single-Data (MISD) 3. Single-Instruction, Multiple-Data (SIMD) 4. Multiple-Instruction, Multiple-Data (MIMD) 26 October 2023 CSE, BMSCE 46 Instruction Stream Data Stream SSID 1 1 MISD >1 1 SIMD 1 >1 MIMD >1 >1
  • 47. Flynn’s Taxonomy of Parallel Computers (Contd…) 1. Single-Instruction, Single-Data (SISD): There is one processing element which has access to a single program and data storage. In each step, the processing element loads an instruction and the corresponding data and executes the instruction. The result is stored back in the data storage. Thus, SISD is the conventional sequential computer according to the von Neumann model. 2. Multiple-Instruction, Single-Data (MISD): There are multiple processing elements each of which has a private program memory, but there is only one common access to a single global data memory. In each step, each processing element obtains the same data element from the data memory and loads an instruction from its private program memory. These possibly different instructions are then executed in parallel by the processing elements using the previously obtained (identical) data element as operand. This execution model is very restrictive and no commercial parallel computer of this type has ever been built. 26 October 2023 CSE, BMSCE 47
  • 48. Flynn’s Taxonomy of Parallel Computers (Contd…) 3. Single-Instruction, Multiple-Data (SIMD): There are multiple processing elements each of which has a private access to a (shared or distributed) data memory. But there is only one program memory from which a special control processor fetches and dispatches instructions. In each step, each processing element obtains from the control processor the same instruction and loads a separate data element through its private data access on which the instruction is performed. Thus, the instruction is synchronously applied in parallel by all processing elements to different data elements. For applications with a significant degree of data parallelism, the SIMD approach can be very efficient. Examples are multimedia applications or computer graphics algorithms to generate realistic three-dimensional views of computer-generated environments. 4. Multiple-Instruction, Multiple-Data (MIMD): There are multiple processing elements each of which has a separate instruction and data access to a (shared or distributed) program and data memory. In each step, each processing element loads a separate instruction and a separate data element, applies the instruction to the data element, and stores a possible result back into the data storage. The processing elements work asynchronously with each other. Multicore processors or cluster systems are examples for the MIMD model. 26 October 2023 CSE, BMSCE 48
  • 49. Memory Organization of Parallel Computers A Further classification of MIMD computers can be done according to their memory organization: 1. Computers with Distributed Memory Organization 2. Computers with Shared Memory Organization 26 October 2023 CSE, BMSCE 49
  • 50. Shared vs. Distributed Memory The simplest and most useful way to classify modern parallel computers is by their memory model: Shared memory and Distributed memory 26 October 2023 CSE, BMSCE 50
  • 51. Illustration of computers with distributed memory Illustration of computers with distributed memory: (a) abstract structure, (b) computer with distributed memory and hypercube as interconnection structure, (c) DMA (direct memory access), (d) processor–memory node with router, and (e) interconnection network in the form of a mesh to connect the routers of the different processor–memory nodes 26 October 2023 CSE, BMSCE 51
  • 52. Illustration of a computer with shared memory Illustration of a computer with shared memory: (a) abstract view and (b) implementation of the shared memory with memory modules 26 October 2023 CSE, BMSCE 52
  • 53. Illustration of the architecture of computers with shared memory: Illustration of the architecture of computers with shared memory: (a) SMP – symmetric multiprocessors, (b) NUMA – non-uniform memory access, (c) CC-NUMA – cache-coherent NUMA, and (d) COMA – cache-only memory access 26 October 2023 CSE, BMSCE 53
  • 55. What is Thread w.r.t Computers ? Thread: A Process (or Program) with own instructions and data - Each thread has all the state (instructions, data, PC, register state, and so on) necessary to allow it to execute 26 October 2023 CSE, BMSCE 55
  • 56. Thread-Level Parallelism Multi-Core Processors: Placement of multiple independent execution cores with all execution resources onto a single processor chip. Design choices for multicore chips 26 October 2023 CSE, BMSCE 56 Hierarchical design
  • 57. Thread-Level Parallelism Multi-Core Processors: Placement of multiple independent execution cores with all execution resources onto a single processor chip. Design choices for multicore chips 26 October 2023 CSE, BMSCE 57 Hierarchical design Pipelined design
  • 58. Thread-Level Parallelism Multi-Core Processors: Placement of multiple independent execution cores with all execution resources onto a single processor chip. Design choices for multicore chips 26 October 2023 CSE, BMSCE 58 Hierarchical design Pipelined design Network-based design
  • 59. Thanks for Listening 26 October 2023 CSE, BMSCE 59